Handbook of Visual Display Technology 9783642359477

397 96 125MB

English Pages [2939] Year 2015

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Handbook of Visual Display Technology
 9783642359477

Citation preview

Properties of Light Timothy D. Wilkinson

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Irradiance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Material Properties for Optical Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optical Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Polarized Light and Birefringence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retardation Plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 4 7 8 10 12 13 17 17

Abstract

In order to understand the physical properties, design, and fabrication of any display technology, it is essential to have a good appreciation of the basic physics involved in the light sources and materials used in their construction. This does not mean that to be a display engineer you need an in-depth knowledge of Maxwell’s equations, but rather you need to understand the properties of light that lead to its control, propagation, and modulation. Key to this is an understanding of the basic wave properties of light and how this leads to an energy transfer and properties such as polarization which allow light to be controlled and manipulated. From these properties also stems the concept of optical coherence which dictates which set of rules needs to be applied to understand how light interacts with the display technology and its environment. The wave properties of light also dictate many aspects of the display performance from wavelength and color control through to optical efficiency and dispersive effect. T.D. Wilkinson (*) Centre of Molecular Materials for Photonics and Electronics (CMMPE) Electrical Engineering Division, University of Cambridge, Cambridge, UK e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_1-2

1

2

T.D. Wilkinson

These can all be explained and analyzed using often simple properties and models to build a picture of how well a display works. This section is designed to introduce some of the fundamental concepts behind light and its propagation without getting buried in the heavy mathematics or physics which often lies just beneath the surface. By using simple analogies, a very powerful analysis can be performed as long as the correct assumptions are made about the fundamental properties of light, its sources, and propagation.

Introduction Light is a very complex notion which stems from the general properties of electromagnetic radiation. However, to use light in applications such as displays, it is often only the simplest concepts of propagation that are required. The basic theory of light propagation stems from complicated mathematical theories such as Maxwell’s equations (Maxwell 1865) which form the fundamental basic theory; however, from these relationships emerge a few basic concepts such as rays, waves, and polarization which can be used to solve the majority of display problems. The properties of light have been postulated for many centuries and there are still many phenomena which are not fully understood. The first significant steps were made by Newton, in his book Opticks, who described light as a stream of particles or corpuscles (Newton 1730). The wave properties of light were first published by Grimaldi (1665) and then further developed by Huygens (1690). The debate about the properties of light has continued right through to modern quantum theory, with major contributions from Maxwell and Schro¨dinger (1926). The modern concept of light has shown that it has both wave and particle (now referred to as photon) properties and is the heart of duality theory. The simplest way to understand the properties of light is to apply the correct theoretical mechanism to the problem that is to be solved. In the case of displays, most problems can be solved using either photons, through ray or geometrical optics, or by wave propagation or wave optics. Hence, we will discuss these two techniques in simple detail. The ability to solve problems with simple theory quickly becomes limited in complex optical systems, and modern optical software can take a large amount of the pain out of solving these problems; however, it is important to understand the fundamentals on which these packages operate. A good mechanism for visualizing the propagation of light is as rays or wavefronts as shown in Fig. 1. This is an idealized representation as it does not take into account what happens to small features or edges in the optical path; however, it is a very powerful tool for solving optical problems. The more complex properties of light at edges and boundaries are covered by diffraction theory. Different levels of difficulty can be included in ray theory from simple geometric rays to paraxial rays, skew rays, and aberrations. Ray and wavefront propagation are the same representations using different geometric properties; hence, problems can be solved with either technique. A ray can be drawn as if extending from a wavefront as shown in Fig. 1b and a wavefront

Properties of Light

3 Direction of propagation

a

b

Ray

Fig. 1 Light propagation as (a) rays and (b) wavefronts

can be generated from a group of rays using Huygens principle of wavelet propagation. In most display applications, geometric ray theory will be used, but wavefronts could equally be applied to solve the problem. The wave properties of light can used to generate wavefronts which propagate like ripples on the surface of a pond. Light propagation can be expressed in both space and time; however, the two dimensions are separable and can be solved independently. A typical propagating optical wave can be expressed as ψ ¼ A cos 2π

z λ



t τ

where A is the amplitude of the wave propagating in space z and time t, λ is the wavelength of the light, and ν ¼ 1=τ is the velocity of the wave. A more common representation is ψ ¼ A cos ðkz  ωtÞ where k ¼ 2π=λ is the wave number and ω is the angular frequency of the wave. The cosine function represents the oscillatory nature of light and will often be presented is the harmonic form using a complex exponential. Above is the scalar form of the wave, but it is also possible to express it in vector form by adding suitable unit vector to the wave number k. James Clerk Maxwell forged the connection between light and electromagnetism discovered by Michael Faraday. He formulated four important equations describing Faraday’s discoveries in mathematical terms. Using the terminology of vector calculus, these are @D ∇H¼ þJ @t ∇B¼0 @B ∇E¼ @t ∇D¼ρ

4

T.D. Wilkinson

where H is the magnetic field strength, D is the electric flux density, t is time, J is the current density, B is the magnetic flux density, E is the electric field strength, and ρ is the volume density of charge. These equations show that each field vector obeys a wave equation, that a varying electric field is accompanied by a varying orthogonal magnetic field and vice versa, and that the two form an electromagnetic field that can propagate as a transverse wave. The speed of light had been established in 1849. He was able to relate it to purely electrical and magnetic constants by the equation 1 c ¼ pffiffiffiffiffiffiffiffiffi ϵ0 μ0 where e0 and μ0 are, respectively, the permittivity and permeability of free space. A key property of light is its wavelength λ, as this represents its spatial frequency which can be related to the speed of light c, as c ¼ f λ . The wavelength and frequency of light have similar properties to those more commonly associated with radio- and microwaves. Maxwell’s insight paved the way for Heinrich Hertz to discover and identify radio waves and to show that they behaved in a similar manner to light. The later discovery of further ranges of electromagnetic radiation led to the construction of the electromagnetic spectrum (Fig. 2), running from the lowest usable radio waves to the highest frequency gamma radiation. The visible part of this enormous spectrum covers barely an octave, a wavelength range of approximately 380–750 nm (nm). This, however, matches closely the power spectrum of sunlight at the Earth’s surface. Figure 2 also shows the different frequency bands as well as the visible spectrum of wavelengths critical to the field of displays.

Coherence Optical coherence is difficult to define as there are many types and unusual definitions. If we assume that light has a basic wave property as described above, then we are assuming that light is fully coherent. Unfortunately such light sources are very rare and even the highest quality laser light source will have some degree of unpredictability. The measure of this unpredictability is referred to as its coherence properties and is often expressed in terms of a source’s coherence length. The problem is that all light sources will have some degree of predictability; hence, coherence is often a widely misquoted term. For example, why is it possible to see diffraction patterns such as the replay from a hologram or the diffraction colors from a compact disk (CD) even when viewed with a thermally random light source such as a tungsten filament light bulb? The answer lies in the relative size of the features on the CD or hologram with respect to the coherence length of the light source. As long as the feature size is of the order of or less than the coherence length of the source, then diffraction effects will be seen. Even the most random of light sources will have some degree of predictability in its propagation, even if it is only over a few microns distance.

Properties of Light

5 320 nm

1024

1021

1018

Gamma rays 1 pm

400 nm X-rays

Solar spectrum

1 nm

Frequency (Hz)

UV 1015

Visible

1 μm IR

1012

1m 760 nm

109

1m Radio

106

1 km 3,500 nm

103 Wavelength 100

Fig. 2 The electromagnetic spectrum

Hence, there are two basic definitions that can be applied to coherence, which depend on the relative feature sizes and light source predictability and statistics of emission. The first definition is the degree of axial coherence within the source. This means that there is some sort of relationship (not random) between the temporal and spatial emission of optical energy in the same direction as the propagation of the light energy. This is shown in Fig. 3 and the coherence length is defined as the distance the light propagates before the ability to predict where you are on the wave is lost. We can also define the source’s spatial coherence. This is when a light source has a spatial distribution or physical area, and there is some form of relationship (not random) between energy emitted from different positions across the area of the light source. This is shown in Fig. 4. Different light sources will have different degrees of either axial or spatial coherence based on the physics of the process used to generate light. Thermal sources such as tungsten filament light bulbs tend to be very random and so have very short (a few microns) coherence lengths, whereas sources based on stimulated emission trapped within a defined optical cavity (such as found in a laser) tend to

6 Fig. 3 Definition of coherence length for axial propagation of light

T.D. Wilkinson

z

Source Coherence length

Fig. 4 Definition of spatial coherence in the propagation of light

have long (often in meters) coherence lengths. Light sources such as arc lamps and light-emitting diodes (LEDs) often are classified as partially coherent as they exist somewhere between the two extremes. The most important reason for defining coherence properties of the light source is that it dictates which propagation theory and principles can be used to understand its properties within a given application. For instance, coherent sources such as lasers which have a high degree of predictability are subject to the rules of diffraction where constructive and destructive interference can occur between waves as seen in Fig. 5. On the other hand, incoherent light sources can be modeled using very simple principles based on the average energy and direction of propagation as defined in Fig. 6. The vast majority of optical design that is done in displays is based on either harnessing or suppressing the effects that can be due to the coherence properties of the light source (Smith and King 2000; Hecht 1987). As the majority of light sources are relatively incoherent, simple geometric rules as used by ray tracing systems apply. More recent developments into bright arc sources and LEDs along with laser projection (both holographic and direct) have meant that it is increasingly important to understand coherent diffraction-based properties of light within a given display system (Goodman 2005).

Irradiance Another significant property of electromagnetic radiation is that it transports energy; hence, it is logical to represent this as an energy density, u, or radiant energy per unit volume. This is dealt with in more detail in the section on radiometry and photometry (see part Photometry). In terms of electric field E or magnetic flux density B, this can be expressed as

Properties of Light

7

Fig. 5 Constructive and destructive interference of waves

Fig. 6 Incoherent interference of waves

u ¼ ϵ0 E2 ¼

1 2 B μ0

where e0 is the electric permittivity or dielectric constant of free space ( 8:85  1012 C2 N1 m2 ) and μ0 is the permeability of free space (4π  107 Ns2 C2 ). To represent the flow of electromagnetic energy, let S represent the transport of energy

8

T.D. Wilkinson

per unit time (the power) across a unit area. If we assume that the energy is flowing in the same direction as that of the propagation, then we can express S as a vector S such that 1 S¼ EH 2 The magnitude of S is the power per unit area crossing a surface whose normal is parallel to S and is often referred to as the Poynting vector. E represents the electric field and H represents the magnetic field (B ¼ μH) which are always orthogonal to one another. The term E  H will vary from maxima to minima, and S will vary in accordance to this. At optical frequencies, this is very fast indeed, hence it is often more logical to talk about the time averages of such quantities. The time-averaged value of the magnitude of the Poynting vector is symbolized by hSi and is a measure of the important optical quantity known as the irradiance:   c  2 I  h Si ¼ ϵ 0 c E2 ¼ B μ0

Material Properties for Optical Components The job of the optical component design engineer is to marry together many different specifications along with the packaging and environmental specifications in order to create a commercially cost-effective component. Two very important considerations are the choice of materials used to make the waveguide and the fabrication methods employed. One of the key aspects to understanding the material properties is to recognize the phenomena which may cause effects at certain wavelengths. Figure 7 describes some of the more common phenomena, including those that may exist outside the visible light band, but have a strong physical or chemical effect on a material’s optical properties: High-frequency modulation (MHz, microwave) • Excitation of librational and vibrational modes in extended structures leads to dielectric loss (heating) and field-induced structural relaxation. • Distinct spectral features lead to fluctuations in the dielectric function (with respect to frequency) and cause difficulties in predicting performance (and design) of traveling wave electrodes. • Some structural moieties (and trapped species) can absorb so strongly that damage results. Infrared • Fundamental vibrational spectra of chemical species, atom-atom bonds, and their overtones.

Properties of Light

Radio

9

Microwave

Visible Infrared

Applied electric fields

Ultraviolet

Optical comm.s

Various processes and ambient conditions

Field induced polarization Dielectric loss Breakdown Relaxation Ionic drift

Phonons, Vibrations Librational modes Relaxation

X-ray

Electronic absorption spectra

Inner shell and nuclear absorption spectra

Fig. 7 Optical properties often found in display materials

• In intense fields, the sub-resonant absorption can lead to multiphoton events; this can be the precursor to optical damage. • Some specific photoinduced chemical reactions are driven by near-infrared absorption, e.g., triplet to singlet oxygen conversion leading to photo-oxidative decay of polymers. Visible • Electronic absorption spectra (color vision), causing optical loss. • Strong absorption features underlie most nonlinear phenomena, charge transfer complexes, excitons, and other molecular orbital/band edge phenomena. • Optical damage driven by multiple photon absorption leading to defect formation in semiconductors and dielectrics, photolysis in many systems. The problem is in finding a unified theory which links dielectric properties to optical properties such as absorption, dispersion, and nonlinearities. A causal relationship linking absorption and the (complex) dielectric properties of all materials has been postulated and is known as the relation (Kramers 1927): pffiffiffiffi n ¼ ϵ1 , ϵ ¼ ϵ1  iϵ2 , a ¼ 2π  106  ϵ2ð  logðeÞ=λ, 2 υ0 :ϵ2 ðυ0 Þ 0 dυ ϵ1 ðυÞ  ϵ1 ¼  02 π ð υ  υ2 0 2 ϵ 1 ðυ Þ  ϵ 1 0 ϵ2 ðυÞ ¼  :υ  dυ π υ2  υ2 where n is refractive index; e’s are dielectric constants, 1 is real, 2 is imaginary, and 1 is infinite frequency; a is the absorption; ν is frequency; and λ is wavelength.

10

T.D. Wilkinson

This theory can be combined with the material’s response to applied fields to create a theory which describes the complex interaction of the material to applied fields via the concept of polarizability: n2 ¼ ϵ0 þ  P=E,  P ¼ ϵ0  χ  E þ χ 2  E2 þ χ 3  E3 . . . , n2 ¼ ϵ0  1 þ χ þ χ 2  E þ χ 3  E2 . . . where P represents the interaction between the material and the electric field within the materials and the coefficients represent different nonlinear interactions and effects within the materials. They are known as the susceptibilities or hyperpolarizabilities and are directly related to fundamental materials constants: χi / 

μiþ1 Eg  ℏω

i

Here the μ are the dipole moments, and the denominator is the energy difference between the band gap (for electronic absorption) and the energy of a photon at the observation frequency.

Optical Dispersion When electromagnetic radiation enters a different medium, such as water or a piece of glass, it interacts with the medium in a very complex manner. The effect of this interaction can be expressed through either reflection or refraction as expressed in the next section of this document. A simple way of expressing certain interactions between the radiation and the medium is the medium’s refractive index. This is the ratio of the speed of light c to the speed of light in the medium v: n¼

c v

This interaction is not as simple as it looks, as it is a function of wavelength as well. The change in refractive index with wavelength is called dispersion and is often difficult to derive as it depends on the chemical composition of the medium. Dispersion can arise from many chemical and physical interactions; however, a common form of dispersion is due to resonance with the chemical components of the medium which can be expressed in the dispersion equation: n 2 ð ωÞ ¼ 1 þ



Nq2e 1 ϵ0 me ω20  ω2

where qe is the electron charge, N is the number of electrons per unit volume, me is the electron mass, and ω0 is the resonant frequency. In fact, there will be a whole

Properties of Light

11

Dense flint glass

Index of refraction n

1.7

1.6

Light flint glass

Crystal quartz 1.5

Bor osilicate crown glass Acrylic plastic Vitreous quartz

1.4 0

200

400

600 800 Wavelength λ (nm)

1,000

Fig. 8 Dispersion for typical materials (refractive index vs. wavelength) (Reprinted from Hecht (1987))

range of different interactions between atoms and components within the medium, which can be represented as a summation of the above equation for different resonant frequencies. This is made even more complicated when considering other electronic interactions such as fixed atomic boundaries. Hence, the dispersion is a very complex concept, with many different contributing effects. Typical dispersion plots for some common optical materials are shown in Fig. 8.

Polarized Light and Birefringence Up to this point, it has been assumed that refractive index is a real property; however, it can also be complex. For almost all isotropic materials such as sodalime glass or water, refractive index is a real property. For anisotropic materials such as calcite and liquid crystals, the case is not so simple, as the interaction between the radiation and the medium depends on the direction of propagation. This is especially the case with particular crystal lattices and structures. This complex interaction is simplest to analyze using polarized light. Monochromatic, coherent light sources such as lasers can be represented in terms of an orthogonal set of propagating eigenwaves which are usually aligned to the x- and y-axis in a coordinate system with the direction of propagation along the z-axis as depicted in Fig. 9. These eigenwaves can be used to describe the

12

T.D. Wilkinson

Fig. 9 Vertically (a) and horizontally (b) polarized light

propagation of light through complex media. The Jones calculus invented by R.C. Jones in 1940 (Jones 1941) allows us to describe these waves and their propagation. There are also situations where the polarization is scrambled in a random manner leading to unpolarized light, but we are only interested in purely polarized light. If we have an electromagnetic wave propagating in the z direction along the x-axis then the light is classified as linearly polarized in the x direction or horizontally polarized. This wave can be represented as a Jones matrix, assuming an amplitude Vx: V¼

Vx 0



If the light is polarized in the direction of the y-axis, then we have linearly polarized light in the y direction of amplitude Vy or vertically polarized light: V¼

0 Vy



We can now combine these two eigenwaves to make any linear state of polarization we require. If the polarization of the light were to bisect the x- and y-axis by 45 , we could represent this as Vx = Vy and use the two states combined into a single Jones matrix: V¼

Vx Vy

¼ Vx



1 1

We can also represent more complex states of polarization such as circular states. So far we have assumed that the eigenwaves are phase (i.e., they start at the same point). We can also introduce a phase difference ϕ between the two eigenwaves, which leads to circularly polarized light. In these examples, the phase difference ϕ is positive in the direction of the z-axis and is always measured with reference to

Properties of Light

13

the vertically polarized eigenwave (parallel to the y-axis); hence, we can write the Jones matrix: V¼

Vx V y ejϕ



There are two states which express circularly polarized light. If ϕ is positive, then the horizontal component leads the vertical and the resultant director appears to rotate to the right around the z-axis in a clockwise manner and is right circularly polarized light. Conversely, if the horizontal lags the vertical, then the rotation is counterclockwise and the light is left circularly polarized. In the case of pure circularly polarized light, ϕ = π/2 for right circular and ϕ = π/2 for left circular:



1 1 Right circular V ¼ V x Left circular V ¼ V x j j Other values of ϕ lead to elliptical polarization states which are more complex to analyze. If f = 0, then we have linearly polarized light at 45 to the y-axis, and if ϕ = π, then we have linearly polarized light at 135 to the y-axis.

Retardation Plates Some crystals such as sodium chloride have a cubic molecular structure. When light passes through these structures it sees no preferred direction and is relatively unaffected. If the crystal has a structure such as hexagonal or triagonal, different directions of light will see very different crystalline structures. This effect is called birefringence, a property which is exploited in retarders. In a birefringent material, each eigenwave sees a different refractive index and will propagate at a different speed as in Fig. 10. This leads to phase retardation between the two eigenwaves which is dependent on the thickness of the birefringent material and the wavelength of the light. The preferred directions of propagation within the crystal are defined as the fast (or extraordinary) axis and the slow (or ordinary) axis. An eigenwave that passes in the same direction as the fast axis sees a refractive index nf and the eigenwave that passes along the slow axis sees ns. For light of wavelength λ passing through a birefringent crystal of thickness t, we define the retardation Γ as Γ¼

2πt ðnf -ns Þ λ

The effect of this retardation can be expressed as a Jones matrix, assuming that the fast axis is in the same direction as the y-axis: W0 ¼

ejΓ=2 0

0 ejΓ=2



14

T.D. Wilkinson

Fig. 10 Refractive index in anisotropic materials

It is more useful to be able to express the retardation from an arbitrary rotation of the fast axis by an angle ψ about the y-axis. Rotation with Jones matrices can be done as with normal matrix rotation. If we define a counterclockwise rotation of angle ψ about the axis y as positive then the rotation matrix is Rð ψ Þ ¼

cos ψ  sin ψ

sin ψ cos ψ



Hence, the general form of the retardation plate is W ¼ Rðψ ÞW 0 Rðψ Þ which can be expanded right to left (in normal matrix fashion) to give 0

jΓ=2 cos2 ψ þ ejΓ=2 sin2 ψ Be W¼@ Γ j sin sin ð2ψ Þ 2

1 Γ sin ð2ψ Þ C 2 A 2 jΓ=2 2 jΓ=2 e cos ψ þ e sin ψ j sin

This matrix now allows us to place an arbitrary wave plate in an optical system and illuminate it with polarized light. Half-wave plate: A half-wave plate is a special example of the generalized retarder. In this case, the thickness of the plate has been chosen to give a phase retardation of exactly Γ = π. Hence, the Jones matrix for a half-wave plate at an angle ψ will be j

 cos 2ψ sin 2ψ

sin 2ψ cos 2ψ



Properties of Light

15

If the fast axis is aligned with the y-axis, then ψ = 0, and the Jones matrix is

j 0 0 j



A half-wave plate can be used to rotate the direction of linearly polarized light from one linear state to another, which is a very useful property in an optical system. The quarter-wave plate: In a similar fashion, we can tailor the thickness to give a quarter-wave retardation of Γ = π/2. Such a wave plate is useful for converting to and from circularly polarized light. For a quarter-wave plate with its fast axis aligned to the y-axis (ψ = 0), the Jones matrix will be

1 1j 0 pffiffiffi 0 1þj 2 Linear polarizers: An important function in an optical system is to be able to filter out unwanted polarization states while passing desired states. This can be done using polarizers which pass a single linear state while blocking all others (crossed) as shown in Fig. 11. The polarizer can be written as a Jones matrix if the orientation of the polarizer is such that it passes vertically polarized light: Py ¼

0 0

0 1



Similarly, the polarizer can be rotated about the z-axis by an angle ψ such that

0 0 P ¼ Rðψ Þ Rð ψ Þ 0 1 Giving a generalized polarizer P¼

sin2 ψ 1 2 sin 2ψ

12 sin 2ψ cos2 ψ



Thus, if we rotate the polarizer by 90 , we get a horizontal polarizer. In a similar manner we can make a right circular polarizer which passes right circularly polarized light, but blocks left circularly polarized light. We can now use Jones calculus to solve the propagation of light through optical systems. A combination of optical elements starting from left to right can be expressed as a series of matrix multiplications. For example, if we have a pair of crossed polarizers (vertical and horizontal), then there will be no light propagated through the system. If we place a half-wave plate with its fast axis at 45 to the y-

16

T.D. Wilkinson

Fig. 11 Principle of a linear polarizer

axis between the two crossed polarizers and illuminate with vertically polarized light, then the resultant will be as in Fig. 12:

V0x V0y



¼

1 0

0 0



0 j j 0



0 0

0 1



0 Vy



¼

jV y 0



Summary A display engineer does not need an in-depth knowledge of Maxwell’s equations, but rather needs to understand the fundamental physical properties of light that lead to its control, propagation, and modulation. Key to this is an understanding of the basic wave properties of light and how this leads to an energy transfer and properties such as polarization which allow light to be controlled and manipulated. From these properties also stems the concept of optical coherence which dictates which set of rules needs to be applied to understand how light interacts with the display technology and its environment. The wave properties of light also dictate many aspects of the display performance from wavelength and color control through to optical efficiency and dispersive effects. These can all be explained and analyzed using often simple properties and models to build a picture of how well a display works. By using simple analogies, a very powerful analysis can be made as long as the correct assumptions about the fundamental properties of the light sources and its propagation are well defined.

Properties of Light

17

Halfwave plate 45°

H Polarizer

V Polarizer

Fig. 12 Jones algebra example

References Goodman JW (2005) Introduction to Fourier optics, 3rd edn. Roberts and Co, Englewood Grimaldi FM (1665) Physico mathesis de lumine, coloribus, et iride, aliisque annexis libri duo (Bologna [“Bonomia”]). Vittorio Bonati, Italy, pp 1–11 Hecht E (1987) Optics, 2nd edn. Addison Wesley, Reading Huygens C (1690) Traite´ de la lumiere, Chap 1. Pieter van der Aa, Leiden (Note: In the preface to his Traite´, Huygens states that in 1678 he first communicated his book to the French Royal Academy of Sciences) Jones RC (1941) New calculus for the treatment of optical systems. J Opt Soc Am 31:488–493 Kramers HA (1927) La diffusion de la lumiere par les atomes. Atti Cong Intern Fisica (trans: Volta Centenary Congress.). Como 2:545–557 Maxwell JC (1865) A dynamical theory of the electromagnetic field. Philos Trans R Soc Lond 155:459–512 Newton I (1730) Opticks, 4th edn. Willian Innys, London (Dover, 1952) Schro¨dinger E (1926) An undulatory theory of the mechanics of atoms and molecules. Phys Rev 28(6):1049–1070. doi:10.1103/PhysRev.28.1049 Smith FG, King TA (2000) Optics and photonics – an introduction. Wiley, New York

Geometric Optics Timothy D. Wilkinson

Contents Ray Propagation of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Total Reflection from a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Snell’s Law of Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Thin Prism and the Thin Lens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aberrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aperture Stops, Entrance and Exit Pupils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reflections at a Boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interference Films and AR Coatings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multiple Reflections and Cavities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Display Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 3 9 11 14 19 20 22 23 24

Abstract

The ability to model the propagation of light is a vital element of understanding any display technology. Whether it is a liquid crystal display backlight or a digital cinema projector, the principles of optical propagation still apply. For the majority of applications, a simple geometric optical model will suffice and allow the system to be carefully defined. Ray tracing is the most fundamental property of geometric optics. By defining a ray and its direction, its path can be traced through the system using fundamental physical rules. This chapter covers the concept of ray tracing for simple lenses including the derivation of the lensmaker’s equation and the paraxial approximation. This is then analyzed further to define the basic set of aberrations which characterize imperfections in an optical system. Rays can then

T.D. Wilkinson (*) Centre of Molecular Materials for Photonics and Electronics (CMMPE) Electrical Engineering Division, University of Cambridge, Cambridge, UK e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_2-2

1

2

T.D. Wilkinson Direction of propagation

a

b

Ray

Fig. 1 Light propagation as (a) rays and (b) wavefronts

be traced through boundaries to build up the theory of reflections and transmission. This leads to the concept of total internal reflection, which is a powerful technique often used in displays to control the flow of optical energy within a system. The basic principles are explored from Fresnel reflection through to total internal reflection, and a simple display application is identified to illustrate the power of these applications. Finally, the properties of reflections are also used to define the function of antireflection coatings and optical enhancement cavity structures. A simple on-axis theory is presented to analyze the basic function of these optical strictures based on the principles of geometric optics.

Ray Propagation of Light The simplest way to visualize the propagation of light is as rays or wavefronts as shown in Fig. 1. This is an idealized representation as it does not take into account what happens to small features or edges in the optical path; however, it is a very powerful tool for solving optical problems (Hecht and Zajac 1987; Smith and King 2000). Different levels of difficulty can be included in ray theory from simple geometric rays to paraxial rays, skew rays, and aberrations. Ray and wavefront propagation are the same representations using different geometric properties, hence problems can be solved with either technique. A ray can be drawn as if extending from a wavefront as shown in Fig. 1b, and a wavefront can be generated from a group of rays using Huygens principle of wavelet propagation as explained in Sect. 7 of chapter “▶ Optical Modulation,” on diffraction. In the next few examples, the ray theory will be used, but wavefronts could equally be applied to solve the problem.

Total Reflection from a Surface Figure 2 shows a ray from a point A incident on a reflecting surface (such as a mirror) at an angle of incidence i, intersecting at the point P on the surface. The reflected ray passes through point B and is reflected at an angle r.

Geometric Optics Fig. 2 Reflection from a surface using ray notation

3

A

B

r

i

P P′

A′

The obvious answer would be that r must equal i, but this was not proven fully until Fermat postulated that the ray must take minimum time to get from point A to B. If we consider a mirror point A0 underneath the mirror, then the shortest path will be from A0 to B. Any other path such as the one shown through point P0 will be longer and will take more time. Hence the angle of incidence must be equal to the angle of reflection (Smith and King 2000).

Snell’s Law of Refraction A ray propagating through a vacuum (which is approximately the same for air) will travel at the speed of light c; however, a ray passing through any other medium will travel at a slower speed such that c2 ¼ c=n2 where n2 is referred to as the refractive index of that medium. A ray traveling from one medium will be refracted as demonstrated in Fig. 3. The proof comes from Fermat’s principle which states the ray should transit through the system in the shortest possible time (Smith and King 2000). Snell’s law tells us that a ray traveling from a medium n1 to another medium n2 will be refracted (assuming n1 > n2) such that n1 sin θ1 ¼ n2 sin θ2 This is one of the fundamental properties used to describe ray propagation through optical systems as the interaction between media such as glass and plastic surfaces can be used to control and focus the direction of the light in displays.

The Thin Prism and the Thin Lens The principle of Snell’s law can be used to solve the optical problem of light propagation through a thin wedge-shaped prism such as the one shown in Fig. 4.

4

T.D. Wilkinson

Fig. 3 A ray refracted from one medium to another

Q n1

q1 P

q2

n2 S

Fig. 4 Light through a thin prism

a b1

b4 b2

q

b3

n

The refraction of the rays at each surface dictates how light will pass through the thin prism. From Snell’s law we have n sin β2 ¼ sin β1 and n sin β3 ¼ sin β4 , and the total deflection of the ray through the prism is θ such that θ ¼ β1  β2  β 3 þ β 4 ¼ β 1 þ β 4  α If we have a thin prism such that α is small and a small angle of incidence such that sin β  β (often referred to as the paraxial approximation) then the total deflected angle can be approximated to θ ¼ ðn  1Þα This is the basic principle used in most geometric ray problems. Deviation from small values of α and β lead to aberration in the optical system, hence these values form a solid basis for good lens design and minimization or potential aberrations.

Geometric Optics

5

Focal point

y r

Fig. 5 A thin lens made from prism sections

Object plane

Image plane

u

f

v

Fig. 6 Classical thin lens optical system

They do, however, limit what can be done in an optical system, especially if size is a constraint. A good example of how this property can be used is shown in Fig. 5, where a thin lens is made from a series of thin prism sections (Smith and King 2000). Each prism section in Fig. 5 is at a height y from the optical axis of the thin lens. Hence as they are thin lenses, the apex angle can be expressed as α ¼ 2y=r where r is the radius of curvature of both surfaces, hence the deflected angle of a ray passing through each prism section will be θ ¼ ð n  1Þ

2y r

This deviation of each ray means that if parallel rays are incident on the lens, then they will all converge to the same point called the focal point or focal length of the lens. We can represent this as f ¼ y=θ ¼ r=2ðn  1Þ . Moreover, parallel rays incident at an angle to the lens will converge to a different point on the optical axis. From this we can define the classical optical system in Fig. 6, assuming that the thin lens is circular symmetric about the optical axis.

6

T.D. Wilkinson

R

Object plane

u

f0

Image plane

fi

x0

v

C

xi

Fig. 7 Definition of sign convention

From the dimensions shown in Fig. 6, we define object distance as u and the image distance as v and we can then use classical geometrical optic theory to give the relationship 1 1 1 ¼ þ f u v This relationship forms the basis of most geometrical optical systems and is one of the fundamental relationships which are exploited on a regular basis in optical design procedures. It is important, however, to have a convention for signs when describing optical systems, as some image planes will be real and others will be virtual, depending on the lens system described. The convention we will be assuming is based on the Cartesian system as shown in Fig. 7: • • • • • • • •

Object distance u is +ve when to the left of the lens. Object focal length f0 is +ve when to the left of the lens. Object distance x0 is +ve when to the left of the object focal plane F0. Image distance v is +ve when to the right of the lens. Image focal length fi is +ve when to the right of the lens. Object distance xi is +ve when to the right of the image focal plane Fi. Radius of curvature R is +ve if its center C is to the right of the center of the lens. Object and image height y0 and yi are +ve above the optical axis.

A thin lens can be made from two spherical surfaces using the above equation and adding them for radii R1 and R2. There are many different types of this lens depending on their radii and sign of curvature, some of which are shown in Fig. 8. The thin lens equation from the prism approximation can also be expressed in the form known as the lensmaker’s equation, which incorporates the two surface radii, R1 and R2, as well as the refractive index of the lens material, nl:

Geometric Optics

R1 > 0 R2 < 0 Bi-convex

7

R1 = ∞ R2 < 0

R1 < 0 R2 > 0 Bi-concave

R1 = ∞ R2 > 0

Planar convex Planar concave

R1 > 0 R2 > 0 Meniscus convex

R1 > 0 R2 > 0 Meniscus concave

Fig. 8 Different thin lenses, surface on left is R1

Object plane f0

Image plane fi

u

v=∞

f0 u=∞

fi v

Fig. 9 Two special collimated cases

  1 1 1 1 1 þ ¼ ¼ ð nl  1Þ  u v f R 1 R2 From this equation, it is possible to see two special situations which may occur and are shown in Fig. 9. • If the object distance u equals the focal length f, then the image is at infinity and the rays on the image side will be collimated. • Similarly, if the image distance v is equal to the focal length f, then the object must be at infinity. There is also a third special case, which allows us to set up the basic structure of an imaging system through any thin lens. If a ray passes through the center of a thin lens (i.e., at the same point where the optical axis passes through the thin lens) then that ray will not be deviated on its course (Smith and King 2000; Smith 2000). Hence, we have the three cases we need to find the image of an object placed in front of a thin lens as shown in Fig. 10.

8

T.D. Wilkinson

Object plane

3

Image plane

1 u

v

f0

fi 2

Fig. 10 Three rays used to locate the image of an object

Table 1 Definitions of the object and image planes of a thin lens Object Location Convex 1 > u > 2f u = 2f f < u < 2f u=f u v > 2f

Inverted Inverted Inverted

Minified Same size Magnified

|v| > u

Erect

Magnified

Virtual

|v| > |f|, u > |v|

Erect

Magnified

A ray from the top of the object through the center of the lens will be undeviated. 1. A ray from the top of the object through the object focal point will leave the lens parallel to the optical axis on the image side. 2. A ray from the top of the object parallel to the optical axis will pass through focal point in the image plane. 3. This can be repeated for the bottom of an object, if it is not on the optical axis. The majority of optical systems designed are used to perform some sort of imaging operation; hence, it is logical to define a series of magnifications to define the type and function of an imaging optical system. The longitudinal magnification is different to the transverse magnification, which means that an object with depth will suffer from distortions due to the differing magnifications. This can be compensated by multiple surface lens design and ray tracing. The imaging properties of a thin lens are summarized in Table 1. An important point in geometric optics is the concept of apertures and stops. They form the fundamental limits of the rays and also effectively define the aberrations and quality of the images generated. The aperture stop of any optical system will restrict the cone of angles allowed through the optical system. Another

Geometric Optics

9

feature of the aperture stop is that it defines the f-number of a lens with a given aperture D and focal length f such that f =# ¼

f D

For instance, a 50-mm focal length lens with an aperture size of 25 mm will have an f-number of 2, which is traditionally (and somewhat confusingly) written as f/2. If the aperture on the lens was reduced to 12 mm, then the f-number would be 4 (or f/4). A smaller f-number clearly allows more light to pass through the lens. Camera lenses are normally specified with their focal length and f-number, as the photographic exposure time is proportional to the f-number squared. Hence, the lower the f-number the “faster” the lens. A typical camera lens will have a range of f-numbers from f/1 (fastest), 1.4, 2, 2.8, 4, 5.6, 8, 11, 16, to f/22 (slowest). Another way of expressing this relationship is by the numerical aperture, often shortened to NA. This is expressed as the index of refraction times the sine of the cone half-angle of illumination: NA ¼ n1 sin

θ 2

NA and f-number are similar ways of expressing the same concept in different optical systems. NA is more suited to conjugated systems where illumination angle is a useful measure of performance such as in microscope objectives (Smith and King 2000).

Aberrations The analysis made so far has been using the paraxial ray approximation and that the lens will only operate at one wavelength; however, this is rarely the case in real optical applications. Most lenses operate over range of wavelengths, which means that dispersion will have an effect, as each glass element will have a different refractive index at each wavelength. This is usually referred to as chromatic aberration. This aberration can be corrected for by using multiple glass surfaces, with air gaps between, or by combining two different glass surfaces with different dispersions together to form an achromatic doublet. The power of a single thin lens (reciprocal focal length) is given by P = (n  1)/R; a small change in refractive index δn will change the power: δP ¼ P

δn n1

Varieties of optical glass differ quite widely in the way in which n varies with wavelength; hence, it is possible to combine two lenses with power P1 and P2 in such a way that δP1 + δP2 = 0 without making P1 + P2 = 0 at the same time.

10

T.D. Wilkinson

Since δn(n  1) has the same sign for all glasses, P1 and P2 must have different sign and the combined pair will have a lower power than the two components. Hence for a given wavelength range, we must satisfy the equation P1

δn1 δn2 þ P2 ¼0 n1  1 n2  1

The two surfaces may be in contact if they have the same radius of curvature, which is advantageous as it reduces the reflections from each air-glass interface. Such touching doublets are normally cemented together with transparent glue which has the same refractive index as the glass (Hecht and Zajac 1987). The paraxial approximation is very useful for setting up basic systems, but it will also lead to aberrations. Hence any optical system must undergo a series of optimizations after the initial approximation to minimize the desired aberrations. No lens system will be perfect, so compromises must be made during the optimization procedure, based on which aberrations will most affect the image quality of the lens system. This is usually done through a series of trial-and-error simulations using ray tracing software such as ZEMAX and CODE V. The main types of aberration are either point-based aberrations such as spherical, coma, and astigmatism or image-based aberrations such as Petzval field curvature and distortion. In the paraxial analysis of the spherical boundary problem, we generated the lensmaker’s equation using the approximation that sin x  x; however, if we use the third-order approximation of sin x  x  x3 =3! then a more accurate estimate of the ray trajectories can be calculated. Most modern software packages perform both calculations and then compare the two results. The differences are expressed in a variety of plots to express the aberrations within the optical system. The more difficult aberrations to avoid and compensate for are those that occur off axis when the ray angles begin to get larger. A common technique for expressing aberrations is to consider a cone of rays which strike the boundary of an optical element in an off-axis circular region. The cone is specified in polar coordinates ρ and θ and it is possible to use the third-order approximation to calculate the way in which the original ray cone is aberrated by the boundary transition. As a result of these aberrations, the lens will generate an ellipse of rays and it is possible to calculate the associated minor and major axes using the form (Smith and King 2000) bx ¼ Bρ3 sin θ  Fy sin 2θ þ Dy2 ρ sin θ by ¼ Bρ3 cos θ  Fyρ2 ð1 þ 2 cos2 θÞ þ ð2C þ DÞy2 ρ cos θ  Ey3 The structure of this ellipse of rays is a powerful and simple method of presenting the different aberrations within the system. The aberrations associated with the coefficients B, C, D, E, F are known as the Seidel aberrations and are often listed in tabular form so the lens designer can see which element has which associated aberration. The first term B is the same for both axes; hence, it will affect on-axis rays, and is the spherical aberration.

Geometric Optics

11

Coma

Spherical aberration

Focal lines Astigmatism

Fig. 11 Different types of Seidel aberrations

Figure 11 shows three different types of Seidel aberrations. Coma is an aberration in addition to spherical aberration which only appears for object points off the axis and is associated with term F. Rays intersect the image plane in a comet-like fashion. Rays along a line θ = 90 do not contribute to coma, but rays at θ = 0 are significant. Astigmatism, related to coefficients C and D, is a result of a cylindrical wavefront aberration, which increases as the square of the distance off axis. The focus consists of two groups of rays referred to as the focal lines, with a blurred circular region in between. The Petzval field curvature comes from terms C and D also, in which the wavefront has an added curvature proportional to y2. This shows that the focal length of the lens changes for off-axis points. A flat object plane will give a curved image plane. It is usual to find Petzval curvature even after a lens has been corrected for astigmatism. Distortion is related to term E and represents an angular deviation of the wavefront increasing as y3. This spreads or contracts the image, destroying the linear relationship between dimension in the object and the image. Common distortions are referred to as the pin-cushion or the barrel effects (Smith and King 2000).

Aperture Stops, Entrance and Exit Pupils A very important tool in ray tracing an optical system are stops and pupils, as they define the limiting paths and angles that a ray can take when traversing the system. As a result of these limits, it is possible to estimate the angular limits within the system and hence estimate the aberrations that might occur due to those limits. An aperture stop (AS) is placed within a system to limit the possible ray paths. It also allows the limiting angles and rays through the system to be calculated, which is often very important when calculating imperfections and aberrations. Once such rays and paths have been defined, the performance of the optical system can then be optimized using commercial software algorithms. Another common stop used is the

12

T.D. Wilkinson Entrance pupil

Exp Enp Chief ray

A.S

Fig. 12 Example of an entrance pupil

field stop (FS), which defines the useful or operating region in the output of image plane of the optical system (Hecht and Zajac 1987; Smith and King 2000). From the aperture stop it is possible to derive the entrance and exit pupils, which are very useful tools in understanding the operation of an optical system. A pupil is simply a projection of the aperture stop through the optical system, and they can be used to find the maximum permissible cone of rays at both the object and image portions of the optical system. The entrance pupil of a system is the image of the aperture stop seen from an axial point on the object through those elements that precede (i.e., are before) the aperture stop as shown in Fig. 12. If there are no elements before the AS, then the stop itself forms the entrance pupil. In contrast to the entrance pupil, the exit pupil is the image of the aperture stop as seen from an axial point on the image plane through the lenses between the image and the aperture stop as shown in Fig. 13. Once again, if there are no lenses to the right of the aperture stop, then the aperture stop will be the exit pupil. The diagrams in Figs. 12 and 13 both include a ray labeled as the chief ray. This is defined as any ray from an off-axis object point that passes through the center of the aperture stop. The chief ray enters the optical system along a line directed toward the midpoint of the entrance pupil Enp and leaves the system along a line passing through the center of the exit pupil Exp. The chief ray is associated with a conical bundle from a point on the object, effectively behaves as the central ray of the bundle, and is representative of it. Chief rays are particularly important when considering aberrations in an optical system. Figure 14 shows how the entrance and exit pupils can be used to investigate an optical system. The aperture stop defines the entrance and exit pupils and then they can be used to find the chief rays as well as the marginal ray, which goes from the

Geometric Optics

13 Exit pupil

Exp Enp Chief ray

A.S

Fig. 13 Example of an exit pupil

Exit pupil

Entrance pupil

Marginal ray

Enp Exp Chief ray A.S

Fig. 14 Example three-lens system with pupils and key rays

axial object point to the rim or margin of the aperture stop. In most optical systems, one of the elements will be acting as an effective aperture stop as each element will have defined a finite aperture diameter. In the situation when it is not clear which element is effectively acting as the aperture, each element must be imaged in turn and the image that subtends the smallest angle at the axial object point is the entrance pupil.

14

T.D. Wilkinson

Fig. 15 Irradiance incident at a material boundary

Plan

e of

inci

den

ce

Ei Hi

qr

e lan

ry p

Hr

nda

Er

Bou

qi

ni Et nt

Ht

qt

Reflections at a Boundary The basic principle of Snell’s law describes the process of refraction through a medium of different refractive index; however, this process does not account for the combination of both reflection and refraction at an interface. Unless a surface is antireflection (AR) coated, there will be a reflection due to the change in refractive index. Unfortunately, this is a rather complex area to analyze as we have to understand Fresnel’s equations for reflections at a boundary (Hecht and Zajac 1987). The reflection and transmission at an interface or boundary between materials with different refractive indices can be derived by breaking the radiation into its electrical (E) and magnetic (H) field components and then solving the relevant conditions which exist at the boundaries. This is broken into two sets of conditions which depend on the orientation of the electric field E relative to the plane of incidence for the interaction (the plane in which the incident, reflected, and transmitted rays all lie) (Fig. 15). The relationships can be summarized into two cases. The first is when the electric field (E field) is perpendicular to the plane of incidence and the second case is when the E field is parallel to the plane of incidence. The derivation of these relationships is based on the boundary conditions at the point of reflection. Here the materials’ interaction dictates that certain properties of the E and H components must be continuous or inverted, depending on its orientation with the materials

Geometric Optics

15

Fig. 16 Plot of the amplitude reflection (r) and transmission (t) coefficients versus angle of incidence

1.0 t|| t⊥

Amplitude coefficients

0.5

r||

0

qp r⊥

–0.5

56.3°

–1.0 0

30

60

90

q i (degrees)

themselves. From this analysis we can define the reflected (r) and transmitted (t) components by using the Fresnel equations combined with Snell’s law to give the following relationships for the perpendicular and parallel components of the reflected and transmitted ray (Hecht and Zajac 1987): sin ðθi  θt Þ sin ðθi þ θt Þ tan ðθi  θt Þ r para ¼ tan ðθi þ θt Þ 2 sin θt cos θi tperp ¼ sin ðθi þ θt Þ 2 sin θt cos θi tpara ¼ sin ðθi þ θt Þ cos ðθi  θt Þ

r perp ¼ 

We can use the relationships to analyze the complex interaction between the E and H fields at the point of interaction. Figure 16 shows a plot of all four coefficients versus angle of incidence for an air-glass (nt = 1.5) boundary. As can be seen from this plot there is an angle θp which is referred to as the polarization (or Brewster) angle, where the sign of the parallel component is inverted. As the angle of incidence approaches 90 (glancing angles) we see that the reflection becomes stronger. This is the technique used to reflect X-rays in telescopes. If we consider the conservation of energy in the system of Fig. 15, then the total energy flowing into the surface area (A) must be the same as the total energy flowing out of it:

16

T.D. Wilkinson

Reflectance and transmittance

a

1.0 T⊥

0.5

n ti = 1.5

R⊥

0 30

60

90

60

90

b

1.0

Reflectance and transmittance

q i (degrees)

0.5

T||

n ti = 1.5

R ||

0

qp

30

q i (degrees) Fig. 17 Plots of the reflection and transmission coefficients versus angle

I i A cos θi ¼ I r A cos θr þ I t A cos θt This can be expressed in terms of the electric field and broken into perpendicular and parallel reflectance (R) and transmittance (T) components such that R + T = 1: R¼

I r cos θr I r I t cos θt ¼ T¼ I i cos θi I i I i cos θi

We can also express R and T in terms of the reflected and transmitted field components: Rperp ¼ r perp 2 Rpara ¼ r para 2 nt cos θt nt cos θt 2 T perp ¼ tperp 2 T para ¼ t ni cos θi ni cos θi para Hence we can say that Rperp þ Rpara ¼ 1, T perp þ T para ¼ 1 . Figure 17 shows the components of the reflectance and transmittance for an air-glass boundary.

Geometric Optics

17

a

b qt

nt

nt

P

qt

P

ni

ni

qi qr

qi qr

c

d nt

nt P

q t = 90°

P

ni

ni

qi = qc qr = qc

qi > qc

qr > qc

Fig. 18 Internal reflection (ni > nt) and the critical angle

As can be seen from the figure above there is an angle at which the parallel reflected component is zero. This is the polarization angle shown before and is also referred to as the Brewster angle. This can be shown as being the angle where θi = θt. This effect is exploited in Brewster prisms where two orthogonal polarization states can be separated with a very high extinction ratio. We can also define both T and R at normal incidence (θ i = 0):  R ¼ Rperp ¼ Rpara ¼

nt  ni nt  ni

2 ; T ¼ T perp ¼ T para ¼

4nt ni ð nt þ ni Þ 2

From the above relationships we can see that something interesting will happen in the case of internal reflection, when ni > nt. In this situation, we have a critical angle θc, which leads to total internal reflection. If we consider a boundary between a dense medium such as soda lime glass and air, with rays hitting the boundary at an angle of incidence θi as in Fig. 18, and return to Snell’s law we see that as θi increases, the refracted ray approaches the tangent to the point P in the system of Fig. 18. This continues until θt = 90 when we have reached the critical angle: θi ¼ θc

and

sin θc ¼

nt ni

18

T.D. Wilkinson

For any incident angle greater than θc the ray will be totally internally reflected. Note that there is no discontinuity at the point of total internal reflection. At angles less than θc, there is a gradual decrease in the refracted ray up till θc when all to the light will be reflected as described earlier with the Fresnel equations. The equations can be rearranged (using nti = nt /ni): qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cos θi  n2ti  sin2 θi n2ti cos θi  n2ti  sin2 θi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r para ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r perp ¼ cos θi þ n2ti  sin2 θi n2ti cos θi þ n2ti  sin2 θi As can be seen from these coefficients when TIR occurs at θi > θc the reflection coefficients of the E field components become complex and alter the phase of the light in a complex fashion. At this point we have R = 1, and the entire wave is reflected. The process of total internal reflection (TIR) is used in many optical components such as wedge and corner prisms which can make reflectors very close to 100 % efficient. There is a further interesting point when θi = θc, as θt will be 90 giving TIR; however, there is in fact an evanescent wave in the direction of the refracted wave (which is now parallel to the optical boundary surface). This evanescent wave will couple back into the surface to generate the internally reflected ray; however, at the boundary the evanescent wave extends a small distance beyond the surface of the boundary. If another medium with the same refractive index (ni) as the first medium is placed on top, then the evanescent wave will couple into the second medium and the ray will continue without being reflected at the point P. This evanescent wave can be used to partially couple between two different media in a process called frustrated TIR or FTIR. A thin layer of material is placed on top of the boundary so that a percentage of the evanescent wave will pass through and couple into a second medium with the same refractive index. Hence a percentage of the light will be reflected while the rest will go straight through. This is the technique used in optical beam splitters and the separation optics in microscopes. The process of refraction and TIR can be used to control a whole range of operations in an optical system rather than just for deviating the path of a beam. One of the commonest is the dispersive glass prism originally demonstrated by Newton and shown in Fig. 4, where the dispersive qualities of the glass can be harnessed. By analyzing the ray path through the prism, it is possible to derive an equation for the total deviation of the ray, δ:   qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi δ ¼ β1 þ sin 1 ð sin αÞ n2  sin2 β1  sin β1 cos α  α The angle δ is a function of n, which in turn is a function of wavelength through dispersion, hence the deviation angle will be different for different wavelengths. This prism is a simple means of splitting up white light into wavelengths which was commonly used in early spectrometers. The process can also be reversed by finding the minimum angle of deviation for a given wavelength and then using this to calculate the refractive index at the wavelength. This is one of the most accurate ways of measuring refractive index.

Geometric Optics

19

Interference Films and AR Coatings Any flat surface where there is a difference in refractive index will generate a reflection which is proportional to the difference in refractive indices. For a ray hitting such a boundary at normal incidence, the reflected R and transmitted T components are given previously in section “▶ Human Vision and Photometry.” If we have parallel-sided slab of material such as glass in air, then there will be a reflection from the top surface and a reflection from the bottom surface of the ray transmitted from the top surface. The second reflection will then reach the top surface again and it is possible to see interference between these two reflections if the path difference between them is correct. This interference is more pronounced if the slab of material is in fact a thin film or layer of material, as the path difference is simpler to control. This is the effect seen when oil or petrol form a thin layer on water. The colors come from the fact that the interference is a function of both thickness and wavelength (Hecht and Zajac 1987; Smith and King 2000). The analysis of reflections at a boundary is very complex and requires the solution of Maxwell’s equations for light hitting a surface at an angle leading to a complex series of phase differences. This can be a very useful property of a multilayer stack of different thickness layers and refractive indices as this allows the reflection or transmission to be angularly selective, which is a very desirable element in flat panel displays such as LCDs. The design of these multilayer stacks requires complex software and often results in over 20 layers being required, which can be difficult to make. Here we will consider the simpler case of light at normal incidence to a thin optical coating. For a lot of simple coatings such as antireflection (AR) coatings, the angular sensitivity is quite broad and it is only above angles of 20–30 that the normal case is no longer accurate. The case of a single layer coating is shown in Fig. 19. The rays are in fact normal to the surface and have only been drawn with a slight angle to demonstrate the principle. The reflectance (subscript denotes layer number) of a single layer with wave number k = 2π/λ can be derived and then simplified further when kd = π/2. This is equivalent to a layer thickness of d = λ/4, which means that the ray reflected from the lower boundary will deconstructively interfere with the ray reflected from the upper boundary as shown in Fig. 19.  R1 ¼ 

n0 ns  n21 n0 ns þ n21

2 2

The normal angle reflectance will in fact be zero if n21 ¼ n0 ns . Generally, for a visible AR coating, λ is set in the visible yellow-green region, which sets d as the quarter wavelength thickness and allows a suitable refractive index for the layer to be chosen to cancel the reflection. Cryolite (n = 1.35), a sodium aluminum fluoride compound, and magnesium fluoride (MgF2, n = 1.38) are common low-index AR films for an air-glass boundary at visible wavelengths. For a typical glass (n = 1.5), MgF2 is a little low but is still used as it is a tough layer and reduces reflections from 4 % to below 1 %.

20

T.D. Wilkinson

These two normal reflection will interfere (and cancel) n0

n1

d

ns

Fig. 19 Reflections off a layer at normal incidence

For a double-layer coating, the reflections at each surface will cascade, giving a more complicated reflectance; however, for quarter wave thickness layers, the reflectance simplifies to  2 2 n n0  nn n21 R2 ¼ 22 n2 n0 þ nn n21 which equals zero at a particular wavelength when  2 n2 ns ¼ n1 n0 This sort of film is referred to as a double-quarter, single-minimum coating. When n1 and n2 are as low as possible, the reflectance will have its single broadest minimum equal to zero at the chosen frequency. It should be clear, however, that n2 > n1, and it is common practice to designate a (glass)-(high index)-(low index)(air) as a gHLa. Zirconium dioxide (n = 2.1), titanium dioxide (n = 2.4), and zinc sulfide (n = 2.6) are commonly used for H and MgF2 used for the L layer. Other double- and triple-layer stacks can be used to create different spectral and angular responses. A common approach is to make a multilayer periodic stack such as a g(HL)3 a. The maximum reflectance can be increased further by adding a final H layer to give a stack of g(HL)m Ha. A very-high-quality mirror can be produced this way. The small peak on the short-wavelength (UV) side can be reduced by adding an eighth-wave low-index film at each end of the stack forming a high-pass filter g (0.5 L) (HL)m H (0.5 L) a.

Multiple Reflections and Cavities Another interesting case can be seen in Fig. 20, where multiple reflections between two surfaces occur. This will only occur when special measures are taken to increase the reflections, usually by partially silvering the surfaces of the slab or

Geometric Optics

21 Atr ′4t ′

Att ′

Atr ′2t ′

T2

T3

T1 q R2

R3

Atr ′5t ′

Atr ′3t ′

Atr ′t ′

R1

A

Ar

Fig. 20 Multiple reflections off a layer

coating with dielectric mirrors. This increased reflection means that the reflections do not die out as they would with just an air-glass boundary (Smith and King 2000). If r and t are the reflection and transmission coefficients from the surrounding medium to the slab and r0 and t0 are the reflection and transmission coefficients from the slab to the surrounding medium, we can write the amplitude of the reflected rays as rA, tt0 r0 A, tt0 r0 3 A, etc. and the transmitted rays tt0 A, tt0 r0 2 A, tt0 r0 4 A, etc. With the exception of the first large reflected ray, the amplitudes go in a geometric progression and the closer to unity that r0 is made, the more slowly the reflections will die away. The phase of the reflections depends on θ, the angle of refraction inside the slab, its thickness h, and the refractive index n. The relative phase on a plane on the outside of the slab perpendicular to the ray depends on θ as ψ¼

4πnh cos θ λ

If we use a lens to combine the reflected rays, then they will interfere and constructive interference will occur when 2nh cos θ = Nλ, for each integer N. This will lead to a series of interference fringe rings. What is interesting is when you consider the gaps between each interference fringe. The brightest point on each fringe corresponds to a ψ value of 2π. Any value of ψ other than this creates a poorly defined phasor, which means that the gaps between the fringes will be dark and the fringes very thin. We can prove this by summing all of the reflected rays to get a ratio of the transmitted irradiance to the incident irradiance of It ðtt0 Þ2 1 ¼ ¼ I i ð1  r 02 Þ2 þ 4r 02 sin2 ðψ=2Þ 1 þ F sin2 ðψ=2Þ ð2r 0 Þ2 where F ¼ ð1  r 02 Þ2

22

T.D. Wilkinson

Note that the F parameter becomes very large as r0 2 approaches unity. For example, if r0 = 0.8 then F = 80. This has the effect of keeping the transmitted irradiance very small, except when ψ is close enough to a multiple of 2π for sin2(ψ/2) to be less than 1/F. Hence for a large value of F, the ratio shoots to unity rapidly as ψ approaches multiples of 2π. The sharp fringes of multiple interference are known as Fabry-Perot fringes. The sharpness of the Fabry-Perot fringes is often referred to as the finesse. pffiffiffi π F Finesse ¼ 2 A Fabry-Perot cavity is capable of producing very-high-quality resonant peaks, given a suitably high finesse. This can be used as a wavelength filter or as an angular filtering system. More importantly, the cavity can be made tunable by placing a variable refractive index in the cavity with either a liquid crystal material or possibly a nonlinear optically active material such as a chromaphore polymer. A more subtle effect which can be exploited in a multiple-path interference device is that by operating a modulator on the very edge of a resonance peak, it is possible to create a large optical effect from a very small electro-optical effect, which may only have a weak response in its nonresonant state.

Display Applications The process or TIR can be harnessed in many ways. The process of waveguiding through TIR has revolutionized the telecommunications industry, with the birth of the silica optical fiber. There are also applications for waveguides in displays, especially as backlights. Plastic optical fibers (POFs) are already being used to guide light by TIR as light pipes in displays such as motorway signs and traffic lights. They offer a cheap means of controlling the direction of light over a few tens of centimeters without significant loss or cost and they can transmit high power without interference or crosstalk. Another use of waveguides is in the backlights of displays such as LCDs and mobile phone displays. Figure 21 shows a series of rays launched into a plastic slab at angles which are less than the critical angle for the plastic-air boundaries. The rays are bounced by TIR along the waveguide (Liu 2005). The two lines in Fig. 21 show two different angles often referred to as modes within the waveguide. There is a limit as to how far they will propagate through the waveguide, as there is a loss associated with the absorption of the plastic material. Hence there is a limited distance of waveguiding before efficiency and uniformity

Fig. 21 Waveguided modes in a plastic slab

Geometric Optics

Lamp

23

Lamp

Lamp

Fig. 22 Double-ended and tapered backlights

become an issue. Even so, this is a good means of distributing light across an area as would be required in an LCD backlight. The thinness of the modern-day laptop display is due mostly to the plastic waveguides at the back which are edge illuminated by thin, low-pressure tubes. The waveguide in Fig. 21 would be no use as a backlight as there is no light emitted by the top surface. Light is allowed to escape from the top surface in a controlled manner by either tapering the waveguide or by putting a series of shaped features on the top surface to allow some of the light to escape. This is shown in Fig. 22, and most commercial backlights would employ both techniques in some form. Features and reflectors are also added on the bottom surface to aid emission. The backlights shown in Fig. 22 are greatly simplified, as a great deal of effort has been made over the last 10 years perfecting the waveguide structures. The lamps along the edges are partially mirrored and are often embedded into the waveguide edge for maximum transmission of light. More modern designs now include light-emitting diodes (LEDs) as sources and the waveguides are further optimized to cater for the nonuniform illumination characteristics (Mottier 2009). Some use edge illumination LED sources and some are now including LED sources launched into the back surface of the backlight waveguide itself. The features on the surface of the waveguide are designed for both a bright backlight and for uniformity of the brightness. The surface is designed to counter for the nonuniformity of the lamps as well as the absorption of the waveguide material. The waveguiding function of a slab waveguide has also been used to form a new large area display as shown in Fig. 23. With careful design, a wedge-shaped piece of plastic can be used to propagate an image from an engine such as a microdisplay (Travis et al. 2000).

Summary For the majority of display-based optical applications a simple geometric optical model is more than sufficient and allows the system to be carefully designed. Ray tracing is the most fundamental property of geometric optics and forms the basic tool of any optical engineer. By defining a ray and its direction, its path can be traced through the system using fundamental physical rules. Simple lenses result from this basic concept along with Snell’s law, leading on to the derivation of the lensmaker’s equation and the paraxial approximation. The purpose and power of this approximation is then analyzed further to define the basic set of aberrations

24

T.D. Wilkinson

Fig. 23 The wedge waveguide display system

which characterize imperfections in an optical system. Rays can then be traced through boundaries to build up the theory of apertures, aberrations, reflections, and transmission. This then leads to the concept of total internal reflection, which is a powerful technique often used in displays to control the flow of optical energy within a system. The basic principles are explored from Fresnel reflection through to total internal reflection and a simple display application is identified to illustrate the power of these applications. Finally, the properties of reflections are also used to define the function of antireflection coatings and optical enhancement cavity structures. A simple on-axis theory is presented to analyze the basic function of these optical strictures based on the principles of geometric optics.

References Hecht E, Zajac A (1987) Optics, 2nd edn. Addison Wesley, Reading Liu J-M (2005) Photonic devices. CUP, New York Mottier P (2009) LEDs for lighting applications. Wiley, Hoboken Smith WJ (2000) Modern optical engineering, 3rd edn. SPIE/McGraw-Hill, New York Smith FG, King TA (2000) Optics and photonics. Wiley, Chichester Travis A, Payne F, Zhong J, Moore J (2000) Flat panel display using projection within a wedgeshaped waveguide. In: Proceedings of conference, SID, Palm Beach, pp 292–295

Optical Modulation Timothy D. Wilkinson

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Point Spread Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modulation Transfer Function and Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Classical MTF Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experimental Measurement of the MTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liquid Crystal Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Out-of-Plane Optical Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . In-Plane Optical Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diffraction of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 4 5 10 11 12 14 17 24 24

Abstract

This chapter covers the issues that lead to developing and understanding a successful display technology. Optical modulation is a very important aspect of any display, and a fundamental principle is that of resolution and its definition in the context of the display technology. Whether the limiting factors are pixels, optics, or materials, the theory of modulation transfer and point spread functions is essential in assessing the resolution properties of any display technology. Optical modulation characteristics are especially important in liquid crystal displays, so the latter part of the chapter develops some of the relevant theory required. The analysis stems from the basic birefringence of these materials and describes their optical properties using Jones matrices. Finally, a brief

T.D. Wilkinson (*) Centre of Molecular Materials for Photonics and Electronics (CMMPE) Electrical Engineering Division, University of Cambridge, Cambridge, UK e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_3-2

1

2

T.D. Wilkinson

description of optical diffraction theory is given, as this forms a basis for most modulation and resolution analysis.

Introduction The concept of resolution is key to understanding and analyzing modern display technologies. The majority of displays in use today use a finite number of picture elements or “pixels” in an array to form images. These pixels are then combined with optics to illuminate, project, or reorient in order to form the final image. The concept of optical modulation ties all of these elements together and can be used to analyze the performance of any display system and define parameters such as resolution and image reproduction quality.

The Point Spread Function It is impossible to create an optical system that has perfect resolution properties. Even an infinitely small point source imaged through a perfect aberration-free lens will not form a perfect spot due to diffraction, as described in Sect. 7, ▶ Liquid Crystal Displays. The wave properties of light lead to the beam spreading at the edges of apertures; thus, there will always be some form of diffraction effect which will cause the spot to spread and develop side lobes. If we shine collimated light through a circular aperture and focus it through a lens, the distribution of the light in the focal plane of the lens will have effects due to the diffraction of the light at the edge of the aperture, as shown in Fig. 1. It is possible to calculate the pattern of light generated from a given aperture via the Fourier transform relationship discussed in Sect. 7, ▶ Liquid Crystal Displays, on Fourier optics and diffraction. If we have a circular aperture, then the resulting distribution in the focal plane will be a first-order Bessel function, with light and dark circular rings. The central bright spot is called the Airy disk and it contains up to 84 % of the total energy (Hecht and Zajac 1987). This pattern, based on a finite aperture, can be used to define the resolution limit of an optical system (Smith and King 2000; Smith 2000). It is also useful to define the numerical aperture (NA) of the optical system in Fig. 1 as NA = n1 sinθ/2, where n1 is the refractive index of the lens material and θ is the angle subtended by the lens diameter at the principle focal point. Let us consider a system which uses a lens to image two bright sources of light of wavelength λ, and each source is imaged to an Airy disk and associated rings. If the sources are close together, then the disks will overlap. When the separation is such that it is just possible to resolve two separate disks rather than one, then the points are said to be resolved. This is shown in Fig. 2 for various amounts of separation. When the image points are closer than 0.5λ/NA, the central maxima of both patterns blend into one, and the combined patterns appear to be from a single source. When the image separation reaches

Optical Modulation

3

fi

Airy disk

Fig. 1 Focused light through an aperture Fig. 2 Definition of the separations of resolvable points

0.5λ/NA

Not resolved

Barely resolved

0.61λ/NA Resolved

Fully resolved

0.61λ/NA, the maximum of one pattern coincides with the first dark ring of the other, and there is a clear indication of two separate maxima in the combined pattern. This is referred to as Rayleigh’s criterion for an optical system: Z¼

0:61λ ¼ 1:22λðf =noÞ NA

This expression is widely used in defining the resolution of telescopes. For the imaging system, the image NA is used, and for the object, the object NA is used. The equivalent separation for a pair of line images is simply λ f/no. The way in which an optical system alters the structure of the aperture or illumination source is called the point spread function (PSF) or in the one-dimensional case the line spread function (LSF). The effects of this can be seen in the next section, where the PSF and LSF are used to describe the resolution

4

T.D. Wilkinson

limits of a display. The PSF, LSF, and modulation transfer function (MTF) are all interrelated and can be used to interpret the limits of an optical system or display. Furthermore, the PSF can also be used to detect or indicate any defects in the optical system such as aberrations. By looking at the distortion in the PSF, we can spot sources of aberration. For example, spherical aberration will lead to a broadening and flattening of the PSF, whereas coma and astigmatism lead to asymmetric shapes in the PSF and its side lobes. In a commercial software package such as CODE V or ZEMAX, the PSF is calculated by wavefront propagation rather than simple ray tracing. Hence it not only provides a different perspective, it may also indicate different distortions such as diffractive effects which would have not shown up in a purely geometric analysis.

Modulation Transfer Function and Resolution One of the most important considerations in the performance of a display is its resolution, i.e., its ability to display high-resolution images. There are several techniques that can be used to judge the performance of a display in terms of resolution. The choice of technique depends on the type of display and the way in which it operates. These techniques have one thing in common: they generate the MTF for the display, which allows the performance of the display to be evaluated either analytically or graphically (Barten 1995). Another feature these techniques have in common is that the effects of color are ignored. The MTF is commonly used to evaluate the resolution performance in optical systems, especially as it can often be represented with mathematical analysis for these systems. One of the commonest uses of the MTF is in the design of lens systems for cameras, as it provides a means of combining the effects of several optical aberrations to judge the quality of the lens design (Smith and King 2000). With this technique it is possible to follow the propagation of light through the optical path and then take into account the MTF to show the effect of the aberrations. This is not so easy in displays as there is a finite resolution limit set by the pixelation and grayscale of the display. It is possible, however, to use the MTF to show how the display performs, taking into account the effects of the optics surrounding the display used for illumination or projection. Probably the simplest test for a display is to use a resolution chart such as the one in Fig. 3. This chart is the standard US Air Force (USAF) chart. The chart contains an array of bars of decreasing separation, which indicate different spatial. The chart is displayed and the resolution limit measured from the limit of the bars that can be visibly resolved on the image. There is a fundamental limit on the resolution based on the number of pixels, but there will also be other more influential limits due to the optics used to illuminate or project the image from the display. There is a resolution limit that can be seen in the image in Fig. 3 that is set by the performance of the printer, compression of the original bitmap, and further limits due to the photocopier used to repeat it.

Optical Modulation

5 −2

Fig. 3 The standard USAF resolution chart

−1 1

2 2

0

3

4 5 6

1

3

4

3

1 1 2 3 4 5 6

2

0

4 5 6 −2

5

1 6

The resolution chart gives us a visual evaluation of the performance, but does not describe the source of the limitation. The only way this can be obtained with a resolution chart is to eliminate the possible sources one by one, with a carefully chosen sequence of chart tests. It is possible to replace the display with a highresolution slide of the chart, which allows us to evaluate the quality of the optics as a separate system. Other charts also exist to test other features of the display system, including diffraction effects and grayscale quality.

The Classical MTF Approach If we consider an image as a luminance distribution, then we can mathematically evaluate the MTF as the effect that the display system has on reproducing an ideal image (Williams and Becklund 1989). In order to understand this, we shall look at the case in one dimension, which is equivalent to looking at the shortest aspect of a display. The analysis can be directly converted into a twodimensional case. If we have a one-dimensional luminance function f(x), we can use Fourier theory to express this as a sum of an infinite number of sinusoids with a spectrum of spatial frequencies. We can express the sum as an integral: 1 ð

f ðx Þ ¼

FðuÞei2πux du 1

6

T.D. Wilkinson

where u is the frequency and F(u) is the complex amplitude of the spatial frequency components. Conversely, we can calculate F(u): 1 ð

Fð u Þ ¼

f ðxÞei2πux dx

1

The function F(u) is the Fourier transform of f(x). It describes the spatial frequency spectrum of the luminance pattern. If the original (undisplayed) luminance pattern is f(x) and its reproduction on a display is f 0 (x), then for f 0 (x) we can calculate the Fourier transform F0 (u). The image rendition of a display system is characterized by the way in which the various spatial frequency components of an image are reproduced. A lack of rendition of high spatial frequency components causes unsharpness of the image. The MTF M(u) describes the ability of a system to reproduce the modulation of the various spatial frequency components and is defined as the output modulation of a system F0 (u) divided by the original input modulation F(u), as a function of spatial frequency:  0  F ðuÞ  MðuÞ ¼  Fð u Þ  In the MTF, the phase relation between the input and output modulation is not taken into account. For cases where this could play a role, the optical transfer function (OTF) should be used. The definition of the OTF is the same as for the MTF, but without the absolute value signs. If the MTF becomes negative, this indicates a phase reversal, i.e., a black bar is imaged as white or vice versa. This is often common in defocused optical systems. The MTF is a very important quantity for the characterization of the resolution capability of a display system. It is analogous with a filter function in electronic theory. Figure 4 shows a typical MTF for a cathode ray tube (CRT) display. At low spatial frequencies, the MTF is 1. At higher frequencies it decreases to 0. However, the following precautions must be accounted for: 1. The concept of MTF (and also OTF) strictly applies only to linear systems. In a nonlinear system, higher-frequency harmonics appear after the Fourier transform, leading to erroneous MTF values. 2. The average luminance of a display system may include effects such as ambient or stray illumination. This can also lead to errors in the MTF as the display system is being influenced by external factors. In many cases the displayed luminance pattern results from an image-forming process where each point of the original pattern is spread over a certain area in the displayed image. Examples are the image forming by a non-perfect optical lens and the scanning process of an electron spot in a CRT. Such systems can be

Optical Modulation Fig. 4 The MTF of a CRT tube with a Gaussian spot

7

1

MTF

0.5

0 0

1

2

3

4

Spatial frequency [cycles/mm]

characterized by a line spread function (LSF) l(x) by which the original f(x) is scanned (or convolved) to form the displayed f 0 (x)(Williams and Becklund 1989): 0

1 ð

f ðxÞ ¼

f ðξÞlðx  ξÞdξ 1

where ξ is a variable used for the integration. If an image is formed by a convolution process, the MTF for this process can be determined from the LSF. From Fourier transform convolution identities, we can directly state the relationship between the displayed image F0 (u) and the Fourier transform of the LSF, L(u): F0 ðuÞ ¼ LðuÞFðuÞ This means that MTF can be directly gained from the absolute value of the Fourier transform of the LSF:  1  ð   i2πux   dx MðuÞ ¼ LðuÞ ¼  lðxÞe   1

Figure 5 shows a LSF and its corresponding MTF. It is interesting to note from this example that the number of sample points required in the LSF is low. In this example, three different numbers of sample points are shown. The resulting MTFs are very similar, with the 8 sample point one being slightly less accurate, but the 16 and 32 point cases being virtually the same. If the LSF is symmetrical about some central axis, then the sine terms in the Fourier transform disappear, leaving:  1  ð    MðuÞ ¼  lðxÞ cos ð2πuxÞdx   1

8 Fig. 5 A typical LSF for 8, 16, and 32 sample points with its corresponding MTF

T.D. Wilkinson

a

Luminance 32 points 16 points 8 points

0.8

0

1.6 Distance (mm)

b MTF

2.4

3.2

1

Calculated from: 32 points 16 points 8 points

0.5

0 0

0.5

1

1.5

Spatial frequency (cycles/mm)

Instead of the LSF, we can also use the PSF p(x, y). In this case, the LSF must first be calculated from the PSF: 1 ð

lðxÞ ¼

pðx, yÞ dy 1

For a case of a rotationally symmetric PSF, the MTF can be directly calculated:  1  ð    MðuÞ ¼  pðr ÞJ 0 ð2πur Þ2πr dr   1

where p(r) is the radial version of the PSF and J0 is the zero order Bessel function. This operation is known as the Hankel transform (Williams and Becklund 1989).

Optical Modulation

9

We can also use the convolution rule to cascade the MTFs from component effects in the total MTF of the display. Hence, we can calculate the MTF components M1(u), M2(u), etc., separately and then obtain the final overall MTF: MðuÞ ¼ M1 ðuÞM2 ðuÞM3 ðuÞ . . . The cascade rule applies only when each element is incoherently detached from the others. In the case of a coherent optical system, each element would have to be separated by a diffuser. In a multi-element coherent system, the MTF is defined for the entire system and not as a cascade of elements. This is because it is possible to correct for the deficiencies of one element with another within a coherent group of optical elements such as lenses. The LSF needed for the calculation of the MTF can sometimes simply be obtained by measuring the luminance transition at a sharp luminance step (edge). This is called the Foucault method or knife-edge test. A small razor blade is used in the object plane to realize the step. If we assume that the image f(x) is a step function (0 to 1 transition), then we have: 1 ð

0

f ðxÞ ¼

1:l ðx  ξÞ dξ 0

Differentiating this expression gives: lðxÞ ¼

df 0 ðxÞ dx

Hence, by differentiating the step response of the display system, we can obtain the LSF and hence the MTF. The image-forming process in a matrix display such as a liquid crystal display, and electroluminescent display or a plasma display, differs from a convolution process. The image on a matrix display consists of luminance samples arranged in a regular array. If the distance between samples is Δx, the sampling frequency is 1/Δx. From a Fourier analysis of the sampled signal, it appears that the frequency spectrum of this function is equal to that of the original continuous spectrum with the addition of repetitions of the original spectrum at distances 1/Δx. Overlapping these spectra will cause interference visible as aliasing or a moire´ effect on the display. To avoid this, the original signal must be prefiltered to remove components above 0.5/Δ x. If the filtering is correct, then the components below the cutoff frequency of 0.5/Δ x will be completely reconstructed. In information theory this is known as Shannon’s theorem and the limit of 0.5/Δx is known as the Nyquist frequency. In practice, the abovementioned reconstruction of the original signal is only possible with electronic circuits. Even in this case, the reconstruction will not be perfect, because of the finite limits of the filters. On a sampled display, the only post-processing that occurs is the spreading of the luminance over the finite size of each pixel. This is equivalent to the convolution of the sampled function with a

10

T.D. Wilkinson

block-shaped LSF. The MTF of this pixel convolution process is given by the absolute value of a sinc (sin(x)/x) function, which is the FT of the block function:    sin ðπΔxuÞ  MðuÞ ¼  πΔxu  assuming that the width of the pixel elements is the same as the sampling distance (Δx). The zero points of this function occur at frequency multiples of 1/Δx, so that the maximum suppression of the repetition spectra is obtained. According to the sampling theorem, no frequency above the Nyquist frequency can be transferred, so the MTF will drop to zero at the Nyquist frequency. The effects of sampling are also seen in the CRT tube in the vertical direction because of the line structure. In order to avoid possible aliasing caused by this sampling, television systems are designed such that an over-sampling factor (known as the Kell factor) is used. For most CRTs, a Kell factor of 0.7 is used.

Experimental Measurement of the MTF The above techniques are all for the mathematical analysis of the resolution of the display system. In practice, it is often easier to measure the MTF before interpreting its effects. One of the most direct ways of measuring the MTF is by the FT of progressively increasing sinusoids displayed in the system. This directly relates to the above theory as we are essentially scanning different spatial frequencies through the display and then measuring the modulation. This is an especially useful technique for nonlinear display systems or for where the performance is nonstandard. By scanning through the sinusoidal frequencies, we are evaluating each point on the MTF. The effect of the display on the scanned frequency is done by the use of a Fourier transform (focal plane of a lens). This is shown in Fig. 6, where a sinusoid and its FT are shown. By scanning the frequency of the sinusoid, it is possible to shift the position of the diffracted spot in the Fourier plane. Even in the case of a binary display, it is possible to measure the MTF by looking at the diffracted first-order spot. The MTF shown in Fig. 7 is an example of scanned MTFs for a 128 pixel matrix display. As expected, the shape of the MTF is a truncated sinc (sin x/x) function. The truncation is due to the fact that it is a matrix display, and the frequency scanning is stopped at the Nyquist frequency to avoid aliasing. From these types of MTF, it is possible to measure nonlinear binary devices and displays which have nonlinear grayscale responses.

Liquid Crystal Modulation Liquid crystals (LCs) are unique compounds with the properties of both the liquid and crystalline phases. They exist in mesophases, which have diffuse molecular order and orientation. Phase changes are often initiated by temperature

Optical Modulation

11

x

u

FT

Fig. 6 Fourier transform of a single-frequency sinusoid Fig. 7 MTF for a 128 pixel matrix display

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

100

200

300

400

500

(thermotropics) or solvent concentration (lyotropics). Different liquid crystal compounds have different mesophase combinations. Some are subtle and microscopic changes in order, while others are much more dramatic. One of the commonest liquid crystal mesophases is the nematic phase. This is the least ordered mesophase before the isotropic. Here the molecules have only a long-range order and no longitudinal order. This means that the molecules retain a low viscosity, like a liquid, and are prone to flow. This can greatly affect the speed at which nematic LCs can modulate the light. The order in the nematic phase arises from the climatic molecular shape, which means that on average the molecules spend slightly less time spinning about their long axis than they do about their short axis. The molecular shape leads to an optical anisotropy or LC birefringence, with the two axes of the molecule appearing as the refractive index as shown in the right of Fig. 8. The refractive index along the long axis of the molecules is often referred to as the extraordinary ne (or fast ns) and along the short axis as the ordinary no (or slow ns) axis. The difference between the two is the birefringence: Δn ¼ ne  no . Along with the optical anisotropy, there is also often a charge imbalance due to the molecular shape leading to a dielectric anisotropy. The dielectric anisotropy is also linked to the elastic properties of the liquid crystal molecule, which means that we can move the molecules around by applying an electric field across them. This combined

12

T.D. Wilkinson

Molecular director

ne

no θ

Optical indicatrix

Fig. 8 The optical indicatrix of a typical liquid crystal molecule

with the flow properties means that a liquid crystal molecule can be oriented in any direction with the use of an applied external electric field. This is a very desirable feature as it leads to their ability to perform grayscale modulation of the light.

Out-of-Plane Optical Modulation The ability to manipulate the liquid crystal position leads to two basic types of optical modulation: out-of-plane and in-plane rotation. The example shown in Fig. 8 shows a liquid crystal molecule at an angle θ to the plane of the cell (i.e., the upper and lower horizontal walls of the device in question). Hence, for a positive dielectric anisotropy material, when an electric field is applied to the upper and lower electrodes, the molecule will rotate about its central point and the angle θ will vary as is shown in the bulk case in Fig. 9. This is an example of out-of-plane rotation and the refractive indices seen by the light will rotate accordingly. It is important to note that these parameters are bulk parameters which are based on a statistical average across billions of individual molecules. The bulk ordering of liquid crystals such as nematics is not always inherent in the material beyond domains of molecules a few micrometers in size. The electric field used to reorient the molecules is normally applied by either an active backplane, such as a thin-film transistor (TFT) or CMOS array, or a passive backplane using indium tin oxide (ITO) electrodes (Fig. 9). The combination of the flow allowing the molecules to move when an electric field is applied and the optical anisotropy means that we can effectively rotate the axes of the indicatrix as the molecules move, creating a movable wave plate or optical retarder. This, along with polarizing optics, forms the basis of most liquid crystal intensity and phase modulation characteristics. If we have an optical indicatrix oriented at an angle of θ to the plane of the cell as in Fig. 8, then we can calculate the refractive index seen by light passing perpendicular to the cell walls: no ne nð θ Þ ¼  1=2 2 2 ne sin θ þ no 2 cos2 θ

Optical Modulation

13

Light

Light

E

E

Fig. 9 Out-of-plane bulk reorientation of a liquid crystal with an applied electric field

103

Γ

6 I

102

10

Γ (rad)

arb. units

4 1

2 100

6

8

10

20

40

60

80

Voltage

Fig. 10 Typical optical modulation of an out-of-plane liquid crystal with applied voltage

We can then calculate the retardance Γ of the liquid crystal layer for a given cell thickness d and wavelength λ. The retardance is the phase difference between a wave passing through the short axis and the wave passing through the material oriented at an angle θ: Γ¼

2πd ðnðθÞ  no Þ λ

We can now use this expression in a Jones matrix representation of the optical LC retarder to get the optical characteristics of the LC material as in Fig. 10. In the case of nematic LC materials, there is little restriction on the flow properties of the material; hence it is possible to continuously vary Γ through several rotations of π (Fig. 10), provided a thick enough cell. The main drawback of these materials is that the speed of the modulation is often very slow (tens of milliseconds to seconds).

14

T.D. Wilkinson

The other main limitation of nematic LCs is that they are inherently polarization insensitive; so multiple passes through a cell with wave plate optics are required to make them insensitive (Barten 1995). The effects discussed with nematic LCs so far have all been planar effects, where there is just the simple angle θ with respect to the cell walls. Because the nematic LC molecules are free to rotate in any position, it is possible to make more complex geometries such as twisted structures. In these devices, the molecules follow a twisted or helical path, often rotating through an angle of 90 as is the case with twisted nematic or TN displays. The twist can be extended by adding suitable chiral dopants to create “supertwist” structures with twists of the order of 270 . The analysis of twisted nematic LC structures is far more complicated and is usually accomplished, using commercial software packages or a series of thin slice approximations across the cell.

In-Plane Optical Modulation Some liquid crystals have a higher degree of order than nematics and one such class is smectic materials. Within this class is a group of materials known as chiral smectic C or ferroelectric liquid crystals (FLCs). If these materials are restricted to a cell thickness of 2–5 μm, then motion of the molecules within the cell is restricted by the surfaces, and the molecules are bounded into two stable states either side of a cone of possible molecular positions. The angle between these two states is defined as the switching angle θ. This is referred to as a surface-stabilized FLC geometry and creates a high degree of ferroelectricity and a large birefringent in-plane electro-optical effect. The penalty is that the molecules are only stable in the two states and therefore the modulation will be only binary. The advantage of this binary modulation is that it can be very fast (10 μs) and that the stability can lead to the molecules remaining in the two states in what is known as bistable switching. Figure 11 shows the two stable positions of the molecules. The plane that the molecules occupy is the same as the one forms the physical boundary of the cell itself. The liquid crystal molecules switch through the angle θ in the plane of the cell, as shown in Fig. 11, when the electric field (E) is applied across the cell. The motion of the molecule is very fast, which is attractive for spatial light modulators (SLMs). The motion of the molecule is only stable in two states, which limits the modulation to binary. The interaction between the light and the liquid crystal molecules is dependent on the polarization of the light and the orientation of the molecules. The director of the molecules acts like the extraordinary axis of a retardation plate, and the normal to the direction of the molecules represents the ordinary axis. Hence the liquid crystal displays birefringence aligned in the direction of the molecules across the cell. This means that the liquid crystal acts like a switchable wave plate whose fast and slow axes can be in two possible states separated by the angle θ and whose retardation depends on the thickness and birefringence of the material. Using Jones matrix notation, a pixel with retardation Γ at an angle θ can be represented as:

Optical Modulation

15 Alignment direction

p θ

E

θ

E

P

Fig. 11 In-plane switching of a liquid crystal molecule

0

jΓ=2 cos2 θ þ e jΓ=2 sin2 θ Be @ Γ j sin sin ð2θÞ 2

1 Γ sin ð2θÞ C 2 A ejΓ=2 cos2 θ þ ejΓ=2 sin2 θ j sin

Binary Intensity Modulation If the light is polarized so that it passes through a liquid crystal pixel parallel to the fast axis in one state, then there is no change due to the birefringence, and the light will pass through a polarizer which is also parallel to the fast axis. If the pixel is then switched into state 2, as shown in Fig. 12, the fast axis is rotated by θ, and the light now undergoes some birefringent action. We can use Jones matrices to represent the optical components: State 1  0     Vx 0 0 0 ejΓ=2 0 ¼ jΓ=2 V 0y V 0 0 0  e y 0 ¼ V y e jΓ=2 State 2 

V 0x V 0y



0

1 Γ   cos θ þ e sin θ j sin sin ð2θÞ 0 0 Be C 0 2 ¼ @ A Γ Vy 0 1 j sin sin ð2θÞ ejΓ=2 cos2 θ þ ejΓ=2 sin2 θ 2    ¼ V y e jΓ=2 cos2 θ þ ejΓ=2 sin2 θ 



jΓ=2

2

jΓ=2

2

If the thickness of the liquid crystal cell d is set so that Γ = π, then the light in the direction of the slow axis will be rotated by 180 . This leads to a rotation of the polarization after the pixel, which is partially blocked by the following polarizer. Maximum contrast ratio will be achieved when state 2 is at 90 to the polarizer, and the resulting horizontal polarization is blocked out. This will occur when:     V y ejΓ=2 cos2 θ þ ejΓ=2 sin2 θ ¼ V y j cos2 θ  j sin2 θ ¼ 0 and the optimum switching angle will be when θ = 45 .

16

T.D. Wilkinson

Polarizer

SLM Pixel

Polarizer

SLM Pixel

Fast

Fast

θ Slow

Slow Γ

Γ State 1

State 2

Fig. 12 In-plane binary modulation states

Fast

Polarizer θ/2

Fast

Polarizer

−θ/2 Slow Slow

State 1

State 2

Fig. 13 In-plane binary modulation states

Binary Phase Modulation If the light is polarized so that its direction bisects the switching angle and an analyzer (polarisor) is placed after the pixel at 90 to the input light, then phase modulation is possible. If we start with vertically polarized light, then the liquid crystal pixel extraordinary axis positions must bisect the vertical axis and will be oriented at angles of θ/2 and  θ/2 respectively for each state, as shown in Fig. 13. Once again we can use Jones matrices to express the system: State 1 

V 0x V 0y



1 Γ   jΓ=2 2θ jΓ=2 2θ þ e j sin sin ð θ Þ cos sin e 1 0 B C 0 2 2 2 ¼ @ Γ θ θ A Vy 0 0 j sin sin ðθÞ ejΓ=2 cos2 þ ejΓ=2 sin2 2 2 !2 Γ ¼ V y j sin 2 sin ðθÞ 0 



0

Optical Modulation

State 2 

0

Vx V 0y

17

1 Γ   j sin sin ðθÞ cos þ e sin e C 0 1 0 B 2 2 2 B C ¼ 0 0 @ Γ θ θ A Vy j sin sin ðθÞ ejΓ=2 cos2 þ ejΓ=2 sin2 2 2 0 1 2 Γ V y j sin sin ðθÞ A 2 ¼@ 0 



0

jΓ=2



jΓ=2



From these two expressions, we can see that the difference between the two states is just the minus sign, which means that the light has been modulated by 180 (π phase modulation). Moreover, the phase modulation is independent of the switching angle θ and the retardation Γ. These parameters only affect the loss in transmission through the pixel which can be gained by squaring the above expressions: T ¼ V 2y sin2 ðθÞ sin2

  Γ 2

Hence maximum transmission (and therefore minimum loss) occurs when Γ = π and θ = π/2.

Diffraction of Light Diffraction is the principle that we use to solve what happens when the limits of geometrical optics are exceeded and are often a critical part of understanding the modulating properties of an optical system. The theory that governs resolution as defined in sections “▶ The Point Spread Function,” “▶ Modulation Transfer Function and Resolution,” “▶ The Classical MTF Approach,” and “▶ Experimental Measurement of the MTF” all stems from diffraction analysis of the optical system. The effect most noticeable is when a series of rays (or a wavefront) is incident upon an obstruction, aperture, or edge. Geometrical optics tells us that once the rays have passed the edge, those rays that were blocked stop propagating and those that passed continue. If we look further down the optical path, a perfectly clear shadow of the edge would be maintained by the rays still propagating under this model. This is not the case in reality as the light is seen to bend outward around the edge, eventually forming a series of rings or fringes. The process of forming these fringes is known as diffraction theory and is based on the light propagating as waves (like ripples on a pond). Let us assume we have an arbitrary aperture function, A(x, y) in the plane S, with coordinates [x, y] as shown in Fig. 14. The light passing through this aperture will be diffracted at its edges, and the exact form of this pattern can be calculated (Wilson 1995). We want to calculate the field distribution at an arbitrary position at a point P, a distance R from the aperture.

18

T.D. Wilkinson

y

s ve wa e n

a

Pl

x β P

r

dS

α

R

Z S

Fig. 14 Aperture and diffraction coordinate system

If we consider an arbitrarily small quantity of the aperture, dS, we can model this as a point source of light-emitting spherical “Huygens” wavelets with an amplitude of A(x, y) dS. The wavelet acts as a radiating point source, so we can calculate its field at the point P, a distance r from dS. The point source dS can be considered to radiate a spherical wavefront of frequency ω. The total field distribution at P is evaluated by superposition (summation) of all the wavelets across the aperture. The process of interference of these spherical wavelets is then the fundamental principle behind diffraction and is based on the Huygens-Fresnel approximation. In order to analyze the propagating wavelets, a series of approximations and assumptions must be made. If we consider only the part of the wavelets which are propagating in the forward (+z) direction and are contained in a cone of small angles away from the z axis, then we can evaluate the change in field dE at the point P due to dS. As the wavelet dS acts as a point source, we can say that the power radiated is proportional to 1/r2 (spherical wavefront); hence, the field dE will be proportional to 1/r. Thus for a propagating wave of frequency ω and wave number k, defined as k = 2π/λ, we have the complex wave: dE ¼

A ðx, yÞ iwt ikr e e dS r

Now, we need to change coordinates to the plane containing the point P, which are defined as [α, β] using the relationship from Pythagoras:

Optical Modulation

19

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2αx þ 2βy x2 þ y2 r ¼R 1 þ R2 R2 The final full expression in terms of x and y (dS = dxdy) for dE will now be: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jkR

12αxþ2βy þ 2

x2 þy2 2

R R A ðx, yÞ e jωt e rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dE ¼ dxdy 2 2 2αx þ 2βy x þ y R 1 þ R2 R2

Such an expression can only be solved directly for a few specific aperture functions. To account for an arbitrary aperture, we must restrict the regions in which we evaluate the diffracted pattern. If the point P is reasonably close to the z axis, relative to the distance R and the aperture A(x, y) is small compared with the distance R, then the lower section of the equation for dE can be assumed to be almost constant, and to all intents and purposes, r = R: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 þ x þy A ðx, yÞ jωt jkR 12αxþ2βy R2 R2 dE ¼ e e dxdy R The similar expression in the exponential term in the top line of the original equation is not so simple. It cannot be considered constant as small variations are amplified owing to the presence of the exponential. To simplify this section, we must consider only the far-field or Fresnel region where: R2 >> x2 þ y2 and so the final term in the exponential ((x2 + y2)/R2) can be neglected. To further simplify, we use the binomial expansion to give the Fraunhofer region: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d d2 ð1  d Þ ¼ 1      2 8 and keep the first two terms only, as this expansion converges rapidly. The simplified version of the field dE can now be expressed as: dE ¼

Aðx, yÞ iðωtkrÞ ikðαxþβy e e R Þ dxdy R

The acceptability of the approximation depends on a sufficiently large value of R to obtain the far-field diffraction pattern of the aperture. One useful guideline proposed by Goodman (Goodman 1996) is to assume that the far-field pattern occurs when:   k x2max þ y2max R >> 2

20

T.D. Wilkinson

z

Fraunhofer region Fresnel region

Fig. 15 Diffraction regions for a square aperture

where xmax and ymax are the maximum dimensions of the aperture A(x, y). The regions of the approximation are defined such that in the far-field or Fraunhofer region, the approximations are accurate; hence, the field distribution E(x, y) only changes in size with increasing z, rather than changes in structure as shown in Fig. 15. In the case where the approximation is bearably accurate, we are in the Fresnel region. Before the Fresnel region, the evaluation of E is extremely difficult and is called the near-field region. The exact boundary of the Fresnel region will depend on the acceptable accuracy. The total effect of the dS wavelets can be integrated across dE to get an expression for the far-field or Fraunhofer diffraction pattern: 1 Eðα, βÞ ¼ eiðωtkRÞ R

ð

ð

A ðx, yÞeikðαxþβyÞ=R dxdy

Aperture

The initial exponential term ei(ωtkR) refers the wave to an origin at t = 0, but we are only interested in relative points at P with respect to each other, so it is safe to set this term to 1. Thus, our final expression for the far-field diffraction pattern becomes: ðð Eðα, βÞ ¼

A ðx, yÞeikðαxþβyÞ=R dxdy

A

which is recognizable as the two-dimensional Fourier transform of the aperture function A(x, y): Fraunhofer region ¼ Far field pattern ¼ FT fAperture functiong

Optical Modulation

21

The final step is to remove the scaling effect of R in the equation, as it does not affect its structure, only its size. The coordinates [α, β] are absolute and are scaled by the factor R. For this reason, we normalize the coordinates and define the Fourier transform of the aperture in terms of its spatial frequency components [u, v] (not to be confused with u and v in geometrical optics): kα 2πR kβ v¼ 2πR



Hence the final relationship is: ðð Eðu, vÞ ¼

A ðx, yÞe2πiðuxþvyÞ dxdy

A

And inversely, we can calculate the aperture from the far-field pattern: ðð Aðx, yÞ ¼

Eðu, yÞe2πiðuxþvyÞ dudv

A

We define the Fourier transform pair as: ðð Fðu, vÞ ¼ ðð f ðx, yÞ ¼

1

1

f ðx, yÞe2πiðuxþvyÞ dxdy

Fðu, vÞe2πiðuxþvyÞ dudv

The coordinates [u, v] in the Fourier plane are now defined as the spatial frequencies. The far-field pattern of a square aperture of width 1 mm, at a wavelength of 633 nm, can only be accurately measured 10 m away from the aperture. Such a far-field distance is clearly difficult to achieve in practical terms, so a means of shortening the distance is needed. If a positive lens is positioned directly after the aperture, the far-field pattern appears in the focal plane of the lens as shown in Fig. 16. The positive lens performs a Fourier transform of the aperture placed behind it. If we consider the aperture A(x, y) placed just before a positive lens of focal length f, then we can calculate the field just after the lens, as described by Goodman (Goodman 1996), by the paraxial approximation: Aðx, yÞ0 ¼ A ðx, yÞ ei2f ðx k

2

þy2 Þ

22

T.D. Wilkinson

f Aperture

Fig. 16 Diffraction through a positive lens

The application of Snell’s law at the spherical lens/air boundaries shows that the lens converts plane waves incident upon it into spherical waves converging to the focal plane. For this reason, the diffraction to the far-field pattern now occurs in the focal plane of the lens, as the Fresnel approximation only holds true in this plane. The effect of this is to shift the far-field or Fraunhofer region away from R to the principle focal length f. The principles of Fresnel diffraction can be applied to A(x, y)0 as if it were the aperture. The exp(x2 + y2) which was originally ignored due to the large R is now included with the term from A(x, y)0 and cannot be removed as the focal distance from the aperture is not large enough. The expression can now be analyzed by a combination of Fresnel diffraction and paraxial ray approximations and is outlined in Goodman (Williams and Becklund 1989). The final result for the diffracted aperture A(x, y) through the lens is: ðð ik ik α2 þβ2 Þ ð f Eðα, βÞ ¼ e A ðx, yÞe f ðαxþβyÞ dxdy A

Once again we can translate the equation into spatial frequency (u, v) coordinates creating the Fourier transform relationship shown above for the far-field region, with an added phase distortion due to the compression of R down to the focal plane of a positive lens. It looks as if we are getting something for nothing, but this is not the case, as the lens introduces the quadratic phase distortion term in front of the transform. This can lead to a smearing of the Fourier transform and must be corrected for in compressed optical systems. There are several methods used to correct the phase, such as adding further lenses close to the focal plane or a compensating hologram to counter the phase distortion. If the aperture is placed a distance d behind the lens, then there will be a corresponding change in the phase distortion term of the Fourier transform: 2 d 2 Eðα, βÞ ¼ e ð1 f Þ ðα þβ Þ ik 2f

ðð A

Aðx, yÞe f ðαxþβyÞ dxdy ik

Optical Modulation

23

Aperture or Image

Aperture or Image

GIVE WAY

GIVE WAY

f

f

f

f

f

f

f

f

Fig. 17 Two possible 4f optical systems

From this equation we can see another way of removing the phase distortion. If the distance is set so that d = f, then the phase distortion is unity, and we have the full Fourier transform scaled by the factor of the focal length, f. This is a very important feature used in the design of optical systems and is the principle behind the 4f system shown in Fig. 17. In a 4f system, there are two identical lenses separated by a distance 2f. This forms the basis of a low distortion optical system. In both of the examples in Fig. 17, the distortions are minimized. In the top system, the aperture is transformed in the focal plane of the first lens and then re-imaged by the inverse transform of the second lens. The image shown at the aperture (give way sign) appears at the output rotated by 180 . The reproduced image would be perfect if the two lenses were ideal; however, there are lens imperfections such as chromatic and spherical aberrations. Some of these distortions are reversible and cancel in the system, but some are not, leading to a slightly distorted image. The lower 4f system is in fact the same configuration for a hypothetical point on the axis. This model demonstrates how the wavefront from the input can be conserved and translated to the output and is an essential concept in optical systems such as free-space optical interconnects and telecommunications (Wilson 1995; Goodman 1996).

Summary In this chapter we have covered the most relevant issues concerning the design, development, and understanding of a successful display technology. Optical modulation is a very important aspect of any display, and a fundamental principle is that of resolution and its definition in the context of the display technology. Whether the limiting factors are pixels, optics, or materials, the theory of modulation transfer

24

T.D. Wilkinson

and point spread functions is essential in assessing the resolution properties of any display technology. Optical modulation characteristics are especially important in birefringence-based technologies such as liquid crystal displays, so a basic theory for analyzing such optics is described. The analysis stems from the basic birefringence of these materials and describes their optical properties using Jones matrices. Finally, as a vital element in any optical analysis of modulation and its effects, a brief description of optical diffraction theory is given as this forms a basis for most modulation and resolution analysis. Diffraction is often a limitation but is now becoming a powerful technique for display generation in its own right (see http:// www.lightblueoptics.com).

References Barten PJG (1995) Display image quality evaluation. In: SID applications seminars, Florida, 23 May 1995, pp A-5/1-40 Goodman JW (1996) Introduction to Fourier optics, 2nd edn. McGraw-Hill, New York Hecht E, Zajac A (1987) Optics, 2nd edn. Addison Wesley, Reading Smith FG, King TA (2000) Optics and photonics. Wiley, New York Smith WJ (2000) Modern optical engineering, 3rd edn. SPIE/McGraw-Hill, New York Williams CS, Becklund OA (1989) Introduction to the optical transfer function. WileyInterscience, New York Wilson RG (1995) Fourier series and optical transform techniques in contemporary optics. Wiley, New York

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

Anatomy of the Eye Christine Garharta* and Vasudevan Lakshminarayananb a Biology Department, University of Missouri-St. Louis, St. Louis, MO, USA b Departments of Physics and Electrical Engineering, School of Optometry, University of Waterloo, Waterloo, ON, Canada

Abstract This chapter gives a basic introduction to the anatomy of the eye. This background is critical to understanding the physiology of the eye and in particular aspects of visual perception which are discussed elsewhere in this handbook.

List of Abbreviations GRIN ILM INL IPL OCT OLM ONL PEs

Gradient index Inner limiting membrane Inner nuclear layer Inner plexiform layer Optical coherence tomography Outer limiting membrane Outer nuclear layer Pigment epitheliums

Introduction The human eye is the basic organ of sight. The mechanism of sight and visual perception involves a set of structures (each of which has a definite function). The eye is housed in a protective framework of bones and connective tissue, and this is called the orbit. The eyelids contain glands that produce the tear film layer over the anterior (front surface) of the eye, as well as protecting the anterior surface. The muscles that are attached to the eyeball and control the movement of the eyes are called extraocular muscles. In addition, the muscles are coordinated between the two eyes, a necessary condition for binocular vision. A complex network of blood vessels and neurons provide nutrients as well as sensory and motor innervations to the eye. The crystalline lens of the eye plays a major role in focusing the light rays through a process called accommodation controlled via the ciliary muscles. The retina, the innermost of various layers, contains the light-absorbing rod and cone photoreceptors, as well as a neural network to process and transmit the electrical signals via the optic nerve to the visual cortex in the brain via the lateral geniculate body. Figure 1 illustrates the main features of the human eye. It also shows the eye in its bony orbit. Additional details on the optical elements and functional and physiological properties of the eye can be found, for example, in the books by Grosvenor (1989), Oyster (1990), Remington (2005), and Hart (1992). Detailed description of the optics of the eye can be found in the book by Atchison and Smith (2000).

*Email: [email protected] Page 1 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

a Eyelid

Sclera

Eyelashes Orbital Muscles

Iris Cornea

Optic Nerve

Pupil Eye Socket

b

tear/film posterior chamber lirnbal zone

cornea iris conjunctiva canal of Schlemm ciliary muscle

ci

lia

ry

bo

dy

anterior chamber

rectus lendon

lens ciliary process retrolental space

optic axis

zonule fibers ciliary epithelium orc terminalis

visual axis

vitreous retina sclera chorioid lamina cribrosa

dis

fovea

c

macula

5 mm

Fig. 1 (a) The human eye in its bony orbit, showing the extraocular muscles (From http://www.99main.com/charlief/ Blindness.htm). (b) Anatomic cross section of the eye (From Walls 1942)

Basic Dimensions of the Eye The eye is a spheroid structure that rests in the orbit on the frontal surface of the skull. The dimensions of the human eye are reasonably constant in adults, varying by only about a millimeter or so. The sagittal diameter (the vertical) is about 24 mm and is usually less than the transverse diameter which is about 24.5–25 mm. The adult human eye weighs approximately 7.5 g.

Eye Formation and Growth Eye formation begins during the end of the third week of development when outgrowths of brain neural tissue, called the optic vesicles, form at the sides of the forebrain region. The major structures of the eye Page 2 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

are initially formed by the fifth month of fetal development. The eye structures enlarge, mature, and form increasingly complex neural networks prenatally. At birth, infant eyes are about two-thirds the size of an adult eye. Until the first month of life, infants lack complete retinal development, with consequent effects on development of visual functions (i.e., visual acuity, contrast sensitivity, motion, etc.). From the second year until puberty, eye growth progressively slows. It should be noted that infants are born hyperopic (too much positive power), and a process of emmetropization occurs. The mechanism and development of emmetropia (as well as myopia development in children) is a major area of current research (e.g., Charman and Radhakrishnan 2010; Pang et al. 2006).

Layers of the Eye The thick wall of the eye contains three layers: the sclera, the choroid, and the retina. The sclera, the white part of the eye, is the outermost layer. This is a thick layer and gives structural stability and shape to the eye. A delicate membrane called the conjunctiva covers the visible portion of the eye. A study (Pang et al. 2006) of 50 eye-bank eyes found that the mean scleral thickness  SD was 0.53  0.14 mm at the corneoscleral limbus (where the cornea and sclera meet), significantly decreasing to 0.39  0.17 mm near the equator and increasing to 0.9–1.0 mm near the optic nerve. The mean total scleral surface area by surface area was found to be about 17.0 cm2. The sclera is optically opaque and the transparent cornea allows light rays to enter the eye. Underneath the sclera is the second layer of tissue, the choroid, composed of a dense pigment and blood vessels that nourish the tissues. The vascular layer is also called the uvea. The choroid is a dense vascular network and supplies nutrients to the retinal layers. The choroid is thought to play a role in eye growth. Recent studies show that the choroid is about 426 mm and the thickness undergoes diurnal fluctuations (Brown et al. 2009). Optical coherence tomography (OCT) results reveal that thickness decreases with increasing axial length of the eye (Esmaeelpour et al. 2010). Near the center of the visible portion of the eye, the choroid layer forms the ciliary body, which contains the muscles used to change the shape of the lens (accommodation). The ciliary body in turn merges with the iris, a diaphragm that regulates the size of the pupil. The iris is the area of the eye where the pigmentation of the choroid layer, usually brown or blue, is visible because it is not covered by the sclera. The pupil is the round opening in the center of the iris; it is dilated and contracted by muscular action of the iris, thus regulating the amount of light that enters the eye, and is therefore like the aperture stop of the optical system of the eye. The change in pupil size not only controls the amount of light incident on the retina but can also affect the retinal image quality that is due to diffraction and depth of focus. The diameter of the eye can change because of factors such as illumination, age, accommodation, and drugs. Behind the iris is the lens, a transparent, elastic, but solid ellipsoid body that focuses the light on the retina, the third and innermost layer of tissue. Accommodation is the process by which the eye lens changes its shape (and increases its power) to bring a near object to focus. The mechanism of accommodation is discussed, for example, in Muller and Strobel (2007), Charman (2008), and Koretz (2000). The retina is a network of nerve cells, notably the rods and cones, and nerve fibers that fan out over the choroid from the optic nerve as it enters the rear of the eyeball from the brain. Unlike the two outer layers of the eye, the retina does not extend to the front of the eyeball. The retina will be discussed in greater detail later in this chapter.

Page 3 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

Lateral rectus Abduction

Medial rectus Adduction

Superior rectus Elevation

Inferior rectus Depression

Inferior oblique

Superior oblique

Excycloduction

Incycloduction

Fig. 2 The extraocular muscles and their actions (Reprinted from Simon and Calhoun 1998)

Accessory Structures These include the lachrymal gland and its ducts in the upper lid, which bathe the eye with tears. The tear film layer is about a quarter of a wavelength thick and keeps the cornea moist and clean. The drainage ducts carry the excess moisture to the interior of the nose. The eye is protected from dust and dirt by the eyelashes, eyelid, and eyebrows. In addition, there are six muscles which extend from the eye socket to the eyeball, enabling it to move in various directions. The extraocular muscles and their actions are shown in Fig. 2. There are two types of movement: conjugate (both eyes move in the same direction) and disjunctive (the eyes move in opposite directions). There is an antagonist-agonist reciprocal innervation for eye muscles. More details on oculomotor mechanisms and characteristics can be found in the book by Ciuffreda and Tannen (1995).

The Humors of the Eye The space between the cornea and iris known as the anterior chamber is filled with a thin watery liquid, which is optically clear and slightly alkaline, called the aqueous humor. The aqueous humor is secreted into the chamber by the ciliary body and is drained by the trabecular meshwork. The aqueous humor inflates the globe of the eye, maintains the intraocular pressure, and provides nutrients to the avascular structures of the eye, namely, the posterior cornea, lens, etc. The refractive index of the aqueous humor is taken to be about 1.336 (for sodium D line), and the anterior chamber has a volume of about 151 mm3 (Behndig and Markstrom 2007).

Page 4 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

The space between the back of the lens and retina, called the posterior chamber, is filled with vitreous humor, a jellylike substance. The vitreous is colorless, transparent, and gelatinous. The viscosity of the material is about two to four times that of pure water. It too has a refractive index of about 1.336. The vitreous in contact with the retina helps to keep it in place by pressing against the choroid. In addition, unlike the aqueous, the vitreous is stagnant – it does not get replaced. Therefore, if blood, cells, or other debris get into the vitreous, they will cast shadows, diffraction effects, etc., on the retina and produce floaters. The depth of the posterior chamber is approximately 16.03 mm.

The Cornea The cornea is the major structure that optically aids in retinal image formation. If the unaccommodated eye has a power of about 60 D (i.e., the eye is looking at an object at optical infinity), then the cornea contributes roughly 42 D of this effective power. The cornea is the first refractive surface element that light comes in contact with. The cornea has an anterior radius of curvature of about 7.8 mm. The cornea is roughly 11.5 mm in diameter, has a thickness of about 0.5–0.6 mm in the center and about 0.8 mm in the periphery, and takes about one-sixth of the globe. It is slightly raised from the sclera at the limbus. Even though a single anterior radius of curvature is given, in reality, the cornea is spherical only in the central 1–3 mm zone. In the paracentral zone, approximately 3–4 to 7–8 mm, it is approximately a prolate spheroid; in the periphery, with the outer zone diameter of about 11 mm, there is greatest asphericity and flattening. At the limbus, the cornea steepens. The front surface of the cornea can be roughly modeled as an ellipsoid, with an eccentricity factor of about 0.6–0.8 (Applegate and Howland 1995). Because of the asphericity, the cornea minimizes spherical aberration and coma in the retinal image. Young’s modulus of elasticity of the cornea is 0.45–1.0 MPa and the Poisson ratio is about 0.49. There are five layers of the human cornea. The cornea is completely transparent and hence has no blood vessels. There is also a systematic arrangement of collagen fibrils in a lattice formation. It gets its oxygen through direct diffusion from the air. The refractive index of the cornea is approximately 1.376. Because the change in refractive index between the cornea (posterior) and the aqueous is not very big, it contributes about 6 D to the overall refraction.

The Crystalline Lens The lens, which is part of the anterior segment of the eye, is behind the iris of the eye. The lens is suspended in place by the zonular fibers which attach to the lens near its equatorial line and connect the lens to ciliary body. Posterior to the lens is the vitreous. The lens contributes about 18 D to the overall effective power of the eye. The lens has an ellipsoid, biconvex shape. The anterior surface is less curved than the posterior. In the adult, the lens is typically 10 mm in diameter and has an axial length of about 4 mm, though it is important to note that the size and shape can change due to accommodation and because the lens continues to grow throughout a person’s lifetime. During accommodation, for near objects, the ciliary muscle contracts, zonule fibers loosen and relax, and the lens thickens, resulting in a rounder shape and thus increasing refractive power. Changing focus to an object at a greater distance requires the relaxation of the ciliary muscle, which in turn increases the tension on the zonules, thereby flattening the lens and thus decreasing the lens. It should be noted that with age, this ability to accommodate decreases, resulting in a condition called presbyopia which affects people typically in their 40s. This decrease of accommodation with age is called Donder’s curve (Stark et al. 1985). The refractive index of the lens varies from approximately 1.406 in the central layers down to 1.386 in less dense cortex of the lens. The Page 5 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

Sclera Choroid Pigment Layer Rod Direction of Light

Cone

Horizontal Cell Bipolar Cell Amacrine Cell Ganglion Cell Optic Nerve Fibres

Fig. 3 The structure of the retina (Reprinted from http://www.catalase.com/retina.gif)

lens has a gradient index nature (Pierscionek 2010), which helps to reduce various optical aberrations. The lens grows throughout life; as a result, the gradient index (GRIN) structure is formed with the highest index in the lens nucleus and the lowest index in the capsule of the lens. Even though the lens is completely transparent, it is made up of three layers: the lens capsule, the lens epithelium, and lens fibers. The lens capsule completely covers the lens and contributes to a higher curvature on the anterior side than the posterior side. The lens fibers form the majority of the lens material. Opacities in the lens are called cataracts, and if they are large enough to interfere with vision, they are surgically removed, and an artificial intraocular lens is substituted.

The Retina The retina is the fundamental sensory layer of the eye. The cross section of the retina is shown in Fig. 3. Note that light comes in from below in the diagram. It contains about 200 million photoreceptors, both rods and cones. These photoreceptors absorb visible light and convert the absorbed light into nerve impulses that are sent out to the brain via the optic nerve. The central retina contains the macula, which is a specialized area for perceiving fine detail and color. The center of the macula is called the fovea. The fovea contains only cone photoreceptors and is completely devoid of rods. The density of cones decreases as we go toward the periphery of the retina (Fig. 4; Osterberg 1935; Curcio et al. 1990). When a photoreceptor absorbs a photon, there is a series of photochemical events that result in a hyperpolarization of the photoreceptor cell. This electrical signal is transmitted down through the cell layers till they reach the optic nerve head/optic disk, a region devoid of photoreceptors. The region of the optic nerve head is called the blind spot. The optic nerve is made up of millions of nerve fibers that collect information from the eye and send it to the visual cortex of the brain (Rodieck 1998; Dowling 1987). The human retina is approximately 0.2 mm thick and has an area of approximately 1,100 mm2 (about the size of a silver dollar). The various layers of the retina are as follows: 1. Inner limiting membrane (ILM) is the boundary between the vitreous humor in the posterior chamber and the retina itself. 2. Ganglion cell layer comprises the cell bodies and axons of ganglion cells.

Page 6 of 10

cone peak

180 160

rod peak

140 120 100

OPTIC DISK

RECEPTOR DENSITY (mm−2 × 103)

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

80 60

rods

40 20 0

cones 70 60 50 40 30 20 10 0 10 20 30 40 50 60 70 80 90 TEMPORAL fovea NASAL ECCENTRICITY in degrees

Osterberg, 1935

Fig. 4 Distribution of rods and cones in the retina (Data from Osterberg 1935)

3. Inner plexiform layer (IPL) contains the synapses made between bipolar, amacrine, and ganglion cells. The thickness of this layer varies considerably across species, where “simpler” organisms (such as frogs, pigeons, and squirrels) possess thicker IPLs than “higher” organisms like primates. The thicker IPL indicates that these retinas perform more peripheral and specialized image processing. 4. Inner nuclear layer (INL) contains bipolar cell and horizontal and amacrine cell bodies. 5. Outer plexiform layer (OPL) contains bipolar cell, horizontal cell, and receptor synapses. 6. Outer nuclear layer (ONL) contains the nuclei of photoreceptors. 7. Outer limiting membrane (OLM) interfaces with the base of inner segments of photoreceptors. 8. Photoreceptor layer contains the inner and outer segments of rod and cone photoreceptors. The photoreceptors contain the opsins (the various proteins) and the chromophores (the photon-catching molecules; consisting of retinal, an aldehyde of vitamin A). The absorption of a photon by the chromophore results in a conformational change of the chromophore, which in turn results in a photochemical cascade. The isomerization of 11-cis retinal to all-trans begins the process of phototransduction. The exact chain of events is that isomerization of photopigment breaks apart a molecule called transducin, which activates an enzyme called phosphodiesterase. Phosphodiesterase, in turn, breaks cGMP into its inactive form, which causes Na+ channels (which are open in the resting state) to close. Closing Na+ channels hyperpolarizes the neuron (unlike in other neuronal systems). Light stimulation thus causes fewer transmitters to be released at the synapse. The hyperpolarization of the outer segment spreads to the inner segment by electrotonic conduction. Since receptors are small, the receptor potential is still large at the axon terminal in the inner segment. Thus, most retinal neurons transmit information using only graded potentials. Some amacrine cells and all ganglion cells use action potentials. It is of interest to note that photoreceptors act as classical fiber optic devices and guide light to sites of absorption (Lakshminarayanan and Enoch 2010). Photoreceptors are found to be oriented toward the center of the exit pupil of the eye (Enoch and Lakshminarayann 1991). 9. Pigment epitheliums (PEs) are darkly pigmented cells which absorb light not captured by photoreceptors, thus reducing scattering; they also play a role in “trimming” photoreceptors which follows a diurnal cycle. Diurnal species (active in bright-light environments) typically possess dark PEs; nocturnal species (active in dim-light environments) possess an adaptation called a tapetum.

Page 7 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015 420

495

530 560

1.0

relative absorptance

0.8

0.6

0.4

0.2 V’

0.0

370 400

S

500

5

M

39

L

600

58 45

670

wavelength (nanometers)

Fig. 5 Rod and cone sensitivity curves

The tapetum is a mirrorlike layer behind the photoreceptors which reflects photons not captured by the photoreceptors back out the eye, thus giving the receptors a “second chance” to capture them. Sensitivity to light in these animals is thus increased by approximately twofold. The dominant wavelength of light reflected by the tapetum is usually close to the absorbance peak of rhodopsin (the photopigment contained by rods). People lack a tapetum lucidum; shining light into the eye while the pupil is dilated illuminates the blood-rich choroid behind the retina, thus the red eye seen in flash photos. There are three types of cones, thus giving rise to trichromatic vision. The first responds maximally to light of long wavelengths, peaking in the yellow region (564–580 nm); this type is designated L for long wavelength-sensitive cones or red cones. The second type responds most to light of medium wavelength, peaking at green (534–545 nm), and is known as M cones or green cones. The third type responds most to short-wavelength light, of a violet color, the S cones (or blue cones), and has a peak of absorption at 420–440 nm (see Fig. 5). The packing arrangements of these three cone types were recently studied using adaptive optics techniques (Roorda and Williams 1999). The packing arrangements are important because of sampling issues and aliasing that can occur in human vision (Lakshminarayanan and Nygaard 1992). The absence of one (or more) of these cone types leads to various color deficiencies (see chapter “▶ Color Vision Deficiencies” on color vision deficiencies). The rods on the other hand are more sensitive for low light levels and peak around 420 nm. The sensitivity curves for rods and cones are shown in Fig. 5. As noted previously, when light falls on a receptor, it sends a proportional response synaptically to bipolar cells which in turn signal the retinal ganglion cells. The receptors are also “cross-linked” by horizontal cells and amacrine cells, which modify the synaptic signal before the ganglion cells. Rod and cone signals are intermixed and combine. Despite the fact that all are nerve cells, only the retinal ganglion cells and few amacrine cells create action potentials. Although there are more than 120 million photoreceptors, there are only about 1.2 million fibers in the optic nerve; a large amount of preprocessing is performed within the retina. The retina spatially encodes (compresses) the image to fit the limited capacity of the optic nerve. The retina does so by encoding the incoming images in a suitable manner. These operations are carried out by the center-surround receptive field structures as implemented by the bipolar

Page 8 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

and ganglion cells. These center-surround structures are functional and not anatomical. These centersurround structures of the ganglion and bipolar cells encode the information by performing edge detection and other tasks. The fovea produces the most accurate information. Despite occupying about 0.01 % of the visual field (less than 2 of visual angle), about 10 % of axons in the optic nerve are devoted to the fovea. That is, there is an almost 1:1 connection between each cone cell in the fovea and a ganglion cell. As a result, there is great spatial resolution in the fovea. However, in the periphery, there is a multiplexing of approximately ten rod cells to one ganglion cell, and hence, resolution falls off considerably (see chapter “▶ Visual Acuity”). The information capacity of the retina is estimated at 500,000 bits per second without color or around 600,000 bits per second with color coding.

Summary In this chapter, a brief summary of ocular anatomy is given. Emphasis has been given to topics dealing with visual optics and retinal function.

Further Reading Applegate RA, Howland HC (1995) Non-invasive measurement of corneal topography. IEEE Eng Med Biol 14:30–42 Atchison DA, Smith G (2000) Optics of the human eye. Butterworth Heinemann, St. Louis Behndig A, Markstrom K (2007) Determination of aqueous humor volume by 3-D mapping of the anterior chamber. Ophthalmic Res 37:13–16 Brown JS, Flitcroft D, Ying G, Frances EL et al (2009) In vivo human choroidal thickness measurements: evidence for diurnal fluctuations. Invest Ophthalmol Vis Sci 50:5–12 Chalupa L, Werner J (2000) The visual neurosciences, vol 2. MIT Press, Cambridge, MA Charman WN (2008) The eye in focus: accommodation and presbyopia. Clin Exp Optom 91:207–225 Charman WN, Radhakrishnan H (2010) Peripheral refraction and the development of refractive error: a review. Ophthalmic Physiol Opt 30:321–338 Ciuffreda K, Tannen B (1995) Eye movement basics for the clinician. C.V.Mosby, St. Louis Curcio CA, Sloan KR, Kalina RE, Hendrickson AE (1990) Human photoreceptor topography. J Comp Neurol 292:497–523 Davson H (1972) The physiology of the eye, 3rd edn. Academic, New York Dowling J (1987) The retina: an approachable part of the brain. Belknap, Cambridge, MA Enoch JM, Lakshminarayann V (1991) Retinal fiber optics. In: Charman N (ed) Vision and visual dysfunction, vol I. McMillan Press, London, pp 280–309 Esmaeelpour M, Povazay B, Hermann B, Hofer B et al (2010) Three dimensional 1060 nm OCT: choroidal thickness maps in normal subjects and improved posterior segment visualization in cataract patients. Invest Ophthalmol Vis Sci 51:5260–5266 Fatt I, Weissman BA (1992) Physiology of the eye: an introduction to vegetative functions. ButterworthHeinemann, St. Louis Grosvenor TP (1989) Primary care optometry. Professional Press, New York Hart WM (1992) Adler’s physiology of the eye, 9th edn. Mosby, St. Louis Koretz JF (2000) Development and aging of human visual focusing mechanisms. In: Lakshminarayanan V (ed) Vision science and its applications, vol 35, Trends in optics and photonics series. Optical Society of America, Washington, DC, pp 246–258 Page 9 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_4-2 # Springer-Verlag Berlin Heidelberg 2015

Lakshminarayanan V, Enoch JM (2010) Biological waveguides, chapter 8. In: Bass M, DeCusatis C, Enoch JM, Lakshminarayanan V et al (eds) Handbook of optics, vol III. McGraw Hill, New York Lakshminarayanan V, Nygaard RW (1992) Aliasing in the human visual system. Concepts Neurosci 3:201–212 Muller M, Strobel J (2007) Mechanism of accommodation of human eye – some new aspects. Klin Monbl Augenheilkd 224:653–658 Osterberg G (1935) Topography of the layers of the rods and cones in the human retina. Acta Ophthalmol Suppl 13:1–102 Oyster CW (1990) The human eye. Sinauer, Sunderland Palmer SE (1999) Vision science. Photons to phenomenology. MIT Press, Cambridge, MA Pang Y, Maino DM, Zhang G, Lu F (2006) Myopia: can its progression be controlled? Optom Vis Dev 37:75–79 Pierscionek B (2010) Gradient index optics in the eye, chapter 18. In: Bass M, De Cusatis C, Enoch JM, Lakshminarayanan V et al (eds) Handbook of optics, 3rd edn. McGraw Hill, New York Remington LA (2005) Clinical anatomy of the visual system, 2nd edn. Butterwoth Heinemann, St. Louis Rodieck RW (1998) The first steps in seeing. Sinauer, Sunderland Rolls ET, Deco G (2004) The computational neuroscience of vision. Oxford University Press, Oxford Roorda A, Williams DW (1999) The arrangement of the three cone classes in the living human eye. Nature 397:520–522 Simon JW, Calhoun JH (1998) A child’s eyes: a guide to pediatric primary care. Triad Publishing, Gainesville Stark L, Sun F, Lakshminarayanan V, Wong J, Nguyen A, Muller E (1985) Presbyopia in the light of accommodation. Work reports on the 3rd international symposium on presbyopia, vol 2. Essilor International, Paris, pp 340–352 Toyoda J, Murakami M, Kaneko A, Saito T (1999) The retinal basis of vision. Elsevier, Amsterdam Walls GL (1942) The vertebrate eye and its adaptive radiations. Cranbook Institute of Science, Bloomfield Hills

Page 10 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_5-2 # Springer-Verlag Berlin Heidelberg 2015

Light Detection and Sensitivity Vasudevan Lakshminarayanan* Department of Physics, Department of Electrical and Computer Engineering and School of Optometry and Vision Science, University of Waterloo, Waterloo, ON, Canada Department of Physics, University of Michigan, Ann Arbor, MI, USA

Abstract This chapter deals with the absolute threshold of vision, namely, the minimum amount of light necessary to elicit a visual response. The effect of intrinsic retinal noise which affects light detection is considered; this is followed by a discussion on intensity discrimination by both rods and cones.

List of Abbreviations Tvi

Threshold versus intensity

Introduction Of all our sensory systems, the visual sense dominates – about 60 % of all nerve fibers from a sensory organ to the brain come from the eyes. The visual cortex contains about 500 million nerve cells (the corresponding number for the auditory cortex is about 800,000 nerve cells; from the ear 30,000 nerve fibers convey acoustic information to the brain, while from the eye one to three million nerve fibers convey visual information to the brain). The eye operates over an amazing range of light levels, covering an intensity range of approximately 12 log units, and possesses exquisite sensitivity. A good review is given by Rodieck (1998).

Nature of Vision at (or Near) Absolute Threshold Absolute threshold implies that the rod photoreceptors signal the absorption of single photons. The question of the sensitivity of the eye is not simply one of physics but also of the criterion used by the observer, thus bringing in the behavioral response of the observer. Hence, the question of the absolute threshold of human vision can only be answered in terms of a response probability. Lorentz in 1901 hypothesized that a just detectable flash of light delivered approximately 100 photons to the cornea (quoted in (Bialek 1987)). Identifying the minimum number of photons required for seeing from this number is difficult because of uncertainties in determining the number of photons absorbed by the retinal photoreceptors. This quantum efficiency has been estimated to be in the range of about 0.1–0.3. The best values of thresholds are obtained after prolonged dark adaptation of about 30–45 min. It is found that completely dark adapted rod photoreceptors approach the sensitivity limit imposed by the quantization of light and the Poisson fluctuations of photon absorption. In fact, isolated single photoreceptors signal the

*Email: [email protected] Page 1 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_5-2 # Springer-Verlag Berlin Heidelberg 2015

absorption of a single photon (Bialek 1987). The number of photons required to give a psychophysical response (behavioral) was established by van der Velden (1946) and by Hecht et al. (1942).

Behavioral Results In their classic experiment, Hecht et al. measured the fraction of trials in which a flash was seen as a function of the number of photons incident on the cornea. This curve, a psychometric function, showed a broad transition from flashes that were rarely seen to those frequently seen, and from this curve the threshold and quantum efficiency can be derived based on the assumption that the variability in a subject’s response was due to Poisson statistics of the photon absorption. Implicit in this analysis are two other basic ideas: only those instances wherein the number of photons which exceed a threshold number was seen and that the average number of photons contributing to “seeing” was directly proportional to the number of photons incident on the cornea and the constant of proportionality being the quantum efficiency. From these beautifully conducted and analyzed experiments, they concluded a quantum efficiency of about 0.06, which is much lower than that derived from light scattering and ocular media properties. The authors conclude that in order for a visual effect to be produced, one quantum must be absorbed by each of 5–8 or so rods in the retina (at a 65 % probability of correct response). To give the reader a sense of how sensitive human observers can be, Pirenne (1962a) used a white light test stimulus spread out over many degrees on the retina, and the threshold was determined to be 0.75  106 cd/m2. This is of the order of 5–30 % of the luminance of the darkest night sky measured by the National Physical Laboratory. There is a big discrepancy between the in vitro result that individual photoreceptors can detect and signal a single photon while psychophysically 5–8 photons absorptions are required. The reason for this discrepancy is the presence of biological noise in the visual system. However, it should be emphasized that the human eye sensitivity approaches the limit set by quantization of light, Poisson fluctuations in absorption, and the internal noise.

The Effect of Noise on Detection of Light Visual sensitivity is hampered by background noise occurring along the visual retina-cortex pathway. There have been many studies to describe and show physiologically and mathematically the events occurring in the retina in response to stimuli and when no stimuli are present (see, e.g., Hamer et al. (2003, 2005)). The physiology and biochemistry are discussed in detail by Rodieck (1998) and Wickehart (2003). What is retinal background noise? Spectral absorptions that occur in the photoreceptors cause a neural cascade which is encoded as sight in the occipital cortex; however, there are ever-present visual stimulations occurring randomly without photon absorption by the photoreceptors. Barlow and others (1956; Sakitt 1972) advocated the concept of “dark light” – a name given to internal events such as spontaneous decomposition of photopigment. It was shown by these and other researchers that additive Poisson noise can account for the discrepancy. In fact, it has been estimated based on spatial and temporal summation characteristics of the rod array, rod density, assumed quantum efficiency, etc., that the equivalent rate of photon-like noise events in rod photoreceptors, the dark light, ranges from 0.002 to 0.03 per second (Donner 1992). Spontaneous background noise in the retina is usually described as having two components. Baylor et al. described discrete thermal activations in photoreceptors, which accounts partly for background noise Page 2 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_5-2 # Springer-Verlag Berlin Heidelberg 2015

a

b

3 4 Saturation 2

3

Log N λ

4

Log increment threshold

5° PARAFOVEA λ = 580 nm μ = 500 nm Cones

5

6

Rods

7

1 3 Weber’s law

0 −1 −2

1 Dark light 2 Square root law

−3 −4

8 (ZERO) 9

8

7

6

Log M μ

5

4

3

2

1

−8

−4

−2

0

2

4

Log field intensity

Fig. 1 (a) Light adaptation curve plotted as increment threshold versus background luminance (or a threshold versus intensity: tvi curve). The above plot shows increment threshold (Nl) and background luminance (Mm). Light of two different wavelengths are used in this case (580 nm for the test and 500 nm for the background) (Adapted from Stiles’ data from (Davson 1990)). (b) Schematic of the increment threshold curve (Adapted from Aguilar and Stiles’ data from (Davson 1990))

(Baylor et al. 1979, 1984). The rest of the noise is thought to be due to fluctuations of the concentrations of certain chemicals in the process of phototransduction. Background retinal noise is generated among both the rod and the cone photoreceptors. Cones have been shown to be noisier than rods, possibly hampering our vision when we fixate on important stimuli in our environment, but presumably have less of an effect on our sensitivity at absolute visual threshold levels. This would make sense because we know rods function as low-light detectors and would therefore necessitate quieter biological conditions. In addition, it is also found that the power spectrum of dark noise had the same shape as the spectrum of a dim flash of light, evidence that retinal noise consists of random events with an average shape of the single photon response. There are a number of unanswered questions such as the effect of continuous noise, false positives due to the noise in setting threshold, etc., and these are beyond the scope of this chapter. In summary, physiological noise is indistinguishable from signals generated by light stimuli, and it is hypothesized to be the main neural limit to our visual acuity. A more detailed description of the problem of noise in photoreceptor can be found in the articles by Lakshminarayanan (2005), Reike and Baylor (1998), and Field et al. (2005).

Intensity Discrimination Up to this point, we have been discussing the detection of light at (or near) absolute threshold values. Here, we discuss the important issue of intensity discrimination – that is, how does the visual system determine if one luminous stimulus differs in intensity from another. It should be obvious from the above discussion that the quantum fluctuations provide a theoretical lower limit for intensity discrimination by an ideal observer. How about at values of stimuli well above threshold? In an increment threshold measurement, a test stimulus of luminance Lt is compared to an adjacent stimulus of luminance L, which is a reference stimulus. The psychophysical task is to determine how different Lt must be from L (above or below) for it to be seen as different (at some preassigned probability value, say 50 % of the

Page 3 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_5-2 # Springer-Verlag Berlin Heidelberg 2015

time). Let this difference be given by DL. If we determine DL for a number of different values of the reference L, we get a curve called a threshold versus intensity function (tvi function). A typical tvi curve is shown in Fig. 1; typically there are two branches showing the duplex nature of the retina. If we start from near absolute threshold, the threshold in the flat horizontal portion of the curve is determined by the dark light level or the internal noise (portion 1); increases in luminance do not change DL very much; this implies that an imposer would not be able to always detect an intensity difference between two flashed stimuli that on average differ in intensity only be a few quanta. As the background luminance is further increased, we pass on to region 2 of the curve, the “square root” law region. Quantum fluctuations increase with the number of quanta in the stimulus, and as stimulus luminance increases, the minimum discriminable threshold increases in proportion to the square root of the intensity level. This is known as the deVries-Rose law or the square root law and is expressed as DL pffiffiffi ¼ K L In this portion, the slope of the curve is ½. For the rod pathway, a slope of 0.6 is often found. At low reference luminance levels, humans behave as ideal detectors and follow the deVries-Rose law. As we further increase background levels, Weber’s law holds and the intensity discrimination threshold is higher than expected from an ideal detector. In this region, also called Weber’s law region, we get a straight line portion of the curve, where the slope is a constant. The constant proportional relationship between increment threshold (DL) and reference luminance (L) is called Weber’s law: DL/L = constant. This proportional change in threshold DL with L implies that the visual system is not detecting luminance differences at the theoretical limit. It should be noted that the Weber constant is affected by stimulus size, duration, wavelength, and retinal location (Blackwell 1946; Lynn et al. 1996; Harwerth et al. 1993; Gescheider 1997). Weber’s law implies that there is a limitation on intensity discrimination due to loss of information. At higher luminance levels, the Weber fraction becomes large – that is, DL increases faster than L and the visual system saturates.

Visual Adaptation As noted before, the visual system operates over a huge range of retinal illuminances. One of the reasons that the visual system can move its “operating curve” over such a wide range of illuminances is the fact that we have different photoreceptor subsystems – the rods and cones (see chapter “▶ Anatomy of the Eye”). This adjustment of the operating level to the existent light level is known as adaptation. In this section, we will discuss the dark and light adaptation characteristics of the visual system. Light adaptation refers to the process that increases/decreases threshold luminance/sensitivity (recall that threshold is the inverse of sensitivity) in response to an increased level of illumination. Dark adaptation is the reverse of this change in sensitivity. Dark adaptation can be easily measured in the laboratory (and routinely tested in the clinic) using standard psychophysical methodology. The eyes are exposed to a suprathreshold adapting light (monochromatic or multichromatic) of large spatial dimension. Then, the adaptation light is turned off (time 0) and a test flash of a certain wavelength and size is presented at a specific retinal location, and the DL is measured at specific times. The classic dark adaptation curve is shown in Fig. 2. The dark adaptation curve has certain specific features. There are two main branches – the cone branch and the rod branch. Initially, there is a rather large decrease in threshold luminance (about 3 min mark or

Page 4 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_5-2 # Springer-Verlag Berlin Heidelberg 2015 8 Cones

Log intensity (μμL)

7 6 Rods 5 4 3 2 0

5

10

15

20

25

30

Time in dark (min)

Fig. 2 Dark adaptation curve. The shaded area represents 80 % of the group of subjects (Adapted from Hecht and Mandelbaum’s data from (Pirenne 1962b))

so), followed by a rather slow change until about 8–12 min or so. Here, the cones approach their lowest threshold level and are fully dark adapted. The curve is pretty much flat during this portion. After about this time period, the cones saturate, and the rod system takes over, and the curve drops again as the rods become more sensitive. This point of transition is called the cone-rod break. Within about 30 min or so, the rod portion becomes flat and rods have adapted. Rods begin to dark-adapt at the same time as the cones, but initially they are less sensitive than cones and have a larger time constant. The reader is referred to the chapter by Birch (2003) for a detailed description of the physiology and mechanism of dark adaptation.

Summary In this chapter, we have discussed the fact that luminance detection under certain conditions is limited only by internal noise in the visual system. We have also examined intensity discrimination by the eye and the basic laws of psychophysics, namely, the deVries-Rose square root law and Weber’s law. Finally, we have examined the response of the photoreceptors to dark adaptation.

Further Readings Barlow H (1956) Retinal noise and absolute threshold. J Opt Soc Am 46:634–639 Baylor DA, Lamb TK, Yau KW (1979) Responses of retinal rods to single photons. J Physiol 288:613–634 Baylor DA, Nunn B, Schnapf JL (1984) The photocurrent, noise and spectral sensitivity of the rods of the monkey Macaca fascicularis. J Physiol 357:575–607 Bialek W (1987) Physical limits to sensation and perception. Ann Rev Biophys Biophys Chem 16:455–478 Birch DG (2003) Chapter 24: visual adaptation. In: Kaufman PL, Alm A (eds) Adler’s physiology of the eye, 10th edn. Mosby, St. Louis Blackwell HR (1946) Contrast thresholds of the human eye. J Opt Soc Am 36:624–643

Page 5 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_5-2 # Springer-Verlag Berlin Heidelberg 2015

Cornsweet T (1970) Visual perception. Academic, New York Davson H (1990) Physiology of the eye, 5th edn. Macmillan Academic and Professional Ltd., London Donner K (1992) Noise and the absolute thresholds of cones and rod vision. Vision Res 32:853–866 Field GD, Sampath AP, Rieke F (2005) Retinal processing near absolute threshold: from behavior to mechanism. Annu Rev Physiol 67:491–514 Gescheider G (1997) Psychophysics: the fundamentals. Psychology Press, Philadelphia Hamer RD, Nicholas SC, Tranchina D, Liebman PA, Lamb TD (2003) Multiple steps of phosphorylation of activated rhodopsin can account for the reproducibility of vertebrate rod single photon responses. J Gen Physiol 122:419–444 Hamer RD, Nicholas SC, Tranchina D, Lamb TD, Jarvinen JLP (2005) Toward a unified model of vertebrate rod phototransduction. Vis Neurosci 22:417–436 Harwerth RS, Smith EL, DeSantis L (1993) Mechanisms mediating visual detection in static perimetry. Invest Ophthalmol Vis Sci 34:3011–3023 Hecht S, Schlaer S, Pirenne MH (1942) Energy, quanta and vision. J Gen Physiol 25:819–840 Lakshminarayanan V (2005) Vision and the single photon. Proc SPIE 5836:332–337 Lynn JR, Felman RL, Starita RJ (1996) Principles of perimetry. In: Riitch R, Shields MB, Krupin T (eds) The glaucomas. Mosby, St. Louis, pp 491–521 Palmer SE (1999) Vision science. Photons to perception. MIT Press, Cambridge Pirenne MH (1962a) Absolute thresholds and quantum effects. In: Davson H (ed) The eye. Academic, New York Pirenne MH (1962b) Chapter 5: dark adaptation and night vision. In: Davson H (ed) The eye, vol 2. Academic, London Reike F, Baylor DA (1998) Single photon detection by rod cells of the retina. Rev Mod Phys 70:1027–1036 Rodieck RW (1998) The first steps in seeing. Sinauer, Sunderland Sakitt B (1972) Counting every quantum. J Physiol 223:131–150 van der Velden HA (1946) The number of quanta necessary for the perception of light of the human eye. Ophthalmologica 111:321–331 (originally published in Dutch in Physica) Wandell BA (1995) Foundations of vision. Sinauer, Sunderland Wickehart DR (2003) Biochemistry of the eye. Butterworth-Heinemann, Philadelphia

Page 6 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_6-2 # Springer-Verlag Berlin Heidelberg 2015

Visual Acuity Vasudevan Lakshminarayanan* Department Physics, Department of ECE and the School of Optometry and Vision Science, University of Waterloo, Waterloo, ON, Canada Department of Physics, The University of Michigan, Ann Arbor, MI, USA

Abstract Visual Acuity (VA) is a measure of the visual system’s ability to see distinctly the details of an object. This chapter will discuss the commonly used VA measure and its measurement.

List of Abbreviations MAR VA

Minimum angle of resolution Visual acuity

Introduction Visual acuity (VA) is a measure of the spatial vision of observers. Eye doctors (ophthalmologists and optometrists) routinely use VA measures to assess spatial vision using high-contrast stimuli, namely, the finest spatial detail that an observer can discern. Some common examples of acuity measures are (1) Snellen eye chart where the observer names the Snellen letters and which is a form of recognition acuity (see Fig. 1a), (2) Landolt rings (or tumbling E) where the observers report where the gap of the letter C (or where the prongs of the letter E are pointing to) is located (resolution acuity; Fig. 1b), and (3) parallel lines where the observer reports the orientation of the bars (resolution acuity). The VA that is measured is related to the highest spatial frequency grating (usually sine-wave grating; finest detail) that can be detected (see chapter “▶ Spatial Vision and Pattern Perception”). In this chapter, we will only discuss the clinically measured VA and not other specialized forms of acuity such as detection acuity and hyperacuity.

Representation of VA Visual acuity is quantitatively represented in one of two ways: (1) as the reciprocal of the minimum angle of resolution (MAR measured in minutes of arc) or (2) as a Snellen fraction. This is measured using, as noted above, letters or Landolt rings. VA can also be expressed as a decimal, i.e., the Snellen fraction is reduced or is given by the reciprocal of the minimum angle of resolution. The Snellen fraction represents the visual acuity in the form of a fraction (e.g., 20/20, 20/80, and 20/200 if measured in feet or equivalently 6/6, 6/24, and 6/60 if measured in meters). Here the numerator is the testing distance (6 m or 20 ft in a clinician’s office) and the denominator is the distance at which the smallest Snellen letter has an angular size of 5 min of arc and each detail in the letter (e.g., the gap between *Email: [email protected] Page 1 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_6-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 (a) Snellen acuity chart. (b) Landolt “C” chart in logMAR form. (c) Bailey-Lovie logMAR chart

the ends of letter C subtends 1 min of arc or the width of a horizontal single bar in the letter E subtends 1 min of arc; each letter subtends a total of 5 arc min). The minimum angle of resolution, MAR, is calculated simply by looking at the minimum details on a target that can be seen by an eye and can be calculated, for example, using the Rayleigh criterion: ymin ¼

1:22l D

where D represents the pupil diameter of the eye. Under photopic levels, D for the human eye is about 2–4 mm, and ymin works out to be about 1 min of arc in the mid-visible range. However, under optimal viewing conditions, the eye’s optics and spacing of foveal cones (center to center spacing of about Page 2 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_6-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Various measures of VA for different performance levels Performance criterion Normal adult Unrestricted driving Moderate visual impairment Legal blindness Profound visual impairment

MAR (min of arc) 0.82 2.00 3.50 10 25

Snellen denominator Snellen denominator (m) 6  MAR (ft); 20  MAR 4.9 16.4 12 40

Decimal Snell-Stirling visual VA efficiency (%) 0.836(MAR-1) LogMAR 1.22 103.3 0.09 0.50 83.6 0.30

21

70

0.29

64

0.54

60 150

200 500

0.10 0.04

20 1.4

1 1.40

20–40 s) could allow some to resolve points that are approximately 0.5 min apart (Geisler 1989). However, even with optimum correction of refractive errors, aberrations, and cone sizes (which vary somewhat), there will be variations in MAR. Other scales are used for representing VA – these include logMAR, decimal VA, and the Snell-Stirling visual efficiency scale. Table 1 gives the relationship between various measures of VA for different levels of visual performance. It should be noted that the logMAR is the preferred scale for representing VA in the research (and increasingly in the clinical) literature. This is especially true if the VA is better than 1 min or much worse. The reason for using this is because the inverse slope of the visual acuity psychometric function (between logMAR and percent correct) remains more or less independent of the acuity value; this means that the measurement error remains approximately constant (Horner et al. 1985). As a result, logMAR charts such as the Bailey-Lovie charts are commonly used ((Bailey and Lovie 1976); see Fig. 1b and c). In these logMAR charts, the letters on each line are approximately 26 % smaller than the size of the letters on the line above it.

Factors Affecting VA Measurements of letter acuity are affected by a number of factors. These include (a) choice of letters, (b) letter spacing, (c) retinal illumination, (d) target contrast, (e) retinal eccentricity, (f) target motion and duration of target presentation, (g) neural defocus, and (h) age. We shall briefly discuss each of these factors below:

1. Choice of letters: even if letters are constructed using a standard font and of the same size, relative legibility can be a factor. For example, uppercase S and B (or C and O) are more difficult to identify when compared to L and T. The relative legibility of letters has been quantified by specifying the difference in size that is required for each letter to reach its identification threshold (Hedin and Olsson 1984; Gervais et al. 1984). In addition to these confusion letters, observers with uncorrected astigmatism will experience defocus of specific contours (e.g., a myopic axis 180 astigmat will see vertical contours defocused making letters like V and Y difficult to distinguish). 2. Letter spacing: because of the so-called contour interaction (Flom 1966) due to lateral inhibition, the ability to identify letters depends not only on size but also on how close the letters are to each other. In

Page 3 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_6-2 # Springer-Verlag Berlin Heidelberg 2015 1.8 1.6 Cone 1.4

Visual acuity

1.2 1.0 0.8 0.6 Rod 0.4 0.2 0

−5

−4

−3

−2

−1

0

1

2

3

4

log L (millilamberts)

Fig. 2 Variation of VA with illumination. Red lines show the maximum of photopic and scotopic VA

3.

4.

5.

6.

addition, other factors such as inaccurate eye movements can contribute to reduced VA (Flom 1991). These are all called “crowding effects.” It is found that the maximum degradation of legibility is produced when a letter and its neighbor are separated by approximately half a letter width. It should be noted that contour interaction is of greater magnitude in peripheral vision (see chapter “▶ Spatial Vision and Pattern Perception”). Retinal illumination: at high levels of retinal illumination, cones mediate VA. Normally VA is measured using a standard test chart at approximately 100 cd/m2. At low luminance levels, the maximum VA reached by normal observers is about 0.7–1.0 logMAR (see Fig. 2). Best scotopic vision occurs at a location smaller than 20 eccentricity on the retina, at which rod density is highest. The variation in VA with retinal illumination strongly correlates with the high-frequency cutoff of the contrast sensitivity function as the illumination is lowered. Contrast: VA is high for high-contrast letters. Even for well-focused foveal targets, letter acuity falls as contrast is lowered. Standard clinical tests use high-contrast targets. However, low-contrast targets have been used in acuity charts and are used clinically (e.g., Pelli-Robson charts, Regan charts (Pelli et al. 1988; Regan and Neima 1983)). Eccentricity: VA falls rapidly with increasing eccentricity on the retina (correlating with the shift toward lower spatial frequency cutoff on the contrast sensitivity function). In fact, it has been shown that letter acuities worsen to twice the foveal value at eccentricities of approximately 2.0 . In addition, the rate at which resolution changes with eccentricity is not the same along the different retinal meridians – the rate of change of acuity is smaller along the horizontal meridian than along the vertical and is better in the temporal than in the nasal visual field (Wertheim 1980). Figure 3 shows a radial eye chart developed by Prof. Stuart Anstis at the University of California at San Diego. Although letter sizes increase dramatically from center outward, the letters are scaled to be equally legible at different eccentricities. Motion: for high-contrast targets, VA is not affected by low velocities of retinal image motion (to approximately 2 deg/s), but becomes progressively worse at higher image velocities (Baron and

Page 4 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_6-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Peripheral VA chart. See text for details

Westheimer 1973). VA measured when target or observer is in motion is called dynamic VA. If the target is intermittently flashed, VA becomes worse as flash duration decreases, but improves as flash duration increases to about 500 ms (temporal summation period; see section 6 of chapter “▶ Flicker Sensitivity” and (Demer and Amjadi 1993)). 7. Neural defocus: the convergence of photoreceptor signals at ganglion cells is more extensive for peripheral rods than cones (almost 10:1 multiplexing, compared to 1:1 for foveal cones). As the signal goes up the neural visual system, there is greater chance of neural “defocus” which can be thought of as the blending together of the gradations of responses of neighboring cells. For example, VA goes down when target moves so quickly that small receptive fields become insensitive. 8. Age: VA starts out relatively poor in a newborn infant. It has been shown that using a forced-choice preferential looking psychophysical procedure, VA develops from approximately 1.5 logMAR at 1 month of age to about 0.7 logMAR by age 6 months. Adult levels of VA are reached (0 logMAR) at approximately 3 years of age (Mayer et al. 1996). At the other end of the age spectrum, high-contrast VA remains relatively invariant with age, but there is a significant decrease in VA measured with low-contrast targets (Enoch et al. 1999).

Conclusion In this chapter, we have briefly discussed visual acuity of the human eye. The measurement of VA as a psychophysical test and a number of factors that affect the VA have been discussed.

Further Reading Bailey IL, Lovie JE (1976) New design principles for visual acuity letter charts. Am J Optom Physiol Opt 53:740–745

Page 5 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_6-2 # Springer-Verlag Berlin Heidelberg 2015

Baron W, Westheimer G (1973) Visual acuity as a function of exposure duration. J Opt Soc Am 63:212–219 Bedell HE (2002) Chapter 5: Spatial acuity. In: Norton TT, Corliss DA, Bailey JE (eds) The psychophysical measurement of visual function. Butterworth-Heinemann, Woburn Demer JL, Amjadi F (1993) Dynamic visual acuity of normal subjects during vertical optotype and head motion. Invest Ophthalmol Vis Sci 34:1894–1906 Enoch JM, Werner G, Hagerstrom Portnoy G, Lakshminarayanan V, Rynders M (1999) Forever young: visual functions not affected or minimally affected by aging. J Gerentol Biol Sci 54A:B336–B352 Flom MC (1966) New concepts on visual acuity. Optom Wkly 57:63–68 Flom MC (1991) Contour interaction and the crowding effect. Probl Optom 3:237–257 Geisler WS (1989) Sequential ideal observer analysis of visual discriminations. Psychol Rev 96:26–319 Gervais MJ, Harvey LO, Roberts JO (1984) Identification confusions among letters of the alphabet. J Exp Psychol Hum Percept Perform 10:655–666 Hedin A, Olsson K (1984) Letter legibility and construction of new visual acuity chart. Ophthalmologica 189:147–156 Horner DG, Paul AD, Katz B, Bedell HE (1985) Variations in the slope of the psychometric acuity function with acuity thresholds and scale. Am J Optom Physiol Opt 62:895–900 Mayer MJ, Beiser AS, Warner AF et al (1996) Monocular acuity norms for the Teller acuity cards between ages one month and four years. Invest Ophthalmol Vis Sci 36:671–685 Norton TT, Lakshminarayanan V, Bassi CJ (2002) Chapter 6: Spatial vision. In: Norton TT, Corliss DA, Bailey JE (eds) The psychophysical measurement of visual function. Butterworth-Heinemann, Woburn Pelli DG, Robson JG, Wilkins AJ (1988) The design of a new letter chart for measuring contrast sensitivity. Clin Vis Sci 2:187–199 Regan D, Neima D (1983) Low contrast letter charts as a test of visual function. Ophthalmology 90:1192–1200 Wertheim T (1980) Peripheral visual acuity. Am J Optom Physiol Opt 57:915–924 Westheimer G (1979) The spatial sense of the eye. Invest Ophthalmol Vis Sci 18:893–912 Westheimer G (2002) Chapter 17: Visual acuity. In: Kaufman P, Alm A (eds) Adler’s physiology of the eye. CV Mosby, St. Louis

Page 6 of 6

Flicker Sensitivity Vasudevan Lakshminarayanan

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Temporal Resolution Acuity and Critical Flicker Fusion Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Properties of the CFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Stimulus Luminance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Area of Stimulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Retinal Eccentricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Wavelength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Ambient Illumination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Refresh Rates of Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CFF and Phosphor Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brightness Enhancement Effects Because of the Temporal Properties of Vision . . . . . . . . . . . . . . . The Temporal Contrast Sensitivity Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Temporal Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flicker in Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 3 4 4 5 5 5 5 6 6 6 8 8 9 9

Abstract

The visual system is sensitive to temporal changes in stimuli. The image appears to be continuous as the visual system integrates the responses with respect to time. A crucial factor is the CFF – critical flicker fusion frequency – and refers to the temporal frequency beyond which flicker is no longer perceived. CFF is a measure of the minimum temporal interval that can be resolved by the visual system. This chapter discusses the CFF and the effect of various parameters such V. Lakshminarayanan (*) School of Optometry and Vision Science, University of Waterloo, Waterloo, ON, Canada Department of Physics, University of Michigan, Ann Arbor, MI, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_7-2

1

2

V. Lakshminarayanan

as luminance, refresh rate of the monitor, wavelength, and retinal eccentricity on flicker perception. Keywords

Adaptation • Ambient illumination • Band-pass shape • Bloch’s law • BrocaSulzer effect • Brucke-Bartley phenomenon • Critical flicker fusion frequency • Ferry-Porter law • Flicker perception • Function • Granit-Harper law • Limiting factor • Phosphor persistence • Regeneration rate • Retinal eccentricity • Retinal illuminance • Stimulus area • Stimulus luminance • Talbot brightness • VDT phosphors • Wavelength Abbreviations

CFF CSF

Critical Flicker Fusion Frequency Contrast Sensitivity Function

Introduction As is well known, images on a monitor are not continuous and the images are continuously refreshed. The rate at which the image is refreshed plays a crucial role in making the image appear continuous, even though the actual luminance of a point on the screen is intermittent. Because the visual system is sensitive to temporal changes, it integrates the responses with respect to time. Flicker arises when the display images are not repeated quickly enough. Flicker perception can be studied using grating stimuli whose luminance varies sinusoidally with time. Flicker perception depends upon stimulus size, luminance, retinal location, and temporal modulation among other factors. It has been found that chromaticity has little or no effect on CFF if the luminance is held constant (De Lange 1958).

Temporal Resolution Acuity and Critical Flicker Fusion Frequency If we present to an observer alternating repetitive cycle of low and high luminance of a temporal square (or sine) wave stimulus, the light will appear to flicker (be intermittent) when the temporal frequency (in Hertz) is low. If we increase the temporal frequency, it will appear to be steady beyond a certain frequency. Psychophysically, we define the CFF (critical flicker fusion frequency) as the frequency at which the stimulus is seen flickering 50 % of the time and as steady or fused 50 % of the time. The CFF is a measure of the temporal resolving power of the visual system. This minimum interval of resolution is analogous to the minimum angle of resolution in spatial vision (see chapters “▶ Spatial Vision and Pattern Perception” and “▶ Visual Acuity”). In this sense, temporal acuity is analogous to grating acuity in spatial vision. The neural basis of CFF is the modulation of firing rates of retinal neurons (Tyler and Hamer 1990; Lee et al. 1989).

Flicker Sensitivity

3

50

Critical frequency (Hertz)

0° 40 10° 30

15°

20

10

0 –4

–3

–2 –1 0 1 Retinal illumination - log L - photons

2

3

Fig. 1 Critical flicker frequency as a function of retinal illuminance and retinal position (From Hecht and Verrijp 1933)

When a light is flickering above the CFF, it will appear steady, and the timeaveraged luminance of a flickering light determines its brightness above the CFF. This time-averaged luminance is called the Talbot brightness. The Talbot brightness can be easily calculated using: Talbot brightness ¼ Lmin þ ð½Lmax  Lmin   f Þ Here Lmax and Lmin refer to the maximum and minimum luminance of the grating, and f is the fraction of time that Lmax is present during the total period.

Some Properties of the CFF The CFF depends upon a number of factors. In this section, some of the basic laws governing the behavior of CFF and flicker issues will be discussed.

CFF and Stimulus Luminance The CFF increases linearly with the log of the stimulus luminance (Fig. 1). This is known as the Ferry-Porter law and is expressed as: CFF ¼ k log L þ b Experimental results hold over a range of approximately 4 log units of luminance. Here, k is the slope of the line and b is a constant; L is the stimulus luminance (Tyler

4

V. Lakshminarayanan 60 19° Critical frequency (cps)

50 6° 2°

40

0.3° 30 20 10 0 –3

–2

–1 0 1 2 3 Log retinal illuminance (trolands)

4

5

6

Fig. 2 Critical flicker frequency as a function of retinal illuminance and stimulus size (From Hecht and Smith 1936)

and Hamer 1993). This law implies that the temporal resolution acuity improves as the flickering stimulus luminance increases. This law holds at many stimulus eccentricities.

CFF and Area of Stimulus CFF increases linearly with the logarithm of the stimulus area (Fig. 2). This is known as the Granit-Harper law and is written as: CFF ¼ k log A þ b Here, k and b are constants (different from those in the equation for Ferry-Porter law). A is the area of the flickering stimulus. The Granit-Harper law holds for approximately a 3 log unit range of luminance for stimuli presented in the fovea and up to about 10 eccentricity.

CFF and Retinal Eccentricity The slope of the CFF luminance function changes being steeper outside the fovea, changing from a k of about 10 Hz/log unit at the fovea to about 20 Hz/log unit at about 10 eccentricity (Fig. 1). This implies that the peripheral retinal neurons respond with greater temporal acuity than central neurons. Tyler (1985) speculates that this might be due to the larger peripheral cone diameters. The maximum CFF

Flicker Sensitivity

5

measured at approximately 35 is about 90 Hz. The CFF is higher in the periphery at high luminance levels. At low luminance levels, there is not much change in CFF with eccentricity.

CFF and Wavelength In general, flickering lights with equal photopic luminance but different wavelengths have equal CFFs. For intensities in the photopic range greater than 1 log troland, CFFs are independent of the wavelength. At low stimulus intensities, CFFs are highest for shorter wavelengths (Hecht and Shlaer 1935).

CFF and Adaptation Given the relative densities of rods and cones in the different retinal areas, their different retinal sensitivities and their neural interactions, the CFF will depend in a complex manner on the adaptation state of the eye. It has been found that the CFF is the highest when the eye is completely light adapted or when a uniform background of high luminance surrounds the flickering stimulus. As a result, a flickering stimulus at a fixed intensity can appear to be flickering when the observer is light adapted and steady when the observer is dark adapted (Coletta and Adams 1984; Goldberg et al. 1983). It is found that the adaptation effects on flicker are evident at frequencies above 15 Hz.

CFF and Ambient Illumination At high levels of display illumination, ambient illumination has a small or negligible effect on perception of flicker perception (Isensee and Bennett 1983). The ANSI standards for room illumination are  200–500 lx. Illumination affects flicker perception on a CRT screen only to a small extent under these conditions.

CFF and Refresh Rates of Monitors The refresh rate (regeneration rate) is one of the most important factors in the perception of flicker on a display. If the rate at which the screen is refreshed is greater than the CFF of the observer, the observer will not perceive flicker. Refresh rates need to be higher for displays with higher luminance and wider field of view in order to make the display flicker free. Refresh rates can be decreased by an increase in phosphor persistence.

6

V. Lakshminarayanan

CFF and Phosphor Persistence It has been found that the medium-short phosphors P20 and P4 cause maximum flicker effect (Turnage 1966). Many modern CRT phosphors contain a blend of two or more phosphors, and hence the ripple ratio is considered to be a better index to describe the persistence effect in a CRT. The modulation index of the fundamental frequency gives the ripple ratio (De Lange 1958). Pearson (1991) gives a list of ripple ratios for various refresh frequencies. In general, the smaller the ripple ratio, the less susceptible the monitor is to flicker. It should be emphasized that the ripple ratio alone does not give much information on perceived flicker; other factors such as luminance and angle of viewing also play a major role.

Brightness Enhancement Effects Because of the Temporal Properties of Vision Two brightness enhancement effects that are known: the Brucke-Bartley phenomenon and the Broca-Sulzer effect. If one views a flickering light and the flicker rate is varied without changing the time-averaged luminance, the brightness of the flickering light appears to be enhanced at certain frequencies. This is the Brucke-Bartley phenomenon. The maximum brightness enhancement appears for flicker rates at around 5–20 Hz (Wu et al. 1996). The second brightness enhancement effect, the Broca-Sulzer effect is one wherein the brightness of a suprathreshold flash depends upon its duration (when compared to the brightness of a steady light of the same luminance). Here, it is found that flash durations of a test light shorter and longer than 50–100 ms produce less brightness effect and the effect becomes stronger, and the peak brightness occurs at short durations with increasing luminance levels (Wu et al. 1996; Aiba and Stevens 1964).

The Temporal Contrast Sensitivity Function A complete description of the temporal responsiveness of the human visual system is given by the temporal contrast sensitivity function (temporal CSF). Like its counterpart the spatial CSF (see chapter “▶ Spatial Vision and Pattern Perception”), the temporal CSF has a band-pass shape, with a peak, a high temporal frequency cutoff (the CFF), and a low temporal frequency roll-off (Fig. 1). In Fig. 3, the amplification scale on the right is nothing but the contrast sensitivity and the threshold modulation is the threshold contrast. The peak contrast occurs at an intermediate flicker frequency; the cutoff high temporal frequency is the limit of temporal acuity above which flicker cannot be resolved (even if contrast is 1) and also shows a reduction in sensitivity at low temporal frequencies.

7

200

0.01

100

0.02

50

0.05

20

0.1

10

0.2

2

Threshold modulation ratio, m–1

0.005

0.5

9300 trolands 850 trolands 77 trolands 7.1 trolands

Amplification, m–1

Flicker Sensitivity

2

0.65 trolands 0.06 trolands 1

1.0 2

5

10 Frequency

20

50

Fig. 3 The human contrast sensitivity function for several mean luminance levels (From Kelly 1961)

The temporal peak frequency shifts from approximately 20 to 5 Hz as the mean luminance decreases. The cutoff high temporal frequency goes from 60 Hz to about 15 Hz as the luminance decreases. Because the visual system has a low CSF at low luminances, we can only see low temporal frequencies of medium to high contrast. Kelly (1961), in his classic experiments on flicker, used a large flickering field with blurred edges to measure the temporal CSF functions (see (De Lange 1958)). If a sharp-edged field is used, the visibility of low frequency flicker is enhanced. The presence of spatial detail improves the visibility of flicker at low temporal frequencies.

8

V. Lakshminarayanan

Temporal Summation The visual system does not distinguish the temporal shape of light flashes shorter than a critical duration. This is inferred from Bloch’s law. The law essentially states that the visual system summates visual inputs over a brief time period and is given by: Lt¼C Here, L is the threshold luminance of the flash, t is the duration of the flash, and C is a constant. Bloch’s law has a spatial analogy, namely, the Ricco’s law in spatial vision (see chapter “▶ Spatial Vision and Pattern Perception”). Bloch’s law holds for flashes that are shorter than a critical duration tc for approximately 30–100 ms. During this time period, the visual system adds together the effects of the absorbed quanta regardless of the temporal pattern in which they arrive. Bloch’s law is a consequence of the temporal filtering properties of the visual system. For more detailed description, see Roufs (1972).

Flicker in Monitors Phosphor persistence can affect flicker modulation. In general, as phosphor persistence decreases, flicker increases. Now, persistence can vary widely depending on the colored VDT phosphors. In general, phosphor persistence are classified as long, medium, medium short, and short in terms of persistence (see chapter “▶ Luminescence of Phosphors”). For example, the P22 blue phosphor is classified as medium short (10–100 μs) while the P22 red and green phosphors are classified as medium (1–100 ms). Therefore, even if luminance is held constant, CFF may vary as a function of the specific color that is displayed on the color VDT. Another important factor is that in many VDTs the green channel’s maximum luminance is greater than the red channel which is greater than the blue channel. Therefore, the colors displayed will differ in luminance and hence the CFF will vary. Usually greens, whites, and yellows will have higher luminance and therefore will be a limiting factor. Recall from above that CFF increases with luminance. Various empirical methods have been proposed to predict whether a particular VDT will appear to flicker in a given environment (Rogowitz 1986). Many of these methods are cumbersome and time consuming. Farrell (1986, 1987) has developed analytic methods for predicting whether a VDT will flicker given screen phosphor persistence, refresh frequency, distance to VDT from the observer, etc. These predict a maximum screen luminance and minimum refresh frequency that will generate a flicker free display for a theoretical standard observer and is based on Kelly’s pioneering work on flicker (Kelly 1961,1969, 1971). The model predicts that if the absolute amplitude of the fundamental temporal frequency of the VDT

Flicker Sensitivity

9

luminance modulation Eobs is greater than Epred then observers will perceive flicker. Here, Epred ¼ ð1=aÞ exp ð2πfbÞ1=2 where a and b are constants that depend upon the size of the display and f is the display refresh frequency. Also, if the amount of energy in the fundamental temporal frequency of the VDT is Eobs, then the lowest refresh frequency that will render a VDT flicker free is given by: ½lnðaEobs Þ2 =½2πb:

Summary The detection of temporal changes is important for the organism. In this chapter, we have discussed the factor most important in use with VDTs, namely, the perception of flicker. The crucial factor is the critical flicker fusion frequency (CFF). The CFF depends upon a number of parameters, and these were discussed along with the temporal CSF (the temporal analogue of the spatial CSF). The CFF is the temporal analogue of the minimum angle of resolution.

Further Reading Aiba TS, Stevens SS (1964) Relation of brightness to duration and luminance under light and dark adapted conditions. Vision Res 4:391–401 Coletta N (2002) Temporal factors in vision. In: Norton TT, Corliss DA, Bailey JE (eds) The psychophysical measurement of visual function. Butterworth/Heinemann, Woburn, Chapter 7 Coletta NJ, Adams AJ (1984) Rod cone interaction in flicker detection. Vision Res 24:1333–1340 Cornsweet TN (1970) Visual perception. Academic, New York De Lange HD (1958) Research into the dynamic nature of the human fovea-cortex systems with intermittent and modulated light: attenuation characteristics with white and colored lights. J Opt Soc Am 48:777–784 Farrell JE (1986) An analytic method for predicting perceived flicker. Behav Inform Technol 5:349–358 Farrell JE (1987) Predicting flicker thresholds in video display terminals. Proc Soc Inform Display 28:18 Goldberg SH, Frumkes TE, Nygaard RW (1983) Inhibitory influence of unstimulated rods in the human retina: evidence provided by examining cone flicker. Science 221:180–182 Harwood K, Foley O (1987) Temporal resolution: an insight into the video display terminal problem. Hum Factors 29:447 Hecht S, Shlaer S (1935) Intermittent stimulation by light V. The relation between intensity and critical frequency for different parts of the spectrum. J Gen Physiol 19:321–337 Hecht S, Smith EL (1936) Intermittent stimulation by light. VI. Area and the relation between critical frequency and intensity. J Gen Physiol 19:979–991 Hecht S, Verrijp CD (1933) The influence of intensity, color and retinal location on the fusion frequency of intermittent illumination. Proc Natl Acad Sci U S A 19:522–535

10

V. Lakshminarayanan

Isensee SH, Bennett CA (1983) The perception of flicker and glare on computer CRT displays. Hum Factors 30:689 Kelly DH (1961) Visual responses to time dependent stimuli I. Amplitude sensitivity. J Opt Soc Am 51:422–429 Kelly DH (1969) Diffusion model of linear flicker responses. J Opt Soc Am 59:1665–1670 Kelly DH (1971) Theory of flicker and transient responses I. Uniform fields. J Opt Soc Am 61:537–546 Kelly DH (1972) Flicker. In: Jameson D, Hurvich L (eds) Visual psychophysics, vol VIII/4, Handbook of sensory physiology series. Springer, Heidelberg Lee BB, Martin PR, Valberg A (1989) Sensitivity of macaque retinal ganglion cells to chromatic and luminance flicker. J Physiol 414:223–243 Pearson RA (1991) Predicting VDT flicker. Inf Display 7&8:22 Rogowitz B (1986) A practical guide to flicker measurement. Behav Inform Technol 5:359–378 Roufs JA (1972) Dyanmic properties of vision, I. Experimental relationships between flicker and flash thresholds. Vision Res 12:261–278 Turnage RE (1966) The perception of flicker in CRT displays. Inf Display 3:38–42 Tyler CW (1985) Analysis of visual modulation sensitivity II. Peripheral retina and the role of photoreceptor dimensions. J Opt Soc Am A 2:393–398 Tyler CW, Hamer RD (1990) Analysis of visual modulation sensitivity IV. Validity of the ferry porter law. J Opt Soc Am A 7:743–758 Tyler CW, Hamer RD (1993) Eccentricity and the ferry porter law. J Opt Soc Am A 10:2084–2087 Wu S, Burns SA, Reeves A, Elsner AE (1996) Flicker brightness enhancement and visual nonlinearity. Vision Res 36:1573–1583

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Spatial Vision and Pattern Perception L. Srinivasa Varadharajan* Hyderabad Eye Research Foundation, L. V. Prasad Eye Institute, Hyderabad, India

Abstract This chapter deals with the visual system’s sensitivity to contrast and the various ways by which it is affected. Specifically, contrast detection and discrimination thresholds and superimposed and lateral masking effects will be discussed. Effects of pattern adaptation and the consequent modification of the contrast sensitivity function will also be discussed.

List of Abbreviations cpd CSF FACT MTF

Cycles per degree Contrast sensitivity function Functional acuity contrast test Modulation transfer function

Introduction The term spatial vision encompasses all things related to seeing the space around us. This definition is so broad that it includes everything related to vision; however, it is usually restricted to visual perception of nonmoving two-dimensional luminance patterns. With this restriction, the terms spatial vision, pattern vision, and pattern perception become interchangeable. Human pattern vision measurements can be grouped under two broad categories, namely, threshold measurements and stimulus matching. In the first category, the value of a particular dimension of interest, such as the contrast, at which the stimulus is detected (or discriminated from another stimulus with a slightly different value) with a given probability is measured. This stimulus value is called the threshold; sensitivity is usually defined as the reciprocal of this threshold value. Most of our understanding of the visual system comes from some sort of threshold measurements. In the second category, the value mentally assigned to the particular dimension of interest is measured. When the stimulus is presented in isolation in a uniform field or background of some sort, we have a simple detection, discrimination, or a matching measurement. When other stimuli are presented, either superimposed on or spatially separated from the stimulus of interest, we have a masking measurement. Unless stated otherwise, all information given below pertains to foveal vision when viewing is done with one eye only.

Ricco’s Law The simplest stimulus to display, and detect, is a spot of light in a uniform background. The earlier chapter on light detection and sensitivity (chapter “▶ Light Detection and Sensitivity”) dealt with the temporal *Email: [email protected] Page 1 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

aspects of this detection task. When the target is small, the luminance detection threshold is inversely proportional to the area of the target. This is known as Ricco’s law and it is usually written as: logðLt Þ ¼ K  logðAÞ where Lt denotes the threshold luminance for detecting a target of area A (given in deg2). The product of the threshold luminance, area, and the stimulus duration gives the threshold energy. Therefore, Ricco’s law implies that for small targets, the visual system essentially integrates the energy over this entire target and sees it when the total energy just matches or exceeds the threshold value. The maximum area over which this integration happens is called Ricco’s area. In other words, Ricco’s area is the area of the target beyond which the relationship given above does not hold true. Estimates of Ricco’s area are affected by various stimuli and observer parameters. In general, Ricco’s area varies from 2 to 14 min2 in the fovea (Davila and Geisler 1991). This area increases enormously as the target is presented in regions away from the fovea (Westheimer 1965; Vassilev et al. 2005), or when the background illumination is increased or the stimulus duration is decreased (Barlow 1958), and with age (Schefrin et al. 1998). These studies also showed that spatial summation of stimuli with areas smaller than Ricco’s area happens at the retinal level. When the stimulus is larger than Ricco’s area, the threshold luminance decreases as the square root of the area of the stimulus. This relationship is called Piper’s law. A plot of the log(threshold) versus log (area) will have a slope of 1/2 for such stimuli. However, this relationship is found to break down quite quickly beyond Ricco’s area (Westheimer 1965; Barlow 1958). As one increases the size of the target beyond Ricco’s area, the threshold luminance keeps decreasing albeit by smaller amounts. Such continued summation could be explained only through the cortical area of the brain.

Contrast Sensitivity Function (CSF) Our ability to perceive luminance variation across space is denoted by the contrast sensitivity function (CSF). In general, sine wave gratings are used as the stimuli for obtaining the CSF. These sine wave gratings are shown on a background that has the same luminance as the average luminance of the grating. The threshold Michelson contrast at which a sine wave grating with a particular spatial frequency is detected is determined for various spatial frequencies. Contrast sensitivity at any given spatial frequency, as mentioned earlier, is the inverse of the threshold contrast. The dependence of this sensitivity on the spatial frequency is called the contrast sensitivity function (Fig. 1). What the modulation transfer function (MTF) is to an optical system is what CSF is to our opticalneural system. The image formed on the retina is solely dictated by the MTF (or more correctly, the optical transfer function) of the optics that precedes the retina. The neural system then operates on this retinal image to form the final mental picture. Unlike MTF, CSF is a band-pass filter. Typically, normal adults have maximum sensitivity between 2 and 6 cycles per degree of visual angle. There is a steep reduction in sensitivity at lower spatial frequencies. On the higher spatial frequency side, sensitivity reduces gradually reaching a value of 1 (since the maximum contrast possible is 1 and sensitivity is the inverse of the threshold contrast, the minimum possible sensitivity is 1) at about 60 cycles per degree (CPD). This high spatial frequency cutoff determines the resolution limit of the visual system (see also chapter “▶ Visual Acuity”). This is called the grating acuity. It is also interesting to note that the photoreceptor size and arrangement in the fovea result in the Nyquist frequency which is close to the grating acuity. The threshold contrast for human subjects could go as low as 0.3 %. On a linearized display driven by the usual 8-bit DAC, such a low contrast cannot be realized. Techniques like dithering, superimposition

Page 2 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Contrast sensitivity (threshold contrast)−1

1000

100

10

1

0

10

20

30

40

50

60

Spatial frequency (c/deg)

Fig. 1 The CSF of two subjects (Adapted from Campbell and Green 1965)

with a high uniform luminance, bit expansion circuits like Pelli and Zhang’s attenuator (Pelli and Zhang 1991), or BITS++TM (Cambridge Research Systems Ltd., UK) are routinely used to overcome this problem. Clinically, CSF is commonly measured using the functional acuity contrast test (FACT) (Vision Science Research Corporation, CA, USA) chart – or the Pelli-Robson chart (Precision Vision, IL, USA). The FACT chart contains five rows of sine wave gratings. Each row has gratings of fixed spatial frequency; contrast decreases from left to right. Each grating is oriented vertical or tilted 15 to the left or right from the vertical (Fig. 2). The subject is usually seated 1 m from the chart and asked to say the orientation of each grating. For each row, the contrast at which the subject makes the first error is taken as the threshold at that spatial frequency. The Pelli-Robson chart (Pelli et al. 1988) contains triplets of letters arranged as two triplets per line. The letters in each triplet are at the same contrast, and the contrast decreases by a factor of 1/√2 from one triplet to the next (Fig. 3). The letters subtend an angle of 0.5 at 3 m, i.e., the Pelli-Robson chart measures the contrast sensitivity at 5 cpd. The lowest contrast at which the subject reads at least two letters correctly is taken as the threshold contrast. The Pelli-Robson chart has been shown to have good reproducibility and reliability (Elliott et al. 1990) and is now increasingly used in clinics throughout the world.

Factors Affecting CSF Various factors are known to affect the human CSF in predictable ways. In general, alterations in the optics of the eye, the retinal/neural structure, the stimulus, or the adaptation state of the observer would alter the CSF. Optical Factors: Any degradation in the optics of the system would show its effect on the MTF and consequently on the CSF. Therefore, one would expect no or little effect on the low spatial frequencies and increasing amounts of reduction in sensitivity as the spatial frequency is increased. Therefore, the high Page 3 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 The functional acuity contrast test (FACT) chart

Fig. 3 Pelli-Robson chart (Adapted from Pelli et al. 1988)

spatial frequency cutoff would be moved toward lower spatial frequencies, and in some cases, the peak sensitivity would also be shifted toward lower spatial frequencies. Some of the typical optical effects are defocus found in persons with uncorrected refractive errors or with cataract or when the pupil size is increased (Atchison et al. 1998; Hess and Woo 1978; Strang et al. 1999).

Page 4 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Percentage threshold contrast (M) as a function of retinal illumination B0 in trolands (Adapted from Van Nes and Bouman 1967)

Retinal and Neural Factors: The photoreceptor arrangement in the retina dictates the high spatial frequency cutoff of the CSF. The receptor arrangement could be altered due to various reasons including age (Crassini et al. 1988), disease (Bour and Apkarian 1996), or the increased overall dimension of the eyeball as found in people with high refractive errors. The effect of this retinal factor is similar to the optical factor in that sensitivity is unaffected at low spatial frequencies and that losses in sensitivity are seen at the high spatial frequencies. The term “neural factors” include all cells upstream from the retina. Depending upon the type of affect, which is usually due to some disease, sensitivity losses could be seen at all spatial frequencies. Another of the retinal effects is the position of the retina where the stimulus is presented. The photoreceptor packing becomes less dense, and the photoreceptors themselves become larger in size as we move away from the fovea. This results in the reduction in the sampling rate. The effect of this is a reduction in the high-frequency cutoff value and reduction in the sensitivity to high spatial frequencies; the low spatial frequencies are not much affected (Crassini et al. 1988). Stimulus Factors: There are many stimulus parameters that could affect the CSF, but the mean luminance and the size of the target are the two most important parameters. At any given spatial frequency, the threshold decreases as the square root of the average luminance (de Vries-Rose law) and then remains constant for high luminance values (Fig. 4). Also, as the luminance is increased, the peak sensitivity moves toward higher spatial frequencies (Van Nes and Bouman 1967; De Valois et al. 1974). Similarly, increase in the size of the target results in profound decrease in the threshold initially; this decrease becomes less pronounced with large sizes appearing to approach an asymptote. In general, the rapid reduction in thresholds appears to take place for sizes up to five cycles of the target. Adaptation: Adaptation is the process by which sensitivity to a particular stimulus parameter could be altered by prolonged exposure to a carefully chosen stimulus. Adaptation to a sine wave grating of a particular spatial frequency reduces the sensitivity around that frequency (Fig. 5) with a bandwidth of about 1 octave (Blakemore and Campbell 1969).

Contrast Detection The presence of such a notch effect is used to explain the CSF as an envelope of various band-limited sensitivity functions centered at various spatial frequencies (Fig. 6). Detection of a grating of a given spatial frequency is then explained using the concepts of linear filters. In general, an image is processed by such a set of filters, and when the output of a filter exceeds a certain value – a threshold response – the Page 5 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Effect of adaptation to a sine wave grating of 7.1 cpd (for observer F.W.C.). The dots with the error bars show the contrast sensitivities at various spatial frequencies after adaptation to a 7.1 cpd grating indicated by the thick arrow (Adapted from Blakemore and Campbell 1969)

image or that component of the image with the specified spatial frequency content will be detected. In the spatial vision parlance, such filter-response systems are called mechanisms or channels. The response of such a mechanism is usually modeled as the product of the spatial sensitivity profile of the mechanism and the luminance profile of the stimulus. This spatial sensitivity profile is called the receptive field of the mechanism. The visual system consists of a vast number of such mechanisms. Besides the central spatial frequency, these mechanisms could be differentiated based on the center of their spatial profile (i.e., the point where the receptive field is centered), the orientation of the grating they are responsive to, the phase of the grating, the size of stimulus, the direction of motion of the stimulus, etc. The values of these various parameters define the “tuning” of the mechanism. The presence of such mechanisms could be used to explain many of the factors that have been described above. A thorough treatment of these mechanisms can be found in the excellent book by Graham (Graham 1992).

Contrast Discrimination Threshold for detecting a change in the contrast in a stimulus is dependent on the initial contrast value. The stimulus over which the change in contrast is detected is called the pedestal. Typically, the threshold decreases as a function of the pedestal contrast, reaches a minimum, and then increases monotonically. At large values of the pedestal contrast, the threshold increases almost linearly. It is important to note that the discrimination threshold is smaller than the detection threshold when the pedestal contrast is low, i.e., very faint stimuli enhance our discrimination ability.

Page 6 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 The overall contrast sensitivity as an envelope of different band-limited sensitivity functions centered at various spatial frequencies

Superimposed and Lateral Masks The pedestal, as mentioned above, can be thought of as a superimposed mask; the task then becomes a detection task instead of a discrimination one. With this change of perspective, we can then manipulate the parameters of the pedestal and study the detection of a grating in the presence of superimposed masks. The parameters that could be manipulated are the spatial frequency, orientation, phase, and size of the pedestal. In general, when the parameters are close to the values of the target and when the pedestal contrast is low, the detection task mimics the discrimination task mentioned above. When the values of the parameters are very different from that of the target and when the pedestal contrast is low, the detection of the target becomes more difficult, and hence, the thresholds are increased. The opposite is true when the pedestal contrast is high. In this case, when the values of the parameters of the pedestals become more and more different from that of the target, the target becomes easy to detect, and hence, the thresholds are reduced from the discrimination thresholds. Figure 7 shows the effect of phase and orientation of the pedestal on the detection threshold of the target. Instead of a pedestal, if we have other stimuli that are spatially displaced from the target, we have a lateral masking measurement. Usually, to maintain symmetry, masks are presented either completely surrounding the target or in pairs on opposite sides of the target. Lateral masks produce similar effects on the detection of a target as superimposed masks. The question of greatest importance is the effect the distance between the mask and the target has on the detection threshold because this would give us the information about the interactions of the underlying mechanism. In general, low contrast flankers that are close to the target seem to reduce detection thresholds, and the effect of this integration extends to quite a separation, sometimes up to eight times the target size (Adini et al. 1997). Such measurements clearly indicate that mechanisms with receptive fields that are widely separated interact. As mentioned earlier, each mechanism’s response would be a product of its spatial sensitivity profile and the luminance distribution within its receptive field. These responses would then be combined to produce various interesting effects (Foley 1994). This lateral interaction of receptive fields is used to explain a lot of Page 7 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 7 Effect of the changing pedestal parameters on the detection thresholds of the target. All contrasts are given in dB re 1, i.e., in units of 20 log(contrast). In all the three graphs, the graph corresponding to the value “0” corresponds to discrimination threshold; (a) and (b) show the effect of changing the orientation of the pedestal with respect to the target (Foley 1994); (c) shows the effect of the pedestal phase with respect to the target on target detection (Adapted from Foley and Chen 1999)

interesting phenomena including Hermann’s grid, Mach band, grating induction, etc. Letter charts using the log MAR progression are designed to provide relatively equal amounts of lateral interactions for any letter in the chart except the ones in the extremities.

Summary Spatial vision refers to the process of detecting and discriminating simple sinusoidal patterns. Our spatial vision ability is described by our contrast sensitivity function. This sensitivity is affected by various factors such as the optical, retinal, neural, and adaptation state of the observer. The detection of these simple patterns is described invoking a set of linear filters that have various tuning properties. These mechanisms interact to produce a variety of effects.

Further Reading Adini Y, Sagi D, Tsodykes M (1997) Excitatory – inhibitory networks in the visual cortex: psychophysical evidence. Proc Natl Acad Sci U S A 94:10426–10431

Page 8 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_8-2 # Springer-Verlag Berlin Heidelberg 2015

Atchison DA, Woods RL, Bradley A (1998) Predicting the effects of optical defocus on human contrast sensitivity. J Opt Soc Am A 15:2536–2544 Barlow HB (1958) Temporal and spatial summation in human vision at different background intensities. J Physiol 141:337–350 Blakemore C, Campbell FW (1969) On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. J Physiol 203:237–260 Bour LJ, Apkarian P (1996) Selective broad-band spatial frequency loss in contrast sensitivity functions. Invest Ophthalmol Vis Sci 37:2475–2482 Campbell FW, Green DG (1965) Optical and retinal factors affecting visual resolution. J Physiol 181:576–593 Crassini B, Brown B, Bowman K (1988) Age-related changes in contrast sensitivity in central and peripheral retina. Perception 17:315–332 Davila KD, Geisler WS (1991) The relative contributions of pre-neural and neural factors to areal summation in the fovea. Vision Res 31:1369–1380 De Valois R, De Valois K (1988) Spatial vision. Oxford Science Publication, New York De Valois RL, Morgan H, Snodderly DM (1974) Psychophysical studies of monkey vision-III. Spatial luminance contrast sensitivity tests of macaque and human observers. Vision Res 14:75–81 Elliott DB, Sanderson K, Conkey A (1990) The reliability of the Pelli-Robson contrast sensitivity chart. Ophthalmic Physiol Opt 10:21–24 Foley JM (1994) Human luminance pattern-vision mechanisms: masking experiments require a new model. J Opt Soc Am A 11:1710–1719 Foley JM, Chen C-C (1999) Pattern detection in the presence of maskers that differ in spatial phase and temporal offset: threshold measurements and a model. Vision Res 39:3855–3872 Graham N (1992) Visual pattern analyzers. Oxford University Press, New York Hess R, Woo G (1978) Vision through cataracts. Invest Ophthalmol Vis Sci 17:428–435 Pelli DG, Zhang L (1991) Accurate control of contrast on microcomputer displays. Vision Res 31:1337–1350 Pelli DG, Robson JG, Wilkins AJ (1988) The design of a new letter chart for measuring contrast sensitivity. Clin Vision Sci 2:187–199 Schefrin BE, Bieber ML, McLean R, Werner JS (1998) The area of complete scotopic spatial summation enlarges with age. J Opt Soc Am A 15:340–348 Shapley R, Man-Kit Lam D (1992) Contrast sensitivity. MIT Press, Cambridge, MA Strang NC, Atchison DA, Woods RL (1999) Effects of defocus and pupil size on human contrast sensitivity. Ophthalmic Physiol Opt 19:415–426 Van Nes FL, Bouman M (1967) Spatial modulation transfer in the human eye. J Opt Soc Am 57:401–406 Vassilev A, Ivanov I, Zlatkova MB, Anderson RS (2005) Human S-cone vision: relationship between perceptive field and ganglion cell dendritic field. J Vis 5:823–833 Westheimer G (1965) Spatial interaction in the human retina. J Physiol 181:881–894

Page 9 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

Binocular Vision and Depth Perception Robert Earl Patterson* Air Force Research Laboratory, Wright-Patterson AFB, OH, USA

Abstract This chapter covers several topics that are important for a basic understanding of binocular vision and depth perception. These topics include the horopter, binocular disparity, binocular rivalry, spatiotemporal frequency effects, and distance scaling of disparity.

Introduction Stereopsis refers to the perception of depth based on binocular disparity, a cue that derives from the existence of horizontally separated eyes. Wheatstone (1838) was the first to report that disparity is the cue for stereopsis, which he called “seeing in solid.” Since his original observations, the phenomenon of binocular depth perception has attracted much interest from the basic and applied scientific and engineering communities. For a recent review of stereopsis, see Howard (2002) and Howard and Rogers (2002). For a review of stereopsis as applied to stereo displays, see Patterson and Martin (1992) and Patterson (2015). This chapter reviews the basics of human stereopsis, which includes the horopter, binocular disparity, and binocular rivalry. It also covers related topics such as spatiotemporal frequency effects and disparity scaling.

Horopter and Binocular Disparity The basics of binocular vision begin with the concept called corresponding retinal areas or corresponding retinal points. Corresponding retinal areas in the two eyes are stimulated when fixation is directed toward an object in the visual field. The images from the fixated object stimulate the two foveae, which are considered to be corresponding. When a pair of images stimulates corresponding retinal areas, those images are said to possess zero binocular disparity. There is an imaginary arc passing through the fixation point called the “horopter.” The horopter is also an important concept in binocular vision because all points or objects along its length define the locations in space which will project images that also strike corresponding retinal areas in the two eyes and therefore, possess zero disparity (see Fig. 1). One way to consider corresponding retinal areas is that they give rise to a common visual direction. Thus, an object that is positioned at the horopter will give rise to images that will stimulate each eye in such a way that the images will appear to come from the same direction out in the visual field. Because the horopter defines the locations in space that project zero binocular disparity, the horopter is considered a reference or baseline depth plane from which the depth of other objects is judged, that is the depth of objects that lie in front of or behind the horopter.

*Email: [email protected] *Email: [email protected] Page 1 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Drawing depicting the basics of stereoscopic viewing. The drawing shows two circles which represent a top-down view of the two eyes, fixation point F, the horopter passing through the fixation point, Panum’s fusional area, crossed and uncrossed disparity regions, and object X and object Y. When point F is fixated, the images from F stimulate corresponding retinal points (foveae) in the two eyes and are fused. Object X is positioned in front of the horopter and thus carries a crossed disparity, but the images from X which stimulate non-corresponding (disparate) retinal points in the two eyes are fused because X is located within Panum’s fusional area. Object Y is positioned farther in front of the horopter and also carries a crossed disparity, but the images from Y which also stimulate disparate retinal points in the two eyes are seen as diplopic (double) because Y is located outside Panum’s fusional area (This figure was reproduced from Patterson (2009) with permissions by The Society for Information Display)

The region of depth surrounding the horopter is subdivided into a front region and a back region, which correspond to the direction of the disparity information projected by objects in those regions. An object located in a depth plane in front of the horopter (and therefore in front of fixation) will project images with “crossed” disparity (Fig. 1), whereas an object located in a depth plane behind the horopter (fixation) will project images with “uncrossed disparity.” Moreover, there is a zone of space surrounding the horopter that defines the region of binocular fusion and single vision called Panum’s fusional area (Fig. 1). Objects situated within Panum’s area are seen as fused, and depth perception is generally reliable within this region. Objects situated outside Panum’s area project images that cannot be fused (i.e., the images are diplopic or seen as double) and depth perception becomes unreliable. It has been established that these two directions of disparity, crossed versus uncrossed disparity, are processed differently by different sets of cortical neurons in the visual brain (Cumming and DeAngelis 2001; Poggio 1995; Poggio et al. 1985). Specifically, there are cortical neurons that are excited by crossed disparity and inhibited by uncrossed disparity, and different neurons excited by uncrossed disparities and inhibited by crossed disparities. There also are neurons which are excited or inhibited by zero or near-zero disparities. The conscious perception of depth is thought to be derived from the pooled signals from these various sets of neurons, which form a distributed-channel network for stereoscopic processing: depth in front of the horopter is perceived if neural responding is greatest in the neurons activated by crossed disparity, whereas depth behind the horopter is perceived if neural responding is greatest in the neurons activated by uncrossed disparity; depth in the plane of the horopter is perceived if neural responding is greatest in the neurons activated by zero disparity. An object that is positioned in front of the horopter and therefore projects crossed disparity to the visual system may end up projecting uncrossed disparity, or vice versa, when an observer executes vergence eye movements by shifting fixation to different objects in the visual field. In this case, the relative disparity between stationary objects in the visual field remains constant but their absolute disparity, which is the relevant cue for stereopsis (Cumming and Parker 1999), will change.

Page 2 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

Note that there are concepts associated with accommodation, such as the depth of field of the human eye, which are related to stereoscopic depth perception. These concepts are discussed in chapters “▶ Human Factors of 3D Displays” and “▶ Human Interface Factors Associated with HWDs.”

Binocular Rivalry When an object is positioned outside Panum’s fusional area and its two monocular images cannot be fused (i.e., the images are diplopic), one of the two images may be perceptually suppressed and inhibited via a process called “binocular rivalry.” Binocular rivalry (Blake 1989, 2001; Breese 1899; Levelt 1965) refers to a situation in which one eye inhibits the visual processing of the partner eye, causing the visibility of the two monocular images to fluctuate over time. When viewing in the real world, the binocular rivalry may go unnoticed. However, when viewing a stereo display with a very large disparity that cannot be fused, the binocular rivalry may be noticed. The neural inhibition provoked by binocular rivalry occurs at many levels of the visual system (Blake 2001) which make visual processing unstable and unpredictable; the inhibition is known, for example, to impair the ability of observers to visually guide and direct attention to targets in the visual field (Schall et al. 1993).

Spatiotemporal Frequency Effects Panum’s fusional area, the zone of binocular fusion and single vision shown in Fig. 1, is not static but rather it varies with the luminance spatiotemporal frequency content of the images impinging upon the retina (Schor and Wood 1983; Schor et al. 1984; Patterson 1990; Blakemore 1970). Here, the term “spatiotemporal frequency” refers to the rate of modulation of luminance information over space and time in an image. High spatial frequency refers to fine luminance details in an image, and low spatial frequency refers to coarse details. Similarly, high temporal frequency refers to a high rate of temporal modulation or to sudden changes in luminance information in time (e.g., brief stimulus exposures), and low temporal frequency refers to a low rate of temporal modulation or gradual changes. Panum’s fusional area varies with the luminance spatiotemporal frequency content of imagery because the spatiotemporal frequency differentially engages various types of visual channels, and many of those channels feed into the neural substrate for disparity processing in the brain. Specifically, high spatial frequencies engage visual channels that process relatively small disparities and which subserve fine depth discrimination (i.e., fine stereoacuity) within a narrow Panum’s area. For example, stereoacuity is about 20 arcsec in the spatial frequency range of about 2–20 cycles per degree of visual angle. The maximum disparity that can be reliably discriminated is a little over 40 arcmin within this spatial frequency range (these values apply equally to both the crossed and uncrossed directions from the horopter) (Schor and Wood 1983). Conversely, low spatial frequencies engage visual channels that process relatively large disparities and therefore support a relatively large Panum’s area but which lack the capacity for fine depth discrimination (Schor and Wood 1983). Below a spatial frequency of about 2 cycles per degree, both stereoacuity and the maximum disparity that can be discriminated increase with decreasing spatial frequency such that at a spatial frequency of about 0.1 cycles per degree, stereoacuity is about 5 arcmin and the maximum discriminable disparity is about 4 arc degrees (again, these values apply equally to both the crossed and uncrossed directions from the horopter). However, low spatial frequencies combined with moderate to high temporal frequencies can support fine stereoacuity (Schor et al. 1984; Patterson 1990). In general,

Page 3 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Drawing depicting the change in disparity magnitude with variation in viewing distance. Diagram in the top panel of the figure shows a top down view of two eyes fixating point F at a short distance, whereas the diagram in the bottom panel depicts two eyes fixating the same point F at a long distance. In both diagrams, the depth between F and Y is the same magnitude. Increasing the viewing distance causes disparity magnitude to decrease. For depth to be perceived reliably, viewing distance as well as disparity must be registered by the visual system (This figure was reproduced from Patterson (2009) with permissions by The Society for Information Display)

fine stereoacuity occurs near the horopter and fixation point (Blakemore 1970), which translates into stimulation near the fovea.

Distance Scaling of Disparity Stereoscopic processing does not simply require binocular disparity information in order to generate a perception of depth. Rather, stereoscopic processing is a complicated phenomenon that entails synergistic processing of a number of visual cues in addition to disparity information, namely, cues as to the viewing distance established by an observer. Thus, a critical distinction needs to be made between binocular disparity, which is relative depth information (i.e., relative to the horopter), versus egocentric viewing distance information which refers to the distance between an observer and the point of fixation. Stereoscopic vision requires a synergistic processing of binocular disparity coupled with viewing distance information because the magnitude of binocular disparity that is projected to the two eyes will vary as viewing distances change. Generally, the magnitude of binocular disparity varies approximately inversely with the square of the viewing distance in the real world (see Fig. 2). If viewing distance to a constant interval of depth between two objects in the visual field is halved, then disparity will be approximately four times its initial value, and if viewing distance is doubled, disparity will be Page 4 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

approximately one-fourth its original value. This relation can be seen in the expression for computing the magnitude of disparity for real-world viewing (Cormack and Fox 1985a): with a relatively large viewing distance and symmetrical convergence, disparity magnitude is computed as: r (in radians) = (I  d)/D2, where r is disparity, I is interpupillary distance, d is the depth interval, and D is viewing distance. Note that the relationship between disparity and depth is different for stereo displays (Cormack and Fox 1985b), a topic covered in chapter “▶ Human Interface Factors Associated with HWDs.” Thus, a given amount of binocular disparity will be ambiguous for determining how much depth exists out in the visual field unless viewing distance is taken into account. It is believed that the synergistic processing involves having the visual system recalibrate the relationship between disparity and depth for different viewing distances, a process called “distance scaling of disparity,” or “disparity scaling for short” (Patterson and Martin 1992; Patterson 2015; Ono and Comerford 1977; Wallach and Zuckerman 1963; Ritter 1977; Patterson et al. 1992). A number of viewing distance cues have been suggested as playing a role in the disparity scaling operation, such as accommodation and vergence (Foley 1980; Owens and Leibowitz 1976; Owens and Leibowitz 1980; von Hofsten 1976), and vertical disparity (Gillam and Lawergren 1983; Gillam et al. 1988; Rogers and Bradshaw 1993), but these cues would do so only for short viewing distances. There is also limited evidence that field cues, such as linear or texture perspective, may provide distance information for disparity scaling (Cormak 1984).

Summary and Conclusions The ability to perceive depth with binocular vision arises from the existence of horizontally separated eyes. This horizontal separation between the eyes produces lateral shifts in the location of corresponding monocular images, called binocular disparity, which the visual system processes as relative depth information (i.e., relative to the horopter). This visual translation of disparity into relative depth entails a synergistic operation that recalibrates the relationship between disparity and depth for different viewing distances. The existence of this recalibration process, called disparity scaling, means that the reliability of stereoscopic depth perception is vulnerable to those factors that affect the visual registration of viewing distance.

Directions for Future Research Future research should investigate further the factors that affect the disparity scaling process, especially the relative strength of the various distance cues thought to play a role in such scaling. The visual combination of various cues in the perception of depth is a recognized issue in the basic vision literature (Hillis et al. 2004; Jacobs 1999; Knill and Saunders 2003). Nonetheless, more could be known about the process, particularly how it affects depth perception in synthetic stereo displays.

Further Reading Blake R (1989) A neural theory of binocular rivalry. Psychol Rev 96:145–167 Blake R (2001) A primer on binocular rivalry, including current controversies. Brain Mind 2:5–38 Blakemore C (1970) The range and scope of binocular depth discrimination in man. J Physiol 211:599–622 Page 5 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

Breese B (1899) On inhibition. Psychol Monogr 3:1–65 Cormack R, Fox R (1985a) The computation of retinal disparity. Percept Psychophys 37:176 Cormack R, Fox R (1985b) The computation of disparity and depth in stereograms. Percept Psychophys 38:375 Cormak R (1984) Stereoscopic depth perception at far viewing distances. Percept Psychophys 35:423 Cumming B, DeAngelis G (2001) The physiology of stereopsis. Annu Rev Neurosci 24:203–238 Cumming B, Parker A (1999) Binocular neurons in V1 of awake monkeys are selective for absolute, not relative, disparity. J Neurosci 19:5602–5618 Foley J (1980) Binocular distance perception. Psychol Rev 87:411–434 Gillam B, Lawergren B (1983) The induced effect, vertical disparity, and stereoscpic theory. Percept Psychophys 34:121–130 Gillam B, Chambers D, Lawergren B (1988) The role of vertical disparity in the scaling of stereoscopic depth perception: an empirical and theoretical study. Percept Psychophys 44:473–483 Hillis J, Watt S, Landy M, Banks M (2004) Slant from texture and disparity cues: optimal cue combination. J Vis 4:967–992 Howard I (2002) Seeing in depth, vol 1, Basic mechanisms. Porteous, New York Howard I, Rogers B (2002) Seeing in depth, vol 2, Depth perception. Porteous, New York Jacobs R (1999) Optimal integration of texture and motion cues to depth. Vision Res 39:3621–3629 Knill D, Saunders J (2003) Do humans optimally integrate stereo and texture information for judgments of surface slant? Vision Res 43:2539–2558 Levelt W (1965) On binocular rivalry. Institute for Perception RVO-TNO, Soesterberg Ono H, Comerford T (1977) Stereoscopic depth constancy. In: Epstein W (ed) Stability and constancy in visual perception: mechanisms and processes. Wiley, New York Owens D, Leibowitz H (1976) Oculomotor adjustments in darkness and the specific distance tendency. Percept Psychophys 20:2–9 Owens D, Leibowitz H (1980) Accommodation, convergence, and distance perception in low illumination. Am J Optom Physiol Opt 57:540–550 Patterson R (1990) Spatio-temporal properties of stereoacuity. Optom Vis Sci 67:123–125 Patterson, R (2009) Human Factors of stereoscopic displays. In: Society for information display international symposium digest of technical papers, Campbell, CA, pp 805–807 Patterson R (2015) Human factors of stereoscopic displays. Springer, London Patterson R, Martin W (1992) Human stereopsis. Hum Factors 34:669–692 Patterson R, Moe L, Hewitt T (1992) Factors that affect depth perception in stereoscopic displays. Hum Factors 34:655–667 Poggio G (1995) Mechanisms of stereopsis in monkey visual cortex. Cereb Cortex 5:193–204 Poggio G, Motter B, Squatrito S, Trotter Y (1985) Responses of neurons in visual cortex (V1 and V2) of the alert macaque to dynamic random dot stereograms. Vision Res 25:397–406 Ritter M (1977) Effect of disparity and viewing distance on perceived depth. Percept Psychophys 22:400–407 Rogers B, Bradshaw M (1993) Vertical disparities, differential perspective and binocular stereopsis. Nature 361:253–255 Schall J, Nawrot M, Blake R, Yu K (1993) Visual guided attention is neutralized when informative cues are visible but unperceived. Vision Res 33:2057–2064 Schor C, Wood I (1983) Disparity range for local stereopsis as a function of luminance spatial frequency. Vision Res 23:1649

Page 6 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_9-2 # Springer-Verlag Berlin Heidelberg 2015

Schor C, Wood I, Ogawa J (1984) Spatial tuning of static and dynamic local stereopsis. Vision Res 24:573–578 von Hofsten C (1976) The role of convergence in visual space perception. Vision Res 16:193–198 Wallach H, Zuckerman C (1963) The constancy of stereoscopic depth. Am J Psychol 76:404 Wheatstone C (1838) Contributions to the physiology of vision: 1. On some remarkable and hitherto unobserved phenomena of binocular vision. Philos Trans R Soc Lond 128:371

Page 7 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_10-2 # Springer-Verlag Berlin Heidelberg 2015

Color Communication Stephen Westland* Colour Science & Technology, University of Leeds, Leeds, UK

Abstract The use of language to describe color is natural and intuitive and there seems to be some evidence that different languages refer to color in a consistent way. It is clear, nonetheless, that the reliance of language to communicate color is limited not least by the number of color names but also by the lack of precision that language affords. As an alternative to natural language, color-order systems have found widespread use; these usually consist of physical books of patches or swatches each of which carries a notation. Three representatives of such systems are referred to in this chapter: the Munsell system, the Pantone system, and the NCS system. Although physical-color order systems can be effective they are also limited by, for example, consisting of relatively few physical samples. The last couple of decades have seen increased use of numerical color communication and specification based upon the CIE system.

List of Abbreviations CIE CMM ICC NCS PMS

Commission Internationale de l'Eclairage Color Matching Module International Color Consortium Natural Color System Pantone Matching System

Introduction A desire to communicate color is natural in the arts and in our everyday lives. However, in the last 50 years color communication has become essential as part of the design and specification of products in industrialized societies. The use of language to describe color is natural and intuitive and there seems to be some evidence that different languages refer to color in a consistent way. It is clear, nonetheless, that the reliance of language to communicate color is limited not least by the number of color names but also by the lack of precision that language affords. There is no clear answer to the question of how many different colors we can distinguish between; however, estimates range from three million to about ten million. It is clear that even if we use color names in a consistent and reliable way then there are limitations in the use of natural language for color communication. As an alternative to natural language, color-order systems have found widespread use; these usually consist of physical books of patches or swatches each of which carries a notation. Color-order systems are, however, themselves limited as tools for color communication. Numerical color communication is increasingly becoming the preferred method for color communication by professionals working in the field.

*Email: [email protected] Page 1 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_10-2 # Springer-Verlag Berlin Heidelberg 2015

Color and Language Previous research has proposed that in the English language black, white, red, green, yellow, blue, purple, orange, pink and gray are basic color terms or universal categories. The classic study in this area was conducted by Berlin and Kay who proposed that not only are there 11 basic color terms but that cultures evolve the use of these terms in way that is predictable and almost universal (Berlin and Kay 1969). As languages evolve, they acquire new basic color terms in a strict chronological sequence; if a basic color term is found in a language, then the colors of all earlier stages should also be present. The sequence is as follows: • • • • • • •

Stage I: Dark-cool and light-warm (this covers a larger set of colors than English “black” and “white”) Stage II: Red Stage III: Either green or yellow Stage IV: Both green and yellow Stage V: Blue Stage VI: Brown Stage VII: Purple, pink, orange, or gray

The Berlin and Kay study contested the Sapir-Whorf hypothesis; the idea that the varying cultural concepts and categories inherent in different languages affect the cognitive classification of the experienced world in such a way that speakers of different languages think and behave differently because of it. The study achieved widespread influence but has recently been criticized (Saunders 2000) and the notion of universality that has endured for the last 30 or so years is under attack from cultural relativists. Critics note that the language sample from which Berlin and Kay collected data was strongly biased in favor of written languages from industrialized societies. However, the Berlin and Kay study has not been refuted entirely and a current study underway at U.C. Berkeley and the University of Chicago is statistically testing comprehensive color-naming data, collected from 110 unwritten languages from non-industrialized societies, through the World Color Survey (http://www.icsi.berkeley.edu/wcs/. Last accessed 11 Aug 2010). More recently, there is evidence that possession of linguistic categories facilitates recognition and influences perceptual judgments (Roberson et al. 2000). There is therefore doubt about the universality of color perceptions and, more crucially, color naming. A further limitation in the use of natural language to communicate color is that the number of colors that we can differentiate between is extremely large. The number of colors that are discernable is difficult to quantify but is certainly measured in the millions. Judd and Wyszecki (1975) estimated that there were ten million discernable colors but a more recent estimate (Pointer and Attridge 1998) was more conservative and placed the number somewhere between two and three million.

Perceptual Color Attributes Since the number of colors that can be observed is extremely large it is natural to consider systematic ways of organizing and describing colors that could lead to a more efficient and meaningful representation. One of the first people to arrange colors in a circle appears to have been Aron Sigfrid Forsius (1550–1637) although his work was not discovered until the twentieth century (Koenig 2003). Forsius’s color circle included white and black. The first hue circle is credited to Newton who considered the spectral hues and presented them in a circular diagram along with the non-spectral hues (which Newton realized were required to complete the circle). Hue is that attribute of a visual sensation according to which an area Page 2 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_10-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 The three attributes of color vision: lightness (upper), chroma (middle), and hue (lower)

appears to be similar to one of the perceived colors: red, yellow, green, and blue, or to a combination of two of them (Fairchild 2005a). Although hue is perhaps the most distinguishable attribute of a color, it is now established that color vision is based on three perceptual attributes: brightness, colorfulness, and hue (or correlates of three such attributes). Brightness is that attribute of visual sensation according to which an area appears to emit more or less light whereas colorfulness is that attribute according to which the perceived color of an area appears to be more or less chromatic (Fairchild 2005a). Relative color terms are frequently used. For example, lightness is a relative brightness (normalized for changes in illumination and viewing conditions) and chroma, saturation, and purity are all distinct from each other but describe various aspects of relative colorfulness. In this chapter a full explanation of these terms is not given but readers are directed to Fairchild (2005a) for authoritative definitions. In this chapter, the terms lightness, chroma and hue will be used in a general way to describe the three perceptual aspects of color perception. Figure 1 illustrates the three attributes. The significance of these perceptual attributes is that it becomes possible to describe a color using three attributes in a semi-systematic way; thus, we might describe a light, saturated orange or a dark, desaturated blue. It also becomes possible to describe differences between two similar colors in a meaningful way; thus, we may say that one color is darker, stronger and bluer, for example, than another. This method of color communication avoids the use of arbitrary color names but is still limited as a method for precise and accurate color communication. However, a systematic understanding of color ontology led to the development of sophisticated tools for color communication such as the Munsell system.

The Munsell System The idea of using a three-dimensional color solid to represent all colors was developed during the eighteenth and nineteenth centuries. For example, in 1810 Philipp Otto Runge developed a system based upon a sphere. However, although these systems became progressively more sophisticated, before Munsell none was based on any rigorous scientific understanding of human color vision. Prior to Munsell’s contribution, the relationship between hue, lightness, and chroma was not well understood. Albert Munsell, an artist and educator, wanted to create a rational way to describe color that would use an alphanumeric notation instead of color names which he could use to teach his students about color. In 1905 he published A Color Notation, a description of his system with the first atlas being produced in 1907 (Kuehni 2003). In 1918, shortly before Munsell’s death, the Munsell Color Company was formed and the first Munsell Book of Color was published in 1929. An extensive series of experiments carried out by the Optical Society of America in the 1940s resulted in an improvement to the system known as the Munsell Renotations (Newhall et al. 1943). Munsell was the first to separate lightness, chroma and hue into perceptually uniform and independent dimensions. The system consists of three independent dimensions which can be represented cylindrically Page 3 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_10-2 # Springer-Verlag Berlin Heidelberg 2015

Value

Hue

Chroma

Fig. 2 The Munsell system arranged color in three dimensions. This idea remains in modern numerical color-communication tools such as CIELAB

as an irregular color solid: hue, measured by degrees around horizontal circles; chroma, measured radially outward from the neutral (gray) vertical axis; and value, measured vertically from 0 (black) to 10 (white). Munsell’s value scale can be interpreted as a lightness scale. Munsell determined the spacing of colors along these dimensions by taking measurements of human visual responses. Munsell based his hue notation on five principal hues: red, yellow, green, blue, and purple, along with five intermediate hues halfway between adjacent principal hues. Each of these ten steps is then broken into ten sub-steps, so that 100 hues are given integer values. Value, or lightness, varies vertically from black (value 0) at the bottom, to white (value 10) at the top. Neutral grays lie along the vertical axis between black and white. Chroma, measured radially from the center of each slice, represents the “purity” of a color (see Fig. 2). Note that there is no intrinsic upper limit to chroma. Different areas of the color space have different maximal chroma coordinates. For instance light yellow colors have considerably more potential chroma than light purples. A color is fully specified in the Munsell system by listing the three numbers for hue, value, and chroma. For instance, a fairly saturated red of medium lightness would be R 5/10 with R indicating the hue, 5/10 indicating lightness and chroma respectively. The Munsell system is an example of a physical color-order system. By reference to the physical samples of color it is possible to provide reasonably accurate and precise color communication. The merits in the system are evident in the observation that the system is still in use today more than 100 years after it was developed. The current Munsell atlas is published in two parts, glossy (1,488 samples) and matt (1,277 samples) (McLaren 1987). However, as a method of communication the system is not without limitations. One problem of relying upon physical samples is that over time they may fade or become soiled. Perhaps more critically the system is limited to a couple of thousand color samples and invariably the color that one wishes to communicate lies somewhere between the colors of two adjacent samples in the system.

Other Color-Order Systems The Munsell system was important for the influence that it had upon ideas about color spaces and representations (Kuehni 2003). Although the Munsell system is still in use today, other color-order systems have been developed and have found widespread use. The Pantone system is particularly popular in the graphic arts and printing industries. Pantone Guides consist of a large number of cardboard sheets, printed on one side with a series of related color swatches and then bound into a small flipbook. For instance, a Page 4 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_10-2 # Springer-Verlag Berlin Heidelberg 2015

particular sheet might contain a number of yellows varying in luminance from light to dark. There are several Pantone systems; the Pantone solid color system, for example, consists of over 1,100 unique, numbered colors (e.g., Pantone 198). The Pantone samples, unlike those of the Munsell system, are not organized in a way that is consistent with the human visual system nor spaced uniformly with respect to perception. However, Pantone systems have found widespread use because their use can assist printers to match colors (Pantone samples typically include information about which inks can be used to match that color). The Pantone Matching System (PMS), for example, provides information on how a printer should obtain the solid colors and there are also guides that provide the closest process color (CMYK) equivalent. Another color-order system that has been successful is the Natural Color System (NCS) (Hesselgren 2007). The NCS system is perhaps more like the Munsell system than the Pantone system. The color samples are logically arranged in a three-dimensional space. However, there are some important differences between Munsell and NCS. For example, the Munsell system is based upon five primary hues whereas the NCS system is based upon four hues. In fact, the NCS system is based upon three pairs of elementary color percepts: white-black, red-green, and yellow-blue. NCS colors are defined by the amount of blackness, chromaticness, and a percentage value between two hues, red, yellow, green or blue. For example, the NCS color NCS 0580-Y10R refers to a color with 5 % darkness, 80 % saturation, and whose hue is 90 % yellow and 10 % red.

Numerical Color Communication That color-order systems such as Pantone and Munsell contain relatively few samples is problematic for modern color communication. Physical color-order systems are also subject to the limitation that even if they are manufactured to a very close tolerance the samples inevitably fade over time or change in color because they become soiled. The use of color-order systems also requires so-called normal color vision, whereas approximately 5 % of the population is estimated to suffer from some type of color-vision defect. The Commission Internationale de l'Eclairage (CIE) developed a system for the specification of color stimuli that was recommended for widespread use in 1931 (Publication CIE No. 15.2 1986). This system allows measurements of spectral reflectance factors of spectral radiance to be converted to CIE XYZ tristimulus values or, in turn, to other (more uniform) color spaces such as CIE (1976) Lab or CIELAB. The CIE system is described in more detail in chapters “▶ The CIE System” and “▶ Uniform Color Spaces.” The second half of the twentieth century saw color measurement using the CIE system become ubiquitous in many industries including textiles, paints, and plastics. The advent of affordable color-imaging devices in the last couple of decades, however, has led to an explosion in digital color communication. Color management is a process that aims to allow color to be transferred across various technologies (printers, cameras, displays etc.) without loss of fidelity. Color management systems are now embedded as part of most popular operating systems that make use of imaging device profiles that allow a device’s color space to be transformed into a standard device-independent color space. Several device-independent color spaces are used including the CIE system but also sRGB (a standard RGB color space) (Stokes and Anderson 1996). The International Color Consortium (ICC) is an industry consortium which has defined an open standard for a Color Matching Module (CMM) at the operating system level, and color profiles for the devices and working space (the color space the user edits in).

Summary There are numerous ways to specify and communicate color. The use of language is natural and intuitive but lacks precision for all but crude descriptions of color. The use of reference to physical samples Page 5 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_10-2 # Springer-Verlag Berlin Heidelberg 2015

(arranged in a color-order system) has become widespread. The Munsell system was revolutionary in its approach and is still in use today but other systems (such as Pantone and NCS) have developed specialist appeal and offer advantages for certain applications. The main advantage of a color-order system is that it is easy to use. However, the number of different colors that we would like to communicate is certainly in the millions and yet even the largest color-order systems contain only a few thousand samples. Arguably, the notation systems of some of these color-order systems allow colors that are between colors in the systems to be notated but with some loss of precision. There is also the argument that to use a color-order system effectively one should possess normal color vision. For this reason, and others, numerical color specification is widely used in industry and is based upon a system for specifying color that was introduced in 1931.

Further Reading Berlin B, Kay P (1969) Basic color terms: their universality and evolution. University of California Press, Berkeley Berns RS (2000) Billmeyer and Saltzman’s principles of color technology, 3rd edn. Wiley-Interscience, New York Fairchild MD (2005a) Color appearance models. Wiley, New York Fairchild MD (2005b) Color appearance models. Wiley, New York Hesselgren S (2007) Why colour order systems? Color Res Appl 9(4):220–228 http://www.icsi.berkeley.edu/wcs/. Last accessed 11 Aug 2010 Judd DB, Wyszecki G (1975) Color in business, science and industry, 3rd edn. Wiley, New York Koenig B (2003) Color workbook. Pearson Education, Harlow Kuehni RG (2003) Color space and its divisions. Wiley, Hoboken McLaren K (1987) Colour space, colour scales and colour difference. In: McDonald R (ed) Colour physics for industry. Society of Dyers and Colourists, Bradford Newhall SM, Nickerson D, Judd DB (1943) Final report of the OSA subcommittee on the spacing of the Munsell colors. J Opt Soc Am 33:385 Pointer MR, Attridge GG (1998) The number of discernible colours. Color Res Appl 23(1):52–54 Publication CIE No. 15.2 (1986) Colorimetry, 2nd edn. Bureau of the Commission Internationale de l'Eclairage, Vienna Roberson D, Davies I, Davidoff J (2000) Colour categories are not universal: replications and new evidence from a stone-age culture. J Exp Psychol Gen 129:369–398 Saunders B (2000) Basic color terms. J Roy Anthropol Inst 6:81–89 Stokes M, Anderson M (1996) http://www.w3.org/Graphics/Color/sRGB.html. Last accessed 11 Aug 2010

Page 6 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015

The CIE System Stephen Westland* Colour Science & Technology, University of Leeds, Leeds, UK

Abstract Colorimetry is a branch of color science concerned with numerically specifying the color of physically defined stimuli such that two stimuli that look the same (under certain criteria) have identical specifications. The Commission Internationale de l’Eclairage (CIE) developed a system for the specification of color stimuli that was recommended for widespread use in 1931 and that has formed the basis of colorimetry for the last 80 years. This chapter briefly describes the development of the CIE system and explains key principles (such as additive color mixing and Grassman’s laws) upon which the system is based. Specification of color by tristimulus values is described and the importance of chromaticity diagrams is discussed.

Introduction Colorimetry is a branch of color science concerned with numerically specifying the color of physically defined stimuli such that two stimuli that look the same (under certain criteria) have identical specifications (Ohta and Robertson 2005). The Commission Internationale de l’Eclairage (CIE) developed a system for the specification of color stimuli that was recommended for widespread use in 1931. The CIE system has been the cornerstone of modern colorimetry for most of the twentieth century, and although today there are other color spaces and color metrics that are frequently used, these are invariably derived from the original work that was carried out in the 1920s and 1930s. This chapter therefore outlines the development of the CIE system. The key to understanding the CIE system is to understand the principles of additive color mixing; the associated experimental laws that were derived and clarified around the beginning of the twentieth century are referred to as Grassman’s Laws and these are described. The notion of a color space is introduced and some properties of chromaticity diagrams are described.

Additive Color Mixing There are two main types of color mixing: additive color mixing and subtractive color mixing. Subtractive color mixing occurs when colorants (inks, paints, dyes etc.) are mixed together; additive color mixing refers to how lights of different wavelengths add together to form different colors. Perhaps the most important feature of additive color mixing is that we can mix together three colors – which we will call color primaries – and create a surprising range of colors. A common misapprehension is that it is possible to define three color primaries that could create any color by mixture. Unfortunately, the range of reproducible colors (or gamut) for a trichromatic additive (or subtractive) system is limited and is always smaller than the gamut of all the colors possible in the world. However, the gamut is smaller or larger depending upon the choice of primaries. Pragmatically, for additive color mixing the largest gamut is achieved when the primaries are red, green, and blue. *Email: [email protected] Page 1 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Additive color mixing using RGB primaries

Figure 1 illustrates additive color mixing with red, green, and blue primaries. The situation described by Fig. 1 can be realized using three projection lamps each emitting a circular beam of one of the three additive primaries that are then superimposed onto a projection screen so that they partially overlap. Mixtures of red and green, red and blue, and blue and green can be seen to result in yellow, magenta, and cyan, respectively. Mixing all three primaries can result in a white. The colors (cyan, magenta, and yellow) that result from mixing any two of the primaries in equal amounts are sometimes referred to as secondary colors. It is important to be aware that there is no clear answer to the question of exactly which red, green, and blue would make the “best” additive primaries. Many image-display devices use different sets of primaries and even different sets of primaries have been used, at various times, within the CIE system of colorimetry.

Trichromatic Color Specification The results of additive color matching are assumed to obey certain laws known as Grassman’s laws of additive color mixture (Ohta and Robertson 2005). Let us represent the three additive primaries red, green, and blue by the symbols [R], [G], and [B] and the amounts or intensities of these primaries by R, G, and B respectively. Grassman’s first law states that we can match the color of an arbitrary color stimulus [C] with an additive mixture of the primaries, thus ½C  R½R þ G½G þ B½B; where  denotes “matches” or “is equivalent to.” The amounts of the primaries needed to affect the match are referred to as tristimulus values and constitute a trichromatic or colorimetric specification of the color stimulus [C] according to the color-matching equation. At first, such a specification may appear somewhat arbitrary since it is clear that the actual tristimulus values RGB that specify a particular color stimulus [C] will depend upon the precise nature of the primaries [R], [G], and [B] that were selected. However, we need to consider Grassman’s remaining laws. Grassman’s second law states that an additive mixture of two stimuli [C1] and [C2] can be matched by linearly adding together the mixture of the primaries that individually match the two stimuli, thus Page 2 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015

if ½C1   R1 ½R þ G1 ½G and B1 ½B; and ½C2   R2 ½R þ G2 ½G and B2 ½B; then ½C1  þ ½C2   ðR1 þ R2 Þ½R þ ðG1 þ G2 Þ½G and ðB1 þ B2 Þ½B: Grassman’s third law states that color matching is invariant to changes in intensity. Thus, a½C  aR½R þ aG½G þ aB½B; where a is the intensity of the unit stimulus [C]. The implication of Grassman’s laws is that although the tristimulus values that specify a color stimulus will depend upon the choice of color primaries, the matching condition will not be affected by this choice. Therefore, if two color stimuli are deemed a match by virtue of having the same tristimulus values under one system of primaries, then they would also be deemed a match under any other set of primaries.

The CIE 1931 System of Colorimetry Trichromatic additive color mixing, described in the previous section, provides a basis for colorimetry since the tristimulus values required to match a color stimulus for a given set of primaries form a colorimetric specification. One possible practical implementation of this idea would be to construct a visual colorimeter. With such a system, the user would view a screen showing (on, say, the left-hand side) the color stimulus to be specified and (on the right-hand side) an additive mixture of the three primary lights. The user would adjust the intensities of the three primaries to achieve a match and the tristimulus values selected by the user could be communicated to other users, who could then replicate the color stimulus using other identically constructed colorimeters. However, colorimetry based upon visual colorimeters would face several challenges, not least the problem that there is variation in terms of the trichromatic matches made by observers as a result of individual variation in color vision. In addition, it would only be possible to construct the colorimeters to be identical within a certain tolerance and maintaining that tolerance over time would not be trivial. However, considering the full implications of Grassman’s laws, if the tristimulus values needed to match a color stimulus at each wavelength (or wavelength interval) are known separately, the tristimulus values for the color stimulus can be calculated by summing across all wavelengths in the visible spectrum. This is the basis of the system of colorimetry recommended in 1931 by the CIE. The tristimulus values required to match 1 unit of energy at each wavelength in the visible spectrum were determined experimentally by two sets of workers (Guild working at the National Physical Laboratory used 7 observers and Wright working at Imperial College London used 10 observers) (Ohta and Robertson 2005). The two sets of experiments used different additive primaries. It is a relatively trivial matter to convert the tristimulus values obtained under one set of primaries to those for a second set of primaries. The data from both experiments were pooled and transformed to a single set of primaries different to either of those used in the actual experiments and the results are illustrated in Fig. 2. Figure 2 is Page 3 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015 0.35 0.30

tristimulus values

0.25 0.20 0.15 0.10 0.05 0 −0.05 −0.10

400

450

500

650 550 600 wavelength (nm)

700

750

800

Fig. 2 CIE 1931 RGB color-matching functions for the red (red line), green (green line), and blue (blue line) RGB color primaries

based upon three monochromatic primaries at standardized wavelengths of 700, 546.1, and 435.8 nm. The latter two wavelengths were chosen because they were easily reproduced from the discharge spectrum of mercury vapor. Guild, in particular, maintained that the primaries had to be reproducible with nationalstandardizing-laboratory accuracy (Fairman et al. 1997). It was reasoned that by positioning the longwavelength primary at 700 nm where the visual system is not very sensitive, small errors in its wavelength would have relatively little impact. Radiometrically, an equienergy white light could be matched using 1.0000, 4.5907, and 0.0600 lumens of red, green, and blue primaries respectively. The units of the RGB primaries were then defined such that one unit each of the primaries would result in an equienergy white. The curves in Fig. 2 are known as the CIE 1931 RGB color-matching functions and they are available in CIE Publication 15.2 (1986). The reader may notice that the red color-matching function is negative at certain wavelengths. The implication of this is that it was not possible to match these wavelengths using all-positive amounts of the three primaries. In order to obtain a match for these wavelengths one of the primaries (the red) was additively mixed with the stimulus, which could then be matched using a mixture of the other two primaries. The color-matching equation is then represented by ½C þ R½R  G½G þ B½B; which, assuming that Grassman’s laws apply, can be transformed into the following form ½C  R½R þ G½G þ B½B: Thus, this situation can be represented as using a negative amount of one of the primaries (in this case, the red). When tristimulus values had to be calculated manually (the 1930s), the presence of both negative and positive values made the complications complicated and prone to error (Ohta and Robertson 2005). Therefore, the CIE introduced a transformation that would allow the use of three so-called “imaginary” primaries that are referred to as [X], [Y], and [Z]. One of the conditions of this transformation was that the XYZ tristimulus values would be all-positive for all real color stimuli. Another condition was such that the

Page 4 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015 1.8 1.6

tristimulus values

1.4 1.2 1 0.8 0.6 0.4 0.2 0

400

450

500

650 550 600 wavelength (nm)

700

750

800

Fig. 3 CIE 1931 XYZ color-matching functions for the X (red line), Y (green line), and Z (blue line) XYZ color primaries

Y tristimulus value would represent the luminance of the stimulus (the luminance values of the other two primaries are correspondingly equal to zero). The additional conditions are widely available in the literature (Fairman et al. 1997), but it is interesting to note that it is unlikely that any of these conditions would be adopted if the CIE system was formulated today. However, the CIE 1931 XYZ color-matching functions are firmly established and are the basis on which many of the developments in colorimetry since 1931 have been based. Note that what is important is the matching condition (that two stimuli are a visual match if they have the same tristimulus values) and this is independent of which primaries we choose to base the system on (assuming that Grassman’s laws hold true). The color-matching functions xðlÞ, yðlÞ, and zðlÞ were defined by the CIE at intervals of 1 nm at wavelengths l between 360 and 830 nm and are shown in Fig. 3. The CIE 1931 standard also specified CIE illuminants A, B, and C although illuminant C was subsequently supplemented by the CIE D (daylight) illuminants in 1964 of which D65 and D50 are perhaps the most important today. The introduction of tables of illuminants allowed the computation of tristimulus values for surface colors as well as for self-luminous colors. Practical formulae for computing the CIE 1931 tristimulus values for a surface with spectral reflectance P(l) under an illuminant of relative spectral power E(l) were provided by the CIE in 1986 (CIE 1986); thus, X830 X ¼ k 360 E ðlÞxðlÞPðlÞ, X830 Y ¼ k 360 E ðlÞyðlÞPðlÞ, X830 Z ¼ k 360 EðlÞzðlÞPðlÞ; where k is a normalizing factor. The significance of this normalizing factor is that the Y tristimulus value (which corresponds to the luminance) would be equal to 100 for a perfectly white surface irrespective of which illuminant is used for the calculation. Alternative methods for the calculation of tristimulus values using tables of weights (which precalculate the illuminant and the color-matching functions at each

Page 5 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015

wavelength) are available (ASTM 2001) and software implementations can also be found (Westland and Ripamonti 2004). The 1931 CIE system was derived using a stimulus field size of 2 of visual angle. In 1964, an additional set of color-matching functions were derived based on 10 of visual angle and are preferred for many practical applications. This additional set of color-matching functions is known as the 1964 or 10 standard observer. If the primaries [X], [Y], and [Z] are considered as vector components, the three-dimensional space thus constructed can be used for the geometric expression of colors and is called a color space (Ohta and Robertson 2005). The tristimulus values for a particular color locate that color in the color space.

Chromaticity Diagrams The concept of color can be divided into two parts: luminance and chromaticity. The CIE 1931 XYZ system was deliberately designed so that the Y tristimulus value was a measure of luminance. The chromaticity of a color can then be specified by the two remaining values and the calculation of chromaticity coordinates x and y; thus, x ¼ X=ðX þ Y þ Z Þ, y ¼ Y=ðX þ Y þ Z Þ allows chromaticities to be plotted in a chromaticity diagram (see Fig. 4). The chromaticity diagram reveals the characteristic horseshoe shape of the spectral locus. It is sometimes convenient to refer to the dominant wavelength and purity of a color. The former is obtained by extending a line from the white point through the color stimulus to the spectral locus and noting the wavelength of intersection and the latter is the proportional distance of the color stimulus from the white point to this intersection point on the spectral locus. If Grassman’s Laws hold true, then the color that results from additively mixing two lights will fall on the straight line in the chromaticity diagram that joins the two points that represent the two lights.

520 nm 0.8

CIE y

0.6

0.4 700 nm 0.2

0.0

400 nm 0.0

0.2

0.4

0.6

0.8

CIE x

Fig. 4 The 1931 CIE chromaticity diagram Page 6 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_11-2 # Springer-Verlag Berlin Heidelberg 2015

Although the CIE 1931 (and 1964) systems have proved very effective for color specification, the corresponding color spaces and chromaticity diagrams are not visually uniform. That is, the distance between two points in the space does not properly correspond to the color difference between two color stimuli represented by those two points. This property restricted the usefulness of the system for certain practical problems such as predicting color difference. Chapter “▶ Uniform Color Spaces” describes the development of uniform color spaces.

Summary The CIE system is the basis of modern colorimetry. The system can be confusing at first because of the use of so-called imaginary primaries (XYZ). However, it is important to realize that the most important properties of the CIE system are independent of the actual primaries that are used. It is also important to note that the system was developed to address the problem of color specification. The original 1931 CIE system was not concerned with color appearance, but rather focused on whether two color stimuli would be a visual match when viewed under identical and standardized viewing and illumination conditions. The matching condition is satisfied if two stimuli have identical tristimulus values and this is invariant to the actual primaries that the tristimulus values refer to. Despite the success of the CIE system, a huge effort was made between 1960 and 2000 to address the lack of visual uniformity in the system. Some of these developments are described in chapter “▶ Uniform Color Spaces.” Also, during the last quarter of the century, the issue of color appearance has started to become more and more important. There have been considerable advances in this area and readers are directed to Fairchild’s book for a summary of these (Fairchild 2005).

Further Reading ASTM E308-01 (2001) Standard practice for computing the colors of objects by using the CIE system. ASTM International, West Conshohocken. www.astm.org, doi:10.1520/E0308-01 Fairchild MD (2005) Color appearance models. Wiley, New York Fairman HS, Brill MH, Hemmendinger H (1997) How the CIE 1931 color-matching functions were derived from Wright-Guild data. Color Res Appl 22(1):11–23 Hunt RWG (1998) Measuring colour, 3rd edn. Fountain Press, London Kuehni RG (2003) Color space and its divisions. Wiley, Hoboken Ohta N, Robertson AR (2005) Colorimetry fundamentals and applications. Wiley, New York Publication CIE No. 15.2 (1986) Colorimetry, 2nd edn. Bureau of the Commission Internationale de l’Eclairage, Vienna Publication CIE No. S2 (1986) Standard colorimetric observers. Bureau of the Commission Internationale de l’Eclairage, Vienna Westland S, Ripamonti C (2004) Computational colour science using MATLAB. Wiley, London Wyszecki G, Stiles WS (1982) Color science – concepts and methods, quantitative data and formulae, 2nd edn. Wiley, New York

Page 7 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_12-2 # Springer-Verlag Berlin Heidelberg 2015

RGB Systems Stephen Westlanda* and Vien Cheungb a Colour Science & Technology, University of Leeds, Leeds, UK b School of Design, University of Leeds, Leeds, UK

Abstract The additive primaries are red, green, and blue or RGB. Unfortunately, there is no single set of RGB primaries that has achieved universal acceptance. Rather, RGB primaries have evolved over time in response to consumer demand and technological advancement. Three important sets of primaries, however, are known as SMPTE-C, ITU-R BT.601, and ITU-R BT.709-3. Many display systems currently use SMPTE-C and ITU-R BT.601 and, as high-definition television develops, ITU-R BT.709-3 is becoming more prevalent. Two important issues for RGB color-reproduction system are the color gamut and device dependency. These topics are briefly described and some recent developments are introduced.

Introduction Using additive color mixing and just three primaries, it is possible to create color-reproduction systems that can generate a wide range of colors. Modern examples of such systems include television, LED displays, plasma displays, and cinematography. The additive primaries are red, green, and blue or RGB. Unfortunately, there is no single set of RGB primaries that has achieved universal acceptance. Rather, RGB primaries have evolved over time in response to consumer demand and technological advancement. Three important sets of primaries, however, are known as SMPTE-C, ITU-R BT.601, and ITU-R BT.709-3. Many display systems currently use SMPTE-C and ITU-R BT.601 and, as highdefinition television develops, ITU-R BT.709-3 is becoming more prevalent. Two important issues for RGB color-reproduction system are the color gamut and device dependency.

Trichromatic Color Reproduction Human color vision is trichromatic. In most humans, our color vision systems are based upon the responses of three classes of cones in the retina, each of which has broadband sensitivity but maximum sensitivity at different wavelengths. A consequence of trichromacy is that color reproduction is trichromatic – the use of three color primaries allows a wide range of colors to be reproduced. The gamut of reproducible colors for a trichromatic additive system is limited and is always smaller than the gamut of all the colors possible in the world. However, the gamut is smaller or larger depending upon the choice of primaries. Pragmatically, the largest gamut is achieved when the additive primaries are red, green, and blue. It is sometimes useful to represent the RGB color space as a cube, with black (corresponding to zero intensity for R, G, and B) in one corner and white (corresponding to maximum intensity for R, G, and B)

*Email: [email protected] Page 1 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_12-2 # Springer-Verlag Berlin Heidelberg 2015 Blue [001]

Magenta [101]

White [111] Cyan [011]

Black [000] Red [100]

Green [010]

Yellow [110]

Fig. 1 The RGB color cube Table 1 The CIE chromaticities of the SMPTE-C, Rec. 601, and Rec. 709 primaries

x y z

SMPTE-C R 0.6300 0.3400 0.0300

G 0.3100 0.5950 0.1050

B 0.1550 0.0700 0.7750

ITU-R BT.601 R G 0.6400 0.2900 0.3300 0.6000 0.0300 0.1100

B 0.1550 0.0600 0.7900

ITU-R BT.709-3 R 0.6400 0.3300 0.0300

G 0.3000 0.6000 0.1000

B 0.1550 0.0600 0.7900

in the opposite corner. Such an illustration (see Fig. 1) makes explicit that the secondary colors of the RGB color solid are cyan (blue + green), magenta (red + blue), and yellow (red + green). The origins of the RGB color model can be traced back to the Young-Helmholtz theory of trichromatic color vision in the early nineteenth century. Early experiments by Maxwell (1998) used three separate filters (red, green, and blue) to capture three black-and-white images. These could be combined using three projectors and appropriate filters to generate color photographs in what became known as the colorseparation method (Hirsch 2004). In modern imaging, the RGB color model is ubiquitous in digital photography, cinematography, and, most notably, in image-display devices such as LCD and plasma displays.

RGB Standards There is no single RGB color space that has achieved universal acceptance. Rather, many RGB standards and RGB primaries have evolved over time in response to consumer demand, professional interests, and technological advances. In the 1950s, the National Television System Committee (NTSC) specified a set of primaries that were representative of phosphors used in CRTs of that era in the USA (Poynton 2009). The NTSC primaries were more saturated than those now found in many modern displays but as a consequence were not very bright. Meanwhile, the European Broadcast Union (EBU) established a different set of primaries for use in European countries (also in Japan and Canada) known now as ITUR BT.601 or simply Rec. 601. The NTSC primaries were eventually replaced by SMPTE-C primaries that are slightly smaller in gamut but can achieve greater brightness. In 1990, a new set of primaries was agreed for high-definition television (HDTV) known as ITU-R BT.709-3 or simply Rec. 709. Table 1 lists Page 2 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_12-2 # Springer-Verlag Berlin Heidelberg 2015

520 nm 0.8

CIE y

0.6

0.4 700 nm 0.2

400 nm

0.0 0.0

0.2

0.4

0.6

0.8

CIE x

Fig. 2 CIE chromaticity diagram showing sRGB/Rec. 709 (solid lines) and Adobe (1998) RGB (dashed lines) gamuts

the CIE 1931 chromaticity coordinates of the SMPTE-C, Rec. 601, and Rec. 709 primaries. It is currently possible to find displays that correspond to each of these standards and, indeed, to several others. Figure 2 shows the gamut of colors that can be achieved using the Rec. 709 primaries. Note that, for a display system based on Rec. 709, colors that lie outside of the triangle in Fig. 2 cannot be reproduced by the display and are said to be out of gamut. However, even within the triangle many chromaticities would be out of gamut at certain luminance levels because gamuts are, of course, three dimensional (Morovič 2008). The use of more saturated primaries would, in principle, allow a greater gamut of reproducible colors; however, in practice – when the 3-D nature of the gamut is considered – the range of colors may even be reduced. Furthermore, in digital RGB systems it is normal to allocate 8 bits per color channel resulting in 256 values for each of R, G, and B (such 24-bit color systems can reproduce approximately 16 million colors though not all may be discriminable (Pointer and Attridge 1998). Using a wider RGB gamut would mean that digital steps would be more widely spaced and this may not be desirable. Consequently, there is growing demand for high-resolution (in terms of bit depth) systems that allocate color using more than 24 bits per pixel (Garcia-Suarez and Ruppertsberg 2010). It is important to note that the standards in current use for RGB systems have evolved for practical use subject to a number of considerations. Nevertheless, the range of colors that many printing systems can generate (using a subtractive mixing system based on cyan, magenta, and yellow inks) exceeds the RGB gamut for certain colors. Thus, bright yellows and magentas that are outside the gamut of RGB display systems can often be obtained using a CMY printing system; correspondingly it is often not possible to obtain in print the bright greens and reds that can be obtained with display systems. The use of imageediting software on computers, in particular, has led to the introduction of some additional RGB standards, some of which have very large gamuts. In 1996, Hewlett-Packard and Microsoft proposed a standard color space, sRGB, intended for widespread use but particularly within the Microsoft operating systems, HP products, and the Internet (Stokes and Anderson 1996). sRGB was designed to be compatible with the Rec. 709 standard and therefore the chromaticities of the primaries are the same as those in Table 1 for Rec. 709. The full specification – which includes a transfer function (gamma curve) that was typical of most CRTs – allows images encoded as sRGB to be directly displayed on typical CRT monitors and this greatly aided its acceptance. However, sRGB is sometimes avoided by high-end print publishing professionals because its Page 3 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_12-2 # Springer-Verlag Berlin Heidelberg 2015

Table 2 The CIE chromaticities of the Adobe RGB (1998) and sRGB primaries

x y z

Adobe RGB (1998) R G 0.6400 0.2100 0.3300 0.7100 0.0300 0.0800

B 0.1500 0.0600 0.7900

sRGB R 0.6400 0.3300 0.0300

G 0.3000 0.6000 0.1000

B 0.1550 0.0600 0.7900

color gamut is not big enough, especially in the blue-green colors, to include all the colors that can be reproduced in printing. Adobe RGB (1998) was established by the Adobe software company (SMPTE240 M) and designed to encompass most of the colors achievable on CMYK color printers. As can be seen in Fig. 2, the Adobe (1998) RGB space improves upon the gamut of the sRGB color space primarily in cyan greens (Table 2).

Color Management and RGB RGB color values are often said to be device dependent. Imagine a camera system was used to capture a scene so that the RGB values are recorded at each pixel. Now imagine that the image is displayed on two displays, one that is based on Rec. 709 primaries and one that is based on Adobe RGB (1998) primaries. Without adjustment for the differences in the two sets of primaries, it is likely that the colors will look different in the two displayed images. The RGB values captured by the camera relate to the camera’s spectral sensitivities; in other words they are dependent upon that device and cannot be relied upon to produce reasonable color accuracy on a display device unless the RGB values are adjusted to account for the differences in the capture device and the display device (e.g., Rec. 709 or Adobe RGB). Fortunately such adjustment regularly takes place and is the key process of color management (see chapter “▶ Fundamentals of Image Color Management”). For color to be reproduced in a predictable manner across different devices and materials, it has to be described in a way that is independent of the specific behavior of the mechanisms and materials used to produce it. To address this issue, current methods require that color be described using deviceindependent color coordinates, which are translated into device-dependent color coordinates for each device (Stokes and Anderson 1996; International Color Consortium 2016). Originally, operating systems supported color for a particular color space but since even RGB varies between devices, color was not reliably reproduced across different devices. The high-end publishing market could not meet its needs with the traditional means of color support and the work of the International Color Consortium (ICC) was critical in terms of increasing color fidelity across a wide range of imaging devices (International Color Consortium 2016). The purpose of the ICC is to “promote the use and adoption of open, vendor-neutral, cross-platform color management systems.” The ICC process converts input color values to output color values via a profile connection space (PCS). For this system to be effective, each device should be associated with a device profile that describes the relationship between the device’s color space and the device-independent color space (PCS). However, the ICC process involves the overhead of transporting the input device’s profile with the image and running the image through the transform. There is also the problem of what to do if an image file does not have a profile associated with it. As described earlier, the sRGB color space was proposed as an alternative means of managing color that is optimized to meet the needs of most users without the overhead of carrying an ICC profile with the image (Stokes and Anderson 1996). Most color management systems now assume that the default color space is sRGB if confronted with a digital color image that does Page 4 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_12-2 # Springer-Verlag Berlin Heidelberg 2015

not have a profile. For this reason, sRGB is often the color space of choice for creating images for display over the Internet on a variety of platforms.

Recent Developments There are a great number and variety of technologies that implement RGB image reproduction. These include phosphors, light-emitted diodes (LED), liquid crystal displays (LCD), plasma, and organic LEDs (Jackson et al. 1994). The color properties of the RGB primaries vary from one device and manufacturer to another, as does the spatial arrangement of the RGB primaries. However, until recently all of these systems have been based on three primaries. A recent development by Sharp, known as Quad Pixel technology, has introduced a fourth primary (yellow) in addition to the RGB primaries in some LED display devices (http://www.sharp.ca/en-CA/ForHome/HomeEntertainment/LEDTV.aspx).

Summary Color reproduction is fundamentally based on trichromacy. A wide range of colors can be generated using three color primaries, and for additive systems (such as TVs, computer screens, and mobile phone displays), the optimum primaries (giving the greatest color gamut) are based on RGB. However, there are several standards for the design of the RGB primaries. Partly for this reason, color management is critical to enable good color fidelity as images are communicated between different imaging devices. Fortunately, color management is built in to almost all modern operating systems and, in most cases, makes appropriate adjustments with the user being aware of the underlying process or being required to provide any information or intervention. However, very accurate manipulation and communication of color requires a high level of knowledge about color mixing, color primaries, and color management. Although the underlying idea of additive color reproduction has remained largely unchanged for at least 50 years, advances in technology have led to better implementations of the RGB with wider gamuts, better resolution, and increased image quality. A recent development, however, has seen the introduction of a quadchromatic additive color system that is claimed to provide further advances in the color gamut that is achievable. It is likely that the next decade will see similar advances, possible with more than four primaries, that move toward a spectral reproduction method rather than a colorimetric one.

Further Reading Garcia-Suarez L, Ruppertsberg AI (2010) Why higher resolution graphics cards are needed in colour vision research. Color Res Appl 36(2):118–126 Hirsch R (2004) Exploring colour photography: a complete guide. Lawrence King, London http://www.sharp.ca/en-CA/ForHome/HomeEntertainment/LEDTV.aspx. Last accessed 11 Apr 2016 Hunt RWG (2006) The reproduction of colour, 6th edn. Wiley, Chichester International Color Consortium (2016) http://www.color.org/. Last accessed 11 Apr 2016 Jackson R, MacDonald L, Freeman K (1994) Computer generated color. Wiley, New York Maxwell JJ (1998) Color theory and color imaging systems: past, present and future. J Imaging Sci Technol 42(1):70–78 Morovič J (2008) Color gamut mapping. Wiley, Chichester Pointer MR, Attridge GG (1998) The number of discernible colours. Color Res Appl 23(1):52–54 Page 5 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_12-2 # Springer-Verlag Berlin Heidelberg 2015

Poynton C (2009) http://www.poynton.com/PDFs/ColorFAQ.pdf. Last accessed 11 Apr 2016 Sharma G (2002) Digital color imaging handbook. CRC Press, Boca Raton Stokes M, Anderson M (1996) http://www.w3.org/Graphics/Color/sRGB.html. Last accessed 11 Apr 2016

Page 6 of 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_13-2 # Springer-Verlag Berlin Heidelberg 2015

CMYK Systems Stephen Westland* and Vien Cheung School of Design, University of Leeds, Leeds, UK

Abstract Color vision is based upon the responses of three classes of cones in the retina, each of which has broadband sensitivity but maximum sensitivity at different wavelengths. A consequence of this is that color reproduction is trichromatic – the use of three primaries allows a wide range of colors to be reproduced. Color-mixing behavior can be broadly classified as either additive or subtractive. The optimum primaries of the subtractive color system are cyan, magenta, and yellow. The use of cyan, magenta, and yellow subtractive primaries allows a surprisingly large – albeit limited – gamut of colors to be reproduced. In practical printing systems, black is also used so that CMYK is ubiquitous in printing. Extended gamuts can be achieved using more than three or four primaries and systems based on six or more primaries are becoming quite common.

Introduction Color reproduction is trichromatic – the use of three primaries allows a wide range of colors to be reproduced. The notion that there was something of a triple nature in color emerged in the seventeenth century and by 1722 LeBlon was creating color images using three separations. The first trichromatic color photograph was produced by Maxwell in 1861 and Maxwell’s method remains fundamental to modern processes of color reproduction (Hunt 2006). The optimum primaries of the subtractive color system are cyan, magenta, and yellow. The use of cyan, magenta, and yellow subtractive primaries allows a surprisingly large – albeit limited – gamut of colors to be reproduced.

Additive and Subtractive Color Mixing The gamut of reproducible colors for a trichromatic system is limited and is always smaller than the gamut of all the possible colors in the world. Moreover, the gamut is smaller or larger depending upon the choice of primaries. In order to fully appreciate this, it is necessary to differentiate between two classes of color reproduction: additive and subtractive color mixing. Additive color mixing describes the color mixing of lights and is normally realized in the spatial superposition of light from three primaries; pragmatically, the largest gamut is achieved when the additive primaries are red (R), green (G), and blue (B) (see chapter “▶ RGB Systems”). However, many color-image reproduction technologies (most notably printing) are based on primaries that absorb light rather than emit light and these are characterized by subtractive mixing theory. Subtractive systems involve colored dyes, pigments, or filters (generically referred to as colorants) that absorb radiant power from selected regions of the electromagnetic spectrum. The dyes used in inks and paints selectively absorb certain wavelengths more than others; the light that is reflected (e.g., from the paper or textile to which the dye is applied) is that light that is not absorbed. The color physics of pigments *Email: [email protected] Page 1 of 5

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_13-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Subtractive color mixing with cyan, magenta, and yellow primaries. The combination of two primaries can produce red, green, and blue

(also used in inks and paints) is a little more complex since they typically also scatter light. However, in the case of both dyes and pigments since the optimum gamut of an additive trichromatic color-mixing system results from red, green, and blue primaries, it follows that the optimum subtractive gamut would result from a dye (cyan) that primarily absorbs in the red region of the spectrum, a dye (magenta) that primarily absorbs in the green region of the spectrum, and a dye (yellow) that primarily absorbs in the blue region of the spectrum. So, for example, by controlling the amount of cyan used in a subtractive system, the amount of red light reflected is modulated. Figure 1 provides a schematic representation of subtractive color mixing with cyan, magenta, and yellow primaries.

Ideal and Realistic Subtractive Primaries Since the purpose of the cyan (C), magenta (M), and yellow (Y) colorants is to absorb red, green, and blue light, respectively, one could argue that ideally they should have block spectral transmission curves. Such ideal transmission curves, also referred to as block dyes, are such that at every wavelength two of the colorants have 100 % transmission and the third is absorbing. Hunt (Hunt 2006) notes that the optimum positions of the transition wavelengths are somewhat indeterminate but cites a study that gives values of around 490 and 580 nm (Clarkson and Vickerstaff 1948). The gamut of a subtractive color-mixing system comprising three such block dyes would be very similar to the gamuts of many additive color-mixing systems based on RGB. In practice, however, it is not possible to achieve dyes or pigments with “block” spectral properties. The reason for this is that the spectral properties of colorants are constrained (by the mechanisms by which they interact with light) to vary smoothly with the wavelength of light (Maloney 1986). Figure 2 shows the spectral transmission properties of a set of physically realizable CMY dyes. The sloping sides of most available colorants’ spectral curves result in so-called unwanted absorptions. So, for example, the cyan dye illustrated in the upper pane of Fig. 2 absorbs at wavelengths greater than 580 nm but also exhibits some (unwanted) absorption at lower wavelengths. These unwanted absorptions result in some colors, especially blues and greens, being reproduced too dark. However, they also allow some colors to be reproduced that lie outside the triangle formed by the primaries in a chromaticity

Page 2 of 5

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_13-2 # Springer-Verlag Berlin Heidelberg 2015 1 0.5

Reflectance factor

0 400

450

500

550

600

650

700

450

500

550

600

650

700

450

500

550

600

650

700

1

0.5 0 400 1

0.5 0 400

Wavelength (nm)

Fig. 2 Spectral reflectance factors of physically realizable cyan (upper), magenta (middle), and yellow (lower) dyes each shown at four concentrations

0.8

CIE y

0.6

0.4

0.2

0.0 0.0

0.2

0.4

0.6

0.8

CIE x

Fig. 3 Schematic representation of the gamut of the dyes illustrated by Fig. 2

diagram (Hunt 2006). Figure 3 shows the gamut of the three dyes described in Fig. 2. We can therefore see that whereas additive trichromatic systems have strictly triangular gamuts, subtractive trichromatic systems in practice typically have convex or concave gamuts (when plotted in the 2-D CIE chromaticity space) whose shape can be quite complex. For a good choice of primaries (such as CMY), the gamut is large and convex; for a poor choice of primaries (such as RGB), the subtractive gamut is small and concave. However, when comparing gamuts of color-reproduction systems, we need to be aware that the gamuts are 3-D and that looking at 2-D projections of these can be misleading (Morovic 2008).

Page 3 of 5

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_13-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Standard colors produced by CEI inks Inks Yellow Magenta Cyan Magenta over yellow Cyan over yellow Cyan over magenta

CIE x 0.437 0.464 0.153 0.613 0.194 0.179

CIE y 0.494 0.232 0.196 0.324 0.526 0.101

CIE Y 77.8 17.1 21.9 16.3 16.5 2.8

The literature on computer graphics presents relatively simple formula for the relationship between the additive primaries (RGB) and the subtractive primaries. For example, Foley et al. (1997) postulate the following: C¼1R M¼1G Y ¼ 1  B: These relationships would only be true, of course, if the subtractive primaries were ideal (block) colorants. Due to the spectral overlap among the colorants, converting CMY using the “one-minus-RGB” method works for applications such as business graphics where accurate color need not be preserved, but the method fails to produce acceptable color images for color-critical applications. The true relationship between RGB and CMY in the real world is complex and nonlinear (Westland and Ripamonti 2004).

Process Colors Since printing systems using cyan, magenta, and yellow primaries can generate relatively large color gamuts, such systems are ubiquitous in commercial color printing systems. However, printing black by overlaying cyan, yellow, and magenta ink suffers from major problems. Colored ink is expensive and printing three ink layers can result in the printed paper becoming excessively wet (which can reduce printing press speeds). In addition, small errors in registration of CMY layers could result in the black having colored edges, and the black produced from a mixture of three colors is not always a very good black. For these reasons, and since black is a very important and common color when printing, a fourth color is often incorporated into CMY systems. Black ink – which can be manufactured inexpensively from carbon black pigment – is denoted by the letter K; the black printing plate in offset printing processes is historically referred to as the key plate which is where the initial K derives (Gatter 2004). The CMYK four-color printing model is sometimes referred to as the process color model. The use of CMY inks of different color properties would affect the range of colors that can be produced. Some standardization of the colors of inks is clearly desirable (Hunt 2006). Consequently, the Comité Européen d’Imprimerie (CEI) have standardized the colors that should be produced, under CIE illuminant D65 and for certain print conditions (e.g., 1 mm ink thickness on coated paper with no optical bleaching agent), for the CMY inks printed individually and in pairs. The CEI standard colors are shown in Table 1 (Hunt 2006). Note that the specification of the inks used in pairs better constrains the spectral absorption curves than if the inks were only specified in use individually.

Page 4 of 5

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_13-2 # Springer-Verlag Berlin Heidelberg 2015

Beyond CMYK Four-color printing is used in many applications; however, the gamut of CMYK printing is limited. For example, it has been estimated that the CMYK process can generate only about 60 % of the Pantone color formula guide (Drew and Meyer 2008), which is itself a limited color gamut. Consequently, for highquality color prints, a spot-color process can be used where individually colored inks are employed to create specific colors rather than rely upon CMY subtractive mixing. An alternative approach is to use more than three or four colorants. For example, Pantone’s proprietary six-color process using CMYKOG (orange and green being added to the process colors) and other so-called hexachrome systems (often based on CMYK with a light cyan and light magenta ink added) are frequently found even in relatively low-cost desktop inkjet printing systems. These systems can provide quite large color gamuts.

Summary The optimum primaries of the subtractive color system are cyan, magenta, and yellow. The use of cyan, magenta, and yellow subtractive primaries allows a large – albeit limited – gamut of colors to be reproduced. Primarily for practical reasons, a separate black ink is usually used so that CMYK is the basis of many color-reproduction systems. Extended gamuts can be achieved using more than three or four primaries and systems based on six or more primaries are becoming quite common.

Further Reading Clarkson ME, Vickerstaff T (1948) Brightness and hue of present-day dyes in relation to colour photography. Phot J 88b:26 Drew JT, Meyer SA (2008) Color management: a comprehensive guide for graphic designers. RotoVision SA, Switzerland Foley JD, van Dam A, Feiner SK, Hughes JF, Phillips RL (1997) Introduction to computer graphics. Addison-Wesley, Reading Gatter M (2004) Getting it right in print: digital pre-press for graphic designers. Laurence-King Publishing, London Hunt RWG (2006) The reproduction of colour, 6th edn. Wiley, Chichester Maloney LT (1986) Evaluation of linear models of surface spectral reflectance with small numbers of parameters. J Opt Soc Am A 3(10):1673–1683 Morovic J (2008) Color gamut mapping. Wiley, Chichester Sharma G (2002) Digital color imaging handbook. CRC Press, Boca Raton Westland S, Ripamonti C (2004) Computational colour science using MATLAB. Wiley, Chichester

Page 5 of 5

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

Uniform Color Spaces Vien Cheung* School of Design, University of Leeds, Leeds, UK

Abstract In 1976, the CIE specified CIELAB and CIELUV, two perceptually uniform color spaces to estimate the magnitude of the difference between two color stimuli. These color spaces were also designed to provide color-difference equations and to interpret color difference in terms of dimensions in lightness, hue, and chroma. The CIELAB and CIELUV color-difference equations have been widely used in industry. However, they do not accurately quantify small- to medium-size color difference. More advanced equations based upon the modification of the CIELAB color-difference equation were developed. The CIEDE2000 color-difference equation is the current CIE recommendation for computing small color differences. Future research in the area of the color difference may be based upon a more uniform color space, based on color-vision theory, and capable of accounting for different viewing parameters.

Introduction The CIE (Commission internationale de l’éclairage or International Commission on Illumination) 1931 XYZ system has been effective to measure the luminance and chrominance of a color (see chapter “▶ The CIE System”). However, the distribution of colors in its x, y chromaticity diagram is very nonuniform. This leads to the problem of change in chromaticity or luminosity and visual perception which are not linearly related. Equal changes in x, y, or Y do not correspond to perceived differences of equal magnitude. In 1976, the CIE suggested two color spaces, CIELAB and CIELUV, as a way to overcome the limitations of the CIE XYZ system (CIE 1978). The visual magnitudes of color differences are intended to be approximately proportional to the distance in these spaces.

The CIELAB and CIELUV Spaces CIELAB and CIELUV were primarily suggested to provide color-difference equations, but they were also designed to express color differences correlated with perception attributes: hue, lightness, and chroma. These spaces are represented by plotting the three attributes along axes at right angles to one another. Both CIELAB (concerned with subtractive mixture, e.g., surface colorant) and CIELUV (for additive mixture of colored light, e.g., television) color spaces have the same lightness scales L , which is defined in terms of the ratio of the Y tristimulus value of the color considered to that of the reference white Yn as follows: L ¼ 116ðY =Y n Þ1=3  16 L ¼ 903:3ðY =Y n Þ

f or Y =Y n > 0:008856 : f or Y =Y n  0:008856

(1)

The opponent color axes, approximately red-green (a ) versus yellow-blue (b ) for the CIELAB color *Email: [email protected] Page 1 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015 100 White

+b* Yellow

L*

C *ab

−a*

hab

Green

Red +a*

−b* Blue

Black 0

Fig. 1 A schematic representation of the CIELAB color space

space (as shown in Fig. 1), are defined in Eq. 2. Chroma (C*ab) and hue (hab) are calculated using a and b as Eqs. 3 and 4, respectively: a ¼ 500½f ðX =X n Þ  f ðY =Y n Þ ; b ¼ 200½f ðY =Y n Þ  f ðZ=Z n Þ

(2)

where f ðI Þ ¼ I 1=3 f ðI Þ ¼ 7:787I þ 16=116

f or I > 0:008856 ; f or I  0:008856

 1=2 ; C ab ¼ a2 þ b2

(3)

hab ¼ arctanðb =a Þ:

(4)

Similarly, the CIELUV color space contains the opponent color axes, approximately red-green (u ) versus yellow-blue (v ), which are defined in Eq. 5:   u ¼ 13L u0  u0n : (5) v ¼ 13L v0  v0n Saturation (suv) and chroma (C*uv), which can also be correlated for saturation in the CIELUV space and hue (huv), are calculated, respectively, in Eqs. 6, 7, and 8: h   2 i1=2 suv ¼ 13 u0  u0n 2 þ v0  v0n ;

(6)

Page 2 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

 1=2 C uv ¼ u 2 þ v 2 ¼ L suv ;

(7)

huv ¼ arctan ðv =u Þ:

(8)

Applications to Colorimetry The CIE system offers a precise means of specifying a color stimulus under a set of viewing conditions and suggested CIELAB and CIELUV uniform color spaces as useful representations of colors that correlate with perceptual attributes (CIE 1978). Another important use of the CIE system is the evaluation of perceived color difference. Color-difference equations are designed to provide quantitative representations of the perceived color differences between pairs of colored samples. Color differences specified by CIELAB (Eq. 9) and CIELUV (Eq. 10) spaces are measured by the Euclidean distance between the coordinates for two stimuli. One unit represents approximately one just-noticeable difference (JND) for a pair of samples viewed side by side (Hunt 1998):  1=2 DE ab ¼ DL 2 þ Da 2 þ Db 2



2 DL 2 þ DH 2 ab þ DC ab

1=2

;

(9)

 2 i1=2  ðDLÞ2  DC ab  1=2   2 1=2 ¼ DL 2 þ Du 2 þ Dv 2 or DL 2 þ DH 2 ; uv þ DC uv

(10)

or

where DH ab ¼ DE uv

h

DE ab

2

where DH uv ¼

h 2  2 i1=2 DE uv  ðDL Þ2  DC uv :

Optimized Color-Difference Equations CIELAB and CIELUV were derived from nonlinear transformations of the CIE XYZ system and have been widely used in color industries. The two CIE-recommended equations, however, do not accurately quantify small- to medium-size color difference (Luo 1999). Many attempts have been made to modify the CIELAB color-difference equation to develop more advanced equations including JPC79 (McDonald 1980), CMC(l:c) (Clarke et al. 1984), BFD(l:c) (Luo and Rigg 1987a; b), CIE94 (CIE 1995), and CIEDE2000 (Luo et al. 2001).

JPC79 and CMC(l:c) Color-Difference Equations McDonald accumulated a large number of data involving polyester thread pairs and carried out visual pass/fail color-matching assessments (McDonald 1980). The visual results were used to derive the JPC79 equation. At a later stage, the JPC79 equation was modified, due to the problem that some anomalies were found for colors close to neutral and black (Smith 1997), and renamed as the CMC(l:c) equation:

Page 3 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

DECMCðl:cÞ

h 2 i1=2 2   2    ¼ DLab =ðlS L Þ þ DC ab =ðcS C Þ þ DH ab =S H ;

(11)

where   S L ¼ 0:040975Lab, std = 1 þ 0:01765Lab, std if LS  16; otherwise S L ¼ 0:511; and   S C ¼ 0:0638C ab, std = 1 þ 0:0131C ab, std þ 0:638, S H ¼ S C ðT f þ 1  f Þ: The terms T and f are given by   1=2 4  4   f ¼ C ab, std = C ab, std þ 1900 ; and  T ¼ 0:36 þ j0:4 cos ðhab, std þ 35 jif hab, std  164 or hab, std  345: Otherwise

 

T ¼ 0:56 þ 0:2 cos hab, std þ 168 : 



The CMC(l:c) equation is based upon the CIELAB color space, and the terms L , Cab, and Hab are corresponded to the CIELAB lightness, chroma, and hue, respectively. The terms SL, SC, and SH define the lengths of the semiaxes of the tolerance ellipsoid at the position of the standard in CIELAB space in each of the lightness (SL), chroma (SC), and hue (SH) directions. The ellipsoids were fitted to visual tolerances determined from psychophysical experiments, and the dimension of the ellipsoid is a function of the position of the standard in the color space. The parametric terms l and c allow the ratio between lightness and chroma components to be adjusted. It is considered that there is greater acceptance for shifts in lightness dimension than in chromatic (chroma and hue) dimension. For predicting the perceptibility of color differences, it was recommended that both l and c equal to 1, whereas for predicting acceptability of color differences, it was recommended that l and c equals to 2 and 1, respectively. The subscript std refers to the standard of a pair of samples. The CMC(l:c) equation has been widely used in a number of industries and became an ISO standard for the textile industry in 1995 (Luo et al. 2001). It was also adopted as a British standard (BS 6923) and an AATCC test method (AATCC 173).

Page 4 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

BFD(l:c) Color-Difference Equation Luo and Rigg (Luo and Rigg 1987a, b; Luo and Rigg 1986) accumulated a large set of experimental data relating to small and medium color differences between pairs of surface colors and developed the BFD(l: c) color-difference equation. The structure of the BFD(l:c) equation is similar to that of the CMC(l:c) equation. However, it was found that an additional term in the BFD(l:c) equation is considered which take into account the fact that the chromaticity ellipses do not all point toward the neutral point as assumed in the CMC(l:c) equation. The effect is most significant in the blue region: h 2  i1=2  2  ; DE BFD ¼ ðDLBFD =l Þ2 þ DC ab =ðcDC Þ þ DH ab =DH þ RT DC ab DH ab =DC DH

(12)

where LBFD ¼ 54:6 log ðY þ 1:5Þ  9:6  DC ¼ 0:035 C ab = 1 þ 0:00365 C ab þ 0:521 ; DH ¼ DC ðGT 0 þ 1  GÞ G ¼ T0 ¼ RH ¼ RC ¼

h

 4 i1=2  C ab þ 14000       0:627 þ 0:055 cos hab  254  0:040 cos 2hab  136 þ 0:070 cos 3hab  32     þ0:049 cos 4hab  114  0:015 cos 5hab  103 RT ¼ RC RH      : 0:260 þ 0:055 cos hab  308  0:379 cos 2hab  160  0:636 cos 3hab  254     þ0:226 cos 4hab  140  0:194 cos 5hab  280 h 6  6 i1=2 C ab = C ab þ 7  107 4 C ab =

The terms C ab and hab refer to the arithmetic mean values of chroma and hue angle, respectively. Both l and c equal to 1 for predicting perceptibility of color differences, and l and c, respectively, equal to 1.5 and 1 for predicting acceptability of color differences.

CIE94 Color-Difference Equation Berns suggested a color-difference equation derived also by modifying the CIELAB equation (Berns 1993). The equation was later recommended by the CIE in 1994 (CIE 1995) and is named as CIE94 colordifference equation. It has a similar structure to that of the CMC(l:c) equation but with simpler weighting functions. The CIE94 formula is given by DE 94

h

2 i1=2   2   ¼ ðDL =ðK L S L ÞÞ þ DC ab =ðK C S C Þ þ DH ab =K C S H ; 

2

(13)

where S L ¼ 1, S C ¼ 1 þ 0:045C ab, std , S H ¼ 1 þ 0:15C ab, std : The parametric factors KL, KC, and KH are included to correct for the variation in experimental conditions.

Page 5 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

For all applications except for textile industry, a value of 1 is recommended for all the parametric factors. For the textile industry, CIE94(2:1:1) is recommended where KL equals to 2 and KC and KH equal to 1.

CIEDE2000 Color-Difference Equation It has been realized that both the CMC(l:c) and CIE94 color-difference equations are standardized but by different organizations: CMC(l:c) by ISO (1995) and CIE94 by CIE (1995). However, there are large discrepancies between the two equations in predicting lightness differences and have problems predicting grayish and bluish colors (Luo 1999). The value of the function in the CMC(l:c) equation increases markedly as L increases, implying that for equal differences in L the visual difference should be largest for the smaller L values. However, the lightness correction in the CIE94 equation implied that equal differences in L would yield equal visual differences for all values of L . The CIE subsequently formed a Technical Committee (TC) 1–47 to develop a generalized colordifference equation. The CIEDE2000 equation was agreed by the CIE (2001) and includes not only lightness, chroma, and hue weighting functions but also an interactive term between chroma and hue differences for improving the performance for blue colors and a scaling factor for the CIELAB a scale for improving the performance for colors close to the achromatic axis. The equation is given by: Step 1: Calculate the CIELAB L , a , and b values. 0 0 , and hab Step 2: Calculate a0 , Cab L0 ¼ L , a0 ¼ ð1 þ GÞa , b0 ¼ b ,   2 1=2 ; C 0ab ¼ a0 2 þ b0 and h0ab ¼ arctan ðb0 =a0 Þ; where   1=2 G ¼ 0:5  0:5 C ab 7 = C ab 7 þ 257 : Step 3: Calculate DL0 , DC0 , and DH0 : DL0 ¼ L0batch  L0std DC 0ab ¼ C 0ab, batch  C 0ab, std ;  0:5  0  0 0 0 DH ab ¼ 2 C ab, batch C ab, std sin Dhab =2 where Dh0ab ¼ h0ab, batch  h0ab, std : Step 4: Calculate CIEDE2000 DE00:

Page 6 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

h  2  2    i DE 00 ¼ ðDL0 =ðk L S L ÞÞ2 þ DC 0ab =ðk C S C Þ þ DH 0ab =ðk H S H Þ þ RT DC 0ab =ðk C S C Þ DH 0ab =ðk H S H Þ 1=2 ;

(14) where 

 2   2 1=2 0 0 S L ¼ 1 þ 0:015 L  50 , = 20 þ L  50 S C ¼ 1 þ 0:045 C ab , and S H ¼ 1 þ 0:015 C ab T ;         0 0 0 0    T ¼ 1  0:17 cos hab  30 þ 0:24 cos 2hab þ 0:32 cos 3hab þ 6  0:20 cos 4hab þ 63 ; and RT ¼  sin ð2DyÞRC where

h  i2  0  Dy ¼ 30exp  hab  275 =25 and   1=2 : RC ¼ 2 C ab = C ab þ 257 0 0 , and Hab values of a pair of samples. Note that L0 , C ab , and h0ab are the arithmetic mean of the L0 , Cab Caution needs to be taken for neutral colors having hue angles in different quadrants. If the difference is less than 180 , arithmetic mean of the samples should be used; 360 should be subtracted from the larger angle, followed by calculating the arithmetic mean, if otherwise. The CIEDE2000 equation has been shown to perform better than the CMC(l:c) and CIE94 equations (Luo et al. 2001; Cui et al. 2001; Luo 2002), and it is the current CIE recommendation for computing small color differences.

Summary The need for a uniform color space resulted in the specification of the CIELAB and CIELUV color spaces, as a way to represent colors that correlate with perceptual attributes. Another important use of the CIE system is the evaluation of perceived color difference between pairs of color stimuli. Color differences specified by CIELAB and CIELUV spaces are widely adopted by industry. More advanced colordifference equations were developed based upon the modification of the CIELAB color-difference equation to better quantify small- to medium-size color difference. Note, however, that all these color-

Page 7 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

difference equations were derived based on the perception of spatially uniform color patches. CIE colorimetry only considers color matching between two stimuli under identical conditions including surround, background, size, shape, texture, and illuminating/viewing geometry. Color matches defined by CIE may no longer be valid if any of these constraints is violated. Future research in the area of color difference may be based upon a more uniform color space rather than modifications of CIELAB. Instead of the empirical approach, it is expected that formulas may be based on color-vision theory and be capable of accounting for different viewing parameters such as sample size, size of color difference, spatial separation, background, and luminance level (CIE 1993).

Further Reading Berns (1993) The mathematical development of CIE TC 1-29 proposed colour difference equation: CIELCH. In: Proceedings of the seventh congress of International Colour Association B, Budapest, pp C19.1–C19.4 CIE (1978) Recommendations on uniform color spaces, color difference equations, psychometric color terms. Supplement 2 to CIE publication 15 (E1.3.1) 1971/(TC1.3). Central Bureau of the Commission Internationale de l’Éclairage, Vienna CIE (1993) Parametric effects in colour difference evaluation, CIE publication 101. Central Bureau of the Commission Internationale de l’Éclairage, Vienna CIE (1995) Industrial colour-difference evaluation, CIE publication 116. Central Bureau of the Commission Internationale de l’Éclairage, Vienna CIE (2001) Improvement to industrial colour-difference evaluation, CIE publication 142. Central Bureau of the Commission Internationale de l’Éclairage, Vienna Clarke FJJ, McDonald R, Rigg B (1984) Modification to the JPC79 colour difference formula. J Soc Dyers Colour 100:128–132 Cui G, Luo MR, Rigg B, Li W (2001) Colour-difference evaluation using CRT colours. Part I: data gathering and testing colour-difference formulae. Color Res Appl 26(5):394–402 Hunt RWG (1998) Measuring colour, 3rd edn. Fountain Press, Kingston-upon-Thames ISO (1995) ISO 105-J03:1995 Textiles – test for colour fastness – Part 3: calculation of colour differences. International Organization for Standardization, Geneva Kuehni RG (2003) Color space and its divisions. New York, Wiley Luo MR (1999) Colour science: past, present and future. In: MacDonald LW, Luo MR (eds) Colour imaging, vision and technology. Wiley, New York Luo MR (2002) Development of colour-difference formulae. Rev Prog Color 32:28–39 Luo MR, Rigg B (1986) Chromaticity-discrimination ellipses for surface colours. Color Res Appl 11:25–42 Luo MR, Rigg B (1987a) BFD(l:c) colour-difference formula – Part I – Development of the formula. J Soc Dyers Colour 103:86–94 Luo MR, Rigg B (1987b) BFD(l:c) colour-difference formula – Part II – performance of the formula. J Soc Dyers Colour 103:126–132 Luo MR, Cui G, Rigg B (2001) The development of the CIE 2000 colour difference formula: CIEDE2000. Color Res Appl 26(5):340–350

Page 8 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_14-2 # Springer-Verlag Berlin Heidelberg 2015

McDonald R (1980) Industrial pass/fail colour matching – Part 1: preparation of visual colour-matching data. J Soc Dyers Colour 96:372–376 Ohta N, Robertson A (2003) Colorimetry: fundamentals and applications. Wiley, Chichester Smith KJ (1997) Colour-order systems, colour spaces, colour difference and colour scales. In: MacDonald R (ed) Colour physics for industry. Society of Dyers and Colourists, Bradford

Page 9 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015

Color Perception Marina Bloj* and Monika Hedrich Bradford School of Optometry and Vision Sciences, University of Bradford, Bradford, UK

Abstract In this chapter we present an overview of how we perceive color. We start with an outline of the physiology of the human visual system and discuss both the trichromatic and color-opponent theories of color vision. We conclude with a description of the phenomenon of color constancy and the factors that contribute to it.

List of Abbreviations L M nm S

Long Medium Nanometers Short

Introduction The human visual system is sensitive to electromagnetic radiation with wavelengths between approximately 380 and 780 nm. The process of vision starts when radiation of this wavelength range is absorbed by the photoreceptors in the retina. The human retina contains two types of photoreceptors, called rods and cones. There are approximately 120 million rods and 6 million cones, which are unevenly distributed throughout the human retina (Curcio et al. 1987; Osterberg 1935). Rods are completely absent from the center of the fovea. Moving away from this area, the number of rods increases and reaches its maximum at about 15–20 eccentricity before the density gradually declines again. Rods are responsive to low light levels, that is, in scotopic conditions. Cones have their highest concentration in the fovea. Outside the fovea, their density decreases rapidly and reaches at approximately 10–15 eccentricity, a constant level. Cones are responsive to high light levels, that is, in photopic conditions, and are fundamental for visual acuity and color vision. Within the cones, three different photopigments have been identified. Each photopigment absorbs light over a broad range of wavelengths but has its maximum sensitivity at a different wavelength. The cones are named according to the wavelength range of their peak sensitivity: S-cones have their maximum sensitivity in the short-wavelength range at approximately 420 nm, M-cones in the mid-wavelength range at 534 nm, and L-cones in the long-wavelength range at 563 nm (see chapter “▶ Anatomy of the Eye” and Bowmaker and Dartnall (1980)). The sensitivity curve of a cone class is defined by the probability of a photon to be absorbed as a function of wavelength (Fig. 1). After a photon of a certain wavelength is absorbed, this energy is transduced into an electrical signal by a complex photochemical reaction. This signal does no longer carry separate information about intensity and wavelength but only about the

*Email: [email protected] Page 1 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015 1

Relative sensitivity

S

L M

0.1

0.01

400

450

500

550

600

650

700

Wavelength (nm)

Fig. 1 The relative sensitivities of the S-, M-, and L-cones as a function of wavelength (derived from Naka and Rushton (1966)). Each curve is normalized to its maximum. Note that the peak sensitivities of the cones are shifted with respect to the ones mentioned above. The present data was obtained using psychophysical measurements in contrast to microspectrophotometry as used by Bowmaker and Dartnall (1980). The shifts are due to the transmission characteristics of the media of the eye

S

M

L

+

S

M

L



S

M

L

+ −

Luminance channel

Red-green channel

Blue-yellow channel

Fig. 2 The color-opponent process. The luminance channel is formed from the sum of L- and M-cone signals (L + M); the red/ green channel from the comparison of L- and M-cone signals (L-M); the blue/yellow channel from the comparison of S-, M- and L-cone signals (S-(L + M)) (Adapted from Kaiser and Boynton (1996))

number of photons that have been absorbed by the cone (Principle of Univariance) (Stockman and Sharpe 2000). The signals released by cones and rods are sent via bipolar cells to the ganglion cells. Most ganglion cells receive their input, however, not from a single photoreceptor but from many. These neural networks allow condensing the signals of 126 million photoreceptors onto approximately one million nerve fibers through which the information leaves the eye. Ganglion cells are crucial for color vision but carry also temporal and spatial information. Regarding color vision, the ganglion cells can be described as chromatically selective as their responses depend on differential inputs from the three cone classes. Figure 2 shows a schematic of the contribution of the three cone classes to color opponency. While neural processing of visual information is often described as being rather complex within the early stages of the visual system, it becomes much more complex at the higher (cortical) stages.

Page 2 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015

Around 30 regions have been identified in the brain cortex to respond to color, and it has been shown that complex interactions take place between them. This indicates that there is no single color center in the cortex; however, it remains unclear where ultimately the perception of color is formed.

Our Visual System Is More than Trichromatic Because we have three types of cones, our visual system is referred to as trichromatic. A color signal excites each of the cone types in different degrees. Each cone mechanism on its own cannot distinguish between a change in excitation due to an alteration in intensity of the color signal and one due to a change in the spectral composition of the signal. It is through the comparison of the output of the three classes of cones that we discriminate color signals. The trichromacy of our visual system becomes apparent in the fact that we can use the three colored guns (red, green, and blue) of a television set to display more than a million of colors (see also chapter “▶ RGB Systems”). Maxwell and Helmholtz in the nineteenth century independently demonstrated that any colored light can be matched by a mixture of three suitably chosen spectral lights. (Any three lights can be used as primaries as long as one of them cannot be matched by a mixture of the other two. The more spaced the primaries are in the spectrum, the more lights that can be matched by direct mixture of the three primaries. To match some lights, it is necessary to mix one of the primaries with the “test” light and obtain the “matching” light by combination of the other two primaries.) The resulting “matching” light has the same appearance as the “test” light but might have a different spectral distribution. These lights are known as metameric. The fact that metameric stimuli exist for our visual system indicates that the cones fail to produce a different set of outputs for every color signal. The trichromatic theory of color vision was developed solely based on observations from light mixing experiments rather than on anatomical studies of the eye. Thomas Young, Helmholtz, and others inferred from their observations that there were three different types of receptors with overlapping spectral sensitivities. This pioneering idea was later confirmed by the discovery of the three cone classes (e.g., Marks et al. (1964), Nathans et al. (1986)), and the trichromatic theory became also known as the YoungHelmholtz theory. The opponent-process theory was proposed firstly by the physiologist Ewald Hering in 1892 and is based on the appearance of color. He suggested, after observing the effect of afterimages, that color is processed by the visual system in terms of opponent color pairs. When we stare, for instance, at a red circle for a while and then look at a white piece of paper, we see green, the opponent color of red. Hering specified originally two pairs of opponent colors, red and green and blue and yellow. However, a third opponent pair, black and white, has been included as brightness is transmitted in a similar way as color. Hering also noted that the perception of opponent colors is mutually exclusive; we never experience yellowish blue or greenish red. Physiological evidence that supports Hering’s opponent-process theory was first found by Hurvich and Jameson (1957). Since then extensive research has been carried out to investigate the opponent color processing in the visual system (e.g., De Valois et al. (1966, 1997); Derrington et al. (1984)). Nowadays, it is generally accepted that both theories of color vision are partially correct and that color processing occurs in two stages (De Valois and De Valois 1993). In the first stage, on the receptor level, the Young-Helmholtz theory is correct with regard to the existence of the three cone classes, and the subsequent transmission of the signal to the cortex, the second stage, can be explained by the opponent-process theory.

Our Visual System Is Color Constant In everyday life, we refer to color as a constant property of an object and we are unaware that most objects only reflect light. A red balloon looks red because it reflects mainly the long wavelengths of the spectrum Page 3 of 7

Relative sensitivity

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015

Illumination

Radiance

0.025

1.0 0.8 0.6 0.4 0.2 0.0 390

Cone sensitivities

490

590

690

Wavelength (nm)

0.02 0.015 0.01 0.005 0 390 490 590 690

Colour signal 0.01 1

Surface reflectance

0.8 0.6

0.008 0.006 0.004 0.002

0.4

0 390 490 590 690

0.2 0 390

Radiance

Relative radiance

Wavelength (nm)

Wavelength (nm) 490

590

690

Wavelength (nm)

Fig. 3 The emitted light hits a surface and is reflected; the reflected light reaches the eye of an observer who experiences color. Each light source emits an illumination with a characteristic spectral power distribution. Together with the surface reflectance of an object, they create the color signal, which reaches an observer’s eye. The incident color signal is absorbed by the cones, dependent on the cone sensitivities

when illuminated by white light. When the same balloon is illuminated by a specific green light, it appears black because no light will be reflected. This is a rather extreme example, but it demonstrates clearly that objects’ color appearance depends likewise on the reflectance properties of the object and the light it is illuminated with. When the same object we have seen before is illuminated by a different light source, then the spectral composition of the reflected light changes and should theoretically lead to a changed color appearance of the object. However, we know from experience that a banana looks yellow no matter which light it is under; in other words, the change in the reflected light stays unnoticed. The fact that object colors appear unchanged despite a change in illumination is known as color constancy (e.g., Kaiser and Boynton (1996); Jameson and Hurvich (1989)), and in order to recognize a surface as unchanged, the visual system must access information about the illuminant and the surface reflectance separately. In a simplified world, the light that reaches our eyes, commonly referred to as color signal, is the product of the spectral power distribution of the illuminant and the spectral reflectance function of the surface (Fig. 3). Physically the reflectance spectra can be determined by measuring first the color signal (with a spectroradiometer) wavelength by wavelength and dividing it then by the spectral power distribution of the illuminant. However, in the human eye, the color signal is detected by the three classes of cones on the retina, which are unable to transmit full spectral information. They rather provide the visual system with information about their individual photon catch. This information changes when the color signal changes, but with only this information, the visual system cannot differentiate whether the alteration is due to an illumination or surface change. If there is no possibility to access separate information about the illumination and the surface reflectance from the color signal, how does the visual system achieve color constancy? In the paragraph above, a simplified world is assumed, though the real world is more complex. Usually objects are surrounded by other objects and there is often more than just one light source. Therefore, the light reflected from an observed object not only depends on the reflectance properties of the object’s surface but also on the illumination, other objects, their location, and position with respect to the

Page 4 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015

illuminant (or illuminants) and each other. Thus, to be color constant, the visual system has to compensate for these various changes. Why Should the Visual System Be Color Constant? Natural daylight is not constant during a day but changes considerably from dawn until dusk. Much larger changes in illumination can even be experienced when coming from outside into a room with artificial illumination. However, the visual system compensates for these permanent changes and allows the stable perception of color in our environment. Color is a reliable cue to object identification and it has also been shown to enhance scene recognition (Rinner and Gegenfurtner 2000; Wichmann et al. 2002). As long as objects are under a constant illumination and context, it is sufficient to simply remember the color to recognize it as the same, though such situations are rare. It is much more common to encounter colored objects in changed illumination and context conditions, and to recognize colors, they must firstly be remembered and secondly the illuminant and context change must be accounted for. Therefore, color constancy can be considered as a sophisticated form of color memory. In everyday life, it is not always necessary to perceive colors as perfectly constant but to make confident judgments about color categories, for example, the ability to decide whether a banana is yellow or green and, thus, is ripe or unripe. Mechanisms and Cues Contributing to Color Constancy It has been shown that color constancy is not achieved by a single but by the interaction of several mechanisms and cues. It is the variety and combination of several cues that support color constancy (e.g., Kraft and Brainard (1999), Kraft et al. (2002)), and it has been suggested that the importance of particular cues varies depending on their availability and that the weighting of them is a dynamic process. The phenomenon of color constancy has been known for more than two centuries. Monge (1789) and Hering (circa 1880) produced early demonstrations of the phenomenon. Helmholtz, in 1868, proposed that the perceived color of the object was the result of mechanisms that took into account the color of the illumination. Since then it has been the goal of different studies to try to quantify the strength of color constancy, determine how the viewing context influences it, and identify the mechanisms involved. Chromatic adaptation has been identified as a powerful mechanism contributing to color constancy. Pioneering research in this area was conducted by von Kries, who postulated that the mechanisms of chromatic adaptation operate as gain controls on the cone signals. Since then an enormous number of studies have been dedicated to the investigation of chromatic adaptation (for review, see Kaiser and Boynton (1996), Webster (1996)). It was shown that von Kries’ hypothesis can explain the phenomenon of color constancy only partially (for review, see Jameson and Hurvich (1989)) and that chromatic adaptation involves not a single but several mechanisms at low-level as well as high-level stages (e.g., Albright and Stoner (2002), Zaidi et al. (1997)). Different mechanisms within chromatic adaptation have been identified to determine color appearance. One of these is chromatic adaptation to the spatial average of a scene. This process occurs over a large spatial area and is rather slow, taking approximately 2 min to stabilize, that is, approaching an asymptotic steady state at 90–95 %, for example, Fairchild and Lennie (1992). Another mechanism is chromatic adaptation to local color contrast (Hurlbert and Wolf 2004; Webster and Mollon 1995). Color contrast occurs through the interaction of adjacent surfaces. This process is almost instantaneous (Rinner and Gegenfurtner 2000) and determines color appearance radically (Webster et al. 2002). The extent to which adaptation to the spatial mean of a scene and adaptation to local color contrast mediate color constancy was studied by Kraft and Brainard (Kraft and Brainard 1999) using an achromatic setting task. They also investigated the effect of adaptation to the most intense scene region, which has been suggested to be crucial for color constancy, for example, McCann et al. (1976). The setup Page 5 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015

used by Kraft and Brainard allowed isolating cues that activate these mechanisms and study of their effects on color constancy separately. Under each condition, observers showed a rather moderate level of color constancy indicating that more than a single mechanism is necessary to achieve high levels of color constancy.

Conclusion The physiology that underpins human color perception is well documented. However, it does not fully explain our everyday experience of colors including our ability to remember and recognize them; for this we need to look at the combination of several retinal and cortical mechanisms that are not yet fully understood.

Further Reading Albright TD, Stoner GR (2002) Contextual influences on visual processing. Annu Rev Neurosci 25:339–379 Bowmaker JK, Dartnall HJ (1980) Visual pigments of rods and cones in a human retina. J Physiol 298:501–511 Brainard DH (2004) Color constancy. In: Chalupa L, Werner J (eds) The visual neurosciences. MIT Press, Cambridge, MA Curcio CA, Sloan KR, Packer O, Hendrickson AE, Kalina RE (1987) Distribution of cones in human and monkey retina: individual variability and radial asymmetry. Science 236(4801):579–582 De Valois RL, De Valois KK (1993) A multi-stage color model. Vision Res 33(8):1053–1065 De Valois RL, Abramov I, Jacobs GH (1966) Analysis of response patterns of LGN cells. J Opt Soc Am 56(7):966–977 De Valois RL, De Valois KK, Switkes E, Mahon L (1997) Hue scaling of isoluminant and cone-specific lights. Vision Res 37(7):885–897 Derrington A, Krauskopf J, Lennie P (1984) Chromatic mechanisms in lateral geniculate nucleus of macaque. J Physiol 357:241–265 Fairchild MD, Lennie P (1992) Chromatic adaptation to natural and incandescent illuminants. Vision Res 32(11):2077–2085 Hurlbert A, Wolf K (2004) Color contrast: a contributory mechanism to color constancy. Prog Brain Res 144:147–160 Hurvich LM, Jameson D (1957) An opponent-process theory of color vision. Psychol Rev 64(6):384–404 Jameson D, Hurvich LM (1989) Essay concerning color constancy. Annu Rev Psychol 40:1–22 Kaiser PK, Boynton RM (1996) Human color vision, 2nd edn. Optical Society of America, Washington, DC Kraft JM, Brainard DH (1999) Mechanisms of color constancy under nearly natural viewing. Proc Natl Acad Sci U S A 96:307–312 Kraft JM, Maloney SI, Brainard DH (2002) Surface-illuminant ambiguity and color constancy: effects of scene complexity and depth cues. Perception 31(2):247–263 Marc E (2007) Color constancy. Wiley, Chichester Marks WB, Dobelle WH, MacNichol EF Jr (1964) Visual pigments of single primate cones. Science 143:1181–1183

Page 6 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_15-2 # Springer-Verlag Berlin Heidelberg 2015

McCann JJ, McKee SP, Taylor TH (1976) Quantitative studies in retinex theory a comparison between theoretical predictions and observer responses to the “color mondrian” experiments. Vision Res 16(5):445–458 Naka KI, Rushton WA (1966) S-potentials from colour units in the retina of fish (Cyprinidae). J Physiol 185(3):536–555 Nathans J, Darcy T, Hogness DS (1986) Molecular genetics of human color vision: the genes encoding blue, green, and red pigments. Science 232(4747):193–202 Osterberg G (1935) Topography of the layer of rods and cones in the human retina. Acta Ophthalmol Suppl 6(1):11–97 Rinner O, Gegenfurtner KR (2000) Time course of chromatic adaptation for color appearance and discrimination. Vision Res 40(14):1813–1826 Stockman A, Brainard DH (2010) Color vision mechanisms. In: Bass M (ed) The OSA handbook of optics, 3rd edn. McGraw-Hill, New York Stockman A, Sharpe LT (2000) The spectral sensitivities of the middle- and long-wavelength-sensitive cones derived from measurements in observers of known genotype. Vision Res 40(13):1711–1737 Wandell BA (1995) Foundations of vision. Sinauer Associates, Sunderland Webster MA (1996) Human colour perception and its adaptation. Netw Comput Neural Syst 7(4):587–634 Webster MA, Mollon JD (1995) Colour constancy influenced by contrast adaptation. Nature 373(6516):694–698 Webster MA, Webster SM, Malkoc G, Bilson AC (2002) Color contrast and contextual influences on color appearance. J Vis 2(6):505–519 Wichmann FA, Sharpe LT, Gegenfurtner KR (2002) The contributions of color to recognition memory for natural scenes. J Exp Psychol Learn Mem Cogn 28(3):509–520 Zaidi Q, Spehar B, DeBonet J (1997) Color constancy in variegated scenes: role of low-level mechanisms in discounting illumination changes. J Opt Soc Am A 14:2608–2621

Page 7 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Color Vision Deficiencies Vasudevan Lakshminarayanan* Department of Physics, Department of Electrical and Computer Engineering and School of Optometry and Vision Science, University of Waterloo, Waterloo, ON, Canada Department of Physics, University of Michigan, Ann Arbor, MI, USA

Abstract We will discuss basic color vision deficiencies, namely, protanomaly, deuteranomaly, dichromacy, and tritanomaly. The effect of these deficiencies on the Commission Internationale de l’Eclairage (CIE) diagram will be discussed. We also discuss the effect of eye diseases on color vision and briefly describe some color vision tests.

Introduction John Dalton presented to the Manchester Literary and Philosophical Society in 1794 (the first of 116 lectures he would give to that body) a talk entitled Extraordinary Facts Relating to the Vision of Colours. In this talk, he discussed his own variant color vision. Thomas Young discussed Dalton’s observations in his own Lectures on Natural Philosophy (1807). Twenty years later, the term Daltonism was coined to describe red-green color deficiency. If a video display is to be used by those with color deficiencies, then it is essential to use color that effectively conveys information to them. In this chapter, we will discuss some aspects of color vision variations or abnormalities.

Types of Color Vision Deficiencies It should be noted that the term “colorblind” is often used to describe all abnormalities in color vision. However, this term is not quite correct since there are only two classes of people who cannot see color and can properly be called colorblind: the rod monochromats or achromats and the cone monochromats. The former refers to people who only have the rod photoreceptors, and the latter refers to people who only have rods and one class of cones. Recall that there are three classes of cones: the short wavelength sensitive (S-cones or blue cones), middle wavelength sensitive (M-cones or green cones), and long wavelength sensitive (L-cones or red cones) (see chapter “▶ Anatomy of the Eye”). These people are not colorblind but have color vision that is different from the “color normal observer.” In general, colorvariant (or deficient) people see a much smaller number of spectral hues than normal observers. In addition the relative luminous efficiency of the eye (i.e., the CIE Vl curve; see chapter “▶ The CIE System”) is altered, and the color matching functions are abnormal (Kaiser and Boynton 1996; Birch 2001; Pokorny et al. 1979). Because of this abnormal color matching, there is color confusion – some colors that look different to color normal observers will look identical to people with color deficiencies. People with congenital color deficiencies are born with the defect from birth, and this imperfection remains stable throughout life. There are also acquired color vision defects due to pathology.

*Email: [email protected] Page 1 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

The most common types of congenital color vision defects are based on whether individuals will have difficulty discriminating colors along the red-green axis of the color circle or the blue-yellow axis, with the red-green axis confusion being the most common problem. This kind of deficit occurs in about 8 % of males and 0.4 % of females in the human population. However, the second type of deficiency, confusion along the blue-yellow axis, is rather rare, with a prevalence of about 0.005 % of the population (Delpro et al. 2005). Other facts about the prevalence of color deficiencies are (see below for explanations of the disorders): 1. 2. 3. 4. 5. 6. 7.

0.38 % of women are deuteranomalous (around 95 % of all color-deficient women). 0.005 % of the population are totally color blind. 0.003 % of the population have tritanopia. Protanomaly occurs in about 1 % of males. Deuteranomaly occurs in about 5 % of males. Protanopia occurs in about 1 % of males. Deuteranopia occurs in about 1 % of males.

Physiological/Genetic Basis of Color Vision Deficiencies There are significant differences in the cone photopigments of the color-deficient population compared to the color normals. In particular, the red-green color vision defects occur because the photopigments in the L- or M-cones are different or one of the photopigments is not present. These defects have an X-linked recessive inheritance pattern (Wissinger and Sharpe 1998), and the color-variant population can be further divided into two groups: (a) protans, whose L-cone photopigment is either missing or the absorption curve is shifted to shorter wavelengths relative to the normal L-cone photopigment, and (b) deutans, whose M-cone photopigment is either missing or the absorption curve is shifted to longer wavelengths. People with a congenital blue-green defect are called tritans and have S-cones with a nonfunctional or abnormal photopigment. Tritans have an autosomal dominant pattern of inheritance but with variable penetrance (individuals with the same genotype will have variable degrees of severity) (Wissinger and Sharpe 1998; Neitz and Neitz 2000). In addition, within each group, we can have dichromats or anomalous trichromats. That is, there are people who have only two classes of functioning cones (dichromats) and people who do have the three cone types, but do not see the world as color normals – the anomalous trichromats. It should be emphasized that the dichromats have the same number of cones as color normals. Dichromatism is the most severe form of the congenital defects. Anomalous trichromats comprise the majority of red-green color defectives. The tritanope appears to have a nonfunctional S-cone and behaves as though he or she only had M- and L-cones in his or her retina. Anomalous trichromats when asked to make a color match use three primaries (just like color normals); however, the proportions of the primaries in the mixture are outside the normal range. The majority of anomalous trichromats also have reduced color discrimination (He and Shevell 1995). Just like dichromats, anomalous trichromats can be classified into two categories: (1) deuteranomalous or “green weak” and (2) protanomalous or “red weak.” People who are deuteranomalous require more of the green primary when mixing with a red to form a standard yellow. These individuals have three distinct cone classes, but the M-cone photopigment has an absorption function that is shifted to longer wavelengths relative to the M-cones found in color normals. It has been speculated that this photopigment could actually be a hybrid L-cone photopigment that has an Page 2 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Classification of congenital color deficiencies (Adapted from Birch (2001)) Number of cone pigments Type Denomination None Monochromat Rod Monochromat One Monochromat Atypical, incomplete (cone) Monochromat Two Dichromat (a) Protanope (b) Deuteranope (c) Tritanope Three Anomalous (a) Protanomalous Trichromat (b) Deuteranomalous (c) Tritanomalous

Hue discrimination None Limited ability in mesopic viewing conditions Severely impaired

Continuous range from slight to severe impairment

absorption spectrum shifted to a shorter wavelength compared to the other L-cone pigment they have in the retina (Neitz and Neitz 2000; He and Shevell 1995; Shevell et al. 1998). In contrast, protanomalous individuals require more red when making a standard yellow match. Similar to protanopes, they have a decreased sensitivity to red. The L-cone photopigment in these individuals has an absorption curve that is shifted to shorter wavelengths relative to the color normal L-cone absorption. Analogous to the deuteranomalous case, the anomalous protanomalous pigment is actually a hybrid M-cone photopigment that absorbs light at slightly longer wavelengths than the other M-cone (Neitz and Neitz 2000; He and Shevell 1995; Shevell et al. 1998). The classification of congenital color vision deficiency is given in Table 1.

Color Vision Deficiencies and the CIE Diagram Color discrimination by color-variant observers can be analyzed easily using a CIE chromaticity diagram (see chapter “▶ The CIE System”). The “color confusion lines” are shown in Figs. 1 and 2 for protanopes and for deuteranopes. The tritanopic color confusion line is shown is Fig. 3. These lines are based on extensive psychophysical color matching and hue discrimination experiments [see, e.g., reference (Wyszecki and Stiles 1982)]. The orientation and spacing of the color confusion lines represent the averages of data obtained from several dichromats using a 2 field of view and of moderate exposure duration. If the field size or the duration is different from that used by Pitts (Pitt 1935), from which the figures are derived, individual dichromat performance might vary from predictions based on color confusion lines. Colors lying on the same line will appear identical if the luminances are equal. Each line in the figures corresponds to the colors that require the same ratio of the two primaries that the observer uses to match colors. Different lines represent different ratios, and therefore the colors on different lines should appear different even when they are equal in luminance. The distance between any two lines represents a just noticeable difference (JND) in color. The area is referred to as the zone of confusion because any colors that fall within a given zone will appear identical to a person with dichromatic defect. For example, the red, green, and yellow colors all fall near the same solid line, and these colors will appear identical to the protanope if they are equal in luminance. The number of confusion lines indicates that protanopes and deuteranopes can only distinguish 21 and 31 distinct wavelengths, respectively. However, color normals can distinguish 150 distinct wavelengths (Pitt 1935). Protans and deutans will have difficulty in discriminating between greens, yellows, oranges, and reds. The tritans will have difficulty in discriminating between gray and white, gray and yellow, gray and green, green and dark green, and blue and blue green. They can distinguish only 44 distinct wavelengths. The color discrimination performance of anomalous trichromats will fall in between the color normals and the dichromats. Their confusion zones will appear as a series of ellipses (Fig. 4) with the major axes of Page 3 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Color confusion lines for a protan, based on Pitts’s data and plotted on the 1931 CIE chromaticity diagram

each ellipse along the corresponding dichromatic confusion zone but shorter than that of dichromats as they do not include complete range of confusions as in the case of dichromats. It should be noted that the length of the major axes of the ellipse varies with the severity of the defect (Birch-Cox 1974).

Acquired Color Vision Diseases Injury or diseases of the eye or the neural pathways can result in color vision changes. These are called acquired color vision defects, and Kollner (1912) proposed one of the first classifications of acquired color vision deficiency to the location of the pathology. Kollner’s law, as it is known, says that blue deficiencies develop from diseases of the retina, while red-green deficiencies develop from diseases of the pathways from the inner layers of the retina to the visual cortex. Pokorny and Smith (Pokorny et al. 1979) state that this rule is valid even though there are occasional contradictions (Schnek and Hagerstrom-Portnoy 2002). In general, acquired color vision deficiencies can occur in diseases such as cone degeneration, optic nerve disorders, vascular disorders, and glaucoma. The most common form of acquired color vision defect occurs as a natural part of aging – with age, the crystalline lens of the eye absorbs progressively more light at the shorter wavelengths. Therefore, there is a loss of color discrimination.

Page 4 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Color confusion lines for a deutan, based on Pitts’s data and plotted on the 1931 CIE chromaticity diagram

Tests of Color Vision Deficiencies There are many techniques to detect color vision deficiencies in the clinic and in the laboratory. The “gold standard” for color vision testing is the anomaloscope, which uses the Rayleigh match procedure for diagnosing the four major X-linked color defects. Here the observer adjusts the relative energies of 670 nm and 546 nm primaries to do an additive match of a 589 nm reference (see chapter “▶ The CIE System”). Most color vision testing is based on a knowledge of the color confusion lines. The most common are test books/plates. Here the colors of the figures and background are chosen to lie close to the confusion lines. The most common test is the Ishihara color test plate (See Fig. 5a, b). A similar test is the HardyRand-Rittler color test plates. Clinically the most commonly used test is the Farnsworth D-15 test (and its desaturated versions). The test colors are the so-called Munsell colors. The Farnsworth Munsell (FM) 100-hue test is the most complicated of the lot – consisting of 85 caps from 100 possible hues of the Munsell color ordering system (Fig. 6). There are also color tests such as the Cambridge color test and the City University color test that are monitor based. However, caution should be used since the monitors have to be calibrated in order to get reliable results. The reader is referred to the book by Birch (2001) for detailed discussions.

Page 5 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Color confusion lines for a tritan, based on Pitts’s data and plotted on the 1931 CIE chromaticity diagram

In terms of video displays, it is essential that color be varied in terms of intensity and saturation and not in terms of hue for color-deficient observers. Single color distinction should be avoided in such displays. Color contrast can be achieved by varying both chromaticity and luminance. Varying luminance alone can also benefit the color-deficient person. Finally, Kovacs et al. (2000) have designed a filter that can be used along with a video display and can compensate for the color deficiency. In order to use this method, a thorough knowledge of the exact shape of the response function of abnormal and anomalous cones is necessary.

Page 6 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Color confusions obtained for deuteranomalous observers. The length of the major axis indicates the severity. The gray lines are the deuteranope’s color confusion lines

Page 7 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 (a) Pseudoisochromatic plates: Ishihara plates. (b) Appearance of pseudoisochromatic plates to normal and colordeficient observers

Fig. 6 The Farnsworth Munsell (FM) 100-hue test Page 8 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_16-2 # Springer-Verlag Berlin Heidelberg 2015

Summary In this chapter, we have described the characteristics, epidemiology, and genetics of common color vision deficiencies. The effect of these deficiencies as analyzed on a CIE diagram was also presented. In addition to genetic color vision defects, it is also possible to acquire color vision deficiencies as a result of disease. Finally, we described briefly some color vision tests.

Further Reading Birch J (2001) Diagnosis of defective color vision, 2nd edn. Butterworth-Heinemann, Oxford Birch-Cox J (1974) Isochromatic lines and the design of color vision tests. Mod Probl Ophthalmol 13:8–13 Delpro WT, O’Neill H, Casson E, Hovis J (2005) Aviation relevant epidemiology of color vision deficiency. Aviat Space Environ Med 76:127–133 Gegenfurtner K, Sharpe LT (1999) Color vision. Cambridge University Press, Cambridge He JC, Shevell SK (1995) Variation in color matching and discrimination among deutranomalous trichromats: theoretical implications of small differences in photopigments. Vis Res 35:2579–2588 Kaiser PK, Boynton RM (1996) Human color vision, 2nd edn. Optical Society of America, Washington, DC Kollner H (1912) Die Storungen des farbensinners, irhe klinische bedeutung und ihre diagnose, Karger, Berlin Kovacs G, Abraham G, Kucsera I, Wenzel K (2000) Improving color vision for color deficient patients on video displays. In: Lakshminarayanan V (ed) Vision science and its applications, vol 35, Trends in optics and photonics. Optical Society of America, Washington, DC, pp 333–337 Neitz M, Neitz J (2000) Molecular genetics of color vision and color vision defects. Arch Ophthalmol 118:691–700 Norton TT, Corliss DA, Bailey JE (2002) The psychophysical measurement of visual function. Butterworth-Heinemann, Woburn, pp 217–288, Chap 8 Pitt FHG (1935) Characteristics of dichromatic vision, report of the committee on the physiology of vision: XIV Special report series. Medical Research Council of Britain, His Majesty’s Stationary Office, London Pokorny J, Smith VC, Verriest G, Pinckers AJLJ (1979) Congenital and acquired color vision defects. Grune and Stratton, New York Schnek ME, Hagerstrom-Portnoy G (2002) Color vision defect type and spatial vision in the optic neuritis trial. Invest Ophthalmol Vis Sci 42:29–39 Sharpe LT, Stockman A, Jagle HJN (1999) Opsin genes, cone photopigments, color vision and color blindness. In: Gegenfurtner KR, Sharpe LT (eds) Color vision, from genes to perception. Cambridge University Press, Cambridge, pp 3–51 Shevell SK (2003) The science of color, 2nd edn. Elsevier, Oxford Shevell SK, He JC, Kainz P, Neitz J, Neitz M (1998) Relating color discrimination to photopigment genes in deutan observers. Vis Res 38:3371–3376 Valberg A (2005) Light vision and color. Wiley, New York Wissinger B, Sharpe LT (1998) New aspects of an old theme: the genetic basis of human color vision. Am J Hum Genet 63:1257–1262 Wyszecki G, Stiles WS (1982) Color science, 2nd edn. Wiley, New York

Page 9 of 9

Displays in the Workplace Sarah Sharples

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Challenge 1: Continuing to Consider “Traditional” Personal Computing . . . . . . . . . . . . . . . . . . . . . 3 Challenge 2: The Changing Nature of Displays in the Workplace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Challenge 3: The Diverse Context of Use of Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Challenge 4: Predicting How Displays Will Be Used in the Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Challenge 5: Displays as Part of an Interactive System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Abstract

The way in which we use displays in the workplace is changing, with increasing diversity in display type, tasks, and context of use. This chapter outlines five challenges that we need to consider when designing and implementing displays in the workplace: (1) the need to continue to consider “traditional” personal computing; (2) the changing nature of displays in the workplace; (3) the diverse context of use of displays; (4) the need to predict how displays will be used in the future; and (5) considering displays as part of an interactive system. The chapter presents a set of display/human factor considerations that should be remembered when designing and evaluating displays in the workplace.

S. Sharples (*) Human Factors Research Group, University of Nottingham, Nottingham, UK e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_17-2

1

2

S. Sharples

Introduction We typically spend between 30 and 45 h per week at work, in addition to time spent traveling to and from our workplace. The displays in our workplace are therefore probably the ones that we encounter and interact with more than any others. However, the nature of work, work environments, and the displays we use are continually changing, and developments such as wireless and mobile communication technologies, large screen displays, and portable computing mean that where 10 years ago most work users had a single dedicated workplace with one cathode ray tube computer screen, the displays with which we interact on a daily basis in conducting our work are of a wide range of types and forms. This chapter considers the changing nature of displays in the workplace. It then presents five key challenges that designers of displays in the workplace must address if we are to ensure that future workplaces are as safe, effective, usable, and comfortable as possible in the future. The range and type of displays that we might encounter, along with the tasks that we complete using these displays, during a typical day, vary considerably. Table 1 presents a selection of these examples and demonstrates that we encounter a wide diversity of display size and form; uses of displays including preparation of documents, analysis, review, or monitoring; and a range of roles of displays and associated technologies in communication and collaboration. Perry and O’Hara (2003) identified three key requirements for display-based activity in the workplace: 1. Ready access to information 2. Social orientation 3. Coordination and planning This chapter considers these three requirements in the context of different types of workplace tasks and activities. If we are to design display technologies in order to effectively consider human factor requirements, we need to ensure that any Table 1 A selection of typical displays and tasks encountered by everyday users Example display Smartphone/mobile device Stand-alone desktop LCD/TFT display Large screen projection or flat screen display In-car technologies Handheld device Integrated device display (e.g., laptop)

Example tasks E-mail, phone call, application, web page viewing E-mail, word processing, CAD, shared viewing of collaborative work Lecture/presentation, entertainment, shared decisionmaking viewing (2D or 3D) Satellite navigation, control of entertainment/ communications system Signature for online shopping, collection of inventory/asset management data E-mail, graphics/word processing, video viewing

Displays in the Workplace

3

assessment methods or design techniques applied are appropriate to the different types of work display use that we may encounter now and in the future. The following sections of the chapter articulate five key challenges that we need to take into account in designing workplace displays. The conclusion then draws upon these challenges to identify the priorities for consideration in display design for the display user, task, and environment as well as the display technology itself.

Challenge 1: Continuing to Consider “Traditional” Personal Computing It would be tempting to consider that, as we embrace a wide range of novel technologies, we can move our focus away from the standard office computer. However, the vast majority of workers still have either a dedicated space or “hot desk” setup containing the typical components of a screen, keyboard, mouse, and processor. Call-center work is prevalent (in 2007 it was estimated that there were 647,000 people in the United Kingdom working in call centers (BBC news website February 14, 2007)). In a report for the UK Health and Safety Executive, Sprigg, Smith, and Jackson (2003) found that on average workers spent 2 h 30 min continually using a display screen without a break, as part of a typical shift pattern (7–9 h shifts); for the majority of respondents, over 75 % of the day was spent using display screen equipment. Of the 1,141 call-center employees surveyed, 53 % used hot desks and 42 % reported that they were moderately or very dissatisfied with glare or reflection on their display screen. Pheasant and Haslegrave (2006, p. 161) specify the typical office workstation as containing a desk, chair, and computer at which a user will undertake paper- and screen-based tasks. They note that the “paperless” office has not become as prevalent as was predicted in the 1980s and remind us that in many ways the standard desk, chair, and computer setup has become even more prevalent, with screen-based displays replacing physical or manual control panels in many contexts. In addition to this, many of us, either formally or informally, work at home on desktop computers. The setup of a traditional workstation may vary for individuals; individuals may vary in their preferred viewing distance (Jackinsi et al. 1999) and screen height (Jackinsi and Heuer 2004), for example. In addition, many workers may use a laptop or notebook for a significant part of their working day. Typically, the height of a laptop screen will be lower than a fixed setup of a separate screen on a desk – primarily due to the context in which the laptop is used (e.g., train, home living room, airport lounge). Pheasant and Haslegrave also note that a laptop screen is typically often only legible when viewed at a narrower range of angles. This property, anisotropy, is encountered when there is a deviation of luminance of more than 10 % depending on target location or viewing angle (Oetjen and Ziefle 2009; International Organisation for Standardization 2001), and studies have demonstrated that if a viewing angle is between 10 and 50 off-axis, performance will deteriorate (Oetjen et al. 2005; Oetjen and Ziefle 2007).

4

S. Sharples

While Heasman et al. (2000) found no major difference between discomfort experienced by users of standard and portable PC setups, it still remains important to acknowledge that these types of displays are likely to be the primary form of viewing and interacting with information for most office-based workers. Although regulations and advice regarding the setup of the traditional personal computer have been available for several decades, it appears that users do still experience discomfort or dissatisfaction when using such displays.

Challenge 2: The Changing Nature of Displays in the Workplace For over a decade, researchers have been debating the way in which our workplaces will change in their nature and form (e.g., Tanaka 2002; Dalton et al. 1998). The human desire for collaboration and communication is acknowledged. Ironically, the increased prevalence of open-plan offices does not necessarily encourage collaboration due to the impact of noise disturbance; as a consequence, open-plan workplaces can tend to be quiet and not conducive to collaboration with physically colocated colleagues. This fits with the category of social orientation proposed by Perry and O’Hara and potentially increases the likelihood of displays being used to support either text-based communication (both e-mail and instant messaging) or voice-based systems such as Skype. Displays are getting both larger and smaller in a number of contexts. For example, in rail control, Wilson and Morrisroe (2005) note that traditional hardwired panel displays are increasingly being replaced by banks of small screens, usually in combination with larger overview displays. Figure 1 shows an example of a hardwired display which comprises a combination of fixed static elements printed on the panel, accompanied by a combination of screens and LED indicators. Figure 2 shows an example of the type of control environment now more typically seen – taken from a Finnish power control environment. It can be seen that within this environment, there is a combination of large screen, small screen, and of course paper displays. This combination of displays presents challenges in terms of viewing distances and lighting levels as well as consistency of interactive method and information presentation method – for example, typically the lighting level requirements for a light-emitting or projected display will be lower than that ideally used for a paper display, due partly to the luminance of the display itself contributing to the overall light levels. A typical situation that we encounter when viewing a presentation in a meeting or conference is that the lighting that results in the most clearly viewed display is lower than that needed for making personal notes on paper. As well as large screen displays, we are more frequently using small screens – both as a primary work tool and a communication tool that is used in our work and home environment. Asset management has been revolutionized by the use of handheld technologies – we see handheld displays being used by couriers and those delivering goods to our work and home, and increasingly handheld technologies are being used for quality monitoring and recording in a manufacturing

Displays in the Workplace

5

Fig. 1 Example of hardwired panel display in rail

Fig. 2 Example of typical multiple monitor display with overview screen (Image: Fingrid Oyj/Juhani Eskelinen)

environment as well as in the context of engineering and maintenance. The challenge of presenting information on small screens has been acknowledged (Jones and Marsden 2006; Weiss 2002); yet we maintain the desire to view increasingly large and complex data sets on increasingly small screens. These complex and

6

S. Sharples

Fig. 3 Mixed reality architecture display (Schna¨delbach et al. 2007)

detailed small screen displays present ergonomic challenges regarding viewing distance, brightness, and resolution. It is often the case that the size of the text or displays on such screens is small, and the postures used when viewing such technologies may be less comfortable and controlled than a dedicated workstation; however, this may not be of a concern if such small devices are only used for small periods of the day. Therefore, it is important for those involved in the design and development of such devices to maintain an understanding of how such displays are being used, and, if they do become the primary tools for work, ensure that good work design guidelines are implemented. Displays are also getting larger, and we are increasingly using displays to support collaboration. Large shared screens facilitate colocated collaboration, and technologies such as Internet-mediated video messaging and dedicated collaborative meeting applications allow us to conduct “face-to-face” meetings (distributed collaboration) via technology. One such type of system is the Mixed Reality Architecture (see Fig. 3). This is an “always-on” technology that supports collaboration between multiple users via a display that presents a different view of the shared environment depending on the position of the user’s “pod” within the virtual space. However, the use of a display for collaboration does not automatically imply a large screen – we may use technologies such as instant messaging via a standard PC or text messaging (SMS) via a mobile device. In addition, we are increasingly taking advantage of 3D displays. VR technologies have increased their capacity for high-resolution and real-time rendering, and such technologies are now being used in a range of applications including the medical, automotive, and design fields. These display types present additional

Displays in the Workplace

7

challenges relating to visual strain, sickness (Sharples et al. 2007), and comfort (see ▶ 3D Displays, Sect. 9). Finally, our notion of the workplace has extended. Mobile technologies extend communications to allow work to be completed at home and while traveling; the concept of the “office” being our normal or regular space where we complete work may be limiting.

Challenge 3: The Diverse Context of Use of Displays Displays in the workplace are not always used as originally predicted, intended, or designed. An obvious example of this is found in the control contexts where multiple standard CRT/flat-screen displays are used by a single operator. The use of these multiple displays means that the standard guidance regarding font size, viewing angle, and viewing distance is not directly transferable – a viewer of multiple displays will typically be seated further away from an individual screen and will not be viewing the screen from directly in front of the display. As mentioned before, the context of use of laptops is also unpredictable – also the impact of this on user comfort, display angle, and brightness may also vary. In addition, users may adjust settings on portable computers for reasons other than ease of use – for example, a user may reduce the brightness of a portable display to preserve battery life, regardless of the impact this may have on viewing comfort. Oetjen and Ziefle (2009) note that in fact, while LCD/TFT screens, such as seen in laptops, are typically intended for a single user, they are often used in collaborative contexts, such as radiology or patient monitoring and in schools. An alternative scenario to this is the typical view of a number of people gathered around a small screen, perhaps looking over the main “workspace” user’s shoulder. With flat-screen displays in particular, this can lead to difficulties, as the angle in front of the screen from which it is possible to clearly see the data may be limited. As new technologies such as e-paper emerge, it is critical that we apply methods to enable us to understand and predict not only the type of display that will be used but also the context in which that display will be implemented. In the EU-funded project VIEW of the Future, a scenario was envisaged where multiple users may in fact interact with the same display via a number of input devices, including handheld screens with displays of their own. Figure 4 shows an example of one such envisionment – one user, either colocated or in a different location, can interact with the system via a handheld mobile device, while other physically present users can use devices such as 3D joysticks to move the viewpoint or interact with the object (e.g., opening and closing car doors, changing the color of the car). This illustrates that the context of display use can even vary for multiple users of a single system.

8

S. Sharples

Fig. 4 Envisionment of future collaboration in visualization of prototypes (from VIEW of the Future)

Challenge 4: Predicting How Displays Will Be Used in the Future Displays have changed considerably in the past 20 years. We have moved to larger and smaller displays; technologies that allow flat screens will in the future extend into e-paper among other display types. Projected image quality has improved and the weight, size, and lag of 3D headsets/glasses have decreased, accompanied by an increase in resolution. However, if we are to design effective and useful displays, we need to anticipate the workspace of the future. In 1998, Dalton et al. (1998) proposed the concept of a SmartSpace, a personal workspace that enabled information visualization, manipulation, and exchange high bandwidth connectivity and a combination of a large, curved “immersive” screen with a flat touch screen and sophisticated sound (Fig. 5). While this concept has not manifested in the exact form proposed, elements can be seen in the Apple iPad and other types of high spec displays. The EU project SATIN (Sharples et al. 2008) identified a potential future application of technologies in multimodal interfaces for visualization of virtual prototypes. This interface (Fig. 6) combined visual, haptic, and auditory information to convey the shape and form of an object in design stage. A user visually sees the representation of a virtual object while the haptic strip (in the figure on the left) takes the form of the section of the visual object as highlighted, and the accompanying sound varies in pitch to represent the curvature value of any point on the curve being explored. This prototype technology is an illustration of the way in which future technologies may facilitate the integration of multiple modalities of

Displays in the Workplace

9

Fig. 5 Demonstration of the SmartSpace concept (image provided with permission from BT (A. Gower))

Fig. 6 Sound and Tangible Interface for Novel Product Design (SATIN) prototype

display – while the development of visual display technology is of course critical, we also need to understand how the visual elements complement or contradict the information presented by other modalities. It is critical that future technologies are anticipated to support good ergonomic design. While this is difficult, there are some methods available to support this. A road map exercise was conducted as part of an activity to anticipate the types of technology that might be used in future workplaces to support activities related to design.

10

S. Sharples

Challenge 5: Displays as Part of an Interactive System The final challenge facing the design of current and future displays in the workplace is that increasingly displays are not separate devices or monitors, but inherently integrated into the device itself. We have long moved past the assumption that the display is a separate device such as a monitor, and in fact arguably the situation in which the display is still a separate device, apart from on a desktop PC, is in shared projected or large screen displays. For a laptop or similar device, the display is part of the same physical device as the keyboard, and increasingly the display essentially is the device – as Weiss (2002) pointed out, increasingly many phonepersonal digital assistant combination devices do not have a separate keyboard and instead include an on-screen keyboard operated by touch technologies utilizing a stylus or finger-based system (see chapter “▶ Introduction to Touchscreen Technologies”). The implication of this is that even when the device is separate, increasingly it is only meaningful to evaluate the impact, effectiveness, usability, and appeal of a display in combination with other peripheral devices or interaction metaphors used. Work completed as part of a project called “VIEW of the Future” (see Fig. 7 for image of sample interaction device, menu, and display) demonstrated that it was not appropriate to evaluate the display and its contents alone, but that perceived usability and performance with the display resulted from an interaction between the device and menu designs. In this example, a device creates a wand effect of a “laser beam” that is used to select part of the display with the required level of accuracy. Current smartphones also feature integrated displays and interaction devices, and increasingly work is focusing on the use of gestures and movement to enhance interaction – for example, you may rotate your screen through 90 to select either a portrait or landscape view of the screen. It is likely that this integration of displays and interaction mechanisms will increase as innovation continues.

Conclusion This chapter has outlined five key challenges facing design of displays in the workplace as follows: 1. Despite the increasing variety in display types, we must not forget to continue to develop the quality and form of conventional computer monitors and displays. 2. The nature of displays in the workplace is changing and increasingly includes collaborative displays and multiple devices per individual user.

Displays in the Workplace

11

Fig. 7 Example of interaction metaphors designed in VIEW of the Future

3. We can no longer assume that a worker will have a single dedicated workplace – many people transit between work, home, and a mobile location when completing their work, and the increasing mobility of devices means that displays are often used in a divers set of contexts. 4. We need to keep ahead of technological developments, and it is important to develop techniques that allow us to envision the way in which devices and displays will be used in the future. 5. We can no longer assume that the display is a separate device – it may be physically integrated with other peripherals or interaction devices or may be integrated into the main device itself as in a smartphone or PDA. We therefore need to ensure that we design the display and interaction techniques as a whole, rather than considering the separate components in isolation. It is useful to articulate these challenges in terms of characteristics associated with the user, task, technology, and context (see Table 2) to highlight the diversity of areas that should be considered when design and evaluating displays in the workplace. Displays in the workplace are changing, diversifying, and developing. While display quality and power are increasingly improving, developers are building new device types and forms, and users are continually finding new work-related tasks and contexts of use in which to employ them. Device designers and evaluators must ensure that we track these developments carefully and acknowledge the impact of device design and type on user use and interaction.

12

S. Sharples

Table 2 Examples of influential factors affecting displays in the workplace and impact of factors on display/human factor considerations Influential factor User User experience

Task

Technology

User visual characteristics User linguistic experience/ability User preference Office-based tasks (e.g., word processing, e-mail) Short tasks (e.g., e-mail, SMS) Reading/individual viewing Shared viewing Viewing objects or images Conventional PC monitor

Laptop Mobile device

E-paper Context

Conventional office setting Mobile

Home setting

Examples of impact on display/human factor consideration Extent to which display has been customized; ability of user to select appropriate display settings (e.g., scale, brightness) Eyesight (e.g., scale, resolution), color blindness Type of text displayed (e.g., character vs. letter based), use of icons vs. text Color, content (e.g., text vs. icons) selected Design of stand-alone peripheral display Use of mobile devices to complete typing and reading task Desire to sit/stand in range of locations (e.g., while traveling, on sofa) Size and resolution of large screen or projected display Resolution and color display requirements for CAD, video, film, or text displays Continue to consider standard display screen equipment design considerations (e.g., height of screen, angle of screen) Screen angle, battery use, resolution, contrast Environmental conditions (rain, brightness of ambient environment), use of display as interaction device Requirements for lighting in ambient environment, resolution, contrast Standard display screen equipment design considerations Varying requirements depending on environmental conditions, postural impact of use of mobile display for extended period of time Use of displays in comfort, prolonged use of “occasional“ work settings

Acknowledgments The author of this chapter is partially funded by Horizon Digital Economy Research, through the support of RCUK grant EP/G065802/1.

Further Reading Dalton G et al (1998) The design of SmartSpace: a personal working environment. Pers Technol 2(1):37–42 Heasman T, Brooks A, Stewart T (2000) Health and safety of portable display screen equipment. HSE Books, Sudbury

Displays in the Workplace

13

International Organisation for Standardization (2001) ISO 13406–2: Ergonomic requirements for work with visual displays based on flat panels - Part 2: Ergonomic requirements for flat panel displays. ISO, Geneva Jackinsi W, Heuer H (2004) Vision and eyes. In: Delleman NJ, Haslegrave CM, Chaffin DB (eds) Working postures and movements. CRC, Boca Raton, pp 73–86 Jackinsi W, Heuer H, Kylian H (1999) A procedure to determine the individually comfortable position of visual displays relative to the eyes. Ergonomics 42(4):535–549 Jones M, Marsden G (2006) Mobile interaction design. Wiley, Chichester Oetjen S, Ziefle M (2007) The effects of LCDs’ anisotropy on the visual performance of users of different ages. Hum Factors 49(4):619–627 Oetjen S, Ziefle M (2009) A visual ergonomic evaluation of different screen types and screen technologies with respect to discrimination performance. Appl Ergon 40:69–81 Oetjen S, Ziefle M, Groger T (2005) Work with visually suboptimal displays: in what ways is the visual performance influenced when CRT and TFT displays are compared? In: Proceedings of the HCI international, Vol 4. Theories, models and processes in human computer interaction. Mira Digital Publishing. CD-ROM Perry M, O’Hara K (2003) Display-based activity in the workplace. In: Proceedings of humancomputer interaction conference, INTERACT ’03. IOS, Amsterdam Pheasant S, Haslegrave CM (2006) Bodyspace: anthropometry, ergonomics and the design of work. CRC, London Schna¨delbach H, Penn A, Steadman P (2007) Mixed reality architecture: a dynamic architectural topology. In: Space syntax symposium. Technical University Istanbul, Istanbul Sharples S, Stedmon AW, D’Cruz M, Patel H, Cobb S, Yates T, Saikayasit R, Wilson JR (2007) Human factors of virtual reality - where are we now? In: Pikaar RN, Koningsveld EAP, Settels PJM (eds) Meeting diversity in ergonomics. Elsevier, Amsterdam Sharples S, Hollowood J, Lawson G, Pettitt M, Stedmon A, Cobb S, Coloso C, Bordegoni M (2008) Evaluation of a multimodal interaction design tool. In: Create 2008. Proceedings of the conference on creative inventions, innovations and everyday designs in HCI. London Sprigg CA, Smith PR, Jackson PR (2003) Psychosocial risk factors in call centres: an evaluation of work design and well-being. HSE Research Report Tanaka R (2002) Future workplace design. Displays 23(1):41–48 Weiss S (2002) Handheld usability. Wiley, Chichester Wilson JR, Morrisroe G (2005) Systems analysis and design. In: Wilson JR, Corlett EN (eds) Evaluation of human work, 3rd edn. Taylor & Francis, London

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Display Screen Equipment: Standards and Regulation Sarah Atkinson* Department of Mechanical, Materials and Manufacturing Engineering, Human Factors Research Group, Institute for Occupational Ergonomics, University of Nottingham, University Park, Nottingham, UK

Abstract The increasing amount of time spent in front of display screens, involving, sometimes, prolonged and daily use, has led to an expanse of legislation and recommendations to encourage employers to identify and adopt design features and work practices which minimize any risk to computer users and others using display screen equipment (DSE). As technologies change, the risks and the measures needed to address them change too. This chapter outlines the existing international legal framework for DSE and the standards and guidance that exist and considers its application to both current and future technologies. An example of the implementation of DSE standards, regulations, and guidance is discussed.

Lists of Abbreviations ANSI DSE HFES ISO VDU

American National Standards Institute Display Screen Equipment Human Factors and Ergonomics Society International Organization for Standardization Visual display unit

Introduction Visual displays can be found in offices, in shops, on the factory floor, in laboratories, and in control rooms. A very high proportion of the workforce now uses a computer or computer-controlled equipment as an integral part of their work. Display screen equipment (DSE) is any work equipment which has a screen that displays information. The most prolific type is the computer, whether a desktop PC with a visual display unit (VDU) or a laptop with an integral screen. However, DSE can also refer to other types of equipment and some of “nonstandard” design such as those found in control rooms for process monitoring or for communication and travel. This chapter will outline the legislation and standards relevant to DSE use.

International Standards The International Organization for Standardization (ISO) is the largest developer and publisher of international standards with members from over 150 countries. Standards are developed under the

*Email: [email protected] Page 1 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

auspices of technical committees and subcommittees with technical development undertaken by working groups. The objective of a standard is to define clear and unambiguous provisions in order to facilitate international trade and communication. To achieve this objective, the standard shall be as complete as necessary; consistent, clear, and concise; and comprehensible to qualified persons who have not participated in its preparation. Such standards are regarded as advisory, but not mandatory.

BS EN ISO 9241 the Ergonomics of Human-System Interaction ISO 9241 originated as a multipart standard entitled Ergonomics requirements for office work with visual display terminals. It has been an influential standard, referenced in many countries’ regulations. The interest in ISO 9241 encouraged the standard subcommittees to broaden its scope, to incorporate other relevant standards, and to make it more usable, now entitled the Ergonomics of human-system interaction (ISO 9241 2001). The structure and overview of each part is shown in Table 1.

ANSI/HFES 100–2007: Human Factors Engineering of Computer Workstations The American National Standards Institute approved ANSI/HFES 100–2007, Human Factors Engineering of Computer Workstations (ANSI/HFS 2007), as an American national standard which was developed by the Human Factors and Ergonomics Society (HFES), and it provides specific guidance for the design and installation of computer workstations, including displays, input devices, and furniture that will accommodate a wide variety of users. ANSI/HFES 100–2007 includes computer mouses and other pointing devices in its inputs chapter, and the displays chapter has been expanded to cover color devices.

Other Relevant Standards

Medical electrical equipment – Medical image display systems – Part 1: Evaluation methods (IEC 62563–1:2010) (IEC 62563–1 2010) provides evaluation methods for testing medical image display systems. It is directed to practical tests that can be visually evaluated or measured using basic test equipment. IEC 62563–1:2010 applies to medical image display systems, which can display monochrome image information in the form of grayscale values on color and grayscale image display systems (e.g., cathode ray tube (CRT) monitors, flat panel displays, projection system). This standard applies to medical image display systems used for diagnostic (interpretation of medical images toward rendering clinical diagnosis) or viewing (viewing medical images for medical purposes other than for providing a medical interpretation) purposes and therefore having specific requirements in terms of image quality. Headmounted image display systems and image display systems used for confirming positioning and for operation of the system are not covered by this standard. ISO 11064–5:2008, Ergonomic design of control centres Part 5: Displays and controls (ISO 11064–5 2008), is part of a seven-part standard under the general title Ergonomic design of control centres. This part includes principles for selection, design, and implementation of displays for control room operation and supervision. BS EN 61772:2013, Nuclear power plants – Control rooms – Application of visual display units (VDUs) (BS EN 61772 2013), presents design requirements for the application of VDUs in new main control rooms of nuclear power plants. It is designed to assist the designer in specifying VDU applications, including displays on individual workstations and larger displays for group working or distant Page 2 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Overview of ISO 9241 Relevant part of ISO 9241 ISO 9241–1:1997/ General Introduction (supersedes ISO Amd 1:2001 9241–1:1993)

ISO 9241–11:1998

Guidance on usability

ISO 9241–20:2009

Accessibility guidelines for information/ communication technology (ICT) equipment and services

ISO 9241–110:2006

Dialogue principles (supersedes ISO 9241 10:1996)

ISO 9241–14:2000

Menu dialogues

ISO 9241–15:1998

Command dialogues

Overview This part introduces the multipart standard ISO 9241 for the ergonomic requirements for the use of visual display terminals for office tasks and explains some of the basic underlying principles. It provides some guidance on how to use the standard and describes how conformance to parts of ISO 9241 should be reported Defines usability as “extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use” and provides guidance on how to address usability in design projects A high-level overview standard covering both hardware and software. It covers the design and selection of equipment and services for people with a wide range of sensory, physical, and cognitive abilities, including those who are temporarily disabled and the elderly Sets out seven dialogue principles and gives examples. The dialogue should be suitable for the task (including the user’s task and skill level); selfdescriptive (it should be obvious what to do next); controllable (especially in pace and sequence); conform to user expectations (i.e., consistent); error tolerant and forgiving; suitable for individualization and customizable; and should support learning Recommends best practice for designing menus (pop-up, pull-down, and text-based menus). Topics include menu structure, navigation, option selection, and menu presentation (including placement and use of icons). One of the annexes contains a ten-page checklist for determining compliance with the standard This part provides recommendations for the ergonomic design of command languages used in user-computer dialogues. The recommendations cover command language structure and syntax, command representations, input and output considerations, and feedback and help. Part 15 is intended to be used by both designers and evaluators of command dialogues, but the focus is primarily toward the designer (continued)

Page 3 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 (continued) Relevant part of ISO 9241 ISO Direct manipulation dialogues 9241–16:1999

ISO 9241–143:2012

Forms (Supersedes ISO 9241–17:2008)

ISO 9241–151:2008

Guidance on World Wide Web user interfaces

ISO 9241–171:2008

Guidance on software accessibility

ISO 9241–300:2008

Introduction to electronic visual display requirements (the ISO 9241–300 series supersedes ISO 9241 parts 3, 7 and 8) Terminology for electronic visual displays

ISO 9241–302:2008 ISO 9241–303:2011

ISO 9241–304:2008

Requirements for electronic visual displays

User performance test methods for electronic visual displays

Overview This part provides recommendations for the ergonomic design of direct manipulation dialogues and includes the manipulation of objects and the design of metaphors, objects, and attributes. It covers those aspects of “graphical user interfaces” which are directly manipulated, and not covered by other parts of ISO 9241. Part 16 is intended to be used by both designers and evaluators of command dialogues, but the focus is primarily toward the designer This part provides requirements and recommendations for the design and evaluation of forms in which the user fills in, selects entries for, or modifies labeled fields on, a “form” or dialogue box presented by the system. Guidance is provided on the selection and design of those user-interface elements relevant to forms Sets out detailed design principles for designing usable web sites – these cover: high-level design decisions and design strategy, content design, navigation, and content presentation Aimed at software designers and provides guidance on the design of software to achieve as high a level of accessibility as possible. Replaces the earlier Technical Specification ISO TS 16071:2003 and follows the same definition of accessibility – “usability of a product, service, environment or facility by people with the widest range of capabilities.” Applies to all software, not just web interfaces A very short (4 pages) introduction to the ISO 9241–300 series which explains what the other parts contain Definitions, terms, and equations that are used throughout ISO 9241:300 series Sets general image quality requirements for electronic visual displays as well as providing guidelines, for electronic visual displays. The requirements are intended to apply to any kind of display technology Unlike the other parts in the subseries which focus on optical and electronic measurements, this part sets out methods which involve testing how people perform when using the display. The method can be used with any display technology (continued)

Page 4 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 (continued) Relevant part of ISO 9241 ISO Optical laboratory test methods for electronic 9241–305:2008 visual displays

ISO 9241–306:2008 ISO 9241–307:2008

Field assessment methods for electronic visual displays Analysis and compliance test methods for electronic visual displays

ISO/TR 9241–308:2008

Surface-conduction electron-emitter displays (SED)

ISO/TR 9241–309:2008

Organic light-emitting diode (OLED) displays

ISO 9241–12:1998

Presentation of information

ISO 9241–400:2007

Principles and requirements for physical input devices

PD CEN ISO/TS 9241–411:2014

Evaluation methods for the design of physical input devices

ISO 9241–420:2011

Selection of physical input devices

ISO 9241–410:2008 + A1:2012

Design criteria for physical input devices

Overview Defines optical test methods and expert observation techniques for evaluating a visual display against the requirements in ISO 9241–303. Very detailed instructions on taking display measurements Provides guidance on how to evaluate visual displays in real-life workplaces Supports ISO 9241–305 with very detailed instructions on assessing whether a display meets the ergonomics requirements set out in part 303 Technical report on a new eco-friendly display technology, called “Surface-Conduction ElectronEmitter Displays” (SED) Technical report on another new display technology called “organic light-emitting diode” (OLED) displays, which are better for fast-moving images than LCDs This part contains specific recommendations for presenting and representing information on visual displays. It includes guidance on ways of representing complex information using alphanumeric and graphical/symbolic codes, screen layout, and design, as well as the use of windows Sets out the general ergonomics principles and requirements which should be taken into account when designing or selecting physical input devices Presents methods for the laboratory analysis and comparison of input devices for interactive systems and provides the means for evaluating conformance with the requirements of ISO 9241–410. The target users of this part are manufacturers, product designers and test organizations concerned with commercial input devices Gives guidance for selecting products on the basis of the relevant properties of the input devices. Also includes test and evaluation methods for use at the workplace level Describes ergonomics characteristics for input devices, including keyboards, mice, pucks, joysticks, trackballs, touch pads, tablets, styli, and touch-sensitive screens. The standard is aimed at those who are designing such devices and is very detailed (continued)

Page 5 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 (continued) Relevant part of ISO 9241 ISO 9241–5:1998 Workstation layout and postural requirements

ISO 9241–6:1998

Guidance on the work environment

ISO 9241–13:1999

User guidance

ISO 9241–2:1993

Guidance on task requirements

Overview This part specifies the ergonomics requirements for a Visual Display Terminal workplace which will allow the user to adopt a comfortable and efficient posture This part specifies the ergonomics requirements for the Visual Display Terminal working environment which will provide the user with comfortable, safe, and productive working conditions This part provides recommendations for the design and evaluation of user guidance attributes of software user interfaces, including prompts, feedback, status, online help, and error management Deals with the design of tasks and jobs involving work with visual display terminals. It provides guidance on how task requirements may be identified and specified within individual organizations and how task requirements can be incorporated into the system design and implementation process

viewing. It covers the use of large-screen displays, provides recommendations on the use of color, and improves the coverage of backfit or upgrade applications, as well as presenting examples of good practice.

Case Study: Implementation of Health and Safety Law in the UK In recent years, much of Britain’s health and safety law has originated in Europe. Proposals from the European Commission may be agreed by member states, who are then responsible for making them part of their domestic law. Modern health and safety law in this country, including much of that from Europe, is based on the principle of risk assessment. The basis of British health and safety law is the Health and Safety at Work etc. Act 1974 (1974). The Act sets out the general duties which employers have toward employees and members of the public and employees have to themselves and to each other. These duties are qualified in the Act by the principle of “so far as is reasonably practicable.” In other words, an employer does not have to take measures to avoid or reduce the risk if they are technically impossible, or if the time, effort, or cost of the measures would be grossly disproportionate to the risk. The Management of Health and Safety at Work Regulations 1999 (the Management Regulations) (Management of health and safety at work 2000) generally make more explicit what employers are required to do to manage health and safety under the Health and Safety at Work etc. Act. Like the Act, they apply to every work activity.

Page 6 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

The Health and Safety (Display Screen Equipment) Regulations Britain has implemented the European Directive 90/270/EEC (Council Directive 90/270/EEC of 29 May 1990) by the Health and Safety (Display Screen Equipment) Regulations 1992. The Health and Safety (Display Screen Equipment) Regulations came into force on January 1, 1993, and have been subsequently updated with amendments (2002). The majority of the regulations apply to self-employed workers and those people working from home but paid by an employer, in addition to employees at company workplaces. The aim of this directive and regulation is to reduce the risks of ill health associated with display screen equipment (DSE) work, notably musculoskeletal disorders (MSDs), stress, and visual fatigue. The Health and Safety Executive (HSE) published guidance on the regulations, notably booklet L26, Work with display screen equipment (2002) (HSE (Health and Safety Executive) 2003). This gives detailed and comprehensive guidance about work with display screen equipment covering both office work and other environments where display screen equipment (DSE) may be used. Further guidance is also available in publications by the HSE (HSG90) (2000) and INDG36 (2005). The legislation consists of nine regulations and a schedule of “minimum requirements.” The regulations cover aspects of the display screen workstation – including minimum specifications for workstation, work equipment and accessories, the work environment, and work organization. With a few exceptions, the definition of DSE covers both conventional (cathode ray tube) display screens and other types such as liquid crystal or plasma displays used in flat panel screens, touch screens, and other emerging technologies. Display screens mainly used to display line drawings, graphs, charts, or computer-generated graphics are included, as are screens used in work with television or film pictures. The definition is not limited to typical office situations or computer screens but also covers, for example, nonelectronic display systems such as microfiche. DSE used in factories and other non-office workplaces is included, although in some situations, such as screens used for process control or closed-circuit television (CCTV), certain requirements may not apply. The regulations place a number of obligations on employers as follows: • • • • •

Analyze workstations and assess health and safety risks Reduce any risks to the lowest reasonably practicable level Provide eye and eyesight tests for employees Provide training and information to employees Plan work to allow breaks and changes of work activity

The regulations apply to people who habitually use display screen equipment as a significant part of their normal work and those classed as users. DSE users: • • • •

Use display screen equipment for continuous or near-continuous spells of an hour or more at a time Use DSE in this way more or less daily Have to transfer information quickly to or from the screen Have limited control over the time spent working on DSE

The legal distinction made between users and nonusers of DSE reflects the nature of the potential risks in the work and the likely causative factors. The degree of exposure and the factors linked to the health problems reported are strongly influenced by the duration and frequency of periods spent working with DSE and by the intensity of the work. The DSE regulations also define a “DSE workstation” as an assembly comprising: Page 7 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

• Display screen equipment (whether provided with software determining the interface between the equipment and its operator or user, a keyboard, or any other input device) • Any optional accessories to the display screen equipment • Any disk drive, telephone, modem, printer, document holder, work chair, work desk, work surface, or other items peripheral to the display screen equipment • The immediate work environment around the display screen equipment The following are excluded from the specific requirements of the DSE regulations, although their users will still be covered by general health and safety legislation: • • • • •

Drivers’ cabs or control cabs for vehicles or machinery Display screen equipment on board a means of transport Display screen equipment mainly intended for public operation Portable systems not in prolonged use Calculators, cash registers, or any equipment having a small data or measurement display required for direct use of the equipment • Window typewriters

The demands of work undertaken using DSE may vary but the basic unit of chair, desk, and computer are similar in these jobs (Pheasant and Haslegrave 2006). Increasingly, workers may use a laptop or notebook portable computer for a significant part of their working day. The way in which a laptop is used varies mainly due to the environment in which the laptop is used (e.g., train, home living room, home working (see Fig. 1)). The DSE regulations apply to portable DSE in prolonged use – which can include laptop and handheld computers, personal digital assistant devices, and some portable communication devices. While there are no hard-and-fast rules on what constitutes “prolonged” use, portable equipment that is habitually in use by a DSE user for a significant part of his or her normal work is to be regarded as covered by the DSE regulations. While some of the specific minimum requirements in the schedule may not be applicable to portables in prolonged use, employers are required to ensure that such work is assessed and measures are taken to control risks. Portable DSE, such as laptop and notebook computers, is subject to the DSE regulations if it is in prolonged use. Increasing numbers of people are using portable DSE as part of their work. While research suggests that some aspects of using portable DSE are no worse than using full-sized equipment (Heasman et al. 2000), that is not true of every aspect. The design of portable DSE can include features (such as smaller keyboards or a lack of keyboard/screen separation) which may make it more difficult to achieve a comfortable working posture. Portable DSE is also used in a wider range of environments, some of which may be poorly suited to DSE work.

Conclusion ISO 9241 sets out clear guidance related to DSE design and use, and the Health and Safety (Display Screen Equipment) Regulations in the UK have attempted to address changes in technology, for example, the increased use of portable equipment. Neither international standards nor UK regulations explicitly consider the increasing use of in-car technology, handheld devices, and smartphones. Neither is able to consider the complexity of some environments, for example, where multiple displays are used, and the location of working, for example, working across multiple work sites. Page 8 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Nonstandard use of portable DSE for home working

When assessing risk or evaluating a workplace, this diversity will need to be considered in the development of new guidance or revision of existing. International standards are developed slowly, by consensus, using consultation and development processes. This can be a disadvantage in a field where technology is rapidly advancing; however, having consensus results in standards that represent good practice.

Further Reading ANSI/HFS (2007) 100 American National Standards Institute, human factors engineering of computer workstations. Human Factors Society, Santa Monica BS EN 61772 (2013) Nuclear power plants – control rooms – application of visual display units (VDUs). International Standards Organisation, Geneva Council Directive 90/270/EEC of 29 May 1990 on the minimum safety and health requirements for work with display screen equipment (fifth individual Directive within the meaning of Article 16(1) of Directive 89/391/EEC) [Official Journal L 156 of 21.06.1990] Heasman T, Brooks A, Stewart T (2000) Health and safety of portable display screen equipment. Health and Safety Executive, Sudbury HSE (Health and Safety Executive) (2003) Work with display screen equipment – health and safety (display screen equipment) regulations 1992 as amended by the health and safety (miscellaneous amendments) regulations 2002. Guidance on regulations, 2nd edn. HSE L26. HMSO, London HSG90 (2000) The law on VDUs: an easy guide: making sure your office complies with the health and safety (display screen equipment) regulations 1992 (as amended in 2002). HSE Books, London. ISBN 0 7 76 2602 4 IEC 62563–1 (2010) Medical electrical equipment – medical image display systems – part 1: evaluation methods. International Standards Organisation, Geneva ISO 11064–5 (2008) Ergonomic design of control centres part 5: displays and controls. International Standards Organisation, Geneva ISO 9241 (2001) Ergonomics of human-system interaction. International Standards Organisation, Geneva Leaflet INDG36(rev3) (2005) Working with VDUs. HSE Books, London. ISBN 978 0 7176 2222 www. hse.gov.uk/pubns/indg36.pdf

Page 9 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_18-2 # Springer-Verlag Berlin Heidelberg 2015

Management of health and safety at work (2000) Management of health and safety at work regulations 1999. Approved code of practice and guidance L21, 2nd edn. HSE Books, London. ISBN 0 7176 2488 9 Pheasant S, Haslegrave CM (2006) Bodyspace: anthropometry, ergonomics and the design of work. CRC Press, London The Health and safety at work etc. Act (HSWA) (1974) HMSO, London

Page 10 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

Light Emission and Photometry Teresa Goodman* National Physical Laboratory, Teddington, Middlesex, UK

Abstract Displays, obviously, are intended to be seen. It is therefore important to characterize their optical performance using measurements that relate to their ability to stimulate a human visual response – so-called photometric measurements. In this chapter, we explore the various measures that are used in photometry, which provide an internationally agreed measurement framework for quantifying the visual effectiveness of displays (and all other sources of optical radiation).

List of Abbreviations CIE SI

Commission International de l’Éclairage Système International d’Unités, the International System of Units

Introduction Measurement is at the heart of our modern technological world, supporting trade, industry, and science and underpinning the regulatory framework that helps maintain and improve our quality of life, in areas as diverse as medicine, transportation, and sport and leisure. Unlike a large proportion of the millions of measurements made each day, those relating to things that we can “see” (i.e., which emit optical radiation in the visible portion of the electromagnetic spectrum) are not based only on physical parameters, but must also take account of the “visual effectiveness” of the radiation produced, or in other words the ability to stimulate a visual response and facilitate vision. The science of the measurement of light in terms of its ability to stimulate human vision is termed photometry. It is distinct from the purely physical measurement of optical radiation in terms of absolute power or energy at each wavelength, which is termed radiometry.

Spectral Luminous Efficiency for Human Vision The sensitivity or spectral response of the human eye is not constant over the visible spectrum but varies with wavelength; the relative ability of optical radiation at different wavelengths to produce a visual response is termed its spectral luminous efficiency. However, the spectral response of the human eye varies not just with wavelength but also according to the level of illumination; the position in the visual field; the visual task being performed; with age; and even from one individual to another. In order to make photometric measurements on a consistent basis, it is therefore necessary to define the spectral luminous efficiency curve that is being used. To this end, two internationally agreed curves have been defined (Commission International de l’Éclairage 2004), which are accepted as representing the relative spectral *Email: [email protected] Page 1 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 The CIE photopic and scotopic luminous efficiency functions V(l) and V0 (l)

luminous efficiency function of a typical human observer in light-adapted conditions (the “photopic” or V(l) function) or in dark-adapted conditions (the “scotopic” or V0 (l) function). These have been adopted as part of the SI system (Organisation Intergouvernementale de la Convention du Mètre 2006) and are shown graphically in Fig. 1. When a scene is viewed in good lighting conditions (e.g., “normal” indoor lighting levels), the spectral response of the eye is not affected by the actual level of illumination. These are the conditions for photopic vision, in which the visual process is entirely governed by cone receptors. The photopic condition is the most common one, and almost all measurements are made using the photopic spectral luminous efficiency function, V(l). Under very dim lighting conditions (e.g., outdoors under starlight), only the rods are sensitive, and the eye operates in quite a different mode. This is the region of scotopic vision and extends down to the visual threshold, which corresponds to an illuminance in the region of a few microlux. At illuminance levels between photopic and scotopic vision (e.g., twilight), the eye is said to operate in the mesopic range. Under these conditions, the eye is in a state somewhere between the stable photopic and scotopic states, with the precise spectral response characteristics depending on the actual level of illumination and the field of view. Because of this nonlinear behavior, it has proved extremely difficult to reach international agreement for visual response functions for the mesopic range, and although this situation has now been resolved for applications such as nighttime road lighting (Commission International de l’Éclairage 2010), debate continues for applications involving brightness evaluations (Commission International de l’Éclairage 1989).

Luminous Efficacy of Radiation The photopic and scotopic spectral luminous efficiency functions describe the relative visual effectiveness of optical radiation at different wavelengths, but in order to determine absolute values for “light output,” it is necessary to know the luminous effect produced for each watt of optical power entering the eye. This is termed the spectral luminous efficacy and is measured in lumen per watt (lm W1), where the lumen is the unit of luminous flux (see section “Radiant Flux, Spectral Radiant Flux, and Luminous Flux”). Through the definition of the candela, the SI system of units defines the spectral luminous efficacy at a frequency of Page 2 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Spectral luminous efficacy functions for photopic vision, K(l), and scotopic vision, K0 (l)

540  1012 Hz (which in standard air corresponds to a wavelength of 555.016 nm) as being 683 lm W1. This applies regardless of the spectral luminous efficiency function that is used. At this wavelength, V(l) has a value of 0.999998, whereas V 0 (l) is 0.401750. The photopic or scotopic spectral efficacy at any other wavelength, denoted K(l) and K 0 (l), respectively, can be calculated as follows: K ðlÞ ¼

683  V ðlÞ 683  V 0 ðlÞ or K 0 ðlÞ ¼ 0:999998 0:401750

The photopic and scotopic spectral luminous efficacy curves are shown in Fig. 2; K(l) peaks at 555 nm and has a maximum value of 683.002 lm W1, whereas K0 (l) peaks at 507 nm and has a maximum value of 1,700.06 lm W1 (these are usually rounded to 683 lm W1 and 1,700 lm W1, respectively).

Photometric and Radiometric Units There are many different units used in radiometry and photometry, some of which are now purely historical but still cause considerable confusion. Those given here are as set down in the International Lighting Vocabulary (Commission International de l’Éclairage 2011); other units that may be encountered are summarized in the Tables given later (see section “Irradiance and Illuminance”). Each radiometric quantity has a photometric equivalent, obtained by weighting the radiometric values by the photopic spectral luminous efficiency function, V(l). The same symbols are used for both types of quantities, with subscripts to distinguish between them (the subscript v denotes a photometric quantity while e or no subscript indicates a radiometric quantity). In the case of flux, for example, which is usually given the symbol F, the symbol for radiant flux is Fe (or just F) and that for luminous flux is Fv. Subscripts are also used to denote whether the radiometric quantity relates to a specific wavelength. For example, if F is to be referenced to a wavelength l, then the symbol used becomes Fl, meaning Fl ¼

dF dl Page 3 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

We can also define the quantity Fl over a range of wavelengths as though it were a function of wavelength: Fl (l).

Solid Angle Many of the measurement quantities listed below refer to solid angle. The unit of solid angle is the steradian (sr). The solid angle is defined by a closed curve and a point in space. Its magnitude is the area of a closed curve projected onto a sphere of unit radius, as shown in Fig. 3. Equivalently, the solid angle can be defined as the quotient of the area of the projected curve onto a sphere of radius R and the radius squared. In other words, an area A at a distance R from a point subtends a solid angle O at that point given by O¼

A R2

Radiant Intensity, Luminous Intensity, and the SI Unit of Light

Figure 4 represents a point source S emitting radiant flux in various directions. The radiant intensity Ie of the source in any given direction is defined as the quotient of the radiant flux dFe leaving the source and propagated in the element of solid angle dO containing the given direction, by the element of solid angle: Ie ¼

d Fe dO

The unit of radiant intensity is the watt per steradian (W sr1), and the corresponding spectroradiometric unit is spectral radiant intensity, Ie,l, usually measured in watts per steradian per nanometer (W sr1 nm1). The photometric quantity analogous to radiant intensity is termed luminous intensity Iv and the unit of measurement is the candela (cd). A luminous intensity of one candela is equivalent to a luminous flux of one lumen emitted within unit solid angle: 1 cd ¼ 1 lm sr1 The candela is in fact the SI unit of light. Since 1979, this has been defined as follows:

Curve C in space

Solid angle W defined by curve C and point P

R P

Projection of curve C on sphere of radius R

Fig. 3 Definition of solid angle, O Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

dA

dW r S

Fig. 4 Radiant intensity

The candela is the luminous intensity, in a given direction, of a source that emits monochromatic radiation of frequency 540  1012 Hz and that has a radiant intensity in that direction of 1/683 W per steradian. In other words, this definition states that at a specific frequency of green light, a monochromatic radiant intensity of 1/683 W sr1 will produce a luminous intensity of 1 cd. Therefore, at this specific frequency of green light, 1 W of radiant flux is directly equal to 683 lm. The specific frequency of the radiation was chosen because it corresponds to a wavelength of 555 nm, the peak of the photopic spectral luminous efficiency curve V(l). Thus, the definition of the candela clearly recognizes the fact that light is a form of energy by basing the candela directly upon the watt. Luminous intensity Iv can be calculated from the spectral radiant intensity, Ie,l (l), by weighting with the photopic spectral luminous efficiency function, V(l): 1 ð

I v ¼ 683 I e, l ðlÞV ðlÞdl 0

Radiant Flux, Spectral Radiant Flux, and Luminous Flux Although the SI unit for photometry is the candela, the most fundamental measure of optical radiation is that of radiant flux, denoted Fe. This is simply the total power in watts (W) of optical radiation emitted, transmitted, or received. It may include visible, ultraviolet, and infrared radiation and can be measured for any stated solid angle. The spectroradiometric equivalent of radiant flux is spectral radiant flux, denoted Fe,l (l) or Fe,l, and is usually expressed in watts per nanometer wavelength interval, W nm1. To find the total radiant flux Fe we integrate the spectral radiant flux Fe,l (l) over the entire spectrum: 1 ð

Fe, l ðlÞdl

Fe ¼ 0

Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

S

Fig. 5 Integrating the flux emitted by a source over the full solid angle gives the total radiant (or luminous) flux

Luminous flux, Fv, is measured in lumen. It is calculated from the spectral radiant flux, Fe,l (l), by weighting with the photopic spectral luminous efficiency function, V(l): 1 ð

Fv ¼ 683 Fe, l ðlÞV ðlÞdl 0

The most common use of radiant flux, and the photometric and spectroradiometric equivalents, is that of the geometrically total radiant flux emitted by a source, in all directions, into a solid angle of 4p sr (see Fig. 4). In particular, when we talk of the “luminous flux” of a source, it is generally the geometrically total luminous flux that is meant (Fig. 5).

Irradiance and Illuminance The irradiance Ee at a point on a surface is defined as the quotient of the radiant flux dFe incident on an element of the surface containing that point, by the area dA of that element. Ee ¼

d Fe dA

The unit of irradiance is watt per meter squared (W m2), and so the unit of spectral irradiance Ee,l is W m2 nm1. The corresponding photometric quantity is termed the illuminance, denoted Ev. The unit of illuminance is lumen per meter squared, lm m2, usually termed lux (lx). An illuminance of one lux is equivalent to a luminous flux of one lumen falling on an area of one square meter: 1 lx ¼ 1 lm m2 Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

S

Normal to dA

q

r

Solid angled dW

dA

Fig. 6 Point source S irradiating surface dA

Illuminance Ev can be calculated from the spectral irradiance, Ee,l (l), by weighting with the photopic spectral luminous efficiency function, V(l): 1 ð

E v ¼ 683 E e, l ðlÞV ðlÞdl 0

Figure 6 represents a point source S irradiating an element of area dA, at a distance r from the source. The normal to this element of area forms an angle y with the radius vector r. The projection of the element of area dA onto a plane that is normal to the radius vector r will be given by: Projected area ¼ dA  cos y Therefore, the solid angle subtended at the source by the element of area dA is dO ¼

ðdA  cos yÞ r2

and so the radiant intensity of the source in the given direction will be given by dI e ¼

d Fe d Fe  r2 ¼ dO dA  cos y

From above, the irradiance produced by the source at dA is given by Ee ¼

d Fe dA

If we substitute the expression for dFe from the equation given above for the radiant intensity, we can rewrite the irradiance as Ee ¼ I e

cos y r2

Thus, the irradiance produced by a point source is inversely proportional to the square of the distance; this is known as the inverse square law. Furthermore, the irradiance is proportional to the cosine of the angle between the direction of irradiation and the normal to the surface; this is known as the cosine law.

Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

Radiance and Luminance Figure 7 represents a radiant surface acting as a source or an irradiated surface acting as a secondary source. The concept of radiant intensity cannot be applied to such a source, and so we consider it as a collection of small radiant surfaces of area dA, to which the concept of radiant intensity is applicable. The quotient of the radiant intensity dIe of one of these small surface elements, when viewed in a particular direction by the projected area of the source under study, is known as the radiance of the surface Le. Equivalently, the radiance in a given direction, at a given point on a real or imaginary surface, can be defined by the formula Le ¼

d Fe dI e  cos y  dO  dA dA  cos y

where dFe is the radiant flux transmitted by an elementary beam passing through the given point and propagating in the solid angle dO containing the given direction, dA is the area of a section of that beam containing the given point, and y is the angle between the normal to that section and the direction of the beam. Radiance is measured in watts per steradian per meter squared (W sr1 m2), and the unit of spectral radiance, Le,l, is therefore W sr1 m2 nm1. The corresponding photometric quantity is termed luminance Lv, and the unit is cd m2. Luminance can be calculated from the spectral radiance, Le,l (l), by weighting with the photopic spectral luminous efficiency function, V(l):

Projected area dA cosq Element of solid angle dW

Element of flux dFe

q Normal to dA

Source element of area dA

Fig. 7 Radiance of an extended source

Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015 1 ð

Lv ¼ 683 Le, l ðlÞV ðlÞdl 0

Radiance and luminance are generally the most relevant and useful quantities for characterizing the amount of light emitted from the surface of a visual display.

Radiant Exitance and Luminous Exitance The radiant exitance Me at a point of a surface is defined as the quotient of the radiant flux dFe leaving an element of the surface containing that point, by the area dA of that element: Me ¼

d Fe dA

It is measured in watts per meter squared, W m2. For a uniform diffuse source, the relationship between radiant exitance and radiance is M e ¼ p  Le The spectroradiometric quantity is known as spectral radiant exitance Me,l and is usually measured in W m2 nm1. The analogous photometric quantity is luminous exitance Mv measured in lumen per meter squared, lm m2.

Non-SI Photometric Units There are a number of non-SI units which may sometimes be encountered in the measurement of illuminance and luminance. The most common of these are summarized in Tables 1 and 2 below.

Summary The SI system is an internationally agreed system of units, which is used throughout the world to provide a consistent basis for measurement. The importance of vision in our daily lives is recognized within this Table 1 Non-SI units for illuminance Unit Phot Milliphot Footcandle

Abbreviation ph mph fcd

Description lm cm2 103 lm cm2 lm ft2

Multiplication factor to convert to lux 10 (Commission International de l’Éclairage 1989) 10 10.76

Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

Table 2 Non-SI units for luminance Unit Nit Stilb Apostilb (Blondel) (Equivalent lux) Lambert (equivalent phot) Millilambert Footlambert (equivalent footcandle) Candela/foot2 Candela/inch2

Abbreviation nt sb asb

Description cd m2 cd cm2 (1/p) cd m2

Multiplication factor to convert to cd m2 1 10 (Commission International de l’Éclairage 1989) 0.3183

L mL fL

(1/p) cd cm2 103 (1/p) cd cm2 (1/p) cd ft2

3,183 3.183 3.426

cd ft2 cd in2

cd ft2 cd in2

10.76 1,550

system by the fact that one of the seven SI base units relates to the visual effectiveness of optical radiation: this is the unit of luminous intensity, the candela. The candela relates the visual effectiveness of a light source to its spectral radiant intensity through defined spectral luminous efficiency functions, the most important being the photopic function, V(l). Other parameters that are of importance for quantifying the performance of sources of optical radiation can be related to luminous intensity by considering the different geometrical configurations used for measurement. They are therefore expressed in terms of derived units (some of which have special names) that are related to the candela through these geometrical considerations, for example, luminous flux is measured in lumen (equivalent to candela steradian) and luminance is measured in candela per meter squared. The use of these internationally agreed units allows us to measure, express, and compare the visual performance of displays on a reliable basis and is therefore essential not only for trade and specification purposes but also for ensuring that regulatory requirements are met.

Acknowledgments This work was funded by the National Measurement Office of the UK Department for Business, Innovation, and Skills.

References Commission International de l’Éclairage (1989) CIE 81:1989 mesopic photometry: history, special problems and practical solutions. Commission International de l’Éclairage, Vienna. www.cie.co.at/ Commission International de l’Éclairage (2004) ISO 23539:2005(E)/CIE S 010/E:2004 joint ISO/CIE standard: photometry – the CIE system of physical photometry. Commission International de l’Éclairage, Vienna. www.cie.co.at/ Commission International de l’Éclairage (2011) CIE S017.2/E:2011 ILV: international lighting vocabulary. Commission International de l’Éclairage, Vienna. www.cie.co.at/

Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_19-2 # Springer-Verlag Berlin Heidelberg 2015

Commission International de l’Éclairage (2010) CIE 191:2010 recommended system for mesopic photometry based on visual performance. Commission International de l’Éclairage, Vienna. www. cie.co.at/ Organisation Intergouvernementale de la Convention du Mètre (2006) The international system of units (SI), 8th edn. Bureau International des Poids et Mesures, Paris. www.bipm.org. ISBN 92-822-2213-6

Further Reading Commission International de l’Éclairage (1983) CIE 18.2-1983: the basis of physical photometry. Commission International de l’Éclairage, Vienna. www.cie.co.at/ DeCusatis C (1998) Handbook of applied photometry. Springer, New York Grum F, Becherer RJ (1979) Optical radiation measurements, vol 1, Radiometry. Academic, New York McCluney R (1994) Introduction to radiometry and photometry. Artech House, Boston

Page 11 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

Measurement Instrumentation and Calibration Standards Teresa Goodman* National Physical Laboratory, Teddington, Middlesex, UK

Abstract Correct and reliable measurement requires not only the selection and use of appropriate measurement instrumentation but also (a) the use of approved or validated measurement procedures, (b) the use of traceable calibration reference artifacts or standards, and (c) the identification, correction, and/or allowance for potential measurement errors and uncertainties. This chapter will provide an overview of all of these considerations and highlight some fundamental issues that affect the key (generic) measurement approaches used when characterizing the optical properties of displays.

List of Abbreviations CIE CIE x(l), y(l), z(l) functions LED

Commission International de l’Éclairage CIE color-matching functions that define the internationally agreed standard colorimetric observer Light emitting diode

Introduction The starting point for any measurement of a display is to decide on what is the appropriate quantity to measure, which is usually based on what the measurement will be used for. This will determine (a) the required measurement geometry, (b) whether the measurements need to be made spectrally or in terms of the photometric value, and (c) the most appropriate measurement instrumentation to use. For example, one of the key characteristics of a display is its “brightness,” which is usually specified in terms of its average luminance. In this case, the measurements would generally be made using a luminance meter that is able to average over a large area of the display. If, however, it was required to measure the spectral characteristics of the optical radiation falling on the surface of the display from other light sources in the environment, it would be more appropriate to use a spectroradiometer system and measure the spectral irradiance at the position of the display. Once these basic questions have been decided, consideration can be given to the other factors that are necessary in order to ensure that measurements are both valid and reliable, namely: • Clear linkage to national standards via an unbroken traceability chain (i.e., the use of measurement instrumentation or reference artifacts whose calibration can be traced directly to national standards) • Recalibration of measurement instrumentation or reference artifacts at appropriate intervals, which may be determined by time or (as is commonly the case where the reference artifact is a lamp) by usage

*Email: [email protected] Page 1 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

• Consideration of the impact of environmental and other influences on the measurement results (i.e., an uncertainty evaluation) • Use of approved or validated measurement techniques More detail for specific measurement parameters, including information on recommended techniques for minimizing measurement errors and uncertainties, is given in section 11, “▶ Display Metrology.”

Measurements Using a Photometer or Other Broadband Meters Most photometric measurements are made using a photometer, which typically consists of a photosensitive element (usually a silicon photodiode) combined with a filter that is designed to modify the spectral responsivity to provide an approximation to the V(l) function. Other elements may also be included depending on the geometry required for the quantity being measured, e.g., a diffuser is usually used for measurements of the amount of light falling on a working surface (illuminance), and a lens is often incorporated for measurements of the light emitted from a defined area on a source (luminance). The photometer is generally calibrated by comparison with a source of known photometric output, and this source is usually a tungsten lamp operating at a correlated color temperature of 2,856 K (CIE standard illuminant A) – see section 5, “▶ TFTs and Materials for Displays and Touchscreens.” The relative spectral responsivity of a photometer should, ideally, exactly match the V(l) function. In practice, such an ideal is impossible to achieve and all photometers show some departure from V(l), as shown in the example in Fig. 1. This is termed the spectral mismatch error of the photometer, and its effect on the results of measurements depends not only on the degree of departure from the V(l) function but also, critically, on the spectral characteristics of the source being measured (see also chapter “▶ Measurement Devices”). Consider as an example a photometer that has a perfect match to V(l) across most of the spectral range but with a significant departure from V(l) in the region between about 450 and 500 nm, as shown in Fig. 2. In this case, there will be no error due to spectral mismatch when measuring a source

1.0 0.9 Ideal V(l)-corrected detector

Relative spectral responsivity

0.8 0.7

Practical highquality photometer

0.6 0.5 0.4 0.3 0.2 0.1 0.0 400

450

500

600 550 Wavelength/nm

650

700

750

Fig. 1 Example of the spectral responsivity of a high-quality photometer Page 2 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

Spectral Iuminous efficiency, photometer responsivity or LED radiance (relative values)

1 0.9 0.8 0.7 0.6 0.5 0.4

Photopic spectral luminous efficiency

0.3

Photometer with spectral mismatch in blue region Blue LED

0.2 Red LED

0.1 0 400

600

500

700

Wavelength/nm

Fig. 2 Example of the impact of spectral mismatch errors for measurements on different sources

with emission only at longer wavelengths, such as a red LED, but significant error when measuring a source with emission at shorter wavelengths, such as a blue LED. The impact of any spectral mismatch errors can be minimized by calibrating the photometer using a reference source with spectral characteristics that are identical those of the sources to be measured. Large errors can arise if sources with different spectral characteristics are compared, even if these sources have the same color appearance (Lambe 1995). Alternatively, if the relative spectral responsivity of the photometer and the spectral power distributions of the standard and test sources are all known, then a correction can be calculated to allow for departures from V(l). This “spectral mismatch correction factor,” F, is given by Ð Ð S t ðlÞV ðlÞdl  S r ðlÞsðlÞdl Ð F¼Ð S t ðlÞsðlÞdl  S r ðlÞV ðlÞdl where St(l) and Sr(l) are the spectral power distributions of the test and reference sources, respectively, and s(l) is the spectral responsivity of the photometer. Similar corrections can be calculated for other instruments that are designed to match a defined spectral responsivity function, such as tristimulus colorimeters that are intended to match the CIE x(l), y(l), z(l) functions. The correction factors for even well-corrected photometers can be large, particularly for colored sources such as displays (even a “white” display). It should also be remembered, as mentioned previously, that most equipment which gives a direct readout in photometric units (e.g., illuminance meters calibrated in lux) will have been calibrated with a source approximating to CIE standard illuminant A (i.e., a tungsten lamp at a correlated color temperature of 2,856 K). Large errors may be introduced when displays or other non-tungsten sources are measured with such instruments. Thus, it will almost always be necessary to calculate and apply a spectral mismatch correction factor when measuring the photometric characteristics of a display using a photometer or, similarly, when measuring the colorimetric characteristics using a tristimulus colorimeter. A diffuser is often incorporated into a photometer to avoid problems associated with nonuniformity of the detector and/or the filter. Many diffusers have an appreciable coloration, and this must obviously be Page 3 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

allowed for in the calculation of spectral correction factors. They also result in a very large field of view (approximately 180 ), which means that photometers fitted with diffusers are generally very susceptible to stray light. Careful screening is therefore essential, and stray light checks are especially important. A baffle is often attached to the front of the instrument in order to reduce the field of view and thus minimize stray light. Another common use of a diffuser is in an illuminance meter, where it is intended to provide a cosine correction, such that sðyÞ ¼ sð0 Þ cos y where s(y) is the responsivity at angle y to the normal. Such a correction is an essential requirement in situations where light is incident in all directions (e.g., when measuring the illumination falling on a desk in an office), but it cannot be perfectly achieved in practice and can lead to measurement errors (see chapter “▶ Measurement Devices”). Illuminance meters are calibrated using luminous intensity standards (luminous intensity Iv) operating at a known distance d meters from the front surface of the diffuser, the illuminance, Ev, being given by Ev ¼

Iv d2

Hence, the calibration does not relate to the conditions of use, and departures from ideal cosine law behavior are consequently often overlooked. It is possible to check the cosine correction of a meter fairly easily using a highly directional source. If the angle between the detector and the source is varied (keeping the distance constant), the signal should, ideally, vary as the cosine of this angle. All photometers have a “limiting aperture” of some kind, be it the sensitive area of the detector itself, the front surface of the diffuser (if there is one), or some other aperture. There are two cases where it is essential to know where this limiting aperture is located. The first is in the calibration of an illuminance meter using standards of luminous intensity. As has already been mentioned, the illuminance is given by the luminous intensity/distance2, and it is therefore necessary to know from where the distance should be measured. If the illuminance meter is fitted with a plane diffuser, there is usually no problem; the front surface of the diffuser is the effective stop of the system. If no diffuser is present, it is necessary to determine exactly where the limiting aperture lies and to measure the distance from this point. The second situation where the location (and in this case, the size as well) of the limiting aperture must be known is when calibrating a highly directional source for luminous intensity. Here, the measured intensity depends critically on the solid angle over which it is measured and any statement of the measured luminous intensity should therefore also give the solid angle used. This means that the distance between the source and the photometer and the size of the aperture must both be correctly determined.

Measurements Using a Luminance Meter Luminance meters usually incorporate some form of imaging system to focus the area being measured onto the detector. The optics are generally designed to allow the area being measured to be viewed and identified through an eyepiece. As in the case of other photometric measurements, spectral mismatch between the luminance meter responsivity and the V(l) function means that spectral differences between the reference and test sources can lead to significant error and spectral correction factors may have to be applied. The spectral transmittance of the imaging system must be allowed for when determining the

Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

spectral responsivity of the meter and calculating spectral correction factors. Some luminance meters also have an additional “close-up” lens that can be affixed to enable very small areas to be measured. These are frequently antireflection coated and therefore have a non-neutral spectral transmittance. It may be necessary to apply an additional color correction factor to allow for this coloration. Even a spectrally neutral lens will change the overall responsivity of the instrument, so that unless it has been calibrated with the lens in position, an appropriate correction should be applied. When making measurements on a display using a luminance meter, it is important to minimize the effect of stray light, either through the use of a stray light elimination tube (on the measurement instrument) or a mask (on the surface of the display) – see chapters “▶ Luminance, Contrast Ratio and Gray Scale” and “▶ Measurement Devices.”

Measurements Using a Spectroradiometer The use of a spectroradiometer is essential if spectral values are required. However, spectroradiometers are also often used to avoid the problems due to spectral mismatch errors associated with photometers and other detectors that are designed to match a defined responsivity function, such as tristimulus colorimeters. In this case, the required photometric value is obtained by calculation as follows, where Qv is the photometric quantity, Ql (l) is the corresponding radiometric quantity, and Dl is the step interval for the measurements (similar calculations can be performed to obtain other integral values, e.g., X, Y, Z tristimulus values): X Ql ðlÞV ðlÞDl Qv ¼ 683 Spectroradiometric measurements require the radiation produced by a source to be isolated into discrete bands of energy of known wavelength and bandwidth. There are various methods by which to do this, but the most common is to use a monochromator. A basic monochromator system consists of an entrance slit, a dispersing element, imaging optics, and an exit slit. The dispersing element serves to convert the homogeneous radiation from the source into a spectrum and is typically either a prism or a diffraction grating. The entrance slit is imaged by the monochromator optics onto the exit slit, and these two slits together determine the “slit function” and bandwidth of the monochromator. There are several different geometrical arrangements used in monochromators, and Fig. 3 shows one common version. Mirrors are used to collimate light from the entrance slit, which then falls onto a diffraction grating, is dispersed by the grating, and finally refocused onto the exit slit. The dispersed spectrum appears in the plane of the exit slit, so that any desired narrow wavelength band can be isolated by adjusting the angle of the grating. Adjusting the width of the exit slit changes the bandwidth of the emergent radiation. The slit function (i.e., the relative spectral transmittance of the monochromator) depends on the relationship between the sizes of the entrance and exit slits. It is triangular if the image of the entrance slit exactly fills the exit slit, and trapezoidal in other cases. It is important to remember that all the radiation at “unwanted” wavelengths is reflected or refracted within the monochromator, i.e., it is not absorbed by the dispersing element. This unwanted radiation is scattered within the instrument, often in such a way that it can leave through the exit slit – this is termed “stray light.” The problem is reduced if two monochromators are used in series (as shown in Fig. 3); a coupled monochromator of this form is called a double monochromator. Increasingly, scanning-type spectroradiometers of the type described above are being replaced by array-based systems, which use a fixed monochromator with an array of detectors in the position normally occupied by the exit slit. The spectrum is distributed across the detector array, and each of the individual Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015 Diffraction gratings

Polychromatic light

Monochromatic light

Forcussing mirrors

Fig. 3 The optical layout of a double Czerny-Turner monochromator

Source

Integrating sphere Baffle Array detector

Slit

Monochromator

Diffraction grating

Fig. 4 Schematic of an array-based spectroradiometer for irradiance measurements

detector elements (pixels) effectively acts as a separate “exit slit” for the monochromator. An example of a typical system is illustrated in Fig. 4. Array-based systems can offer a number of advantages over scanning systems, e.g.: • They sample all wavelengths in the spectral range simultaneously, making them well suited to measurements on pulsed or time-varying sources. • The absence of moving parts means they can be more stable, reproducible, and rugged. • They are usually smaller and therefore more portable. • Simultaneous wavelength sampling is also an advantage in situations such as process control, where rapid data collection is necessary. • It is usually possible to integrate the signal over a period of time, thus reducing the effect of noise on the measured output.

Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

The cost of array-based spectroradiometers has fallen significantly in recent years, and their reliability has also improved, making their use increasingly widespread. It is important to note, however, that although in many respects array-based spectroradiometers are very similar to conventional systems which use discrete sampling, some of the features of array-based systems mean that special calibration techniques and precautions are needed if correct results are to be obtained (Hopkinson et al. 2004). In particular, the problem of in-system stray light is significantly more acute with array systems than with more traditional systems, for the following reasons: • Array systems use a single monochromator to disperse the radiation across the array, whereas the majority of high-quality traditional spectrometers use a double monochromator (which has much better stray light performance) and mechanically scan through the spectrum by rotating the monochromator gratings. • Radiation can be reflected (or inter-reflected) off the array, onto the walls and/or other components within the monochromator, and then back onto the array. Reflections from detectors placed after an exit slit (as in scanning systems) are much less likely to reenter the monochromator and be re-reflected onto the detector. • The attraction of many array systems lies in their small size and portability; it is much more difficult to make effective use of baffles in a physically small system. • Radiation must be spread across the whole array, making the placement of baffles within the monochromator more difficult than in the situation where only the narrow exit slit is to be irradiated. • Most array spectrometers use silicon detectors, which have high responsivity in the red and near infrared (to ~1,100 nm) and much lower response in the blue and ultraviolet spectral regions. This makes them highly sensitive to any stray radiation in the red and near infrared, which is also the region of highest emission from many commonly used sources, particularly tungsten-based sources typically used as calibration reference artifacts (see section 5, “▶ TFTs and Materials for Displays and Touchscreens”). As a result of these problems, in-system stray light is a major factor influencing the performance of array systems for spectrometry and, in the blue region in particular, can dominate all other sources of measurement uncertainty. Other major factors to be considered are wavelength calibration, bandwidth, linearity, noise, dark current, and external stray light (Hopkinson et al. 2004). Because of the problems of stray light, the use of array spectrometers for measurements of displays is best restricted to situations where it is possible to make a direct comparison between the display under test and a reference display with known characteristics; use of a traditional calibration reference artifact, of the type described in section 5, “▶ TFTs and Materials for Displays and Touchscreens,” is likely to lead to large measurement errors.

Calibration Reference Artifacts In the vast majority of cases, the measurement instrumentation used in photometry and spectroradiometry is calibrated using a reference source, which is most commonly a tungsten lamp operating at a correlated color temperature of 2,856 K. This type of reference source often bears little resemblance to the types of source being measured, and this, in turn, leads to many potential sources of error. Thus, if a specific calibration problem arises, the best approach may well be, first of all, to see whether it is possible to obtain a standard with the same characteristics as the test sources. If this can be done, it may be possible to achieve a given level of accuracy with a less complex and, therefore, less expensive measuring system. Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

Tungsten lamps used as calibration standards for luminous intensity, radiant intensity, illuminance, or spectral irradiance are usually operated cap down and calibrated in a specified horizontal direction. They are generally designed so that it is possible to set up the lamp on each occasion of operation in exactly the same position relative to the photometer or irradiated target. Although it is not essential for a source of illuminance or irradiance to obey the inverse square law if it is always used at the distance at which it was calibrated, it must be constructed so that the calibration distance can be measured precisely from a reference point on the lamp, the lamp enclosure, or the lamp mount. It is often useful, however, if a lamp used for this purpose does obey the inverse square law, at least over a limited range of distances, so that it can be used to provide a range of illuminance/irradiance levels. Another requirement of a source to be used as a standard of intensity, illuminance, or irradiance is that its field should be uniform, preferably to better than 0.25 % over the irradiated area or angle. Coiled filaments, especially single coils arranged in a regular pattern, can show rapid changes of intensity with angle of view as the front turns mask the rear turns to a greater or lesser extent. Uniformity is improved if the lamp envelope is diffusing, but if clear, it should be of good optical quality. Lamps are often grit blasted to achieve more uniform irradiance, but the surface is then vulnerable to contamination and consequent discoloration. Furthermore, where a lamp has a diffusing envelope, it is almost impossible to define the position of the light center, so the inverse square law is unlikely to be obeyed. Ribbon filament lamps can be used as luminance or spectral radiance standards when high power levels are required. These are normally operated cap down, with the ribbon vertical, and have a plane window of glass or silica to permit good optical imaging. The calibration applies to radiation about an axis, normally horizontal, from a specific area of the ribbon. The calibrated area must be readily identifiable and for this reason a pointer is often provided adjacent to the ribbon, or a small notch may be cut into the ribbon itself. Because the ribbon can move relative to the lamp base during warm-up, the alignment should be checked and adjusted if necessary once the lamp has reached its final operating temperature. In the more common situation where lower power levels are to be involved, such as for the measurement of a display, a standard of lower radiance is generally used. The most fundamental is a plane surface of known luminance (or spectral radiance) factor, such as a barium sulfate or magnesium oxide plaque or a calibrated white opal, which is illuminated using an intensity standard (see Fig. 5). The diffuser acts as a secondary source and is usually illuminated normally and viewed at 45 . The radiance or luminance, L, is given by L¼

Is pd 2

where I is the intensity, d is the distance between the intensity standard and the diffuser, and s is the radiance factor of the diffuser for illumination at 0 and viewing at 45 .

Intensity standard

Diffuse reflectance standard

Diffuser viewed at 45°

Fig. 5 Lower-level radiance or luminance standard Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

Entrance port with adjustable aperture

Integrating sphere

Lamp in enclosure

Exit port giving uniform Lambertian field

Fig. 6 Integrating sphere-based luminance standard

In the more general case, where a source of intensity, I, irradiates a diffusing surface at an angle yA to the normal, which is then viewed at angle yB to the normal, the radiance or luminance is given by L¼

IsðyA , yB Þ cos yA pd 2

where s(yA, yB) is the radiance factor of the diffuser under these conditions of irradiance and view. In the case of a perfect diffuser (one for which s(yA,yB) is independent of yA and yB – often referred to as a cosine or Lambertian diffuser), the luminance varies as the cosine of the viewing angle; this ideal is rarely achieved, however, and significant errors can arise if the conditions of use do not replicate closely the conditions under which the diffuser was calibrated. A less direct, but widely employed, standard for luminance or radiance is a “luminance gauge.” This often consists of a small integrating sphere coated internally with a white diffusing material of high reflectance, such as barium sulfate paint, with an illuminating source located either inside or outside the sphere (see Fig. 6). The device is frequently provided with some means of varying the luminance/radiance over a range of values (e.g., by use of an adjustable diaphragm or aperture). Whatever type of luminance/radiance source is chosen, the reference direction and alignment method, the location and size of the calibrated area, and the solid angle subtended by the optical system of the detector must all be specified. When using the integrating sphere-type source as a standard of spectral radiance, it is also necessary to ensure it has been calibrated against another standard of known spectral radiance. It is not sufficient to calculate the spectral distribution from the color temperature, since the sphere coating is not always neutral and this can lead to very large errors, particularly in the blue region (see Fig. 7).

Summary For any measurement, it is important not only to select the most appropriate measurement equipment but also to ensure that it is correctly calibrated using stable reference standards that are directly traceable to national measurement standards. In addition, it is essential to consider and evaluate all potential sources of error or measurement uncertainty; these can arise from the measurement instrumentation used, the device being measured, the calibration reference artifact, and/or the environmental conditions and are typically exacerbated by any differences between the characteristics of the calibration reference and the device being measured (spectral properties, output power level, size and shape, etc.). The calibration reference artifacts used to calibrate instrumentation for measuring the optical emission characteristics of displays (and other sources of optical radiation) are usually based on tungsten lamps, Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015 1.2 1.1

Ratio of Power

1.0 0.9 0.8 0.7 0.6 0.5 0.4 400

450

500

550

600

650

700

750

800

Wavelength/nm

Fig. 7 Spectral power distribution of a range of luminance gauges relative to a Planckian radiator at the same temperature

which provide relatively stable output across the whole of the visible spectral region. Typically some sort of diffuser is also incorporated, to provide a spatially uniform luminance or radiance source. The majority of measurements on visual displays are performed using broadband detectors, such as photometers or illuminance meters (also called luxmeters), luminance meters (sometimes referred to as spot photometers), and tristimulus colorimeters. For these devices, the main source of measurement error is usually the degree of spectral mismatch between the actual detector spectral responsivity and the desired spectral response function(s). Other major sources of potential error are stray light, which can affect luminance, illuminance, and colorimetric measurements, and departures from ideal cosine response, which is important mainly for illuminance measurements on large sources. Further, factors such as nonlinearity, temperature coefficient, noise, dark current, polarization sensitivity, drift, etc., also need to be considered but are generally (but not always) less significant. Measurements of the optical characteristics of displays may also be made using spectroradiometers, which provide measures of spectral radiance or spectral irradiance from which photometric and colorimetric quantities can be calculated. Spectroradiometer systems may be based on a scanning monochromator, but increasingly these are being superseded by systems using array spectrometers. In-system stray light is the major source of error for this type of instrument and can lead to very large measurement uncertainties; for display measurement, their use is currently restricted primarily to direct comparisons between the device under test and a calibrated reference display.

Directions for Future Research Research into improved instrumentation for measurements on displays is focused on two main areas: (1) the reduction of measurement error, particularly through improved spectral matching for broadband detectors (photometers and tristimulus colorimeters) and reduced stray light for array-based spectroradiometers, and (2) the capture of more information in a single measurement, e.g., camera-based luminance meters, which can provide information on luminance nonuniformities across the entire surface of a display in a single measurement (see section 6, “▶ Emissive Displays”), and conoscope systems, which provide information on angular variations in the display output (see chapter “▶ Measurement Devices”).

Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_20-2 # Springer-Verlag Berlin Heidelberg 2015

Acknowledgments This work was funded by the National Measurement Office of the UK Department for Business, Innovation and Skills.

Further Reading DeCusatis C (1998) Handbook of applied photometry. Springer, New York Grum F, Becherer RJ (1979) Radiometry. In: Optical radiation measurements, vol 1. Academic, New York Hopkinson GR, Goodman TM, Prince SR (2004) A guide to the use and calibration of detector array equipment. SPIE Press, Bellingham Lambe R (1995) The role of measurements and of a national standards laboratory in energy efficient lighting. In: Proceedings of 3rd European conference on energy efficient lighting, Newcastle Upon Tyne, pp 271–278 McCluney R (1994) Introduction to radiometry and photometry. Artech House, Norwood

Page 11 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Overview of the Photometric Characterization of Visual Displays Teresa Goodman* National Physical Laboratory, Teddington, Middlesex, UK

Abstract A wide range of measurements may be required in order to fully characterize the properties of a display, but these are generally based on a relatively small number of fundamental measurement parameters: luminance, color, spatial uniformity, angular distribution, reflectance, and temporal characteristics. This chapter provides an overview of the methods and instrumentation used for these underpinning measurements and outlines the major associated sources of potential measurement error and uncertainty.

List of Abbreviations BRDF CIE CRT LCD LED PC

Bidirectional reflectance distribution function Commission Internationale de lÉclairage Cathode ray tube display Liquid crystal display Light emitting diode Personal computer

Introduction Measurements of the photometric characteristics of displays are important for many reasons, such as assessing performance for product development, manufacture, and quality control purposes, enabling purchasers to compare products on a consistent basis, and as inputs to software used to predict display legibility in specific installations. A range of measurements may be required, as summarized in Table 1. This chapter will provide a brief overview of these measurements, the types of instrumentation that are available, and the major potential sources of measurement error.

Luminance Luminance is the most basic measurement for a display. Not only is it the foundation of many of the other measurements (as detailed in part “▶ Advanced Measurement Procedures”), but it is also often used within the display industry as a general rule of thumb regarding the suitability of the display for a particular application. For example, a display with a luminance of 500 cd m2 will be on the borderline of being acceptable (“bright enough”) for use in daylight or other high illumination conditions, whereas a luminance of 1,500 cd m2 would be considered to be very likely to be acceptable. Luminance is usually measured for both a black and white screen using a luminance meter (sometimes called a spot photometer) or a telespectroradiometer (see part “▶ Standard Measurement Procedures” and chapter “▶ Standards and *Email: [email protected] Page 1 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Measurements for characterizing the optical performance of a visual display Measurement Luminance Color Gray-level step Contrast ratio Spatial luminance uniformity Spatial color uniformity Angular luminance distribution Angular color distribution Diffuse reflectance Specular reflectance Angular reflectance Temporal characteristics

Used for Key element for basic specification and comparison of display performance Key element for basic specification and comparison of color display performance Key element for basic specification and comparison of display performance Key element for basic specification and comparison of display performance Assessing whether luminance variations across the surface of the display will result in unacceptable disturbance to the end user Assessing whether luminance variations across the surface of the display will result in unacceptable disturbance to the end user Assessing the range of angles over which the display can be viewed Assessing the range of angles over which the display can be viewed Assessing legibility of the display under specific ambient illumination conditions Assessing legibility of the display under specific ambient illumination conditions Assessing legibility of the display under specific ambient illumination conditions Assessing display susceptibility to motion blur for moving images

Test Patterns” for more details). The measurements are usually performed with the measuring instrument perpendicular to the surface of the display, or at the angle at which the user would normally view the display (if this is not perpendicular to the surface). The size and location on the display of the measured area should be stated, since most displays show some nonuniformity in luminance over the surface. It is also important not to measure too small an area, since this can result in large differences in the measured results with small changes in size or position (at the extreme, if the measurement area is the same size as a single pixel, small movements can mean that the area is either located precisely over a pixel, giving a “high” reading, or only partially over a pixel, giving a “low” reading; neither reading will adequately represent the true display luminance). Precautions must also be taken to minimize the effect of stray light from areas of the screen other than the defined measurement area, and this generally involves the use of either a stray light tube on the measurement instrument or a mask on the surface of the display. Other major sources of error are the performance of the measurement instrument (see section “Mobile Displays, Microdisplays, Projection and Headworn Displays” and chapter “▶ Measurement Devices”) and external influences on the output of the display (e.g., CRTs are susceptible to external magnetic fields and LED displays can be affected by changes in ambient temperature). Results are expressed in terms of candela per meter squared (cd m2).

Color The range of colors that can be displayed (the “color gamut”) is evaluated by measuring the chromaticities of red (R), green (G), blue (B), and white (R = G = B) screens. The approach, precautions, and sources of error are similar to those for measurements of luminance, but in this case, the instrumentation used is a colorimeter (which is typically placed directly on the surface of the screen) or a telespectroradiometer (which is imaged onto the screen and provides measurements of the spectral radiance). Results are expressed in terms of CIE (x,y) or (u0 ,v0 ) chromaticity coordinates or using another specified color system. More details are given in part “▶ Standard Measurement Procedures” and chapter “▶ Measurement Devices.”

Page 2 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015 600

Luminance (cd m−2)

500

400

300

200

100

0 0

4

8

12 16 20 24 28 32 36 40 44 48 52 56 60 64 R, G, B demand level

Fig. 1 Example of results from gray-level step measurements

Gray-Level Step These measurements are also performed in a similar manner to luminance measurements, but in this case, the white screen is varied over the full range of RGB drive levels (i.e., from 0 to 256), and the luminance is determined for each step (see parts “▶ Standard Measurement Procedures” and “▶ Advanced Measurement Procedures”). A test pattern generator is usually used for this purpose (see chapter “▶ Standards and Test Patterns”) to avoid problems that may arise if a personal computer (PC) is used to set the gray levels (a PC provides the means for altering brightness, contrast, and gray-level step in a way that is often hidden from the operator). Results are usually expressed graphically, as shown in Fig. 1.

Contrast Ratio Contrast ratio is defined as the ratio of the highest luminance to the lowest luminance that the display system is capable of producing. The larger the contrast ratio, the greater is the difference between the brightest whites and the darkest blacks that can be displayed. A high contrast ratio is a desirable aspect of any display, but it is not always possible to make a direct comparison between the contrast ratio values provided by different display manufacturers, due to differences in the measurement methodologies used. The most representative measure of contrast ratio for assessing overall display performance is static contrast ratio, which refers to the ratio between the luminances of the brightest white and the darkest black that can be displayed simultaneously. Dynamic contrast, on the other hand, refers to the ratio between the deepest blacks and the brightest whites that a display can display, but not at the same time, and generally results in higher contrast values. This is particularly true in the case of displays employing backlights (e.g., LCDs), where light can bleed through from the backlight into black areas when an image containing both white and black areas is being viewed (thus reducing contrast ratio), whereas the backlight can be reduced or even turned off if a fully black image is displayed. A further complication is that, regardless of whether static or dynamic contrast is measured, the results obtained can depend significantly on the ambient lighting conditions. Contrast ratio is often quoted for

Page 3 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

dark room conditions, that is, with no ambient illumination present and minimal reflections from the surroundings. In most instances, this is not the environment in which displays are used, and different values will be obtained if ambient illumination is present, and/or the surrounding walls, floor, and ceiling can reflect light from the display back onto the screen. There are two commonly used methods of measuring contrast ratio. The “full on/off” method compares the luminance of a white screen (R = G = B = max) with that of a black screen (R = G = B = 0) and has the advantage that it largely cancels out the effect of the external environment (equal proportions of light are reflected from the display to the room and back for both the “black” and “white” measurements, as long as the room stays the same). This method is generally suited only to dynamic contrast measurements, unless it is possible to control the display such that the backlight is fully on even when displaying a black image. The second method is to use a checkerboard pattern, in which the luminance values of all the white squares (or rectangles) are measured and averaged, and similarly the luminance values of the black squares (or rectangles) are measured and averaged. The ratio of the averaged white readings to the averaged black readings is the contrast ratio. This method provides static contrast values. However, accurate measurement of contrast using this checkerboard approach requires the use of a well-controlled dark room, with all walls, floors, ceilings, etc., totally black and nonreflective; this can be difficult and expensive to achieve. Whatever method is used and regardless of whether static or dynamic contrast is being measured, results are expressed as the ratio between the luminances of the “white” and the “black” conditions, for example, 1,000:1. More details of contrast ratio definitions and measurement methods are given in chapters “▶ Luminance, Contrast Ratio and Grey Scale” and “▶ Standards and Test Patterns.”

Spatial Luminance and Color Uniformity The visual appearance and effectiveness of a display can be significantly degraded if there are perceptible variations in the luminance and/or color over the active area of the screen. Such nonuniformities may appear as a gradual variation from one part of the screen to another or as localized variations due, for example, to the structure within the backlight. Prior to the development of imaging photometers/ colorimeters, display nonuniformity was assessed by making measurements at a large number of discrete points across the full area of the screen using a spot photometer, colorimeter, or telespectroradiometer (see chapter “▶ Spatial Effects”). The problem with this approach lies in achieving sufficient spatial resolution and conducting a sufficient number of measurements to characterize fully the performance of the display. In practice, the scientist/engineer performing the measurement generally identifies the brightest and dimmest locations on the display by visual inspection and then performs several measurements around these areas. The nonuniformity can be calculated as a contrast ratio between the areas of highest and lowest luminance. The main problem with this approach is that it does not fully represent the overall nonuniformity of the display but gives only two specific worst case points. The low spatial resolution of the measurements can also mask small area nonuniformities. More recently, therefore, this point-by-point approach to measurements of display nonuniformity has been largely superseded by the use of imaging photometers and colorimeters, which produce two-dimensional maps of the variations in luminance (or chromaticity) over the full screen surface, as illustrated in Fig. 2.

Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Contrast ratio 0.00−54.7 54.7−109.4 109.4−164.1 164.1−218.8 218.8−273.5 273.5−328 328−383 383−438

Fig. 2 Example of results from an imaging colorimeter, showing contrast ratio values in false color

Angular Variations in Luminance and Color Measurements of the angular variation in luminance and color provide the characteristics of the display over the whole forward hemisphere and can be used not only to determine the angular field of view of a display but also to provide a more comprehensive understanding of display legibility under a range of conditions (see chapter “▶ Viewing Angle”). The highest angular resolution and measurement sensitivity is achieved through goniometric methods, in which the luminance or color distribution is mapped as a function of angle. However, these methods require long setup and measurement times and are consequently often too expensive to fulfill the needs of the display industry. Other methods have therefore been developed, based on the use of the latest imaging technologies (Rykowski et al. 2006), but a detailed description of these is beyond the scope of this chapter (see chapter “▶ Measurement Devices” for more information).

Reflectance Measurements Light reflected from the display surface into the user’s line of sight is superimposed on the displayed image and results in a degradation of the legibility of the displayed image. Conventionally, two types of reflection have been considered within the display community: specular reflection and diffuse reflection. However, with the increased use of antiglare and touch screen coatings, a third type of reflection, termed “haze,” is now also considered and can be a significant contributor to degraded visual performance.

Diffuse Reflectance

Diffuse reflectance measurements (see chapter “▶ Ambient Light” for more details) are intended to quantify the amount of reflected light that will be superimposed on the displayed image from a uniformly distributed diffuse light source. This diffuse light source provides a reasonable approximation of the illumination environments in which displays are often used. For example, the illumination from the sky is diffuse in nature, so this is an appropriate condition to use for displays used out-of-doors, and although indoor environments generally have a somewhat complicated illumination distribution, even these are often adequately represented by a diffuse source to a first degree (e.g., office lighting is often designed to produce good uniformity across the working plane with no visible “bright spots”). Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015 Light source Baffle

Light trap Display or reflectance standard

Exit port

Luminance meter

Fig. 3 Schematic of sampling sphere measurement of diffuse reflectance

The baseline measurement technique for diffuse reflectance is to place the display inside a large integrating sphere, with a lamp (with baffle) placed behind the display such that this provides diffuse illumination onto the display. Measurements are made of the screen luminance for a measured level of diffuse illuminance and compared with the measured luminance under the same conditions for a calibrated reflectance standard (Kelley 2006). However, this method requires access to an integrating sphere that is large enough to accommodate the display (the diameter of the sphere should be at least ten times the diagonal of the display), and it is therefore not widely used. An alternative approach is the “sampling sphere method” (Kelley 2006), in which the display is placed against the sample port of an integrating sphere rather than inside it. A lamp is placed inside the sphere, close to the wall, and baffled to prevent direct illumination of the display. The luminance of the display surface is measured using a luminance meter and compared with that measured under identical conditions but with the display replaced by a calibrated diffuse reflectance standard (see Fig. 3). The diffuse reflectance of the display, rdis, is given by rdis ¼

rstd Ldis Lstd

where rstd is the reflectance of the calibrated standard, Ldis is the luminance measured when the display is in position, and Lstd is the luminance measured with the reflectance standard in position. If the display emits light, then the luminance of the display must be subtracted from the luminance measured under reflection to obtain the net reflected luminance. Measurements are usually made for a range of display conditions (white and black as a minimum) since the reflectance may vary depending on the display settings.

Specular Reflectance

Measurements of specular reflectance are made to determine the degree of “mirrorlike” reflection from a display (see chapter “▶ Ambient Light” for further details). Specular reflections are generally several orders of magnitude larger than diffuse reflections and can be a major source of discomfort if a display is incorrectly positioned within a lit environment. Measurements are usually made by comparing the luminance of display when viewed at angle y to the normal and illuminated by a point source at  y to the normal with that for a calibrated specular reflectance standard (typically a piece of black glass) that is illuminated and viewed under identical conditions.

Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Angular Reflectance Measurements of angular reflectance provide the reflectance characteristics of the display over the whole forward hemisphere and can therefore be used to determine the details of how the display reflectance will impact on its legibility under any given conditions (see chapter “▶ Ambient Light” for further details). The importance of these measurements has grown in recent years due to the increasing use of display screens with antiglare or touch screen coatings, both of which introduce nontrivial “haze” reflections, that is, reflections that are intermediate between diffuse and specular in nature. The most serious effect of haze is to “broaden out” the specular reflection, making it less easy for an observer to avoid the reflection by changing their viewing location and thus resulting in a display that is less legible than would be the case if only specular reflection were present. As in the case of measurements of angular luminance and color variations, the most comprehensive and accurate measurements of the angular reflectance properties of a display are obtained using goniometric methods, yielding the bidirectional reflectance distribution function (BRDF) (Kelley et al. 1998). As for other angular measurements, simplified approaches are under development based on imaging technologies, but these have not yet gained international acceptance and are outside the scope of this chapter (see chapter “▶ Measurement Devices” for further information).

Temporal Performance (Motion Blur) Measurements of the temporal performance of a display relate to the degree of image persistence for a dynamic image (see chapters “▶ Temporal Effects” and “▶ Standards and Test Patterns”). These measurements are particularly important for video images containing fast-moving, high-contrast targets, such as television coverage of football and tennis (where motion blur can cause the fast-moving ball to be hard to distinguish). The issue of motion blur has become more important with the advent of new displays, such as LCD screens. Unlike CRT displays, where the image is displayed only for a short time during each refresh cycle and is blank between each image, in an LCD display, the image is held on the screen during the entire refresh period. This means that for a fast-moving object in the image, the object position is correct for only a fraction of the time, and the eye interprets this as the object being blurred. In practice, there are two contributors to motion blur: the rise and decay time of the pixels and the hold time. The former can be measured using a fast photodiode and the latter is dictated by the display drive electronics.

Basic Measurement Instrumentation As described in chapters “▶ Measurement Instrumentation and Calibration Standards” and “▶ Measurement Devices,” most measurements of a display are made using either a spectroradiometer (Commission International de lÉclairage 1984), which provides measurement results as a function of wavelength, or a filtered broadband detector (Commission International de lÉclairage 1982), which is designed to give an approximation to one or more of the CIE standard observer functions. The different characteristics of these instruments can lead to different, but in each case significant, measurement errors, as summarized in Table 2. Both types of instrument are typically calibrated using a stable and reproducible reference source, such as a luminance gauge. The reference source may be calibrated in terms of its luminance, its chromaticity, or its spectral output (usually absolute spectral radiance) as a function of wavelength by a laboratory that is traceable to national standards. The instrument is calibrated by comparing the measured values for the reference light source with the calibration data, yielding a correction factor or factors. All instruments will Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

show some drift in calibration with time, so it is important to check the calibration at regular intervals. Furthermore, reference sources also drift, both with time and with usage, so it is important that these are recalibrated at regular intervals by a laboratory providing measurements that are traceable to national standards.

Table 2 Major potential sources of error with spectroradiometer (SR) and filtered broadband detector (BB) systems Source of error Stray light

Spectral mismatch

Type of instrument SR

BB

Description Radiation scattered within the spectrometer is measured at a wavelength that does not correspond to its true wavelength, leading to errors in the measured spectral power distribution and in any calculated results, for example, tristimulus values

The match between the spectral response of a broadband meter and the target CIE standard observer function is never perfect, and residual mismatch errors lead to errors when comparing sources with different spectral characteristics. Errors of many tens of percent are common with colored sources such as displays

Evaluation/correction methods Evaluation: (a) measure the output as a function of wavelength for a large number of monochromatic inputs or (b) use cut-on or cut-off filters to investigate levels of stray light arising from particular spectral regions (e.g., no signal should be observed at wavelengths below the filter cut-on wavelength when the monochromator is set to wavelengths above the cut-on) Correction: this is possible using evaluation method (a) but is difficult and complex. A better approach is to select a spectrometer with low levels of stray light – typically this means using a scanning double monochromator Evaluation: measure the spectral responsivity of the meter as a function of wavelength and compare with the desired function Correction: spectral mismatch errors can be minimized by calibrating the meter using a source with spectral characteristics close to those of the sources to be measured. A correction factor F can be applied if the target spectral function (R(l)), the spectral responsivity of the meter (s(l)), and the spectral power distributions of the reference and test sources (S r(l)) and S t(l), respectively, are known: Ð Ð F¼Ð

Wavelength scale

SR

Wavelength errors result in the measured irradiance or radiance values being assigned to an incorrect wavelength. This also impacts on quantities derived from spectral measurements, such as tristimulus values

S t ðlÞRðlÞdl S r ðlÞsðlÞdl

Ð

S t ðlÞsðlÞdl S r ðlÞRðlÞdl

Correction: calibrate the wavelength scale of the spectrometer at several wavelengths covering the wavelength range of interest, for example, by using monochromatic emission lines from a low-pressure discharge lamp or several laser lines Note that wavelength errors can vary significantly across the spectral region of interest, and even relatively small shifts (less than 1 nm) can result in a change of several DE*ab units for some display colors (continued)

Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Table 2 (continued) Source of error Polarization

Type of instrument SR and BB

Description The optical radiation from some displays (e.g., LCDs) is highly polarized, and this can lead to significant errors if the responsivity of the detector system is polarization sensitive. Polarization effects can be wavelength dependent and can lead to errors in measured color or luminance values

Linearity

SR and BB

The output signal does not vary in proportion with the input quantity

Dynamic range (saturation and noise)

SR and BB

The output of many displays (e.g., CRTs) can show large variations over the course of each refresh cycle. The dynamic range of the measurement instrument must be sufficiently large to avoid saturation at the peak of each pulse and to minimize the effect of noise on the signal at the low point of each cycle. This is important even for instruments in which the readings are averaged over several cycles, since saturation and noise effects can cause errors in the averaged signal

Synchronization

SR and BB

The color of a display is rarely uniform over a refresh cycle, so correct results will only be obtained if the exposure time is an exact integer multiple of the display refresh period. Any partial cycles captured will distort the measurement result

Evaluation/correction methods Evaluation: make measurements of a uniform, nonpolarized light source (such as a luminance gauge) through a polarizer that is rotated to several different positions in turn. Any variation in the measurements is due to polarization sensitivity of the detector Correction: if the detector system does show polarization sensitivity, corrected results should be calculated from the mean of two measurements made at orthogonal polarizations Evaluation can be assessed by measuring the output value for a number of known input values that span the range of inputs over which the instrument will be used. Often checked using calibrated neutral density filters, lamps of different intensities, or superposition techniques (Hopkinson et al. 2004) Correction: corrections can be applied, based on the evaluation measurements, but it is usually preferable to restrict the range of input values to those over which the instrument is linear Evaluation: calibrate the system both with and without a neutral density filter in place and compare measurements of the display made under both conditions. Any difference in results indicates a probable problem due to saturation at the peak of each pulse Correction: calibrate and use the system with a neutral density filter that provides sufficient attenuation to ensure that no saturation occurs at the peak of the pulse, while also ensuring that signal levels are high enough to avoid problems due to noise Correction: if possible, the instrument should be synchronized with the refresh cycle of the display and set so that the measurement exposure time captures an integer number of cycles (here, “synchronization” means that the sampling time of the measuring instrument is directly related to the refresh rate of the display, not that the measurement is initiated at a particular time in the display cycle). If this is not possible, a long measurement time should be used, so that the number of whole refresh cycles is large compared to the number of part cycles captured (continued) Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Table 2 (continued) Source of error Bandwidth and step interval

Type of instrument SR

Description The choice of spectral bandwidth for a measurement is a compromise between signal level and spectral resolution. A wide bandwidth gives a high signal level and improved signal to noise ratio but at the cost of the resolution of narrow peaks, which may lead to errors, for example, in the calculation of tristimulus values. The step interval should be an integer multiple of the bandwidth, and to avoid significant errors in calculated tristimulus values, intervals of no greater than 5 nm should be used

Evaluation/correction methods Evaluation: bandwidth can be measured by scanning through a monochromatic line at very fine wavelength intervals and measuring the full width at half maximum. Note that the band-pass function may not remain the same size and shape over the entire wavelength range Correction: methods for correcting for bandwidth and step interval are available (Woolliams et al. 2011) but these require detailed knowledge of the band-pass function at all wavelengths and are generally not easy to apply. Instruments with poor slit profiles (e.g., highly nonsymmetrical) or where the step interval is not an integer multiple of the bandwidth are best avoided

Summary The key measurements required in order to characterize the performance of a display are luminance, color, spatial uniformity, angular distribution, reflectance, and temporal characteristics. These provide basic, underpinning information relating to the quality, usability, and legibility of the display under different conditions of use. This section has provided an overview of the methods, instrumentation, and major sources of potential error and uncertainty associated with these measurements; more details on all these aspects are provided in section “▶ Display Metrology.”

Acknowledgments This work was funded by the National Measurement Office of the UK Department for Business, Innovation and Skills.

Further Reading Commission International de l’Éclairage (1982) CIE 53: methods of characterising the performance of radiometers and photometers. Commission International de l’Éclairage, Vienna. www.cie.co.at/ Commission International de l’Éclairage (1984) CIE.63: the spectroradiometric measurement of light sources. Commission International de l’Éclairage, Vienna. www.cie.co.at/ DeCusatis (1998) Handbook of applied photometry. Springer, New York Grum F, Becherer RJ (1979) Optical radiation measurements volume 1: radiometry. Academic, New York Hopkinson GR, Goodman TM, Prince SR (2004) A guide to the use and calibration of detector array equipment. SPIE Press Book. SPIE Press, Bellingham Keller PA (1997) Electronic display measurement: concepts, techniques and instrumentation. Wiley, New York Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_21-2 # Springer-Verlag Berlin Heidelberg 2015

Kelley EF (2006) Diffuse reflectance and ambient contrast measurements using a sampling sphere. In: Proceedings of the 3rd Americas display engineering and applications conference, Society for Information Display, Atlanta, pp 1–5. Available from National Institute of Standards and Technology. ftp:// ftp.fpdl.nist.gov/pub/reflection/ADEAC06_Sampling_Sphere_2-1.pdf Kelley EF, Jones GR, Germer TA (1998) Display reflectance model based on the BRDF. Displays 19:27–34 McCluney WR (1994) Introduction to radiometry and photometry. Artech House, Boston Rykowski R, Kreysar D, Wadman S (2006) The use of an imaging sphere for high-throughput measurements of display performance – technical challenges and mathematical solutions. SID international symposium digest of technical papers 37, pp 101–104 Woolliams ER, Baribeau R, Bialek A and Cox MG (2011) Spectrometer bandwidth correction for generalized bandpass functions. Metrologia 48:164–172

Page 11 of 11

Digital Image Storage and Compression Tom Coughlin

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Image Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Still Image Formats and Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Video Image Storage Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Image Storage Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Storage Requirements for Image Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Directions for Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Reading List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Abstract

There are many still and video image formats in current use. These vary by resolution and compression technique used. Higher-resolution formats are generally used in content capture, while content distribution uses compressed formats to reduce bandwidth demand. This section exhibits examples of digital image (still and video) storage requirements for various common formats and resolutions. It also includes discussion of the types of storage devices used in the capture of digital still and video images, as well as storage used for field editing and archiving. Future content will require even greater storage and bandwidth as demand for higher-resolution and more immersive content drive developments and changes in the required technology.

T. Coughlin (*) Coughlin Associates, Atascadero, CA, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen, W.Cranton (ed.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_23-2

1

2

T. Coughlin

List of Abbreviations

2K 3D 4K 8K Blu-ray Charge-coupled device (CCD)

GIF HDTV JPG

Lossless compression

Lossy compression

LZW MPEG-2 MPEG-4 PNG Raw

SDTV TIF

Transcoding Ultra-HDTV

Generally refers to video resolution roughly 2,000  1,000 pixels Refers to stereoscopic images giving an illusion of depth Generally refers to video resolution roughly 4,000  2,000 pixels Generally refers to video resolution roughly 8,000  4,000 pixels A compressed video format used in Blu-ray disks An electronic image capture technology that detects light levels using light-sensitive semiconductor circuits Graphics Interchange Format is a lossless compressed format commonly used in Web sites High-definition TV Joint Photographic Experts Group (also known as JPEG) is a lossy compressed format that is popular in digital still cameras Compression of content that allows the recovery of all of the original content resolution with image processing Compression of content that results in irrecoverable loss in the resolution of the original content Lempel-Ziv-Welch compression, a popular lossless compression algorithm The compressed video format used in DVDs A compressed video format often used for Web and mobile phone content Portable Network Graphics is a lossless image format sometimes used for Web content An image format (often proprietary to the camera used) that captures the raw camera sensor output Standard definition TV Tagged Image File Format can be compressed or uncompressed. This still image format is often used in image archiving Transforming content from one image format to another Very high-resolution video format for nextgeneration video

Digital Image Storage and Compression

3

Introduction Key factors influencing the size of image files are (1) increased sensitivity and lower cost of image sensors, (2) availability of faster and more integrated electronics allowing rapid processing of images and content compression, and (3) the decreasing price and increasing availability of digital storage. All of these factors have enabled the increasing resolution of digital still and video cameras.

Image Capture Various types of storage devices have been used in image capture. Analog silver halide film produces still and video images of great resolution, which has only recently been matched by the very highest resolution digital images. Today’s digital still and video cameras use charge-coupled devices (CCDs) or active pixel CMOS sensors. There are many factors in comparing the visual quality of digital images versus analog images on, e.g., 35-mm silver halide film, and many of the differences will only show up if the image is blown up to look at details. It is estimated that there are about 20 “quality” megapixels in a high-end film camera with a good lens system, with the finest-grained film in good quality light. If the light level is less or a lower quality lens system is used, the “quality” megapixels may be as low as 4 (or even less) (How Many Pixels Are There. . .).

Still Image Formats and Compression Still digital imaging using electronic cameras produces data files of many different formats as well as “Raw” formats. The “Raw” data captured by a camera’s sensors is often processed by the camera electronics to produce a lossy or lossless image. Lossy compressed still image formats lose some of the original image detail, while lossless compressed image formats allow the details of the original image to be reconstructed. The maximum lossless compression possible in still images is about 50 % (Rabbani and Jones 1991). While lossless compression can save storage space, processing the compression algorithm to recover the image makes the time to open or save files longer. If a captured raw image is 3,000  2,000 pixels in size (or 6 megapixels) and if it has 24 bits (or 3 bytes) of color information per pixel, then the total size of this image is 6 megapixels  3 bytes = 18 megabytes. Compression can be used to make the required storage capacity smaller than this. Some of the more popular still image formats include TIF, PNG, JPG, and GIF. There are also several uncompressed “Raw” formats. Table 1 compares some information about these formats (Digital Photography Photo File Formats). Note that the popular JPG digital still images use a lossy compressed format. Also image

4

T. Coughlin

Table 1 Some characteristics of common still image file formats Still image format TIF PNG JPG GIF

Color depth Variable Variable 24 8

Compression Lossless Lossless Lossy Lossless

Loss of detail on saves? No No Yes No

compression (pixels only) may be separate from the number of colors that are used in the image (essentially a color compression). TIF or TIFF is Tagged Image File Format (compressed or uncompressed) (Tagged Image File Format). The TIF format is supported by several imageprocessing applications and is often the archiving format of choice for still image content. TIF can be compressed or uncompressed. The header section of a TIF file contains color depth, color-encoding, and compression-type information. TIF images can use LZW lossless compression. The LZW compression uses a table-based lookup algorithm invested by Abraham Lempel, Jacob Ziv, and Terry Welch. The LZW algorithm looks for reoccurrences of byte sequences in its input. The table maps input strings to their associated output codes. The table initially contains mappings for all possible strings of length one. Input is taken one byte at a time to find the longest initial string present in the table. The code for that string is output and then the string is extended with one more input byte, b. A new entry is added to the table mapping and the extended string to the next unused code (obtained by incrementing a counter). The process repeats, starting from byte b. The number of bits in an output code, and hence the maximum number of entries in the table, is usually fixed, and once this limit is reached, no more entries are added (Welch 1984; Ziv and Lempel 1977). PNG is Portable Network Graphics (standardized compression). This is a lossless compressed image format (A Basic Introduction to PNG Features). This image format is sometimes used for Web content, although GIF images are more common. Note that PNG is the default image format for Apple computers. PNG supports true color (up to 48 bits), grayscale (up to 16 bits), and palette-based (8 bits) color. This format was designed to replace the older GIF format. PNG has three main advantages over GIF for Web content: alpha channels (variable transparency), gamma correction (cross-platform control of image brightness), and two-dimensional interlacing (a method of progressive display). PNG also compresses better than GIF by around 5–25 %. PNG is a still image only format, while GIF supports animation as well. JPG or JPEG is Joint Photographic Experts Group (variable compressed format (JPEG). This is the most popular image format produced by many digital still cameras. JPG is generally a lossy compressed image format, and it is generally used to limit the size of the image files, so more images can be recorded on a given storage capacity. JPG files may be compressed to 1/10 of the size of the original data. Every time JPG images are processed or edited, more resolution is lost. It is

Digital Image Storage and Compression

5

Table 2 Some characteristics of common still image file formats Image file type TIFF, uncompressed JPG, high quality JPG, medium quality JPG, low quality PNG, lossless compression GIF, lossless compression (only 256 colors)

Size of file in KB 901 319 188 50 741 286

much better to save images in a lossless format and only generate JPG images when a smaller format is needed for email or on a Web site. GIF is Graphics Interchange Format (compressed format) (GIF Files). This format was created in the 1980s by CompuServe Information Service for transmitting images across data networks. When the World Wide Web was created in the 1990s, GIF was adopted as the primary image format for Web sites. GIF files use a lossless compression scheme to keep file sizes at a minimum, but they are limited to 8-bit (256 or fewer colors) color palettes. The GIF file format uses an LZW (Lempel-Zev-Welch) file compression that squeezes out inefficiencies in the data storage without losing data or distorting the image. The LZW compression scheme is best when compressing images having large fields of homogeneous color. It is less efficient compressing complicated pictures having many colors and complex textures. As an example of the relative size of various still images with variable image compression, see Table 2 (Digital Image File Types Explained).

Video Image Storage Formats Table 3 compares digital storage capacity, pixels/frame, frame rate, and streaming bandwidth requirements for various uncompressed and compressed video formats (Coughlin 2008; Common Video Data Rates; 2006; Video Data Specifications). Compression of this content can change these numbers considerably and is very common in content distribution (Further details on video compression techniques are given in chapter “▶ Signal Filtering: Noise Reduction and Detail Enhancement.”). Note that MPEG-4, DVD MPEG-2, and Blu-ray are all compressed distribution formats, while the remaining formats are professional format sizes without compression for distribution. It is clear that richer digital content requires much greater storage capacity and bandwidth. 4 K Ultra-HD is over 12 times larger than today’s HDTV (NHK Broadcast Ultra HD). A few simple equations can be used to calculate the overall digital storage requirements for an hour of video content with a given level of pixel surface density. Equations 1 and 2 show how the data rate can be calculated for a threecolor RGB or YUV video.

6

T. Coughlin

Table 3 Storage and streaming bandwidth requirements for various formats of video content

Format MPEG-4 (compressed) DVD MPEG 2 (NTSC, compressed) Blu-ray disk (compressed) SDTV (NTSC, 4:2:2, 8 bits) HDTV (1080p, 4:2:2, 8 bits) Digital cinema 2 K (4:2:4, 10-bit) YUV Digital cinema 4 K (4:4:4, 12 bits) YUV UHDTV 8K (4:4:4, 16 bits) (2012)

Pixels in a frame (width  height) Varies 720  480

Frame rate (fps) Varies 29.97

Data rate (MBps) ~0.750 1.22

Storage capacity/hour (GB) ~0.337 4.39

1,920  1,080

24

4.56

16.4

720  480

29.97

31

112

1,920  1,080

24

149

537

2,048  1,080

24

199

716

4,096  2,160

48

1,910

6,880

7,680  4,320

120

23,890

86,000

Width  Height  Bytes=pixel  Colors ¼ Frame Size

(1)

For example, 2,048  1,080  10 bits/pixel  3 colors = 66.3 Mb/frame = 8.29 MB/frame. (This is for a three-color RGB image. Note that this is a 10-bit deep file.) Size  frames per second ¼ Data Rate 8:29 MB=frame  24 fps ¼ 199 MB= sec ð2KÞ

(2)

Figure 1 graphically compares the resolution of various video formats (What Is Ultra HD).Resolution Comparison Compressed video images, like still video images, can be compressed by a lossy or lossless compression algorithm. Like still images lossless compression is not possible for compression greater than about 50 %. It is likely that more advanced content lossy compression technologies will be used for 4 K delivery such as HEVC (MPEG2 H.265) or related technologies from Netflix and other companies. The first chip products enabling HEVC decoding were released in 2014 with widespread adoption in consumer electronics over the next few years. Further details of video compression techniques are given in chapter “▶ Signal Filtering: Noise Reduction and Detail Enhancement.”

Image Storage Devices In digital still imaging, floppy disks were one of the first media used (Sony’s Mavica Camera) (2000). Very small form factor hard disk drives (so-called 1-in. HDDs) were introduced in the late 1990s to provide very high storage capacity

Digital Image Storage and Compression

7

NTSDVD (720X480)

HDTV (1920 X 1080)

UHDTV-14K (4096 X 2160)

UHDTV-28K (4096 X 2160) Fig. 1 Coughlin Associates 2015

for professional photographers (1999). The storage capacity of 1-in. HDDs achieved over 10 GB, but the production of 1-in. HDDs stopped about 2008. NAND-based flash memory is today the dominant storage media for still images. In video content capture NAND flash memory is now the dominant storage media (roughly 40 % of professional video cameras use flash memory (2014). Hard disk drives are the next most common due to their cost-effective storage capacity with optical disks the next most common camera storage. Magnetic tape and film have been declining noticeably as a storage media in professional as well as consumer video cameras. The flash memory and hard disk drive products can often be plugged directly into a computer, so the content can be accessed and copied. This is an advantage to users since they do not have to play back the content like with most magnetic-tape-based cameras and for all film-based video cameras. Optical disks can be inserted into an optical disk player for instant playback access and can also be inserted into and copied onto computers. Over the last few years, NAND flash storage capacities have increased, and prices have gone down, making possible flash-based video cameras with a reasonable capture time on a single flash card. Because card-based flash camcorders do not have the moving parts of a tape or optical media camera and because they do not have to have the additional cost of an internal hard disk drive, the original purchase price of consumer flash-based camcorders is generally the lowest in the market. This enabled a whole new low-cost market for camcorders and thus increased the overall market for these products. Also note that many mobile phones have still and video camera capabilities that store their images on flash memory cards. In today’s world the capabilities of smartphone cameras rival those of lower-cost cameras, and as a consequence, dedicated camera sales have suffered.

8

T. Coughlin Virtual Reality, 3D Movie 1000 Ultra HD Movie

Data Rate (Mbps)

100

HD Movie 10

DVD Movie (MPEG-2) CD Quality Stereo Audio

1

0.1 One page ASCII text 0.01 1KB

10KB 100KB

1MB

10MB 100MB

1GB

10GB 100GB

1TB

Multimedia Object Size

Fig. 2 Comparison of storage capacity requirements and data rates for different size multimedia objects (Coughlin Associates 2008)

Storage Requirements for Image Distribution Figure 2 shows a projection for the required data rate and storage capacity to contain various sizes of multimedia objects. As can be seen from this chart, blue laser optical disks will be large enough to contain compressed HDTV quality movies, but higher-resolution compressed content will require 100s of GBs. It is likely that in the next few years resolutions as high as those currently used in digital cinema (4  2 K pixel; HDTV is 2  1 K pixel) will appear in many highend households and will be commonplace by about 2020. In addition, humans want ever more realistic video experiences, which require even higher-resolution and depth requirements. Imagine, for instance, the storage requirements for a true 3D movie displayed in full depth in the air at 4 K (or higher) resolution in some sort of home projection system, viewable from every direction. Such products could well become common household products within the first 30 years of this century. The result would be demand for optical physical distribution media with storage capacities of 1 TB or higher. 1 TB disks or larger are a possible role for mass-produced holographic disks, for large number of optical layers on a disk, or for even higher-frequency laser-based optical disk products. Even with the rise of content distribution through the Internet,

Digital Image Storage and Compression

9

as long as the resolution requirements increase faster than the bandwidth available to most consumers, there will continue to be some demand for physical media. The mass production cost of optical media tends to be a few cents per disk. As the storage capacity increases, the cost per unit of storage becomes very low. For Blu-ray disks compared to DVDs, the price per GB consequently decreases. The $/GB of optical media will continue to drop if holographic recording or even higher-frequency laser optical recording moves into consumer products.

Directions for Further Research As transport and storage technologies improve, it is likely that even higher video resolution than 8  4 K UHD will be developed, especially for very large format displays and special applications. In addition to higher resolution, adding true stereoscopic depth to images, especially moving images at very high resolution, will require even faster data transfer rates and storage capacities. Captured data with multiple camera angles would require immense storage by today’s standards, and even a commercial 2-h movie distribution media with very high-resolution stereoscopic content could require 1 TB (1012 bytes) of information or greater7. The professional raw video content may reach 100s of petabytes, approaching an exabyte of raw content capacity. We must develop storage systems capable of finding and using this content as effectively as we work with today’s usually smaller content files. As the ways to use content increase with many different size display devices, transcoding of one content format to another becomes important. Transcoding processes are needed that can be done real time enabled by faster electronic signal processing. This will ease management issues associated with multiple formats of a single piece of content.

Conclusions Rich content requires lots of storage and with growing resolution storage requirements are increasing. Professional content continues to increase in resolution and richness, stereoscopic moving images being an example of this. Making higherresolution content useful will require faster processing and delivery systems operating at much higher data rates. Compression of content will help in making the most use of the bandwidth available, but the amount and type of compression are a function of how the content is used. Lossy compression is generally only appropriate for content delivery. Even with the use of compression, the digital storage requirements for the professional media and entertainment markets will continue to increase.

10

T. Coughlin

Further Reading (1999) One inch no cinch for IBM storage gurus. EE Times (2000) From film to floppy to CD: new Sony Mavica first to store images on CD-R, using cirrus logic optical storage chip. Cirrus Logic Press Release. https://www.cirrus.com/en/company/ releases/P158.html (2006) Industry plays with ultra high definition. Inquirer. Cirrus Logic Press. https://www.cirrus. com/en/company/releases/P158.html Yamashita, Y et al. (2012) “Super Hi-Vision” video parameters for next generation television. SMPTE Motion Imag J 212(4):63–68 (2014) Digital storage in media and entertainment report. Coughlin Associates A Basic Introduction to PNG Features. http://www.libpng.org/pub/png/pngintro.html Common Video Data Rates, Integrity Data Systems, Inc. www.integritydatasystems.net Coughlin TM (2008) Digital storage in consumer electronics. Newnes Press, Newnes Press is in Boston, MA Coughlin Associates (2008) Newnes Press, Boston, MA Digital Image File Types Explained. http://www.wfu.edu/~matthews/misc/graphics/formats/for mats.html Digital Photography: Photo File Formats. www.cywarp.com/faq_digital_photo_formats.htm GIF Files. http://webstyleguide.com/graphics/gifs.html How many pixels are there in a frame of 35mm film, Brad Templeton’s Photo Pages, http://pic. templetons.com/brad/photo/pixels.html JPEG – Joint Photographic Expert Group. http://www.scantips.com/basics9j.html NHK Broadcast Ultra HD Web site. http://www.nhk.or.jp Rabbani M, Jones PW (1991) Digital image compression techniques. SPIE Press, Bellingham, WA Tagged Image File Format. http://www.scantips.com/basics9t.html Video Data Specifications. www.mpeg.org Welch TA (1984) A technique for high performance data compression. IEEE Comput 17(6):8–19 What is Ultra HD? http://www.ultrahdtv.net/ Ziv J, Lempel A (1977) A universal algorithm for sequential data compression. IEEE Trans Inf Theory IT-23(3):337–343

Reading List Coughlin TM (2008) Digital storage in consumer electronics. Newnes Press Miano J (1999) Compressed image file formats. ACM Press, ACM Press, New York, NY Panasonic (2006) The video compression book. www.roadcastpapers.com Symes P (2004) Video compression demystified. McGraw-Hill, McGraw Hill, Boston, MA Taubman D, Marcellin M (eds) (2002) JPEG2000: image compression fundamentals, standards and practice. The Springer international series in engineering and computer science

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Video Compression Scott Janus* Visual and Parallel Computing Group, Intel Corporation, Folsom, CA, USA

Abstract This section provides an introduction as to why video compression is necessary (namely, due to technological transmission and storage constraints). After a brief historical overview which sets the context that video compression has been with us since the first television broadcasts, it describes the fundamental compression stages used in contemporary codecs (i.e., entropy coding, domain transforms, and temporal coherence). This chapter also describes how several of these algorithms were chosen due to various limitations and quirks of the human psychovisual system. This section concludes with a deeper look at the details of common video compression algorithms in use today, including AVC and HEVC.

List of Abbreviations AVC HEVC MPEG RGB VC-1

Advanced video coding High efficiency video coding Moving Picture Experts Group Red, green, blue Video codec 1

Introduction Video refers to the electronic representation of moving images. Whereas film can be directly perceived by the human eye, video requires processing to be converted into a viewable image. As is the case with film, each individual picture does not actually move: the illusion of movement is created by rapidly presenting a series of still images. Unlike film, video is almost always compressed. Film images are uncompressed in the sense that the entire image is immediately available within a single frame. Storing uncompressed images electronically takes a large amount of space. Correspondingly, it takes a large amount of bandwidth to transfer uncompressed images from one place to another. Historically, video compression has been employed to make video systems affordable. Video compression is also used to overcome technological or physical limitations.

Chroma Subsampling There certainly are venues that use uncompressed video. For instance, an increasing number of Hollywood movies are digitally edited and manipulated (adding special effects, for instance) using uncompressed ultra high-definition (such as 4096  4096) video. This approach requires extremely *Email: [email protected] Page 1

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

large amounts of disk space and specialized high-speed networks to transform the video from one work area to another. The expense of these real-time systems is well outside the range of affordability of the average consumer, although PC-based systems that can operate on uncompressed HD video in real time are becoming more affordable. Still, even on these systems it is typically impractical to store and transfer a large catalog of content in the uncompressed domain. There are many different video compression techniques. In this section, we will examine some of the most commonly used approaches. Like all compression schemes, video encoding relies on detecting redundant information and replacing it with a more efficient representation. Video compression also takes into consideration the human psychovisual system by discarding information that is difficult or impossible for people to see. Almost all video is therefore lossy.

Color Difference Spaces As discussed elsewhere in this handbook, it is very efficient to create image sensors and display devices that operate in an RGB color space (chapter “▶ RGB Systems”). These systems sample the real world using a tristimulus approach: taking a picture of the scene by conceptually using a red filter, a green filter, and a blue filter. The resulting images can be recreated by using red, green, and blue sub-pixels. Although RGB is great for cameras and monitors, it is a very inefficient color space when it comes to human perception of reality. Many RGB spaces allocate equal bandwidth to the red, green, and blue channels, but in fact humans do not have equal sensitivity to these channels. People are most sensitive to green and least sensitive to blue. Furthermore, RGB allocates equal bandwidth to brightness and hue, even though humans can perceive changes in luminance much better than they can perceive changes in chrominance. To address these issues, video is typically stored in a family of color spaces that is colloquially known as YUV. This terminology is not formally correct; there is no color space officially named YUV. Actual color difference spaces include Y’PrPb, Y’(B’-Y’) (R’-Y’), Y’CrCb, and xvYCC. This family of spaces breaks each color into luminance and chrominance components. YUV color spaces offer advantages for video. For instance, many types of video processing such as brightness and contrast adjustment (see chapter “▶ TV and Video Processing”) can be readily accomplished by operating on only the luma or chroma component. It is also possible to maintain independent resolution for luminance and chrominance information. A commonly used digital YUV format is known as YCrCb. For today’s consumer applications, 8 bits are assigned for each Y, Cr, and Cb sample. 10-bit and 12-bit versions are also used in certain professional applications. The highest fidelity version of YCrCb uses one Y, Cr, and Cb sample for every pixel, as shown in Fig. 1. This approach is known as 4:4:4 and is very intuitive. A 4:4:4 video signal can be compressed by downsampling the resolution of the chrominance. Although this is a lossy technique, it is not perceptually intrusive because humans do not notice color differences as readily as they do intensity differences. Almost all consumer video is stored and transmitted in a 4:2:0 format (Fig. 2) wherein there is only one chroma pair sample for every 2  2 grid of pixels. By reducing the chroma resolution to 25 % of 4:4:4, the 4:2:0 space offers substantial bandwidth savings with relatively minimal artifacts. Specifically, 8-bit 4:2:0 has 12 bits per pixel, compared to 8-bit 4:4:4 that has 24 bits per pixel. Although chroma-subsampled video is compressed compared to 4:4:4, such content is generally labeled as uncompressed in the vernacular of the video industry. In this domain, video compression refers to a set of well-known techniques for converting YUV data into a compressed bitstream. The most commonly used compression standards for widespread consumer consumption are MPEG-2, H.264, and Page 2

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

= Y sample = Cr, Cb samples

Fig. 1 4:4:4 sampling

= Y sample = Cr, Cb samples

Fig. 2 4:2:0 sampling

VC-1. These will be discussed in more detail later in the chapter. However, the underlying concepts of all these formats, as well as other formats such as MPEG-1 and H.263, are fundamentally the same. As such, we will first review these basic techniques and later describe how each of the formats uses them differently.

Predictive Coding There is a lot of redundancy within video. For a typical video stream, any two adjacent frames are likely to be quite similar. A great amount of compression can be had by taking advantage of temporal coherence between adjacent frames. For example, rather than completely storing a frame, we can instead take the approach of just storing the differences between the current frame and one or more reference frames. Conceptually, in the case of a camera pan of a static scene, picture 1 could be stored as “picture 0 offset by two pixels to the left and one pixel downward.” The offset between one image and another is stored as a two-dimensional motion vector. This temporal compression technique is an example of predictive coding, so called because a compressed sample is estimated from a previous decoded sample. When the prediction is done from a different frame, it is known as inter-coding. There is also intra-frame predictive coding, wherein a particular sample is estimated from adjacent samples. Page 3

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

In the case of inter-coding, the processing of estimating the vectors during encoding is known as motion estimation. The decoding equivalent is known as motion compensation. Merely storing motion vectors is not sufficient for general video. Motion information will not capture changes in brightness or color, handle the introduction of new elements to the scene (such as someone walking on-screen), or handle changes in perspective due to camera movement. In practice, therefore, error correction data is also stored. This difference between the block being coded and the reference block is known as the error prediction signal or residual. Inter-coding consists of creating a predicted picture using motion vectors applied to one or more reference frames and then adding the residual terms to the resulting prediction. Inter-coding only offers compression if the storage, or coding, of the picture as motion vectors and error terms is smaller than describing the picture in a completely self-contained manner. If inter-coding a particular region of the image did not create good compression – as might be the case during a scene change – then that region can instead be intra-coded. Intuitively, it makes sense to apply motion vectors only to regions of the scene that are moving. For instance, if somebody was waving their hand, you would ideally want to define the outline of the hand and specify a motion vector that applied just to that region. Although precisely this type of coding has been defined (ISO IEC 14496 2), it is rarely used in practice. It is simply too expensive to perform this type of detailed image preprocessing to detect oddly shaped regions of movement. Also, it requires a relatively large number of bits to describe arbitrarily shaped areas. The more practical approach used by modern codecs is to decompose the scene into blocks of variable size. These blocks crudely define regions of motion but are efficient to code. One or more motion vectors are associated with each of these blocks. Specifically, the image is typically broken into 16  16 pixel regions known as macroblocks. These macroblocks may contain a single motion vector, or it may be broken into smaller units known as blocks, each of which has its own motion vector. These blocks can be small as 4  4 samples, although 8  8 blocks are quite common. These motion vectors are derived mathematically, and the algorithms used to derive them have no true understanding of the scene. As such, the derived motion vectors may not actually correspond to true motion on the screen. The simplest form of motion compensation uses a single reference picture. These types of frames are known as predicted or P frames. P frames usually reference a self-contained or intra-coded (I) frame. Within a P picture, individual blocks can be coded as intra- or inter-blocks. Intra-blocks are known as I blocks, and in this case the inter-coded blocks are P blocks. When dealing with P frames, the reference I picture comes chronologically before the P frame. In this case, the reference frame is known as a backward reference. Bidirectionally predicted (B) frames use forward and backward references. Within a B-frame, each block can be an I block, P block, or a B block. The B blocks have at least two motion vectors: one for the forward reference and one for the backward reference. Some formats use an equal weighting between the forward and backward reference. More complex codecs allow for the forward and backward references to be weighted, so that one can count more heavily toward the predicted value. Classical B-frames use I and P pictures as references, never other B pictures. The bidirectional prediction allows for B-frames to be more highly compressed than P or I frames. As such, most streams consist primarily of B-frames. However, I frames must still be used (typically once every second or so) to make sure the picture quality does not degrade too far. It is hypothetically possible to encode an entire movie using the first frame as an I picture, the last frame as a P picture, and every one of the frames in between as a B-frame, but obviously the prediction of all those frames would be horribly inaccurate and have massive error terms. Page 4

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

I0

B1

B2

P3

Fig. 3 Picture dependencies

I frames must also be inserted frequently to handle any corrupted bitstreams or to handle the scenario of beginning playback in the middle of the stream, as may be the case when changing television channels or jumping to a particular scene in a movie. Any such random access must begin on an I frame; otherwise the P and B pictures will use a reference that is not available. As such, the frame cannot be reconstructed. The precision of the motion vectors varies among formats. The simplest ones use a half-pixel resolution (often referred to as half-pel), while more sophisticated ones use quarter-pel or even eighth-pel. Generally speaking, the greater the motion vector precision, the better the compression ratio that can be achieved. Further compression can be achieved by taking advantage of the motion coherence. In most scenes, the motion vector of a particular block is likely to be very similar to the motion vectors of neighboring blocks. As such, compression efficiency can be further improved by predictively coding the motion vectors themselves. As an example, the motion vector of a block might be coded as the difference or error from the motion vector to the block immediately to its left.

Coded Versus Frame Order The fact that predicted pictures may refer to forward references has a serious implication: the order of the frames in the bitstream, also known as the coded order, must be different than the display order. For instance, in a simple four-picture sequence consisting of an I picture, a P picture, and two B pictures, the display order would be I, B, B, P. To help identify the frames, we can assign them numbers based on their display order, to wit: I0, B1, B2, P3. The B pictures cannot be decoded until the P picture has been decoded; therefore P3 must come in the bitstream before the B-frames. As such, the order of the pictures in the bitstream (also known as the coded order) is I0, P3, B1, B2. These dependencies are illustrated in Fig. 3. After a decoder has applied the motion compensation to decode the frames, it must reorder them to place them back into display order.

Transform Coding Both intra-coded blocks and residuals tend to have spatial redundancy. By transforming from the spatial domain to a different domain, the redundancy can be reduced and the resulting signal more efficiently coded. A well-chosen transform can decorrelate the spatial redundancies and result in a set of coefficients that can be efficiently encoded. A good transform also allows for perceptual coding that takes advantage of the fact that humans are more sensitive to some spatial frequencies than others. The transform itself does not actually provide any compression, but converts the data into a format that is more amenable to compression. The 8  8 discrete cosine transform (DCT) was a commonly used transform for video coding, although it has been replaced by simpler 4  4 transforms in modern codecs.

Page 5

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Quantization The outputs of the transform operation are quantized to provide further compression. Unlike preceding stages, the quantization is a lossy operation. The quantization is done in the transform domain so as to provide perceptual coding: higher precision is given to the coefficients that are most noticeable to the human eye. Typically, low-frequency components are quantized with finer granularity, while highfrequency components are quantized more coarsely. Motion vectors are also quantized. It is worth noting that quantization is the only irreversible stage of encoding. Information is destroyed and cannot be recovered. It is this stage that makes video compression algorithms lossy. Therefore, picking the correct quantization parameters is of great importance, in order for the reconstructed signal to be acceptable. The quantization stage increases the number of zero-value coefficients in a block, as many low-value components get rounded down to zero. For example, a typical 4  4 block in H.264 only has four nonzero samples. This outcome will prove helpful in the next stage of encoding.

Entropy Coding Entropy coding refers to the technique of efficiently coding symbols using variable length codes. Variable length coding works by using short codes for common symbols and longer codes for infrequent symbols. In the case of English, for instance, we’d want to code the common letter E with a short symbol and the less common letter Z with a longer symbol. Consider a simple coding scheme where every letter is coded as 5 bits. A is coded as 00000, B as 00001, up to Z coded as 11001. Regardless of the text being coded, it will always average out to 5 bits per letter. Now consider a variable length coding scheme tuned to English letter frequency, say, E is 011, T is 000, A is 1100, and Z is 11111101010. At first glance, it seems unlikely that this is an efficient scheme, since some letters are coded at much more than 5 bits. However, for large passages of text, compression does indeed happen. For instance, a word like TEATIME can be encoded at about 3.7 bits per letter. Effective variable length encoding requires a coding scheme that is tuned to data being coded. A code table designed for standard English text would not fare as well when applied to acronym-riddled instant messages (i.e., TTYL, LOL, CYA). The code would also be very inefficient when applied to a stream of random letters. There needs to be some symbols that consistently appear more frequently than others in order for variable length coding to be effective. Luckily, this condition is true for natural video content that has been predicted, transformed, and quantized. As such, video compression schemes use variable length encoding to great effect. The two fundamental approaches are using a fixed coding table for all video content and using adaptive coding. Fixed code tables require careful analysis of a wide variety of video streams during the creation of the compression standard. Once this prior work is complete, implementation of an encoder or decoder is relatively inexpensive as the table can be hard-coded into place. However, since this approach provides a coding scheme that is effective on average to every possible video stream, there are certainly going to be specific video streams that are not coded as efficiently as others. The converse approach of context adaptive entropy coding allows for the encoding-time creation of a coding table that is well tuned for the current video stream. However, this efficiency comes at the cost of great complexity and cost in the encoder and decoder.

Page 6

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Deblocking The block-based transform can sometimes lead to noticeable visual artifacts, such as obvious discontinuities between neighboring blocks. This blockiness most notably manifests when insufficient bits are allocated to compress the picture, which can happen commonly in low-bit rate video or in scenes of high complexity, such as one of waterfall. Deblocking filters can be applied as part of the decoding process to alleviate these artifacts. The deblocking operation does not provide any compression in and of itself, but for some types of video, it does allow a lower bit rate to be used without a noticeable degradation in visual quality. Deblocking can be applied to reconstructed pictures before they are used as references for other pictures. This approach is known as in-loop deblocking. Filtering can also be applied out of loop, where the deblocked pictures are not used for any predictions, but are just intended to be displayed.

Decoding It is worth noting that video compression specifications typically specify the behavior of the decode operation and bitstream syntax, but not how an encoder operates. Given a valid compressed bitstream, there will only be one valid decompressed set of frames. It is usually relatively straightforward to determine if a decoder is compliant to the spec by seeing if the output of a candidate decoder matches that of a reference decoder. By comparison, there is a many-to-one mapping of uncompressed frames to valid compressed bitstreams. Two different encoders can come up with completely different – yet equally valid – compressed bitstreams. Even the output of a single encoder can vary wildly depending on how parameters such as target bit rate are adjusted. This approach allows for encoder developers to differentiate based on quality and performance while maintaining industry-wide playback compatibility. Much of the value of a particular encoder stems from the peculiarities of its underlying algorithms. There are professional-grade encoders that offer radically better quality compared to more inexpensive applications. Let us now put all of the aforementioned compression tools together and see how they interoperate. Figure 4 is a block diagram of a generalized decoder. It is a simplified schematic showing the operations common to classical compression algorithms such as MPEG-2. Beginning with the compressed bitstream, entropy decoding is applied. The primary outputs of this stage are quantized coefficient data and motion vectors. On a block-by-block basis, the quantized coefficient data is scaled in a process known as inverse quantization (which is simply a multiplication). The results are moved back into the spatial domain via an inverse transform. At this point, the block either contains actual pixel samples or error prediction terms. In the later case, the block is added to a predicted Compressed bitstream

Entropy decoding

Inverse quantization

Motion vectors

Inverse transform

Motion compensated prediction

+

Deblocking

Intra prediction

Uncompressed pictures

Memory

Fig. 4 Generalized decoder block diagram

Page 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Uncompressed picture

+

Entropy coding

Quantization

Transform

Compressed bitstream

Inverse quantization

Motion estimation

Inverse transform

Motion vectors

+

Motion compensated prediction

Intra prediction

Deblocking

Memory

Fig. 5 Generalized encoder block diagram

block to generate pixel samples. The predicted block is created either using motion vectors (interprediction) or by predicting values within the block itself (intra-prediction). The final stage is the application of a deblocking filter, resulting in the final uncompressed picture. The resulting picture may now also be used as a reference for decoding other pictures.

Encoding Encoding is a more complicated procedure than decoding. This reality is not accidental. In most applications, including broadcasts, webcasts, and optical disk storage, the ecosystem has many more decoders than encoders. It thus makes a great deal of economic sense to have relatively inexpensive decoders available to consumers and limit more costly encoders to the smaller number of content providers. There are also scenarios such as teleconferencing where there is a 1:1 ratio of encoders and decoders. The encoder contains most – if not all – of a decoder, as can be seen in careful inspection of Fig. 5. Specifically, the inverse quantization, inverse transform, deblocking, and motion compensation stages of the decoder are used in encoding. The entropy decoder is not necessarily needed, although it is often included in more sophisticated multi-pass implementations that are optimized for high-quality compression. When an uncompressed block is fed into the encoder, the first thing that must happen is the encoder must decide if the block will be inter-coded or intra-coded. If the block is part of an I picture, it must naturally be intra-coded. However, for most blocks in P and B pictures, each block can be either intercoded or intra-coded. Generally, the encoder must compare the coding efficiency of each case and then choose which one provides the most efficient representation. This mode decision is typically made by performing motion compensation on the block and comparing the results with intra-coding it. The mode

Page 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

decision also has to determine a block shape, motion vector accuracy, motion vector prediction mode, as well as other coding details. It is a complex operation. The mode decision is a complex operation. Just considering motion vector selection, several different vectors have to be examined. The motion compensation stage ultimately picks the one(s) that offers the best result given a finite search area or search time. There are various algorithms for selecting motion vectors. Many of these techniques are proprietary, as the choice of motion vectors has a major impact on the final quality of the compression. The optimal result can always be found by exhaustively searching all possible candidates, but this is computationally impractical. For quarter-pel precision, there are over 30 million candidates for a single high-definition reference frame! Once a final motion vector has been chosen, it is used to form a predicted block. This block is subtracted from the original block to form the prediction error coefficients. These coefficients are transformed and quantized, with the result being entropy coded along with the motion vector. With this generalized concept of video encoding and decoding in mind, we will now embark on a brief survey of popular compression standards.

MPEG-2 MPEG-2 is one of the most widely used video compression formats in the world today. It is the basis of many terrestrial and satellite television systems, is used in DVD-Video disks, and is one of the three formats supported in Blu-ray. It was created by the Moving Picture Experts Group (MPEG) in 1994 to replace MPEG-1. MPEG-1, designed to store VHS-quality video at CD bit rates, had very limited applicability. MPEG-2 was designed to support a wide range of applications, including high-definition studio-grade video. MPEG-2 has support for 4:2:0, 4:2:2, and 4:4:4 video. The MPEG-2 standard includes video and audio compression schemes, as well as multiplexed stream formats. MPEG-2 video is specified in ISO/IEC 13818-2 (ISO/IEC 13818-2). It is also specified as ITU-T Recommendation H.262, but the MPEG-2 moniker is far more common. MPEG-2 uses half-pel precision motion vectors applied to blocks of size 16  16, 16  8, or 8  8. The format does not offer any sophisticated intra-prediction. Its P pictures use a single reference, and the B pictures use a forward and a backward reference. The B pictures use an equal weighting between the two references. The entropy coding uses a fixed variable length coding table. The only supported transform is an 8  8 discrete cosine transform (DCT). MPEG-2 is a very successful codec and has been widely adopted throughout the industry. It will continue to be used for quite some time, even though more recent formats – namely, the next two formats we will discuss – offer better compression quality.

AVC A follow-on to MPEG-2 was standardized by MPEG and ITU in ISO/IEC 14496-10 (ISO IEC 14496 10) and ITU-T Recommendation H.264. It is colloquially referred to by many different names. AVC and H.264 are the two most common labels; other variations include MPEG-4/AVC and MPEG-4 Part 10. It is occasionally referred to as JVT, because it was developed by the Joint Video Team. Sometimes the term MPEG-4 is used, but this term is ambiguous, as there is also a video standard known as MPEG-4 Part 2. Part 2 defines a standard that is incompatible with Part 10. AVC has found much wider adoption in the industry than MPEG-4 Part 2.

Page 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Previously decoded pictures

Current picture

Fig. 6 Multiple backward references

AVC was developed to provide greater compression than that offered by MPEG-2. It is targeted at a broad range of usage models including broadcasting, digital storage, and webcasting. As is the case with MPEG-2, it has support for a wide span of resolutions and bit rates. Likewise, it supports 4:2:0, 4:2:2, and 4:4:4 video. AVC supports 8–12 bits per sample. AVC can achieve about 2 bit rate savings compared to MPEG-2 (ISO IEC 23008 2). Naturally the exact savings depends on many variables, including target bit rate and image resolution. However, AVC consistently outperforms MPEG-2. AVC achieves its high compression through the use of more complex tools not available in MPEG-2. Unsurprisingly, AVC encoding and decoding is computationally more intense than MPEG-2. In particular, CABAC (discussed below) is quite a challenge. The increased efficiency comes at the cost of more expensive components. AVC decoders have proven affordable on the commercial scale, and the format is used for Blu-ray disks, television broadcasting, and Internet streaming, most notably by YouTube. AVC supports quarter-pel motion vectors. For motion prediction, the macroblocks can be decomposed into 16  16, 16  8, 8  16, 8  8, 8  4, 4  8, and 4  4 blocks. Unlike MPEG-2, P pictures can use multiple backward references, as shown in Fig. 6. As such, each motion vector is associated with an index number indicating which reference frame it uses. The standard B picture concept is likewise extended for AVC. Entities known as B slices can use multiple previously decoded pictures as references. These references can be forward only, backward only, or backward and forward. Any given block within the slice can only use two references at a time, but the references used can vary from block to block within the slice. The two references can also be arbitrarily weighted, as compared to the equal weighting used in MPEG-2’s B pictures. The arbitrary weighting can be very useful in situations like a cross-fade between two scenes. Blocks in B slices can also use direct mode, wherein the motion vector is not explicitly coded, but recovered from previously decoded information. Unlike many other standards, B pictures can also be used as references. In addition to inter-prediction, AVC supports intra-block prediction. Samples in a block are predicted from already decoded blocks. Figure 7 shows some examples of the different types of predictions. The coding of this simple intra-prediction mode and resulting residuals typically results in a smaller compressed bitstream than non-predicted intra-coding. Whereas MPEG-2 used fixed 8  8 DCT transforms, AVC primarily uses a 4  4 transform. This smaller size allows better mapping of blocks to the boundaries of moving objects in the scene. The AVC transforms exclusively use integers, making them cheaper to implement and free of any floating point precision issues endemic to DCT. AVC supports two types of entropy coding: context-based adaptive variable length coding (CAVLC) and context-based adaptive binary arithmetic coding (CABAC). Being adaptive, both of these schemes offer greater compression efficiency compared to static variable length coding schemes, such as used in MPEG-2. Of course, this increased efficiency comes at the price of increased encoder/decoder complexity.

Page 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

M

A

B

C

D

E

F

G

H

M

I

I

J

J

K

K

L

L Mode 0: Intra_4X4_vertical M

A

A

B

C

D

E

F

G

H

Mode 1: Intra_4X4_horizontal B

C

D

E

F

G

H

I J K L Mode 3: Intra_4X4_diagonal_down_left

Fig. 7 Intra-prediction examples

CAVLC is the simpler of the two schemes. It uses several different variable length coding tables. A particular code set is chosen as the input stream is parsed, with the most efficient set being chosen depending on the content. CABAC is substantially more complex than CAVLC but offers greater compression efficiency. CABAC has three major steps: binarization, context modeling, and binary arithmetic coding. In binarization, elements such as motion vectors and transform coefficients are converted into a binary code. This operation is similar to assigning a VLC to symbols, but it is only the first stage of CABAC. Next, binarized symbols are analyzed to select a probability model that is well suited to code the stream. Finally, the probability model is used to arithmetically code the symbols. Other AVC tools include motion vector prediction and an in-loop deblocking filter.

VC-1 VC-1 is a video compression format that is a contemporary of AVC and offers roughly the same level of compression compared to AVC. It was derived from Microsoft’s WMV9 video format and standardized by SMPTE as VC-1 in SMPTE 421M (Wiegand et al. 2003a). VC-1 has numerous minor changes compared to WMV9, perhaps the most notable being the addition of interlaced content support. Historically, there were lively debates as to whether VC-1 or AVC is better. The codec that offers the best results depends on the content being encoded, the resolution and bit rates involved, and personal preference with regard to the subtle artifacts of each of the formats. On the whole, VC-1 is less complex than AVC. Keeping VC-1 simple while maintaining high quality was a conscious design goal of VC-1(5). Some of the simplicity was achieved by limiting the scope of the format. For instance, it only supports 8-bit-per-sample 4:2:0 video. This limitation is acceptable for delivering video to consumers but makes the format unacceptable for high-end studio-grade use. VC-1’s most common usage is on Blu-ray disks. It is also used in some versions of Microsoft’s Silverlight framework for distributing video across the Internet. VC-1 supports quarter-pel motion vectors with classical P and B pictures: P pictures refer to one backward reference. B pictures use bilinear prediction between one backward and one forward reference. Page 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 8 A picture subdivided into tiles

One unique element of VC-1 is its intensity compensation feature. Designed to efficiently code crossfades between scenes, it allows scaling of the luma and chroma values of the reference pictures prior to motion compensation. VC-1 has an integer-based adaptive transform scheme, where 4  4, 8  4, 4  8, and 8  8 blocks can be used. The flexible block sizes allow for precision mapping of moving objects in the scene. VC-1’s intra-prediction is done in the frequency domain, unlike AVC’s spatial domain intra-prediction. VC-1 uses a frame-adaptive VLC for entropy coding. Similar to AVC, VC-1 also includes a deblocking support. VC-1’s deblocking filter is applied to a narrower region around block boundaries compared to AVC.

HEVC HEVC is the successor to AVC. Its specification was developed by the same teams that developed AVC. Much as AVC aimed to offer twice the compression efficiency over its predecessor (MPEG-2), HEVC is designed to achieve twice the encoding efficiency over AVC. AVC became the dominant compression standard for high-definition (HD) video with a resolution of 1920  1080 pixels. However, an industry transition is under way to move mainstream consumer video to a resolution of 3840  2160 pixels. Sometimes referred to as 4 K or Ultra HD (UHD), this resolution has four times as many pixels as HD. As part of this transition, the higher resolution is typically being coded with the new, more efficient HEVC codec. Whereas previous codecs have been primarily used only 8 bits per sample in the consumer space, the emerging HEVC usage models often use 10 bits per sample. The increased precision allows for a wider range of colors and higher dynamic range of luminance. HEVC achieves its improvements in coding efficiency by starting with the AVC tools and further refining them. The new tools include improved intra-picture coding, improved inter-picture prediction, CABAC entropy coding, and in-loop filtering. Unlike AVC, the HEVC standard was written in a time when it is common to watch video on powerconstrained devices such as tablets and phones. Also, integrated circuit densities have reached a point where it is no longer efficient to improve computation performance by merely increasing clock frequency. The only way to achieve high performance in a power-efficient manner is by using parallel compute engines. As such, a recurring theme among the new HEVC tools is the ability for them to be implemented in parallel architectures.

Page 12

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

As an example, HEVC allows a picture to be subdivided into rectangular regions called tiles, as shown in Fig. 8. Each tile is independent of the others, and so the decoding of the tiles can be dispatched to multiple computation engines that can operate in parallel on the same picture. They are similar to slices but are constrained to be rectangular in shape. Tiles provide additional coding efficiency over slices by not requiring per-tile header information. Tiles can also provide better spatial coherency compared to typical slice partitioning.

VP9 Much as VC-1 was a contemporary alternative to AVC, Google’s VP9 is a codec that is being deployed in the same time interval as HEVC and aims to be similarly efficient. As with the case with AVC vs. VC-1, VP9 and HEVC have similar compression efficiency; some scenes may compress better with HEVC, while others may compress better with VC9. One of the novelties of VP9 is not in its tools, but rather in the fact that it is promoted as a royalty-free codec. By comparison, most other codecs require that implementers pay royalties for the patented intellectual property used in the codec’s tools. VP9 is currently limited to 4:2:0, which constrains its common usages to consumer video. As an example, VP9 is being used extensively on YouTube. Most video production houses use either 4:2:2 or 4:4:4 video.

Conclusions Video must be compressed for practical and affordable storage and transmission. There are several different video compression standards in use in the industry today. Some are open standards, while others are proprietary. In general, they all take advantage of spatial and temporal coherence in the video, as well as coherence in the intermediate codes. Pictures are divided into regular blocks. These blocks are intrapredicted or inter-predicted, transformed, quantized, and entropy coded. The most prevalent codec today is AVC, although that is rapidly being supplanted by HEVC. VP9 is also seeing growing adoption for Internet video.

Directions for Future Research Video compression is by no means a completed technology. Research continues into new algorithms that endeavor to greatly improve upon the efficiency offered by HEVC and VP9. For example, Google has announced they are developing a new codec called VP10.

Further Reading ISO/IEC 14496-10 Coding of audio-visual objects – part 10: advanced video coding ISO/IEC 14496-2 Coding of audio-visual objects – part 2: video ISO/IEC 23008-2 High efficiency coding and media delivery in heterogeneous environments – part 2: high efficiency video coding ISO/IEC 13818-2 Generic coding of moving pictures and associated audio information, part 2: video Page 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_24-2 # Springer-Verlag Berlin Heidelberg 2015

Janus S (2002) Video in the 21st century. Intel Press, Hillsboro Keith J (2007) Video demystified: a handbook for the digital engineer, 5th edn. Newnes, Amsterdam Ostermann J et al (2004) Video coding with H.264/AVC: tools, performance, and complexity, vol 1. IEEE Circuits Syst Mag 4:7–28 Poynton C (1996) A technical introduction to digital video. Wiley, New York SMPTE 421M VC-1 Compressed video bitstream format and decoding process Sze V et al (2014) High efficiency video coding. Springer, Springer Cham Heidelberg New York Dordrecht London. http://www.springer.com/us/book/9783319068947 Wiegand T et al (2003a) Rate-constrained coder control and comparison of video coding standards. IEEE Trans Circuits Syst Video Technol 13:688–703 Wiegand T et al (2003b) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13:560–576

Page 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015

Fundamentals of Image Color Management Matthew C. Formana* and Karlheinz Blankenbachb a Create-3D, Sheffield, UK b Display Lab, Pforzheim University, Pforzheim, Germany

Abstract Ideally in modern multimedia systems, exact colors and tones should be maintained throughout the processing chain from input device (e.g., scanner, camera) to output device (e.g., display monitor, printer). This ensures that all viewers will be presented with practically the same final display which is also consistent with the image originally captured. Each real-world capture or display device, however, interprets color values in its own specific manner, even if devices use ostensibly the same representations (e.g., RGB levels). Tonal (gray level) responses also vary widely. Specific steps must therefore be taken to ensure consistency of output on different devices; color management systems provide frameworks for this. This chapter presents the fundamental concepts of image color management and investigates some of the practicalities of their application in color management systems.

List of Abbreviations CIE CMM CMY[K] CRT ICC LC LCD LED PCS RGB UCS

Commission Internationale de l’Eclairage (International Commission on Illumination) Color management module Cyan, magenta, yellow, [black] (color coordinates) Cathode ray tube International Color Consortium Liquid crystal Liquid crystal display Light emitting diode Profile connection space Red, green, blue (color coordinates) Uniform chromaticity scale

Introduction Color displays and print systems should be able to reproduce images with color and luminance content that are as close as possible in an absolute sense to those in an original, real-world scene. However, maintaining accurate real-world image colors through complex digital processing chains and a wide variety of input and output hardware devices requires significant efforts. Although image sensors in cameras and scanners generate RGB signals and display systems are also RGB addressed, these red, green, and blue intensity values are not related in any consistent, deterministic way. Such RGB representations are examples of device-dependent color spaces. In a simple everyday example to illustrate the issues involved (Fig. 1), a digital camera captures a picture. This is transferred to a computer where it is *Email: [email protected] Page 1 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015 Output devices

Input device Computer processing

RGB (monitor)

RGB (camera)

CMY[K] (printer)

Fig. 1 A typical color image processing chain, involving three separate device-dependent color spaces

displayed on an LCD monitor as a preview, is possibly manipulated, and is also printed on paper using cyan, magenta, yellow, and black inks. There are many differences between the spectral responses of the camera sensor and monitor pixels, and many variables are involved in converting to a chemical-based ink/paper process. Unless account is taken of these issues, differences between the original scene and the displayed and printed outputs in terms of gray level and color representation are highly likely to occur. Color management systems take steps to minimize these variations so that displayed and printed colors more closely match those originally captured (Roy 2000; Giorgianni and Madden 2008; Kohler 2000). This is important to ensure consistency of output both between individual jobs running at different times and between different installations. Accurate characterization and control of color in a digital imaging process also mean that useful techniques such as soft proofing become possible. This is the previewing of an image intended to be output on a certain device (e.g., a printing press), by displaying it on a different device (e.g., a monitor) in a representative manner.

Practical Color Management The overall goal of digital color management is to maintain understanding and control of colors, and the relationships between colors, in a digital image processing chain. This is usually achieved by defining a device-independent color representation to serve as an absolute, neutral color reference space. Devicedependent source colors as produced by an input device are mapped into this reference space, followed by onward conversion to the device-dependent destination color space of the display or print process (Green and Macdonald 2002). The International Color Consortium (ICC) (International Color 2011) has defined a standard layout for color management and has also specified flexible processes for conversion between device color spaces (International Color 2010). An image is captured in a device-dependent color space, such as camera or scanner RGB. The colors in the source image are converted internally by the Color Management Module (CMM) software into a device-independent Profile Connection Space (PCS). The PCS represents chromaticities directly and is thus completely independent of the specifics of particular input or output device color characteristics.

Color Profiles The conversions between input and output device color spaces that take place in the CMM are specified by color profiles. These are collections of data structures which characterize the source and target device color spaces to the CMM. When fed to the CMM along with colors from the source image data, a pair of color profiles takes care of the mapping from source to destination device colors via the intermediate PCS. Page 2 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Basic sRGB color space definitions Parameter Luminance White point Primaries (CIE 1931) Gamma Ambient illuminance Ambient white point Veiling glare

Value(s) 80 cdm 2 for 100 % white D65 (x = 0.3127, y = 0.3291) Red: x = 0.64, y = 0.33; green: x = 0.30, y = 0.60; blue: x = 0.15, y = 0.06 2.4 64 lx D50 (x = 0.3457, y = 0.3585) 1%

A color profile describes a color space in terms of a set of colorimetric parameters, defining the mapping to/from the PCS. These are chiefly grayscale gamma value or curve (tone response), white point chromaticity, reference luminance, and primary color chromaticity coordinates. The profile defining the sRGB color space (see Table 1 for basic parameters) is a commonly available example. sRGB was developed as a standard color space for computer processing and communication of images via the Internet, with a response that approximates most CRT computer monitors for reasonable cross compatibility with end-user computer systems that do not include color management (Stokes et al. 1996). Specification and Transport of Color Profiles For color management to be a practical proposition, color profiles must be self-contained, portable entities that can be made readily available alongside image data that is in the color space they describe, ready to be used by a CMM in any system that needs to convert such an image to its destination device space. Color profiles are generally provided in ICC (or ICM) files. These use a standard file format defined by the ICC for the purpose. Manufacturers of input devices (cameras, scanners) and output devices (monitors, printers) often provide ICC profile files with their products, though these often need tweaking and regenerating to suit local conditions where color workflows need to be particularly well calibrated. The calibration data stored in ICC files can also be embedded in various common image files using special extensions. TIFF, JPEG, PDF, SVG, and PNG are well-known formats with this facility. Embedding of a profile, directly describing the color space to which image pixel values relate, greatly aids the color management process by removing any ambiguity about the source color space for a profileaware application opening the image. Digital cameras can often be set to embed a profile describing their source color space in this way automatically in every image captured. Generating Color Profiles Many input and output devices are not supplied with ICC profile files, and a calibration process is needed to generate a suitable one to match the actual response of a device (Green and Johnson 2000). Even if a profile is supplied as a result of a manufacturer’s own calibration procedures, it is also often necessary for an end user to repeat it at some stage because the color response of a device can shift for many reasons. In printing, ink and media variations will do this, and display screen characteristics will change over the lifetime of the unit, not to mention the effects of fluctuations in ambient light and other external conditions. In environments such as commercial printing, where output consistency and accuracy is a very important consideration, calibration is normally carried out at regular intervals as an element of an organization’s quality assurance procedures. In general, calibration to generate profiles is performed using a standard color test chart containing a number of known chromaticity samples uniformly distributed in the device color space – see Fig. 2 for a representative example using output device RGB coordinates (though note that the chart as reproduced

Page 3 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015

94 28 13

241 149 108

97 119 171

90 108 39

164 131 196

140 253 153

255 116 21

7 47 122

222 29 42

69 0 68

187 255 19

255 142 0

0 0 142

64 173 38

203 0 0

255 217 0

207 3 124

0 148 189

255 255 255

249 249 249

180 180 180

117 117 117

53 53 53

0 0 0

Fig. 2 Color test chart adapted from MAZeT, Jena, Germany, with corresponding device stimulus RGB values (8 bit, R top, G center, B bottom)

here is not suitable for use in any calibration procedure). Test targets with patches of known gray and primary color tone densities are also used to set the gamma (tone curve) parameters in the profile. Such charts are sometimes provided, together with calibration software, with mid-range image scanners. They are more generally available as part of aftermarket color profiling kits intended to generate profiles for several types of device in environments where recalibration is often needed (Datacolor Inc. 2011; X-Rite Inc. 2011). Calibrating an RGB input device such as a scanner is a fairly straightforward procedure, since scanners have their own enclosed light sources with known properties. Various test charts are captured, and the resulting images are loaded into supplied calibration software. This software reads the scanned color and tone patches. Using a priori information on the absolute chromaticity and tone values of the test charts, it then produces a profile that transforms the device RGB values of the scanned test files to the PCS with errors minimized over the whole representable color space and all luminance (tone) levels. Calibrating a camera uses a similar process, though great care must be taken to use appropriate lighting of the test targets when they are captured. To calibrate an RGB output device (e.g., a monitor), the same process is effectively used in reverse. Rather than measuring known chromaticities and tones and then measuring the device’s response, a series of known device RGB colors and tone values are displayed (under well-controlled lighting conditions and monitor setup), and the absolute chromaticity and luminance values are then measured using a colorimetric probe temporarily attached over the display. Calibration software controls the entire process in a closed loop: it displays the sample patches, directs the colorimeter to take measurements, and then processes the results to produce the final ICC profile file. Simple calibration processes can often be performed by computer users without specialized equipment using only operating system software utilities, though these are no substitute for dedicated calibration systems in terms of accuracy of the final profile. Some high-end LCD monitors are available with built-in color and/or luminance calibration functions. These are particularly useful in environments where consistent, accurate reproduction is essential and ambient luminance variations must be compensated for, such as in clinical medical imaging and prepress applications (Abileah 2005; EIZO 2010). In an LCD monitor, the color characteristic of the backlight is of great importance in ensuring well-calibrated output colors, so these systems use sensors to measure the luminance and chromatic output of the backlight either before or after it passes through the LC matrix layer. They then make automatic compensating adjustments to the backlight chromaticity, typically by modulation of separate red, green, and blue LED intensities.

Page 4 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015 From camera, scanner, etc.

To display, printer, etc,

Input device space Gamma adaptation

Matrix/LUT transform

Input profile (calibration data)

Profile Connection Space Gamut mapping

Rendering intent

Output device space Matrix/LUT transform

Gamma adaptation

Output profile (calibration data)

Fig. 3 Typical color data path in the ICC color management framework

Calibration of printers, usually CMYK devices, is carried out using a conceptually similar process as is used for monitors. Test prints are made, comprising patches of known device color and tone on media with a certain white point. The absolute chromaticity and density of each patch is then read using a colorimeter with a self-contained light source. Finally the calibration software generates a profile for this particular combination of printer settings and media. Typically, many profiles are generated for various different combinations of print media and printer resolution and output settings, and possibly even different inks.

The Color Management Module The Color Management Module contains the core processing functionality in the ICC color management system. It is a piece of software which, given source and destination color profiles, handles the conversion of image colors from the input to the output device color spaces, via the Profile Connection Space. The role of the CMM in the color management system is shown in Fig. 3 and is now described in detail. An image taken by a camera or scanner is in a device-dependent space, specified by its color profile. The CMM extracts a set of calibration information from this profile and uses it to map colors into the PCS. This is a device-independent space which represents colors as chromaticity values that map well to human color perception. In practice, either the CIE L  a  b  or CIE XYZ color spaces are usually used as the PCS. The input mapping takes place in two stages: firstly, gamma adaptation to account for the tonal response of the device and, secondly, a matrix-based transformation of color coordinates. A lookup tablebased transformation can be used in place of the matrix transformation if faster processing is needed (at the expense of a small reduction in color accuracy). At this point, image colors are in the device-independent PCS. The reverse of the above input transformation is now applied, using calibration data from the destination device profile, to produce image colors in the output device color space. Note that the matrix or lookup table color coordinate transformation also takes account of the source or destination color coordinate system (RGB, CMY, CMYK, or even six- or eight-ink subtractive systems in some print applications). When the image is displayed or printed, the chromaticities and tones of points captured by the input device will be reproduced as accurately as possible to the viewer. Gamut Mapping There is one significant caveat in the color transformation process as described. In practically all cases, the color gamuts of the input and output devices (the ranges of chromaticities that the devices can represent) will not match, and a further gamut mapping stage is needed to map input to output chromaticity coordinates within the PCS (Stone et al. 1988; Morovič 2008). Figure 4 illustrates the problem by plotting the color gamut of a typical CMYK printer and RGB monitor in device-independent coordinates. The gamut of the printer is much smaller than that of the monitor, and either device can only represent a small

Page 5 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015 0.6 0.5

Printed CMYK

V′

0.4

Monitor RGB

0.3 0.2

CIE 1976 UCS 0.1 0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

u′

Fig. 4 Illustrative comparison of the color gamut of a typical RGB emissive display and a typical CMYK printer

a

b Source gamut

Destination gamut Colorimetric rendering

Perceptual rendering

Fig. 5 Basic color rendering approaches from a larger to a smaller gamut. (a) Colorimetric. (b) Perceptual

subset of the total CIE 1976 color range. Only colors represented by the area of overlap between the printer and monitor can be reproduced accurately by both devices. The situation is similar when comparing typical input and output device gamuts, and algorithms must be used to map from one to the other according to a specific set of rules. This color rendering procedure defines the way in which source colors in the PCS are treated if the input and output device gamuts do not match, for example, if input chromaticity values fall outside the smaller gamut of the destination device in the PCS. Very broadly, there are two distinct approaches to the problem; Fig. 5 represents a situation where the gamut of the output device is far smaller than that of the input device, as will typically be the case when transforming a scanned image for print. The input device in the figure has a larger span (represented here as blue to green) than the output device, so the problem is of determining what should be done with input colors that are outside the span of the output device. If colors must be maintained at their exact chromaticity values, then colorimetric rendering is used (Fig. 5a): chromaticities are mapped exactly from source to destination. Unfortunately however, this results in the complete loss of color information which falls outside the destination gamut; out of gamut colors are typically saturated to the chromaticity value at the top or bottom of the range. This is generally unacceptable for real-world (e.g., photographic) images, and a perceptual rendering is more appropriate (Fig. 5b). Here, all colors in the source gamut are scaled into the destination gamut, and thus, no color information at all is actually lost, although the absolute chromaticity values of all points in an image (except at the absolute center of the range) will shift from their original values. The perceptual scaling

Page 6 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015

operation is typically not linear; it attempts to transform colors with greater perceptual relevance in the final display with greater accuracy than others. In the ICC color management framework, four color rendering schemes are encapsulated as specific rendering intents. Each of these has a different aim and makes specific assumptions to suit different applications. The perceptual intent compresses or expands the full gamut of the input device to fill the full gamut of the destination device. Tonal balance is therefore preserved, but colorimetric accuracy is not. The mapping is nonlinear, with more perceptually relevant colors in the mid-tones being mapped more accurately than those in the lower or upper ranges. This rendering intent was designed for use with real-world (photographic) images to give a pleasing (though not colorimetrically accurate) display. Notwithstanding numerical precision issues in device color vs. PCS representations and in the calculations made by the CMM, a transformation made using the perceptual intent is generally reversible. The saturation rendering intent preserves heavily saturated colors in an image and is designed for use with bold computer graphics output such as business chart presentations. It ensures that saturated source colors remain vivid even in a larger destination gamut, though it does not attempt to maintain colorimetric accuracy and is therefore not suitable for transforming photographic image colors. It is also not generally reversible. When colorimetric accuracy must be preserved from source to destination, a colorimetric rendering method must be used, as already outlined. The relative colorimetric intent maintains the chromaticity values of the source device. If the destination device gamut is a superset of the source gamut, there are no losses, and the mapping is exact, but when the destination device gamut is smaller than that of the source, this intent clips are unrepresentable colors leading to a loss of color information. Also, the mapping moves the white point if it is different for the source and destination devices; this typically occurs when comparing RGB monitor and CMYK printer color spaces. Note, however, that this white point correction will result in a shift in the chromaticity values of the colors in the image. The relative colorimetric rendering intent is well suited for applications such as soft proofing of printer colors using a display monitor. The transformation is only reversible if the destination gamut is a superset of the source gamut. Finally, the absolute colorimetric rendering intent makes a direct mapping from source to destination gamut as for the relative colorimetric intent, but without remapping the white point if it has different coordinates in the two spaces. The results will therefore theoretically be the most colorimetrically accurate of all intents overall with respect to the source, but if the white points are in fact different, then there will be a noticeable, possibly undesirable shift of all of the colors in the image as displayed or printed. It can be seen that every rendering intent represents a compromise, and it is therefore important that an appropriate one is chosen for the task at hand.

Color Management System Implementations Color management can be implemented on computer systems either at the operating system or application level or a mixture of both. The advantage of operating system color management is that input devices, output devices (via functions in device driver modules), and applications that use operating systemsupplied APIs are then fully integrated. The color workflow for image data throughout the system can then be controlled from capture, through application processing, right to display or print output. Certain highend applications, however, make use of their own CMMs to maintain full control of the workflow. This is typically done to offer more control over transformations as is needed in a production environment to be able to address more exotic input and output devices and to be able to offer higher quality conversions than operating system CMMs. Page 7 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015

Table 2 ICC rendering intents and names used in Microsoft Windows ICC Rendering Intent Perceptual Saturation Relative colorimetric Absolute colorimetric

Windows name “Picture” “Graphic” “Proof” “Match”

In Microsoft Windows, the integrated Image Color Management subsystem has recently been superseded by the new Windows Color System architecture (Microsoft 2011). These systems allow profiles to be loaded and rendering intents to be set for scanners, cameras, displays, and printers, and transformations can be made automatically or under application control. Windows names the four ICC rendering intents described in the previous section according to the typical application of each – see Table 2. Apple Mac OS contains the ColorSync subsystem for system color management (Apple Inc. 2010), which is also integrated with camera, scanner, printer, and display drivers. It also provides a set of utilities that allow inspection and manipulation of profiles and visualization of their color spaces. Linux and UNIX operating systems do not generally have such mature support, and color and profile management is usually left to individual imaging applications to perform. Many image processing applications have color management functionality, supporting image embedded profiles and the concepts of working and output color spaces. However, most Web browsers in use, despite being the means via which a large proportion of final images are viewed, do not support color management consistently, and many do not yet support it at all.

Summary Representations of colors and tones in real-world imaging and output devices are generally not compatible even if they appear to use the same variables (e.g., red, green, and blue levels). This is the case partly because they have completely different physical origins and is also due to the influence of external variables such as ambient illumination and print media and ink differences. Environmental and component aging factors also cause variations. Color management systems attempt to correct for as many of these differences as possible by defining device-neutral representations of color values and the means to transform image data between the device’s own color spaces and this device-independent color space. This chapter has presented these fundamental issues, introduced the ICC specification as a standardized framework for color management, and addressed some of the practicalities of its use.

Further Reading Abileah A (2005) DICOM Calibration for medical displays: a comparison of two methods, Planar systems. http://medicaldisplaysforless.com/Dome/WP/DICOM_Calibration_for_Medical_Displays. pdf Apple Inc., Technical note TN2035: ColorSync on Mac OS X (2010) http://developer.apple.com/library/ mac/#technotes/tn/tn2035.html Berns RS (2000) Billmeyer and Saltzman’s principles of color technology. Wiley, New York Datacolor Inc., Datacolor (2011) http://www.datacolor.com/ EIZO, EIZO ColorEdge CG245W – the first self-calibrating monitor for graphics (2010). http://www. eizo.com/global/products/coloredge/cg245w/ Page 8 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_25-2 # Springer-Verlag Berlin Heidelberg 2015

Giorgianni EJ, Madden TE (2008) Digital color management: encoding solutions. Wiley, Chichester Green PJ, Johnson T (2000) Issues of measurement and assessment in hard copy color reproduction. Proc SPIE 3963:281–292 Green P, Macdonald L (eds) (2002) Colour engineering: achieving device independent colour. Wiley, Chichester International Color Consortium (2010) ISO 15076-1:2010: image technology colour management – architecture, profile format and data structure International Color Consortium (2011) International color consortium. http://www.color.org/. Accessed 11 Mar 2011 Kohler T (2000) The next generation of color management system, eighth color imaging conference, Scottsdale, Nov 2000, pp 61–64 Microsoft, Windows color system (2011) http://msdn.microsoft.com/en-us/windows/hardware/gg487409 Morovič J (2008) Color gamut mapping. Wiley, New York Stokes M, Anderson M, Chandrasekar S, Motta R (1996) A standard default color space for the internet – sRGB, Nov 1996. http://www.w3.org/Graphics/Color/sRGB.html Stone MC, Cowan WB, Beatty JC (1988) Color gamut mapping and the printing of digital color images. ACM Trans Graphics 7(3):249–292 X-Rite Inc., X-Rite (2011) http://www.xrite.com/

Page 9 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

Digital Image Operations Matthew C. Forman* Create-3D, Sheffield, UK

Abstract In applications that deal with digitally represented visual images, various forms of processing are generally required before the results are ready to be displayed. Although many of the methods used are complex, all have their roots in a small number of core concepts and techniques. This chapter looks at these common core spatial domain operations, firstly reviewing those that rely on applying transformations of brightness and color in place within digital images. It then moves on to consider geometric manipulation of image data and resampling issues.

List of Abbreviations API CPU GPU HSV RGB Y0 CbCr

Application programming interface Central processing unit Graphics processing unit Hue, saturation, value (color space) Red, green, blue (color space/color storage method) Luma, blue-difference chroma, red-difference chroma (color space/color storage method)

Introduction The rapid growth of digital storage of visual images has been driven by several factors. Digital representations inherently have far better robustness and noise immunity than direct analog recordings. This is extremely advantageous where both long-term storage and communication are concerned. One of the most significant advantages of digital representation, however, is the ease with which useful and complex processing operations can be implemented. The precise details of a particular implementation of a digital image storage scheme depend on the application: the manner of source content creation, storage or transmission system requirements, and the final destination of the image. At a low level, however, an image is stored in either a raster (also commonly known as bitmap or pixmap) or vector representation. A vector representation consists of instructions and parameters for drawing the final image, element by element, from geometric primitives such as lines, curves, polygons, and text. A raster format represents a lower level of abstraction of image data. It contains a sampled representation of any captured or synthesized image and thus offers a more general means of storage. Since display systems themselves are addressed in this manner, the final destination for all image representations is effectively raster; an image in a vector format is rasterized for display by executing the appropriate drawing instructions and sampling the result. This article therefore concentrates on processing that can be achieved when an image is stored in a raster format.

*Email: [email protected] Page 1 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

Origin

x

y Pixel sample value at (x,y): f (x,y) H

W

Fig. 1 General raster (bitmap) image layout

Raster Image Processing Format In the most general sense, a raster image is comprised of a rectangular array of pixels (“picture elements”) (Watt and Policarpo 1998; Gonzalez and Woods 2008). Each pixel is a sample of the information in a finite area of a spatially continuous image source, centered on a particular geometric location in the plane. The sample value may simply be the scalar irradiance arriving at an image sensor pixel, or equivalently the emittance of a display pixel; an array of these represents a grayscale image. Alternatively a pixel may carry color information, typically by encoding irradiance/emittance proportions of red, green, and blue light; the array of such pixels can represent a full-color image. Figure 1 illustrates the layout of a general raster image (note that the origin is also commonly at bottom left).

Color Image Processing Although color images are usually RGB encoded in capture and display devices, such a color representation is not necessarily best suited for supplying full-color image data to image processing operations. Hence, for processing, images are often transformed into alternative color spaces that are more compatible with the operation(s) to be carried out, or simply for ease of implementation. For example, it is often desired to separate the luma (brightness) from the chroma (color) information and process them 2 separately – the Y ^a€ C b C r (luma, chroma blue, and chroma red) space is commonly used in these cases. In other operations the HSV space may be appropriate. Many image processing operations may be carried out directly in the pixel spatial domain – some of which may require resampling – though others are more easily applied in a spatial frequency domain such as Fourier space. This chapter continues by looking at spatial domain pixel operations, a group of common global processing operations that rely on applying transformations of brightness and color within digital images.

Global Pixel Operations A fundamental class of image processing techniques, global pixel operations, apply a single operation identically to each pixel. This section introduces a number of intensity-only and color-specific pixel operations. A second class of local pixel operations, where a number of values in a local neighborhood of the operation pixel are considered, is often used in filtering and enhancement applications (see chapter “▶ Signal Filtering: Noise Reduction and Detail Enhancement”).

Page 2 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

a Original image & histogram

Transfer function

Processed image & histogram

T(i )

i

b T(i )

i Log

Log

Fig. 2 Intensity transformation operations on sample images. (a) Contrast stretching. (b) Nonlinear brightness adjustment

Intensity Transformations An intensity transformation uses a linear transfer function to remap input pixel intensity values (Watt and Policarpo 1998). If f(x, y) represents an intensity raster image and T(i) is an intensity transfer function, then the processed image is: g ðx, yÞ ¼ T ðf ðx, yÞÞ: A general remapping facility such as this allows a number of practical enhancement operations to be carried out. It is convenient to visualize the transfer function as a line plot relating output to corresponding input values, and software with intensity transformation features often allows transformations to be defined graphically in this way, generally with reference to the intensity histogram of the image. Some common intensity transformations are illustrated by Fig. 2, which also shows their effects on sample images. Contrast stretching (Fig. 2a) expands the intensity range of an image in parts of intensity space (typically the center) while compressing or clamping the intensity dynamic range in other parts. The goal is to improve the utilization of dynamic range for the most important parts of the image. The results can be seen in the example shown as improved contrast. Here, limiting low- and high-intensity ranges have been clamped to black and white. Normalization is a related process that determines stretch limits automatically from the image brightness histogram, so that the image’s existing intensity range is mapped exactly on to the maximum range available. This is particularly useful to compensate for photographic underexposure. An offset can be added to the transfer function to increase or decrease overall image brightness; however, this generally results in saturation at the white or black level. As an alternative, nonlinear brightness adjustment (Fig. 2b) applies a smooth curve to increase or decrease overall image brightness in such a way that saturation at maximum or minimum intensity cannot occur. A power function, such as that used for display gamma correction (see chapter “▶ Luminance, Contrast Ratio and Grey Scale”), is often used.

Page 3 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

As in the examples shown, any intensity transformation can be applied to a color image by first transforming the image data into a color space which represents intensity information separately from color information, applying the transformation to the intensity component and then transforming back to 2 ^ 1) component would be the original color space. Using the Y ^a€ C b C r space, for example, the luma (YE subject to transformation, while the chroma components (Cb and Cr) would be passed unchanged.

Color Saturation Adjustment and Matrix Methods It is often desired to make adjustments to color saturation in an image in RGB space. One way to accomplish this is to convert the image data into HSV space and make the appropriate modification to the S (saturation) component before transforming the data back to RGB for display. However, this requires several separate operations and hence may not result in a particularly efficient implementation. It also carries the risk of introducing precision, rounding, and overflow issues. A convenient way to implement global pixel processing operations directly in RGB space uses a general matrix-based framework (Haeberli 1993). The matrix multiplication of the input color pixel value (in the form of a column vector) with an operation matrix yields the output pixel value. We define the operation as a 4  4 matrix to result in a general linear transformation, and several operations can then be concatenated into one just by multiplying operation matrices. The input pixel vector, Fðx, yÞ ¼ ½ f R f G f B 1T with fR, fG, and fB being the source red, green, and blue component values, and the output pixel vector, Gðx, yÞ ¼ ½g R g G g B g w T with gR, gG, and gB being the destination red, green, and blue component values, and gw is not generally computed. If T is a 4  4 matrix defining the desired pixel operation, then the overall operation is represented as Gðx, yÞ ¼ T:F ðx, yÞ: The operation matrix for saturation adjustment is 2

3 20ca þ s b g 0 6 a bþs g 07 6 7 Tsat ðsÞ ¼ 6 7with a ¼ 0:3086 ð1  sÞ 4 a b g þ s 05 0

0

0 1 b ¼ 0:6094 ð1  sÞ g ¼ 0:0820 ð1  sÞ

Here, Î, Î2, and Î3 are factors derived according to the contributions of red, green, and blue components, and s is the saturation adjustment value. If s = 0, all color is removed leaving only brightness information. If s = 1, there is no change, and values 0 < s < 1 result in various levels of desaturation. For values s > 1, saturation is enhanced. Figure 3 demonstrates the effects of applying saturation enhancement (example value s = 1.35) and desaturation (example value s = 0.65) to a sample image, using this process.

Page 4 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Image color saturation modifications using matrix methods in RGB space

Intensity transformations for contrast and brightness changes, as well as many color-specific transformations – for example, hue rotation – can be specified concisely and applied using the matrix framework. Combined operations can also be computed efficiently, concatenating individual operations by multiplying together the appropriate matrices before applying the result. Matrix pixel transformations can be implemented efficiently using precomputed lookup tables. The fast matrix arithmetic facilities of GPUs and some CPUs are also useful for this.

Application Example: White Point Correction When source material is shot, the color and luminance responses of the scanner or camera used are ideally calibrated and known. Color management techniques (see chapter “▶ Fundamentals of Image Color Management”) can then be used to ensure that the image that is ultimately displayed is perceived to be as close as possible to the original scene, with neutral shades being reproduced as accurately as possible at the display. The response of a camera is not always known a priori, however, or material may have been shot with an incorrect white point setting in force. Correction can be achieved by remapping the white point using a simple global color pixel operation, with the capture device having been used to record a physical reference white point in the scene. This operation is also useful for deliberate manipulation of the white point of an image as for special effect purposes. There are a number of options when defining this operation, particularly with regard to color space (Viggiano 2004). Here we assume operation in the RGB space of the camera itself. If the measured color pixel value of the white point reference, W ¼ ½wR wG wB ; then assuming the full-scale color component value is 255 (i.e., 8 bits per component color), the white point correction operation can be defined in the matrix framework above as 2 3 0 0 0 20c255=wR 6 0 07 0 255=wG 7: Twpt ¼ 6 4 0 0 255=wB 0 5 0 0 0 1 Note that if any color component is zero, the result of the corresponding computation above would be undefined. Although this is very unlikely in practice, it must be considered in an implementation. Also, because of the likelihood of saturation of at least one color component of a white reference to the full-scale representable value, it is often more reliable to measure at least one known gray physical reference instead. The white point reference, W, can then be computed from these values.

Page 5 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

a

Original pixel samples New pixel samples (magnification)

b Tent

New pixel samples

Original pixel samples

Reconstructed signal

Cubic

Filter function

Reconstruction by convolution

Fig. 4 Image scaling and resampling. (a) New pixel centers are not aligned with original ones. (b) Filtered resampling using tent and cubic filter functions

Geometric Image Operations and Resampling A further important set of processing techniques commonly applied to image data is the class of geometric image operations. Image resizing (scaling), rotation, and morphing are some very common practical applications. In the case of vector images, any required geometric transformations are generally applied directly in the vector representation, before rasterization for display takes place. Indeed this point encapsulates the chief advantage of using a vector rather than a raster representation whenever possible: although images destined for digital display inevitably must be sampled into raster form eventually, transformations applied before sampling effectively take place in a continuous domain and are therefore scale independent. Geometric transformations on images in raster form must take account of the fact that these images have already been sampled and discretized. While the pixel processing methods outlined in the previous section deal with modifications only to color or intensity sample values in raster digital images, geometric operations involve changes to the positions of image samples.

Raster Image Resampling and Scaling Scaling is a very common requirement in image processing, for example, when zooming at the display to inspect detail closely, or preparing an image optimally for a display device with a certain resolution. Consider for example that an image must be scaled up so that every six pixel rows and columns are instead represented by seven in the destination image. Figure 4a illustrates a section of a single row of pixels of the source image and also the corresponding section of pixels in the destination, scaled image. The problem now is of determining appropriate values for the new pixels. Many of the destination pixel centers are not aligned with source pixel centers, so the source pixel values must be mapped onto the destination pixel grid and resampled. A naïve approach simply selects the closest source pixel value to the center of each destination pixel; however, this results in significant image distortion with portions of the image

Page 6 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

information being deleted completely or replicated. An approach to scaling without introducing such distortion involves approximating the original continuous intensity/color surface and sampling it at locations corresponding to the new pixel centers (Schumacher 1995). This approximation is made by convolving the source image pixels with a filter impulse response function and setting destination pixel values at locations in the convolved signal that correspond to their centers. The theoretically ideal reconstruction filter is defined by the sinc function; it corresponds to an ideal low-pass cutoff characteristic in the spatial frequency domain. However, the sinc function contains negative values and has infinite support, and thus is not practical to use directly. Increasing the scale of an image as above (magnification) requires resampling to a higher pixel rate. This corresponds to interpolation in signal processing terms. Reducing image scale, sometimes known as minification, requires resampling to a lower pixel rate – decimation in signal processing terms. Note that in this case, care must be taken that the new, lower sampling rate is still adequate for the spatial frequency content of the image so as not to introduce aliasing distortion. This is generally achieved by scaling the filter function before convolution. The reconstruction process is illustrated in Fig. 4b using two typical filter functions: the “tent” (corresponding to linear interpolation between samples) and cubic filters (describing the shapes of their impulse responses). The cubic function approximates the ideal sinc function with better precision resulting in higher fidelity results, but is slower to compute than the “tent.” Others such as the Lanczos filter, offer still better reconstruction accuracy (Turkowski 1995). In implementation, the filter is generally applied one dimensionally through all rows and all columns separately.

Image Rotation A second extremely useful geometric operation is the rotation of an image. For rotation through multiples of 90 , resampling is not required unless pixels are nonsquare; a simple transfer of pixel values directly from one location to another is sufficient. However, rotation of an image through an arbitrary angle is often needed, and this does require resampling (see Fig. 5). A general, direct implementation would use filter functions for reconstructing the continuous intensity surface, and then resample this according to new, rotated pixel centers. Such an approach is, however, computationally inefficient and cumbersome to implement. A more practical algorithm is due to Paeth (1995). Here, the rotation operation to be applied anticlockwise through angle Î is decomposed into three simple shear operations. If source and destination  T  T pixel coordinates are represented as column vectors S ¼ sx sy and D ¼ d x d y , respectively, a general transformation from source to destination coordinates according to matrix M is D ¼ M:S: The rotation matrix can be considered as the product of three shear matrices:       cos y  sin y 1  tan ðy=2Þ 1 0 1  tan ðy=2Þ Mrot ¼ ¼ : sin y cos y 0 1 sin y 1 0 1 Implementation of shear operations is straightforward, requiring only a shift of pixel data along one axis (effectively, resampling of translated pixels using a simple tent function) proportional to the distance along the second axis. Figure 6 illustrates the process for a clockwise rotation through 10 . Note that as a consequence of rotation, the rectangular pixel array area required to hold the image is increased, though cropping is often used to retain a rectangular subregion that does not include the rotated image boundary.

Page 7 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

Original pixel grid

Rotated pixel grid

Fig. 5 Arbitrary rotation of a raster image requires resampling due to the complex overlay of source and destination pixels

Source image

Shear 1

Shear 2

Shear 3

(y axis)

(x axis)

(y axis)

[Partial 1]

[Partial 2]

Final rotated result

Fig. 6 Rotation of a raster image using the three-shear method

For improved accuracy, rotations by angles of 90 or more are implemented by transfer of pixels for right angle portions followed by the three-shear algorithm for the remaining portion.

Other Geometric Image Operations In addition to scaling and rotation, a number of other geometric operations requiring resampling are possible on raster image data. Considering straightforward affine transformations, the simple shear has already been outlined in its application to rotation using the Paeth method. Translation is also sometimes useful – this is effectively a phase shift of pixel data by an amount which is not necessarily an integer number of pixels. More general remapping and warping techniques are often required in certain higher level applications (Watt and Policarpo 1998). An image may be mapped to a regular or irregular mesh, and mesh nodes manipulated to apply modifications to the image structure in a local sense. A common application of such a technique is perspective transformation, often used (with knowledge of camera parameters) to correct for perspective distortion in an image captured from a camera. Warping methods are also used in morphing: creating a smooth transition from one image to another, driven by relationships between the nodes of the mesh in both images, defined by the user. A closely related application to morphing is the synthesis of new viewpoints of a scene, given at least two known viewpoint images and a set of

Page 8 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015 Destination projection and viewport [z into paper] (Mapped to destination pixel buffer) x y

V1

Source image texture-mapped to our geometry

V2

Geometry to carry source image Quadrilateral in (x,y) with vertices: V 1,V 2,V 3,V 4 V3

V4 Vertex tranformation

Example A: Rotation All 4 vertices transfomed by rotation in z axis

Example B: Perspective warp Vertices translated individually in (x,y) plane

Fig. 7 Fast geometric transforms

correspondences between mesh nodes in the source images. This is useful in three-dimensional imaging and modeling.

Implementation of Fast Geometric Transformations Modern commodity GPUs have evolved chiefly to accelerate three-dimensional object transformations and rendering for computer entertainment and visualization applications. The parallel processing facilities that make this possible, however, also greatly simplify implementation of very fast geometric operations on raster images – both affine transformations and more complex warps (Qureshi 2001). This can be achieved through common APIs such as OpenGL (Silicon Graphics 1992) and Microsoft DirectX (Akenine-Möller et al. 2008). A general technique for implementing 2D affine transformations of raster images is as follows (see Fig. 7): 1. Define a virtual camera, usually with an orthogonal projection and a viewport mapping to a destination pixel buffer. 2. Create a geometric entity in object space. For affine transformations, a simple quadrilateral surface is suitable. 3. Using the texture handling facilities of the API, map the source image to the geometric entity just created. 4. Transform the vertices of the geometric surface, either directly or using the vertex affine transformation facilities of the API. Since the geometric surface is “carrying” the source image, the final rendered result will be a destination image transformed accordingly. The GPU’s texture lookup filtering functionality ensures that appropriate resampling takes place automatically. General mesh warps can also be achieved directly, simply by using more complex geometry to define a suitable mesh containing internal vertices and then transforming those vertices as necessary to apply the desired warp.

Page 9 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_26-2 # Springer-Verlag Berlin Heidelberg 2015

Summary All practical image and video processing applications are built on a core set of low-level operations on digital representations of images. Some of these are applied in place at the pixel level, but others involving geometric transformations result in changes to the inherent structure of the image representation, and therefore must take into account sampling issues. It is relatively straightforward to use modern graphics hardware and APIs to implement extremely fast fundamental image processing operations.

Further Reading Akenine-Möller T, Haines E, Hoffman N (2008) Real-time rendering, 3rd edn. A. K Peters, Natick Gonzalez RC, Woods RE (2008) Digital image processing, 3rd edn. Pearson Prentice Hall, New York Haeberli P (1993) Matrix operations for image processing. http://www.graficaobscura.com/matrix/index. html. Accessed Nov 1993 Paeth AW (1995) A fast algorithm for general raster rotation. In: Kirk D (ed) Graphics gems. Academic, Boston Qureshi S (2001) Image rotation using OpenGL texture maps. C/C++ User J 19:10–17 Schumacher D (1995) General filtered image rescaling. In: Kirk D (ed) Graphics gems III. Academic, Boston Silicon Graphics, Inc (1992) The OpenGL graphics system: a specification. Version 1.1 Turkowski K (1995) Filters for common resampling tasks. In: Glassner AS (ed) Graphics gems. Academic, Boston Viggiano JAS (2004) Comparison of the accuracy of different white balancing options as quantified by their color constancy. In: Proceedings of the SPIE, vol 5301. Bellingham Watt A, Policarpo F (1998) The computer image. Addison-Wesley, Reading

Page 10 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

TV and Video Processing Scott Janus* Visual and Parallel Computing Group, Intel Corporation, Folsom, CA, USA

Abstract This chapter describes video processing algorithms commonly in use today. In addition to describing how they work at a generic level, this chapter will also explain why such processing is necessary, even on today’s high definition content. Sample pictures are included that visually demonstrate the key principles of the various algorithms. The processing categories that are covered include: deinterlacing, film mode detection, scaling, anamorphic scaling, nonlinear anamorphic scaling, hue, saturation, brightness, contrast, de-noise, detail enhancement, and frame rate conversion.

List of Abbreviations CCD FPS i p

Charge-coupled device Frames per second Interlaced, as in 1080i, referring to 1920  1080 interlaced content Progressive, as in 720p, referring to 1280  720 progressive content

Introduction The quality of video in the past few decades has improved remarkably, to the point where watching a nostalgic show from your childhood can be a surprisingly low-fidelity experience. Some of the quality improvements have come from an increase in resolution; standard definition broadcasts are slated to cease in 2009 in the USA, and DVD-Video is slowly but surely being supplanted by Blu-ray. Part of the increase in quality comes from new video compression codecs: H.264 and VC-1 offer substantial improvements in efficiency over MPEG-2 (as discussed in chapter “▶ Video Compression”). Yet despite these advances in the baseline video itself, many of the quality enhancements have come from the application of algorithms applied to the decoded, uncompressed video. These techniques are sometimes referred to as post-processing algorithms, in that they are applied after decoding. Few enhancements were needed in the early days of video, because the content almost always precisely matched the available displays. The format of the live broadcasts was guaranteed to match the characteristics of all the televisions capable of receiving the transmissions. Today, however, the world is a much different place. Even constrained to the traditional consumer electronics (CE) ecosystem in the early years of the twenty-first century, there was still a wide range of different video formats with different resolutions, scan types, and aspect ratios. PCs make the situation more complex, in that the content can be practically any size and displayed via an arbitrary-sized window. PC resolutions and refresh rates are much more varied than that supported by traditional CE interfaces. The much discussed convergence of PCs and dedicated CE video systems has steadily – yet asymptotically – progressed. Many engineers spent the previous decade working to ensure that CE *Email: [email protected] Page 1 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Progressive scanning

video could be displayed well on PCs. Today, many engineers are working to ensure that PC/Internet video can be displayed well on CE devices. At the same time, video on mobile devices such as cell phones and mobile Internet devices (MIDs), as well as console game stations, further adds to the combinatorial complexity of the video ecosystem. This chapter will review many of the enhancement algorithms in place today. For many of the algorithms, there is no universal implementation, and, indeed, there is a great deal of intellectual property and company differentiation tied up in the nuances of a particular implementation. However, the basic concepts are discussed in a generalized form herein. These algorithms can be classified into three main categories. The first category contains those algorithms that are needed to address a fundamental difference between the video content and the display device. These differing parameters include the scanning type, aspect ratio, resolution, and frame rate. Without some form of conversion between these mismatched parameters, the video will basically be unwatchable. The second group consists of those algorithms needed to ensure the presentation of the video on a particular display in a particular environment accurately reproduces a visual experience consistent with a reference implementation. These adjustments ensure that all the dark and bright areas are correctly visible and the colors are properly represented. The final category consists of those algorithms that strive to somehow improve upon the baseline implementation or otherwise differentiate a given product from its competitors.

Scan Conversion There are two fundamentally different scan types: interlaced and progressive. Progressive scanning records every line of scene for a particular instance of time. 1080p Blu-ray content and 720p HDTV broadcasts are examples of progressive content. Film content is also inherently frame-based. Being an analog physical medium, there is no real concept of rows or pixels. Still, the entire scene is captured at full spatial resolution with every opening and closing of the shutter. Progressive scanning is illustrated in Fig. 1. Interlaced scanning records or displays every other line of the scene for a given time stamp. The half frame is known as a field. The fields corresponding to the even-numbered lines of the scene are unsurprisingly known as even fields, and the odd-numbered lines compose the odd fields. Legacy analog television and HDTV’s 1080i format are examples of interlaced scanning. Interlaced scanning has historically been used to address technological bandwidth constraints of the time. It is illustrated in Fig. 2. At first glance, it may seem as though having every other line of the video missing would result in noticeably visible results. After all, half the content is missing! However, when interlaced content is displayed on an interlaced monitor (which includes almost every standard definition television made in the twentieth century), the human psychovisual system’s perception of the TV’s phosphor decay results in an acceptable presentation of the content. The missing lines are not obvious.

Page 2 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Field 0

Field 1

Field 2

Field 3

Fig. 2 Interlaced scanning

24P source content

60i 3:2 pulldown content

Fig. 3 3:2 Pulldown

The fact that there are two fundamentally different scanning techniques means that some type of conversion must take place if one type of content is displayed on a different type of display. This problem cropped up in the early days of television as engineers considered how to transmit progressive film content over the interlaced-only television broadcast system. In geographies using a 60 Hz interlaced scan rate, the 24 FPS film content is converted using a technique known as 3:2 pulldown. The 3:2 comes from the fact that the first frame is sampled three times and the second frame is sampled two times to create fields at the necessary cadence. This procedure is shown in Fig. 3. A similar technique known as 2:2 pulldown is used to convert 24 FPS content to 50 fields per second for other geographies. Telecine (Matchell 1982) is the generic term for converting film content to interlaced video. With the advent of progressive displays – most notably computer monitors – in the closing years of the twentieth century, the problem of deinterlacing content became widespread. Merely natively displaying the interlaced content is inadequate, as the human eye can easily see that half of the data is missing. The missing rows of data must be filled in with something. It is impossible to exactly recreate the missing fields, as that data was never captured. Instead, we must approximate the missing information. One simple approach is to estimate the missing field by interpolating from the pixel directly above and below the absent pixel (De Haan and Bellers 1998). This approach is shown as bobbing and is illustrated in Fig. 4. It is easy to implement but has noticeable artifacts around high contrast, static images such as subtitles. Such regions tend to flicker, due in part to the fact that the odd and even fields are sampled at different spatial locations. Another simple deinterlacing algorithm is to simply combine two adjacent odd and even fields into a single frame, as shown in Fig. 5. Weaving, as it is known, works great with static images. However, regions of large movement result in an artifact known as combing or feathering. In these regions distinctive horizontal artifacts can be seen due to the two fields—which are samples of two different instances in time—being presented at the same time. Note that whereas bobbing converted 60 fields per second into 60 frames per second, weaving creates 30 frames per second. A more sophisticated approach is to apply bobbing to regions of the screen with high motion and weaving to regions of the screen with little or no movement. This adaptive deinterlacing can be quite

Page 3 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Field 0

Field 1

Frame 0

Frame 1

Fig. 4 Bobbing

Field 0

Field 1

Frame 0

Fig. 5 Weaving

effective, as it discriminately applies bob or weave to minimize the artifacts of each algorithm, as demonstrated in Fig. 6. The size of the regions varies based on the implementation; they can be as small as a pixel, in which case the algorithm is known as pixel-adaptive deinterlacing. Generally speaking, the smaller the region, the more expensive it is to implement the algorithm. Similarly, expense also increases with the range of motion the algorithm can detect. An even more advanced approach is to use the motion data of adjacent fields to reconstruct a completely new frame. The position of moving objects in the missing field is calculated from the references and placed appropriately. This motion-compensated approach is quite complex to implement.

Page 4 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Field 0

Field 1

Field 2

Frame 1

Frame 2

Field 3

Fig. 6 Advanced deinterlacing

60i 3:2 pulldown content

30P woven content

Fig. 7 Incorrect deinterlacing

Film Mode Detection It is possible that the interlaced content harbors progressive content, such as a telecined movie. Applying the aforementioned deinterlacing algorithms to such content will result in suboptimal results, an example of which is shown in Fig. 7. The sample illustrates the problem when applying weaving, but the problem applies to all the algorithms. The best possible result is to monitor the content and compare adjacent fields to see if they originate from the same progressive frame. If done properly, the original progressive frames can be completely recovered from the interlaced content and optimally displayed, as shown in Fig. 8. This process is known as film mode detection (and correction) or inverse telecine.

Frame Rate Conversion It is possible for the frame rate of the video content to be different than the frame rate of the display device that is showing the content. For instance, a Blu-ray movie may be recorded at 24 FPS and viewed on a PC that has an 85 Hz refresh rate active. Obviously, some sort of conversion between the frame rates must occur in order for the video to be viewed in an acceptable fashion. The simplest approach is simply to replicate the frames, as shown in Fig. 9. This approach provides reasonable quality, although an alert eye can detect the fact that each unique film frame is being presented for a different duration of time. The variable duration is due to the fact that the display refresh rate may not be an even multiple of the content’s frame rate. In the case of 24 FPS content and a 60 Hz display, the video frames are alternately presented for three and two display refreshes. This cadence results in each video frame being presented for

Page 5 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

60i 3:2 pulldown content

Recovered 24P content

Fig. 8 Correctly recovering progressive content

24 FPS 85 Hz

0 0

0

1 0

0

1

1

2 1

1

2

3 2

2

3

3

3

3

4 3

3 4

Fig. 9 Frame replication

24 FPS 85 Hz

0 0 1

0 1

1 0 1

0 1

1

1 2

2 1 2

1 2

3 2

3 2 3

2 3

3

3 4

Fig. 10 Frame interpolation

an average of 60/24 = 2.5 video frames per display refresh. In the cast of 24 FPS content and an 85 Hz display, the cadence is quite complex, as 85/24 is an irrational number. A more sophisticated approach is to not merely replicate frames but to create new intermediate frames. This is a complex process; merely performing a linear interpolation of the intermediate frames will only result in blurred content, as conceptualized in Fig. 10. Instead, some motion-compensated approach must be used. This can be done on a global (picture-wide) basis or at a finer grain. Typically, these more advanced frame rate conversion algorithms are used in systems where the content and display ratios are a multiple of one another. For instance, many of the newer HDTVs take in a 24 FPS signal from Blu-ray movies and internally convert it to the display’s native 120 Hz refresh rate. The increased number of frames can decrease judder artifacts inherent in the cinematic content’s low frame rate and help mitigate motion blur issues experienced by some display technologies.

Aspect Ratio Conversion Often, the aspect ratio of the content differs from the aspect ratio of the display. Most retail video has an aspect ratio of 4:3 or 16:9. Older TVs have a 4:3 aspect ratio; most newer ones have a 16:9 ratio. Playing 4:3 content on a 16:9 display or 16:9 content on a 4:3 display requires some form of conversion. The aspect ratio topic is further complicated by the wide availability of Internet video that can have practically any aspect ratio and the fact that video is often played on PCs in a window of arbitrary size. Nevertheless, the techniques used below can be used to span any mismatch between content and display aspect ratios. Although I will use the term display aspect ratio in the following entries to refer to the properties of a monitor’s entire screen, the concepts can be generalized to smaller target regions of video, such as might be seen in picture-in-picture scenarios or when watching windowed video on a PC.

Page 6 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 11 Full screen stretching

Fig. 12 Cropping

Fig. 13 Letterboxing

One simple approach is to simply stretch the video to completely fill the screen, as demonstrated in Fig. 11. This approach is popular with many consumers, as the entire screen is active. Video purists decry the technique, however, because the aspect ratio of the content is distorted. For instance, circles are deformed into ovals. Another approach is to cut out a section of the content that matches the video’s aspect ratio. This cropping technique (Fig. 12) makes use of the display’s entire real estate while preserving the content’s aspect ratio. However, portions of the video are lost. This artifact is most notable in scenes containing two people facing at each other at the extreme edges of the screen. Cropping can remove both of the actors from the viewable picture. The cropping of key elements can be mitigated to some extent by moving the cropping window back and forth across the content to the areas of interest. This technique is known as pan and scan. Note that pan and scan is only practical with human control. It is only suitable for a prior processing, such as preparing wide-screen content to be broadcast in a 4:3 format. Letterboxing is a technique that displays the entire scene on the screen while preserving the original aspect ratio. This feat is accomplished by uniformly scaling the content to match the most constraining dimension and filling the remainder of the screen with black, as illustrated in Fig. 13. This approach is the preferred one for videophiles. Some people do not like it because it leaves regions of the screen unused. Letterboxing refers to insertion of horizontal black bars above and below the content when adapting content to a display with a wider aspect ratio. When adapting content to a narrower aspect ratio, black columns are inserted on either side; this technique is known as pillarboxing.

Page 7 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 14 Wide-screen cropping

Fig. 15 Nonlinear anamorphic scaling

People are sometimes surprised to find letterboxes appearing on their wide-screen TVs when watching movies. This scenario occurs because cinematic aspect ratios are completely disconnected from video aspect ratios. The most common film aspect ratios (such as 1.85:1 and 2.35:1) are wider than 16:9, so letterboxing is still necessary. A special case of cropping can be used when dealing with letterboxed content. If wide-screen content is letterboxed into a narrow-screen transmission or storage format and subsequently displayed on a widescreen TV, the cropping window can be set to the display aspect ratio. The resulting image completely fills the screen while preserving the aspect ratio (Fig. 14). In all the aspect ratio conversion schemes discussed so far, some scaling or resampling of the content is typically required. All of the scaling for the above techniques uses linear scaling, by which, I mean that the scale factor is consistent across the picture. Stretching content to fill the screen uses anamorphic scaling: the scale factor in the horizontal direction is different than the scale factor in the vertical domain. Still, it is linear in that the scale factor remains constant in each direction. Anamorphic scaling is common in the cinema. It is also used in DVD-Video playback, because the video is sampled using non-square pixels. Typically, DVD-Video content is stored as 720  480 samples for both 4:3 and 16:9 contents. For each aspect ratio, a different anamorphic scaling must be used to properly scale the 720  480 content to the square pixels of today’s displays. Scaling always results in a degradation of the image quality. The topic is worthy of a book itself. For this brief overview, let us just note that unnecessary scaling should be avoided. Competitive solutions today use sophisticated algorithms involving many source samples (or taps) per single destination sample. Simple bilinear scaling is usually inadequate. The most sophisticated algorithms use temporally adjacent frames to increase the effective resolution of the current frame. This brief segue on scaling leads us to the final aspect ratio conversion technique. It has many different brand names but no common generic name. I refer to it as nonlinear anamorphic scaling. In this technique, the horizontal scale factor is constant, but the vertical scale factor varies across the picture (Intel Corporation Intel Clear Video Technology). This allows the content aspect ratio to be correctly reproduced in the center of the picture at the expense of exaggerated distortions at the edges of the screen, as seen in Fig. 15. This approach utilizes the entire real estate of the screen. As such, it is seen by some as an optimized compromise between cropping and stretching. However, it has very noticeable artifacts. For instance, a car driving across the frame will start with wildly distorted tires that morph into proper circles as it reaches the center region and then stretch back out as it reaches the far edge of the screen.

Page 8 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 16 Adjusting brightness

Fig. 17 Adjusting contrast

Basic Adjustments Even in the early days of a primarily homogenous video environment, some basic video processing controls were necessary to account for variations in products from different manufacturers and to adjust for ambient lighting conditions. These adjustments were necessary to get a perceptually uniform behavior across the different products and installations. In other words, by properly calibrating the TV using these basic adjustments, a person would ideally perceive the same image regardless of the living room or showroom he happened to be in. Of course, in practice, this precise level of calibration rarely happens, as the average consumer does not understand how to properly adjust all the settings to match the reference behavior.

Brightness and Contrast Brightness and contrast are two commonly used but commonly misunderstood controls (Keith 2007). Both of these operate solely on luma components. Brightness is more technically known as black level adjustment. Contrast refers to the gain of a picture. Adjusting the brightness of a picture adjusts the entire range of output values. In other words, decreasing the brightness will lower both the black level and the white level. Increasing the brightness will increase both the black and white levels. An example of brightness adjustment is shown in Fig. 16. Contrast, however, actually alters the distance between the black and white levels. Just manipulating the contrast itself will not alter the black level but will raise and lower the white level. All intermediate values will be stretched accordingly. An example is shown in Fig. 17.

Page 9 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 18 Adjusting saturation

Fig. 19 Adjusting hue

Loosely speaking, the function of brightness and contrast can be calculated by the following function: Output_value ¼ ðcontrast  input_valueÞ þ brightness

Saturation and Hue Hue and saturation manipulate the color. They operate solely on the chroma components and have no impact on luma. Saturation refers to the intensity of the chroma components. Saturation is accomplished by multiplying the chroma values by a constant. Saturation adjustment examples are shown in Fig. 18. Hue adjustment basically rotates and shifts all the colors. It can be thought of as a rotation of the hue ring, as demonstrated by examples in Fig. 19.

Advanced Adjustments For quite some time now, television and display manufacturers have been adding features to their products that are designed to make the video appear better than a reference display. In this context, better is subjective. Basically, the intent is to differentiate from competing products. If all displays exactly matched the reference display, then consumers presumably would just buy the cheapest unit. However, if one display presents the video in a fashion that somehow looks more pleasing to the eye than the others, then the consumer might be willing to pay more money for it. Even when dealing with the latest high definition video formats, there is still a lot of room for improvement. As discussed in chapter “▶ Video Compression,” consumer video is compressed in a lossy fashion. Most of the content is stored as 8 bit per sample 4:2:0, which has limited dynamic range and

Page 10 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

color fidelity. There is a distinguishable gap between this content and the limits of human perception. Work is under way to narrow this gap by increasing the bit depth and color gamut of the compressed video, but in the meantime, advanced video enhancement algorithms strive to improve today’s content

Adaptive Brightness and Contrast One such class of algorithm is adaptive contrast enhancement. This feature generates a histogram of the image and automatically adjusts the contrast to provide an ideal presentation for the current scene. Some hysteresis is needed to ensure the contrast does not change widely with every single frame, and care must be taken to make sure that intentionally dark scenes remain dark. Advanced versions of the algorithm can adaptively adjust the contrast in various regions of the screen. Another technique is to monitor the ambient lighting conditions and adjust the brightness (and possibly contrast) of the image to match the characteristics of the human perceptual system. For instance, in a dark room, the brightness should be lower than when the same content is being viewed in a bright room. This approach does not necessarily monitor the video content but instead modulates parameters based on external conditions.

Video Artifact Removal As previously noted, video compression is lossy and introduces artifacts. These artifacts can be mitigated by applying clean-up algorithms to the decoded images. Deblocking attempts to smooth the boundaries between block-shaped regions with different average values. The human eye is surprisingly good at detecting these regions. Newer video codecs (ISO IEC 14496 10; SMPTE 421M VC 1) include deblocking filters as part of the decoding algorithm, but not all video streams encoded in these formats take full advantage of the deblocking capabilities. Also, older codecs such as MPEG-2 do not inherently support deblocking, so applying it as a post-processing stage can noticeably improve the picture quality. An example of deblocking is shown in Fig. 20. High-frequency information is often discarded as part of the compression process. Detail or sharpness algorithms can reproduce some of these characteristics, as demonstrated in Fig. 21. Poorly implemented sharpness filters can create new artifacts, such as creating ringing around high-frequency details. Good filters recreate original detail without introducing new problems. High-frequency noise is always a problem with analog content and even with most digital content. The CCDs that actually digitize the real world introduce noise, as do the compression algorithms. Noise removal algorithms can remove these artifacts, as seen in Fig. 22, and create a cleaner picture. Care must be taken to only remove the unwanted noise; just applying a general blurring filter will mitigate the noise but will also lose fine details that are part of the desired picture.

Advanced Color Processing There are many different techniques in contemporary products that strive to improve the color response of the system. These basically strive to create more intense or vibrant colors than the reference implementation. Some of these algorithms try to boost colors that have limited bandwidth in a particular domain.

Page 11 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 20 Deblocking

Fig. 21 Sharpening

Fig. 22 Noise reduction

Some take advantage of the broader range of colors that the new wide gamut displays are capable of displaying. Often, these color enhancements are not technically correct, in that they produce colors that are different than a defined spec, such as BT.709. However, many viewers prefer the enhanced colors to the baseline implementation.

Page 12 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_28-2 # Springer-Verlag Berlin Heidelberg 2015

One interesting class of algorithms is skin tone detection. These algorithms attempt to detect regions of the screen that contain samples of human skin and adjust the color of the skin independent of the rest of the picture. These can arguably make the depicted people look better. Again, this is straying from an accurate reproduction of the original scene. Presumably, a properly calibrated system would accurately depict any flesh tones just as accurately as the rest of the scene. There are a wide range of different flesh tones out there in the real world, so it is not always clear if there is a universal advantage to this type of processing.

Conclusion We are in the middle of an amazing period of video quality improvement. Even as the transition from standard definition to high definition continues, a widespread proliferation of high-quality HDTVs and high-resolution PC monitors has sparked an arms race of competitive video enhancement features. As long as video storage techniques remain restricted to reproducing a significant subset of what humans can perceive – which seems likely to be the case for decades to come – there will always be a market to enhance that video to create a perceptually improved experience. It is also worth noting the recent influx of relatively low bit-rate content, such as Internet-streamed video content à la YouTube, is a particularly fertile ground for new video processing techniques to emerge.

Directions of Future Research The field of video processing research remains active, with new capabilities being introduced with the annual release of new televisions by major manufacturers. These features include more sophisticated implementations of the basic techniques described in this chapter, as well as adaptive mechanisms that attempt to automatically improve subjective video quality. The increasing availability of displays with wider gamuts and higher refresh rates has also prompted research into mapping existing video standards into these new domains. Also of interest are techniques for improving the display of low bit-rate and low-resolution user-generated content, which has become widely available due to the proliferation of video-capable phones and Internet sites such as youtube.com.

Further Reading De Haan G, Bellers EB (1998) Deinterlacing-an overview. Proc IEEE 86:1839–1857 Intel Corporation. Intel ® Clear Video Technology controls questions and answers. intel.com. http://www. intel.com/support/graphics/sb/CS-029863.htm ISO/IEC 14496–10 coding of audio-visual objects – part 10: advanced Video Coding Janus S (2002) Video in the 21st century. Intel Press, Hillsboro Keith J (2007) Video demystified: a handbook for the digital engineer, 2007th edn. Newnes, New York List P et al (2003) Adaptive deblocking filter. IEEE Trans Circuits Syst Video Technol 13(7):614–619 Matchell R (1982) Digital techniques in film scanning. IEE Proc Sci Meas Technol 129:445–453 Poynton CA (1996) Technical introduction to digital video. Wiley, New York Robin M, Poulin M (2000) Digital television fundamentals. McGraw-Hill, New York SMPTE 421M VC-1 compressed video bitstream format and decoding process

Page 13 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

Security Imaging: Data Hiding and Digital Watermarking Daniel Taranovsky* Advanced Micro Devices, Markham, ON, Canada

Abstract Securely embedding data in digital images has important applications for covert communication and copyright protection. The specific algorithm used to embed information depends on the characteristics of the application. Two prominent techniques for embedding information are least-significant-bit modification and discrete-cosine-transform modification. Detecting modifications to a high bit-rate signal such as a digital image can be difficult and is the subject of cross-disciplinary research in statistics and signal processing. This chapter gives an overview of the important issues associated with security imaging with specific examples of techniques to embed and detect hidden information.

List of Abbreviations DCT LSB YCrCb

Discrete cosine transform Least significant bit of a binary number A representation of an RGB color in terms of its luminance (Y) and chrominance (Cr and Cb) components

Introduction Digital photography, social networking, and ubiquitous Internet access have made the distribution of digital media commonplace. This trend creates new problems and opportunities and warrants the development of a technology for ownership control and secure embedding of information to media assets. Traditional computer networks can offer a brute-force security approach that limits access privileges but is only effective as long as there is no breach of the network. An alternative is to implement embedded security that “travels with the content” in a sense. Data hiding technology, such as digital watermarking, can protect content without explicitly fire-walling access. Covert communication channels can be established with data hiding in digital images. The hiding of this data in images shares similar theoretical foundations as digital rights management. Digital images are so common that it is impossible to have an inventory of all images shared, posted, stored, and otherwise in existence. This provides ample opportunities for sharing images containing hidden data without arousing suspicion. Images are e-mailed, are available on servers for download, and posted on websites every day. The act of sharing or viewing a digital image is part of the way people communicate with each other on a regular basis. Even if one were inclined to be suspicious of digital image sharing activity, the volume of content available makes it impractical to effectively search for covert communication. Digital images are high bit-rate signals making them excellent media for data hiding. The human eye also has limitations in detail it can perceive, making it possible to alter the digital representation of an image with imperceptible difference to a viewer. *Email: [email protected] Page 1 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Image with paragraph of text secretly embedded (a) and the original image (b)

Data hiding refers to the process of securely embedding information to some cover medium and is the focus of this section. Different applications have unique security requirements that reflect the usage model of the embedded information, while there are overarching concepts that apply to all cases. Steganography traditionally refers to data hiding for the purpose of covert communication, and digital watermarking applies data hiding for digital rights management. In terms of problem definition, we refer to steganography as being the general data hiding problem and digital watermarking as a specific instance of steganography.

Principles of Steganography We define a steganography embedding algorithm as a function: f ðM , C, k Þ ! Z

(1)

C is the original digital image (referred to as the cover object or cover image) used to embed data M. M is the message being embedded, referred to as the stego-message. Z is the resulting image after embedding message M in cover object C, referred to as the stego-object or stego-image. Z is intended to be indistinguishable from C. k is some digital key (referred to as the stego-key) to extract and decipher the message M from the stegoimage. k should only be known by the originator and intended recipient of M. Cryptography is the study of encrypting data, and steganography techniques attempt to hide data. Cryptography and steganography are often used together. Rather than extracting a legible message from the stego-object, the message is encrypted prior to embedding. The mere detection of a covert message would be considered a breach of the steganographic algorithm, but encryption would serve as additional protection for the message in the event of a successful attack on the stego-object. Figure 1a is an example of a stego-object with an encrypted paragraph of text embedded.

Page 2 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

K : Stego-Key 101010111001101... M : StegoMessage

f : Stego-Function

Noise, manipulation, attack, etc...

K : Stego-Key 101010111001101...

z : Stego-Image

z′ : Received StegoImage

f

−1

f : Inverse Stego -Function

M′: Received StegoMeaasge

f −1 Transmission

C : Cover Image

Fig. 2 Steganography in practice

Characteristics of Steganographic Algorithms Figure 2 illustrates the steganography process. There are different steganography functions f, all varying depending on target application criteria. Improving one criterion is typically achieved by compromising another (Chandramouli et al. 2004; Provos and Honeyman 2003). Capacity Capacity refers to the size of the message that can be embedded in the cover object. A message can be compressed to fit in a smaller cover object at the expense of computational complexity. Algorithms that offer higher capacity reduce the bandwidth and time required to transmit the stego-object. Low-capacity algorithms require larger distortions to the cover image as the message size increases, resulting in compromised perceptual transparency (Moulin and Mihcak 2002). Perceptual Transparency Perceptual transparency is the degree that the algorithm distorts the original cover object. As a greater proportion of the stego-image signal is the message, higher distortion and higher capacity is observed. Steganographic algorithms should apply distortion characteristic of signal noise. A typical expectation of perceptual transparency is that the stego-object is visually indistinguishable from the cover object. An algorithm that offers low perceptual transparency is not useful for any application. A higher-order requirement is that statistical analysis of the stego-image signal should not reveal any evidence of an embedded message. While an image may appear to be identical, statistical analysis of the color data distribution can reveal anomalies when compared to everyday images. Section “Steganalysis Techniques” describes these cases in more detail. Security Security refers to the difficulty an attacker has in determining whether a message is embedded in the cover object. A digital watermark, for example, may not alter the cover image and be impossible to remove but is clearly embedded in the object. In this case the perceptual transparency and robustness is high, but the security of the communication channel is low. Security in this sense is defined as the level of “invisibility” of the message. An algorithm that is not perceptually transparent is not secure, but the inverse is not necessarily true. Some digital watermarking applications may be secure, but all covert communication must be secure to be effective. Robustness Robustness is the ease with which one can destroy, modify, or remove the stego-message without damaging the cover medium. Compressing, resizing, or cropping images are transformations typically applied to digital images. Some applications require hidden information to remain intact and extractable

Page 3 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Characterizing data hiding applications Image data hiding application Digital watermarking Covert communication Photo annotations

Prime algorithm considerations Robustness Security Capacity

even after such geometric modifications. If an attacker must damage the image beyond recognition to remove the stego-message, then the algorithm is robust. If the quality of the cover object after an attack is not important, then the problem becomes trivial – simply deleting the stego-image will destroy both the cover and message object. Steganographic algorithms are designed to force attackers to severely damage the image while removing the message. In the case of covert communication, a good steganographic algorithm will make determining the presence of a message no more successful than a random guess. The highest level of security breach in covert communication is if the attacker is able to read the stego-message and replace it with an alternate message. In this case, not only has the presence of covert communication been identified (a security breach on its own), but the robustness of the algorithm has also been compromised. Complexity Complexity refers to the computational complexity of the algorithm. Long messages embedded in small images may require a lot of computational resources to encode and decode. Depending on the application, this may or may not be a concern. For example, an application that embeds precise time stamp information or requires real-time copyright verification may be sensitive to the computation complexity of the algorithm.

Applications of Steganographic Algorithms Table 1 characterizes the principle characteristics of data hiding applications. Digital Watermarking A proprietary image or logo can be protected by having ownership information bound to it. If the distribution rights are violated, then the offending image would have information on who violated the usage agreement bound to it. Other security features can include download information and time stamping to track the origin of ownership violations. Users typically know (or are told) a digital watermark is embedded in the image to deter copyright infringement. Reading the watermark message may not necessarily be considered a security breach. The content is expected to remain perceptibly unaltered after the digital watermark is applied, and it should be impossible to alter the watermark even after cropping, resizing, and other geometric image processing. Covert Communication Two parties communicating covertly with one another will want to hide the existence of messages. Depending on the communication channel and the expected mode of attack, robustness may not be an issue. If a message is embedded in an image on the World Wide Web, then the recipient will download the image without any transformation or processing applied to it. However, if the image is sent over a protected server that may selectively attack suspecting image attachments, then robustness may become a concern. High capacity and high security are the most important algorithm considerations to avoid detection.

Page 4 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

Before

Pixel A R: 1011 0101 G: 0110 1001 B: 1000 1100

After

Pixel B

Pixel C

Pixel A′

Pixel B′

Pixel C′

0011 0010 0110 1001 1000 1100

0101 1001 0110 1001 1000 1100

1011 0100 0110 1001 1000 1100

0011 0011 0110 1001 1000 1100

0101 1001 0110 1001 1000 1100

Fig. 3 Insertion of 011 in least significant bit of pixel A, B, C’s red (“R”) channel

Photo Annotations Information about archived photos may be most conveniently stored as embedded information. By embedding the information in the image with steganography techniques, attachments do not require separate storage maintenance and will not be lost or destroyed on their own, and access to annotations is controlled by those who have the stego-key. Information about the contents of the image, its contextual significance, or other information may be useful for search engines and people with need-to-know privileges.

Steganography Techniques This section discusses data hiding principles and techniques. A comprehensive survey of steganographic algorithms is beyond the scope of this text. We focus on LSB (least significant bit) and transformationbased techniques, which are the basis for many application-specific implementations (Chandramouli et al. 2004; Provos and Honeyman 2003).

Hiding in the Spatial Domain: Least-Significant-Bit Modification LSB techniques involve selectively modifying the least significant bits of a pixel’s color. The bits modified can be a single least significant bit in a single channel or multiple bits in one or more RGB color channels. As more bits are modified, the perceptual transparency decreases while capacity increases. An example of least-significant-bit insertion is shown in Fig. 3. Some file formats (such as .GIF) use an indexed palette to color pixels. For example, each pixel is an 8bit index referencing one entry in a palette of 28 24-bit colors. Storing the stego-message in the least significant bit of the pixel’s index can result in high degradation of perceptual transparency since adjacent colors in the palette are not guaranteed to be contiguous in color space. Embedding the message in the palette (rather than the palette index) retains more perceptual transparency, but the size of the embedded message is limited to the size of the palette. Another consideration is what pixels in the image to modify. Cover images with many “color pairs” that differ by a single bit, preferably in the same spatial locality, are best suited for least-significant-bit modification. Regions of flat color are prone to show discrepancies in neighboring pixels. Some algorithms may select random pixel locations, intelligently identify favorable image regions, or have hard-coded locations. The principle deficiency with least-significant-bit algorithms tends to be robustness. Resizing, cropping, signal noise, and other basic modifications usually destroy the stego-message. Stego-messages can also be removed by simply overwriting all least significant bits in the image.

Page 5 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 An image and one 8  8 pixel block (a, d), its compressed discrete-cosine-transform representation (b, e), and the resulting decoded image (c, f)

Hiding in the Frequency Domain: DCT Coefficient Modification A digital image can be considered a multidimensional signal. Every pixel is represented as a three-channel (RGB or YCbCr) value on a two-dimensional image plane. Each channel can be mapped over the surface of the plane and treated as a distinct two-dimensional signal (Cox et al. 1997; Mohanty et al. 2000; Westerfeld 2001). Signals can be represented as a sum of oscillating functions of varying frequencies. The DCT (discrete cosine transform), for example, represents a signal as a sum of cosine functions. The JPEG image encoding algorithm divides an image into 8  8 blocks of pixels and encodes each block as its DCT coefficients. This is particularly effective for images since there is a high degree of correlation among neighboring pixels. If one pixel is red, there is high likelihood that the next pixel is close to red. Therefore, in an 8  8 block of pixels, the color variations can usually be represented in the low-frequency bands, while high-frequency variations (large color changes from one pixel to the next) are rare, imperceptible to the human eye, and can be discarded without changing the principle content of the image. Data hiding in an image’s frequency domain can be accomplished by modifying the low-frequency DCT coefficients (the upper left values in Fig. 4e) (Cox et al. 1997). Let us assume cover image C is represented by a series of 8  8 DCT coefficient matrices, and we intend to embed data in one coefficient denoted by ci. Message M is composed of a series of values m0, m1, . . . , mn, and the resulting value of the coefficient ci after embedding mi is zi. One simple algorithm for hiding mi is: z i ¼ c i þ a  mi

(2)

a is a scalar value to ensure ami is significant enough to ensure robustness. The message should be hidden in a number of different coefficients ci throughout the image, and only those with the stego-key are able to identify which coefficients were modified and by how much. Consider the example illustrated in Fig. 5. An 8  8 pixel block has a portion of stego-message M embedded in its (1,1) coefficient. ami = 63, and is added to the original DCT coefficient ci = 63 to give Page 6 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 An 8  8 pixel block. Cover image (a), stego-image (b), cover difference image (c), attacked image (d), and stegodifference image (e)

zi = 126. Comparing the decoded stego-object (Fig. 5b) with the original image (Fig. 5a), we see that modifying ci = 63 to zi = 126 results in minor degradation. Some data is lost in the upper left corner (Fig. 5c), but otherwise the color structure remains intact. This satisfies the perceptual transparency condition. The algorithm for embedding and extracting a message is application specific. There are overriding principles that are shared among applications that implement data hiding in the frequency domain. We consider a naïve example of a stego-key that is comprised of the message M and the original DCT coefficients C. A watermark application checks for the presence of mi at coefficient location ci. By comparing the original ci = 63 to the discovered value zi = 126 in the stego-image, one can apply statistical analysis to determine that the likelihood zi  ci = 63 is the result of random noise. Assuming one concludes it is statistically unlikely for the coefficient ci to change from 63 to 126 due to random noise while mi = 63, then we determine zi exhibits evidence of a watermark or some other embedded message mi. In this simple example, the recipient of the cover image knows mi = 63 and ci = 63, and the stego-key will look for a value close to zi = 126 at location i. If the algorithm discovers, for example, zi = 100 at location i, then it is less likely that mi is present at location i and more likely that the discrepancy between zi = 100 and ci = 63 is due to noise or other unrelated modifications. The interpretation of the presence of mi is a matter of implementation. This simple example is referenced for illustrative purposes. In practice more rigorous methods are used to select coefficients for embedding data. Some algorithms embed data in the middle frequency coefficients or calculate a perceptual mask to determine the most suitable coefficients to modify. The perceptual mask computes how much each coefficient can change without perceptually modifying the image. Coefficients that can be modified by a larger amount are better candidates for hiding data. Embedded data can employ multiple coefficients within an 8  8 block. For example, a DCT block can represent mi = 1 if zi > zj, and mi = 0 otherwise. The interpretation of these bits depends on the application. With covert communication and photo annotation, one does not have any knowledge of message M. Digital watermarking applications, however, may strictly test for the presence or absence of a prescribed watermark M. In this circumstance statistical correlation is performed on extracted M΄ and known M to determine whether the image is marked. The risk of false-positive or false-negative Page 7 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

determinations is related to the size of a, and the size of a is inversely proportional to the watermark’s perceptual transparency. The larger the value of a, the more the coefficients will change when mi is embedded. If a is large, naively trying to removing mi without the stego-key (which can identify location i) would require modifying many low-frequency coefficients by a larger amount. Such an attack would materially alter the quality of the image. The strength of embedding messages in the frequency domain is robustness. Resizing the image, additive noise, and other pixel-based modifications will not alter the underlying structure of the image in its frequency domain. Using statistical methods to extract message mi (as in the above example) means the message is protected from minor modifications to the stego-object Z. Changing zi from, for example, 126 to 123 will not significantly change the probability mi was added to ci. Alternatively, large modifications to Z (from a naïve attack) will destroy the integrity of the object. Other modifications such as cropping may remove portions of the watermark, but assuming the watermark is distributed in the perceptually relevant portions of the image one will still detect the presence of a partial watermark. For some applications detecting the presence of a watermark anywhere on the image is sufficient. In Fig. 5d we illustrate a naïve attack on the stego-object in Fig. 5b. Assume we attempt to remove the watermark by adding noise to all low-frequency bands. In this example, we simply add 100 to all coefficients in the upper left quadrant of the DCT matrix. The result is significant color deviation from the original image (illustrated in Fig. 5e). Applying this across all 8  8 pixel blocks in the image will show serious degradation in image quality. This satisfies the robustness condition.

Steganalysis Techniques Steganalysis is the study of techniques to attack data hiding algorithms. For covert communication applications, simply detecting the presence of a message is considered a security breach. A successful digital watermarking attack would constitute removing or altering the watermark without damaging the media object. Successful techniques for attacking specific algorithms will always be considered on a case-by-case basis. However, there are statistical characteristics of images that can be applied to form a general framework for identifying hidden data in images. The general techniques that can be applied to any data hiding algorithm are most interesting, since one can rarely assume one knows all available algorithms being employed. These techniques also assume no knowledge of the hidden message or original cover image, yet can provide insight as to where data may be hidden in the image (Chandramouli et al. 2004). An encrypted stego-message will be indistinguishable from a random sequence of bits. However, the distribution of bits in an image is not random. Colors in images are correlated, and specific colors tend to be prominent and clustered in ways that would not be characterized as random. When a message is written to the LSB of pixels in an image, the number of even and odd values in the image converges. If the stegomessage is a random sequence of bits, and the message bits overwrite the LSB, then the pixels with hidden data will have even and odd parity (approximately equal number of 0 and 1 LSB values). Images generally do not share this property. Most images are skewed to being even or odd due to prominent, repetitive colors in the image. When an LSB technique is used on an image, the number of color pairs that differ by one bit increases. A higher proportion of pixels will be adjacent to colors that differ by a single bit in a single color channel, and the total number of colors in the image will increase beyond what is statistically typical of a natural image. This is particularly evident in palette-based images where colors that differ by negligible amounts may be discarded to reduce the size of the palette. Stego-messages that are embedded in the LSB of palette indices run the risk of severely degrading image quality since there is no guarantee neighboring colors in Page 8 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

the palette share locality in the color spectrum. To alleviate this, the palette may be ordered so index LSB modifications will reference similar colors, but an ordered palette itself may arouse suspicion (Dumitrescu et al. 2003; Ker 2005; Lee et al. 2006). DCT coefficients in the frequency domain are similarly distributed as pixel colors. Stego-message insertions tend to even the frequency of odd and even coefficient values and increase the number of distinct coefficient values. Methods similar to statistics-based LSB steganalysis can be performed on DCT encoded images by analyzing DCT coefficients rather than pixel colors.

Summary/Conclusion Securely embedding data in an image provides a new means of protecting digital content and communicating covertly. Digital security has traditionally been associated with monitoring strict access privileges with a secret password. However, this does not address several key usage models of digital content. In some circumstances one may want others to access the digital content but limit their usage. For example, they may have the right to view, store, and modify a collection of images, but does not have the right to distribute to others. In this circumstance a strict “access” or “no-access” policy does not apply, and researchers need to be creative in addressing this growing usage model. Digital watermarking serves as the image’s “passport” in a sense; the origin and usage rights information are uniquely bound to the content. Similar to digital rights management, covert communication conducted with a strict “access” or “noaccess” policy may have limitations. For example, assume the message is unencrypted and access is restricted by stealth, physical barriers, or other means. Such a scheme works as long as access is not breached. Covert communication embedded in digital images redefines the problem to one of trying to determine where to look and the computational complexity associated with scanning an insurmountable pile of images.

Directions for Future Research Digital steganography is a relatively new subject with research opportunities spanning many directions. The robustness of hiding information in the frequency domain and the prevalence of JPEG images make DCT-based algorithms more popular from a research point of view than LSB implementations. While there does exist algorithms that hide information in images, it remains difficult to assess how well the data is hidden. Quantifying the security of an algorithm and developing tools to systematically test the performance of information hiding implementations would be valuable (Ming et al. 2006). This technology would allow researchers to assess how well one algorithm performs relative to another and expose specific weaknesses. Improving steganographic algorithms is another prominent area of research. A smarter heuristic for selecting candidate pixels, for example, would increase the security, robustness, and capacity of the system. Determining the portions of an image that are perceptually relevant (such as faces and foreground objects) helps identify areas that cannot be modified without altering the image’s primary subject. Finding pixels and coefficients that can be substantially modified without degrading perceptual transparency would help increase algorithm robustness. Further research in the statistical characterization of colors in digital photographs would identify common and unlikely patterns in natural images. With such information one can employ preventative measures to ensure hidden information does not appear as a statistical anomaly, which could arouse suspicion (Provos 2001). Page 9 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_31-2 # Springer-Verlag Berlin Heidelberg 2015

References Chandramouli R, Kharrazi M, Memon N (2004) Image steganography and steganalysis: concepts and practice, vol 2939, Lecture notes in computer science. Springer, Heidelberg Cox I, Kilian J, Leighton T, Shamoon T (1997) Secure spread spectrum watermarking for multimedia. IEEE Trans Image Process 6(12):1673–1687 Dumitrescu S, Wu X, Wang Z (2003) Detection of LSB steganography via sample pair analysis, vol 2578, Lecture notes in computer science. Springer, Heidelberg, pp 355–372 Ker A (2005) Improved detection of LSB steganography in grayscale images, vol 3200, Lecture notes in computer science. Springer, Heidelberg, pp 97–115 Lee K, Westfeld A, Lee S (2006) Category attack for LSB steganalysis of JPEG images, vol 4283, Lecture notes in computer science. Springer, Heidelberg, pp 35–48 Ming C, Ru Z, Xinxin N, Yixian Y (2006) Analysis of current steganography tools: classification and features. Intelligent information hiding and multimedia signal processing 2006, pp 384–387 Mohanty S, Ramakrishnan K, Kankanhalli M (2000) A DCT domain visible watermarking technique for images. IEEE Int Conf Multimedia 2:1029–1032 Moulin P, Mihcak M (2002) A framework for evaluating the data-hiding capacity of image sources. IEEE Trans Image Process 11(9):1029–1042 Provos N (2001) Defending against statistical steganalysis. In: Proceedings of the 10th USENIX security symposium, vol 10, Washington DC, pp 323–335 Provos N, Honeyman P (2003) Hide and seek: an introduction to steganography. IEEE Secur Priv 1:32–44 Westerfeld A (2001) F5 – a steganographic algorithm: high capacity despite better steganalysis, vol 2137, Lecture notes in computer science. Springer, Heidelberg, pp 289–302

Further Reading Cox I, Miller M, Bloom J, Fridrich J, Kalker T (2008) Digital watermarking and steganography, 2nd edn. Morgan Kaufman, Burlington Katzenbeisser S (2000) Information hiding techniques for steganography and digital watermarking. Artech House, Norwood Salomon D (2003) Data privacy and security. Springer, New York Wayner P (2009) Information hiding: steganography and watermarking, 3rd edn. Morgan Kaufman, Burlington

Page 10 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

Security Imaging: Biometrics and Recognition Technology Daniel Taranovsky* Advanced Micro Devices, Markham, ON, Canada

Abstract The technology of accurately identifying people has improved with the advent of digitizing biometric data. Cross-disciplinary advancements in medical and computer sciences have allowed anatomical characteristics to be digitized so traditional pattern recognition algorithms can be employed to reconcile a single instance with potentially millions of database entries. Two important biometric applications are facial and fingerprint recognition. One facial recognition technique expresses face images in terms of eigenfaces using principal component analysis. This chapter provides an overview of the biometric recognition problem with examples of solution methods for fingerprint and facial recognition.

Introduction The problem of uniquely identifying people is often encountered in forensics and security applications. Biometric recognition has evolved considerably with display, image capturing, and computer vision being applied to develop cross-disciplined solutions to the most difficult identification problems. Biometric applications are broadly positioned along two axes: identification versus verification and involuntary versus voluntary. Identification systems match measured biometric data with instances in a database. An ideal system matches measured data with a single database entry with 100 % accuracy. In reality, the system may return false positives, false negatives, or multiple entries that fall within some acceptable threshold of confidence. The quality of the measured data is often worse with involuntary systems, so the identification process is even more challenging. Verification systems confirm the person is who she/he claims to be and only needs to match measured data with a single database entry. A Boolean result (with some confidence factor) is output. Standardized photo and signature identification systems have been in existence for some time and still serve as a cost-effective way of deploying secure verification processes in many circumstances. Fingerprint scanners and digital photography are used around the world to accompany passport control’s document inspection. Figure 1 shows the application space of biometric systems. Involuntary identification systems are designed to measure and match biometric data without requiring explicit participation from the sampled individual. Cameras that photograph subjects in a target area and the analysis of fingerprints left behind in a crime scene are examples of involuntary identification scenarios. Acquiring biometric data covertly is typical of involuntary biometric identification applications. The principal benefit of covert biometric acquisition is it becomes more difficult to circumvent the security system. Voluntary systems can be more intrusive and overt with the benefit of acquiring better measurements and more accurate results. Retinal scans, fingerprint scans, and signatures are three examples of biometrics typically employed in voluntary identification and verification systems. Examples of voluntary participation in identity verification are passport control areas and providing a signature during credit card transactions.

*Email: [email protected] Page 1 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015 Involuntary

•Tracking missing persons

(Harder problem) • Crime scene investigations • Video surveillance

Verification

Identification

(Easier problem) • Passport control • Credit card transaction

Voluntary

Fig. 1 Biometric application space

Table 1 Biometric data characteristics Characteristic Universal Distinct Permanent Collectable

Description Every person should have an instance of the biometric No two people should share the same biometric measurement or representation The biometrics should be tamper proof and remain unchanged over time It should be recordable, measurable, and have a suitable digital representation

A biometric measurement is a quantitative or qualitative characterization of the human body. Height, hair color, DNA, fingerprints, eye vascular structures, and voice patterns are all examples of biometrics (Zhao et al. 2003; Delac and Grgic 2004). A suitable biometric for forensic and security applications should be universal, unique, permanent, and collectable. Table 1 describes the four key biometric characteristics. A person’s height is universal, somewhat permanent, and easily collectible but far from distinct. Every person has a distinct and permanent personality or sense of humor, but this would prove difficult to collect. Vascular retinal scans are said to be among the most secure biometrics since it is universal, distinct, and impossible to tamper with without seriously compromising one’s health. It is, however, difficult to collect retinal scans in involuntary collection environments. The person must be willing to subject themselves to a retinal scan in order to collect the data. Applications in the lower right quadrant (voluntary identification) are not typical since the voluntary participation of the subject serves to narrow the search problem to one of verification, rather than a broader identification problem.

Principles of Biometric Identification Facial and fingerprint biometric data is universal, distinct, permanent (relatively), and collectable. Fingerprint matching has a long history in forensic investigations and is one of the few nonintrusive biometrics that have voluntary and involuntary applications. A person may volunteer samples of her/his DNA or fingerprints, but is also likely to leave DNA and fingerprint samples where they visit. Facial recognition is particularly interesting for security applications since the biometrics can be collected through the lens of a covert camera. Video footage collected anywhere in the world can be a source of

Page 2 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

Table 2 Biometric system output probability expressions Probability expression P(f(P0,Pi)  t | Pi 6¼ P*) P(f(P0,Pi)  t | Pi = P*) P(f(P0,Pi) < t | Pi = P*) P(f(P0,Pi) < t | Pi 6¼ P*)

Natural language interpretation Probability of false-positive error Probability of a correct positive match Probability of false-negative error Probability of a correct negative match

{Pa, Pb, Pc, ...} : Set of potential matches. Si : Biometric sample

Pi : Data representation Feature extraction

Database

P* is the database entry representing the same person from whom biometric sample S′ was collected. If the system outputs P* then a positive match has occurred. All other entries in the output set are false positive errors. f : Matching function

S′ : Biometric sample

P′ : Data representation Feature extraction

Fig. 2 System flow diagram for biometric identification

biometric data, and concealing one’s face in all circumstances is not practical in most parts of the world. Subsequent sections focus on fingerprint and facial identification algorithms. A formal description of the biometric identification problem is presented below. Assume one is attempting to match biometric data S0 with entries in a database. S0 features are extracted and represented as P0. Let P* denote the entry in the database which correctly corresponds to the same person from whom S0 was extracted. The identification system will compare P0 with entry Pi and calculate a score f(P0,Pi). Without loss of generality, we assume 0  f(P0,Pi)  1. Higher f(P0,Pi) scores imply the system identified a higher likelihood that Pi = P*. Sometimes the identification system may not compute f for every Pi in the system due to resource constraints. In this case, unlikely candidates may be pruned very early in the search algorithm to avoid computing f for entries that will probably not generate a high score. For example, if the database is partitioned by sex, race, or other identifiable characteristic, and this characteristic can be reliably determined from the biometric data S0 , then a large portion of the database can be disregarded prior to computing the expensive evaluation function f for the remaining entries. After f(P0,Pi) is computed, it is compared with some threshold t. If f(P0,Pi)  t then it is determined that Pi is a candidate match for P*. Setting threshold t depends on the application. Higher t results in higher probability of false-negative errors, and lower t results in higher probability of false-positive errors. Forensic applications consider false-negative errors objectionable since falsely identified entries may be disregarded in subsequent investigation. Secure access verification systems seek to minimize false-positive errors. Errors occur due to the design of the underlying function f, the quality of the biometric data S0 , the quality of the entry representations Pi, and the threshold t. A formal characterization of the system’s output is presented in Table 2. Figure 2 illustrates a system flow diagram for biometric identification.

Page 3 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

Facial Recognition Human brains have evolved a remarkable capability to recognize faces. Variable lighting conditions, partial feature occlusion, years of aging, and other seemingly material alterations do not prevent humans from quickly recognizing a friend’s familiar face. Given a suitable face representation and an effective matching algorithm, computer technology can be used to mine through large databases of face data. An example of a reliable, high-performance facial representation and matching algorithm is the focus of this section. While there are many research topics in the area of facial recognition, we broadly characterize two approaches. Statistical methods attempt to find closest-fit approximations in image space. Geometric methods attempt to model and match key features of the face. Psychological research on how the human brain recognizes faces suggests that both holistic- and feature-based methods are used. Some facial recognition systems employ a hybrid statistical and geometric method (Lu and Jain 2006; Manjunath 1992; Gross et al. 2001). One notable method for facial recognition involves adapting the concept of principal component analysis to images of faces (Zhao et al. 2003). Principal component analysis is a method to transform a set of data to a coordinate space that aligns itself along trends in the data. For example, assume we have a set of data {(1,1), (2,2), (3,3)}. The point (3,3) represents a single sample, but does not provide insight in how its position relates to the other samples. If one modifies the coordinate system so the axes are along the lines y = x and y =  x, then we see apmore system appear. The dataset pffiffiffi 0 coordinate pffiffiffi 0 ffiffiffi 0 descriptive coordinates in the new coordinate space are {( 2,0) , (2 2,0) , (3 2,0) }. All samples have Y0 = 0, so Y0 does not provide descriptive information on the dataset. Y0 may be discarded altogether to condense the dataset representation without the loss of information. X0 is the most descriptive dimension since we know the sample data is expressed along the dominant trend line rather than an arbitrary coordinate system. In this case we refer to X0 as being the principal component of the dataset. Figure 3 illustrates this example. Now consider N  N dimensional data representing images of size N  N. Every pixel is an axis in N  N dimensional space. We are interested in finding the principal components of this dataset. An (N  N)  R matrix is constructed with every row corresponding to an image; N  N is the number of Y

X′ Y′

(3,3) → (3 √2)′ (2,2) → (2 √2)′ (1,1) → (√2)′ X

Fig. 3 Sample set mapped on alternate axes Page 4 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Example eigenfaces (Busey 2009)

Mean face =

eig0 + δ0 ⋅

eig1 + δ1 ⋅

eig2 + δ2 ⋅

eig3 + δ3 ⋅

+⋅⋅⋅

Fig. 5 Approximating face images as a linear combination of eigenfaces (Busey 2009)

pixels in each image, and R is the number of images in the dataset. The covariance matrix of the dataset is calculated and the eigenvectors of the covariance matrix are determined. The covariance matrix indicates how pixel colors change relative to one another. The eigenvectors of the covariance matrix will orient the axes along prominent trend lines much like the example in Fig. 3. In the context of facial recognition and image data, the eigenvectors of the covariance matrix are referred to as eigenfaces (Busey 2009). One can imagine these eigenfaces as representing the “direction” in which the images differ from one another. They identify important relationships about how pixel colors change relative to one another. Consider a naïve coordinate system where each axis defines one pixel’s color. In this example, a pixel color is assumed to have no correlation with other pixel colors. We know this not to be true of human faces. The eigenfaces orient the coordinate system to represent statistical correlation among pixel colors. Principal component analysis normalizes the data around its mean (the average face given all the faces in the dataset). The average face becomes the new origin in the eigenface coordinate system. The eigenfaces will indicate the prominent directions that best characterize how faces tend to differ from the average face. Figure 4 shows three examples of eigenfaces. Once the series of eigenfaces is generated, the system orders the eigenfaces in order of their eigenvalues. If some eigenvalues are close to zero, these eigenfaces may be discarded altogether without much consequence. In Fig. 3, we saw that Y0 did not contribute to the characterization of the dataset since no samples differed along the Y0 axis. The same concept applies with eigenfaces, and the system will usually be able to accurately reconstruct all the faces in the dataset using a small subset of prominent eigenfaces. Faces are composed of a linear combination of the mean face in the dataset and the eigenfaces (Fig. 5). d0 is the weight associated with the most prominent eigenface (eig0), d1 is the weight associated with the second most prominent eigenface (eig1), and so on. Typically, not all dNN eigenfaces are required to get a close approximation to the target face image. Since each image is represented by a small set of weights di, a high level of compression can be achieved. The process is illustrated in Fig. 5:

Page 5 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

2

3 I 0, 0 , I 0, 1 , . . . , I 0, M 4 5 ... I R, 0 , I R, 1 , . . . , I R, M

(1)

The datasets of R images composed of M = N  N pixels are arranged with one image per row. The mean value per dimension is subtracted. Each dimension (column) is a pixel location (Eq. 1): 2 3 covð0, 0Þ, covð0, 1Þ, . . . , covð0, M Þ 4 5 ... (2) covðM , 0Þ, covðM , 1Þ, . . . , covðM , M Þ A covariance matrix is computed from the image data matrix (Eq. 2). 2 3 2 3 2 3 eig10 eigM 0 eig00 4    5, 4    5,   , 4    5 eig0M eig1M eigM M

(3)

Eigenvectors are computed from the covariance matrix and arranged according to their eigenvalues. eig0 is the most significant eigenvector, and eigM is the least significant. These vectors correspond to eigenfaces. The most significant eigenvector has the largest eigenvalue (Eq. 3). 2

3 2 3 2 0 3 I 0, 0 , I 01, 0 , . . . , I 0R, 0 I 0, 0 , I 1, 0 , . . . , I R , 0 eig00 , eig01 , . . . , eig0M 4 54 5¼4 5    0 0 0 I 0, M , I 1, M , . . . , I R , M I 0, M , I 1, M , . . . , I R, M eigM 0 , eigM 1 , . . . , eigM M

(4)

Eigenvectors are arranged per row with the most significant at the top. The original image data matrix is transposed and transformed by the eigenvector matrix. This will express the image dataset in eigenface space. Transformed images are arranged in columns (Eq. 4). The column entries in the resulting matrix of Eq. 4 correspond to a weight factor associated with an eigenface. For example, I0 0,0 corresponds to eigenface 0’s weight when constructing database image entry 0. I0 3,2 corresponds to eigenface 2’s weighted contribution to face image 3’s representation. A face image’s representation in eigenspace is its collection of eigenface weights. This results in a very compact and manageable digital representation since many of the eigenfaces will be discarded as in Fig. 3. Given an image one wishes to match with entries in the database, the system begins by representing the identification candidate image as a linear combination of eigenfaces. This is achieved by transforming the pixel data into eigenface space. For example, (I0 0,0, I00,1, . . ., I0 0,M) represents face image 0 in eigenspace, and every component of this M-tuple corresponds to an eigenface weight. The matching criteria can then be as trivial as finding the Euclidean distance between the candidate image and the images in the database. The eigenface weights di serve as coordinates in eigenspace and are used to compute closest neighbor likely candidates. The faces in the dataset that are deemed possible matches are those with tolerable distances to the candidate image. Tolerable distances in this case are analogous to the system’s threshold described in section “Principles of Biometric Identification.”

Page 6 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

This technique can be applied to normalized frontal face images. However, in an arbitrary photograph or video sequence, there are a host of problems to be solved prior to the identification process. Faces in the image must be located, features extracted, and subsequent compensation for different lighting conditions and face orientation (Brunelli and Poggio 1993). One method used to identify faces in a scene is to search for symmetry since the face is inherently symmetric (Sirovich and Meytlis 2009). Edge detection methods can also be useful to locate faces and features. Eyes, mouth, and nose are then located on the face to determine the face’s position and orientation. Once this information is known, methods can be employed to normalize the data and initiate the identification process (Kirby and Sirovich 1990; Sirovich and Kirby 1987). Another class of facial recognition techniques is based on geometric reconstruction of facial features and making comparisons with three-dimensional models in the database. Range image data is extracted from the face and curvatures of the face surfaces are analyzed. Surface minima and maxima and convex and concave gradients are used to locate the nose, eyes, and mouth. Given the location of these facial features, the data is normalized and correlations are calculated with the dataset.

Fingerprint Recognition Fingerprint recognition methods are broadly characterized as correlation- or minutiae-based techniques. Correlation-based techniques attempt to characterize the ridge patterns over the fingerprint and calculate a match in frequency space. Minutiae-based techniques identify distinctive points on the fingerprint and then perform a position and orientation best-fit computation to score the match. Current minutiae systems achieve more reliable results than correlation techniques, although systems adopting a hybrid approach claim to achieve lower false-positive matches than minutiae techniques alone. Some correlation techniques use 2D Gabor filters to represent a fingerprint’s local orientation and fingerprint ridge frequency. The ridges on a fingerprint are represented as a sinusoidal function with interridge distances corresponding to the function’s frequency. The global pattern representing a fingerprint is best characterized as ridges wrapped around control points, called loops and deltas. The loop is the innermost cul-de-sac of ridges, and deltas are flat areas arising when ridges change orientation. Correlation techniques aim to describe the global ridge pattern about these control points (Ross et al. 2002). Minutiae techniques scan the fingerprint for key markers, such as ridge bifurcation and ridge termination (referring to instances of forking into two separate ridges and terminating its path, respectfully). A vector of minutiae {m0,m1,m2,. . .,mn} can be used to digitally represent a fingerprint, with each minutiae being a three-tuple {a,j,c}. The three-tuple represents location (a), orientation (j), and type of minutiae marker (c) on a normalized image of a fingerprint. When attempting to match a fingerprint with records in the database, the system attempts to minimize logical distances between corresponding minutiae points. This is not an easy task, since one must first normalize both datasets to compensate for scaling, rotations, and other discrepancies likely to occur when a finger is placed on a sensor. Logical distances between corresponding minutiae nodes increase as the orientation or location discrepancies increase. A minutiae substitution or insertion would contribute substantially to the cumulative logical distance between the two fingerprints. Figure 6 shows an example of fingerprint with identified minutiae.

Page 7 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 Examples of ridge bifurcation and ridge termination minutiae

Summary/Conclusion The key challenge with biometric recognition systems is establishing a method of digitally representing the face, fingerprint, or other biometric signature. The digital representation must be concise enough to allow fast comparisons and storage without discarding important differentiating features. In the case of fingerprints, minutiae location and orientation can be used to digitally represent a fingerprint. A scoring function is used to determine closest matches with candidate samples. Using eigenfaces, images of faces can be digitally represented as a simple series of weights di . These weights correspond to the contribution of eigenfaces in the “reconstruction” of the candidate sample image. The eigenface weights define a new coordinate system, and two images with similar weights will have similar appearance. The effectiveness of representing images as eigenfaces is most apparent when trying to find a match between a candidate sample image and the images in the database. Since the weights can be visualized as a coordinate system, the problem of facial recognition is reduced to calculating Euclidean distances.

Directions for Future Research The problem of bridging the natural analog and computer digital worlds to enable biometric recognition required advancement in a number of areas. Every step of the recognition pipeline in Fig. 2 is the object of potential improvement. Identifying faces and normalizing the image prior to computing a face’s digital representation is a challenge for automatic facial recognition systems. This step takes place prior to any recognition tasks and can still be improved. Developing face image representations that are invariant to environmental factors is a difficult problem. Different lighting conditions and orientations of the face can provide a serious challenge for facial recognition systems and are the subject of ongoing research. Some research considers 3D modeling of the face to relight the image based on environmental conditions. Compensating for variations from facial hair, aging, and emotional expressions is also an open research topic.

Page 8 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_32-2 # Springer-Verlag Berlin Heidelberg 2015

Further Reading Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15(10):1042–1052 Busey T (2009) The face machine. http://cognitrn.psych.indiana.edu/nsfgrant/FaceMachine/ faceMachine.html Delac K, Grgic M (2004) A survey of biometric recognition methods. In: 46th International symposium electronics in Marine, ELMAR-2004, Zadar, pp 184–193 Gross R, Shi J, Cohn J (2001) Quo vadis face recognition? – the current state of the art in face recognition. Technical Report, Robotics Institute, Carnegie Mellon University, Pittsburgh Kirby M, Sirovich L (1990) Application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Trans Pattern Anal Mach Intell 12(1):103–108 Li S, Jain A (2005) Handbook of face recognition. Springer Science + Business Media, New York Lu X, Jain A (2006) Automatic feature extraction for multiview 3D face recognition. In: Proceedings of the 7th International conference on automatic face and gesture recognition, Southampton, pp 585–590 Maltoni D, Maio D, Jain A, Prabhakar S (2009) Handbook of fingerprint recognition. Springer Science + Business Media, London Manjunath B (1992) A feature based approach to face recognition. In: Proceedings of computer vision and pattern recognition, Champaign, pp 373–378 Ross A, Reisman J, Jain A (2002) Fingerprint matching using feature space correlation. In: Proceedings of post-EECV workshop on biometric authentication, vol 2356. Springer, Heidelberg, pp 48–57 Sirovich L, Kirby M (1987) Low-dimensional procedure for the characterization of human faces. J Opt Soc Am 4(3):519–524 Sirovich L, Meytlis M (2009) Symmetry, probability, and recognition in face space. PNAS P Natl Acad Sci 106(17):6895–6899 Wechsler H (2007) Reliable face recognition methods: system design, implementation and evaluation (International Series on Biometrics). Springer Science + Business Media, New York Zhao W, Chellappa R, Rosenfeld A, Phillips PJ (2003) Face recognition: literature survey. ACM Comput Surv 35(4):399–458

Page 9 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Direct Drive, Multiplex, and Passive Matrix Karlheinz Blankenbach*, Andreas Hudak and Michael Jentsch Display Lab, Pforzheim University, Pforzheim, Germany

Abstract This chapter is dedicated to the driving of low-content displays. It gives an overview of these methods starting from the simplest, direct drive, and segmented low-content displays through multiplex passive and active matrix drives for displays with low- to mid-size resolutions. The introduction explains some of the electrical principals involved in driving a display. Later on, passive matrix addressing schemes will be described in more detail before the active matrix addressing scheme is presented. Finally, the handling of two typical low-resolution passive matrix LCD modules will be explained.

List of Abbreviations AM ASCII CGRAM CGROM DC DC/DC DDRAM DR E EOTF IF IR ITO LC LCD LED mC MPU MOSFET OLED OTP PM PWM QVGA RAM RGB RMS

Active matrix American Standard Code for Information Interchange Character Generator Random Access Memory Character Generator Read Only Memory Direct current Direct current converter Display Data Random Access Memory Data register Enable Electro-optical transfer function Interface Instruction register Indium tin oxide Liquid crystal Liquid crystal display Light-emitting diode Microcontroller Microprocessor unit Metal-oxide-semiconductor field-effect transistor Organic light-emitting diode One-time programmable Passive matrix Pulse-width modulation Quarter Video Graphics Array Random access memory Red, green, blue Root mean square

*Email: [email protected] Page 1 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

RS R/W TCON TFT VGA XOR

Register select Read/write Timing controller Thin-film transistor Video Graphics Array Exclusive OR

Introduction Besides the physical properties of a display such as viewing angle, black state, response time, and so on, which are mainly dependent on the display technology being used, the electrical driving of a display also plays an important role. Simply considered, an LCD is a valve that controls the transmission of light. Of course, this approach is a little weak for emissive displays (see section ▶ Emissive Displays): they do not just control the light output of a backlight because the light is generated by the pixel itself. From the electrical point of view, a single pixel can be considered as a parallel plate capacitor for a voltage-driven technology like LCD and for current-driven technology like organic light-emitting diode (OLED) as a simple LED. So a voltage (U) or current (I) across the pixel controls the light output as is shown in Fig. 1. The simplest way of driving displays is direct-driven displays, which means every single pixel has its own connection line (Fig. 2 left). But with an increasing amount of pixels, that would be impossible. Nowadays, modern televisions have a resolution of 1,920  1,080 pixels, which means for direct driving, 6 million lines are needed (horizontal pixels  vertical pixels  RGB). This is technically impossible. To overcome this issue in high-resolution displays, the pixels are addressed by a matrix where every interconnection belongs to a pixel as is shown in, for example, a 5  7 matrix in Fig. 2 on the right. This reduces the number of lines needed for driving to 5,760 columns (1,920  RGB) plus 1,080 rows. The next step is the differentiation between active and passive matrix-driven displays. Active matrix driving enables most modern display technologies to have a higher resolution (more pixels), higher contrast ratio, and more gray levels and colors when compared to passive matrix addressing. However, passive matrix displays have competitive advantages in terms of price, especially for character and low-resolution graphic displays for LCDs and OLEDs (Cristaldi et al. 2009).

% Light Output I

100

U

Driving Voltage Driving Current

Fig. 1 Visualization of voltage-driven (left) and current-driven (center) pixels and their typical light output characteristics (right)

Page 2 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Driving electrodes

On Off On

Off

On

5×7 Electrode matrix

On On

Fig. 2 Direct-driven eight-segment digit (left) and a 5  7 electrode matrix (right), both showing “3”

Direct Drive Today, direct-driven displays are used in low information applications like simple digital watches, status indicators, or simple instrumentation for home appliance or industrial equipment. These displays are normally used for monochromatic applications, sometimes they are combined with multicolor backlights, or they are used without a backlight in reflective applications. The addressable pixels of direct-driven displays are called segments and often differ in shape and size. Each segment is driven by one dedicated signal line, which is related to a common ground electrode. The resolution of the displays is limited by the number of signal lines that can be integrated on the display glass. Also, the driving circuit has to address every segment which leads to a large number of connection pins.

Direct-Driven Liquid Crystal Displays The most important direct-driven liquid crystal display is the eight-segment display, which is also available as starburst or multiburst displays. These displays are made as a stack of two glass sheets, each patterned with transparent electrodes. The gap between the sheets is filled with liquid crystal material. To reduce the number of lines, typically, the rear glass is structured with one common electrode. The overlap of a top electrode with the bottom electrode forms an addressable segment. Thus, for a segmented display with n segments, the number of connection lines is n + 1. The electrical driving for liquid crystal displays needs to be DC free. Otherwise, the lifetime of the material will be reduced by ionization of the LC molecules. This leads to a pulse-modulated driving scheme. A typical waveform which is used for the static driving of liquid crystal displays is shown in Fig. 3. The figure shows a square wave voltage with a duty cycle of 50 % for the common electrode. To drive a segment into the off state, the same voltage pulses are applied to the segment line, so the voltage across this segment is zero. To drive a segment into the on state, the square wave voltage applied to the segment line must be out of phase. This leads to a voltage across the segment which is two times the driving voltage. For good contrast, the voltage across the segment should be about three times the threshold voltage of the liquid crystal material. To drive different segments with one common electrode, a simple driving circuit with logical XOR gates can be used. Figure 4 shows the driving circuit using a clock signal which is applied to the common electrode to control the waveforms of the segment signals. Often the single segments of an eight-segment digit are labeled with letters from a to g.

Page 3 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015 On-state S

DU

Off-state

+V

Segment electrode

0 +V

Common electrode

0 +V

C

Segment– Common voltage (DU)

0 −V t

50–70 Hz

Fig. 3 Waveform for direct-driven liquid crystal displays

LCD

1

1

0

1

1

1

1

1

1

0

Common electrode (square wave)

Fig. 4 Electronic driving circuit for direct-driven liquid crystal displays

For graphical displays which need to show more information, the direct driving scheme is not suitable due to the increased number of connections, which also limits the active display area, because the signal lines must be integrated within the display glass.

Direct-Driven OLED Displays As the lifetime of organic light-emitting diode (OLED) material is suitable for display applications, many segmented OLED displays are now available on the market. The driving scheme for direct-driven OLED displays differs completely from that of liquid crystal displays. The organic material is a self-emitting material that leads to a current-controlled driving approach instead of the voltage-controlled driving which is used for liquid crystal devices. OLED displays are also encapsulated between two glass sheets. The front sheet is coated with transparent ITO electrodes, and the rear sheet is usually coated with a common metallic electrode. The segments are formed between the front and the rear electrodes, so for a segmented display with n segments, the number of connection lines is n + 1. To drive a segment in the on state, a current must be applied from the top electrode to the rear electrode. The emitted light of a segment is proportional to the current driven through the segment. Without applying a current source to a segment, the segment is in the off state. The brightness of the segments depends on

Page 4 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Segment electrodes

I

Current source 0 Current source 1

Current source n

OLED

Data Interface for microcontroller Dimming

Common electrode

Fig. 5 Electronic driving circuit for direct-driven OLED display

their active area. To achieve brightness uniformity for the whole display, segments with different sizes must be driven with different currents. Figure 5 shows an electrical driving circuit for direct-driven OLED displays. Each segment is driven by one dedicated current source. Usually, the maximum current of the current source is limited by the driver IC itself. To address larger segments, it is necessary to connect several current sources together. To reduce cost, direct-driven OLED displays are usually built up with a driver IC mounted on the display glass or on the flex cable. This reduces the number of signal lines between display and electronics. For OLED displays, the driver ICs usually offer a dimming pin, which allows general dimming by applying a pulse-width modulated (PWM) dimming signal. The frequency of the PWM dimming signal should be above 120 Hz to avoid flicker effects (Shinar 2003).

Multiplex Drive The main issue for direct-driven displays is the high number of connections that also leads to an electrical driver which offers many output pins. To address a higher number of segments, usually a multiplex or duty-cycle driving approach is used. Therefore, the bottom electrode is divided into m independent parts, as shown in Fig. 6. On the other side, up to m top electrodes are joined together. So for a display with n segments, the number of connections is n/m + m. For example, a display with 64 segments needs 65 connections at direct drive and 34 connections at multiplex drive with m = 2. There are several limitations, like the switching time of liquid crystals, display contrast, etc., so usually m is limited to m  4 for multiplex-driven segment displays. For driving, the m bottom electrodes (grid electrodes) are separated by a duty cycle of 1/m. This means that the segments are not driven all the time, so the achievable contrast is reduced compared to directdriven segments. Furthermore, the transmission of the display is also reduced due to the multiplex drive and leads to the need of a brighter backlight. Figure 7 shows the driving waveform to picture “7  C” on an LCD. What has to be taken into account is that an LCD responds to the RMS value applied to the electrodes. That means that to activate a segment, a voltage has to be applied to the grid and the segment must be driven with zero volts (Scheffer and Nehring 1984). Page 5 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

1

2

3

4

Common electrode I Common electrode II

5 MUX

Segment electrodes

Fig. 6 Multiplex-driven display showing “7  C”

T +V Common electrode I 0 +V Common electrode II 0

+V Segment electrode 1 0 +V Segment electrode 2 0 +V Segment electrode 3 0 +V Segment electrode 4 0 +V Segment electrode 5 0

Fig. 7 Driving waveform for a multiplexed LCD (see Fig. 6) showing “7  C” on the display

Page 6 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Matrix Driving Fundamentals Compared to active matrix (AM)-driven displays, the passive matrix (PM) drive uses no active elements. In this case, the active part in an AM display would be one or more thin-film transistors (TFT) per pixel. Through the Alt and Pleshko effect (see next section), the resolution of PM displays is restricted to QVGA (320  240). Due to their poor performance in terms of image quality and resolution, they are mainly used in cheap devices which do not need a high perceived quality. For example, a normal pocket calculator does not need a high-resolution color display, and nobody would pay the price for a high-end panel in such a device. A clear benefit of PM displays is their relatively low-cost manufacture.

Fundamentals of Passive Matrix LCDs The driving of passive matrix displays is done by dedicated integrated circuits (IC), which are connected to the display glass and the data source, usually a microcontroller; further details are presented in section 5. Passive matrix displays replace the segments with pixels that are arranged in a matrix with row and column electrodes, as simplified and shown in Fig. 8 left for 2  2 pixels. A passive matrix LCD is made up of two glass substrates sandwiching the liquid crystal material, with ITO electrodes structured on their surface. For example, on the top substrate, there are N row electrodes and on the bottom substrate M column electrodes. This results in a matrix with N + M electrodes. Every intersection of the matrix equates to a pixel. So we can address N  M pixels. Assuming a resolution of 100  100 pixels, only 200 electrodes will be needed. A direct-driven display would require 10,000 electrodes, so clearly there is a massive saving of electrodes. In a passive matrix display, each row is sequentially scanned by a pulse. For the period of a frame T, the pulse lasts for the duration of T/N (TON) at one row. During this pulse, the data for a row is applied to the columns. This time is the so-called refresh rate or frame period and is usually set above 50 Hz to prevent flicker. In Fig. 8, this is demonstrated for the upper right pixel. Note that the signal between row and column must have a 90 phase shift to activate a pixel. Figure 9 shows the sequential scanning of rows. Assuming a frame period of 60 Hz (16.67 ms) and a display with 128 lines, we get a pulse width of around 130 ms (Gulick and Mills 1994). A passive matrix display is sensitive to the RMS voltage that is applied to a pixel cell. So pixels that are on the same line or column are also addressed by the half voltage U as shown in Fig. 8. This effect is called ghosting and can be overcome by selecting a voltage that is above the threshold of the optoelectronic response curve, also known as the electro-optical transfer function (EOTF). For the unselected pixels, a Waveform for pixel ‘A’

Passive Matrix Driving voltage

C1

C2

0

–U

Column (data) (frontplane)

Row

R1

C2 R1 +U

R2

A

ITO

Row (scan) (backplane)

⏐U⏐

Data ⏐2U⏐

=

Ghosting

DU

0 1 Pixel

0

⏐U⏐ Pixel voltage

V t

Fig. 8 Principle of PM addressing of an LCD. Pattern of row and column electrodes where the black square is a switched-on pixel, left, and the driving waveform, right

Page 7 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015 Frame period T

Row

Pulse width (Ton)

1 2 3

N

Fig. 9 Sequential scanning pulses for row addressing in a passive matrix display

% Transmission Threshold

Saturation

100 90

10 Uoff

Uon

Driving Voltage

Fig. 10 Example of an optoelectronic response curve of a liquid crystal cell

voltage below this threshold has to be applied. Figure 10 shows an example of an EOTF for a typical liquid crystal cell. The threshold and saturation values are defined as 10 % and 90 % of the transmission (Lueder 2005). Another effect of passive matrix addressed displays is that the voltage between a selected and a nonselected pixel becomes smaller when the number of multiplexed lines increases. The result is a smaller contrast. Alt and Pleshko formulated the limits of multiplexing in RMS responding displays. They calculated the voltage ratio between the on and off states depending on the number of multiplexed lines N: U on ¼ U off

pffiffiffiffi 1 N þ1 2 pffiffiffiffi N 1

So the voltage gap between the on and off states is decreased by increasing the number of lines. This results in a reduced contrast ratio unless a liquid crystal with a very steep optoelectronic response curve is used (Alt and Pleshko 1974; Nehring and Kmetz 1979).

Page 8 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015 μC IF TCON Column driver

Column driver

Row driver Display Row driver

Fig. 11 Typical assembly of the interface between display and driver

Column electrodes Row electrodes

Organic layers

Fig. 12 Setup for passive matrix OLED display

Figure 11 illustrates a typical assembly of a low-resolution passive matrix display and the interfacing to a microcontroller, which provides the data to be displayed. For displays more complex than the segmented layout, the column and row drivers are separated from the timing controller (TCON) and are often directly assembled on the glass substrate of the display. The TCON is configured via a microcontroller interface (mC IF), and the data transmission is also done via this interface. The TCON itself contains a lot of functional blocks. These blocks are the host interface, LUT (look-up table), display data RAM, color processing, gray generation, control logic, timing controller, OTP (one-time programmable) memory, oscillator, temperature sensor, temperature compensation, DC/DC converter, and finally the outputs for the column and row drivers. The important blocks are described in section 5 and Cristaldi et al. (2009).

Fundamentals of Passive Matrix OLEDs

The mechanical setup for passive matrix OLEDs (see also part ▶ Organic Electroluminescent Displays) is similar to that of the liquid crystal versions. It consists of organic material instead of liquid crystals, which is sandwiched between with glass sheets. The sheets are coated with ITO electrodes crossing each other as shown in Fig. 12. The organic layer depends on the OLED technology and production technique (Shinar 2003). The driving of passive matrix OLED displays differs significantly from the driving of liquid crystal displays. Unlike LCDs, OLEDs are driven by current sources. For the passive matrix approach the display driver needs one current source per display row.

Page 9 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015 Column driver signal: 1 0 1 0 I

I 1

I 0

I 1

0

Row driver signal: 1 0 U VKH

1 U VKH

0

Fig. 13 Passive matrix driving for OLED displays

To address one display line, the row driver switches the line to logical ground, while all unaddressed lines are switched to a voltage similar to the anode voltage of the display driver. This is shown in Fig. 13 for line two. To switch on the dedicated pixels, the current source from the column driver has to drive the OLED pixel. For passive matrix OLEDs crosstalk is not a problem, because light has to be generated by a current flow, and that is not possible for non-driven pixels. But the resolution for passive matrix OLEDs is limited to something less than 200 lines. This depends mostly on the lifetime of the OLED material at high brightness levels. For the passive matrix line addressing scheme the display luminance Ldisplay of a display with n lines depends on the pixel luminance Lpixel as followed: Ldisplay ¼

Lpixel n

This means that if a display should have a luminance of 100 cd/m2 and 100 lines, the luminance of a pixel must be 10,000 cd/m2. So typically a compromise between resolution, lifetime, and display brightness has to be found. To use the passive matrix approach for displays with more lines, there are some additional driving techniques like dual line addressing or multiline addressing (Eisenbrand et al. 2007). As already considered, the lifetime of OLED displays depends on the display luminance. But there are also some lifetime-optimized passive matrix driving techniques known from Eisenbrand (Eisenbrand et al. 2007). As OLED displays are self-emitting displays, the refresh rate of these displays should be above 100 Hz. For lower refresh rates some flicker effects like flashing lines can occur for a moving observer.

Fundamentals of Active Matrix Driving For better understanding, the typical characteristics and in consequence the limitations of passive and active matrix driving are visualized in Fig. 14 for LCDs and summarized in Table 1. As a passive matrix pixel is formed by the crossing of two ITO (indium tin oxide, transparent semiconductor) electrodes, it is obvious that a voltage which is applied to a row (line) affects all pixels on this row; the same applies analogously for columns. The consequence of such a matrix arrangement is that the voltage of each passive matrix pixel is affected during the frame time by the (gray level) voltages for other pixels of the same column. The left side of Fig. 14 demonstrates this for a simplified 2  2 matrix: The top row (line) is selected by “+U” and the column contains the gray levels “0” for black for the top left pixel and “-U” for Page 10 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Active Matrix Passive Matrix Column (data) (frontplane)

C1

C2

Column (data) Row (scan) Address TFT

R1

ITO

Row (scan) (backplane)

CLC pixel

Storage capacitor

R2 Front plane 1 Pixel

Row and column on backplane

Fig. 14 Basic principle of passive (left) and active (right) matrix for LCDs

Table 1 Fundamental characteristics of active and passive matrix driving Principle

Time characteristics Pixel voltage or current

Passive matrix Crossings of two orthogonal ITO line arrangements (one on frontplane, the other one on the backplane) form the pixel No storage of gray level (voltage or current, if no bistable technology) Set up by row and column values for the whole row or column which vary over frame time

Active matrix Row and column electrodes are on the backplane. A thin-film transistor (TFT) as a MOSFET acts as nonlinear switching element for each pixel A pixel capacitor stores gray level (voltage or current) Dedicated values for each pixel selected via gate of TFT (row, line)

the top right pixel. The top left pixel voltage (difference column  row) is then + U which results in a certain gray level but not black (for a normally white LCD). In contrast, the top right pixel is black as the voltage difference is 2U. In the figure, the amount of voltage is given as this is relevant for the LC transmission, and the voltage sign changes from frame to frame to avoid a DC voltage. Due to the passive matrix principle, the bottom pixels are also affected by the column voltages of the top lines. Therefore, the bottom right pixel is set to the same voltage as the top left one. This simplified example shows that during a frame scan, all pixels are affected in their gray-level behavior. It is obvious that this limits the useful resolution and gray-scale capability for passive matrix LCDs. By introducing a nonlinear switching element – usually a MOSFET (metal-oxide-semiconductor produced in thin-film transistor (TFT) technology; see also chapter ▶ Active Matrix Liquid Crystal Displays (AMLCDs) and part ▶ Inorganic Semiconductor TFT Technology) – the ghosting of a passive matrix drive is suppressed (Fig. 14 right): The (address) TFT only transfers column voltage (data) to the LC pixel if its gate (connected to the row line) is set to an appropriate positive voltage (e.g., +20 V; for details, see chapters ▶ Active Matrix Driving and ▶ Active Matrix Liquid Crystal Displays (AMLCDs)). Therefore, the column voltage for other rows (lines) does not affect this pixel. In the next step, we will now have a look at the differences between passive and active matrix driving from the point of view of scanning the rows (Fig. 15). All the rows (lines) of a matrix display are scanned subsequentially (here from top to bottom); when the last row is reached, the next frame starts with the first row (line). As there is no “memory” in a passive matrix LCD pixel (the same for PM OLEDs), the intended gray level (here black) vanishes over time (depending on the response time which is negligible for PM OLEDs). This results in a low contrast ratio for PM LCDs. Active matrix pixels however “store” the gray level (voltage), and therefore, all pixels show their intended gray level. This results in a higher contrast ratio, but this “hold-type” approach causes motion blur (see chapter ▶ Video Processing Tasks). Page 11 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015 Passive Matrix Leftover from actual scan

Active Matrix Stored from actual scan

Data

Scan

Data

Scan

No leftover from last scan (frame)

Stored from last scan (frame)

Fig. 15 Visualization of passive (left) and active (right) scan and pixel (gray level) representation for LCDs

LCD

OLED

Column (data) Row (scan)

Row (scan)

Address TFT

Address TFT

Storage capacitor

Power

Column (data)

Drive TFT

CLC pixel

Storage capacitor

OLED Frontplane

Frontplane

Fig. 16 Basic circuitry of active matrix drives of an LCD (left) and an OLED (right) pixel

Table 2 Comparison of active matrix parameters for LCDs and OLEDs Driving principle Number of TFT per pixel Aperture a

LCD Voltage 1 70 %

OLED Current 2 30–70%a

30% for bottom emission (normal lateral stack) and 70 % for top emission (inverted stack)

Figure 16 and Table 2 demonstrate the differences between AM LCDs (left) and AM OLEDs (right; see also chapter ▶ Active Matrix for OLED Displays): As LCDs are voltage driven, only a capacitor is necessary to store (hold) the (gray level) voltage between two scans (frames). In contrast, OLEDs are current driven and need therefore permanent current to emit light. Therefore, a power (column) line has to be implemented as well as a (additional) drive TFT to transfer the current to the OLED layer structure. It is clear that this secondary TFT and the power line reduce the aperture ratio (useful pixel area/total pixel area) significantly. An approach to overcome this is the top emission structure (see chapter ▶ Active Matrix for OLED Displays) at the price of higher manufacturing complexity. Another difference is that voltage-driven display technologies like LCDs need only one TFT per pixel, while emissive displays like OLEDs require at least two of them. Actual designs of AM OLEDs have four TFTs per pixel due to Page 12 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

uniformity issues. As an example, an XGA display has 1,024  768  3  2.4 million subpixels resulting in the same number of TFTs for LCDs but up to 10 million TFTs for an AM OLED. This reduces, in consequence, the production yield. After discussing the differences in pixel circuitries, the next step toward the requirements of panel electronics (e.g., row and column drivers) is to look at the waveforms required for active matrix pixels. Figure 17 visualized the row and column signals for a simplified 2  2 AM LCD in normally white mode (see bottom right, electro-optical curve; see chapter ▶ Twisted Nematic and Supertwisted Nematic LCDs). On the left side, the waveforms plotted for two subsequent frames and the right side visualized the pixel voltages of the pixels labeled with “a” (black) and “b” (white, transparent). First, we will start with the waveforms for scanning each row sequentially (left): Row 1 (red) applies a pulse to the connected gates of this line, and therefore, the corresponding column voltage (data) is transferred to the pixel. In the next step, row 2 (green) is activated and the columns have to provide the Column (data)

1 2 1 2

VD 0 VD 0

Frontplane (VCom) VFP 0 C1 C2 VG 0 VG 0

a

R1 1 2 1 2 R2

b

Row (scan, gate) t TFrame

(simplified example)

Waveform for pixel ‘a’ and ‘b’

a

b 1

2

1

2

1

2

1

2

R1

Row

R2

C2

Data FP

C1

= On

Off

DU

V

V t

t

Rel. L ΔV

Fig. 17 Visualization of waveforms of a simplified 2  2 active matrix LCD (left) and resulting pixel voltage for two pixels (a, b, right)

Page 13 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

gray-level data for this row. This process is repeated typically 60 times a second (frame rate or frame time). The column (data) voltage and the front plane voltage set the gray level via their difference. On the right side, the time diagram of the voltages of the pixels “a” and “b” is shown: Pixel “a” receives its gray-level voltage at the first row and this voltage is stored (dotted line), while the second (or, in case of real panels, all other rows) is activated. A similar procedure, but on the second row, is applied to pixel “b.” As pixel “a” is intended to show black, the voltage difference DU (data voltage – front plane voltage) must be maximized, while for pixel “b” this difference should be zero. Comparing this to the passive matrix LCD in Fig. 14, it is clear that the image quality of AM LCDs is superior as no voltage limitations happen (like Alt and Pleshko) and the electro-optical curve can be used in its total span. From those waveforms, the requirements for the panel electronics are delivering the waveforms for row and columns at the right time with the right value. Adjustment of Electro-optical Characteristics to Gamma Curve As all examples up to now refer practically to black and white content, we have to discuss how gray levels are reproduced. The electro-optical characteristics of nearly all displays except CRTs are far from being similar to the ideal luminance (LIdeal)-gray level (DInput) relationship L  Dg (Fig. 18 left; see chapter ▶ Luminance, Contrast Ratio and Grey Scale). DInput stands here for the digital gray-level data (usually 6 or 8 bits per color). In order to achieve a luminance-gray level relationship as plotted in, some adjustments within the driving electronics system have to be performed. This will be discussed for a positive mode AMLCD for which the electro-optical characteristic is shown in Fig. 18 center; DLCD can be interpreted as driving voltage as shown, for example, in Fig. 17. This basic electro-optic dependency is practically opposite to the ideal gamma curve (left). Therefore, the gray-level voltages have to be modified according to the properties of the display by a transfer function (Fig. 18 right). This transfer function is realized for AMLCDs by gamma reference voltages and the column digital-to-analog converters (see chapters ▶ Active Matrix Driving and ▶ Active Matrix Liquid Crystal Displays (AMLCDs)). The following example will illustrate the issue and the task (see dashed lines in Fig. 18): If a normalized gray level from the source of 0.6 (= DInput) (left) is set, the corresponding luminance LIdeal represents 30 % of the maximum level. If no further precautions are made, this would lead only to 10 % of LLCD for the LCD (center). The right gray shade for achieving 30 % of the maximum luminance for this LCD is, however, 0.45. This task is done by a transfer function (right) so that the source gray levels (DInput) are modified to appropriate display gray levels (DLCD). The transfer function is usually represented by ten or more interpolation points (gamma voltages for LCDs).

Lideal 1.0

1.0

1.0

Ideal gamma curve

Positive mode

0.5

0.0

DLCD

LLCD

Transfer function 0.5

0.5

0.5

1.0 Dinput

0.0

0.0 0.5

1.0 DLCD

0.5

1.0 Dinput

Fig. 18 Visualization of adaptation of an LCD electro-optic curve (center) to the required luminance (L)-gray level (D) relationship (left) via transfer function (right)

Page 14 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Passive Matrix LCD Modules At least two low-resolution LCD modules with matrix driving will be described: starting from the display driver IC, going over the connection between microcontroller, driver, and the display panel, and giving an example of how to program such an IC. In the field of character displays, the HD44780 by Hitachi is a very famous one. But it is only able to address two lines each with 16 characters. Going to high resolutions, we have the T6963 by Toshiba. This IC is able to drive panels with a resolution of 240  128 pixels and is also able to generate graphics.

Character Displays Character displays are often used in devices in which a high-resolution display is not needed for presenting information. A benefit is that they are very cheap to produce and easily accessible by a 4- or 8-bit microcontroller. Figure 19 shows a typical character display that is connected to the common and segment driver for a character display. Character displays are available in sizes from 1  8 up to 4  40 characters where one character is represented by 5  10 dots or 5  8 dots. The last configuration is more commonly used. The standard for driving such kinds of displays is the HD44780 controller by Hitachi. Most variants of character displays have more or less the same architecture, which is why this device is used for an introduction to driving a character display. Driver Architecture Let us start with some features of the HD44780. It is possible to drive displays with one 8-character line or two 8-character lines. As mentioned before, one character can consist of 5  8 or 5  10 dots. It has a low power operation support from 2.7 to 5.5 V and a wide range of liquid crystal display driver power from 3 to 11 V. For transfering the data to be displayed and commands from the microprocessor to the HD44780 controller, a digital MPU (microprocessor unit) bus interface (4- or 8-bit parallel data) with a speed up to 2 MHz is provided. Figure 20 shows the simplified block diagram for the HD44780 with the core features. The MPU interface consists of three control lines (RS, R/W, and E) and eight data lines (DB0 to DB7). HD 44780 Interfacing: Controller ´ Display glass SEG 40

Data SEG 1

COM 1

COM 8 Scan COM 9

COM 16

Fig. 19 Interfacing between a character display controller and the electrodes of a character display

Page 15 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

HD 44780

RS

Instruction Register (IR)

Timing generation

Control logic

Display Data RAM (DDRAM)

R/W mC

E DB 7 DB 4 DB 3 DB 0

MPU interface

SEG 1 SEG 40 Segment & Common driver

LCD COM 1 COM 16

Data Register (DR)

Character Generator RAM (CGRAM)

Character Generator ROM (CGROM)

Fig. 20 Simplified block diagram of an HD44780 architecture

This interface is used to transmit or receive data and commands to and from the display driver. The control lines are RS (register select), R/W (read/write), and E (enable). RS selects if the data are written to the instruction register (IR) or written/read from the data register (DR). The IR is used to store instructions like display clear or cursor shift. The DR stores data that will be written or read from the DDRAM (display data RAM) or CGRAM (character generator RAM). The R/W selects if a read or a write operation will be performed, and finally with the enable signal, the interface starts to read or write. When only a 4-bit data interface is used, DB0 to DB3 are not connected and the data will be transferred in two cycles over DB4 to DB7. The HD44780 has three different types of memory. First, there is the display data RAM (DDRAM). The data being displayed is stored into this memory. Every pixel or dot on the display is mapped into this memory. So if there is no content change on the display, the data does not have to be sent again to the driver. The second is the character generator ROM (CGROM). This memory contains 8-bit character codes and generates 5  8 or 5  10 dot character patterns. Figure 21 shows an extract of the content from the CGROM of a HD44780. The code mapping is very close to the American Standard Code for Information Interchange (ASCII). Instead of addressing every dot for a character, the user can use these predefined signs for writing letters and signs on the display. Depending on the manufacturer and country, the content of the CGROM can differ a little bit. Finally, we have a character generator RAM (CGRAM). If the user needs a sign that is not implemented in the CGROM, he can create his own and load it into the CGRAM. As for the DDRAM, the addresses of the memory are mapped to dots on the display. The mapping is usually described in detail in the datasheet of a driver. The timing generator generates a clock signal for the operations of internal circuits like the memory and synchronizes the column and row operations for writing data to the display. Finally, the column and row signal drivers will write the data onto the display (Cristaldi et al. 2009). Interfacing a Character Display Driver In the following, a short introduction of what has to be done to write content on a character display is given. As for most display drivers, an initialization routine is necessary. In the case of a character display, this is relatively simple compared to a color display with high resolution. Only a few parameters have to be set.

Page 16 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

Upper 4 Bits Lower 4 Bits

0001

0010

0011

0100

0101

0000

0001

0010

0011

Fig. 21 Extract of character generator ROM table

Table 3 Initialization routine for an HD44780 display controller Initialization

Instruction 8-bit interface, 3  instruction

2 lines, 5  7 Display on Display clear Entry mode Write character to LCD RAM “F”

RS 0 0 0 0 0 0 0 1

RW 0 0 0 0 0 0 0 0

D7 0 0 0 0 0 0 0 0

D6 0 0 0 0 0 0 0 1

D5 1 1 1 1 0 0 0 0

D4 1 1 1 1 0 0 0 0

D3 0 0 0 1 1 0 0 0

D2 0 0 0 0 1 0 1 1

D1 0 0 0 0 0 0 0 1

D0 0 0 0 0 0 1 0 0

Note: “Enable” pulse for valid data required

As an example, this is done for the HD44780 character display driver. It is assumed that a microcontroller is connected to the MPU interface and an 8-bit interface is used for data transmission. The display connected to the driver has two lines and a character size of 5  8 dots. The initialization routine is listed in Table 3. The first step is to configure which interface is used. For this case, it is an 8-bit interface. Because of the availability of a 4-bit interface, the last four bits don’t care for this command. The command has to be sent three times. Some drivers require a minimum wait time for the three commands. If this is the case, this is marked in the datasheet of the driver. Afterward, the number of display lines and the character font are set. The display is turned on by setting data bit 2. Bit 1 determines if the cursor is turned on or off and bit 0 can be used to turn the blinking of the cursor on and off. Finally, a display clear command is transmitted. The entry mode is used to set the cursor movement direction (increment or decrement) by bit 1 and the display shift on or off by bit 0. After that, the controller is ready to write content into the DDRAM. In the example, it is the letter “F.” By setting the enable pin to high, the content of the DDRAM will be written onto the display.

Page 17 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

External memory RS R/W

Column driver

Column driver

E mC

CS

T6963

Row driver

DB 7

DB 0

Display Row driver

Fig. 22 Interfacing between microcontroller and a low-resolution graphic display

Low-Resolution Passive Matrix Graphic Displays (Up to QVGA) The next step up from character displays is low-resolution graphic displays. They start from a size of 32  64 pixels going up to 240  320 pixels, which is equivalent to a resolution of QVGA (quarter VGA, VGA = 640  480). Displays with resolutions above QVGA can be considered as high-resolution displays, and nearly all of them use active matrix driving because of the better performance it delivers. The main use of low-resolution graphic displays is applications such as portable MP3 players, mid-end mobile phones (high-end mobile phones or smart phones nowadays are equipped with VGA active matrix displays), and so on. Compared to a simple character display, they have the ability to show the user more information in terms of text as well as graphics and they are able to produce gray scale and colors. As for character displays, the HD44780 is widely known as the driver IC; for low-resolution passive matrix displays, it is the T6963 by Toshiba. This driver is able to drive displays with resolutions up to 240  128 pixels. The handling of the device is very similar to the handling of a character display driver, so we will not go into too much detail about the programming of a T6963. It has an 8-bit data/command microcontroller unit interface with four control lines. The configuration of the driver is a little bit more extensive because more parameters have to be addressed. In contrast to a character display driver, the row and column drivers are not included in the driver because of the higher pin count. The memory for storing and displaying content is also separated from the driver. Figure 22 illustrates a typical assembly between microcontroller, driver with external components, and display.

Summary This chapter dealt with the driving of low-content displays, starting from the simplest, direct drive, and going through to passive matrix-driven displays. The matrix addressing of pixel cells allows the reduction of number of lines that is required to drive a display. In terms of cost efficiency, PM is still the technology of choice because of its simple fabrication process and the fact that the electronics can be relatively simply realized. Of course, there are some drawbacks, like less optical performance when compared to AM-driven displays. This is caused by the effect of cross talk and the response of gray scales to the electro-optical curve. Although PM driving technology is well established, there is still a lot of

Page 18 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_33-2 # Springer-Verlag Berlin Heidelberg 2015

progress going on in the improvement of drivers related to intelligent driving algorithms and energy saving.

Further Reading Alt PM, Pleshko P (1974) Scanning limitations of liquid-crystal displays. IEEE Trans Electron Devices 21(2):146–155 Brody TP (1984) The thin film transistor – a late flowering bloom. IEEE Trans Electron Devices 31(11):1614–1628 Cristaldi DJR, Pennisi S, Pulvirenti F (2009) Liquid crystal display drivers. Springer, Berlin, pp 75–78. ISBN 978-90-481-2254-7 Eisenbrand F, Karrenbauer A, Xu C (2007) Algorithm for longer OLED lifetime, experimental algorithms. In: Proceedings of the 6th international workshop, WEA 2007, Rome, June 2007, pp 338–351 Gulick P, Mills T (1994) Active addressing(TM) of passive matrix displays. Inform Disp 10:14–17 IEC 61966-4 (1998) Colour measurement and management in multimedia systems and equipment – part 4: equipment using liquid crystal display panel Kawakami H (1976) Method of driving liquid crystal matrix display device. US Patent #3 976 362, issued 1976 Kleitz W (1998) Microprocessor and microcontroller fundamentals: the 8085 and 8051 hardware and software. Prentice Hall, Upper Saddle River. ISBN 0-13-262825-2 Kristiansen H, Liu J (1999) Overview of conductive adhesive technologies for display applications. In: Liu J (ed) Conductive adhesives for electronics packaging. Electrochemical Publications, Port Erin, pp 376–399 Kuijk KE (2000) Minimum-voltage driving of STN LCDs by optimized multiple-row addressing. J Soc Inform Disp 8(2):147–153 Kumar V (1995) Digital technology: principles and practice. New Age International, New Delhi. ISBN 81-224-0788-9 Lueder E (2005) Liquid crystal displays addressing schemes and electro-optical effects. Wiley, Chichester, pp 161–166. ISBN 0-471-49029-6 Mitescu M, Susnea I (2005) Microcontrollers in practice. Springer, Berlin\Heidelberg. ISBN 978-3-54028308-9 Nauth P (2005) Embedded intelligent systems. Oldenburg, M€ unchen. ISBN 978-3-486-27522-3 Nehring J, Kmetz AK (1979) Ultimate limits for matrix addressing of RMS responding liquid crystal displays. IEEE Trans Electron Devices 26(5):795–802 Scheffer TJ, Nehring J (1984) A new highly multiplexable liquid crystal display. Appl Phys Lett 48(10):1021–1023 Shinar J (2003) Organic light-emitting devices: a survey. Springer, New York. ISBN 0-387-95343-4

Page 19 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

Active Matrix Driving Karlheinz Blankenbach* Display Lab, Pforzheim University, Pforzheim, Germany

Abstract Multimedia systems require electronic displays with high resolution and video performance. The solution in terms of display technologies such as LCDs and OLEDs is active matrix (AM) driving. This chapter describes the signal path from digital panel input to row and column signals in terms of ICs (integrated circuits); the so-called panel electronics system.

List of Abbreviations AM AMLCD AMOLED DAC EEPROM EOC IC LVDS MOSFET PCB PPDS QVGA RSDS SVGA TCON TCP TTL XGA

Active matrix Active matrix liquid crystal display Active matrix organic light emitting display Digital to analog converter Electronic erasable programmable read only memory Electro-optic curve Integrated circuits Low-voltage differential signaling Metal oxide semiconductor field effect transistor Printed circuit board Point to point differential signaling interfacing Quarter video graphics array Reduced swing differential signaling interfacing Super video graphics array Timing controller Tape carrier package Transistor-transistor logic Extended graphics array

Fundamentals As described in chapter “▶ Direct Drive, Multiplex and Passive Matrix,” matrix displays are addressed via row and column waveforms. In this chapter, the tasks and electronics (the so-called panel electronics) required to generate the row and column driving signals from the (digital) input data are discussed. Most professional displays up to (S)VGA have a digital parallel RGB TTL input (Transistor-Transistor Logic, see chapter “▶ Panel Interfaces: Fundamentals”), while for higher resolutions, serial interfaces (see chapter “▶ Serial Display Interfaces”) are used. However, the fundamentals and principles of the panel electronics are independent of the interface type; for ease of understanding, the digital TTL RGB parallel *Email: [email protected] Page 1 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

Column driver driver driver 1 2 3

Gamma buffer for EOC

R Digital input signals

G

Timing controller

B

driver R 1 o w driver 2

Sync Backlight Controls

VCom Backlight driver

Power supply Sync

Fig. 1 Block diagram of a digital parallel input LCD module

interface is employed here. Details on Active Matrix pixels can be found in the corresponding chapters of the specific display technologies and in chapter “▶ Direct Drive, Multiplex and Passive Matrix.” Most of the examples refer to AMLCDs (often misleadingly named TFTs). Their basic driving method can be easily understood (an excellent reference for all LCD display driving is Cristaldi et al. 2009) and applied with some modifications to other electro-optical principles like AMOLEDs (see chapters “▶ Active Matrix for OLED Displays,” and Chap. 14, Advances in AMOLED Technologies in Ref Bhowmik et al. 2008) and AM e-paper displays (see chapter “▶ Electrophoretic Displays” and Henzen et al. 2004). Plasma Display Panels (PDPs) are Passive Matrix driven by special waveforms, as discussed in chapter “▶ Plasma Display Panels.” Figure 1 shows a typical AM LCD panel block diagram (see also Cristaldi et al. 2009, Chap. 6.1, AMLCD Driver Architectures, Kim et al. 2007; Lee 2002) with a single AM pixel magnified for reference. The digital RGB TTL input, synchronization, and control data (left) are captured by an input IC (integrated circuit). These data are then transferred to the Timing Controller (often abbreviated as TCON). The TCON rearranges the input data, so they are then in the right format for the row and column drivers. The column drivers are supported by the gamma buffer which provides the appropriate voltage for the digital-to-analog conversion within the column driver ICs to supply the gray level LCD pixel voltage with correction for electro-optic curve (EOC) (or the OLED voltage for the driver TFT). The VCom driver delivers the front plane voltage. The following subchapters follow the data path from Timing Controller input to pixel voltage and row (gate) pulses. The additional built-in electronics of an LCD panel are often the power supply (see chapter “▶ Power Supply Fundamentals”) and the backlight driver (mainly for LEDs, see chapter “▶ Dimming of LED LCD Backlights”). Figure 2 shows the typical waveforms of a parallel digital RGB data interface (detailed interface timing data and parameters can be found in the specifications of panel manufacturers) for one frame: The vertical synchronization signal VSync sets the pulse for a new frame. There are no pixel data during and around this sync pulse which is visualized by VPorch and “Data enable” input. The same applies for the horizontal synchronization HSync for every row (line). Between two HSync pulses, the gray level RGB color data for one row (line) have to be transmitted. For an XGA color panel, 1,024 (24-bit) gray level data have to be provided and processed within less than 21 ms. The correspondence between input and pixel data is drawn in Fig. 3 for an XGA display. Rows (lines) and columns start in the top left corner of the panel in the notation (row, column) for the pixel. A color pixel itself consists of a red, green, and blue subpixel (inset). The panel input data are usually in the right order to reduce the data processing required within the Timing Controller. Page 2 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 Example for XGA 1 frame: 16.7 ms(60 Hz) VSync

VPorch

Data enable

21.5 μs

HPorch HSync

Data

invalid

1

2

3

768 (XGA: 768 rows)

# of data per line : 1024 x RGB x bpp

Clocks and other Controls omitted

Fig. 2 Typical example of parallel panel interface timing which is practically the same for Timing Controller data input (for details see section “Timing Controller and Intrapanel Interface”) Row (line), column (data) Example for XGA 1,1

1,2

1,1024

2,1

2,2

2,1024

R G B

768,1

768,1024

Fig. 3 Usual nomenclature of pixels (matrix notation, values for XGA resolution) and RGB subpixel (inset). The dedicated arrangement is provided in display specifications

Timing Controller and Intrapanel Interface From the block diagram of an LCD module (Fig. 1), the tasks of a Timing Controller (TCON) become clear – it must deliver control and data signals for both row and column drivers. The most relevant input and output signals (see also Fig. 2) are listed in Table 1 including some common abbreviations. The TCON usually receives the data to be displayed in the panel resolution; therefore, no scaling is required (see chapter “▶ Video Processing Tasks”). Figure 4 shows a typical block diagram of an AMLCD Timing Controller (see also Cristaldi et al. 2009 Chap. 6.1, AMLCD Driver Architectures, Lee and Lee 2000; Kim and Martin 2000). The input RGB gray level data are separated within the input logic in terms of data and synchronization (frame and line). The gray level data have to be formatted and analyzed for the column drivers. The timing control block controls via horizontal and vertical reference (panel resolution) the row driver output and synchronizes that with the column data output. As the column Page 3 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Signals for row and column driver delivered by the Timing Controller (TCON) (Most relevant) Signals from TCON

Main output signal of driver

Row driver VSync (vertical start) Scan direction (DIR) Output Enable (OE) Row (TFT gate) pulse

Column driver HSync (horizontal start) Polarity (inversion) Digital gray level data (video data) Analog voltage acc. to gray levels

Data to be displayed

Input

Data alignment

Input logic Vertical and horizontal reference External programming interface

Output formatting

Column RGB, polarity

Timing control

Programmable registers

Row

EEPROM with panel parameters

Fig. 4 Simplified block diagram of an LCD Timing Controller with its inputs and outputs

data have a high data rate, dedicated interfaces (see below) have to be implemented; furthermore, the Timing Controller must be able to process these data in real time. The Timing Controller is often not designed for a dedicated display resolution and driver ICs, those data have to be set via control registers via the programming interface and have to be stored in an EEPROM (electronic erasable programmable read only memory). As mentioned before and calculated in chapter “▶ Direct Drive, Multiplex and Passive Matrix,” the data rate of the gray level data is relatively high (MHz to GHz range); for example, in XGA panels 1,024  768  60 Hz  3  8 bit  1.2 Gb/s. The intrapanel data lines (see Fig. 5) must be able to transmit this gray level color stream, and the physical interface has to be “small” as this increases the size of the display module. There exist basically two approaches for the intrapanel data interface: parallel and serial transmission from the Timing Controller (TCON) to the column drivers (gray levels). The most prominent implementations including the main features and limitations are described in the following overview and summarized in Table 2 (for further information see, e.g., Lee 2002; McCartney and Bell 2005; McCartney et al. 2001; Koh 2009). • Transistor–Transistor Logic (TTL) Interfacing. For 6-bit gray scale, 18 parallel wires (lines) only for data (gray levels) are required as a minimum. Dual banking cuts the data rate in half by providing two banks of data busses but doubles the number of data wires to 36. This approach is only good for low-resolution and small-sized panels (but with a relatively large PCB stripe). Some information can be found in (Connor et al. 1994) as this intrapanel interface technology is somewhat outdated. Page 4 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

DAC voltages

Gamma voltages

Data lines Data bus clock Start pulse

Column driver 1

Start pulse

1 Display input data

N

Column driver 2 1

N

Column driver y 1

N

1

Timing controller

Row driver 1 N LCD glass

Gate clock Panel power supply

Row driver x

1 N

Fig. 5 Block diagram of the panel electronics (module see Fig. 1). Depending on the intrapanel interface, the number of data lines from the Timing Controller to the column drivers varies significantly

Table 2 Comparison of intrapanel interfaces TTL Physical layer Number of lines for 6-bit gray scale Recommended resolution a

TTL 36a QVGA

RSDS LVDS 18 SVGA

PPDS High-speed serial 2b SVGA

For dual data bus which is necessary to keep data rate and EMI low Number is independent of gray scale resolution

b

• Reduced Swing Differential Signaling (RSDS) Interfacing. RSDS (see Lee and Lee 2000) halves the number of data lines, e.g., to 18 compared to 36 for dual data TTL banking. Mid-resolution and mid-sized panels are suitable for RSDS. • Point to Point Differential Signaling (PPDS) Interfacing. Only two wires are required independent of gray resolution for PPDS (see, e.g., McCartney and Bell 2005; McCartney et al. 2005) compared to 18 for RSDS and 36 for TTL (dual data banks). This method is mainly suitable for high-resolution and large-sized panels; for lower resolution, the price is not competitive. PPDS can deliver up to 600 Mb/s which is fast enough for WUXGA panels. As only a few lines are required, the failure rate is significantly lowered and the tiny number of lines enables a small-sized PCB stick and reduced electromagnetic interference (EMI) effects.

Row and Column Drivers This section provides details about the tasks, characteristics, features, and implementation of drivers for rows and columns. The examples refer mostly to LCDs but also provide the basis for other display technologies. Their input signals are provided by the Timing Controller (TCON) in such a way that no further data handling, manipulation, or calculation is required. The fundamental task of these drivers is to output the right (gray level) data (column) at the right time (selected by the row driver) to all pixels of the (active) matrix. The major task for the row driver (often also named the gate driver) is to “activate” a row (also called a line). For AM LCDs (see chapter “▶ Active Matrix Liquid Crystal Displays (AMLCDs),” and Cristaldi Page 5 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 Frame time Row #

TFrame = 16.67 ms(60 Hz)

Pulse for TFT gate

1 2 3

768 Pulse width TON (=21 μs for XGA)

Fig. 6 Sequential activation of rows (lines) for one frame (values for XGA). An XGA with 768 lines usually has three gate driver ICs (3  256 outputs)

et al. 2009 Chap. 6.4, Gate Drivers), a positive pulse opens the gate of the pixel MOSFETs to transfer the column voltage (gray level data) to the pixel and storage capacitor. For a certain resolution in terms of rows, the activation time of a row signal can be calculated via TON ¼

TFrame 1 ¼ N  f Frame N

where N is the number of rows (e.g., 768 for XGA); TFrame = frame time (e.g., 16.67 ms for 60 Hz); fFrame = frame frequency (e.g., 60 Hz) At 60 Hz frame frequency, the TON duration for every row (line) equals about 35 ms for VGA and 21 ms for XGA. During that time, all column data for a line have to be transferred to the pixels of this row and in parallel the gray level data for the next line have to be transferred from the Timing Controller to the column controller. This is described more in detail below. For addressing a whole frame, the rows are activated sequentially until the last row is reached (Fig. 6). After that, the new frame starts with a VSync pulse from the Timing Controller (see chapter “▶ Video Processing Tasks”). A more detailed view into the operation of a row driver and its I/O signals is provided in Fig. 7 (see also Cristaldi et al. 2009 Chap. 6.4, Gate Drivers): The basic functionality is a bidirectional shift register with typical input signals from the Timing Controller (TCON) like clock, Vsync (vertical start), scan direction (DIR, for projection or row driver IC on left or right side of panel), and Output Enable (OE) to cut the gate pulse length in order to prevent row-to-row cross talk (see section “Dynamic Performance of AMLCDs”). The clock frequency is calculated via frame rate multiplied by the number of lines; for XGA, we obtain 60 Hz  768 = 46 kHz. Row drivers are cascadable and have typically 256 outputs (low impedance buffer) switching between two voltages (typical values): +20 V for Gate High (row is “activated,” gray level data via TFT to pixel) and 5 V at Gate Low (row is not selected). The typical package of a row driver IC is a TCP (Tape Carrier Package) with a die size pad of about 50 mm which has to be “expanded” to the pixel size (PC monitor 300 mm). Figure 8 visualizes all typical and necessary timing signals of an AM LCD row driver. It is essential for good image quality that no delay occurs between subsequent row drivers (here row driver 1 and 2). But as the dynamic performance of a row with a certain resistance and with many MOSFET gates having a certain capacity the waveform at a TFT gate is not ideal, this will be discussed in section “Dynamic Performance of AMLCDs.” Summarizing, row drivers are relatively simple ICs with low clock speed and digital output compared to Timing Controllers and column drivers. Page 6 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 Output enable

VSync Clock Dir

VGate High

VGate Low

1

256-bit shift register

Output control logic

Output buffer

to TFT gates

256

Sync for next driver

Fig. 7 Block diagram with TCON input signals (top left and center) and TFT-output signals (right) of a typical AMLCD row driver Row driver output number : 1

2

256

1

2

Clock VSync Row 1 U Row 2

0V

Out enable (OE) to prevent crosstalk

Row 256 Trigger next row driver

+20V −5V

Row driver 1

Row driver 2

Fig. 8 Typical timing diagram of an AMLCD row driver

As the row driver delivers the gate signal for the MOSFET (TFT) of an AM pixel, the task of the column (or source) driver is to provide the appropriate gray level voltage for the (LC) pixel. Therefore, each output consists of a digital-to-analog converter (DAC) for the gray level data. The fundamental input signals of a column driver from Timing Controller (TCON) are: • RGB gray level data (PC domain: 8-bit, automotive 6-bit; mobile phones 4–6-bit) delivered via intrapanel interface (see section “Timing Controller and Intrapanel Interface”) • HSync (horizontal start) for next row (line) • Polarity for inversion (LCDs only) • Load for transferring gray level data to DAC • Clock for data synchronization Page 7 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

RGB

Data latch

Load Polarity VGamma

Sync for the next driver

128-bit shift register

HSync

DA converter 10 Output buffer

1

Analogue grey level voltage to TFTs

384

Fig. 9 Block diagram of a typical AMLCD column driver Output voltage (to TFT) VDD V1 Positive polarity

VCom (≡ FP) Negative polarity

Asymmetric (CLC corrected)

V2 V3 V4

ΔVKB

Symmetric (uncorrected)

V5 V6 V7 V8 V9 V10 0

ΔVKB 0

15 31 47 Input grey level (6-bit data)

63

Fig. 10 Example of 6 bit digital gray-level-to-voltage curve of an AMLCD without (symmetric) and with (asymmetric) kickback voltage (DVKB) correction. V1 to V10 are the gamma buffer reference voltages

A typical column driver (block diagram see Fig. 9, for further information see Cristaldi et al. 2009 Chap. 6.3, Source Drivers, Kim and Martin 2000; McCartney and Bell 2005; Connor et al. 1994; McCartney et al. 2005) has 384 outputs which is equivalent to 128 RGB pixels. The 6–8-bit DACs provide a typical span from 0 to 20 V being at a first approach symmetrical to VCom (see Fig. 10) for large panels. An output buffer reduces the impedance for fast rise of the output DAC voltage (e.g., load storage capacitor). In order to avoid electrophoresis of the liquid crystals and image sticking, the effective pixel voltage must be DC free; therefore, the output voltage swings around VCom. Like row drivers, column drivers are designed for cascading. However, their operating frequency is significantly higher: 384 gray level data have to be transferred by clock pulses from the data input to the “last” data latch. This results for XGA typically in a clock frequency of about 50 MHz (the data have to be processed 60 times per second for 768 rows). A column driver has a voltage supply for the logic (3.3 or 5 V) and reference voltages for the DAC (VGamma, see section “Gamma- and VCom- Supply for LCDs”). The package of a column driver is typically a TCP with 50-mm die size pad pitch which has to be “expanded” for PC-monitors to about 100 mm (subpixel size). Page 8 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 HSync

Data (RGB)

126

127

128

invalid

1

2

3

HPorch Load

Load

Polarity

negative

Output acc. to grey level

positive

VCom

Fig. 11 Typical timing diagram (clock and some control signals omitted) of an XGA AMLCD column driver

Column drivers TFT

CGS

Row drivers

CLC

Storage capacitor

Frontplane

Fig. 12 Typical equivalent circuit diagram of an AMLCD. More details on pixel circuitry can be found in Fig. 13

A typical timing diagram of an AMLCD column driver is plotted in Fig. 11 for an XGA display: The top part of this figure shows an excerpt of a row duration (see Fig. 2) at the time of an HSync pulse which starts the transfer of the gray level data for a new row. As a typical column driver has 384 outputs, 128 RGB gray level data are stored in the data latch. The load pulse transfers these gray level data to the DACs. This means that the DACs output is set for one row while the gray level data of the next row are loaded into the data latch in parallel (Fig. 12). The signals at the bottom demonstrate how the polarity signal changes the output level of the DACs (row inversion is plotted, see Fig. 14 and Meinstein et al. 1996) relative to Vcom (constant for large panels). The rising and falling curves of the output waveform are not of perfect rectangular shape because of RC limitations (see Fig. 12). Page 9 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 Driver waveforms

Pixel

LC waveforms

Column (data) VD VCom 0

Row (gate)

Address TFT

VG VD VCom

CGS VLC

1 frame

VG

VLC

CLC

Storage capacitor

0

Pixel

ΔVKB

t Frontplane (simplified example)

Row (gate)

t

Fig. 13 Input waveforms for the pixel transistor (left), AMLCD pixel circuitry (center) and resulting pixel voltage VLC (right)

Row inversion

Even frame

Odd frame

Column inversion

Pixel inversion

+

+

+

+

+



+



+



+











+



+





+



+

+

+

+

+

+



+



+



+











+



+





+



+











+



+



+



+

+

+

+

+



+



+

+



+













+



+



+



+

+

+

+

+



+



+

+



+



Fig. 14 Examples of AMLCD polarity inversion methods

Gamma- and VCom- Supply for LCDs To provide the appropriate voltage supply for the column driver and the frontplane (VCom), dedicated buffer circuitries have been introduced. Beside cost, the motivation for highly integrated ICs was low footprint, thus reducing the nonactive area of a display module. At first, the buffer for the gamma voltage supply will be presented. The fundamental task of the column drivers is to set the pixel voltage to the corresponding gray scale value with respect to the (mostly) nonlinear electro-optic curve of the display. The final luminance output has to fulfill the “gamma curve” L ~ Dg (see chapter “▶ Direct Drive, Multiplex and Passive Matrix,” section 3 of chapter “▶ Luminance, Contrast Ratio and Gray Scale” and Lueder 2010). An example of such a curve is visualized in Fig. 10 where the output voltage of the column driver is plotted over the 6-bit digital gray level (6-bit is used because it makes the figure simpler and more readily understandable, it can be easily transferred to 8-bit). The gamma supply buffer IC delivers to voltages V1 to V10, as reference voltage to the column driver’s DACs; in the “middle” (here between V5 and V6) is the VCom voltage for the frontplane (see Cristaldi et al. 2009 Chap. 6.3.4, Analog Buffers). This approach is typically used for larger sized AMLCD panels (>500 ). VCom modulation and a single branch gray-level-to-voltage curve are an optimized method for smaller sized AMLCD panels (see Bhowmik et al. 2008 Chap. 9, Advances in Display Driver Electronics and Kudo et al. 2003). Page 10 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 VDD + −

+ −

DAC reference voltages V1

V5 6-bit system

+ −

+ −

V6

V10 Column driver

Gamma buffer IC

Column driver

AM LCD panel

Fig. 15 Example of a 6-bit gamma voltage supply for AMLCD column drivers VDD R1 R2

+ −

Vcom= V DD/2 R3 R4 Feedback from panel

Fig. 16 Basic buffer circuitry for the AMLCD frontplane voltage Vcom

Due to the dynamic effects of AMLCD driving, a kick-back voltage VKB (see section “Dynamic Performance of AMLCDs”) lowers the pixel voltage in the same direction. This is visualized on the right side of the figure for the gray level voltage rising behavior when the gate voltage ON-pulse (row selection) is finished. Therefore – to avoid a residual DC offset (electrophoresis, image sticking, see section 3 of chapter “▶ Temporal Effects”) – a nonlinear and asymmetric correction for the (gray-level-dependant) kick-back voltage VKB has to be introduced. Figure 15 shows a fundamental block diagram of the gamma voltage buffer (Lee et al. 2009; Blyth and Orlando 2005) supplying the DACs of column drivers and the connection to the column drivers ICs. The typical number of outputs is ten voltages (V1 . . . V10), which should be independently adjustable, e.g., by digital potentiometers for good gray level display performance. The gamma buffer IC is normally fed by a single voltage supply of typically 12 V DC, and the outputs are capable of delivering 10 mA at 10 V. Particularly for vertical aligned (xPA) LCs, the electro-optic curves for the RGB primaries differ slightly. Therefore, e.g., the white point will shift while stepping through different gray levels (Color Tracking, see section 2 of chapter “▶ Spatial Effects”). Implementing a gamma reference buffer for each primary allows precise tuning of each color or gray level without color shifts, resulting in constant color temperature. The frontplane VCOM (Fig. 16) is adjusted by the resistors R1 and R2 (digital and programmable potentiometers are recommended) so that an overall compensation of the kick-back voltage is done by gamma voltages and frontplane voltage. The basic function should be a low-frequency transconductance Page 11 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

mC

Address

A

Data

D C

Control Memory interface

Timing controller with display memory

Data HSync VSync

LCD with row and column drivers

Display module

Fig. 17 Typical block diagram of an embedded graphics system with microcontroller (mC) and a display module with Timing Controller including built-in frame memory and display (only row and column drivers). The state-of-the-art interface for this configuration is a memory interface type

amplifier (usually an Operational Amplifier) with high-speed active feedback (see, e.g., Lee and Won 2007) compensating deviations during operation. All necessary functionality for driving AMLCDs is now presented. However, some dynamic effects happen which are discussed in section “Dynamic Performance of AMLCDs” after a brief introduction to highly integrated panel electronic ICs.

All-in-One Display Driving Controllers Especially for high-volume electronic systems with display resolutions up to QVGA or even VGA, there is a competitive advantage in integrating various panel electronics ICs (see Fig. 1) into a single one (for an overview and further details see Cristaldi et al. 2009 Chap. 6.1, AMLCD Driver Architectures and Bhowmik et al. 2008 Chap. 9, Advances in Display Driver Electronics). This results also in benefits like space reduction (also in terms of Printed Circuit Board) and improved reliability. The first approach (Fig. 17) is similar to low-resolution Passive Matrix display modules with built-in character or graphics controller (see section 5 of chapter “▶ Direct Drive, Multiplex and Passive Matrix”): The microcontroller is connected via a standard memory interface to the display module, and the Timing Controller (TCON) is equipped with the display frame memory. This keeps the processor load low because only the data of pixels which have to be changed have to be transmitted. The Timing Controller reads the data to be displayed for every frame from the memory and transfers the data to row and column drivers. It could be considered that the display module is in a “free-running” mode if there is no change in display content. The TCON can either be mounted on a separate PCB or foil as well as on the display glass. A higher integration can be achieved by integrating TCON, display data RAM (frame memory), and driver ICs (row and column) into a single IC; Fig. 18 shows an example; more details are provided by the specification of such ICs. The RGB data and system interface (bottom) is normally dedicated to the microcontroller or processor used. Those input data are transferred to the built-in display memory (Random Access Memory type) by the row and column address. Each pixel corresponds to 3 bytes for 8-bit gray level color displays. The dual ported display RAM is read out via line address and display data latch block. The whole memory access is steered by the timing generator. This block is also responsible for the real time data transfer to row and column driver via display address and data latch. Via power and gamma (g) voltage inputs, all necessary supply is provided for the all-in-one display driving IC.

Dynamic Performance of AMLCDs As Active Matrix display can be manufactured to a relatively large size (e.g., up to 10800 diagonal for AMLCDs), it is obvious that some dynamic effects will occur due to high-frequency driving (50 kHz for Page 12 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015 1 ...N

1 ...N

Row driver Column driver

Power supply

Gamma buffer for EOC & Digital voltage supply

Display data latch Row address

Display data RAM (frame memory)

Display timing generator

Line address

Column address

Control logic RGB interface

System interface

Data

Parameters

Fig. 18 Typical block diagram of an “all-in-one” display controller with display memory, row and column drivers

Row driver output enable signal (OE) Near row driver

Far from row driver

TFT OFF threshold

“Gate delay” Column data

Actual row

Next row

Time of 1 row (line)

Fig. 19 Waveform of an AMLCD TFT gate signal near and far from the row (gate) driver. The ideal pulse from row driver is distorted via RC low pass characteristics of row (resistance) and gate-source capacity. Without the Output Enable (OE) pulse, the TFT of a row will let pass the gray level data of the subsequent row (dashed line)

XGA row clocking) and signal propagation over long distance via thin film lines and transistors (TFT). A typical equivalent circuitry of an AMLCD is shown in Fig. 12 (see also Cristaldi et al. 2009 Chap. 5.2, Structure of AMLCD Panel and Tsukada 1996 Chap. 2.3, Design Analysis): The row and column lines have a non-negligible resistance (from pixel to pixel) and capacitors like pixel (CLC) and storage capacitor and parasitic Gate-Source capacitor CGS from the TFT MOSFET (metal oxide semiconductor manufactured in Thin Film Technology, see Chap. 5.2.1 and Tsukada 1996 Chap. 3, Thin-Film Transistors) are present. The gate pulse (see Fig. 8 ) will be distorted from left to right in the sense of the figure by the resistance of the row line and the parasitic Gate-Source capacitor CGS in a way that the rise and fall time is increasing (Cristaldi et al. 2009 Chap. 5.3, General Considerations, Tsukada 1996 Chap. 2.3, Design Analysis). The effect is an RC low pass filter degradation of the pulse waveform; details will be Page 13 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

discussed later on. Similar effects occur for the column drivers with column line resistance and mainly the storage capacitors (visualized by horizontal arrows) from top to bottom. The consequences of these pulse smearings are plotted in Fig. 19: The center shows the above discussed distortion of the rectangular gate pulse by RC low pass filters near (green line) and far (red line) from the row driver IC. The difference is caused by the propagation of the pulse along the row line with resistors and capacitors (see Fig. 12). At the rising edge, a delay of the start of the (distorted) pulse will occur which is named as “Gate delay” (Cristaldi et al. 2009 Chap. 5.3, General Considerations, Kim et al. 2006). This means that TFT gates far from the row driver IC are reaching the TFT threshold “later” than TFT gates near the row driver (additional to smeared edges). This causes less available charges from the column drivers to reach the storage and pixel capacitor. A consequence is a DC voltage component which can reach values of 100 mV being critical in terms of image sticking (see chapter “▶ Temporal Effects”). At the trailing edge, the fall time will increase from left to right (far from the row driver). Therefore, the TFT OFF threshold will be reached later than for ideal conditions. This will allow the column gray level voltage dedicated to the next row to influence the pixel voltage of the current row, leading to blurry images through crosstalk. This effect is avoided by the Output Enable (OE) pulse (Kim et al. 2006, see also Fig. 8): The trailing edge of the gate pulse is cut off by the OE pulse. In a well-designed AMLCD panel, the OE pulse is so long that the furthest TFT is closed when the next row’s column gray level data are transferred. However, the OE pulse will also shorten the time to load the gray level voltage from the column driver DACs into the pixel including a storage capacitor as this signal (pulse shape) is also distorted by RC effects along the column line. The parasitic Gate-Source capacitor CGS as shown in Fig. 12 has another effect beside acting as capacitor for the RC low pass row filter – a falling gate pulse (“Gate OFF” edge) will therefore diminish the pixel’s gray level voltage. This voltage drop is in the range of 1 Vand is called kickback voltage DVKB (Cristaldi et al. 2009 Chap. 5.3, General Considerations). The consequence will be (potential) gray level shift, flicker, and image sticking if not properly compensated. To explain this effect and the consequences, all necessary things are drawn in Fig. 13: On the left side, the gate pulse (VG), the column driver gray level data voltage VD, and the frontplane voltage VCom are plotted over time. These waveforms are applied to the (equivalent) circuitry in the center of the figure. The parasitic Gate-Source capacitor CGS (dotted) couples the row (gate) line with the pixel capacitor. Therefore, the switch-off of the row signal (via OE pulse) lowers the gray level voltage of the pixel as shown on the right side. This drop is in the same direction for positive and negative output (polarity, see Fig. 11). In consequence – as mentioned before in section “Gamma- and VCom- Supply for LCDs” and plotted in Fig. 10 – the gray level output voltage from the column drivers must be asymmetrical to avoid any DC component in the resulting column driving signal (Cristaldi et al. 2009 Chap. 5.5, Kickback Compensation Methods). The kickback voltage DVKB can be calculated via parameters from both pixel and MOSFET (Cristaldi et al. 2009 Chap. 5.3.1, General Considerations, Park et al. 2007; Tsukada 1996 Chap. 2.3.1, Design Analysis): DVKB ¼ DVGate

CGS CGS þ CLC þ CSt

with DVGate as gate pulse voltage, CGS as parasitic Gate-Source capacity, CLC as pixel capacity, and CSt as capacity of the storage capacitor. With typical values for those parameters (DVGate = +20 V, CGS = 10 fF, CLC = 100 fF, CSt = 50 fF), we obtain a kickback voltage DVKB of about 1.2 V. However, this value is not constant as the pixel capacity CLC depends on pixel voltage (gray level, polarity). Therefore, a compensation DVKB for various gray levels and the two polarities has to be made via the gamma voltage buffer as described above (see Fig. 10). Furthermore, the gate waveform at a TFT (see Fig. 19) has an

Page 14 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

additional effect but is minor compared to the “Gate OFF” one. The longer fall time of the gate pulse far from the row driver allows more charge injection through distant TFTs than through near gate TFTs. Both effects have to be properly handled and adjusted in order to prevent noticeable flicker (see below) and residual DC (will cause image sticking).

Inversion to Prevent Flicker Slight differences in the two gamma curves for positive and negative polarity (see Fig. 10), kickback voltage dependencies, and rise and fall time discrepancies lead to a slight modulation of the luminance output of a pixel showing a permanent gray level. In order to compensate for this effect, neighboring “regions” of an AMLCD are driven with opposite polarity (inversion) so that potential flicker will be largely reduced in visibility. There are several methods (see Fig. 14 Cristaldi et al. 2009 Chap. 5.4.1, Crosstalk Reduction and Polarity Inversion Techniques, Meinstein et al. 1996; Lee and Won 2007) for implementation like row-, column-, pixel-, and dot-inversion. Advantages and drawbacks of these approaches are discussed in (Cristaldi et al. 2009 Chap. 5.4.1, Crosstalk Reduction and Polarity Inversion Techniques and Lueder 2010). The frame inversion is “automatically” implemented via the polarity signal. But with some signal processing, it can be used to increase the gray scale resolution by “Frame Rate Control” (FRC): For even frames, the original gray level, let us say “i,” is used; for the subsequent odd frame, the gray level is increased by one digit to “i + 1.” For 60 Hz frame rate, vision will integrate to an intermediate gray level as the luminance difference is too small to be detected (no flicker). Therefore, a 6-bit column driver can provide a 7-bit gray scale level resolution. Applying 4-frame FRC will push a 6bit DAC to “8-bit resolution.”

Summary and Directions for Future Research In this chapter, we have described the path from digital input data to TFT waveforms generated by panel electronics consisting mainly of the Timing Controller (TCON) as well as row and column drivers. Additionally, power supply circuitry, which is not described here, is integrated in the panel electronics such as gamma buffer IC for providing the gray level references for the digital-to-analog converters (DAC) of the column ICs. The examples refer to AMLCDs (Cristaldi et al. 2009 is an excellent reference) but can be easily adapted, e.g., to AMOLEDs as described in the fundamentals (chapter “▶ Direct Drive, Multiplex and Passive Matrix,” for further details see e.g., in Bhowmik et al. 2008 Chap. 14, Advances in AMOLED Technologies). As cost and space is always an issue, R&D activities focus on higher integration of functions like all-inone ICs (see Fig. 18) and high speed and low wire interfaces from TCON to column drivers. Further challenges for panel electronics are high frame rates to reduce motion blur (see chapter “▶ Video Processing Tasks”). Another approach, especially for mobile displays, is to integrate electronic functions as much as possible onto the display (glass) substrate (details see Bhowmik et al. 2008 Chap. 13, Recent SOG Development Based on LTPS-Technology, Nakatogawa et al. 2009; Kim et al. 2004).

References Bhowmik A, Li Z, Bos PJ (2008) Mobile displays: technology and applications. Wiley, Hoboken Blyth T, Orlando R (2005) A programmable analog reference memory for adaptive gamma correction. SID Dig 36:1094–1097 Connor B, Velamuri S, Mank D (1994) Low power 6-bit column driver for AMLCDs. SID Dig 25:351–354 Cristaldi DJR, Pennisi S, Pulvirenti F (2009) Liquid crystal display drivers. Techniques and circuits. Springer, New York Page 15 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_34-2 # Springer-Verlag Berlin Heidelberg 2015

Henzen A, van de Kamer J, Nakamura T, Tsuji T, Yasui M, Pitt M, Duthaler G, Amundson K, Gates H, Zehner R (2004) Development of active-matrix electronic-ink displays for handheld devices. J SID 12(1):17–22 Kim EG, Martin R (2000) A compact LCD driver and timing controller system. SID Dig 31:46–49 Kim CH, Kim CM, Moon KC, Park KC, Kim IG, Joo SY, Park TH, Huk Maeng HS, Jung EJ, Kim CW (2004) Development of 200 ppi SOG-LCD. In: IMID’04 Digest. Society for Information Display, Campbell, pp 1–4 Kim SH, Park H, Kim S, McCartney R (2006) A new driving method to compensate for row-line signalpropagation delays in an AMLCD. J SID 14(4):379–386 Kim SS, Kim ND, Berkeley BH, You BH, Nam H, Park JH, Lee J (2007) Novel TFT-LCD technology for motion blur reduction using 120 Hz driving with McFi. SID Dig 38:1003–1006 Koh H (2009) pLVDS: a new intra-panel interface for the future flat-panel displays with higher resolution and larger size. SID Dig 40:1237–1240 Kudo Y, Akai A, Furuhashi T, Matsudo T, Yokota Y (2003) Low-power and high-integration driver IC for small-sized TFT-LCDs. SID Dig 34(2):1244–1247 Lee AY (2002) TFT-LCD module architecture for notebook computers. Inf Display 1:14–17 Lee A, Lee DW (2000) Integrated TFT-LCD timing controllers with RSDS column driver interface. SID Dig 31:43–45 Lee JB, Won T (2007) Feed-back control for the reduction of flicker and gray scale errors for large panel TFT-LCD. In: IDMC 2007. The Korean Information Display Society (KIDS), Kangnam-gu, Seoul, pp 536–539 Lee B, Kim KD, Jeon YJ, Lee SW, Jeon JY, Jung SC, Yang JH, Park KS, Cho GH (2009) A buffer amplifier with embodied 4-bit interpolation for 10-bit AMLCD column drivers. SID Dig 40:371–374 Lueder E (2010) Crystal displays. Addressing schemes and electro-optical effects. Wiley, New York McCartney RI, Bell MJ (2005) A third-generation timing controller and column-driver architecture using point-to-point differential signalling. J SID 13(2):91–97 McCartney R, Kozisek J, Bell M (2001) WhisperBus™: an advanced interconnect link for TFT column driver data. SID Dig 32:106–109 McCartney RI, Bell MJ, Poniatowski SR (2005) Evaluation results of LCD panels using the PPDS™ architecture. SID Dig 36:1692–1695 Meinstein K, Ludden C, Hagge M, Bily S (1996) A low-voltage source driver for column inversion applications. SID Dig 27:1–6 Nakatogawa H, Tsunashima T, Aoki Y, Motai T, Tada M, Ishida A, Nakamura H (2009) 3.5 inch VGA TFT LCD with system-on-glass technology for automotive applications. SID Dig 40:387–390 Park Y, Lee E, Kim S (2007) An analysis of common reference voltage architecture in wide TFT-LCD. In: IDMC 2007, pp 259–260 Tsukada T (1996) TFT/LCD. Liquid-crystal displays addressed by thin-film transistors. Gordon & Breach, Amsterdam

Further Reading Den Boer W (2005) Active matrix liquid crystal displays. Fundamentals and applications. Newnes, Amsterdam Maini AK (2007) Digital electronics: principles, devices and applications. Wiley, Chichester Myers RL (2002) Display interfaces. Fundamentals and standards. Wiley, Chichester Wobschall D (1987) Circuit design for electronic instrumentation: analog and digital devices from sensor to display. McGraw-Hill, New York

Page 16 of 16

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_35-2 # Springer-Verlag Berlin Heidelberg 2015

Panel Interfaces: Fundamentals Karlheinz Blankenbach* Display Lab, Pforzheim University, Pforzheim, Germany

Abstract This chapter describes the fundamentals of interfaces between graphics adapters (or graphics controllers) as signal sources and the input of display modules. We distinguish between analog and digital transmission while the latter divides into parallel and serial methods. In this chapter, analog and digital parallel interfaces are described; serialized data transmission standards will be presented in the chapters that follow.

List of Abbreviations ADC AM APIX bpp BW CRT DAC DP DVD DVI EMI HDMI HDTV LVDS MDDI MIPI NTSC PAL PC QVGA RGB RS-170 SDTV TCON TTL TV USB VESA

Analog-to-digital converter Active matrix Automotive pixel link Bit per pixel Bandwidth Cathode ray tube Digital-to-analog converter Displayport Digital versatile disk or digital video disk Digital visual interface Electromagnetic interference High-definition multimedia interface High-definition television Low-voltage differential signaling Mobile display digital interface Mobile industry processor interface National Television Systems Committee (USA) Phase alternating line (analog color television) Personal computer Quarter video graphics array Red, green, blue US standard black and white video format Standard-definition television Timing controller Transistor-transistor-logic Television Universal serial bus Video Electronics Standards Association

*Email: [email protected] Page 1 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_35-2 # Springer-Verlag Berlin Heidelberg 2015

VGA XGA

Video graphics array Extended graphics array

Introduction The interface between the video data source (e.g., tuner, video player, PC) and the display panel has the task of sending all image data in real time. This can be done in an analog or digital way; a good overview and introduction to display interfaces can be found in, e.g., Myers (2002); Kim and Nam (2006); Weise and Weynand (2007). Depending on the display resolution, the pixel data rate to transmit the data information (gray level and color) from the graphics adapter to the display can reach relatively high values. This interface data rate determines the very basic interface characteristics and can be calculated by the following formula Eq. 1; its basic unit is bit/s (baud); however, practically Mbit/s or Gbit/s is used: Data rate ¼ horizontal resolution  vertical resolution  frame rate  gray scale  RGB

(1)

“Gray scale” describes the numbers of gray levels (also known as grayscale resolution) in bits such as 8 bit for 256 gray levels or 6 bit for 64 gray levels. “RGB” is three for standard color displays with RGB primaries and RGB = 1 for monochrome displays. Assuming the widespread color XGA display, the data rate results in 1024  768  60 Hz  8 bit  3 = 1,132,462,080 bit/s  1.1 Gbit/s. This is a relatively high amount for long (copper) cables; therefore, PC projectors, until today (2015), rely mostly on analog signals often known as VGA (do not confuse this with resolution). The most prominent analog interfaces such as PC, VGA, and TV are presented in section “Analog Interfaces.” On the other hand, digital transmission is less sensitive to external EMI issues especially for twisted pair cables (serial interfaces). A basic visualization of those fundamental interface methods is provided in Fig. 1 including typical voltage levels; a comparison will follow below and is summarized in Table 1. The waveform on top White: 0.7 V

Grey level ∼ voltage Analog Color burst

Black: 0 V Sync: –0.3 V

Sync

Sync Clock Digital parallel

“1”: 3.3 V or 5 V “0”: 0 V R G B

Grey level

Digital serial

Sync & grey level

ΔU: 0.35 V R U

G

B

(not to scale)

t

Fig. 1 Simplified visualization for analog (RS-170), parallel, and serial interfaces showing the first part of a line signal (start of line) Page 2 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_35-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Fundamental characteristics of display panel interfaces Interface Typical applications

Analog SDTV, PC, laptop (standard IF 2010)

Parallel Panels up to VGA including intra-panel interfaces

Typical resolution range (H  V) Merits

767  575 . . . 1280  1024 Hassle free (standardized and widespread) Good quality Cable length up to 20 m DAC-ADC necessary

96  64 . . . 640  480

Shortcomings

Serial DVI, LVDS, (embedded) DISPLAYPORT, HDMI, intra-panel interfaces 640  480 . . . 1920  1080

No DAC-ADC needed

Large cable length Few lines

High quality

High quality

Limited length of cable ( 0. The effective anchoring strength, when compared to a Rapini–Papoular energy, may be thought of as ws ¼ 23 S ð3c1 þ ðc3  2c2 ÞS Þ. It is clear that there is no dependence on the azimuthal angle f as such a degenerate anchoring condition would suggest. The p-Cell We again illustrate the use of these equations using the p-cell device (Fig. 1). We use the Q-tensor equivalent of “one constant approximation,” and, as before, we assume that the additional polarization term, Ps, is zero. We write the electric field in terms of the electric potential, E = ∇U, and solve for the qi with i = 1. . . 5 and the electric potential, U. The problem is again inherently one dimensional, and so the dependent variables are functions of z and t only. Unlike in the director case, we do not constrain the solution as we want to allow the order to change and biaxial solutions to emerge when necessary. In this model, we do not constrain the director to lie in the xz-plane, which we did in the director model. The only constraints on the behavior of the liquid crystal enter through the governing equations, and since we assume that, in the absence of evidence to the contrary, the Q-tensor at the boundaries is fixed to be in a uniaxial state with the equilibrium value of the order parameter. The equations for the five elements of the Q-tensor and the electric potential are then g

   @q1 De 2 2  ¼ L1 q1, z  U z  2aq1  b q21 þ q22 þ q23  2q24  2q1 q4  2q25  2cq1 tr Q2 ; 3 @t 6 g

  @q2 ¼ L1 q2, z  L1 q2, z  2aq2  2bðq1 q2 þ q2 q4 þ q3 q5 Þ  2cq2 tr Q2 ; @t g

  @q3 ¼ L1 q3, z  2aq3  2bðq2 q5  q3 q4 Þ  2cq3 tr Q2 ; @t

Page 22 of 26

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_87-2 # Springer-Verlag Berlin Heidelberg 2014

   @q4 De 2 2  ¼ L 1 q 4, z  U z  2aq4  b 2q21 þ q22  2q23 þ q24  2q1 q4 þ q25  2cq4 tr Q2 ; g 3 @t 6 g

  @q5 ¼ L1 q5, z  2aq5  2bðq2 q3  q1 q5 Þ  2cq5 tr Q2 ; @t

with     tr Q2 ¼ 2 q21 þ q22 þ q23 þ q24 þ q25 þ q1 q4 ; and the electrostatic displacement equation. We now carry out the same simulations as before, with the same parameter values as the director-based approach, as well as the additional parameters needed for the Q-tensor model: Sexp = 0.624, Seq = 0.624, a = 0.975  105 N/Km2, b = 36  105 N/m2, c = 43.875  105 N/m2, dT = 4.0 K. The results of these simulations are shown in Fig. 3, where we have determined the in-plane director angle y from the Q-tensor by calculating the major eigenvalue as discussed above. For the bend state (Fig. 3b), the situation is very much the same as before, with a uniform bulk region (aligned with the electric field, y = p/2) and high distortion regions close to the boundaries. (Note that, as in the director model, the solution for the bend state at 0 V is an unstable equilibrium solution which would, if allowed, relax to the twist state.) However, the case for the splay state (Fig. 3a) is different. For the lower values of the applied voltage, we see similar behavior to the director-based model, with a distortion region in the center of the cell. However, at high voltages (i.e., 30 V), the system has switched from the splay state to the bend state. By considering the major order parameter S1 for the bend state, Fig. 4, we see that the region of high distortion at the center of the cell has reduced the order parameter. This eventually leads to an effective “melting” of the liquid crystal (in fact the liquid crystal enters a transient biaxial state), and the liquid crystal order reforms to take the lower energy bend state. Although this change in order is a high energy event, it is favorable in order to achieve the low-order bend state. Remarks In this section, we have laid out the ingredients for a theory which allows the modeling of regions of changing order (i.e., near to defects (Schophol and Sluckin 1987)), including dynamic effects through a dissipation functional which considers the rate of change of the Q-tensor. However, a fuller description of the dynamics would include dissipation through flow, i.e., the equivalent to the Ericksen–Leslie equations for the director-based approach. However, describing such a theory in a short chapter is impossible, and given the, as yet, limited use of such a theory to model liquid crystal devices, we have opted to describe only this simpler no-flow model. We do however refer the reader to (Sonnet et al. 2004) which contains a detailed description of just such a theory, which also contains a summary of other similar theories which have started to be used.

Summary We have tried in this chapter to explain and summarize two commonly used theories of nematics, one based on using the director as a dependent variable and one based on using the tensor order parameter. Both have their advantages and disadvantages. The director-based approach is relatively simple and has been used successfully in many different situations. All material parameters in this theory have been measured for a few materials, although a

Page 23 of 26

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_87-2 # Springer-Verlag Berlin Heidelberg 2014

a

2 0V 2V 5V 29V 30V

θ, director angle (radians)

1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 −3

0

0.5

1

1.5

2

z, distance through cell (x10−6m)

b

0

θ, director angle (radians)

−0.5 −1 −1.5 −2 −2.5 −3 −3.5

0

0.5

1

1.5

2

z, distance through cell (x10−6m)

Fig. 3 Plots of the director angle from the Q-tensor simulation, for the p-cell in the (a) splay state and (b) bend state

0.66 0.64

S, order parameter

0.62 0.6 0.58 0.56 0.54

0V 2V 5V 29V 30V

0.52 0.5 0.48

0

0.5 1 1.5 z, distance through cell (x10−6m)

2

Fig. 4 The order parameter S1 as a function of distance through the cell, for various voltages. At V = 29 V, a significant reduction in order is caused by the high distortion in the center of the cell Page 24 of 26

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_87-2 # Springer-Verlag Berlin Heidelberg 2014

complete characterization of all the necessary parameters is time-consuming and rarely undertaken. For instance, many of the viscosities remain unknown for most liquid crystal materials. The main problem with a director-only approach is the inability of such a theory to accurately handle defects, where there exists a singularity in the director, and the nematic order is reduced from its bulk value, or other more general regions of reduced order such as the example described above. Indeed we have seen that in this case, the director-based model fails completely to model the switch between the splay and bend state which occurs (Barberi et al. 2004b; Ramage and Newton 2008). It is in such a situation that a Q-tensor approach is valuable. With the ability to model biaxial configurations of molecules, and reductions in order parameter, this theory is able to describe the cores of defects and other instances of changes in order, i.e., near to rough surfaces. However, when using the Q-tensor approach, it is unusual to be able to make any analytic headway, and for most realistic situations, the governing equations must be solved numerically. These numerical computations are often extremely expensive (in terms of computational time and memory) because of the large discrepancies in time and length scales that exist in problems which contain defects. The ratio of defect core size to the device dimensions is a few orders of magnitude, and it is often necessary to implement sophisticated numerical methods where time and space adaption are utilized. As mentioned in the introduction, the crucial step in modeling liquid crystal devices is often the choice of the appropriate dependent variables and then the choice of theory. If a nematic liquid crystal device is thought to contain no defects, or regions of varying order parameter, the Ericksen–Leslie theory will be appropriate. When defects are present, an order parameter should be included as a dependent variable, and a theory such as the Q-tensor theory described above will be more appropriate.

Further Reading Barberi R, Ciuchi F, Lombardo G, Bartolino R, Durand GE (2004a) Time resolved experimental analysis of the electric field induced biaxial order reconstruction in nematics. Phys Rev Lett 93:art.137801 Barberi R, Ciuchi F, Durand GE, Iovane M, Sikharulidze D, Sonnet AM, Virga EG (2004b) Electric field induced order reconstruction in a nematic cell. Euro Phys J E 13:61 Barbero G, Dozov I, Palierne JF, Durand GE (1986) Order electricity and surface orientation in nematic liquid-crystals. Phys Rev Lett 56:2056 Bos PJ, Koehler-Beran KR (1984) The pi-cell – a fast liquid-crystal optical-switching device. Mol Cryst Liq Cryst 113:329 Bose E (1908) Zur Theorie der anisotropen Flussigkeiten. Phys Z 9:708 Care CM, Cleaver DJ (2005) Computer simulation of liquid crystals. Rep Prog Phys 68:2665 de Gennes PG, Prost J (1993) The physics of liquid crystals, vol 2. OUP Clarendon Press, Oxford Dunmur D, Fukuda A, Luckhurst G (2001) Physical properties of liquid crystals: nematics. Institution of Engineering and Technology, Stevenage Ericksen J (1991) Liquid crystals with variable degree of orientation. Arch Ration Mech Anal 113:97 Frank FC (1958) On the theory of liquid crystals. Discuss Faraday Soc 25:19 Kelker H (1973) History of liquid crystals. Mol Cryst Liq Cryst 21:1 Leslie FM (1968) Some constitutive equations for liquid crystals. Arch Ration Mech Anal 28:205 Meyer R (1969) Piezoelectric effects in liquid crystals. Phys Rev Lett 22:918 Miesowicz M (1935) Influence of a magnetic field on the viscosity of para-azoxyanisol. Nature 136:261 Miesowicz M (1936) Der einfluss des magnetischen feldes auf die viskositat der flussigkeiten in der nematischen phase. Bull Acad Pol A 28:228 Oseen CW (1933) The theory of liquid crystals. Trans Faraday Soc 29:883 Page 25 of 26

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_87-2 # Springer-Verlag Berlin Heidelberg 2014

Parodi O (1970) Stress tensor for a nematic liquid crystal. J Phys (Paris) 31:581 Ramage A, Newton CJP (2008) Adaptive grid methods for Q-tensor theory of liquid crystals: a one-dimensional feasibility study. Mol Cryst Liq Cryst 480:160 Rapini A, Papoular M (1969) Distortion d’une lamelle nématique sous champ magnétique conditions d’ancrage aux parois. J Phys (Paris) Colloq 30:C4 Schophol N, Sluckin TJ (1987) Defect core structure in nematic liquid-crystals. Phys Rev Lett 59:2582 Sluckin TJ, Dunmur DA, Stegemeyer H (2004) Crystals that flow: classic papers from the history of liquid crystals. Taylor & Francis, London Sonnet AM, Maffettone PL, Virga EG (2004) Continuum theory for nematic liquid crystals with tensorial order. J Non-Newtonian Fluid Mech 119:51 Stewart IW (2004) The static and dynamic continuum theory of liquid crystals: a mathematical introduction. Taylor & Francis, London Virga EG (1994) Variational theories for liquid crystals. Chapman & Hall, London

Page 26 of 26

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

Twisted Nematic and Supertwisted Nematic LCDs Peter Raynes* Department of Chemistry, University of York, York, UK

Abstract In this chapter we describe and discuss the two nematic liquid crystal display modes, the twisted nematic and supertwisted nematic, which dominated the early years of LCD technology and remain important today. The construction, operation, and optical properties of both modes are described together with their multiplexing performance.

List of Abbreviations D« = «//  «⊥ «//, «⊥ Dn = n//  n⊥ n//, n⊥ k11, k22, k33 LCD N Pre-tilt rms STN TFT TN Vsel, Vunsel

Permittivity anisotropy Electric permittivity along and normal to the director, respectively Optical anisotropy Refractive index along and normal to the director, respectively Splay, twist, and bend elastic constants, respectively Liquid crystal display Number of lines in a matrix Angle between director and the planar surface Root mean square Supertwisted nematic liquid crystal Thin-film transistor Twisted nematic liquid crystal Select and unselect voltages, respectively

Introduction The invention in 1971 of the twisted nematic (TN) electro-optic effect (Schadt and Helfrich 1971) was a major landmark in the development of liquid crystal display technology. The combination of the TN device with the cyanobiphenyl liquid crystal materials provided, for the first time, LCDs with an acceptable performance and long operating life. Initially the TN device was used as low information content displays with only a few characters or numbers showing limited amounts of information; the traditional wristwatch and calculator are classic examples of such low information content displays still found today. However, as the displays market developed, so did the desire for displays with larger amounts of information, and it soon became clear that the one major drawback of the basic TN device was the inability to multiplex, or share, the electrodes. Two quite different technologies arose in the years around 1980 that transformed the amount of information that could be displayed on LCDs. One is the amorphous silicon thin-film transistor (TFT), considered in chapter “▶ Hydrogenated Amorphous Silicon Thin-Film Transistors (a Si:H TFTs)” which is used with the standard TN device. The other is the *Email: [email protected] Page 1 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

supertwisted nematic (STN) LCD, where the geometry of the TN device was changed to produce a different device structure which could display large amounts of information without the need for TFTs. In this section we consider the construction, operation, and performance of both TN and STN LCDs.

Twisted Nematic Liquid Crystal Device Twisted Nematic Construction and Operation The construction and operation of the twisted nematic display device is illustrated in Fig. 1. The liquid crystal layer has a twist angle (’) of 90 induced by orthogonal planar alignment on the two glass surfaces containing the thin layer of liquid crystal. In the unactivated (OFF) state the nematic director twists uniformly from one surface to the other, hence the name twisted nematic. Light incident on the device is polarized by the first polarizer, and the twisted liquid crystal layer rotates, or guides, the plane of polarization by 90 so that it becomes parallel to the transmission axis of a second polarizer with polarization axis orthogonal to the first, and light is transmitted. The application of a small voltage of around 2–3 V across the liquid crystal layer reorients the director toward the electric field and distorts the uniformly twisted structure, provided that the nematic material has positive dielectric anisotropy (e// > e⊥). The ON state now no longer rotates the plane of polarization, and the light is blocked by the second polarizer. Removal of the electric field restores the uniformly twisted state, and light is again transmitted. The TN cell in this configuration is known as the normally white mode. It can also be operated using parallel polarizers; in this case the OFF state is black and the configuration is known as the normally black mode. The normally white mode is frequently used with a reflector in portable low-power applications (e.g., watches and calculators) which use ambient light. The normally black mode is usually used in backlit applications such as mobile phones and computers.

Domains The construction of practical TN devices is complicated by the existence of two types of defects which occur as the boundaries between domains of different orientation of the liquid crystal. The domains are Polarizer Electrode

Electrode

Polarizer

Dark

Bright

Fig. 1 The construction and operation of a TN display in the normally white mode Page 2 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

visible as areas of different contrast and are particularly visible off axis. Apart from being unsightly they can result in misreading of the displayed information and are unacceptable in commercial devices. The orthogonal surface alignment used on the two glass plates is compatible with both right hand and left hand   twisted structures in the OFF state (i.e., ’ ¼ þ90 or 90 ), and a TN device breaks up into two sets of domains with a typical size of several millimeters. These domains persist into the ON state, where they are highly visible, and are known as reverse twist domains. The application of an electric field across the layer can also result in two further sets of similar size domains, also visible as areas of different contrast in the ON state. These arise from the reorientation of the director in two possible directions toward the applied electric field, causing reverse tilt domains. The use of long pitch (P  200 mm) chiral nematic liquid  crystals and surface alignments with a small pre-tilt  1 removes both sets of domains and results in a defect-free display, provided that the chirality of the liquid crystal and of the pre-tilts are combined correctly (Raynes 1975).

Optical Properties of the OFF State Full modeling of the optical properties of the OFF state of TN devices can be carried out using the techniques described in chapter “▶ Optics of Liquid Crystals and Liquid Crystal Displays.” However, most of the essential features of the optical properties at normal incidence of the OFF state can be understood using the 2  2 Jones matrix method. For the general case of a nematic liquid crystal layer of thickness d, birefringence Dn, and twist angle ’ enclosed between a polarizer and an analyzer with arbitrary orientations, the normalized transmitted intensity TTN of light of wavelength l for the geometry defined in Fig. 2 is given by (Raynes 1987a) ( T TN ¼

cos d cos ð’ þ y  gÞ þ

sin d sin ð’ þ y  gÞ

pffiffiffiffiffiffiffiffiffiffiffi 2 ð1þ a Þ

a sin d cos ð’  y  gÞ þ ð1 þ a2 Þ 2

where a ¼ Dndp=’l and d ¼ ’

2

2

)2 (1)

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1 þ a2 Þ.

Analyser Exit director

f

Polarizer

g q

Input director

Fig. 2 Director and polarizer orientations for twisted liquid crystal layers Page 3 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

The transmission of the OFF state of a standard TN device between parallel polarizers can be derived from this general equation by substituting ’ = 90 and g = y = 0 and results in the GoochTarry equation (Gooch and Tarry 1975):  qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 sin ðp=2Þ 1 þ ð2Dnd=lÞ2 (2) T TN ¼ 1 þ ð2Dn  d=lÞ2 Equation 2 is plotted in Fig. 3, and as (2Dnd/l) increases, the transmission is seen to oscillate with decreasing amplitude. TN cells with a large optical thickness (2Dnd/l) satisfy the condition first identified by Mauguin (Mauguin 1911) and rotate all wavelengths of light by 90 ; the plane of polarization is said to be guided by the twisted structure. Although the amplitude of the oscillations increases as (2Dnd/l) is reduced, Gooch and Tarry noted that Eq. 2 predicts a series of transmission minima when   ðDn  d=lÞ2 ¼ m2  1=4 e.g., for m = 1, 2 . . .. Dn  d ¼ 0:87l, 1:94l::::: TN devices operating in either of the first two of these so-called Gooch-Tarry minima, with m = 1 or 2, are thinner than conventional (Mauguin) TN devices and show faster switching speeds and have improved viewing angle properties. First and second minima devices have become the standard commercial TN device.

Field-Induced Reorientation

The analytical solution of the continuum energy equation (see chapter “▶ Liquid Crystal Theory and  Modelling”) for a TN layer with ’ ¼ 90 and zero pre-tilt (ys = 0) shows that there is a threshold voltage given by (Raynes 1975)

Intensity ratio R

0.15

1.0

0.10

0.05

0

0

2

4

6

8

10 x=

12

14

16

18

20

2dΔn l

Fig. 3 Normalized transmission of monochromatic light through a TN device in the normally black mode between parallel polarizers, where x is the dimensionless parameter (2Dnd/l) Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014 90

q max / degrees

60

30 f = 0°

90°

180°

270°

0 0.5

1.0

1.5

2.0

2.5

Voltage / Volts

Fig. 4 Numerically calculated voltage dependence of the midplane tilt angle, ym, of liquid crystal layers of various twist angles ’ with ys = 1 , k11 = 10 pN, k22 = 5 pN, k33 = 20 pN, e⊥ = 5, e// = 15, and b = 1

90 80 70

Phi Theta

q and f (°)

60 50 40 30 20 10 0 0.0E+00

5.0E−07

1.0E−06 1.5E−06 d (m)

2.0E−06

2.5E−06

Fig. 5 Numerically calculated voltage dependence of the full director profile of a TN layer at a voltage well above threshold

  e0 e==  e⊥ V 2c ¼ p2 fk 11 þ ðk 33  2k 22 Þ=4 þ 2k 22 d=Pg

(4)

and an initial slope of ym2 just above threshold, for the case of (d/P) = 0, by (Raynes et al. 1979) y2m ¼

Vc



4ðV  V c Þ    5k 33 =8k 11 þ e==  e⊥ =e⊥

(5)

Minimization of the free energy using numerical procedures allows the calculation of the full director profile even in the case of finite surface pre-tilt (ys 6¼ 0) and added chirality (d/P 6¼ 0). Figure 4 shows the Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

calculated voltage dependence of the midplane tilt angle ym for a variety of twist angles ’; the TN device  corresponds to the specific case ’ ¼ 90 . This figure demonstrates the existence of a threshold voltage and a distinct slope above threshold. Figure 5 shows the full twist and tilt director profile for a typical TN device for an applied voltage well above threshold. The twist is symmetric about the middle of the layer and is concentrated in the center of the layer. The tilt profile is also symmetric about the center of the layer, with a “top hat”-like distribution.

Optical Properties of the ON State The optical properties of the ON state of a TN device are calculated by applying the optical methods described in chapter “▶ Optics of Liquid Crystals and Liquid Crystal Displays” to the full director profiles calculated using the numerical methods. The transmission at normal incidence of a typical first minimum TN device for the normally black and normally white modes gives the results shown in Fig. 6; the two modes are complementary. Contrast ratios well in excess of 100:1 can be observed. The results of the numerical calculation of the transmission of light at oblique incidence show an asymmetry of contrast of TN cells about the cell normal. This strong asymmetry is a well-known weakness of TN devices and is easily observed, particularly in multiplexed devices and in TFT-driven devices with gray scale. The physical origin of this asymmetry is quite straightforward to understand. Figure 4 shows that for typical operating voltages, ym is in the range 45–70 , and therefore for some directions light will be propagating close to the optic axis in the center of the layer, resulting in a loss of guiding and a high contrast. The direction of highest contrast is determined by a combination of the twist sense of the layer and the sign of the surface pre-tilt and is carefully selected in most TN devices to be compatible with the intended usage. TN displays are widely used in computer screens where the optical properties (such as contrast ratio and viewing angle) are improved considerably by the use of optical compensation films.

Multiplexed Twisted Nematic Displays The information content of a typical display is too large to address each pixel individually, and a multiplexing, or matrix addressing, technique where electrodes are shared between many pixels is universal. Each row in the matrix is selected sequentially, while appropriate data waveforms are applied to the columns. The slow response times of TN LCDs and the D2 dependence of the free energy mean that each pixel responds to the root mean square (rms) of the resulting waveforms. As the number of lines (N) in the matrix increases, the fraction (1/N) of the total time for which the selected pixels see the full select pulse decreases, thereby reducing the ratio of the rms voltages seen by the selected (ON) and the

a

100

Transmission

80 60 40 20 b 0 0

1

2

3

4

Applied voltage

Fig. 6 Transmission of a TN layer in the normally black (a) and normally white (b) modes Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

unselected (OFF) pixels. Alt and Pleshko (Alt and Pleshko 1974) showed that the maximum ratio of select to unselect voltages is given by V sel V unsel

sp ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffi N þ1 : ¼ pffiffiffiffi N 1

(6)

For a multiplexing ratio of 100:1 (N = 100), the effective select voltage is only 11 % higher than the voltage on unselected pixels. The multiplexing ratio achievable (the value of N) is therefore determined by the steepness of the transmission-voltage curve of the device, with any angular dependence of the contrast degrading the performance even further.   From Eq. 5 it is evident that lowering both k33/k11 and e==  e⊥ =e⊥ increases the steepness of the curve. However the need to preserve low operating voltages means that lowering k33/k11 represents the preferred method for improving the multiplexing performance of TN displays. Many attempts have been made to tailor the elastic constant ratio k33/k11 of nematic liquid crystal materials and mixtures (Bradshaw and Raynes 1983, 1986), but these efforts met with only limited success and the multiplexing possible corresponds to N  5, meaning the multiplexed TN device is inadequate for application displays with a high information content such as mobile phones, TVs, and computer screens. Two quite different technologies came to the rescue. One is the use of amorphous silicon to generate an array of thin-film transistors (TFT), considered in chapter “▶ Hydrogenated Amorphous Silicon ThinFilm Transistors (a Si:H TFTs).” However the development costs of TFT technology was initially considered to be prohibitively high, as the market for LCDs with high information content was uncertain. Fortunately a second solution was found which used a similar technology to the TN device and had only modest associated development costs. This used devices with increased twist angle which was found to increase the steepness of the transmission-voltage curve, and hence the multiplexing performance. This alternative device became known as the supertwisted nematic (STN) and is described in the next section. It proved to be a key display device as it was developed at a modest cost and established the use of LCDs in products such as mobile phones and laptop computers. This gave confidence in the market, and the inherently superior TFT technology was developed and eventually replaced STN in these products. STN displays are still found, however, in credit card readers and a host of automotive, domestic, and office products.

Supertwisted Nematic Liquid Crystal Devices Supertwisted Nematic LCD Construction and Operation It was found in 1982 (Waters et al. 1983) that the increase of steepness of the transmission-voltage curve necessary for multiplexing could be achieved quite simply by increasing the twist angle ’ of the liquid crystal layer from the 90 used in the TN device to lie within the range 180–270 . This is clearly shown for a range of ’ by both the experimental results of Fig. 7 and the calculated voltage dependence of the midplane tilt angle ym of Fig. 4. These larger twist angles are stabilized by a combination of surface alignment and a chiral nematic material with a pitch P within the range ’ d ’  0:25   þ 0:25 2p P 2p

(7)

Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

Transmission

90°

0

225°

1

2

3

0

1

180°

0

2

3

2

3

270°

1

2

3

0

1

Voltage

Fig. 7 Transmission-voltage curves of a standard LC mixture in devices with different twist angles

40

V2

% Transmission

30

20

10 V1

0

1

2

3

Fig. 8 Transmission-voltage curve of an optimized LC mixture in an STN display

As with the TN device, care is needed to use the correct combination of the surface pre-tilts and sign of P to produce a uniform director tilt across the layer. The device acquired the name of supertwisted nematic (STN) LCD and was later optimized for operation in a two polarizer mode (Scheffer and Nehring 1984). When the twist angle and the material parameters are adjusted to give an approximately infinite

Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

transmission-voltage slope (Fig. 8), high levels of multiplexing are possible and, because of this, the STN device became the standard multiplexed LCD for a range of high information content displays until it was superceded by TFT LCDs.

Optical Properties of the OFF State

The large twist angle ’ of the STN device (180–270 ) inhibits the guiding of the plane of polarization of light characteristic of TN devices, and the normally bright OFF state typically has a greenish yellow color and the ON state a dark blue color. We can use Eq. 1 to calculate the transmission for the case when the polarizer is oriented at an angle of +45 to the input director and the analyzer at an angle of +45 to the exit director. The normalized transmission TSTN is found from Eq. 1 to be given by T STN ¼ cos2

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n offi ’2 þ ðDndp=lÞ2

(8)

The device is therefore colored and has a maximum transmission (TSTN = 1) when qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Dnd=l ¼ p2  ð’=pÞ2

(9)

where p is an integer. The optimum values of Dnd, the transmission spectra, and characteristic color of a range of STN devices are readily derived from Eq. 8 (Raynes 1987b). The inherent color of STN devices posed a problem for some applications, and compensation techniques were developed to remove this color to produce a black and white display. In some applications this was then converted into a color display by incorporating color filters. The favored method of compensation involves the use of birefringent films between the polarizers and the STN layer (Okumura et al. 1987).

Field-Induced Reorientation

A numerical solution of the continuum energy equation (see chapter “▶ Liquid Crystal Theory and Modelling”) generates the voltage-dependent values for ym shown in Fig. 4 for a range of twist angles ’. These demonstrate that both the threshold voltage and the steepness of the transmission-voltage curve increase with ’. The optimum STN device should have a nearly infinite slope of the transmission-voltage curve, and the appropriate combinations of material and device parameters which induce this have been widely studied using numerical techniques (Waters et al. 1985; Raynes and Smith 1987). Analytical solutions of the continuum energy equation, extended to an arbitrary twist angle ’, have been found for the case of zero surface pre-tilt for both the threshold voltage and the slope just above threshold (Raynes 1986). These help in the understanding of the STN switching process. The threshold voltage is given by   e0 e==  e⊥ V 2c ¼ p2 k 11 þ ’2 fk 33  2k 22 ð1  bÞg

(10)

where b ¼ 2pd=P’, and the initial slope just above threshold is given by y2m ¼

Vc



4ðV  V c Þ    F ðk Þ þ e==  e⊥ =e⊥

(11)

Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

where F ðk Þ ¼

    k 33  ð’=pÞ2 k 233 =k 22 þ k 22 1  4b þ b2 þ k 33 ð2b  1Þ k 11 þ ð’=pÞ2 fk 33  2k 22 ð1  bÞg

(12)

For all known nematic liquid crystal materials,   k 233 =k 22 þ k 22 1  4b þ b2 þ k 33 ð2b  1Þ > 0 and therefore fk 33  2k 22 ð1  bÞg > 0:

(13)

F(k) is therefore reduced as the twist angle ’ is increased, becoming zero for some value of ’ and eventually negative as ’ is increased further. For a large enough ’, F(k) is sufficiently negative that   (14) F ðk Þ þ e==  e⊥ =e⊥ ¼ 0 and the ym – voltage curve, and hence also the transmission-voltage curve, has infinite slope and is close to the optimum multiplexing conditions. Parameter sets calculated from Eqs. 10 and 14 form a useful starting point for the more detailed numerical calculations. The existence of other instabilities (Chigrinov et al. 1979) in chiral nematic layers subject to electric fields, which compete with the Freedericksz transition, are of considerable practical significance. For small ratios of d/P, a Freedericksz transition takes place, but as d/P is increased, as is the case in an STN with larger twist angle ’, a periodically modulated structure appears at a voltage below the Freedericksz threshold voltage. These periodic distortions appear as a striped texture (Waters et al. 1985) that scatters light and can be eliminated by the use of a combination of a high surface pre-tilt (5 ) and a small d/P ratio (0.5) (Waters et al. 1985). A theoretical analysis (Schiller and Schiller 1990) of the periodic instability also prompted the design of liquid mixtures which minimize the problem.

Optical Properties of the ON State The steep voltage dependence of ym shown in Fig. 4 results in the steep transmission-voltage dependence which makes the STN device so well suited for use in multiplexed high information content displays (Fig. 8). As was the case with the TN device, the optical techniques covered in chapter “▶ Optics of Liquid Crystals and Liquid Crystal Displays” can be combined with the numerical solution of the continuum energy equation to calculate the optical properties of the ON state of the STN device. This calculation is used routinely as an aid in the design of the materials and device parameters suitable for STN displays. The excellent transmission voltage steepness and acceptable viewing characteristics of the STN display resulted in its preference over the TN device in passive multiplexed LCDs.

Conclusions In this review we have examined the operation and physics, including the optics, of TN and STN liquid crystal display devices. Significant improvements in the performance of video rate color LCDs have been achieved by adding optical compensation films and by combining TN LCDs with arrays of thin-film transistors, based on amorphous silicon. Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_88-2 # Springer-Verlag Berlin Heidelberg 2014

Further Reading Alt PM, Pleshko P (1974) IEEE Trans Electron Devices ED-21:146 Bradshaw MJ, Raynes EP (1983) Mol Cryst Liq Cryst 91:145 Bradshaw MJ, Raynes EP (1986) Mol Cryst Liq Cryst 138:307 Chigrinov VG, Belyaev VV, Belyaev SV, Grebenkin MF (1979) Sov Phys JETP 50:994 Gooch CH, Tarry HA (1975) J Phys D Appl Phys 8:1575 Mauguin C (1911) Bull Soc Fr Miner 34:71 Okumura O, Nagata M, Wada K (1987) ITEJ Tech Rep 11:27 Raynes EP (1975) Rev Phys Appl 10:117 Raynes EP (1986) Mol Cryst Liq Cryst Lett 4:1 Raynes EP (1987a) Mol Cryst Liq Cryst Lett 4:69 Raynes EP (1987b) Mol Cryst Liq Cryst Lett 4:69 Raynes EP, Smith RA (1987) Proceedings of the Euro-display, London, p 100 Raynes EP, Tough RJA, Davies KA (1979) Mol Cryst Liq Cryst 56:63 Schadt M, Helfrich W (1971) Appl Phys Lett 19:127 Scheffer TJ, Nehring J (1984) Appl Phys Lett 45:1021 Schiller P, Schiller K (1990) Liq Cryst 8:553 Waters CM, Brimmell V, Raynes EP (1983) Proceedings of the third international display research conference, Kobe, p 396 Waters CM, Raynes EP, Brimmell V (1985) Mol Cryst Liq Cryst 123:303

Page 11 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Smectic LCD Modes Per Rudquist* Department of Microtechnology and Nanoscience, Chalmers University of Technology, Göteborg, Sweden

Abstract In this article, we discuss display modes based on smectic liquid crystals with special focus on chiral tilted smectic materials. Here, we find the ferroelectric liquid crystals and antiferroelectric liquid crystals (FLCs and AFLCs) which can provide 100–1,000 times faster pixel switching than nematic LCs. The hysteretic switching of bistable surface-stabilized FLCs and AFLCs allows for passive matrix addressing, which made these materials the prime candidates for large direct view LCDs in the 1990s, before the activematrix thin-film transistor (TFT) technology was mature enough to allow for large display panels. While FLCs have found a number of applications, no AFLC device has as yet been commercialized. With today’s large TFT arrays – developed for nematic LCDs – there is an increasing interest in combining FLCs and AFLCs with active-matrix technology, e.g., with the fast FLCs used in monostable, analog switching modes. This could lead to even more powerful LCDs with full grayscale and superior speed (facilitating field sequential color generation) compared to nematic LCDs.

List of Abbreviations AFLC DH FLC ITO LCD LCOS n ||, n ⊥ NLS P, P QBS SmA, SmA SmC, SmC SSFLC Dn = n || – n ⊥

Antiferroelectric liquid crystal Mode deformed helix mode in short-pitch SmC Ferroelectric liquid crystal (most used also for SSFLC) Indium tin oxide (for transparent electrodes) Liquid crystal display Liquid-crystal-on-silicon Refractive index along and normal to the director Non-layer shrinkage Spontaneous electric polarization density in tilted chiral smectic LCs, e.g., in Sm C and Sm Ca Quasi-bookshelf structure Smectic A and chiral smectic A liquid crystal phase Smectic C and chiral smectic C liquid crystal phase Surface-stabilized ferroelectric liquid crystal Optical anisotropy in a birefringent material

*Email: [email protected] Page 1 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Introduction Smectic Liquid Crystals Smectic liquid crystals have the molecules arranged in layers in contrast to the nematic liquid crystal (Fig. 1a), which has no layers but only long-range orientational order of the molecules. In orthogonal smectics, like the SmA phase (Fig. 1b), the director n is parallel to the layer normal z. In tilted smectics like SmC, n is tilted by a finite angle y away from z (Fig. 1c). The SmA phase is uniaxial with the optic axis along z, whereas the SmC phase is optically biaxial.

Early Smectic LCDs The earliest work with smectics relevant to displays utilized scattering effects in the smectic A phase and was intended for projection devices. In 1973 F.J. Kahn, then at Bell Labs, demonstrated (Kahn 1973) that high-resolution graphic images could be written by an infrared laser beam onto a transparent cell of a homeotropically aligned SmA. The smectic is locally heated to the isotropic phase and the disorder frozen in when the liquid crystal rapidly returns to the smectic state. The resultant tiny scattering centers then appear black on white background in projection. With a x-y-deflected and intensity-modulated 20 mW YAG laser 104, picture elements(s) could be written with a resolution of 50 lines/mm over a 3  3 cm area, giving a contrast ratio of about 10:1. This display mode is thus thermo-optic and constitutes a true phase-change mode. The written information has infinite storage capacity if the ambient temperature is kept reasonably far away from the smectic-nematic transition. Erasure can be achieved thermally, eventually assisted by an electric field. The full erasure of a picture requires about 20 s. A variation of this mode was presented by Hareng and Le Berre at Thomson-CSF (Paris) without modulation of the laser beam, but with a spatial modulation of the voltage applied to the cell (Hareng and Le Berre 1975). In addition to a somewhat higher contrast, a faster erasure and a moderate grayscale were achieved. By skipping the laser, the same authors later developed a quite impressive 100  100 flat panel display with a line addressing time of 40 ms where, instead, the rows were sequentially heated and the electric writing signal was applied to the columns (Hareng and Le Berre 1978). A slightly similar, though different, panel was demonstrated in the same year by the STL Group in Harlow (Coates et al. 1978) without use of heating effects. As in the previously mentioned cases, the memorized scattering state is a metastable texture but now induced by a low-frequency high voltage (100–150 V). The mechanism bears resemblance to the dynamic scattering mode in nematics. It is currentdriven and therefore requires a liquid crystal material with high electric conductivity. When the field is taken away, the induced turbulence becomes a static scattering smectic texture. The clear state is regained by application of a high-frequency field (dielectric alignment). A large area 780  420 flat display by this technique required 0.8 s to write a frame (Crossland and Canter 1985). Like the Thomson panel, it works

a

n

b z

n

c

z

q

n

d

e z

z

j q

n P

Fig. 1 Schematic illustration of (a) the nematic phase, (b) the SmA phase, (c) the SmC phase. In the chiral smectic C (SmC*), there is (d) a spontaneous polarization P perpendicular to n and z and (e) a helical structure along z Page 2 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

without polarizers. The development of scattering-type SmA displays was essentially abandoned after the introduction of chiral smectic modes using ferroelectric and antiferroelectric liquid crystals. Review chapters on the early smectic A display technology can be found in reference Coates 1990.

Ferroelectric and Antiferroelectric LCs Liquid crystal materials are in general nonpolar, and physical properties with vector symmetry, like a spontaneous electric polarization, are not allowed. But in tilted chiral smectic LCs, e.g., the SmC phase, the symmetry is low enough to allow for a local spontaneous electric polarization density P perpendicular to the “tilt plane” (Meyer et al. 1975; see Fig. 1d), with the sign of P defined through P ¼ Pz  n. The interplay between molecular tilt, polarity, and chirality is the basis for ferro- and antiferroelectricity in liquid crystals, and the electrooptic effects in FLCs and AFLCs are primarily based on the linear coupling between the applied electric field E and the polarization P. The torque P  E switches the director azimuthally about z at constant y, i.e., “on the surface of the smectic cone” with the opening angle of 2y. The chirality, however, also leads to a helical structure where the director rotates on the smectic cone along z, cf. Fig. 1e. The period or pitch p of the helix is typically on the mm-scale but can be larger in mixtures. The helix of SmC makes P spiral about z which cancels out the macroscopic polarization and makes the phase optically uniaxial. But in thin cells where p is large compared to the cell thickness d, the helix can be elastically unwound by the liquid crystal-surface interactions and a macroscopic, switchable polarization is present. This is the basis for the most important smectic LCD mode: the so-called surfacestabilized ferroelectric liquid crystal (SSFLC) device (Clark and Lagerwall 1980; Lagerwall 1999). In the antiferroelectric SmCa liquid crystal phase (Fukuda et al. 1994; Chandani et al. 1989), the director tilts in opposite directions in adjacent layers. This anticlinic arrangement of chiral tilted molecules makes the structure antipolar with P antiparallel in adjacent layers. The electrooptic effect in AFLCs (for details, see AFLCD Principle) is the field-induced switching between the antiferroelectric ground state and the two symmetric field-induced synclinic ferroelectric states. In many respects, AFLCs have been regarded more attractive than FLCs, but due to several reasons, for instance, the so far insufficient achievable contrast, the AFLCD technology is still in a development phase.

Smectics Versus Nematics Today’s common LCD screens are based on nematic liquid crystals, which present a problem when it comes to high switching speed. At a frame rate of 60 Hz, the total frame time is 16 ms. The pixel response times of nematics are at best about 10 ms, i.e., of the order of the frame time, which is one reason for image blur. Here, FLCs (10 ms) and AFLCs (100 ms) provide about 100–1,000 times faster switching. The main reason is that the electrooptic effects in FLCs are polar and the switching can be actively driven in both directions simply by applying voltages with opposite polarity. In contrast, nematics are nonpolar and do not respond to the sign of the applied field. While their response to an electric field is actively driven, the relaxation back to the original state is driven only elastically which sets the limits for the working speed of nematics. But the higher switching speed in (A)FLCs comes at a cost – the physics of smectics is much more complex. The layered structure of smectics allows for a variety of defect structures, which can ruin the performance of the device. These defects, once formed, do not heal out over very short distances (as irregularities in a nematic – which is a 3D liquid – generally do), but can affect the alignment quality of the smectic over large areas. Moreover, the spontaneous polarization P interacts with ions and introduces polar terms in the surface anchoring which further complicates both switching and alignment. The problems related to alignment quality are factors preventing FLCs and AFLCs to be exploited on a large scale. A difficulty that was originally considered a very serious obstacle for exploitation is that the cell gap d must be thinner than in nematic LCDs. In FLC and AFLC displays, the synclinic state should Page 3 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

fulfill the half-wave plate condition dDn = l/2 (l/4 in reflective mode devices). For instance, with Dn  0.15 and l  0.5 mm, we get d  1.7 mm, thus considerably smaller than in most nematic LCDs which use cell gaps of about 3–5 mm. Despite these challenges, FLCs have been commercialized and are today used, for instance, as microdisplays with silicon backplances in electronic view finders (several million units per year) and recently for picoprojectors. FLCs have also been commercialized in various photonics applications and in some passively driven FLC displays requiring absolute bistability. An example is the FLC key processing element in Agfa’s DIMAX machine capable of producing 20,000 color prints per hour.

Ferroelectric Liquid Crystal Displays The Bistable SSFLC Display Principle Surface Stabilization The most important smectic LCD mode is the bistable SSFLC mode (Clark and Lagerwall 1980, Fig. 2). An SmC liquid crystal is arranged with the smectic layers perpendicular to the cell substrates, commonly referred to as “bookshelf geometry.” This is accomplished through planar aligning surface layers (e.g., rubbed polymide layers or obliquely evaporated inorganic surface layers such as SiOx, or SiO2) and using a material with the phase sequence isotropic-nematic-SmA-SmC. After filling the cell, the liquid crystal material is slowly cooled from the isotropic phase. In the (chiral) nematic phase, a homogeneous director field along the rubbing direction is obtained. The intrinsic helical structure of the chiral nematic phase is suppressed through choosing a pitch of the chiral nematic phase p > 4d (Bradshaw et al. 1987). On entering the SmA phase, the smectic layers form perpendicular to n, i.e., normal to the glass plates (bookshelf structure) before further cooling into the SmC phase. The surface anchoring is strong enough to “unwind” the SmC helical structure close to the surfaces, and when the cell gap d is sufficiently small, the helix is suppressed (unwound) by the surface action also in the volume. This allows for only two possible stable orientations of n which simultaneously fulfill the a

b

c

T 0.5

UP θ

V P

Ps

A Down

Fig. 2 (a) The helical structure of SmC* is not compatible with planar anchoring. The helix is unwound, and the two stable director states are defined by the intersection between the plane of the surface and the smectic cone. (b) The ideal bookshelf SSFLC device structure. (c) Schematic transmission-voltage characteristics (single hysteresis loop) of the SSFLC device for a half-wave plate cell between crossed polarizers (P and A), for y = 22.5 Page 4 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

conditions that the molecules must be parallel to the surfaces while they must stay on the smectic cone. Moreover, as the helix is suppressed, the electric polarization is no longer cancelled out, but there is a macroscopic polarization perpendicular to the surfaces. In the virgin (unpoled) state, spontaneous ferroelectric domains appear, representing opposite sign of P. The two states (UP and DOWN) should be energetically equivalent in zero field, but can be switched between each other by means of an applied electric field E of the right polarity. The switching takes place only when E has attained a certain threshold value, and the new state remains after the field has been taken off. This elastically unwound structure thus has a macroscopic polarization and two stable states in absence of an electric field, properties which are characteristic of a ferroelectric. This motivates the name SSFLC. Note that the SmC, the chiral smectic C phase, is not intrinsically ferroelectric. Unlike a solid crystal, a liquid crystal cannot by itself have ferroelectric properties. The reaction of an SmC on applying an electric field in the same geometry is a dielectric effect: the helix unwinds and the SmC gets polarized (see Deformed Helix Mode). The SSFLC structure is a bistable switchable waveplate with two field-controlled positions of the optical indicatrix in the plane of the cell. Between crossed polarizers with one of the director states parallel to one of the polarizers, we get a dark state. The other state is bright with the transmission   I0 2 2 pdDn (1) I ¼ sin ð4yÞ sin l 2 where I0 is the intensity before the first polarizer, y is the molecular tilt angle, Dn is the birefringence, and l is the wavelength of light in vacuum. For maximum contrast, the tilt angle y should be 22.5 , and the thickness should be tuned to constitute a half-wave plate, i.e., l/2 = Dnd. The characteristic single hysteresis loop in the transmission-voltage characteristics of the SSFLC device is depicted in Fig. 2c. As the electrooptic states are in the plane of the cell (in-plane switching), the viewing angle is nearly hemispheric, with no need for optical compensation foils. The bookshelf structure is, however, very susceptible to mechanical shock – even a small mechanical action on the cell substrates can destroy the bookshelf alignment. This problem could be reduced by, for example, polymer stabilization of the SSFLC structure or as in the Canon FLCD case, ruled out by protecting the SSFLC panel by means of an external hard transparent front protective sheet separated from the display panel. Dynamics of the Switching The equation of motion for the azimuthal director rotation on the smectic cone, disregarding the elastic torques from the surfaces, is (Lagerwall 1999) gf

@’ 1 ¼ PE sin ’ þ Dee0 E 2 sin2 y sin 2’ þ K∇2 ’ @t 2

(2)

where f is the azimuthal angle in Fig. 1d. The first term on the right side is the ferroelectric torque (E) dominating at low fields. The second term is the dielectric torque (E2) that becomes important at high fields, containing the dielectric anisotropy De (see t-Vmin mode). The third term is the elastic term tending to make n uniform in the smectic layer plane with K as the effective elastic constant. If in a first approximation we disregard the dielectric and elastic terms in Eq. 2, we get g’

@’ ¼ PE sin ’ @t

(3)

Page 5 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

For small deviations from the equilibrium state, we get ’ = ’0e t/t with the characteristic response time t = g’/PE. At small fields, a typical response time is about 100 ms. But as we see, the speed increases linearly with the spontaneous polarization P, which is a molecular property. It is counteracted by the rotational viscosity g’ which is rather a global property. Increasing the ratio P/g’ is a prime task for molecular synthesis and design of well-engineered mixtures. Whereas an enormous amount has been achieved due to the dedicated effort on the corresponding task in nematics in the last two decades, very little has so far been invested in well-engineered FLC materials. An educated guess is that about factor of 10 would result from a similar effort in FLC materials. In the SSFLC cell, the threshold for switching is actually in the voltage-time pulse area (Clark and Lagerwall 1980): the lower the magnitude of the pulse, the longer is the critical pulse length, and the shorter the pulse, the higher is the threshold field to switch the cell. (This is different from the pure voltage threshold in the typical nematic case.) If the pulse is too short, there is not enough time for the necessary nucleation of domains with opposite polarity, which then grows at the expense of the starting state through domain wall motion. If there is no switching at the surfaces, the elastic torque will make the cell relax back to the starting state when the pulse is switched off (Clark and Lagerwall 1980; see also Half-V-Mode). The voltage-time area threshold for switching the SSFLC has a large impact on the driving voltages in active matrix compared to passive matrix drive. In video-rate passive matrix displays, the length of the addressing pulse is 10 ms, which might require tens of volts for switching. On the other hand, in active matrix, or in simple bistable SSFLC displays (e.g., shelf labels or e-book displays), the effective pulse length can be allowed to be several milliseconds, and it can be enough with even fractions of a volt for switching.

Chevron Structures Formation of Chevrons The simple bookshelf structure depicted in Fig. 2b is an idealized case. As it turned out, the real structure of an SSFLC device is more complicated. When entering the SmC phase, the layers tend to fold into a kind of chevron structure. The reason is the decrease of the smectic layer thickness as the result of the tilt. On cooling through the SmA-SmC transition, the layers do not slip along the surface (the smectic periodicity dA in the SmA phase is imprinted), and the only way for the material to satisfy the new smaller layer thickness dC  dAcos y in the bulk and the imprinted period dA at the surface is the creation of a folded structure (Rieker et al. 1987). The mechanisms behind the chevron formation are today quite well understood, but mastering the difficulties resulting from these structures is still a central part of FLC technology. First, the chevron can form in two possible directions along the cell, and where regions with opposite direction meet, characteristic zigzag defects appear which cause light leakage in the dark state (Fig. 3b). Second, the chevron interface acts as a third surface (in addition to the two cell surfaces), with its own boundary conditions for the director and the polarization fields. This means that a plurality of device states is possible, uniform, and twisted, above and below the chevron interface (see, for instance, Takatoh et al. 2005). Third, the chevron increases the angle between the memorized state and the fully switched director orientation under applied field. Finally, the chevron automatically makes P noncollinear with the applied electric field E. Hence, there is an immediate torque when the field is applied which changes the nature of the threshold. In summary, the chevron strongly influences the optics, the electrooptics, and the bistability of SSFLC displays and has large impact on the multiplexed driving in passive matrix displays. For a detailed analysis of the electrooptics of different chevron states in SSFLC devices, we refer the reader to references Lagerwall 1999, Takatoh et al. 2005, and Dijon 1990.

Page 6 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

a

b

2

Chevron interface

c

C1

1

PUP 1

2 PDOWN

C2

d

C1

C2

Fig. 3 (a) Chevron in an FLCD. The director is continuous across the chevron interface, whereas P is discontinuous. (b) Microphotographs of zigzag defect regions in the bright and dark states (material Felix 015-000, polyimide 2610, cell thickness 1.3 mm). Outside the electrode area (right), dark and bright states occur with equal probability, illustrating the bistability of the structure. (c) Illustration of C1 and C2. (d) Illustration of the difference in memory angle in C1 and C2. Dots indicate the tip of director on the smectic cone

The fact that the polarization is not collinear with the applied field in chevron cells gives an effective torque also on the smectic layers under applied fields. For high polarization materials, this can be used to straighten up the chevron into a quasi-bookshelf structure (QBS) by applying strong AC fields (Hartmann and Luyckx-Smolders 1990). The chevron angle d is then reduced and the memory angles widened which provides higher contrast. Moreover, the QBS structure is attractive for multiplexed addressing, as it is less sensitive to cross talk pulses under nonselecting bias waveforms (Sato et al. 1989). But unfortunately, the straightening up to QBS is not compatible with the conservation of the smectic layer thickness dC. Therefore, in QBS, the smectic layers form a kind of “horizontal chevrons” with stripe domains (Hartmann 1991). A potential solution to the chevron-related problem could be found in a class of still very few materials showing no or small layer shrinkage around the smectic A to C transition and therefore not forming chevrons (Lagerwall and Giesselmann 2006). Control of the Chevron Direction To avoid zigzag defects, it is crucial to make the chevron form in the same direction everywhere in the display. This can be done through careful tuning of the surface pretilt angle b, i.e., the preferred angle the director makes with the cell surfaces. If b = 0 , the two chevron directions are degenerate. For a finite value of b, however, they become nondegenerate, and we have to distinguish between two types of chevron, C1 and C2, with different structures and electrooptic properties (Kanbe et al. 1991). The surface pretilt of C1 and C2 is in the same and opposite direction, respectively, to the chevron kink; see Fig. 3c. For passive matrix FLCDs where large memory angles are desired, C1 is often preferred as it gives larger effective optical tilt in the quiescent state (cf. Fig. 3d). For active-matrix microdisplays, on the other hand, C2 can be used and is preferred as it is easier to stabilize. Both C1 and C2 can occur simultaneously at small pretilt angles. For very large values of b, only C1 is stable (Kanbe et al. 1991) as the chevron angle cannot be larger than y.

Page 7 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

SSFLCDs In this section, we give two examples of passive matrix FLCDs, of which the Canon FCLD from 1995 has been commercialized. Even a number of video-rate passive matrix FLCD prototypes have been developed with grayscale produced with combinations of spatial and temporal dithering. However, the strong temperature dependence of the pixel switching characteristics in passive matrix FLCDs is a challenge. Therefore, for video-rate displays, there is an increasing interest of combining FLCs with a TFT active matrix, just as for nematics. Ferroelectric liquid-crystal-on-silicon (FLCOS) microdisplay is already a successful application of FLCs with active matrix. Addressing of SSFLCs We have already discussed the unconventional threshold characteristics of SSFLC structures. But we also have to account for the ionic effects which give counteracting fields which may prevent the pixel from switching or even induce back-switching to the original state when the pulse is switched off (Dijon 1990). In order to avoid these effects, as wells as the effect of burning-in images (“sticking”), the addressing must be DC balanced. This is a challenge in SSFLCs as opposite polarity of the field is used to write black and white states. In passive matrix displays, each pixel feels the superposition of the voltages delivered to the row (selection waveform) and column (data waveform). A set of data and selection waveforms is called the addressing scheme. For reviews of different FLCD addressing schemes, see Dijon 1990 and Matuszczyk and Maltese 1995. The schemes have to be quite complex to comply both with the need for short line addressing times and with the chevron-induced reduction in memorized tilt angles (Fig. 3d). Non-addressed pixels always experience the data pulses to the other pixels of the same column, making the director vibrate slightly around the zero-field position and causing light leakage. In the case of perfect bistability, e.g., ideal bookshelf geometry, these “cross-talk vibrations” would be minimized and the contrast enhanced. In FLCDs with TFT active matrix, there are essentially no cross-talk pulses on nonaddressed pixels. The charge-controlled drive, in principle, also allows for generation of grayscale – the fractions of UP and DOWN domains in each pixel can be controlled with the amount of charge put to the electrode pixels (Dijon 1990). Several attempts have been made to use voltage-controlled microdomains for generation of grayscale also in passive matrix using the QBS geometry (Lagerwall 1999). One of the many complications is that the previously written gray state has to be erased by a “blanking pulse” in order to ensure the same level of transmission for a given pulse amplitude. Above, we have considered only the ferroelectric torque P  E which dominates at low-applied voltages. But in materials exhibiting low polarization P and negative dielectric anisotropy and/or large positive dielectric biaxiality, the dielectric torque becomes important, and we get a minimum in the critical voltage-pulse time area (Orihara et al. 1986, cf. Fig. 4). At very high pulse amplitudes, the dielectric torque, proportional to E2, dominates over the ferroelectric torque, proportional to E, and prevents the pixel from switching. This is utilized in the so-called t-Vmin addressing scheme (Surrguy et al. 1991). Furthermore, in the case of high dielectric biaxiality, the high-frequency signals on the pixels from crosstalk pulses stabilize large memory angles in the t-Vmin addressing mode (AC stabilization) (Jones et al. 1991). The Canon FLCD Figure 5 shows a photograph of the world’s first commercial high-resolution LCD from 1995 by Canon. It is a 1500 passive matrix FLCD with 1,280  1,024 pixels and uses the C1 structure with a surface pretilt of 18 and a cell gap of 1.1 mm (Lagerwall 1999; Hanyu et al. 1993). Each pixel, divided into four subpixels, Page 8 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

a

1,000

Pulse width (μs)

Latching 100

10

4655/000 1

b

1

10

100

1,000

Pulse width (μs)

Latching

100

SCE8 10

1

10

100

Pulse amplitude [V]

Fig. 4 Threshold characteristics for (a) conventional and (b) t-Vmin mode FLC material (Courtesy of T.Matuszczyk)

Fig. 5 Canon commercial FLCD from 1995

white, red, green, and blue, could display 16 colors. In order to further increase the symmetric bistability, Canon used “cross-rubbing,” i.e., the rubbing direction on the two glass plates was not collinear. In this

Page 9 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

way, the unwanted effects of polar anchoring could essentially be neutralized and twisted states avoided (Mizutani et al. 1998). In 1997, Canon demonstrated a (1,024  768) digital full-color passive matrix FLCD prototype with 18 subpixels in each pixel, six for each color. This panel had a cell gap of 2 mm and was, in fact, the first example of “non-chevron” FLCDs using so-called non-layer shrinkage materials (Lagerwall 1999). t-Vmin Prototypes In 1998, Sharp showed a 1700 FLCD full-color video screen developed in collaboration between Sharp (Japan), Sharp (Oxford), and DERA in Malvern, UK (Lagerwall 1999; Itoh et al. 1998). It used the C2 structure and the DERA t-Vmin mode. It had 720  916 pixels and scanned the two halves of the screen simultaneously, giving a line address time of 12 ms, i.e., 240 Hz where eight subframes were used to write each picture. In 2000, a similar full-color 1500 prototype of 60 Hz, 540  920 pixel FLCD with a contrast ratio of 150:1 was presented (Takatoh et al. 2005; Koden 2000). It used a cell gap of 1.4 mm, had an operating temperature range of 0–60  C, and produced 256 gray levels (>16,000,000 colors). Idemitsu’s Polymer FLCDs Idemitsu developed a 0.5 mm thin flexible FLC display manufactured in a roll-to-roll process (Hachiya et al. 1993). The alignment of the FLC layer was achieved through mechanical shear caused by the bending of the film over one of the rollers. As a polymer FLC, it exhibits slower pixel response than the monomer FLCs, about 1 ms at room temperature, but allowed for passive matrix displays with a frame rate of 2 Hz. It was the first nonstatic polymer liquid crystal display ever presented. Remarkably, this FLC polymer film can also be used for realization of 3D TV. It then acts a large area shutter in front of a corresponding TV provided with linear polarizer. By switching the optic axis of the FLC polymer film 45 back and forth at 120 Hz, two stereoscopic pictures with orthogonal linear polarization can be alternately displayed at 60 Hz. A person equipped with normal (passive) polaroid glasses for left and right eyes will then experience a 3D image (Okoshi et al. 1998). Flexible SSFLCDs Bistable flexible SSFLCs using monomeric SmC materials between plastic substrates for ultralow power consumption displays have been developed, for example, by the University of Stuttgart (Lueder et al. 1999; Brill et al. 2002) and by Citizen (Iio and Kondoh 2007).

FLC-On-Silicon Microdisplays So-called ferroelectric liquid-crystal-on-silicon (FLCOS) microdisplays (Clark et al. 2000a, Fig. 6) are very small (0

Fig. 8 The electroclinic effect in SmA*. (a) At E = 0 the optic axis is along z. (b) For E 6¼ 0 the optic axis tilts out the angle f from the layer normal in a plane perpendicular to E. (c) Bookshelf electroclinic device Page 12 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Fig. 9 Partial unwinding of a short-pitch helical structure gives a change in shape and orientation of the optical indicatrix (bottom)

In 1995–1996, A. Fukuda and coworkers reported a chiral smectic material exhibiting an analog, thresholdless switching behavior, with a characteristic “V-shaped” transmission-voltage behavior in the AFLCD geometry (Fukuda 1995; Inui et al. 1996). This mode later turned out not to be antiferroelectric but a special case of a surface-stabilized FLC (conventional SmC phase) working in a dielectric mode (Rudquist et al. 1999). In the zero-field state – at “the tip of the V” – the director is at the bottom (or top) of the smectic cone with P parallel to the glass plates. The projection of n (and the optical indicatrix) onto the cell plane is parallel to one of the polarizers and gives the dark state. The analog switching is the homogeneous (at least in the volume) field-induced rotation of the director under the torque P  E where Page 13 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

all states are accessible on the smectic cone. The dark state can be stabilized in “twisted FLC cells” with high value of P (100 nC/cm2), in which the polarization charge self-interaction makes P and hence the director field homogeneous in the volume of the cell and the twist confined to thin surface regions. The polar anchoring gives the necessary restoring torque. The dark state can also be stabilized in the case of thick dielectric surface layers. The restoring torque is then due to the building up of electrostatic energy in the surface layers as soon as P has a component perpendicular to the surfaces, and the applied field reorients P until compensated by the polarization charges built up at the surfaces (Clark et al. 2000b). The V-shaped switching mode was demonstrated by Toshiba in 1998 with a full-color 1500 TFT activematrix prototype (Okimura et al. 1998; Takatoh et al. 2000) with an outstanding performance for that time. Although the charge needed to switch the high P typically cannot be delivered in one line addressing time, it had video performance and an unsurpassed viewing angle. However, the image quality suffered from rapid aging. As it turns out, the director configuration in the field-free dark state is not stable. The structure can, for instance, break up into domains with opposite directions of P (Hammarquist et al. 2008).

Half-V-Mode

The so-called half-V-mode FLC (Fig. 10) is based on a monostable surface-stabilized SmC structure (Bradshaw and Raynes 1990) in a material with a direct N-SmC transition. An SmC monodomain can be achieved in such materials by the use of applied electric fields during the cooling from the N phase (Patel and Goodby 1986). In the analog, half-V-mode n is along the rubbing direction at zero field (strong anchoring) parallel to one of the crossed polarizers. An applied electric field will switch the director towards the opposite side of the cone, but as the surfaces do not switch, the molecules return to the initial state when the field is turned off. The field-induced rotation is continuous and depends on the magnitude of E. Hence, a continuous grayscale is obtained (Nonaka et al. 1999; Asao et al. 1999). In order to assure DC-balanced addressing, the same pulses, but with opposite polarity, are applied between frames. This frame is black as there is essentially no switching of the molecules for the opposite polarity. It also means that the display is black 50 % of the time. But the insertion of the black frames between image frames significantly reduces motion blur, and the half-V-mode is therefore very attractive for high-speed performance TFT displays.

Transmission

Voltage 0V

P A

Fig. 10 Principle and transmission-voltage characteristics for the half-V-mode Page 14 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Fig. 11 Photograph of polymer-stabilized V-mode FLC display prototype by a collaboration of S. Kobayashi (TUS-Y) with DIC, Dainippon Inc., Japan. The display had 800  600 pixels and uses field sequential color, with a frame rate of 60 Hz (180-Hz RGB) (Courtesy of DIC Corp.)

a

b

E

T

c

E V −Vh

V

t

Vh

P A

Fig. 12 Schematics of AFLCD switching. (a) The monostable dark AF state and the two symmetric field-induced F states. (b) The double hysteresis transmission-voltage characteristics. Written gray levels are stable at the holding voltage Vh. (c) Simplified driving waveforms. In passive matrix displays, the actual voltage fluctuates around Vh because of cross-talk

PS-VFLC Mode Polymer-stabilized V- and half-V-mode displays have been developed (Kobayashi et al. 2004). A joint research group of Tokyo University of Science, Yamaguchi (TUS-Y), and DIC Corp has demonstrated a field sequential full-color 400 800  600 polymer-stabilized V-mode FLC display, which uses materials with lower polarization than the V-shaped switching FLC mode, and therefore gives excellent performance with ordinary TFT-matrix drive (Fujisawa et al. 2007, 2008; see Fig. 11).

Antiferroelectric Liquid Crystal Display Mode AFLCD Principle

AFLCs (SmCa phase) (Fukuda et al. 1994; Chandani et al. 1989) can provide almost as fast electrooptic switching as FLCs, but, in principle, also allows for grayscale and easy dc-compensated addressing. However, the alignment issues of AFLCs are even more severe than in the FLC case, and AFLCDs have not yet been commercialized despite development of large (1700 ) full-color video screen prototypes in the 1990s by Denso. As in SmC, the SmCa has a superposed helix and is uniaxial, but in the surfacestabilized state, it is biaxial with the optical indicatrix axis along z. AFLCDs are just as the FLCDs used in the bookshelf geometry, but with the crossed polarizers oriented along and perpendicular to the smectic layer normal z. This gives a zero-field dark state. A sufficiently strong field will switch the material from Page 15 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

the dark (AF) state to one of the bright ferroelectric (F) states, with optic axis tilted y from the polarizer, depending on the sign of the field (Fig. 12a). The two F states are (ideally) symmetric around z, and thus, E gives the same transmission. The transmission-voltage characteristic of AFLCDs is a double hysteresis loop (b). The switching occurs in domains, and the up-slopes of the transmission-voltage curve are produced by an increasing fraction of bright F domains on the expense of AF domains. By tuning the voltage and/or length of the data pulse, one can control this fraction and hereby produce essentially continuous gray levels. After the switching pulse, the voltage is reduced to a holding voltage Vh inside the hysteresis loop, for which the AF and F states are both stable (b, c). As a result, no further switching from AF to F or any significant relaxation from F to AF occurs during the frame time. The written gray level is thus stable at Vh. DC-compensated drive can be accomplished by simply dividing each frame into two subframes with opposite sign of the applied electric pulses. In real AFLCDs, the actual pulse train applied to each pixel is more complicated than what is indicated in (c). Note that the AFLCD mode is not tri-stable – it is monostable but with three well-defined electrooptic states. The F states are stable only under application of the holding voltage. The major obstacle for commercialization of AFLCDs is the insufficient contrast obtained so far as a result of light leakage in the dark state: First, the lack of a nematic phase in materials exhibiting the SmCa phase generally results in smectic layer misalignment and a static light leakage. Furthermore, the driving fields are generally strong enough to straighten up vertical chevron structures to QBS, which gives horizontal chevrons and a striped texture. Second, the transmission-voltage curve is not perfectly flat below the threshold for switching. This “pretransitional effect,” the field-induced distortion of the anticlinic state below the AF-F transition, gives dynamic light leakage in passive matrix displays, as the transmission in dark pixels is not zero at Vh. The dynamic part of the light leakage can be ruled out in active-matrix drive, but the relatively high P value of AFLCs is a challenge. One way to avoid the static (and minimize the dynamic) light leakage is to use so-called orthoconic AFLCs (see below).

Orthoconic Antiferroelectric Liquid Crystal (AFLC) Display Mode It was mentioned above that AFLCs, in contrast to FLCs, have not yet been developed into commercial devices. This situation could change, however, with the new so-called orthoconic materials (OAFLCs). In the case of y = 45 , the surface-stabilized AFLC state becomes (negative) uniaxial with the optic axis perpendicular to the glass plates, giving a perfect black state at zero field, irrespective of smectic layer misalignment (D’havé et al. 2000; Rudquist 2013). In such “orthoconic AFLCs,” the smectic cone angle is 90 and the optic axis is orthogonal to the tilt plane. The surface-stabilized orthoconic thus avoids the static light leakage and, in fact, also minimizes the dynamic light leakage in passive addressing (Rudquist et al. 2002) (there is of course no dynamic light leakage in active matrix). Hence, orthoconic AFLCs provide a solution to the dark state problem in AFLCDs without having to solve the alignment problem. The cell surfaces are by symmetry always more or less polar and more or less incompatible with the antipolar ground state of AFLCs. Each surface tends to promote one of the polar field-induced F states. In cells thin enough to accomplish surface stabilization of today’s short-pitch OAFLC mixtures, the bright F states can become metastable, making them prevail for long times after the electric field is switched off. Fast back-relaxation to the dark state then requires specially designed waveforms or polymer-stabilization (Rudquist et al. 2006). But the metastability of the F states could open up for truly tri-stable AFLC devices, where both the black states and the symmetric bright states are stable. This feature could be very attractive for low power consumption displays showing static pictures. The obvious advantage of this tri-stable mode compared to static bistable SSFLCDs would be that the problem of image sticking could be easily avoided in the tri-stable orthoconic AFLCD case by just reversing the polarity of the pixels at very low frequency. Such a display would only consume power during image update and/or polarity Page 16 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

reversal. Finally, by combining the surface-stabilized OAFLCs with circular polarizers, also the bright state becomes independent of the quality of alignment (Rudquist 2013).

Conclusions The peak of R&D in FLCD and AFLCD modes was in the 1990s before the maturation of the TFT-matrix technology. But today, the increasing need for speed and resolution in visual displays again has made FLCs, and perhaps AFLCDs, attractive not only for high-performance microdisplays but also for large active-matrix direct view displays. Moreover, there is an increasing need for perfectly bistable, ultralow power displays, where SSFLCs have already proven their high potential. Further development of non-layer shrinkage (chevron-free) materials and polymer-stabilized technologies could open up for large volume applications of smectics in several types of display modes.

Further Reading Andersson G, Dahl I, Keller P, Kuchynski W, Lagerwall ST, Skarp K, Stebler B (1987) Submicrosecond electro-optic switching in the liquid-crystal smectic A phase: the soft-mode ferroelectric effect. Appl Phys Lett 51:640–642, US Patent 4,838,663 (1989) Asao Y, Togano T, Terada M, Moriyama T, Nakamura S, Iba J (1999) Novel ferroelectric liquid crystal mode for active matrix liquid crystal display using cholesteric-chiral smectic C phase transition material. Jpn J Appl Phys 38:5977–5983 Beresnev LA, Blinov LM, Osipov MA, Pikin SA (1988) Ferroelectric liquid crystals. Mol Cryst Liq Cryst 158a:1, (Special topics XXIX) Gordon & Breach Bradshaw MJ, Raynes EP (1990) Smectic liquid crystal devices. US Patent 4,719,969 Bradshaw MJ, Brimmel V, Raynes EP (1987) A novel alignment technique for ferroelectric smectics. Liq Cryst 2:107–110 Brill J, Lueder E, Randler M, Voegele S, Frey V (2002) A flexible ferroelectric liquid crystal display with improved stability for smart card applications. J Soc Inf Disp 10:189–194 Buckley E (2008) Holographic laser projection technology. Inf Disp 24:22 Chandani ADL, Gorecka E, Ouchi Y, Takezoe H (1989) Antiferroelectric chiral smectic phases responsible for the tristable switching in MHPOBC. Jpn J Appl Phys 28:L1265–L1268 Clark NA, Lagerwall ST (1980) Submicrosecond bistable electro-optic switching in liquid crystals. Appl Phys Lett 36:899–901, US Patent 4,367,924 (1983) Clark NA, Lagerwall ST (1991) Introduction to ferroelectric liquid crystals, chapter I, pp 1–97; Applications of ferroelectric liquid crystals, chapter 6, pp 409–465. In: Goodby JW, Blinc R, Clark NA, Lagerwall ST, Osipov M, Pikin SA, Sakurai T, Yoshino K, Zeks B (eds) Ferroelectric liquid crystals – principles, properties, and applications. Gordon & Breach, New York Clark NA, Crandall C, Handschy MA, Meadows MR, Malzbender RM, Park C, Xue JZ (2000a) Ferroelectric microdisplays. Ferroelectrics 246:1003–1016 Clark NA, Coleman DA, Maclennan JE (2000b) Electrostatics and the electro-optic behaviour of chiral smectics C: ‘block’ polarization screening of applied voltage and ‘V-shaped’ switching. Liq Cryst 20:985 Coates D (1990) Smectic A LCDs. In: Bahadur B (ed) Liquid crystals, applications and uses, vol 1. World Scientific, Singapore

Page 17 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Coates D, Crossland WA, Morrisy JH, Needham B (1978) Electrically induced scattering textures in smectic A phases and their electrical reversal. J Phys D Appl Phys 11:2025–2035 Crossland WA, Canter S (1985) An electrically addressed smectic storage device. In: SID international symposium digest, Orlando, p 124 Crossland W, Wilkinson TD (1998) Nondisplay applications of liquid crystals. In: Demus D, Goodby J, Gray GW, Spiess HW, Vill V (eds) Handbook of liquid crystals, vol 1. Wiley-VCH, Weinheim, pp 763–822 D’havé K, Rudquist P, Lagerwall ST, Pauwels H, Drzewinski W, Dabrowski R (2000) Solution to the dark state problem in AFLCDs. Appl Phys Lett 76:3528–3530, US Patent, 6,919,950 (2005) Dijon J (1990) Ferroelectric LCDs. In: Bahadur B (ed) Liquid crystal applications and uses. World Scientific, Singapore, Chapter 13 Fujisawa T, Hayashi M, Hasebe H, Takeuchi K, Takatsu H, Kobayashi S (2007) Novel PSV-FLCDs with temperature shift free operating voltages exhibiting high response speed, high optical throughput, and high contrast ratio for field sequential full color LCDs. SID Symposium Digest of Technical Papers 38:633–636 Fujisawa T, Nishiyama I, Hatsusaka K, Takeuchi K, Takatsu H, Kobayashi S (2008) Field sequential full color LCDs using polymer-stabilized V-shaped ferroelectric liquid crystals. Ferroelectrics 364(1):78–85 Fukuda A (1995) Pretransitional effect in AF-F switching: to suppress it or to enhance it, that is my question about AFLCDs. In: Proceedings of 15th IDRC, Asia display 1995, Hamamatsu, Japan, p 61 Fukuda A, Takanishi Y, Isozaki T, Ishikawa K, Takezoe H (1994) Antiferroelectric chiral smectic liquid crystals. J Mater Chem 4:997–1016 F€unfschilling J, Schadt M (1998) New ferroelectric displays and operation modes. Ferroelectrics 213:195–298 Garoff S, Meyer RB (1977) Electroclinic effect at the A-C phase change in a chiral smectic liquid crystal. Phys Rev Lett 38:848–851 Hachiya S, Tomoike K, Yuasa K, Togawa S, Sekiya T, Takahashi K, Kawasaki K (1993) Ferroelectric liquid-crystalline polymers and their application to display devices. J Soc Inf Disp 1:295–298 Hammarquist A, D’havé K, Matuszczyk M, Clark NA, Maclennan JE, Rudquist P (2008) Stabilization of V-shaped switching ferroelectric liquid crystal. Structure stabilized by dielectric surface layers. Phys Rev E 77:031707 Handschy MA, Spanner BF (2008) The future of pico projectors. Inf Disp 24:16 Handschy MA, McNeil JR, Weissman PE (2006) Ultrabright head mounted displays using LED-illuminated LCOS. Proc SPIE 6224:62240S–62241S Hanyu Y, Nakamura K, Hotta Y, Yoshihara S, Kanbe J (1993) Molecular alignment of a very-large-size FLCD. SID Digest 24(2):364–367 Hareng M, Le Berre S (1975) Formation of synthetic images on a laser-beam-addressed smectic liquidcrystal display. Electron Lett 11:73 Hareng M, Le Berre S (1978) Liquid crystal flat display. In: Proceedings of IEDM, Washington, pp 258–260 Hartmann WJAM (1991) Ferroelectric liquid crystal displays for television application. Ferroelectrics 122:1–4 Hartmann WJAM, Luyckx-Smolders AMM (1990) The bistability of the surface-stabilized ferroelectric liquid-crystal effect in electrically reoriented chevron structures. J Appl Phys 67:1253–1261 Iio K, Kondoh S (2007) A memorable and flexible dot matrix display using ferroelectric liquid crystals. Ferroelectrics 365:148–157

Page 18 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Inui S, Iimura N, Suzuki T, Iwane H, Miyachi K, Takanishi Y, Fukuda A (1996) Thresholdless antiferroelectricity in liquid crystals and its application to displays. J Mater Chem 6:671 Itoh N, Akiyama H, Kawabata Y, Koden M, Miyoshi S, Nuamo T, Shigeta M, Sugino M, Bradshaw MJ, Brown CW, Graham A, Haslam SD, Huges JR, Jones JC, McDonnell DG, Slaney AJ, Bonnett P, Gass PA, Raynes EP, Ulrich D (1998) 1700 video-rate full-color FLCD. Proc IDW 98:205 Jones JC et al (1991) The importance of dielectric biaxiality for ferroelectric liquid crystal devices. Ferroelectrics 121:91–102 Kahn FJ (1973) IR-laser-addressed thermo-optic smectic liquid-crystal storage displays. Appl Phys Lett 22:111–113 Kanbe J, Inoue H, Mitzutome A, Hanyuu Y, Katagiri K, Yoshihara S (1991) High resolution, large area FLC display with high graphic performance. Ferroelectrics 114:3–26 Kobayashi S, Jun X, Furuta H, Murakumi Y, Kawamoto S, Oh-kouchi M, Hasebe H, Takatsu H (2004) Fabrication and electrooptic characteristics of polymer-stabilized V-mode ferroelectric liquid crystal display and intrinsic H-V-mode ferroelectric liquid crystal displays: their application to field sequential full-color active matrix liquid crystal displays. Opt Eng 43:290–298, and references therein Koden M (2000) Passive-matrix FLCDs with high contrast and video-rate full-color pictures. Ferroelectrics 246:993–1002 Lagerwall ST (1999) Ferroelectric and antiferroelectric liquid crystals. Wiley VCH, Weinheim Lagerwall ST (2001) Ferroelectric and antiferroelectric liquid crystals. In: Encyclopedia of materials: science and technology. Elsevier, Oxford, UK, pp 3044–3063 Lagerwall JPF, Giesselmann F (2006) Current topics in smectic liquid crystal research. Chemphyschem 7:21–45 Lagerwall S, Matuszczyk M, Rhode P, Ödman L (1998) The electroclinic effect. In: Elston S, Sambles R (eds) The optics of thermotropic liquid crystals. Taylor & Francis, London, pp 155–194 Lueder E, Buerkle R, Muecke M, Klette R, Bunz R, Kallfass T (1999) Flexible and bistable FLC and cholesteric displays on plastic substrates for mobile applications and smart cards. J Soc Inf Disp 7(1):29–35 Matuszczyk T, Maltese P (1995) Addressing modes of ferroelectric liquid crystal displays. Proc SPIE 2372:296–309 Meyer RB, Liebert L, Strzelecki L, Keller P (1975) Ferroelectric liquid crystals. J de Physique Lett 36: L69–L71 Mizutani H, Tsuboyama A, Hanyu Y, Okada S, Terada M, Katagiri K (1998) Digital full color ferroelectric liquid crystal display. Ferroelectrics 213:179–186 Musevic I, Blinc R, Zeks B (2000) The physics of ferroelectric and antiferroelectric liquid crystals. World Scientific Publishing Company, Singapore Nonaka T, Li J, Ogawa A, Hornung B, Schmidt W, Wingen R, D€ ubal H-R (1999) Material characteristics of an active matrix LCD based upon chiral smectics. Liq Cryst 26:1599–1602 Okimura H, Akiyama M, Takato K, Uematsu Y (1998) SID digest, 2901. 1171; cf. also discussions in Takatoh et al. (2005) Okoshi K, Yasa K, Moritwaki F, Kofuji T (1998) FLC polymer and plastic substrates for use in a largearea optical shutter for 3-D TV, SID 98. Digest 2901:1135–1138 Orihara H, Nakamura K, Ishibashi Y, Yamada Y, Yamamoto N, Yamawaki M (1986) Anomalous switching behavior of a ferroelectric liquid crystal with negative dielectric anisotropy. Jpn J Appl Phys 25:839–840 Ostrovskij BI, Rabinovich AZ, Chigrinov VG (1980) Behavior of ferroelectric smectic liquid crystals in electric field. In: Bata L (ed) Advances in liquid crystal research and applications, Pergamon Press, Oxford, UK, p 469 Page 19 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_89-2 # Springer-Verlag Berlin Heidelberg 2014

Patel JS (1992) Ferroelectric liquid crystal modulator using twisted smectic structure. Appl Phys Lett 60:280–282 Patel J, Goodby JW (1986) Alignment of liquid crystals which exhibit cholesteric to smectic C phase transitions. J Appl Phys 59:2355 Rieker TP, Clark NA, Smith GS, Parmar DS, Sirota EB, Safinya CR (1987) ‘Chevron’ local layer structure in surface-stabilized ferroelectric smectic-C cells. Phys Rev Lett 59:2658–2661 Rudquist P, Lagwerwall JPF, Buivydas M, Gouda F, Lagerwall ST, Clark NA, Maclennan JE, Shao R, Coleman DA, Bardon S, Bellini T, Link DR, Natale G, Glaser MA, Walba DM, Wand MD, Chen X-H (1999) The case of thresholdless antiferroelectricity: polarization-stabilized twisted smectic C liquid crystals give analog response. J Mater Chem 9:1257–1261 Rudquist P, Meier JG, Lagerwall JPF, D’havé K, Lagerwall ST (2002) The tilt plane orientation in AFLCDs and the origin of the pretransitional effect. Phys Rev E 66:061708 Rudquist P, Elfström D, Lagerwall ST, Dabrowski R (2006) Polymer-stabilized orthoconic antiferroelectric liquid crystals. Ferroelectrics 344:177–188 Rudquist P (2013) Orthoconic antiferroelectric liquid crystals. Liq Cryst 40(12):1678–1697 Sato Y, Tanaka T, Kobayashi H, Aoki K, Watanabe H, Takeshita T, Ouchi Y, Takezoe H, Fukuda A (1989) High quality ferroelectric liquid crystal display with quasi-bookshelf layer structure. Jpn J Appl Phys 28:L483–L486 Surrguy PWH, Ayliffe PJ, Birch MJ, Bone MF, Coulson I, Crossland WA, Hughes JR, Ross PW, Sunders FC, Towler MJ (1991) The ‘JOERS/Alvey’ ferroelectric multiplexing scheme. Ferroelectrics 122:63–79 Takatoh K, Yamaguchi H, Hasegawa R, Saishu T, Fukushima R (2000) Application of FLC/AFLC materials to active-matrix devices. Polym Adv Technol 11:413–426 Takatoh K, Hasegawa M, Koden M, Itoh N, Hasegawa R, Sakamoto M (2005) Alignment technologies and applications of liquid crystal devices. Taylor & Francis, Oxon Takezoe H (2001) Ferroelectric, antiferroelectric, and ferrielectric liquid crystals: applications. In: Encyclopedia of materials: science and technology. Elsevier, Oxford, UK, pp 3064–3074 Verhulst AGH, Cnossen G (1996) Active-matrix deformed-helix ferroelectric liquid crystal displays. Ferroelectrics 179:141–152

Page 20 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

In-Plane Switching (IPS) Technology Hyungki Hong* Department of Visual Optics, Seoul National University of Science and Technology, Nowong-gu, Seoul, Republic of Korea

Abstract In-plane switching is a liquid crystal (LC) mode initially developed for large-area display applications such as television and monitor. Nowadays, this mode became very popular for mobile display, as domain of this mode is stable under physical stress like touch. Its working principles and unique characteristics are explained in comparison with those of other LC modes.

List of Abbreviations G2G IPS ITO LC NB NW TFT TN VA

Gray to gray In-plane switching Indium tin oxide Liquid crystal Normally black Normally white Thin-film transistor Twisted nematic Vertical alignment

Introduction In designing the liquid crystal (LC) cell structure, various combinations are possible. The initial LC alignment directions can be controlled to be either homogeneous or homeotropic by the selection of the alignment material. The electric field direction can be made either vertical or in-plane by making planar electrodes on each side of two substrates or by making line-shaped electrodes on one of the substrates. The LC molecules align parallel to the electric field for LCs of positive dielectric anisotropy and perpendicular to the electric field for LCs of negative dielectric anisotropy. When these combinations are considered, a few different structures can be used for display applications and each has its own distinct characteristics. The effective birefringence of the LC is dependent on the angle with respect to the optical axes. As such, the limited viewing angle offered by LCDs has been a serious issue as LCDs of larger size have been developed. Hence, various methods have been devised to obtain angularly uniform electro-optical characteristics. An LC cell using homogeneous alignment, an in-plane electric field, and a positive LC were first reported in a discussion of the in-plane rotation of LC molecules and its electro-optical effect (Soref 1974). As the electro-optical effect by in-plane rotation of the LC molecules shows the least angular dependence, this structure has been characterized and successfully commercialized under the name

*Email: [email protected] Page 1 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

in-plane switching (IPS) mode (Kiefer et al. 1992; Oh-e and Kondo 1995; Yeo et al. 2005; Hong et al. 2007).

Principle of the IPS Mode Voltage-Transmittance Relation The basic electro-optical characteristics of IPS can be explained by a simplified one-dimensional model of the LC cell (Kiefer et al. 1992; Oh-e and Kondo 1995). When the LC molecules are placed between crossed polarizers, without any twist, the LC layer can be assumed to be a uniform uniaxial medium such that the optical axes of all of the LC molecules are defined by the polar and azimuth angles (y, f). Under this assumption, the voltage-transmittance relation can be written as follows: T ðV , lÞ ¼ sin2 ½2fðV Þ sin2 ½pdDneff ðV , l, yÞ=l:

(1)

In the IPS mode, the initial rubbing direction of homogeneously aligned LC molecules is parallel with the axis of the polarizer or analyzer, and the transmittance is zero when the external voltage is zero. The LC mode that gives zero transmittance at zero external voltage is typically called the normally black (NB) mode. As the voltage difference between the two electrodes on the lower substrate increases, the LC of positive De rotates to align parallel to the in-plane electric field as shown in Fig. 1. As the angle f from the rubbing direction increases, the transmittance increases. Compared with other LC modes that control the a

E-field

V0

V0

V0

Voltage off

b

Propagation direction of light

V1 Voltage on

T(V,l)= sin2 (2φ(V ))

Analyzer φ

Optic axis of LC

Polarizer // rubbing direction

Fig. 1 (a) Liquid crystal (LC) motion under in-plane electric field and (b) propagation of polarized light in in-plane switching (IPS) mode Page 2 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

effective refractive index Dn, the transmittance of the IPS mode is determined by the change of angle f. Therefore, the transmittance ratio between the different wavelengths remains the same irrespective of the gray levels in Eq. 1. The equation of rotation of LC molecules can be derived from the Euler-Lagrange equation as follows (Oh-e and Kondo 1995):   @f d @f @2f  ¼ 0 ¼ eo DeE 2 sin f cos f  K 22 2 : @f dz @ f_ @z

(2)

This equation is derived from the standard continuum elastic energy expression used for nematic LCs assuming that the x-y plane is parallel with the substrate and the rubbing direction is parallel with the x-axis. In Eq. 2, K22 is one of the elastic constants of the LC material and E is the electric field. Elastic constants K11, K22, and K33 of the LC control the splay, twist, and bend motion of the LC molecules, respectively. As the twist of the LC molecules is induced by the in-plane electric field, K22 is the most important of the elastic constants in the IPS mode. In the real situation of display applications, the anchoring energy of the alignment layer is so strong that LC molecules near the aligned layer do not move. So the boundary condition can be given as f(0) = f(d) = 0, where d is the cell gap. In the case of this strong anchoring, the solution of Eq. 2 shows that the arrays of LC molecules twist approximately in the shape of a sine function as shown in Fig. 2a and Eq. 3: fðzÞ ¼ fm sin ðpz=d Þ:

(3)

a

b

f (twist angle)

Normalized transmittance (%)

As the LC molecules near the aligned layer barely move under the strong anchoring condition, the maximum transmittance occurs at the cell gap condition where LC retardation is larger than half the wavelength of visible light (Satake et al. 2001; Bbowmik et al. 2008). An example is shown in Fig. 2b, where the maximum transmittance for 550-nm light occurs at LC retardation values beyond 275 nm. Therefore, the retardation of the LC cell is generally selected to be 10–30 % larger than half the wavelength for display applications.

−0.5

V3> V2> V1 V1 V2 V3

−0.25

0 0.25 Normalized cell gap

0.5

1 275 nm 320 nm 375 nm

0.8 0.6 0.4 0.2 0 0

2

4 6 8 Driving voltage (V)

10

Fig. 2 (a) Optical axis distribution of LC molecules along the direction of the cell gap under an external voltage above threshold voltage levels and (b) dependency of the voltage-transmittance curve on retardation for incident light of 550 nm (Reprinted with permission from Bbowmik et al. (2008)) Page 3 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

Thin-Film-Transistor Pixel Structures and Production Step

To make an in-plane electric field inside the LC cell, electrodes of different voltage levels should be placed on the side of the same glass substrate. For this purpose, common electrode and pixel electrodes of line shape are placed side by side inside the active area of each pixel as shown in Fig. 3a. The common electrode is connected to the constant voltage source through the pad contact area located at the edge of the display. Pixel electrodes are connected to the thin-film transistor (TFT) of each pixel, and a different data voltage is supplied to each pixel electrode through the on-state TFT. To reduce coupling between bus lines and pixel electrodes, common electrodes are generally placed near bus lines. The production process on the lower substrate side of IPS is almost the same as that of other LC modes such as the twisted nematic (TN) mode. As an example, a cross section of an LC cell is shown in Fig. 3b, where a five-mask process is used to make the electrode structure on the lower substrate. In this example of a five-mask process, a metal layer is deposited on the glass substrate, and metal patterns for the gate bus line and common electrode are made through photolithography using the first mask. The gate insulator layer is then deposited, and the amorphous-silicon layer for the TFT is deposited and patterned using the second mask. Another metal layer is then deposited and patterned to make the source and drain of the TFT, data bus line, and pixel electrodes using the third mask. The passivation layer of SiNx is deposited. With the use of the fourth mask, contact holes are patterned through the gate insulator and the passivation layer, which is not shown in Fig. 3. Finally, the indium tin oxide (ITO) layer is deposited and patterned to make pad contact using the fifth mask. In the case of Fig. 3, pixel electrodes of metal are made on the same layer as the data electrodes. To improve light efficiency, the pixel or common electrodes can be made alternatively from ITO by connecting the ITO layer to the metal patterns of the other layer. The production process on the upper substrate side of IPS is simpler than that of other LC modes using a vertical electric field as the ITO layer need not be formed on the side of the upper substrate (Bbowmik et al. 2008). a

Capacitor

Common electrode Pixel electrode

LC

Data bus line TFT

b

Gate bus line

Black matrix

Color filter Passivation

Data bus line

Common electrode

Pixel electrode

Gate insulator

Data bus line

Fig. 3 (a) Top view of electrode configuration inside one pixel on the side of the lower substrate and (b) cross section of IPS pixel structure Page 4 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

LC Rotation Under an Electric Field A pixel of IPS mode has a two-dimensional electrode structure as shown in Fig. 3; therefore, the threedimensional distribution of LC molecules should be obtained to fully characterize the electro-optical performance of the LC cell. For this purpose, both commercial and in-house simulation programs of the three-dimensional LC motion have been widely used (LCD master). An example of a simulated LC configuration for the horizontal cross section shown in Fig. 3a is shown in Fig. 4a. Curved lines represent the equipotentials caused by common and pixel electrodes. These are not uniform throughout the area of the cross section and have vertical components as well as horizontal components: hence, the LC rotation induced by the electric field is more complicated than the simplified assumption of the uniform twist rotation of the LC molecules. In Fig. 4a, short lines represent LC molecules. Whereas LC molecules near the center of each aperture rotate almost in-plane, LC molecules on the top of electrodes rarely depart from the initial homogeneous direction. Rotation of LC molecules out of plane exists as well. Simulated transmittance from the LC configuration shown in Fig. 4a is shown in Fig. 4b, where transmittance peaks exist near the central position between two adjacent electrodes. a Upper glass substrate Common electrode potential

Pixel electrode potential

Lower glass substrate

Common electrode

b

V2

V3

V3>V2>V1

Transmittance

V1

Pixel electrode

Horizontal distance

Fig. 4 (a) Example of a simulated LC molecule configuration along the direction perpendicular to the electrodes. Short lines represent LC molecules and curved lines represent equipotentials. The vertical direction represents the LC cell layer. Colors represent the level of electric field intensity. (b) Simulated transmittance at the corresponding horizontal position (Reprinted with permission from Bbowmik et al. (2008)) Page 5 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

To improve overall light efficiency, the electrode width had to decrease and the distance between electrodes had to increase to the optimum value. Electrode widths of 3–4 mm are currently used, and research to make narrower and more reliable electrodes is still ongoing. Electric field distributions are determined by the external voltage and electrode configuration. Uniformity of the electric field deteriorates for larger electrode distance, which in turn affects the luminance distribution characteristics between electrodes. An example of nonuniformity of the luminance distribution is given in Fig. 5, which shows microscope photographs where the electrode distance of the test cell is 28 mm (Hong and Shin 2008). LC molecules near electrodes move at lower external voltage than LC molecules at the center, and luminance near the electrodes is observed to be greater than that in the region between two adjacent electrodes at the voltage condition of V1 and V2. When the electric field distribution is nonuniform, it is difficult to align the LC molecules in each local region to the maximum transmittance. Also, the external voltage of the driving IC has limited range. Owing to these various considerations, an electrode distance of less than 15 mm is generally used for display applications. The above example is based on the NB mode. The black state is not affected by the nonuniformity of the electric field. On the other hand, in normally white (NW) mode, the transmittance is largest at zero voltage and the black state is achieved by applying a strong electric field. The TN mode is one example of an NW mode. As some of the LC molecules in the IPS mode remain in the initial configuration even under a strong electric field, the IPS mode is not used for NW mode.

V1

V2

Ball spacer

V3

Metal electrode

Fig. 5 Micrograph of an IPS test cell under different driving voltages, where V1 < V2 < V3. The electrode spacing is 28 mm and the electrode width is 5 mm. LC molecules move nonuniformly owing to the nonuniform electric field. Vertical black lines are metal electrodes (Reprinted with permission from Hong and Shin (2008)) Page 6 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

Characteristics of the IPS Mode Dynamic Characteristics From the equation of rotation (Eq. 3), the temporal behavior of LC molecules can be derived as follows: @f @ 2f 2 g1 ¼ eo DeE sin f cos f þ K 22 2 ; @t @z

(4)

where g1 is the twist viscosity of the LC (Oh-e and Kondo 1996). For the case when the external voltage at the final state is zero, LC molecules show exponential-decay behavior, where the time constant is given as tf ¼

g1 d 2 : K 22 p2

(5)

Equation 5 shows that the cell gap and the material parameters of the LC are key parameters that determine the response time of the IPS mode. However, there is no simple solution like Eq. 5 when the voltage at the final state is not zero. To characterize the dynamic image quality, the response time between any two gray levels has to be considered. IPS is known to have a relatively uniform gray to gray (G2G) distribution compared with other modes such as vertical alignment (VA) and TN. Figure 6 shows an example of G2G distribution for IPS and VA (Bbowmik et al. 2008). For TVapplications, various complex driving schemes have been applied to improve the G2G response time, and nowadays the dynamic performance of an LCD TV system is determined not only by the dynamic performance of the LC mode itself but also by driving schemes such as overdriving and scanning backlight (Nakamura and Sekiya 2001; Kurita 2001). Still, the IPS mode optimized for dynamic performance is reported to show dynamic image quality better than that of the TV system with other LC modes or other types of flat panel display (Kim et al. 2006). Figure 7 shows a photograph captured when images are horizontally scrolling with the speed of ten pixels per frame on the display. Minute detail can be still discerned for IPS LCD TVs as shown in Fig. 7a.

Viewing Angle Characteristics and Their Effects on Other Characteristics Little dependency of image quality on the viewing direction is a strong merit of the IPS mode when compared with other LC modes (Oh-e and Kondo 1995; Yeo et al. 2005; Hong et al. 2007). In-plane rotation of LCs has an inherently wide viewing angle as shown schematically in Fig. 8. When the transmittance of the LC is controlled by out-of-plane LC rotation under the vertical electric field, the angle y of the second sine term in Eq. 1 is different for the different viewing directions, and consequently the transmittance depends on the viewing angle. In contrast, when the transmittance of the LC is controlled by in-plane LC rotation under the in-plane electric field, the angle f between the LC optical axis and the polarizer in Eq. 1 is relatively unaffected by the viewing direction. Therefore, the angular dependence of transmittance of the in-plane configuration is much smaller than that of the out-of-plane configuration. Effective retardation of in-plane LCs is not completely independent of the viewing angle, and slight color shifts are observed in monodomain LCs when viewing the direction approaches 90 , as shown in Fig. 9. When the viewing direction becomes parallel with the optical axis of the LC, the effective retardation decreases slightly and the transmittance in the visible wavelength becomes larger in the blue region. When the viewing direction becomes perpendicular to the optical axis of the LC, the effective retardation increases and the transmittance in the visible wavelength becomes higher for colors of longer Page 7 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

a

Gray to gray response time

G255 G223 G191 G159

40

G127

30

G95

e Respons ) s (m e m ti

50

20

G63

10

G255

b

l

G0

Fi na

G0

tia

G63

G127

Ini

G63

G31

l

G159 G191

G255

0

Gray to gray response time G255 G223 G191 50 G159 G127

30

G95

20 10

G255

0

G0

l

G63

tia

G0

G0

na

G127

Ini

G63

G31

l

G127

G191

G255

G191

G63

Fi

e Respons ) s (m e m ti

40

Fig. 6 Examples of gray to gray response time distribution for (a) IPS mode and (b) vertical alignment mode (Reprinted with permission from Bbowmik et al. (2008))

wavelength. To compensate for this phenomenon that the image looks bluish or yellowish depending on the viewing direction, a dual domain structure has been devised (Aratani et al. 1997). Dual domains are induced by zigzag-shaped electrodes, as shown in Fig. 10. A zigzag electrode induces two different directions of the electric field at the upper and lower parts of the pixel; therefore, the LC rotates counterclockwise at the upper region and clockwise at the lower region. Therefore, a dual domain can be induced even though only one rubbing direction is used inside the pixel. The bluish shift of one domain and the yellowish shift of the other domain counteract each other, eliminating the color shift phenomenon of the monodomain. Although the phase change induced by the LC is the same irrespective of the viewing direction, the condition of crossed polarizers can be affected by the viewing direction. Figure 11 illustrates the phenomenon where the angle between the polarizer and the analyzer is 90 at normal incidence but larger than 90 at oblique incidence. Compensation films to remedy this problem have been reported and contribute to the improvement of the viewing angle characteristics of IPS (Saitoh et al. 1998; Lee et al. 2005). The LCD viewing angle was initially defined as the angular range where the contrast ratio between the luminance of white and black is larger than 10. Although this specification is suitable for displaying texts or graphics, the angular performance for TV applications should be considered in terms of gray level and Page 8 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

Fig. 7 Comparison of motion blur in photographs of an image scrolling horizontally at a speed of ten pixels per frame for (a) an IPS LCD and (b) a plasma display panel (Reprinted with permission from Kim et al. (2006))

Fig. 8 Comparison between (a) out-of-plane motion of an LC and (b) in-plane motion of an LC

Polarizer

LC molecule

ne

no Polarizer

Δneff increase. Yellowish for q > 60°

Δneff decrease. Bluish for q > 60°

Fig. 9 Color shift phenomenon of one-domain IPS Page 9 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

a

Rubbing direction

Common electrode

b

E-f

ield

Pixel electrode ld

fie

E-

Fig. 10 LC configuration of two-domain IPS (a) for the initial alignment state and (b) under a driving voltage larger than the voltage threshold

Fig. 11 The angular change between the polarizer and the analyzer for (a) normal incidence and (b) oblique incidence

color characteristics as well. Most of the wide-viewing-angle LC modes used for TVs currently satisfy this specification of a contrast ratio of 10 at more than 170 . New measurement methods to differentiate the viewing angle performance between LCD modes are needed and have been reported (ICDM (International Committee for Display Measurement) Standard 2011). Although the angular dependence of the luminance of LCDs is relatively well known, this angular dependence is also related to other performance criteria. The example in Fig. 12 shows how the luminance ratio and color are related to the viewing direction. In LCDs, color is represented by controlling the luminance of individual red, green, and blue subpixels. In this figure which is the observed trend of non-IPS mode, the luminance of (R0, G0, B0) gray levels

Page 10 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014 1 Normal

Oblique

0.9 RO⬘

0.8

RO

0.7 Luminance

GO⬘ 0.6 0.5 GO

0.4 0.3 0.2 BO⬘

0.1

BO 0 0

64

128

192

256

Fig. 12 Effect of angular dependence on color for an LC cell having angular luminance dependence

a

b

1 Normal

Oblique

1 Normal

0.9

0.8

0.8

0.7

0.7 Luminance

Luminance

0.9

0.6 0.5 0.4 0.3

Oblique

RT(O)

0.6 0.5 0.4 0.3

0.2

0.2

0(10)

0.1

RT(N)

0.1 N(10)

0 0

65

128 Gray level

192

256

0

0

5 Time (ms)

10

Fig. 13 Example of the effect of angular dependence on the response time for an LC cell with nonnegligible gamma shift. (a) An LC state of 10 % luminance at the normal viewing direction is different from that at an oblique viewing direction. (b) Change of temporal luminance curve for the normal and the oblique viewing directions. RT response time, N normal, O oblique (Reprinted with permission from Bbowmik et al. (2008))

represents one color and the luminance of the blue color is higher for the oblique viewing direction than for the normal direction. Therefore, this color would look more bluish for the oblique viewing direction. Dynamic performance has been generally considered only for the normal direction. Response time is defined as the time interval between 10 % and 90 % luminance levels from the temporal luminance measurement data when the driving voltage changes from one state to the other. If angular luminance dependence exists, the times taken for the luminance to reach 10 % or 90 % are different for the normal and oblique viewing directions, even though the temporal LC rotation is the same irrespective of the viewing direction. Figure 13 shows an example of angular temporal dependence (Hong et al. 2008a). Angular luminance characteristics of IPS are reported to be almost negligible (Oh-e and Kondo 1995; Yeo et al. 2005). So the angular dependence of various characteristics of LCDs such as color variation and temporal luminance can be effectively minimized in the case of the IPS mode.

Page 11 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

Stability of Domain After External Stress When an external stress or touch is applied, the luminance of the LCD changes and then returns to the original LC state when the external stress is removed. However, if states of stable domains coexist inside one pixel, the domain deformed by external stress may return not to the state of the original domain, but to another locally stable LC domain. This may result in extremely long restoration times, much longer than the typical response time of the LC. For the purpose of wide viewing angles, multidomain methods have been used for various LC modes. In the case of the TN mode of two domains, stability of normal and reverse domains has been reported and analyzed from the viewpoint of the Gibbs free energy (Saitoh et al. 1997). A very long restoration time of more than minutes is also observed for multidomain VA after the application of external stress. In this case, the luminance change shows nonsymmetric angular characteristics, whereas it had initially symmetric angular characteristics. So it may be inferred that one domain is deformed to another quasi-stable domain owing to the external stress (Hong et al. 2008b). A dual domain is used for the IPS mode, but the luminance changes of IPS due to external stress are reported to disappear in less than 1 s and unstable behavior of domains is not reported. Nowadays, touchscreen interface became very popular for the various electronic devices and display screen was required to restore to the initial state or be unchanged under the stress of touch. In this respect, IPS mode became very popular in the mobile devices such as smartphone and tablet.

Summary The IPS mode was initially developed owing to the merit of its negligible angular dependence. As multimedia applications became more important, fast dynamic performance was required and the response time of IPS gradually improved through the optimization of LC materials and driving schemes. As techniques for accurate patterning of electrodes improved, the transmittance of the IPS mode gradually increased as well. The traditionally known shortcomings of LCDs such as narrow viewing angle and slow response are mostly overcome using IPS modes, and IPS-mode LCDs are currently used successfully for large multimedia applications. For other specific applications, the relative importance of particular features is slightly different. And due to the domain stablity after the external stress, IPS-mode LCD are also suitable for touch-screen applications.

Further Reading Aratani S, Klausmann H, Oh-e M, Ohta M, Ashizawa K, Yangagawa K, Kondo K (1997) Complete suppression of color shift in in-plane switching mode liquid crystal displays with a multidomain structure obtained by unidirectional rubbing. Jpn J Appl Phys 36:L27–L29 Bbowmik AK, Li Z, Bos PJ (2008) Mobile displays. Chapter 4: IPS LCD technology and application. Wiley, New York Hong HK, Shin HH (2008) Effects of rubbing angle on maximum transmittance of in-plane switching liquid crystal display. Liq Cryst 35(2):173–177 Hong HK, Shin HH, Chung IJ (2007) In-plane switching technology for liquid crystal television. J Disp Technol 3(4):361–370

Page 12 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_90-2 # Springer-Verlag Berlin Heidelberg 2014

Hong HK, Yoon JK, Lim MJ (2008a) Analysis of dependence of optical response time of LCD on the viewing direction. J Soc Inf Disp 16(10):1063–1068 Hong HK, Ahn JY, Jung HY, Baek HI, Lim MJ, Shin HH (2008b) Moving-image-sticking phenomenon induced by an outside force in liquid-crystal displays. J Soc Inf Disp 16:883–888 ICDM (International Committee for Display Measurement) Standard (2011) to be published. http://icdmsid.org Khoo IC, Simoni F (1991) Physics of liquid crystalline materials. Gordon and Breach, New Jersey Kiefer R, Weber B, Windscheid F, Baur G (1992) In-plane switching of nematic liquid crystals. In: Japan displays ‘92, p 547 Kim KD, Yoon JK, Lim M, Shin HH, Chung IJ (2006) Motion artifact comparison of PDP and MBR LCD: world’s best MPRT LCD. In: International display workshop 2006, pp 1487–1488 Kurita T (2001) Moving picture quality improvement for hold-type AM-LCDs. Soc Inform Display Digest Tech Papers, San Jose, pp 986–989 LCD master. http://shintech.jp/software_e/lcdm2d_e.html Lee JH et al (2005) Optical configurations of TW-IPS LC cell for very wide viewing angle in large size TV application. SID Int Symp Digest Tech Papers 36:642–645 Nakamura H, Sekiya K (2001) Overdrive method for reducing response times of liquid crystal displays. Soc Inform Display Digest Tech Papers, San Jose, pp 1256–1259 Oh-e M, Kondo K (1995) Electro-optical characteristics and switching behavior of the in-plane switching mode. Appl Phys Lett 67:3895–3897 Oh-e M, Kondo K (1996) Response mechanism of nematic liquid crystals using the in-plane switching mode. Appl Phys Lett 69:623–625 Saitoh Y, Takano H, Chen CJ, Lien A (1997) Stability of UV-type two-domain wide-viewing-angle liquid crystal display. Jpn J Appl Phys 36:7216–7223 Saitoh Y et al (1998) Optimum film compensation of viewing angle of contrast in in-plane-switchingmode liquid crystal display, rubbing. Jpn J Appl Phys 37(Part I):4822–4828 Satake T, Nishioka T, Saito T, Kurata T (2001) Electrooptical study of an in-plane switching mode using a uniaxial medium model. Jpn J Appl Phys 40:195–199 Soref RA (1974) Field effects in nematic liquid crystals obtained with interdigital electrodes. J Appl Phys 45:5466 Yang DK, Wu ST (2006) Fundamentals of liquid crystal displays. Wiley, New York Yeh P, Gu C (1999) Optics of liquid crystal displays. Wiley, New York Yeo SD, Oh CH, Lee HW, Park MH (2005) LCD technologies for the TVapplication. Soc Inform Display Digest Technical Papers, Boston, pp 1738–1741

Page 13 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

Vertically Aligned Nematic (VAN) LCD Technology Hidefumi Yoshida* Display Device Development Division, Sharp Corporation, Nara, Tenri, Japan

Abstract Vertically aligned liquid crystal displays (VA-LCDs) have now been well developed and are widely used in televisions (TV), monitors, notebooks, and mobile devices due to the many advantages that they have over other LC modes. In particular, the contrast ratio in the normal direction is very high, and by adopting a multi-domain technique, a wide viewing angle is also possible. In contrast with some other LC modes, yields are higher because of the possibility of rubbing-free mass production, and there is the additional possibility of creating transflective displays. In this chapter, various VA-LCD modes and their respective advantages as well as the principal configuration of transflective displays are discussed.

List of Abbreviations CF CR FHD IPS-LCD ITO LCD PSA TAC TFT TV VA-LCD VAN technology

Color filter Contrast ratio Full high definition. The resolution is 1920  RGB  1080 In-plane switching mode LCD Transparent electrode comprised from the oxide of indium and tin Liquid crystal display Polymer-sustained alignment Tri-acetyl cellulose Thin film transistor Television Vertically aligned liquid crystal display Vertically aligned nematic technology

Introduction Liquid crystal display (LCD) technologies have been improved and are now used for small-sized cellular phone, midsized monitors, large-sized TVs, and so on. In particular, vertical alignment (VA) technology has been adopted for various types of displays due to the many advantages compared with twisted nematic (TN) mode or in-plane switching (IPS) mode: high on-axis contrast ratio, rubbing-free process, wide viewing angle, good cost–performance, and finally simultaneous usability for reflective and transmissive mode. In this chapter, this VA technology is introduced firstly for TV applications and secondly for displays for mobile devices. As for vertically aligned liquid crystal displays (VA-LCD), also known as vertically aligned nematic (VAN) LCDs, various other LC display modes have been developed, and each mode has its own specific advantage. In some cases, the technology had previously been discarded, but

*Email: [email protected] Page 1 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

a

b

Black

White or Gray

Polarizer Linearly polarized

Linearly or Elliptically Polarized LC molecules

Linearly polarized

Linearly polarized

Incident light

Polarizer (absorption axis)

V off

Incident light V on

Fig. 1 Principal structure (cross-sectional view) of vertically aligned liquid crystal display (VA-LCD) in case a voltage is turned off (a) and turned on (b)

nowadays is under spotlight for new applications. These technologies are introduced toward the end of this chapter.

Background of Vertical Alignment Technology Figure 1 shows the fundamental liquid crystal (LC) molecular alignment. The LC director is initially vertically aligned between crossed polarizers (Schiekel and Fahrenschon 1971). Without voltage applied, light traveling normally to the device experiences no birefringence and hence no polarization conversion between the two polarizers, and the device appears black (Fig. 1a). When a voltage is applied across the liquid crystal layer, because the LC material chosen will have a negative dielectric anisotropy, there will be a tendency for the LC director to tilt relative to the original vertical orientation, and above a threshold voltage, V th ¼ pðK 33 =ϵ0 DϵÞ1=2

(1)

(where K33 is the elastic constant for bend distortion, e0 is the electric constant, and De is the magnitude of the dielectric anisotropy in the form of relative permittivity), which is typically around 2 V, a tilted director profile will occur (Fig. 1b). The azimuthal angle of the tilt in the LC director is degenerate and, in general, will be decided either by thermal fluctuation or by defects or structures within the cell which can break the degeneracy (this will be discussed in more detail later in this chapter). In the tilted state, light traveling normally to the device will experience some birefringence, polarization conversion will occur, and light will be transmitted according to   2 2 pdDneff (2) I ¼ I 0 sin 2c sin l (where c is the angle between optical axis of the polarizer and the projected director of LCs, d is the cell gap or the thickness of liquid crystal layer, Dn is understood to mean the effective birefringence or the anisotropy of refractive index experienced by light traveling normally incident to the device, and l is the

Page 2 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

Table 1 Contrast ratio and viewing-angle characteristics of VA mode and IPS mode Contrast ratio Du0 v0

In-plane switching mode 1,500 ny = nz Positive c-plate nx = my < nz

b

Best transition Polarization of incident light

M

Polarization of outgoing light

Effect of a-plate Slow axis of a-plate

Effect of positive c-plate

Fig. 17 The principle of optical compensation. (a) The principal structure of optical configuration which gives black image at any viewing angle. (b) The optical path or locus on Poincaré sphere

a TAC

b

PVA TAC c-plate Glass substrate LC-layer

TAC PVA

Glass substrate Biaxial TAC film

LC-layer Glass substrate

Glass substrate a-plate TAC PVA TAC

c

TAC PVA

PVA TAC

Glass substrate Biaxial TAC film TAC PVA TAC

LC-layer Glass substrate

Fig. 18 The polarizer structure for optical compensation. (a) the combination of a positive a-plate and a negative c-plate, (b) the biaxial TAC layers used on both sides of the LC layer, (c) a biaxial TAC layer on only one side

compensate for both the LC layer and the polarizers are the combination of a positive a-plate and a negative c-plate (Fig. 18a). In order to simplify the sheet configuration, an optically biaxial film was developed where the function of negative c-plate and the positive a-plate are combined. The film is stretched in two different directions, machine direction and lateral direction, and used to replace the internal TAC layer that normally supports one side of the PVA layer of the polarizer. In the ideal case, the biaxial TAC layer is used in a symmetrical configuration on both sides of the LC layer as shown in Fig. 18b. This leads to a symmetrical viewing-angle characteristic as is used in high-end displays. However for cost reduction with reduced symmetry in performance, it is possible to put a biaxial TAC layer on only one side of the display (Fig. 18c).

Page 15 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

Technologies for Mobile Applications: Transflective Display VA-LCD is also used particularly suited to mobile applications because transflective display technology can readily be applied to this LC mode. Transflective displays have portions of each pixel which are transmissive and portions which are reflective, so that they can be viewed both indoors and outdoors in both weak and strong ambient lighting conditions.

Transmissive Mode with Optimum Brightness Although the fundamentals of VA-LCDs in mobile devices and in large area televisions are very similar, the priorities are rather different. In televisions, issues such as contrast ratio and viewing angle are very important. However, for mobile devices, although these issues are relevant, often brightness or more accurately power efficiency is more critical. The brightness of the white state of each pixel is given by the average value of Eq. 2. For optimum brightness, therefore, not only does the thickness and effective birefringence of the display at maximum applied voltage have to match the condition for a half-wave plate (Eq. 3), but the azimuthal angle of the director c must be at 45 to the polarizers. If a four-domain mode such as the MVA is used, the azimuthal angle of the protrusions is at 45 to the polarizers so that maximum light is transmitted through each domain. However, there are regions between each domain (disclinations) in which the azimuthal angle of the director changes rapidly through 90 from the value in one domain compared with the next. In these regions, the brightness will be less than optimal. One way to avoid this problem is to insert quarter-wave plates between the polarizers and the LC layers (Maltese and Ottavi 1978). If the optical axes of the quarter-wave plates are oriented at 45 to the polarizers, and perpendicular to each other, then the combination is that of two crossed circular polarizers (Fig. 19). Circularly polarized light therefore enters the LC layer, and (regardless of the azimuthal orientation of the LC layer) in the white state, the LC layer acts as a half-wave plate to change the handedness of the circular polarization so that it is transmitted by the second circular polarizer. This configuration is brighter than using linear polarizers simply because it is insensitive to the azimuthal angle of the director in the white state. If the absorption axes of the polarizers are simply set to east–west or north–south azimuth like MVA, the azimuth with the maximum viewing range is rotated by 10 or 20 . In order to realize symmetrical viewing-angle characteristics in this configuration that uses circular polarizers, the absorption axes of the a

b

Black Absorption axis Rotated

White

Rotated

Polarizer

Slow axis

Quarter wave plate

Absorption axis

Rotated

Rotated

Slow axis

Polarization changes independently of LC molecular azimuth

Complementary

Circularly polarized Rotated

Slow axis Absorption axis

Rotated Voff

Quarter wave plate Polarizer

Rotated

Slow axis Absorption axis

Rotated Von

Fig. 19 The principal configuration of LCD with a pair of circular polarizers in case a voltage is turned off (a) and turned on (b) Page 16 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

a

b

Black

White

Polarizer Quarter wave plate Glass substrate LC molecule Reflector Glass substrate Quarter wave plate Polarizer Voff

Backlight

Von

Fig. 20 The principal structure of trans-reflective (transflective) display, in case a voltage is turned off (a) and turned on (b)

polarizers were rotated by about 10 to 20 (Yoshida et al. 2004b). The line-shape protrusion for MVA is not necessary, and the protrusion-like nipple is located at the center of sub-pixels. LC molecules are inclined toward the nipple-shape protrusion when a voltage is applied. Only a black spot is located at the center of each sub-pixel, realizing bright white state. In addition to abovementioned technology, the technology to realize the higher aperture ratio has been developed (Hirata et al. 1996). The organic thick material (e.g., the UV curable resin) was put on the display area. As the electric field from the data and gate bas lines can be suppressed or shielded, the transparent electrode can be fabricated on the data and gate bas lines, letting the aperture ratio higher, realizing brighter display image or lower power consumption.

Transflective Mode Figure 20 shows the principal configuration of a transflective display (Nanutaki et al. 1999; Jisaki and Yamaguchi 2001). The optical configuration is the same as that for the transmissive mode, except that each pixel of the display has both a transmissive area and a reflective area. In the reflective area, the incident light is reflected by a reflector and goes through the LC layer twice. In order to keep the optical retardation dDn of the transmissive area and the reflective area the same, the cell gap of the reflective area is half that of the transmissive area. Then the threshold voltage and the brightness-voltage characteristics of the two areas are the same, making it possible to use the display in transmissive or reflective mode in any circumstances.

Summary VA-LCDs are widely used for TV and mobile applications due to their productivity and high performance (contrast ratio, brightness, response time, etc.). There are several types of VA-LCD mode, each with their own advantage. MVA- and PVA-LCDs have mainly been used for televisions and other large area displays because the yield and performance (contrast ratio, brightness, and response time) are high. Technologies with higher brightness and faster response times are now being adopted. PSA technology is used for mobile phones because of its good productivity with small-sized display, whereas photo-alignment technology is now being adopted for large area televisions because the productivity is more suited to larger area displays. VA-LCD with oblique electric field is used for automotive display for its ultrahigh response speed. Page 17 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

In order to realize wider viewing-angle characteristics, optical compensation films and multi-pixel configuration have been introduced. For realizing multi-pixel configuration, several technologies (capacitance coupling technology, multi-TFT driving technology, time sharing technology, and so on) have been introduced and put into mass production. Now, the viewing-angle characteristic is almost satisfactory. Transflective-type VA-LCD is now widely used for mobile phone for its good readability under sunlight. I believe that the future of VA TFT-LCDs is very bright.

Further Reading Chen J, Kim KH, Jyu JJ, Souk JH, Kelly JR, Bos PJ (1998) Optimum film compensation modes for TN and VA LCDs. In: SID '98 digest 21.2, p 315 Hanaoka K, Nakanishi Y, Inoue Y, Tanuma S, Koike Y, Okamoto K (2004) A new MVA-LCD by polymer sustained alignment technology. In: SID '04 digest, pp 1200–1203 Hirata M, Watanabe N, Shimada T, Okamoto M, Mizushima S, Take H, Hijikigawa M (1996) Development of 'Super-V' TFT-LCDs. In: Digest of AM LCD'96 IDW 96. p 193 Jisaki M, Yamaguchi H (2001) Development of transflective LCD for high contrast and wide viewing angle by using homeotropic alignment. In: Digest of Asia display/IDW '01, p 134 Kamada T, Koike Y, Tsuyuki S, Takeda A, Okamoto K (1992) Wide viewing angle full-color TFT LCDs. In: Digest of Japan display '92, p 886 Kim SS (2005) The world's largest (82-in.) TFT-LCD. In: SID '05 digest, p 1842 Kim KH, Park SB, Song JK, Kim S, Souk JH (1998) Domain divided vertical alignment mode with optimized fringe field effect. In: Digest of Asia display 98. p 383 Kim SS, Berkely BH, Park JH, Kim T (2006) New era for TFT-LCD size and viewing angle performance. J SID 14(2):127 Kimura N, Ishihara T, Miyata H, Kumakura T, Tomizawa K, Inoue A, Horino S, Inaba Y (2005) New technologies for large-sized high-quality LCD TV. In: SID '05 digest 60.2, p 1734 Kobayashi S, Iimura Y (1997) Multidomain TN-LCD fabricated by photoalignment. In: SPIE 3015. p 40 Koike Y, Kamada T, Okamoto K, Ohashi M, Tomita I, Okabe M (1992) A full-color TFT-LCD with a domain-divided twisted-nematic structure. In: SID '92 digest, p 798 Maltese P, Ottavi CM (1978) Improved construction of liquid crystal cells. Alta Frequenza XLVII(9):664 Nakanishi Y, Yoshida H, Sasabayashi T, Tasaka Y, Okamoto K, Inoue H, Sukenori H, Fujikawa T (2000) Multi-domain vertically aligned LCD driven by oblique electric field. In: Digest of AM-LCD '00, p 13 Nam MS, Wu JW, Choi YJ, Yang JH, Kim JY, Kim JH, Kwon SB (1997) Wide-viewing-angle TFT-LCD with photo-aligned four-domain TN mode. In: SID '97 Digest, p 933 Nanutaki Y, Kubo M, Shinomiya T, Kimura N, Ishii Y, Funada F, Hijikigawa M (1999) Development of a novel TFT-LCD with excellent legibility under any intensity of ambient light. In: Euro display '99 Latenews paper, p 121 Ohmuro K, Kataoka S, Sasaki T, Koike Y (1997) Development of super-high-image-quality vertical alignment-mode LCD. In: SID '97 digest, p 845 Park SB, Lyu J, Um Y, Do H, Ahn S, Choi K, Kim KH, Kim SS (2007) A novel charge-shared S-PVA technology. In: SID '07 digest p 1252 Schadt M, Schmitt K, Kozinkov V, Chigrinov V (1992) Surface-induced parallel alignment of liquid crystals by linearly polymerized photopolymers. Jpn J Appl Phys 31:2155

Page 18 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_91-2 # Springer-Verlag Berlin Heidelberg 2014

Schiekel MF, Fahrenschon K (1971) Deformation of nematic liquid crystals with vertical orientation in electrical field. Appl Phys Lett 19:391 Takeda A, Kataoka S, Sasaki T, Chida H, Tsuda H, Ohmuro K, Sasabayashi T, Koike Y, Okamoto K (1998) In: SID '98 Digest, p 1077 Tanaka Y, Taniguchi Y, Sakaki T, Takeda A, Koike Y, Okamoto K (1999) A new design to improve performance and simplify the manufacturing process of high quality MVA TFT-LCD panels. In: SID '99 digest, p 206 Tasaka Y, Yoshida H, Seino T, Tsuda H, Chida H, Kataoka S, Mayama T, Koike Y, Ohhashi M (1998) TFT-LCD with divided inclined vertical alignment by irradiation of unpolarized ultra violet light. In: Digest of AM LCD 98, p 35 Utsumi Y, Takeda S, Kagawa H, Kajita D, Hiyama I, Tomikoka Y, Asakura T, Shimura M, Ishii M, Miyazaki K, Ono K (2008) Improver contrast ratio in IPS-Pro LCD-TV by using quantitative analysis of depolarized light leakage from component materials. In: SID '08 digest, p 129 Yamamoto E, Kanno T, Katsuta S, Asaoka Y, Maeda T, Kamada T, Yoshida H, Tsuda Y, Kondo K (2013) Novel Microstructure Film for improving viewing angle characteristics of LCD. In: IDW’13, pp 82–83 Yamamoto E, Yui H, Katsuta S, Asaoka Y, Maeda T, Tsuda Y, Kondo K (2014) Wide viewing LCDs using novel microstructure film. In: SID 2014 digest, pp 385–388 Yoshida H, Koike Y (1997) Inclined homeotropic alignment by irradiation of unpolarized UV light. Jpn J Appl Phys 36:L428–L431 Yoshida H, Seino T, Koike Y (1997) Four-domain divided inclined vertical alignment by irradiation of unpolarized ultra violet light. Jpn J Appl Phys 36:L1449 Yoshida H, Nakanishi Y, Sasabayashi T, Tasaka Y, Okamoto K, Inoue Y, Sukenori H, Fujikawa T (2000) Fast-switching LCD with multi-domain vertical alignment driven by oblique electric field. In: SID '00 digest, pp 334–337 Yoshida H, Kamada T, Ueda K, Tanaka R, Koike Y, Okamoto K, Chen PL, Lin J (2004) Multi-domain vertically aligned LCDs with super-wide viewing range for gray-scale images. In: Digest of Asia display/IMID '04, pp 198–201 Yoshida H, Tasaka Y, Tanaka Y, Sukenori H, Koike Y, Okamoto K (2004) MVA LCD for notebook or mobile PCs with high transmittance, high contrast ratio, and wide angle viewing. In: SID '04 Digest, p 6

Page 19 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Bistable Liquid Crystal Displays Cliff Jones* School of Physics and Astronomy, University of Leeds, Leeds, Yorkshire, UK

Abstract Bistable liquid crystal displays offer many benefits, including the ability to display high levels of image content using passive matrix addressing and without thin-film transistors, ultralow power reflective displays with image storage that only consume power with changes to image, and flexible plastic displays capable of showing color images. The topic is diverse, involving nematic, smectic, and cholesteric liquid crystals; retardation, anisotropic absorption, scattering, and selectively reflecting optical modes; dielectrically, ferroelectrically, and flexoelectrically driven electrooptic effects; bistable textures stabilized by monostable surfaces or smectic layers and bistable surfaces; and applications ranging from electronic skins to high-definition television. Many different bistable display modes have been suggested over the past four decades, and this chapter concentrates on the bistable twisted nematic, surfacestabilized ferroelectric liquid crystals (FLCs), scattering smectic A, grating-aligned zenithal bistable display, and bistable cholesteric displays (BCDs).

List of Abbreviations BCD BiNem™ BTN C1 and C2 FLC CMOS DMOS DRAMA ESL FLC FLCD HAN HDTV ITO LCD

Bistable cholesteric display Trade name for 0-p bistable nematic mode marketed by nemoptic Bistable twisted nematic often used for the 0–2p metastable nematic display mode Chevron smectic layer orientation directions defined with respect to the parallel surface alignment directions Complementary metal-oxide semiconductor/silicon Double diffused metal-oxide semiconductor/silicon Defence Research Agency Multiplexed Addressing Scheme used for tVmin FLC Electronic shelf-edge label used in retail for automatically displaying pricing and other product information Ferroelectric liquid crystal formed in chiral-tilted smectic liquid crystals, but usually taken to mean chiral smectic C Ferroelectric liquid crystal display Hybrid-aligned nematic, usually taken to mean homeotropic alignment on one surface and homogeneous or low pre-tilt alignment on the opposite surface High-definition television usually corresponding to 1,920  1,080i (interlaced) or 1,920  1,080p (progressive) pixels Indium tin oxide, the transparent conducting oxide layer most commonly used by the display, touch screen, and solar panel industries Liquid crystal display –N and N nematic and chiral nematic liquid crystal phases, where the pitch of the chiral nematic is arbitrarily taken to be much longer than the wavelength of light, to distinguish it from the cholesteric phase

*Email: [email protected] Page 1 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

NTSC OLED PEDOT PES PET Ps and Pf QVGA RGBW RMS SiOx SmA, SmC, and SmC STN TDP TFT TN VAN ZBD Dn τVmin mode FLCD

National Television System Committee that defined the standards for US color television Organic light-emitting diode Poly (3,4- ethylenedioxythioiphene), a polymeric transparent conductor Polyethersulfone polymer film that can be made without birefringence Polyethylene terephthalate, polymer film Electric polarization, either spontaneous (ferroelectric) or elastically induced (flexoelectric) Quarter Video Graphics Array, 320  240 pixels Red-green-blue and transparent (white) color filter system Root mean square Silicon oxide layer, usually evaporated onto glass surface to induce director alignment Smectic A, smectic C, and chiral smectic C phases Supertwist nematic display, taken to include foil-compensated STN Triangular director profile for SmC and FLC chevron structures Thin-film transistor, usually meaning one or more such elements at each pixel Twisted nematic Vertically aligned nematic, where both surfaces are homeotropically aligned Zenithal bistable display/device The birefringence of the anisotropic liquid crystal phases, given as the difference between extraordinary ne and ordinary no refractive indices Mode of operation for low Ps and highly positive dielectrically biaxial FLC

Introduction Bistable displays exhibit electrooptic memory. They can be switched between two stable optically distinguishable states with an appropriate electric field. The states have equivalent or similar energies and are separated by a much higher energy state or energy barrier. The barrier ensures that the desired state for a pixel is retained after the switching pulse without further electrical excitation: The pixel is then said to have been latched. The property of bistability has several potential benefits for a display technology. The most obvious advantage is the provision of ultralow power operation because the display does not require constant updating and only consumes power when the image content is changed. The first market to gain significant traction that takes advantage of this is the electronic book reader. Such devices often incorporate a bistable electrophoretic ink-based display, providing sufficiently high-image content but with a battery that needs charging only after many pages. However, unlike bistable liquid crystal displays (LCDs), this technology does not have a well-defined threshold and therefore necessitates a thin-film transistor (TFT) element at each pixel to prevent previously written rows from being affected by subsequent data signals. This adds to the display cost. Even with the TFTs, the display update is slow and can be distracting. Because of the relatively high tooling costs, TFTs have less design flexibility and tend to be available only in the formats dictated by large markets: Niche markets are better served by passive matrix displays. However, existing low-cost passive matrix displays without TFTs, such as the Supertwist Nematic (STN), provide neither sufficient optical quality nor the low power often required by Page 2 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

the market. For these applications, a bistable LCD uniquely combine the low cost with superior optical performance and ultralow power. There are a variety of different bistable LCDs each with its own merits. Displays have been produced using conventional nematic liquid crystals familiar from many consumer applications in the market today but also utilizing either cholesteric (Coates 2011) or smectic (Rudquist 2011) liquid crystal phases. The history of bistable LCDs is almost as long as that of the LCD itself. For much of this time, the technological development was driven by the need to display high-image contents, ultimately providing full color and video speed. Now that need is largely satisfied by the success of TFT-driven displays, it is the advantages of cost, low power, and good image quality that are more important. However, bistable LCD also offer advantages in developing markets, such as those using flexible plastic substrates. The bistable liquid crystal displays suited for each of these three types of application will be reviewed in turn.

Infinite Multiplexibility and Rapid Frame Response Introduction With conventional passive matrix displays, each row is addressed line by line and combined with a synchronous data signal on the columns that either increase or decrease the root mean square voltage applied to a given pixel when averaged over the frame time. The degree of discrimination provided by the signal depends on the fraction of the total signal represented by the selected row: That is, the data provide decreasing degrees of discrimination as the number of rows increases. High levels of multiplexing require the electrooptic threshold to be steep, but even the steep threshold of STN limits displays to about 240 rows in practice. Higher levels of multiplexibility required either a separate nonlinear element such as a thin-film transistor (TFT) or a bistable response. For this reason, much of the initial development of bistable LCDs targeted high-image content displays, and advantages such as power or image storage were rather secondary. Indeed, the displays were often designed in transmissive mode operating with a backlight that dominated power consumption. A bistable electrooptic response allows unlimited image content to be displayed because each row can be addressed independently and the image built up line at a time. Usually, a pulse of voltage Vs and duration t is applied to a row while appropriate data Vd applied to the columns. The voltages and times are arranged so that the pixel resultant, given by row minus column voltage, causes latching to the opposite state if jV s þ V d j, whereas no latching occurs for the resultant jV s  V d j because it is lower than the latching threshold. During this operation, all of the remaining rows are held at ground, so that the unwritten resultant signals are Vd, which is far too low to cause any latching, particularly if V s  2V d : Each frame takes at least n  t seconds to address for a display of n rows. However, this minimum is rarely realized practically. Separate addressing signals are required for each transition: Usually a reset or blanking signal is applied to prepare each row for the addressing signal, and then discrimination is done using a separate select signal. Also, liquid crystals are sensitive to deterioration when DC is applied, and so bipolar addressing signals may be used. This means that a very fast latching response is needed if a complex image is to be refreshed rapidly, for example, to enable moving images or cursors to be displayed. The frame time depends not only on the latching time taken to address each row but also on the optical response time of the liquid crystal itself. Bistable LCDs with fast frame response include the 0–2p bistable twisted nematic and the surface-stabilized ferroelectric liquid crystal display.

The 0–2p Bistable Twisted Nematic

Among the first bistable LCD to be invented was the 0–2p bistable twisted nematic (BTN) (Berreman and Heffner 1980). Interest in this mode was reinvigorated in the mid-1990s when Seiko Epson produced Page 3 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

working demonstrators suitable for use in backlit graphic equalizer displays (Tanaka et al. 1995; Nomura et al. 2000) and a 5.700 diagonal eight-color transmissive QVGA display. The bistable display mode was chosen because it offered a very fast optical response as well as a high level of multiplexing. Conventional twisted nematic (TN) LCDs has orthogonal rubbing directions and a long cholesteric pitch P to bestow a single handedness to the monostable twist. With the 0–2p BTN, parallel rubbed surfaces are spaced at cell gap d, and the inherent chiral pitch P is reduced to give a monostable p twist across the cell (i.e., d/P = 0.5) and two metastable states with 0 and 2p twist. If the surfaces have pre-tilt, the director profile of the p-twist state is splayed, whereas the splay is negligible in the metastable 0- and 2p-twist states (Fig. 1). This broadens the range of d/P over which the metastable states are formed and device operation becomes practical. Figure 1 also illustrates that the p-twist state is topologically distinct from the two metastable states: There is no continuous transition from this state to the others, being mediated by the creation and movement of disclination loops. On the other hand, the metastable 0- and 2p-twist states are topologically equivalent, and the transition between these states occurs without the creation of defects. Application of a sufficiently high transverse field E reorients the director in the bulk of the sample to a vertical orientation when using a positive De nematic. If the field is removed quickly from this state, backflow causes the tilt in the cell midplane to initially increase, causing the 2p-twist state to be formed. Removing the field gradually allows the director profile to relax into the 0-twist state because the backflow is then lessened. Strictly speaking, the device is not genuinely bistable, and the metastable 0-twist and 2p-twist states relax back to the p-twist state a few seconds after removal of the applied field. The interpixel gaps remain in the p-state throughout, and defects spread from these regions across the latched pixel. Optimizing the mixture d/P can help increase the image retention time. Frequently updated displays such as those of reference (Tanaka et al. 1995) used a low-frequency signal below the Fréedericksz threshold π-twist (splayed)

Defect

Defect

2π-twist (un-splayed)

0-twist (un-splayed)

Vertical

E Gradual decrease of E

Sudden decrease of E

Fig. 1 0–2p BTN Page 4 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

voltage but sufficient to prevent the spread of the p-state and “hold” the image. Other methods have been suggested for preventing the p-state formation in the inter-pixel regions altogether, thereby isolating each pixel from relaxing out of the desired state. These include locally reducing the cell gap (Berreman and Heffner 1980), phase separation of a polymer network into the inter-pixel region (Li et al. 1996), or patterning the alignment in that region to increase pre-tilt or twist (Acosta et al. 2006). Long-term bistability in these systems proved difficult to maintain, because microscopic irregularities within the pixel eventually nucleate the more stable p state. Optical contrast was provided using polarizers either side of the LCD. Crossed polarizers were oriented at 45 to the rubbing directions such that 0-twist state has the transmission T of a simple birefringent retarder given by (Elston and Benzie 2011)   T 1 2 pDn:d ¼ sin (1) I0 2 l where the illuminating intensity is I0 and the polarizer is assumed to be perfect, leading to 50 % of the light being transmitted at the half-wave plate condition Dn:d ¼ 1=2 l . The tilt in both states is low. The transmission of the 2p state is given to the generalized expression for a twisted birefringent layer (Elston and Benzie 2011): 2T ¼ ll I0



2  pffiffiffiffiffiffiffiffiffiffiffiffiffi  pffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 2 cos f 1 þ a cos ðf þ ’1  ’2 Þ þ pffiffiffiffiffiffiffiffiffiffiffiffiffi sin f 1 þ a sin ðf þ ’1  ’2 Þ 1 þ a2   2 p ffiffiffiffiffiffiffiffiffiffiffiffiffi a 2 2 cos2 ðf  ’  ’ Þ þ sin f 1 þ a 1 2 1 þ a2 (2)

where a ¼ p:Dn:d=f:l is related to the twist angle of the director from the input to output substrates f, and ’1 and ’2 are the input and output polarizer directions with respect to the input director, respectively. Given the polarizers and retardation are set to maximize the transmission for the 0p-twist state (i.e.,   ’1 ¼ 0 , ’2 ¼ 90 , and a ¼ 1=4 ), Eq. 2 shows that the 2p state has a transmission given by rffiffiffiffiffi! 2T 16 2 17 sin 2p ¼ 0:035 (3) ¼ I0 17 16 and hence appears dark. The chromaticity of this dark state is reduced when the device is operated in a reflective mode, because both states have a low tilt. The resulting displays have an excellent viewing angle characteristic, with contrast approaching 30:1 over a 100 viewing cone. For a typical liquid crystal with Dn = 0.14, these conditions are met for a cell gap of close to 2 mm. The time taken for the director to respond to the change in electric field is related to the anisotropic viscosities. Ignoring backflow effects, the optical response times are the same as a conventional TN and can be related to the rotational viscosity g1 through the simple expressions: ton

g1 d 2 g1 d 2

; toff ¼ ¼ ϵ0 Dϵ V 2  V 2th ϵ0 DϵV 2th

(4)

Both the field-driven “on” response time and the field-independent “off” time strongly depend on cell gap.

Page 5 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

A typical material with a De  20 and g1  0.2 Pa.s has a total optical response time below 5 ms. Together with line-address times below 100 ms, a QVGA 240  360 pixel device could be updated at 60 Hz.

Other Early Bistable Nematic Displays During the same period that Berreman and Heffner were developing the 0–2p BTN mode, researchers at the same laboratories were investigating bend-splay bistable modes (Boyd et al. 1980; Thurston et al. 1980), where the director latches between “vertical” (V) and “horizontal” (H) states. The upper and lower internal surfaces of a nematic LCD were patterned with alternating high tilt stripes of +yS and yS, as shown in Fig. 2. The boundaries between the different surface conditions produced pinning sites for ½ surface disclination if the pre-tilt was between 22.5 and 67.5 . Unlike the 0–2p BTN described in the previous section, this device was truly bistable because the two textures are topologically distinct and separated by an energy barrier. Transitions between the states require the movement of the defects from one stripe to the next. Latching between the states was induced using interdigitated electrodes on top and bottom surfaces, where each electrode coincided with the boundary between alternating alignment. A vertical electric field could be applied between upper and lower surfaces and a horizontal field applied using adjacent electrodes on the same surface. These fields couple to the positive De of the material to induce the V and H states, respectively. Optimum behavior was found where the width of each stripe s was the same as the cell spacing d. For d = s = 50 mm, switching occurred at 70 V for a duration of 20 ms for the transition from V to H and 80 ms for H to V. Optical contrast was produced using a pleochroic dye doped into the nematic liquid crystal. Rather than using monostable surfaces, an alternative approach is to create alignment layers that are inherently bistable and possess two or more favored alignment orientations. This has a number of potential advantages, such as the ability to induce bistable behavior in standard nematic mixtures, insensitivity of the written image to pressure induced flow and greater device design freedom. The first bistable surface alignment layer was formed by evaporating SiOx at a precisely controlled thickness and angle to impart two tilted and a single un-tilted alignment state, each of which imparted different azimuthal orientations to a contacting nematic (Monkade et al. 1987; Barberi et al. 1989). Evaporated layers are not suited for large-scale manufacturing due to the difficulties of batch processing and variations of evaporating angle over large areas. Instead, Bryan-Brown et al. (Bryan-Brown et al. 1994) used a bi-grating to impart azimuthal bistable orientations at 45 to the modulation directions. One of the modulations was blazed substantially, to induce a high director pre-tilt in that state. More recently, azimuthal bistability has been demonstrated using bi-gratings with square features or pillars (Yi et al. 2009), square wells (Tsakonas et al. 2007), and compartments with sawtooth-shaped sidewalls (Lasak et al. 2009).

Splay “H”

topologically distinct +

Bend “V” +

-

-

inter-digitated electrodes

Disclinations +

+θs

+

-

−θs

+θs

−θs

-

+θs −θs

S

Fig. 2 The splay/bend mode Page 6 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

A number of latching mechanisms were proposed for use with the azimuthal bistable surfaces, including in-plane electrodes and chiral ions. One effective device configuration used the flexoelectric polarization inherent to nematic liquid crystals to induce in-plane latching between azimuthally bistable states using a transverse electric field (Barberi et al. 1992), as shown in Fig. 3. Opposing bistable surfaces are arranged so that the tilted state on one surface is aligned with the un-tilted state on the other. This leads to two states with opposite senses of splay and bend. The resulting flexoelectric polarization is then either in an “up” or “down” state. Each pixel is addressed by first applying a pulse sufficient in strength to break the surface anchoring, followed by a DC pulse of appropriate polarity to latch the desired state. Latching occurred for pulses of 50 ms duration above threshold fields of 16 V/mm, demonstrated in devices of 1, 2, and 4 mm. However, this type of device was not successfully commercialized due to issues associated with ionic response and image sticking.

Ferroelectric Liquid Crystals and the τVmin Such fast response and addressing times still fell short of what is needed for full-color television displays driven by passive matrix. Arguably, the most ambitious market targeted by any passive matrix bistable display is large area HDTV. In the mid-1990s, a joint development program between the UK’s Defence Evaluation and Research Agency (DERA), Sharp Laboratories of Europe, and Sharp Corporation of Japan (Bradshaw et al. 1997) used bistable ferroelectric liquid crystals (FLCs) operating in the tVmin mode (Surguy et al. 1991; Jones et al. 1993) to produce 600 (Bradshaw et al. 1997) and 1700 diagonal (Itoh et al. 1998; Koden 1999) prototypes aimed at meeting the HDTV specification. The requirement of 1,080 interlaced lines driven at a 60 Hz frame rate to show 16.8 million colors (256 gray levels) was challenging for a passive matrix display. Eight bits of gray was achieved by combining two bits of spatial dither on the columns (weighted 1:2) with four temporal bits (weighted 1:4:16:64). Even with the use of interlaced lines, achieving a 60 Hz frame rate on a 1,920  1,080 panel sets the target line-address time to be 15.4 ms and an optical response time target below 4 ms. Operation was required across the temperature range 0–60  C, and achieving a very high contrast ratio in excess of 150:1 was critical. Such performance could only be achieved through the combination (Jones et al. 1993) of high dielectric biaxiality ferroelectric liquid crystal mixtures operating close to the minimum response time, using monopolar addressing schemes and the C2 chevron alignment, as described below. a

PLAN VIEW T1

PH

SiOx

T1

45° θs

PH

T2 PLAN VIEWS SiOx

SIDE VIEWS

b

PH θs

Pf θs

T2 E

E Pf

SiOx

T2

E

T1 PH

Pf Top

E

T1

Pf

SiOx PH

PH2

T1 Bottom

Fig. 3 Azimuthal bistable display. (a) Bistable SiOx surface and (b) arrangement for Azimuthal bistable display latching with transverse field (Barberi et al. 1992) Page 7 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

FLC had been known to provide very fast bistable optical shutters following the original work of Clark and Lagerwall in 1980 (Clark and Lagerwall 1980). Devices were formed by cooling into the ferroelectric chiral smectic C (SmC) phase from a smectic A (SmA) sample with layers aligned normal to the cell walls, as shown in Fig. 4. The sample spacing was arranged to be sufficiently thin to unwind the helical nature of the chiral nematic and smectic C phases and to operate as a switchable half-wave plate optical shutter in the ferroelectric phase. X-ray studies (Rieker et al. 1987) showed that the smectic C layers tilt into a symmetrical chevron structure, with a layer tilt angle dC typically between 80 % and 90 % of the smectic C cone angle yC. Bistability results from the orientation of the director at the chevron interface: The liquid crystal n director is continuous across the interface, but constrained to either of two in-plane orientations bi at the interface, given by cos bi ¼ 

cos yC cos dC

(5)

This angle is typically between 45 % and 60 % of yC. The direction of layer tilt with respect to parallel aligned surfaces is dictated by the pre-tilt and the ratio of zenithal and azimuthal anchoring energies. Two layer orientations may form C1 and C2 (Kanbe et al. 1991) depending on whether the layers tilt in the same or opposite direction to the surface alignment, respectively. The C1 state is the lowest energy state where the pre-tilt yS is relatively high compared to the layer tilt, such as temperatures close to the SmA to SmC() phase transition or if the azimuthal anchoring energy is low. At intermediate pre-tilts and for high azimuthal anchoring energies, the C2 state forms a few degrees centigrade below the phase transition. If low tilts are used, both C1 and C2 are formed and the sample is covered with unwanted zigzag defects. The quiescent state director profile is determined both by the chevron interface (Eq. 5) and the surface alignment. For typical layer tilt angles, the out-of-plane tilt can be ignored and the director profile is approximately triangular from one surface to the other (Anderson et al. 1991), as shown in Fig. 4. Incident light is extinguished for one of the domains when the device is set between crossed polarizers at the angle bext to the rubbing directions, given by (Anderson et al. 1991)

SmA

N(*)

r

r

SmC(*) Immediately below Tc (C1) r

SmC(*) Well below Tc (C2)

SmC(*) Well below Tc (C1) r

r

θs

β1

n

r layer

r

r

r

βs

r

In-plane azimuth angle β

Fig. 4 Ferroelectric liquid crystal alignment on cooling through the sequence N–SmA–SmC for parallel aligned surfaces, rubbing direction r. C1 and C2 chevron layer textures with low and high tilt and triangular director profiles (TDPs) are shown Page 8 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

"

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi# 1 tan ðbi  bs Þ 1 þ a2 4 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tan 2ðbext  bS Þ ¼ 1 1 þ a2 4

(6)

where a = pDnd/fl, as for Eq. 2 includes the director twist given by ’ ¼ bi  bs ; and the effect of the small out-of-plane tilt y is ignored. For thick cells operating close to the full-wave plate condition ðDn:d  lÞ; the extinction angle b ext diverges and is highly wavelength dependent. At long wavelengths or low cell gaps, Eq. 6 predicts that the extinction angle tends toward bext !

bi þ bS 2

(7)

as shown in Fig. 5 for an FLC with bS = 25 (such as for a high pre-tilt C1 state), bS = 12.5 (i.e., equivalent to bi such as would occur if the azimuthal anchoring energy is negligible), and bS = 0 (such as for the C2 case where yS  yC  |dC|). For such thin samples, the bext is approximately wavelength independent and equal to the so-called FLC memory angle bm. When the polarizers are oriented at bm, the transmission for the other domain is then approximately given by   2T 2 2 pDn:d (8) ¼ sin ð4bm Þ sin I0 l Canon launched the first commercial bistable LCDs as computer monitors in the early 1990s (Kanbe et al. 1991; Terada et al. 1993) using FLCDs operating in the C1 geometry. The C1 texture with high bS, and therefore high memory angle, was achieved using a surface pre-tilt of about yS = 18 . Displays were 1,024  1,280 and either 1500 (Kanbe et al. 1991) or 2100 (Terada et al. 1993) diagonal, operating at between 65 and 150 ms/line, depending on the temperature. Page update rates of 15 Hz were attained by driving two of the 1,020 interlaced rows simultaneously. Any flicker was made less noticeable by scanning each page eight times while addressing every eighth row pair. Even so, this relatively slow frame rate was not suitable for displaying cursor movement. Hence, the addressing also included the ability to partially update the screen by addressing a limited set of rows at the same time. For example, a 32  32 pixel cursor could readily be addressed at 100 Hz.

Polariser Extinction Angle/°

24 βs 16

Δn.d = 260 nm

25° 12.5°

8

0 400



450

500 550 600 Wavelength/nm

650

700

Fig. 5 Optical extinction angle bext for chevron quiescent state with the TDP for samples with yC = 22.5 and dC = 19 and for different surface azimuthal angles bS = 0 , 12.5 , and 25 Page 9 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Operating in the C2 chevron geometry with bS = 0 as shown in Fig. 4 has a number of advantages for displays. In particular, latching between the states involves no reorientation of the director at the surfaces, being mediated by the transition at the chevron interface only, allowing extremely fast response times to be achieved. However, Eq. 7 and Fig. 5 show that the memory angle is significantly less than bm = 22.5 that is required for maximum brightness efficiency from (Eq. 8). The display in applications such as computer television monitors is continually updating, and unaddressed rows always have the data waveform applied. This voltage couples to the dielectric biaxiality inherent to smectic C () liquid crystals to increase the extinction angle back toward the 22.5 optimum, an effect termed “AC stabilization.” Ignoring any effects due to viscous backflow and considering elastic torques for changes to orientation of the director around the cone ’C only, then the response to an applied electric field is given by (Brown et al. 1997) g1 sin2 yC

@ 2 ’C @’ ¼ B1 sin2 ’C þ B2 cos2 ’C cos2 dC þ B3 sin2 dC  B13 sin 2dC sin ’C @t @z2  2 @’C 1 þ ðB1  B2 Þ cos2 dC sin 2’C  B13 sin 2dC cos ’C 2 @z 2 2 þ PS E z cos dC sin ’C  ϵo E z @ϵ sin ’C cos ’C cos dC  1 2 2 2  ϵo E z Dϵ cos ’C sin 2yC sin 2dC  sin ’C cos dC sin yC 4

(9)

where De is the uniaxial dielectric anisotropy (= e3  e1); @e is the dielectric biaxiality (= e2  e1); B1, B2, B3, and B13 are the elastic constants associated with two-dimensional distortions of ’C for uniform layers; and g1 is the rotational viscosity. At high field strengths or at frequencies too high to cause ferroelectric switching, the effect of the dielectric terms in Ez2Dϵ and E z 2 @ϵ dominates. The AC stabilizing effect of the dielectric anisotropies is similar to the switching effect in nematic liquid crystals: The negative De tends to reduce any out-of-plane tilt and stabilize the condition given by Eq. 5. The dielectric biaxiality is positive and tends to stabilize either ’C = 0 or 180 , where the in-plane component of the director b0 and out-of-plane tilt z0 are given by   1 tan yC ; z0 ¼  sin 1 ð sin dC : cos yC Þ (10) b0 ¼  tan cos dC Typically, b0 is a couple of degrees greater than the cone angle, and z0 a couple of degrees lower than the layer tilt angle. It is clear that the uniaxial and biaxial anisotropies act in opposition for chevron geometries. The electrostatic energy is a minimum at a director orientation given by (Jones and Raynes 1992) sin ’AC ¼ 

Dϵ sin yC cos yC tan dC Dϵ sin2 yC  @ϵ

(11)

Measurements of the biaxial permittivities (Jones and Raynes 1992) show that @e > De sin2 yC and hence the director tends toward the conditions of (Eq. 10). Effective use of AC stabilization requires FLC mixtures with high dielectric biaxiality mixtures to be used (Jones et al. 2000). Moreover, the dielectric biaxiality plays a yet more important role in the latching mechanism of FLCDs operating in the tVmin mode.

Page 10 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

chevron interface

+

+

+

+

+

+

+

+

+

+

VDC

VDC latch

Fig. 6 Ferroelectric liquid crystal latching between “up” and “down” states. (a) Latching in the C1 chevron state with high pre-tilt. (b) Potential model for latching at the chevron interface. (c) Latching in the C2 chevron state with low pre-tilt

Unlike AC stabilization, the DC ferroelectric response is polarity dependent: An applied field of the appropriate polarity tends to reorient the spontaneous polarization toward the opposite side of the cone, as shown in Fig. 6. The resulting high gradient of director orientation close to the chevron interface causes a large latching torque that is eventually sufficient to cause the director to swap discontinuously from one allowed state to the other through the formation and movement of a domain wall. After removal of the field, the director relaxes back to a triangular director profile on the opposite sign to the original state. A simple one-dimensional model (Mottram et al. 1999) suggests that the latching transition is a momentary reduction of the smectic cone angle to the condition yC = dC where the director has a single orientation at the interface and the director swaps continuously between the states (Fig. 6). At low DC field strengths, the coupling to the spontaneous polarization is far stronger than the dielectric effect, and the pulse width t required for latching increases linearly with 1/Ez. At higher fields, the dielectric terms of the switching equation acting in Ez2 increasingly oppose the ferroelectric latching torque PS  E. This causes the latching response time to deviate from the 1/E z behavior and begin to slow its rate of increase. Eventually, a minimum response time is reached, the so-called tV minimum, above which the response slows rapidly with increasing field, diverging at the field where the dielectric and ferroelectric torques balance. Numerical modeling shows that the tV minimum occurs at about 60–64 % of this divergence field (Jones et al. 1991) and is approximately given by

V min

8 93 < =2 PS d 1  0:62 ϵ0 cos dC : Dϵ sin2 y  @ϵ 23  ðDϵ cos y sin y tan d Þ23 ; C C C C

(12)

with the minimum response time approximately

Page 11 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

tmin



g1 sin2 yC @ϵ  Dϵ sin2 yC  PS 2

(13)

The steep response above the minimum has the potential for high contrast ratios and wide operating windows. Addressing tVmin FLCD is done using monopolar row waveforms, such as in the JOERS/Alvey scheme (Surguy et al. 1991). Unlike conventional addressing, it is the lower voltage resultant jV s  V d j that causes latching above the minimum rather than the higher voltage jV s þ V d j: The display operates in an inverted mode. A two-slot monopolar strobe pulse (0, VS) applied to the rows combines synchronously with select data ðV d , þ V d Þ and nonselect data V d , þ V d to produce the resultants ðþV d , V s  V d Þ and ðV d , V s þ V d Þ; respectively, as shown in Fig. 7a. This scheme gives both a fast response and wide discrimination between select and nonselect resultants even for low data voltages. This is because the two portions of the resultant pulses act in unison: The select pulse Vs  Vd is immediately preceded by +Vd that helps latching, but the nonselect pulse Vs + Vd is preceded by a pulse of the opposite sign Vd that hinders switching and thereby improves discrimination. Contrast this with a bipolar strobe pulse where a trailing select pulse is always preceded by a high voltage that slows the response, yet the nonselect pulse is a Vs+Vd 1000

Slot time τ/μs

0 –Vd

Vs–Vd +Vd 0

100

ϖ

⏐Vs–Vd⏐

20 1

10

100

Peak Resultant Voltage/V

b

80 20°C

Slot time τ/μs

30°C

10 7.7

target τ 40°C Maximum Voltage

1 10

40

60

100

Pulse Voltage/V

Fig. 7 The FLC tVmin electrooptic latching characteristic: (a) The principles of tVmin addressing using the JOERS/Alvey scheme for the commercial mixture SCE8. (b) Example response for a low viscosity-high dielectric biaxiality mixture (Slaney et al. 1997) Page 12 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

preceded by a lower pulse and hence discrimination is reduced. The JOERS/Alvey addressing scheme uses simultaneous blanking pulse several lines ahead of the addressed row to ensure that the correct state is obtained while minimizing the total frame time. Figure 7b shows experimental results for a fast FLC mixture operating in the C2 geometry (Slaney et al. 1997). Assuming each row uses two slots to DC balance the data waveform, the target latching time for HDTV operation is 7.7 ms when operating with conventional 40 V STN drivers. Clearly, the fast response meets the target for high temperatures. However, achieving the response time target at lower temperatures required the use of multiple slot DRAMA schemes (Anderson et al. 1997) to increase the discrimination combined with strobe waveforms extended into the following rows (Hughes and Raynes 1993). In this fashion, HDTV performance was obtained from 0  C to 60  C operating temperature (Koden 1999).

Ultralow Power Reflective Mode Displays Introduction By the time passive matrix bistable FLC displays reached the target for full-color video imagery, monostable active matrix devices driven by TFTs were being successfully deployed in laptops and computer monitors. Since the turn of the millennium, bistable LCD developments have been concentrated on portable products, where low power and low cost are essential requirements. Such devices operate in a reflective mode, and a crucial part of the display design is ensuring excellent reflectivity and high contrast and wide viewing angles are achieved for the bistable quiescent states. Often, the application requires the display to be updated infrequently, and speed of update is only a secondary consideration. Operating voltages might be as high as 40 V, since the total energy consumed is small when the updates are infrequent. However, standard components and manufacturing methods should be used to ensure that costs are minimized. Other properties that may be advantageous for some applications include stability of the image to shock, wide operating temperature ranges, very high resolution, or the provision of inherent gray scales. Several bistable LCD modes have been considered or used in commercial products requiring low power reflective displays. These applications range from watch and label displays to large displays for electronic readers. The technologies and their strengths and weaknesses are summarized in the following two sections. This section concentrates on glass-based displays used for portable products, whereas the following section looks at plastic substrates and the provision of color.

Surface-Stabilized Ferroelectric Liquid Crystals FLC displays operating in the C1 chevron geometry can be designed to give good extinction with a memory angle close to the ideal bm = 22.5 using alignment surfaces with high pre-tilt. This ensures that the contrast is retained after the removal of the addressing signals. Recently, Citizen (Kondoh et al. 2005; Amakawa and Kondoh 2008) has developed displays using obliquely evaporated SiOx surface alignment with high pre-tilt to produce memory angles in excess of 16 , as predicted for bs = 25 in Fig. 5. A number of issues arise in this mode when high ferroelectric polarization P S is used. First, image sticking may occur where the latching voltages shift over several hours if the image is not updated. Second, the polar SiOx surface favors the orientation of P S into the surface, thereby tending to induce half-splayed states of the director profile which greatly reduce the contrast ratio. This tendency is counteracted if the alignment directions are crossed with respect to each other by up to 20 . Both of these effects are minimized using PS below 16 nCcm2.

Page 13 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

The principal application for this technology was watches: Not only is ultralow power essential, but also latching below 5 V and high resolution of up to 1,000 dpi was required. FLCDs are perhaps the only bistable LCD capable of operating at such low voltages. Of course, the use of C1 texture, low voltage, and low spontaneous polarization combine to make the update slower than other FLCDs, such as those described in the previous section. However, the attractive features offered by this technology also suited other small display applications, such as electronic shelf-edge labels, instrumentation, and electronic dictionaries. For each of these applications, it is important to shield the ferroelectric liquid crystal from shear caused by mechanical pressure to the display. Such shock would permanently damage the smectic layer alignment due to viscous flow in the plane of the cell, thereby disrupting the optical appearance. Photolithographic-defined spacers overcoated with adhesive to fix the upper and lower substrates to each other were used to prevent shearing between the substrates and therefore ensure the aligned smectic layers were protected from damage. A different approach was taken by Sharp Laboratories of Europe, who targeted high-resolution displays for electronic paper and e-reader applications (Ulrich et al. 2001). Reflective displays using two external polarizers are limited to about 200 dpi due to image shadowing. This effect is caused by parallax due to the separation of the image plane and reflector which are separated by the rear-glass substrate. Higherresolution displays require a single-polarizer mode to be used, thereby allowing the rear substrate to include an internal reflector. In such instances, the image is formed at the front polarizer and all parallax is removed. Achromatic black-and-white states are achieved by combining the FLC acting as a switchable half-wave plate with a fixed quarter-wave plate in front of the reflector. The optimum configuration requires the memory angle between the FLC states to be 22.5 (i.e., bm = 12.25 ), with the polarizer oriented at +7.5 to the rubbing directions and +75 to the slow axis of the wave plate. This condition most closely matches the director profile obtained with the C2 layer orientation. A 300  300 pixel 1.6 mm spaced device was fabricated operating in the tVmin mode. Application of strobe and data voltages of Vs = 15 V and Vd = 3 V, respectively, resulted in line-address times of 100 ms.

Scattering Smectic A Another early bistable LCD that is receiving renewed attention is the bistable smectic A (Coates et al. 1978; Crossland and Canter 1985). The device is shown schematically in Fig. 8: It latches between a scattering focal-conic texture that appears white due to backscattering and a transmissive homeotropic state that appears black due to an absorbing layer coated onto the rear substrate. These states are separated by an energy barrier associated with movement of the smectic layers. The smectic layer normal and Back-scattered white light Incident white light

Black Absorber

Transmissive

Scattering

Fig. 8 The bistable smectic A scattering device Page 14 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

director of a positive De material are aligned parallel to an applied electric field in a transition analogous to the nematic Fréedericksz transition. Unlike the nematic case, the applied field both compresses the layer and induces splay of the n director to cause undulations. The threshold voltage for the transition from planar to homeotropic is related to the cell gap d (Coates 1998): V PH

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2p2 k 11 d ¼ ϵ0 Dϵ:lA

(14)

where lB is a characteristic length related to the layer spacing. Latching to the focal-conic state relies on negative conductivity anisotropy of the smectic phase, wherein there is a high ionic mobility in the direction parallel to the layers. Applying a low-frequency field parallel to the layer normal leads to ionic flow in the sample, which tends to create vortices in the smectic centered on surface irregularities and reorient the layers toward the cell plane. The resulting focal-conic texture consists of domains typically between 1 and 10 mm in diameter. Strong backscattering of incident light results if a highly birefringent LC material is employed to create an attractive white state with an excellent viewing angle. The threshold voltage between uniform homeotropic and scattering focal-conic textures is related to the conductivities s | | and s | | through the relationship V HF

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u 2p2 k 11 d u   ¼u sjj t ϵ0 ϵjj 1  :lA s⊥

(15)

The conductivities are usually enhanced by deliberately doping the liquid crystal with mobile ionic species. At low frequencies, the ions are sufficiently mobile to respond in time to the applied field. However, the disruptive action of the ions is counteracted by the E2 effect of the field coupling to the positive De, and the threshold VHF quickly increases with frequency. A high cell gap of 15 mm < d < 30 mm is needed to provide sufficient scattering centers and create the attractive white state. However, this leads to a trade-off between voltage and reflectance, and typical devices require voltages in excess of 100 V. Recent demonstrations of this technology have been made by PolyDisplay ASA (B€ uy€uktanir et al. 2006) and Halation (Tang et al. 2010). Typical room temperature operating voltages are VS = 100 V and Vd = 50 V (Tang et al. 2010); such high voltages are supplied using plasma display panel DMOS drivers. Employing a 20 Hz page blank for the low-frequency transition to the scattering texture, a 96  96 pixel display is updated within 5 s. This is rather slow for many portable products, but is suitable for electronic signage. Historically, devices also suffered from poor lifetimes due to electrochemical degradation of the liquid crystal. Despite the use of costly row and column drivers and the slow update speed, the device has the advantage of a simple construction without polarizers and produces excellent white coloration, wide viewing angle from the near-Lambertian properties of the scattering, and reflectance of 50 %.

Weak Anchoring and the 0-p Bistable Twisted Nematic The 0–2p BTN described in Sect. 2.2 was metastable, since the device would eventually return to a single lowest energy p-twist state. That state differed topologically from the 0p- and 2p-twist states used for switching. True bistability results when the device is arranged to allow latching between the topologically different 0p and p-twist states (Dozov et al. 1997). From 1998 to 2010, this approach was developed by

Page 15 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

the French company Nemoptic S. A. and the technology marketed under the trade name BiNem™ (Joubert et al. 2003). Devices are constructed with parallel aligned surfaces and the cholesteric pitch set approximately to d/P  0.25. One of the surfaces is arranged to have weak zenithal anchoring and switches to homeotropic when a field above a critical value E C is applied (Joubert et al. 2003): Wy E C  pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ϵ0 Dϵ < K >

(16)

where W y is the zenithal anchoring energy of the weakly anchored surface, the dielectric anisotropy De is positive, and < K > is the average elastic constant. The opposing surface is strongly anchored and usually has a small finite pre-tilt to prevent the formation of reverse tilt domains. The weakly anchored surface has zero pre-tilt, thereby ensuring that the transition to vertical alignment is first order (Dozov and MartinotLagarde 1998). Once the weakly anchored surface is switched vertical, latching between the bistable states is driven by the backflow of the director in a similar way to the 0–2p BTN, as shown in Fig. 9. Immediately after the field has been removed, the vertical alignment of the weakly anchored surface is at an unstable equilibrium. This means that the surface director can relax to either one of the bistable states depending on the viscous flow in the vicinity of the surface. Immediate removal of the field creates a high

Strongly anchored surface

π-twist

0-twist

Topolagically distinct Weakly anchored surface

E

E

Gradual decrease of E

Sudden decrease of E

Ec

Vertical

Fig. 9 The 0-p BTN or BiNem™ mode Page 16 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

degree of backflow close to the strongly anchored upper plate. If the cell gap is sufficiently small, this couples hydrodynamically to the director at the weakly anchored plate, inducing a relaxation into the p-twisted state. Alternatively, if the field decreases progressively, or through an intermediate value, the backflow is lower and the uniform 0p texture forms. Once the final state is selected, the surface relaxes rapidly to the un-tilted alignment, usually within 10 ms. Hydrodynamic coupling between the surfaces requires the cell gap to be less than dC, estimated (Dozov et al. 1997) as 1 dC < 2

sffiffiffiffiffiffi  k 33 g1 r Wy

(17)

The azimuthal anchoring energy must be sufficiently high to maintain the correct twist in the 0- and p-twist states, despite the mismatch of these states with the natural twist of the chiral nematic: d >>

k 22 Wb

(18)

Specialized alignment polymers have been formulated (Angelé et al. 2007) that provide weak zenithal anchoring Wy and strong azimuthal Wb anchoring, while retaining the ability to be processed using standard industrial methods. When combined with a proprietary liquid crystal mixture, the zenithal anchoring energy was typically 2  104 Jm2 < W y < 4  104 Jm2, for which dC  2 mm and the threshold voltage below 20 V. The alignment polymers also satisfied Eq. 18 with Wb  0.1 Wy, ensuring that the twist deviation from the 0p and 2p conditions was less than a couple of degrees. Optical contrast is provided using parallel polarizers oriented at 45 to the rubbing direction and the retardation set to the half-wave plate condition (Dn.d = l/2). The 0-twist state then appears dark and the p-twist state transmissive. For the wavelength where the half-wave plate condition is met, Eq. 2 then gives  pffiffiffiffiffiffiffiffiffiffiffiffiffi 2T 2 ¼ cos p 1 þ a2 ¼ 0:869 I0

(19)

Chromaticity of the bright state is readily compensated using a violet filter layer to reduce green light transmission and meet the target white color balance. Optimization of the polarizer angles and retardation give contrast ratios of 15:1, a reflectance of 32 %, and a viewing cone of 110 for contrast over 4:1 without compensation layers. The high viewing angle is a consequence of both bistable textures having in-plane director profiles. The most attractive appearance for black-and-white reflective displays requires the interpixel gaps to be in the white state. The liquid crystal mixture was arranged with d/P slightly less than 0.25 so that the p-twist state was favored on cooling into the nematic phase. The formation of the p-twist state is also helped by the slightly thicker cell gap that occurs in the inter-pixels gaps, due to the thickness of the transparent electrode. As with any other LCD that requires such a low cell gap, high levels of cleanliness are required during production to ensure satisfactory yields. This thin cell gap, however, gives the display a fast optical response, typically below 10 ms. Nemoptic produced several prototypes to demonstrate different aspects of the technology. A 5.100 diagonal 400  300 reflective color demonstrator attained a 20.5 % reflectance with a contrast of 12:1 and an NTSC color saturation of 4.5 %. It used an RGBW color filter front plate, with the color depth designed for a double-pass suitable for a reflective mode (Angelé et al. 2007). Gray scale was produced by modulating the extent to which the 0-twist texture spreads across a pixel from one of the inter-pixel gaps in what was termed a “curtain” effect. Later, a single-polarizer BiNem mode demonstrator was Page 17 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

produced by reducing the cell gap still further to form a switchable quarter-wave plate (Osterman et al. 2010). Good achromatic extinction of light occurred for the 0-twist state when combined with an internal half-wave plate retarder and reflector on the rear substrate, with the polarizer oriented between 8 and 15 to the rubbing direction and the slow axis of the half-wave plate set between 16 and 30 . The use of a single-polarizer mode-ensured images was free from parallax while allowing reflectances in excess of 42 % to be attained. High viewing angles were maintained using a biaxial compensator built into the achromatic half-wave plate. Recently, this type of display was combined with a TFT backplane to allow video-frame rates to be achieved for highly complex images (Joly et al. 2010).

Zenithal Bistable Nematic Displays The bistable LCD in this category with the most market success is the Zenithal Bistable Display or ZBD™, which is being sold into the retail sector for market signage and shelf-edge labeling (Jones 2009). The device uses a submicron pitch surface relief grating to provide bistable surface alignment (BryanBrown et al. 1997). Other than the grating alignment layer, the device shares a construction design of the standard TN display: Thus, it combines low-cost construction with infinite mulitplexibility, excellent optical properties, image retention, and the concomitant low power. Deep grating structures provide antagonism between the alignment of the top and bottom of the grooves and the sidewalls, leading to elastic deformation of the local surface director orientation. This deformation can be reduced by the formation of ½ disclination loops at the surface, close to regions of high surface curvature. The surface is designed to give the correct degree of curvature to ensure both continuous (defect free) and defect-containing states are stable. These states are separated by an energy barrier that represents the energy required to move, create, and annihilate the defects. A typical zenithal bistable surface is provided by a homeotropic grating with a depth AG of about 1 mm and pitch L of 0.8 mm (Jones 2008). Although the director is elastically distorted close to the varying surface for the homogeneous and homeotropic continuous states, the deformation decays into the bulk of the cell and becomes constant over distances greater than half the pitch, as shown in Fig. 10. At this distance, the grating acts as a standard alignment surface with either high, near homeotropic, pre-tilt alignment for the continuous or C state, or low pre-tilt for the defect or D state. The D state pre-tilt yD is dictated by the relative position of the defects and is strongly dependent on grating shape: C state

D state

θD ≈L/2

a

w

electrode hu L

SD

Fig. 10 Zenithal bistable surface Page 18 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014 

yD  90 ð1  2sD = LÞ

(20)

where sD is the distance between +½ and ½ defects as shown in Fig. 10, L is the distance between repeating defects (equal to the pitch for a simple, periodic grating), and pre-tilt is defined from the plane of the LCD. Usually, a pre-tilt of less than 4 is ideal and so a near symmetric sinusoid-like grating is preferred. The disclination lines run parallel to the grooves to form defect loops defining areas of D state. At the boundary between D and C states, the +½ and ½ defects detach from the edges to annihilate close to the surface plane. Unless pinned strongly by inherent inhomogeneities of the surface, the defect loop may extend or retract along the grooves if disturbed by external influences, such as changes in temperature, voltage, or viscous flow. Rather than rely on such random defect pinning sites, the grating includes a p phase shift, or “slip,” in the groove structure every few microns along its length to form vertical convex and concave edges. These stabilize the defect state and provide barriers to unwanted defect annihilation. Such structures enable devices with wide operating windows and good shock stability to be maintained from temperatures below 20  C to above 70  C (Jones 2008). The energy barrier between the two states ensures that bistability occurs for a wide range of different grating shapes and aspect ratios, even if one state has a lower energy. Indeed, it is often advantageous to deliberately favor one state. The D state always forms on first cooling from the isotropic into the nematic phase. A relatively deep grating is usually chosen to ensure that this state is maintained uniformly across a device at all temperatures in areas that cannot be selectively latched (e.g., in the inter-pixel gaps). An alternative zenithal bistable surface has a locally planar condition, so that the C state has a low near planar surface pre-tilt, and the defect state is higher pre-tilt. If a mono-grating is used, the director simply aligns parallel to the grooves in a monostable configuration. However, if a deep bi-grating is used, the director is forced to deform elastically around the surface features, and zenithal bistability is obtained. This type of surface is utilized in the post-aligned bistable nematic device (Kitson and Geisow 2002). Numerical modeling of the latching transitions has been developed and compared favorably to experimental results in reference Spencer et al. (2010). The model used a two-dimensional Q-tensor formulation of the director field, to allow for the reduction of nematic order at the defect cores. As well as the bulk anisotropic viscoelastic constants, the model included terms for the flexoelectricity, finite surface anchoring, and surface viscosity. Figure 11 shows snapshots of the director field evolution for latching from C to D and D to C. Both states of a zenithal bistable surface have substantial splay and bend deformations of the director close to the surface, leading to local flexoelectric polarization. Latching between the states occurs if a field of sufficient magnitude, duration, and correct polarity is applied. Where the local surface condition is homeotropic, the applied field couples to a positive De liquid crystal, nucleating a ½ defect pair on the near-vertical sidewall of each groove. If the applied transverse field is positive with respect to the grating surface, the defects separate due to the effect of the flexoelectric polarization close to the surface. This causes the defects to move across the surface until the ½ defect is close to the convex grating top, and the +½ defect is close to the concave grating bottom, where the defects become “pinned” by the surface curvature. Application of a negative field of sufficient impulse causes the defects to detach from the grating edges and move toward each other across the surface until they annihilate and form the C state. These transitions are reversed if the local surface orientation is planar and latching requires a negative De liquid crystal material. Addressing signals for practical devices uses bipolar pulses, with the polarity of the trailing pulse defining the final state. In these instances, the elastic deformation close to the grating is increased by the RMS effect DeE2 of the first portion of the signal. A simple analytical model for a zenithal bistable device has also been derived (Davidson and Mottram 2002). This uses a surface polarization and a critical surface torque for discontinuous changes of surface

Page 19 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

a

0 μs

497 μs

674 μs

497 μs

703 μs

900 μs

1.2 ms

30 ms

1.2 ms

30 ms

b

0 μs

994 μs

Fig. 11 Numerical simulation of (a) C to D and (b) D to C transitions (Spencer et al. 2010)

pre-tilt. Simplifying this treatment further (Spencer et al. 2010) leads to an expression for the latching voltage from C to D states: " !#

  ϵjj ϵjj  ϵg w g1 ls :d 2W y

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d þ hu V CD ¼  1 þ AG : þ (21) ðe11 þ e33 Þt ðe11 þ e33 Þ þ ϵ0 DϵK 33 ϵg ϵg  ϵjj w þ ϵjj L where the pulse duration is t, cell gap is d, bend and splay flexoelectric coefficients are e11 and e33, the bulk twist viscosity is g1, surface viscosity is ls, and the zenithal anchoring energy is Wy. The expression includes terms for the dielectric effect of the grating approximated to a rectangular shape of amplitude AG, offset hu, dielectric constant eg, full-width half maximum w, and pitch L, as defined in Fig. 10. The form of this expression fits both numerical simulation and experimental results well, although the grating terms in this expression relate to the dielectric effect of the grating only and do not express the effect of shape differences on the defect dynamics. Both numerical simulation (Spencer et al. 2010) and experimental data (Jones and Amos 2011) give strong direct relationships between the latching voltages, grating pitch L, groove width (1  w/L), and zenithal anchoring energy Wy. A weak direct relationship with the amplitude AG is also apparent. Current commercial devices use the grating opposite a conventional rubbed polymer alignment to latch between a HAN and TN state (Bryan-Brown et al. 1997; Wood et al. 2000) as shown in Fig. 12a. The grating is aligned with the grooves parallel to the rubbing direction on the opposing surface to give a 90 TN for the low pre-tilt D state. On latching into the high pre-tilt C state, the twist in the cell is removed and the hybrid-aligned HAN is formed. The device is sandwiched between parallel polarizers with a diffusive rear reflector to produce a reflective display operating in the normally white TN mode. The best optical performance uses the grating on the front surface with the polarizer parallel to the grating grooves, and the material used to form the grating is index matched to the ordinary refractive index of the liquid crystal mixture to prevent diffractive losses. The device is either operated at the first Gooch Tarry minimum or halfway between the first and second minima (Jones and Bryan-Brown 2010) to give good white balance. Page 20 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

a

Polariser

D TN

C Zenithal Bisable surface E

E

HAN

Rubbed-polymer Analyser Diffuse reflector

b

Fig. 12 The ZBD ® display. (a) Schematic of the device operating in VAN-TN mode. (b) Photograph of a 480  360 pixel 100 dpi 6.800 diagonal ZBD, designed for use in retail signage

This arrangement combines the high reflectance of the TN state, with the contrast and viewing angle allowed by the TN-HAN combination. The excellent appearance is achieved without the need for compensation layers, and the tolerance on the 7 mm cell gap is lenient (typically 0.25 mm rather than the 0.05 mm of STN). ZBD latching occurs for pulse magnitudes of several volts and durations around 100 ms, though the optical change associated with this the transition to the D state may take 100 ms. This means that the display is readily addressed using conventional STN drivers, and the image typically takes a few hundred milliseconds to update. Fabrication of the ZBD device is done on a conventional TN LCD production line with negligible equipment outlay. It is inherently low cost, using standard components for TN and STN displays, but without the need for strict tolerances or optical compensation foils. Low-cost fabrication of the grating is done by embossing the structure into a homeotropic photopolymer using an inverse of the desired grating shape on a carrier film. This film is fabricated by first copying a photolithographically defined grating master into nickel, using sputtering and electroforming, which is then used to form the grating into a resin on a PET backing. The process is designed to be self-patterning (Bryan-Brown et al. 2009) by arranging for the homeotropic photopolymer to adhere preferentially to the resin, except for region of the display mother glass previously coated with adhesion promoter. The simple fabrication ensures that the device has similar cost to standard STN but with greatly improved optical appearance, higher multiplexibility, and ultralow power consumption. The first application for this technology is retail signage (see Fig. 12b) and electronic shelf-edge labels (Jones 2009). The bistable displays have been combined with an ultralow power RF communications Page 21 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

protocol that allows many thousands of labels to operate continuously from two button batteries for between 5 and 7 years when updated several times each day, despite two-way communication for each label with a single transceiver in the store. These attributes are leading to rapidly increasing deployment across Europe, with over a million bistable displays already in operation. Also, this combination of ultralow power consumption display and communications with full graphic information content has begun to attract other sectors, including office signage and manufacturing control. A bistable surface, particularly one with an inherent polar latching mechanism, lends much flexibility to device design. The bistable surface may be used opposite a monostable homeotropic surface to create a bistable device latching between VAN and HAN states, opposite a high tilt surface to form a bistable Pi-cell, or opposite a second bistable surface to give a multi-stable device latching among VAN-HAN1HAN2-TN states (Jones 2006). Typical devices use a periodic grating structure, to ensure a uniform display with constant orientation of the director pre-tilt and azimuthal orientation of the D state. The grating may be varied over length scales much smaller than a pixel, to give analogue gray scale or multidomain structures (Bryan-Brown et al. 1998; Jones et al. 2003). However, because the formation of the defects is dictated by the shape of the grating features, bistable pre-tilts may be stabilized by single isolated features, such as step edges (Uche et al. 2005), pillars, or wells (Jones 2001) of the correct aspect ratio. Alternatively, a random or pseudorandom distribution of features may be used to deliberately vary the alignment of the director, resulting in bistable scattering devices (Jones 2001), for example. Such behavior was also found for a deep homeotropic bi-grating structure, where the defect loops swapped from peak to peak to form random patches of D state. Surface relief is not a prerequisite for producing bistable surface alignment: Two or more stable alignment states can also be induced by patterning the local surface alignment in a periodic manner. For example, bistable pre-tilts of +ys and ys occur for a flat surface patterned to give alternating strips of homeotropic and planar homogenous surface alignment, where the orientation of the planar alignment is parallel to the direction of the modulation (Oo et al. 2008) in a similar fashion to the earlier work (Boyd et al. 1980) described in Sect. 2.3, but with submicron length scales.

Plastic, Flexibility, and Color in Reflective Bistable Displays The glass-based displays summarized in the previous section are beginning to realize the market potential for low power and high-image quality reflective displays. In this section, the potential for flexible plastic displays is described, which promise to help create new markets not currently possible using existing display technologies. Great effort worldwide is dedicated to producing TFT-based plastic displays, either by fabricating the transistors on the plastic substrate or producing them by conventional means and then transferring to the plastic substrate. Various demonstrators have been made, some of which are likely to make successful entrants to the market over the next year or two. However, such devices are destined to be significantly more expensive than their glass counterparts for the foreseeable future, leaving unfulfilled the requirement for applications where low-cost plastic displays are essential. Bistable displays provide the additional advantage for applications that the image is maintained after updating. This is essential where power is remote from the display, such as for smart cards. Bistability means that a card can be produced without an onboard battery because the image is only updated when the card is inserted into a powered read/write terminal. As passive matrix displays, bistable LCDs are ideally suited for fabrication on plastic substrates, since complex images can be displayed using passive addressing, thereby avoiding the challenges associated fabricating thin-film transistors on plastic substrates. The technologies are all reflective displays and so do

Page 22 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

not need backlighting. As LCDs, they are tolerant to low levels of oxygen and moisture ingress and, unlike OLED displays, do not need extensive barrier layers to be added to the plastic substrates.

Bistable LCDs Using Retardation Modulation Plastic displays based on ferroelectric liquid crystals (Brill et al. 2002), Binem BTN (Barron et al. 2005), and zenithal bistable displays (Bryan-Brown 2000; Jones et al. 2002; Rudin et al. 2009) have all been demonstrated. Because the devices use external polarizers and modulate the retardation of the liquid crystal, it is important to use substrates that do not add to the retardation effect of the electrooptic medium. Practically, this means that an optically isotropic transparent material should be chosen, such as polyethersulfone (PES). This restricts the process temperatures to a maximum of 170  C. Conventional glass sphere spacers may not be suitable if they impact into the softer plastic substrate material. Instead, photolithographically defined spacers is used, which may be either cylindrical (Barron et al. 2005; BryanBrown 2000) or walls (Rudin et al. 2009). All bistable displays may lose image if exposed to sufficient shear between the two substrates, but for ferroelectric liquid crystals, the resulting alignment damage can be permanent. To meet the requirements for flexing of a smart card, a segmented FLCD was designed wherein all of the inactive areas outside the latching segments were used as spacer (Brill et al. 2002), thereby furnishing the device with the maximum protection to bending. The design of many of the other components for the display also depends on the degree of flexibility required for the application. Although insensitive to oxygen and moisture ingress, the substrates should still be coated with thin ( EH

Focal Conic

White light transmitted

E E < EHP*

Transient Planar

Homeotropic

Fig. 13 Basics of bistable cholesteric display (BCD) operation Page 24 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

A schematic of the BCD operation is shown in Fig. 13. The cholesteric material is contained between electrode-bearing substrates; the front substrate is transparent, but the rear is opaque and absorbent (often black). In the planar state, up to 50 % of light in the wavelength band Dl is reflected to appear colored. The focal-conic state is weakly forward scattering, so most of the light incident on areas of this texture is transmitted and absorbed by the black coating. This optical contrast is achieved without the use of separate polarizers, thereby simplifying device construction. The example display shown in Fig. 14a uses a cholesteric with the pitch tuned to reflect yellow wavelengths, to give a yellow and black appearance of the image. Alternatively, the rear of the display might be coated with a blue material that absorbs transmitted yellow light, but reflects the blue light unaffected by the cholesteric helicity. In this instance, the reflecting state appears white, since the yellow light reflected by the cholesteric combines with the blue light reflected by the rear coating. The absorbing state then appears blue. Tuning the reflected wavebands by adjusting the pitch of the material and choosing an appropriate contrasting rear absorbing layer allow a variety of different two-color combinations to be offered. Simulated reflection spectra for thick and thin samples are shown in Fig. 14b for a material with Dn = 0.19 at 550 nm. The number of turns of the helix needed to fully reflect half of the incident light depends on the birefringence of the liquid crystal. Typical materials suitable for displays have 0.17 Dn 0.25, for which only 10–15 turns of the helix are needed. Given a typical average refractive index of n ¼ 1:6, a green or yellow reflection band occurs for pitches around 360 nm, giving a high reflectance with a cell gap of only 5 mm, ideal for manufacturability. Good device operation also requires materials to be chosen with no or weak temperature dependence of the pitch. Latching between the states is accomplished by first applying a field above the threshold EH which couples to the positive dielectric anisotropy of the material to unwind the helix and induce the homeotropic orientation (de Gennes and Prost 1993):

a

b

0.6 15P

LH Reflectivity of planar state

0.5 Δλ

0.4

8P

0.3 4P

0.2 0.1 0 400

450

c

λ1

500 550 600 Wavelength/nm

Initial Planar

1.0 LH λ1 Reflectivity

λ0

λ1

650

700

EFHEHFEH Final Planar

0.8 0.6

EPF

EHP*

0.4 0.2

Initial focal conic

0.0 0

10 20 RMS Voltage/V

30

Fig. 14 BCD (a) Photograph of a 6.800 diagonal 800  600 BCD from Kent Displays, Inc. (b) Predicted reflectivity for 2 and 15 turns of a 350 nm pitched cholesteric in the planar state (calculated using MOUSE-LCD simulation software from HKUST and using the refractive indices for the cyano-bipheyl 5CB). (c) Threshold characteristic for the zero-field state after an applied pulse. The latching fields are found from the voltages labeled multiplied by the cell gap used Page 25 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

p2 EH ¼ P

rffiffiffiffiffiffiffiffiffi K 22 ϵ0 Dϵ

(24)

For the 5 mm cell spacing of our previous example, Eq. 24 predicts that the critical unwinding field EH is about 30 V for a typical dielectric anisotropy of De = 20 and twist elastic constant of K22 = 7pN, allowing multiplexing with CMOS driver voltages. The electrooptic characteristic of Fig. 14c shows the resulting textures that form after pulses of varying applied field to a sample that starts from either the focal-conic or planar texture. The final latched state depends on the rate at which the field is removed. If the field is immediately reduced to below EHP* (Greubel 1974; Yang and Lu 1995; Yang et al. 1997): E HP

2 ¼ p

rffiffiffiffiffiffiffiffi K 22 E H  0:42E H K 33

(25)

then a transient planar state is formed. At the instance the field is removed, the director tilt begins to relax into the cell plane, allowing the twist to return by forming a helical conical director profile (Fig. 13). This transient state has a helix parallel to the cell normal, but with a pitch P  that is higher than the intrinsic pitch of the material given by (Kawachi et al. 1975) P ¼

K 33 P  2P K 22

(26)

The transient state is metastable: All tilts will have decayed within 1 ms typically, but the device will remain in the transient planar state until the more stable planar state nucleates from defects in the sample. This transition is much slower, limiting the latching time of the liquid crystal to 200 ms or longer. The alternative relaxation from the homeotropic into the focal-conic texture occurs if the field is first reduced to an intermediate level EHP * < E < EHF (where EHF  0.9 EH ). At this lower field, the homeotropic state of the director is still favored, but the coupling to the dielectric anisotropy cannot fully maintain the unwound state because the field is lower than EH and the director begins to twist into the plane of the cell, forming the focal-conic state. A perfect planar state has a reflectance that both diminishes quickly and changes coloration when viewed off-axis, as predicted by (Eq. 22). Preferably, the planar state is fractured into micro-domains, for example, using a polymer network, to distribute the helical axis about the cell normal. Defects are formed at the domain walls that scatter the incident light across a wider angle range, thereby reducing the reflectance of the planar state. Although the peak reflectance of the device is reduced from a practical maximum of about 45 % to between 30 % and 35 % by introducing the domain structure, devices with a 120 viewing cone and very little off-axis color shift are readily fabricated. The polymer network then tends to stabilize the planar texture, but careful choice of the surface alignment conditions ensures that the wide range of bistability is maintained. Several waveforms for driving the BCD have been suggested, the use of which depends on the application. The interested reader is referred to reference Coates (2011) for a more detailed description of the various options. Either bipolar or monopolar schemes can be used, since the display responds to the RMS voltage. If a monopolar scheme is used, the polarity should be reversed regularly to avoid electrolytic issues of the cholesteric material. Conventional addressing (Yang and Doane 1992) is used when image update time is not important. This simple scheme is readily applied using conventional STN drivers and allows a large number of gray levels to be written. Each row is addressed sequentially, initially with a blank or reset pulse Vb followed by selection using a row voltage Vs (typically 25 V) and synchronous data voltage Vd (typically 5 V) applied on the columns. The resultant pixel voltage (row Page 26 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

minus column) switches the pixel either homeotropic or leaves it in the focal-conic state. At the end of the addressing period, the voltage reduces to Vd, and the pixels return to the planar state via the transient planar texture. This type of addressing operation takes between 20 and 40 ms per line: Although satisfactory for indicators and signage with low levels of image content, this time is rather long for high content displays such as those used in portable equipment or electronic readers, taking several tens of seconds to update each page. Much faster addressing uses the dynamic addressing method (Huang et al. 1995). This scheme is similar to FLCD Malvern schemes (Hughes and Raynes 1993) with the blanking (or preparation) phase and the evolution phase being applied during the preceding and trailing rows, respectively, to greatly improve the page update time. For example, a 1,000 row electronic reader can be addressed in little over a second using this scheme. There are advantages for use on plastic substrates that particularly suit the use of BCD: 1. Optical contrast is achieved without the cost of polarizers or a diffusing reflector. The construction of a plastic cell is simple, without lamination of these additional components. As well as being attractive in its own right, the resulting decreased thickness also helps improve display flexibility. 2. Birefringent plastic films are applicable, allowing lower cost plastics such as polyethylene terephthalate (PET) or polycarbonate to be used. 3. The lower substrate can be made from an opaque material, and as such can utilize a variety of flexible materials, such as textile or paper. 4. Unlike some bistable LCD modes, the BCD is not sensitive to cell gap and internal flatness of the substrates. 5. Liquid crystals are readily encapsulated into polymer gels. Cholesteric gels can be made in which each droplet retains its bistable nature. As described below, this allows novel fabrication processes to be adapted and employed, such as printing the bistable medium onto a single substrate. Two approaches have been taken to form liquid crystal droplets: phase separation or emulsification. One BCD (Khan et al. 2005) uses a 20 % mixture of prepolymer and cholesteric spread onto a polycarbonate substrate and laminated against a second substrate with appropriate spacer beads. The display is finished using a 15-min exposure to UV to complete the photo-cure and cutting the desired display from the resulting film. The use of the high level of photopolymer concentration means that the display image is retained when flexed and the liquid crystal is contained by the polymer at the edges even without the presence of a gasket seal. Careful design of the system is required to ensure that the large droplet size is retained, and hence both bistability and an acceptable level of reflectivity are achieved. Recent displays have been produced using emulsification. Kodak (Stephenson 2004) extended the printing methods developed for the photographic industry to cholesteric emulsions. The ITO-coated PET was first coated with a dried gelatin layer to insulate the electrodes from the liquid crystal before coating with the cholesteric dispersion and finally screen printing the opaque electrodes. More recently, Kent Displays Inc. used a transfer method (Shiyanovskaya et al. 2008) to form a cholesteric substrate-free display device. Uniformly sized droplets of cholesteric are formed in water using membrane emulsification together with an appropriate surfactant to prevent coalescence. Good reflectivity results by achieving 15 mm droplets with only a 2 mm variation. Each droplet is then coated with a 200 nm polymer shell by adding a film-forming polyurethane latex binder into the emulsion before use. The emulsion is then printed onto the electrode-bearing substrate where it forms a densely packed array of droplets and is dried to expel the water. During the drying process, the droplets tend to flatten to form flattened ellipsoids: This enhances the resulting reflectivity toward 30 %. After printing the electrodes using PEDOT conductor, the display is protected by overcoating in a clear polymer layer. A substrate-free device is produced by first coating the glass or plastic preparation substrate with a dark protective layer and peeling away the Page 27 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

completed display from the carrier to provide displays with a total thickness of 20 mm. Each processing step is designed to enable web-based production of the displays rather than the conventional batch approach used to fabricate glass panels. Such methods have the potential to increase the factory throughput and eventually should begin to reduce the costs associated with making complex displays. The use of selective reflection in BCDs facilitates forming full-color displays using a triple stacked system (Hashimoto et al. 1998), with separate layers tuned to red, green, and blue wavebands. The most attractive appearance is achieved when the red layer is at the rear of the display, and the green layer has the opposite twist sense to the red and blue layers to improve reflectivity (Taheri et al. 1998). The most efficient arrangement is that shown in Fig. 15a, where cholesteric droplets are used to build up three separate cholesteric layers, interspersed with shared transparent PEDOT electrodes. In addition to minimizing parallax between the top and bottom stacks, this construction restricts the number of electrodes traversed by the reflected red light to six or less, thereby significantly reducing absorptive losses. Further enhancements of the flexibility are possible by using the PET layer as a carrier layer during construction and then removing before use to form a substrate-free display. Each panel is addressed sequentially, and so the frame time is increased three-fold, but this is outweighed by the appealing look of the resulting full-color fully flexible display, as shown in Fig. 15b. Glass-based triple BCD stacks are being used in Flepia electronic reader from Fujitsu (Kato et al. 2010; Nose et al. 2010). The device runs for 40 h while continually showing 260,000 colors on the 800 display with a reflectivity approaching 30 %. Multiple stacked cholesteric displays are also used for large area billboard signage by Magink (Coates 2009). The daylight readability and low power consumption give the cholesteric displays a competitive edge over large area LED units often used for this application, while allowing the full-color gamut required by that application to be achieved. Signs from 6 to 13 m2 are created using tiled units of triple-stack displays. Each 17  17 cm tile has a typical resolution of 3 dpi, which is satisfactory for the typical viewing distances of greater than 10 m. The panels are kept in a a

Protective polymer

b

B

G

Electrodes

R

Emulsion of Microencapsulated cholesteric Base PET layer

d

c

E

Fig. 15 Plastic-based BCD. (a) Shared electrode arrangement for a triple stack using cholesteric emulsions applied to a single substrate. (b) Photograph of a single-substrate BCD triple stack. (c) The electronic skin, providing a latchable color molding to help personalize a mobile phone. (d) A BCD writing pad Page 28 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Table 1 Summary of particular bistable liquid crystal display (LCD) and the main advantages and disadvantages for the selected target market SmA scattering FLC C2

E-book reader

Viewing angle

HDTV

Fast response, multiplexibility

FLC C1

Watch

BCD 0–2p BTN 0-p BTN Binem™ ZBD

Electronic skin Graphic equalizer E-book reader

High voltage, slow update

Alignment, temperature range, shock stability, low cell gap Very low voltage and power, ultrahigh Shock stability resolution Flexibility, conformability color Reflectance Fast response Low cell gap Fast response, gray scale Cell gap, viewing angle

Retail signage and Cost, temperature range, power ESL

Maximum size (< 700  1300 )

temperature-controlled unit to help ensure image uniformity and the width of the inter-tile gap is minimized. Plastic BCD offers distinct advantages for producing reflective color and flexible displays, and this has led to their application in several new markets. The ultrathin and flexible substrate-free displays enable a multitude of novel applications not possible using standard display approaches. An excellent example of such a market is electronic skin (Green et al. 2008). Figure 15c shows a mobile phone casing covered with a cholesteric skin. This is effectively a single pixel display molded onto the outer layer of the product to allow the color to be personalized according to the taste of the user. Latching between states is done using very little electronics. Alternatively, Fig. 15d is an electronic writing pad (Schneider et al. 2008) that takes advantage of the flow-induced change to the planar state caused by a pressure from a writing implement. The electronics in this case are only required to latch from the planar to focal-conic state, being used to erase the whole page and prepare the device for a new page to be written.

Discussion and Conclusion The wide range of different bistable technologies described presents a choice for a potential end user. Each of the commercially available technologies has individual merits that might suit some markets but not others. Table 1 takes a target market for each of these technologies and summarizes the key advantages and disadvantages for that market. For that reason, the table is not a comparison between the different technologies. It also means that the disadvantages cannot be judged against the requirements for other markets. For example, BCDs offer among the brightest reflective color displays available today, yet the requirement for electronic skins demands further improvements. The speed of the FLC update beats all other display technologies, yet the market need is fully met using TFTs and the bistable technology is superseded in that market. Despite a long gestation, there has been a burgeoning of markets requiring the advantages offered by bistability over recent years. This is most evident for e-book readers, but the demand across many other applications and different sectors continues to grow rapidly. Not only do bistable displays offer ultralow power essential for long battery life, but they also have excellent flicker-free readability, low cost, and are available in plastic. In addition to being a consumer portable equipment, bistable displays are creating new applications in equipment where large numbers of battery-operated units require occasional updates. A good example of this is electronic labeling used for retail, manufacturing operations, and postal

Page 29 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

tracking. Particularly attractive to these markets is the combination of bistable display and low power RF communication, where remote updates of information are needed with minimal risk and infrastructure. Among the first applications to attract substantial sales is electronic shelf-edge labeling, now in supermarkets and spreading to the rest of the retail sector. Bistable displays are also seen in public information displays and electronic billboards, thereby proving their applicability from the smallest to largest displays in signage. New markets are also being created by the new functionality of the bistable displays, such as smart card displays and electronic skins. Continued advances in polarizer-free modes, low voltage operation, and novel effects are evidence that bistable LCD science and technology is still in the early phase of their success.

Further Reading Acosta EJ, Smith NJ, Towler MJ (2006) Isolation techniques for long-term bistability of the bistable twisted nematic mode. Liq Cryst 33(3):249–256 Amakawa H, Kondoh S (2008) Developments of ferroelectric liquid crystal devices and applications utilizing memory effect. In: Proceedings of international display workshop (IDW 2008), pp 1559–1562 Anderson MH, Jones JC, Raynes EP, Towler MJ (1991) Optical studies of thin layers of smectic-C materials. J Appl Phys 24:338–342 Anderson MH, Hughes JR, Jones JC, Russell KG (1997) Comparison of three slot DRAMA with other tVmin drive schemes. In: Proceedings of 14th international display research conference (IDRC 1997), pp L40–L41 Angelé J, Stoenescu D, Dozov I, Osterman J, Laffitte JD, Compagnon M, Emeraud T, Leblanc F (2007) New developments and applications update of BiNem displays. Proc SID Int Symp Dig Tech Papers 38:1351–1769 Barberi R, Boix M, Durand G (1989) Electrically controlled surface bistability in nematic liquid crystals. Appl Phys Lett 55(24):2506–2508 Barberi R, Giocondo M, Durand G (1992) Flexoelectrically controlled surface bistable switching in nematic liquid crystals. Appl Phys Lett 60(9):1085–1086 Barron C, Angelé J, Bajic L, Dozov I, Leblanc F, Perny S, Brill J, Specht J (2005) Development of Binem ® displays on flexible plastic substrates. J Soc Inf Disp 13(3):193–198 Berreman DW, Heffner WR (1980) New bistable liquid crystal twist cell. Appl Phys Lett 37:109–111 Bobrov Y, Lazarev P, McMurtry D, Remizov S (2001) Incorporation of optiva polarizers in LCD production line. Proc SID Int Symp Dig Tech Papers 32:639–641 Boyd GD, Cheng J, Ngo PDT (1980) Liquid-crystal orientational bistability and nematic storage effects. Appl Phys Lett 36:556–558 Bradshaw MJ, Brown CV, Haslam SD, Hughes JR, Graham A, Jones JC, Katsuse H, Kawabata Y, Koden M, Miyoshi S, Nonomura K, Numao T, Shigeta M, Sugino M, Tagawa A, Gass PA, Raynes EP, Towler MJ (1997) Key technologies for tVmin FLCDs. In: Proceedings of international display research conference, Toronto, Sept 1997, pp L16–L17 Brill J, Lueder E, Randler M, Voegele S, Frey V (2002) A flexible ferroelectric liquid-crystal display with improved mechanical stability for smart-card applications. J Soc Inf Disp 10(3):189–194 Brown CV, Dunn PE, Jones JC (1997) The effects of elastic constants on the alignment and electro-optic behaviour of smectic C liquid crystals. Euro J Appl Math 8(3):281–291 Bryan-Brown GP (2000) Zenithal bistable device (ZBD) using plastic substrates. In: Proceedings of 20th international display research conference (IDRC), pp 229–232

Page 30 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Bryan-Brown GP, Towler MJ, Bancroft MS, McDonnell DG (1994) Bistable nematic alignment using bigratings. In: Proceedings of international display research conference (IDRC 1994), pp 209–212 Bryan-Brown GP, Brown CV, Jones JC (1997) Bistable nematic liquid crystal device. US Patent 06249332, priority 16 Oct 1995 Bryan-Brown GP, Wood EL, Jones JC (1998) Optimisation of the Zenithal bistable nematic liquid crystal device. In: Proceedings of 18th international display research conference (IDRC 1998), pp 1051–1053 Bryan-Brown GP, Walker DRE, Jones JC (2009) Controlled grating replication for the ZBD technology. Proc SID Int Symp Dig Tech Papers 40:1334–1337 B€uy€uktanir EA, Mitrokhin M, Holter B, Glushchenko A, West J (2006) Flexible bistable smectic-A polymer dispersed liquid crystal display. Jpn J Appl Phys 45:4146–4151 Chigrinov VG (1999) Liquid crystal devices. Artech House, London Clark NA, Lagerwall ST (1980) Submicrosecond bistable electro-optic switching in liquid crystals. Appl Phys Lett 36(1):899–901 Coates D (1998) Non-chiral smectic liquid crystals – applications. In: Demus D, Goodby J, Gray GW, Spiess H-W, Vill V (eds) Handbook of liquid crystals. Wiley VCH/GmbH, Weinheim, pp 470–490 Coates D (2009) Low-power large-area cholesteric displays. Inf Disp 25(3):16–19 Coates D (2011) Cholesteric liquid crystals. In: Chen J, Cranton W, Fihn M (eds) Handbook of visual display technology. Springer, Berlin Coates D, Crossland WA, Morrissy JH, Needham B (1978) Electrically induced scattering textures in smectic A phases and their electrical reversal. J Phys D Appl Phys 11:2025–2034 Crawford G, Zumer S (eds) (1996) Liquid crystals in complex geometries formed by polymer and porous networks. Taylor and Francis, London Crossland WA, Canter S (1985) An electrically addressed smectic storage device. In: Digest of SID 1985, pp 124–126 Davidson AJ, Mottram NJ (2002) Flexo-electric switching in a bistable nematic device. Phys Rev E 65(05171000):1–10 de Gennes PG, Prost J (1993) The physics of liquid crystals, 2nd edn. Clarendon, Oxford Demus D, Goodby J, Gray GW, Spiess H-W, Vill V (eds) (1998) Handbook of liquid crystals. Wiley VCH, Weinheim Doane JW, Khan A, Huang X-Y, Miller N (2003) Bistable reflective cholesteric displays. In: Proceedings of international display research conference, Paper 6.1, pp 84–87 Dozov I, Martinot-Lagarde P (1998) First order breaking transition of tilted nematic anchoring. Phys Rev E 58(6):7442–7446 Dozov I, Nobili M, Durand G (1997) Fast bistable nematic display using monostable surface switching. Appl Phys Lett 70(9):1179–1181 Elston S, Benzie P (2011) In: Chen J, Cranton W, Fihn M (eds) Handbook of visual display technology. Berlin, Springer Green A, Montbach E, Miller N, Davis DJ, Khan A, Schneider T, Doane JW (2008) Energy efficient flexible Reflex™ displays. Proc SID Int Symp Dig Tech Papers 39:55–58 Greubel W (1974) Bistability behavior of texture in cholesteric liquid crystals in an electric field. Appl Phys Lett 25(1):5–7 Greubel W, Wolf U, Kr€ uger H (1973) Electric field induced texture change in certain nematic/cholesteric liquid crystal mixtures. Mol Cryst Liq Cryst 24:103–106 Hashimoto K, Okada M, Nishiguchi L, Masazumi N, Yamagawa E, Taniguchi T (1998) Reflective color display using cholesteric liquid crystals. Proc SID Int Symp Dig Tech Papers 39:897–900 Heilmeier GH, Goldmacher JE (1968) A new electric-field-controlled reflective optical storage effect in mixed liquid crystal systems. Appl Phys Lett 13(4):132–133 Page 31 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Hilsum C (2010) Flat-panel electronic displays: a triumph of physics, chemistry and engineering. Philos Trans R Soc A Math Phys Eng Sci 368(1914):1027–1082 Huang X-Y, Yang D-K, Bos PJ, Doane JW (1995) Dynamic drive for bistable cholesteric displays: a rapid addressing scheme. J Soc Inf Disp 3:165–168 Hughes JR, Raynes EP (1993) A new set of high-speed matrix addressing schemes for ferroelectric liquid crystal displays. Liq Cryst 13:597–601 Itoh N, Akiyama H, Kawabata Y, Koden M, Miyoshi S, Numao T, Shigeta M, Bradshaw MJ, Brown CV, Graham A, Haslam SD, Hughes JR, Jones JC, Slaney AJ, Bonnett P, Gass PA, Raynes EP, Ulrich DC (1998) 1700 video-rate full colour FLCD. In: Proceedings of the international displays workshop (IDW), Kobe, Dec 1998, PLC 1–2, pp 205–208 Joly S, Thomas P, Osterman J, Simon A, Lallemant S, Faget L, Laffitte J-D, Irzyk M, Madsen L, Angelé J, Leblanc F, Martinot-Lagarde P (2010) Demonstration of a technological prototype of an active-matrix Binem liquid-crystal display. J Soc Inf Disp 18(12):1033–1039 Jones JC (2001) Bistable nematic liquid crystal device. US Patent 7,371,362, priority 30 Nov 1999 Jones JC (2006) Novel geometries of the Zenithal bistable device. Proc SID Int Symp Dig Tech Papers 37:1626–1629 Jones JC (2008) The Zenithal bistable display: from concept to consumer. J Soc Inf Disp 16(1):143–154 Jones JC (2009) Approaching the Zenith: bistable LCDs in a retail environment. Inf Disp 25(3):8–11 Jones JC, Amos RM (2011) Relating display performance and grating structure of a Zenithal bistable display. Mol Cryst Liq Cryst 543:823–834 Jones JC, Bryan-Brown GP (2010) Low cost Zenithal bistable device with improved white state. Proc SID Int Symp Dig Tech Papers 41:207–211 Jones JC, Raynes EP (1992) Measurement of the biaxial permittivities for several smectic-C host materials used in ferroelectric liquid crystals devices. Liq Cryst 11(10):199–217 Jones JC, Towler MJ, Raynes EP (1991) The importance of dielectric biaxiality for ferroelectric liquid crystal devices. Ferroelectrics 121(2):91–102 Jones JC, Towler MJ, Hughes JR (1993) Fast, high contrast ferroelectric liquid crystal displays and the role of dielectric biaxiality. Displays 14(2):86–93 Jones JC, Brown CV, Dunn PE (2000) The physics of tVmin ferroelectric liquid crystals. Ferroelectrics 246:191–201 Jones JC, Brett P, Bryan-Brown GP, Graham A, Wood EL, Scanlon RJ, Martin H-L (2002) Meeting the display requirements for portable applications using Zenithal bistable devices (ZBD). Proc SID Int Symp Dig Tech Papers 33:90–93 Jones JC, Beldon SM, Wood EL (2003) Gray scale in zenithal bistable displays: the route to ultra-low power color displays. J Soc Inf Disp 11(2):269–275 Joubert C, Angelé J, Boissier A, Pecout B, Forget SL, Dozov I, Stoenescu D, Lallemand S, MartinotLagarde P (2003) Reflective bistable nematic displays (BiNem) fabricated by standard manufacturing equipment. J Soc Inf Disp 11(1):217–224 Kanbe J, Inoue H, Mizutome A, Hanyuu Y, Katagiri K, Yoshihara S (1991) High resolution large area FLC display with high graphic performance. Ferroelectrics 114:3–26 Kato T, Kurosaki Y, Kiyota Y, Tomita J, Yoshihara T (2010) Application and effects of orientation control technology in electronic paper using cholesteric liquid crystals. Proc SID Int Symp Dig Tech Papers 41:568–571 Kawachi M, Kogure O, Yoshii S, Kato K (1975) Field-induced nematic-cholesteric relaxation in a small angle wedge. Jpn J Appl Phys 14(9):1063–1064 Kawachi M, Kato K, Kogure O (1978) Light scattering characteristics in nematic-cholesteric mixtures with positive dielectric anisotropy. Jpn J Appl Phys 17:1245–1250 Page 32 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Khan A, Shiyanovskaya I, Schneider T, Miller N, Ernst T, Marhefka D, Nicholson F, Green S, Magyar G, Pishnyak O, Doane JW (2005) Reflective cholesteric displays: from rigid to flexible. J Soc Inf Disp 13(6):469–474 Kitson S, Geisow A (2002) Controllable alignment of nematic liquid crystals around microscopic posts: stabilisation of multiple states. Appl Phys Lett 80(19):3635–3637 Koden M (1999) Passive-matrix FLCDs with high contrast and video-rate full colour pictures. Ferroelectrics 246:87–96 Kondoh S, Suguro A, Noguchi K, Iio K, Ueda K, Takahashi N, Fujino M (2005) Low power consumption displays using ferroelectric liquid crystals. In: Proceedings of international display workshop (IDW 2005), pp 81–82 Lagerwall ST (1999) Ferroelectric and antiferroelectric liquid crystals. Wiley-VCH, Weinheim Lasak S, Davidson A, Brown CV, Mottram NJ (2009) Sidewall control of static azimuthal bistable nematic alignment states. J Phys D Appl Phys 42(085114):1–8 Li J, Hoke CD, Fredley S, Bos PJ (1996) A bistable LCD using polymer stabilization. Proc SID Int Symp Dig Tech Papers 27:265–268 Monkade M, Boix M, Durand G (1987) Order electricity and oblique nematic orientation on rough solid surfaces. Europhys Lett 5(8):697–702 Mottram NJ, Ul Islam N, Elston SJ (1999) Biaxial modeling of the structure of the chevron interface in smectic liquid crystals. Phys Rev E 60:613–619 Nomura H, Obikawa T, Ozawa Y, Tanaka T (2000) Recent studies on multiplex driving of BTN-LCDs. J Soc Inf Disp 8(4):289–294 Nose M, Uehara H, Shingai T (2010) Driving scheme of color e-paper using Ch-LC for high image quality. In Proceedings of the international displays workshops (IDW 10), paper EP6-1 Oo TN, Kimura N, Akahane T (2008) Effect of elastic constant ratio K33/K11 and stripe width area ratio on micro-patterned alignment of nematic liquid crystal. Adv Tech Mat Mat Proc J 10(1):9–20 Osterman J, Madsen L, Angelé J, Leblanc F, Scheffer T, Nordström R (2010) Optical optimization of the single-polarizer BiNem display for e-reading applications. Proc SID Int Symp Dig Tech Papers 41:203–206 Rieker TP, Clark NA, Smith GS, Parmar DS, Sirota EB, Safinya CR (1987) Chevron local layer structure in surface stabilised ferroelectric smectic C cells. Phys Rev Lett 59(3):2658–2661 Rudin J, Kitson S, Geisow A (2009) Colour plastic bistable nematic display fabricated by imprint and ink-jet technology. J Soc Inf Disp 17(4):309–316 Rudquist P (2011) Smectic LCD modes. In: Chen J, Cranton W, Fihn M (eds) Handbook of visual display technology. Springer, Berlin Schneider T, Magyar G, Barua S, Ernst T, Miller N, Franklin S, Montbach E, Davis DJ, Khan A, Doane JW (2008) A flexible touch-sensitive writing tablet. Proc SID Int Symp Dig Tech Papers 39:1840–1842 Shiyanovskaya I, Green S, Khan A, Mayar G, Pishnyak O, Doane JW (2008) Substrate-free cholesteric liquid-crystal displays. J Soc Inf Disp 16(1):113–115 Slaney AJ, Minter V, Jones JC (1997) Assessment of LC materials for tVmin devices. In: Proceedings of international display workshop (IDW), Nagoya, Nov 1997, PLC3-5, pp 885–886 Spencer TJ, Care C, Amos RM, Jones JC (2010) A Zenithal bistable device: comparison of modelling and experiment. Phys Rev E 82(021702):1–13 Stephenson S (2004) Development of flexible displays using photographic technology. Proc SID Int Symp Dig Tech Papers 35:774–777 Surguy PWH, Ayliffe PJ, Birch MJ, Bone MF, Crossland I, Hughes JR, Ross PW, Saunders FC, Towler MJ (1991) The ‘JOERS/Alvey’ ferroelectric multiplexing scheme. Ferroelectrics 122:63–79

Page 33 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_92-2 # Springer-Verlag Berlin Heidelberg 2014

Taheri B, West JL, Yang DK (1998) Recent developments in bistable cholesteric displays. Proc SPIE 3297:115–121 Takatoh K, Sakamoto M, Hasegawa R, Koden M, Itoh N, Hasegawa M (2005) Alignment technologies and applications of liquid crystals. Taylor and Francis, London Tanaka T, Sato Y, Inoue A, Momose Y, Notuma H, Iino S (1995) A bistable twisted nematic (BTN) LCD driven by passive-matrix addressing. Asia Disp 26:259–262 Tang Y, Lei W, Zheng Y, Wang B, Sun G, Xia X (2010) Multi-gray level reflective LCD module design and its tiled application. Proc SID Int Symp Dig Tech Papers 41:1408–1410 Terada M, Yamada S, Katagiri K, Yoshihara S, Kanbe J (1993) Static and dynamic properties of chevron uniform FLC. Ferroelectrics 149:283–294 Thurston RN, Cheng J, Boyd GD (1980) Mechanically bistable liquid-crystal display structures. IEEE Trans Electron Devices ED-27(11):2069–2080 Tsakonas C, Davidson AJ, Brown CV, Mottram NJ (2007) Multistable alignment states in nematic liquid crystal filled wells. Appl Phys Lett 90(111913):1–3 Uche C, Elston SJ, Parry-Jones LA (2005) Microscopic observation of Zenithal bistable switching in nematic devices with different surface relief structures. J Phys D Appl Phys 38(13):2283–2291 Ulrich DC, Henley B, Tombling C, Tillin MD, Smith D, Dodgson N (2001) A 400-dpi reflective storage ferroelectric liquid-crystal display for highly readable, low-power mobile display applications. J Soc Inf Disp 9(4):295–300 Wood EL, Bryan-Brown GP, Brett P, Graham A, Jones JC, Hughes JR (2000) Zenithal bistable device (ZBD) suitable for portable applications. Proc SID Int Symp Dig Tech Papers 31:124–127 Wu ST, Yang DK (2001) Reflective liquid crystal displays. Wiley, Chichester/New York Yang D-K, Doane JW (1992) Cholesteric liquid crystal/polymer gel dispersions: reflective displays. Proc SID Int Symp Dig Tech Papers 23:759–762 Yang DK, Lu ZJ (1995) Switching mechanism of bistable reflective cholesteric displays. Proc SID Int Symp Dig Tech Papers 26:351–354 Yang D-K, Chien L-C, Doane JW (1991) Cholesteric liquid crystal/polymer gel dispersion bistable at zero field. In: Proceedings of international display research conference, pp 49–52 Yang D-K, West JL, Chien L-C, Doane JW (1994) Control of reflectivity and bistability in displays using cholesteric liquid crystals. J Appl Phys 76:1331–1333 Yang DK, Huang XY, Zhu YM (1997) Bistable cholesteric reflective displays: materials and drive schemes. Annu Rev Mater Sci 27:117–146 Yi Y, Lombardo G, Ashby N, Barberi R, Maclennan JE, Clark NA (2009) Topographic-pattern-induced homeotropic alignment of liquid crystals. Phys Rev E 79(041701):1–9

Page 34 of 34

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_93-2 # Springer-Verlag Berlin Heidelberg 2014

Cholesteric Reflective Displays David Coates* Wimborne, Dorset, UK

Abstract Thin films of cholesteric liquid crystal can be electrically switched at modest voltages into either a reflecting state that provides a specific color or an almost transparent state that appears black when a black absorber is placed behind the display. Both states are stable over time. Many display variations, using glass and flexible plastic substrates, have been made using this general effect to produce a wide range of display devices.

List of Abbreviations HTP PDLC PSCT SSCT

Helical Twisting Power Polymer-Dispersed Liquid Crystal Polymer-Stabilized Cholesteric Texture Displays Surface-Stabilized Cholesteric Texture Displays

Introduction The cholesteric liquid crystal phase derives its name from the esters of cholesterol in which it was first discovered (Reinitzer 1888). The modern, systematic name is the chiral nematic phase indicating that it is a chiral analogue of the nematic phase. In practice, but not strictly correct, it is common to refer to materials with a short pitch length (i.e., 0.25) exhibit a wide reflection bandwidth (80–90 nm) which is of lower color purity but higher brightness than the color and brightness exhibited by lower Dn materials. In a helical structure such as in this case, incident white light is treated as consisting of left- and righthanded circular polarized (LHCP and RHCP) light of which only one polarization component that is the same as the helix handedness can be reflected (Fig. 1); the other handedness of circularly polarized light is transmitted. Thus, only 50 % of the appropriate wavelengths of incident light can be reflected. Wavelengths of light away from this region are transmitted and rotated; for display use, they are usually absorbed by a black absorber behind the display so that only the reflected wavelengths are seen by the observer; the display thus appears colored. There is an angle dependence to the wavelength of light reflected which is described by the cos y term in Eq. 1; thus, when viewed off-axis, the reflected wavelength shifts to a shorter wavelength (and the reflectance is also lower). This occurs ideally in perfectly aligned planar textures, but normally, the angle dependence is less than predicted because the helices in, for example, an electrically driven film are far from perfectly aligned and exhibit a multidomain structure. The helices in multidomain planar structures are usually centered about normal incidence, and their angular distribution leads to a broader overall reflection peak and a wider viewing angle (Yang et al. 1994a). The Bragg equation (Eq. 1) provides a useful but not totally accurate representation of the reflection; other more rigorous techniques have been used to describe these optical properties more precisely (Kats 1971; Yang and Mi 2000; Berreman and Scheffer 1970). To obtain maximum reflection, the film of cholesteric in the planar texture must be well aligned between transparent substrates and have few discontinuities or “oily streaks” (bright lines that indicate the boundary of planar domains); there should be at least ten turns of the cholesteric helix (St. John et al. 1998; Yuan 1996). The peak reflectance from the planar texture in an electrically switched cell is typically in the range of 30–40 % rather than the theoretical 50 %. The color purity is often reduced due to short-wavelength light scattering and overtone bands that broaden the reflection peak. This is a particular problem with longwavelength films whose shorter-wavelength side bands are more dominant to the human eye, making, for example, a red peak appear brown. This can be improved by incorporating a dye in the liquid crystal (West 1996; Zhou and Khan 2002) or a red overlay filter (Kipfer et al. 1997) or a filter having a birefringent gradient (Grupp et al. 2000). Measurements taken at near-normal incidence with diffuse lighting (integrating sphere with specular reflection included) give a measure of the total (specular and Lambertian) light reflected. When illuminated and measured at normal incidence, the iso-contrast curves are symmetrical, but if the illumination is off axis, then the reflected light iso-contrast plot becomes nonsymmetrical (Valyukh et al. 1999).

Page 3 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_93-2 # Springer-Verlag Berlin Heidelberg 2014

The Focal Conic Texture In the focal conic texture, the helices are spatially much smaller and probably quite random in orientation but generally parallel to the substrates – they do not selectively reflect light but, due to the changes in refractive index between the domains, give rise to forward and backward light scattering (Fig. 1). Weak or strong light-scattering films can be created by optimizing the domain size, birefringence, and pitch. Due to the materials status in the 1960s–1970s and a general desire to make black and white displays, strongly light-scattering focal conic textures using materials with a long pitch length and weakly positive or negative dielectric anisotropy (De) liquid crystals were investigated, but these display modes have faded from use. Modern cholesteric displays use positive De liquid crystals generally based on one of the earlier displays (Greubel et al. 1973), but now benefiting from highly positive dielectric anisotropy nematic hosts and high twisting power chiral dopants, these displays can operate at modest voltages and reflect colored light in the planar texture. Thus, the focal conic state is optimized to be of low light scattering (typically 1–2 %) such that with a black absorber behind the display, it can be made to appear black. If the black absorber is placed inside the cell, reflections from the rear substrate, conducting layer, and glass surfaces are eliminated, and the contrast improved (Grupp 2001).

Temperature Effects The pitch of the cholesteric helix varies with temperature. The order parameter of a nematic liquid crystal decreases with increasing temperature, much more so near the clearing point; this gives rise to changes in refractive index and birefringence (Pohl and Finkenzeller 1990). The helical twisting power of the chiral dopant is also temperature dependent. Thus, inevitably, there is a change of reflected wavelength and waveband width with temperature which leads to a change in reflected color. This change in pitch length can be minimized by using mixtures of at least two chiral dopants with the same helical twist sense but having an opposite twist sense response when heated or cooled (Buchecker et al. 1992), thus averaging the twist effects due to temperature change such that the pitch changes less than it would for each individual chiral dopant. If a smectic phase is present as a lower temperature phase, then as the temperature approaches this lower temperature transition, the pitch length of the cholesteric phase dramatically lengthens to prepare for the formation of the lamellar structure of the smectic phase. This effect is the basis of thermochromic cholesteric mixtures used in thermometers, mood rings, temperature indicators on wine bottles and coffee mugs etc. (McDonnell 2006). Smectic phases are thus avoided in mixtures intended for display use. Over a small temperature range at the cholesteric to isotropic liquid transition, there occurs a series of liquid crystal phases known as blue phases (Lehmann 1906; Collins 1984) which, when the temperature range of these phases is extended, for example, by polymer stabilization, can give rise to useful display effects (Kikuchi 2009).

Alignment Homogeneous and homeotropic alignments have been studied (Khan et al. 1996; Lu et al. 1995; Pfeiffer et al. 1995) and give rise to surface-stabilized cholesteric texture (SSCT) displays. Polyimide alignment layers are usually chosen for display use, thinner layers of which tend to provide displays with better reflectivity (Schlanger et al. 1998) and viewing angle. Other aligning methods such as those produced by photo-alignment, microgrooves, PTFE, and obliquely sputtered (Bunz et al. 1999) SiO2 have also been investigated. Rubbing the polyimide layer, which provides some anisotropy or directionality to the surface, provides a more “ideal” planar texture with less discontinuities or “oily lines”; due to increased specular reflection, Page 4 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_93-2 # Springer-Verlag Berlin Heidelberg 2014

such films appear to have a metallic sheen. The helices at the rubbed surface are within a few degrees of being normal to the substrate surface (Watson et al. 1998) and exhibit greater overall reflectivity than a non-rubbed sample, but the focal conic texture transmission often suffers, and the planar texture reflectivity falls off more quickly with viewing angle, quickly reaching a similar level of reflectivity to that of a non-rubbed cell. Rubbing just one surface can provide a compromise of better reflectivity with a less severe change of viewing angle (Khan et al. 2001, 2002; Okada et al. 2005). The homeotropic alignment of short-pitch systems favors the focal conic texture; the planar texture, while having a wide viewing cone, is not as reflective at normal incidence. The focal conic texture can be stabilized from reverting to the planar state by incorporating within the liquid crystal a small amount of a UV-curable reactive monomer and polymerizing this in situ in the liquid crystal cell to give a range of polymer-stabilized cholesteric texture displays (PSCT) (Yang and Doane 1992; Yang et al. 1994b). Silica agglomerates, having strong hydrogen bonding, have also been used (Crawford 1997; Hisamitsu et al. 2006) to help stabilize the planar and focal conic states – the agglomerates break down during switching from one state to another and then reform to help stabilize the new state.

Electrical Switching A typical electro-optic curve for a positive dielectric anisotropy liquid crystal is shown in Fig. 2. With bistable displays, a common starting state must be defined because the electro-optic curve may depend on this, in this case either focal conic or planar state. To generate this curve, a thin film of cholesteric liquid crystal sandwiched between transparent conducting electrodes is electrically driven into the starting state by applying a voltage above V4 (for planar) or between V2 and V3 focal conic, and then a second “select” or “defined” pulse corresponding to the voltage shown on the x-axis is applied. The curve shows the stable state reflectance that arises after (usually >100 ms) the application of the defined voltage (given on the x-axis) of defined pulse width (usually 50 ms) has been applied to that starting state. Starting from a planar texture, the U-shaped curve is created. If the focal conic texture is used as the starting texture, the second dotted curve in Fig. 2 is produced. Plot of voltage versus reflectance 140

Reflectance (Y)

V4

V1

120

V6

100 80 60 V3

40

V2

V5

20 0 0

10

20

30

40

50

60

Voltage (v) Planar

Focal conic

Fig. 2 Plot of a typical cholesteric electro-optic curve showing V1  V4 for the transition from the planar state and V5 and V6 for the transition from the focal conic state

Page 5 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_93-2 # Springer-Verlag Berlin Heidelberg 2014

When starting with the planar texture, select voltages 2V SAT C=T SELECT

10

M. Rose −6

Fig. 9 Typical transfer characteristic of an a-Si:H TFT for an AMLCD display

TFT ON

Log of Idrain (A)

−7 −8 −9 lon −10

∼ 2 × 107 loff

−11 −12 TFT OFF −10

−5

−13

0 VGS (V)

5

10

15

The OFF current must be small enough for negligible leakage during the hold (frame) period TFIELD, i.e., I OFF < ΔVC=T FIELD (where ΔV is the maximum acceptable drop across the pixel during the frame hold period and is usually 100 Cd/m2), good spatial uniformity (>80 %), wide color gamut (comparable to a cathode ray tube standard), high resolution (>200 pixels per inch), wide viewing angle, and high contrast (white-dark >300:1) and be free from visible defects. These requirements stem from human factor studies and an ever increasing demand for photographic quality images (Machover 1997; Le Grand 1957). The burden of each of these requirements is shared by both the LCP and BLU. An example of this is the display brightness. Measurement of a typical LCD in its brightest white state will show luminance values ranging, for example, from 100 to 300 Cd/m2. Despite the function of the LCP as an optical shutter, their maximum transmittance is typically yc , which is 42.1 for PMMA and 39.3 for PC. In order to extract light from the lightguide, the internal ray angle must locally be smaller than yc. Figure 4b shows one method of achieving this condition at a protrusion on the upper lightguide surface, where the incident angle to the protrusion surface allows the ray to escape by refraction. The output angle b is a function of the incident angle and the protrusion angle a. The minimum b in this case is given by Snell’s law:

Page 7 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

bmin ¼ a þ arcsinðn sin ðyc  aÞÞ For a shallow protrusion with a = 5 , bmin = 70 . Numerous other extractor schemes exist, varying in their efficiency and the light angle distributions they produce. Examples include screen-printed white paint dots, molded prisms, domes, rough cylinders, diffractive elements, and shallow wells that extend into the lightguide surface. The angle distribution of the extracted light from a lightguide is principally controlled by the extractor optics. Typical lightguides have their peak luminance at b between 70 and 80 . Lightguides can also be used to control the angle distribution of extracted light across the lightguide by using linear structures molded onto surface, running along the propagation direction. In general, the output angles are usually too large for direct axial viewing, and additional light management films are needed for optimal illumination of the LCP. Lightguides in some cases have a slight wedge angle g along the propagation direction, as diagrammed in Fig. 4c. Each reflection of the top and bottom surfaces decreases y2 by g, leading to a progression of TIR failure from the launch to the distal end, aiding the operation of extraction features. Light exits the surface at very high (grazing) angles and may require an inverted prism film (turning film) above the lightguide to redirect the light toward the viewer, topped by a diffuser film to improve uniformity. Linear structures running from the launch to distal end may be molded onto the top lightguide surface to concentrate the light distribution across the lightguide. Output angle distributions from turning film BLUs tend to be narrower than their flat plate lightguide counterparts. The turning film, wedge lightguide, and diffuser films usually must be codesigned to achieve the desired BLU output. Angle recycling and to some extent polarization recycling films tend to be much less effective in these systems, since the light recycling process tends to be less efficient. Spatially uniform light extraction over the full lightguide area requires that the rate of extraction must generally increase from the launch to distal end of the lightguide, to accommodate the decrease in light power within the lightguide. As an illustration, consider a one dimensional lightguide, where light propagates along the x direction, with an internal power Pi(x), and is extracted with efficiency per unit length e(x), defined by dPi ¼ Pi ðxÞϵðxÞ dx For uniform illumination, the extracted power per unit length must be constant (C). This results in a linear drop in internal light power and a resulting extraction function that increases inversely with the propagation distance: Pi ðxÞ ¼ Pi ð0Þ  Cx C ϵðxÞ ¼ Pi ð0Þ  Cx This can be written in terms of the total efficiency  of the lightguide: Pi ð0Þ  Pi ðLÞ Pi ð0Þ  ϵðxÞ ¼ L  x



In practice, the extraction efficiency is increased along x using higher surface density of extractor features or an increase in their unit area or angle with the surface. Practical limits to this range, as well as lightguide Page 8 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

absorption and scattering by imperfections or additional surface features, tend to limit the total efficiency to less than 90 %. Absorption can also limit the ultimate practical length of a lightguide. For a small power absorption a, Pi ðxÞ ¼ Pi ð0Þð1  axÞ  Cx Since most optical polymers tend to have higher absorption in the blue region of the spectrum, light extracted from the distal end of a lightguide may appear somewhat yellow. This can limit the length of lightguides made from certain types of polycarbonate, for example, which is otherwise favored for handheld devices due to its impact resistance. For larger displays, PMMA is often used because of its reduced absorption in the blue. Designers of backlights utilizing LEDs attempt to reduce costs and power consumption by minimizing the number of LEDs while maintaining the brightness target. When few LEDs are used, there is potential for objectionable nonuniformity in brightness near the launch edge. Light launched into a flat lightguide edge will have an internal angle of y2max  40 , which must propagate before it uniformly fills the space between the LEDs. To reduce this propagation length (often referred to as the mixing region), the launch edge is often structured with prisms, cylindrical lenses, or surface roughness to spread the light laterally. In addition, the extractor efficiency e(x, y) is usually increased between the LEDs near the launch end to compensate for the reduced internal light in this region. Handheld devices such as cell phones demand the use of compact backlights, which require lightguides ranging in thickness from approximately 0.3 to 0.6 mm. LEDs on the other hand generally have higher efficacy and lower cost as the package becomes thicker. Since the input coupling efficiency scales as the overlap area between the LED active region and the lightguide edge, thin lightguides are often designed with a taper from the launch edge to the desired thickness over a length of about 1–2 mm, as represented in Fig. 4d. The taper region increases the ray internal angle, which can lead to TIR failure and light loss. This generally limits the taper to less than about 10 . High coupling efficiency also requires that the LEDs be aligned within a few percent of the LED height from the edge center.

Display Specific Backlights LCDs in TV, computer monitor, laptop, tablet, and cell phone applications utilize different arrangements of BLU components, each optimized for a specific display area, thickness, and cost. Direct illumination used in some LCD TVs is illustrated in Fig. 5a. Spatial uniformity is a key challenge for these systems, requiring that the regions between and in front of the light sources be equally bright. This is generally achieved with a thick (2–3 mm) white diffuser plate, set sufficiently far from the lamps so that the bulb image can no longer be resolved. In the simplest of systems, this air gap is on the order of the bulb spacing. In order to reduce the gap to make thinner displays, the diffuser plates may have a surface structure that creates multiple images of the light sources, which can reduce gap thickness by about a factor of 2. Recently, edge illumination schemes using white LEDs have been used to create thin TVs. In some cases, lightguides and their associated LEDs are tiled to enable local light dimming for maximum power efficiency. BLUs for most monitors and many TVs utilize edge illumination. As shown in Fig. 5b, for the case of CCFL sources, single or multiple bulbs are placed along two opposite edges of a thick plastic lightguide. White LEDs in a single row along one or two opposing edges are also quite common. The extraction features on the surface of the lightguide consist of appropriately placed extractor features (such as screenprinted white dots or laser-ablated spots on a flat acrylic slab), increasing in density away from the light

Page 9 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

Light source

a RGB LED

Diffuser elements

Brightness LC panel enhancement films

CCFL

b

c Cover sheet Prism film Prism film Diffuser sheet Slab light guide White diffuse reflector

Cover sheet

Prism film

Diffuser film

d

4 CCFL + Lightguide

Diffuser sheet Turning film V-cut light guide White diffuse reflector

Diffuse white reflector

e

f

LCP Reflective polarizer Cover sheet Prism film Prism film Diffuser sheet Light guide White diffuse Reflector

g

LCP Prism + Reflective Polarizer film Prism film Diffuser sheet Light guide White diffuse Reflector LCP Prism + Reflective Polarizer film Diffuser sheet Light guide White diffuse Reflector

LCP Reflective polarizer Prism film Prism film Diffuser sheet Light guide Specular reflector

LCP Prism + Reflective Polarizer film Prism film Diffuser sheet Light guide Specular reflector

Fig. 5 (a) A typical structure of a television LCD backlight, consisting of CCFL or LED sources and a reflector, covered by a diffuser plate and brightness enhancement films. (b) A monitor backlight system utilizing multiple CCFL light sources along the edge of a lightguide, a rear reflector, diffuser sheet, a prismatic angle recycling film such as BEF, and an additional diffuser or cover sheet. (c) A backlight for a notebook comprised of a slab lightguide, reflector, diffuser sheet, two prism angle recycling films (prism directions are perpendicular to one another), and a top diffuser cover sheet. (d) An alternative notebook backlight utilizing a wedge lightguide and a turning film, followed by a diffuser film. Its output angular distribution tends to be narrower than that shown in (c). (e) A schematic of a backlight found in LCDs for handheld devices consisting of LEDs along the edge of

Page 10 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

injection edge. CCFL lamps are covered by a high reflectivity white or mirrorlike reflector to direct as much light into the lightguide as possible. A white reflective sheet is placed beneath the lightguide to reflect extracted light back toward the viewer. In order to improve uniformity, a diffuser film is placed on top of the lightguide to hide the extractor features. At this point, the BLU output has a broad angular extent with its peak energy away from the axial direction. In order to more efficiently direct the light toward a viewer, a prismatic enhancement film with prisms running across the display is employed that concentrates the light along the vertical direction to achieve higher axial brightness. An increase in light concentration can also be provided by one or more gain diffusers. Reflective polarizers can be added to further increase the BLU efficiency by polarization recycling. In order to further improve uniformity and diminish artifacts such as moiré, a translucent diffuser (known as a cover sheet) may also be included just before the LCP. BLUs for computer notebooks (laptops) use edge illumination. The light source tends to be exclusively white LEDs equally spaced along the bottom edge of the lightguide. The key drivers in utilizing LEDs are energy efficiency, thinness, and elimination of mercury disposal issues inherent in CCFLs. As was discussed in the previous section, LED sources require a more complex lightguide extractor pattern near the launch edge to achieve spatial uniformity, and the edge of the lightguide may also be structured to help laterally diffuse light to further reduce hot spots. The majority of lightguides for notebooks are plate type (Fig. 5c), with the remainder being V-cut or wedge type (Fig. 5d). The plate-type lightguide systems generally utilize a diffuser sheet just above the lightguide, followed by two crossed prism films, and are often topped with a translucent cover sheet diffuser. The prism films concentrate the light near the axial direction by an angle recycling process. An additional reflective polarizer may be added to enhance the brightness at all angles, either adhered directly to the LCP lower polarizer (Fig. 5e) or combined with a prism enhancement film (Fig. 5f). Thin and bright BLUs are paramount for handheld applications including cell phones and tablets, necessitating the use of edge illumination and white side-emitting LEDs along the lightguide injection edge (Fig. 5g). The BLU construction is similar to that of the LED notebook, absent a cover diffuser sheet (to reduce BLU thickness and cost), and usually the use of a specular (mirrorlike) reflector below the lightguide, instead of a diffusive white reflector. Battery life is paramount in these displays. Multilayer specular reflectors can have reflectivities >98 %, improving BLU efficiency by as much as 10 % over white reflectors. Removal of the cover sheet can increase optical efficiency but can lead to the appearance of visual artifacts such as moiré patterns and color nonuniformities from enhancement films. Various schemes are employed to reduce these artifacts, such as matte surfaces, adjustment of prism spacing, and rotation of the prism films about the axial direction. The viewing angle allowed for handheld displays is generally smaller than that for monitors or TVs, which enables the BLU designer to maximize lightconcentrating effect from prism film to increase axial brightness. Polarization recycling films are also common in handheld displays and can be laminated just below the LCP polarizer or combined with a prism film.

ä Fig. 5 (continued) a lightguide, a diffuse reflector film, diffuser film, two prism angle recycling films, and a reflective polarizer attached to the LCP. A diffuser cover sheet can be included but is becoming less common. (f) An alternative backlight solution for handheld LCDs, similar to (e), in which a combination of a prism film and reflective polarizer is used, with or without an additional prism film. (g) A backlight similar to that shown in (e) and (f), utilizing a high-efficiency specular reflector instead of a diffuse reflector

Page 11 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

System Optimization The light path in the BLU from the source to the LCP is complex in all of the schemes just discussed, especially when light recycling elements are present to achieve maximum efficiency. In practice, designers utilize ray trace software, experimentation, and experience to optimize BLU performance. Criteria for illumination brightness, angle distribution, degree of polarization, optical artifacts, cost, and spatial uniformity can vary widely from one LCD manufacturer to another and can depend on the specific market segment (high-end vs. lower-priced models). In many cases, brightness enhancement films are used to reduce the number of light sources, resulting in greater system efficacy and reduced power consumption. In all cases, the BLU designer must account for the interdependency of the light sources, lightguide or cavity, reflector, diffusers, and a variety of enhancement films. The combination of two crossed prism films will redirect light in the axial direction most efficiently if light enters these films near a specific azimuthal and polar angle, which varies with the prism film. System performance tends to improve as the output of the lightguide and diffuser film peaks near this preferred input angle. A BLU may be optimized by adjusting the lightguide extractor shape, the diffuser haze level, or both to achieve this condition. The diffuser nearest the lightguide may also be eliminated in some cases if the lightguide extractor pattern and input edge structure are adjusted to minimize spatial nonuniformity at the launch end, and the remaining light management films are made sufficiently diffusive to achieve overall uniformity. This is an especially important goal for handheld BLUs, in order to reduce total thickness and cost. Full BLU optimization relies on extensive experimentation and experience. Ray tracing software can also be very useful in maximizing system performance, reducing experimentation, and designing new components and in the analysis of numerous optics issues that appear in these complex systems. Popular commercial software for this purpose includes ASAP (Breault Research), LightTools (Synopsys), and TracePro (Lambda Research) (TracePro is provided by Lambda Research Corp.). Each of these tools launches rays and calculates reflection, transmission, refraction, absorption, and in some cases diffraction of the ray, repeating this process for typically millions of rays and compiling the results. Often, backlight and component suppliers find it necessary to write specialized software to handle the complex interaction of light sources, lightguide, diffusers, prism films, and reflective polarizers, in order to maximize computation speed and accuracy. Lightguide design can be particularly demanding, given that the extraction surface may contain millions of features that vary over the lightguide surface. The emphasis in the chapter has been on optical performance. Thermomechanical considerations are also important in BLU design, to minimize defects that can occur under various environmental conditions. For example, diffusers, plates, and optical recycling films can become physically distorted with changes in temperature and humidity. Care must be taken to allow sufficient room for expansion due to humidity and temperature, accounting for thermomechanical asymmetries in films, and to prevent local sticking of films in normal use and storage.

Conclusion The majority of LCDs for handheld displays, laptops, monitors, and televisions utilize a backlight containing LED light sources or CCFL and either edge or direct illumination schemes to deliver light efficiently and uniformly across the display area. The optimum design depends greatly on a variety of specifications including thickness, brightness, viewing angle, weight, color gamut, contrast, and cost. Ever-increasing requirements in energy efficiency and the use of more environmentally friendly materials

Page 12 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

are challenging BLU component and system manufacturers to continuously enhance these elements and their integration.

Directions for Future Research The enhancement films, light sources, lightguides, diffusers, and reflectors continue to improve their respective performance at a steady rate and ever more asymptotic. Combinations of these elements, such as an integrated prism and reflective polarizer, unitary cross prism films, higher contrast reflective polarizers, or the complete integration of all the BLU elements, are potential next steps. Thinner lightguides using film manufacturing methods and increased color gamut through the use of narrowband phosphors, either in the LED or a diffuser film, are exciting possibilities under exploration. The challenge will be to achieve these elegant goals for each of the display segments in a cost-effective manner, tailored for each display model. Ultimately, BLUs may have more standard formats that will reduce the complexity and manufacturing costs of a truly unitary optical system.

Further Reading Backlights LCD (ed), Kobayashi S (co-eds), Mikoshiba S, Lim S (2009) Society for information display. ISBN 978-0-470-69967-6 Cree Breaks 200 Lumen per Watt Efficacy Barrier. http://www.cree.com/press/press_detail.asp?i= 1265232091259 Duggal AR, Shiang JJ, Foust DF, Turner LG, Nealon WF, Bortscheller JC (2005) Large area white OLEDs. SID Symp Dig 36:28–31 ENERGY STAR ® Program requirements for televisions Europe and China. Available at http://www. energystar.gov Haitz R, Kish F, Tsao J, Nelson J (1999) The case for a national research program on semiconductor lighting. Optoelectronics Industry Development Association Forum, Washington, DC Hulze HG, deGreef P (2009) Power savings by local dimming on a LCD panel with side lit backlight. In: SID symposium digest of technical papers, June 2009, vol 40, issue 1, pp 749–752 Le Grand Y (1957) Chapter 11: Luminance difference thresholds. In: Light, colour and vision. Chapman and Hall, London Lee J-H, Liu DN, Wu S-T (2009) Introduction to flat panel displays. Wiley, New York. ISBN 978-0-47051693-5 Liu T, O’Neill M (2008) Increasing LCD energy efficiency with specialty light-management films. Inf Disp 24(11):24 Liu T, Wheatley J, O’Neill M, Sousa ME (2009) Edge-lit hollow backlight using tunable reflective polarizer for liquid crystal displays. In: SID symposium digest of technical papers, June 2009, vol 40, issue 1, pp 819–822 Machover C (1997) In: MacDonald LW, Lowe AC (eds) Display systems: design and applications. Wiley, New York. ISBN 0-471-95870-0 pp 3–14 Park JL, Lim S (2007) LCD backlights, light sources, and flat fluorescent lamps. J Soc Inf Disp 15:1109 TracePro is provided by Lambda Research Corp. http://www.lambdares.com; LightTools is provided by Optical Research Associates at http://www.opticalres.com/lt/ltprodds_f.html; ASAP is provided by Breault Research Organization, Inc. http://www.breault.com/software/software-overview.php

Page 13 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_96-2 # Springer-Verlag Berlin Heidelberg 2014

Vikuiti.com web site tutorial. http://solutions.3m.com/wps/portal/3M/en_US/Vikuiti1/BrandProducts/ secondary/optics101/ Watson P, Boyd GT (2008) In: Bhowmik AK, Li Z, Bos PJ (eds) Mobile displays: technology and applications. Wiley, New York, pp 211–225. ISBN 978-0-470-72374-6

Page 14 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

Optical Enhancement Films Gary Boyd* Display Materials and Systems Division, 3 M Center, St. Paul, MN, USA

Abstract The optical performance of liquid crystal displays, including axial brightness, viewing angle, spatial uniformity, and color, are improved with the use of optical enhancement films within the backlight. These films also significantly improve energy efficiency and battery life. The structure, operation, and optimal use of these films are described.

List of Abbreviations BEF BLU CCFL Cd/m2 DBEF LCD LCP OEF PIA TIR

Brightness enhancement film Backlight unit Cold cathode fluorescent lamp Candelas per square meter Dual brightness enhancement film Liquid crystal display Liquid crystal panel Optical enhancement film Preferred input angle Total internal reflection

Introduction Liquid crystal displays (LCDs) require an illumination source, known as a backlight unit (BLU), which is placed behind the Liquid Crystal Panel (LCP). Early backlights for computer notebooks consisted of a single cold cathode fluorescent lamp (CCFL) as a light source placed along the edge of a plastic light guide, which in turn distributed light across the area of the display. A white reflector was placed behind the light guide, and a translucent film was placed above it to diffuse the light and hide the light guide extractor features. Backlights today are considerably thinner and more efficient. Angle recycling films, introduced by 3 M as brightness enhancement films (BEF), followed by polarization recycling films (dual brightness enhancement films or DBEF), can improve luminance measured perpendicular to the screen (axial luminance) by greater than 200 %. As a result, such films, which are collectively referred to here as optical enhancement films (OEFs), are found in almost all LCD backlights and are becoming increasingly important as the demand grows for improved energy efficiency and longer battery life (Graf et al. 2007; Boyd 2007).

*Email: [email protected] Page 1 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

a

b

LCP

LCP

Optical enhancement films

Optical enhancement films

Extraction point Lightguide Direct-lit backlight

Edge-lit backlight

• Light sources in direct view of user.

• Light sources placed along the edge of a backlight.

• Requires significant diffusion to achieve spatial uniformity

• Requires lightguide and extraction mechanism.

Fig. 1 Schematics of backlighting schemes for LCDs: (a) Direct lighting using light sources distributed over the area of the display and (b) edge lighting with light sources placed along one or more edges of a light guide that spreads light over the display area. Both incorporate diffusive elements and optical enhancement films to maximize efficiency and spatial uniformity

Typical backlight constructions were described in chapter “▶ LCD Backlights.” The two predominant backlight schemes of direct illumination and edge illumination are diagrammed in Fig. 1a, b, respectively, showing the placement of the OEF in the backlight. Angle recycling films such as BEF concentrate the angle distribution toward the axial direction, thereby brightening the image for typical viewing. Polarization recycling films enhance the proportion of linearly polarized light required by the LCP.

Energy Savings and Battery Life Extension The increase in backlight efficiency resulting from OEF can be utilized by display manufacturers to make displays brighter, reduce the number of light sources (thereby reducing cost), decrease backlight temperature, or reduce energy usage and increase battery life (Liu and O’Neill 2008). As a result, these films are often an integral component in any optimized backlight system. With an increasing mandate to reduce energy usage, and provide longer battery life, many LCD manufacturers are using every means possible to translate improved efficiency into power savings. As examples in reference (Liu and O’Neill 2008) demonstrate, enhancement and reflector films increased the measured run time of a mobile entertainment system by as much as 48 %, reduced the backlight electrical power in a laptop by 25 %, reduced the number of CCFLs in a monitor from four to two (saving 6 W of power), and dropped the number of CCFLs needed in a 40 in. LCD television from 20 to 12, while also reducing the temperature of the front panel by 7  C (saving over 100 W of power).

Backlight Gain OEFs share a common mechanism of transmitting light of a desired state (in angle distribution or polarization) and recycling light of a state which is less desired (Fig. 2a). The backlight components below the OEF, such as the bottom reflector, transform this light into a mixture of both states, which is returned to the OEF for further transmission, reflection, and recycling, resulting in a greater total transmission of the desired state than in the absence of such recycling. The resulting gain in brightness (G) can be expressed as a function of the axial transmission of the preferred state through the OEF (t), the

Page 2 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015 Detector

a

Optical enhancement films

tP0

tRR’P0 R2R’P0

RP0 P0 Conversion element

b

R2R’2P0

RR’P0

Optical enhancement film gain (t = 0.95)

5 4.5

Gain

4 3.5

R’ = 0.3

3

R’ = 0.4

2.5

R’ = 0.5

2

R’ = 0.6

1.5 1

R’ = 0.7

0.5

R’ = 0.8

0 0.1

0.3

0.5 OEF Reflectance R

0.7

0.9

R’ = 0.9

Fig. 2 (a) Illustration of the operation of optical enhancement films (OEF). Rays reflect between the OEF and backlight reflective elements multiple times and are transmitted through the OEF to achieve the desired angle distribution and polarization state. (b) Plot of OEF gain, showing its increase with increasing reflectance of the OEF (R) and backlight elements (R0 )

corresponding reflectance of the less preferred state (R), and a reflectance factor (R0 ) of the backlight elements below the OEF. The R0 factor is actually a product of the backlight reflectance and the fraction of the nonpreferred state that is transformed into the preferred state. For angle recycling films, we can consider the case where all angles incident on the film are equally probable (a Lambertian source), and for polarization recycling films, all incident polarization states are assumed equally probable. The sum of each transmission through the enhancement film gives: G ¼ t þ RR0 t þ ðRR0 Þ t þ    ¼ t 2

1 X i ðRR0 Þ ¼ i¼0

t 1  RR0

(1)

In the absence of OEF, R = 0 and t = 1, giving G = 1. Any positive change in t, R, or R0 provides an increase in gain. In practice, there is usually a trade-off between t and R, and light absorption creates limits for all three factors. A more detailed analysis is provided in Watson and Boyd (2008). Gain can be measured using a Lambertian light source with a diffuse reflective surface (a white light box, for example), as the ratio of axial luminance with and without the OEF. The factor R0 will be dependent on many details of the reflective surface, including absorption, angle scattering, and

Page 3 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

polarization rotation. It can be deduced from a measure of G and an estimate of R and t of the OEF, using optical modeling. As an example, for certain prism and reflective polarizer films, ray trace calculations may show R  0.5 and t = 0.9. A measure of G = 1.6 for a particular BLU or diffuse light source then implies R0 = 0.875 from the above equation. To then achieve a gain of 1.7, an enhancement film with R = 0.538 is required, all else being equal. In Fig. 2b, G is plotted for a fixed value of t as a function of R and R0 . The need for efficient reflectors in the BLU (high R0 ) is evident from the plot, in order to best utilize an OEF for brightness improvement.

Cascading Enhancement Films The addition of optical enhancement films to a BLU increases axial brightness, but with somewhat diminishing improvement as the number of films is increased. Each enhancement films changes the angular distribution or polarization state of light for the film above it, which in turn alters the effective R0 below each film. One way to deduce R and t for a combination of OEF is to calculate gain by ray tracing for various values of R0 , then fit the results to the gain equation. Figure 3a illustrates the calculated gain of two different prism films (P1, P2 used for angle recycling), and a reflective polarizer (RP used for polarization recycling) as a function of reflectivity R0 of a Lambertian source. Using this technique, one finds that a typical range for R0 = 0.8–0.9. The gain for various combinations of OEF in a stack is shown in a

Comparison of gain for single OEF 2 1.9

Gain

1.8 1.7

Prism film 1

1.6

Prism film 2 Reflective polarizer

1.5 1.4 0.7

0.75

0.8

0.85

0.9

Backlight reflectivity R’ Comparison of gain (R’ = 0.86)

b 2.9 2.7

Gain

2.5 2.3 2.1 1.9 1.7 1.5 P1

P

R

P1

P2 R

P/

P2

R

P/

1

1

/P

/P

P2

2 /P

P

R

Fig. 3 (a) Gain is plotted as a function of backlight system reflectivity R0 for two types of angle recycling prism films (P1, P2) and a polarization recycling reflective polarizer (RP), as a function of backlight reflectivity R0 . (b) A comparison of gain for various combinations of OEF. The notation A/B refers to element A placed over element B Page 4 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3b, for R0 = 0.86. Using films P1, P2, or RP alone provides a gain increase of 67 %, 69 %, or 84 %, respectively, while pair combinations increase gain by approximately another 50 %, and using all three provide an additional increase of 15–20 %.

Angle Recycling Enhancement Films A typical angle recycling OEF consists of a set of parallel prisms each with an apex angle of 90 . When placed on top of a diffuse light source, the axial brightness can be increased by approximately 60–80 %, depending on the details of the prism film and the reflectivity of the backlight system. The primary methods to manufacture prism films are extrusion through a die (typically using polycarbonate), or curing of a resin pressed between a mold and transparent substrate. In resin curing, a typical substrate is polyethylene terephthalate, and the curing is performed using ultraviolet light sources and photo-initiated acrylates. Both methods have their strengths and weaknesses in cost, performance, and quality. Molds are usually created by precise diamond turning methods to create smooth facets and wellcontrolled angles to optimize and control the film performance. The basic operation of a prism film OEF is to transmit light from a specific range of incident angles toward the axial direction, while reflecting rays incident at other angles back toward the backlight. The reflected rays eventually return at the correct incident direction to exit the film near axial, thereby redirecting most rays toward the axial direction. This mechanism can be understood in more detail by examining the light rays in the plane of the cross section of the prisms. At the extreme incident angle (90 in Fig. 4a), light exits the prism at an angle referred to as ycutoff which is the maximum exit angle of the central bundle of light from the film. For a right angle prism of index nprism:

qcutoff

a

Prisms Substrate Air

−15° −30° −90°

q PIA

b

c





−10°

d

35° 10°

89° 40°

Fig. 4 Basic ray diagrams for a cross section of prism OEF. (a) Rays entering the flat surface of the film at the preferred input angle yPIA exit the prisms in the axial direction. The cutoff angle ycutoff defines the maximum viewing angle from a single prism film when viewing across the prisms. (b) Rays incident at angles between the TIR limits of the prisms are reflected back into the backlight. (c) Just beyond the TIR incidence angles, rays exit the prisms and can reenter the adjacent prism to return to the backlight. (d) At still higher angles of incidence, light can exit through the facets, creating high angle lobes Page 5 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

sin ð2ycutoff Þ ¼ 1  n2prism þ 2

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n2prism  1

(2)

For a typical 3 M BEF, ycutoff  32–37 , increasing as the prism index decreases. Also of interest is the angle of entry, referred to as the preferred input angle (PIA), where light exits toward the axial direction. The PIA of a right-angle prism can be derived from Snell’s law: sin yPIA ¼

 1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2n2prism  1  1 2

(3)

The PIA depends only on the prism refractive index, and is typically about 30–34 , increasing with increasing prism index. Rays that enter the prism film over a wide angular range exit the film over a much narrower angle range due to refraction at the lower planar surface and prism facet, leading to a concentration of the light in the axial direction. At lower incidence angles, rays are refracted through the prism facet and reflected twice internally, back toward the backlight. Over a range of angles about the normal, light is totally internally reflected without transmission (Fig. 4b). At higher (positive) angles, light can escape through an adjacent facet and enter into a neighboring prism, where it is reflected again to the backlight (Fig. 4c). These paths are the basic mechanism of recycling for a prism film. At still higher positive angles, light exits an adjacent facet and escapes above the other prisms to create high angle lobes (Fig. 4d). For a Lambertian source, approximately 60 % of the incident light is reflected, and about 5 % of the incoming rays are directly transmitted to the lobe angles. In most handheld and notebook displays, two angle recycling enhancement films are utilized, one on top of the other, with their respective prism directions perpendicular to one another (crossed films). Conoscopic views (Fig. 5) show the effect of single and crossed films on the output angular distribution, for two types of input angle distributions: a Lambertian source and a light guide plus diffuser. In the case of the Lambertian source, a single prism film concentrates the light in the direction perpendicular to the prism direction. The cutoff angle of 35 is evident in these false color images, as are the higher angle lobes. When a second prism film is crossed with the lower film, light is concentrated along both axes for maximum axial luminance. For a typical input from a light guide and diffuser, the first prism film once again confines light perpendicular to the prism direction, in addition to moving the peak closer to the display normal (center of the diagram). Using crossed prism films concentrates the light near the normal, resulting in a pattern quite similar to that using the Lambertian source. Thus, crossed angular recycling films provide high axial brightness for a wide range of light source inputs. If the backlight reflectivity R0 is reduced to 0, as in a simulation depicted in Fig. 6, the luminance drops considerably, while the angle distribution remains essentially the same. This demonstrates that the angular distribution is primarily dictated by the prism transmission t, which is simply multiplied by the 1/(1 – R R0 ) factor by the recycling process. Many parameters affect the gain of a prism film, including the prism apex angle, refractive index, tilt angle, the radius of curvature of the prism tip and valley, the spacing between prisms, film absorption, light scattering within or from the surface of the film, and any perturbations in the prism structure. These dependencies have been examined in detail by film manufacturers, with the goal of continuing to improve and tailor performance for backlight optimization (Graf et al. 2007; Boyd 2007). The maximum axial gain is achieved (for a Lambertian light source) when the prisms are symmetric and linear, with a 90 apex angle and the sharpest possible tip and valley. Altering the apex angle from 90 to 100 can, for example,

Page 6 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015 Top sheet prism orientation

Prism orientation

a

Source input angle distribution

Theta limit:80°

Output angle distribution Luminance Theta limit:80°

90

135

>2

45

Output angle distribution Luminance Theta limit:80°

90

135

>2

45

0

1.5 180

0

1

315

3.1

45

1.5

1.8

180

Luminance

90

135

315

225

0.8 225

0.9

45

135

0.8 180

0

0.6 180

315

1.1

45

135

0.8

0.7 0

0.5

0

180

315

225

270

1

45

135 2.3

180

0

1.5

0.7

180

0

0.8 225

315 270

n Az

B

n xB = nyB = nzB = n Az

Fig. 21 Basic structure of a multilayer reflector, consisting of alternating layers of birefringent polymer A, and isotropic polymer B. Arrows within the layers indicate the relative size of the refractive indices for light polarized in those directions. In this particular case, the refractive indices are matched in polymer A in the x and y directions and mismatched with the indices in B to achieve high reflectivity

1.2

Incidence angle

Reflectance

1

0 Deg 10 Deg

0.8

20 Deg 30 Deg

0.6

40 Deg 0.4

50 Deg 60 Deg

0.2 0 380

70 Deg 80 Deg 420

460

500

540

580

620

660

700

740

780

Wavelength (nm)

Fig. 22 Reflectance spectrum of Enhanced Specular Reflector (3 M ESR) for a various incidence angles. The multilayer film retains high reflectivity over the visible range at all angles

Optical Artifacts of Enhancement Films A backlight unit must minimize any visual nonuniformity that might disturb the display image. While OEFs provide a means to maximize BLU efficiency, care must be taken not to also introduce optical artifacts. These include pixel moiré, reflective moiré, color bands, warping, and white spots, each with their own root cause and solutions. Moiré is a general phenomenon occurring whenever two or more regular patterns overlap. Pixel moiré appears as a faint set of alternating bright and dark bands in the display, and results from a spatial interference usually between the topmost prism film and the LCP pixels. Prism films can produce a high contrast line set via total internal reflection when viewed near their cutoff angle, while an LCP has a

Page 17 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

regular array of electrode lines along the horizontal and vertical directions. Methods used to mitigate this effect include adjustment of the prism film pitch to nonintegral multiples of the pixel pitch, rotation of the prism film to reduce the moiré band spacing below visual resolution, or the addition of a diffusive element between the prism film and LCP, such as a diffuse adhesive for the polarizer. The effect can also be somewhat muted by a randomization of the prism structure. Reflective moiré resembles a faint wood grain pattern, and occurs when a line set from a prism film spatially interferes with its own distorted reflection. The reflection of a top prism film can arise from the planar surface of a lower film, or from an RP placed above or below it. The reflection is distorted because the prism film and reflecting surface are not strictly parallel. Solutions include using a diffuse element above the prism film, diffusing the reflective surface with a matte finish, or laminating the prism film to the element with a reflective surface to keep both parallel at all points, since identically overlapping line sets do not form a moiré pattern. The matte surfaces on the backside of prism enhancement films are most commonly employed to reduce reflective moiré. Color bands can result from refractive index dispersion in prism films, birefringence dispersion in prism film substrates, and spectral leaks in multilayer films. Most often these color artifacts can be eliminated by diffusive elements (to induce color mixing), or in the case of an RP by reduction of index mismatch in the film z direction and improved layer thickness control. Warping or physical distortion of films can result in nonuniformities in brightness enhancement, leading to visible bright or dark bands. Such distortion may occur after changes in temperature and humidity, as a result of OEF shrinkage or expansion combined with localized constraints. Solutions include allowing space for movement of the films, proper conversion and tab placement to minimize distortion, use of thicker films, or reduction of local pinning by surface modification. White spots can occur if the tips of a prism film are abraded during assembly or use, or may result from local optical contact between the prism tips and an adjacent polymer film. Such contact can lead to light leakage through the prism, producing a local bright spot. Solutions to these issues include utilizing durable (less brittle) resins for prisms, rounding the tips, and slightly adjusting the prism heights to minimize total contact area with a top film. Durability in mobile displays is often tested by repeated local pressure to simulate use of a touch panel or by dropping a steel ball from a certain height to simulate dropping of a phone or sudden impact. Backlight elements such as prism films and light guides can be especially sensitive to these tests. The ball drop test for example can result in local deformation of a prism, which can reduce the axial brightness, and increase off axis brightness, creating a spot in the damage region, which may appear bright or dark at certain angles. Methods to reduce this effect include the use of soft or tough polymers that quickly recover their shape after impact, rounding prism tips, or utilizing a matte coating to reduce the contrast of the damaged region. The lightguide might receive localized impressions from the backside of a diffuser film, creating unwanted light extraction. Care must be taken to minimize the density and severity of these impressions by rounding the features on the underside of diffuser films. Each of these artifacts can potentially degrade display performance. Their respective solutions must be balanced with the goal of maximizing backlight efficiency economically.

Conclusion Optical Enhancement Films employ angle and polarization recycling optics that significantly increases backlight efficiency in LCDs, which may be used to reduce energy consumption, increase battery life, or improve brightness. Additional benefits include the management of the angular distribution of illumination and improved spatial uniformity. Angle recycling films utilize surface protrusions such as prisms that Page 18 of 19

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_97-2 # Springer-Verlag Berlin Heidelberg 2015

preferentially transmit light toward the axial direction, while polarization recycling films transmit light of the polarization required for the LCP. Both film types recycle the reflected light, which is transformed by the backlight elements into a state that is readily transmitted. High-efficiency reflectors are an important part of the recycling process. The basic optical principles and manufacturing methods for these films were described. The choice of which combination of OEF to use must ultimately depend on balancing specifications for brightness, power, uniformity, display thickness, and component cost.

Directions for Future Research Mobile displays steadily demand higher resolution (usually resulting in reduced LCP transmission), shrinking thickness, and longer battery life. LCD televisions and monitors are rapidly converting to LED-based illumination and require maximum efficiency to reduce component costs and energy consumption. A trend is anticipated for increased integration of the BLU components to simplify manufacturing and reduce system cost, possibly leading to a single optic that will redirect light from light sources efficiently and uniformly across the LCP area. In cell phones, tablets, laptops, and monitors, this integration may include the light guide element, reflector, enhancement optics, and polarizer in a single unit. A key advantage of such backlights will be improved durability. Thinner direct lit backlights for televisions will require improved uniformity enhancing components as well as brightness enhancement. Low cost must be maintained for any new backlighting scheme.

Further Reading Ahn SW, Lee KD, Kim JS, Kim SH, Park JD, Lee SH, Yoon PW (2005) Fabrication of a 50 nm half-pitch wire grid polarizer using nanoimprint lithography. Nanotechnology 16:1874–1877 Anandan M (2008) Progress of LED backlights for LCDs. J Soc Inform Display 16(2):287–310 Boyd GT (2007) Optical films for LCD backlights. Seminar lecture notes, session M-4, Society for information display, 18 May 2008, ISBN: 0887-915X Broer DJ, Lub J, Mol GN (1995) Wide-band reflective polarizers from cholesteric polymer networks with a pitch gradient. Nature 378:467–469. doi:10.1038/378467a0 Graf J, Olczak G, Yamada M, Coyle D, Yeung S (2007) Backlight film and sheet technologies for LCDs. Seminar lecture notes, session M-12, Society for information display, 20 May 2007, ISBN: 0887-915X Information regarding diffusive polarizers may be found at http://www.3m.com/product/information/ Vikuiti-Diffuse-Reflective-Polarizer-Film.html. Accessed 2011 Kobayashi S, Mioshiba S, Lim S (2009) LCD backlights. Wiley, New York. ISBN 978-0-470-69967-6 Liu T, O’Neill M (2008) Increasing LCD energy efficiency with specialty light-management films. Inform Display 24(11):26 Watson P, Boyd GT (2008) Backlighting of mobile displays. In: Bhowmik AK, Li Z, Bos PJ (eds) Mobile displays. Wiley, Chichester, pp 219–223 Weber MF, Stover CA, Gilbert LR, Nevitt TJ, Ouderkirk AJ (2000) Giant birefringent optics in multilayer polymer mirrors. Science 287:2451–2456

Page 19 of 19

LCD Processing and Testing Yoshitaka Yamamoto

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The LCD Structure and Total Process Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TFT Fabrication Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metal Wiring and Electrode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Insulators and Semiconductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PE-CVD Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liquid Crystal Cell Fabrication Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fabricating Color Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liquid Crystal Cell Process and ODF Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Glass Scribing and Polarizer Assembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Module Assembling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assembling Driver LSIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Backlight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test and Repair Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test and Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Repair Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 5 8 8 10 12 12 15 17 17 17 18 19 19 22 23 24

Abstract

The TFT-LCD technology is based upon semiconductor IC fabrication processing. The unique point of the TFT-LCD technology is that it uses a glass substrate, instead of the conventional Si wafer. For the TFT fabrication process, thin-film formation, such as CVD, sputtering, and film coating on glass substrate are important. In the assembling process of color filter and TFT substrate, photo spacer and ODF have been developed and applied for largeY. Yamamoto (*) Display Technology Laboratories, Corporate R &D group, Sharp Corporation, Tenri, Nara, Japan e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_98-2

1

2

Y. Yamamoto

size LCDs. Light source of backlight is being replaced from CCFL by LED. Test and repair technologies have been essential technologies for stable production. As described in this chapter, these technologies are contributing to realize good yield for large-size display fabrication. List of Abbreviations

ACF AOI a-Si BM CCFL CF CDO COF COG CVD FHG FPC ITO LCD LED MVA NTSC OD ODF OLB PE-CVD PET p-Si PVA SiNx TAB TAC TFT THG μe YAG

Anisotropic conductive film Automatic optical inspection system Amorphous silicon Black matrix Cold cathode fluorescent lamp Color filter Critical dimension and overlay measurement Chip on film Chip on glass Chemical vapor deposition Fourth harmonic generation Flexible printed circuit board Indium tin oxide Liquid crystal display Light-emitting diode Multi-domain vertical alignment National television system committee (body that develops television standards) Optical density One drop fill Out lead bonder Plasma-enhanced chemical vapor deposition Polyethylene terephthalate Polycrystalline silicon Polyvinyl alcohol Silicon nitride Tape-automated bonding Triacetyl cellulose Thin-film transistor Third harmonic generation Electron field-effect mobility Yttrium aluminum garnet

Introduction The thin-film transistor liquid crystal display (TFT-LCD) market has been expanding into the mobile phone, digital camera, game, PC, and TV areas. LCD-TV display size has been enlarged to 60 in. in diagonal. These

LCD Processing and Testing

3

applications are supported by device and fabrication technologies, such as higher resolution, wider viewing angle, and lower cost in manufacturing. In this chapter, TFT-LCD fabrication process and testing technologies are discussed. This chapter consists of five parts as follows: 1. LCD structure and total process flow This part is useful for understanding the structure of the LCD fabrication process. 2. TFT fabrication technologies The purpose of this section is to discuss TFT fabrication process technologies and equipment. 3. Liquid crystal cell fabrication process In this section, color filter technology and cell fabrication processes are discussed. 4. Module assembling The LCD module consists of an LCD cell, electric components, and a backlight. This part shows how the components are assembled. 5. Test and repair technology Testing and repair technologies are important for large-size TFT-LCDs to attain a good and stable yield. In this part, test and repair technologies for TFTs, color filters, and LCD cell are discussed.

The LCD Structure and Total Process Flow Figure 1 shows the pixel structure of an LCD-TV. One pixel consists of three sub-pixels, each having one TFT (thin-film transistor) for switching signals. Each sub-pixel’s position coincides with the color filter sub-pixels red, green, and blue (RGB). HD-TVs have about 2 million pixels. TFTs have a layered structure and are formed on a glass substrate. The manufacturing process of TFTs is explained in the following paragraph. Figures 2 and 3 show the TFT-LCD process flow and LCD module structure, respectively. The TFT substrate and the color filter substrate are aligned precisely together leaving a gap between the two that is filled with liquid crystal material. The TFT transmits an electric signal to the liquid crystal, which in turn modulates the amount of light penetration accordingly. A specific color is given to the penetrating light by the color filter. After the LCD cell is assembled, polarizer films are affixed onto both sides of the LCD cell. To improve the optical performance of the LCD, an optical film, such as a retardation film or a front diffuser, may be attached. Driver LSIs are mounted onto the FPC (flexible printed circuit board) and attached to the LCD cell. The backlight module is attached to the LCD cell, and they are inserted into a metal frame. After the final test, the module is shipped to customers.

4

Y. Yamamoto

a

Sub-pixel

Pixel

b Transparent electrode

Drain electrode Source electrode

B

A

Semiconductor B

G

R Gate electrode

Source Gate electrode electrode

TFT structure

c

Pixel structure Glass substrate A-B cross sectional view of TFT

Fig. 1 The pixel structure of TFT-LCD

TFT

CF (Color filter)

TFT substrate

assemble

CF substrate CF substrate

Liquid crystal cell process

TFT substrate

> Assembling the TFT and the CF substrate > Liquid crystal material injection

Polarizer

Attachment of polarizer & FPC

FPC Driver LSI Polarizer

Attachment of Back light > Attach the back light unit > Set in the vessel > Final test

Shipment

Fig. 2 TFT-LCD process flow

Back light

LCD Processing and Testing

5 Metal vessel Optical film Polarizer CF substrate

FPC Driver LSI Light source

TFT substrate Liquid crystal Polarizer Back light

Light guide

Fig. 3 The structure of an LCD module

TFT Fabrication Technology TFT processes are derived from LSI technology, but require certain modifications. The reason for it is the substrate. In TFT fabrication process, we have to handle a thin and large glass substrate. Table 1 shows the history of glass substrate size over ten generations. The biggest size for TFT fabrication is G10, which is 2,850  3,050 mm. That is 73 times larger than the G1 size used in the early stages of TFT production. Additionally, substrate thickness is reduced from 1.1 mm to 0.7 mm. This history shows the remarkable progress in TFT fabrication technologies in the last 23 years. For large-size TVs, a-Si (amorphous silicon) TFT is widely used, because the fabrication process is short and simple. The a-Si TFT technology started from early research on applications of a-Si material. In 1975, Professor Spear of Dundee University and his group showed that a-Si thin film, which was deposited by glow discharge chemical vapor deposition (CVD), showed good semiconductor characteristics (Spear and Le Comber 1975). In 1979, they published a paper describing early TFT performance (Le Comber et al. 1979). It was a very sensational content for many researchers. In 1981, A.J. Snell et al. showed that a-Si TFT devices have considerable potential for the switching element in LCD panels (Snell et al. 1981). This discovery became the base of the current a-Si TFT technology (refer to chapter “▶ Hydrogenated Amorphous Silicon Thin Film Transistors (a Si: H TFTs)” for more details). On the other hand, p-Si (polycrystalline silicon) TFT is mainly applied to smallsize functional integrated display applications, because peripheral circuit such as monolithic drivers or power supply circuits and functional circuits such as sensor or memory are fabricated monolithically by this technology. Figure 4 shows the comparison of a-Si and p-Si. In a-Si, atoms are arranged with no regularity in the

6

Y. Yamamoto

Table 1 History of TFT-LCD glass size

Generation G1 G2 G3 G4 G5 G6 G8 G 10

TFT performance (me)

Area (m2) 0.12 0.17 0.36 0.67 1.4 2.8 5.3 8.7

Production 1987 1991 1995 2000 2002 2003 2006 2009

a-Si

p-Si

Amorphous state

An aggregation of small crystal grains

(0.5 cm2/Vs)

(100 cm2/Vs)

Silicon material

Conceptual image of materials

Size (mm) 300  400 360  465 550  650 730  920 1,100  1,300 1,500  1,850 2,160  2,460 2,850  3,050

e–

low

high

Fig. 4 The comparison of a-Si and p-Si

long range. It has no crystal-like characteristics. Because of many defects and unpaired electrons in Si materials, electron field-effect mobility (μe) is limited to around 0.5 cm2/Vs. That performance is three decades behind single-crystal silicon LSI. This is the reason why a-Si TFT application is limited only to switching transistors. The p-Si material is an aggregation of small-sized crystal silicon grains with an average size of ~300 nm. As a result, μe is about 100 cm2/Vs which is 200 times that of a-Si TFT’s and near the characteristics of single-crystal silicon LSI. Driver circuits, power supply circuit, or sensor circuits are realized by p-Si TFTs. A more detailed description on this topic will be given in chapter “▶ Polycrystalline Silicon Thin Film Transistors (Poly-Si TFTs).” Figure 5 shows the structure of a-Si TFT. The a-Si TFT is composed of gate electrode, gate insulator, semiconductor, and source/drain electrode. This structure is called “inverted-staggered structure.” It has a gate electrode under the semiconductor. In this structure, the gate insulator, a-Si, and n+Si layers are deposited sequentially without breaking the vacuum. This process step is called “three-layer deposition step.” The number of thin-film deposition steps is five consisting of gate electrode, three-layer deposition, source/drain electrode,

LCD Processing and Testing

7

Low resistivity Si (n+Si) Semiconductor (a-Si) Drain electrode Transparent electrode

Source electrode

Glass Gate electrode

Gate insulator

Fig. 5 The structure of a-Si TFT Glass substrate Glass substrate Gate electrode (sputtering) 3 Layer Deposition (SiNx\a-Si\n+Si) (PE-CVD) Source/Drain electrode (sputtering)

Glass substrate

Thin film (Gate metal) deposition

Gate metal Gate electrode

Semiconductor + (n Si/a-Si)

Gate insulator

Photo-resist coating photoresist

Exposure & development Source electrode

Drain electrode

Etching Transparent electrode (sputtering)

Transparent electrode

Gate electrode

Liquid crystal sell process [Total process]

Resist remove & cleaning

Gate electrode [Unit process]

Fig. 6 The process flow of a-Si TFT

passivation, and transparent electrode. Figure 6 shows a typical a-Si TFT process flow (passivation layer is not shown in the figure). The process unit is made from thin-film formation, photoresist coating, exposure and development, etching, photoresist removal, and cleaning. The process unit is repeated with changing thin-film materials.

8

Y. Yamamoto

Metal Wiring and Electrode Sputtering is one type of physical vapor deposition technology (refer to chapter “▶ Indium Tin Oxide (ITO): Sputter Deposition Processes” for more details). It has been used for metal thin-film deposition. Metal material on a target is evaporated by collision of Ar ions. In the TFT process, an electrode such as gate, source/drain, or pixel electrodes is deposited by sputtering. To meet the requirement of large-size TFTs, low-resistivity materials are developed. In the early stages of mass production, Ta or W was used for its thermal and chemical durability. Currently, these materials have been replaced by Al due to its low resistivity, about 3 μΩ/cm. Pure Al has an issue of hillocks. Hillock is roughness that appears on the surface of the metal film upon exposure to temperatures in the range of 250–400  C due to changes in the microstructure of the amorphous metal. One solution to this problem is alloying with other metals. Cu is a promising material for high-performance LCDs (Colgan et al. 1996; Sirringhaus et al. 1996). The resistivity is around 2 μΩ/cm. However, Cu has some issues in the TFT fabrication process: • Poor adhesion characteristics to substrate. • Cu atoms move easily in Si material and can cause instability of the transistor properties. To solve these problems, metals such as Ti or Mn have been applied as a blocking layer. Using such layer, diffusion of Cu atoms into Si layer has been prevented. But the double-layer structure results in high sheet resistance and poor ohmic characteristics. To solve this problem, new Cu material has been proposed (Koike et al. 2007). It contains small amount of Mn (2 %) in the Cu metal. Mn atoms move to the surface by heat treatment (200–300  C), thus forming a thin barrier layer to prevent further diffusion of Cu atoms. The good feature of this structure is that the Mn layer is very thin (1–2 nm), which, however, is enough for good barrier property. For transparent electrodes, ITO (indium tin oxide) is a common material. ITO has the unique characteristics of high light transmittance and low resistivity. Typical values of sheet resistance and transparency are 5 to several 10 Ω/square and 80–95 %, respectively. ITO film is deposited by DC sputtering method. The ITO target is prepared by sintering a mixture of 5 10 % SnO2 and In2O3. Usually inline-type sputtering machine is applied for ITO deposition. Transparency and resistivity are controlled by deposition conditions.

Insulators and Semiconductor The PE-CVD (plasma-enhanced chemical vapor deposition) is one of the most popular thin-film deposition methods. It is based upon a combination of plasma and

LCD Processing and Testing

9

thermal energy. Usually 13.56 MHz frequency power is applied to a SiH4 gas feed, at around 300  C. The three layers – gate insulator (SiNx), semiconductor (a-Si), and low-resistivity Si layer (n+Si) – are deposited continuously in one PE-CVD chamber. Gate insulator requires high dielectric strength (resistance to electrical breakdown) and durability against chemical agents. The semiconductor (a-Si) has a stable electric performance without high temperature heat treatment. The phosphor-doped silicon (n+Si) realizes low-resistivity interlayer between semiconductor and metal electrode. The three-layer continuous deposition technology eliminates impurities and particles from the interface between the gate insulator and semiconductor layers.

Gate Insulator The gate insulator, SiNx, is one of the essential elements for thin-film transistors. To obtain a high yield, a thicker gate insulator is preferable. At the same time, a thin gate insulator is better for producing high-performance TFTs. Hence, the gate insulator thickness selection results from the trade-off between yield and performance. From a reliability point of view, fixed charges and carriers’ injection into the gate insulator are important issues. Injected carriers change the threshold voltage. To improve the stability of transistor characteristics, the gate insulator should be strengthened against the carriers’ injection. To improve the quality of gate insulator, accurate control of atomic ratio Si/N in a SiNx film is important. The main factor for it is the gas flow rate of SiH4/N2 and deposition temperature. Semiconductor Film To achieve high-performance electrical characteristics and better reliability of TFT, it is very important to reduce the defect density in the semiconductor as well as achieve low interface state density of the gate insulator and semiconductor. High-density (~10 %) hydrogen atoms in the a-Si film terminate dangling bonds (unpaired atomic bonds) in the semiconductor layer. Continuous deposition method of CVD plays an important part in preventing particles from entering the gate insulator or semiconductor films, thus maintaining a low interface state density. Low-Resistivity Si Film The contact resistance between semiconductor and metal electrode should be sufficiently low. High P-doped a-Si (n+Si) offers low resistivity to satisfy that requirement. In addition, microcrystal n+Si shows lower resistivity than amorphous n+Si. The typical resistivity of amorphous n+Si and microcrystal n+Si is 20–50 Ωcm and 0.2–3.0 Ωcm, respectively.

10

Y. Yamamoto Power supply (13.56MHz) ∼

Vacuum chamber

Upper electrode

SiH4 SiH2 Si

SiH2 Si

plasma

exhaust substrate stage Heater

Fig. 7 PE-CVD equipment

PE-CVD Technology Figure 7 shows a conceptual image of PE-CVD equipment. It is an example of an a-Si film deposition. SiH4 gas is introduced into the reactor. The chamber needs to be isolated from the atmosphere as SiH4 is a combustible gas and can be easily ignited when mixed with air. The SiH4 gas is injected from fine holes in the upper electrode and spreads into the reaction chamber. The gas decomposes into plasma by the application of a high-frequency electronic field. Radicals in the plasma (such as SiH2, SiH3, etc.) are deposited on the glass substrate surface, which is heated to about 320  C. To get a good uniformity of deposited film thickness and quality, the thermal uniformity, pressure, and gas concentration control are very important. However, due to improper deposition conditions or insufficient equipment maintenance, particles fall from the chamber wall onto the glass substrate. That is another important issue for CVD technology. For large-size CVD equipment, a fundamental problem has been arising in the uniformity of film thickness and quality as a result of nonuniformity of RF power density distribution. This problem had been studied by Lieberman et al. They had shown a detailed theoretical study with a cylindrical parallel plate in the capacitivecoupled plasma reactor model (Fig. 8 ) (Lieberman et al. 2002). The “standing wave effects” cause the nonuniform RF power density distribution which appears highest in the center and gradually decaying toward the edge. The standing wave effect depends on the frequency of the electric field. Its influence on mass production equipment becomes an essential problem in G8 or higher (Takehara et al. 2004; Takehara 2005; Sun et al. 2004).

LCD Processing and Testing

11

CVD chamber Upper electrode

Upper electrode s d

L

plasma

R

plasma

lower electrode

lower electrode

Fig. 8 Theoretical modeling for standing wave Table 2 Electrode size and lc

Table 3 Frequency and l0

R (m) 0.9 1.2 1.6 2.1

G5 G6 G8 G10

Frequency (MHz) 13.56 27.12 40.70 60.00

λc (m) 5.7 8.0 11.0 14.0

λ0 (m) 22.1 11.1 7.4 5.0

The critical wavelength λc at which standing wave effect appears is computed as follows: λc ¼ 2:6ðL=sÞ1=2R

(1)

where L is the half spacing between the electrodes, s is the plasma sheath width, and R is the parallel electrode radius. When wavelength of power supply λ0 comes close to λC, the standing wave effect influences the plasma distribution. For example, λ0 is about 22.11 m in PE-CVD with 13.56 MHz. For conventional CVD equipment, L and s are 10 mm and 1.5 mm, respectively. From Eq. 1, λc = 11.0 m in G8 and λc = 14 m in G10. Table 2 shows the results of the calculations. It shows that we have to consider the standing wave effect when we use the large PE-CVD chambers, because the nonuniformity effects of the standing wave are not negligible. Moreover, because the influence of the standing wave becomes critical with a high RF frequency, frequencies over 13.56 MHz cannot be used in large parallel plate-type PE-CVD equipment (Table 3).

12

Y. Yamamoto

From Eq. 1, to decrease the standing wave influence, it is necessary to reduce the electrode separation L and expand the sheath width s. To spread the sheath width s, we have to optimize the process conditions such as gas flow rate or deposition pressure. To reduce the electrode separation L, a special design for the narrow gap of electrodes is needed. These optimizations improve the uniformity of film quality and thickness, but the process window becomes narrower with the bigger chamber size. So, it will be difficult to find the solution for that issue just by optimizing the chamber and process conditions.

Liquid Crystal Cell Fabrication Process Fabricating Color Filters The next step of the TFT-LCD process is liquid crystal cell process and module assembling. Color filters consist of three color elements, red, green, and blue. A color filter process includes color photoresist coating and lithography technology. Color filter elements must coincide with the position of the TFT pixel matrix as shown in Fig. 2. Usually the color filter has a simple rectangular shape, so the proximity type exposure equipment is enough for production. The glass substrates for color filers have to have the specifications as for the TFT substrate. This is to eliminate the influence of thermal stress by the difference of thermal expansion coefficient.

Black Matrix (BM) The first step of color filter fabrication process is BM (black matrix) formation. There are key functions for the BM: • To prevent light leakage from the spaces between pixels • To minimize the color mixing of neighboring pixels • To guard the TFT characteristics from incident light from the backlight BM keeps the image quality of TFT-LCD at a clear and vivid hue. OD (optical density) is shown by the following equation: OD ¼ log10 ðI=IoÞ where I0 is the strength of the input light and I is the penetration light intensity. To get a good image quality, OD of BM should be 3.0 or more. There are two types of BM materials, metal (Cr is a typical material) and resin. Cr-BM has a good performance as a light shade, the OD shows 4.0 or more. But from the viewpoint of environmental pollution, Cr has been replaced by resin. Resin BM is processed with photoresist resin and carbon black. This material shows good characteristics for BM, high resistivity, and low light reflectance.

LCD Processing and Testing

13

Recently, the titanium-black material, which has been developed as a pigment, has found application in the resin BM. It shows an excellent performance of high shading and high resistivity. It also improves the adhesion characteristics to the substrate. Its actual adhesion strength is 1.5 times that of the conventional materials.

Color Filters (CF) In the early stages of development, color filters (CFs) were made by a “dyeing method” that colored resins such as caseins with dyestuff. But those materials did not have enough durability against light, and so they have been replaced by pigment-dispersed photoresist. In color filter fabrication process, photolithography method has been used. Usually photoresist material contains about 30 % solid pigment. In the photolithography process, this pigment absorbs the light from exposure equipment. To overcome the issue, high-sensitivity photoresist is required. Photoresist consists of polyfunctional acrylate compounds, pigment, photopolymerization initiators, and solvent. When the light irradiates to the initiators, radicals appear in photoresist, and it starts the polymerized reaction of acryl-oligomer. Then, the photoresist is developed and dried. This series of steps is repeated for each of the primary colors: red, green, and blue. Recently, a new four-color filter, RGB and yellow, has been developed. This technology expands the color-reproducing field to catch more natural colors and realize bright image quality by improving the transmittance of light.

Overcoat and Transparent Electrode After the color filter process, the substrate surface is covered by an overcoating. The purposes of the overcoat are to protect the surface of color filters from damage and prevent impurities from moving into the liquid crystal material. The epoxy acrylate is used for the overcoat material. The transparent electrode requires high transparency and low resistivity. ITO is widely used for this purpose. To achieve low resistivity, one of the good ways is high temperature deposition. However, color photoresist is an organic material, so the maximum temperature should be lower than 230  C. Usually the characteristics of ITO for CF are transparency ratio, 95 %, and sheet resistance, 30 Ω/□. In the ITO etching process, some kinds of materials such as an aqueous solution of hydrogen iodide or ferric chloride have been used as etchants until now. Currently, oxalic acid (H2C2O4) solution with surfactant is widely used for this because of its stable and effective etching performance with low running costs. Spacer A spacer is used to control the spacing between TFT and color filter substrate. Conventionally, a spacer has been a plastic bead. But that method had some issues, specifically, difficulty in controlling bead position and a degradation of image

14

Y. Yamamoto

quality by cohesion of beads. A new trend in spacer technology is a photo spacer method (Okita and Masaki 1999; Ohmori et al. 2000). Photo spacer is a pillarshaped photoresist material formed on a color filter substrate at proper positions. Its size is 10–20 μm width and 4–6 μm height. The advantage of the photo spacer is accuracy in cell gap control, durability to mechanical shock, and high contrast ratio of image quality. Another reason of shift to photo spacer technology is that a new liquid crystal fill technology named ODF (one drop fill) technology has been adopted for production. Requirements for photo spacer are as follows: • Mechanical property The change of a pillar height should be small against the compression. • Uniformity The thickness deviation should be small- to large-size glass substrate. • Heat and chemical durability Good durability against heat or chemicals. • No contamination No ionic contamination to liquid crystal material. Two methods can be applied for photo spacer formation. One way is to utilize layered color filter photoresist. And another method is to add a resin layer for the photo spacer. This photo spacer technology is a principal technology for spacer fabrication.

Alignment Layer Liquid crystal molecules are arranged on the surface of a substrate. And their dispositions are controlled by alignment layer. The alignment layer is about 0.1 μm thick polyimide layer. In the fabrication process, polyamic acid type and solvent-soluble polyimide type are available. In the former case, precursor material is coated onto a substrate and heated for converting the polyamic acid to a polyimide. In the latter case, pre-imidized polyimide organic solution is coated onto a glass substrate and dried in an oven. Recently, solvent-soluble polyimide method became the main stream in the fabrication process. Conventionally, printing technology has been used for coating, but for over G6 glass-size substrate, ink-jet technology has become the favored method, because it is easy to exchange materials and can reduce the usage of polyimide. The next step is “rubbing,” where the surface of the substrate is rubbed in one direction by a special velvet rubbing cloth. The rubbing treatment is an important step for liquid crystal molecule alignment, but the mechanism of molecule alignment is not clear. The rubbing machine is equipped with a rubbing roll, which is guided without vibration at a well-defined speed and angle, precisely adjusted pressure, and adapted rotation. One of the important issues of the rubbing process is electrostatic damage to TFT and dust or particle from the cloth. On the other hand, with newly adopted methods such as MVA (multidomain vertical alignment) mode, the rubbing step is eliminated.

LCD Processing and Testing

15 LC dispenser Glue seal

LC material droplet

Glass substrate

Fig. 9 One drop fill technology

Liquid Crystal Cell Process and ODF Technology The vacuum injection technology has been applied in the liquid crystal cell fabrication process for a long time. As the market size of LCD was expanding, an alternative method for this was strongly requested. As the solution for this, a new technology called “one drop fill (ODF) technology” was proposed. The technical point is dropping liquid crystal droplets on a substrate. By that method, the treatment time became very short and opened a way for a large-size LCD production (Kamiya et al. 2001; Hirai et al. 2008; Yoshida et al. 2006; Yamada et al. 2001). Figure 9 shows a conceptual diagram of the ODF technology. Liquid crystal material drops onto TFT substrate (or CF substrate) in the area surrounded by sealant. Controlling the amount of liquid crystal material is very important in ODF process. The cell spacing is estimated from the cell area and gap width. The total amount of liquid crystal material must be matched to the spacing. When the quantity is out of range, the liquid crystal panel becomes a defective product. In case of small display manufacturing, higher accuracy becomes necessary, because the amount of the liquid crystal material is rather small. Figure 10 shows the manufacturing process flow of ODF. The first step is a formation process of the polyimide film onto the TFT substrate. This is followed by the formation process of a seal. The sealant joins the TFT and color filter glass substrates. It runs along the edge of the display region via a printing or dispensing method. The requirements to sealant in ODF are as follows: • • • •

No contamination spread in the liquid crystal material. It should be curable by both UV light and heat treatment. Have a viscosity whereby the dispenser can draw the line of constant width. Be able to bond two substrates together, tightly.

It is important not to cause impurities to spread into liquid crystal material. The seal material touches the liquid crystal material before it stiffens. Some materials were proposed for this purpose, which brought ODF technology into more practical use. Also, in the case of seal formation, shape at the corners is important. So, high degree of accuracy is required to control the amount of sealant for uniform seal width.

16

Y. Yamamoto

ODF (one drop fill) process

Conventional process

Cleaning & Alignment layer process (PI coating and rubbing) Seal dispense (Thermal & UV cure type)

Seal dispense (Thermal cure type)

LC dispense Assembling TFT/CF substrates and UV cure(in vacuum)

Assembling TFT/CF substrates

Thermal cure in heat chamber

Thermal cure in heat chamber

Glass substrate cutting

Glass substrate cutting Set in a vacuum chamber Dip the injection hole into LC bath LC Injection (Back the pressure to atmosphere) Seal the injection hole Cleaning for the residual LC removal

Inspection and send to assembling process step

Fig. 10 Comparison of ODF and conventional process

To apply this technology to larger substrates, equipment size is a problem when the stage moves. To solve this problem, a moving dispenser method has been applied, in which the stage is fixed and the dispenser moves and draws. The clean driving system is an important point. The following discusses the process of assembling two substrates. This process is performed in a vacuum. In this step, two technologies, glass substrate handling in a vacuum and a high accuracy of positioning, are necessary. The positioning of two substrates must be within 1 μm or less. For large-size glass handling, vacuum chucking or electrostatic chucking technologies are available. But the former is difficult in a vacuum and the latter has the problem of causing electrostatic damage to the TFT device. Nowadays, combinations of these two methods or new chucking technology such as PSC (physical sticky chuck), which uses a special diazo sticky sheet, are applied to mass production equipment. The final step in the ODF process is stiffening the seal material. UV radiation and heat treatment are applied to strengthen the panel. The first advantage of the technology is good productivity. In a conventional vacuum injection method, a long time is necessary for the injection of liquid crystal into a large-size LCD panel. In the ODF method, liquid crystal material is dropped onto glass substrate instead. The ODF technology reduces the process time. For

LCD Processing and Testing

17

high-speed LCDs, narrowing the cell gap is demanded. And this technology meets with the requirement. For small-size LCDs, many panels have to be treated in the injection process. ODF technology simplifies the cell assembling process. The in-line assembly production line style can apply for fabrication. The second advantage is cost saving. The liquid crystal material itself is very expensive. ODF method offers a way to reduce the quantity consumed.

Glass Scribing and Polarizer Assembling The assembled mother glass is scribed into each LCD panel. A diamond tip runs on the surface of the substrates along the breaking line. Then it is pressed toward one direction for breaking. For some mobile applications, laser cutting has been used when an accurate panel size is requested. After cutaway the edge of the separated panels and cleaned, polarizer films are put on the panel. Polarizer is composed of multilayered films of PVA (polyvinyl alcohol), TAC (triacetyl cellulose), and PET (polyethylene terephthalate). A PVA film contains iodine and works as a polarizer. TAC films are attached on both sides of the PVA film. PET is the protection film and is put on one surface of TAC. Other optical films such as compensation films or wide-viewing films are pasted on top of the panel according to the specifications of the LCD. After the first image quality test, panels are sent to the module assembling fabrication process.

Module Assembling The module assembling process is the final fabrication step for TFT-LCD production. It includes assembling driver LSI, backlight, and final test.

Assembling Driver LSIs There are two types of LSI assembling methods. The first is the OLB (out lead bonder) method, which mounts driver LSIs on a flexible printed film substrate and connects to the LCD panel. The second method is COG (chip on glass) which mounts LSIs on a glass substrate directly. The OLB method is applied to large-size LCD-TVs mainly, and COG is for mobile applications. In the OLB method, driver LSIs are delivered as TAB (tape-automated bonding) or COF (chip on film). In the TAB method, a wiring circuit is formed on polyimide film by lithographic technology. Driver LSIs are mounted on the film by thermocompression method, followed by a covering of resin. In the COF method, a circuit is formed on thin polyimide film directly by plating or casting. Typically, COF is good for fine-pitch wiring and bends well.

18

Y. Yamamoto Upper electrode

a FPC

Conductive particle

Glue (epoxy)

Plastic Bead

TFT substrate Conductive layer (Au etc.)

b

lower electrode

LCD with FPC

c Connecting

A A’ FPC

Conductive particle

FPC

Driver LSI TFT substrate A

A’ Cross Sectional View

Fig. 11 Connecting by ACF (anisotropic conductive film)

ACF (anisotropic conductive film) is a convenient material for connecting TABs to an LCD panel. Figure 11 shows the concept of ACF. It is a resin film that contains plastic balls having a surface covered by metal (Au or Ni). The number and the size of balls and metal material are determined by the contact resistance requirement. TABs are connected to the LCD panel by ACF. PCBs (printed circuit boards) are connected to TABs also by ACF.

Backlight CCFL (cold cathode fluorescent light) has been the main source for the backlight of LCD-TV, for low-cost and high-luminous efficacy. On the other hand, the luminous efficacy of LED has been improved every year, and LED backlight has become a new trend in TV application. LED backlight has the following features: • • • •

Light and thin High color reproductivity and high image quality of the LCD Low power consumption No mercury (Hg)

The LED backlight is classified into “direct light” and “edge light” by the arranging of LEDs. In the direct-light-type LED backlight, many LEDs are arranged on the plane behind a diffuser plate.

LCD Processing and Testing

19

Power consumption is reduced by the “local dimming” technology. The brightness of each area is controlled according to the local brightness of the image. By that technology, the contrast ratio of LCD is improved over 1,000,000:1. LED has a feature of the high-color reproductivity, 150 % NTSC color gamut (a common CCFL backlight: 75 %). The direct-light-type backlight becomes thicker due to the diffuser air gap which is required to achieve luminance uniformity. Phosphor-based LEDs have been used for cost reduction of backlight. It consists of blue LED (mostly InGaN) and yellow fluorescent phosphor. In this case, the color reproductivity is almost same as normal CCFL. In the edge-light type, white LEDs are arranged on the side of light guide plate. It can reduce the cost of the backlight, because of the smaller number of required LEDs. This technology has an advantage to realize thin LCD module. One of the issues is the brightness uniformity in a large-size light guide. A second issue is the difficulty of applying the local dimming technology. Some new technologies have been proposed to solve that problem (Yamamoto et al. 2009; Masuda et al. 2009; Gourlay et al. 2009).

Test and Repair Technology The requirement for LCD-TV size is getting larger and larger. Image quality, natural color, and high definition such as full HD are demanded as general standards. If testing/repair technologies meet these requirements, it will enable improved product yields.

Test and Measurement Test and Measurement in TFT and CF Fabrication Process Test and measurement items and methods in the TFT process are shown in Table 4. The purposes for these tests are as follows: • Process and yield management • To get the data of defects for repairing • Quality control of TFT panels For large substrate fabrication such as G8 or G10, the TFT process cost is considerably high. We have to find defects and find the root cause to minimize the damage (Igarashi 2009; Freeman and Hawthorne 2000). There are two approaches to testing: 1. Detect the defect by measurement of TFT performance or image quality. The I/V tester and open/short tester is available.

20

Y. Yamamoto

Table 4 Test and measurement in TFT fabrication process Checkpoint Development and etching Film deposition

Complete the TFT process

Items Pattern width Pattern matching Pattern defect Film thickness Resistivity Particles TFT characteristics Short/open Array testing

Measurement method Pattern width measurement AOI (automatic optical inspection) Short/open tester Spectroscopic ellipsometer Four-point probe sheet resistivity measurement Laser scattering particle detector I/V tester Short/open tester Array tester

2. Detect the problem of patterning, which includes pattern defects, line/space width shift, or inaccuracy of overlay. The CDO (critical dimension and overlay measurement) and digital macrosystem are good approaches to check for these problems. The defects of color filters are classified into two categories: bright spot and black spot (Suzuki et al. 2005). The bright spot is caused by leakage of light from BM or color filter defect. The black spot is a defect caused by particle or BM/color filter material residue. For color filter fabrication process, photolithography or ink-jet technology is adopted. AOI (automatic optical inspection) system is commonly used as process control and repair (Mizumura 2009). The inspection items are black matrix (BM), color filter pixel, overcoat, ITO, photo spacer, etc. The issues are particles, dust, residual BM, residual photoresist, and pinholes. These defects are detected by the combination of transmitted and reflected lights. For example, in the case of overcoat defects, dark and bright spots are detected by illumination from top and bottom side. Table 5 shows the classification of defects by AOI method. All defect mapping data can be stored in a data server and used for the repair process.

Dynamic Operation Test The dynamic operation test is the first opportunity to check the TFT performance and defect of thin films such as semiconductor, electrode, insulator, and color filters (Hitomi 2009). On a TFT substrate, many thin-film transistors exist, and it would be difficult to check every TFT characteristic. Therefore, this test is very useful for checking problems through the total fabrication process. Test results feed back to the TFT and liquid crystal assembling processes. At this stage, no driver LSI is connected to panels. Hence, this procedure prevents sending failed panels to the module assembling process.

LCD Processing and Testing

21

Table 5 Defect classification and inspection result

Top surface particle

Illumination from top surface Bright spot Dark spot -

Bottom surface particle

-

BM residue Photoresist residue

-

-

Photoresist defect

Illumination from bottom surface Bright spot Dark spot -

-

-

-

-

Over coat defect Pinhole

-

-

One of the good methods to check the TFT and CF process is “Mura” checking in the dynamic operation test. “Mura” is defined as the nonuniformity of brightness or color. Many factors cause the Mura problem. The key factors are as follows: • Nonuniformity of the electric signal. Driver output signal is changed by TFT performance, wiring capacity, or electrode resistivity. One example is striped Mura on an organic LED display driven by monolithic polysilicon TFT drivers. It is caused by the difference of neighboring TFT performance. • Nonuniformity of CF photoresist thickness. • Nonuniformity of cell gap. Surface flatness deviation of substrate causes nonuniformity of cell gap. And particles on the substrate change the cell gap, too. They would be other factors of Mura problem. • Impurity in liquid crystal material changes its resistivity. It causes the difference of charge holding time. It is an important factor to control the good image quality. • Nonuniformity of alignment layer thickness and rubbing treatment. It causes nonuniform arrangement of liquid crystal molecules. Short-range Mura is easy to catch. For example, a small brightness variation in several millimeter range is found very easily. We can find the root of these problems because each problem has its unique “Mura” mode. “Mura” is a very good and convenient way to catch the problem in TFT/CF process. The dynamic operation test had been originally carried out manually, but new automated test systems were introduced to the fabrication process. In the dynamic operation test, there are two probing methods: a contact pin probe and FBP (flat board probing) (Sato and Kobayashi 2001). FBP probes are made of plastic film which is connected to a glass interposer. The contactors are made by a photolithography process, and their pitch can be reduced to 30 um. The FBP enables low-cost testing for high-resolution displays. For the full testing, every electrode has to be contacted by probes. One idea to reduce the cost for testing is dividing all electrodes

22

Y. Yamamoto

into some groups and probe by group. For example, divide every gate electrode into odd and even numbers, and source electrodes into RGB groups. The probe number can be significantly reduced by the grouping method.

Repair Technology Repair technology has been applied to improve the production yield. A method combining testing and repairing is proposed (Wakabayashi et al. 2004; OMRON LASERFRONT INC 2009). Usually the repair procedure has to be applied before the liquid crystal cell process. The procedures of the test and repair technology in TFT process are as follows: • Cutting the short circuit part by laser • Connecting upper and lower layer electrodes at the open circuit by laser • Connecting failed electrodes by Laser CVD To achieve good performance, it is important to select the repairing method or conditions according to the layer structure or optical property of materials. The requirement for laser positioning is several microns. And laser power and repeatability control are important factors. To satisfy these demands, the solid-state laser source Nd:YAG (neodymium-doped yttrium aluminum garnet; fundamental wavelength of 1,064 nm) has been applied for repairing. In some cases, THG (third harmonic generation) at 355 nm or FHG (fourth harmonic generation) at 266 nm is used. This laser has 100 mW of laser power and is operated as continuous wave and Q-switched mode.

Repair of TFT Substrates A Laser CVD process consists of two steps, source gas filling and decomposition of material by laser. Usually the area for repairing is small and confined, so the curtain sealed method has been applied. Figure 12 shows the concept. The repair space is separated by an inert gas curtain. The balance of gas supply and exhaust is controlled carefully. A laser is focused into several microns and radiates on a spot area. Commonly, metal carbonyl compounds, such as W(CO)6 or Mo(CO)6, are used for source gas. Laser repair equipment has two functions: cutting and connecting. By combining laser cutter and laser CVD, the repair system shows high performance. The laser repair system consists of three parts: “inspection system,” “data server,” and “repair part” (Kakishita 2004; Honoki et al. 2006). In the “inspection system,” in-line inspection equipment is installed to check the performance of the fabrication process and make the defect mapping of every substrate. The “data server” function is data storing and control of the repair systems. In the “repair part,” repair machines operate from the data of the data server. The results of repairs are returned to the data server. All the test and repair system is connected and integrated into the production line. It plays the role of fabrication monitoring of the production line.

LCD Processing and Testing

23 LASER

Objective Lens Purge gas Source gas

Window

Exhaust Gas curtain area TFT substrate Laser deposition metal

Fig. 12 Schematic image of laser CVD for repairing (http://www.laserfront.jp/en/product/sl455/ adv.html)

Repair of CF Substrate From the results of the inspection, defects are repaired by “tape grind,” “laser processing,” or “ink coating.” “Tape grind” is a method that grinds away the projection defects by abrasive tape. This method is effective for removing local bumps or cohesions of pigment material. “Laser processing” is applied to remove particles or residual materials. The laser source is selected according to the material for removal. For example, the FHG of YAG laser is used for the removal of organic materials like color resist. The frequency of the laser is several Hz. “Ink coating” is a repair method for CF defects by spraying same ink as original color filter materials. Microdispenser is used for spraying. And the amount of spray should be controlled accurately as several pico-liters.

Conclusion As the liquid crystal display size increases, processing and testing technologies become more and more important for production. As for the TFT fabrication technology, cell assembling technology, module technology, and inspection/repair technology have made rapid strides. The conventional vacuum injection method has been replaced by the ODF technology. For plasma-enhanced CVD equipment, standing wave phenomena became a big problem in proportion to the enlargement of process chamber. Engineers are currently working toward a solution to this issue.

24

Y. Yamamoto

Recently, higher-speed image displaying technology has been required for 3D display, and customers demand low price and large TVs. To meet these demands, engineers of the liquid crystal display have to strive for improvement.

Further Reading Colgan EG, et al (1996) Copper-gate process for high information content a-Si TFT-LCDs. In: IDW96 Proceedings, Kobe, pp 29–32 Freeman D, Hawthorne J (2000) Implications of super high resolution to array testing. SID Symposium digest of technical papers, Long Beach Convention Center, Long Beach, CA, USA, vol 25.2, 31: 375–377. doi: 10.1889/1.1832960 Gourlay J et al (2009) Low-cost large-area LED backlight. In: SID09 DIGEST, San Antonio, USA, vol.48.1, pp 713–715 Hirai A, Abe I, Mitsumoto M, Ishida S (2008) One drop filling for liquid crystal display panel produced from larger-sized mother glass. Hitachi Rev 57(3):144–148 Hitomi K (2009) 2009 LCD technology outlook (complete works). Electron J 354–357 Honoki H, Nakasu N, Arai T, Yoshimura K, Edamura T (2006) In-line automatic defect inspection and repair method for possible highest yield TFT array production. In: IDW06 Proceedings, ¯ tsu-shi, Japan, pp 849–852 O Igarashi D (2009) 2009 LCD technology outlook (complete works). Electron J 336–340 Kakishita N (2004) Optical inspection system for the next generation LCD production. In: IDW04 Proceedings, Niigata, Japan, pp 565–568 Kamiya H et al (2001) Development of one drop fill technology for AM-LCDs. In: SID01 DIGEST, Boston, USA, pp 1354–1357 Koike J, Neishi K, Iijima J, Sutou Y (2007) Possibility of Cu-Mn alloy for TFT gate electrodes. In: IDW07 Proceedings, Sapporo, Japan, pp 2037–2040 Le Comber PG, Spear WE, Ghaith A (1979) Amorphous-silicon field-effect device and possible application. Electron Lett 15:179–181 Lieberman MA et al (2002) Standing wave and skin effects in large area, high frequency capacitive discharges. Plasma Sources Sci Technol 11:283–293 Masuda T, Ajichi Y, Kubo T, Yamamoto T, Shinomiya T, Nakamura M, Shimizu T, Kasai N, Mouri H, Feng XF, Teragawa M (2009) Ultra thin LED backlight system using tandem light guides for large-size LCD-TV. In: IDW09 Proceedings, Miyazaki Japan, pp 1857–1860 Mizumura M (2009) 2009 LCD technology outlook (complete works). Electron J 345–349 Ohmori H, Sakagawa M, Tani M, Nagase T (2000) A new negative photoresist for LCD spacers with high resolution. In: IDW00 Proceedings, Kobe, Japan, pp 399–402 Okita T, Masaki Y (1999) The new photoresist for LCD panel spacer. In: IDW99 Proceedings, Sendai, Japan, pp 415–118 OMRON LASERFRONT INC HP (2009) http://www.laserfront.jp/en/product/sl455/adv.html Sato K, Kobayashi S (2001) Flat board probing for 30 m pitched flat panel inspection. In: SID01 DIGEST, Boston, USA, pp 646–649 Sirringhaus H, Kahn A, Wagner S (1996) Self-passivated copper gates for thin film silicon transistors. In: IDW96 Proceedings, Kobe, Japan, pp 391–392 Snell AJ, Mackenzie KD, Spear WE, Le Comber PG, Hughes AJ (1981) Application of amorphous silicon field effect transistors in addressable liquid crystal display panels. Appl Phys 24:357–362 Spear WE, Le Comber PG (1975) Substitutional doping of amorphous silicon. Solid State Commun 17:1193–1196 Sun S, Takehara T, Kang ID (2004) Scaling-up PECVD system for large-size substrate processing. In: SID04 DIGEST, Seattle, pp 1499–1501 Suzuki Y et al (2005) Ekishou display no dekirumade. In: Nikkan Kogyou Sinbunsya, pp 135–136

LCD Processing and Testing

25

Takehara T (2005) Newest technology “akt-apxl” process chamber of the PECVD equipment for large TFT-LCD. AKT News 18:32–39 Takehara T, Sun S, Kang ID (2004) The latest PECVD technology for large-size TFT-LCD. In: IDW04 Proceedings, pp 603–606 Wakabayashi K, Mitobe K, Torigoe T (2004) Laser CVD repair technology for final yield improvement method in mass and large size TFT-LCD production process. In: IDW04 Proceedings, Niigata, Japan, pp 623–624 Yamada S et al (2001) A new production of large size TFT-panel by “LC-dropping method”. In: SID01 DIGEST, Boston, USA, pp 1350–1353 Yamamoto T, Tomiyoshi A, Masuda T, Fujiwara K, Ajichi Y (2009) The LED backlight of AQUOS XS1. Sharp Tech J 99:32–37 Yoshida M, Muramoto K, Oono T (2006) Liquid crystal drop filling (ODF)/vacuum bonding system: V-series. ULVAC Tech J 64:36–40

The p-Cell Philip Bos

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Effect of Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Optically Self-Compensating Director Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conversion to the Operational State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2 2 4 5 6

Abstract

The π-cell device is described. This type of device is shown to have fast response, due to the lack of “backflow” effects, and a wide viewing angle due to an optically self-compensating director configuration. The characteristics are valuable for future low-cost and low-power AMLCDs. List of Abbreviations

AMLCD ECB TN

Active matrix liquid crystal display Electrically controllable birefringence Twisted nematic

Introduction A goal for future advances in AMLCDs will be simplification of the structure and required process steps. Currently, the structure and process steps are complicated by both the structure required for multidomain pixels to improve the viewing angle and P. Bos (*) Liquid Crystal Institute &Chemical Physics Program, Kent State University, Kent, OH, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_99-2

1

2

P. Bos

the use of color filters associated with each pixel to generate color. Another goal will be the reduction of power, which is primarily controlled by the power requirement of the backlight. Both of these goals could be addressed by an electro-optical effect that did not require multidomain alignment to achieve a good angle of view and that could switch quickly enough to support field sequential color. The π-cell, described in this section, has both properties.

The Effect of Flow Conventional electrically controllable birefringence (ECB) devices and twisted nematic (TN) are known to suffer from a slowed relaxation rate due to “backflow.” Van Doorn (1975) first made clear that this effect is due to material flow in the device after an electric field is removed. As shown in Fig. 1 for an ECB device, in the drawing on the left, the material flow shown by the arrows in a relaxing device imparts a torque on the director near the center of the cell in a “backward” direction relative to the direction of lower elastic energy. TN devices also show this phenomenon as was also investigated by Berreman (1975). Faster relaxation can be achieved by removing this effect through a cell redesign where the material flow is considered. For the case of the TN device, a 3π/2 cell was proposed (Hubbard and Bos 1981). The “minus” 3π/2 refers to the twist of the device, where the negative sign indicates that the chiral additive used to achieve the twist has the opposite sign of the twist sense caused by the surface pretilt in the absence of any dopants (In this case, the tilt angle of the director is uniform across the thickness of the layer). For the case of a tunable birefringence device, a device called the π-cell has been proposed (Bos and Koehler-Beran 1984) where the rotational sense of the pretilt on the two cell surfaces has the opposite sense, as shown in Fig. 1 on the right side. In this case, the “backward” torque due to flow has been eliminated. The relaxation of an ECB device and a π-cell, after a voltage is removed, is shown in Fig. 2. It has been suggested by Raynes (Peter Raynes, Private communication) that the decreased response time could also be related to the bend structure that causes the director in the center of the cell to remain vertical, which effectively makes the cell relax as though it was two cells of half the thickness.

The Optically Self-Compensating Director Configuration An additional advantage of the π-cell device is its excellent off-axis optical properties, which can be further improved through the use of external compensation layers. The excellent off-axis viewing properties are a result of the optically “selfcompensating” director field, as can be understood by considering Fig. 1. If light is considered to be propagating almost along the vertical direction but tipped to the left (shown dotted line), for the case of the conventional ECB devices on the left it can be seen the off-axis light will be more nearly along the optical axis of the

The p-Cell

3

Fig. 1 The effect of cell design on flow shown by solid arrows for a uniformly aligned cell (left) and a π-cell (right). A direction of light propagation is shown as the dotted arrow

Time Response

40

30 Transmittance (%)

Fig. 2 The optical transmission after 10 V has been applied to a 7-μm thick cell for 500 ms. The green curve is for a uniform device, while the blue curve is for a π-cell. The birefringence of the liquid crystal is 0.1. The cell is assumed to be between crossed polarizers aligned at 45 to the rub axis of the cells, and the wavelength of light is 550 nm. Flow is taken into account in this calculation

20

10

0 500

504

508 512 Time (ms)

516

520

4

P. Bos

director field near the top and bottom of the cell. This causes the effective birefringence of those areas to be decreased. On the other hand, if the light propagation direction is considered to be tipped slightly to the right, the effect birefringence in those areas would be increased. So the optical effect of light tipping to the left or right will be much different and in both cases be different from the case of vertically propagating light. Now, consider the right-side picture that shows the director configuration in a π-cell. In this case, if light is considered to be propagating with a slight tip toward the left (shown by dotted arrow), the propagation direction will be more nearly perpendicular to the optic axis near the lower surface but more nearly along the optic axis near the upper surface. As a result the effective birefringence is increased in the lower part of the cell, but decreased in the upper part, and the net birefringence is less affected by the tipping of the propagation direction than for the standard ECB cell. While the effect of the optical properties due to tipping the light propagation direction from the top and bottom regions of the cell is self-compensating, the optical effect of the nearly vertically aligned region in the middle of the cell for off-axis light is significant. Bos showed that the optical effect of this vertically aligned area can be compensated for by the addition of an external negative C plate (Bos and Rahman 1993); however, other compensation approaches have been considered. For example, Uchida’s group has considered biaxial compensation films (Yamaguchi et al. 1993) (and proposed the name OCB for the resulting device). Also, Mori realized that a more ideal solution from that originally proposed by Bos or Uchida would be to have a negative retardation material whose optic axis field would be the mirror image of that of the π-cell in the low retardation state (Mori and Bos 1997). For this compensator, he used polymerized discotic liquid crystals and was able to achieve excellent viewing angle properties. Even further improvement can be realized by incorporating a viewing angle compensation for the polarizers. Figure 3 shows the measured data from a π-cell used with a discotic negative birefringence material of the appropriate optic axis profile, in a design that also compensates for the viewing angle characteristics of the polarizers (Mori and Bos 1999). A number of authors have compared the switching speed and viewing angle characteristics of the π-cell with other common modes. In particular, one that compares performance to the TN cell is by Kumagawa et al. (Kumagawa et al. 2002).

Conversion to the Operational State One issue with the π-cell structure is the need to acquire and maintain the bend structure shown on the right side of Fig. 1. This is because with 0 V applied to the cell, and when the pretilt is less than about 45 , a splay state, where the director in the center of the cell is horizontal, has the lowest energy. The splay state and bend

The p-Cell

5

Fig. 3 The measured viewing angle characteristics for a π-cell compensated with a negative birefringence film combined with a wide viewing angle polarizer

state are not topologically equivalent, so the pixel must be converted to the bend state through the application of a voltage. The nucleation and stability of the bend state has been studied by a number of authors, but two of particular note are Nakamura and Noguchi (2000), who explain the details of the transition and provide a very useful means of reducing the conversion time, and that of Acosta et al. (2004), who give an overview of nucleation techniques.

Summary In summary, the advantages of the π-cell are that it has excellent viewing angle characteristics due to its optically self-compensating director configuration and it has a sufficiently fast switching speed, due to the lack of “backflow” effects. These advantages make it potentially useful for future low-cost and low-power AMLCD devices because of a simplified pixel structure and the use of field sequential color.

6

P. Bos

Further Reading Acosta E et al (2004) Nucleation of the pi-cell operating state: a comparison of techniques. Liq Cryst 31:1619–1625 Berreman DW (1975) Liquid crystal twist cell dynamics with backflow. J Appl Phys 46:3746 Bos P, Koehler-Beran K (1984) The π-cell, a fast liquid-crystal optical switching device. Mol Cryst Liq Cryst 113:329 Bos P, Rahman J (1993) An optically "self-compensating" electro-optical effect with wide angle of view. SID Dig Tech Pap 24:273 Doorn CZV (1975) Transient behavior of a twisted nematic liquid crystal layer in an electric field. J Geophys Res 36:C1–C261 Hubbard R, Bos P (1981) Optical bounce removal and turn-off time reduction in twisted nematic displays (R). IEEE 28(6):723 Kumagawa K, Takimoto A, Wakemoto H (2002) Fast response OCB-LCD for TV applications: paper 48.4. SID Digest Tech Papers, p 1289 Mori H, Bos P (1997) Application of a negative birefringence film to various LCD modes. In: Conference record of the international display research conference 17, San Jose, M88 Mori H, Bos P (1999) Optical performance of the pi cell compensated with a negative birefringence films and an A-plate. Jpn J Appl Phys 38:2837–2844 Nakamura H, Noguchi M (2000) Bend transition in pi-cell. Jpn J Appl Phys 39:6368 Yamaguchi Y, Miyashita T, Uchida T (1993) Wide-viewing-angle display mode for the activematrix LCD using bend-alignment liquid-crystal cell. In: 1993 SID international symposium digest technical papers XXIV, Seattle. SID, Playa de Rey, 1993, pp 277–280

Flexoelectro-optic Liquid Crystal Displays Harry J. Coles and Stephen M. Morris

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flexoelectric Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Flexoelectro-optic Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Development of Materials for the Flexoelectro-optic Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uniform Lying Helix Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uniform Standing Helix Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 5 8 11 13 14 15

Abstract

With the development of the high-definition liquid-crystal display panels, the liquid-crystal response time has been improved so as to minimize motion blur through the implementation of the overdrive technology. However, for further reduction of the response times, alternative electro-optic effects that exhibit a faster response time are required. One potential candidate is the flexoelectrooptic effect in chiral nematic liquid crystals, which is a fast, in-plane deflection of the optic axis that occurs on the microsecond timescale and is linear in the applied electric field strength. The flexoelectro-optic effect has been shown to operate in two different geometries: the uniform lying helix alignment and the uniform standing helix alignment. This chapter describes the benefits of each device mode and the challenges that need to be overcome if they are to be implemented into commercial display technology.

H.J. Coles (*) • S.M. Morris Department of Engineering, University of Cambridge, Centre of Molecular Materials for Photonics and Electronics, Cambridge, UK e-mail: [email protected]; [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_100-2

1

2

H.J. Coles and S.M. Morris

List of Abbreviations

IPS ITO LC NLC NLC ULH USH VAN

In-plane switching Indium tin oxide Liquid crystal Nematic liquid crystal Chiral nematic liquid crystal Uniform lying helix Uniform standing helix Vertically aligned nematic

Introduction The response time of the liquid-crystal element in flat-panel display technology has always been a key factor in determining the overall dynamic image quality. Considerable progress has been made, both in terms of the materials and the driving technologies, which have enabled the current state-of-the-art high-definition panels to be developed. However, further refinements based upon existing technology are limited, and alternative electro-optic effects that inherently possess a faster response are continuously being pursued. One such mode is the flexoelectro-optic effect in chiral nematic liquid crystals (NLC), which can exhibit response times of the order of 10 s to 100 s of microseconds. In this chapter, we describe two alternative modes of operation that are based upon the flexoelectro-optic effect. Specifically, the two modes correspond to different orientations of the chiral nematic helix with respect to the substrates of the device: one mode corresponds to the axis of the helix lying in the plane of the device, referred to as the uniform lying helix (ULH), and the other corresponds to the uniform standing helix (USH), whereby the axis of the helix is aligned along the normal of the substrates (more commonly known as the Grandjean texture). The benefits of each mode are discussed following a brief introduction to flexoelectricity and the flexoelectro-optic effect. In addition, the outstanding factors that need to be resolved before these become commercially viable are considered along with the importance of the material development for the flexoelectro-optic effect.

Flexoelectric Effect The flexoelectric effect in nematic liquid crystals (LCs) was first identified by Meyer in 1969 (Meyer 1969). Therein, it was recognized that there was an interaction between a liquid-crystalline medium and an applied electric field which in many ways bore a certain resemblance to the piezoelectric effect that was known to occur in solids. Today, this interaction is referred to as the flexoelectric effect, which is due to a curvature strain (first-order spatial derivative), in order to avoid

Flexoelectro-optic Liquid Crystal Displays

3

confusion with piezoelectricity, which results from a positional strain (second-order spatial derivative). However, in analogy to piezoelectricity, flexoelectricity also gives rise to an electromechanical interaction. For the inverse flexoelectric effect, an applied electric field encourages an alignment of the dipole moments, which then generates a local deformation of the director field. If the nematic molecules possess shape polarity, then this deformation results in a net polarization. The nematic LC phase is centrosymmetric, which according to Curie’s principle (Demus et al. 1998) prohibits the appearance of a net polarization. On the other hand, the three fundamental director deformations do not possess a center of symmetry, and thus a local polarization may indeed be present. Therefore, in a nematic medium, which has the center of symmetry, a polarization may appear for nonequilibrium conditions where local deformations of the director field exist. However, twist deformations do not generate a polarization due to symmetry reasons, and consequently a polarization can only appear for a splay or a bend deformation or, alternatively, a combination of the two. The local polarization density (P) can be written in terms of the divergence and curl of the director field, P ¼ es nð∇  nÞ þ eb n  ð∇  nÞ

(1)

where es and eb are the so-called splay and bend flexoelectric coefficients, respectively, and n is the director. It is worth noting that the flexoelectric coefficient is often written in an alternative form of P = e1 n(▽  n) + e3(▽  n)  n, which is the same as Eq. 1, provided that e1 = es and e3 = eb. While the flexoelectric effect may be observed for all nematic liquid crystals, the effect in most conventional compounds is small due to very modest values of the flexoelectric coefficients. To gain a deeper understanding of the deformation-induced polarization, Meyer considered the impact of the molecular shape on the flexoelectric coefficients. In his paper (Meyer 1969), the molecular configuration was proposed for two distinct molecular shapes: one which favored a splay deformation and the other that favored a bend distortion. For the splay deformation, a pear-shaped molecule was considered (Fig. 1). This example illustrates the packing geometry for an asymmetricshaped molecule in the undeformed state whereby the director axis is aligned along a horizontal direction. In the distorted state, the shape of the molecules for a splay deformation of the director field results in a breakdown of the symmetry of the alignment and the appearance of a local polarization. Similarly, Fig. 2 shows the deformation-induced polarization when banana-shaped molecules are subjected to a bend distortion of the local director field. For nematic LCs, the molecular shape is generally not exclusively pear or banana-shaped but instead has a geometrical configuration that allows a polarization to appear for both splay and bend distortions. An example of a molecular shape that combines aspects of both pear and banana shapes is illustrated in Fig. 3. In the absence of an electric field, no polarization is present and the system remains in an undistorted equilibrium configuration. Conversely, with the application of an electric field, the dipole moments align in such a way so as to create a periodic splay-

4

H.J. Coles and S.M. Morris

a

b

P>0 es > 0

Fig. 1 The flexoelectric effect for pear-shaped molecules: (a) the undeformed state with a net polarization of zero and (b) the appearance of nonzero polarization for a splay deformation of the molecules

a

b P>0 eb > 0

Fig. 2 The flexoelectric effect for a banana-shaped molecule: (a) the undeformed state with a net polarization of zero and (b) a flexoelectric polarization as a result of the bend deformation

a E = 0, P = 0

b E > 0, P > 0

Fig. 3 The so-called inverse flexoelectric effect. (a) In the unperturbed state, the n ! n invariance is satisfied and the polarization is zero. (b) In the presence of an electric field, the symmetry is broken and the formation of a periodic splay-bend deformation gives rise to a nonzero polarization

Flexoelectro-optic Liquid Crystal Displays

5

bend deformation of the director field. As a result, there is then a polarization along the direction of the applied electric field. The presence of a local polarization necessitates the inclusion of an additional term in the free energy per unit volume to account for the electrical energy associated with the polarization. The modified free-energy density is then given as a summation of the elastic and flexoelectric free-energy densities: f ¼ f elastic þ f flexoelectric

(2)

where fflexoelectric = EP and therefore the full expression for the free-energy density in terms of the divergence and the curl of the director field is given by f ¼

1 1 1 K 11 ½∇  n2 þ K 22 ½n  ð∇  nÞ2 þ K 33 jn  ð∇  nÞj2 2 2 2 es E  nð∇  nÞ  eb E  ½n  ð∇  nÞ:

(3)

The Flexoelectro-optic Effect In 1987, Patel and Meyer showed that flexoelectric coupling between an applied electric field and a chiral nematic liquid crystal (NLC) results in a fast-switching electro-optic effect (Patel and Meyer 1987). This interaction between an external stimulus and the LC medium is an in-plane deflection of the optic axis away from the equilibrium position at zero field. NLCs are required since they facilitate a space-filling periodic splay-bend deformation shown schematically in Fig. 3. If a cut is made at an oblique angle to the axis of the helix (known as the Bouligand cut (Bouligand 1969)), the projection of the director field onto the plane of the cut then resembles a periodic splay-bend deformation. An illustrative example of the Bouligand cut in an undistorted NLC is shown in Fig. 4. Alternatively, the same distortion pattern can be observed if the director planes are rotated in unison about the y-axis. In this situation, the director fields form a periodic splay-bend deformation on the plane of the Bouligand cut when viewed along the helix axis, as shown in Fig. 5. The rotation of the director planes is a manifestation of the flexoelectric coupling between an applied electric field and a chiral nematic medium. When an electric field is applied along the y-axis, that is, perpendicular to the helix axis, then in order to minimize the free energy, the helix structure distorts so that the director field assumes a periodic splay-bend deformation. The result is that the helix axis remains fixed, but the optic axis rotates by some angle, φ (referred to as the tilt angle), relative to the equilibrium state. The macroscopic optic axis is oriented perpendicular to the director planes. As the field strength increases, the polarization also increases as a result of the enhanced splay-bend deformation. Consequently, the Bouligand plane occurs at more and more oblique angles, and the angle through which the optic axis is rotated grows linearly in magnitude with the applied field strength.

6

H.J. Coles and S.M. Morris

Fig. 4 An example of a Bouligand cut through an undistorted chiral nematic liquid crystal. The pattern inside the box on the right-hand side represents the periodic splay-bend deformation of the director field that is projected on to the Bouligand plane, which is observed when viewed along the helix axis

a

E=0 Z

b

φ Y

X

E>0

Fig. 5 The pattern of the director field in the plane of the cut when viewed along the helix axis. (a) For no electric field, the chiral nematic is in an undeformed state, and a cut along the director planes perpendicular to the helix axis reveals the striped pattern shown on the right-hand side. (b) The chiral nematic is deformed with the application of an electric field, and a cut along the director planes, which is now at an oblique angle to the helix axis, reveals a periodic splay-bend pattern

The relationship between the tilt angle of the optic axis and the applied E-field strength was first obtained by Patel and Meyer (1987). It was shown that the equilibrium helical distortion could be determined by assuming that, for this condition, the elastic energy compensates the energy due to flexoelectric coupling. The equation for the tilt angle can be written, in its simplest form, as tan ϕ ¼

e p E K 2π

(4)

Flexoelectro-optic Liquid Crystal Displays

OA

Z−axis

7

Z−axis

Z−axis

−φ

OA



E0

Z

Y

X

Fig. 6 An example of the rotation of the optic axis in a chiral nematic liquid crystal with an electric field applied perpendicular to the helical axis. Since the tilt angle depends upon the sign of the field as well as the magnitude, the direction of the rotation of the optic axis is determined by the direction of the applied electric field

where e = (es + eb)/2, K = (K11 + K33)/2 and p is the pitch. From this relationship, it is apparent that the tilt angle depends not only on the strength of the applied electric field but also on its polarity. Figure 6 demonstrates schematically the deflection of the optic axis in different directions for positive and negative applied electric fields. Note that in response to a bipolar electric field, the rotation of the optic axis is twice that of the tilt angle, φ. It is clear from Eq. 4 that to maximize φ, both a large flexoelastic ratio (e/K ) and a long pitch ( p) are required. The time taken for the optic axis to rotate through an angle φ is a very important parameter with regard to the performance of the electro-optic effect and is one feature that makes it of particular interest for future display modes. The dynamic behavior of the flexoelectro-optic effect can be considered to depend upon the viscoelastic forces associated with the macroscopic deformation. From a thermodynamics approach, it is possible to show that the response time, τ, is given by (Patel and Lee 1989): τ¼

η p2 K 4π 2

(5)

where η is the effective viscosity associated with the distortion of the helix. The dominant factor in the response time is the pitch, and therefore in order to achieve fast switching, a short pitch value is required. Experimentally, the initial reports observed response times of the order of 100 μs (Patel and Meyer 1987).

8

H.J. Coles and S.M. Morris

It is important to consider that to ensure that the limit of the linear regime due to the flexoelectro-optic response is maximized, the dielectric response needs to be as small as possible. Equation 3 assumes that dielectric coupling between the applied electric field and the NLC is insignificant but, in practice, most nematic LCs possess a nonzero dielectric anisotropy (Δe). Therefore, when the dielectric anisotropy is nonzero, an additional term has to be added to the free-energy density expression. This can be written as (de Gennes and Prost 1993) f dielectric ¼ 

ϵ0 Δϵ ðn  EÞ2 : 2

(6)

The dielectric term is quadratic in the field E as opposed to the linear dependence observed for flexoelectric coupling, and this form of coupling to the applied electric field will typically dominate. Most compounds that have been developed for current display technology possess a nonzero dielectric anisotropy, and therefore the flexoelectric behavior is observable only at very low E-fields before the helix begins to unwind and dielectric coupling dominates. This restriction to low fields then limits the amplitude of the flexoelectro-optic response. The threshold field for unwinding the helix when an electric field is applied perpendicular to the helix axis is given by (Patel and Meyer 1987)

Ecrit

π2 K 22  2 e ¼ P ϵ0 Δϵ  π4K

!12 :

(7)

Equation 7 includes contributions from both dielectric and flexoelectric coupling. Evidently, a low dielectric anisotropy and a large flexoelastic ratio are required to ensure that the critical field for unwinding the helix is very large so as to ensure that the linear regime due to flexoelectro-optic effects is preserved up to large electric field strengths. Overall, the combination of Eqs. 4, 5, and 7 shows that to optimize the flexoelectro-optic effect, the flexoelastic ratio must be large, whereas the pitch and the dielectric anisotropy must be small. Such a combination of material parameters is not readily available using conventional LC compounds, and alternative LC compounds are required. The development of new materials specifically for the flexoelectro-optic effect is discussed in the following section.

Development of Materials for the Flexoelectro-optic Effect The initial studies on the flexoelectro-optic effect reported small tilt angles (e.g., 7 ) as a result of small flexoelectric coefficients (Patel and Meyer 1987; Patel and Lee 1989). Extensive work was carried out by the Chalmers group in Sweden in the early 1990s (Komitov et al. 1994; Rudquist et al. 1994, 1997; Rudquist and Lagerwall 1997), and the tilt angles were increased to 30 , although impractical electric field strengths of 130 V/μm1 were still required. Reduction in the driving

Flexoelectro-optic Liquid Crystal Displays

9 F

F

OC9H18 O

F

F

Fig. 7 Example of a bimesogenic compound

voltage was later achieved following (Musgrave et al. 1999; Noot et al. 2001; Coles et al. 2001) the development of the bimesogenic structures (see Fig. 7), which combined relatively strong polar groups with a configuration that was able to minimize the dielectric anisotropy through the orientation of the dipole moments. The generic structure of a bimesogen consists of rigid aromatic cores as the terminal units, and the central unit of the molecule consists of a flexible alkyl chain with either an “odd” or “even” number of carbon atoms. The presence of the flexible spacer at the center of the molecule allows the opposing mesogenic units, which for all intents and purposes provide the source of the anisotropy, to assume different orientations relative to one another. However, for “odd” and “even” spacers, the dominant conformers are bent and linear configurations, respectively. To a first approximation, the improvement in the flexoelectric properties for bimesogenic structures can be understood in terms of dipolar theory. For example, Helfrich’s model (Helfrich 1971) showed that the bend flexoelectric coefficient could be expressed as e3 ¼

μ⊥ θK 33 ðb=aÞ2=3 N 1=3 KB T

(8)

where μ is the transverse dipole moment, θ is the bend angle, a and b are the length and breadth of the molecule, respectively, and N is the number density. An illustration of the geometrical model used by Helfrich is given in Fig. 8. Studies carried out on a nonsymmetric homologous series showed that there was an odd-even effect when the flexoelastic ratio (e/K ) was plotted as a function of the number of ether links in the flexible spacer (Morris et al. 2007). This was similar to the odd-even effect that was observed for the phase transitional properties. Larger e/K ratios were observed for the structures that had an odd number of units in the flexible spacer due to the fact that these predominantly exhibit a bent conformation, and thus θ is increased in accordance with Helfrich’s model. Recently, bent-core and banana-shaped compounds have also been shown to have large flexoelectric coefficients for the same reasons, both as neat compounds (Harden et al. 2005) and in mixtures (Wild et al. 2005; Kundu et al. 2009). However, subsequent studies suggest (Le et al. 2009; Kumar et al. 2009) that these coefficients are not quite as large as first reported. The larger flexoelectric coefficients of the bimesogens coupled with the low dielectric anisotropy leads to the fact that large tilt angles per unit of electric field were observed, while at the same time the critical field for unwinding was significantly

10

H.J. Coles and S.M. Morris

Fig. 8 Geometrical model used by Helfrich for bentshaped molecules (Helfrich 1971)

a

b



Fig. 9 An example of the tilt angle as a function of the electric field strength for a bimesogenic compound

100

Tilt angle, f

80

60

40

20

0

0

5

10

15

Electric field (Vμm−1)

increased. The majority of the bimesogenic structures are achiral, and thus a chiral dopant with a high twisting power is required. Further research has led to the development of mixtures, which exhibit very large tilt angles (greater than 80 ), and, moreover, the field strength required for full-intensity modulation was found to be substantially reduced to only 1 or 2 V/μm (Coles et al. 2006). An example of the large tilt angles achievable with bimesogenic mixtures is shown in Fig. 9. Other investigations have also been conducted on the structure-flexoelectric property relationships. For example, the effects of photoisomerization on the flexoelectric properties of NLCs have also been examined in order to evaluate the influence of the molecular shape on monomesogenic compounds (Komitov et al. 2000). For this study, azobenzene dopants were used since these materials undergo a trans-cis conformation change when subjected to ultraviolet radiation. The results were discussed in terms of the modification of the pitch of the NLC along with the change in the flexoelastic ratio. It was found that the flexoelastic ratio increased after the trans-cis conformation change, and this was attributed to an

Flexoelectro-optic Liquid Crystal Displays

11

increase in the bend flexoelectric coefficient. Furthermore, on comparison with an earlier study (Hermann et al. 1997), it was considered that molecular dissymmetry also played an important role in the magnitude of e/K. The combined research has resulted in the development of liquid crystals specifically for use in the flexoelectrooptic effect and which enhance the coupling between the applied electric field and the LC. The following sections describe the two potential modes of operation of the electro-optic effect and the challenges that need to be solved before the effect can be exploited in commercial devices.

Uniform Lying Helix Mode The original mode of operation of the flexoelectro-optic effect is the uniform lying helix (ULH) configuration, which is shown in Fig. 10. In this case, the helix axis is aligned parallel to the substrates, and an electric field is applied across the sample, satisfying the requirement that the direction of the electric field is orthogonal to the helix axis. With the application of an electric field, the optic axis is deflected in the plane of the device in either direction depending upon the polarity of the applied electric field (as discussed in section The Flexoelectro-optic Effect). The linear dependence of the tilt angle on the electric field strength (c.f. Fig. 9) leads to grayscale modulation. Moreover, since the rotation is in the plane of the device, this mode offers the possibility of a wide viewing angle. As discussed in detail in section The Flexoelectro-optic Effect, fast response times require short pitch values. Furthermore, to prevent losses due to diffraction from the periodic structure, the pitch should be smaller than the wavelength of the incident light. Generally, pitch values of the order of 300 nm are sufficient to obtain fast response times without significantly decreasing the tilt angle for a given electric field strength. Ultimately, the major challenge for the ULH mode is the spontaneous and stable alignment of the optic axis. It is well known that an NLC does not spontaneously align in the ULH configuration in conventional cells that are treated with standard alignment layers such as rubbed polyimide or homeotropic surfactants. This is because the periodic structure of the LC does not match the anchoring conditions imposed by the surfaces. Alignment can be achieved by following certain procedures: for example, cooling in the presence of an electric field and/or applying mechanical stress to the glass substrates. However, this alignment is unstable and, for planar anchoring, the helical structure reverts to a Grandjean texture once the electric field is removed and the relaxation time that this process takes is governed by the cell thickness. Furthermore, these processes are not applicable to mass manufacturing, and it is often the case that the contrast ratio is lower than 100:1. Studies have been carried out to try and solve the problem of spontaneous and reliable alignment of the ULH. Komitov and coworkers (Komitov et al. 1999) showed that it was possible to obtain a good alignment using planar periodic boundary conditions in the form of alternating planar/homeotropic anchoring coated onto one substrate: the periodicity of the anchoring conditions was equivalent to one-half turn of the helical structure. The helical structure in this case did not

12

H.J. Coles and S.M. Morris

Glass substrate ITO layer

E Chiral nematic

Alignment layer

Helix axis

Fig. 10 The uniform lying helix configuration

spontaneously align with its axis in the plane of the device, and an electric field was required to trigger alignment. A subsequent report replaced the planar periodic anchoring with that of an NLC polymer layer that was coated onto both substrates. In this case, the pitch of the polymer alignment layer was identical to that of the bulk NLC (Hegde and Komitov 2010). However, the process also involved applying an electric field to unwind the helix and then reduce the field strength while at the same time slowly cooling the material. Several studies have used polymer stabilization to “freeze-in” the ULH structure such that it is stable to thermal cycling. Polymer stabilization can be achieved without an adverse effect on the flexoelectro-optic switching properties, but this procedure does require the alignment to be obtained before the polymer network is formed (Rudquist et al. 1998; Kim et al. 2005; Broughton et al. 2006). Carbonne and coworkers used polymer microchannels that were formed using the polymer liquid-crystal polymer slide (POLICRYPS) method (Carbone et al. 2006). In contrast to the surface-based approach, this method results in a spontaneous alignment of the ULH since this is now the lowest energy configuration. Furthermore, for this arrangement, the ULH alignment is resilient to temperature cycles across the phase transitions and to unwinding due to large electric field strengths. In addition, contrast ratios greater than 100:1 were observed. Recent research has shown that the helical axis of the NLC actually aligns at some angle to the surface alignment direction for planar-aligned cells (Salter et al. 2009a). If both surfaces are coated with planar alignment, the resulting ULH structure can consist of two separate domains. Salter and coworkers have studied the behavior of these domains on the applied electric field strength. Moreover, it was found that it is possible to obtain a mono-domain ULH structure by altering the alignment directions on each substrate, which in turn results in an increase in the contrast ratio.

Flexoelectro-optic Liquid Crystal Displays

13

Uniform Standing Helix Mode An alternative geometry is obtained by aligning the NLC in the more conventional Grandjean texture and then applying an electric field perpendicular to the helical axis using in-plane electrodes; this is referred to as the USH mode and is illustrated in Fig. 11. Initially, the USH mode was considered in the context of a fast-switching phase device for telecommunication applications (Broughton et al. 2005a, b; Davidson et al. 2006), but more recently it has been considered as a potential new display mode (Castles et al. 2009, 2010). Unlike the ULH mode, the rotation of the optic axis is now out of the plane of the device, and thus a compensation film would be required to obtain a wide viewing angle. However, theoretical predictions have shown that a positive c-plate compensation film with an optic axis normal to the plane of the device should be adequate to cancel out the phase retardation of the LC (Castles et al. 2010). One of the key benefits of this mode over the ULH is that the alignment is trivial since the Grandjean texture is the lowest energy configuration when using conventional polyimide alignment layers. A comparison of flexoelectro-optic switching in the ULH and USH modes revealed that the tilt angles are comparable for the two modes (Salter et al. 2009b). However, for the USH mode to operate as a display device, larger tilt angles are required for full-intensity modulation than the 22.5 required in the ULH mode. The actual tilt angle required for full-intensity modulation will depend upon a number of factors such as the flexoelastic ratio, the birefringence, and the cell thickness (Castles et al. 2009). Another significant benefit of this mode is the off-state, which can tend toward being optically inactive for visible wavelengths provided the pitch is very short Glass substrate Alignment layer

Chiral nematic

In-plane electrodes

d

W

Fig. 11 The uniform standing helix configuration

S

14

H.J. Coles and S.M. Morris

(typically less than 200 nm). A report on flexoelectro-optic switching in the USH mode showed that the transmission in the zero-field conditions can be expressed as (Castles et al. 2009) I off 

π 2 n4 ðΔnÞ4 p6 d2 16λ8

(9)

for the condition that np « pλ. As a result, this can lead to very high contrast ratios (greater than 2000:1) when viewed at normal incidence to the cell (Castles et al. 2010). The USH mode is not unlike the vertically aligned nematic (VAN) LC mode, although a short-pitch NLC is negatively birefringent. The tight constrain on the pitch does, however, mean that very large flexoelectric coefficients are required if large tilt angles per unit of applied electric field strength are to be obtained (cf. Eq. 4). Nevertheless, the potential of this LC mode is clear, fast response times combined with high contrast ratios, a combination not readily achievable with many other LC modes. However, further developments in materials and the understanding of the USH mode are required before this can become a commercially viable technology.

Summary This chapter has described the two different modes of operation of the flexoelectrooptic effect: the uniform lying helix and the uniform standing helix alignments. Both modes exploit the sub-millisecond response time of the flexoelectro-optic effect combined with the gray-scale capability due to the linear response of the tilt angle with the electric field strength. However, the two modes do have different attributes. For the uniform lying helix configuration, the in-plane rotation of the optic axis leads to a wide viewing angle, negating the need for additional compensation films. The key challenge for this mode is overcoming the alignment of the helical axis in the plane of the device, which is nontrivial due to surface anchoring conditions. The combined research has made significant advancements in this area both in terms of materials and also developing new strategies with which to achieve a high contrast ratio. Although spontaneous alignment of the helix axis with contrast ratios greater than 1000:1 has yet to be realized, recent developments in the understanding have identified potential routes forward. In contrast to the ULH configuration, the USH mode does not have the problem of alignment. Further, for very short pitch values, the very low transmission between crossed polarizers potentially leads to very high contrast ratios at normal incidence. A compensation film is required due to the out-of-plane rotation of the optic axis, and continued development of the materials is required to increase the flexoelectric coefficients further so as to obtain full-intensity modulation at low electric field strengths. Despite the present challenges, both modes are of significant interest for nextgeneration fast-switching flat-panel displays.

Flexoelectro-optic Liquid Crystal Displays

15

Further Reading Bouligand Y (1969) Sur l’existence de pseudomorphoses choleste´riques chez divers organismes vivants. J Physique (Coll. C4) 30(suppl. au 11–12):90–103 Broughton BJ, Betts RA, Bricheno T, Blatch AE, Coles HJ (2005a) Liquid crystal based continuous phase retarder: from optically neutral to a quarter waveplate in 200 microseconds. Proc SPIE 5741(28):190–196 Broughton BJ, Clarke MJ, Blatch AE, Coles HJ (2005b) Optimized flexoelectric response in a chiral liquid-crystal phase device. J Appl Phys 98(3):034109 Broughton BJ, Clarke MJ, Morris SM, Blatch AE, Coles HJ (2006) Effect of polymer concentration on stabilized large-tilt-angle flexoelectro-optic switching. J Appl Phys 99:023511 Carbone G, Salter P, Elston SJ, Raynes P, De Sio L, Ferjani S, Strangi G, Umeton C, Bartolino R (2006) Short pitch cholesteric electro-optical device based on periodic polymer structures. Appl Phys Lett 95:011102 Castles F, Morris SM, Coles HJ (2009) Flexoelectro-optic properties of chiral nematic liquid crystals in the uniform standing helix configuration. Phys Rev E 80:031709 Castles F, Morris SM, Gardiner DJ, Malik Q, Coles HJ (2010) Ultra-fast-switching flexoelectric liquid-crystal display with high contrast. J Soc Info Disp 18:128–133 Coles HJ, Musgrave B, Coles MJ, Willmott J (2001) The effect of the molecular structure on flexoelectric coupling in the chiral nematic phase. J Mater Chem 11:2709–2716 Coles HJ, Clarke MJ, Morris SM, Broughton BJ, Blatch AE (2006) Strong flexoelectric behavior in bimesogenic liquid crystals. J Appl Phys 99(3):034104 Davidson AJ, Elston SJ, Raynes EP (2006) Investigation into chiral active waveplates. J Appl Phys 99:093109 de Gennes PG, Prost J (1993) The physics of liquid crystals. Clarendon, Oxford Demus D, Goodby J, Gray GW, Spiess H-W, Vill V (eds) (1998) The handbook of liquid crystals, vol 2B. Wiley, Weinheim, Chap. 2 Harden J, Mbanga B, Eber N, Fodor-Csorba K, Sprunt S, Gleeson JT, Jakli A (2005) Giant flexoelectricity of bent-core nematic liquid crystals. Phys Rev Letts 97:157802 Hegde G, Komitov L (2010) Periodic anchoring condition for alignment of a short pitch cholesteric liquid crystal in uniform lying helix texture. Appl Phys Lett 96:113503 Helfrich WZ (1971) Z Naturforsch 26a:833–835 Hermann DS, Rudquist P, Ichimura K, Kudo K, Komitov L, Lagerwall ST (1997) Flexoelectric polarization changes induced by light in a nematic liquid crystal. Phys Rev E 55:2857–2860 Kim S, Chien L, Komitov L (2005) Short pitch cholesteric electro-optical device stabilized by nonuniform polymer network. Appl Phys Lett 86:161118 Komitov L, Lagerwall ST, Stebler B, Strigazzi A (1994) Sign reversal of the linear electro-optic effect in the chiral nematic phase. J Appl Phys 76:3762 Komitov L, Bryan-Brown GP, Wood EL, Smout ABJ (1999) Alignment of cholesteric liquid crystals using periodic anchoring. J Appl Phys 86:3508–3511 Komitov L, Ruslim C, Ichimura K (2000) Effect of photoisomerization of azobenzene dopants on the flexoelectric properties of short-pitch cholesteric liquid crystals. Phys Rev E 61 (5):5379–5384 Kumar P, Marinov YG, Hinov HP, Hiremath US, Yelamaggad CV, Krishnamurthy KS, Petrov AG (2009) Converse flexoelectric effect in bent-core nematic liquid crystals. J Phys Chem 113:9168–9174 Kundu B, Roy A, Pratibha R, Madhusudana NV (2009) Flexoelectric studies on mixtures of compounds made of rodlike and bent-core molecules. Appl Phys Lett 95:081902 Le KV, Aroaka F, Fodor-Csorba K, Ishikawa K, Takezoe H (2009) Flexoelectric effect in a bentcore mesogen. Liq Cryst 36:1119–1124 Meyer RB (1969) Piezoelectric effects in liquid crystals. Phys Rev Lett 22:918–921 Morris SM, Clarke MJ, Blatch AE, Coles HJ (2007) Structure-flexoelastic properties of bimesogenic liquid crystals. Phys Rev E 75:041701

16

H.J. Coles and S.M. Morris

Musgrave B, Lehmann P, Coles HJ (1999) A new series of chiral nematic bimesogens for the flexoelectro-optic effect. Liq Cryst 26:1235–1249 Noot C, Coles MJ, Musgrave B, Perkins SP, Coles HJ (2001) The flexoelectric behaviour of a hypertwisted chiral nematic liquid crystal. Mol Cryst Liq Cryst 366:725–733 Patel JS, Lee S-D (1989) Fast linear electro-optic effect based on cholesteric liquid crystals. J Appl Phys 66:1879–1881 Patel JS, Meyer RB (1987) Flexoelectric electro-optics of a cholesteric liquid crystal. Phys Rev Lett 58:1538 Rudquist P, Lagerwall ST (1997) On the flexoelectric effect in nematics. Liq Cryst 23:503–510 Rudquist P, Buivydas M, Komitov L, Lagerwall ST (1994) Linear electro-optic effect based on flexoelectricity in a cholesteric with sign change of dielectric anisotropy. J Appl Phys 76:7778–7783 Rudquist P, Carlsson T, Komitov L, Lagerwall ST (1997) The flexoelectro-optic effect in cholesterics. Liq Cryst 22:445–449 Rudquist P, Komitov L, Lagerwall ST (1998) Volume-stabilized ULH structure for the flexoelectro-optic effect and the phase-shift effect in cholesterics. Liq Cryst 24:329–334 Salter PS, Elston SJ, Raynes EP, Parry-Jones LA (2009a) Alignment of the uniform lying helix structure in cholesteric liquid crystals. Jpn J Appl Phys 48:1013021–1013025 Salter PS, Kischka C, Elston SJ, Raynes EP (2009b) The influence of chirality on the difference in flexoelectric coefficients investigated in uniform lying helix, grandjean and twisted nematic structures. Liq Cryst 36:1355–1364 Wild JH, Bartle K, Kirkman NT, Kelly SM, O‘Neill M, Stirner T, Tuffin RP (2005) Synthesis and investigation of nematic liquid crystals with flexoelectric properties. Chem Mater 17:6354–6360 Yang D-K, Wu S-T (2006) Fundamentals of liquid crystal devices. Wiley, Chichester

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Electrophoretic Displays Karl Amundson* E Ink Corporation, Cambridge, MA, USA

Abstract Electrophoretic displays offer high brightness and contrast across the full range of viewing angles, an “ink-on-paper” appearance, and image stability. These attributes make electrophoretic displays an attractive candidate for portable devices that require easy readability in a variety of lighting conditions from indoor to bright sunlight without consuming much power. This chapter reviews the early development of electrophoretic displays and the challenges to commercialization, as well as the current electrophoretic display state of the art.

Introduction An ideal electronic paper display offers ink-on-paperlike optical attributes including the high contrast and brightness considered important for comfortable long-term reading and high comprehension and easy readability in a variety of conditions from office lighting to bright sunlight, consumes power sparingly for long battery life in portable devices, and is durable or even conformable or flexible. These attributes are desirable for incorporation into portable devices and especially for devices that are intended for long-term reading such as electronic book readers. A number of imaging film technologies have been considered as candidates for electronic paper displays. Because emissive displays consume significant power and commonly used types wash out in bright sunlight, top electronic paper candidates are non-emissive. The dominant display technology, the twisted nematic display, does not offer high brightness in its non-emissive (reflective or non-backlit) form primarily due to the optical polarizers that are required to achieve optical contrast. Higher reflectance has been sought among reflective display technologies that do not employ polarizers. Within the class of liquid crystal displays, dispersions of liquid crystal droplets in a polymer matrix or bi-continuous polymer and liquid crystal dispersions give electric field-switchable scattering through manipulation of the anisotropic liquid crystal refractive index (see chapter “▶ Liquid Crystal Polymer Composite Materials for LCDs”). The scattering power of such films is limited by the small refractive index anisotropy, typically about 0.2 at best. Thin cholesteric liquid crystal films return light to the viewer through Bragg reflection off the periodic twist structure (see chapter “▶ Cholesteric Reflective Displays”). Particle-based imaging films offer the promise of very high reflectance. Particles can be chosen that give a much higher refractive index contrast than achieved by the refractive index anisotropy of the liquid crystal technologies just mentioned. For example, titania particles offer a refractive index of 2.6 versus that of 1.4 for simple oils. In addition to high scattering, particle-based displays offer near-Lambertian scattering which gives them a paperlike appearance and constant performance across a full range of viewing angles. One early type of reflective particle display developed at the Xerox Corporation (Sheridon and Berkovitz 1977; Sheridon et al. 1997; Sheridon 2005) was called “Gyricon” and was based upon bichromal balls held in oil cavities in a polymer sheet. The bichromal balls are composed of *Email: [email protected] *Email: [email protected] Page 1 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

a

++ + + + + + + ++++ ++ +++++++ ++ ++ +++++ +++ ++ ++++ +++ + + ++++ + + + +++ +++ + +++ + +++ ++ +++ +++++

b + ++ + +++ + + + + + + + + ++ + + + +++ + + + ++++++

+ + + + + + + + ++ + + + + + + + + + + + + ++ + + + + + + + + +

V

+ + ++ ++ + + + +

V

Fig. 1 A cross section of a thin electrophoretic film containing white scattering pigment particles in a dyed fluid. In (a), the display appears dark or colored (viewer is above the display) because the pigment is behind the dyed fluid. A voltage is applied to two pixels in (b), driving the pigment over those pixels to the front of the display. These pixels appear white because the pigment is in front of the dyed fluid

particle-filled polymer; one hemisphere contains light-scattering particles and the other light-absorbing particles. These spheres have a permanent electric dipole aligned along the bichromal axis of rotational symmetry, so an electric field can be used to rotate these spheres between a state where their white hemispheres are facing the viewer and the opposing orientation where their black hemispheres face the viewer. Another type of particle-based display is the electrophoretic display. Many groups have pursued electrophoretic displays of various forms. Common to electrophoretic displays is that pigment particles in an oil-based liquid or air translate under the force of an applied electric field. This chapter reviews electrophoretic displays that employ through-film switching where particles are moved primarily toward and away from the viewing surface to achieve optical contrast. “In-plane” electrophoretic technology, where optical contrast is achieved through moving particles primarily laterally to the viewing surface, is reviewed in chapter “▶ In-Plane Electrophoretic Displays.” Through-film switching electrophoretic displays are by far the most prevalent type of commercial reflective electronic paper displays today.

Early Development of Electrophoretic Imaging Films Electrophoretic display imaging films were first developed in the 1970s, concurrently with the development of the twisted nematic display (Evans et al. 1971; Ota et al. 1973; Dalisa 1977; Amundson 2005). These electrophoretic displays used particles with a high refractive index in a dyed, oil-based fluid film. This “electrophoretic fluid” is held between a front and back electrode (Fig. 1). A voltage applied to the back electrode pushes the particles toward the viewing surface to achieve a white, reflective state. An opposing voltage pulls particles away from the viewer, so that light is absorbed by the dye before being reflected back to the viewer, giving a black or colored state, depending on the absorption spectrum of the dye. In this way, contrast is achieved by applying voltage pulses. A large refractive index contrast between

Page 2 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Spacers

Front electrode

+ +++ + + + ++ +++ + + + + + + + + + + + + + + ++ ++ + +++ ++ ++ ++ + + + + + + + ++ + + ++ ++ ++++ + + +++ + + + ++ ++

Pixel electrodes

Fig. 2 A series of ribs extending vertically in the display contain particle populations and limit lateral migration of pigment

the electrophoretic particles and their surrounding fluid is important because the white state is achieved through backscattering of ambient light, and the intensity of light-scattering scales roughly as the square of the refractive index contrast between the scattering particles and their surrounding fluid. The white scattering particle size is chosen to be on the order of the wavelength of visible light or to contain scattering structure on that length scale in order to maximize the light scattering. Properly designed, the electrophoretic films can give a much brighter white state than reflective liquid crystal films because liquid crystal films (except for certain cholesteric liquid crystal films) require light-absorbing polarizers in order to achieve optical contrast. Also, because scattering is used to achieve the bright state, these displays offer high contrast across the full range of viewing angles and an appearance similar to that of ink on paper. The electrophoretic particles and their liquid medium must be carefully prepared for successful electrophoretic function. The particles must have a charge in order to move in response to backplane voltages. Many types of particles develop a natural charge in a fluid medium through ionization of surface species or preferential adsorption onto the surface. However, it is preferable to impart a charge to the surface through controlled surface functionalization. For example, ionizable species can be attached to particle surfaces whereby either the cation or anion is preferentially solvated, leaving behind the complementary charge on the particle surface. In addition, particle surfaces must be treated to avoid particle agglomeration or particles sticking to other surfaces. Surface treatment is most often achieved through chemical attachment of polymer brushes or physisorbed surfactants. These surface polymer assemblies must be sufficiently dense and of sufficient extent to create a steric hindrance to close particle approach, for irreversible association through van der Waals attraction occurs if the particles are brought close enough for van der Waals attraction energy to overwhelm thermal energy. While the twisted nematic display technology was successfully developed into commercial displays during the 1970s, development of electrophoretic films was hindered by several observed failure modes. Despite the fact that humans have used dyes since prehistoric times, dyes are commonly unstable and degrade over time. Under switching voltages, the electrophoretic particle swarm would not switch uniformly but form “roll-cell” instabilities that leave an observable texture on the display surface. Particles would migrate laterally to electrode edges through a dielectrophoretic force, or gravitational force could cause particles to settle to one side of a display held vertically. The failure modes due to lateral migration, “roll-cell” patterns, migration to electrode edges, and settling under gravitational force can all be reduced to subcritical length scales by subdividing the fluid reservoir into an array of cells on the order of tens to a hundred microns wide. Hopper et al. (Hopper and Novotny 1979 and Blazo et al. 1982) described a microcell structure made from photo-patterned ribs of photoresist material. The microcells were filled with an electrophoretic fluid (Fig. 2). All extant electrophoretic displays utilize microcapsules or microcells to eliminate long-range lateral migration.

Page 3 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

aV

Time

bV

Time

Fig. 3 Representation of drive voltages required to achieve an image sequence where one begins in one optical state, transitions to another optical state (at the first dashed line), then back to the original optical state (at the second dashed line). (a) shows a sample waveform for a twisted nematic display, and (b) shows a sample waveform for an image-stable electrophoretic display

Image Stability, Power Consumption, and Driving Schemes When properly formulated, electrophoretic films exhibit image stability, whereby the display holds its image even in the absence of an electric field. Image-stable displays require a very different driving paradigm than monostable displays such as twisted nematic displays. A twisted nematic film requires continuous application of a voltage with amplitude chosen to maintain a graytone; in the absence of a driving voltage, a displayed image will vanish. An image-stable display requires a voltage only to change a graytone (location of the pigments), and, in the absence of a driving voltage, an image will persist. The waveforms required to transition from one graytone to another are illustrated in simplified form in Fig. 3. Because an image-stable display requires power only to update an image, for displays that are not continuously updated, the power savings can be considerable. Electrophoretic displays can be driven in direct drive or active matrix addressed in a straightforward fashion. Some formulations offer a threshold voltage sufficient to allow passive-matrix addressing (Ota et al. 1973; Amundson 2005; Ota et al. 1975; Lewis et al. 1977). One method for achieving an effective threshold voltage through backplane design is to employ a control grid structure (see Fig. 4). The physical action is analogous to vacuum tube operation. Particles behind a control grid electrode are shielded from a switching electric field. Once the control grid voltage falls below a critical value, the particles experience a switching electric field (Singer and Dalisa 1977; Murau 1984). In yet a different scheme, Blazo (Blazo 1982) describes a photo-responsive electrophoretic display by stacking a photoconductive layer in series with an electrophoretic film between front and back electrodes. In the absence of light, the photoconductive layer is highly resistive and protects the electrophoretic film from a switching voltage. Exposing specific areas of the display to light renders the photoconductor in those areas conductive, and so the

Page 4 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015 Row electrodes

Front electrode

+ +++ + + + ++ +++ + + + + + + + + + + + + + + ++ ++ + +++ ++ ++ + + + + + + + + + ++ + + ++ ++ ++++ + + +++ + + + ++ ++

Column electrode

Fig. 4 Control grid structure for driving electrophoretic films. In this image, the pigment is trapped in electrostatic potential wells because of a voltage difference between the row and column electrode. Reversing the voltage bias between row and column electrodes releases the pigment to be moved toward the front in response to a voltage on the front electrode

electrophoretic film in those areas experiences a switching electric field. In this way, an image is created in response to light exposure.

Speed and Optical Performance Reciprocity in Electrophoretic Films In this section, the relationship between the electrophoretic film design and properties and the response time is explored, based upon a very simple model for response time. By way of starting point, electrophoretic films that utilize scattering particles for the bright state and dyes for the dark state typically need to be on the order of tens to a hundred microns in thickness. At voltages on the order of tens of volts, these films switched between optical extremes in tens to a few hundreds of milliseconds. Here, we review basic factors that determine the switching speed. In a fluid, charged particles reach a terminal velocity in response to an applied electric field very quickly (typically, in well under a microsecond). Therefore, one can ignore inertial contributions to particle motion and express the electrophoretic motion through the terminal velocity: v¼mE

(1)

where m is the electrophoretic mobility and is determined by the balance between the Coulombic pull and viscous fluid drag. Derivations of the electrophoretic mobility can be found in Morrison and Ross (2002) and Probstein (1994). An expression for the electrophoretic mobility was first developed by Smoluchowski: m ¼ z  ϵ=

(2)

where e and  are the dielectric constant and viscosity of the surrounding fluid. z, the zeta potential, is the electrostatic potential at the shear plane around a particle. (The shear plane is an idealized representation of the dividing surface between fluid entrained by a particle and fluid that flows around a particle. Typically, the shear plane is near the outer reach of a polymer brush stabilization layer of a particle.) The zeta potential arises from the charge of the particle and can be approximated by z ¼ q  lD =ϵ

(3)

Page 5 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

where q is the particle surface charge density and lD is the Debye length in the surrounding fluid. Equations 2 and 3 can be combined to give an electrophoretic mobility of m ¼ q  lD =

(4)

The time to switch an electrophoretic display film can be approximated by the distance traversed by the particles, h, divided by their electrophoretic velocity v. The electrophoretic velocity is simply the electrophoretic mobility times the applied electric field strength E, and the applied electric field is approximately the applied voltage drop, V, across the display cell divided by the cell gap h. This gives an approximate switching time of T switch  h=v ¼ h=ðm  E Þ ¼ h2 =ðm  V Þ ¼ h2  =ðq  lD  V Þ

(5)

From this equation arising from a very simplified model for an electrophoretic display, we can see how the switching speed depends on the particle surface charge density, the fluid properties, the cell gap, and the drive voltage. Electrophoretic films, unlike twisted nematic films, can be switched between two graytones at a variety of drive voltages, as one would expect from Eq. 5. From these first principles, one may estimate that the switching speed scales as the square of the cell gap divided by the drive voltage, once the particles and fluid properties have been fixed. The interest in faster response drives a cell design to thinner dimensions. The desire for higher contrast gives an opposing drive to thicker films. Likewise, there is reciprocity between switching voltage and response speed.

Recent and Present-Day Electrophoretic Displays Microencapsulated Electrophoretic Displays The proliferation in the 1990s of portable devices that incorporate displays renewed interest in electronic paper display technologies. By this time, the twisted nematic liquid crystal display technology was mature and successful, yet non-backlit twisted nematic displays performed poorly in brightly lit conditions and especially in direct sunlight. A growing opportunity existed for a display technology that would enable easy readability in both indoor and outdoor conditions while not consuming much battery power. Based on early work at the MIT Media Laboratories, the E Ink ® Corporation was founded in August 1997 and began the development of a commercially viable microencapsulated electrophoretic film (Comiskey et al. 1998). The film was composed of microcapsules that contained dyed oil and highly scattering, charged particles. In the current process, microcapsules are coated onto an indium tin oxide (ITO)-coated plastic sheet that served as a transparent front plane of a display. This film, called a “front-plane laminate,” is then laminated to a backplane, and an edge seal is applied to protect the imaging film from the environment to form a display cell (Fig. 5). Several years after the particle-dye formulation was developed, E Ink developed a dual-particle electrophoretic imaging film, such that it did not require the use of dyes. The dual-particle formulation is used today. In the dual-particle formulation, the white state is achieved through highly scattering particles just as with the particle-dye systems. The dark state is achieved through the use of lightabsorbing particles that have the opposite charge as that of the scattering particles. A voltage of one sign draws the scattering particles to the front surface to give a white state, and the opposite voltage draws the light-absorbing particles to the front surface to give a black state. Colored pigments can also be used to achieve a colored state instead of either state. A dual-particle film is more challenging to formulate

Page 6 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Frontplane

++ + + + + + ++ + + + + ++ + + + + ++ ++++++ + + + +

-

-

+ + ++ + ++ + + ++

Backplane

+

+

+ + ++ + + + ++ + + + + + + ++ + + +

+

+

Fig. 5 Illustration of a microencapsulated electrophoretic display. The microcapsules are coated onto a front plane (top) with an ITO common electrode. The backplane electrodes are represented in cross section below. In this example, a positive voltage pushes light-scattering pigment toward the viewer and a negative voltage pulls the pigment toward the backplane

Frontplane − − + + ++ + + −−−−−−−− −− −− − + + + + ++ −− −− + −− + − + ++ + + + + + + ++ + + + + + + + + ++++ + + + + + − − − −−− − + +

-

-

+

++ + + +++ + + +++++ + + + + + ++ − −− − − − −−− − −−−−

+

Backplane

Fig. 6 A side-view schematic of a microencapsulated electrophoretic display film that contains two types of pigment particles and a microscopic image of such a display from the E Ink Corporation

because two sets of particles must be made to have opposite charges and both must be surface treated to avoid particle agglomeration. However, a dual-particle formulation, illustrated in Fig. 6, offers significant optical advantages and eliminates the need for dyes, which are generally not long-term stable to light. A dual-particle formulation also avoids “dye tainting” of the white state that arises in dye-based systems. White-state tainting occurs when backscattered light from multiple scattering events off of white pigment passes through the dyed oil between the scattering particles (Fig. 7) and is absorbed by the dye. Tainting can be reduced by decreasing the dye concentration, but then the oil reservoir must be made thicker in order to reach the optical saturation necessary for a good dark or colored state. Since the speed of transition scales roughly as the square of the thickness of the film (see Eq. 5), this increase in thickness rapidly decreases the speed of the imaging film. As a result, there is reciprocity between white-state quality and switching speed. This reciprocity can be made less severe using a dual-particle formulation. A dual-particle formulation eliminates the need for a dye, and so light can be reflected back to the viewer by scattering off of the white pigment without dimming by light-absorbing entities (Fig. 7). The E Ink “front-plane laminate” can be laminated to a variety of backplanes to make various display types. E Ink’s first products were direct-drive displays and were made using both rigid and flexible backplanes; the pixels in these displays were switched between dark and light optical states. E Ink later developed grayscale rendering and commercialized a wide variety of active-matrix displays. E Ink’s first commercial, active-matrix display was used in the Sony® Librié ® that was launched in April 2004

Page 7 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

a

b

+++ + +++ + + + + + + + ++ + + + + +++ + + + + + + + ++ + +++ + +++++++++ ++++++ + ++ + + + + + ++ + + + + + + + + + ++ + + + ++

+++ + +++ + + + + + + + ++ + + + + ++ + + + + + + + ++ + +++ + +++++++++ ++++++ + ++ + + + + + ++ + + + + + + + + + ++ + + + ++

+ ++ + +++ + + ++ + + + +

− − − − − − − − − − − + ++ + +++ + + ++ − − − − − − −− −− − + + + +

+ + + + +++ ++++ ++++++

+ + + + + + +++ + + + + + + ++

+ + + + + + +++ ++++ ++++++ − − − − − − − − −−− +++++++ + +++ + + ++

Fig. 7 The white state is achieved through multiple scattering off of white scattering particles in (a) a particle-dye electrophoretic film and (b) a dual-particle electrophoretic film. In the particle-dye system, interstitial dye taints the reflected light

(Amundson 2005). The Sony Librié incorporated a 6-in. SVGA E Ink display that rendered images in 2bit grayscale. This was the first electronic book reader using an electrophoretic display. Since the introduction of the Librié, numerous companies have made electronic readers using E Ink display modules. Currently, these electronic reader displays render images in 4-bit grayscale. Examples are shown in Fig. 8a, b. Prototype E Ink signage displays are shown in Figures 8c, d. In order to successfully integrate electrophoretic imaging films into active-matrix display modules, several modifications had to be made from the typical electronics for driving liquid crystal displays. The active-matrix backplane and source and gate drivers had to be modified to support 15-V drive, which was the preferred driving voltage. The display controller also required considerable redesign. Electrophoretic films are not driven to maintain an optical state but instead are driven to change the optical state. This means that, in order to render a new image in an efficient manner, the display controller should hold in memory the current image as well as the new image. Based upon the current and new image, the voltage sequence suitable for achieving the transition from the current to new graytone for each pixel is extracted from controller memory and applied to the appropriate pixel.

Microcell Electrophoretic Displays

Microcellular electrophoretic films were fabricated in the late 1970s and early 1980s using photoresist patterning to form the microcell walls (Hopper and Novotny 1979; Blazo 1982). The microcell walls block lateral migration of the pigment particles. The SiPix ® Corporation (acquired by E Ink in 2012) developed a continuous process for manufacture of microcell electrophoretic displays. An embossing wheel is used to form microcups in a deformable polymer layer coated over an ITO-coated display front sheet, followed by hardening of the polymer. Embossing, filling of microcups with electrophoretic fluid, and cup sealing are done in one continuous process. A microscopic image of a microcell electrophoretic imaging film is shown in Fig. 9. The in-line coating process allows for spot color. SiPix has demonstrated displays where one region contains an electrophoretic fluid with one dye and another region contains an electrophoretic fluid with another dye. This allows one region to be switched between white and one color and another region between white and a different color. In 2013, E Ink announced a three-pigment system using the SiPix microcell structure and with white, black, and red pigments. Initial applications are retail signage and electronic shelf labels. The displays show white, black, and red reflective states. A Spectra sign is shown in Fig. 10.

Page 8 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 8 Examples of commercial and prototype devices using E Ink electrophoretic displays. (a) A commercial Kindle ® Voyage ® eReader, (b) a commercial Kobo ® Aura HD eReader, (c) a prototype 32” public information display, and (d) E Ink Spectra™ electronic shelf label display

Fig. 9 A Micro Cup ® electrophoretic film constructed with an embossed rib pattern and cells filled with an electrophoretic fluid and then sealed with an overcoat is shown in (a) top view in (b) cross section Page 9 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 10 An E Ink Spectra display using the SiPix microcell structure and white, black, and red pigments

Air-Gap Electrophoretic Displays

The velocity of electrophoretic particles in a fluid medium is proportional to the electric driving force and inversely proportional to viscous drag. The electrophoretic velocity is therefore roughly inversely proportional to the fluid viscosity, all else being equal. The appeal of an air-gap electrophoretic display, one in which charged particles move through air instead of a liquid phase, is a high electrophoretic mobility for achieving high switching speeds as well as potential high reflectivity due to the large difference in index of refraction of the particles versus air. One may note that the viscosity of simple fluids is typically on the order of 1 cP, while the viscosity of air is around 0.02 cP. Based upon this, one may expect about two orders of magnitude speed increase in air-gap electrophoretic films compared to fluid-based versions. A challenge to keeping this speed advantage, however, is achieving similar particle charging to that in fluid-based systems. The well-established methods for charging particles in aqueous fluids, ionization of dissociating species on the particle surface, and nonaqueous media, through micellization of dissociating ions on the particle surface, are unavailable in a gas-phase system. A microcellular, air-gap electrophoretic display was developed through collaboration between the Bridgestone ® Corporation and the Department of Electrical Engineering at Kyushu University in Japan (Hattori et al. 2003). A key for successful implementation is avoiding particle sticking to each other or to walls of the microcells and electrode surfaces. This is especially challenging in an air-gap system, because of strong van der Waals attractions in air. Bridgestone and Kyushu University reported the development of a “liquid powder” electrophoretic particle that exhibits extremely low association forces (Hattori et al. 2003). Figure 11 shows a demonstration of liquid powder. A liquid powder (Fig. 11a) and an ordinary powder (Fig. 11b) are poured onto a horizontal platform. The ordinary particle forms a pile with a particular angle of repose that is an indicator of forces of particle association. The “liquid powder,” on the other hand, does not form a pile. This is a reflection of extraordinarily low association forces. Bridgestone and Kyushu University placed white and black “liquid powder” particles in an open-top microcell array and then applied the top substrate to form an electrophoretic cell. Despite the low particle association forces, the particle-particle and particle-wall forces are sufficient to impart a sizable threshold voltage, and so these displays can be driven using a passive addressing scheme. Switching of these displays required 70–100 V, presumably owing to the size of the threshold voltage. Individual pixels switched in about 0.2 ms with a 70-V drive voltage. Given the very fast switching speed, one can achieve relatively fast updates even with the row-by-row nature of the passive-matrix driving scheme. Hattori et al. (2003) reported a 67 ms update time for a 160  160 display. They also reported grayscale rendering through partial addressing. Figure 12 shows a microscopic image of a microcell structure and a 160  160 passivematrix display made by Bridgestone and Kyushu University. A prototype display module is shown in Fig. 13.

Page 10 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 11 (a) Electronic Liquid Powder ® made by Bridgestone and (b) an ordinary powder poured onto a platform. The ordinary powder forms a pyramidal pile owing to interparticle associations. The absence of a pile in (a) is indicative of very low interparticle association forces (Reprinted from Amundson 2005)

Fig. 12 A microscopic image of a microcell structure made by Bridgestone is shown in (a). (b) shows a 160  160 passivematrix display on a plastic substrate made by Bridgestone and Kyushu University (Reprinted from Amundson 2005)

Fig. 13 A 100dpi, QVGA passive-matrix, commercial display module made by Bridgestone and Kyushu University. This display module measures 83  62 mm

While the fast speed of an air-gap electrophoretic display is appealing, “liquid powder” displays suffered drawbacks. The driving voltage needed to be very high to overcome the attractive forces between

Page 11 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 14 The world’s first flexible, electronic paper display made using a flexible, active-matrix backplane and E Ink’s frontplane laminate. The backplane was fabricated using rubber stamping and organic semiconductors. The display has a resolution of 16  16 pixels (Photograph by CJ Gunther)

particles and substrates, and this complicates driving electronics. Another challenge of air-particle displays is the high-speed collisions that can damage the pigment surfaces and limit display lifetime. In 2012, Bridgestone ceased their commercialization of their air-gap electrophoretic display.

Flexible Electrophoretic Displays Display flexibility offers several advantages. At the base level, a flexible display, even in an unflexed configuration, offers durability. Glass-based displays become increasingly more delicate and difficult to protect as the size of the display increases. A flexible display, on the other hand, is much more durable and less likely to break if, for example, a handheld device is dropped or bent. The durability also allows the device to be designed without the additional size and weight required by protective features demanded by glass-based displays. Actual conformability allows displays to be applied on curved surfaces, and full flexibility enables yet a much larger universe of applications. Electrophoretic films are ideal candidates for incorporation into flexible displays. E Ink “front-plane laminates” are themselves flexible, and the microcapsules or microcups provide a solidity that maintains the display integrity during flexion. For non-flat displays, the full viewing angle optical performance is especially valuable. As a consequence, E Ink displays have been used in most development efforts directed toward flexible displays over the past decade. Flexible display technology and development are discussed in more detail in chapters “▶ Flexible Displays: Attributes, Technologies Compatible with Flexible Substrates and Applications” and “▶ Flexible Displays: Substrate Options.” E Ink’s first involvement with flexible displays started in 1999 through the collaboration with Bell Laboratories, Lucent Technologies (Rogers et al. 2001). This collaboration resulted in the creation of the first fully flexible electronic paper active-matrix display (Fig. 14). The backplane was manufactured using rubber stamping for defining molecular photoresist monolayers and used organic semiconductors in the pixel transistors, on a plastic substrate. The E Ink front-plane laminate was laminated to the plastic backplane to form a flexible display. From this low-resolution concept demonstration, E Ink has progressed to commercial manufacture of flexible, high-resolution active-matrix displays. Examples of devices that incorporate E Ink flexible displays are shown in Fig. 15.

Page 12 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 15 Examples of devices that incorporate flexible E Ink displays. (a) Sony Reader with a 13.1” flexible display, (b) a YotaPhone ® made by Yota ® with an E Ink screen, (c) an E Ink Digital Hour Clock Watch made by Phosphor ®, and (d) a Sony SmartBand ® Talk. The E Ink displays in the latter three devices are curved for improved ergonomic design

The thin and lightweight Sony Reader shown in Fig. 15a would have been quite fragile had it been made with a large glass-based display. The all-plastic display allows for a very light weight; the entire device weighs only 6.8 oz (193 g) and is very rugged. The E Ink screen on the YotaPhone (Fig. 15b) enables a low-power, daylight readable display and is curved to enable a better fit into the palm of a hand. The displays in the watch and wristband (Fig. 15c, d) are curved to better fit around a wrist.

Summary In the “second age” of electrophoretic display development, starting in the late 1990s, electrophoretic display technology has been developed to fulfill a desire for electronic paper, a role not well filled by the dominant liquid crystal display technology. These displays offer good reflectivity over a full viewing angle, and their Lambertian scattering characteristic imparts an “ink-on-paper” appearance, which is attractive for long-term reading. Their ability to hold an image in the absence of applied power and voltage makes this technology attractive for use in portable devices and enables readers that can last as long as 2 months without battery recharging. Electrophoretic displays have been integral to the recent commercial success of electronic books. They have also been used in handheld medical devices and displays in the retail environment and as indicators on electronic devices such as USB memory sticks where their zeropower image stability is essential.

Further Reading Amundson KR (2005) Electrophoretic imaging films for electronic paper displays. In: Crawford GP (ed) Flexible flat panel displays. Wiley, New York Page 13 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_101-2 # Springer-Verlag Berlin Heidelberg 2015

Blazo SF (1982) High resolution electrophoretic display with photoconductor addressing. SID Dig 1982:93 Comiskey B, Albert JD, Yoshizawa H, Jacobson J (1998) An electrophoretic ink for all-printed reflective electronic display. Nature 394:253–255 Dalisa AL (1977) Electrophoretic display technology. Trans Electron Devices 24(7):827–834 Evans PF, Lees H, Maltz M, Dailey J (1971) Color display devices. US Patent 3,612,758 Hattori R, Yamada S, Masuda Y, Dailey J (2003) Color display devices. SID Dig 2003:846–849 Hopper MA, Novotny V (1979) An electrophoretic display, its properties, model, and addressing. IEEE Trans Electron Devices 26(8):1148–1151 Lewis JC, Garner GM, Blunt RT, Carter F (1977) Gravitational, inter-particle and particle-electrode forces in the electrophoretic display. Proc SID 18(3/4):235–242 Liang RC, Hou J, Zang H, Chung J, Tseng S (2003) Microcup displays: electronic paper by roll-to-roll manufacturing processes. J SID 11(4):621–628 Morrison I, Ross S (2002) Colloidal dispersions: suspensions, emulsions, and foams. Wiley, New York Murau P (1984) Characteristics of an X-Y Addressed Electrophoretic Image Display (EPID). SID Dig 1984:141 Ota I, Ohnishi J, Yoshiyama M (1973) Electrophoretic Image Display (EPD) panel. Proc IEEE 61:832–836 Ota I, Sato T, Tanka S, Yamagami T, Takeda H (1975) Electrophoretic display devices. In: Laser 75 optoelectronics conference proceedings, Munich, pp 145–148 Probstein RF (1994) Physicochemical hydrodynamics: an introduction. Wiley, New York Rogers J, Bao Z, Baldwin K, Dodabalapur A, Crone B, Raju VR, Kuck V, Katz H, Amundson K, Ewing J, Drzaic P (2001) Paper-like electronic displays: large-area rubber-stamped plastic sheets of electronic and microencapsulated electrophoretic inks. Proc Natl Acad Sci U S A 98(9):4835–4840 Sheridon NK (2005) Gyricon materials for flexible displays. In: Crawford GP (ed) Flexible flat panel displays. Wiley, New York Sheridon NK, Berkovitz MA (1977) The Gyricon, a rotating ball display. Proc Soc Inf Disp 18(3&4):289–293 Sheridon NK, Richley EA, Mikkelsen JC, Tsuda D, Crowley JM, Oraha KA, Howard ME, Rodkin MA, Swidler R, Sprague R (1997) The Gyricon, a rotating ball display. Int Disp Res Conf 7(1):L82–L84 Singer B, Dalisa AL (1977) An X-Y addressable electrophoretic display. Proc SID 18(3/4):255–256

Page 14 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

In-Plane Electrophoretic Displays Kars-Michiel H. Lenssen* Philips Research, Eindhoven, The Netherlands

Abstract In-plane electrophoretics is attracting increasing attention, because it can provide bright color and/or transparent displays in combination with ultralow power consumption. Besides display applications, this enables novel applications such as smart windows, electronic skins, electronic wallpaper, and digital signage (as a replacement for color-printed paper signs). This section gives an overview of the developments in the field for nonmatrix, passive-matrix, as well as active-matrix devices. The latest concepts for grayscale and multicolor are also presented.

Introduction In-plane electrophoretics is the basic principle of systems in which electrically charged, colored particles are moved laterally (“in plane”) to create an optical effect, instead of vertical movement as described in chapter “▶ Electrophoretic Displays.” For several reasons it can be advantageous to be able to move colored particles out of the path of view by collecting them in a small fraction of the total area, instead of hiding them behind other particles or a colored liquid, as is the case for vertical electrophoretic devices. The most important benefit is that in-plane electrophoretics, similar to printing, allows the use of a subtractive color scheme, which makes it possible to realize bright, print-like full-color displays. Another advantage over conventional electrophoretic displays is the option for a highly transparent optical state, which enables different applications, such as smart windows. Also, the possibility to choose the reflector independently of the particles provides an extra degree of freedom: for example, a static image pattern can be added onto the reflector or a brighter white state than is possible with white particles can be achieved.

Monochrome In-Plane Electrophoretic Devices In its simplest form, a pixel of an in-plane electrophoretic display consists of a viewing area (large area) and a collector area (small area). Each of these areas contains at least one electrode, which is located on the same substrate, so the charged, colored particles can be moved from the collector area to the viewing area (and vice versa) by applying a voltage difference between the electrodes. When all particles are collected in the collector area, the pixel is transparent (see Fig. 1); for reflective displays this state can be made white by adding a reflector behind the pixel. This reflector can be as simple as plain printing paper (Lenssen et al. 2008a) or it can be a reflector with gain (Endo et al. 2004) to increase the brightness at the cost of a slightly reduced viewing angle. By optimizing the reflector, this latter effect, however, can be minimal, since light that is reflected over certain angles will be absorbed by pixel walls, etc. in the case of an isotropic reflector. If an anisotropic reflector is used instead, this otherwise lost light can still contribute to the brightness of the display. *Email: [email protected] Page 1 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Dark

Bright

Fig. 1 The principle of an in-plane electrophoretic display

For the colored (“dark”) optical state of the pixel there are two basic options: the particles can be collected on an electrode that covers the whole viewing area (Kishi et al. 2000) or the particles can be left suspended in the liquid. In the latter case the particles are spread either over the whole pixel (Fig. 4 of ref Swanson et al. 2000) or over the viewing area kept between two electrodes (Lenssen et al. 2008a). Spreading offers the advantage of a higher brightness in the compacted state, since only a small fraction of the pixel area is covered by electrodes. This even allows the use of nontransparent electrode materials. Similarly to conventional electrophoretic devices, a division of the panel into compartments is desirable to limit migration of particles (see chapter “▶ Electrophoretic Displays”). Since a certain degree of alignment between compartments and in-plane electrodes is needed, instead of microencapsulation, wall structures, for example, microcups (Wang et al. 2006), are used.

Nonmatrix (Segmented) Displays The concepts described can be applied directly to realize segmented displays, but in practice usually some variations are used. A relatively straightforward implementation of the concept of a collector and a viewing electrode using a suspension of black toner particles (1–2-mm diameter) in Isopar and a 180-mm-thick poly(ethylene terephthalate) substrate is described in (Kishi et al. 2000). The highest contrast obtained was above 8 and the response time was 15 ms at a driving voltage of 100 V or 30 ms at 40 V. In this device there are two pairs of collector and viewing electrodes per pixel. If the distance that the particles have to travel between the collector and the viewing electrodes is halved, the response time becomes roughly four times faster for the same voltage difference (see Eq. 5 in chapter “▶ Electrophoretic Displays”). A geometry of interdigitated electrodes could be considered an extreme case of subdividing and distributing collector and viewing electrodes over the pixel area. In this way, the distance that particles have to travel is minimized and visible grids in the transparent state are avoided by the fine distribution of the collector area over the total area. An example is shown in Fig. 2; however, in this case both electrodes have the same dimensions and the dark state is achieved by spreading the particles instead of collecting them on an electrode. In (Lenssen et al. 2009a), the typical switching speed was around 1 s for driving voltages of the order of 5 V. Instead of linear electrodes different geometries can be used, such as the so-called closed-loop type design (Matsuda et al. 2002), in which the collector electrode surrounds the viewing electrode. Because the collector electrode can be positioned underneath the spacer, it was found that the aperture could be increased to 92 % for 200 dpi (5-mm-wide walls), compared with 74 % for a multiline structure. Another example of a geometry that is not so conventional in displays consists of hexagonal and dot electrodes (Lenssen et al. 2009a), as shown in Fig. 3. An advantage for manufacturing is that the electrodes form a kind of mesh and therefore are more robust against open defects than long line electrodes. In Fig. 3 the hexagonal and dot electrodes can be seen consisting of 70-nm-thick indium tin oxide; the gap between the glass substrates (i.e., the height of the layer of electrophoretic suspension) is around

Page 2 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Microscopic photographs of a sample with interdigitated electrodes: left particles uniformly spread (dark state); right particles compacted onto one of the electrodes (transparent state) (Reproduced from Lenssen et al. 2009a with permission from the Society for Information Display)

Fig. 3 Microscopic photographs of panels with hexagonal and dot electrodes: (a) particles uniformly spread, (b) the same sample with particles compacted on the hexagonal electrode, and (c) the same sample with particles compacted on the dot electrode (Reproduced from Lenssen et al. 2009a with permission from the Society for Information Display)

28 mm. In this geometry there are two states that result in a transparent optical state: the particles can be compacted either on the hexagonal electrodes or on the dot electrodes. In the dark state the particles are spread over the pixel. With one hexagonal and one dot electrode per pixel transmission values of approximately 3 % and approximately 65 % were measured; the dark state was limited by the transparent wall material (SU8) that was used. By reduction of the area that is covered by walls, black panels switching between 1 % and 70 % were realized in a different geometry. In 2000 it was proposed to use the pixel walls as one of the electrodes in a so-called walls/post pixel design (Swanson et al. 2000). In (Ukigaya et al. 2003a) a study about embedding the collector electrode in the wall was reported; it was found that the driving voltage and the response time were greatly improved

Page 3 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

(250 ms at 10 V for 260 dpi or less than approximately 1 s at 5 V Ukigaya et al. 2003b) for a closed-looptype design. Also it was observed that the electric-field cross talk was reduced, since the embedded collector electrode acts as a kind of electric shield. Nonmatrix displays are sometimes also called electronic skins, or e-skins, because they can be used to make the appearance of surfaces electronically adaptable by covering them with such nonmatrix electronic paper (e-paper) displays (Lenssen et al. 2009a; Koch et al. 2009; Montbach et al. 2009). No systematic lifetime studies dedicated to in-plane electrophoretics have been reported yet. However, it can be expected that the relevant effects will be similar to those for vertical electrophoretics (chapter “▶ Electrophoretic Displays”). In (Lenssen et al. 2011) no significant degradation in the transmission value (of 67 %) was reported after 400,000 switching cycles for a panel with a geometry similar to that shown in Fig. 3.

Passive Matrix

As also discussed in chapter “▶ Electrophoretic Displays,” some threshold is needed for passive-matrix addressing: it should be possible to write one row of pixels while the other rows (that are also exposed to part of the writing voltage) remain unchanged. In principle this threshold voltage could be intrinsic, achieved by careful tuning of the material properties of the particles and the electrode material. In this case particles that are collected on an electrode will only move away if the repulsive voltage is above a certain threshold voltage. However, so far no passive-matrix panels based on an intrinsic voltage have been reported for in-plane electrophoresis. In 2000 various concepts for passive-matrix displays based on in-plane electrophoresis without an intrinsic threshold were described (Kishi et al. 2000); all used a control electrode to press the particles onto the bottom substrate by means of a repulsive voltage (of 60–450 V). Additionally, for good operation a barrier that separates the collector electrodes from the viewing electrodes turned out to be necessary; this barrier can either be mechanical (a “wall barrier”) or electrical (generated by additional control electrodes). A composite structure with control electrodes on top of wall barriers was proposed, but no experimental results were presented. The first truly in-plane passive-matrix display prototypes (i.e., without a control electrode on the cover substrate) were reported 8 years later (Lenssen et al. 2008a, b). This concept used a gate electrode to control particle movement between the collector electrode and the viewing area. Also, particles were not kept on a large viewing electrode, but were kept between two narrow viewing electrodes. Besides improving the brightness (because of the small area covered by electrode material), this makes it much easier to spread particles over the viewing area in a controllable way: whereas a single big electrode covering the viewing area would create an equipotential plane, now a voltage gradient can be created in the viewing area between the two viewing electrodes. A schematic cross-sectional view of the pixel is shown in Fig. 4, indicating the collector, gate, and two viewing electrodes, positioned on the same substrate. The driving scheme consists of four phases. The first phase is a reset, in which all particles are collected at the collector electrode, simultaneously for all pixels; subsequently all gate voltages are increased. In the second, programming phase the gate voltage in all pixels in a specific row is lowered, and dependent on the desired optical state a potential difference between the collector and the first viewing electrode is applied so that in certain pixels particles can cross the gate. This programming step is repeated for all rows in the display. Note that to program a pixel it is not necessary to move the particles from the collector electrode completely toward the other side of the pixel, because the optical state of a pixel is directly determined by the number of particles within the viewing area. Therefore, in the first instance it is sufficient to move them merely from the collector across the gate. Since this distance is much smaller than the pixel size, the time needed to program all pixels in a display is Page 4 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 The four phases of the passive-matrix driving scheme: reset, program, evolution, and hold. The bars indicate applied voltages (Reproduced from Lenssen et al. 2009b with permission from the Society for Information Display)

significantly reduced (over a factor of 10 if this distance is one fifth of the pixel size). Although this programming step is done per row, the subsequent so-called evolution phase (in which the particles that arrived in the viewing area are spread between the viewing electrodes) can be done for the whole panel at once, and thus the time involved is independent of the number of rows. Finally, when the desired optical image is achieved, the panel can be switched to the hold phase with a small repulsive voltage on all gate electrodes, so that no particles can move in or out of the viewing area.

Page 5 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

For the passive-matrix prototypes consisting of 100  100 pixels (each 500 mm  500 mm) and an external reflector of normal copying paper, a high brightness of 48 % and a contrast ratio of 7:1 were measured. The emphasis of this study was more on the optical properties and low voltages than on the switching time, which was reported to be 30 min for the whole display (with all applied voltages being 10 V or less); it was predicted that update times can be reduced to values of the order of seconds by improved suspensions, higher driving voltages, and/or smaller pixel sizes.

Active Matrix It is relatively straightforward to adapt the concepts and designs described for nonmatrix displays for application in an active-matrix device, analogously to other display technologies. In 2002 prototypes of in-plane electrophoretic displays based on the closed-loop-type design with 250  300 pixels of 135 mm  116.9 mm were reported (Matsuda et al. 2002). Because the writing voltage had to be limited to prevent cross talk, optimal dark and white states could not be achieved; for 15 V/+10 V a contrast ratio of 10:1 was measured. The writing time for a frame was less than 1 s. By embedding the collector electrodes in the pixel walls cross talk could be avoided and the driving voltage could be reduced to 5 V (Ukigaya et al. 2003b). In (Henzen 2009a) it was estimated that an aperture around 70 % should be possible for active-matrix pixels of 200 mm  200 mm, enabling reflectivity values of about 60 % (for full-color displays; see also section Subtractive Color). This is in agreement with an extrapolation of measured values (Verschueren et al. 2010), which estimates 54 % reflectivity for an aperture of 70 %.

Hybrid Electrophoretics The distinction between vertical and in-plane electrophoretics is not always very clear. Besides the (relatively recent) purely in-plane electrophoretic displays, hybrid electrophoretic display geometries have also been studied. These geometries have electrodes on both substrates, but make some use of lateral movement of particles, often in an attempt to merge the transparency option of in-plane electrophoretics with the higher switching speed of vertical electrophoretics (owing to the smaller distance that particles have to travel). Devices in which an electrophoretic suspension is sandwiched between a substrate with a continuous electrode and a substrate with a small area of electrodes (e.g., multiple lines) have been reported already, for example, in (Swanson et al. 2000). In a sense passive-matrix devices with a uniform control electrode on the other substrate are also hybrid devices (Kishi et al. 2000). Another example is the “dual mode” switching concept from (Chung et al. 2003); besides conventional vertical movement of white particles in a colored liquid, it is also possible to move them toward the walls, thus revealing a black back plate. This enables one to switch a pixel between three optical states: white, black, and the liquid color. In (Lenssen et al. 2009a) a hybrid device with at least two independent in-plane electrodes to realize a discrete number of gray levels in a simple and robust way was presented; this will be discussed in the next subsection.

Related Technologies In the last half a decade new parties started research on related technologies that are also based on electrically charged colored particles that move with an in-plane component. Electrokinetic display technology (Koch et al. 2009) is a hybrid technology (see the previous section), the name of which draws attention to the fact that also other physical phenomena than pure electrophoresis are important to understand and control the movement of the charged particles. Switching times under 500 ms (at 24 V) have been demonstrated, with a transparency >60 %. Page 6 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 A 4-bit grayscale image on a 200-dpi active-matrix panel based on a closed-loop type of design (Reproduced from Matsuda et al. 2002 with permission from the Society for Information Display)

Electro-osmotic technology claimed that it may be possible to increase switching speed by an order of magnitude (Henzen and Groenewold 2010) by exploiting the role of electro-osmosis; in this case the particles are moving with the liquid, rather than through the liquid. However, so far switching times not faster than 3 s in a passive matrix were reported (Hoehla et al. 2013). In a single-layer passive prototype the contrast ratio was approximately 5:1 (Hoehla et al. 2013).

Grayscale In in-plane electrophoretic displays the grayscale is determined by the number of particles in the viewing area. This unambiguous relationship makes grayscale a real strength of in-plane electrophoretics and is the reason why it is relatively easier to make reproducible gray levels in in-plane electrophoretic devices than in devices based on vertical electrophoresis (where the delicate gray levels depend on the precise metastable positions of particles). By variation of the number of scans for a frame, 4-bit grayscale on a 200-dpi active-matrix panel based on a closed-loop-type design was demonstrated in 2002 (Matsuda et al. 2002), as shown in Fig. 5. Since the particles are attached to either the viewing electrode or the collector electrode, it was expected that the gray states can be maintained over a long period of time without power. Subsequently various improvements were implemented in a 260-dpi active-matrix display (Ukigaya et al. 2003b). If there is electric field cross talk between neighboring pixels, this causes a nonuniform distribution of particles. This does not only affect the grayscale, but also causes a so-called after-image: a single reset pulse is not sufficient to return the display to a uniform white state after an image has been displayed. It was shown that embedding a 10-mm-high collector electrode in the pixel walls diminished both issues. In a 50-dpi passive-matrix panel, 5-bit grayscale was demonstrated using a design with gate electrodes as discussed in the previous subsection (see Fig. 6; Lenssen et al. 2008a). Since a spreading concept was

Page 7 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 A 5-bit grayscale image on 50-dpi passive-matrix panels using gate electrodes (Reproduced from Lenssen et al. 2008a with permission from the Society for Information Display)

used, the grayscale is not maintained without power, but the required power in the “hold” phase is so low (roughly 4 nW/cm2) that stability can be mimicked effectively. Most grayscale concepts require some electronics to provide specific, and often rather complex, driving schemes. For matrix displays this is usually not a big problem since controller electronics are needed anyway, but for applications of simpler panels sophisticated driver electronics could be prohibitive. In these cases “built-in grayscale” (Lenssen et al. 2009a) can be a solution, if a limited number of discrete gray levels is sufficient. This concept uses a hybrid design, in which a large fraction of one substrate is covered with an electrode, and the other substrate has at least two electrodes covering smaller fractions of the substrate area. By applying a DC voltage between two electrodes, one can obtain a certain gray level; selection of various pairs of electrodes provides five different gray levels in the simplest configuration with three electrodes (see Figs. 7 and 8). If the area of the in-plane electrodes is designed to differ, even more gray levels become available. An additional advantage of the built-in grayscale concept is that it is robust, e.g., against temperature variations. Whereas in traditional electrophoretic displays the exact driving schemes applied are usually dependent on the ambient temperature, this is not necessary for the built-in grayscale concept. It is expected that the precise switching time to a different gray state may vary with temperature (because of viscosity changes), but the final optical state is largely independent of temperature. In an electrokinetic architecture eight gray levels were demonstrated by dynamic driving, without the need for resetting the particles (Koch et al. 2011).

Concepts for Multicolor In-plane electrophoretic technology provides several options for color e-paper. Two classes of color concepts can be distinguished: concepts based on subpixelation and those based on subtractive color schemes.

Subpixelation (Additive Color)

Obviously, a color filter could be added to a black panel (with a white reflector) for a straightforward fullcolor panel, analogously as described for vertical electrophoresis in the previous section. Because of the superior brightness, this would result in an improvement compared with an estimated maximum brightness of approximately 30 % (Henzen 2009a) that can be achieved with color panels based on vertical

Page 8 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

a Microstructured cell compartment

b Patterned electrode 2

Patterned electrode 1

Substrate 1

Substrate 2

c

Unpatterned electrode

d

e

Fig. 7 The concept of built-in gray levels based on hybrid electrophoresis (cross-sectional view) (Reproduced from Lenssen et al. 2009a with permission from the Society for Information Display)

Fig. 8 Segmented panel based on hybrid electrophoresis demonstrating some built-in gray levels (Reproduced from Lenssen et al. 2009a with permission from the Society for Information Display)

electrophoresis. So far, no in-plane electrophoretic display with a color filter array has been published, but an in-plane electrophoretic display that used an RGB subpixelated reflector has been reported (Endo et al. 2004). This segmented panel (see Fig. 9), using black particles in 73 mm  73 mm subpixels and an anisotropic reflector, had a reflectance of 35 % (illumination under 30 and detection at 0 ) in the white state and 4 % in the black state. The measured color gamut changed much less for variation of the

Page 9 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 9 An in-plane electrophoretic panel with a subpixelated RGB reflector showing four patterns: (a) combination of red, green, and blue segments, (b) combination of cyan, magenta, and yellow segments, (c) white in the whole display area, and (d) black in the whole display area (Reproduced from Endo et al. 2004 with permission from the Society for Information Display)

illumination angle from 30 to 70 than for a reflective LCD; as explanations the absence of birefringence and the nontransparent walls that keep reflected light from penetrating neighboring pixels were mentioned. Instead of a subpixelated reflector, subpixelation can also be realized by filling microcups with differently colored (RGB) liquids, all containing white electrophoretic particles. A concept for this (as well as a white/black/red active-matrix test panel) has been described in (Sprague 2011), based on “dual mode” switching (Chung et al. 2003) in combination with a black backplane (Sprague 2011). In (Mukherjee et al. 2014) it was proposed to use dual suspensions consisting of particles with one of the RGB additive primary colors and particles of the complementary color in the CMY subtractive primaries and fill subpixels with the resulting R/C, G/M and B/Y suspensions. It was predicted that this biprimary color system can span a significantly larger color space than when conventional RGBW subpixelation is used. The reason for the limited research activity on color in-plane electrophoretic displays based on subpixelation may be that subpixelation either limits the brightness of a reflective display to not much more than one third of the incident light, or the saturation of the colors is sacrificed, besides the costs and loss of resolution.

Subtractive Color A better option to achieve bright full-color displays is to use a subtractive color scheme and stack layers with a yellow, cyan, and magenta suspension on top of a white reflector (see Fig. 10). This is possible since Page 10 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

B G R

White reflector

Fig. 10 A subtractive color concept for in-plane electrophoretics

every layer can be switched to a transparent state, and therefore all color combinations can be realized by mixing the desired amount of yellow, cyan, and magenta, in analogy to color printing. This really enables “any pixel, any color.” For subtractive color mixing to be possible, it is required that the colored particles do not influence the trajectory of the light, i.e., they must be nonscattering. This means that from a macroscopic viewpoint the electrophoretic suspension looks transparent instead of opaque. This requirement of nonscattering particles can be met by matching the refractive index of the particles and the liquid, or the particles should be significantly smaller than the optical wavelength. In practice, the latter approach has been chosen so far. For well-saturated colors a contrast ratio of around 20:1 should be sufficient (Henzen 2009a). Further, as a rule of thumb, the distance between the top and bottom layers in the stack should be small compared with the pixel size to avoid parallax, which is not trivial for high-resolution displays. Other important parameters are the aperture and the losses due to substrates plus electrodes. It is an advantage that for in-plane electrophoretics only a small fraction of the viewing area has to be covered with electrodes; this enables stacking without big losses. Extrapolations based on optical measurements on monochrome test samples indicate that in-plane electrophoretics has the potential to deliver full-color e-paper devices with high brightness and a wide viewing angle (50 % reflectance at 45 ), approaching newspaper quality (Verschueren et al. 2010). The first in-plane electrophoretic display with nonscattering colored particles is the passive-matrix device shown in Fig. 6 (Lenssen et al. 2008a). In a panel with a similar layout, except for an additional gate and collector electrode, it was demonstrated that it is also possible to control more than one type of particle independently in a single layer of suspension. Figure 11 shows such device based on a suspension with both cyan and orange particles. Because these differently colored particles have different charges, they can be controlled independently; also mixed colors of cyan and orange can be realized. This was claimed to be the first multicolor reflective matrix panel without subpixelation, stacking, or color sequential driving. This option provides an elegant solution for a full-color in-plane electrophoretic display consisting of only two layers each with two colors of particles. Excellent white can be obtained by the white reflector (while both layers are switched to transparent), excellent black by the black particles, and all colors can be made by combinations of cyan, magenta, and yellow particles.

Page 11 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 11 Photographs of 100  100 pixel passive-matrix panels demonstrating grayscale multicolor images, without subpixelation or stacking (Reproduced from Lenssen et al. 2009b with permission from the Society for Information Display)

The samples that were described so far contained one or two particle colors, but recently significant progress has been made in colored suspensions. Note that the requirements for these suspensions are different from those for vertical electrophoretics: the particles have to be nonscattering and it is essential that all particles behave in a controlled way (so that the viewing area can be cleared completely). In (Goulding et al. 2011) the development of electrophoretic suspensions in all primary colors was described, as well as dual suspensions with white, respectively black particles. Although most of the experiments in this paper used vertical switching, also in-plane switching of single and dual (red/blue) suspensions was reported (Goulding et al. 2010, 2011). A full set of coloured suspensions, i.e., magenta, yellow and cyan suspensions plus black and white has been developed and demonstrated in electrokinetic displays (Koch et al. 2011). Also bistable inks have been reported (including a black ink that was used in panels that could retain the written state for at least 6 months (Benson et al. 2012)); much attention was paid to the importance of a compatible material system and HP-DEBPT, a surfactant, was considered by the authors as a major step in achieving this (Zhou et al. 2012). Modeling of in-plane electrophoretics (Yeo et al. 2009; Jeon et al. 2010) is proceeding as well, also including other electrokinetic phenomena that occur in in-plane electrophoretic devices. These developments in suspensions, processing and modelling have already led to the realization of full-color prototype displays. Stacked segmented as well as active-matrix electrokinetic panels have been demonstrated (Fig. 12), showing full-color images with a color quality that approaches the Standard for Newsprint Advertising Production (SNAP) (Benson et al. 2012). The estimated average power consumption was 50 mW/cm2 for the full stack displaying a different image every 10 s.

Processing and Application Directions Not much more than 5 years ago all samples in publications were still fabricated on rigid glass substrates although some candidate roll-to-roll processes were already being developed (Wang et al. 2006). In the meantime a lot of progress have been made. Plastic substrates for electrokinetic segmented panels have been fabricated in a roll-to-roll process (Koch et al. 2009) and assembled to flexible monochrome panels that were thinner than 0.25 mm and had a bending radius of 5–10 mm. The transparency was >60 % and the contrast ratio about 10:1. Later an active-matrix backplane was fabricated on flexible glass (although not yet roll-to-roll but using a bond-debond process) and subsequently plastic electrokinetic front-plane technology was laminated on

Page 12 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 12 Photographs of a stacked electrokinetic display, sitting on a white lambertian reflector, displaying color images along with printed reference SNAP colors; the 128  128 pixels with a 500 mm pitch (Reproduced from Benson et al. 2012 with permission from the Society for Information Display)

it (Mourey et al. 2011). The result was a flexible monochrome active-matrix display (128 times 128 pixels of 500 mm  500 mm). While initially color e-readers were considered one of the main applications, the attention has moved toward smart windows and digital signage, which are viewed as multi-billion dollar business opportunities. These large-volume applications could justify investments in roll-to-roll production facilities, in contrast to smaller-area applications like electronic skins, watches, user interfaces, toys, etc. Recently also more application-oriented research appears, resulting in publications about e.g., a concept for window foils that can actively control light and heat independently (using a dedicated dual suspension, which still has to be realized) (Lenssen et al. 2013) and energy harvesting as power source for in-plane electrophoretic panels (Lenssen et al. 2010). It was demonstrated experimentally that photovoltaic cells or RF-power can be used to realize autonomous panels that do not need power cables or batteries to function; this eliminates the need for failure-prone power connections and facilitates retrofit propositions, e.g., for smart windows.

Conclusion In-plane electrophoretics can provide bright color and/or transparent displays in combination with ultralow power consumption and therefore is attracting increasing attention. Several companies and institutions consider it the most promising technology for full-color e-paper; see, for example, (Lenssen et al. 2009b; Ota 2009; Henzen 2009b). Besides the display applications mentioned in chapter “▶ Electrophoretic Displays,” in-plane electrophoretics enables novel applications such as smart windows, e-skins, electronic wallpaper, and digital signage (as a replacement for color-printed paper signs).

Further Reading Benson B, Liu Q, Koch TR, Mabeck J, Hoffman RL, Mourey DA, Combs G, Zhou Z-L, Henze D (2012) Ultra-low-power reflective display with world’s best color. SID Symp Dig 43:708–710 Chung J, Hou J, Wang W, Chu LY, Yao W, Liang RC (2003) Microcup electrophoretic displays, grayscale and color rendition. In: Proceedings of IDW’03, Kobe, pp 243–246

Page 13 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Endo T, Soda T, Takagi S, Kitayama H, Yuasa S, Kishi E, Ikeda T, Matsuda H (2004) Color in-plane EPD using an anisotropic scattering layer. SID Symp Dig 35:674–677 Goulding M, Farrand L, Smith A, Greinert N, Wilson H, Topping C, Kemp R, Markham E, James M, Canisius J, Walker D, Vidal R, Khoukh S, Lee S-E, Lee H-K (2010) Dyed polymeric microparticles for colour rendering in electrophoretic displays. SID Symp Dig 41:564–567 Goulding M, Smith A, Farrand L, Greinert N, James M, Wilson H, Topping C, Kemp R, Markham E, Canisius J (2011) Polymer microparticle dispersions for transmissive and reflective full colour electrophoretic display applications. Nihon Gazo Gakkaishi J Imaging Soc Jpn 50(2):135–141 Henzen A (2009a) Development of e-paper color display technologies. SID Symp Dig 40:28–30 Henzen A (2009b) Progress in subtractive color electrophoretic displays. In: Proceedings of IDW’09, Miyazaki, pp 533–535 Henzen A, Groenewold J (2010) Electro-osmosis: the key to in-plane electrophoretic displays. In: Proceedings of IDW’10, Fukuoka, pp 1511–1513 Hoehla S, Henzen A, Fruehauf N (2013) Development of electro-osmotic color e-paper. SID Symp Dig 44:119–122 Jeon Y, Kornilovitch P, Beck P, Zhou Z-L, Henze R, Koch T (2010) Understanding electrophoretic displays: transient current characteristics of dispersed charges in a non-polar medium. In: Proceedings of IDW’10, Fukuoka, pp 1499–1502 Kishi E, Matsuda Y, Uno Y, Ogawa A, Goden T, Ukigaya N, Nakanishi M, Ikeda T, Matsuda H, Eguchi K (2000) Development of in-plane EPD. SID Symp Dig 31:24–27 Koch T, Hill D, Delos-Reyes M, Mabeck J, Yeo J-S, Stellbrink J, Henze D, Zhou Z-L (2009) Roll-to-roll manufacturing of electronic skins. SID Symp Dig 40:738–741 Koch T, Yeo J-S, Zhou Z-L, Liu Q, Mabeck J, Combs G, Korthuis V, Hoffman R, Benson B, Henze D (2011) Novel reflective color media with electronic inks. J Inf Disp 12(1):5–10 Kroeker KL (2009) Electronic paper’s next chapter. Commun ACM 52(11):15–17 Lenssen K-MH, Baesjou PJ, Budzelaar FPM, van Delden MHWM, Roosendaal SJ, Stofmeel LWG, Verschueren ARM, van Glabbeek JJ, Osenga JTM, Schuurbiers RM (2008a) Novel design for fullcolor electronic paper. SID Symp Dig 39:685–688 Lenssen K-MH, Baesjou PJ, van Delden MHWM, Stofmeel LWG, Verschueren ARM, van Glabbeek JJ, Osenga JTM, Schuurbiers RM (2008b) Bright color electronic paper. In: Proceedings of IDW’08, Niigata, pp 219–222 Lenssen K-MH, van Delden MHWM, M€ uller M, Stofmeel LWG (2009a) Bright color electronic paper technology and applications. In: Proceedings of IDW’09, Miyazaki, pp 529–532 Lenssen K-MH, Baesjou PJ, Budzelaar FPM, van Delden MHWM, Roosendaal SJ, Stofmeel LWG, Verschueren ARM, van Glabbeek JJ, Osenga JTM, Schuurbiers RM (2009b) Novel concept for fullcolor electronic paper. J Soc Inf Disp 17(4):383–388 Lenssen K-MH, Stofmeel LWG, van Delden MHWM, Vullers RJM, Visser HJ, Pop V (2010) Zeroenergy e-Skin. In: Proceedings of IDW’10, Fukuoka, pp 1507–1510 Lenssen K-MH, van Delden MHWM, M€ uller M, Stofmeel LWG (2011) Bright e-skin technology and applications: simplified gray-scale e-paper. J Soc Info Disp 19(1):1–7 Lenssen K-MH, Trouwborst ML, van Delden MHWM (2013) Novel concept for smart windows. In: Proceedings of IDW’13, Sapporo, pp 1315–1317 Matsuda Y, Kishi E, Goden T, Ogawa A, Ukigaya N, Uno Y, Ishige K, Ikeda T, Matsuda H (2002) Newly designed, high resolution, active-matrix addressing in-plane EPD. In: Proceedings of IDW’02, Hiroshima, pp 1341–1344 Montbach E, Pishnyak O, Lightfoot M, Miller N, Khan A, Doane WJ (2009) Flexible electronic skin display. SID Symp Dig 40:16–19 Page 14 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_102-2 # Springer-Verlag Berlin Heidelberg 2015

Mourey DA, Hoffman RL, Garner SM, Holm A, Benson B, Combs G, Abbott JE, Li X, Cimo P, Koch TR (2011) Amorphous oxide transistor electrokinetic reflective display on flexible glass. In: Proceedings of IDW’11, Nagoya Mukherjee S, Heikenfeld J, Smith N, Goulding M, Topping C, Norman S, Liu Q, Kramer L (2014) The biprimary color system for e-paper: doubling color performance compared to RGBW. SID Symp Dig 45:869–872 Ota I (2009) History of electrophoretic displays and proposal of a novel cell structure for lateral particle movement display devices. In: Proceedings of IDW’09, Miyazaki, pp 525–528 Schurman K (2009) X-ray vision: any color, any pixel. Comput Power User 9(9):46–47 Sprague RA (2011) Active matrix displays for ereaders using microcup electrophoretics. SID Symp Dig 42:365–368 Swanson SA, Hart MW, Gordon JG II (2000) High performance electrophoretic displays. SID Symp Dig 31:29–31 Ukigaya N, Endo T, Matsuda Y, Goden T, Ishige K, Takagi S, Kishi E, Ikeda T, Matsuda H (2003a) In-plane EPD with an embedded collecting electrode in a spacer. SID Symp Dig 34:576–579 Ukigaya N, Endo T, Matsuda Y, Goden T, Ishige K, Takagi S, Nakanishi M, Kishi E, Ikeda T, Matsuda H (2003b) Active matrix addressing in-plane EPD with a collecting electrode embedded in a spacer. In: Proceedings of IDRC’03, Kobe, pp 107–110 Verschueren ARM, Stofmeel LWG, Baesjou PJ, van Delden MHWM, Lenssen K-MH, M€ uller M, Oversluizen G, van Glabbeek JJ, Osenga JTM, Schuurbiers RM (2010) Optical performance of in-plane electrophoretic color e-paper. J Soc Inf Disp 18(1):1–7 Wang X, Zang HM, Li P (2006) Roll-to-roll manufacturing process for full color electrophoretic film. SID Symp Dig 37:1587–1589 Yeo JH, Kim SW, Lee GD (2009) Dynamical behaviors of charged particles in horizontal switching electrophoretic cell. In: Proceedings of IDW’09, Miyazaki, pp 545–548 Zhou Z-L, Liu Q, Combs G, Benson B, Henze D, Koch T (2012) Development of bistable electronic inks for reflective color media. In: Proceedings of NIP28, Quebec City, pp 32–35

Page 15 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Video-Speed Electrowetting Display Technology Johan Feenstra* Liquavista BV, Eindhoven AG, The Netherlands

Abstract In this chapter, we discuss electrowetting display technology. We start by introducing electrowetting in general and explain briefly how this technology works. Then we present some of the proposals that have been put forward in the past on how to use electrowetting as a display technology. The main focus of the chapter is on the electrowetting display technology that has been developed by Liquavista and its properties, focusing on full-color, video, and grayscale capability. In the last part of the chapter, we illustrate the versatility of this technology by showing prototypes of electrowetting displays in reflective, transmissive, and transflective modes.

Introduction With electrowetting a voltage is used to modify the wetting properties of a liquid on a solid material. An example of such increased wettability is illustrated in the photographs in Fig. 1. Figure 1a shows a water droplet on a hydrophobic surface. Creating the water/hydrophobic interface requires much energy and as such the water droplet minimizes the contact area with the underlying surface. In Fig. 1b a voltage difference is applied between the electrode in the water and a subsurface electrode underneath the hydrophobic insulator. As a result of the voltage, the droplet spreads, i.e., the wettability of the surface increases strongly. When the voltage is removed, the droplet returns to the original state indicated in Fig. 1a. Electrowetting has its origin in the combination of two classic and very well-understood fields: interfacial chemistry and electrostatics. The starting situation where a liquid droplet sits on a solid surface (Fig. 1a) is described by the Young equation: gLV cos y ¼ gSV  gSL ;

(1)

where gij are the surface tensions of the liquid/vapor, solid/vapor, and solid/liquid interfaces, respectively, and y is the contact angle. In the case of electrowetting, an electrostatic term is added to the energy balance of the system. As a result, the droplet will adjust its shape to lower the energy of the total system, as shown in Fig. 1b. The final result, including the electrostatic energy, was found by Nobel Prize laureate Gabriel Lippmann (1895): gLV cos y ¼ gSV  gSL þ

1 e0 er 2 V ; 2 d

(2)

where er and d are the dielectric constant and the thickness of the hydrophobic insulator, respectively.

*Email: [email protected] Page 1 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Water droplets on a hydrophobic surface (a) without and (b) with voltage applied

Since gLV, gSV, and gSL are material constants, applying a voltage will increase cos y, implying that the liquid will spread for both polarities. Rapid progress in the performance of electrowetting has been achieved in the last 20 years owing to improvements in materials and processing. In the last decade, electrowetting has been utilized for an increasing number of applications. These include pixelated optical filters (Prins et al. 2001), fiber optics (Mach et al. 2002), adaptive lenses (Kuiper and Hendriks 2004; Berge and Peseux 2000), lab-on-a-chip (Pollack et al. 2000), and curtain coating, in use by Kodak for more than 10 years (Blake et al. 2000).

Electrowetting as a Display Technology: A Historical Overview The first proposals to use electrowetting as a display technology date back to the early 1980s. Beni and Hackwood (Bell Labs) proposed a display based on moving an index-matching liquid in and out of a porous structure, using electrowetting (Beni et al., US Patent 4,411,495). Analogous to polymer-dispersed liquid crystal displays (LCDs), the optical state of the switch is scattering white in one state (no liquid in pores) and transmissive in the other state (liquid in pores). Although the mechanism is elegant, making real displays on the basis of it is rather complex. After that, there were a few more suggestions in the 1980s (Kohashi, US Patent 4,488,785; Lea, US Patent 4,583,824), but the next serious attempts were made at Xerox by Sheridon and his coworkers. The first variation they proposed was based on covering a variable area by spreading a droplet (Sheridon, US Patent 5,659,330). This theme was followed by a number of others at a later stage, including Canon, Fuji, and Sony. This approach was abandoned, most likely because of manufacturing difficulties and the limited optical efficiency that can be achieved. A second variation proposed by Sheridon et al. was using capillaries as a reservoir in which a colored liquid is stored or from which it is removed (Sheridon, US Patent 5,956,005). If the colored liquid is pushed out of the capillary, it spreads across the display surface, thereby altering its color. Also in this case, the complexity of manufacturing is one of the main reasons this variation had no success. Interestingly, a variation on the theme of capillaries or 3D structures to achieve a large active area has been proposed by Heikenfeld et al. (2009) for their electrofluidic display, which is discussed in more detail elsewhere in this book.

Page 2 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

A method to realize bistable electrowetting displays either by an in-plane geometric approach or by using a 3D-channeled structure was recently proposed by ADT (Blankenbach et al. 2008). The electrowetting display technology introduced by Liquavista (Hayes and Feenstra 2003) has been developed specifically to overcome the issues with manufacturing experienced in earlier attempts. By remaining as close as possible to the existing manufacturing process of the mainstream display technology (LCDs), this approach maximizes the chances of success. As such, full advantage is taken of the high optical efficiency, whereas no compromise is made on other, advantageous, properties of LCDs, including video-speed and grayscale capability as well as the high manufacturing yield and low cost. The appeal of this approach has also been recognized by others who have adopted the same architecture and principle (Zhou et al. 2009; Cheng et al. 2008; Sureshkumar et al. 2009). In the remainder of this chapter, we will first describe the properties of electrowetting displays in more detail, focusing on the full-color, video, and grayscale capability. Next, the strong versatility of electrowetting displays is demonstrated by showing working displays in reflective, transmissive, and transflective modes.

Liquavista’s Electrowetting Display Principle In Fig. 2 the principle of an electrowetting display is shown. Figure 2a shows the optical stack, comprising a transparent electrode, a hydrophobic insulator, a colored oil layer, and water. In a display these layers are sandwiched between glass and polymeric substrates. In equilibrium, the colored oil naturally forms a continuous film between the water and the hydrophobic insulator (Fig. 2a) because this is the lowest energy state of the system. At the typical length scales used in displays (pixel sizes around or below 200 mm), the surface tension force is more than 1,000 times stronger that the gravitational force. As a result, the oil film is stable in all orientations. When a voltage difference is applied across the hydrophobic insulator, the system lowers its energy by moving the water into contact with the insulator, thereby displacing the oil (Fig. 2b) and exposing the underlying reflecting surface. The balance between electrostatic and surface tension forces determines how far the oil is moved to the side. In this way the optical properties of the stack when viewed from above can be continuously tuned between a colored off state and a transparent on state, provided the pixel is sufficiently small so that the eye averages the optical response. The electrowetting optical switch is intrinsically transparent, except for the colored oil layer. This means that the switch can be used to form the basis of transmissive, reflective, and transflective displays. This will be demonstrated in the last part of this chapter. The photographs in the insets in Fig. 2 show a typical oil retraction obtained for a group of pixels with a size of 160  160 mm2. The photograph in the inset in Fig. 2b confirms the 80 % white area required for a 70 % in-pixel color reflectivity. Part of the electrode is omitted in the lower-left corner of each pixel to control the oil motion (Feenstra et al. 2003a). In the photographs it can be seen that the control of oil motion strongly improves pixel-to-pixel homogeneity and hence the display uniformity.

Full-Color Electrowetting Displays The materials used in electrowetting displays are very simple: two pieces of glass or plastic and water and oil in between. An essential ingredient to complete the display is the dye that is dissolved in the oil. The choice of dye determines the color of the display, in particular in the off state, where the oil covers the

Page 3 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

a

A more efficient light switch than LCD

Electrolyte

Coloured oil Pixel wall

Hydrophobic coating Electrode

Substrate

(Oil thickness and droplet size not to scale! oil film typically 4 mm thick, across 160 mm pixel)

Homogeneous oil film

b

A more efficient light switch than LCD

Electrolyte

Coloured oil

Pixel wall

V Hydrophobic coating Electrode

Substrate Oil pushed aside

(Oil thickness and droplet size not to scale! oil film typically 4 mm thick, across 160 mm pixel)

Fig. 2 Electrowetting display principle

entire pixel. This implies that a wide range of colors can be achieved simply by varying the color of the dye.

Display Architectures for Full Color Nearly all contemporary display technologies use RGB segmentation to realize full color, constituting an intrinsic loss of two thirds of the incoming light. For LCDs an additional 50 % of the light is lost owing to the presence of polarizers. With the exception of electrowetting displays, generating high-brightness color is a strong limitation of all contemporary display technologies, including the ones in development. Because of its intrinsic nature as a colored light switch, electrowetting allows for a variety of display architectures with improved color brightness. Of these, two are discussed below as they represent the extremes of architectural possibilities. All architectures show the low power consumption, video-rate switching, and grayscale capability that will be addressed in more detail later in this chapter.

Page 4 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Single-layer architecture with a black dye and an RGB color filter (not to scale)

Single-Layer Architecture A low-cost, full-color display can be fabricated with electrowetting using an RGB color filter approach (Fig. 3). In this case, black oil is required as an absorbing switch. Compared with microelectromechanical systems (MEMS) or electrophoretic approaches, electrowetting offers a similar front of screen performance in a simpler, lower-cost structure. One of the most important advantages of this approach is that the manufacturing process and flow are very similar to that used for LCDs. On the other hand, compared with LCD, this architecture offers an intrinsic improvement of a factor of 2 in the color conversion factor (ratio of the theoretical light out to the light in). In practice, the improvement is even larger, as the electrowetting display will have a naturally unlimited viewing angle, whereas LCDs typically use approaches for improving the viewing angle which lead to further reduction of brightness. In reflective mode, the electrowetting display offers a strong improvement in power consumption with respect to emissive technologies owing to the absence of a backlight. The absence of the backlight and optical enhancement films also result in a significant cost reduction. In transmissive mode, electrowetting displays also offer power consumption reduction owing to the increased efficiency of the optical switch. Three-Layer Architecture A further strong improvement in optical performance is obtained when three monochrome layers are placed on top of each other. Having three monochrome layers ensures that all processes used for the single-layer display can be used for the three-layer display as well. This approach means that at every area of the display, all colors can be made, i.e., a theoretical color conversion efficiency of 100 % can be achieved. This is six times better than for an LCD and three times better than for nearly all other display technologies. In practice, the color conversion will be reduced, as in all electronic displays. For the three-layer architecture, the presence of three active-matrix layers will be the most important factor determining how high the practical optical performance will be. Aligning the remaining oil droplet on the inactive part of the pixel and enhancing the aperture ratio will be critical factors in achieving the highest possible front of screen performance. Furthermore, this approach would allow for a unique combination of large color gamut with high reflectivity, achieving full paperlike performance. More details on the three-layer approach can be found elsewhere (Hayes and Feenstra 2003).

Video Speed With electrowetting, liquids can be moved very rapidly. As a result, it is possible to show video content on display pixels smaller than about 500 mm in size (Feenstra et al. 2003b). Figure 4 shows the response times of a 100-ppi subpixel (250  80 mm2) upon voltage application. At t = 0, the voltage is switched on. In this case, the voltage required for a switch to the high-brightness state is about 20 V. The on switch occurs very fast, showing a response time of about 3 ms. A commonly used definition of the response time is the time it takes for the pixel to reach 90 % of the final value.

Page 5 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Reflectivity (a.u.)

Swing voltage (V) 0 −5 −10 −15 −20

0

5

10

15

20 Time (ms)

25

30

35

40

Fig. 4 Response times for a 250  80 mm pixel, showing 3-ms on switch and 9-ms off switch

90 80 70

White area (%)

60 50 40 30 20 10 0 0

2

4

6

8 10 12 DC voltage (V)

14

16

18

20

Fig. 5 Electro-optic response of 160  160 mm2 pixels

After 10 ms, the voltage is switched off, and the pixel relaxes to its original state. The response time for the off switch is around 9 ms. Clearly the on and off response times are sufficiently fast to be able to show video content.

Grayscales

The electro-optic response of an electrowetting pixel with a 160-ppi resolution (160  160 mm2) is depicted in Fig. 5. The pixel white area, i.e., the area from which the oil is removed, is plotted as a function of DC voltage. The electro-optic response shows a small threshold voltage before displacement of the oil film commences. The white area shows a steady increase with increasing voltage. The in-pixel reflectivity is proportional to the white area and can be as high as 70 % for a white area of 80 %. All intermediate optical states are stable, implying that analogue, voltage-controlled grayscales can be realized. Page 6 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

When part of the oil film is removed from the pixel area, the effective dielectric thickness in the pixel changes, as the oil forms part of the dielectric. This means that the capacitance of the pixel changes during addressing the display. This is not unusual, as also the pixel capacitance of an LCD changes under operation, but the size of the effect is somewhat larger in this case, depending on the grayscale used. The modification of the pixel capacitance will need to be accounted for in the addressing schemes used. In addition to amplitude modulation, electrowetting displays can also be addressed with pulse width modulation to realize grayscales. Both methods have advantages and disadvantages, but the flexibility of electrowetting displays in that they can use either of the two or a combination of both provides an excellent grayscale capability.

Power Consumption One of the most important advantages of reflective electrowetting displays is their very low power consumption even when showing full-color video at high brightness. In Fig. 6 we summarize the power consumption of a number of technologies. We have restricted ourselves to technologies for video-rate displays. As the value for backlit LCDs, we have taken a typical power consumption for a 2.5-in. display of around 250 mW. For active-matrix organic light-emitting diodes (OLEDs), the actual power consumption is dependent on the image content. Therefore, we have depicted two numbers for OLEDs, one showing the power consumption for Web-based content and one for a typical photograph (still image). As the former contains more white, the power consumption is significantly higher. Obviously, generating light costs much energy. For MEMS technology, the power consumption has been determined by ab initio calculations, assuming a 6-bit grayscale by spatial dithering and an increased frame rate, resulting in higher power consumption even though no illumination is required. For Liquavista displays, we used the same calculations with amplitude-modulated grayscales and our present voltage of about 20 V (DC). As we anticipate the driving voltage will be reduced further in the (near) future, this power consumption can be reduced further significantly. However, already with contemporary driving voltages, the power consumption is much lower than for the other technologies.

28

Liquavista display

OLED (webpage)

240

80

OLED (still image)

iMoD (MEMS)

70

Back-lit LCD

80 0

50

100 150 200 250 Power consumption for video (mW/sq inch)

300

Fig. 6 Comparison of power consumption for a variety of video displays. For the sake of clarity: OLED organic light-emitting diode, IMoD interferometric modulator, MEMS microelectromechanical switch, LCD liquid crystal display

Page 7 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

For applications other than high-resolution video displays, we also need to compare the power consumption of electrowetting displays versus that of bistable displays, such as electrophoretic displays. In this case, the comparison will be very different for different usage modes. Bistable displays have an advantage for usage modes where the display is refreshed only once every few minutes (e.g., linear reading). On the other hand, the power of electrophoretic displays increases rapidly when the display is updated more frequently, so electrowetting displays actually become much more power efficient in the case when people are interacting with the content.

Manufacturing Process The process flow for manufacturing an electrowetting display is shown in Fig. 7. At the bottom substrate any type of substrate can be used, ranging from structured indium tin oxide (ITO)-coated glass for segmented displays and active-matrix substrates for high-resolution, pixelated displays to polymeric substrates for flexible displays. On the substrate a submicron-thick amorphous fluoropolymer layer or a stack of a barrier layer and a fluoropolymer is coated. The barrier layer is used as a precaution to cope with imperfections in the fluoropolymer layer that may occur during volume manufacturing. The strong hydrophobic nature of the fluoropolymer ensures the spreading of the oil film in the off state. Photolithographic walls form the pixel structure, which can be filled rapidly by simply dosing across the surface. The height of the pixel walls plays an essential role in determining the amount of oil that self-assembles inside the pixels. This height, determined during a standard photolithographic process, is very uniform across the surface, resulting in a uniform electro-optic response. The water, forming a continuous phase throughout the display, acts as the common electrode. After the liquids have been applied, the display is closed using an ITO-coated cover substrate to provide the electrical contact with the water. As can be seen, electrowetting display processing consists of standard technologies, nearly all of which are used in existing LCD manufacturing facilities. The only exception to this is the filling and coupling,

Bottom substrate

Layer deposition

Pixel wall formation (Photo-litho)

Top substrate

Surface treatment

Filling

Seal deposition

Coupling

S&B

Fig. 7 Process flow for manufacturing electrowetting displays

Page 8 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

which is different from the filling and coupling process that is used for LCDs. This means that for current players in the industry, electrowetting displays offer a great opportunity to commercialize strongly improved displays with a relatively low investment.

Strong Versatility: Operational in All Modes As stated already, the basic electrowetting switch is transmissive and forms the building block for displays of all modes. In the next subsections we discuss each of these modes in more detail and show working displays for each of them.

Reflective Mode The transmissive display switch shown in Fig. 2 can be made reflective by either adding a reflector behind the display stack or by making one of the layers underneath the oil layer reflecting. In an active-matrix display, for instance, this is typically done by using a reflecting pixel electrode. Also for reflective displays, the easiest way to obtain full-color displays is with a black and white switch and a color filter. This architecture works significantly better for electrowetting displays than for electrophoretic displays, as the specular reflective nature of the display allows for a much better saturation of the colors, by avoiding light leakage between subpixels. An example of a color reflective electrowetting display (1.8 in., 256 k colors, 115 dpi) is shown in Fig. 8. A reflective display, especially with this kind of optical efficiency, can provide strong advantages in mobile applications, including low power consumption (no backlight required) and readability in many lighting conditions. In addition, reflective displays are typically much more comfortable to read, as the brightness of the display adapts itself to the level of environmental lighting, in much the same way as all the other information the human registers.

Fig. 8 A 1.8-in. reflective electrowetting display with 256K colors

Page 9 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Typically, in a single-layer reflective display, the color gamut will still be limited, as one needs to find a compromise between high brightness and large color saturation. One way to breach this compromise is to use a stacked approach. This is analogous to color printing, where subtractive color dots are placed on top of each other. For the stacked approach, the versatility of having a transmissive and a reflective mode is essential, as the upper layers need to be transmissive, whereas in reflective displays the bottom layer needs to be reflective.

Transmissive Mode As the only light-absorbing element in the electrowetting display stack is the oil layer, the optical efficiency of the switch can be very high. This makes electrowetting display technology very suitable for use in transmissive displays. Clearly, integration of the electrowetting display stack in a full display will also require optimization of the non-electrowetting-specific bits that absorb light as well, such as the active-matrix backplane. Aligning the inactive areas in the front and the backplane is desired to retain the high optical transmissivity of the display. Recently, we presented the world’s first transmissive electrowetting display (Giraldo et al. 2009) and showed its favorable properties: unlimited viewing angle, high brightness, and a clear path toward competitive contrast ratios. In Fig. 9 we show photographs of transmissive electrowetting display prototypes in direct view and at a large (+50 ) viewing angle. The variation in brightness at different angles is due to the backlight brightness, as the efficiency of the display is independent of the viewing angle (Giraldo et al. 2009).

Fig. 9 A 1.8-in. transmissive electrowetting display with 6-bit grayscale: (a) direct view and (b) view under an angle of 50 Page 10 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 10 A 6-in. direct-drive transflective electrowetting display shown in (a) reflective mode and (b) transmissive mode

Transmissive electrowetting displays allow for an improvement of power consumption compared with LCDs by at least a factor of 2, but more likely around 3–4 owing to the increased aperture, which presents product designers with new opportunities. Firstly, the size of the display at which the power consumption is acceptable will be significantly larger, pushing it to a size that goes significantly beyond even that of today’s smartphones. Moreover, for applications where the present power consumption is acceptable, one can also envisage the brightness of the display being boosted by a factor of 2–4, which would mean that it becomes much more visible in most lighting conditions.

Transflective Mode Finally, combining the reflective and transmissive options with the fast switching properties mentioned already, one can also think of novel architectures that support the desire of the customer to have uncompromised color as well as low power consumption. This is achieved by combining a highbrightness, monochrome reflective mode and a highly saturated, transmissive field sequential color mode (Giraldo et al. 2009). The reflective monochrome mode in this hybrid architecture can be used for reading books and documents and will have a front of screen performance that is comparable to that of existing solutions. The large color gamut color mode can be used when desired, for instance, when working with presentation files or when reading color magazines. In addition to this hybrid approach, electrowetting can also be used in a more conventional transflective display. An example of such a transflective, segmented display is presented in Fig. 10.

Summary Electrowetting displays have very favorable optical properties, combining a paperlike performance with video-speed switching speed and high-brightness full-color capability. On the other hand, electrowetting

Page 11 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

displays are manufactured using well-known processes that can be found in LCD manufacturing facilities. In addition, electrowetting displays show a high level of versatility, being able to operate in reflective, transmissive, and transflective modes. This implies that electrowetting displays are a disruptive technology from the user experience point of view while not disrupting the existing LCD value chain, including backplanes, components, and system manufacturing. In the coming phase, most of the effort in the field of electrowetting displays will be aimed at commercial introduction and scaling up the volumes. In parallel, work on delivering on the long-term technology road maps, on further improving the performance, and on introducing revolutionary novel multilayer architectures will continue.

Further Reading Beni G, Craighead HG, Hackwood S (Bell Labs) Refractive Index switchable display cell. US Patent 4,411,495 Berge B, Peseux J (2000) Variable focus lens controlled by an external voltage: An application of electrowetting. Eur Phys J E 3:159 Blake TD, Clarke A, Stattersfield EH (2000) An investigation of electrostatic assist in dynamic wetting. Langmuir 16:2928 Blankenbach K, Schmoll A, Bitman A, Bartels F, Jerosch D (2008) Novel highly reflective and bistable electrowetting displays. SID Dig 16:237–244 Cheng WY et al (2008) Novel development of large-sized electrowetting display. SID Dig 39:526–529 Feenstra BJ (2008) Mobile display: technology and applications: electrowetting displays for mobile multi-media applications. Wiley, New York (Chap 18) Feenstra BJ, Hayes RA, Camps IGJ, Hage M, Franklin AR, Schlangen LJM, Roques-Carmes T (2003a) Video-speed response in a reflective electrowetting display. IDW Proc 3:1741 Feenstra BJ, Hayes RA, Camps IGJ, Hage M, Johnson MT, Schlangen LJM, Roques-Carmes T, Franklin AR, Ford RA, Valdes AS (2003b) A reflective display based on electrowetting: principle and properties. IDRC Proc 3:322 Giraldo A, Aubert J, Bergeron N, Li F, Slack A, van de Weijer M (2009) Transmissive electrowettingbased displays for portable multi-media devices. SID Dig 40(1):479–482 Hayes RA, Feenstra BJ (2003) Video-speed electronic paper based on electrowetting. Nature 425:383 Heikenfeld J, Zhou K, Kreit E, Raj B, Yang S, Sun B, Milarcik A, Clapp L, Schwartz R (2009) Electrofluidic displays using Young-Laplace transposition of brilliant pigment dispersions. Nat Photonics 3:292–296 Kohashi T (Matsushita) Display device having juxtaposed capillary openings for generating variable surface concavities. US Patent 4,488,785 Kuiper S, Hendriks BHW (2004) Variable-focus liquid lens for miniature cameras. Appl Phys Lett 85:1128 Lea MC (University of Rochester) Electrocapillary devices. US Patent 4,583,824 Lippmann G (1895) Ann Chim Phys 5:494 Mach P, Krupenkin T, Yang S, Rogers JA (2002) Dynamic tuning of optical waveguides with electrowetting pumps and recirculating fluid channels. Appl Phys Lett 81:202 Mugele F, Baret J-C (2005) Electrowetting: from basics to applications. J Phys Condens Matter 17:R705

Page 12 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_103-2 # Springer-Verlag Berlin Heidelberg 2015

Pollack MG, Fair RB, Shenderov AD (2000) Electrowetting-based actuation of liquid microdroplets for microfluidic applications. Appl Phys Lett 77:1725 Prins MWJ, Welters WJJ, Weekamp JW (2001) Fluid control in multichannel structures by electrocapillary pressure. Science 291:277 Quillet C, Berge B (2001) Electrowetting: a recent outbreak. Curr Opin Colloid Interface Sci 6:34–39 Sheridon NK (Xerox Corporation) Electrocapillary color display sheet. US Patent 5,659,330 Sheridon NK (Xerox Corporation) Electrocapillary display sheet which utilizes an applied electric field to move a liquid inside the display sheet. US Patent 5,956,005 Sureshkumar P, Kim M, Song EG, Lim YJ, Lee SH (2009) Effect of surface roughness on the fabrication of electrowetting display cell and its electro-optic switching behavior. Surf Rev Lett 16(1):23–28 Technology white paper at http://www.liquavista.com/technology/default.aspx Zhou K, Heikenfeld J, Dean KA, Howard EM, Johnson MR (2009) A full description of a simple and scalable fabrication process for electrowetting displays. J Micromech Microeng 19(6):065029

Page 13 of 13

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

Droplet-Driven Electrowetting Displays Frank Bartels* Bartels Mikrotechnik GmbH, Dortmund, Germany

Abstract To achieve a reflective display, a number of different approaches have been developed. One is based on the so-called electrowetting effect, which changes the surface energy of a liquid in contact with a surface. In the ADT display approach, some color is placed into droplets and these droplets are moved into different positions; hence, this is known as the “droplet-driven display,” or D3, technique. In this chapter, we focus on the D3 technology, its production conditions and applications, and some details concerning necessary optimization.

Introduction Interest in nonemissive but electronically controllable displays has been steadily increasing over recent years. The main reason is that the power consumption of all emissive displays is quite high, especially if large display areas or battery-driven devices are used. If readability in outdoor conditions is necessary, the power consumption of traditional displays can be tremendous. For example, a standard LED-based billboard as may be installed in Times Square in New York consumes at least 50 kW of power (one LED consumes 2.5 V  20 mA = 0.05 W; 640  480  3  0.05 W = 46 kW). In notebook and mobile devices, the display has the highest energy load of all components. Another important topic is the readability. Even if the power is not an issue, it is possible that at high illumination levels, readability cannot be guaranteed. For example, there is a requirement to create some simple signs for use in the desert, where the lighting level can be at least 100,000 lx. Looking at the different display markets, there is an extreme variety in nearly every respect. There are pixel sizes from a few micrometers up to 20 mm, total display dimensions from far below stamp up to big walls for public viewings, total pixel numbers from one to millions, and color range from monochrome to full color. Hence, we clearly believe that for emissive displays as well as for nonemissive ones, different technologies and optimization processes are needed to fulfill the requirements for such different applications. To achieve a reflective display, many rather different approaches have been developed. One is based on the so-called electrowetting effect (Moon et al. 2002), which changes the surface energy of a liquid in contact with a surface. A detailed description is given in chapter ▶ Video-Speed Electrowetting Display Technology, and a brief summary is given in the next section. Electrowetting was developed for the so-called lab-on-chip applications, where usually a water droplet with a sample or a reagent is transported and analyzed (a good overview is given in Fair (2007). The possibilities to manipulate the movement of the droplet are multifaceted. In the ADT display approach, some color is placed into a liquid – at the beginning simply water – and the droplets are moved into different positions; hence, this is known as the “droplet-driven display,” or D3, technique. In this chapter, we will focus on the D3 technology, its production conditions and applications, and some details concerning necessary optimization. *Email: [email protected] Page 1 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

Water

Hydrophobic layer

θ0

Dielectric

Substrate Electrode

U

θ

Fig. 1 Principle of electrowetting and the Lippmann-Young

Technical Electrowetting in General

Because the principles of electrowetting are described in chapter ▶ Video-Speed Electrowetting Display Technology, we will only briefly cite the basic equation, which explains the mode of action. The contact angle of a fluid with a surface depends on many chemical and physical parameters. To understand the basics of electrowetting displays and their parameters, we will limit ourselves to the Lippmann-Young equation (Lippmann 1875): cos y ¼ cos y0 þ

e0 er U 2 2gLG d

(1)

This equation describes that the contact angle of a droplet in contact with a surface can be modified by applying an electric field. The cosy0 term describes the contact angle (see Fig. 1) in the absence of a voltage or electric field and the second term describes its influence. As voltage U increases, the sum increases and so y decreases; hence, the liquid will start to wet the surface.

Droplet-Driven Display As already mentioned, the electrowetting effect can be used for displays, as also described in an early publication (Lippmann 1875), in two different ways: 1. Electrically induced contracting and relaxing of a locally fixed (e.g., by surrounding walls or a pixel grid) droplet similar to the process shown in Fig. 1 2. Moving a droplet from one position to another The displays produced by Liquavista, a Philips spinout, are based on the first approach, which is described in detail in chapter ▶ Video-Speed Electrowetting Display Technology and Hayes and Feenstra (2003). We have concentrated on the second approach, which is described in Pollack et al. (2000), Paik et al. (2003), Fair et al. (2003), and Kim et al. (2006) and illustrated in Fig. 2.

Page 2 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015 Bistable state

Data Electrode

Hydrophobic Dielectric

Water

E1

Bistable state

Intermediate state

UEW

Oil

E2

E E1

E2

E1

E2

Control Electrodes

Fig. 2 Basic process for droplet movement by electrowetting

Reservoir Visible area

2 mm

Fig. 3 Bonelike cell structure for the 2D version of the droplet-driven display (D3) concept

A water or water-like droplet is positioned above a structured electrode. If a voltage is applied to the neighboring electrode, which has to be positioned quite close to the first one, the fluid will gain some surface energy by wetting the area of the second electrode. Hence, the fluid is sucked to the second electrode. If the voltage is switched off, the droplet has simply moved one step. The simple next step is to bring some color into that droplet and realize a structure that makes one droplet position visible from above and hides the other one. This simple concept creates the important bistability of the D3 concept. Because no further energy and not even an electric field or charge is needed to create the visible or hidden state, we call that display approach the “no-power” display (Blankenbach et al. 2008). To stabilize the droplet position even in higher-shock conditions, we place the droplet in a fluidic structure which will separate several droplets from each other and create a structural barrier between the two droplet positions. In the simplest 2D version, the unit-cell will have a bonelike structure as shown in Fig. 3. In the design, we have to manage the desired shock resistance, the switching voltage, and the switching time. These can be influenced by the length and minimum dimension of the interchange channel in relation to the absolute diameter and volume of the liquid droplet. The process of droplet movement can be influenced and controlled by different effects. One effect is the electrode structure. The behavior of the jumping process depends on the overlap of designs between the fluid and the electrode. The necessary overlap can be realized by designing small electrodes with fluid droplets with a slightly larger diameter. Another way is to design the edges of the electrode like a sawtooth; these are then placed so close together that the droplet is nearly independent of its diameter

Page 3 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

a Low illumination

intermediate illumination

intense illumination

Activated LED

D∧3 Testpixel

b Contrast ratio 45° (non-specular) geometry with reflector 15 CMY 300 μm CMY 450 μm

10 Newspaper 5

0

CMY 150 μm

0

2,000

4,000

6,000

8,000

Illuminance / Ix

Fig. 4 (a) Measured contrast ratio for three different channel heights. (b) Contrast ratio for different illumination conditions and channel height

and will always see a small portion of the neighboring electrode. By using other designs, one can trap a droplet in a fluidic chamber and guide it to the predefined position. An example will be given later. The diameter and the thickness of the droplet can be varied from the electrowetting “point of view” over a wide range. We have successfully realized thicknesses from 30 to 800 mm and diameters from 100 to 10,000 mm. The choice of both parameters is mostly defined by the required display functionality. Most of our designs have a diameter of 1–2 mm with a channel height of 50–175 mm. In Fig. 4, a setup of some LEDs and a D3 test pixel is shown under different illumination levels. Under high illumination level, it is difficult to see the on/off status of the LED, but it is easily seen with the colored droplet. Without any illumination, the D3 pixel is intrinsically not visible, but because of the full transparency of the setup, a backlight can be used to operate a transflective mode. The achievable contrast ratio for three different channel heights is given in Fig. 4b (Blankenbach et al. 2009).

Color The reason for using relatively large channel heights is to achieve very high color saturation. Most of the other reflective technologies achieve only a rather poor color saturation. Mainly, this is caused by the RGB or CMY color scheme with all three colors in one plane or even better the use of a color filter. To make this clear, let us look at creating a full black. In a display which has pixels with parallel visible areas of RGB or CMY, one is only able to realize a maximum black level of 33 % – a light gray. It is obvious that only a 33 % red level is achievable. In reflective mode, full color is only achievable either if each pixel can have

Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

a

b

: achievable color by 4 × 4 dithering 0.6

K

W

R

B

G

CIE 1976 UCS

v′ 0.5

Printer

0.4 W

C

K

Y

M

0.3 0.0

4 × 4 CMY stacks CHLCD 0.1

0.2

0.3

0.4 u′

4 × 4 CMY EW cluster = 18 bit color depth

Fig. 5 (a) Color effects with three layers stacked using CMY, which result in eight colors. (b) Calculated results for the achievable color range assuming a 4  4 subpixel structure

b

Transfer hole Fluidchamber

a Stable off

Switching

Intermediate layer

Stable on

Glass

Intermediate layer

Fig. 6 3D monochrome design for D3 concepts with improved aperture: (a) side view, (b) 3D view

all eight colors (red, green, blue, cyan, magenta, yellow, black, and white) or by stacking three layers (CMY) of nonpigmented ink. A stacked three-layer device filled with CMY nonpigmented ink with layer thicknesses of 175 mm is shown in Fig. 5. On the basis of those measurements, we have calculated a possible color range if we form a 4  4 subpixel and every individual subpixel can be set to one of the eight colors. The result is shown in Fig. 5b (Blankenbach et al. 2009). By this combination, it is possible to achieve all eight colors (red, green, blue, cyan, magenta, yellow, black, and white) in each pixel. The color is one of the most important parameters regarding device stability and lifetime. From the comments above, it is quite obvious that nonpigmented inks give us the best color freedom. On the other hand, the long-term stability of such colors needs further improvement, and there are currently only a limited number of colors available which fulfill the lifetime requirements. With pigmented color, the stability is much better, but a stacked device will be limited by light absorption in the upper layers. Nevertheless, attractive designs can be realized.

Some Improvement of Pixel Design The 2D approach as described in section 2.2 is quite easy to realize and offers outstandingly low marginal costs when produced in high volumes. The approach will become even better if plastic material is used as the base layer. A big disadvantage is the reduction of the aperture size because of the hidden reservoir position. Hence, several different attempts were made to overcome that problem. One of the most advanced configurations is shown in Fig. 6.

Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 7 (a) Droplet generator. (b) Starlike electrode structure for controlled droplet positioning within a pixel cell. (c) Multilayer layout for high pixel number

The liquid droplet is stored in a layer below the visible layer (Bitman et al. 2010). Both chambers are separated by an intermediate foil, which has the necessary openings and control electrodes. This design allows a very high aperture level of around 85 %. By choosing the opening properly, we are able to realize a fully bistable 2-mm pixel at a reasonable switching voltage of about 20–40 V and a switching time of about 300 ms. Besides such U-shaped geometries, we have realized an S shape and an L shape too. Especially the L shape will offer a good combination of attractive specifications because it will combine the transflective capability of the 2D design with the good aperture of the 3D design. This must be balanced by handling a significant increase in complexity, which we have only done up to now for laboratory prototypes. With increasing production experience, we will improve our approach. With the approach described in section 2.3 using the eight colors in the subpixel configuration, one can realize gray levels. As seen from Fig. 7, such a subpixel approach will give a reasonable color resolution, which can be used to achieve gray levels too. Besides this approach, we have realized several more complex structures which allow gray levels. It is well known from several independent publications that single droplets can be separated from a larger liquid reservoir by electrowetting forces. Usually, one will have a line of individual controllable electrodes (“electrode runway”), and by application of an electric field, these electrodes are wetted from the larger reservoir. Then the electrodes between the reservoir and all electrodes except the last electrode of the electrode line are set to be field-free. Hence, the liquid is forced back to the reservoir except for the liquid which covers the last electrode. The droplet diameter is correlated to the electrode diameter. By doing this, we are able to create from a large reservoir single droplets, which we place in a pixel chamber. Such a droplet generator is shown in Fig. 7a. It is obvious that a gray level is created from such transport of single droplets into the visible area. To get the droplets back, one has to be sure that the main larger visible droplet is positioned at the entrance of the electrode runway, as shown in Fig. 7b. We place a starlike electrode structure in the main chamber and a set of separated electrodes in the entrance channel. By application of a voltage to the starlike electrode structure, the droplet or droplets inside are forced to the desired position automatically. Figure 7c shows how this concept can be transferred to a layer structure of central feeding lines, bottom electrode and through-holes, fluid channel, and top electrode. One of the great advantages of that concept is the possibility not only to switch the position of a “predefined” droplet from one place to the other but also to generate the color from a large central reservoir, which will feed a whole set of pixels within a tile. The advantage is the increase in aperture size and the possibility to change the liquid in the reservoir from time to time to achieve longest lifetime, even under difficult conditions, e.g., high UV radiation. A disadvantage is that building up a gray scale by that process is rather slow. We are currently working on design improvements, e.g., building different runway sizes (ratio 1, 2, 4, 8, 16, etc.) working in parallel to feed a pixel faster.

Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Electrical properties of several coating materials Air Glass Teflon SiO2 Si3N4 Ta2O5

Maximum field strength (kV/mm) 2 10 20 4–10 20 600

Dielectric constant 1 6–8 2 4 10 42

Another way of creating gray scales is the well-known pulse width modulation. The droplet is moved very fast from the visible to the nonvisible position and the time ratio between the two stages will give the gray-level impression. To realize such a configuration, a rather high speed of movement is needed. It was shown (Pollack et al. 2002) that with a rather low diameter, a droplet movement frequency of up to 1,000 Hz is achievable. With a much bigger diameter of 0.5 mm, we have realized a maximum frequency of 125 Hz. To achieve that high speed, the voltage has to be increased to levels in the 100–140-V range. Because the capacitor needs to be charged all the time, it is quite obvious that the power consumption will increase significantly. For our main application, low power consumption is one of the main strategic advantages. Hence, we have shown that such a procedure can be done, but we will not be focusing on it for some time.

Device Details The parameters in (Eq. 1) show us the main optimization aspects of an electrowetting device. In the design of display systems, the driving voltage U should be as low as possible. Therefore, high values for the dielectric constant er are recommended as is keeping the surface tension g LG and the thickness d of the dielectric low. To achieve a low surface tension g LG, we tested a variety of different materials, such as Cytop (Berry et al. 2007), Teflon, and more than 20 others. A contact angle of water against such surfaces of better than 130 in combination with a roll-off angle below 3 gave suitably low surface tension. Good values are achievable with Teflon using spraying or dipping technologies. Unfortunately, other parameters also have to be optimized. Besides the gLG value, the breakthrough voltage is an important parameter. To achieve a reasonable effect, the field strength across the dielectric has to be reduced to a level which is in the range of the breakthrough voltage. Some values are given in Table 1. It is clearly visible that for common isolating materials such as silicon dioxide, silicon nitride, and Teflon coating, levels of 70 % of the breakthrough voltage are achieved. This creates the demand for a rather high level of surface homogeneity. If the droplet moves across the surface and hits an area with a reduced dielectric thickness, a short circuit will easily occur and this will destroy the device. The other way to reduce the voltage needed is through the er value. We identified tantalum pentoxide and niobium pentoxide as valid material. Because these materials do not give us the right g LG, we combine such a high dielectric constant coating with a rather thin Teflon or Cytop (Berry et al. 2007) layer. An additional important factor is the surface roughness (Sureshkumar et al. 2009). As an electrode material, we usually use indium tin oxide (ITO) films which are evaporated and structured by a laser technique. One of the great advantages of the material combination chosen is the transparency in the visible range. This is essential for building up three color stacks and using transflective configurations. For our monochrome 3D approach, we have realized a chromium layer as the electrode material, which results in some improvement in surface roughness and reduction of the switching voltage. Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

a Connector

b

Intermediate layer

Glas

Fluidstructure

Fig. 8 (a) Electrical connection of base, top, and intermediate electrodes. (b) Fully assembled 5  7 matrix with cable connected

As base material, we normally use a glass substrate, which allows the broadest variety of coating processes, but in a laboratory prototype, we have successfully used flexible films such as poly(ethylene terephthalate) and polyphenylsulfone. Because of the wide range of operating temperatures which displays must have, the water/silicone oil system, which was tested in our first attempts, is not an appropriate choice. We now use polypropylene droplets, which run in a matrix of an organic solvent. Principally, with that configuration, we achieve a temperature range from around 45 C to 125  C. Because of the packaging and assembly processes needed, we actually limit our operation range from 40  C to 85  C. Depending on the specific color, we need in addition some additives in the liquid to achieve a stable setup. Hence, the right choice of the base material, the coating process, and the surface finish is crucial for success. Having addressed all these topics, one is able to achieve quite good electrowetting behavior at reasonably low voltages and superior lifetime and stability of the device. In short – we have broad experience in building up electrowetting structures with different materials and design configuration and can address nearly every special need.

Production Consideration Some advice regarding the production of the basic electrowetting structure, shown in Fig. 1 , have been given so far. But some more comments are needed to understand the challenges, and these will be described using the 3D design shown in Fig. 5. We normally start with a glass substrate, which we cover with ITO film. Later, the ITO is structured by laser ablation. The structure obtains its dielectric layer by evaporation processes, and by dip coating or spin coating, we place the Teflon or Cytop film on top. The technique for producing the fluidic structure depends on the desired droplet diameter. For a diameter of 2–5 mm and a channel height of around 150 mm, we usually use excimer laser structuring in the prototype stage. If the design is fixed, we change to a molding or stamping process. For smaller structures, we produce the layer by lithography. The liquid layer is positioned on top of the electrode structure and is fixed by a gluing dot. Having prepared two such samples, we put on top of one of them the intermediate layer with holes for droplet and neutral liquid transfer. This intermediate layer again has Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015 High content & function

High cost

Smart phone E-book Watch Laptop? Small size

Billboards Large size

Smart cards Price tags Signs Status control Low cost

Low content & function

Fig. 9 Market matrix for electrowetting displays

electrodes on the top and bottom to address the individual row or column. We then fill the sample with the neutral liquid under a vacuum to prevent gas-bubble formation. The colored liquid is then positioned in the partly covered fluid vessel through one of the holes. At the end, the other glass sample is positioned on top. Again only a slight fixture is needed to fix the device. By putting glue around the whole setup, we obtain a seal which prevents desorption of the solvent. At the end, we place standard electric pins from the side, as they are well known from LCD production to produce an electrical contact as it is shown in Fig. 8a. An example of a fully assembled 5  7 matrix product is shown in Fig. 8b. We follow two different routes for improvement. First, we establish a cost-optimized production line for low-content product with only a limited number of pixels but of relatively large pixel size. We use molded parts, which we assemble and fill using an automatic fast serial rotary-table-type-production machine. The second path is production using lithographic processes on larger substrates. Because most of the processes needed are quite similar to those used in standard LCD production, only some specific steps such as the filling of the device with the liquids need to be added. Hence, a wide area of cooperation is possible for further volume production in larger areas.

Markets The display market is heavily fragmented, with some big blockbuster applications such as TFT screens for computers and mobile phones. This is illustrated in Fig. 9 (Rawert et al. 2010). Electrowetting technology can be applied to all of the markets addressed in the figure, ranging from our activities in the low-content and small-size area to the yet to be realized dream of a no-power, but controllable billboard. The range of applications even for low-content displays is quite large and we will give some examples. All studies have shown that the growth rate of radio-frequency identification techniques with smart sensor networks or smart switching is quite large. The power consumption of a such system is critical, because the battery lifetime or better the use of an energy harvester is crucial for success. Figure 10a shows a household switch from the company EnOcean, which is powered just by the energy generated by a hand during actuation. We have implemented simple on/off signs which can be driven within these power Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

a b

Household ‘true white’ display Start

45°

Prewash

50°

Wash

auto

60°

Fig. 10 (a) Sign on a self-powered radio-frequency identification switch. (b) Screen of household equipment with electrowetting signs

budgets and which expand the application range of the product significantly. Another example is given in Fig. 10b. With the use of the transparency of the D3 concept, simple signs can be placed on household equipment, but in the off status, the indicator will not reduce the designer’s freedom even when he or she wants to realize a wooden appearance. Battery end-of-life indicators can be highly visible without supplemental stress being put on the battery. Because of the unlimited bistability, safety indicators will show the status of a machine or device even after years of use.

Summary We have collected some information on the electrowetting display approach using a moveable droplet. We gave some information about the basic concept, some design variations which can be realized to fulfill specific requirements, and some advice regarding production and challenges. The electrowetting concept is exceptionally flexible, whereas, on the one hand, this presents an opportunity and, on the other hand, it demands a focus on specific design directions. The technologies of organic LEDs, nanocoatings, roll-toroll processes, and flexible substrates offer a continuous stream of new knowledge. It is quite clear that the combination of all of these with the electrowetting approach will generate new opportunities, and we expect the D3 concept to play an active role in these developments.

Next Development Steps Several different development directions are needed for further progress of this technology. First, some more basic tests regarding the lifetime of the device will be conducted. In parallel, we will start to adopt the design for much smaller droplet sizes to enable small display dimensions and video speed. In addition, optimization of L-shaped fluidics will be combined with electrode designs which enable high pixel numbers.

Further Reading Berry S, Kedzierski J, Abedian B (2007) Irreversible electrowetting on thin fluoropolymer films. Langmuir 23:12429–12435; Product sheet, “Cytop” from Asahi Glas Bitman A et al (2010) Bistable electrowetting displays D3 – droplet driven displays ® conference proceedings, ISBN 978-3-7723-1430-8

Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_104-2 # Springer-Verlag Berlin Heidelberg 2015

Blankenbach K et al (2008) Novel highly reflective and bistable electrowetting displays. J SID 16(2):237–244 Blankenbach K et al (2009) Recent improvements for applications of droplet-driven electrowetting displays. SID, Digest of technical papers, pp 475–478, ISSN 0009-966X Fair RB (2007) Digital microfluidics: is a true lab-on-a-chip possible? Microfluid Nanofluid 3:245–281 Fair RB, Srinivasan V, Ren H, Paik P, Pamula VK, Pollack MG (2003) Electrowetting-based on chip sample processing for integrated microfluidics. In: IEEE International electron devices meeting (IEDM), Duke University Durham, Durham Hayes RA, Feenstra BJ (2003) Video-speed electronic paper based on electrowetting. Nature 425:383–385 Kim NY, Hong SM, Park SS, Hong YP (2006) The movement of micro droplet with the effects of dielectric layer and hydrophobic surface treatment with RF atmospheric plasma in EWOD structure. J Phys Conf Ser 34:650–655 Lippmann G (1875) Relations Entre les Phenomenes. Electriques et Capillaires. Ann Chim Phys 5(11):494–549 Moon H et al (2002) Low voltage electrowetting on dielectric. J Appl Phys 92(7):4080–4087 Paik P, Pamula VK, Pollack MG, Fair RB (2003) Electrowetting-based droplet mixers for microfluidic systems. Lab Chip 3:28–33 Pollack MG, Fair RB, Shenderov AD (2000) Lab-on-a-chip. Appl Phys Lett 77:1725–1726 Pollack MG et al (2002) Electrowetting-based actuation of droplets for integrated microfluidics. Lab Chip 2:96–101 Rawert J et al (2010) Bistable D3 electrowetting display products and applications. Journal SID 2010 Seattle, Digest of technical papers, pp 199–202, ISSN 2154–6746 Sureshkumar P et al (2009) Effect of surface roughness on fabrication of electrowetting display cells and its electro-optic switching behavior. Surf Rev Lett 16(1):23–28

Page 11 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

Electrofluidic Displays Jason Heikenfelda* and Kaichang Zhoub a Novel Devices Laboratory, School of Electronics and Computing Systems, University of Cincinnati, Cincinnati, OH, USA b Gamma Dynamics, Cincinnati, OH, USA

Abstract Electrofluidic displays were first reported in 2009, and transpose brilliantly colored pigment dispersions via competition between electromechanical and Young-Laplace pressure. To our knowledge, this is the first display technology to use a three-dimensional (3-D) microfluidic device structure and leverages brilliantly colored aqueous pigment dispersions. Reported herein is a brief review of electrofluidic display technology. The review includes the device operating principle, fabrication, speed, brightness, and color performance. Also presented is recent progress in key areas needed for realizing products, including bistability, fabrication on flexible substrates, and performance in various temperatures (varied from 28  C to 80  C).

List of Abbreviations C CMY d DPI h l p R R RGB(W) U V g « uV uY

Capacitance Cyan, magenta and yellow Dielectrics thickness Dots per inch Channel height Pixel length Pressure Radius of curvature Reflectance Red, green, blue (white) Velocity Voltage Interfacial surface tension Dielectric constant Wetting angle under voltage Young’s angle

Introduction Reflective displays have attracted increasing attention in recent years due the success of portable applications such as electronic paper (e-paper). Different from transmissive and emissive displays which require a high-power light source on the back of the display, reflective displays utilize ambient light for illumination and, therefore, provide superior power efficiency, sunlight legibility, and reduced *Email: [email protected] Page 1 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

long-term reading eyestrain. Nowadays, there are numerous reflective display technologies on the market vying for e-paper application where high white state reflectance (R) is critical: electrophoretic (E-Ink, R  40 %) (Huitema et al. 2006), electrowetting (Liquavista, R  55 %) (Hayes and Feenstra 2003), cholesteric liquid crystal (Kent Displays Inc., R  40 %) (Khan et al. 2005), electrochromic (NTerra Inc., R  45 %) (Back et al. 2002), microelectromechanical interference (Qualcomm Inc., R  50 %) (Graham-Rowe 2008), and liquid powder (Bridgestone, R  40 %) (Hattori et al. 2004). However, all of these technologies fall short of the visual brilliance and contrast of pigments printed onto paper (R > 80 %). For example, electrowetting displays currently provide the best white state reflectance, which is achieved by reconfiguring the coverage of a dye-colored oil film on a planar high-reflectance white surface. However, the colored oil film typically can only be reduced to 20–30 % of the viewable area, which results in reduced white state and limits contrast. Furthermore, dyes lack the stability and color performance of pigments, and the colored oil is not stable in a given position without continual application of voltage (not bistable). Therefore, if reflective displays are to achieve the performance of paper, an ideal approach may be to leverage the use of high-performance pigments used in modern printing media, and the pigments would be hidden to occupy less than 5–10 % of the viewable area. Furthermore, the new approach would employ only planar photolithographic microfabrication for the purpose of simple and large-area manufacturing. Electrofluidic displays (Heikenfeld et al. 2009) were first reported by the Heikenfeld group at the University of Cincinnati in 2009, and are the first 3-D microfluidic display devices. Electrofluidic displays have great potential for reflective display because: (1) they potentially deliver a record 70 % white reflectance at all viewing angles; (2) can be bistable (Yang et al. 2010), therefore requiring very low operational power and extending battery life; (3) the active layer can be as thin as 20 mm for flexible and rollable displays; (4) have the capability of high resolution (>150 ppi), video speed, and vivid colors; and (5) use only planar photolithographic microfabrication such that the approach is manufacturable. Therefore, electrofluidic displays might be an ideal approach for reflective displays. Reported herein is a brief review of electrofluidic display technology, which is now under development at the University of Cincinnati and Gamma Dynamics Corporation. First, the electrofluidic display pixel structure and its operation principle are introduced. Then, the fabrication process for electrofluidic displays on glass or flexible substrate is described. Next, the operation speed and optical performance for electrofluidic displays will be discussed. After that, a novel pixel structure which enables zero-power grayscale operation will be presented. Lastly, the operation of electrofluidic display under low and high temperatures will be briefly discussed.

Electrofluidic Display Operation A brief discussion of electrofluidic display pixel structure and operating principle is first provided.

Electrofluidic Display Pixel Structure A typical electrofluidic display pixel structure is shown in Fig. 1, which contains several important geometrical features. Firstly, there is a reservoir in each pixel, which will hold a colored pigment dispersion in less than 90 % of the visible area, which will receive the pigment dispersion from the reservoir when a suitable voltage is applied. Thirdly, there is a duct surrounding each pixel which enables counterflow of oil to fill the base of the reservoir as the pigment dispersion leaves the reservoir. In addition, the duct serves as the boundary for adjacent pixels. In our experiments, the duct design shown in Fig. 1a has been effective at terminating advancement of the pigment dispersion at the end of the surface channel. The physics governing this Page 2 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

a

3D structure μm ~150

SEM photo

L~150 μm

0 μm

to 50

L

Channel

Reservoir

SU-8 mesa

~30 μm

Duct

b

Supporting substate Top view

Cross-section view ITO

Hydrophobic dielectric

Channel (oil) h~3 μm Pigment dispersion

SU-8 mesa Duct

Al Duct

Reservoir

Fig. 1 Schematic diagram of electrofluidic display pixel structure and operation: (a) 3-D view and scanning electron microscopy (SEM) picture; (b) cross section and top view (Adapted from Heikenfeld et al. 2009)

termination has two origins: (1) the duct marks the end of the aluminum electrode; (2) the dispersion encounters a diverging capillary geometry at the channel/duct interface. Thus, the duct is further important as it prevents merging of pigment dispersions in adjacent pixels. It is important to note that all of these features can be formed by a single photolithographic or microreplication step, which allows for standard planar microfabrication and potentially simple manufacturing. A complete electrofluidic pixel requires several additional coatings and a top substrate. As shown in Fig. 1, the surface channel is bound by two electrowetting plates (Fair 2007) consisting of an electrode and hydrophobic dielectric. The top electrowetting plate is composed of a transparent In2O3:SnO2 electrode (ITO), and the bottom electrowetting plate comprises a highly reflective aluminum electrode such that the electrofluidic pixel can exhibit high reflection and brightness.

Page 3 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

H2O:pigment ITO dielectric Δp

oil-filled channel

Δp

θY

+ + + + − − − − − θv − − − + + +

dielectric

170

4.0

150

3.0

130

2.0

110

1.0

γao ∼ 5 mN/m h ∼ 3 μm

90

0.0

70

−1.0

50

−2.0

Net Pressure (KN/m2)

Contact Angle (θv, º)

Al

−3.0

30 0

2

4

6

8

10

12

14

Voltage (V)

Fig. 2 Schematic diagram and plot of wetting angle and net pressure versus applied voltage (Adapted from Heikenfeld et al. 2009)

Basic Electrofluidic Pixel Operation The basic operation of electrofluidic display pixel is shown Fig. 1b. Initially with no voltage applied, the hydrophobic channel and reservoir, respectively, impart a large/small Laplace pressure (Berthier 2008) on the dispersion, therefore the net Laplace pressure causes the pigment dispersion to occupy the reservoir with the larger feature dimension. When voltage is applied, the resulting electromechanical pressure (Jones 2005) that exceeds the net Laplace pressure will pull the aqueous pigment dispersion from the reservoir of small viewable area (90 %). When the voltage is removed, again, the Laplace pressure will push the pigment dispersion rapidly (10’s ms) back into the reservoir. Thus, a switchable device is created and perceived coloration area of pixel can be altered between 10 % and 90 % of pixel area. Next, a deeper discussion on the fundamental principle which governs the pixel operation is presented. As shown in the diagrams and plots of Fig. 2, when a voltage is applied, the movement of pigment dispersion into and out of the surface channel is regulated by a competition between Young-Laplace pressure and electromechanical pressure, which is generated by the electrowetting effect (Mugele and Baret 2005) on two electrowetting plates. Laplace pressure can be expressed as Dp ¼ gao ð1=R1 þ 1=R2 Þ N=m2



(1)

where gao (N/m) is interfacial surface tension between the aqueous pigment dispersion and the ambient oil, R1 and R2 (m) are the principal radii of curvature for the pigment dispersion meniscus. For a typical pixel dimension such as 300 mm, the channel length is typically 100 the channel height (h). Therefore, Young-Laplace pressure in the surface channel is governed by the smaller radius of curvature, which can

Page 4 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

be expressed as Rh = h/2cos(yY), where yY is Young’s angle for the pigment dispersion in the oil. As yY is >170 , the principle radii of curvature in the channel is approximately equal to 1/2 the channel height. Therefore, Young-Laplace pressure in the channel can be further approximated as Dp  2gao/h. On the other end, the radius of curvature in the reservoir is governed by two equal radii of curvature which equals 1/2 the reservoir diameter. Because the reservoir diameter is typically >10 the channel height, the Laplace pressure in the reservoir is far less than that in channel. As a result, at no voltage the net YoungLaplace pressure can be simply approximated as Dp  2gao/h. For the electromechanical pressure, an understanding of the electrowetting effect is first required. Electrowetting can be expressed by the following equation: 1 ϵ  V2 cos yV ¼ cos yY þ  2 gao  d

(2)

where yV is the apparent wetting angle, e/d (F/m2) is the hydrophobic dielectric capacitance per unit area, and V(V) is the applied DC voltage or AC RMS voltage. As predicted by Eq. 2, with increasing voltage, the apparent pigment dispersion wetting angle (yV) decreases. Combining Eqs. 1 and 2 and stated approximations, the net pressure on the pigment dispersion in the channel can be described (Heikenfeld et al. 2009) by Dp 

 2gao ϵ  V 2  N=m2 h hd

(3)

As predicted by Eq. 3, the ideal threshold for pulling pigment dispersion into the channel is where the electromechanical pressure (eV2/2hd) is greater than Laplace pressure (2gao/h). At this threshold, the electrowetted contact angle is reduced below 90 . When the onset of contact angle saturation is reached, the net pressure on the pigment dispersion is maximized and thus the speed of pulling the pigment dispersion into the channel reaches a maximum. Our experiments have shown that both DC and AC voltage can be used to operate the pixels, although AC bias (square wave, 60 Hz) showed reduced dielectric charging, reduced saturation angle, and faster ON speed. Currently, the maximum net pressure that can be achieved is 10 kN/m2, which is for the case of gao  50 mN/m and h  2.0 mm with the apparent wetting angle yV  70 . It has been observed during experimentation that the pigment dispersion can be reliably pulled in and out of the channel with the application of voltage (Zhou et al. 2010a). Furthermore, voltages implemented near the ideal threshold (Dp = 0) were shown to hold the pigment dispersion in intermediate wetting positions (gray scale). The voltage range for stability can be determined from Eq. 3 by incorporating the effects of contact angle hysteresis. Contact angle hysteresis is typically only a few degrees in oil, but can be increased by providing a rough or patterned surface to the hydrophobic dielectric (0.1 s kN/m2).

Electrofluidic Display Microfabrication Introduced herein is the basic fabrication process of electrofluidic display, which has been shown to produce pixels with size ranging from 150 to 500 mm (the resolution is sufficient for both smart phone screen and electronic paper). Also, the process has produced electrofluidic display modules on both glass and flexible substrates. Described is the simplest embodiment of a direct-driven electrofluidic display prototype. First, the mesa structure, which is the bottom layer as shown in Fig. 1, was created with polymer patterned by

Page 5 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Electrofluidic display module (a) fabricated on a PET backplane, with pixels at the (b) “OFF” and (c) “ON” states (Adapted from Zhou et al. 2010b)

photolithography. Typically, the polymers are permanent photoresists, which can be either in liquid form such as SU-8 or KMPR (Microchem Corp.), or in dry film such as PerMX (DuPont). Alternatively, the mesa structure can be created more quickly and at lower cost by using microreplication techniques such as embossing. Next, a highly reflective aluminum electrode was vacuum deposited. The electrode was patterned to form pixels by either angle-evaporation or photoresist patterning followed by etching. After that, a thin pinhole free dielectric such as Parylene C was deposited. These films were then further coated with fluoropolymer such as Fluoropel PFC1601V (Cytonix Corp.) or Cytop CTL809M (Asahi). A top transparent substrate which includes the same dielectric layers and a thin In2O3:SnO2 (ITO) film was used to complete the device. The top or bottom substrate further includes patterned SU-8 spacers, which were used to regulate the surface channel height. It is important to note that the entire fabrication process presented above can be implemented with temperatures as low as 100–120  C and is therefore compatible with both glass and flexible plastic substrates. We have successfully transferred the process to a 125 mm thick polymer backplane (both PET and PEN), and demonstrated repeatable operation as shown in Fig. 3 (Zhou et al. 2010b). For the liquid dosing, a self-assembled liquid dosing approach utilizing capillary action was used. The colored pigment dispersion was first filled in the reservoir and surface channel completely; then as soon as a few dodecane oil droplets were applied at one edge of the device, the oil will move into the channel due to its low surface tension and push the colored pigment dispersion out of the channel. When oil comes to the reservoir, as the height of surface channel is much smaller than the depth of reservoir, the oil will move around the reservoir and leave the pigment dispersion in the reservoir behind. Thus, the pigment dispersion is dosed in the reservoirs only. After oil dosing, the display device can be permanently sealed with a UV epoxy.

Switching Speed of Electrofluidic Display Switching speed is important for all displays including reflective displays, as in the long term it is expected that most display products should be capable of video speed (20 ms switching speed) for their success. For electrofluidic displays, the demonstrated ON speed at which the pigment dispersion fills 90 % of the surface channel area is currently tON  50 ms for 150 mm electrofluidic display pixels, and the current OFF

Page 6 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

speed is tOFF  30 ms for the 150 mm pixels size. Therefore, a route for improvement is required for video applications. One approach for improvement is device scaling. Considering 170 DPI color pixels, the RGB subpixel size (L) is 50 mm. This will result in 9 decrease in tON because the traveling distance is reduced by 3 and the traveling velocity U is increased by 3 according to U  h/L (Song et al. 2009). For the same reason, device scaling to L  50 mm provides a 9 decrease in t OFF. Besides device scaling effect, another approach for big improvement is the materials improvement and optimization (gao,e,d), which can allow as much as 10 decrease in T ON, and 15 decrease in TOFF. Finally, the current oil and pigment dispersion viscosities are 2 cSt and could be potentially reduced to 1 cSt, which will result in a 2 decrease in both tON and tOFF. The overall effect of above design changes predicts a tON and tOFF likely below 1 ms. Therefore, it can be concluded that with continued development, electrofluidic display pixels can easily satisfy video-speed requirements because only a small part of the abovementioned improvements are needed.

Optical Performance of Electrofluidic Display This section is on electrofluidic display brightness, which is determined by pigments, reflectors, fill factor, and pixel architecture. Unlike electrowetting displays which exhibit colors via transportation of a thin dyed oil film, electrofluidic displays can achieve brilliant coloration by transposing a 10–15 wt% aqueous pigment dispersion in front of a high-performance reflector such as aluminum. This approach is conceptually unique and has advantages over other pigment-based displays such as electrophoretic or liquid powder displays because these displays are fundamentally challenged by the need to place a thin white pigment layer in front of a black absorbing pigment layer. It should be noted that electrofluidic devices can operate not only with colored pigment dispersion but also with the dye-colored oil and clear polar fluids in electrowetting displays. However, aqueous pigment dispersions provide a significant optical performance boost over dyed oil for the following reasons: (1) pigments are self-diffusive (optically scattering) for an inherently wide viewing angle; (2) pigments provide more superior light fastness than dyes due to reduced surface area exposure to light and air (Cristea and Vilarem 2006); (3) pigments are dispersed, so >10–15 wt% pigment dispersion is readily achieved, while the dye concentration in oil can be more limited. To achieve a maximally bright white or colored state with wide viewing angle, a high-performance reflector together with diffusely reflective white or colored state is required. Diffusely white coloration can be achieved by using a diffuse oil, by using a textured aluminum reflector, by placing a thin diffusely transmitting coating near the channel, or by directly building up the mesa with a diffuse polymer which inherently provides a diffusely reflecting surface. The maximum brightness can be realized by using a high-quality reflector such as aluminum. When dielectric-protected, an aluminum reflector can have a measured R of >93 %. Considering a 150 mm size pixel with a 3 mm-high surface channel and with a 2 mm-wide duct, an aluminum reflector can enable a theoretical white state reflectance of R > 77 % or >90 % as calculated for a ratio of reservoir area to surface channel area of 10 % or 5 %, respectively. If a carbon-black pigment dispersion with 5 % reflectance in visible wavelength is used, the resulting theoretical contrast ratio can range from >15:1 to >18:1. It should be noted that to achieve such performance, the refractive index and thickness of the films on the top substrate must be optimized to minimize Fresnel reflections. Our preliminary prototype results includes demonstration of 1.4-in. diagonal direct-drive prototypes with pixel sizes ranging from a few millimeters to 150 mm (>30,000 pixels). For all tested pixel sizes, Page 7 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

a

RGBW reflection ~ 40% (0.33 ¬0.85 + 0.33 ¬0.85 + 0.33 ¬0.85 + 1.00 ¬0.85) / 4

RGBW filter

WK pixels CMYW reflection ~ 63% (0.66 ¬0.85 + 0.66 ¬0.85 + 0.66 ¬0.85 + 1.00 ¬0.85) / 4

CMYW filter

CMYK pixels

b 100

Reflectance (%)

80

60

40

20

0 400

450

500

550

600

650

700

Wavelength (nm)

Fig. 4 Full-color approaches for electrofluidic display technology: (a) red-green-blue-white (RGBW) and cyan-magentayellow-white (CMYW) pixel architectures; (b) optical spectra of CYM pigments; (c) photos of pigment droplets with four colors CMYK, and an overlay of CMY stacks (Adapted from Heikenfeld et al. 2009)

>98 % pixel yield has been achieved. Several combinations of white diffuse or colored (cyan, magenta, yellow or red, and black) pigment dispersions have been tested. Similar to other display technologies, electrofluidic displays require a more sophisticated pixel architecture to realize full-color operation. As illustrated in Fig. 4a, a high-efficiency black/white electrofluidic pixel can be combined with a RGBW color filter (Brown et al. 2005), which could theoretically result in a 40 % white state brightness. The existing electrophoretic displays could offer such monochrome brightness but they cannot offer the full-color images like electrofluidic displays. As

Page 8 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

shown in Fig. 4a, if a two-layer subtractive CMY approach is used like that proposed for electrowetting and cholesteric displays, an even higher full-color brightness of 60 % is theoretically possible. However, multilayer displays cannot use active matrix addressing and are therefore not suited for mainstream e-paper application. The spectra and photos of electrofluidic CMY pigment dispersions were shown in Fig. 4b, c, respectively. Use of three stacked CMY electrofluidic pixels is also possible. With this approach, the brightness could be close or equal to that of printed media, but the manufacturing will be substantially more difficult if pixels are high resolution.

Bistable Electrofluidic Display Pixel The above discussed electrofluidic display pixel could provide high resolution, video speed, and bright color due to the use of pigment dispersions similar to those used in inkjet printing industry. However, no bistable operation was reported for this pixel structure. For the success of a reflective display technology, it is highly desirable that pixels are bistable, where pixels retain their image without any electrical power. Therefore, bistability enables maximum power efficiency, and is also beneficial for display longevity. In this section, an approach for creating infinitely bistable electrofluidic display pixels is introduced and demonstrated. Figure 5a shows the pixel structure and its operation principle. The 3-D pixel structure consists of two identical parallel channels, which are separated by a suspending middle layer. Via two openings of small area (90 % Reflective Area In addition to the bistable platform presented in the previous section, there is also a new and completely redesigned approach for e-Paper, which we refer to as an “electrofluidic imaging film.” Unique from the dozen or more existing e-Paper technologies, an ink is electromechanically transported to the viewable front, or hidden in the back, of a specially designed porous film. This film is a first for all types of electrowetting-style displays by allowing nonaligned lamination fabrication. Furthermore, the film is able to efficiently split merged fluid and therefore eliminate the need for ink microencapsulation or pixel borders needed in other displays that also move a colorant. Cumulatively, these advances provide a bistable/multistable electrofluidic imaging film that has fast switching ( 76 %), and simpler manufacturing by nonaligned lamination. Due to a threshold voltage, imaging film displays can be passive matrix-driven for low cost electronic shelf label and signage applications. These displays can also be paired with an active matrix backplane for eReader applications. As shown in Fig. 7, the electrofluidic imaging film consists of a specially designed, diffusely reflective, dual porosity film sandwiched between two fluidic channels of equal height. The reflective state of the device is controlled by whether the reflective film is covered by an optically absorbing polar fluid (ink) or a transparent nonpolar fluid (oil). A regular lattice of large pores and a dense, irregular lattice of small pores facilitate fluid flow through the film. By controlling the diameter and therefore the Laplace pressure of two pore types in the film, the device has been designed such that the ink always occupies the larger pores and the oil always occupies the smaller pores. This guarantees that there always exists a path through the film for pigment flow and for oil counter flow.

Page 11 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 7 Illustration of electrofluidic film device structure and operation. Left: Including a side view of the imaging film in its static off state with ink stable in the hidden channel. Additional important features are shown in the angled view diagram of the white imaging film. The imaging film is coated with a thin reflective electrode, and spacers are also present on the opposite side of the film (not shown). Right: Photographs of a microreplicated electrofluidic imaging film (top) flexed between two ITO coated PET sheets filled with an oil, and illuminated in diffuse room light (bottom) in an simple electrode-stripe testing device with a diffuse light source at 15 off normal and a camera at 55 off normal (Reproduced from Hagedon et al. 2012)

By applying an electrical potential between the polar ink and an electrode coated by a hydrophobic dielectric in one of the two channels, an electromechanical pressure (electrowetting) draws the ink into the corresponding channel and displaces the clear nonpolar oil into the opposite channel, simultaneously causing breakup of the ink. In the zero voltage state, the geometry exerts no net pressure on the fluids and therefore the reflectance state of the device is stable. Fluid breakup of the pigment droplet is a key to the high speed operation of the device because it reduces the fluidic path length of both the ink and the oil fluids. Breakup of the ink during switching is assisted by the addition of specially designed channel spacers, between the plates and the films which also serve to set the channel heights. These spacers create low pressure fluid interfaces throughout the ink, which enables oil to inject into the ink and thereby causing it to breakup (Hagedon et al. 2012). Importantly, the design has been optimized for alignment free assembly of the three layers. Testing platforms have been developed for a number of resolutions from 25 PPI to 150 PPI. It has been shown that as display resolution increases, switching time is reduced. While 25 PPI modules have demonstrated 30 ms switching times, modules with resolutions as high as 150 PPI have been demonstrated with switching speeds below 15 ms. It is projected that resolutions over 300 PPI are attainable. Intermediate gray states have been achieved through pulse width modulation. Though some relaxation occurs after voltage is removed, the fluid stabilizes quickly and remains in the gray state indefinitely. If desired, the imaging film can also be designed for strong voltage threshholding, which is an important attribute required for low-cost passive-matrix addressing. This opens up the use of this technology to single-layer RGBW approach. Not only does single layer allow for higher resolution than stacked CMY approaches, but unlike CMY approaches using a single backplane (CMY electrochromic), CMY-particle electrophoretic, electrofluidic displays can easily be switched fast enough to mimic navigation in current e-readers and potentially fast enough for at least crude video and animation. The electrofluidic imaging film requires no pixel borders or ink-microencapsulation, allowing >90 % reflective area. We have demonstrated that this reflective area, coupled with a proprietary engineered reflector, results in RGBW operation that provides SNAP (newsprint) quality color.

Page 12 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 8 Stability of the pigment dispersion over temperature from 28  C to 80  C. No change was observed in the response of wetting angle versus voltage after storage at various temperatures (Adapted from Heikenfeld et al. 2009)

Environmental Test of Electrofluidic Display Electrofluidic displays must meet established environmental requirements for products including storage and operating temperature. For electrofluidic displays, pigment dispersion stability is the key for obtaining a wide environmental range. We have demonstrated that our pigment dispersions are stable and can survive without degradation over a wide temperature range, from 28  C to 80  C. The pigment dispersion samples used for the stability tests were placed in silicone oil, the ambient in electrofluidic display devices. Three samples of the pigment/silicone oil mixture were stored at 28  C, 20  C, and 80  C, respectively, for 24 h. Pigment dispersion wetting angle in silicone oil versus voltage measurements were performed immediately after removal from the extreme environments for these three samples. For the measurement, a droplet of pigment dispersion (1 nL) was dispensed into silicone oil on the top of a test substrate and contacted by a metal probe. The response of wetting angle versus applied voltage was video recorded by camera (Fig. 8). As we can see from figure, there is no visual difference between the samples stored at different temperatures. The initial Young’s angle is 160 , the wetting angle reaches the saturation wetting angle of 60 . After removing the voltage, the droplet wetting angle returns to 160 , demonstrating that our pigment dispersion performs without change after storage between 28  C and 80  C. In addition to pigment dispersion measurements over storage range, most recently we have measured and observed satisfactory device operating performance of electrofluidic display modules over the entire measured temperature ranges (from 30  C to 60  C). Examples of electrofluidic display pixel operation at extreme operating temperatures are shown in Fig. 9. We also observed that our display devices are operative after storing over temperature range from 70  C to 80  C.

Page 13 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 9 Electrofluidic display pixels operation at extreme temperatures: 30  C and 60  C

Summary This chapter briefly reviews the electrofluidic display technology in most key aspects including pixel design, device fabrication, speed, color, bistability, and environmental operation. It should allow anyone interested in reflective displays to quickly understand electrofluidic display technology. Future work includes new pixel architectures, novel liquid dosing technologies, advanced aging/reliability testing, improved brightness and color, generation of high-resolution grayscale images, and further adaptation to industry standard manufacturing tools.

Directions for Future Research As reported herein, electrofluidic displays are a new technology, and therefore future research and development will focus on a few key challenges: manufacturable dosing and sealing techniques, analogue grayscale ability for the driving scheme shown in Fig. 1, and reliable material systems for extended device lifetime, enhanced reflectance, and full-color operation.

References Back U et al (2002) Nanomaterials-based electrochromics for paper-quality display. Adv Mater 14:845–848 Berthier J (2008) Microdrops and digital microfluidics. William and Andrew, New York Brown EC et al (2005) Adding a white subpixel. Inf Disp 5:26–31 Cristea D, Vilarem G (2006) Improving light fastness of natural dyes on cotton yarn. Dyes Pigm 70:238–245 Fair R (2007) Digital microfluidics: is a true lab-on-a-chip possible? Microfluid Nanofluid 3:245–281 Graham-Rowe D (2008) Electronic paper targets colour video. Nat Photonics 2:204–205 Hagedon M, Yang S, Russell A et al (2012) Bright e-Paper by transport of ink through a white electrofluidic imaging film. Nat Commun 3:1173, 11/06/online

Page 14 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_105-2 # Springer-Verlag Berlin Heidelberg 2015

Hattori R et al (2004) A novel bistable reflective display using quick-response liquid powder. J Soc Inf Disp 12:75–80 Hayes RA, Feenstra BJ (2003) Video-speed electronic paper based on electrowetting. Nature 425:383–385 Heikenfeld J et al (2009) Electrofluidic displays using Young-Laplace transposition of brilliant pigment dispersions. Nat Photonics 3:292–296 Huitema E et al (2006) Flexible electronic-paper active-matrix displays. J Soc Inf Disp 14:729–733 Jones TB (2005) An electromechanical interpretation of electrowetting. J Micromech Microeng 15:1184–1187 Khan A et al (2005) Reflective cholesteric LCDs for electronic paper applications. In: Proceedings of the international display manufacturing conference, pp 397–399 Mugele F, Baret J (2005) Electrowetting: from basics to applications. J Phys Condens Matter 17: R705–R774 Song J et al (2009) A scaling model for electrowetting-on-dielectric microfluidic actuators. Microfluid Nanofluid 7:75–89 Yang S et al (2010) High reflectivity electrofluidic pixels with zero-power grayscale operation. Appl Phys Lett 97(143501):3 Zhou K et al (2010a) Reliable electrofluidic display pixels without liquid splitting. SID 10 Digest, pp 1659–1662 Zhou K et al (2010b) Flexible electrofluidic displays using brilliantly colored pigments. SID 10 Digest, pp 484–486

Further Reading Heikenfeld J et al (2011) Review paper: a critical review of the present and future prospects for electronic paper. J Soc Inf Disp 19(2):129–156 Zhou K et al (2009) A full description of a simple and scalable fabrication process for electrowetting displays. J Micromech Microeng 19(065029):12

Page 15 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

Mirasol®: MEMS-based Direct View Reflective Display Technology Ion Bita*, Evgeni Gousev and Alok Govil Qualcomm MEMS Technologies, Inc, San Jose, CA, USA

Abstract Mirasol ® display is an emerging reflective technology particularly attractive for direct-view mobile display applications. Based on principles of interferometric light and color modulation and microelectromechanical device operation and fabrication, the technology demonstrates unique attributes of low power consumption and consistent image quality in various ambient lighting conditions, including bright sunlight. Additionally, the intrinsic fast switching of the microelectromechanical systems (MEMS) pixel elements enables video operation of mirasol displays at various refresh rates. Mirasol displays can be manufactured in a conventional large-area glass LCD TFT fab balanced and modified for MEMS processing steps, which make it an attractive alternative to existing display technologies, especially for products and applications where power consumption and sunlight viewability are important factors.

List of Abbreviations CRT IMOD LCD MEMS mirasol OLED PECVD TFT

Cathode ray tube Interferometric modulator Liquid crystal display Micro electro mechanical systems Reflective display devices based on IMOD technology Organic light-emitting diode Plasma enhanced chemical vapor deposition Thin film transistor

Introduction In the modern dynamic and always and everywhere connected world, consumer’s usage of mobile devices grows at an astronomic pace. This motivates rapid development of a wide range of new software applications and wireless network platforms, which, in turn, significantly drives a demand for a new class of hardware products and portable devices that will satisfy the growing demand and will enrich user experience. The display becomes the central interface between the user and the connected world. While conventional display technologies perform reasonably well and meet basic customer requirements, the growing usage of mobile devices in professional and personal life dictates the development of new display technologies with significantly reduced power consumption and image quality independent of ambient lighting, in particular under outdoors environment conditions. Qualcomm mirasol microelectromechanical systems (MEMS)-based approach is a promising technology that specifically addresses power consumption and consistent viewability issues (Cathey 2009).

*Email: [email protected] Page 1 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

The core of the technology, the mirasol display pixel, is designed on the principles of: (i) thin film interferometric modulation optics to produce color in the reflected spectrum of an (white) ambient light and (ii) capacitive MEMS switching mechanism for changing the pixel state (e.g., color to black) and, as a result, refresh an image by applying low voltage. Fundamentally, these principles offer significant power advantages because (i) reflected light (energy) is “recycled” (more precisely, modulated) from an ambient lighting source and (ii) low-voltage MEMS pixel switching automatically means low power consumption during display array addressing. Compared to current mainstream display technologies, mirasol displays do not utilize backlight illumination units, while a frontlight solution can be included to supplement low ambient lighting as needed by the user. Finally, being a member of the reflective display family, mirasol display technology demonstrates image quality, for example, contrast ratio and color gamut, which does not degrade as a function of ambient lighting resulting in a striking visual experience compared to the more common case of emissive displays.

Color Performance Metrics for Displays Color performance of emissive display technologies such as liquid crystal displays (LCDs) and organic light-emitting diodes (OLEDs) is subject to ambient lighting conditions. Even with impressive color performance specifications in a completely dark environment, emissive displays are often difficult to use outdoors. Reflective display technologies, on the other hand, deliver a much more consistent performance under varied lighting conditions, while potentially requiring supplemental illumination in dark environments. Evaluating color performance of emissive and reflective technologies with common metrics therefore requires measurements to be made under carefully specified real-world lighting conditions (Gille et al. 2008; Qualcomm MEMS Technologies). State of the art color performance metrics and future expectations for reflective display technologies are discussed and also briefly compared to printed media such as newspaper and magazine quality prints (Henzen 2009). Specifically, display reflectance of 40–60 %, color gamut of 35 % of CIE-Luv (equivalent to photo-quality print), and a contrast ratio of 20:1 are considered to be good. (While much better performance is achieved by emissive display technologies under a completely dark environment, performance is generally similar or worse than this under more realistic lighting conditions.)

Principle of Color Generation in IMODs Color as perceived by the human eye depends on the spectrum of the incoming light, which in turn is shaped by the interplay of light sources and interactions with materials through basic physical phenomena such as absorption, emission, dispersion, or interference. For example, the white light emitted by a source can become colored upon reflection from objects as a result of a differential absorption of photons with different wavelengths or upon transmission through color filters used by liquid crystal display (LCD). Generation of color via interference is quite common in nature, with some examples including iridescence seen in butterfly wings and peacock feathers or in the appearance of soap or oil films and Newton’s rings. Interferometric modulator (IMOD) devices also generate color using interference. The typical implementation of IMOD device is analogous to an Etalon (or Fabry Pérot interferometer) (Angus Macleod 2010) operating in reflection mode, where the distance between the reflective interfaces is modulated electrically. Figure 1a shows a highly reflective mirror that is movable with respect to fixed layers formed by a thin film optical stack that includes a semitransparent layer. As the gap between the mirror and the Page 2 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

a

b Mirror x

Moveable mirror

v′

Mirror d Mirror

Absorber Viewer

Thin film stack

0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0

CIE 1976 boundary sRGB space

0

0.10 0.20 0.30 0.40 0.50 0.60 u′

Fig. 1 Color generation in interferometric modulators (IMODs). (a) IMOD device structure and color versus gap. (b) IMOD color spiral on CIE u0 –v0 chart

partially reflective fixed stack is changed, the reflected color is in turn modulated. The color versus air gap relationship of IMOD devices can be computed using transfer-matrix methods (Angus Macleod 2010). Figure 1b shows the color range plotted on a standard CIE u0 –v0 plot for a given optical stack.

IMOD Device Physics From an electrical standpoint, the IMOD is similar to a parallel plate capacitor formed by the deformable mirror and the fixed partially reflective stack. Application of voltage on this capacitor results in electrostatic attraction between the two plates, which in turn causes a displacement of the movable mirror toward the partial reflector and a change in the color seen by the viewer. Under equilibrium conditions, the electrostatic force is balanced by the mechanical restoring force, which results in the following voltage versus gap relationship in an IMOD device.  2 1 1 V ¼ F M ¼ Kx; x  g < d; F E ¼ QE ¼ Aϵ 0 2 2 dx where, FE is the electrostatic force of attraction between the two plates, Q is the magnitude charge on the plates, E is the electric field between them, V is the applied voltage, d is the electrical gap in undeflected state, FM is the mechanical restoring force, x is the deflection, K is the spring constant, A is the area of the plates, and g is the air gap in the undeflected state. Solving the above cubic equation for x as a function of V results in the three solutions shown in Fig. 2a where only two of the three solutions correspond to stable equilibrium. As the voltage is increased in magnitude starting from 0 V, IMOD starts in “released” (or open) state while the mirror gradually deflects toward the absorber. Once a threshold, called pull-in or actuation voltage, is crossed (5 V in Fig. 2a), the mirror collapses toward the absorber reducing the air gap thickness to nominally zero (actuated state). The transition to the actuated state is accompanied by a nonlinear increase of the electrostatic attraction, which increases rapidly as the air gap decreases. If the voltage is now decreased in magnitude, the electrostatic force needs to be reduced significantly before the mirror is able to snap back to the released state. The threshold voltage level, when a transition to the released state happens, is called release voltage (2 V in Fig. 2a). The asymmetry of the actuation and release voltages leads to a hysteresis behavior, where both

Page 3 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

a

2,000 1,500 1,000

Deflection, x (Å)

2,500

b

500

ÙVoltageÙ < VRelease > VActuation Else

−6

−4 Actuated

−2

0 0 Voltage (V)

2

Released

4

IMOD state Released Actuated Maintain current state

6

Unstable

Fig. 2 (a) Hysteresis in IMOD. Both actuated and released states are stable when the applied voltage magnitude is between 2 and 5 V in this example. (b) Truth table

states are stable in the range of voltages defined by these two thresholds, akin to a memory effect where each pixel preserves the last state it was set to while being held in this voltage range. The dynamical response time of the IMOD device can be modeled by including the terms for acceleration and squeezedfilm damping into the force equation (Veijola 1995; Bao 2005):  2 d2x a0 A2 dx Aϵ 0 V þ Kx ¼ m 2þ  dt 2 dx 1 þ ð9:638ÞK 1:159 ðg  xÞ3 dt n where m is effective mass of the mirror, Kn is Knudsen number, 0 is coefficient of viscosity of air, and a is a constant dependent on the geometry of the device (Bao 2005). Mechanical response times of 10–20 ms are commonly achieved (compared to 2–25 ms for LCD), thus allowing for full video capability.

Image Addressing The existence of hysteresis and bistability allows IMOD devices to be driven by passive matrix without being subjected to degradation of contrast ratio as the vertical resolution of the display (number of scan lines) is increased (Alt and Pleshko 1974). Since each pixel remembers its last-set state, the display does not need to be refreshed until the pixel content needs to be updated. These attributes bring substantial power savings, especially for E-reader type of applications. Integration of an active matrix backplane with the IMOD displays can further reduce power requirements, especially for high refresh rate, full video operation. Each IMOD pixel typically consists of three subpixels respectively for red, green, and blue color channels as traditionally used in other displays. The individual subpixels are fabricated using a common process sequence platform producing three separate air gaps. The color points for the subpixels are chosen from the IMOD color spiral for a given optical stack (e.g., Fig. 1b) considering white-balancing constraint and the trade-off between gamut and brightness pertinent to any reflective display. The subpixels can be independently transitioned between the chosen primary color state and a black state. Given the bistable or 1-bit digital nature of each IMOD device, intermediate colors can be generated for each subpixel by

Page 4 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

dividing it into multiple area-weighted IMODs and/or using spatial dithering such as error diffusion algorithms commonly used in the print industry. IMOD mirror sizes used are optimized to avoid dither artifacts while still maintaining reasonably high fill factor. IMOD sizes ranging from 30 to 60 mm are typical. Color processing, dithering, and gamma adjustment functions are generally included in the driver integrated circuit (IC). Future generations of the IMOD technology may use multistate or analog control of the air gap. This allows each IMOD to reach a larger range of colors from the color spiral shown in Fig. 2b having varied color saturation and including a natively bright white state. The ability to generate all colors with a single IMOD device not only allows better color gamut, but also fundamental improvements to the overall brightness of the display (Henzen 2009).

Overview of Mirasol Display Manufacturing The development of IMOD technology has been shaped by a defining decision to focus on the set of materials and processes available in conventional TFT LCD factories in order to enable a scalable manufacturing platform compatible with a direct-view display application of IMOD (Miles 2006; Floyd et al. 2006). Thus, mirasol displays are a significant development in the MEMS industry, which is dominated by the more traditional silicon-based platform. As a result of this strategy, IMOD device architectures were developed to fit the capabilities and constraints introduced by operating in a TFT LCDlike factory. For example, the typical lithography resolution in thin film transistor (TFT) fabs lags significantly behind the state of the art in the silicon IC industry, but is nevertheless adequate for a direct-view display application where each MEMS pixel may have a lateral dimension on the order of tens of microns. The manufacturing of mirasol displays consists of the following key steps: pixel array fabrication, display panel fabrication, and assembly into a display module.

Pixel Array Fabrication Process The fabrication of IMOD pixels is based almost entirely on process steps common in TFT fabs, but used in a surface-micromachining approach for MEMS fabrication: deposition (PECVD, sputter, etc.), patterning (photolithography), and etching (dry, wet) of dielectrics and metals (Londergan et al. 2007). MEMSspecific processes include forming the air gap below the electrostatically actuatable membrane, which is accomplished by depositing a sacrificial layer that is later etched out through appropriately designed openings in the membrane. A simplified process flow for IMOD device fabrication is shown in Fig. 3. While there are similarities to standard processes used in TFT array manufacturing, a number of additional controls and optimizations are needed in order to ensure that the resulting MEMS device meets the desired operational and reliability specifications. Given that the IMOD is an electromechanical optical device, it is not surprising that additional attention is dedicated to engineering the combined optical, electrical, mechanical, and surface properties of the various layers forming the device. Some examples include the residual stress of the electrostatically actuatable membrane, the energy and topography of the surfaces on each side of the air gap, and the refractive index of each of these various layers. As the IMOD technology continues to be developed and refined, some of the new, advanced processes available for large area substrate processing can become important enablers (Waldrop 2007).

Page 5 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

Start glass substrate

Deposit, pattern, etch optical stack TCO + absorber, define row electrodes

Deposit insulator dielectric

Deposit, pattern, etch support posts dielectric

Deposit, pattern, etch sacrificial layer

Deposit, pattern, etch reflective membrane (create etch holes, define column electrodes)

MEMS release etch (create air gap by etching sacrificial layer)

Array fabrication complete

Fig. 3 Fabrication process flow for an array of the basic IMOD pixel structure

4

1

3

Viewer side

4

5

6

2

Fig. 4 Typical mirasol package: [1] IMOD pixel array, [2] glass substrate, [3] glass backplate, [4] a thin polymer seal, [5] driver IC, [6] optical films

Module Fabrication Process

The basic flow includes panel fabrication, followed by assembling the display module including all the required subcomponents. Mirasol display panels are made by encapsulating the array glass with a cover glass in order to protect the MEMS structures against physical damage or environment contamination, and to create a reproducible ambient surrounding the pixels (Fig. 4). The typical mirasol package solutions borrow from existing flat panel display (FPD) industry approaches, such as employing perimeter adhesive sealing with relatively thin glass cover plates. Encapsulation is done at a plate level under nearatmospheric pressure conditions, followed by singulation to individual panels. It should be noted that

Page 6 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Photograph of a 5.700 XGA-format color mirasol display

the perimeter seal requirements revolve primarily around preventing moisture ingress, without concern for oxygen. A thin desiccant layer is often included inside the package to extend the display lifetime. It should also be noted that vacuum encapsulation is not required as in other traditional MEMS devices, since the pixel response time is already very fast for display applications even under atmospheric pressure conditions. The intrinsic attributes of IMOD technology that eliminate the need for using traditional display components such as polarizers and color filters also reduce the complexity of basic mirasol display module architectures. As shown in Fig. 4, at minimum, the panel needs to be integrated with the driver electronics, and further, for reaping the benefits of the reflective nature of the display, antireflection optical films are also added in order to maintain the intrinsic contrast ratio of the display. However, in practice, depending on the particular usage requirements from the product employing the mirasol display, additional components may be integrated within the module – diffusive layers to manage glare and view angle, touch panels for enabling advanced user interactions, as well as frontlights to enable use of the display in dark ambient conditions. For example, since IMOD pixels are intrinsically specular reflectors, diffusing layers are added to the display module stack in order to tailor the view cone for ambient conditions with directed or narrow angle illumination sources. By controlling the diffusion level, one can manage the angular variation of the reflection and tailor the perceived brightness in the desired view cone. This is an advantage, for example, for small form factor handheld displays where the incoming light can be reflected in a smaller view cone, typically +/45 , to increase perceived brightness. The choice of the diffuser level is based on particular usage models (real ambients are a mixture of diffuse and directional light sources), and their impact on the display appearance can be modeled by understanding that they interact with light twice – they diffuse incoming light sources, and further diffuse the reflection of the IMOD pixel array, resulting in a desired view cone. As expected, the particular processes used for module assembly also borrow from the well-established LCD type infrastructure – driver attach with chip-onglass or chip-on-flex solutions, film lamination using pressure-sensitive adhesives, and touch or cover plate assembly with air gaps (perimeter gasket) or with optically clear resins or adhesive films in a bonded configuration.

Page 7 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_106-2 # Springer-Verlag Berlin Heidelberg 2015

Summary Mirasol, a new technology that mimics the way nature generates bright color by interferometric modulation, enables a new type of low-power reflective displays for portable applications with a consistent viewing experience under a wide range of ambient lighting conditions including bright sunlight (Waldrop 2007). Interferometric modulation, which can produce color by “recycling” light from the ambient lighting conditions, is used in a novel display pixel architecture based on microelectromechanical devices. This unique combination of color reflection, low-voltage switching MEMS, and bistable pixel architectures enables a low power consumption superior to existing conventional display technologies. Qualcomm MEMS Technologies (Sampsell 2006) develops and commercializes mirasol displays covering the whole technology and product development cycle from early phases of research all the way to module manufacturing, with a first product introduced to the market place in 2008. The technology is particularly attractive for mobile devices including e-reader applications (Fig. 5). Current research activities center around new pixel architectures and materials with enhanced display performance, innovations in the array addressing and low-cost component research, such as the frontlight technology.

References Alt PM, Pleshko P (1974) Scanning limitations in liquid crystal displays. IEEE Trans Electron Devices ED-21:146 Angus Macleod H (2010) Thin-film optical filters, 4th edn. CRC Press Bao M (2005) Analysis and design principles of MEMS devices. Elsevier Science Cathey J (2009) Enhancing mobility through display innovation. Inf Disp 11:8–11 Floyd PD, Heald D, Arbuckle B et al (2006) IMOD display manufacturing. SID 2006 Proc 37:1980–1983 Gille J, Gally B, Shelby R (2008) Specifying color performance in mobile displays: the effects of environment, pixel size, and the use of dither. In: Conference record of the 28th international display research conference, pp 133–136 Londergan A, Gousev E, Chui C (2007) Advanced processes for MEMS-based displays. Proc Asia Disp SID 1:107–112 Miles M (2006) Toward an iMoDTM ecosystem. In: IDW’06 proceedings, pp 1583–1586 Qualcomm MEMS Technologies, Inc. Mobile color depth: quantifying the real world performance of displays. http://www.mirasoldisplays.com/mcd Sampsell J (2006) Innovation to production: a continuum. SID Proc 71:1 Henzen A (iRex Technologies) (2009) Development of e-paper color display technologies. Society of Information Displays 2009 Digest, pp 28–30 Veijola T (1995) Equivalent circuit models of the squeezed film in a silicon accelerometer. Sensors Actuators A48:239–248 Waldrop M (2007) Brilliant displays. Sci Am 297:94–97

Further Reading Heikenfeld J (2010) Lite, brite displays. IEEE Spectrum, p 29, March 2010 Motamedi ME (ed) (2005) MOEMS: micro-opto-electro-mechanical systems. SPIE Press

Page 8 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

Time Multiplexed Optical Shutter Displays Ram Ramakrishnana* and Dan Van Ostrandb a Uni-Pixel Displays, Inc., The Woodlands, TX, USA b Technology Consultant, Austin, TX, USA

Abstract Time multiplexed optical shutter (“TMOS”) describes the operational principle of the patented frustrated total internal reflection (FTIR) display systems invented by Uni-Pixel Displays, Inc. (Selbrede, US Patent 5,319,491). The fundamental principle of FTIR displays is that light, edge-injected into one edge of a thin planar transparent waveguide, that is reflectively mirrored at the non-insertion edges, remains bound within the waveguide in the same way light is trapped inside fiber-optic cables. The violation (“frustration”) of total internal reflection (TIR) causes the light to emerge from the waveguide in the area where the violation occurs. TMOS achieves frustration of TIR (FTIR) by moving a transparent material or membrane of equal or slightly higher refractive index into contact (or near contact) with the waveguide. The light inside the waveguide is then “coupled” from the waveguide into the membrane where it encounters surface features on the membrane that redirect the light toward the observer. The ultrafast response speed of TMOS pixel actuation enables the use of field sequential color generation (FSC) with pulse width-modulated gray-scale generation at video-capable frame rates. The typical motional or color break-up artifacts associated with other FSC systems are resolved at their source by deconstructing the primary color image subframes into even smaller time segments and then rearranging the sequence in which they are presented to the viewer.

List of Abbreviations AL CRT FSC FTIR ITO LCD LED LG MEMS MEOPS OLED PET RGB TFT TIR TMOS

Active Layer Cathode ray tube Field sequential color Frustrated total internal reflection Indium tin oxide (ITO) Liquid crystal display Light-emitting diode Light Guide, also known as an optical waveguide Microelectromechanical system Micro electro-optical polymer system Organic light-emitting display Polyethylene terephthalate Red, green, and blue (RGB) light Thin-film transistor Total internal reflection Time multiplexed optical shutter

*Email: [email protected] Page 1 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

Introduction TMOS is an innovative flat-panel display technology (Van Ostrand and Cox 2008; Cox and Selbrede 2007) based on principles of operation quite distinct from those used in other technologies such as LCD, OLED, plasma displays, and CRTs. Localized frustration of TIR (de Fornel 2001; Zhu et al. 1986) using optical shutters with large apertures and high-speed response permits TMOS to implement field sequential color generation. This approach also significantly simplifies the TMOS display architecture relative to other systems. The fabrication of this novel micro electro-optical polymer system (MEOPS) display system involves the precision integration of a polymer thin film aerially extended (the Active Layer) across a transparent planar waveguide in which primary color light is sequentially edge injected. At each pixel, the central aperture of the thin-film membrane is deformed via electrostatic attraction, while at the pixel edges, it is firmly tethered (analogous to a drumhead) to secure reliable mechanical actuation (Fig. 1). TMOS is a transmissive technology that can ultimately achieve ultrahigh efficiency (display output, in lumens, divided by raw lumens injected into the LG), theoretically as much or more than ten times the current LCD efficiencies (5 %). The absence of color filters and polarizers, when combined with a multipass light depletion approach achieved by the use of a TIR-based stochastic LG, fuels this efficiency advantage over LCD displays. The TMOS architecture enables a dramatic reduction in the complexity of manufacturing display panel devices by replacing many of the layers and materials currently found in LCD panels (such as liquid crystals, polarizers, color filters, and brightness-enhancing films) with a single thin-film layer (the Active Layer). TMOS can advance the state of the art in display performance in terms of luminance, power efficiency, and image quality. TMOS displays neither require the use of backlights, brightness-enhancing films, polarizers, color filters, liquid crystals, noble gases, high vacuums, phosphors, nor absorptive layers that are commonly used in other display technologies. The absence of the light-blocking/light-absorbing layers (such as polarizers, color filters, masks, and other light-impeding apertures) native to LCD technologies can push the relative power efficiency of TMOS up significantly beyond LCD performance (Fig. 2).

Subsystems Overview The TMOS display system architecture is a unique combination of discrete subsystems (Fig. 3). The subsystems are explained in the following sections.

Light Guide (LG) The LG serves as the core light transmission medium in the system. Non-collimated light is injected into the LG from the “illumination system” and is maintained in stochastic multimode TIR propagation in the

Fig. 1 Image of a TMOS display Page 2 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

60% 75% 100% Start

5% 77% 28.3% 29.2% 43% 100% Start

id gu

fil

ht

ne

&

tiv e TF

T

ba

TF

ck p

la

Ac

lig

la ye r

ar

s as gl T

e

m

r ize

re

Po l

er

&

ap

LC

tu

te fil

or ol

Po l

ar

ize

r

rs la ye r

Light out >60%

C

C

ov er

gl

as

s

Light out ~5%

Fig. 2 LCD displays (right side) use multiple subtractive layers to modulate light in a one-pass system. TMOS (left side) is a “multi-pass” system that does not incorporate the same number or type of light-absorbing mechanisms Micro-optic layer Conductor between lenses

Light

Carrier film lifted

guide

Mirrored edges (all sides)

ghts LED li by all pixels) d e r a (sh

Not to scale

Fig. 3 Conceptual view of the components of a TMOS panel

LG until emitted at an active pixel, depleted by absorption, or exhausted through a natural sink in the system. Initial developments have used standard thin-film transistor (TFT) mother glass as the LG.

Page 3 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

Illumination System The TMOS architecture, based on the conventional RGB model of color generation, incorporates red, blue, and green LEDs for its light sources. The LEDs are set in a light bar assembly that optimizes the angles of light injected into the insertion edge of the system’s LG. The illumination system also incorporates the appropriate attachment means for mating the illumination system to the LG to prevent light leakage and to redirect stray light back into the system. The illumination system is tuned to optimize the overall system performance (emission angles, efficiency, etc.) relative to the limited range of transit angles achievable through the LG. Over time, the illumination system will be able to incorporate a broader mix of LED colors to achieve photo quality imagery (extended gamuts not limited to the traditional NTSC color gamut) and also will be able to house infrared (IR) light sources in addition to visible light sources. Operating with such supergamuts (switchable and/or superimposed) assumes the image content is suitably encoded to take advantage of this TMOS feature. It is expected that a mix of visible and non-visible light will make the system night vision compatible; the global luminance can be tuned to meet the needs of any ambient light environment, from night vision compatibility to readability in sunlight.

Drive Control Mechanism at the Individual Pixel

The TMOS “optical shutter” light valve mechanism is controlled using a variable capacitor architecture at each individual pixel. The capacitor is comprised of two conductive planes held parallel to one another and separated by a submicron gap. When a voltage differential is created in the capacitor, Coulomb (or electrostatic) attraction pulls the two conductive planes together. In the TMOS architecture, one of the capacitor planes resides on the Light Guide (LG), and the other resides on or within the Active Layer (AL) film. The LG conductor planes are individual to each pixel and controlled at each pixel by one or more thin-film transistors (TFT). The AL conductive plane is a thin contiguous network or layer of conductive material extending across the entire surface of the film. Controlling the charge and the discharge of the capacitor at each pixel provides the control of the attractive force that activates each individual pixel through local deformation of the AL membrane. Initially, prototype devices were built that utilized individual conductive traces extending from the edge of the LG to each individual pixel conductive pad on the LG. This “direct drive” approach allows the control of the pixels to be managed by transistors that are located off the LG. In addition, other development prototypes have TFTs located at each pixel that provide the pixel drive control capacitor management. Leveraging a unique approach invented by Uni-Pixel called “Simple Matrix” (K. Derichs, US Patent 7,764,281), it is possible to build a TMOS display without TFTs and yet provide individual pixel control by patterning the conductors on the LG and AL as stripes. The crossover intersection point of the row stripes on the LG and the column stripes on the AL will provide the capacitance point of the pixel for hysteretically controlling the optical shutter.

Drive Control Circuitry System Level As in every display panel technology, TMOS has its own unique drive control timing and voltage requirements. A TMOS display is a series of optical shutters as pixels, with each pixel opening and closing to emit light over a specific time period. Each pixel handles all colors: there are no sub-pixels for red, green, or blue (hence the term “unicellular” pixel). Pixel operation is an “all digital” function, meaning that either the pixel is open (“on”) or closed (“off”). As opposed to other technologies that require analog settings to control light modulation, TMOS uses only on/off timing. Gray scales for each primary are generated by way of pulse width modulation (i.e., only opening the pixel for a percentage of the time allotted for each color).

Page 4 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Relationship of the primary elements of a TMOS display. The TFT backplane is the Light Guide, which is edgeilluminated, onto which the TMOS AL film is affixed

Fig. 5 Notional construction of TMOS AL film (the deformable element suspended over each pixel region). The shape of the optical microstructure is notional; actual geometries vary

Active Layer (AL) The core elements that allow the TMOS system to perform as required are built into the Active Layer (AL). These core elements include a base carrier film, optical microstructures added to one surface, and a contiguous conductive layer added either on or between the optical microstructures or embedded within the AL film stack. The size, geometry, and optical properties of the optical microstructures govern the light output performance of the display system (extraction efficiency, dispersion pattern/viewing angle, backscatter, ambient light handling as it relates to contrast ratio, etc.) (Figs. 4 and 5). Active Layer Film Specifications The Active Layer is a polymer carrier film that has optical microstructures embossed on one surface with an embedded or coated conductor. The optical microstructures have been developed using UV embossing (Mohr et al. 2004) of a photoacrylate material on thin PET. Other approaches, such as hot embossing

Page 5 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

(Worgull 2009), can also be used to make optical microstructures on Active Layer film. Material properties that dominate the display performance are the mechanical properties of carrier film, mechanical and optical properties of the optical microstructure material, and the light-coupling efficiency of the optical microstructures. Young’s modulus is a crucial parameter for the carrier film substrate and the optical microstructure material. This value determines the actuation voltage required for moving the AL membrane into contact with the LG for light emission and also provides the decoupling force used to pull the AL membrane away from the LG, back to the dark state (or off state). Typical values of modulus range from 5 to 100 GPa for the LG, 1 to 5 GPa for the carrier film, and 0.05 to 1 GPa for the optical microstructure material. A higher modulus value is essential for the AL to enable restoration or release of the film away from the LG to the non-light-coupling position when the actuation voltage is removed. However, this must be matched to the capacitive driving force required for pixel actuation, or else it could result in unacceptably high operating voltages. Any increase in the restoring force required to decouple the optical microstructures from the LG surface, beyond that supplied by the film stiffness or Young’s Modulus, would be due to “stiction.” If stiction becomes a significant factor, a higher Young’s modulus would be required for the AL film, and correspondingly a higher voltage will be required to actuate the pixel. Minimizing the stiction effects is essential for the low-voltage operation, overall system efficiency, and lifetime performance of TMOS displays. Yield elongation and yield strength of the composite AL film are important since stretching beyond the elastic limit will directly affect the restoring force and could result in permanent damage to the AL film. Surface energy of the contacting surfaces is an important parameter for both the LG surface and the optical microstructure material. Higher values for both the LG surface and the optical microstructure material lead to adhesion and stiction. A lower value is preferred for minimizing stiction and is critical for maintaining a low actuation voltage (1,000,000:1 contrast ratio, because the LEDs are off when there are no pixels open. A better measurement of true contrast ratio is the 4  4 measurement (alternating white and black squares in essentially a checkerboard fashion). In this measurement, some areas of pixels are on and some areas of pixels are off at the same time on the same screen. Both minimum and maximum luminances are measured at the same time. Using this measurement, most standard LCD panels today are between 50:1 and 150:1. Early component measurements showed TMOS could achieve >3,000:1 4  4 contrast ratio. More recently, prototype panels have been measured with a 4  4 contrast ratio of over 700:1.

Resolution

TMOS has already achieved 250 mm pixel pitch, in part due to its unicellular pixel structure. Densities as high as 300 dpi (83 mm pixel pitch) are projected.

Pixel Speed

TMOS prototypes have been measured to have off to on pixel response times of approximately 2 ms. The on to off speed is a function of the natural frequency of the membrane in the assembled pixel configuration. The on to off times for the same pixel have been measured at approx. 9 ms.

Viewing Angle and Light Emission

The Active Layer film is fabricated with customized optical microstructures. The shape of these microstructures defines the customizable range of solid angles that are produced by the display. Prototypes have been built that have 90 viewing angles in both the horizontal and the vertical. More restricted emission patterns can be achieved without adding extra optical layers to the TMOS display.

Number of Colors Uni-Pixel developed encoding methods for TMOS that permit operation beyond 24-bit color within a highly optimized field sequential color-generating domain. TMOS is a purely digital system that converts Page 10 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

analog video to digital equivalents. Matching the drivers to TMOS’s digital color generation system permits operation up to 36-bit color.

Video Capability TMOS is targeted toward applications in the highly competitive display market sectors where excellent video performance is required. Accordingly, TMOS prototypes have been built to display at least 60 frames per second video operation while displaying 24-bit color (eight bits per primary color).

Power Consumption TMOS uses a single part pixel to display the full color spectrum. Color is generated using FSC. To generate 8-bit color for each red, green, and blue primary, a total of 24 subframes for each original video frame are required. Traditional LCD displays have red, green, and blue sub-pixels. Each LCD sub-pixel is set with an analog voltage that twists the crystal, blocking a corresponding percentage of the light, thereby enabling the tripart pixel to display the full color suite. At first glance, it appears that a TMOS display, which uses 24 subframes (or screen setups) for each video frame, would consume much more power setting up the screen than a traditional LCD panel. However, that is not the case. Examining the details shows why: • LCDs have three-part pixels, and TMOS has a one-part pixel. For each screen setup, LCDs have to address three times the number of TFTs compared to a TMOS display. The 24:1 ratio just went to 24:3 (or 8:1). • Current state-of-the-art LCDs use faster refresh rates to minimize blur from moving objects. Most LCD panels are driven at either 120 or 240 Hz. TMOS panels do not have the need for higher frame rates to eliminate motion blur (Yohso and Ukai 2006) given the quick response speed of the TMOS pixels combined with algorithms that get the brightest subframes out first (Castet et al. 2002). This changes the comparison ratio to 24:6 (4:1) or 24:12 (2:1). • LCD panels use the column lines to set precise analog voltages to each pixel. The voltage on the row address line must stay “high” for the required amount of time to ensure that the precise voltage is stored in the LCD pixel capacitor (for 256 levels, or 8 bits, per primary color, this a precision of at least 1/512th, which can only be achieved by waiting for at least 6 tau or RC time constants; higher precision requires more time). TMOS uses a digital threshold pixel, and only 1 tau (or RC time constant) is required. Each pixel is either on or off, no analog settings in between. The row line only is held “high,” long enough to charge a small hold capacitor at each TMOS pixel. This small hold capacitor controls the gate of the pixel TFT, which provides a discharge path for the pixel pad. This hold capacitor is a fraction of the total capacitance of a typical LCD pixel capacitor; therefore, it takes much less charge and much less time when addressing each row/pixel on a TMOS panel. The Active Layer has a common conductor shared across all TMOS pixels (pixel capacitors). Each TMOS pixel is a mechanical device that makes it a variable capacitor. Voltage is only required to change a pixel’s state. The common conductor on the Active Layer allows charge to be shared between pixels changing to opposite states, thereby reducing the total average power. The total voltage required to address a screen is significantly less than that required to address an LCD panel. The exact amount of the power savings is dependent on the exact screen size and design of the TFTs and capacitors; however, all testing, modeling, and calculations have shown that the total power consumed to drive a TMOS panel is roughly equivalent to that of a typical LCD panel of the same size and resolution.

Page 11 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

Backlight Rear polarizer TFT backplane Liquid crystal layer Color filters Front polarizer Cover plate

Fig. 10 Conceptual illustration of AMLCD layers

Manufacturability and Display Assembly Uni-Pixel has conducted evaluation projects with industry experts that have concluded that TMOS displays can be fabricated on existing LCD lines by removing many steps in the LCD manufacturing process that are not required for TMOS. This macroscale simplification of an LCD foundry has implications relative to yields and the cost to produce TMOS displays versus LCD displays. A simpler architecture makes possible a potentially simpler manufacturing process (Figs. 10 and 11). TMOS architectures not only provide performance advantages, they also reduce the cost of display manufacturing in contrast to mainstream technologies. Significantly, greater yields for a TMOS arise from its fewer components, the potential for a registration-free assembly, its reduction of the TFT count by a factor of at least two-thirds, and its comparatively larger feature sizes. The fabrication of Active Layer film, with its proprietary optical microstructures to optimize light output during pixel actuation, is innovative insofar as no such thin contiguous polymer sheet structures have ever been deployed within a MEOPS architecture to form pixels that exploit FTIR (Fig. 12).

Direction for Future Research A few areas for further research are relevant for optimizing the overall Active Layer film performance, reliability, and durability. Films with different elastic properties have been simulated using MEMS models, and the effect of mechanical properties on the actuation characteristics has been estimated. A few promising candidate materials have been identified for future evaluation. Use of monolithic films with direct embossed structures is also an important consideration for a better control of optical and electromechanical properties, as this eliminates the unwanted contribution from carrier films. During the operating lifetime, the active film undergoes multimillion cycles of contact with the Light Guide surface. A fundamental understanding of the adhesion, residual stresses, and wear, if any, of the lens material is

Page 12 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

TFT backplane, light guide, & illumination system Active layer film

Fig. 11 Conceptual illustration of TMOS layers. Note that TMOS has far fewer layers, therefore fewer manufacturing steps than does LCD (and thus fewer failure points)

Fig. 12 In this early fully functional TMOS prototype, the Active Layer is sandwiched between a cover glass and the TFT backplane/Light Guide assembly

essential. Further optimization of film stack used in the Light Guide substrate would also enable improved performance. The present design uses traditional dielectric films including SiO2, Si3N4, and other planarizing polymer layers. Research into other highly transparent, index-matched dielectric films with improved electrical properties would enable to achieve lower operating voltages and overall performance. Additional research can be done to further the knowledge of the potential power savings associated with the stored mechanical energy in each pixel and the timing relationship of the switching of the charge/ discharge paths. Suffice it to say that the potential for power savings that can be achieved with this architecture has not yet been fully explored.

Summary/Conclusion TMOS has fewer layers and components (things to assemble) than a comparable LCD display, which leads to a lower cost display. With fewer absorptive layers and higher optical efficiency, TMOS is brighter and/or uses less power for the same brightness. The Active Layer membrane operates over a wider temperature range than liquid crystals. Without color filters a larger color gamut can be achieved. Also, by

Page 13 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_107-2 # Springer-Verlag Berlin Heidelberg 2015

using RGB LEDs with no color filters, the color gamut can be dynamically adjusted to match the gamut of the source for true color reproduction. TMOS displays have gone from concept to live demonstration of working prototypes. Existing operational prototypes are a first-line proof that a properly constructed TMOS display can achieve its performance goals.

Further Reading Armitageb D, Underwood I, Wu S-T (2006) Introduction to microdisplays, Wiley SID series in display technology. Wiley, Chichester, pp 25–28 Castet E, Jeanjean S, Masson GS (2002) Motion perception of saccade-induced retinal translation. Proc Natl Acad Sci 2(23):15159–15163 Cox B, Selbrede M (2007) Inside FPD: UniPixel’s TMOS display technology. Nikkei Microdevices 264: 43–47 de Fornel F (2001) Evanescent waves: from Newtonian optics to atomic optics, 1st edn. Springer, New York, pp 18–28 den Boer W (2005) Active matrix liquid crystal displays. Elsevier, Burlington Hornbeck L (1998) Current status and future applications for DMDTM based projection displays. In: Proceedings of the IDW ‘98 - the 5th International Display Workshops, Kobe, Japan, 1998, pp 65–68 Lee J-H, Liu DN, Wu S-T (2008) Introduction to flat panel displays: fundamentals and applications, Wiley SID Series in display technology. Wiley, Chichester Li Z, Meng H (2006) Organic light emitting materials and devices. CRC Taylor and Francis, Boca Raton Lueder E (2010) Liquid crystal displays, Wiley SID series in display technology. Wiley, Chichester Mohr J, Hollenbach U, Last A, Wallrabe U (2004) Polymer technologies: a way to low-cost optical components and systems, SPIE Proceedings 5453: Micro-Optics-I, pp 1–12 Nalwa HS (2008) Handbook of organic electronics and photonics (3-volume set). American Scientific, Dordrecht Van Ostrand D, Cox B (2008) Time multiplexed optical shutter. SID Symp Dig 39:1054–1057 Worgull M (2009) Hot embossing: theory and technology of microreplication (micro and nano technologies). William Andrew, Burlington Yang D-K, Wu S-T (2006) Fundamentals of liquid crystal devices, Wiley SID series in display technology. Wiley, Chichester Yohso A, Ukai K (2006) How color break-up occurs in the human-visual system: the mechanism of the color break-up phenomenon. J Soc Inf Disp 14(12):1127–1133 Zhu S, Yu AW, Hawley D, Roy R (1986) Frustrated total internal reflection: a demonstration and review. Am J Phys 54(7):601–607

Page 14 of 14

Introduction to 3D Displays Barry G. Blundell and Mark Fihn

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Spatial Perception and Historical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Visual Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Classes of 3D Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Abstract

“3D display technologies are evolving very rapidly particularly in respect of autostereoscopic systems which are able to be used without viewing glasses, and which provide much greater freedom in vantage position. However, significant questions still remain in relation to the perceptual mechanisms that underpin the 3D experience. Here we provide general background discussion as a precursor to other entries which review approaches in more detail.” Keywords

3D display • Classes • Spatial perception and historical overview • Visual considerations • 3D TV • Center of projection (COP) • Cyclostereoscope • Disparity independent • Luminance disparity • Parallax barrier • Parallax for observer dynamics (POD) • Pulfrich effect • Radial raster barrier • Stereoscope • Temporal disparity • Zograscope B.G. Blundell (*) School of Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand e-mail: [email protected] M. Fihn Veritas et Visus, Temple, TX, USA e-mail: [email protected]; markfi[email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_108-2

1

2

B.G. Blundell and M. Fihn

Introduction In its day, the advent of color TV and associated content represented a natural evolutionary development. Major technical challenges were quickly overcome, and audience expectations fueled the growth of an extensive industry. More recently, even greater commercial opportunities were offered by the introduction of thin panel TV display technologies. In a remarkably short period of time, the CRT (which had underpinned television since the 1930s) was cast into landfill history. Quickly audiences demanded larger format and higher quality screens and sales flourished. Surely, the time had now come for a further bold evolutionary step in which TV would finally escape flatland and embrace the third dimension. Audience anticipation was fueled by extensive media hype, and this also increased short-term revenue-driven expectations. Against this backdrop, fundamental considerations were invariably cast to one side. Could there be any doubt that from a technical perspective, the successful commercialization of 3D TV simply required support for the binocular parallax cue to depth and that given appropriate content (produced directly in 3D or up-converted from 2D), any inconvenience associated with the need to don viewing glasses would be more than offset by the pleasure of the visual experience? As with the CRT, 2D screens would quickly become obsolete, and sales of 3D displays would be measured in terms of many millions of units. In reality, audience interest in 3D TV rapidly waned: up-converted TV content does not generally produce captivating results, viewing glasses are not widely accepted, and as many (including children) watch TV for significant periods of time, the interface between display and visual system is perhaps more complex than originally anticipated. Despite this debacle, from a technology perspective, we have never been better placed to develop innovative, cost-effective glasses-free systems able to address real needs of key applications. However, there is much to be learned from the recent ill-fated commercial foray into glasses-based 3D TV. It is a history which reconfirms that 3D is not simply a natural incremental extension of 2D but rather needs to be considered as representing a potentially highly innovative tableau in its own right. As such it is optimally placed to address certain application requirements but does not (and should not) represent a universally applicable extension to the 2D screen. Additionally, it is evident that a technology centric focus is insufficient and in order to learn how to best use and adapt to the 3D tableau (which offers major advances in both visualization and synergistic interaction), there is a need to adopt a strongly interdisciplinary approach and to fully embrace the remarkable capabilities of the human visual system.

Introduction to 3D Displays

3

Spatial Perception and Historical Overview In the first half of the nineteenth century, Charles Wheatstone demonstrated the role of binocular vision in enabling images depicted on a 2D surface to exhibit spatial attributes that we invariably associate with our everyday experience of our physical surroundings. Parallel advances in photographic techniques enhanced the commercial opportunities of this research, and stereophotography flourished. By way of example, in 1854 George Swan Nottage founded the London Stereoscopic Company. Within 4 years, he had sold over half a million stereoscopes, and it was reported that his well-traveled photographic team had produced around 100,000 stereoscopic images of famous and exotic locations. This was indeed the era of the armchair traveler. The history surrounding the invention and development of the stereoscope is well documented, and support for binocular parallax forms the fundamental basis for practically all of today’s commercial 3D systems. Less attention is invariably given to earlier interest in techniques able to unlock the spatial characteristics of some forms of surface rendered content. Two early exemplar approaches adopt a similar strategy by ensuring monocular viewing from a defined location (the center of projection (COP)) in such a way as to maximize the extent to which the image under observation occupies the visual field (maximum immersion). The first of these predates Wheatstone’s work by more than four hundred years and forms a backdrop to Filippo Brunelleschi’s (1377–1446) demonstration of accurate, mathematically based perspective painting. On the basis of first-hand knowledge, Brunelleschi’s biographer Antonio di Tuccio Manetti reports the adoption of a “peepshow” technique in which the viewer was positioned behind the completed work and placed an eye to a small hole cut through the panel. A mirrored surface was positioned in front of the painting, enabling its reflected image to be seen (see, e.g., Kemp 1978) (Fig. 1). Many artists have sought to create a visually realistic immersive experience by rendering geometrically precise anamorphic images on the inner surfaces of illuminated enclosures – the monocular viewing position being defined by the location of a viewing hole. For example, in connection with a work by van Hoogstraten (1627–1678), Pirenne (1975) writes: “[It depicts] a Dutch interior consisting a hall with a black and white tiled pavement, opening on two furnished rooms with a view of a street and a canal. All this appears in three dimensions when viewed through the peephole. This peepshow looks very much like a real interior, extending far beyond the dimensions of the cabinet.” Interestingly, this latter observation concerning perceived size (a “Tardis” type illusion) is a feature commonly associated with the view-box approach. Anamorphic techniques underpin today’s 3D “pavement” art with the vantage position (direct monocular viewing or indirect binocular viewing (via a lens)) being defined by the artist (see, e.g., Fig. 2). In the late seventeenth century, Fra Andrea Pozzo (1642–1709) sought to achieve spatial and geometric realism on a much grander scale in the painting of the hemicylindrical ceiling (up to ~30 m in height) of St. Ignazio church, Rome. Pirenne (1970) describes viewing the work from the optimal vantage point (corresponding to the center of projection (COP) which is indicated by a disk set into the floor):

4

B.G. Blundell and M. Fihn

Fig. 1 John Pugh’s “Art Imitating Life Imitating Art Imitating Life.” The lower left image is an early concept layout; the lower right shows Pugh painting the statue (Images from John Pugh [1]), http://artofjohnpugh.com, with courtesy of John Pugh

Fig. 2 The Zograscope (commercialized in ~1745) and exemplar image (Zograscope images often employed a strong perspective framework); http://peintres-officiels-de-la-marine.com/gravures/ vue-optique.htm with courtesy of Claude Durieux

Introduction to 3D Displays

5

From the position marked by the yellow marble disc, the arches supported by columns at both ends of the ceiling are seen to stand upright into space. They are seen in three dimensions, with a strength of illusion similar to that given by the stereoscope.

The above examples employ techniques that maximize perspective accuracy and minimize both surface awareness and accurate judgment of image distance. Both Brunelleschi and van Hoogstraten eliminated the binocular cues, while Pozzo ensured that the viewing distance was large enough to weaken their efficacy. The Zograscope (see Fig. 2) was introduced circa 1745 and was intended to enhance the spatial realism of images rendered on a planar surface. This catalyzed the first commercial sales of 3D-like viewing devices and associated content. Toward the end of the eighteenth century, the Phantasmagoria (see, e.g., Fig. 3 and Blundell 2011) began to grow in popularity. Minimization of screen surface awareness was a key feature of this multisensory immersive experience, and with the advent of the Phantascope, this was coupled with image dynamics (acting on the linear perspective cue to depth) to generate a strong impression of emergence and recession. In “Letters On Natural Magic,” Sir David Brewster writes: The exhibition of these transmutations was followed by spectres, skeletons, and terrific figures, which, instead of receding and vanishing as before, suddenly advanced upon the spectators, becoming larger as they approached them, and finally vanished by appearing to sink into the ground. The effect of this part of the exhibition was naturally the most impressive. The spectators were not only surprised but agitated and many of them were of opinion that they could have touched the figures. M. Robertson, at Paris, introduced along with his pictures the direct shadows of living objects, which imitated coarsely the appearance of those objects in a dark night or in moonlight. (Brewster 1883)

The invention of the stereoscope provided conclusive evidence of the ability of binocular vision (and specifically spatial (geometric) retinal disparities) to deliver a strong sense of 3D spatial perception. Koenderink et al. (1994) observe: “Historically, a sharp caesura occurs with the invention of the stereoscope. . .any depth not based on [binocular] stereopsis was suspect.” Certainly in the light of the opportunities offered by rapidly advancing stereoscopic techniques, this break with the past is quite understandable. Without doubt, stereophotography offered a generally predictable form of 3D which could be readily experienced by most viewers. In addition, the geometric constructs that describe binocular parallax were simple to understand, could be used to compose perceived depth, and seemed to provide a sufficient (albeit highly superficial) basis for explaining binocular 3D. In contrast, previous techniques delivered a mellower 3D experience, did not provide a means of composing perceived depth, were not (and are still not) easily explained, and were much less predictable (content that works for one person does not necessarily work for another and often content simply does not work at all). The stereoscopes devised by Charles Wheatstone and subsequently by David Brewster facilitate the fusing of pairs of images. In contrast, the anaglyph technique enables left and right stereo views to be incorporated in a single image – each being uniquely identified by means of color coding. Viewing glasses comprising passive

6

B.G. Blundell and M. Fihn

Fig. 3 The Phantasmagoria – an immersive multisensory experience. The invention of the Phantascope enabled focused images to exhibit a strong sense of emergence and recession. For optimal results, display screen visibility (surface awareness) had to be minimized (Image from “l’Optique” by Fulgence Marion (1890). Reproduced from Thomas Weynants’ Collection (see the “Early Visual Media” website: http://www.visual-media.be))

(color) filters perform color separation and transmit only the correct view to each eye. This approach dates back to the work of Wilhelm Rollmann in Germany (circa 1853) (Symanzik 1993) and Joseph D’Almeida (circa 1858). However, some 30 years appear to have passed before Ducos Du Hauron introduced the term “anaglyph” (derived from the Greek anagluphein – “to carve in relief”). It is often assumed that the anaglyph technique precludes the delivery of 3D images comprising a rich spectrum of vibrant colors. However, this is not the case – see, e.g., ColorCode 3D (www.colorcode3d.com). The invention of the parallax barrier (originally devised for two views by F.E. Ives in 1903 and extended to multiple views by Kanolt in 1918) and lenticular techniques (pioneered by F. E. Ives and H. E. Ives in the first half of the twentieth century) enabled 3D images to be experienced without recourse to viewing glasses/ apparatus. Innovative work undertaken by Edmond Noaillon in the 1920s in relation to barrier-based 3D (Blundell 2014) is unfortunately largely overlooked in literature. Noaillon sought to use the barrier approach in the implementation of glasses-free cinema and invented the cunning radial form of raster in which the opaque and transparent regions radiate from a common point located below the barrier-screen assembly (see Fig. 4). This configuration permits considerable freedom in viewing distance and in principle offered an arrangement whereby an audience located across the width and breadth of a cinema was able to experience glasses-free 3D. Noaillon also endeavored to increase the transmission efficiency of the barrier and worked on

Introduction to 3D Displays

7

Fig. 4 The radial raster barrier-screen assembly employed in the first generation of glasses-free cinema which opened in Moscow early in 1941 (Blundell 2014)

the introduction of oscillatory barrier motion. This latter objective was frustrated by the lack of appropriate electromechanical components needed to synchronize main barrier motion with that of an equivalent but much smaller barrier (microfilter) located within the projection system. In addition, since the speed of an oscillating barrier continually varies, the implementation was fundamentally flawed. However, in the 1940s cyclic (rotational) motion of a radial barrier was used by Francois Savoye in the successful development and commercialization of the cyclostereoscope (Blundell 2014) (Fig. 5). The radial raster barrier was the core technology employed in the first glasses-free 3D cinema, which opened in Moscow in 1941. The screen measured ~5 m by ~3 m, and despite there being fewer than 400 seats, in a 4 month period (i.e., up until Russia’s entry into WWII), approximately 500,000 people took the opportunity to experience 3D – glasses-free (Blundell 2014; Valyus 1966). Following the end of WWII, glasses-free 3D cinema reopened in Moscow in 1947 and subsequently flourished in other Russian cities. The earlier barrier form of radial raster was replaced by its lenticular rendition, which offered higher quality, brighter images. At about the same time, Hollywood led the first major foray into technically simpler glasses-based 3D cinema. This approach was able to work with the wide-screen format, whereas unfortunately the radial raster is much more limited in this respect. Throughout the twentieth century, the grail of 3D has attracted (and continues to attract) the attention of many talented and highly inventive researchers who have proposed and prototyped many diverse technologies. Unfortunately, progress has often been hampered by a lack of suitable content, by the absence of applications able to truly capitalize on 3D display capabilities, and by the lack of necessary base

8

B.G. Blundell and M. Fihn

Fig. 5 Terracotta figures created more than 2,000 years ago – forming part of a great army intended to guard the First Emperor of the Qin Dynasty on his voyage to the afterlife. Historical study suggests that accurately rendering 3D images in a 3D space is a more intuitive process than accurately mapping a 3D scene onto a 2D tableau. (Blundell 2011)

technologies. For example, in this latter respect, in the late 1920s (several years after his demonstration of electromechanical TV), John Logie Baird turned his attention to 3D displays. This led to him patenting a swept-volume volumetric system able (in principle) to capture volumetric data and display this within a cylindrical image space (Blundell 2007). In the absence of simple electronic components – which today we take for granted – he had no way of processing data prior to its depiction, nor did he have the luxury of suitable buffering/storage facilities. Thus the data capture and display subsystems had to directly connect together, and this greatly constrained their respective designs. Indeed the lack of cost-effective high-performance graphics engine technologies severely negated the practicality of nearly all volumetric systems prototyped prior to the 1970s, and it was not until relatively recently that affordable graphics processing hardware offering appropriate performance could be readily purchased, thereby circumventing significant component level design and implementation work.

Visual Considerations Wheatstone’s demonstration of the ability of binocular parallax to elicit a strong sense of spatial perception in the absence of other cues to depth remains as powerful today as when first elucidated. Since that time, practically all effort directed toward the development of 3D display paradigms has focused on supporting the formation

Introduction to 3D Displays

9

of the spatial (geometric) retinal disparities which are directly derived from binocular vision. However, non-binocular parallax-based 3D techniques have continued to attract considerable research attention primarily within the arts and vision science communities. In today’s world of “post-hype 3D,” it is perhaps essential to embrace this work and to reexamine the nature of spatial perception (stereopsis) and the intended functionality of 3D systems. Adapting Vishwanath (2014), conventional stereopsis can be loosely described as the externalized visual impression of tangible form, order, and immersion that is set in a 3D framework of pervasive negative space and which is derived from binocular observation of our spatial surroundings, or by appropriate viewing of stereoscopic images. The expression “negative space” is assumed to denote the translucent permeable space in which physical entities generally appear to coexist and plays a crucial role in enhancing the tangibility of their spatial separation. Gabor (1960) coined the term “stereoscopy by default” in relation to spatial perception derived from monocular viewing of dynamic image content. It is convenient to slightly modify Gabor’s terminology and use the term “stereopsis by default” (s-bd) when referring to a visual perceptual experience in which attributes associated with conventional stereopsis are derived from monocular or binocular viewing of single or identical pairs of images captured from a particular vantage point and which comprise static or kinetic content (Blundell 2015). Using the terms “conventional stereopsis” and “stereopsis by default” (in literature this is also referred to as “paradoxical monocular stereopsis” and the “plastic effect”) not only avoids confusion but, most importantly, emphasizes that the overarching experience of stereopsis is not a single well-defined perceptual experience, but rather is multifaceted and driven by the complex interaction of multiple stimuli thereby reflecting the powerful plasticity and adaptability of the visual system.

Classes of 3D Display The overarching function of any 3D display system is to interface with the visual system to support spatial perception in a manner that conforms to our natural expectations. Table 1 provides a loose framework for the classification of 3D techniques and includes support for both conventional stereopsis and s-bd. In the case of the former, three classes of display are indicated on the right-hand side of the table. Stereoscopic systems are assumed to support stereopsis from a single vantage point and, other than in the case of direct fusion of stereo pairs, require the use of some form of viewing apparatus/glasses. In contrast, the autostereoscopic categories are implicitly assumed to support glasses-free viewing. The incorporation of two classes of autostereoscopic systems is primarily intended to distinguish between their ability to naturally satisfy the oculomotor cues to depth. We assume that Category I autostereoscopic systems are not able to naturally satisfy these cues, while in the case of Category II, the cues of accommodation and convergence are satisfied in a natural manner. In addition for simplicity, it is assumed that this second category of autostereoscopic system is implicitly able to support

Maximization of immersion Single view

Static

Viewing hole, “hole in the fist,” or converging lens

Kinetic

Viewing hole, “hole in the fist,” or converging lens

Image type

Example techniques

Converging lens

Pulfrich Effect

Static or kinetic Stereoscope Direct fusion Anaglyph Frame sequential

Pictorial cues Spatial disparity Direct or indirect Single view

Maximization of immersion Single view

Pictorial cues Disparity independent Synoptic viewing device Single view Static or kinetic Synopter or equivalent

Viewing characteristics

Pictorial cues Temporal disparity Disruption of binocular cues Single view Kinetic

Pictorial cues Luminance disparity One neutral density filter Single view Kinetic

Pictorial cues Disparity independent Disruption of binocular cues Single view Static or kinetic Zograscope

Pictorial cues Disparity independent

Pictorial cues Temporal disparity

Cue base

Static or kinetic Volumetric Electroholography Varifocal Virtual reality (VR/IVR) Parallax barrier Lenticular

Autostereoscopic II Pictorial cues Spatial disparity POD Oculomotor cues Direct Multiple views

Static or kinetic

Direct Single or multiple views

Autostereoscopic I Pictorial cues Spatial disparity POD

Stereoscopic

Monocular s-bd

Class

Binocular s-bd

Conventional stereopsis (binocular)

Stereopsis-by default (s-bd)

Table 1 A simple classification scheme embracing both conventional stereopsis and s-bd (Adapted from Blundell 2015)

10 B.G. Blundell and M. Fihn

Introduction to 3D Displays

11

motion parallax relating to changes in an observer’s vantage point (parallax for observer dynamics (POD)). In the case of Category I systems, POD may or may not be satisfied. Although this classification structure is by no means perfect, it provides a useful framework for grouping display paradigms. Entries beginning on the left side of Table 1 relate to techniques that offer support for s-bd (although as noted previously, this is strongly dependent on image content/ geometry). We loosely distinguish between approaches that offer to support s-bd via either monocular or binocular viewing. While conventional stereopsis is by definition assumed to be fundamentally underpinned by spatial (geometric) retinal disparities, we identify three disparity mechanisms that have the potential to support s-bd. These are briefly summarized below: 1. Temporal Disparity: This is assumed to relate to changes in the retinal image over time – specifically those directly induced by the dynamics of image components depicted on a surface. In connection with this method of stimulating s-bd, Denis Gabor observed: “Anybody can check “stereoscopy by default” [s-bd] in the cinema by closing one eye and forming a tube with the hand around the other. . .” (Gabor 1960). This “hole-in-the-fist” approach is perhaps the simplest viewing technique and is generally quite effective provided that the size of the “hole” ensures that the display screen occupies the entire visual field (although with practice, this constraint can be relaxed). Alternatively, a small viewing hole cut in a piece of card will also suffice. Temporal disparity can provide strong support for s-bd and is not restricted to monocular viewing scenarios. Thus, for example, binocular viewing of kinetic image content depicted on a computer display via a converging lens (e.g., a magnifying glass or Fresnel lens) can also evoke s-bd. 2. Luminance Disparity: This underpins the Pulfrich effect and corresponds to a difference in the level of light impinging on the retinae. Given appropriate kinetic image content, the technique is able to evoke a strong sense of s-bd and as with other s-bd techniques invariably gives rise to a spatial perceptual experience which conforms to physical world expectations – false geometries generally appear to be absent. 3. Disparity Independent: This corresponds to viewing scenarios in which s-bd is evoked (or rather appears to be evoked) without any recourse to retinal disparities. Static image content may be viewed with one eye shaded or binocularly via a synopter (a device intended to present identical images to the two eyes). Exemplar viewing techniques are summarized in Table 1. Although these approaches can work well with some forms of image content, it is evident that monocular viewing of static images tends to yield the least vivid form of s-bd. Also while s-bd may be clearly evident for some image components, it may be entirely lacking for others within the scene. In this context, Schlosberg (1941) indicates: “. . .we may get the plastic effect [s-bd], but find the depth more adequate in certain parts of the view than others.”

12

B.G. Blundell and M. Fihn

In relation to the various approaches that can be adopted in order to evoke stereopsis, Enright (1993) observes: “. . .there can be little doubt that the most effective representations of the third dimension are those which involve [conventional] stereopsis; and that the second most effective way to convey a feeling of depth is through the use of image motion: optical flow patterns, image shear, motion parallax and the like [temporal disparity]. When both stereopsis and image motion are excluded, one is dealing no with no more than third best [disparity independent].” The Pulfrich effect provides the simplest and most straightforward means of initially experiencing s-bd. The temporal disparity approach is slightly more problematic at the outset as not all kinetic content induces s-bd. Also when first experimenting with this modality, it is helpful to adopt monocular viewing and employ a viewing aperture that ensures that the image largely fills the visual field. Disparity-independent techniques require more practice although some researchers find the synopter-based technique straightforward. In general terms, a period of adaptation is often required (particularly in relation to disparity-independent approaches), and this may take several minutes. However, with a little practice, it becomes easier to experience s-bd, and viewing conditions may often be relaxed. The mechanisms that underpin s-bd remain uncertain. The most long-standing hypothesis focuses on depth cue coherence coupled with the minimization of surface awareness. Schlosberg (1941) suggests: “In normal binocular inspection of a picture the ‘flatness’ cues are strong enough to force the observer to see a flat picture; but if ‘flatness’ cues can be eliminated or weakened, or if the depth cues that are present can be sufficiently exaggerated, the perception takes on depth.” Vishwanath and Hibbard (2013) and Vishwanath (2014) question the conventional explanations of s-bd and, on the basis of observations and experimental results, propose an alternative hypothesis.

Summary From the standpoint of technology, we are well placed to develop cost-effective autostereoscopic display technologies which have the potential to satisfy real needs. Such technologies must not only address real needs but also integrate properly with the human visual system and capitalize on its remarkable capabilities. This not only requires that 3D images possess satisfactory visual attributes but also demands that components underpinning the display technology do not hamper the natural visual process. This latter requirement is of particular importance in the implementation of augmented reality headsets where optical components employed in image formation also act as a window through which the physical world is seen. The binocular parallax cue to depth plays a crucial role in the spatial perception of 3D physical space, and by presenting each eye with a slightly different view onto images rendered on a 2D surface, we are able to elicit a perceptual experience that largely mimics our natural world experience. However, as outlined above, binocular vision and the binocular parallax cue to depth are not fundamental requirements for

Introduction to 3D Displays

13

vivid spatial perception, and other mechanisms can also give rise to stereopsis (which represents a complex multifaceted experience). Continued rigorous research into s-bd has the potential to greatly enhance understanding of the mechanisms supporting our perception of 3D. Results obtained to date suggest that display attributes can strongly influence initiation/perception of s-bd, and in this context, it is possible that large, ultra-high-definition screen formats (including curved screen profiles) may play an important role (Blundell 2015).

Further Reading Blundell BG (2007) Enhanced visualization: making space for 3-D images. Wiley, Hoboken Blundell BG (2011) 3D displays and spatial interaction. Exploring the science, art, evolution and use of 3D technologies. Walker & Wood Ltd. Also available for download https://www. dropbox.com/sh/r96nhin3od4z81t/hepxE7-ju1. Accessed Mar 2015 Blundell BG (2012) On exemplar 3D display techniques: summary discussion. www. barrygblundell.com. Accessed Mar 2015 Blundell BG (2014) On aspects of glasses-free 3D cinema ~70 years ago. https://www.dropbox. com/sh/uw3yjnwn3qo9z1e/SyPYAkHSqc. Accessed Mar 2015 Blundell BG (2015) On curious 3D perception phenomena: stereoscopy – by default, discussion document. www.barrygblundell.com. Accessed Mar 2015 Brewster SD (1883) Letters on natural magic. Chatto and Windus, London Claparède E (1904) Steréréoscopie monoculaire paradoxale. Ann Ocul 132:465–466 Enright JT (1993) Paradoxical monocular stereopsis and perspective vergence. In: Ellis S (ed) Pictorial communication in virtual and real environments, 2nd edn. Taylor & Francis Inc, Bristol, pp 567–576 Gabor D (1960) Three-dimensional cinema. New Sci 8(191):141–145 Ivanov SP (1945) Rastrovaia Stereoskopiia v Kino. Goskinizdat, Moskva Kemp M (1978) Science, non-science and nonsense: the interpretation of brunelleschi’s perspective. Art History 1(2):134–161 Koenderink JJ, van Doorn AJ, Kappers AML (1994) On so-called paradoxical monocular stereoscopy. Perception 23:583–594 Koenderink JJ, Wijntjes M, Van Doorn A (2013) Zograscopic viewing. i-Perception 4:192–206 Marion F (1890) L’Optique (4th ed). Librairie Hachette et Cie, Paris Okoshi T (1976) Three-dimensional imaging techniques. Academic Press, New York Pirenne MH (1970) Optics, painting and photography. Cambridge University Press, London Pirenne MH (1975) Vision and art. In: Carterette EC, Friedman MP (eds) Handbook of perception, Vol 5, Seeing. Academic Press, New York, pp 433–490 Schlosberg H (1941) Stereoscopic depth from single pictures. Am J Psychol 54(4):601–605 Symanzik J (1993) Three-dimensional statistical graphics based on interactively animated anaglyphs. In: Proceedings of section on statistical graphics, American Statistical Association, Alexandria Tidbury LP, Black RH, O’Connor AR (2014) Perceiving 3D in the absence of measurable stereoacuity. Br Ir Orthopt J 11:34–38 Valyus NA (1966) Stereoscopy. The Focal Press, London Vishwanath D (2011) Visual information in surface and depth perception: reconciling pictures and reality. In: Albertazzi L, van Tonder G, Vishwanath D (eds) Perception beyond inference: the information content of visual processes. MIT Press, Cambridge, MA, pp 201–240 Vishwanath D (2014) Towards a new theory of stereopsis. Psychol Rev 121(2):151–178 Vishwanath D, Hibbard PB (2013) Seeing in 3D with just one eye: stereopsis without binocular vision. Psychol Sci 24:1673–1685 Wade NJ (ed) (1983) Brewster and wheatstone on vision. Academic Press, London/New York Wade NJ (1987) On the late invention of the stereoscope. Perception 16:785–818

Human Factors of 3D Displays Robert Earl Patterson

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Interocular Cross Talk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Interocular Differences in Luminance and Contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Accommodation-Vergence Mismatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Stereoanomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Spatiotemporal Frequency Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Distance Scaling of Disparity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 High-Level Cue Conflict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Directions for Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Abstract

This chapter provides a selected review of a number of important perceptual and human factors issues that arise when three-dimensional (3D) displays are designed and used. Topics discussed include: interocular cross talk; interocular differences in luminance and contrast; accommodation-vergence mismatch; stereoanomaly; spatiotemporal frequency effects; distance scaling of disparity; and high-level cue conflict.

R.E. Patterson (*) Air Force Research Laboratory, Wright-Patterson AFB, OH, USA e-mail: [email protected]; [email protected] # Springer-Verlag Berlin Heidelberg (outside the USA) 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_109-2

1

2

R.E. Patterson

Introduction Over the past several years, there has been a growing interest in the development of high-quality displays that present binocular parallax information to the human visual system for inducing the perception of three-dimensional depth. The methods for presenting binocular parallax to an observer vary widely and include three broad categories of display: stereoscopic, holographic, and volumetric displays (Patterson 2015; Halle 1997). Because the technology for stereoscopic displays is more developed, and more widely used, than those based on holography or volumetric methods, the human factors issues involved in the viewing of stereoscopic displays will be emphasized in this chapter, with only brief mention of holographic and volumetric displays. Despite the diverse methods for creating 3D displays, which includes stereo spatial multiplexing as well as temporal multiplexing (i.e., field sequential) techniques (Patterson 2015), there remains common human factors issues that arise when viewing such displays. The purpose of this chapter is to provide a selected review of these important issues so that they can be considered when designing and using 3D displays. In doing so, the following topics will be covered: interocular cross talk; interocular differences in luminance and contrast; accommodationvergence mismatch; stereoanomaly; spatiotemporal frequency effects; distance scaling of disparity; and high-level cue conflict.

Interocular Cross Talk Interocular cross talk refers to a situation in which information from one eye’s view leaks into the other eye. Because cross talk serves to introduce a form of binocular, or interocular, noise into the visual system (Yeh and Silverstein 1990) that degrades stereopsis in all of its respects, this is probably the most serious human factors issue. Interocular cross talk can occur with both spatial multiplexing and temporal multiplexing stereo display techniques. For example, with spatial multiplexing, interocular cross talk can occur if there is significant chromatic aberration with lenticular displays or if there is significant diffraction with parallax barrier-type displays. With such autostereoscopic displays, there may be interocular cross talk if an observer is established at an incorrect viewing distance. With temporal multiplexing, interocular cross talk can occur if there is significant display persistence (Yeh and Silverstein 1990) wherein information from one eye’s view persists and continues past the termination of a given frame and thus leaks into the other eye when that eye’s view is exposed. Studies have shown that as little as 2–7 % of interocular cross talk can significantly reduce the limits of binocular fusion and degrade image quality (Yeh and Silverstein 1990) and that as little as 5 % of cross talk can produce viewing discomfort (Kooi and Toet 2004). The remedy for this problem is to keep cross talk less than 2 %. This issue of interocular cross talk arises mainly with stereo displays. Interocular crosstalk should likely be less of a problem with either

Human Factors of 3D Displays

3

holographic displays or volumetric displays because the binocular parallax information is preserved in the differing directions of light emanating from the display, thus it should be more difficult for the view of one eye to leak into the other eye (Patterson 2015).

Interocular Differences in Luminance and Contrast Recent studies have revealed that stereo depth perception appears to be surprisingly robust despite significant interocular differences in luminance level (Kooi and Toet 2004; Boydstun et al. 2009). For example, the magnitude of perceived depth, as well as depth discrimination thresholds, were relatively unaffected by interocular luminance differences of up to 60 % (Boydstun et al. 2009), and visual discomfort was slight with interocular luminance differences of up to 25 % (Kooi and Toet 2004). Recent studies have also shown that stereo depth perception appears to be surprisingly robust despite significant interocular differences in stimulus contrast. For example, depth discrimination performance was largely unaffected by interocular contrast differences of up to 83 % (Hess et al. 2003), and visual discomfort was slight with interocular contrast differences of up to 25–50 % (Kooi and Toet 2004). Thus, for good stereo viewing, interocular differences in luminance or contrast should be less than 25 % (Patterson 2015). Interocular differences in luminance and contrast would likely not be a problem with either holographic displays or volumetric displays. This is because, with these types of displays, the luminance information would be coming from one source which should equate the information projected to the two eyes (Patterson 2015).

Accommodation-Vergence Mismatch When viewing a stereo display, the stimulus for accommodation would be the images on the surface of the display. However, when an observer converges to a virtual object appearing in-depth in front of or behind the display, the vergence angle can be in conflict with accommodation. This is called accommodationvergence mismatch (Wann et al. 1995). Due to the synergy between accommodation and vergence (Toats 1972; Toates 1974), an accommodation-vergence mismatch can create problems such as eyestrain and visual discomfort (Patterson 2015; Wann et al. 1995). It is believed that converging or diverging to a depth plane that is different from the display surface will pull the accommodative response to that depth plane, thereby making the images on the display surface become blurred. This, in turn, would tend to drive the accommodative response back to the display surface and a conflict between accommodative responding and vergence eye movements would ensue. It has been shown (Hoffman et al. 2008) that the presence of accommodation-vergence mismatch can hinder visual performance and cause visual fatigue.

4

R.E. Patterson

However, the problem of accommodation-vergence mismatch should occur only for short viewing distances (Patterson 2015) due to the depth of field of the human eye. Depth of field refers to the range of distances in object space within which an image appears in sharp focus and is specified in meters. Depth of field is calculated from the depth of focus (which refers to the range of distances in image space within which an image appears in sharp focus and is given in terms of diopters) using the formula D = 1/F, where D is distance in meters, and F is distance in diopters. For a given depth of focus, the depth of field will vary depending upon fixation distance, such that the eye can tolerate much larger intervals of depth when viewed from a far distance than when viewed from a near distance before images go out of focus. For example, a recent estimate of the total depth of focus comes from a comprehensive review of the literature (Wang and Ciuffreda 2006) from which it can be concluded that the average total depth of focus is on the order of 1.0 diopter (or, equivalently, 0.5 D in front of fixation and 0.5 D behind fixation). Thus, for a given viewing distance under consideration, the viewing distance in meters should be recast in diopters. Next, calculate the estimated closest point of the depth of field (in meters) from the observer by taking the reciprocal of the sum of the distance in diopters plus 0.5 D. The estimated farthest point of the depth of field (in meters) from the observer would be calculated as the reciprocal of the sum of the distance in diopters minus 0.5 D. Based on the total depth of focus being 1.0 diopter, for a fixation distance of 0.5 m, the total depth of field would range from a distance of about 0.1 m in front of fixation to about 0.17 m behind fixation. For a fixation distance of 1 m, the total depth of field would range from a distance of about 0.33 m in front of fixation to about 1.0 m behind fixation. For a fixation distance of 2 m, the total depth of field would range from a distance of about 1 m in front of fixation to an infinite distance behind fixation. For a fixation distance of 3 m, the total depth of field would range from 1.8 m in front of fixation to an infinite distance behind fixation. And for a fixation distance of 20 m, such as when viewing 3D cinema (Patterson 2015), the total depth of field would range from about 18 m in front of fixation to an infinite distance behind fixation. Thus, for 3D cinema, almost the entire viewing distance – from a couple meters in front of the user to an infinite distance away – represents the usable depth interval for which accommodation-vergence conflict should not occur. Note that these values are estimates given that the depth of focus is affected by several factors, including the luminance of the displayed imagery, which in turn affects pupil size and the level of resolution. Nonetheless, the total depth of field would be very large – from about 1 m in front of fixation to infinity behind fixation – when fixating an object approximately 2 m away and even larger when fixating objects farther away (Patterson 2015). Converging or diverging away from the display surface may pull accommodation to that position in depth but if that position is within the depth of field, then the images of the stimulus on the display surface will still be in focus and the accommodative response would not be driven back to the display. A conflict between accommodative and vergence responses should not occur if the images

Human Factors of 3D Displays

5

on the display surface remain within the observer’s depth of field. In this case, the accommodative response would be free to follow vergence without any conflict in responding. In general, there should likely be very little change in the accommodative state of the eyes when the observer directs his or her gaze to objects located a couple of meters away or farther, which should minimize discomfort and the other problems associated with accommodation-vergence mismatch. Accommodationvergence mismatch should be a problem only with near-eye displays. More generally, the remedy for accommodation-vergence mismatch is to present the stereo depth information (i.e., the perceived depth planes) within the depth of field of the human eye or limit viewing duration. In presenting the stereo depth information with the depth of field, one can estimate the depth of field by using the calculations given above. One can estimate the location of the perceived depth plane by using the calculations given below in the section on “Distance Scaling of Disparity.” When viewing holographic and volumetric displays, it is likely that the stimulus for accommodation and vergence will be the 3-D object, and thus the two oculomotor responses should be consistent with one another, that is, there should be no accommodation-vergence mismatch (Patterson 2015).

Stereoanomaly In chapter “▶ Binocular Vision and Depth Perception,” under the section entitled “Horopter and Binocular Disparity,” there is a discussion of the two directions of disparity, crossed versus uncrossed. Objects located in a depth plane in front of fixation (and the horopter) will create images with “crossed disparity,” whereas objects located in a depth plane behind fixation (horopter) will create images with “uncrossed disparity.” This section also discusses the idea that these two directions of disparity are processed differently by different sets of cortical neurons in the visual brain. For some individuals, perceived depth in a stereo display is reversed such that depth induced by crossed disparity is actually perceived as back depth, or depth induced by uncrossed disparity is actually perceived as front depth. This condition is referred to as “stereoanomaly” (Richards 1970; Richards 1971; Patterson and Fox 1984). About 20–30 % of individuals under degraded stimulus conditions such as brief stimulus exposure can express this problem with stereo viewing. This estimate does not include 6–8 % of individuals who are stereoblind, which can be a medical condition resulting from strabismus. One explanation for stereoanomaly comes from the idea that the neural substrate for stereopsis in such individuals is abnormally insensitive to disparity in one or the other direction, which is revealed under degraded stimulus conditions such as brief exposures (Patterson and Fox 1984). The conditions of stereoanomaly and stereoblindness may limit the number of individuals who have the capability to use stereoscopic displays in certain applications. Therefore, it may be important to screen for stereoanomaly and stereoblindness (Patterson 2015). Regarding stereoanomaly, the likely remedy is

6

R.E. Patterson

to present disparity information under nondegraded conditions or bolster the disparity information with other depth or distance cues (van den Enden and Spekreijse 1989). There seems to be a lack of information about the viewing of holographic or volumetric displays by stereoanomalous or stereoblind individuals. It may be that the enriched depth cues afforded by holographic or volumetric displays would enable stereoanomalous individuals, and perhaps some stereoblind individuals, to appreciate a reasonable quality of depth in one or the other type of display (Patterson 2015).

Spatiotemporal Frequency Effects In chapter “▶ Binocular Vision and Depth Perception,” under the section entitled “Spatiotemporal Frequency Effects,” there is a discussion of how the spatial and temporal frequencies contained in the images projected to the two eyes can affect Panum’s fusional area. Thus, the information contained in that section is relevant for human factors issues. To sum up the information contained in that section, for displayed imagery with fine details (e.g., 20 cyc/deg or higher) and relatively sustained stimulation, the total effective disparity range can be about 80 arcmin centered on the plane of fixation (the horopter) before binocular fusion is lost, and stereoacuity thresholds are about 20 arcsec. For imagery with coarse details (e.g., for spatial frequencies at and below 3 cyc/deg) and sustained stimulation, the total effective disparity range can be about 8 arcdeg centered on fixation, and stereoacuity thresholds are about 5 arcmin. In this latter case, however, stereoacuity thresholds can improve to about 20 arcsec when stimulation is moderately transient. When considering large field-of-view immersive displays that induce stereo with the time-multiplexing (field sequential) technique, peripheral areas of the retinae, which respond especially well to moderate and high rates of temporal modulation, will be stimulated with the time-multiplexing technique. Thus, disruptive peripheral flicker may be perceived when viewing large field-of-view immersive displays that induce stereo with the time-multiplexing method (Patterson 2015). The remedy for this problem is to employ a high frame rate so that the visual system temporally integrates the intermittent information seen in the periphery. This issue of spatiotemporal frequency effects has not been considered with the viewing of holographic or volumetric displays.

Distance Scaling of Disparity In chapter “▶ Binocular Vision and Depth Perception,” under the subheading “Distance Scaling of Disparity,” there is a discussion of how binocular disparity information must be recalibrated and scaled in accordance with viewing distance information in order for reliable depth to be perceived, an operation referred to as

Human Factors of 3D Displays

a

7

F

Y

Short viewing distance

Y F

Y F RE

LE

b

F

Y Long viewing distance

Y

F LE

Y F RE

Fig. 1 Drawing depicting the change in disparity magnitude with variation in viewing distance. Diagram in the top panel of the figure shows a top down view of two eyes fixating point F at a short distance, whereas the diagram in the bottom panel depicts two eyes fixating the same point F at a long distance. In both diagrams, the depth between F and Y is the same magnitude. Increasing the viewing distance causes disparity magnitude to decrease. For depth to be perceived reliably, viewing distance as well as disparity must be registered by the visual system (This figure was reproduced from Patterson (2009) with permissions by The Society for Information Display

disparity scaling. Thus, the information contained in that section is also relevant for human factors issues. That part of chapter “▶ Binocular Vision and Depth Perception” mentions that the magnitude of binocular disparity varies approximately inversely with the square of the viewing distance in the real world. For example, see Fig. 1 (this chapter). If viewing distance to a constant interval of depth between two objects in the visual field is halved, then disparity will be approximately four times its initial value, and if viewing distance is doubled, disparity will be approximately one-fourth its original value.

8

R.E. Patterson

However, when stereo displays are viewed, the magnitude of disparity varies approximately inversely with the first power of viewing distance. This is presumably due to the way stereo depth is created on a flat display, that is, with half-image separation on a flat surface instead of viewing objects which are actually positioned at different depths in the visual field (Cormack and Fox 1985). Thus, if viewing distance to a stereoscopic display depicting a given depth interval is halved, then disparity will be approximately twice its initial value, and if viewing distance is doubled, disparity will be approximately one-half its original value. When viewing stereo displays with symmetrical convergence and targets located near the midsagittal plane, disparity magnitude is computed as: r (radians) = S/D, where r is disparity, S is the separation of the half images on the stereoscopic display, and D is viewing distance (Cormack and Fox 1985). This can be seen in the expression for calculating the perceived depth in a stereo display: d = (D  S)/(I  S), where d is predicted depth, D is viewing distance, S is separation between half images on the display screen, and I is interpupillary distance (Cormack and Fox 1985). Note that when disparity is crossed, the denominator is (I + S), and when disparity is uncrossed, the denominator is (I S). In the previous section on accommodation-vergence mismatch, it was mentioned that one remedy for that mismatch is to present stereo depth information within the depth of field of the human eye and that one can calculate an estimate for the depth of field by using the expression provided in that section. Those calculations for the depth of field can be combined with the calculations for the perceived depth in a stereo display given above, in order to ensure that the perceived depth falls within the depth of field. That changes in viewing distance affect disparity magnitude differently with stereo displays than in the real world may have implications for mixed-reality or augmented-reality applications (Patterson 2015). With these kind of applications, virtual stereo objects are projected into real-world scenes creating perceptual interactions between virtual objects and real objects. If a user moves around his or her environment, then perceived depth of the virtual stereo objects may vary with changes in viewing distance while depth of the real objects remains stationary, which may complicate the use of mixed-reality or augmented-reality displays. The magnitude of disparity should vary approximately inversely with the square of the viewing distance with holographic and volumetric displays, as in the real world.

High-Level Cue Conflict In many applications, stereo displays are used to recreate real-world scenes by presenting various cues to depth and distance. Potential cues would include binocular disparity, motion parallax, linear perspective, and texture perspective. It is important that the various depth cues convey the same magnitude of depth; note that the absence of a cue in a display is likely to be registered by the visual system as a

Human Factors of 3D Displays

9

zero value. If the various cues convey different magnitudes of depth, then viewing discomfort is likely to occur if the display is viewed for a prolonged period of time. The basis for this discomfort is termed high-level cue conflict (Patterson and Silzars 2009), and is likely due to the intuitive reasoning system, which is strongly engaged with immersive displays that entail the perception of simultaneous, redundant cues, attempting and failing to make reasoned sense out of the conflicted perceptual information. When viewing holographic and volumetric displays, it is likely that all potential cues to distance and depth would be in registration, that is, there should be no high-level cue conflict.

Summary and Conclusions (1) Interocular crosstalk can occur with spatial multiplexing when the technology permits light from one eye’s image to leak into the partner eye (Lueder 2012). Limit interocular cross talk to a value less than 2 %. (2) Keep interocular luminance differences and interocular contrast differences less than 25 %. (3) View stereo displays from a distance of 2 m or greater if possible, and present stereo depth information within the depth of field of the human eye, or limit viewing duration. (4) Screen for stereoanomaly and stereoblindness if needed, present displayed imagery under nondegraded conditions, and bolster disparity information with other, nonconflicted, depth or distance cues. (5) Know the spatiotemporal properties of the displayed imagery, which will allow one to predict human sensitivity to various ranges of disparity magnitudes. (6) The depth perceived with stereo displays may vary in ways that are different from real-world depth when users move around their environment, which may complicate the use of mixed-reality or augmented-reality displays. (7) Make congruent all depth and distances cues in the stereo display, including binocular parallax, motion parallax, and the perspective field cues.

Directions for Future Research There are several issues that deserve to attract future research efforts: (1) develop methods for decreasing interocular cross talk; (2) develop methods for quick screening of stereoanomaly and stereoblindness, and determine whether bolstering disparity information with other depth or distance cues minimizes or eliminates symptoms of stereoanomaly; (3) determine to what degree changes in viewing distance affect depth perception in mixed-reality or augmented-reality applications; (4) determine to what degree adding motion parallax to a stereo display minimizes or eliminates high-level cue conflict; and (5) develop a line of research that investigates many of the issues discussed in this chapter for holographic displays (e.g., distance scaling of disparity information; stereoanomly and stereoblindness; high-level cue conflict).

10

R.E. Patterson

Further Reading Boydstun A, Rogers J, Tripp L, Patterson R (2009) Stereo depth perception survives significant interocular luminance differences. J Soc Inf Disp 17:467–471 Cormack R, Fox R (1985) The computation of disparity and depth in stereograms. Percept Psychophys 38:375 Halle M (1997) Autostereoscopic displays and commputer graphics. ACM SIGGGRAPH 31:58 Hess R, Liu C, Wang Y-Z (2003) Differential binocular input and local stereopsis. Vision Res 43:2303–2313 Hoffman D, Girshick A, Akeley K, Banks M (2008) Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. J Vis 8:1–30 Kooi F, Toet A (2004) Visual comfort of binocular and 3D displays. Displays 25:99–108 Lueder E (2012) 3D displays, Wiley series in display technology. Wiley, West Sussex Patterson R (2009) Human factors of stereoscopic displays. Soc Inf Disp Int Symp Dig Tech Pap 40:805–807 Patterson R (2015) Human factors of stereoscopic 3D displays. Springer, London Patterson R, Fox R (1984) The effect of testing method on stereoanomaly. Vision Res 24:403 Patterson R, Silzars A (2009) Immersive stereo displays, intuitive reasoning, andcognitive engineering. J Soc Inf Disp 17:443–448 Richards W (1970) Stereopsis and stereoblindness. Exp Brain Res 10:380–388 Richards W (1971) Stereoanomalous depth perception. J Opt Soc Am 61:410 Toates F (1974) Vergence eye movements. Doc Ophthalmol 37:153–214 Toats F (1972) Accommodation function of the human eye. Physiol Rev 52:828–863 van den Enden A, Spekreijse H (1989) Binocular depth reversals despite familiarity cues. Science 244:959–961 Wang B, Ciuffreda K (2006) Depth of focus of the human eye: theory and clinical applications. Surv Ophthalmol 51:75 Wann J, Ruston S, Mon-Williams M (1995) Natural problems for stereoscopic depth perception in virtual environments. Vision Res 35:2731–2736 Yeh Y, Silverstein L (1990) Limits of fusion and depth judgments in stereoscopic color displays. Hum Factors 32:45

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_110-2 # Springer-Verlag Berlin Heidelberg 2015

Introduction to Projected Stereoscopic Displays Lenny Lipton* Los Angeles, CA, USA

Abstract This chapter deals with the subject of front-projected stereoscopic displays and specifically with what are known as plano-stereoscopic displays, which are the most common type. “Plano,” for “planar,” indicates that the display is made up of two two-dimensional images that together form what is called a stereo pair. The purpose of a stereo pair is to replicate the way we see with our two eyes with slightly different perspective views. We will cover the major methods for projecting such displays, and the venues or applications for such displays will also be considered.

Introduction: Selection Techniques There are three ways that a plano-stereoscopic display can be selected. Selection is the process by which the appropriate image is seen by the appropriate eye, and the inappropriate image is blocked. Selection is required because a single display surface is used when projecting plano-stereoscopic images, and there needs to be some way to sort out the appropriate image for the appropriate eye. Broadly, there are two approaches that can be used for the projection of such images. One is to use two projectors, and the other is to use a single projector. Obviously, a single projector is more dependable and convenient; however, there can be circumstances in which using two projectors is advantageous, depending on the venue itself and the degree of complexity and control that can be exerted during projection. Getting more light on the screen is one advantage of dual projectors, perhaps the only advantage. The three techniques for image selection, which apply equally to single or dual projection, are selection by means of color or wavelength, selection by means of polarization, and selection by means of time, or “temporal selection.” And these can be combined with each other and projected using a single or double projector setup: 1. Selection by means of color is called the “anaglyph” – a technique that is over 150 years old and familiar to people who will immediately recognize the complementary-colored glasses (typically red and blue or red and green), often in cardboard and sometimes in plastic frames. The technique is used for magazine and book illustrations because it is compatible with commercial printing processes. It is a multiplexing technique using color, with each end of the spectrum used for encoding a perspective view. Without going into a detailed explanation, the eyewear filters block and pass the appropriate images as required. The result is pleasing to some people in some circumstances, but cannot be taken seriously as a general-purpose solution because the image is limited to monochrome or compromised color effects. A typical conference room projector with an ordinary screen can be made to work for the anaglyph, and such illustrations can be inserted into a PowerPoint presentation, for example. The audience members are asked to put on the glasses to see the stereoscopic slide, and the result can be satisfactory.

*Email: [email protected] Page 1 of 4

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_110-2 # Springer-Verlag Berlin Heidelberg 2015

2. The next selection technique, polarization, can be accomplished with a single or dual projection setup, and it works because of the polarization characteristics of light. A description of the physics involved is beyond the scope of this chapter. There are two kinds of polarization employed. At the highest end circular, polarization is employed, which allows for head tipping. But generally speaking, linear polarization is more widely used in location-based entertainment venues and theme parks because the eyewear required is less expensive than that needed for circular polarization. In the case of linear polarized image selection, the technique uses sheet polarizers, which have been available since the late 1930s. For a dual-projector setup, the sheet polarizers’ axes over each projector lens are orthogonal. A so-called silver screen (usually a vinyl screen painted with aluminum pigment) is employed for the projection screen, and linear polarizing glasses are used by audience members. Polarization is a characteristic of light that is a useful one in the context of stereoscopic displays, because it is possible to extinguish the unwanted eye’s perspective and pass the wanted perspective through eyewear that has appropriate polarization filters. Unlike the anaglyph, it allows for full color. Circular polarization selection allows for head tipping. In the case of linear polarization, only a little bit of head tipping will cause leakage of the unwanted image, producing a result that looks like a double exposure. In the case of circular polarization, head tipping does not materially increase leakage, so it is to be preferred if quality of projection is the major consideration. 3. Temporal selection is based on the fact that a rapid sequence of left and right images, when alternated and viewed through an appropriate shuttering device, produces seamless, flicker-free 3D images. Displays of this kind have been used for years for viewing images on CRT monitors; for defense applications, molecular modeling, and computer-aided design; and on some occasions for medical imaging.

Venues We will now discuss the major applications for stereoscopic imaging. Each has its own set of solutions.

Conference Rooms and Trade Show Booths Stereoscopic projection using polarized image selection technology is used for conference rooms and trade show booths. Drug companies like the pizzazz of stereoscopic images in their trade show booths and people in fields where it is necessary to visualize three-dimensional data such as molecular modeling, oil and gas exploration, or computer-aided design have also used 3D projection. Both dual and single projection approaches have been used. Dual projection approaches are more complex, because they require the coordination of two projectors in terms of precision geometry and illumination. Also, often such setups require considerable tweaking, not only at setup but sometimes throughout the course of projection. The method usually used is linear polarization for projection. But the advent of low-cost electronic projection, say of the so-called road warrior class, makes dual projection an attractive proposition. The time-multiplex or field-sequential technique has also been used in venues like this, where each observer wears shuttering eyewear. Shuttering eyewear for such applications is available from the StereoGraphics division of Real D, from MacNaughton, Inc. under the brand name NuVision, from i-O Display Systems, and others. People who are seeking to produce robust dual projection systems and have the budget for it would be well advised to consult with a systems integrator, such as Mechdyne in the United States or Inition in England. There are off-the-shelf products, such as those marketed by Lightspeed, which produce field-sequential stereoscopic images that can be projected onto any projection screen and viewed with shuttering eyewear. But for screens up to 70 in. in diagonal, the reader is advised to also consider a rear projection television Page 2 of 4

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_110-2 # Springer-Verlag Berlin Heidelberg 2015 ®

set based on Texas Instruments’s DLP ® technology (see chapter “▶ DLP Projection Technology”) as manufactured by Samsung and Mitsubishi.

Theme Parks and Location-Based Entertainment Almost 30 years ago, stereoscopic projection was used by Disney at their Disneyland theme park, and since then, most major theme parks and world’s fairs have found the addition of a stereoscopic theater to be de rigueur. The technology used for projection in these situations, up until now, has been dual projection. Often, the setups involved 70 mm projectors that were interlocked, projecting onto a polarization-conserving screen using linear polarized light. These tended to be big screens, and all the audience members were equipped with polarizing glasses. The projected images involved extreme stereoscopic effects and also often off-screen effects – so-called 4D effects – in which audience members had their seats buzzed or were sprayed with water, ad nauseam. In the past decade or so, dual digital projectors have also been used. A survey of theme park theaters in southern California has led the author to the conclusion that the dualprojector technique cannot be counted on to consistently produce good results. Half of the venues visited had misalignment problems, which makes the case that dual projection setups, even in venues where one would expect care to be taken, are problematical. Lately, theme parks have become interested in single electronic projector solutions. The most common solution employed is one offered by Real D, using a technology called ALPS, which produces linear polarization in combination with the temporal selection technique using a single DLP projector. This general technique will be described in more detail in the section on theatrical projection. ALPS is an electrooptical switch that toggles the axis of linearly polarized light at field rate. Since both the left and right images emerge from the same optical path, they are treated optically identically, and there are no alignment or color and geometric symmetry issues. It should be noted that a major player in the location-based entertainment business, IMAX, has been using dual 70 mm projectors for decades. Because of the design, one could argue that they are but a single projection machine with two parts, a left and right projector. IMAX stereoscopic projection is usually carried out with expertise. The screens are extremely large, and they employ linear polarization. The IMAX medium has moved more into the mainstream in addition to being used in location-based entertainment, as IMAX theaters have become part of the multiplex theatrical experience. Lately, IMAX has broadened its offering to include digital projection on smaller screens using two coordinated digital projectors in newer locations. But by doing so, they are calling into question the raison d’être for the brand: Big.

The Theatrical Cinema In the past, the 3D theatrical cinema depended on two projectors, using schemes that were very much like that offered by the theme parks. As long as two projectors were required, the medium had no opportunity to be established, because of the difficulties of coordinating two machines to work harmoniously to specification. In 2005, with the introduction of the Real® D projection system based in part on the Texas Instruments DLP light engine (see chapter “▶ DLP Projection Technology”) as embodied in the projection machines from three manufacturers – Christie, NEC, and Barco – the stereoscopic cinema got an enormous boost. Today, there are several 1000 such venues with more than a 90 % share held by Real D. The Real D system is interesting in that it combines both temporal and polarization characteristics, and it can be classified as either or both a temporal selection technique or a circular polarization technique. The basis for the Real D system is the ZScreen, which the author helped to develop while at StereoGraphics Corporation and which was further developed by Matt Cowan, Josh Greer, and Gary Sharp, at Real D for its particularly demanding application to the theatrical cinema. The ZScreen is an electrooptical modulator or switch that changes the characteristic of polarized light between left- and right-handed polarized light at Page 3 of 4

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_110-2 # Springer-Verlag Berlin Heidelberg 2015

the video field rate. The projector projects alternate fields of left and right perspective views coordinated with the left and right circularly polarized light. When projected on a proper polarization-conserving screen and when viewed through appropriate polarizing analyzing spectacles, the result is a superb quality stereoscopic image. As important, installation is straightforward and the system needs little or no ongoing tweaking or calibration. The latest version of the ZScreen, the XL ZScreen, has twice the transmission and achieves this by recovering light that would have been lost by an absorbing sheet polarizer. This product now enables a single projector to project onto screens that are 70 ft in width. An alternative system that is also combined with time multiplexing is the Dolby system, based on technology from INFITEC that Dolby licenses, which is an advanced form of anaglyph that produces a full color image of good quality. Yet, another technique that uses the Texas Instruments digital projectors is the XpanD system based on eyewear manufactured by MacNaughton. It is a pure time-sequential technique, and like the Dolby system, it does not require a special polarization-conserving screen. Both the Dolby and the XpanD systems require a high-gain screen in order to achieve projection on screens that are bigger than, say, 35 ft because of the light level. The systems also demand reuse of the eyewear because of their cost. For further details of 3D cinema technology, see chapter “▶ 3D Cinema Technology.”

Conclusion Nothing has been said in this chapter about the preparation of stereoscopic images, but that requires a great deal of expertise as well, although there are a number of off-the-shelf software packages for producing stereoscopic computer-generated images and stereoscopic cameras are available. Users have been known to simply use a single camera moved through two locations. There are also service bureaus that can process a planar image and turn it into a stereoscopic image. For content creation for the theatrical cinema, many tools and many experts are available. Today, the projection of stereoscopic images is flourishing. There is a very high degree of awareness about the medium because of the success of the theatrical cinema, which has led to or added to interest in other areas such as corporate communications, theme parks, trade shows, and also the home. This chapter has limited its scope to an attempt to introduce the most basic concepts of the medium to professionals who may be in the field of displays or seek to apply stereoscopic displays.

Further Reading Eyman S (1999) The speed of sound: Hollywood and the talkie revolution, 1926–1930. Johns Hopkins University Press, Baltimore Crowther B (1957) The lion’s share. E. P. Dutton, New York http://www.dolby.com/professional/technology/cinema/dolby-3ddigital.html Lipton L (1982) Foundations of the stereoscopic cinema. Van Nostrand Reinhold, New York. http://www. andrewwoods3d.com/library/foundation.cfm Lipton L (2001) The stereoscopic cinema: from film to digital projection. SMPTE J 110:586–593

Page 4 of 4

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

Addressing Stereoscopic 3D Displays Matthew C. Forman* Create-3D, Sheffield, UK

Abstract An end-to-end three dimensional (3D) content delivery system must be able to make use of a range of different sources and types of 3D content, and be able to deliver high quality images through any of a number of stereoscopic 3D display technologies. This chapter looks at 3D picture formats for storage and transmission and how these map to current and future display devices and content sources. In addition, connectivity and standardization issues in end-to-end 3D display applications are considered.

List of Abbreviations 2D 3D API AVC DIBR DMD DVB DVI E-DDC HD HDMI ITU LC LCD MVC SD SDI SMPTE

Two dimensional Three dimensional Application programming interface Advanced video coding Depth image based rendering Digital micro-mirror device Digital video broadcasting Digital visual interface Enhanced data display channel High definition (television program material) High-definition multimedia interface International telecommunication union Liquid crystal Liquid crystal display Multiview video coding Standard definition (television program material) Serial digital interface Society of motion picture and television engineers

Introduction Recent years have seen the successful revival of stereoscopic 3D film presentation in cinemas. The entertainment industry as a whole has since been working hard to follow this with solutions for in-home 3D television/home cinema, digital photography, video, and gaming. A key requirement for in-home 3D is consumer adoption of suitable new 2D/3D display units – such as computer-connected monitors and stand-alone televisions – that support effective 3D viewing while being capable of showing conventional non-stereoscopic SD and HD program material and 2D computer displays. The need for rapid *Email: [email protected] Page 1 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

a R (s−1) Original stereo source

R

L

X

b

X

c

d

L+R

Frame-sequential

Colour-encoded

e

f

Over-Under

h

g

Checkerboard

Row-interlace

i

Independent

Side-by-side

j

Multi-view (8 shown; not to scale)

2D + Depth

Fig. 1 3D picture encoding formats

development of 2D-compatible 3D offerings has necessitated the use of relatively mature stereoscopic display technologies that require the user to wear special glasses to demultiplex the display correctly to left- and right-eye views, though a new generation of autostereoscopic (glasses free) displays is expected to follow. This need for rapid development of end-to-end 3D systems has led to a number of display technologies, both projection-based and direct-view, being in use. These use different methods to present stereo 3D content, and this in turn results in a range of incompatible drive requirements which must be fulfilled by an equally wide range of 3D content sources and types. Additionally, as the demand for 3D content accelerates, the need for standardization becomes more pressing, to ensure compatibility with future display technologies. This chapter will look at the factors and issues that arise when using practical and commercially available stereoscopic 3D display systems in real world end-to-end systems that present 3D content. It first considers a number of common formats used to represent 3D content at different stages of an end-to-end 3D system, and their adaptability to, and compatibility with various stereoscopic display technologies.

Page 2 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

Display interfacing and connectivity is then considered, followed by two short application studies. These consider issues of connectivity, transmission, compression, and storage in 3D television and computer display, and recent and ongoing efforts to standardize elements of the delivery chain right up to the display hardware itself.

Picture Formats for Stereo 3D Display Systems making use of stereoscopic 3D content and displays have been developed – until very recently – in research environments, for specific niche application areas such as training, simulation, and visualization, and through the activities of enthusiast groups. This has led to a wide variety of formats being used to deliver content to displays, mostly developed ad hoc to suit the application at hand and the specific requirements of the display technology being used (Smolic et al. 2009). Most of these formats multiplex a stereo pair either spatially within each frame or in time. In the new commercial 3D environment, cross compatibility of formats and displays becomes significantly more important. This chapter reviews current and proposed 3D video data formats and examines issues of cross compatibility and adaptability to current and future 3D display technologies, existing content sources and intermediate storage systems, and transmission infrastructures. In some applications, it is necessary or desirable to contain the 3D content stream within the constraints of existing infrastructure or standards; this is known as Frame-compatible 3D. However, a stereo video stream will generally contain twice as much raw visual information as a conventional 2D one; a reduction of information and hence some loss of quality must occur to maintain frame compatibility. In other applications or areas of a delivery chain, this is not an issue. In the descriptions given, the source 3D video stream is defined as comprising two simultaneous video signals, each with a frame size of X pixels horizontally by Y pixels vertically, and a progressive frame rate of R frames per second, the combination denoted X  Y  R (see Fig. 1a).

Anaglyph/Color-Encoded

This class of methods for formatting a stereoscopic picture stream uses color filter methods to multiplex the left- and right-eye stereo views into a single frame at the source (Fig. 1b) (Judge 1935; ColorCode-3D 2011). The viewer wears glasses containing appropriate color filters (typically red/cyan, red/blue or blue/ amber) so that only the correct intended view reaches each eye. The use of such filters inevitably interferes with the color rendering accuracy of the final 3D picture; however, this mature technique has the distinct advantage that it is highly compatible with conventional 2D transmission infrastructure and storage systems, only requiring an overall resolution of X  Y  R. It is therefore still a useful method for offering occasional 3D stereo content in systems where 2D material predominates.

Frame-Sequential/Field-Sequential In this scheme, incoming left- and right-eye picture frames are interlaced temporally (Fig. 1c). The result remains a conventional time-based stream of frame images, and the method therefore offers good compatibility with conventional HD video storage and transmission systems. The spatial (per-frame) resolution of each stereo view image is maintained, though the effective temporal resolution (frame rate) of the stereo stream must be halved if the stream is to fit within conventional HD bandwidth constraints (overall resolution X  Y  (R/2)). If this is not a requirement, the display frame rate can be doubled so that the original content stereo frame rate is maintained.

Page 3 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

Frame-sequential is the native addressing format for stereo 3D display systems that use active LC shutter glasses, whether direct-view or projection-based. Since it is relatively mature, this is a common solution in the current generation of consumer 3D display technologies. Field-sequential is a related format for use where compatibility with interlaced transmission and display environments is needed (SD&A/SPIE 2004). Alternating display fields of each frame carry leftand right-eye stereo views.

Side-By-Side The side-by-side format retains the frame synchronization and rate of the source stereo video material, and places the left- and right-eye views next to one another on the left- and right-hand sides of each frame (Fig. 1d). In frame-compatible usage of this format, it is necessary to halve the horizontal resolution of each stereo view (overall stream resolution of (X/2)  Y  R) and then rescale it back upwards for display. Again, if 2D frame compatibility is not needed and a nonstandard double-height signal can be accommodated, original overall stereo signal quality can be retained. Frame compatible side-by-side is natively compatible with some direct-view polarizing displays (i.e., using passive glasses) that have horizontally alternating polarization bands since these also operate at half horizontal resolution. This scheme is a popular choice for efficient frame-compatible transmission in existing HD television infrastructures as the loss of horizontal resolution is not deemed to be as detrimental to overall quality at the display as halving the frame rate, which would be needed if the frame-sequential format were adopted for transmission.

Over-Under/Top-Bottom/Above-Below This is similar in principle to the side-by-side format but places the two stereo views into the target frame in a vertical layout instead (Fig. 1e). When used in a frame-compatible system, a reduction in vertical frame resolution must take place (X  (Y/2)  R). The frame-compatible Over/Under scheme is directly compatible with most direct-view polarizing displays which employ lines of vertically alternating polarization as this technology inherently operates at half the vertical resolution of the original content.

Row Interlace/Line Sequential In this format, left- and right-eye stereo views from each pair in the source material are placed on alternating horizontal pixel lines in each target frame (Fig. 1f). Each stereo view must be reduced to half its original vertical spatial resolution, though the original frame rate of the stereo video stream is effectively maintained (X  (Y/2)  R). If interleaved lines are sampled from the appropriate vertical positions in the original views, the effective resolution of streaming frames as perceived by the viewer can be higher than it would be for a single static half resolution frame. As in the case of Over/Under, this scheme is again most directly compatible with direct-view polarizing displays which employ lines of vertically alternating polarization. This format may also be seen in rotated form (Column Interlace) for direct addressing of single stereo viewpoint parallax barrier autostereoscopic displays.

Checkerboard In the Checkerboard scheme, left- and right-eye stereo views are alternated in a single frame in both the horizontal and vertical dimensions (Fig. 1g). This results in the raw pixel count of each view being halved; however, provided the original full resolution views were sampled according to the checkerboard layout,

Page 4 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

the effective resolution of the stereo pair as perceived is higher than it would be using a simple horizontal/ vertical packing scheme since each view effectively “fills in” gaps left in the other. This scheme is particularly compatible with DMD-based projection displays as a consequence of the layout of their micro-mirror arrays (Woods 2009).

Independent Stereo This is a general class of formats where left- and right-eye stereo views are represented separately at the full resolution and frame rate of the original stereo content, where there is no requirement for conventional 2D HD frame compatibility (Fig. 1h). A specialized encoding format is used; this is generally an extension to an established transmission or storage standard.

Multi-view The 3D picture delivery formats described so far have been designed specifically to carry stereoscopic video data: each overall “frame” comprises a pair of images, one destined for the viewer’s left eye and one for their right eye. This provides for a single 3D viewpoint on the overall scene: the viewpoint that was originally used when the content was shot or rendered. These formats are therefore perfectly suitable for the “first generation” of 3D display systems making use of active or passive shutter glasses and no capability for “look-around.” A number of “second generation” 3D display systems may employ head or eye tracking and/or directionally selective autostereoscopic techniques such as the use of lenticular sheets. These systems require 3D data in a format which enables them to construct and replay a number of different stereo viewpoints, depending on the location of the viewer in front of the display. One such group of formats is Multi-view. Instead of a single stereo pair, a number of images are encoded as if taken from an angular range of viewpoints (Fig. 1i). It is possible to create a stereo pair from any two neighboring viewpoint images, and hence the scheme is compatible with both fixed stereo viewpoint and multi-viewpoint 3D displays. Encoding and compression formats have specifically been developed for multi-view image arrays, such as the MVC (Multiview Video Coding) amendment of H.264/AVC (Merkle et al. 2006), though the bandwidth and storage requirements are somewhat greater than for stereoscopic 3D formats.

2D + Depth The 2D + Depth format is an embodiment of an altogether different approach to representing a 3D video stream that can be displayed with multiple viewpoints (Fig. 1j). Instead of a direct encoding of two or more fixed viewpoints, it augments a conventional HD 2D frame with a separate depth map image, in which each pixel represents the scene depth of the corresponding pixel in the main 2D image frame. It therefore encodes the 3D scene at a higher level of abstraction than a stereoscopic or multi-view format, and arbitrary stereo viewpoints (within a supported range) can be synthesized at the display unit; this synthesis process is known in general as Depth Image Based Rendering (DIBR) (Fehn 2004; Morvan et al. 2008). A disadvantage of the basic scheme is that a shift of viewpoint can reveal background areas that contain no image information if they were previously occluded by foreground objects, though this can be avoided by including extra occlusion layers in the stream. Provided the firmware is in place to support this format, then, 2D + Depth is compatible with a wide range of 3D display technologies whether they are capable of displaying multiple viewpoints or not, and indeed with conventional HD displays. A number of other formats have been proposed which typically combine one or more of the basic methods to enhance flexibility and/or image quality, for example, Depth Enhanced Stereo. This offers a basic stereo pair but with per-view depth and occlusion maps. Fixed stereo viewpoint equipment can then

Page 5 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 3D encoding formats defined in HDMI version 1.4a HDMI 3D Format Frame Packing Field alternative Line alternative Side-by-Side (Full) L + depth L + depth + graphic + graphicsdepth Top-and-Bottom Side-by-Side (Half)

Comments Full resolution (therefore not 2D frame-compatible) independent stereo format. Progressive scan and interlaced versions are defined. Full resolution version of generic field-sequential stereo. Interlaced frame formats only. Full resolution version of generic line-sequential stereo. Progressive formats only. Full resolution version of generic side-by-side stereo format, i.e., twice normalencoded width. Progressive or interlaced. Separately encoded full resolution 2D image frame and depth map (generic 2D + depth format). Raw encoding similar to Frame Packing. Progressive only. As L + depth, but with an extra full resolution graphics layer and corresponding depth map. 2D frame-compatible (half vertical resolution) generic top-bottom stereo. Interlaced or progressive. 2D frame-compatible (half horizontal resolution) generic side-by-side stereo format. Interlaced or progressive.

display the stereo pair without further processing, though arbitrary views can also be synthesized effectively if necessary (Smolic et al. 2009).

Connections to 3D Displays A range of technologies are currently being used to offer 3D display, with a yet greater number at various stages of development. Historically, physically connecting a source of 3D content to a display has been achieved in a largely ad hoc manner according to the nature of the application, either using specialized proprietary or existing standard interfaces, sometimes with the aid of add-on adapters and signal processors. For example, polarizing dual projector stereoscopic installations run from a computer require separate left- and right-view video signals; here, two conventional display connections (e.g., DVI; see below) are needed on the source computer. For broad consumer and industry acceptance of 3D display in general, there is a need for interface standardization. A number of efforts have already been made in this area, and others are ongoing. In the following section, the current state of 3D display interfacing is reviewed.

HDMI (High-Definition Multimedia Interface) HDMI (HDMI Licensing 2010) is intended as a replacement for consumer analogue standards for connection between sources of content (e.g., television set-top boxes, disk-base players) and display equipment. It provides all-digital carriage of uncompressed HD video and multiple channels of audio. It was designed to be compatible (at the video signal level) with the earlier DVI-D standard (Digital Display Working Group 1999), most commonly used for computer-monitor interfacing, given the appropriate cables. Although originally developed to support 2D HD video and audio, since version 1.4 HDMI has explicitly included support for 3D video formats. A wide range of formats is supported in recognition of the fact that any one format may not be the optimal choice for all sources of content. The standard defines the way in which each 3D picture format is encoded into the low-level video stream and also provides a method for a source device to learn the format capabilities of a display (“sink”) device via the E-DDC (Enhanced Display Data Channel) mechanism. Table 1 lists the 3D picture formats allowed by

Page 6 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

HDMI version 1.4a. The standard dictates that a conforming 3D display must support the Frame Packing, Side-by-Side (Half), and Top-Bottom formats at certain specified resolution and frame rate combinations. If the source device has material to offer in a different format from one of those supported, it must perform transparent conversion to one that is supported. The advantage of standardizing 3D connection formats in such a way is interoperability of 3D content sources with 3D display equipment whatever the underlying techniques and addressing formats they may be using to present the picture. While this is essential to drive consumer and industry take-up, the rigid support of specific formats and resolutions does not generally suit professional and “power-user” environments.

DVI (Digital Visual Interface) DVI is an earlier standard mainly used for connection of personal computers and monitors (Digital Display Working Group 1999). It offers analogue (DVI-A) and digital (DVI-D) signaling options together over the same cable. Unlike HDMI, it does not support 3D picture formats natively despite having many signaling and protocol features in common with HDMI. This is however an advantage in some circumstances, where a lower level video link protocol is desirable or necessary. Where it is used in 3D applications, any encoding of stereoscopic picture information over DVI must therefore be done at the application level, and in a manner that directly suits the low-level addressing requirements of the display equipment attached. Examples are the direct display of Frame-sequential stereo using a doubled frame rate for use with active shutter glasses and dual DVI ports being used simultaneously to drive a polarizing dual projector installation.

DisplayPort The new DisplayPort interface has been developed by the Video Electronics Standards Association chiefly as a replacement for computer-display connections (VESA 2009). It offers a significantly greater digital video throughput than earlier standards such as DVI and HDMI, to enable it to support very high display resolutions and frame rates. It has also been designed to reduce display device complexity since it specifies internal connection to display device components (such as LCD panels) in addition to external connection between devices. It also provides signaling protocols for control purposes and for source and sink devices to share information on their capabilities. An advantage, particularly valid for new 3D formats, is that DisplayPort uses a general packet-based encapsulation of the video signal rather than being constrained to planar raster representations. DisplayPort has included direct support for stereo 3D signals since version 1.1a of January 2008. This defines a frame-sequential and a field-sequential scheme for progressive and interlaced stereoscopic video respectively. These can operate at any resolution and frame rate that may be required, provided they can be transmitted at or below the maximum bit rate allowed by the standard. Version 1.2 of the standard has added a number of further stereo 3D features (Kobayashi 2010): • A high frame rate full HD frame-sequential mode offering up to 120 fps for each eye • Further formats: side-by-side, pixel interleaved, dual interface, and stacked • A facility for sources to read back display hardware 3D capabilities

SDI (Serial Digital Interface) The SDI family of standards, originated by SMPTE, defines a coaxial cable-based interface for uncompressed digital video in professional applications. It is often used for interconnection of video cameras, recording devices, and television studio equipment, including display monitors. The original version of the standard (SDI) defines an interface capable of carrying SD video (Society of Motion Picture Page 7 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

and Television Engineers 2008a). This was later enhanced to offer higher bit rates to carry HD video at resolutions up to 1,080 line interlaced (HD SDI) (Society of Motion Picture and Television Engineers 2008b), with the latest enhancement (known as 3G SDI) providing for transmission of higher resolution HD formats such as 1,080 line progressive (Society of Motion Picture and Television Engineers 2006). SDI interfaces do not directly support stereoscopic 3D video formats or translation between them; this is not generally an important consideration in professional environments as it is in consumer equipment. However, SDI is often used to carry 3D video signals in 3D content creation workflows for connection of stereo camera rigs and processing and recording equipment using a left/right pair of connectors. Stereoscopic 3D display monitors are also available which carry such a pair of SDI connectors, typically for use as viewfinders and editing/post-processing/broadcast monitors.

3D Displays and Formats in Practical Applications This section gives a brief summary of formats and standards being applied in end-to-end systems using consumer 3D displays.

Television/Home Cinema

A number of broadcasters and carriers have announced or introduced first generation stereoscopic 3D television systems. A typical arrangement uses an HD frame-compatible side-by-side (halved horizontal resolution) format encoded at “1080i25” (1,920  1,080 pixels, 25 fps, interlaced) and compressed using H.264, for delivery over existing HD infrastructure (BSkyB 2010). Most television displays available that are compatible with these services use active shutter glasses, though polarizing systems using passive glasses are becoming available. Integration is, however, not quite complete in these first generation services. The HD set-top boxes in use often do not support stereo 3D format encoding natively, for example, through the use of HDMI version 1.4 ports. In these cases, the viewer must select the correct stereo picture format manually when stereo 3D content is being broadcast. Looking beyond current implementations, several organizations including the DVB Consortium and SMPTE have 3D television standardization efforts under way (Zou 2009). The aim is to create frameworks for both HD frame-compatible and future 3D television broadcast systems, defining formats and compression schemes to enable efficient transmission and integration of both 3D and conventional 2D content (The DVB Project 2010; Szypulski 2010). Where physical media are concerned, the Blu-ray Disc Association have defined Blu-ray 3D, a set of requirements for storage of stereoscopic 3D content (Blu-ray Disc 2010). It encodes pictures using the “Stereo High” profile of the MVC extension to the ITU-T H.264 AVC codec used in conventional HD Blu-ray. Using a full “1080p” (1,920  1,080 pixels, progressive scan) base video format, this encodes two stereo views in an Independent Stereo format in such a way that the system is backwards compatible with 2D displays – these will show just one of the views. This typically incurs a 50 % storage overhead compared with a disc encoded using the conventional AVC 2D HD codec.

Computer-Based Stereo 3D Stereoscopic 3D presentation has been use in computing for several decades, mostly in specific visualization applications. The OpenGL graphics API has directly supported rendering to separate left-eye and right-eye buffers (if present in the system) since at least version 1.1, released in 1992 (Silicon Graphics 1992), and stereo capable graphics hardware including active shutter glasses have also been available since the early 1990s. Growing consumer interest in stereo 3D presentation in general has seen the recent introduction of these technologies in computer entertainment (gaming) applications. Page 8 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

Generally, in visualization or gaming applications, the content to be presented already exists in a 3D object-based form and is rendered by graphics hardware to the display in real time for interactive operation. It is therefore relatively straightforward to change the “virtual camera” viewpoint during rendering to create a stereo view pair instead of a single central view. Software drivers are available that can do this as an intermediate translation stage between the application program and the graphics hardware (iZ3D Inc 2011; Dynamic Digital Depth 2011), and typically present the result to a fast refresh monitor as frame-sequential stereo at a high frame rate. Compatible active LC shutter glasses are also widely available. Such systems will be able to present stereo 3D content in other application areas (such as Internet based content) as this becomes more widely available.

Conclusions From a base of niche and enthusiast applications, consumer interest in 3D presentation and acceptance of glasses-based stereoscopic displays in cinemas has led to efforts in the consumer electronic, entertainment, and personal computing industries to offer “first generation” end-to-end 3D content delivery. To achieve this, a mix of existing 2D video standards and equipment and new compatible developments have been used. The aim has been to produce 3D displays as rapidly as possible to take advantage of renewed interest, and this also serves to drive development of improved end-user 3D display technologies. Industries have also embarked on a number of standardization initiatives, aiming to establish formats and frameworks to support future fully integrated 3D content delivery systems.

Further Reading Blu-ray Disc Association (2010) BD ROM – audio visual application format specifications, July 2010. http://www.blu-raydisc.com/assets/Downloadablefile/BD-ROM_Audio_Visual_Application_For mat_Specifications-18780.pdf BSkyB Ltd (2010) BSkyB 3D technical specification for PlanoStereocopic (3D) program content. http:// introducingsky3d.sky.com/a/bskyb-3d-tech-spec/ ColorCode-3D (2011) About ColorCode-3D. http://www.colorcode3d.com/ColorCode_3-D.html Digital Display Working Group (1999) DVI 1.0 Specification DDD – Dynamic Digital Depth (2011) TriDef – Stereoscopic 3D Software. http://www.tridef.com/ Fehn C (2004) Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. Proc SPIE 5291:93–104 HDMI Licensing, LLC (2010) HDMI Specification Ver.1.4a iZ3D Inc (2011) iZ3D drivers and software for stereoscopic 3D. http://www.iz3d.com/ Judge AW (1935) Stereoscopic photography. Chapman and Hall, London Kobayashi A (2010) DisplayPort(TM) Ver.1.2 Overview. http://www.vesa.org/wp-content/uploads/2010/ 12/DisplayPort-DevCon-Presentation-DP-1.2-Dec-2010-rev-2b.pdf Merkle P, Mueller K, Smolic A, Wiegand T (2006) Efficient compression of multi-view video exploiting inter-view dependencies based on H.264/MPEG4-AVC. In: IEEE international conference on multimedia (ICME), Toronto Morvan Y, Farin D, De P (2008) System architecture for free-viewpoint video and 3D-TV. IEEE Trans Consum Electron 54(2):925–932

Page 9 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_111-2 # Springer-Verlag Berlin Heidelberg 2015

SD&A/SPIE (2004) Proposed standard for field-sequential 3D video (draft). http://www.stereoscopic.org/ standard Silicon Graphics, Inc (1992) The OpenGL graphics system: a specification (version 1.1) Smolic A, Mueller K, Merkle P, Kauff P, Wiegand T (2009) An overview of available and emerging 3D video formats and depth enhanced stereo as efficient general solution. In: Picture coding symposium (PCS), Chicago Society of Motion Picture and Television Engineers (2006) ST 424:2006 Television 3 Gb/s Signal/Data Serial Interface Society of Motion Picture and Television Engineers (2008) ST 259:2008 Television – SDTV Digital Signal/Data – Serial Digital Interface Society of Motion Picture and Television Engineers (2008) ST 292:2008 1.5 Gb/s Signal/Data Serial Interface Szypulski T (2010) SMPTE 10E40 working group on 3D home master – progress and status update, 25 May 2010. http://www.smpte.org/sections/section_washingtondc/washington_previous/was_ may10/SMPTE_Standards_Update_and_Producing_live_3DTV_Sports.pdf The DVB Project (2010) DVB BlueBook A151 – commercial requirements for DVB-3DTV July 2010 VESA (2009) DisplayPort 1.2 Standard Woods A (2009) 3-D displays in the home. Inf Disp 25(7):8–12 Zou W (2009) An overview for developing end-to-end standards for 3-D TV in the home. Inf Disp 25(7):14–19

Page 10 of 10

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

3D Cinema Technology Bernard Mendiburu* VP Innovation Volfoni, Los Angeles, CA, USA

Abstract This chapter describes the technologies and trade-offs related to the projection of stereoscopic 3D images in a cinematic environment. Particular emphasis is placed on the struggles throughout the industry to identify an optimal technology related to 3D glasses. Some discussion is considered about nextgeneration solutions, but autostereoscopic 3D is not expected in the cinema environment for more than a decade.

List of Abbreviations AS ECB LCoS LCS PS

Active stereoscopy Electronically controlled birefringence Liquid crystal on silicon Liquid crystal shutter Passive stereoscopy

Introduction to 3D Cinema Exhibition Current 3D projection technology used in commercial movie theaters uses a dual-channel work flow, displaying two discrete images to the audience, one for the left eye and one for the right eye. In computer graphics animation movies, such as Toy Story 3, the two video streams are produced with two virtual cameras set in a 3D synthetic world. In real-world productions, such as Avatar, pairs of cameras are linked together in a specialized apparatus referred to as a “3D rig” or “camera rig,” and the left and right image streams are produced, mixed, transported, and broadcasted in parallel and synchronicity. The word “stereoscopic” literally means “seeing volume,” but it is often mistakenly understood to mean “dual imaging system.” This is probably because of the association we have with the notion of “stereoscopic sound,” another dual-channel stimuli system we are familiar with. This chapter will not cover multichannel 3D (technologies based on integral imaging, depth acquisition, or synoptic cameras and that can be seen without 3D glasses on autostereoscopic displays). Because of the display resolution and viewpoint synthesis requirements, the mass production of autostereoscopic displays and content for this next generation of 3D visualization is still 10–15 years away, beyond the accuracy limit of technology forecasting (see also chapters “▶ Autostereoscopic Displays,” ▶ Head- and Eye-Tracking Solutions for Autostereoscopic and Holographic 3D Displays,” and “▶ Emerging Autostereoscopic Displays”). 3D projection systems are limited by virtue of the fact that a single screen must be able to show both left and right images. Therefore, the left and right channels must be multiplexed in order to share a single *Email: [email protected] *Email: [email protected] Page 1 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

reflecting surface and then filtered out before reaching the eyeballs of the audience. So far, no better apparatus has been designed to that end than glasses, and so today, all cinematic solutions employ some type of glasses-based technology. (Note that in some autostereoscopic 3D cinema prototypes, the left and right images are beamed to the supposed position of the left and right eyes of the audience, but this approach requires a rather rigidly defined theater design and has not gained popularity. As such, in current 3D theaters, the multiplexing is in time, polarization, wavelength, or a combination. It can be done using single or dual projectors, before or after the imaging device, inside or outside of the projector. The combination of all encoding and decoding stages involved in 3D cinema reduces the light efficiency to less than 20 %. This is currently the biggest challenge facing 3D cinema, because luminance is a key quality factor. The other major challenge facing the industry is related to cross talk between the two channels. This chapter is structured on presenting the emission, encoding, transmission, and then decoding and reception of the 3D visual message. • • • • •

Emission refers to the stereoscopic projection setup and method. Encoding is based on a combination of various multiplexing processes. Transmission relies on light bouncing on the theater screen. Decoding is done by various eyewear technologies. Reception is assumed by the audience’s eyes and brains in a process called stereopsis that is described elsewhere (see chapter “▶ Human Factors of 3D Displays”).

Projection Setup and Mode Projection Setup 3D projection can be achieved using a single-, dual-, or multiple-projector setup. Simpler setups have the advantage of easy deployment and operation; more complex setups offer better image quality, resolution, light level, contrast, and discrimination, each increasing with the number of involved projectors. Single-Projector System The single-projector setup is preferred for small theaters. It is a fail-proof concept and only marginally more complex to operate than a regular 2D digital projection system. The single-projector 3D solution has been used for many decades. The advent of digital cinema has enabled the widespread implementation of ® 3D theaters, particularly thanks to the imaging speed of DLP chips. Other single-projector systems, based on reflective liquid crystal, or simply 35 mm film were introduced more recently and rely on periscope lenses that realign left and right images. All single-projector systems suffer from low luminance – as one might expect from filtered light from a single bulb. Dual-Projector Systems A dual-projector system is more powerful than a single-projector system, with twice the amount of light to begin with and the stability of full-time illumination when running passive stereo. Despite its apparent simplicity, with each projector displaying one of the two image streams, two projector solutions struggle with issues related to setup and operational complexity. Dual projection setups require more specialized equipment and staff to preserve left-right coherent image geometry and sharpness. Aging light bulbs will need to be re-calibrated to maintain equal light levels and match color accuracy between the two projectors. Keystone correction is needed to compensate for the trapezoidal image warping due to different projection paths. To that effect some projectors use a camera that analyzes the picture to detect Page 2 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

and correct asymmetries. Such constraints relegate dual projection systems to be used primarily in highend theaters and special venues. Almost all dual projection systems use passive stereo and polarization multiplexing. There have been attempts to use dual active projection, with two DLP projectors working in synchronicity, to increase light levels on large screens. But such a solution did not prove very effective in field use. If the two images are not perfectly aligned on screen, the images are blurry. There are some rare cases of dual active projection using SXRD 4 K projectors with mechanical shutters to generate high-resolution images on regular white screens. Such setups are used in scientific visualization and have not been used in commercial cinema since 1920 due to their complexity and expense. Multiple Projector Systems Special venues are common users of gigantic screens and 3D imagery. “King Kong 3D 360” recently opened at Universal Studios in Los Angeles. Sixteen projectors are used to project the movie. Such systems use the same principles as scientific visualization systems use to align and match images, such as edge matching, genlock, and frame lock. Such systems are one of a kind and not very common.

Active and Passive Stereoscopy The distinction between active stereoscopy (AS) and passive stereoscopy (PS) is an important one to understand when considering 3D display technologies. Because it is a central concept that interacts with many other components (such as the number of projectors and the active or passive glasses), the general public, as well as a good share of industry professionals, tend to misunderstand it and inappropriately label 3D systems as “active” or “passive.” But to be clear, single-projector setups can run in active and passive modes, just as passive glasses can be used to watch passive or active stereo. The AS and PS attributions only distinguish between concurrent or alternative transmission of the left and right image streams. Passive Stereoscopy with Pixel Collocation In passive stereoscopy, left and right images are displayed full time. This is most likely achieved using two projectors. Another approach is to share the imaging surface of a single projector and realign the pictures on screen using a periscope optical mount with prisms placed between the projector and the two lenses. Through this method, 4 K digital projectors can also be used as passive stereo single 2 K projectors. This has been done in the 1950s using standard film. Modern reiterations of this process were introduced in 2009 and 2010 (by Technicolor, Oculus, and Panavision). Active Stereoscopy with Time Multiplexing In active stereoscopy, left and right images are alternatively displayed, fast enough for humans not to notice that each eye is seeing a black screen half of the time. If the refresh rate is close enough to retinal persistence, the brain only detects inconsistent light levels referred to as flickering. Over 100 Hz the flickering effect is considered as unnoticeable at the required light levels in the theater. In current AS digital 3D projectors, the stereoscopic multiplexing is run at 144 Hz, with each left and right frame alternatively displayed three times following a [L R L R L R] pattern repeated 24 times a second. The main problem with AS is that depth artifacts are introduced by the interaction between fast horizontal movement and the average time delay between left and right image streams. An object traveling across the screen in 1 s at the current AS projection rate of 144 fps will be integrated by our visual system with a parallax artifact of 0.69 % of image width. As a reference point, most depth placements in modern 3D movies stay within a parallax of 1–2 % of image width. Luckily for AS in theaters, film makers avoid fast camera movement, in order to avoid strobe effects, or rely on motion blur Page 3 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

to mask it. Still, motion-depth artifacts of up to 0.1 % of the screen can occur and be noticeable, conflicting with other depth cues like occlusions.

Projection Technologies The current resurgence of 3D cinema relies heavily on the so-called glass-to-glass digitization with images being stored and processed exclusively as discreet numerical values from the camera lens to the projector’s optical system. Even though 3D has been strongly associated with the deployment of digital projectors in theaters, classic film distribution made an unexpected comeback as a projection method for 3D movies in 2009. 3D cinema imaging systems are based on high-end 2D projectors used in specific ways, with additional apparatus used to align and multiplex the left and right images. In this section, we cover all projection systems, including the latest digital cinema products and film-based retrofit technologies. The multiplexing uses a combination of time, light polarization, and wavelength technologies that will be presented in the next section.

Digital 3D Projection Digital projectors are key assets in the current 3D cinema resurgence because they allow for pixel-accurate image projection and AS retrofit at marginal incremental costs. Texas Instruments’ DLP technology was the only digital system on the 3D market for a while, although it is now competing with Sony’s SXRD technology. ®

Digital Micromirror Device or DLP Projection ® DLP is the commercial name of the DMD, a technology that uses an array of micromirrors to create ® images (see chapter “▶ DLP Projection Technology”). Being built upon a full on/off light control system, it modulates gray levels by pulse-width modulation. Data range of 10 bits (1024 light levels) is dithered by flashing 1024 times 1-bit images. Color is generated by using three imaging channels, fed with monochromatic red, green, and blue lights. In this configuration, dividing the imaging time into two discrete images may look like a simple data source and microcode upgrade issue. Actually it involves a huge bandwidth load on data processing chips that control the DLP chip. Driving a 2 K chip, with two megapixels, at 10 bits of depth color, involves 2 Gb of data per frame. In 3D, with the projector running at 144 fps, the process generates 288 Gbps of data. Therefore, in 2005 the first generation of “digital 3D” cinema projectors were run at lower resolutions to accommodate for the increased requirements of the stereoscopic triple flash. Recent developments of DLP technology include the deployment of a dual DLP projector system known as “digital IMAX” and the release of the first generation of projectors based on new 4 K DLP chips. Liquid Crystal on Silicon, LCoS The other family of digital imaging processors used in professional cinema are based on liquid crystals (see chapter “▶ Liquid Crystal on Silicon Reflective Microdisplays”) and are marketed under the Silicon X-tal Reflective Display (SXRD) brand by Sony. Until recently, it was the only 4 K projection system, but was too slow to sustain active stereoscopy from a single projector. A periscope lens was developed and is now available as “RealD XL-S.” It separately deals with upper and lower parts of the light beam coming from the lens, polarizes, and aligns them on the screen. The resolution, theoretically full 2 K, is actually Page 4 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

852  2048, due to the actual frame format in theater releases and the necessity of a buffer area between the two images.

Film-Based 3D Projection There may be thousands of digital 3D projectors in the USA, and tens of thousands are on their way. Film projectors still count in the hundreds of thousands in the USA and a few millions in the world. As such, many industry leaders are betting on innovative 3D-on-flim distribution technologies to take advantage of the installed base of film-based projectors. Single-Strip 3D Film Projection In 2010, Technicolor presented a film-based 3D projection system using an above/below image format on the 35 mm support and periscope lens attachment that polarizes and realigns pictures on the screen (reference to be supplied). With such systems, however, in the event of a film rupture, projectionists fix the support by slicing out one frame. Unfortunately, this inverts the left/right alternation of images and subsequently presents “inverted stereo” to the audience. Addressing this headache-triggering hazard, 3D inventor Lenny Lipton introduced a few months later another system where the images are recorded side by side on the film strip and adequately rotated and realigned by a patented 3D lens attachment (Lipton 2011; Lipton et al. 2011). These two solutions are aimed at helping retrofit tens of thousands existing 35 mm projectors in the USA for a fraction of the cost of a complete digital replacement and should help spread the 3D renaissance to worldwide locations where the economy cannot support the expense of a digital 3D projection system. It should be noted that film-based solutions like these generate potential depth placement or vertical parallax, and scratch and dust on the film generate retinal rivalry (Figs. 1 and 2). Dual-Strip 3D Despite popular belief, the 1950s 3D movies were not shown using low-end red and blue color encoding systems. The Golden Age 3D movies were shown in full color, using two projectors and Polaroid polarizing filters. The main issues were the synchronization of the two projectors and the need for an intermission. Regular 2D movie reels could be loaded alternatively on two projectors, and they would switch over, sometimes automatically. When a 3D film was shown, both projectors were running at once and an intermission was required to reload them. All these issues were addressed in the design of the IMAX 3D projection system, with electronic synchronization, complex gate registration device using a vacuum pump, and giant platters to hold long runs of 70 mm film.

a

b 14 LEFT

10

103 106

104 RIGHT

102

12

40

32

30 26 28

42 16 38

105

36

101 Prior Art

44

34

Fig. 1 (a) Left and right images on a single-strip film, classic over/under, and (b) lens attachment (Reprinted from Lipton et al. 2011)

Page 5 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

a

304

302

RIGHT

LEFT

301

303

LEFT

00 0 00 0

904

RIGHT

306 000000

305

b

901

902

905 903 908

907 906 909

910 911 912 913

Fig. 2 (a) Left and right images on a single-strip film, new side-by-side method, and (b) lens attachment (Reprinted from Lipton et al. 2011)

Multiplexing Techniques Time Multiplexing Time multiplexing is the simplest and cheapest multiplexing used in the stereoscopic display industry. The process was discussed in Section 2.2.2 of this chapter. Time multiplexing requires an imaging engine fast enough to completely switch from left to right images with no residual effect. In the case of the DLP chip, full transition is immediate. Still, some dark time has to be included to match the other apparatus involved in the encoding or decoding of the 3D, and that dark time concept applies to many stereo projection systems. Active polarizer solutions, like the RealD’s “Z-Screen” or active liquid crystal shutter glasses, have longer transition times than DLP technology and require the imaging device to project a dark frame while they get to their appropriate left or right state. Dolby 3D and MasterImage systems use encoding wheels. The border between the left and right half-circle filters must have crossed the whole light path before an image can be displayed.

Page 6 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

Wide-Band Wavelength Multiplexing Anaglyphic encoding is the most common form of 3D imaging, noticeable because of the red-cyan glasses widely used as a visual synonym for “3D,” even in recent advertisements for current full-color digital 3D products. The basic principle of anaglyph technologies is to assign two of the red, green, and blue channels to one eye and the remaining one to the other. That color affectation will define the efficiency, comfort, color, and luminance behavior of the overall system. The most used is red/cyan (green + blue). Two other color combinations are currently used: blue/yellow (red + green) suffers from a huge luminance asymmetry and has to be tweaked to blue/orange or blue/brown (Sorensen et al. 2004; Starks). Green/magenta (red + blue) offers the most balanced luminance and resolution in 4:2:2 YUV video codecs used in almost all digital delivery systems (Cugnini 2009; Lanfranchi and Brossier 2010). The current development of full-3D display systems in theaters and at home is paradoxically reviving the interest in such color-based 3D encoding, for they allow film companies to repurpose 3D content so that the vast majority of the global audience can enjoy 3D without upgrading their display systems. These solutions allow 3D gaming on 2D TV sets and 3D distribution on 2D support and 2D TV channels. It remains to be seen if this will lead to an anaglyphic projection revival in low-end cinema markets that cannot afford the cost of full-3D projection systems.

Narrow-Band Wavelength Multiplexing Thin-layer technologies allow for narrow-band light selectivity. Wavelength multiplexing combines two discrete sets of red, green, and blue light spectra that do not overlap. These spectra are currently generated in digital cinema projection by filtering out a generic light source into one or the other sets, at a great efficiency cost. In a single-projector system, the filter is placed on a spinning wheel inside the projector, between the light source and the imaging system (Maximus et al. 2007). In dual projection systems, the filters can be placed in front of the lens (Jorke and Fritz 2006) (Fig. 3). Upcoming solid-state light sources, like LED or laser beams, offer both qualities (narrow-band sources finely tunable to the desired wavelength). Dual light engine projectors using such light sources will offer passive projection that does not require active glasses or special screens.

Linear Light Polarization In 1929, Edwin H Land, the founding genius of Polaroid Corporation, patented the production of neutral gray polarizing filters eventually used for 3D cinema (Land and Friedman 1929). A pair of filters in front of the projectors is matched with a pair of filters enclosed in the 3D glasses. When a linear light polarization is used, head titling or imperfect filter alignment impairs the system’s efficiency and generates crossover. This light leak between right and left images generates an image artifact called “ghosting.” It is considered that 3–5 % of cross-channel leaking is acceptable for a comfortable 3D experience.

Circular Light Polarization

Circular polarization is obtained by running linear polarized light through a “quarter-wave plate” that orients it to the right hand or to the left hand. The reverse treatment is done in the 3D glasses, doubled with classic linear polarization filters that extinguish the inappropriately oriented light. Therefore, circular light polarization has the benefit of not being sensitive to the alignment between the eyewear and the projectors’ filters. Tilting your head will not reduce the efficiency of the 3D system. On the downside, the discriminant factor is sensibly lower, leading to crossover levels up to 10 %. To that effect, a patented visual effects pass, dubbed “ghost busting,” is applied to the left and right images to pre-compensate for the expected light leak (Lipton and James 2005).

Page 7 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

a

4000

light intensity

3500 3000 2500 2000 1500 1000 500 0 350

b

400

4000

500

B1

3500 light intensity

450

550 G1

600

650

700

650

700

R1

3000 2500 2000 1500 1000 500 0 350

c

400

450

4000

B2

3500 light intensity

500

550

600

G2

R2

3000 2500 2000 1500 1000 500 0 350

400

450

500 550 wave length (nm)

600

650

700

Fig. 3 Narrow-band wavelength multiplexing, left and right image channel spectrum (Courtesy of SD&A, www.stereoscopic.org)

Combination of Multiplexing Techniques Time multiplexing has the great advantage of requiring only one projector. Beyond the obvious monetary incentive, it offers the additional advantage of enabling simplified adjustment to the projection system. Left/right synchronization cannot be mistaken: luminance and white balance are leveled, and keystone and magnification corrections are not needed. On the other hand, polarization and wavelength encoding offer advantages that cinema owners like, with glasses either cheap enough to be disposable or not needing battery and power management. In order to get the best of both worlds, single-projector active stereo, and low-management passive glasses, vendors have developed encoders that actively polarize or filter the light output. Such a mixed breed of active/passive stereoscopic projection has dominated the US market since the renaissance of 3D cinema in 2005, leading to widespread confusion throughout the industry. The situation may not ease with upcoming single-projector passive projection (Sharp et al. 2010) and periscope-based active projection (Cowan et al. 2010). To clearly distinguish between technologies, see Table 1 in Section 7.

Page 8 of 14

2010

2009

Date of release 1900s 1920s 1950s 1990s 2005–2008

1

Technicolor

Deluxe

1

Oculus

Film, 35 mm

Film, 35 mm

Film, 70 mm Film, 35 mm

1 1

Digital DLP Digital DLP

Digital DLP

1 1

Digital SXRD

2

Digital 3D

Dolby MasterImage

1

Digital DLP

Support Film, 35 mm Film, 35 mm Film, 35 mm Film, 35 mm Digital DLP

Digital DLP Digital DLP

3D4K/XLS

Sony/RealD

1

Number of projectors 2 2 2 2 1

2 1

XL

RealD

Xpand, NuVision Digital IMAX

IMAX 3D Z screen

IMAX RealD

Infitec Active glasses IMAX

Commercial name Anaglyph

Vendor

Table 1 3D cinema projection technologies by date

Passive

Passive

Active Passive

Passive

Passive Active

Active Active

Passive

Active

Stereo Passive Active Passive Passive Active

Wavelength

Polarization

Time Polarization

Polarization

Multiplexing Color Time Polarization Polarization Polarization, circular Polarization, circular Polarization, circular Wavelength Polarization, circular Wavelength Time

Passives

Passives

Actives Passives

Passives

Passives Actives

Passives Passives

Passives

Passives

Glasses Passives Actives Passives Passives Passives

IR synchronization Periscope attachment rotating, aligning, and polarizing images Periscope attachment aligning and polarizing images

Filter in front of lens

Filter in front of lens IR synchronization

Periscope attachment recycling polarization loss Periscope attachment aligning and polarizing images Encoding wheel inside projector Encoding wheel in front of lens

Encoding device on projector Color filter or colored film Synchronized wheels Polaroid filters Dual 70 mm projection Active polarization

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

Page 9 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

3D Screens Polarized light multiplexing requires a polarization-preserving screen. The quality of the preservation will dictate the stereoscopic crossover or left/right channel SNR. Such screens, dubbed “silver screens,” as a reference to the screens used in the early ages of the cinema industry, are now made with aluminum dust layered over synthetic fabric. Silver screens offer the additional benefits to have a much higher reflectance gain than classic white screens, allowing for more overall light efficiency within the system. The proponents of non-polarized 3D systems insist on the fact that this high gain applies mostly to the center of the seating area, and the patrons in the side seating enjoy a much lower light level. Other systems, using active glasses or wavelength encoding, work perfectly on regular white cinema screens.

Filtering Techniques Active Eyewear

Time de-multiplexing is typically done with liquid crystal shutter (LCS) glasses that “blind” the left or right eye accordingly to a synchronization signal coming from the projector. Theaters are fitted with infrared transmitters that can be placed around the stage or simply put atop the projector itself, beaming through the projection booth window and bouncing on the screen. A new synchronization protocol called “DLP Link” uses the visible light domain. The glasses are fitted with a light sensor that detects the presence of images on the screen. The assignment of the detected light flux to either eye is commanded by an invisible light burst asymmetrically inserted in the “dark time” interval between the left and right images. A third synchronization system, using RF signaling, is currently used in professional computer graphics (Nvidia) and consumer 3D TV systems (BitCaldron). LCS glasses are used to be based on “Pi Cell” filters (see chapter “▶ The p-Cell”), with the current generation using super-twisted nematic (STN) substrates. In January 2011, a new type of LC was introduced, called “Electronically Controlled Birefringence” or ECB. This was used in hybrid lenses that decode both active and polarized passive stereoscopic images (www.volfoni.com/activeyes). Among the limitations of LCS glasses is the need for a “dark time” when no image is displayed while one of the two shutters slowly gets back to its transparent state. Reducing this dark time to a minimum is a key issue in improving both light efficiency and color purity of LCS systems. If the dark time is not appropriately tuned between the glasses and the projector, it generates a visual artifact known as color banding, where smooth color gradients are seen as series of discreet flat color areas (Figs. 4 and 5).

Wide-Band Color Filters

The best wide-band color filters are made of gelatin, as has been used for a century in stage lighting. In addition to being cheap to produce and easy to distribute inserted into news magazines, the widely distributed “paper glasses” offer the best color filtering. Hard shell glasses with plastic lenses may look more fashionable or be more practicable for daylong use in postproduction, yet they are optically inferior.

Page 10 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

Tek

T Trig⬘d

M Pos: 4.120ms

MESURES CH3 Inactive Max CH1 Tps descente 295.0µs CH1 Tps montée 1.000ms CH2 Tps montée 1.150ms CH2 Tps descente 288.9µs

CH1 50.0mV CH2 50.0mV

Tek

M 5.00ms 11–Mar–11 23:39

T Trig⬘d

CH3 \ 2.80V 60.0002Hz

M Pos: 4.120ms

MESURES CH3 Inactive Max CH1 Tps descente 305.0µs CH1 Tps montée 970.0µs CH2 Tps montée 1.140ms CH2 Tps descente 305.0µs

CH1 50.0mV CH2 50.0mV

M 2.50ms 11–Mar–11 23:38

CH3 \ 2.80V 60.0003Hz

Fig. 4 Transition graph of Volfoni ActivEyes 3D glasses using electronically controlled birefringence (ECB) liquid crystal shutters (LCS) (Image courtesy: Bertrand Caillaud, Volfoni R&D)

Fig. 5 Volfoni ActivEyes are passive 3D glasses that can be turned into active glasses when connected to the ActivMe electronic driver Page 11 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

Narrow-Band Color Filters The Dolby/Infitec glasses are produced using thin-layer deposition technologies. Their production process includes more than a dozen layers. Sensitivity to abrasion is an issue, with contrast ratio declining over time. Production cost has been a concern, especially for cinema owners that must buy and maintain many hundreds of glasses for a single screen. In 2010, the glasses’ price was reduced to $17.

Polarized Glasses

Polarized glasses exist in celluloid-based disposable filters and hard plastic lenses. Unlike color filters, the plastic lenses have the same efficiency as the gel-based filters. The orientation of the linear polarization is key factor that can makes glasses incompatible between systems. Even with circular polarization, the orientation of the underlying linear filter is important to get optimal discrimination. When 3D cinema became massively popular, the ecological impact of the high volume of discarded disposable 3D glasses raised a big concern. Some vendors developed a recycling process that eventually was determined to be more costly and less eco-efficient than simply disposing of glasses. Other vendors offer biodegradable glasses as an alternative (Dager 2010). With the wide use of passive 3D displays in 3D TVand 3D cinema production, a niche market exists for high-quality polarizing glasses. It is now possible to buy personal passive 3D glasses to be used at home and in 3D theaters.

Summary and Conclusion As shown in Tables 1 and 2, 3D cinema projection employs a wide variety of technologies, using various time and light domain multiplexing, from single and dual projectors. This technological mix and match allows for a different solution design for every segment or niche in the market. It is also a sign that the best business model is not yet established. At the same time, film distribution channels are revisited via alternative content distribution and extension of the digital distribution to overseas markets. Soon, new low-cost 3D E-film solutions, based on high-end HD TV projectors, will reach the market. The battle between active and passive solutions will be impacted by the 3D TV deployment. With huge business opportunities lying in the home 3D TV market, massive budgets are currently invested in research and development on active glasses and full resolution passive displays. Eventually, the cinema display technology will be impacted by the result of 3D TV developments. Soon, we will see new light sources in projectors, and sooner than expected, giant flat screens will be deployed in cinema.

Page 12 of 14

Digital res Film format

Eyewear

Screen

Filters

Light

Projection setup

Support

Stereoscopy

Dual passive filters Single active filter Encoding wheel White Silver Active glasses Passive glasses

Digital Film 70 mm Single projector Periscopic lens Dual projector Linear polarization Circular polarization Narrow wavelength

Collocation

Time multiplexing

Comb filter

Active stereo Passive stereo

x 2K

x

x

x

x

x

x

RealD

Table 2 3D cinema projector technologies by manufacturer RealD XL x 2K

x

x

x

x x

x

x

MasterImage x 2K

x

x

x

x

x

x

Dolby x 2K

x x

x

x

x

x

LCS 2K

x

x

x

x

x

IMAX Active 70 mm

x

x

x x

x

RealD XLS x 2K

x

x

x

x x

x

x

Digital IMAX x 2K

x

x

x o o

x

x

IMAX 3D x

x

x

x o o

x

x

Oculus 70 mm

x

x

x

x

x x

x

x

Technicolor 35 mm

x

x

x

x

x x

x

x

Panavision 35 mm

x

x

x

x

x x

x

x

Dual Sony 4K 35 mm

x

x

x

x o o

x

x

Dual 35mm x

x

x

x o o

x

x

Infitec 35

x

x

x

x

x

x

x

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

Page 13 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_112-2 # Springer-Verlag Berlin Heidelberg 2015

Further Reading BitCaldron reference to be supplied Cowan M, Lipton L, Carollo J (2010) Combining P and S rays for bright stereoscopic projection. US Patent 7,857,455 Cugnini A (2009) 3D landscape still cloudy – or – anaglyph Ain’t dead, yet. http://displaydaily.com/2009/ 09/28/3d-landscape-still-cloudy-or-anaglyph-aint-dead-yet/ Dager N (2010) Stereoscopic 3D glasses can and should be eco-friendly. http://indiefilm3d.com/ stereoscopic-3d-glasses-can-and-should-be-eco-friendly Jorke H, Fritz M (2006) Stereo projection using interference filters. In: Woods AJ, Dodgson NA, Merritt JO, Bolas MT, McDowall IE (eds) Stereoscopic displays and applications XIII. SPIE, Bellingham Land EH, Friedman JS (1929) Polarizing refracting bodies. US Patent 1,918,848 Lanfranchi C, Brossier C (2010) Method and equipment for producing and displaying stereoscopic images with coloured filters. US Patent 2010/0289877 A1 Lipton L (1982) Foundations of the stereoscopic cinema: a study in depth. Van Nostrand Reinhold, New York Lipton L (2011) High brightness film projection system for stereoscopic movies. In: Woods AJ, Holliman NS, Dodgson NA (eds) Stereoscopic displays and applications, XXII. SPIE, Bellingham Lipton L, James HJ (2005) Polarizing modulator for an electronic stereoscopic display. US Patent 6,975,345 Lipton L, Mayer AL, Rupkalvis JA (2011) System for the projection of stereoscopic motion pictures. US Patent 2011/0085141 A1 Maximus B, Malfait K, Vermeirsch K (2007) Method and device for performing stereoscopic image display based on color selective filters. US Patent 2007/0127121 A1 Mendiburu B (2009) 3D movie making: stereoscopic digital cinema from script to screen. Focal Press, Amsterdam Mendiburu B (2011) 3D TV and 3D cinema: tools and processes for creative stereoscopy. Focal Press, Amsterdam Nvidia reference to be supplied Reference to be supplied Sharp GD, Robinson MG, McKnight DJ, Schuck MH (2010) Stereoscopic projection systems for employing spatial multiplexing at an intermediate image plane. US Patent 2010/0141856 Sorensen SEB, Hansen PS, Sorensen NL (2004) Method for recording and viewing stereoscopic images in color using multichrome filters. US Patent 6,687,003 Starks reference to be supplied Zone R (2007) Stereoscopic cinema and the origins of 3-D film, 1838–1952. The University Press of Kentucky, Lexington Zone R (2011) Deep screen, a history of stereoscopic motion pictures: 1952–2009

Page 14 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Autostereoscopic Displays Adrian Travis* Applied Sciences Group, Microsoft Corporation, Redmond, WA, USA

Abstract Autostereoscopic displays show stereo 3D without the need for spectacles. The simplest show the same pair of views as a stereoscopic display and make them visible each to one eye with lenslets, barriers, or something similar. The more advanced displays are such that what the user sees depends on their point of view and this is done by presenting many views, tracking the head of the viewer, or both. If the angle between views is sufficiently fine, autostereoscopic displays look like holograms except for the slight blurring caused by random phase between pixels. This chapter presents an overview of established and emerging autostereoscopic systems.

Introduction The various successes of stereoscopic 3D demonstrate that people like what they see but no doubt they would like the choice not to wear spectacles. Autostereoscopic displays aim to offer this choice and can be complicated but the idea is that the requirements of 3D should place impositions on the display, not on the viewer. Much as with 2D, it is quite simple to specify what is needed to create an autostereoscopic 3D display and the challenge instead is how to engineer a display that does what is required at sufficient quality and sufficiently low cost. Without spectacles, each eye will see a different picture on the same screen only if that screen modulates light as a function of angle. Sometimes it is enough to produce only two views but if we go further and produce views at many different angles, we have the potential for true 3D in the sense that the viewer sees nothing different from the original as seen through a window frame. It is not immediately obvious why angular modulation can achieve this but this is where autostereoscopic displays are rapidly heading. This section therefore begins by describing the theory of what we may call autostereoscopic pixelation before explaining the various strategies being used in attempts to make the displays (Fig. 1).

Autostereoscopic Pixelation Whereas the pixel (x, y) on a 2D display emits light equally to all directions in the manner of a light-bulb, the pixel on an autostereoscopic display must modulate light as a function of angle (y, j). This might seem strange versus a 3D (x, y, z) array of pixels but angular modulation is required if we are accurately to portray reflections and obscuration, the property that ensures that we do not see through one 3D image to that behind.

Accommodation

The acid test of a 3D display – that it should be able to create the image of a pixel somewhere other than on the screen – is in principle easily passed. It suffices to create a convergent set of rays which, after leaving *Email: [email protected] Page 1 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Each pixel on an autostereoscopic display modulates light as a function of angle

Off-screen pixel

Fig. 2 Convergent rays produce the image of a pixel above the screen

the screen, pass through a point. This is seen by an eye or camera as an offscreen pixel because they focus on the waist of a ray bundle and the fact that the rays began their journey before reaching the offscreen pixel is irrelevant as long as they thereafter diverge. In practice, the angle between views must be close to

Page 2 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

LED’s

Virtual image of viewer

1 2 3 4 5 6 7 8 9

lens

LCD Part of v iew 5 visible on thes

e pixels 6 ls Part of view these pixe visible on s xel w 7 se pi e i v e t of n th Parible o s i v

Fig. 3 Autostereoscopy naturally forms perspective by tiling slices of views imaged from infinity

the diffraction limit if an eye is to accommodate offscreen and there is debate as to whether this is necessary (Fig. 2).

Perspective A multi-view autostereoscopic display naturally allows the user to study the 3D image from any point of view and to see in stereo throughout, but we should also expect perspective to vary with distance. That it does so can be shown by considering an eye looking at a box from afar: the two sides will appear parallel because the angle each top edge forms to the eye’s line of sight is approximately the same. Now move close and the top edges will appear to diverge because the eye now looks at one edge from a different direction to the other. The same is true with an autostereoscopic display: the line of sight from a close eye has a different angle to one side of the screen from the other and since light is modulated as a function of angle the display inherently has the propensity to form perspective. That it actually does so can be explained as follows (Fig. 3). Use the concept of view-sequential projection where views of a 3D image are shown one by one rapidly on a liquid crystal display, each being illuminated by parallel rays of light shone in the direction from which the view was captured. If the parallel rays of light are created by LEDs in the focal plane of a lens, then we can trace rays from the eye of the viewer back through the lens to form a virtual image of the eye. All rays reaching the viewer originate as if from the virtual image so we can draw lines from this point through each boundary between adjacent LEDs to the LCD where they delineate zones of the LCD. In each zone, a different view is visible so the eye sees a mosaic of sections from different views and it is this mosaic that combines to form a correct perspective.

Holograms Versus Aperture Diffraction It is tempting to think that autostereoscopic pixelation is somehow incomplete and that only a hologram can perfectly reproduce the image of a 3D object. But the image created by a hologram is itself autostereoscopic and often differs from other autostereoscopic images only in how intensity is modulated as a function of angle. With a hologram, this is achieved by using diffraction gratings selectively to deflect collimated illumination and a hologram can legitimately be treated as a mere superposition of diffraction gratings. Where a hologram differs from other autostereoscopic displays is if the illumination is highly coherent in which case a hologram is capable of focusing light to pixels of higher resolution than is possible with autostereoscopic pixelation. This can tilt the argument in favor of holographic pixelation if screens are small or if it is important that the eye should see images on which it can accommodate and the lesser resolution of conventional autostereoscopic displays is caused by aperture diffraction. Light emerging from one pixel of an autostereoscopic display is subject to aperture diffraction so at a minimum, the ray must diverge at an angle in radians approximately equal to the wavelength of light Page 3 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Aperture diffraction limits the resolution of infinitely deep images

divided by the diameter of the pixel. Suppose that a display of width X with n pixels per row is to subtend an angle y to a viewer at normal reading distance and that the viewer chooses to place their nose against the screen so as to look at the image of distant mountains. The angle of divergence in radians is approximately ln/X and if the distant image is to have the same resolution as the on-screen image, then it subtends an angle ln2/X to the viewer (Travis 1997a). It makes sense that this angle also should equal y so n = √(yX/l), for example, a 300 mm wide display that subtends one radian at normal reading distance should have no more than 735 pixels per row, which is less than many laptop screens today (Fig. 4).

Integral Imaging Perhaps the most established of all autostereoscopic technologies is that which uses an array of lenslets (i.e., small lenses) to convert a high-resolution 2D display into an autostereoscopic display by arranging that there are several pixels in the focal plane of each lenslet. If, in a duplicate of this system, each pixel of the 2D display is replaced by a photosensor then it is possible to capture the whole of a 3D image on one panel and recreate it on the other, an arrangement that is called integral imaging (Lippmann 1908; Okano et al. 1997). Conventional cameras pointed at these displays can be shown to bring images to a focus at points that are not on the screen (Kim et al. 2010); so integral imaging has the great advantage that we can expect the human eye to accommodate (i.e., focus offscreen), which experiments suggest may be important if the viewer is to avoid discomfort (Hoffman et al. 2008; Schowengerdt and Seibel 2005). Furthermore, manufacturers have learnt to delineate pixels with such reliability that an increase in resolution has the advantage of seeming like a minimal change to what is already a complicated technology. Nevertheless, for any but the narrowest field of view, integral imaging displays need many more lines than the 2D ones. Particularly on big screens, the combination of line resistance and gate capacitance makes it a struggle to display images at standard resolution, let alone at the resolutions needed for integral imaging.

Slanted Lenticules The prevalent compromise has been to provide for pixels to modulate light only in azimuth, that is, in the horizontal plane. This means that our 2D array of lenslets is replaced by a 1D array of cylindrical lenslets known as lenticules and our 2D display needs fine resolution only in the horizontal plane. However, the pixels in a conventional display are square and each is subdivided into one red, one green, and one blue rectangular element, with gaps in between. Put this behind a display and a distant observer will see a view that is red, green, blue, or dark (due to the gaps). A now-popular solution is to rotate the panel slightly with respect to the lenticules (Cees Van Berkel 1999). If each lenticule spans, for example, three columns of the 2D display, then with an appropriately slight rotation, any three adjacent rows will be one red, one blue, Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Lenticules can be slanted so the column of each view has rows of red then green then blue

and one green in any single direction through the lenticule. The lenticular array thus decreases resolution by a factor of three in each dimension in exchange for producing nine autostereoscopic views (Fig. 5). A lenticular screen display with nine views lets a viewer see stereo without the need to wear spectacles or keep their head in one position and the 3D effect can be very convincing, while displays with as many as 30 views have been demonstrated (Im et al. 2008). But the benefits of 3D remain uncertain while the penalties of a large reduction in resolution are clear. Lenticular screens have therefore been developed, which can be switched on and off so that the user can choose whether to have a 3D image or highresolution 2D.

Switchable Lenslets It is not yet clear what the advantages of 3D are, while users rarely welcome a loss of resolution so the fashion at the time of writing is to arrange that the lenslet array can be switched off, allowing the highresolution 2D image underneath to appear when preferred. This can be done by making lenslet-shaped cavities within a transparent slab and filling them with birefringent material. If light passing through the device is polarized so that the refractive index of the material and the transparent slab are the same, then the lenticules vanish, allowing the high-resolution 2D image to be seen; otherwise, the image will be autostereoscopic. The lenslets can be switched in two ways: either the birefringent material is static and the polarization state of the light is rotated by a liquid crystal cell (Woodgate and Harrold 2003), or the polarization state of the light is constant and the birefringent material is a liquid crystal whose axis of birefringence can be rotated (de Zwart et al. 2004); and a helpful feature is to allow 3D windows to be opened within a high-resolution 2D background (Hiddink et al. 2006). Liquid crystal panels are conventionally made with flat glass so an alternative that is arguably more compatible with established manufacturing lines is to use electric field to create a graded index lens in the liquid crystal layer (Ren et al. 2007). This means that the lenses can not only be switched off or on, they can also be scanned (Kao et al. 2009), which lets us consider adding head tracking. Field of view is an important figure of merit for autostereoscopic displays, the wider the better. Any system based on lenses has the problem that lens performance degrades as the angle of the central ray with respect to the lens axis increases. The degradation is less severe if the focal length of the lens is long, but in lenslet systems the consequence is that the viewer sees the same sequence of views repeated as they move horizontally off perpendicular. Sometimes this is desirable – many viewers can see the 3D image – but it is unnatural and at the transition between the end of one sequence and the start of the next, the image is false.

Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 A barrier of slits limits the positions from which each pixel can be seen

Parallax Barriers A crude imaging device can be made by a pinhole so our lenticular array can be replaced by an array of slits, the advantage being that the field of view is no longer restricted by lens aberration. The array of slits is known as a parallax barrier because the direction of any ray is determined by parallax between the point of ray origin and the slit through which the ray passes. As with lenticular arrays, it helps to be able to switch the parallax barrier on and off, which can also be done using states of polarization. For example, if polarized light passed by a quarter wave plate is blocked, then fine slits in the plate act as a parallax barrier, which becomes mostly transparent if the polarization state of the light is switched to the orthogonal by a liquid crystal cell (Moseley et al. 2002; Fig. 6). Parallax barriers used like this not only waste light but also greatly increase aperture diffraction because the slit is so narrow. Furthermore like lenslet array devices, their resolution is always a fraction of that for the latest 2D display and users seem quickly to expect the latest high-resolution format as the minimum acceptable. Head tracking aside, autostereoscopic displays inevitably need high data rates but these need not come only by high resolution: they can also come by high speed.

Scanning Illumination Liquid crystal panels are no different from slits in the sense that they do not emit light but modulate it and there is no geometric reason why the slits should not be put behind the panel instead of in front. The advantage of doing this is that slits in front of a backlight can be replaced by thin electroluminescent strips, which waste much less light. Furthermore, we can illuminate one at a time of an array of strips behind the liquid crystal display so that each pixel can modulate one-by-one a set of rays travelling to angular increments of a wide field of view (Eichenlaub et al. 1995). The aperture diffraction need be no different from that of a normal display and if all is repeated sufficiently quickly, the eye need see no flicker. Unfortunately, that last proviso places great demands on the liquid crystal panel. It is not that liquid crystals cannot switch very fast – ferroelectric liquid crystals can switch in less than 50 ms – but conventional amorphous silicon transistor arrays have insufficient line rates to display frames at much more than the rates of 2D video. Can better economy be made of whatever frame rate is feasible?

Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

View Sequential Fewer frames may suffice if each is seen in its entirety by one eye since, particularly if we invoke crude head tracking, it may then be acceptable to address a limited part of the entire viewing space. This is in principle easily done: simply put a Fresnel lens behind the liquid crystal panel and a scanning spot source of light in the focal plane of the lens (Travis 1990). It is possible to display holographic images like this (Travis 1997b) and chapter “▶ Head- and Eye-Tracking Solutions for Autostereoscopic and Holographic 3D Displays” explains how head tracking (Ezra et al. 1995) can be used to limit frame rate. However, the bulkiness of this arrangement is undesirable so efforts are being made to produce light-guides that have a similar ability to emit convergent rays whose direction can be scanned. One approach (Brott and Schultz 2010) is to place LEDs at each end of a light-guide and to emboss it with a zigzag of facets that are angled to deflect rays almost perpendicular to the guide. The arrangement is symmetric; so rays from the LEDs at one end will emerge slightly left of perpendicular and rays from the other end will emerge slightly to the right. A film of lenslets arranged like a Gabor super-lens causes both sets of rays to converge so that a viewer directly in front of the guide will have each of their eyes illuminated by rays from one end or the other. This arrangement has the excellent property that if the viewer moves to either side of center, the image gracefully degrades back to 2D but it would be better still if more than two views could be supported.

Wedge Lensing The wedge light-guides that conventionally distribute light across the back of a liquid crystal panel can be made to concentrate light if the spot source of light is placed at the thin end (Käläntär et al. 2004) and to act like a lens if the surfaces are made smooth and the thick end is curved (Travis et al. 2009). Rays from the spot source are collimated by reflection off the thick end and if it is appropriately faceted, the rays will emerge uniformly from the surface of the wedge in a single direction. It is then a simple matter to scan this direction by driving one at a time of an array of LEDs along the thin end. If there are to be several viewers, we need at least several pairs of views so the frame rate of the liquid crystal display must increase. Metal oxide transistors already have mobilities an order of magnitude higher than for amorphous silicon (Mo et al. 2010) and with fast-switching analogue liquid crystal effects, can combine to give frame rates exceeding 1 kHz (Koshida et al. 2009). At a repetition rate of 60 Hz, which gives 17 views, that might be adequate, but if there are more than just a few viewers, data rates must go even higher.

Projection Few devices can turn data into spatially modulated light faster than the Texas Instruments Digital Micromirror Device™ (DMD, see chapter “▶ DLP ® Projection Technology”). The combination of crystalline silicon transistors, short addressing lines, and mirrors that tilt in a few microseconds leads to frame rates of perhaps 5 kHz, depending on how the illumination is modulated. Ferroelectric liquid crystal on a silicon backplane can also deliver high data rates for the same reason (Wilkinson et al. 1997) but we must now consider schemes based on projection. This does not necessarily mean that the system has to be view sequential, however; pico-projectors are becoming so inexpensive that one can imagine tiling them in the focal plane of a lens.

Multi-projector The eye sees an image on a screen only if rays of light travel from the screen to the eye. A projector is conventionally pointed at a white screen; so if we replace this with a Fresnel lens that images the projector Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 7 The lens acts like a screen but each projector shows an image to a different direction

into the eye (taking care to attenuate the light), then that eye and no other will see the entire picture from the projector. A row of projectors will produce views visible exclusively to a row of eyes and a vertical diffuser allows the image to be seen by people of different heights. Recently, projectors have become so affordable that as many as 95 have been arranged in rows and columns to form an integral imaging display (Sakai et al. 2009). While simple in principle, execution of this concept is more difficult. Whether arc-light or LED, no one light source has the same color coordinates as any other and the color coordinates vary with age. Furthermore, the center of a projection lens will typically pass twice as much light as the periphery, which itself will be separated at least by the lens holder from the periphery of the lens of the adjacent projector. Projectors can be interleaved at different heights and optical feedback can be used to facilitate alignment and adjust the brightness and color coordinates of the projected images, but this is a laborious process (Fig. 7).

Shuttered Projection Instead of using many small projectors, an alternative is to use one big projector and place a slit in front of it so that it behaves optically as if it were small (Baird 1942). Moving the slit is equivalent to moving a small projector and if the slit is one element of a liquid crystal shutter (Travis and Lang 1990), then one can time-multiplex the equivalent of many small projectors by opening one element of the shutter at a time. The same projector is used for each view so there is no need to correct misalignments or color nonuniformities and at least 18 views can be displayed using one or more DMDs (Cossairt et al. 2004). The system is bulky and alternatives that may be slimmer are described in chapter “▶ Emerging Autostereoscopic Displays”. However, the field of view of these displays is limited by the F-number (i.e., the ratio of focal length to diameter) of the projection lens. Even if we revert to the multi-projector system, the field of view is limited by the Fresnel lens whereas the constant message from potential users is that the field of view should be as wide as possible.

Scanning Slits Sometimes a new component transforms an established concept and the combination of a DMD projector with the scanning parallax barrier of section “Parallax Barriers” produces a display with a field of view that is unlimited by optics and dependent entirely on the rate at which data can be modulated. Three DMDs – one for each color – improve matters and there is the potential for binary modulation of an LED light source so as to get the best out of the DMD. Large liquid crystal shutters are not exactly easy to make but are certainly simpler than large liquid crystal displays and the combination is reported to be excellent

Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

(Møller and Travis 2005). Nevertheless this and all projection systems are bulky and users would prefer that they are flat.

Projection Through a Wedge Shine a ray of light into the thick end of a wedge light-guide and each time the ray reflects of one surface, the ray’s angle with respect to the perpendicular of the opposite surface will reduce. The ray will emerge into air once it reaches the critical angle and the number of reflections required to do this depends on the angle of injection and determines the distance to point of exit. This conversion of launch angle to distance up a screen is just what is provided by the space between a screen and a video projector so the wedge allows an image to be projected without the bulk otherwise inherent to projection (Travis et al. 2006). If the wedge is made from off-the-shelf polymer, then there is some loss of contrast due to material scatter but multiple projectors give 3D albeit with the usual problems of combining different images and with a field of view limited by lens aberration. With scanning slits, the result might be a simple way of getting a flat panel 3D display with a particularly wide field of view but scanning slits are very wasteful of light.

Directions for Future Research A pressing problem is how to show a pair of autostereo views on a display whether it is orientated like a portrait or at right angles like a landscape. Autostereo has good potential for handheld devices because there is usually only one user and users tend to hold the device perpendicular to their line of sight but they have come to expect the option of either orientation. Field of view should be the preoccupation of anyone trying to think of more advanced concepts for 3D. Lenses work poorly at as little as 25 to the screen perpendicular so it can be difficult to concentrate rays into each eye of a viewer sitting off-center. Data rates also become a challenge if many views are to be displayed and chapter “▶ Emerging Autostereoscopic Displays” describes a way of managing high data rates. The alternative of trying to reduce the data rates is described in chapter “▶ Head- and Eye-Tracking Solutions for Autostereoscopic and Holographic 3D Displays.”

Conclusion Multi-view autostereoscopic images at full resolution prompt every 3D cue. Depth perception is greatly enhanced if the images can be seen over a wide field of view but even if the display can handle the necessary data rates, many concepts use lenses that aberrate at large angles to the screen perpendicular. Autostereocopic displays are therefore typically a compromise between pragmatism and the ideal, but the results can be a convincing improvement versus concepts dependent on spectacles.

Further Reading Baird JL (1942) Stereoscopic colour television. Wirel World 48:31–32 Brott R, Schultz J (2010) 16.3: directional backlight lightguide considerations for full resolution autostereoscopic 3D displays. In: SID symposium digest of technical papers, vol 41, Seattle, pp 218–221 Cees Van Berkel C (1999) Image preparation for 3D-LCD. In: Proceedings of SPIE, vol 3639, stereoscopic displays and virtual reality systems VI, San Jose, pp 84–91 Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Cossairt O, Møller C, Travis A, Benton SA (2004) Novel view sequential display based on DMD technology. In: Proceedings of SPIE vol 2591, stereoscopic displays and virtual reality systems XI, San Jose, pp 273–278 de Zwart ST, IJzerman WL, Dekker T, Wolter WAM (2004) A 2000 switchable auto-stereoscopic 2D/3D display. In: Proceedings of 11th international display workshop (IDW), Niigata, pp 1459–1460 Eichenlaub JB, Hollands D, Hutchins JM (1995) A prototype flat plane hologram-like display that produces multiple perspective views at full resolution. In: Proceedings of SPIE, vol 2409, stereoscopic displays and virtual reality systems II, San Jose, pp 102–112 Ezra D, Woodgate GJ, Omar BA, Holliman NS, Harrold J, Shapiro LS (1995) New autostereoscopic display system. In: Proceedings of SPIE, vol 2409, stereoscopic displays and virtual reality systems II, San Jose, p 31 Hiddink MGH, de Zwart ST, Willemsen OH, Dekker T (2006) 20.1: locally switchable 3D displays. In: SID international symposium digest of technical papers, vol 37, San Francisco, pp 1142–1145 Hoffman DM, Girshick AR, Akeley K, Banks MS (2008) Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. J Vis 8(3):33, 1–30 Im H-J, Jung S-M, Lee B-J, Hong H-K, Shin H-H (2008) 20.1: mobile 3D displays based on a LTPS 2.4 VGA LCD panel attached with lenticular lens sheets. In: SID symposium digest of technical papers, vol 39, Los Angeles, pp 256–259 Käläntär K, Matsumoto SF, Katoh T, Mizuno T (2004) Backlight unit with double-surface light emission using a single micro-structured lightguide plate. J SID 12:379–387 Kao Y-Y, Huang Y-P, Yang K-X, Chao PCP, Tsai C-C, Mo C-N (2009) 11.1: an auto-stereoscopic 3D display using tunable liquid crystal lens array that mimics effects of GRIN lenticular lens array. In: SID symposium digest of technical papers, vol 40, San Antonio, pp 111–114 Kim Y, Jung J-H, Hong K, Park G, Lee B (2010) 37.4: accommodation response in viewing integral imaging. In: SID symposium digest of technical papers, vol 41, Seattle, pp 530–532 Koshida N, Dogen Y, Imaizumi E, Nakano A, Mochizuki A (2009) 45.2: an over 500 Hz frame rate drivable PSS-LCD: its basic performance. In: SID symposium digest of technical papers, vol 40, San Antonio, pp 669–672 Lippmann MG (1908) Epreuves reversibles donnant la sensation du relief. J Phys 7:821–825 Mo YG, Kim M, Kang CK, Jeong JH, Park YS, Choi CG, Kim HD, Kim SS (2010) 69.3: amorphous oxide TFT backplane for large size AMOLED TVs. In: SID symposium digest of technical papers, vol 41, Seattle, pp 1037–1040 Møller CN, Travis AR (2005) Time multiplexed autostereoscopic flat panel display using an optical wedge. In: Proceedings of SPIE, vol 5664, stereoscopic displays and virtual reality systems XII, San Jose, pp 150–157 Moseley RR, Woodgate GJ, Jacobs AMS, Harrold J, Ezra D (2002) Parallax barrier, display, passive polarization modulating optical element and method of making such an element, US Patent 6437915 Okano F, Hoshino H, Arai J, Yayuma I (1997) Real time pickup method for a three-dimensional image based on integral photography. Appl Opt 36:1598–1603 Ren H, Fox D, Wu S-T (2007) 62.1: liquid crystal and liquid lenses for displays and image processing. In: SID symposium digest of technical papers, vol 38, Long Beach, pp 1733–1736 Sakai H, Yamasaki M, Koike T, Oikawa M, Kobayashi M (2009) 41.2: autostereoscopic display based on enhanced integral photography using overlaid multiple projectors. In: SID symposium digest of technical papers, vol 40, San Antonio, pp 611–614 Schowengerdt BT, Seibel EJ (2005) 7.1: true 3D display technology. In: SID symposium digest of technical papers, vol 36, Boston, pp 86–89 Travis ARL (1990) Autostereoscopic 3-D display. Appl Opt 29:4341–4343 Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_113-2 # Springer-Verlag Berlin Heidelberg 2015

Travis ARL (1997a) The display of three-dimensional video images. Proc IEEE 85:1817–1832 Travis ARL (1997) View-sequential holographic display. International Patent, WO9900993 Travis ARL, Lang SR (1990) A CRT based autostereoscopic 3-D display. In: Eurodisplay 1990, 10th international display research conference, Amsterdam, LP10, 26–28 Sept 1990 Travis ARL, Møller CN, Lee CMG (2006) Flat projection for 3-D. Proc IEEE 94(3):539–549 Travis ARL, Large T, Emerton N, Bathiche S (2009) Collimated light from a waveguide for a display backlight. Opt Express 17:19714–19719 Wilkinson TD, Crossland WA, Coker T, Davey AB, Stanley M, Yu TC (1997) The fast bitplane SLM: a new ferroelectric liquid crystal on silicon spatial light modulator. In: Spatial light modulators, technical digest. Optical Society of America, Washington, DC, pp 149–150 Woodgate GJ, Harrold J (2003) LP-1: high efficiency reconfigurable 2D/3D autostereoscopic display. In: SID symposium digest of technical papers, vol 34, Baltimore, pp 394–397

Page 11 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Head- and Eye-Tracking Solutions for Autostereoscopic and Holographic 3D Displays Enrico Zschau* and Stephan Reichelt SeeReal Technologies GmbH, Dresden, Germany

Abstract Head- and eye-tracked 3D display systems provide glass-free 3D experience while offering a free movement of the observer within the tracking range. In either autostereoscopic or holographic 3D displays, eye-tracking systems are used for following left and right eye position of the observer in real time. Knowing the exact eye positions, the 3D display system provides the proper perspective views to the display user. For autostereoscopic displays these are left and right 2D stereo sub-images, whereas for holographic displays these are left and right holographic 3D reconstruction, respectively. For angular separation of the particular views, special optical light-steering devices are employed. Thus, a fast eye tracking combined with smooth light steering ensures a complete 3D visualization for the user at any time over a wide viewing range. In this chapter, we first discuss general aspects of video-based eye tracking for their application in 3D displays and present implementations of real-time eye-tracking systems in autostereoscopic and holographic 3D displays. Then an implementation of an eye-tracking system is discussed in detail including the actual hardware and software solutions and achieved performance. Commercially available solutions for eye tracking are evaluated in terms of their specifications and suitability for tracked 3D displays and compared with our system. We then continue with a general description of the optical system required for providing the designated views of a 3D display. Finally, we conclude with a brief summary and offer a perspective on possible future developments of tracked 3D displays.

List of Abbreviations API ASD CPU DSP ET EWOD FD FOV FPGA HAL IR LVDS MC OEM

Application programming interface Autostereoscopic display Central processing unit Digital signal processor Eye tracking Electrowetting on dielectrics Face detection Field of view Field-programmable gate array Hardware abstraction layer Infrared Low-voltage differential signaling Master control Original equipment manufacturer

*Email: [email protected] Page 1 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

PC PCA PHY RAM ROI SDK SRIO SL SVM VW

Personal computer Principal component analysis Physical layer of ethernet Random-access memory Region-of-interest Software development kit Serial variant of rapidIO Support layer Support vector machine Viewing window

Introduction As commonly acknowledged within the electronic-display industry, the next evolutionary step will be the transition from two-dimensional (2D) to three-dimensional (3D) visualization. 3D technology has already entered the movie theater market and starts to penetrate into the consumer television market. Various technologies for visualizing three-dimensional scenes on displays have been technologically demonstrated and commercially refined over the last decade, among them being stereoscopic and autostereoscopic, multiview, integral imaging, volumetric, or holographic type. As 3D displays vary in their optical design, the physical 3D image representation as well as their 3D performance is quite different. A good overview of state-of-the-art technologies is provided by the “3D Display Technology Family Tree” chart, compiled by the 3D@Home Consortium (ST4 02). Which 3D technology becomes finally successful, will crucially depend upon not only on its technical features and performance but especially on how well human factors are addressed. Visual conflicts will have strong impact on future acceptance and success of 3D displays. Inherent visual conflicts of stereo 3D to natural viewing are always going to limit capabilities and features if not market saturation. Non-consistent depth cues such as convergence-accommodation mismatch and unnatural motion parallax cause less viewing comfort and might hinder the general progress of stereoscopic displays, especially for use at short viewing distances (Hoffman et al. 2008; Reichelt et al. 2010a). The eye-tracking system of a general 3D display system (either autostereoscopic or holographic) is used for tracking left and right eye position of the observer, respectively. With this information, the proper perspective views (i.e., left and right stereo-image or left and right holographic reconstruction, respectively) can be provided by specially designed optical light-steering devices. In general, the advantages of eye-tracked 3D displays can be summarized as follows: • • • • •

No need of wearing special 3D glasses Multiuser capability with individual content presentation Simultaneous 2D and 3D data visualization at any position and at any size Full resolution in autostereoscopic 3D at native display resolution (no spatial multiplexing required) Realization of important human factor issues such as motion parallax (look around effect), or region-ofinterest-based 3D-visualization optimization

One special feature of tracking application in 3D displays is that tracking is only needed when the viewer is actually watching at the display. Due to the incorporated video camera, the display system always knows the position of the viewer(s) that are watching. This enables for example the mimicking proper motion parallax even with autostereoscopic 3D displays, which is an important depth cue. Page 2 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

a

b

T=0

T = X (X = detection time)

c

T=X+1

Fig. 1 Regions where the detection algorithms are executed: (a) search-phase and (b) tracking-phase

Tracked 3D display systems must fulfill the following key functions (with its determining parameters and design challenges as listed in the brackets): 1. Precise tracking of head and eye position in space (tracking area, camera exposure time, data transmission and processing, robustness, accuracy, and detection reliability) 2. Accurate light steering to the actual eye positions (data transmission, response time of steering device, viewing range, aberrations) Which approach is best-suited for head/eye/gaze tracking and light steering depends not only on display size such as for signage, television, desktop, or mobile displays but also on their application and intended number of viewers, viewing range, viewer interaction, etc.

Basic Principles of Tracking Tracking of Objects Tracking in general means the processes of locating specific moving objects and following their path over time. This adapted to the field of image processing means to detect and track previously selected objects in a continuous sequence of subsequent image frames from one or more video cameras. Tracking of objects is divided into two phases: During a first search-phase, interesting objects are detected by using appropriated detection algorithms, analyzing the full scene provided by the camera. Those are registered for the tracking process. Then during a second tracking-phase, the detection algorithm or a more specific tracking algorithm is applied only at small, so-called tracking-regions, around those registered objects for each subsequent frame. So the number of camera pixels to process when tracking those objects in comparison to process the whole camera frames will be dramatically reduced. According to current movements of the objects, the tracking-regions are adapted frame by frame in size and position. Their size depends on the maximum speed of the targeting objects and the cameras frame rate – both together give an estimate for the maximum shift of the object between two subsequent camera frames. By using the information on speed, direction, and size, the object’s position and size in the following frame can be estimated, which helps to further reduce the size of the tracking-regions to a size just slightly larger than the size of the objects (cf. Fig. 1a, b). For this estimation, the so-called Kalman filter can be applied (Kalman 1960). Alternatively, a direct calculation using 3D-vector operations can be applied, but effects like noise have to be taken into account, which is modeled by the Kalman filter.

Page 3 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

A major advantage of such tracking methods is the reduction of computing time that is needed to process the camera frames while maintaining high frame rates in tracking mode. This implies to have small-sized objects in a large camera frame. This reduction allows operating at higher frame rates. In changing environments and when more than one object should be tracked, the algorithm may have to deal additionally with objects appearing and disappearing. Disappearing can be handled quite simple – the object is not more detected in the tracking-region and will be unregistered. Newly appearing or temporary occluded objects have to be detected again, which is occasionally done by performing a full detection or by scanning the full frame in multiple iterations over multiple frames.

Feature Detection in General Depending on the application, specific objects have to be found and tracked. For autostereoscopic and holographic 3D displays, for example, the tracking-system has to detect and track the faces and/or eyes of the observers to allow free head movements without losing the 3D effect. In this case, the objects or features to track are faces and eyes. A face may be split up into multiple features like eyes, nose, mouth, and eyebrows, so it could also be represented by a group of features as may be necessary in a specific detection algorithm. A feature is typically modeled by multiple parameters, which are extracted from the raw camera frame by applying different image-processing filters and algorithms. Hierarchical processing models use different levels of complexity for their features – with increasing level, feature models become more complex in terms of amount of parameters or complexity to calculate those parameters. This allows complex hierarchical feature models. For example, by applying a simple but inaccurate feature model, but with many false positives in the first level, the number of locations to process in the second/next level is greatly reduced. For simple features, i.e., only one simple feature model, a simple processing model may also be sufficient. The detection of a feature typically occurs within a camera frame, where specific parameters defined in the feature model are extracted from the image data. Those parameters are calculated by applying filters, kernels, or complex algorithms to the raw color or grayscale values in the image and have to be such that they distinguish significantly from other image regions. Afterwards, operators are used to reduce the information (i.e., many pixels) to significant parameters (i.e., an average value). As an example, filters or kernels can be used to highlight specific parts or objects in the camera frame, which can be edges, gradients, or other special patterns. In the following, a very simple example of image-based object detection is given: large red blobs in a blue liquid have to be found (cf. Fig. 2a). The model parameters of the blobs are: red color, large size, and nearly symmetric shape. So at first, we apply a threshold operation to remove the blue liquid; only the red parts remain. Furthermore, a binarization is applied where all removed parts are set to zero and all other parts to one (Fig. 2b). By calculating the regions of all directly connected pixels marked by ones we get multiple objects and their properties like size, mass, contour, and so on (Fig. 2c). By further analyzing the contour of the objects, one may further filter out regions, which do not have the expected shape or size – the remaining regions are the large red blobs we were originally looking for (Fig. 2d). Often features cannot be modeled that easily, e.g., when strong parameter variations occur or no general distinct parameters can be found to exactly differentiate the feature. In this case, so-called classifiers known from the field of artificial intelligence are very convenient. Such classifiers are typically trained with multiple samples, where each sample is marked to belong to a specific group. In the simplest case, two groups – positives and negatives – are used. After training with positive and negative samples, the classifier is able to assign one of the two groups to a new sample (which has not been trained). This allows handling complex relationships between model parameters, which are not easy to formulate directly. But to succeed with this method, it is important to find appropriate parameters, which together significantly Page 4 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Simple example of image-based object detection comprising (a) start, (b) thresholding and binarization, and (c) component connection. Sub-figure (d) shows the final result

separate a feature from all others and non features. A very popular classifier is the so-called support vector machine (SVM) (Cortes and Vapnik 1995). Other methods to classify data are the so-called Eigenvectors (Lyons et al. 1999), which are based on the principal component analysis (PCA) (Jolliffe 2002).

Methods to Detect Faces and Eyes For face detection, a variety of methods have been developed. Typical methods use a combination of filters and classifiers, where the image is filtered by multiple operators such as edge detection (e.g., Sobel and Feldman 1973 or Canny 1986) or the Viola-Jones operator (Viola and Jones 2004). The output of each filter is classified, to later decide between a face and a non-face. By using a hierarchical method as stated above, the quality and performance can be clearly improved to allow real-time operation. A major problem in face detection is the vast variety of faces through different ethnic groups, hairstyle, beards, and accessories, not to mention specific poses and facial expressions. So face-detection systems typically concentrate on the facial areas with low variation like the eyes, nose, and mouth. The image data are typically transformed to normalized grayscale, so that different skin colors and unknown scene illumination have less influence on the algorithms. The methods for eye-detection are similar to face detection. Often pattern-based approaches in combination with classifiers are used, because eyes do not vary as much as faces do. Depending on the available resolution in the image, details like the iris, eye corners, and pupils may be separately detected. Another quite different method for detecting eyes is based on active, pulsed infrared (IR) illumination, which makes use of the so-called red-eye effect (or retina reflex) (Morimoto et al. 2000). This method uses a special camera setup which sequentially records two images, one where the observer is illuminated by an Page 5 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

on-axis IR-light source (located very close to the camera) and the other using an off-axis IR-light source. The on-axis image provides a bright spot at the pupil’s location, whereas in the off-axis image the pupil appears black, which is together referred to as the “bright and dark pupil effect.” By differentiating both images, the pupil locations can be detected quite easily. Because of their time-sequential manner, cameras with a high frame rate are essential to manage fast movements of the observers. In addition to the retina reflex, the so-called cornea reflex can be detected, which appears as a bright spot on the cornea that results from the off-axis illumination. This combined with the retina reflex is used to determine gaze direction, which can be used as an additional feature in autostereoscopic and holographic 3D displays.

Implementation of an Eye-Tracking System This section describes the specific platform and software solution developed by SeeReal to enable realtime eye tracking implemented in their autostereoscopic and holographic 3D displays. At first, some general considerations regarding the realization of a complete eye-tracking system are mentioned. In the following, components like camera system, software design, as well as hardware platform are presented.

General Eye-Tracking System Design Depending on the type of 3D display, different requirements are imposed to the observer tracking. For 3D displays providing a large sweet spot, inaccurate face tracking may be sufficient. If higher accuracy is needed, the system should provide eye tracking or even pupil tracking. Furthermore, the environment’s background illumination must be considered. In dark or very changing situations, a continuously emitting IR-illumination and optical filters (band-pass or long-wave pass) are useful to improve image quality, but this implies the use of grayscale images only. When the 3D display requires face or eye positions in 3D, at least two mutually adjusted cameras are needed – by using stereophotogrammetry, the distance of a feature or image position to the camera can be calculated. The accuracy of distance determination mainly depends on the distance between the cameras, the sensor resolution, and the used lenses regarding field of view and optical quality (vignetting, distortion, field curvature). The basic principle of stereo analysis is to find the location of one distinct feature or object in both images captured by the two cameras. The lateral difference of the two positions in both images, the so-called disparity, is used to calculate the metrical depth. For these processes, patternmatching algorithms are often used, which require some structure inside the scene captured by the cameras to provide the best results; the image quality and the capabilities of the used matching algorithms directly influence the accuracy that can be finally achieved. Even if the 3D display does not require 3D coordinates of the viewer position, e.g., if an angular position is sufficient, the depth information is very useful. For example, it can be utilized for a rough guess of features sizes that are expected from the captured image. Depending on the specific requirements of the eye-tracking system, the camera optics and sensor must be chosen equally carefully as the set of algorithms for face and/or eye detection to finally fulfill all requirements with regard to tracking range, system accuracy, and system speed. The complexity of the algorithms defines the required performance of the hardware platform. The integration of a tracking system into a 3D display therefore requires not only performance issues to be considered – form factor, power consumption, and full integration capability has to be considered as well. A common practice is to implement the system on a PC first, which allows for evaluation of the algorithm performance, hot spots, and computational needs to finally define the embedded hardware for product solutions.

Page 6 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Stereo camera system for observer tracking. The camera system is integrated in the display housing

Camera System The camera system for SeeReal’s autostereoscopic and holographic 3D displays is a custom in-house designed stereo camera system based on two identical grayscale image sensors that operate at a frame rate of 60 Hz (Fig. 3). Different camera lenses have been chosen and integrated to meet different requirements on tracking accuracy and tracking range for autostereoscopic and holographic 3D displays. A continuous-emitting IR-light source and lens-mounted long-wave pass filters are used to address the potential impact of strongly varying background illumination. Thus, operation is possible in completely dark rooms up to situations dealing with sunlight. At any background illumination situation, there is sufficient IR-light received by the image sensor thus providing high contrast within the face while maintaining a good image quality with low noise. This is in contrast to systems that operate without extra illumination, where noise-prone electronic image amplification is occasionally required. For each individual camera system, a set of calibration data is generated by using a special in-house designed calibration device. Hence, a precise calculation of the epipolar lines used for stereo analysis and the calculation of exact metric 3D positions from 2D-image positions are guaranteed. This calibration process also measures and compensates for lens distortions.

General Software Design The software for real-time eye tracking was developed with portability in mind, so it consists of the eye-tracking software and a general part providing a kind of operation system to abstract the hardware platform. The realized software system consists of three main components (cf. Fig. 4). First, the hardware abstraction layer (HAL) provides access to low-level interfaces and devices like the camera, in-system communication, Ethernet access, memory management, storage, and task management, beside many others. The second component, the support layer (SL), provides high-level services like communication protocols, job queues, and the file system, which are independent of the hardware but may be shared between different applications. Both together are the general part of the system, thus they provide the basis for the third module, the application, and operate like an operating system especially for image-processing applications. The main reason for creating HAL and SL was to enable easy portability of the application to different hardware platforms. So the system currently allows executing the same application compiled for both a DSP/FPGA platform and a Windows PC, with the application code being exactly the same on both platforms. The Windows platform consists of a standard PC with a Camera Link http://www.imagelabs.com/pdf/ CameraLink5.pdf frame grabber add-on card. The DSP-platform contains multiple DSPs and FPGAs and appropriate interfaces. Their number depends on the processing power required.

Page 7 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Application

Support layer Image Commmanage- unication ment protocols

File system

Hardware abstraction layer Memory Memory Image SRIO manageaccess aquisition interface ment (RAM, Flash)

FPGA interface

TCP/IP networking

Logging

Job management

Hardware C6000 DSP/X86 PC

Fig. 4 Overview of the general software design

The application is divided into multiple tasks running in parallel on multiple processors – on multiple DSPs or inside multiple threads on the PC. This permits to utilize multicore environments and enables scalability. The communication between those tasks is done by using Serial RapidIO on DSPs and interthread communication on PC. The DSPs are designed to connect to an FPGA. This allows implementing appropriate algorithms suitable for massive parallelization as FPGA designs. One disadvantage of using FPGAs is the complexity in programming, which is much more time consuming than optimizing an algorithm on a DSP or PC using assembly language. But on the other hand, they provide much processing power at low energy consumption in a compact form to enable integration. So a trade-off between development time and performance boost must be made, when choosing which algorithms or code fragments should run on the FPGA or on the DSP. But as stated, also the general suitability to execute on an FPGA must be given – intensive sequential logic paths with multiple dependencies from former results may run much faster on a processor/DSP than on an FPGA, because FPGAs are typically driven with a lower frequency than DSPs. The general FPGA design of the platform is structured into multiple modules beside busses and interfaces to external devices. The most important modules are the camera interface utilizing Camera Link to access image data, RAM-access to drive external RAM, DSP-interface to exchange data with the DSPs, the in-system bus to connect all modules with each other and the job modules, which encapsulate the algorithms optimized as FPGA designs (see Fig. 5).

The Eye-Tracking Application The application, in this case the eye-tracking software, is divided into three main tasks: (1) face detection (FD), (2) eye detection and tracking (ET), and (3) master control (MC). According to available resources (i.e., number of DSPs or CPU cores), face detection and tracking tasks make use of further tasks to speed up their operation. For each observer to be tracked, one ET-task is instantiated and executed on a separate core, which allows multiuser operation out of the box. Depending on the designated number of observers that can see proper 3D scenery, the DSP/FPGA platform has to be dimensioned appropriately – so a modular hardware-design is essential. Page 8 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Camera link

Other IO

DSP/EMIF

Jobmodule 1

System bus

LVDS

Flash

Jobmodule 2 Jobmodule N

RAM FPGA-design

Fig. 5 Overview of the FPGA design

The FD-task is responsible for continuously detecting all faces in the entire tracking area (i.e., typically the camera’s field of view). The detected faces are collected by the MC-task and then sorted and logically assigned to the appropriate ET-tasks, where each ET-task tracks one particular observer. The ET-tasks are then updated with those new face positions. If a new observer is entering the tracking area, a free ET-task (if available) is selected and assigned to this new observer. An ET-task performs a detailed eye-detection in the facial area in the first frame. During the following frames faster eye tracking is enabled as long as the eyes are steadily detected and not occluded or outside the tracking area. The continuously updated face positions are used for verification purposes. For face detection algorithms, different software development kits (SDKs) can be used. For SeeReal’s eye-tracking system, the general requirements to such a face-detection algorithm are: • General portability/availability on different platforms • Less than 100 ms frame time to detect multiple observers within a large area (i.e., 30 in a depth range of 0.5–3 m) • Reliability of more than 99 % detection rate at a precision of around 20 mm The eye detection and tracking algorithms used in the ET-task are based on image processing to find the best eye candidates. The most important operations are normalization, thresholding, and segmentation as well as connected-component analysis. This combined with a classifier is used to filter out all false positives. The classifier is trained with a very large set of multicultural facial images to provide a very high reliability across multiple ethnic groups. The use of stereo analysis allows the calculation of metrical 3D eye positions and helps to increase the detection and tracking performance by means of processing time and reliability through precise feature size estimation.

Hardware Design The main reason for developing a dedicated hardware for eye tracking is the ability to integrate it into a 3D-display device. When deciding for a specific platform, a trade-off between universality and specialization has to be made. Furthermore, algorithms in this field continuously improve, so some space for algorithm upgrades should always be planned, especially when a new platform is designed. A good balance is to create an expandable hardware platform, which can be upgraded by adding additional modules (e.g., processors/DSPs/ FPGAs), to increase the lifetime of a hardware design and save development costs.

Page 9 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

a

Ethernet

b

FPGAPCB

DSP

Flash

DSPPCB

DSP

Flash

DSPPCB

LVDS

SRIO

FPGA

IO

Flash

FPGAPCB

EMIF

RAM

IO

RAM

IO

RAM

IO

IO

Flash

EMIF

IO

FPGA

RAM PHY

IO

RAM

IO

Camera link

RAM

DSP

Flash

DSP

Flash

DSPPCB

DSPPCB

SRIO

Fig. 6 Overview of the hardware design: (a) FPGA module and (b) DSP module

The hardware design developed by SeeReal consists of multiple hardware modules responsible for different tasks; generally DSP modules, FPGA modules, and add-on modules have been designed. The base version is composed of two modules, a DSP and an FPGA module. The DSP module implements multiple DSPs from Texas Instruments, DDR2-ram, flash-memory, and a Network-PHY for networking (see Fig. 6b). The FPGA module provides an Altera FPGA, DDR2-ram, and flash-memory beside multiple external interfaces for different purposes like LVDS and Camera Link (see Fig. 6a). The DSPs on a DSP module are connected by Serial RapidIO lanes (SRIO) to enable high-speed communication with 2.5 Gbit/s per lane. In addition, each DSP-module provides an external SRIO-port to connect to other DSP-modules. This allows increasing the calculation power by using more application tasks in parallel. Even though, multiple FPGA modules can be added to the system and connected with each other using LVDS to enable a high-resolution image data transfer at full frame rate. The current system is able to track up to four observers simultaneously. Owing to the modular design, even more observers can be tracked – simply by adding more hardware modules. This platform enables to integrate eye tracking into 3D displays, but is also useful as a research platform, to improve the algorithm performance, to test new algorithms and to prepare for a higher level of integration, i.e., to realize the hardware platform as a highly integrated system or system-on-chip. In parallel to the DSP/FPGA hardware, the software system is running on a PC utilizing an Intel Core2 Quad CPU. A Camera Link frame grabber provides the images from the camera system. This system also allows tracking of up to four observers simultaneously.

Overall Performance The eye-tracking software system enables to precisely detect and follow the eye-movements of up to four independent observers and even more with additional hardware modules. Applied to the hardware platforms described above, a frame rate of 60 Hz is achieved in tracking mode. The position accuracy and detection rate has been determined by using a large image database (stereo image sequences with continuous-emitting IR-illumination and handmarked pupil positions), which provides a large amount of individuals from different ethnic groups. Eye-region resolution is given by the specific camera system, which in turn, has been designed for the needs of the particular 3D display

Page 10 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Accuracy of SeeReal tracking systems for given eye-region resolutions. Eye-region resolution is adapted to different requirements for autostereoscopic and holographic 3D display prototypes Eye-region resolution (pixel) 100 50

Accuracy (mm) 2.5 5

Detection rate (%) 99.4 99.4

Table 2 Overview of applications for SeeReal’s eye-tracking system Application Autostereoscopic 3D display Holographic 3D display Holographic 3D display

Tracking range H/V: 45 /45 DR: 0.5–1.5 m H/V: 16 /12 DR: 1.7–2.5 m H/V: 40 /30 DR: 1–2.5 m

Accuracy (mm) 5

Camera system 0.4 megapixel

2.5

0.4 megapixel

2.5

4 megapixel

H/V horizontal/vertical tracking area, DR depth range of the tracking area

(viewing distance, viewing range). The eye-region resolution is defined as the measure of pixels per eye-distance (number of pixels between both eyes). The results are shown in Table 1. Depending on the given eye-region resolution, an accuracy of 2.5 or 5 mm is achieved at a detection rate of >99.4 %. The detection rate is defined as the number of correctly identified eye pairs that are within the predefined accuracy tolerances compared to all detectable eye pairs. Since tracking accuracy depends only on the eye-region resolution, a specific system with special requirements for accuracy and tracking range can be realized by choosing appropriated sensors and lenses. So the system has been adjusted to meet the requirements of different applications, especially for 3D displays. Table 2 gives an overview of the 3D displays realized by SeeReal and the camera system implemented. The autostereoscopic 3D display (Fig. 7a; Stolle et al. 2008) uses an eye tracking with lower accuracy but larger tracking range. Another application is the use as an eye-tracking system for holographic 3D displays with very stringent requirements on tracking accuracy (see Fig. 7b; Reichelt et al. 2010b; Zschau et al. 2010). The display has to reposition the holographic Viewing Window of approximately 8 mm in size exactly to the pupil position of the observer such that the pupil lies within this 8 mm range all the time. To immediately shift the Viewing Window along with the observer’s movements, a very short response time is essential. To manage the Viewing Window shifting sufficiently fast, the whole system has a low latency between sensor illumination (shutter opens) and the output of the 3D coordinates. To further minimize system delay, a prediction algorithm can be applied, which enables an estimate for the expected 3D positions in the next few frames – this truly helps to overcome this delay for fast and continuous head movements. Experimental results show a distinct improvement in position stability compared to the standard algorithm without prediction.

Comparison with Commercially Available Eye-Tracking Systems There are several commercial eye-tracking solutions available that basically could be integrated into a 3D display to perform the tasks of eye tracking. Some provide a separate camera system and some provide a normal LC-display with integrated eye-tracking cameras. Regarding the processing platform, both systems with dedicated calculation hardware or PC-based programs are offered.

Page 11 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 7 3D displays from SeeReal: (a) autostereoscopic 3D display “NextGen” and (b) holographic 3D display “VISIO 20”

Most of these systems are designed to measure not only the eye position but also the gaze direction. The main applications for such systems are in user studies using the gaze direction, human-computer interactions, support of disabled people, and measurement of fatigue in automotive environments. This feature on the other hand brings the disadvantage of a small tracking area, the so-called head-box, which mainly results from the requirement to capture the eye regions with high resolution. The observer distance is typically around 60 cm, there the head-box defines the freedom of movement – a typical movement range of around 20 cm in depth, 25 cm in horizontal and 15 cm in vertical direction is given. Table 3 lists some eye-tracking systems on the market and their main specifications with focus on those with separate camera systems (as specifications available at the date of publication). Beside complete eye-tracking systems, SDKs providing software APIs for face or eye tracking are offered to use them in own system developments. Most of them are provided as Windows-SDKs beside other PC-based platforms. To realize an integrated version in a 3D display, the appropriate licensing to enable the porting of the software to the target hardware must be considered beside the additional development resources. Some of them including their core features are listed in Table 4.

Optical Light Steering in 3D Displays Beside the tracking task, 3D displays have to perform a subsequent step of light steering according to the movement of the observer. We have developed different alternatives for the optical light steering in 3D displays; two of them are explained here in more detail. The steering of the Sweet spot or Viewing Window can be done, for example, by shifting the light source and thus shifting the image of the light source accordingly, or by placing an additional optical deflection element that realizes a variable deflection in front of the information panel. In the following, we discuss implementations of these alternatives.

Light Steering by Variable Light Source Positions

The first principle that was developed and implemented in autostereoscopic “NextGen” and holographic “VISIO 20” prototypes is based on active shifting of the light source. The optical principle is schematically sketched in Fig. 8. By imaging through a lens, a shift of any light source in object space results in a shift of its image.

Page 12 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_114-2 # Springer-Verlag Berlin Heidelberg 2015

Table 3 Overview of some commercially available eye-tracking systems Tracking range/ head-box H/V: 40  30 cm D: 60 cm/var DR: 40–80 cm H/V: 44  22 cm D: 70 cm DR: 50–80 cm H/V: 35  23 cm D: var DR: 30 cm

Position accuracy N/A

SmartEye Pro 5

Product Tobii IS

Speed 20–30 Hz

Solution Complete system with small form factor and embedded processing

N/A

60 Hz L: 33 ms

Complete system and embedded processing

1 mm

N/A

Custom setup

10 % MTF at 35 cycles/mm for all optimization fields 3 arcmin 27 % distortion. However, this is compensated for in the relay lenses. In order to achieve 50 lp/mm resolution across the FOVand the weight target, the author was using a combination of plastic and glass materials. The relay was composed of six lenses, four polymer lenses each having an aspherical surface and two glass lenses with spherical surfaces. The strong aspheres close to the screen help with distortion correction. The authors also designed an all-polymer relay, and they report that the hybrid relay performed 33 % better in terms of the transverse color correction.

Conclusion A collection of head-worn display optical architectures was presented, which illustrates potential starting points organized by the FOV parameter for three FOV regimes.

Page 9 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_137-2 # Springer-Verlag Berlin Heidelberg 2015

Table 4 Mid-FOV designs (between 40 and 60 ) Chen. Helmet visor display employing reflective, refractive, and diffractive optical components US 5,526,183 11 June 1996

US 5,436,763 25 July 1995

Takeyama. Image display apparatus

Togino. Visual display apparatus

US 6,342,871 29 Jan 2002

US 5,436,765 25 July 1995

Erfle. Ocular

Hoshi et al. Off-axial HMD optical system consisting of aspherical surfaces without rotational symmetry In: Proc. of SPIE, vol 2653

US 1,478,704 25 Dec 1923

Chen. Wide spectral bandwidth virtual image display system

Chen. Ultrawide FOV, broad spectral band visor display optical system US 5,499,139 12 Mar 1996

Becker. Head-mounted display for miniature video display system US 5,003,300 26 Mar 1991

Page 10 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_137-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 (a and b) Canon video see-through systems (Adapted from Takagi et al. (2000) and Inoguchi et al. (2008))

Page 11 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_137-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Visually coupled airborne systems simulator (VCASS) (Adapted from (Buchroeder et al. 1981))

Fig. 5 Rotationally symmetric but tilted optical system (Adapted from Sisodia et al. (2007))

Page 12 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_137-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 The advanced helmet-mounted display (AHMD)

Fig. 7 An example wide FOV (120  67 )

Further Reading Azuma R, Baillot Y, Behringer R, Feiner S, Julier S, MacIntyre B (2001) Recent advances in augmented reality. IEEE Comput Graph Appl 21(6):34–47 Berman AL, Melzer JE (1989) Optical collimating apparatus. US Patent 4,859,031 Buchroeder RA, Seeley GW, Vukobratovich D (1981) Design of a catadioptric vcass helmet-mounted display. Optical Sciences Center, University of Arizona, Final report, under contract to U.S. Air Force Armstrong Aerospace Medical Research Laboratory, Wright-Patterson Air Force Base, Dayton, Ohio, AFAMRL-TR-81-133 Cakmakci O, Rolland JP (2007) Design and fabrication of a dual-element off-axis near-eye optical magnifier. Opt Lett 32(11):1363–1365 Cakmakci O, Moore B, Foroosh H, Rolland JP (2008) Optimal local shape description for rotationally non-symmetric optical surface design and analysis. Opt Exp 16:1583–1589 Cakmakci O, Thompson KP, Vallee P, Cote J, Rolland JP (2010) Design of a freeform single-element head-worn display. In: Proc SPIE 7618:761803 Cameron AA (2009) The application of holographic optical waveguide technology to the Q-sight family of helmet-mounted displays. Proc SPIE 7326:7326H Page 13 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_137-2 # Springer-Verlag Berlin Heidelberg 2015

Garrard K, Bruegge T, Hoffman J, Dow T, Sohn A (2005) Design tools for freeform optics. In: Mouroulis P, Smith W, Johnson B (eds) Current developments in lens design and optical engineering VI. Proceedings of SPIE, vol 5874. SPIE Press, Washington, DC Huxford RB (2004) Wide FOV head mounted display using hybrid optics. In: Mazuray L, Rogers PJ, Wartmann R (eds) Proceedings of optical design and engineering, vol 5249. St. Etienne, France, pp 230–237 Inoguchi K, Matsunaga M, Yamazaki S (2008) The development of a high-resolution HMD with a wide FOV using the shuttle system. In: Proceedings of head- and helmet-mounted displays XIII: design and applications, vol 6955. SPIE, Washington, DC, p 695503 La Russa JA (1976) Image forming apparatus. US Patent 3,940,203 Melzer JE (1998) Overcoming the FOV: resolution invariant in head mounted displays. In: Lewandowski RJ, Haworth LA, Girolamo HJ (eds) Helmet- and head-mounted display III. Proceedings of SPIE, vol 3362. SPIE, Washington, DC Mukawa H, Akutsu K, Matsumura I, Nakano S, Yoshida T, Kuwahara M, Aiki K, Ogawa M (2008) A full color eyewear display using holographic planar waveguides. Proc Soc Inf Dis 39(1):89–92 Rodgers M, Thompson KP (2004) Benefits of freeform mirror surfaces in optical design. In: Proceedings of the American Society for Precision Engineering, Raleigh Rogers JR (1983) Aberrations of optical systems with large tilts and decentrations. In: Rogers F (ed) Optical system design, analysis, and production. Proceedings of SPIE, vol 399. SPIE, Washington, DC Rolland JP (1994) Head-mounted displays for virtual environments: the optical interface. In: International lens design conference, proceedings of the Optical Society of America, vol 22. Optical Society of America, Washington, DC, pp 329–333 Rolland JP (2000) Wide angle, off-axis, see-through head-mounted display. Opt Eng 39(7):1760–1767 (Special issue on pushing the envelope in optical design software) Sisodia A, Bayer M, Townley-Smith P, Nash B, Little J, Cassarly W, Gupta A (2007) Advanced helmet mounted display (AHMD). In: Brown RW, Reese CE, Marasco PL, Harding TH (eds) Head and helmet-mounted displays XII. Proceedings of SPIE 6557, 65570N. doi:10.1117/12.723765 Takagi A, Yamazaki S, Saito Y, Taniguchi N (2000) Development of a stereo video see-through HMD for AT systems. In: International symposium on augmented reality, Munich, 5–6 Oct 2000 Thompson KP (1980) Aberration fields in tilted and decentered optical system, Ph.D. dissertation, University of Arizona, Tucson, AZ Thompson KP (2005) Description of the third-order optical aberrations of near-circular pupil optical systems without symmetry. J Opt Soc Am A 22:1389–1401 Thompson KP (2009) The multi-nodal 5th order optical aberrations of optical systems without rotational symmetry Part 1: spherical aberration. J Opt Soc Am A 26:1090–1100 Thompson KP (2010) The multi-nodal 5th order optical aberrations of optical systems without rotational symmetry Part 2: The Comatic Aberrations. J Opt Soc Am A 27:1490–1504 Thompson KP, Schmid T, Cakmakci O, Rolland JP (2009) A real ray-based method for locating individual surface aberration field centers in imaging optical systems without symmetry. J Opt Soc Am A 26:1503–1517

Page 14 of 14

Electronic Viewfinders Ian Underwood and David Steven

Contents Introduction and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Optical Design and Trade-Offs in EVFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Magnification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Eye Relief and Eye Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Field of View (FoV) and Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Lens Type, Material, and Surface Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Comparison of OVF with Electronic Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Comparison of EVF with Direct-View Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Overview of Imager Technologies EVF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Applications of EVFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Consumer and Prosumer Digital Video Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Digital Still Cameras (DSCs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Digital Video Camcorders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Abstract

A viewfinder is a compact unit that, when positioned appropriately in front of the eye, allows a photographer or videographer to assess an image or video sequence for some combination of, or before, during, and after, capture. We review and compare different classes of viewfinder and their applications.

I. Underwood (*) School of Engineering, University of Edinburgh, Edinburgh, UK e-mail: [email protected] D. Steven Midlothian Innovation Centre, Optovise Ltd, Roslin, Edinburgh, Scotland, UK e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_138-2

1

2

I. Underwood and D. Steven

List of Abbreviations

DSC DVC DVP D-SLR EID EVF EVIL FoV HTPS OLED OVF PHOLED P-OLED qVGA SLR SMOLED VGA VF

Digital still camera Digital video camera Direct-view panel Digital single-lens reflex Electronic information display Electronic viewfinder Electronic viewfinder interchangeable lens Field of view High-temperature polysilicon Organic light-emitting diode Optical viewfinder Phosphorescent OLED Polymer OLED Quarter VGA (320240 Pixels) Single-lens reflex Small-molecule organic light-emitting diode Variable graphics adapter normally 640480 pixels Viewfinder or view-finder

Introduction and Classification A generally accepted definition of a viewfinder (VF) is a compact unit that allows a photographer to see and assess an image before, during, or after capturing it or a videographer to view a video sequence before, during, or after capture. This covers the vast majority of viewfinders in use today. There are many additional niche applications of viewfinders including, for example, night vision units, range finders, and thermal imagers. VFs can be direct VF (Ray 2002) in which case the scene is imaged directly into the eye of the end user or screen VF (Ray 2002) in which case a real image is produced on a screen for viewing. VFs can be optical (OVF), in which case the image is viewed through a purely optical arrangement, or electronic (EVF), in which case the image is captured electronically then displayed on a miniature electronic information display (EID) that is viewed under optical magnification through an eyepiece as illustrated in Fig. 1. Thus, to be pedantic, an EVF actually encompasses both optical and electronic functionality, whereas an optical viewfinder is purely optical. As a short aside, a binocular head-mounted display (HMD) might be thought of as a matched pair of hands-free or wearable EVFs. HMDs are treated in Part 10.4 in this volume (see chapter “▶ See-Through Head Worn Display (HWD) Architectures”).

Electronic Viewfinders

3

Fig. 1 Schematic showing principle of operation of EVF

History Some artificial assistance has always been necessary to allow the capture of precisely framed images on film or on an electronic image sensor. In the very early days of film photography (when the default medium was sheet film), a photographer could view the critical factors of the composition and focus of the intended photograph by looking at the scene imaged onto a ground glass screen in the plane of the film prior to image capture. When satisfied, the photographer would insert a sheet of film and expose the image. Viewed from today, this was a slow and laborious process. As film quality improved, it became possible to shrink the film size to the point where it was no longer practical to view the image directly. Furthermore, the introduction of roll film containing multiple images, which greatly aided the speed and convenience of photography, also prevented direct viewing of the image plane prior to picture taking. A separate viewfinder was required to aid the framing of a photograph. The most basic all-optical “viewfinder” is simply a rectangular mechanical frame through which the photographer can look in order to estimate how an image will be framed in the photograph. Estimate is an appropriate word as, for example, the position of the eye relative to the frame can change the perceived composition. A more sophisticated OVF has a lens to match the viewfinder’s field of view with that of the picture-taking lens and an eyecup as a means of correctly positioning the eye (both laterally and at an appropriate distance from the proximal lens element). In some digital cameras of limited zoom range (less than 5), the OVF may have a zoom capability that is coupled to the main zoom lens of the camera.

4

I. Underwood and D. Steven

The broadest definition of an electronic viewfinder (EVF) is a viewfinder on which an image, captured electronically by an image sensor, is displayed for viewing purposes. By this definition, the electronic display could be a modestly sized LCD or OLED panel held at a distance and viewed directly or a miniature panel (microdisplay) viewed through a magnifying eyepiece place close to the eye. It has become the convention that the term EVF is applied to the latter, while the former is often referred to simply as “the LCD.” Here, we adopt the term directview panel (DVP), in contrast to EVF, for the latter in order to allow for the recent introduction of OLED screens alongside LCDs. Although the distinguishing line is being continually more blurred by dual functionality, we will refer to digital still cameras (DSCs) and digital video cameras (DVCs) distinctly and according to the primary function of the unit. Price erosion for LCD panels means that from around year 2000, almost all DSCs and DVCs have a DVP whose average size has grown with time from around 1 in. to more than 3 in. These have traditionally been LCDs with an increase in OLED screens from around 2010.

Optical Design and Trade-Offs in EVFs The design of an electronic viewfinder is determined by a number of parameters, each of which has an influence on one or more of the other design parameters. From a user’s perspective, a comfortable viewing experience is highly desirable, so the magnification and the distance between the viewfinder lens and the user’s eye become extremely important. Alongside this, the magnified virtual image has to be of a quality which matches or exceeds the application requirements. The size of the optical lens used to view a display in viewfinder applications is determined mainly by the display size, the eye relief, and the eye box. Once the display size and magnification of the system have been determined, the other parameters can be calculated and the predesign finalized.

Magnification The magnification of a an electronic display, when used in a viewfinder application, can be determined by the ratio of the angle the display subtends to the naked eye to the angle the virtual magnified image subtends at the eye. The distance at which an object can be viewed comfortably with the human eye is 254 mm, and, therefore, the angle subtended by the display at this distance can be calculated using simple geometry. When the display is located at the first focal plane of the viewfinder optic, then this leads to an equation for magnification (M ) as follows: M ¼ 254=f

(1)

Electronic Viewfinders

5

Virtual Image

Microdisplay

Eye Relief

Fig. 2 Illustration of eye relief

where f is the focal length of the lens in mm. It follows that choosing a certain magnification for the system will result in a lens of a certain focal length to achieve this.

Eye Relief and Eye Box Eye relief is an important consideration as it defines how close the user’s eye should approach to the anterior lens surface of the viewfinder. The eye relief, illustrated in Fig. 2, should be calculated from the surface of the nearest lens element to the cornea of the user’s eye. A smaller eye relief allows a larger field of view (FoV) but may raise comfort or safety concerns. Due to the fact that eyeglass wearers are not able to hold an electronic viewfinder as close to the eye as non-eyeglass wearers, the eye relief should account for this – an acceptable value for eye relief in this case would be no less than 25 mm but preferably higher. The eye box, shown in Fig. 3, is defined as the region through which the pupil can move lateral to the viewfinder while having a non-occluded view through the viewfinder. The larger the eye box, the more freedom the user’s eye has to move around with respect to the viewfinder in both vertical and horizontal directions before the image becomes clipped at the edges – for a circular lens, this could be considered as rotational movement. Thus, consideration should be given to a large eye box when comfort and ease of viewing is a priority or when an EVF may be used in a moving or unstable environment. Increased eye relief and eye box may be desirable, but both are achieved at the expense of increasing the diameter (and, of course, the cost) of the viewfinder optics if the magnification and field of view are to remain constant.

6

I. Underwood and D. Steven

Fig. 3 Illustration of eye box

Eye Box

Field of View (FoV) and Resolution When deciding on a suitable field of view for the viewfinder optics, it is worth considering the resolution of the final virtual image. The angle, θpix, subtended at the eye by each pixel when magnification is equal to the overall FoV divided by the number of pixels in a given direction. θpix ¼ FoV=Number of pixels As an example of this, a qVGA display with a 25 FoV the angular subtense of each pixel (in the horizontal) would be θpix_hor ð25, 320Þ ¼ 25 =320ðpixelsÞ ¼ 4:68 min of arc And for the same display with a 10 FoV, the horizontal angular subtense of each pixel would be θpix_hor ð10, 320Þ ¼ 10 =320ðPixelsÞ ¼ 1:875 min of arc If we assume that the visual acuity of the human eye is around 3 min of arc in a mature adult, this would show that in the examples above, the individual pixels would be resolved in the 25 system, whereas in the 10 system the individual pixels would not be resolved. Changing to a VGA display in the 25 system would solve the problem.

Lens Type, Material, and Surface Forms The design of the optical system for an electronic viewfinder depends on the particular constraints of the application it is to be used in. Very simple systems, where a single lens is used as a simple magnifier, would result in a non-pupil forming system (Fischer et al. 2008) that would allow the user’s eye to move around

Electronic Viewfinders

7

and still see the image. A more complicated system where a multielement lens is used, such as those used in military applications, would result in a pupil forming system (Fischer et al. 2008), where the user’s eye has to be placed exactly at the exit pupil of the system in order to see the entire image. With the increase in the use of plastic optics and the easier manufacturing of aspheric surfaces in both glass and plastic optics, there is now more scope to have high-quality systems with fewer elements, making the entire systems smaller and more lightweight. Diffractive surfaces can be used in single-element systems to correct for color shift in the system, and they can also be used in more complex arrangements for purposes such as light guiding.

Comparison of OVF with Electronic Display Given the option of OVF or electronic display, let us explore the relative benefits. In the era of the film camera, OVF was pretty much the only option. The primary advantages of the OVF are that it consumes no power and that it gives a view of the actual scene that does not impose any artificial limitations, distortions (with the exception of optical aberrations), or delays on the viewed scene. Such limitations or distortions include resolution, dynamic range, color, frame rate, and time lag. The primary disadvantages of a simple OVF are an inability to review pictures already taken and parallax arising from the fact that the OVF lens and the imaging lens are physically separate as shown in Fig. 4. The latter issue becomes more significant with reducing focal distance and very troublesome in macrophotography. Parallax is removed by the use of a single-lens reflex (SLR) configuration (http://en. wikipedia.org/wiki/Single-lens_reflex_camera) in which a mirror directs light, typically via a pentaprism arrangement, to the OVF for the majority of the time and is removed to allow light to strike the film during exposure. But the SLR configuration carries a substantial penalty in terms of size, weight, and cost. The evolution of cine-cameras and still film cameras to video cameras and digital still cameras, respectively, opened new possibilities. With an electronic image sensor, the captured image signal could be transmitted immediately to an electronic display – either a direct-view LCD panel or an electronic viewfinder. With electronic image storage, a stored image or video sequence could be replayed on the screen. The primary advantages of an electronic display are no parallax, the ability to accurately track any range of optical zoom, the facility to overlay alphanumeric and graphical information on the scene, and the opportunity to review previously captured pictures or movies. The primary disadvantages of an electronic display are battery drain and time lag between capture and display that can become noticeable for fast-moving objects or fast panning. The requirement to keep cost down has historically resulted in displays that are small in size and/or low in definition or other aspects of performance. Some performance limitations of electronic displays in typical consumer models have

8

I. Underwood and D. Steven

Fig. 4 Schematic showing simplistic parallax in OVF on a medium format twin-lens reflex camera

included limited definition (number of pixels) or resolution, limited maximum brightness and contrast, and limited range of colors and color fidelity. Thus, most electronic displays are suitable for high-level tasks such as picture framing and composition but are not sufficient for critical tasks such as fine focusing, estimating depth of field, and determining precise exposure or color balance. Some electronic displays allow a portion of the screen to be magnified to assist with critical tasks.

Comparison of EVF with Direct-View Panel Given the option of EVF and/or DVP, what are the relative benefits? The primary advantage of a DVP over an EVF is the opportunity for several people to view the image simultaneously for review purposes. It also allows the camera to be used away from the eye. An obvious limitation of a DVP is that the screen size is limited to less than the size of the camera. As cameras shrink in size, the screen can fill the back of the camera but no more. The main limitation of a DVP in practice is that it can become difficult or impossible to see the displayed image in bright ambient conditions such as a beach or a ski slope on a sunny day. The primary advantages of an EVF over a display panel are the availability of a relatively large image from a small unit and screen, the ability to see a clear image in bright ambient conditions (as the EVF is largely shielded from ambient light), and the lower power consumption of the EVF that can help lengthen the useful operating time of the battery. If privacy is important, an EVF offers inherent privacy to a single individual.

Electronic Viewfinders

9

Overview of Imager Technologies EVF Early EVFs were based on miniature CRTs with in-line magnifying optics that resulted in the form factor of a thin tube. This was compatible with the overall shoebox form factor of early video cameras as shown in Fig. 3. Later EVFs have been designed around imaging engines based on a wide variety of microdisplay technologies, including various LCD technologies (that allowed power, volume, and weight shrinkage), transmissive high-temperature polysilicon (HTPS) LCD (Morozumi et al. 1984; Lewis et al. 1990), transmissive transferred silicon (t-Si) (Salerno et al. 1992), liquid crystal on silicon (LCoS) (Ernstoff et al. 1973), reflective nematic (McKnight et al. 1989) and ferroelectric (Underwood et al. 1991) LCoS, and various emissive technologies (that allowed further power, volume, and weight shrinkage), including small-molecule organic lightemitting diode (SMOLED) (Tang AND Van Slyke 1987), polymer OLED (P-OLED) (Burroughes et al. 1990), and PIN-OLED (Huang et al. 2002) but not yet (to the best of our knowledge) phosphorescent OLED (PHOLED) (Baldo et al. 1998). These technologies are all described elsewhere in this volume and microdisplays using them elsewhere in section 10, “▶ Mobile Displays, Microdisplays, Projection and Headworn.” Cost pressure in the design and manufacture of consumer products has limited consumer EVFs typically to qVGA or lower resolution and optics to relatively small field of view (FoV).

Applications of EVFs Consumer and Prosumer Digital Video Cameras In the early years of consumer video cameras, the cost of direct-view LCD panels was prohibitive. At best, for high-end units, a small LCD panel may be included. The EVF was the preferred means of image display. The longitudinal small shoebox form factor of video cameras allowed the EVF on-a-stalk approach. The introduction of the digital still camera coincided with the early availability of small LCD panels at reasonable cost. For some years before and after the turn of the century, flip-out LCD screen and EVF coexisted. By 2010, digitization, including the migration from analog tape to digital tape to solid state (SS) image storage and the associated miniaturization, along with continued improvements in the daylight legibility of DVPs, caused a trend away from EVF and DVP to DVP – only models of SS-DVC.

Digital Still Cameras (DSCs) The DSC market can be classified as follows from high-end to low-end: the longstanding digital single-lens reflex (D-SLR) with interchangeable lenses; the more

10

I. Underwood and D. Steven

Table 1 Presence of OVF, EVF, and DVP in classes of camera

Type of camera Historical/film Rangefinder Single-lens reflex Compact Early compact digital Current DSC Digital SLR EVIL Bridge Luxury Enthusiast Style Mainstream Budget

OVF

DVP

EVF

✓ ✓ ✓ ✓ ✓



✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

✓ ✓ (✓) (✓)

Key (✓) means may be available as optional accessory

recent EVIL (electronic viewfinder interchangeable lens); bridge (box-shaped, high-zoom, noninterchangeable lens), of which an early example is shown in Fig. 6; luxury/premium compact; thin-and-light/style compact; and mainstream and budget. The general prevalence of OVF, DVP, and EVF in each of these classes is outlined in Table 1. Post-2010, almost all DSCs have a DVP. Some models have a DVP and an EVF, while a very few have a DVP and an OVF. A relatively new trend in the luxury and enthusiast segments is an external EVF as an optional accessory that typically attaches to the hot shoe.

Digital Video Camcorders The DVC market is split between tape and solid-state storage but trending to solid state with time. Vertically, it splits into consumer, prosumer, and professional. There are almost no models with OVF, almost every model has DVP, and higherend models add EVF as shown in Table 2. For broadcast-quality applications where many factors such as color balance and exposure are critical, an enclosed EVF of high quality, such as shown in Fig. 5, is essential (Figs. 6 and 7).

Future Directions As anyone who has tried to frame a photo or a video on a beach or on a ski slope on a sunny day will attest, a DVP is often not adequate, and, in those situations, an EVF is very helpful. There are several reasons why EVFs are not more prevalent than they are. First, in a competitive market, manufacturers are always looking to reduce the cost of manufacture, and having two components that do essentially the same

Electronic Viewfinders Table 2 Presence of OVF, EVF, and DVP in classes of video camera

Fig. 5 Picture of early video camera showing EVF and flip-out DVP

Fig. 6 Bridge-style DSC from mid-2000s with LCD EVF and LCD DVP

11 TYPE OF video camera Historical/analog Consumer Prosumer/professional Typical in 2011 Digital videotape consumer DV tape prosumer/professional Solid-state consumer Solid-state prosumer/professional

OVF

DVP

EVF

✓ ✓

✓ ✓

✓ ✓ ✓ ✓

✓ ✓

12

I. Underwood and D. Steven

Fig. 7 Broadcast-quality video camera with high-end EVF module

job is not smart. Given the stark choice of EVF or DVP, DVP wins as it makes a much better selling feature than an EVF. Not many would-be buyers are thinking of those difficult situations at the point of purchase, and not many in-store salesmen are trained to explain the benefits of the EVF. Keeping cost down has limited the performance of EVFs in consumer units. Classical optics has limited the form factor of EVFs, so there may be scope for some of the more novel HMD optical designs to transfer across to be incorporated in next-generation EVFs

Further Readings Armitage D, Underwood I, Wu ST (2006) Introduction to microdisplays. Wiley, Chichester Baldo A, O’Brien DF, You Y, Shoustikov A, Sibley S, Thompson ME, Forrest SR (1998) Highly efficient phosphorescent emission from organic electroluminescent devices. Nature 395:151–154 Burroughes JH, Bradley DDC, Brown AR, Marks RN, MacKay K, Friend RH, Burns PL, Holmes AB (1990) Light-emitting diodes based on conjugated polymers. Nature 347:539–541 Ernstoff MN, Leupp AM, Little MJ, Peterson HT (1973) Liquid crystal pictorial display. In: Electron devices meeting, 1973 international, Washington, DC, pp 548–551 Fischer RE, Tadic-Galeb B, Yoder PR (2008) Chapter 8. In: Optical system design, 2nd edn. SPIE Press, Bellingham. ISBN 978-0-07-147248-7 Huang J, Pfeiffer M, Werner A, Blochwitz J, Leo K, Liu S (2002) Low-voltage organic electroluminescent devices using pin structures. Appl Phys Lett 80:139 Lewis AG et al (1990) International solid state circuits conference technical digest, pp 222–223 McKnight DJ, Vass DG, Sillitto RM (1989) Development of a spatial light modulator: a randomly addressed liquid-crystal-over-nMOS array. Appl Opt 28(22):4757–4762 Morozumi S et al (1984) SID international symposium. Technical Digest, pp 316–319 Ray SF (2002) Chapters 54 to 57. In: Applied photographic optics: lenses and optical systems for photography, film, video, electronic and digital imaging, 3rd edn. Focal Press, Oxford. ISBN 0 240 51540 4

Electronic Viewfinders

13

Salerno JP, Vu DP, Dingle BD, Batty MW, Ipri AC, Skwart RG, Jose DL, Tilton ML (1992) Single crystal silicon transmissive AMLCD. SID Symposium Digest, p 555 Tang CW, Van Slyke SA (1987) Organic electroluminescent diodes. Appl Phys Lett 51:913–915 Underwood I, Vass DG, Sillitto RM, Bradford G, Fancey NE, Al-Chalabi AO, Birch MJH, Crossland WA, Sparks AP, Latham SG (1991) High-performance spatial light modulator. Proc SPIE 1562:107. doi:10.1117/12.50776

Multifocus Displays Brian T. Schowengerdt and Eric J. Seibel

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Human Visual System: Accommodation and Vergence Are Linked . . . . . . . . . . . . . . . . . . . . . . . Conventional Stereo 3D Displays Disrupt Accommodation/Vergence Linkage . . . . . . . . . . . Volumetric 3D Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multifocus 3D Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scanned Voxel Displays Using a Single Focus Modulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Objective Measurements of Display’s Focal Range and Viewers’ Accommodation . . . . . . Speed Limitations of the Deformable Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scanned Voxel Displays Using Multiple Light Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fiber Array Multifocal Light Source for Wearable Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Characterization of Multifocal Light Beams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3D Volumetric Display Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 3 6 8 9 11 13 14 15 18 19 21 23

Abstract

The human visual system has neurological cross-linkages that cause the lenses of the eyes to focus at the same distance to which the lines of sight of the two eyes converge (i.e., to point and focus the eyes to the same distance) – a useful linkage when scanning the eyes between real three-dimensional objects. To view objects rendered at different virtual viewing distances in a conventional stereoscopic 3D display, the eyes must shift their convergence between different distances. However, because these displays present all pixels in a single, narrow focus plane (e.g., in an LCD panel), they require the lenses of the eyes to maintain their focus at a single distance. When viewing most objects B.T. Schowengerdt (*) • E.J. Seibel Department of Mechanical Engineering, Human Photonics Laboratory, and Human Interface Technology Laboratory, University of Washington, Seattle, WA, USA e-mail: [email protected]; [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_139-2

1

2

B.T. Schowengerdt and E.J. Seibel

in the scene, the eyes must point and focus to different distances, forcing the visual system to fight the linkage and generating eye fatigue. At the University of Washington, we have created various multifocus 3D displays to enable the eyes to point and focus to the same distance and provide more realistic depth cues. First, we provide a brief description of the visual processes of accommodation (the focus of the eyes’ lenses) and vergence (the convergence of the lines of sight) and their hardwired linkage. Next, we explain how conventional stereoscopic displays force the visual system to try to decouple these linked processes and discuss the adverse effects that decoupling creates. We discuss various volumetric displays that avoid accommodation/vergence decoupling for small objects over a limited range of distances. Finally, we describe three different approaches we are using to create multifocus 3D displays that overcome the conflicts in the visual system for an unlimited range of object size over an unlimited range of distances. List of Abbreviations

DMM HMD LCD UAVs

Deformable membrane mirror Head-mounted display Liquid-crystal display Unmanned autonomous vehicles

Introduction In our visual systems, a set of interacting processes act in concert to actively scan the environment, enabling the brain to track the relative locations of surrounding objects within a three-dimensional space. If some of the cues to depth are simply not available, the visual system uses the remaining cues to make best estimates. For difficult spatial tasks, the removal of some depth cues (e.g., by covering one eye) dramatically lowers performance. For demanding spatial tasks involving video displays, such as minimally invasive surgery, it is helpful to provide as many accurate depth cues as possible with a 3D display. The simplest and most common displays used for the presentation of 3D data are stereoscopic displays (see chapter “▶ Introduction to Projected Stereoscopic Displays”). However, though they can create a compelling feeling of depth by including some cues that aren’t available in 2D displays, they also generate inaccurate cues that provide conflicting depth information. Unfortunately, though the visual system can cope with the mere absence of some depth cues, it reacts poorly to conflicting depth information. The imperfect mimicry of 3D viewing conditions in stereoscopic displays can create sensory conflicts within the visual system, leading to eye fatigue and discomfort. This discomfort is analogous to the motion sickness experienced inside the cabin of a rocking boat, which is also the result of conflicting sensory information. We present various approaches to building displays that better recreate reality and do not create conflicts within the visual system. First, we provide a brief

Multifocus Displays

3

description of the visual processes of accommodation and vergence and their hardwired linkage. Next, we explain how conventional stereoscopic displays force the visual system to try to decouple these linked processes and discuss the adverse effects that decoupling creates. We discuss various volumetric displays that avoid accommodation/vergence decoupling for small objects over a limited range of distances. Finally, we describe two different approaches we are using to create near-to-eye multifocal 3D displays that overcome the conflicts in the visual system for an unlimited range of object size over an unlimited range of distances, the principle of which is shown in Fig. 1.

Human Visual System: Accommodation and Vergence Are Linked The viewing of real objects at different distances in the real world requires the visual system to adjust multiple processes in a coordinated fashion. One of these processes, accommodation, controls the focus of the eye’s optics. Like a camera, the eye changes its focal power to bring an object at a given viewing distance into sharp focus on the retina (the imaging plane of the eye). Whereas a camera slides a lens forward or backward to shift focus from a distant to a nearby point, the eye stretches or relaxes an elastic lens positioned behind the iris and pupil, to change its convexity (Helmholtz and Ko¨nig 1909). A second process, vergence, controls the distance at which the lines of sight of the eyes converge – i.e., the distance at which a viewer is pointing his or her eyes. When a viewer looks from an object in the distance to a closer object, two things must happen simultaneously in order to see the new object clearly: the viewer must converge both eyes to point at the new object and change the accommodation of the eyes’ lenses to bring the object into focus (Fig. 2). These processes of accommodation and vergence need to consistently act in concert when viewing real objects, and, accordingly, a hardwired link connects their operation. A movement in one process automatically triggers a synchronous and matching movement in the other process (Fincham 1951).

Conventional Stereo 3D Displays Disrupt Accommodation/Vergence Linkage Stereoscopic and autostereoscopic (chapter “▶ Autostereoscopic Displays”) displays create a partial 3D percept by providing one image to the left eye and a different image to the right eye, but both of these images are generated by flat 2D imaging elements such as liquid-crystal display (LCD) panels. The light from every pixel originates from a flat surface (or, in some cases, two coplanar flat surfaces), so the optical viewing distance to each pixel is exactly the same, namely, the distance to the screen. This optical viewing distance acts as a cue to the visual system that all of the objects are flat and located at the same distance, even though the stereoscopic information provides cues that some objects are behind or in front of the screen, and it places a demand on accommodation to

4

B.T. Schowengerdt and E.J. Seibel

Fig. 1 Principle of our multifocus 3D displays. Light is triaxially scanned to form a 3D volumetric image. As the eye adjusts its focus from far (a) to near (b) by changing the shape of the eye’s crystalline lens (yellow arrows), the voxels representing the statue are brought into sharp focus on the retina, while the brick wall becomes slightly defocused

focus the eye to the distance of the screen. Only when the viewed object is positioned at the actual distance of the screen (i.e., the degenerate case of using the 3D display as a 2D display) can the eyes point and focus to matching distances and see a sharp image of the object. Objects that are positioned stereoscopically behind the screen require the eyes to point behind the screen while focusing at the distance of the screen – that is, viewers must attempt to decouple the linked

Multifocus Displays

5

Fig. 2 The cross-linked operation of accommodation and vergence when viewing real 3D scenes. Left: as a viewer looks at the house in the distance, the lines of sight (black dotted lines) of his/her eyes are converged to point at the house, while the eyes’ lenses (ovals near the front of each eye) are accommodated to a matching distance to focus the light reflected from the house (blue solid lines) onto the retina to form a sharp image of the house. Because the tree is at a different viewing distance, the light reflected from the tree in the foreground (solid green lines) comes to focus behind the retina and is blurred in the retinal image. Right: as the viewer shifts gaze to the tree, the eyes simultaneously converge to the distance of the tree and increase the convexity of the lenses to accommodate to the matched distance. The increased optical power of the lens brings the light reflected from the tree into focus on the retina, while the light reflected from the house shifts out of focus

processes of vergence and accommodation – or else the entire display will be blurry (Fig. 3). This forced decoupling is thought to be the major source of eye fatigue in stereoscopic displays (Hoffman et al. 2008; Mon-Williams et al. 1993; Howarth 1996; Ellis et al. 1993), compromises image quality, and may lead to visual system pathologies with long-term exposure (especially in the developing visual systems of children) (Rushton and Riddell 1999).

6 Fig. 3 Accommodation/ vergence mismatch when viewing stereoscopic displays. The eyes converge at point (a), the location at which the object is stereoscopically rendered – but the eyes must also focus their lenses (accommodate) to the surface of the display (b), where the pixels are located optically

B.T. Schowengerdt and E.J. Seibel

A

B

Volumetric 3D Displays Volumetric displays (see chapter “▶ Volumetric 3D Displays”) take the points of light representing objects and physically distribute them throughout a 3D volume, such as by sweeping a projection screen through space while it is illuminated with a sequence of image slices. These points of light are referred to as “voxels,” as a contraction of “volume pixels.” Because the physical and optical location of each voxel is at the same distance as its intended stereoscopic depth, volumetric displays create matching accommodation and vergence cues and thereby avoid the conflicts generated by stereoscopic displays.

Swept-Screen Volumetric Displays The volume of light can be generated using a number of different technical approaches. Some volumetric displays sweep a projection screen throughout a volume. In one such display, a circular projection screen (about 25 cm in diameter) spins around its center axis, sweeping the surface of the screen through a spherical 3D volume (Favalora et al. 2002). During each refresh cycle, a high-speed video projector projects 198 different 2D slices of a virtual 3D object onto 198 orientations of the spinning screen. Each point on the 3D object is represented by a voxel within the 3D volume, and light coming from that voxel reaches the viewer’s eyes with the correct cues for both vergence and accommodation.

Multifocus Displays

7

Stacked-Screen Volumetric Displays As an alternative to moving a single screen through multiple positions in space, another approach is to create an array of projection screens that can be activated sequentially. One such display contains a stack of 20 liquid-crystal light-scattering shutters (Sullivan 2003). At any given instant in time, 19 of the 20 shutters are placed in a transparent state, while one active shutter acts as a light-scattering rearprojection screen. The active state “sweeps” through shutter stack, in a fashion functionally similar to the sweeping screen of other volumetric displays, while a high-speed video projector projects a sequence of 2D slices throughout a 3D volume. Disadvantages of Conventional Swept-Screen and Stacked-Screen Displays Though swept-screen and stacked-screen volumetric displays create matching accommodation and vergence demands for the objects they display, they possess three principal disadvantages. The first drawback is that the objects they depict are of limited size – they must physically fit within the scanned 3D volume. For a swept-screen display, this means all displayed objects must fit within the area subtended by the moving screen (e.g., a 25-cm-diameter sphere), while all objects in a stacked-screen display must fit within the dimensions of the stack (e.g., a 40  30  10-cm volume). These displays cannot place two objects on opposite sides of a table – much less place objects on the distant horizon – and can only shift the focal level of objects through an accordingly small range of accommodation. To be able to represent larger scenes, the physical dimensions of the screens must be increased. For swept screens, the effects of a larger mass screen that deflects more air during scanning limit the size increases. A second disadvantage of such volumetric displays is that they do not correctly represent occlusion. Every voxel is visible to the viewer, even if that voxel represents a point on the opposite side of the object that should not be visible from that angle. Finally, additional difficulties stem from the computational demand placed by the large number of voxels and the difficulty in leveraging conventional video card technology to handle this load. For instance, each of the 198 slices of the aforementioned swept-screen display has a resolution of 768  768 pixels – i.e., the display must render over 116 million pixels per frame, making computation of moving video infeasible with current graphics processing units. Volumetric Displays with Static or Dynamic Viewing Optics The first disadvantage can be addressed by adding viewing optics to the display that can serve to magnify the effective size of the volume. For example, if an image source is placed at the effective focal length of a lens (or multi-lens system), the image will be optically located at an infinite viewing distance. If the image source is moved toward the back surface of the viewing optics, the image will approach the actual location of the lens. If a stack of image sources is distributed between the focal length and back surface of the lens, the resultant multiplanar image will be

8

B.T. Schowengerdt and E.J. Seibel

stretched between the optics and the distant horizon. The number of image sources will determine the Z-axis resolution of the display. Rolland and colleagues estimate that 14 image sources will seamlessly fill a volume extending from 0.5-m viewing distance to optical infinity, given the depth resolution of the human accommodation system (Rolland et al. 2000). In another approach, a single display screen is optically scanned through multiple depth positions using a variable focus lens, a deformable mirror, or translating mirrors (Wann et al. 1995; Suyama et al. 2000; Love et al. 2009; Hua and Liu 2009).

Fixed-Viewpoint Volumetric Displays By adding the constraint of a fixed viewpoint, the latter disadvantages can be addressed. When the viewer’s position is known, a viewpoint-specific image with correct occlusion cues can be rendered to the volumetric display. One prototype uses an approach somewhat similar to that of a stacked-screen display, in that there are image slices placed at fixed distances with a volume (Akeley et al. 2004). However, rather than turning those slices off and on sequentially, the slices are always on and optically superimposed using beam splitters. A separate stack of slices is used to create the 3D volume displayed to each eye, allowing accurate occlusion and viewpoint-dependent lighting effects to be presented. There are, however, only three focal planes in the current prototype, and the light loss associated with optical combining using beam splitters makes a significant increase in the number of layers problematic. In order to interpolate between limited depth layers, the display can distribute the luminance for a given represented object across two adjacent depth layers, to create the illusion that the object resides in between the physical layers, in a manner similar to the depth-fused display technique (Suyama et al. 2004; Lee et al. 2009; Liu and Hua 2010). In order to recreate the full range of real-world depth perception, a 3D display must be able to place voxels at optical distances ranging from the near point of accommodation (a focus distance of around 7 cm in a young viewer) to infinitely distant. We have developed a number of multifocus scanned voxel displays that, like volumetric displays, overcome the accommodation/vergence conflict but, unlike other volumetric displays, can place objects anywhere from 6.25 cm from the viewer’s eye to infinitely far away – surpassing the range required to match the full range of accommodation.

Multifocus 3D Displays We have developed novel near-to-eye volumetric displays that project images directly to the eyes, by scanning beams of intensity-modulated light with varying focus distances in an XY pattern across the retina (Fig. 1). Pixels projected by a scanning beam that is collimated (i.e., focused at infinity) are optically positioned at the distant horizon. When viewing these pixels, the user relaxes accommodation, decreasing the curvature of the eye’s crystalline lens to focus at the distance (Fig. 1a), as the eyes would do if viewing a real object positioned far away. Pixels

Multifocus Displays

9

Fig. 4 A scanned pixel display projects a beam of color- and luminance-modulated light into the eye, and the lens of the eye (to the right of the pupil) focuses the beam to a point on the retina, creating a pixel. As the beam is scanned biaxially (scanner shown as the white box at the left), the pixel moves across the retina, forming a 2D image. Only three pixels are shown, for simplicity of illustration

projected by a diverging beam are positioned at a closer viewing distance. When viewing these pixels, the user accommodates to a near point, increasing the curvature of the crystalline lens of the eye (Fig. 1b).

Scanned Voxel Displays Using a Single Focus Modulator Scanned pixel displays, such as the virtual retinal display (Furness and Kollin 1995; Johnston and Willey 1995), biaxially scan a color- and luminance-modulated beam of light, serially moving a single pixel in 2D across the retina to form an image (Fig. 4). We have integrated a variable-focusing element into a scanned light display to enable a voxel to be triaxially scanned throughout a 3D volume (Fig. 5). Unlike the swept-screen or stacked-screen volumetric displays discussed above, the light is not projected onto a screen (moving or otherwise) but rather creates a 3D volume of light that is viewed directly by the eye. By positioning the 3D volume between the surface of a lens and its focal length, the 3D volume can be magnified to occupy a virtual space extending from the lens to the distant horizon. As when viewing real 3D objects, the eyes can focus upon different points within the 3D volume. We have designed and constructed a number of scanned voxel display prototypes using this approach (Schowengerdt and Seibel 2004, 2006; Schowengerdt et al. 2010). One such prototype presents full-color, stereoscopic, multiplanar video directly to each eye, using a scanning beam of light (Schowengerdt and Seibel 2004). Before the beam is raster scanned in the X- and Y-axes, it is first “scanned” in the Z-axis with a deformable membrane mirror (DMM; Fig. 6). The DMM contains a thin silicon nitride membrane, coated with a reflective layer of aluminum that is stretched in front of an electrode. The shape of the reflective membrane is controlled by applying bias and control voltages to the

10

B.T. Schowengerdt and E.J. Seibel

Fig. 5 In the scanned voxel display, a modulated beam is triaxially scanned throughout a 3D volume that is viewed directly by the eye. For simplicity, only two image planes and five voxels are shown. In the top image, the viewer is accommodating to the distant horizon, with the far rear plane in the volume in focus on the retina (the foci are represented by two green circles). Graphics in that far plane (e.g., distant mountains and clouds) will be in focus for the viewer, while graphics in the other planes will be blurry proportionally to their distance from the viewer’s point of focus (represented by the three foci behind the retina – notice how their light is diffusely spread when it reaches the retina). In the bottom image, the viewer has shifted accommodation to a near point, increasing the optical power of the eye’s lens. Now, the front plane of the volume is in focus on the retina, bringing graphics in that plane (e.g., a branch from a nearby tree) into sharp focus for the viewer, while mountains and clouds in the far plane are shifted out of focus (the foci are in front of the retina, and the light is diffuse when it reaches the retina)

Fig. 6 The deformable membrane mirror (DMM) is used to dynamically change the focus of the beam before it is XY scanned. The beam is shown entering from the bottom of the figure and being reflected to the right. If no voltage is applied across the membrane and electrode (left side of figure), the membrane remains flat and doesn’t change the focus of a beam reflected from its surface. If a voltage is applied (right side of figure), the membrane electrostatically deflects toward the electrode, creating a concave parabolic mirror that shifts beam focus closer

Multifocus Displays

11

membrane and electrode. With no applied voltage (left side of Fig. 6), the membrane forms a flat mirror and a collimated beam reflected from its surface remains collimated. With an applied voltage, the reflective membrane is electrostatically deflected toward the electrode, forming a concave parabolic surface that will focus a beam of light to a near point (right side of Fig. 6). Intermediate voltage levels shift the focal point anywhere between the near point and optical infinity (i.e., a collimated beam). After being scanned in the Z-axis with the DMM, the beam is scanned in the X-axis with a spinning polygon mirror (Lincoln Laser Company) and scanned in the Y-axis with a galvanometric mirror scanner (Cambridge Technologies), completing the triaxial scan. This 3D scanned voxel volume is optically divided with fold mirrors and relayed to left and right eyes. Figure 7 presents a graphical overview of the complete optical system. In this prototype, two planes are scanned frame sequentially into the eye. To provide the video content for the display, two images are presented in a “pageflipping” mode, in which even frames from the 60 Hz refresh rate are used to present one image, while the odd frames are used to present the second image. In synchrony with the page flipping of the images, the DMM shifts the focus of the scanning beam, such that the two images are projected to different depth planes, creating a two-plane voxel volume. The viewer perceives the superimposition of the two planes as one composite multilayer image. By naturally accommodating the eyes, the viewer can bring objects in the background (Fig. 7a) or foreground (Fig. 7b) into focus on his/her retina. By rendering an object to a plane in the volume that matches its stereoscopic viewing distance, the cues to accommodation and vergence are brought into correspondence. Figure 8 shows example photographs of multilayer images displayed on the prototype.

Objective Measurements of Display’s Focal Range and Viewers’ Accommodation In order to assess the full focal range of the prototype, we measured the diameter of the scanning beam at multiple locations with a beam profiler and used these measurements to calculate the degree of divergence of the beam across a range of DMM control voltages. The beam divergence data were, in turn, used to calculate to viewing distance of the virtual image and the amount of accommodation needed to bring the image into focus (Fig. 9). Virtual images displayed with the prototype can be shifted from 6.25 cm from the eye (closer than the near point of human accommodation) to optical infinity. Figure 10 shows objective measurements of the diopter power of accommodation (1/focal length) of human subjects to the display, taken with an infrared autorefractor (for more details, see McQuaide et al. 2003 and Schowengerdt et al. 2003). Subjects accurately shifted accommodation to match the image plane as it was optically shifted forward and backward with the DMM.

12

B.T. Schowengerdt and E.J. Seibel

Fig. 7 The viewer brings different depth planes into focus by naturally shifting the accommodation of the eyes’ lenses. By changing the voltage to the DMM rapidly, a frame-sequential multiplanar image is generated. (a) The viewer accommodates his/her eye to the distance, and thus the house in the background plane is in focus, while the tree in the foreground plane is somewhat blurred. (b) The viewer accommodates near, bringing the tree into focus on the retina, while the house is shifted out of focus

An interesting finding from our prior research is that the human accommodation response to a scanned light display is dependent upon the diameter of the scanning beam (Schowengerdt and Seibel 2004). When the scanning beam is greater than 2 mm in diameter, subjects accommodate accurately and consistently. However, if the diameter of the beam is reduced to 0.7 mm, the display creates the virtual equivalent of a pinhole lens (Ripps et al. 1962; Hennessy et al. 1976; Ward and Charman 1985, 1987) – the depth of focus of the display increases and

Multifocus Displays

13

Fig. 8 Photographs taken of multilayer images displayed on the prototype scanned voxel display. Left: the camera is focused on the far voxel plane, which portrays a brick wall with green text. In the top photo, a voxel plane containing an image of a spider web is in front of the camera’s plane of focus (analogous to a human viewer’s point of accommodation). In the bottom photo, the voxel plane with the spider web is optically shifted with the DMM to align with the rear voxel plane. Middle: the display can also be used in a see-through augmented reality mode, in which the voxel image is presented to the eye with a beam splitter, enabling virtual objects to be optically placed within the real world. The camera is focused near the front voxel plane, which portrays a spider web. The rear voxel plane, containing a stone wall and yellow airplanes, is behind the camera’s plane of focus. Right: in the top photo, both voxel planes are aligned on the Z-axis, and the camera is focused at this point, yielding a uniformly focused image. In the middle and bottom photos, the voxel planes are separated, and the camera’s focus is shifted between the front and rear voxel plane

accommodation begins to operate in an open feedback loop and become more variable, both within and between subjects.

Speed Limitations of the Deformable Mirror This first described prototype frame sequentially projects two planes in a voxel volume, providing a limited degree of resolution in the Z-axis. One way to improve this resolution is to increase the number of frame-sequentially presented planes, mimicking the arrangement of volumetric displays – such as the swept-screen displays discussed in the introduction. However, unlike such volumetric displays, our multifocus scanned voxel displays are not limited to varying the Z-axis of voxels on a frame-by-frame basis. Indeed, it is not very computationally efficient to create a full 3D voxel array since, for any given scene, the majority of voxels are not actively used to represent objects. A more elegant solution is to create a two-and-ahalf dimensional (2.5D) sculpted surface of voxels, in which there is one voxel per XY coordinate, and the Z-axis position of that voxel can be dynamically adjusted with a single focus modulator. This solution is more computationally efficient and better able to leverage conventional video card architecture, as the display can be

14

B.T. Schowengerdt and E.J. Seibel

Accommodation to focus image (in D)

0

22.4 44.8 67.2 89.6 112

134

157

179

202

0

224 ∞

−2

50

−4

25

−6 −8

12.5

−10 −12 −14 −16

Distance from virtual image to eye (cm)

DC voltage to DMM (V)

6.25

Fig. 9 Optical distance to (right axis) and accommodation required to focus (left axis) a plane in the scanned voxel display plotted as a function of the voltage used to drive the DMM. The diopter power of ocular accommodation required to bring the image into focus is equal to the negative inverse of the distance to the virtual image as measured in meters

driven with a 2D source image paired with a depth map of Z-axis values. For each refresh cycle of the display, the beam is moved in a 2D XY raster, using the color and luminance data from the 2D source image to control the intensities of the RGB light sources and the depth map to dynamically control the position of a single focus modulator on a “pixel-sequential” basis. Unfortunately, current DMMs are only capable of kHz focus-modulation rates, rather than the MHz rates necessary to vary the focus of the beam on a pixel-sequential basis. Solid-state electro-optical materials promise a faster alternative to deformable membrane mirrors. Electro-optical polymers are being developed at the University of Washington (Dalton 2004), which will enable spatial light modulators that can operate at GHz rates – exceeding the speed requirements to perform pixelsequential focus adjustment with a single modulator. An alternative solution is to generate multiple focus channels in parallel, an approach that is discussed in the next section.

Scanned Voxel Displays Using Multiple Light Sources In the approach described in the previous section, a single light beam is focus modulated and scanned into the eye. An alternative to modulating the focus of a single beam over time, we can also use multiple light beams, each placed at a different focus level and superimposed to form a composite multifocal beam.

Multifocus Displays

15 −7 y = 0.9816x – 0.4487 R 2 = 0.9749

Accommodation (D)

−6 −5 −4 −3 −2

y = 0.8111x – 0.27 R 2 = 0.9988

−1 0 0

−1

−5 −2 −3 −4 Focus of display (D)

−6

−7

Fig. 10 The observed accommodation responses to two prototype scanned voxel displays plotted as a function of the objectively measured focus levels of the displays. Black circles represent accommodation responses of 10 subjects (averaged over time and subject) while viewing a display with a 3.5-mm diameter exit pupil. Blue diamonds represent the average (over time) accommodation response of a single subject viewing a display with an exit pupil between 2.9 and 1.6 mm. Least squares linear regression lines have been fitted to each data set

The composite multifocal RGB beam is then XY scanned into the viewer’s eyes with each component beam creating a different plane in a voxel volume, creating a layered multifocal virtual image that appears to float in space. As with the deformable mirror approach, a collimated beam creates pixels at a viewing distance far from the user, while diverging light beams create pixels closer to the viewer. Initial designs used a free-space optics system, in which beams were superimposed using beam combiners, requiring precision alignment on an optical bench (Schowengerdt and Seibel 2006). Unlike the prior prototype in which multiple planes are produced frame sequentially, this system generates the multiple planes simultaneously.

Fiber Array Multifocal Light Source for Wearable Displays We have developed a novel beveled fiber array to provide a compact, lightweight, and robust way to generate a multifocal bundle of light beams, enabling volumetric imagery in wearable display and HMD form factors. The array superimposes the multiple beams without using beam combiners, providing better light efficiency.

16

B.T. Schowengerdt and E.J. Seibel

Fig. 11 Optical fibers with end faces positioned within the focal length of a lens produce a multifocal bundle of beams

Scanner

Fig. 12 Lateral offsets in fiber end faces produce lateral shifts in sub-image position, after beams have been raster scanned

Fiber Array with Flat End Faces The beveled fiber array multifocal beam generator consists of multiple single-mode optical fibers, each illuminated at one end with its own laser diode and with the other end of each fiber left uncoupled with an exposed end face. The cores in the fiber end faces act as point sources of light for the display, and because each core is a different distance from a collimating lens, each beam of light is focused to a different distance. Initial implementations used flat-polished end faces and staggered the fiber tips in the Z-axis to place them in different distances from the lens (Fig. 11). While the resultant beams are roughly superimposed because the fiber cores are only 127 μ apart, the angles at which their light approaches the lens can lead to significant lateral offsets between sub-images in the final projected view (Fig. 12) – so it is preferable to decrease the effective fiber spacing and beam angle to maximize the overlap of the projected images. Fiber Array with Beveled End Faces In order to decrease the effective distance between fiber end faces, we fabricated a beveled fiber array that, in a manner not unlike a prism, folds the emerging beams laterally, such that their primary axes are brought closer together (Fig. 13). With an extreme polish angle of about 42 from the normal, the beams are deflected to emerge parallel to the end face of the array, creating total beam superimposition. However, at such an extreme angle, much of the light is trapped by total internal

Multifocus Displays

17

Fig. 13 Beveled fiber array deflects light to increase superimposition of multifocal beam. The end face of fiber A is positioned at the focal length of lens C, so the resultant beam (red) is collimated. Fiber B is closer to the surface of the lens, so the beam (blue) is diverging

Fig. 14 Array of fibers cut approximately 40 from the normal

reflection and scattered, so it is preferable to use a moderate polish angle (38–40 ) that increases superimposition while minimizing light loss. The fiber array was fabricated by inserting 16 bare single-mode optical fibers (125-μ cladding diameter) into a silicon v-groove chip (5-mm total package width; OZ Optics), with a spacing of 127 μ between each groove, for a linear array of fibers 1.905 mm in total width. A Pyrex lid held the fibers in position while cementing with UV-cure epoxy. A diamond dicing blade (Ultratec) was used to cut the fiber array at an approximately 40 angle from the normal (Fig. 14). Multiple shallow passes minimized the chance of fracture of the composite Pyrex, silicon, and glass fiber array. The bare fibers emerging from the rear of the array were sheathed in protective jacketing, and the loose ends were cemented in standard FC connectors and polished. To remove the microscopic scratches formed during the angle cutting, the fiber array was mounted on a custom-fabricated polishing jig, and the cut face of the fiber array was hand polished with a sequence of 5–1 μ polishing films (Fig. 15). Figure 16 shows the array after polishing.

18

B.T. Schowengerdt and E.J. Seibel

Fig. 15 Front view of the fiber end face before polishing (left), midway during polishing (middle), and at the end of polishing (right)

Fig. 16 Front view of the polished fiber array, with a single optical fiber core illuminated

Characterization of Multifocal Light Beams Light from pigtailed laser diodes (Thorlabs 635) was coupled into the fiber array for characterization of the generated light beams (Fig. 17). Figure 18a shows the approximately 20 deflection of the emerging light beam from the normal path it would follow with a flat end face polish. The array was placed at an angle in front of

Multifocus Displays

19

Fig. 17 Multiple fibers illuminated. Each fiber face is a different distance from a collimating lens and therefore generates a beam at a different focus level

a collimating lens such that the farthest fiber core was about at the focal length of the lens (Fig. 18b). Figure 19 shows a roughly collimated beam produced by that fiber (left), a partially diverging beam produced by a center fiber (middle), and a highly divergent beam produced by the nearest fiber (right). We traversed a photosensing diode with a pinhole aperture vertically and horizontally through the center of each fiber’s beam, to measure the beam profile of the spots produced at 16 mm (Fig. 20, top), 28.5 mm (middle), and 41 mm (bottom) from the array (Schowengerdt and Seibel 2006). As would be expected, all of the beams are completely superimposed in the vertical axis (as the fibers lie in the same vertical plane). The peak-to-peak spacing profiles of adjacent spots are the same at all distances – i.e., all of the beams are deflected uniformly, such that the centers of the beams remain parallel. Beam superimposition is increased, with a spacing of approximately 87 μ, showing about a two-thirds decrease from the 127 μ spacing between fiber cores.

3D Volumetric Display Configuration When integrated into our retinal scanned display system, the beveled fiber array forms a multifocal bundle of beams that is raster scanned to form a volumetric image projection. Figure 21 shows a simplified illustration of the display system, showing only four fibers and two of those fibers illuminated, so that individual beams can be better visualized. Each fiber’s beam creates a different plane in a

20

B.T. Schowengerdt and E.J. Seibel

Fig. 18 (a) The array is angle polished, causing light projected through each fiber to be deflected approximately 20 from the normal path (double green line). (b) Top view of the array placed at angle in front of the collimating lens, such that different fibers are at different distances from the lens

Fig. 19 Light projected through most distant fibers in the array is at the focal length of the lens, creating an approximately collimated beam (left). Closer fibers generate diverging beams (middle and right)

voxel volume, creating a layered multifocal virtual image that appears to float in space. By naturally accommodating the eyes, a viewer can bring objects in the background (Fig. 21, top) or foreground (Fig. 21, bottom) into focus. By rendering each object to a plane in the volume that matches its stereoscopic viewing distance, the cues to accommodation and vergence are brought into correspondence.

Multifocus Displays

21 z = 0 mm

1.2

1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 4

6

8

10

12

14

16

18

20

22

z = 0 mm

0 4

6

8

10

z = 12.5 mm

12

14

16

18

20

22

16

18

20

22

16

18

20

22

z = 12.5 mm

1.2

1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 4

6

8

10

12

14

16

18

20

22

0

4

6

8

10

12

14

z = 25 mm

z = 25 mm 1.2

1.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2 0

0 4

6

8

10

12

14

16

18

20

22

4

6

8

10

12

14

Fig. 20 Beam profiles produced by each fiber along vertical (left) and horizontal (right) axes measured at 16-mm (top), 28.5-mm (middle), and 41-mm (bottom) distance. The parallel beams produced a peak-to-peak spacing of 87 μ across all distances – approximately two-thirds of the 127 μ spacing between fiber cores

Discussion As we have discussed, conventional stereoscopic displays create fatiguing cue conflicts in the visual system between accommodation and vergence, because viewers are forced to focus their eyes at one distance and point them at a different distance. Multi-viewpoint swept-screen and stacked-screen volumetric displays can

22

B.T. Schowengerdt and E.J. Seibel F

B

H

D

E

C

I

J

I

J K

K

G

A F

H

B D

E C A

G

Fig. 21 Laser diodes (a) inject light into single-mode optical fibers (b), which are formed into the fiber array (c). Light from the fiber positioned at the focal length of lens (d) creates a collimated beam (red). When raster scanned by mirror (e) and relayed to the eye by lenses (f) and (h), the collimated beam creates an image that is optically positioned at the distant horizon. By accommodating the lens of the eye (i) to the far point, the distant object is brought into focus on the retina (j, top of figure). Light from a fiber positioned closer to the lens creates a diverging beam wave front, presenting an optically near sub-image to the eye. Shifting accommodation to a near point (i) brings this fiber’s sub-image into focus on the retina (k, bottom of figure). As with viewing real objects at different distances, layers that are not being focused upon by the eye are naturally optically blurred (k, top of figure; j, bottom of figure)

only overcome this conflict for small objects over a limited range of focus distances and cannot render occlusion cues correctly. We have presented two approaches to building 3D scanned voxel displays that better mimic natural vision, projecting objects of any size at viewing distances from 6.25 cm to optical infinity and overcoming the cue conflict throughout the full range of human accommodation. The first approach uses a dynamic focus-modulation device (a deformable membrane mirror in the current prototype) to scan a single beam in X-, Y-, and Z-axes. The second approach uses an array of optical fibers placed different distances from a lens to form a superimposed multifocal beam. The fiber array is very compact, enabling the system to be converted to a lightweight head-worn volumetric display, ideal for wearable computing, augmented reality, and gaming applications. This is in contrast with other volumetric 3D display technologies using scanning screens (Favalora et al. 2002) or multiple LCDs (Sullivan 2003; Rolland et al. 2000), which are not well suited for wearable display applications, due to size and weight limitations. One advantage to using multiple light sources to create different planes is that multiple focus distances can be presented along the same line of sight, enabling pixel-accurate depictions of transparency and reflections to be presented. For instance, a scene can be rendered in which a fish swimming under the surface of

Multifocus Displays

23

a lake and a reflection of a faraway mountain from the lake surface can be seen overlapped, with the fish and mountain placed at different optical distances. Non-fatiguing 3D displays can be used for all 3D viewing applications for which conventional stereoscopic systems are typically used. There are, however, some applications for which they are critical. Surgeons are increasingly using minimally invasive methods (e.g., endoscopy and laparoscopy) which require looking at displays for many continuous hours. 3D displays enable surgeons to better guide endoscopes around obstructions within the narrow spaces of the body, but doctors must remain in top mental form throughout long surgeries, so it is crucial that these displays be non-fatiguing and comfortable. The guidance of minimally invasive surgery tools is a form of teleoperation, and other forms of teleoperation – such as the piloting of remote UAVs (unmanned autonomous vehicles) – also can greatly benefit from 3D displays that can be comfortably viewed for extended durations. Finally, as 3D displays can be used for extended periods while playing video games, it is preferable to remove visual cue conflicts, especially for young viewers. Acknowledgments This research was supported by a grant from the National Science Foundation’s MRI program, NSF grant # BES-0421579.

Further Reading Akeley K, Watt SJ, Girshick AR, Banks MS (2004) A stereo display prototype with multiple focal distances. ACM Trans Graph 23:804–813 Dalton LR (2004) Organic electro-optic materials. Pure Appl Chem 76:1421–1433 Ellis SR, Hiruma N, Fukuda T (1993) Accommodation response to binocular stereoscopic TV images and their viewing conditions. SMPTE J 102:1137–1144 Favalora GE, Napoli J, Hall DM, Dorval RK, Giovinco MG, Richmond MJ, Chun WS (2002) 100 million-voxel volumetric display. Proc SPIE 4712:300–312 Fincham EF (1951) The accommodation reflex and its stimulus. Br J Ophthalmol 35:381–393 Furness TA, Kollin J (1995) Virtual retinal display. US Patent 5,467,104 Helmholtz HLFv, Ko¨nig AP (1909) Handbuch der physiologischen Optik. Voss, Leipzig Hennessy RT, Iida T, Shina K, Leibowitz HW (1976) The effect of pupil size on accommodation. Vis Res 16:587–589 Hoffman DM, Girschick AR, Akeley K, Banks MS (2008) Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. J Vis 8(3):1–30 Howarth PA (1996) Empirical studies of accommodation, convergence, and HMD use. In: Proceedings of the Hoso-Bunka foundation symposium: the human factors in 3-D imaging, Tokyo Hua H, Liu S (2009) Time-multiplexed dual-focal plane head-mounted display with a liquid lens. Opt Lett 34:1642–1644 Johnston R, Willey S (1995) Development of a commercial virtual retinal display. In: Proceedings of SPIE, Helmet- and head-mounted displays and symbology design requirements, vol 2464, Orlando, pp 2–13 Lee C, Diverdi S, Ho¨llerer T (2009) Depth-fused 3D imagery on an immaterial display. IEEE Trans Vis Comput Graph 15(1):20–33 Liu S, Hua H (2010) A systematic method for designing depth-fused multi-focal plane threedimensional displays. Opt Express 18:11562–11573

24

B.T. Schowengerdt and E.J. Seibel

Love GD, Hoffman DM, Hands PJW, Gao J, Kirby AK, Banks MS (2009) High-speed switchable lens enables the development of a volumetric stereoscopic display. Opt Express 17 (18):15716–15725 McQuaide SC, Seibel EJ, Kelly JP, Schowengerdt BT, Furness TA (2003) A retinal scanning display system that produces multiple focal planes with a deformable membrane mirror. Displays 24:65–72 Mon-Williams M, Wann JP, Rushton S (1993) Binocular vision in a virtual world: visual deficits following the wearing of a head-mounted display. Ophthalmic Physiol Opt 13:387–391 Ripps H, Chin NB, Siegel IM, Breinen GM (1962) The effect of pupil size on accommodation, convergence, and the AC/A ratio. Invest Ophthalmol 1:127–135 Rolland JP, Krueger MW, Goon A (2000) Multifocal planes head-mounted displays. Appl Optics 39(19):3209–3215 Rushton SK, Riddell PM (1999) Developing visual systems and exposure to virtual reality and stereo displays: some concerns and speculations about the demands on accommodation and vergence. Appl Ergon 30:69–78 Schowengerdt BT, Seibel EJ (2004) True 3D displays that allow viewers to dynamically shift accommodation, bringing objects displayed at different viewing distances into and out of focus. Cyberpsychol Behav 7(6):610–620 Schowengerdt BT, Seibel EJ (2006) True 3D scanned voxel displays using single or multiple light sources. J Soc Inf Display 14(2):135–143 Schowengerdt BT, Seibel EJ, Silverman NL, Furness TA (2003) Binocular retinal scanning laser display with integrated focus cues for ocular accommodation. In: Woods AJ, Merritt JO, Benton SA, Bolas MT (eds) Stereoscopic displays and virtual reality systems X, Proceedings of SPIE-IS&T electronic imaging, SPIE, SPIE-IS&T, vol 5006. Bellingham, pp 1–9 Schowengerdt BT, Murari M, Seibel EJ (2010) Volumetric display using scanned fiber array. In: SID symposium digest of technical papers, vol 41. pp 653–655 Sullivan A (2003) A solid-state multi-planar volumetric display. SID Symp Dig 34:1531–1533 Suyama S, Date M, Takada H (2000) Three-dimensional display system with dual frequency liquid crystal varifocal lens. Jpn J Appl Phys 39(Part 1, No. 2A):480–484 Suyama S, Ohtsuka S, Takada H, Uehira K, Sakai S (2004) Apparent 3-D image perceived from luminance-modulated two 2-D images displayed at different depths. Vis Res 44(8):785–793 Wann JP, Rushton S, Mon-Williams M (1995) Natural problems for stereoscopic depth perception in virtual environments. Vis Res 35:2731–2736 Ward PA, Charman WN (1985) Effect of pupil size on steady state accommodation. Vis Res 25:1317–1326 Ward PA, Charman WN (1987) On the use of small artificial pupils to open-loop the accommodation system. Ophthalmic Physiol Opt 7:191–193

Occlusion Displays Kiyoshi Kiyokawa

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ray Paths in an Optical See-through Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approach 1: Pattern Illumination-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approach 2: Occluder-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approach 3: Light Modulator-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Light Modulator-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Directions of Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 4 4 5 5 7 7 8

Abstract

The occlusion capability of a see-through display is important in enhancing a user’s perception, visibility, and realism of the synthetic scene presented. Unlike video see-through displays, occlusion of a real scene in an optical see-through fashion is quite difficult to achieve, as the real scene is always seen through the partially transmissive optical combiner. In this article, four portions in ray paths of an optical see-through display are first identified between the light source and the eye. Corresponding to them, a number of existing approaches for an occlusion display are then introduced that cut off the light in a different manner. Finally, recent advancements and future directions of occlusion displays are discussed. Keywords

Digital micromirror device (DMD) • Light modulator-based approach • Liquid crystal on silicon (LCOS) • Optical see-through display • Light modulator-based

K. Kiyokawa (*) Cybermedia Center, Osaka University, Toyonaka, Osaka, Japan e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2016 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_140-2

1

2

K. Kiyokawa

approach • Occluder-based approach • Pattern illumination-based approach • Video see-through displays List of Abbreviations

CAD CG DMD LCD LCOS

Computer-aided design Computer generated Digital micromirror device Liquid crystal display Liquid crystal on silicon

Introduction Occlusion is well known to be a strong depth cue. In the real world, orders of objects in depth can be recognized by observing overlaps among them. In terms of cognitive psychology, incorrect occlusion confuses users. The occlusion capability of a see-through display is important in enhancing a user’s perception, visibility, and realism of the synthetic scene presented. Correct mutual occlusion between the real and the synthetic scenes is often essential in augmented reality applications, such as architectural previewing. Figure 1 shows some examples of mutual occlusion. Figure 1a is a real scene on which a synthetic image is to be superimposed. Figure 1b is the synthetic image with correct occlusion attributes, and Fig. 1c is the ideal result of the image overlay. Without occlusion attributes, the merged image looks weird, even though the synthetic image is perfectly registered on the real image (see Fig. 1d). To present correct occlusion, depth information of both the real and the synthetic scenes is needed. Depth information of the synthetic imagery is normally available from the depth buffer. On the other hand, depth information of the real scene can be acquired in advance for a static scene or by real-time range sensors such as a highspeed stereovision system. Once the depth information is acquired, occlusion is represented differently with optical and video see-through approaches. In both cases, a partially occluded virtual object can be represented by simply omitting rendering the occluded regions. Similarly, a partially occluded real object can be presented in a video see-through approach simply by rendering the occluding virtual object over the input video frame.

Fig. 1 An example of mutual occlusion. (a) Real scene, (b) synthetic image, (c) ideal result, (d) typical ghost image

Occlusion Displays

3

A video see-through display electronically combines a synthetic image with a real image that is captured by a video camera mounted on the user’s head. The combined image is then presented in front of the user’s eyes. Video see-through displays have the following three advantages: 1. Pixel-based image processing of the real scene is available (e.g., intensity and tint corrections and blending ratio control). 2. Temporal registration error in each rendered frame can be eliminated by synchronously processing and presenting the real and virtual images. 3. Implicit and explicit visual information from the real scene is available for utilization (e.g., depth sensing by stereo vision and camera pose estimation by fiducial marker and/or natural feature tracking). The first advantage allows video see-through displays to handle the occlusion problem without difficulty. If the system has the depth information of the real scene, rendering a combined image with correct occlusion is a simple choice, from the two image planes, synthetic and captured, of the source for each pixel. Hence, this has so far been exclusively the type of display used in realizing mutual occlusion. However, a video see-through display reduces the rich information content of the real world because of the display’s low spatial and temporal resolution, limited depth-of-field, fixed focus, and so forth. Another disadvantage is that a user may temporarily lose her vision under a system failure. Unlike video see-through displays, optical see-through displays preserve the real image as it is. However, mutual occlusion in an optical way is quite difficult to achieve, as the real scene is always seen through the partially transmissive optical combiner. Any optical combiner will reflect some percentage of the incoming light and transmit the rest, making it impossible to overlay opaque objects in an optical way. Besides, each pixel of the synthetic image is affected by the color of the real image at the corresponding point, and never directly shows its intended color. In the following, a number of new optical see-through displays will be briefly explained that have recently been studied for tackling the problem of mutual occlusion.

Ray Paths in an Optical See-through Display Let us first consider the ray paths of an optical see-through display so that we can better understand the problem and classify the existing solutions. To realize mutual occlusion with an optical see-through display, we must selectively cut off or pass the rays of the real scene by using a pixel-wise light-blocking mechanism. Figure 2 shows ray paths of an optical see-through display. Firstly, rays are emitted from the light source (1), they are then reflected on the real objects (2), and pass through the air (3). Some of the rays are then blended with synthetic images by an optical combiner to finally arrive at the viewer’s pupils. Corresponding to these four portions in a path, the following three approaches are conceivable in covering the real scene with a synthetic image.

4

K. Kiyokawa

1. Pre-object

Optical combiner

2. Object

3. Post-object

4. CG

Fig. 2 Ray paths in an optical see-through display

Approach 1: Pattern Illumination-Based Approach This approach cuts rays off between the light source and objects. Noda et al. developed a unique approach using a pattern illumination (Noda et al. 1999). Firstly, a laser range finder acquires a depth map of the real objects in the darkroom. Then, those parts of the real objects that should appear are lit by a front projector. Finally, the pixels of the virtual objects that should be in front of the occluded real objects are rendered. Consequently, a correct mutual occlusion phenomenon is observed from the range finder’s viewpoint. With this approach, it is possible to form mask patterns of any shape, in real time, to correspond to the geometry of the real objects. Bimber et al. have developed a similar system with higher visual quality (Bimber and Fröhlich 2002). Maimone et al. have developed a pattern illuminationbased system combined with an optical see-through head-mounted display and a set of commodity RGB-D cameras (Maimone et al. 2013). However, with this approach, the lighting conditions must be strictly controlled in a darkroom, so it is not applicable to outdoor applications.

Approach 2: Occluder-Based Approach This approach cuts rays off by preparing real counterparts of the virtual objects. Kameyama developed a simple CAD system that allows users to perceive mutual occlusion, by appropriately controlling the intensity of a real environment (Kameyama 1999). In this system, the user manipulates a black input device on which a synthetic image is superimposed through a half-silvered mirror. The user’s hands are so brightly lit that his or her fingers in front of the device are clearly visible through the superimposed image. As a result, the user’s fingers are perceived as covering the synthetic image. Following pioneering studies of a head-mounted projective display (Kijima and Hirose 1995; Rolland et al. 2005), Inami et al. introduced handheld real objects covered with retroreflective material for two purposes: one as a screen onto which a synthetic scene is projected and the other as

Occlusion Displays

5

an occluder to cover farther real objects (Inami et al. 2000). Stereo-paired synthetic images are cast on the real scene from projectors placed in positions optically equivalent to those of the user’s eyes. Most of the rays that hit the screen are then reflected and go back to each user’s eye, whereas most of the rays that hit other real objects are randomly diffused so that negligibly few rays go back to the user’s eyes. Therefore, the user sees the virtual objects only on the screen. Consequently, any other real objects in front of the screen occlude the virtual objects, and the screen itself occludes any real objects behind it. Although these approaches are useful and relatively simple (Kameyama 1999; Inami et al. 2000), they require a special occluder object corresponding to a virtual object present; hence, they are not applicable to arbitrary situations.

Approach 3: Light Modulator-Based Approach This approach cuts rays off between the objects and the user’s eyes using a spatial light modulator embedded in the display. This approach is suitable for a head-worn display, because no special environmental settings such as a dark room or a counterpart object are necessary. In the last decade, a number of optical see-through displays have been developed that use a spatial light modulator. An early system in this regard was developed by Tatham (1999). He used a transparent LCD panel as the active mask for superimposing opaque virtual objects on the real scene as well as generating virtual cast shadows on a real object by using pixels on the active mask with neutral density. A similar idea has been employed in a more recent spatial display (Mulder 2005). In the next section, a few advanced systems in this regard will be introduced.

Light Modulator-Based Approaches As Tatham discussed in Tatham (1999), both transmissive and reflective approaches are conceivable. A transmissive approach is based on a transparent addressable optical filter, such as a LCD panel. However, simply inserting an active mask introduces an out-of-focus problem. The formed mask pattern on the addressable optical filter will itself not be in focus when the user is focusing on outside scenery. As a result, the edges between synthesized and real images look unnatural. To tackle this problem, a simple relay optics design named ELMO was proposed by Kiyokawa et al. that positions the transparent LCD panel at an intermediate focus point (Kiyokawa et al. 2000). Taking two convex lenses, each with the same focal length, and placing one in front of and one behind the LCD panel, a sighting telescope with a magnification of one can be structured. Then erecting the inverted real scene by an erecting prism, the mask pattern is always in focus regardless of user’s focus. Figure 3 shows the basic design of this optics. By opening and shutting the pixels on the LCD panel at positions where real objects should appear and disappear, respectively, it is possible to optically present a true mutual occlusion of real and

6

K. Kiyokawa CG image

Erecting prism LCD panel

f

Half-silvered mirror

Eyepiece

f

Objective lens

Fig. 3 Schematic design of the ELMO optics

Fig. 4 Images seen through ELMO-4. Transparent objects (left), opaque objects (middle), and bare hand interaction (right)

virtual worlds. Sony has already proposed the same idea in 1992 (Kawamura 1992), although there is no report of their having carried out a feasibility study. A disadvantage of the early ELMO design was that a certain amount of offset to the user’s viewpoint was introduced, causing inconsistencies between proprioception and visual perception. The most advanced ELMO display to date (ELMO-4) features a parallax-free optics with a built-in real-time range finder (Kiyokawa et al. 2003), which is actually head-mountable. Figure 4 shows a few snapshots taken through the ELMO-4 display. In 2013, Maimone and Fuchs proposed a near eye, light field-based occlusioncapable head-mounted display (Maimone and Fuchs 2013). The display is composed of a shutter layer, a backlight, and two or more thin transmissive spatial light modulation layers, packaged in a compact eyeglasses-like form factor. The display is positioned closer than the eye accommodation distance so the user cannot directly focus on the image formed on the display. Instead, the stacked light modulators are used to create a focused image through a series of optimized time-multiplexed patterns. This display is promising as it can support selective occlusion, multiple

Occlusion Displays

7

simultaneous focal depths, and a wide field of view, though the image quality needs to be significantly improved to compete with other types of head-mounted displays. A reflective approach is based on a reflective addressable optical filter, such as a DMD and a LCOS. This approach is potentially superior to a transmissive approach in terms of visual quality, as visual problems introduced by a transparent active mask such as attenuation, distortion, and color shift are severe. Uchida et al. developed a DMD-based system (Uchida et al. 2002) and achieved high color reproducibility. This system, however, was a benchtop prototype, and no compact design following this study has been reported. On the other hand, Cakmakci et al. proposed a LCOSbased compact optical design by utilizing an X-prism (Cakmakci et al. 2004). The X-prism eliminates the necessity for an erecting prism, which has been a major reason of the large display size of existing displays, including the ELMO series. However, a parallax along the optical axis and the telecentricity requirement imposed by the LCOS base make it difficult to use in personal space ( 10

y = 0 Optimum viewing angle y = 0

Min. 60

Typ. 70

Max. –

Unit Deg.

Remark Note 1

35 55 300 –

50 60 – 600

– – – –

Deg. Deg.

Note 4

– – 0.260 0.280 0.550 0.280 0.260 0.500 0.120 0.130 350 –

10 25 0.310 0.330 0.600 0.330 0.310 0.550 0.170 0.180 450 –

– – 0.360 0.380 0.650 0.380 0.360 0.600 0.220 0.230 – 1.25

ms ms

Note 2 Note 4 Note 3 Note 3

Note 4 Here it is often referred to backlight for LCDs

cd/m2 Note 5

A typical example of an excerpt (focusing on optical characteristics) of a specification of an activematrix LCD (AMLCD) is provided in Table 1; other display technologies have similar optical parameters in their specifications. Some remarks regarding display specifications have to be made here: • Notes (see Table 1 Remark) refer to various measurement procedures that often use dedicated setups and measurement devices; often there are no common standards, procedures, or methods used. • Many specifications contain “TBD” (to be defined), brackets, or typical values. Only MIN and MAX values are relevant. • All values usually refer to an operating temperature of 25  C but are likely to degrade for other temperatures. • Measurements are performed after warm-up (see Chapter ▶ Temporal Effects). • Optical measurements are mostly performed under darkroom conditions ( frame time (see Chapter ▶ Measurement Devices). • Use appropriate test patterns (see Sect. 2 of Chapter ▶ Standards and Test Patterns). • Do not change settings, adjustments, or control knobs during measurement. • For analog input displays, take care to match the output voltage range of the graphics adapter and the input voltage sensitivity range of the display (see Fig. 2). • Avoid electromagnetic interference (EMI) both on display (system) and measurement equipment. Before starting with measurements, the display (or device) under test (DUT) and measurement setup must be prepared in an adequate manner, especially for display (systems) with analog input. It is obvious that improper setups, the measurement devices, and the parameters listed below are potential sources of error. The most relevant topics to be taken into account for the display under test are setup and display adjustment. Standard display setup conditions (if not otherwise specified): • Mount the display as closely as possible to the intended application, e.g., a vertical rather than horizontal arrangement if the display is to be vertically implemented. • Check electrical power supply. • Check data line(s). • Environment: 25  C; 90 kPa; 75–85 % relative humidity (RH), not condensing. • Warm-up time for display and instruments >20 min. • Darkroom condition 1 lx. • Perpendicular viewing direction. • Center screen measurements. Adjustment of display control (for CRTs, analog input FPDs, graphics adapters with SW controls for gamma, OSD settings, etc.): • Adjust display control by frequency and phase using 1  1 grille test (Fig. 2). • Set contrast to maximum (100 %). • Set brightness to minimum (0 %). Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_142-2 # Springer-Verlag Berlin Heidelberg 2015

Table 3 Dependency of display measurements on display characteristics Parameter Uniformity Switch on Lifetime Jitter/swim/drift Response time Displayed image Burn-in Differential aging

Description Distribution over the display area Time to stable operation Change during operation and/or storage Short time image distortions Low enough for video? LCD: temperature dependent Dependency of displayed image, ghosting Effects when displaying static images Color shift due to different lifetime of color materials

Table 4 Dependency of display measurements on environmental conditions Parameter Temperature Viewing angle Illuminance

• • • • •

Description Change of performance Dependency of image quality from observer angle Degradation of display performance due to reflections

Use gray scale test pattern (Fig. 2). Increase brightness until 0 % and 5 % boxes became clearly distinct but not too different. Reduce contrast until 95 % and 100 % boxes became clearly distinct but not too different. Repeat the above two steps until all boxes show the same difference to their neighbors. Make full screen tests for visual uniformity and (for color displays) color saturation, e.g., white, black, R, G, B, C, M, and Y.

The very basic visual test patterns for adjustment are visualized in Fig. 2 (for details see Chapter ▶ Standards and Test Patterns): • Grille test pattern: Both are vertical and horizontal. For analog path evaluation, use vertical b/w bars. The name gives the number of neighboring black and white pixels. 1 pixel black and the next one (e.g., horizontal) white; 2 x 2 represents for a horizontal pattern 2 neighboring black pixels and 2 neighboring white ones. This definitions refer to lines either horizontal or vertical. • Gray scale pattern: It is used for gray scale capability, and the example on the right shows the values for 8-bit gray scale values. Adjust the display so that neighboring boxes, e.g., 0 % (full black) and 5 % or 95 % and 100 % (full white), can be distinguished. After the prerequisites and the display setup have been made, a large variety of display measurements can be carried out. The focus of this paragraph lies on optical parameters; however, some other procedures like electronic effects are also briefly discussed. Tables 3, 4, and 5 summarize the dependencies on display performance, environmental conditions, and display signals. Again, the most simple parameter for easy understanding is the effect of luminance. The intended application and the specific display technology determine the measurements that have to be performed for an appropriate evaluation. Be aware that many of the parameters are not mentioned in the manufacturer’s specifications. A prominent example is ambient light performance: As the surroundings of the display and the light conditions differ largely from application to application (even in the same location, light conditions can change during the day), practically all optical parameters in display

Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_142-2 # Springer-Verlag Berlin Heidelberg 2015

Table 5 Dependency of display measurements on signal source and signal processing software (Wu and Rao 2006) Parameter Output and input Scaling Frame-rate conversion De-interlacing

Description Match Adjustment of resolution of source and display Adjustment of frame frequency of source and display Transforming half frames to progressive displays

specifications are measured under darkroom conditions (illuminance LBlack (0 lx)) even for a moderate illuminance. Both approximations simplify the Contrast Ratio formula (11) toward a hands-on approach. CAL R 

LWhite ð0 1xÞ LReflected

(12)

Furthermore, the parameters of this simplified formula are very easy to obtain as there is no need to even drive the display: The white darkroom luminance we get from the specification (perpendicular value is acceptable because most users prefer this angle), the reflected luminance is easy to measure (e.g., diffuse geometry) in the OFF state. This reduces the effort for the first evaluation of a display significantly. For the next step, we limit our calculations to diffuse reflections. The reflected luminance for diffuse geometry was defined in Eq. 3 and will be here adapted to diffuse (index D) reflections: LDiffuse Reflected ¼

rD E p

(13)

with rD as the diffuse reflection coefficient of the display system (the index “system” is omitted here) and E as illuminance. Putting this into Eq. 12, we obtain a rather simple formula for the Contrast Ratio under diffuse conditions:  CDiffuse R

pLWhite ð0 1xÞ rD E

(14)

Even with the approximations used, the basic dependencies of the (diffuse) ambient light Contrast Ratio are clearly apparent: • The higher the white luminance of the display, the higher the performance (Contrast Ratio) will be in ambient light based on a linear relationship. • Investing some money in antireflection (AR) and/or antiglare (AG) is also recommended as it improves the Contrast Ratio by reducing rD. • The diffuse Contrast Ratio is lowered by the illuminance E in a hyperbolic dependency (1/E).

Numerical Example for Contrast Ratio with Ambient Light

There are only a few things in the display business that are overestimated in their practical use – one of them is the Contrast Ratio. As pointed out in chapter “▶ Luminance, Contrast Ratio and Grey Scale,” the darkroom Contrast Ratio is often “highlighted” to demonstrate superior performance. But we will see that

Page 11 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015

Table 4 Numerical examples for Contrast Ratio under darkroom conditions and with a reflected luminance LReflected of 10 cd/m2 (other values can also be chosen) Display A B C D a

LWhite (cd/m2) 500 100 500 100

LBlack (cd/m2) 1 1 5 5

CRDR 500 100 100 20

CRAL using Eq. 11a 510/11 = 46 110/11 = 10 510/15 = 34 110/15 = 7

CRAL using Eq. 12 500/10 = 50 100/10 = 10 500/10 = 50 100/10 = 10

LReflected (W) = LReflected (K) (see above comments on Eq. 11)

this has only a very limited influence on ambient light performance: Table 4 compares four displays with luminance pairs for white of 500 cd/m2 (A and C) and 100 cd/m2 (B and D), respectively. They differ in their black luminances and therefore also in their darkroom Contrast Ratios. To simplify again the comparison, we assume the same reflected luminance for black-and-white screens by adding 10 cd/m2 in both cases. Please note we use the “exact” formula (11) in the second column from right taking into account three values for luminance (LWhite, LBlack, and LReflected) and not only two as in the simpler approach given in the right-hand column (see Eq. 12). At a first impression, one is likely to argue that a high (darkroom) Contrast Ratio CDR R results also in superior ambient light performance. Let us start with the second column from the right (“exact” formula): Displays A and C have an identical white luminance of 500 cd/m2 but the darkroom Contrast Ratio for display C is a factor of 5 lower than A. However, this “advantage” shrinks to about 1.4 (46:34) when 2 comparing the ambient light Contrast Ratio CAL R with 10 cd/m . That is less than 30 % of the darkroom “advantage.” Black luminance generally has only a slight influence on ambient light Contrast Ratio CAL R . This is demonstrated by comparing the displays B and D: Black in a darkroom differs by a factor of 5 while the ambient Contrast Ratio shows (again) only 1.4. On the basis of the white luminance, ambient light reduces AL the Contrast Ratio CDR R of A to B from a factor of 5 (50:10) under dark conditions to 4.6 (46:10) for CR , which is fairly similar. Roughly the same applies for C and D: 34:7 and 50:10. Table 4 is also suitable for evaluating the limits of the approximation of the ambient Contrast Ratio formula (12): For reasonable and typical values for the darkroom Contrast Ratio, the displays A and B differ only slightly in both ambient light Contrast Ratios CRAL (46 vs. 50 and 10 for both formulas for B). The simple hands-on formula (12) for the Contrast Ratio for diffuse illuminance, taking only the white luminance output of the display and the reflected luminance (or diffuse illuminance Eq. 14) into account is sufficient for many calculations.

Measurements of Contrast Ratio Versus Diffuse Ambient Light for Various Displays Figure 11 shows the measured results of various display technologies concerning their diffuse ambient light performance,and comparison with the ambient light formula (14). Because their darkroom Contrast Ratios CRDR are different, we divide by the ambient light Contrast Ratio CRAL in order to normalize to “1” (darkroom conditions). All the displays (two transmissive AM LCDs and a CRT) under test show a similar degradation due to ambient light of the relative Contrast Ratio. Increasing the white darkroom luminance LWhite (0 lx) by using a high power backlight for a transmissive AM LCD shifts its curve (dashed line) compared to a standard AM LCD (dashed dotted line) to the right so that the display remains readable for higher ambient brightness. The solid line is the result of a numerical simulation using the diffuse formula (14) with a reflectivity of 2 % and a darkroom Contrast Ratio of 75:1, which was typical for the displays under test (except the high power backlight LCD). This demonstrates again, that the simple

Page 12 of 28

Contrast Ratio: CRAL/ CRDR

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015 Transmissive LCD with high power backlight

1.0 0.8 CRT 0.6 Calculated by Eq. 14 with LWhite = 150 cd/m2 LBlack = 2 cd/m2 rD = 0.02

0.4 0.2

Standard transmissive LCD

0.0 1

10

100 1,000 Diffuse ambient illuminance E (lx)

10,000

Fig. 11 Normalized Contrast Ratio degradation over illuminance for different display types and calculated values

Contrast Ratio CR

Night

Indoor

Outdoor

500:1

Contrast range of vision

Dimming

Non-emissive Optimization CR ≠ CR (E)

Backlight

Emissive

3:1 1

10

100

1,000

10,000

Illuminance E (Ix)

Fig. 12 Typical characteristics of Contrast Ratio for a wide range of ambient light conditions for emissive and reflective display technologies

approach for the diffuse ambient light Contrast Ratio (Eqs. 12 and 14) leads to easy-to-achieve and reasonable results for many setups. An optimum display system should be readable under ambient light conditions in the corresponding application. This is visualized in Fig. 12 where the Contrast Ratio is plotted over illuminance E across a wide range. Basically, vision can “process” Contrast Ratios from 3:1 to 500:1. The basic dependencies for emissive (dotted line) and nonemissive (dashed line) display technologies are plotted. The performance of emissive displays degrades with increasing ambient illumination because of increasing ambient light reflections while the Contrast Ratio for nonemissive technologies rises with ambient illuminance to a constant value. Emissive displays should be dimmed in darker conditions to avoid eyestrain while nonemissive displays have to be illuminated (e.g., backlight for LCDs) to remain readable at low illuminance levels. The optimum is achieved if the Contrast Ratio is not a function of the illuminance E (CR 6¼ CR (E)) from the illuminance (solid line) and within the range of 10:1 for text and 30:1 for images and videos. As shown, ambient light significantly reduces the Contrast Ratio of displays. Some basic calculations have been performed here with relatively simple approximations that are validated with numerical examples and measurements. In the following text, we will have a closer look at the ambient light measurement standards.

Page 13 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015

Measurement Procedures with Ambient Light Simulations As exact ambient light measurements would require the same environment as intended for the application, standards are very helpful in reducing the effort required in display evaluation. However, some experience is needed to extrapolate from measurement results to real applications. As outdoor applications like E-Signage become more widespread, their ambient light performance has also to be evaluated – often ISO 15008 is used for that; others are listed in chapter “▶ Standards and Test Patterns.”

ISO 15008 for Automotive Evaluation Infotainment systems including satellite navigation using electronic displays are becoming more and more standard even in mid-size cars. As automotive applications have quite different ambient light conditions ranging from night to direct sunlight (especially in convertibles), these displays must be readable in all such environments. Dark light conditions are relatively easily managed by “just” dimming the display; bright sunlight scenarios are more challenging. In order to ensure a minimum readability in automotive applications the ISO 15008 standard was created. It is related to the possible position of the driver’s eyes (the “eye box”) and the geometry of the display. The focus here is on the two different light simulations: Daylight and sunlight simulations, as described below are easy to perform and require less effort than setups using integrating spheres (see section 8.1 of chapter “▶ Overview of the Photometric Characterisation of Visual Displays”), which are also described in this standard. Measurement Setup The basic setup consists of a display, a lamp, and a luminance meter for measuring the combined luminance L from display and reflections (LReflected). The daylight setup (Fig. 13, left) simulates a cloudy sky with diffuse light of illuminance E = 3,000 lx at the display with angles a, b, g, and 25 elevation of display according to the geometry of the driver and the position and orientation of the display; for simplification, only b is shown here. A diffuser (the results strongly depends on its characteristics) placed close to the display represents large area clouds and sky. A combination of diffuse and haze (if present) reflections characterizes the sunlight setup (Fig. 13, right): A distant small-sized lamp ensures that no specular light reaches the detector for this geometry; the illuminance E is specified as being 45,000 lx at the display. Both geometries take into account that displays are usually not readable in specular sunlight conditions. The minimum Contrast Ratio is defined as 3:1 for daylight and 2:1 for the sunlight setup in ISO 15008.

Sunlight

Daylight

E

E

β

30°

45°

55° Lamp

L

L Diffusor Lamp

Fig. 13 ISO 15008: Daylight (left) and sunlight (right) geometry in simplified version (driver’s eye box – display geometry omitted)

Page 14 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015 55° incident angle, 30° measurement angle 200

1,000

150 100 LReflected = 0.0028 E 100 10 fitted

50

0

Contrast Ratio

Reflected luminance LReflected (cd/m2)

LReflected

CR

0

20,000

40,000 Illuminance (Ix)

60,000

1

Fig. 14 Typical chart of reflected luminance and Contrast Ratio degradation of an AM LCD under ISO 15008 sunlight geometry (see Fig. 13)

Results for an AM LCD for E-Signage applications are presented in Fig. 14 for sunlight conditions. The reflected luminance LReflected (left axis, blue curve) is proportional to the illuminance E (see also section 2.1), which is also verified here by a fit (dotted green line). This linear dependency allows extrapolation of the illuminance to higher values. This can be useful as bright light heats up the display, potentially changing the optical performance of the display under test and making direct measurement at high incident illumination difficult. Indeed, a nonlinearity generally indicates a rise in the temperature of the display which degrades the optical characteristics (usually specified for 25  C). The constant of proportionality in this example is 0.0028, leading to a reflection coefficient (diffuse and haze) rSystem D of 0.009 (see section 2.1). The Contrast Ratio (right axis, magenta curve) shows the theoretically predicted hyperbolic behavior (1/E) as pointed out in section 3. The Contrast Ratio is about 470:1 for darkroom conditions and degrades to about 6:1 at 45,000 lx, which is better than required in ISO 15008. The fitted curve corresponds well to the measured characteristic. However, one must be aware that this measured Contrast Ratio of 6:1 is not very representative of real light situations because only diffuse light (under the given geometry) is evaluated. In real-world applications, specular reflections (observer – display – surroundings) such as those from the clouds or walls cannot be avoided. In consequence, the two relatively simple measurements of ISO 15008 are most suitable for the comparison of the ambient light performance of various displays. Improvements are achievable via Ulbricht sphere setups (see above), which are also described in the standard.

Sunlight Readable Touch Entry Devices according to MIL-L-85762 Another method for evaluating the ambient light performance of electronic displays with touch screens is the MIL-L-85762 standard. Although dedicated to touch systems one can also test displays without touch screens. The measurement setup is shown in Fig. 15; in contrast to ISO 15008 the specular component is explicitly taken into account. The diffuse illuminance E (labeled as “diffuse” in Fig. 15) via lamp and diffuser has to be set to 24,000 lx and the light source in specular geometry (labeled as “specular” in Fig. 15) has a luminance of 24,000 cd/m2. The latter value is large but leads only to a relatively small illuminance on the display of about 200 lx depending on the light source size and its distance from the screen. The specular arrangement must be made with care so that the “image” of the light source is homogeneous and larger than the captured area of the luminance meters (FOV) on the display. The minimum Contrast Ratio to pass this standard is 1.6:1. This MIL standard is sometimes modified

Page 15 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015 Luminance meter

Diffusor Diffuse Specular 30°

90°

30°

FOV

E Display

Fig. 15 MIL-L-85762 geometric setup for ambient light performance measurements

in terms of the lamp’s intensity in specular geometry to avoid the screen content being completely washed out by the specular reflection. The following formulas and calculations combine the measurement values and the lighting conditions on the basis of the formulas (5), (10), (12), and (14) (“1” is added here as the threshold Contrast Ratios are relatively low): CR ¼

LWhiteDT þ LReflected LWhiteDT þ f DT E LWhiteDT ¼  þ1 LBlackDT þ LReflected LBlackDT þ f DT E f DT E

(15)

D  LWhiteD ð30 Þ   diffus þ 1 spec spec E ðf spec þ f diffus þ f diffus D T D þ f T ÞE

(16)

CR  with

• LxxxDT: Luminance of the display with touch screen for ambient illumination of 0 lx and 30 angle of vision. The index xxx stands for white and black. • fDT: combined’ reflection coefficient of display with touch screen for diffuse and specular geometry, see section 2. • E: Illuminance/lx. • D: Relative transmission of the touch screen for the display luminance. • Lxx(30 ): Luminance of displays @ 30 /cd/m2. • fxxxxx: Reflection coefficient for diffuse or specular orientation with index “D” for display and “T” for touch screen. • Exxxxx: Illuminance/lx for diffuse or specular orientation. Figure 16 (left) shows example results for diffuse illuminance only and Fig. 16 (right) the specular relationship for an LCD (LWhite = 160 cd/m2) with two different touch screens (T1 and T2; D = 0.76) mounted on it, and without a touch screen (dotted line). All curves show a linear relationship, where the touch screen T1 with advanced reflection reduction methods has an even lower reflectivity than the LCD with no touch entry device under diffuse conditions. Table 5 shows an example of the MIL standard evaluation with this display-touch system: Both measured Contrast Ratios (center column) are below the requirements of the standard (1.6:1). However by using the formulas (15) or (16) we are able to calculate Page 16 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015

Table 5 Comparison of Contrast Ratio and required minimum luminance of the LCD from Fig. 16 with different touch screens T1 and T2 (example data) Touch screen T1 T2

Measured Contrast Ratio for both light sources (at the same time) 1.18 1.04

Required white luminance of the LCD for CR = 1.6:1 540 cd/m2 2,400 cd/m2

Specular (30°) T2 Only LCD

3 2

T1

1 0

100

200

300

LReflected (cd/m2)

LReflected (cd/m2)

Diffuse (0°) 4

4,000

T2

3,000 2,000 T1 Only LCD

1,000

400

0

Illuminance (lx)

100

200

300

400

Illuminance (lx)

Fig. 16 Example results for the reflected luminance (LReflected) of MIL-L-85762 test procedure for diffuse geometry (left) and specular geometry (right)

L (cd/m2)

120 90

LBlack (1,000 Ix) 30

Specular Matte (diffuse) LWhite (0 Ix)

CR

150

AL ≈ CR

LWhite (0 Ix) LBlack (1,000 Ix)

20 1,000 Ix:

60 10

AL CR =

30 LBlack (0 Ix) 0

−40 −30 −20 −10 0 10 20 30 40 Angle θ (°)

LWhite LBlack

0 −40 −30 −20 −10

0 10 20 30 40 Angle θ (°)

Fig. 17 Horizontal angular dependency (example data) for an AM LCD of the luminance (left; emitted and reflected) and Contrast Ratio (right)

the display’s white luminance LWhite necessary to achieve a Contrast Ratio of 1.6:1 to fulfill the standard. This leads (right column) to a white luminance of the display of 540 cd/m2 for touch screen T1 and 2,400 cd/m2 for T2 respectively. Touch screen T1 is significantly more suitable as 540 cd/m2 is easier (and cheaper given the lower power consumption) to achieve by backlight improvements than 2,400 cd/m2 for T2.

Ambient Light and LCD with Viewing Angle Dependency Another example of a Contrast Ratio measurement and a further validation of the approximation formula (12) of the Contrast Ratio CRAL in the presence of ambient light is provided in Fig. 17: A horizontal viewing angle scan (40 to +40 ; for method see chapter “▶ Measurement Devices”) was performed by turning an AM LCD in front of a fixed luminance meter (perpendicular to the display for 0 ) and a light source (40 off luminance meter, switched ON and OFF). Therefore, a combination of viewing angle dependencies of the AM LCD and reflections of the lamp from the display were measured. This procedure is also suitable for judging the reflection characteristics of the display (surface); for other procedures, see, e.g., (American Society for Testing and Materials and ASTM E1392-96 1997).

Page 17 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015

At first, the luminance plot (Fig. 17 left) will be discussed for darkroom conditions (0 lx). The luminance LBlack (0 lx) for black and white (LWhite (0 lx)) has the typical LCD shape as described in section 2.1 of chapter “▶ Viewing Angle”: The absolute value of the black luminance is relatively low compared to white (high darkroom Contrast Ratio CRDR); the shape of the curve for black (dashed dotted line) rises for larger angles. The white luminance shows the typical Gaussian-like shape (dotted line). After the darkroom measurements were done, a distant lamp with a small diffuser in front of it was adjusted to deliver 1,000 lx at the display’s surface, the geometry was 40 off perpendicular incidence at 0 . Again, the luminances for black and white (not shown here) screens were measured. When plotting the luminance LBlack (1,000 lx, Fig. 17 left, solid line) for the black display with reflections from the light source, we obtain the following results: • Under specular conditions (20 ), the luminance LBlack (1,000 lx, corresponds here to LReflected) for black is nearly twice the white luminance LWhite (0 lx) (dotted line) under darkroom conditions. • The “specular peak” is relatively wide because the display under test had a matte surface (diffuse characteristics, distributes incident light in nearly all directions). The same characteristic was measured for the white luminance with reflections (not plotted). We will now discuss the characteristics of the Contrast Ratio CRAL (Fig. 17 right) with the lamp switched on i.e., with ambient light. It was measured and calculated in two ways – first according to the definition (dashed line, section 3 of chapter “▶ Luminance, Contrast Ratio and Grey Scale”) by dividing the measured black and white luminances (1,000 lx, lamp ON) and with the approximation formula (12) (solid line): • The shape of both approaches for the Contrast Ratio is very similar for all angles. • The peak value is slightly different because LReflected (here LBlack (1,000 lx)) is neglected in the numerator of the fraction of the approximation formula (see Eq. 11). • There is practically no legibility at specular reflection angle and about 15 due to the matte surface. • The maximum Contrast Ratio with the lamp on is at about +20 when the lamp-display-observer geometry is in the diffuse region and the display viewing angle-dependant luminance is reasonably high. The angle of maximum Contrast Ratio angle will differ from display to display because it strongly depends on both the viewing angle characteristics of the display and its surface reflection properties. The conclusion from all the abovementioned ambient light measurements setups and procedures is that they try to simulate real-world ambient light conditions. But as one can easily imagine their value for real scenarios can be somewhat limited. A more dedicated method is to use ray-tracing software to simulate light conditions. This is easier to adapt for various light situations approaching “real” environments, but BRDFs are required, which have to be obtained by measurements for all angles. To go more in detail is beyond the scope of this chapter. The simple measurement setups presented here are in any case better than having no information about ambient light performance.

Contrast Ratio for Projection As we have experienced for direct view displays, ambient light calculations do not transfer well to realworld conditions. However, for projection, the situation is more comfortable as standard screens can be regarded as nearly perfect diffuse reflectors. As introduced in chapter “▶ Introduction to Display Metrology” (photometric units), a flux F (unit lumen) generated by a projector causes an illumination E (unit lx) for an image size A (unit m2) on the screen. This formula is expanded by a (normalized) gain factor G (no unit) of the screen used:

Page 18 of 28

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_148-2 # Springer-Verlag Berlin Heidelberg 2015

EProjector ¼

FG A

(17)

The gain factor equals “1” for a perfectly diffuse (Lambertian) screen; for semiglossy or retroreflective screens, G can reach values of two or more. Formula (3) allows the calculation of the (reflected) luminance from the illuminance E (e.g., by ambient light) on a diffuse surface (here a screen) with a reflection coefficient r: L¼

Er p

(18)

We will now apply these two formulas to a typical situation in a conference room: The projector throws white light at a flux F onto the screen, which is also illuminated by other, typically white light sources like lamps or sunlight falling through the windows. On the screen, all of the light sources combine and are diffuse reflected, e.g., to the observer or luminance meter. We will assume in the following, that all light not coming from the projector is summarized as EAmbient light and the output of the projector refers to a full white (or black) image. Combining both formulas (Eqs. 17 and 18) results in the generated luminance LProjector by the projector on a diffuse screen which typically refers to a white image of the projector; rScreen represents the reflection coefficient of the screen: LProjector ðwhiteÞ ¼

FGrScreen Ap

(19)

Typical values for the reflection coefficient rScreen for nonglossy materials at perpendicular incidence are 0.95 (95 %) for standard screens and 0.8 for white-painted walls. Uncoated glass has at perpendicular incidence a reflection of about 5 % while a matte black wall reflects typically 1 %. Please note that the screen is often judged as “white” when the projector is switched off and only ambient light is present. The corresponding luminance is the “blackest black” we can achieve for projection. In the same way, we will obtain the reflected luminance LDiffuse ambient from the diffuse-reflected ambient illuminance EAmbient light (lx) by the screen: LDiffuseambient ¼

EAmbientlight  rScreen p

(20)

We can now put the luminance for the diffuse-reflected ambient light LDiffuse ambient (Eq. 20) and the projector LProjector (for black and white, Eq. 19) into the formula (11) for the Contrast Ratio CRAL with ambient light: CAL R ¼

LProjector ðWhiteÞ þ LDiffuseambient LProjector ðBlackÞ þ LDiffuseambient

(21)

We will make (again) the following reasonable approach that the darkroom Contrast Ratio of the projector is relatively high (LProjector (Black) 35 cd/m2

45 cd/m2

50 cm 2.3–6.5 mm (d = 50 cm) (3.1 preferred) 92 % of height

2.6 mm f = 18

>7  9

>5  7

>0.75

0.71 3: 1–15: 1 (6:1 preferred)

>50 %

by a list of selected parts and areas. Further standards can be found in Gray et al. (1985). Additional information is provided for some examples like text reproduction in Table 3 and pixel faults in Tables 4, 5, and 6.

ISO 9241 The standard ISO 9241 covers many tasks for human interaction with computers including software and input and output devices such as displays. The relevant parts for the latter are: • • • • • • • • •

Part 300: Introduction to electronic visual display requirements Part 302: Terminology for electronic visual displays Part 303: Requirements for electronic visual displays Part 304: User performance test methods for electronic visual displays Part 305: Optical laboratory test methods for electronic visual displays Part 306: Field assessment methods for electronic visual displays Part 307: Analysis and compliance test methods for electronic visual displays Part 308: Surface-conduction electron-emitter displays (SED) Part 309: Organic light-emitting diode (OLED) displays

Page 3 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Table 4 Flat panel devices: classesa for pixel faults (ISO 13406–2) – maximum allowed quantity of defective pixels per million Class I II III IV

Type 1 0 2 5 50

Type 2 0 2 15 150

Type 3 0 5 50 500

5*5 Cluster type 1 or 2 0 0 0 5

5*5 Cluster type 3 0 2 2 50

ISO 13406–2 class “2” is recommended for industrial applications. There are often specific definitions used in specifications, which do not refer to ISO a

Table 5 Flat panel devices: types for pixel faults (ISO 13406–2) Type Description Example

1 White pixel (RGB ON)

2 Black pixel (RGB OFF)

3 Subpixel (R, G, B OFF or ON)

Table 6 Flat panel devices: cluster types for pixel faults (ISO 13406–2) Cluster type 3 Defects (X) of subpixels (ON or OFF) within an array of 5  5 pixels type 1 or 2 equivalent

Allowed

Not allowed

IEC TC100 The IEC TC100 has the following technical areas (TAs): • • • • • • • • •

TA 1: Terminals for audio, video, and data services and contents TA 2: Color measurement and management TA 4: Digital system interfaces and protocols TA 5: Cable networks for television signals, sound signals, and interactive services TA 6: Higher data rate storage media, data structures, and equipment TA 7: Moderate data rate storage media, equipment, and systems TA 8: Multimedia home server systems TA 9: Audio, video, and multimedia applications for end-user networks TA 10: Multimedia e-publishing and e-book Page 4 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Power Consumption in Europe (EU 642/2009) A standard directive within the European Union limiting the power consumption of monitors and TV sets will be effective starting on 1 April 2012. The maximum allowed power consumption can be calculated via the following formulas with A as screen area in dm2 as provided in the directive: • TV sets: PTV  16 W + A  3.4579 W/dm2 • Monitors: PMonitors  12 W + A  3.4579 W/dm2 This results in PTV  113 W for a TV set with 81 cm diagonal and 16:9 aspect ratio.

Test Patterns In order to allow reproducible and comparable display measurement results, it is obvious that test patterns have to be defined and used. Two different types of test patterns (or images) exist: • Measurement patterns: These patterns are optimized for luminance and color meters. • Visual inspection: Many display evaluation tasks can be done most effectively by visual inspection as often no suitable measurement device exists. A good example is gray-level testing: For measurement devices, full screen patterns of successive (in time) gray levels are used and the measurement spot is always at the same location of the screen (usually center). Therefore, no repositioning is necessary and nonuniformities over the screen do not influence the result. For visual judgment, about 10 vertical gray-level bars fill the whole screen at one time. The observer can then quickly notice if the perceived luminance difference from gray bar to neighboring bar is roughly the same. Furthermore, with boxes of slightly different gray levels inside each bar, the gray-level resolution can be easily judged. This chapter presents the most relevant test patterns with a short description for use and judgments of the results. Table 7 provides an overview of measurement procedure and recommended test patterns. More information is given in the subsequent paragraphs. The measurement procedures are presented in Part “▶ Introduction to Display Metrology” and measurement devices in chapter “▶ Measurement Devices.” However, not every aspect can be discussed and presented here, such as the specific requirements for medical displays (see chapter “▶ Reliability and Fidelity of Medical Imaging Data” and International Telecommunication Union Recommendation BT.709-5 2002; Deutsches Institut f€ur Normung e.V. DIN 66234-1-8 1980).

Full Screen The simplest and easiest test pattern is full screen, where a single gray level and color is uniform over the entire screen (Fig. 1). Examples are black, white, gray, and colored content. This test image is mostly used for luminance, contrast ratio, gray level, and color (coordinates) measurements. Furthermore, the full screen test pattern is used as well for uniformity measurements, e.g., for projection systems which have often a bright spot in the center and darker corners.

Centered Box Figure 2 visualizes a centered box test pattern where fore- and background are black and white and vice versa; gray levels and colors are rarely used. The size of the inner box is usually square and one fifth of the screen height. Camera-based measurements refer to the steplike profile of the gray level and its luminance

Page 5 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Table 7 Overview of measurement tasks and test patterns Measurement taska Luminance, contrast ratio, gray scale (gamma), color coordinates, response time Contrast ratio (mainly PDP), loading, halation Contrast ratio (mainly projection), burn-in, Modulation transfer function, visual sharpness Visual sharpness, readability, scaling Visual judgment of gray scale Ghosting Motion blur effects Intracharacter contrast, visual judgment of readability, and motion blur for text Visual judgment of real images including skin tones and gray levels Deinterlacing, frame rate conversion AMLCD Vcom adjustment, flicker a

Recommended test patternb Full screen Centered box Checkerboard Resolution (grille) Resolution (visual) Gray scale Shadowing Motion blur Text Color reproduction Motion test LCD flicker

Paragraph §11.5.2.1 §11.5.2.2 §11.5.2.3 §11.5.2.4 §11.5.2.5 §11.5.2.6 §11.5.2.7 §11.5.2.8 §11.5.2.9 §11.5.2.10 §11.5.2.11 §11.5.2.12

Including viewing angle, ambient light, temperature, short time, and lifetime measurements b/w, monochrome, and color

b

Fig. 1 Full screen test patterns: black (left) and white (right) as examples for other content like gray levels and colors

Fig. 2 Examples of centered box test patterns. The size of the inner box can vary

dependency (Fig. 3). The size of the centered box can be shrunk to the size of a single pixel (Fig. 4), which is also a measure for sharpness. As this figure shows, there are some nonuniformities within one pixel of the CRT. Dependency of White or Black Luminance on Size and Background For some display technologies like PDPs (see chapter “▶ Display Technology-Dependent Issues”) and CRTs, the maximum white luminance can depend on the histogram distribution of all gray levels shown. A typical example is visualized in Fig. 5 for evaluation of loading (white percentage is increasing relative to the black background) and halation. The measurement procedure is described in chapter “▶ Spatial Effects.” These test patterns are also used for the evaluation of light measurement devices with optics (see chapter “▶ Measurement Devices”).

Page 6 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015 Grey level

A

A′

A′

A

Luminance profile x

x

Fig. 3 Centered box (left) and typical luminance profile (right, dotted line)

Fig. 4 Examples of centered box of the size of one pixel of CRT (left) and LCD (right)

Loading

Halation

Fig. 5 Loading and halation test patterns with different center box area ratios

Checkerboard The standard test pattern for improved contrast ratio and burn-in (and similar phenomena) is a checkerboard or chessboard as visualized in Fig. 6. Usually about five sequentially alternating black and white squares are arranged vertically and the number of horizontal squares depends on the display’s aspect ratio.

Resolution (Grille) For examination of the real display resolution, the modulation transfer function (MTF, see chapter “▶ Luminance, Contrast Ratio and Grey Scale”) is measured. This is done via different spatial frequencies, expressed as an n  m grille. n stands for the number of neighboring black pixels and m refers to neighboring white pixels; both horizontal and vertical bars are used. Normally one takes n = m for MTF measurements. 1  1 means that one pixel is white and the next one black (rectangular

Page 7 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 Checkerboard test pattern; box size and number may vary

1

1

1

1

1

1

1

2

2

3

...

Fig. 7 Grille test patterns: constant grille (e.g., 1  1, left) and increasing (right, for visual resolution inspection)

bars). The lowest spatial frequency corresponds to the left half of the screen that is white and the right half is black. Figure 7 visualizes the two types of grille patterns: one (left) with a constant n value for a grille pattern which fills the whole screen for electronic measurements, the other (right) for visual inspections uses an increasing number of n grille patterns (usually 10x repeated for each n before raising n by 1); another widespread visual image is the USAF 1951 target. Grille patterns can be both in horizontal or vertical direction and of pulse- or sine-like shape (Table 8).

Resolution (Visual)

As resolution measurements of displays require considerable effort (see chapters “▶ Luminance, Contrast Ratio and Grey Scale” and “▶ Measurement Devices”), visual inspection patterns are a good alternative (Fig. 8). For judgment, the pattern is often labeled with figures like lines per mm, which are then easy to compare. Most of these patterns are black and white and form part of many TV all-in-one test pictures. These patterns are also suitable for judgment of signal processing effects (see section “▶ Photometry” of chapters “▶ Spatial Effects” and “▶ Video Processing Tasks”) and video bandwidth limitations. Similar patterns are used for optical system evaluation.

Gray Scale Visual judgment of gray levels is done by showing boxes or bars of different gray levels simultaneously on the screen, while grayscale measurements make use of full screen patterns with the spot always at the Page 8 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Table 8 Overview of grille patterns Arrangements Rectangular bars (grille)

Vertical

Horizontal

Sine-modulated bars

Fig. 8 Examples of visual resolution test patterns: grille (left, often labeled with lines per mm), Siemens star (center), and zone plate (right)

same location on the screen to avoid potential uniformity deviations. All patterns shown here are monochrome gray but color scales are also often used. Especially for analog input displays, it is essential that both brightness and contrast adjustment settings are adequate for optimum grayscale resolution as described in chapter “▶ Luminance, Contrast Ratio and Grey Scale”: Both parameters are connected to gray-level resolution and adjusted so that all of the boxes of each gray shade bar (Fig. 9) can be distinguished. Test patterns of gray bars showing all possible gray levels (e.g., 256 for 8-bit resolution) often allow quicker and better results by visual inspection than examinations done by meters (although visual inspection does not allow calculation of gamma value). Figure 10 shows two typical gray-level test patterns. The pattern on the right with boxes inside gray bars enables a quick counting of gray levels to be distinguished by vision. Furthermore, it is impressive to notice low gray levels vanishing when the display is exposed to (bright) ambient light.

Shadowing and Ghosting Particularly for passive matrix-driven displays, neighboring pixels with different gray levels can influence each other, including the effects of a whole line or row. These degradations can be captured by MTF measurements (see chapters “▶ Luminance, Contrast Ratio and Grey Scale” and “▶ Color”), while

Page 9 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 9 Gray-level test pattern for visual adjustment of contrast and brightness

Fig. 10 Monochrome gray-level resolution test patterns: linear increment (256 boxes for 8-bit color depth, left) and vertical bars with +/ one gray level increasing insets (right)

Fig. 11 shows patterns that are also useful for visual inspection. These effects are known as shadowing or ghosting and are expressed in terms of uniformity and contrast (reduction). They can be observed in horizontal and/or vertical direction also depending on the gray level of fore- and background; therefore, several test images should be used.

Motion Blur

Measurements of motion blur require expensive equipment (see section “▶ Display Glass” of chapter “▶ Measurement Devices”) which is also often not suitable for measurements at low or high temperatures. Therefore, visual test patterns have gained a significant importance. Examples of these patterns are shown in Fig. 12. It is essential to note that the scrolling speed can be varied over a range of 1–30 pixels per frame. Due to the gray-to-gray response time dependency of LCDs, the background often has some gray shades as well as black and/or white foreground (either text, single line, or bars). It is essential to note that motion blur is observed in all hold-type displays (active matrix as well as AC memory type PDPs, which cover over 99 % of the PDPs sold), regardless of the speed of the display’s electrooptical switching speed. However, response times of LCDs larger than about 5 ms for 16.7 ms frame time increase the loss of sharpness significantly.

Text The sharpness of characters (and therefore readability) as well as scaling effects can be easily judged by characters of different font sizes on the same screen (Fig. 13). Usually black and white are used but gray levels and color are also examined.

Page 10 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 11 Test patterns for shadowing and ghosting

0 pixel per frame

0 pixel per frame

2 pixel per frame

2 pixel per frame

5 pixel per frame

5 pixel per frame

10 pixel per frame

10 pixel per frame

20 pixel per frame

20 pixel per frame

Fig. 12 Test patterns for motion blur by scrolling text (left) and scrolling bar (right). The dotted arrow visualizes the direction of motion (from left to right is chosen here, the reverse is also possible)

Text Samples

Text Samples

Is this readable? abcdefg ABCDEFG 0123456789 Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789 Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Is this readable? abcdefg ABCDEFG 0123456789

Fig. 13 Examples of visual readability test patterns

Color Reproduction As visual inspection is more sensitive than measurement devices, for example, for skin tones and overall color reproduction, real images as shown in Fig. 14 are often used as a final test pattern after gray scale and color have been measured and correctly adjusted. Other test patterns for color reproduction are derived from gray-level patterns as shown in Fig. 9; an example for color testing is to display “colored” gray bars vertically (Fig. 15, gray, R, G, and B shown; C, M, and Y can also be implemented). Depending on the display, one may notice that not all gray levels can be distinguished.

Page 11 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 14 Images as examples for testing of color reproduction including skin tones

Fig. 15 Color gray scales for visual evaluation of color performance

Fig. 16 Test patterns for visual judgment of motion (visualized by dotted line) performance

Motion Test As for skin tones, the visual system often allows a quicker and more effective judgment than measurement devices. Figure 16 provides two examples for this – a gray-shaded circle and a moving pendulum. As scrolling text is often used by TV stations, they can also provide a simple evaluation; a more advanced test pattern is shown in Fig. 12 with different scrolling speeds. Besides response time, those moving patterns are very effective for observing artifacts of signal processing (see section ““▶ Photometry” of chapters “▶ Spatial Effects” and “▶ Video Processing Tasks”).

LCD Flicker As DC-free driving waveforms are required for LCDs, the luminance can vary slightly when the inversion principle has some nonlinearities. This is not noticeable for well-designed panels, but flicker can be forced by black and white test patterns that are adapted to the inversion method. Frame-, row-, column-, pixel-,

Page 12 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

+

-

+

-

-

+

-

+

+

-

+

-

-

+

-

+

+

+ +

+

+ +

+

+

Fig. 17 LCD flicker due to polarity inversion (left) can be forced by dedicated test patterns (right) corresponding to inversion driving schemes

and dot-inversion principles can be used; however, dot inversion (a dot is a subpixel) leads to the best results. Figure 17 visualizes on the left-hand side the polarity of the dots relative to Vcom (see chapter “▶ Active Matrix Driving”). If a pattern (right) is displayed which has the highest gray level on “+” dots and black on the other ones (“ ”), the image tends to flicker. Therefore, the inversion method of an LCD can be visually evaluated by subsequentially stepping through the various inversion-dedicated test patterns. It is far beyond the scope of this chapter to explain the reason in detail. The consequence for applications is that one should use only gray levels for boxes and not dithering methods (as in printing), which can correspond to the inversion method applied.

Summary and Directions of Future Research There exist a lot of standards and test patterns for various demands and requirements. We distinguish between images for measurement and visual inspection. Which method is the better one depends on the measurement task. For instance, gray-level gamma curves are easy to measure but somewhat difficult to judge by vision. On the other hand, grayscale bars with square insets of slightly different digital gray levels are faster and easier for an observer to evaluate. However, it must be stated that many display applications like industrial gauge meters are seldom measured by standards. As digital image data processing and transmission becomes more and more widespread, trends of future research head toward image quality in these systems; further information can be found in, for example, Wu and Rao (2006) and Berbecel (2003).

Further Reading American National Standards Institute ANSI/HFS 100 (1988) American National Standard for human factors engineering of visual displays terminal workstations. www.ansi.org Berbecel G (2003) Digital image display. Algorithms and implementation. Wiley, Chichester Deutsches Institut f€ ur Normung e.V. DIN 66234-1-8 (1980) VDU workstations. www.din.de Downen PA (2005) A review of popular FPD measurement standards. In: SID’05 ADEAC, vol 2, pp 5–8

Page 13 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_150-2 # Springer-Verlag Berlin Heidelberg 2015

Electronics Industries Association EIA TEP 105 (1981) Industrial cathode-ray tube test methods. www. eia.org Electronics Industries Association EIA TEP 105–17 (1990) MTF test method for monochrome CRT display systems. www.eia.org Electronics Industries Association EIA TEP 116-C (1993) Optical characteristics of cathode-ray tube screens. www.eia.org Engeldrum PG (2004) A theory of image quality: the image quality circle. J Imag Sci Technol 48(5):447–457 Gray JE, Lisk KG, Haddick DH, Harschbarger JH, Oosterhof A, Schwenker R (1985) Test pattern for video displays and hard-copy cameras. Radiology 154(2):519–527 International Electrotechnical Commission IEC 60441 (1974) Photometric and colorimetric methods of measurement of the light emitted by a cathode-ray tube screen. www.webstore.iec.ch International Telecommunication Union Recommendation BT.709-5 (2002) Parameter values for the HDTV standards for production and international programme exchange, Geneva. www.itu.int Keller PA (1997a) Display and related standards, Chap 11. In: Electronic display measurement: concepts, techniques and instrumentation. Wiley, New York Keller PA (1997b) Electronic display measurement: concepts, techniques and instrumentation. Wiley, New York National Electrical Manufacturers Association PS 3.14-2003 (2003) Digital imaging and communications in medicine (DICOM). Part 14: grayscale standard display function. National Electrical Manufacturers Association, Rosslyn. www.medical.nema.org Video Electronics Standards Association VESA-2005-5 (2005) Flat panel display measurements FPDM standard version 2.0. www.vesa.org Wu HR, Rao KR (2006) Digital video image quality and perceptual coding. Taylor & Francis, Boca Raton

Page 14 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015

Measurement Devices Karlheinz Blankenbach* Display Lab, Pforzheim University, Pforzheim, Germany

Abstract As there is a huge variety of measurement procedures, it is obvious that a wide range of measurement devices must exist. They have to cover, for example, luminance, color, viewing angle, response time, and illuminance tasks. In this chapter, the most relevant devices and systems are presented and compared (if several approaches exist). The major source of error is the accuracy of the match of the device to the eye sensitivity V(l) and/or the color-matching functions (CMFs). As devices from manufacturer to manufacturer are somewhat different, the absolute comparability is somewhat limited.

List of Abbreviations AC ADC CCD CCFL CIE CMF CMOS CR CRT FOV fps IC LCD LED MPRT OLED PC PDP PM RGB RMS TV mC

Alternating current Analog-to-digital converter Charge-coupled device Cold cathode fluorescent lamp Commission International de l’Éclairage Color-matching functions Complementary metal-oxide-semiconductor Contrast ratio Cathode ray tube Field of view Frames per second Integrated circuit Liquid crystal display Light emitting diode Motion picture response time Organic light emitting diode Personal computer Plasma panel display Passive matrix Red, green, blue Root mean square Television Microcontroller

*Email: [email protected] Page 1 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015

Introduction As human observers can only compare neighboring displays or patterns and describe the effects in words (the exception being the use of special patterns to determine resolution or gray level), it is essential to rely on measurement devices to capture and to specify the performance of electronic displays. In this chapter, the fundamentals of optical display measurement devices (for the fundamentals, see Grum and Bartleson 1984; McCluney 1994, Chap. 7.3.3.3, Sect. 3) are described and discussed. The main topics are luminance and color measurement devices (including viewing angle dependency) with either spot or area detector, response time, and illuminance meters. With such devices, we are able to perform the display measurement procedures as describe in the previous Parts “▶ Standard Measurement Procedures” and “▶ Advanced Measurement Procedures” in combination with the test patterns presented in chapter “▶ Display Technology-Dependent Issues.” Additional and valuable information regarding measurement devices can be found in chapters “▶ Measurement Instrumentation and Calibration Standards” and “▶ Overview of the Photometric Characterization of Visual Displays.” Before using these instruments, they must be calibrated and the fundamentals of measurements must be respected. There are basically three different principles for luminance and color measurements as shown in Fig. 1 : devices equipped with a single detector (including a triple sensor for colorimetric measurements) such as spot meters which integrate over many pixels (typically 25–500 pixels), area detectors (calibrated cameras, see, e.g., Kelley 1999) that can measure an entire screen in one shot, and spectral measurement devices (see section “Color Measurement Devices: Spot Detectors”). Single detectors usually have a higher accuracy than camera-based systems. The latter are however more suitable for uniformity measurements or luminance profiles, but Moiré effects have to be avoided, e.g., by slightly defocusing. If a display emits polarized light, as with LCDs and OLEDs, one has to take care over the potential sensitivity of the light detector used to the orientation of the polarization axis. We can furthermore distinguish between two designs of single detector devices: the first type has a hood (shield) to limit the acceptance area on the display and to block ambient light (see Fig. 2 left). This allows one to perform “dark room” measurements “anywhere,” which is very helpful when verifying specifications (nearly all measurements provided in specifications refer to zero ambient illumination, i.e., 0 lx) in the application environment (outside the lab). However, because they usually have no optics and a large acceptance angle (at distant geometry), measurements to determine the influence of ambient light or viewing angle are not possible. When testing LCDs, any pressure on the display should be avoided, because a change in the cell gap of an LCD can change its optical properties. In all cases, the calibration of instruments used is essential; some details are described in Sect. 5 of chapter “▶ Measurement Instrumentation and Calibration Standards,” and errors that can occur can be found in Sect. 10 of chapter “▶ Overview of the Photometric Characterization of Visual Displays.” In contrast to devices with a hood, single detectors with optics (Fig. 2, right), as well as cameras, need dark room conditions to verify specifications. However, measurements at the display location with ambient light and viewing angle dependencies are only possible with optics-based devices. The acceptance angle (field of view) should be 2 ; according to the CIE standard observer, the viewfinder typically has a wider angle. As for all optical systems, stray light and veiling glare (internal reflections, e.g., off lenses) are also an issue, as well as for camera-based systems: Imagine, for example, a black centered box to be measured which is surrounded by bright white – masks or shields should therefore be used (see below). Many of these devices are equipped with an optical viewfinder to aim precisely at the area to be measured. The focal length of optical devices has to be adjusted; at short distances from the display, closeup lenses are used.

Page 2 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015

Luminance and color measurement devices

Single detector

Hood

Calibrated camera

Spectrum

Optics

Monochromator

Spectrometer

Fig. 1 Overview of measurement principles for luminance and color measurements

D i s p l a y

Hood

D i s p l a y

Luminance meter

Luminance meter

FOV Viewfinder

Fig. 2 Luminance meter (with single detector) principles of hood (left) and optics (right)

Light Filter

Detector

Analogue electronics

ADC

Digital electronics

Display Interface

Fig. 3 Block diagram of luminance meter

Luminance Measurement Devices In this section, the most important components of luminance meters and the issues relating to their use will be presented. Basically, the block diagram shown in Fig. 3 can be applied to hood and spot meters: Light falls through a dedicated filter (which adjusts the spectral sensitivity of the semiconductor detector to, e.g., V(l); see below) onto an optical (or photo) detector (for the fundamentals see, e.g., Kasap 2001) which transforms light intensity to voltage or current. The detector of a single sensor device, for example, is a photo diode, while camera systems are equipped with a segmented chip having potentially millions of light-sensitive pixels. Both principles can be extended to color measurements (see section “Color Measurement Devices: Spot Detectors”). The output of the detector must be adjusted for optimum conversion results to the input range of the ADC (only digital devices are described here). This has to be done very carefully when measurements for black in the range below 0.5 cd/m2 and over 1,000 cd/m2 for white are to be made with the same instrument. In the digital electronics section, a microcontroller performs all the necessary procedures ranging from zero adjustment to calibration (see Sect. 5 of chapter “▶ Measurement Instrumentation and Calibration Standards”) to visualize the data on a display and enable communications with a PC for data logging. The performance of all components from filter to microcontroller influences the precision of the instrument.

Spectral Sensitivity Electronic detectors like photodiodes and camera chips have spectral sensitivities like those that are shown in Fig. 4 with the maximum being typically in the infrared region. But when measuring luminance (and also color), the sensor sensitivity has to be matched by dedicated filters to achieve the spectral Page 3 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015 3,000K bulb

CIE Standard Observer V(λ)

1.00

Rel. units

0.75

0.50

0.25

0.00

~ ~

Silicon photodiodes 400

800 600 Wavelength /nm

1,000

1,200

Fig. 4 Spectral sensitivity and emission of receivers and emitters and CIE standard observer (V(l))

sensitivity of the human eye (photopic, V(l) = CIE standard observer; see also Sect. 2 of chapter “▶ Light Emission and Photometry” and Sect. 2 of chapter “▶ Measurement Instrumentation and Calibration Standards”). The absolute accuracy of luminance (and color) measurement devices therefore depends strongly on the deviations of the filter-detector curve from V(l) (or CMF y), and this may vary from manufacturer to manufacturer and furthermore even within different models from the same manufacturer. However, the absolute accuracy, and consequently the measurement uncertainty, also strongly depends on the spectral characteristics of both display and detector. Some examples may illustrate this: • Let us assume a deviation from V(l) in the range of 500–550 nm. If we measure a red LED or a red color on a display (both with, e.g., a center wavelength of, e.g., 620 nm and a half width of 30 nm), the error will be zero. Doing the same with a LED with a center wavelength of 525 nm would cause a relatively huge error. • CRTs have typically a relatively high peak at about 700 nm compared to blue or green. In this region, the photodiodes are highly sensitive. If the filter has some mismatch there, these deviations result in a large error of absolute luminance (and color). The same is true for 400 nm and deviations in the steep flanks of the V(l)-curve. The mismatch of observer sensitivity and many optical electronic detectors can be demonstrated by capturing the light output of IR remote controls or IrDA interfaces: The IR-diode becomes “visible” to devices such as digital CMOS-based cameras. This can also be used as a simple test of the basic function of IR-devices. Therefore, one has to explicitly check the IR-sensitivity of the measurement device. Furthermore, any light from sources other than the display has to be avoided for dark room measurements. Because of these sources of deviation (errors), the display producers usually state explicitly the measurement device which has been used for the evaluation of the parameters in their specifications.

Evaluation of Luminance Meters and Sources of Error As mentioned before, there are a lot of pitfalls to be avoided when carrying out luminance (and color) measurements. In this chapter, we focus on topics beside spectral sensitivity effects (see above) such as stray light and polarization. Calibration and other fundamentals can be found in, e.g., Commission International de l’Éclairage (1987) and Chap. 10 in Keller (1997).

Page 4 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015 Rel. L Loading Ideal

1 1 Aw / Atotal

Measurement area Rel. L Halation

Ideal

1 Aw / Atotal

1

Fig. 5 Test patterns (left) and typical results (right) of stray light and veiling glare measurements of meters with optics

Luminance meter FOV

Luminance meter Stray light

Fig. 6 Test for stray light and veiling glare with mask to characterize the influence of neighboring areas of measurement area

Optical Performance and Stray Light Issues Light from outside the intended measurement area, defined by the FOV of the meter, can result in relatively large errors. Stray light from content displayed outside the measurement area (FOV) can affect the measured luminance and should therefore be checked before performing box pattern and checkerboard tests. The results are plotted as luminance over area portion referring to the total screen area as in the example presented in Fig. 5; the basic measurement procedure is described in Sect. 2.1.3 of chapter “▶ Spatial Effects.” The “loading” and “halation” (Boynton and Kelley 1998) effects are caused by stray light (e.g., Coleman 1947, also known as veiling glare) from areas outside the (projected) FOV spot on the screen via internal reflections, e.g., of lenses and can influence the measurements, especially for centered box procedures. These effects occur for both spot detectors and camera-based devices. For evaluation, one has to ensure that the display’s luminance is constant (independent of screen content). These effects can also be measured by using paper printed with these test patterns. The consequence is that measurements like those with a centered box have to be carefully undertaken with respect to minimizing stray light and viewing glare issues as plotted in Fig. 5. The error for loading (white box with area AW) can be about 5 % and over 100 % for halation (black box with area AK). An ideal optical device should indicate a constant luminance reading which is independent of its surroundings. Another test for stray light sensitivity uses a black mask (Fig. 6) in order to avoid display influence. An improvement is to use a cone-shaped frustum mask (Fig. 7, see also, e.g., Boynton and Kelley 1997) which stops rear reflections from the display passing into the optics of the meter. Polarization Sensitivity As the light emitted from displays such as LCDs or OLEDs (circular polarizer due to metal electrode) is polarized, a detector should not have any orientational dependency on the direction of the polarization. A simple but common test is to turn a polarizer in front of detector with light from an incandescent bulb or

Page 5 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015

Luminance meter

Luminance meter

Fig. 7 Reduction of stray light reflections of mask (left) by cone mask (right, also known as a frustum) in side view configuration L Frame time

LDisplay Lmean ~ vision t

Fig. 8 Visualization of integration of pulsed light output to mean value by luminance meter

reflections from white paper – the indicated luminance should be constant. Another possibility is to turn the detector in front of an LCD. The results of such measurements are plotted, e.g., in a polar diagram or given as a mean value (of luminance) and deviation. Time Domain Tasks Another challenge for both detector and electronics is the waveform of the luminance in the time domain. Many displays like PM OLEDs, CRTs, and PDPs do not emit light continuously: Time-resolved measurements, like those for response time, prove that the luminance signal more often resembles pulses like those shown in Fig. 8 for a typical CRT output. This light output results in a visual sensation proportional to the mean value for typical display conditions. The integration time of the detector system should be significantly larger than the frame time. This is especially an issue for standard cameras with 50 or 60 Hz frame frequency – temporal aliasing can occur. An example will highlight the requirements for the time domain task: A PM OLED with 96 lines and a specified (mean) luminance of 100 cd/m2 running at 60 Hz emits a peak luminance of 9,600 cd/m2 for about 175 ms per pixel. Neighboring lines result in a superposed waveform with high frequency modulation and large amplitude variations. No instrumental range adjustment can be made in such a short time and four orders of magnitude and more have to be covered. In order to obtain this average luminance, several techniques are possible: • The analog path method, using, e.g., a detector with limited bandwidth (acts as a first low pass filter) and an analog low pass filter. The AD conversion and the microcontroller software can be relatively slow. • Use of a high-speed detector and a high-speed AC-to-RMS-IC of which the output can be converted relatively slowly or a fast ADC with fast mC transmission and calculation of the mean value via software. Figure 9 visualizes the major tasks and challenges for luminance measurements over some decades of time. In the ms and ms range, displays (red) emit light which is measured (integration time) and converted to luminance (blue) about every second. This has to be performed with high precision for years, for example, in the case of lifetime measurements.

Page 6 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015 DMD PDP

CRT LCD

Integration time

Warm up

μs

ms

s

min

Stability temperature h

d

Calibration lifetime

m

y

t

Fig. 9 Typical characteristics of displays (red) and requirements for meters (blue) over time Table 1 Principles and typical performance of spot color measurement devices Principle

Pros Cons

Monochromator The spectrum is measured as intensity for each wavelength with one optical sensor D and a moving grating G. As the grating turns, the wavelengths which fall on the single detector vary over time. The intensity for each wavelength is multiplied by CMFs and added. A transformation to the desired color system is then done Very high accuracy Measurement very slow

Array spectrometer The incoming light is split by a fixed grating G into a (lateral) spectrum and captured by an optical line detector D in fixed geometry at one time. The output is multiplied with calibration values and color-matching functions toward tristimulus values and further transformation into CIE color spaces Fast and moderate in price Limited accuracy at low level light

Colorimeter The light input is measured with three sensors with dedicated filters according to the color-matching functions as X, Y, Z at one time (for details, see below including Fig. 11). Coordinates are calculated according to the color system formulas

Fast and inexpensive Limited accuracy (depends on fit to color-matching functions)

Color Measurement Devices: Spot Detectors For color measurements (see, e.g., Hunt 1992; American Society for Testing and Materials 1996), basically the same basic rules apply as for luminance. Again, spot detectors (described here) and area imagers (see Kelley 1999) can fulfill the task. There are three different ways that color can be captured by spot detectors: monochromator (for details see Sect. 4 of chapter “▶ Measurement Instrumentation and Calibration Standards” and, e.g., Chap. 8.12.2 in McCluney 1994), array spectrometer, and colorimeter (see below). They are compared in Table 1 and visualized in Fig. 10. Measurements are relatively timeconsuming if the spectrum is measured with high precision by a scanning monochromator; a fast acquisition however with limited accuracy is provided by a colorimeter (description see below). A reasonable compromise is an array spectrometer, which captures a complete spectrum in one shot with a sensor array (wavelength range divided by the number of pixels equals wavelength resolution) – the accuracy is relatively high and acquisition time low. Monochromators and spectrometers are usually connected to a PC to perform color space calculations and transformations into, e.g., CIE 1976 UCS. Some battery-powered spectrometers with a built-in processor allow mobile acquisition; high-priced ones have optics for targeting the display under test. Compared to those devices, a colorimeter is relatively simple (see below).

Principle of Colorimeters Fast acquisition, low price, and stand-alone lightweight operation are the main advantages of colorimeters. Such a system is typically built like a single detector luminance meter (see section “Luminance Measurement Devices,” block diagram Fig. 2). They measure light with three sensors with matched filters (see Fig. 11) so that the resulting spectral sensitivity corresponds for each one to a color-matching function (see chapter “▶ The CIE System” and Sect. 2 of chapter “▶ Color”). The outputs are amplified and the tristimulus values are achieved without further calculations; the transformation to color spaces requires only little computing power. The accuracy of these colorimeters however depends strongly on the

Page 7 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015

Input

G

D

Input

G

IC Mirror

D

Fig. 10 Basic principles of color measurement devices. Scanning monochromator (left), array spectrometer (center), and colorimeter (right, only IC; for details see Fig. 11); for explanations see text and Table 1

Spectrum

X = 10 Y = 20 Z = 60 Wavelength Spectrum

×

color matching functions corrected photodiodes

= tristimulus values

Fig. 11 Schematic of color measurements by a colorimeter

compliance of spectral sensitivity with the CMFs. To keep the costs low, the left peak of x is often replaced by using z (e.g., x0 ¼ 0:2z þ x) so that each photodiode has to be matched to only one spectral sensitivity. A major source of error is that the spectral CMF sensitivity is in total not very accurate. In consequence, many colorimeters can be “calibrated” to various display types like LEDs, CCFLs, LCDs, CRT, and incandescent bulbs. This is however only helpful for measurements within one “family” of displays but not for absolute comparison. The colorimeter principle is also applied in camera detectors, which catch images in a short time for further analysis of color via dedicated software. More information about colorimeters can be found in American Society for Testing and Materials (1996).

Evaluation of the Performance of Color Measuring Systems/Sources of Errors As is common for every measurement device, the accuracy should be known. There are several ways of evaluating this for color measurements and various tests for errors and uncertainties, as briefly outlined here: • Compare the actual device to a calibrated device. • Most of the sources of error for luminance meters (see section “Luminance Measurement Devices”) also apply to color measurement devices. • Some devices, especially colorimeters, can be calibrated by factors for tristimulus values, e.g., Xcorr = k Xmeasured. However, such a setting is only applicable to displays with the same (or very similar) spectral characteristics compared to the spectrum used for calibration. • Measure RGB and CMY color coordinates and check if the CMY points lie on lines which connect RGB coordinates (see Fig. 12). The excerpt on the right-hand side shows the measured coordinate for magenta (circle) near the line connecting the coordinates of blue (B) and red (R). The dashed lines represent the specified tolerance of the measurement device. In consequence, the color locus for magenta is within this limit. Page 8 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_151-2 # Springer-Verlag Berlin Heidelberg 2015

0.6

0.5

Tolerance

CIE 1976 UCS 0.5 R v’

v’ 0.4 CIE 1976 UCS LCD 0Lx 0.3 0.1

0.2

0.3 u’

B 0.4

0.4 0.2

0.3 u’

Fig. 12 Example of an accuracy measurement procedure with first and secondary primaries (left) and magnified excerpt (right) for tolerance evaluation

Fig. 13 Basic (goniometric) 2D viewing angle measurement setups: fixed display and moveable sensor (left) and vice versa (right)

• The accuracy of the reproduction of the color-matching functions can be tested with monochromatic light: Use, for example, a monochromator output as the light source and step through the visible wavelengths. Determine the color coordinates for each wavelength (e.g., in steps of 1 nm) and compare to the ideal curve (e.g., horseshoe-like curve for CIE 1931). All deviations influence also the accuracy of color measurements of other light sources, e.g., with large half width or several peaks.

Viewing Angle Measurements As viewing angle measurements consist typically of a luminance or color measurement at different angles, light-capturing devices as described before are combined with turntables for goniometric setups either in 2D (usually horizontal or vertical) or 3D (half sphere). A small field of view (100ppi pixel densities in the living room. 2. Stereoscopic displays will demand higher-than-“Retina” performance. For us to effectively view the world in 3D, we will need even more pixels. Adding depth to an image increases our demand for enhanced image quality. 3. HDTV is just the tip of the iceberg. It has been less than 3 years since the industry had serious debates suggesting that 720p was entirely suitable; that 1,080p was just a waste of pixels. You can still occasionally read commentaries about how Blu-ray adds little over DVD. The “science” used to justify these claims seems sorely lacking, because it is truly the rare individual that cannot discern enormous differences, even at considerable viewing distances, between devices with different pixel densities. And this differentiation has not yet come close to the point of diminishing returns – we will certainly see more and more Quad-HD (3,840  2,160), and then UHD (7,680  4,320). As always, bandwidth will be an issue, but do not pay attention to the market “experts” who try to tell you that today’s Blu-ray is perfectly adequate. It is not. 4. Color does not matter. OK. Color does matter, but it is not a question of how many colors – it is the range of colors that can be displayed on a screen. A display that shows one trillion different colors is meaningless if all of them are a different shade of blue. Likewise, a display that covers “90 % of the CIE curve” is meaningless if it skips one key color group. Since the human visual system cannot distinguish anywhere close to one billion colors, let alone one trillion colors, the industry’s current focus on bit count needs to be displaced with a focus on extending the range of colors that are reliably depicted on a display. 5. Multicolor primary displays. RGB stripes within a square box have been a great way to create flat-panel displays to this point. Moving forward, there is little question that we will increasingly see multicolor primary displays that utilize a variety of sub-pixel structures. Things like RGBW, RGBY, the PenTile Matrix, and a broad array of novel pixel structures will increasingly enable performance differentiation and offer improvements to our visual experience.

Touch Panel When we started the Touch Panel newsletter about 6 years ago, one well-known industry analyst suggested that we would have no subscribers, that the topic was simply too narrow and very “boring.” “You just can’t make a resistive touch screen seem interesting to anyone but an engineer who has been told their job depends on creating a touch-based device that almost no one will want.” We ignored this pundit, and today the Touch Panel newsletter is the most popular of all our newsletters. 1. Explosion of hybrid touch technology offerings. One of the reasons that so many touch technologies are currently competing for a position in the market is because none of the existing technologies perfectly satisfies the needs of the application. As such, numerous developments are underway to combine more than one touch technology into a single solution – thereby broadening the usage model. For example, the popular projected-capacitive technology used in Apple’s iPhone and iPad does not enable pen input and does not function with gloved hands or if the surface is wet. Likewise, the traditional tablet PC requires a stylus and does not allow for finger input. Hybrid solutions that combine digitizers with projected-capacitive touch technologies to enable both pen and finger operability are becoming popular. Another example is the recent emergence of analog multi-touch resistive (AMR, also called hybrid analog-digital). Both are alternatives to projected-capacitive technology that utilize Page 7 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_152-2 # Springer-Verlag Berlin Heidelberg 2015

2.

3.

4.

5.

the familiar resistive technology. There also have been recent announcements related to hybrid voltagesensing and charge-sensing in-cell touch technologies. It is quite predictable that in the absence of technology breakthroughs that satisfy all user needs, hybrid approaches will continue to be introduced to the market. Haptic feedback. Studies indicate that the human sense of touch is enhanced significantly by both audio and force-feedback cues. Without such extra-sensory feedback, touching a glass-like surface is unappealing, (which helps explain the appeal of the sounds we receive from a typewriter and the keystroke responses of a typical computer keyboard). Even the sound of a pencil on a sheet of paper provides feedback cues that are helpful to the user. Most touch technologies fail to provide these natural feedback cues – making it difficult to make many inputs that we are accustomed to making with our fingers or pen-based input devices. As such, it is very likely that the touch-screen market will increasingly include haptic feedback technologies. Such solutions have been demonstrated to improve both the user’s experience and the efficiency of the touch technology. Haptics have clearly been demonstrated to speed up touch recognition, reduce user errors, improve safety in mission-critical applications, and increase touch confidence in distractive environments. Moreover, haptics can help reduce screen size in applications that demand small displays – by enabling a confirming feedback cue in a small space. There are several haptic technologies competing for a share of this growing market and there will be a bit of a battle to identify the best haptic solutions for the future. Non-touch interactivity. The popularity of Nintendo’s Wii has demonstrated a need for enhanced motion recognition and digital interaction with display devices. Both Sony (with its newly released Move) and Microsoft (with Kinect) have signaled a substantial response to the Wii – enabling much more sophisticated interactive capabilities. We will almost certainly see these sorts of gestural solutions gain favor in the home and the workplace – ultimately replacing the traditional remote control, and perhaps even making inroads into the mouse market. Indirect touch solutions. The notion of “touch-screen” technology predisposes one to consider touch technologies that directly address the surface of the display. But there are many surfaces besides the front of the screen that can be utilized to manipulate data on the screen. Consider the backside of a smart phone. Rather than obscuring the images on the display with your fingers, the touch interaction could be easily shifted to the back surface of the phone – functioning to some extent like a mouse. Many of the issues associated with fingerprints, transmissivity, scratch resistance, etc., disappear when a different surface is used. Or consider the dashboard in an automobile, including sound system controls, air controls, GPS, etc., all of which require some level of reaching out and touching a button or knob to adjust the settings. There is no reason that these distracting touch-based interactions cannot all take place on the steering column using indirect touch solutions. Such examples of indirect touch are likely to expand even faster than solutions in which the user directly touches the display surface. Interaction with 3D displays. Stereoscopic 3D display technologies have recently gained mass market attention, particularly in the TV space, but also in various PC announcements. One of the biggest challenges associated with 3D displays is that most user interface technologies, (including touch screens) register in only x/y space. Manipulating images in 3D space has not been developed in concert with the emergence of the 3D display market. Although 3D mice and camera-based solutions have been developed to recognize user inputs in 3D space, the technology is still in its infancy. It is predictable that in the coming years, we will see more and more developments related to interacting in 3D space – across all applications.

Page 8 of 9

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_152-2 # Springer-Verlag Berlin Heidelberg 2015

Summary As a practical matter, it is impossible to consistently predict the future with any meaningful accuracy, even based on the rather broad generalizations highlighted in this chapter. Nevertheless, certainly some of these prognostications, and perhaps even most of them, will hit the target. One thing is for sure – in the world of displays, betting against the overall capabilities and demands of the human visual system is likely to be a bad bet. Whatever display you use today will be eclipsed in terms of performance by the display you use tomorrow. Acknowledgment The second part of this chapter is adapted from (Fihn 2011), published in Information Display magazine, May/June 2011. The author and publishers thank the editors and the Society for Information Display for permission to reprint this content.

References Fihn M (2011) Predicting the future. Inform Disp 27(5) DYAs/SID Mentley D (1994) Stanford resources flat information display conference. Stanford Resources, Santa Clara Texas Instruments Purchasing Department (1998) Internal study

Further Reading 3rd Dimension newsletters, Veritas et Visus Display Standard newsletters, Veritas et Visus Flexible Substrate newsletters, Veritas et Visus High Resolution newsletters, Veritas et Visus Touch Panel newsletters, Veritas et Visus

Page 9 of 9

Display Market Forecasting Ross Young

Contents Fundamentals of Display Market Forecasting: Establishing Credibility Through Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Displays: One of the Easiest Markets to Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lesson #1: Follow the Money . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lesson #2: Supply Can Determine Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 4 5 6 8

Abstract

This chapter examines the fundamentals of display market forecasting from one of the pioneers of this subject, Ross Young, founder and former CEO of DisplaySearch. It is his contention that displays are one of the easiest supply chains to track and forecast. In addition to a number of lessons he learned from tracking and forecasting the display market, he will also provide his latest predictions about a number of important segments. List of Abbreviations

ASP FED LTPS OEM OLED

Average selling price Field emission display Low-temperature polysilicon Original equipment manufacturer Organic light emitting diode

R. Young (*) SVP Displays, LEDs and Lighting, IMS Research, Austin, TX, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_153-2

1

2

R. Young

PV SKU TFT LCD

Photovoltaic Stock-keeping unit Thin film transistor liquid crystal display

Fundamentals of Display Market Forecasting: Establishing Credibility Through Methodology As a newcomer to any segment, the most important ingredient to establishing success in market research is to establish credibility. Credibility can be established by the hiring of industry experts with extensive contacts or through methodology. At DisplaySearch, we tried to bring on both experienced industry experts with extensive contacts as well as veteran market researchers who had developed valuable forecasting methodologies. We hoped that combining market expertise with extensive contacts and innovative methodologies would lead to the most accurate data in the industry. In terms of methodology, credibility can be established by providing a means to verify or check the data you are reporting. If company market share is provided, companies have every incentive to make their data look as good as possible and exaggerate the data they are providing. Because the display industry is marked by a supply chain with numerous inputs provided from other companies, particularly in the early days of the industry, you could use the supply chain to your advantage as a researcher by tracking the inputs from each layer of the supply chain to another and verify the data you collect and provide. The ability to report on the entire supply chain also expands your target market. Reporting on the entire supply chain was a big focus at DisplaySearch and was a key ingredient to our success. Our competitors were not as focused on tracking the shipments of different equipment and materials used by the fabs or the panel to OEM to brand shipments which we focused on. As a result, we had better visibility into what was happening in the market today and what was going to happen in the future. We found that panel customers were looking for unbiased information on their panel suppliers, panel suppliers were looking for the same on their customers, and component suppliers were looking for insights on their customers as well as their customers’ customers. This approach does have limitations, however, particularly for consumer products. If you can only see what is being shipped, but not what is selling through, your near-term forecasts are at risk, which are of greater value to brands, OEMs, and retailers. If products aren’t selling through, they will sit in inventory, and the next month’s shipments will be diminished. Brands and retailers may slash the prices to move the inventory that puts your ASP and revenue forecasts at risk. If a research company doesn’t have access to sell-through data, which only a handful of research companies provide, their near-term forecasts should be taken with a grain of salt. Another benefit of sell-through data is SKU by SKU reporting which allows researchers to determine which features are selling well and which aren’t. It allows brands to determine if these features are priced too high, too low, or just right. Sell-

Display Market Forecasting

3

through tracking also allows brands and researchers to determine the impact of promotional efforts, enabling the understanding of price elasticity for their products as well as their competitors. Interestingly, when multiple companies are hungry for market share, this type of information can also lead to price wars as we have seen in the TV industry between Sony and Samsung. Despite the clear value of combining global supply chain tracking with global sell-through data, no company is actually providing this information. DisplayBank, DisplaySearch, and iSuppli offer global supply chain tracking to some degree, but no one company offers global sell-through tracking. GfK offers sell-through tracking for most of the world, excluding North America, while NPD, which acquired DisplaySearch, offers sell-through coverage for North America. Given that most of the growth for flat panel products will come from outside of North America, I would hope to see GfK working with a supply chain research firm to provide global sell-in and sell-through coverage in the near future as this type of reporting would be critical for helping fab managers and TV and IT OEMs to better run their factories and minimize inventory growth.

Displays: One of the Easiest Markets to Forecast When I look back at my experience in covering the display industry and comparing it to a number of other segments that I have covered since leaving DisplaySearch, such as PV and LEDs, displays have been and continue to be a far easier segment to cover. In fact, I always expected one of the larger research firms such as Gartner and IDC to come in and take significant market share as we were having so much success. For that reason, we tried to make our reports as detailed and frequent as possible, raising the bar and requiring our larger potential competitors to make a significant commitment to this space in order to be successful. However, neither company ever made a significant commitment, enabling the existing players to maintain a high share. Why do I believe displays are easy to cover? • There are relatively few display manufacturers. In the large-area space which dominates the total LCD market, just five manufacturers account for most of the market. This compares quite favorably with PV, where there are over 200 cell manufacturers, and with LEDs, where there are over 70 manufacturers. It therefore requires less time and manpower to track the supply side of the market in displays than in other segments. At DisplaySearch, we made sure we had personnel near all the leading display manufacturers. • Display manufacturers want to benchmark themselves against the competition. The leading LCD manufacturers have always been eager to understand their position versus their competitors, which requires them to disclose information about their shipment results to research firms. In addition, they want to see this information updated as frequently as possible to gain as much insight in regard to how the market is changing. Furthermore, all of the top manufacturers are

4









R. Young

publicly traded, which means they typically release their revenue figures which helps research firms in tracking the market. This is not always the case in a number of other segments which are marked by privately held companies as well as companies with unusually strong positions who fear revealing too much about their operations to their competitors via research firms. In addition, companies in more mature segments are satisfied with annual reporting reducing the opportunity for market research firms. Thus, to the delight of display market research firms, display manufacturers are hungry for market data and are willing survey participants. Supply chain coverage allows for enhanced estimates and greater participation. In segments such as displays which require numerous component inputs, a company’s panel shipments can be estimated relatively accurately without ever actually receiving inputs from the panel manufacturer itself. When we first showed this ability to the panel manufacturers, they were surprised, and we convinced them that they might as well give us more reliable information as we were going to make a reasonably accurate estimate of their data. With relatively few exceptions, panel suppliers dictate the market evolution. In most cases, what panel suppliers put on their roadmaps become the dominant trends in the industry making it relatively easy to forecast the evolution of the display industry. Most OEMs and brands simply take what the panel suppliers put on their roadmaps, resulting in fewer truly differentiated display products. The exception to this is Apple, which has started numerous trends and implemented a number of new technologies which required panel suppliers to agree to an exclusivity period for some time as seen in its notebook, monitor, phone, and tablet products. Samsung and Sony have also had success in launching unique TVs which have remained differentiated for an extended period. Customers reward greater reporting frequency. Given the greater tactical value of data provided more frequently, display customers have shown that they are willing to pay more for content provided more frequently. As a result, focusing on a particular segment and establishing leadership in this segment was quickly rewarded. Following panel suppliers to new applications can be richly rewarded. Unlike a number of technologies, displays have moved into one application after another, and panel manufacturers have been hungry for expertise in these emerging segments. Hiring or quickly developing expertise in these segments as panel manufacturers need this information was a critical formula for success.

For all of these reasons, the display industry has proven to be a much easier market to cover than other components.

Lesson #1: Follow the Money I first began tracking and forecasting the flat panel display market in 1994 while serving as a product manager at automation supplier Brooks Automation. After

Display Market Forecasting

5

attending my first Society for Information Display (SID) event, I wrongly concluded that although there were many different display technologies, there was no one technology that met the needs of all existing applications and that these different technologies would coexist for some time. This assumption was incorrect because tens of billions of dollars were invested into expanding the capacity and improving the performance of amorphous silicon (a-Si) TFT LCDs which quickly became the dominant display technology. It wasn’t just the display manufacturers which made TFT LCDs the dominant display technology. It was also the result of large investments and technical breakthroughs from manufacturing equipment and materials suppliers. This massive supply chain was not going to be stopped. This provided one of the first lessons of display market forecasting which was “follow the money.” If one simply forecasted the display market based on capital spending, they likely would have reached the right conclusion that TFT LCDs would dominate. Fortunately, I quickly learned this lesson. I can recall a few years later when field emission displays (FEDs) were the buzz in the industry and we had to not only forecast this market but decide how many resources to apply to tracking FEDs. I made the decision to maintain our scarce resources on TFT LCDs and to conservatively forecast the outlook for FEDs which failed to be successfully commercialized despite significant capital behind it. There was just too much investment in the ground and forthcoming in TFT LCDs which were improving at a rapid rate with costs rapidly declining. We took the same approach with every other non-TFT LCD technology, including carbon nanotubes, low-temperature polysilicon TFT LCDs, MEMS, OLEDs, plasma, and more. It would be very difficult for an alternative technology to match the cost curve and performance improvements of TFT LCDs. In addition, alternative technologies may not ever have the benefit of an application such as notebook PCs which were willing to pay a premium for TFT LCDs as no suitable alternative existed to TFT LCDs at the time for color, video rate, and low-power displays. The volume from the notebook market also enabled TFT LCDs to dramatically reduce their costs. Unfortunately, for competing display technologies, a-Si TFT LCDs will play the role of viable alternative, making it difficult for other technologies to justify high premiums. Are there any high-volume applications that are willing to pay a premium for performance or functions that a-Si can’t deliver? I am not aware of any applications that can fit this requirement, although I imagine the flexible OLED manufacturers believe one may be developed in the near future.

Lesson #2: Supply Can Determine Demand An interesting lesson from tracking the display market is that the timing and scale of large investments in supply can dramatically alter demand forecasts. The best example of this is in TVs. Based on fabs under construction, it was quite clear to us that TFT LCDs were going to penetrate the TV market very quickly. I remember trying to quantify how many TVs the LCD industry was going to be capable of

6

R. Young

producing at an InfoComm event in 2003 to the amazement of the RPTV focused industry. They could not comprehend that despite the continued improvements to their TVs and the growth that they were enjoying, they were going to lose the market to flat panel TVs. Given the amount of capacity being brought online, prices were going to have to fall too fast for RPTVs to maintain their position, and the weak performance of LCDs was going to get better and better. Another interesting example was when Samsung SW Lee keynoted SID and indicated that 100 M LCD TVs would be sold by a certain time when all the forecasters were thinking of a much smaller number. He had the advantage of knowing exactly how much capacity and when it was going to be brought online, making the prediction a foregone conclusion.

Predictions • 3D will get more confusing before it becomes commonplace. • In 2010, 3D got off to a slower start than most brands predicted. The TVs performed well and were priced in line with high-end 2D TVs. The lack of success can be attributed to a lack of content and the expensive active shutter glasses. Content will come, but consumers still don’t like the idea of expensive glasses that require batteries and that only work with their particular branded set. The LCD industry is responding by offering a lower-cost glasses solution through film patterned retarders (FPRs). AUO and LG Display each showcased this technology at FPD International 2010 and have aggressive commercialization plans in 2011. Passive glasses from movie theaters will work on these TVs, and they should be available at under $10. The FPR solution delivers a brighter, more stable 3D image with less ghosting through spatial rather than time multiplexing, but the resolution is cut in half (1920  540 rather than 1920  1080). The pros and cons are summarized in Table 1. Will consumers be able to understand these differences? Will they accept lower resolution performance? Later in 2011, we may see the active film patterned retarder approach which has recently been demonstrated by CMI and LG which does not suffer a decline in resolution and can still use low-cost glasses. In addition, we will continue to see progress in glasses-free 3D, but prototypes produced so far suggest adoption in TVs is still a long way off. Smaller displays targeting single viewers are more appropriate. • Oxide TFTs may have a far-reaching impact on display technology and performance. • If sufficient oxide TFT uniformity, stability, and reliability can be achieved, oxide TFTs can have a far-reaching impact on the future of display technology competition and performance. Materials such as ZnO, IGZO, and others have been proposed to replace a-Si in TFT LCD and AMOLED backplanes. In the case of a-Si TFT LCDs, the use of oxide TFTs makes it easier to

Display Market Forecasting

7

Table 1 Active versus passive glasses 3D comparison 3D glasses Ghosting performance Flicker performance Brightness Resolution Viewing angle 2D performance Cost

FPR w/polarized glasses ✓ ✓ ✓ ✓

Shutter glasses

✓ ✓ ✓ ✓

achieve both large size and high resolution such as the 70" 3840  2160 display that Samsung showed at FPD International 2010. The higher mobilities of oxide TFTs result in faster transistor switching and smaller transistors, making it easier and less costly to drive the large, high-resolution displays. • In the case of OLEDs, it would eliminate the need for a low-temperature polysilicon (LTPS) backplane to drive the OLED. While LTPS backplanes achieve sufficient mobility, they suffer from lower yields associated with poor mobility and threshold voltage uniformity associated with grain boundaries in the polysilicon film inherent in the excimer laser annealing crystallization process. This results in the need for additional compensation circuitry which further complicates the LTPS TFT backplane and impacts yield and cost. The oxide TFT backplane could level the OLED playing field with LCDs as the backplane process and costs would be similar between a-Si LCDs and OLEDs. It also eliminates the need for the crystallization step in LTPS which is extremely slow and difficult to scale. With similar instead of >2 backplane costs, OLEDs should have a cost of manufacturing advantage versus TFT LCDs given the elimination of the backlight, color filter, etc. OLEDs would still have to improve the scaling of the frontplane patterning step currently conducted by shadow masks and extend blue lifetimes, but there has been good progress in these areas. With a growing number of oxide TFT prototypes being shown at tradeshows, it is increasingly likely that oxide TFTs will be commercialized in the near future significantly improving the outlook for large-size OLEDs, particularly OLED TVs. • Transparent displays are coming soon. • Another popular prototype at FPD International 2010 was transparent displays. OLEDs have achieved >70 % transparency through the use of transparent oxide materials such as ITO, AZO, and WO. LCDs are able to use edge-lit LED backlights with transparent light guides. Applications for these transparent displays include retail windows, signage, hotel and residential windows, and many more. The day of multifunctional windows which serve as TVs, monitors, phones, solar energy sources, and solid-state lights is not too far away.

8

R. Young

Summary In summary, the display market continues to evolve and grow as costs fall and new applications are created. Fortunately, given the limited number of players, their abilities to create and make markets, and their desire to work with leading research firms, it is one of the easiest industries for market research firms to cover.

The Crystal Cycle David Barnes

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Importance of Display Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Importance of Macro Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Market Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Future Supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Importance of Supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Sentiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Pivots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The Capacity Acceleration Price Indicator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Abstract

The Crystal Cycle is a business cycle in the AMLCD industry driven by the pace of capital investment in new property plant and equipment. Such investments cause surges in the amount of production capacity panel makers must sell in terms of display area. Display panel prices fall rapidly in response to surges in supply but stabilize or rise afterward. This leads to cyclical price and profit fluctuations. This chapter describes ways of explaining and predicting the Crystal Cycle.

D. Barnes (*) Portland, OR, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_154-2

1

2

D. Barnes

Introduction The term Crystal Cycle means the rise and fall of active-matrix, liquid-crystal display (AMLCD) prices over the course of a business cycle. The semiconductor industry has its silicon cycle, so participants in the AMLCD industry thought it only fair that they should coin a special name also. The implication of the term is that prices for some measure of good fluctuate relative to some level at some pace distinct from commercial or seasonal cycles. This chapter describes such elements of the Crystal Cycle and proposes a model for explaining and anticipating their interactions. Looking back at the assumptions and predictions made in the 2012 edition of this handbook, macroeconomic changes in 2007–2008 had more lasting effects on the Crystal Cycle than realized in 2010, when the original chapter was written. In hindsight, the mathematical model developed in 2010 remained robust through 2007 but the AMLCD industry expanded at a slower pace after 2008. As a result, the reference line for assessing cycle turning points pivots at that point. This edition of the handbook updates the Crystal Cycle model and considers the influence of macroeconomics in more detail.

The Importance of Display Area The price of goods requires some meaningful relationship to production cost and market value before it can be used to describe business cycles. For example, we seldom read about the average price for an airline ticket. We read instead about the average price per passenger mile. Flying passengers 5,000 miles incurs more cost than flying them 500 miles and passengers are willing to pay more for flying farther. Unfortunately, we often read about average panel prices despite the fact that such numbers represent neither cost nor value well. If we consider that AMLCD ranges in (diagonal) size from less than 2 in. to more than 100 in., we recognize that the largest panels deliver several thousand times more display area than the smallest ones do. The display area of a flat panel is a key driver of production cost and it is the most important factor determining its market value. From a cost perspective, the purchase price of most manufacturing inputs increases with area (e.g., glass or optical films) and the consumption of production assets (e.g., tools or facilities) increases with area also. From a consumer perspective, a 30-in. monitor seems more valuable than a 15-in. one, and historical analysis reveals that consumers recognize the correct value intuitively by spending about four times more on the larger display because it provides four times more information area. For such reasons, the proper price per good in the AMLCD industry is measured as dollars per square meter. Other denominators such as dollar per square inch could be used but production capacity is measured in square meters, and comparisons between capacity, output, and prices are important points of analysis. There are several reasons that the average price per square meter would change over time. Macroeconomic factors include regional differences in inflation rates,

The Crystal Cycle

3

currency exchange rates, and consumer demographics. Commercial factors include new product introductions, taxes or tariffs, and seasonal changes in consumer behavior. The problem with such factors is that they are difficult to model. Though each factor may influence the business cycle, considering all of them would require a model as complicated as the real world. Thus, analysts choose a small number of factors and exclude the rest or make some assumption about their gross influence. The art of modeling is picking a few reasonable factors and remembering that excluded factors can create surprising results in the real world. In many cases, analysts choose microeconomic factors such as supply or demand to explain prices. The problem with most conversations about price and demand in the AMLCD industry is that potential demand has been far greater than the potential supply despite decades of capital and technical investments by panel makers. Capital investments in new thin-film transistor (TFT, as in TFT-LCD synonymous with AMLCD) fabrication plants are only one factor driving supply growth. Incremental capital expenditures, process simplifications, and operational efficiencies have added one or more new plants worth of capacity each year over the past two decades. As a result, total AMLCD fabrication area capacity doubled or more every 2 years over the decade of 1997–2007. Such expansion was not sufficient to replace incumbent display technologies such as cathode-ray tubes (CRTs or Braun tube) or plasma panels in every home or office, however. Rising middle-class and aspirational purchasing power outside rich countries has increased demand for AMLCD-based products faster than capital investments or technical improvements can supply. This basic constraint may persist as consumer demand for mobile devices with flat panel displays increases. As a result, panel makers have limited productive potential at any point in time relative to potential, unconstrained demand. From an analytical perspective then, commercial limits on display area demand are arbitrary and subject to pricing. Imagine entering a store offering big LCD TV sets for one dollar each. How many would you buy? Most people would buy more inexpensive sets than they would expensive ones. Measuring demand in such cases is difficult because there is no finite amount independent of price. All the AMLCD factories in the world running as fast as they could would be unable to meet the demand for one dollar TV sets. Supply and demand would be out of balance. Fortunately, markets have a way of bringing supply and demand into balance: price. In this case, suppliers would be rational if they increased prices to the level at which the number of consumers willing to buy TV sets matched the supply. That level is discovered only when suppliers raise prices above the optimal (supplydemand) balance point, so prices will rise and fall in what economists call the process of price discovery. The machinations of price discovery help explain seasonal cycles. Panel prices often fall after the end-year holidays and rise as retailers prepare for the next holiday season. But what if suppliers add a large amount of capacity before the holidays? Microeconomics would predict that panel prices must fall suddenly if suppliers wanted to use their new capacity effectively. In such event, demand would prove greater than expected and prices lower than expected before considering the effect of supply.

4

D. Barnes

The Importance of Macro Effects Basic microeconomic dynamics involving capital investment, operational improvement, and market development were sufficient to explain most display area price fluctuations through 2007. The first edition of this chapter therefore asserted that supply drives price and price drives demand, despite frequent stories in the press about supply-demand relationships. While the supply-price relationship has remained the simplest explanation of Crystal Cycle turning points since then, macroeconomic conditions have changed. Prior to 2008, large display panel cost and market factors had more influence on AMLCD industry dynamics than did small panel cost and market factors. That remained true after 2007 but the business influence of displays smaller than ten (10) diagonal inches increased markedly after introduction of smartphone products in 2007. Compared to TV products, smartphones demanded displays with greater spatial resolution (more ppi, pixels per inch) and more cost per square inch. Increased spatial resolution increased the fabrication value of AMLCD panels from producers with the requisite technology. More components per square inch increased the material contribution value per display and the addition of touch sensors increased total contribution opportunity along the supply chain. As a result, display makers had reason to allocate more capacity to serve demand for smartphones and associated mobile devices. The initial stage of replacing CRT TV with LCD TV in rich countries was nearly complete by 2008 and the average area price of TV panels was declining more than 20 % per annum, in part because switching costs were low. A typical TV set or PC monitor maker (or brand) faced less risk switching panel suppliers than did a notebook PC maker. In turn, the typical smartphone or mobile device maker faced even greater switching costs than did notebook PC makers. Thus, display makers had additional motives for increasing small panel output and they had ability to cut hundreds of panels from glass substrates intended to support ever-larger TV panels. Such asset fungibility led to less correlation between area supply and price than in previous decades.

Market Factors The period of 2008–2013 was remarkable for the influence of small panel market dynamics compared to historical trends but small panel effects were overshadowed by macroeconomic events. The story of the global recession commencing in 2008 was ongoing at the time of this writing in mid-2014, but its influence can be seen already. AMLCD producers have not been able or willing to expand capacity as fast as they did from the mid-1990s to the mid-2000s. Modeling fabrication capacity from 2000 to 2013 (see Fig. 1) shows a classic S-curve pattern. For interest, the chart’s columns show capacity by substrate size in geometric increments. A PearlReed growth model fits the total capacity trend at 49 % per annum with an inflection point (second-order condition of zero acceleration) in mid-2008. Since the recession, capacity has been growing slower. Forecasts suggest annual capacity will increase

The Crystal Cycle

5

250

200

8–16 m2 150

4–8 m2 2–4 m2

100

1–2 m2 0–1 m2

50

0 2000

2002

2004

2006

2008

2010

2012

Fig. 1 AMLCD area capacity, 2000–2013 in millions of m2

less than 10 % a year for the remainder of this decade. Five years of slower investment were required to double capacity from 2008 to 2013, compared to less than 2 years to double from 2001 to late 2003. A slower pace of expansion, despite considerable investment by Chinese entrants, signals a change in the baseline acceleration of capacity.

Future Supply Looking forward, we can see two trends. Asset fungibility should continue to provide options for capacity allocation shifts and commodity price pressure should reduce the attractiveness of small panel demand. A close look at the composition of AMLCD substrate capacity reveals that 78 % of total capacity was in the form of substrates sized less than 1 m2 in 2003. By 2007, such small substrates composed only 17 % of total capacity, and substrates in the 1–2 m2 regimes comprised the majority of 32 % share. In 2013 substrates sized 4–8 m2 accounted for 57 % of total capacity. Given that most Chinese companies are investing in larger substrate plants, we may see 60 % or more of capacity in the form of such large sheets in 2017. Since modern production methods allow hundreds of panels to be patterned on a substrate, we should expect panel makers to allocate capacity based on profit opportunity. Looking back at market allocation decisions, we see small panel area accounting for 5 % of total output and 20 % of sales revenue in 2007. In 2013 those shares increased to 9 % of area and 34 % of sales. The area price multiple relative to large panels increased from 4.4 to 5.2 (times). That makes sense given the extra value provided

6

D. Barnes

by high-resolution displays for mobile devices and it helps explain the attractiveness of allocating more capacity to small panel markets. The question is whether such premiums will be available in the future. News reports at the time of this writing anticipate $100 smartphones in 2015. If market dynamics in poorer countries demand lower-priced smartphones, then we should expect premiums for small panels to decline, despite demand for higher resolution panels in rich countries. Poor countries offer greater populations and greater unit demand opportunities than do rich countries. It seems reasonable to assume that panel makers will have less incentive to allocate capacity to small panel products in the future. If so, then overall market dynamics may revert to the pre-2007 condition in which large panel demand dominated microeconomic considerations. Thus, while leading panel makers may make money selling 600 ppi or higher resolution panels in the future, such market activities may matter less than they do today. In any case, it seems reasonable to assume that mean reversion will occur: recent profit opportunities provided by smartphones may not contribute as much in the future and the future may look more like the pre-2008 market than the more recent era.

The Importance of Supply In the pre-2008 era, the growth of panel area supply in any given quarter depended mostly upon investments made about 3 years earlier. Even leading AMLCD producers needed about 3 years to construct, equip, and ramp a new fabrication plant. Thus, while there were opportunities for incremental investment or technical improvement, capacity tended to rise in stair-step fashion as leading panel makers invested billions in new plants and fast followers matched such investments as soon as possible. In addition bringing new plants online within the same time frame (spread over several quarters because of equipment supply constraints), most new plants used larger glass substrates. That compounded the effect of new investments. As a result, the supply of display area surged upward in wave after wave of investment on a nominal 3-year cycle.

Sentiments Sentiment within the ranks of panel makers was another key factor driving supply expansion in the pre-2008 era. Given the high capital expenditures required for AMLCD capacity, executives reasoned that running plants at high utilization was essential, even if that drove the market-clearing price of output area down toward (or below) break-even. One way to access the impact of such thinking is to sum the free cash flow (operational cash flow minus capital expenditures on property, plant, and equipment) for all AMLCD makers in Taiwan for 6 years from 2002 to 2007. The sum is nearly minus 14 billion dollars: about $14 billion flowed from shareholders to employees and suppliers over the 6 years. Divided by cumulative sales

The Crystal Cycle

7

Fig. 2 Average area price in USD by calendar quarter

revenue, the ratio was 11.5 %. Purchased materials account for the greater portion of AMLCD product cost, so running plants at high levels of utilization may not have been the best financial strategy. Regardless of the thoughts or motives, the result was successive waves of display area supply coming to market.

Prices We can evaluate the effect of supply expansions using public disclosures from two leading AMLCD producers. AU Optronics of Taiwan and of Korea and LG Display operate similar businesses in different home currencies, which tend to offset US dollar translation effects. More important, both producers report display area sold each quarter and their combined capacity represents 40 % or more of the total industry. From such figures, we can estimate area price and cost trends for most of the industry (minimizing the effects of small producers serving niche markets). During the 6 years of 2002–2007, the two producers supplied 51 million square meters of AMLCD for $93 billion. That equates to a 6-year average area price of $1813 per square meter. Their cumulative operating profit margin was 9 %. On trend, area prices declined at an exponential rate of –21 % per annum, but we can see considerable fluctuation on that trend in Fig. 2. The chart plots price per square meter in thousands of dollars by calendar quarter from 2001 to 2013. Periods when the price rose above the exponential trend indicate up (boom) cycles. Periods when the price fell below trend indicate down (bust) cycles. Note that the (gray) trend line fit through the start of the 2008 recession (generally recognized in Q4’08) is steeper than the (black) tend line fit from 2009 onward.

8

D. Barnes

Pivots Leading AMLCD producers built few new fabrication plants after 2008. Supply expansions depended upon operational improvements and new entrants in China who established positions in the domestic display market using substrate sizes established in pre-2008 waves of expansion. For the 6 years of 2008–2013, AU Optronics and LG Display sold 265 million square meters of display for $205 billion at an average area price of $775. Their combined operating profit margin was 3 % over this period but their area price declined only 5 % per annum. The chart scale suggests that the degree of price fluctuation in the later 6-year period is less than before and that is true. Area price rose or fell as much as 40 % against the trend pre-recession. Post-recession, the fluctuation was on the order of 10 % or less. Thus, the model for Crystal Cycle turning points published in the first edition of this handbook should be modified to account for a trend line pivot point resulting from a combination of premium small panel innovation, slower capacity expansion, and macroeconomic recession.

The Capacity Acceleration Price Indicator The plotted points (or intervals between points) where the area price curve crosses the trend line confirm Crystal Cycle turning points. One goal is predicting when booms may turn to busts or when busts may rebound to booms. Another goal is predicting such turning points without prediction, which occurs when using demand forecasts that depend on an implied price assumption, for example. A capacity acceleration model has proven successful in this regard because capacity is a function of past decisions regarding investment, for the most part. The author has been tracking and modeling AMLCD capacity since 1996. In addition, reliable time series are available from market researchers such as NPD DisplaySearch. A basic acceleration formula provides a way to quantify the growth of growth using a fourquarter smoothing of quarterly capacity data. The acceleration shown below [Equation] is simply the negative value of acceleration because rising growth drives falling prices. This equation, named the capacity acceleration price indicator (CAPi), has been used many times for more than a decade to predict turning points in the Crystal Cycle for the author’s clients: 2 4 2 4

þ1 X

!2 Ct



t¼2 0 X t¼3

0 X t¼3

Ct

þ2 X t¼1

! Ct



Ct "

þ2 X

!3 Ct 5

t¼1 þ1 X

!2 3 : Ct 5

t¼2

Evaluating capacity data on the basis of negative acceleration provides excellent correlation with price fluctuations. Figure 3 plots the percentage deviation of area

The Crystal Cycle

9

Fig. 3 Capacity acceleration and area price deviations from trend

price relative to trend with CAPi plotted on the same axes. CAPi turning points for down cycles are noted by black dots on the gray line. These correspond with area price (for the representative makers AU Optronics and LG Display) falling below trend.

Conclusion From the foregoing, it seems reasonable to assume the Crystal Cycle will persist even if the absolute value of capacity growth remains less than 10 % per annum. Smaller absolute values may amplify the apparent acceleration caused by relatively small changes in the historical context of faster growth but price deviations should still occur. For leading producers such as AU Optronics and LG Display, incentives for increasing display area supply persist. Charting their combined output and average annual price on a square meter basis shows substantial elasticity of demand relative to price. The slope of the log-log relationship from 2004 to 2013 is minus 1.7, which is quite strong. For those two leaders, at least, demand increased faster than price decreased. Sales revenue thus increased over the 10 years. Ignoring the question of economic profit, rising sales attracts new entrants such as those in China at the time of this writing. New entrants have incentives to accept prices that cover fixed costs, so they tend to act as price spoilers regardless of local or national policy conditions. Based on historical observations of Japanese, Korean, and Taiwanese entrants, Chinese entrants may drive prices down the long-term trend line for the remainder of this decade.

10

D. Barnes

It will be interesting to look back at the present some years hence. Will price fluctuations for the flat panel industry overall increase or decrease as smartphones become commodities or as AMOLED technologies mature? If the past is any guide, the Crystal Cycle will continue for flat panel displays as long as there are new entrants to spoil any attempt to rationalize prices.

Further Reading Besanko D, Dranove D, Shanley M, Schaefer S (2007) Economics of strategy, 4th edn. Wiley, Hoboken Founderees DB SID business conference 2014; http://www.slideshare.net/DavidBarnes11/ founderees LCD or OLED: who wins?, Paper 5.1 from SID symposium 2013 LCD Limbo, David Barnes, SID business conference 2013 There be Dragons, David Barnes, SID business conference 2012 Varian HR (2002) Intermediate microeconomics: a modern approach, 6th edn. W. W. Norton, New York

Opportunities for Alternative Display Technologies: Touchscreens, E-Paper Displays and OLED Displays Jennifer Colegrove

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Touch Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-Paper Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OLED Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 3 5 6 9 9

Abstract

The worldwide flat panel display (FPD) market was growing steadily until 2008, but following a downturn in 2009, slow growth is forecast from 2010 to 2017. Despite this, faster growth opportunities may be found in specific display technologies or through adding-on functions to displays over the next several years. These faster growing areas include touch screens, e-paper displays, and OLED displays. This chapter provides detailed discussion of the market drivers, suppliers, and market forecasts for these display technologies.

Introduction The worldwide flat panel display (FPD) market grew steadily to more than $100 billion in 2008. The economic downturn at the end of 2008 and 2009 hit many industries including the FPD industry, and global FPD revenue dropped to about $90 billion in 2009. From 2010 and 2011 to 2017, DisplaySearch forecast slow but

J. Colegrove (*) Emerging Display Technologies, DisplaySearch, Santa Clara, CA, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_155-2

1

2

J. Colegrove 140

WW FPD revenue ($ billions)

120

100

80

60

40

20

0 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 (f)

Fig. 1 Worldwide flat panel display revenue: 2006–2009 history data and 2010–2017 forecast (Reprinted from Q2 2010)

steady growth in worldwide FPD revenue (see Fig. 1). The CAGR (compound annual growth rate) is 2–3 % from 2006 to 2017. DisplaySearch’s report (Q2 2010) covers all major FPD technologies including liquid crystal displays (LCDs), plasma display panels (PDPs), organic lightemitting diode (OLEDs) displays, and e-paper displays. Although slow growth is forecast for overall flat panel display revenue, faster growth opportunities may be found in a number of specific display technologies or add-on functions to displays over the next few years. These fast growing areas include 1. Touch screens (see chapter “▶ Introduction to Touchscreen Technologies”), which may be added on or built in to displays 2. E-paper displays (see chapter “▶ Electrophoretic Displays”), which is an emerging display technology comparing with LCD, PDP, etc. 3. OLED displays (see chapter “▶ Organic Light Emitting Diodes (OLEDS)”), which is an emerging display technology comparing with LCD, PDP, etc. In the following, we present an overview of the technologies, market drivers, suppliers, and market forecasts for each of these three fields.

Opportunities for Alternative Display Technologies: Touchscreens, E-Paper. . .

3

Touch Screens Touch screens are becoming widespread due to their ease of use and the intuitive interfaces they enable, which can save time and increase productivity. Falling prices have also spurred adoption, and consumer products have increasingly been designed around touch screens. One of the differences between the touch-screen market and the display market is the relative degree of consolidation. The display market has consolidated to the point where the top ten manufacturers account for over 90 % of total display revenue, while the touch-screen market, in comparison, is made up of around 170 manufacturers in 2009 and over 190 manufacturers in 2010, with the top ten accounting for less than 50 % of the total market revenue. In other words, the touchscreen market is highly fragmented. There are over a dozen touch-screen technologies in use (see chapter “▶ Introduction to Touchscreen Technologies”), but none of them are perfect. No single technology can meet 100 % of the requirements for every application. DisplaySearch groups touch technologies into 11 categories: resistive (both analog and digital), surface capacitive, projected capacitive, infrared (traditional infrared), optical imaging (camera-based), acoustic wave (both surface acoustic wave [SAW] and bending wave), digitizer, in-cell, on-cell, combination, and other. Long used in industrial equipment, kiosks, and other nonconsumer products, touch-screen penetration has been rapidly increasing in mobile phone, portable navigation devices, gaming, and other applications. Over the next several years, touch screens will undergo strong growth in medium- and large-size (>500 ) applications such as Mini Note/slate PC, retail, ticketing, point of information, and education/training. The touch-screen industry is already a multibillion dollar industry and still has great growth momentum. This is what makes it so attractive. DisplaySearch forecast that total touch-screen module revenue will grow from $4.3 billion in 2009 to nearly $14 billion by 2016, a compound annual growth rate of 18 % (Fig. 2). Led by Apple’s iPad, the slate PC (also called tablet PC) is proved to be a fastgrowing emerging application. As of early 2011, there are over 90 companies entering the slate PC industry. In October 2010, DisplaySearch increased the forecast of projected capacitive touch screens for mininote and slate PCs. Total touch-screen shipments for Mini-Note/slate PCs in the size range of 5.0–10.200 were forecast to reach 19.5 million in 2010 and 122 million in 2016. Among the 20 touch-screen application categories that DisplaySearch tracks, the mobile phone is the largest in terms of shipments and revenues during the period 2009–2015. There were about 377 million touch screens shipped for mobile phone applications in 2009, which is a 26 % penetration rate. DisplaySearch forecasts that the penetration rate of touch in mobile phones will reach more than 50 % by 2016. Projected capacitive touch has seen very fast growth in the last couple of years as it has been popularized by Apple’s iPhone and iPod Touch starting in 2007. Projected capacitive is the first serious challenger to the long-term dominance of analog resistive in the touch-screen world. Not only have more resistive touch-

4

J. Colegrove 14000

2500

12000

Unit (M) 2000

Module Revenue (M$)

1500 Unit (M)

8000

6000

1000

4000

Module Revenue (M$)

10000

500 2000

0

0 2008 (h)

2009 (h)

2010

2011

2012

2013

2014

2015

2016

Fig. 2 Touch-screen market forecast: 2008–2009 history data; 2010–2016 forecast (From Touch Panel Market 2010)

screen manufacturers moved to produce projected capacitive, but projected capacitive technology has evolved to single substrate (the ITO coating layers are on one substrate). In recent years, more film-based projected capacitive systems are serving very large sizes, up to 100 in. and more. With the iPad and iPhone 4 adopting it in 2010, DisplaySearch forecasts that projected capacitive touch will surpass resistive touch technology for the first time to become the leading touch technology on revenue basis. Several touch-screen module manufacturers are expanding capacity or adjusting technology to offer projected capacitive touch panels in an effort to take advantage of the growth spurt in this segment. However, with touch-screen prices quickly falling, it is important for suppliers to have the most efficient supply chain in order to remain competitive. In-cell touch has been something of a Holy Grail in the touch industry for the past few years – it just seems natural for the touch technology to be totally (and invisibly) integrated into the display. In-cell touch is defined as the touch sensor being inside the display cell, typically between the TFT substrate and the color filter substrate. On-cell touch is outside of the display cell, typically on the surface of the color filter substrate, but beneath the polarizer. Both in-cell and on-cell touch have implications for the supply chain, as they involve integration of the touch sensor and display. Manufacture process and yield improvement are critical.

Opportunities for Alternative Display Technologies: Touchscreens, E-Paper. . .

5

The touch-screen industry has many other opportunities besides those discussed above. For example, touch controller IC demand is increasing; there are opportunities here for semiconductor companies. Projected capacitive touch and resistive multitouch are growing, which creates, for example, opportunities for glass suppliers and ITO patterning equipment suppliers, and for those working on transparent conductive material to replace ITO (see Part “Transparent Conductors: ITO and ITO Replacements”).

E-Paper Displays E-Paper displays (see ▶ Paper-Like and Low Power Displays, section 8) are taking off with consumers due to their low power consumption and ease of reading – especially in sunlight. In addition to the e-Paper displays’ “green” factor, electronic shelf labels (ESLs) can save time and labor costs by enabling dynamic pricing in stores. More and more digital content is available, and e-readers such as Amazon’s Kindle using e-paper displays make a large amount of books and documents portable, which is especially attractive for travelers. E-paper displays could also be used to differentiate products, such as mobile phone covers or dynamic keypads. Besides e-books, e-newspapers, and e-magazines, many other applications have adopted or are expected to adopt e-paper technologies: mobile phones, clothes (wearable displays), greeting cards with displays, ESLs, point-of-purchase (POP) displays, public signage, and other applications. Our definition of an e-paper display is a direct-view electronic display that can be bi-stable or nearly bi-stable, with the benefit of very low power consumption. Bi-stability means the display has two stable states: the display can maintain an image without consuming any power; power is only needed to switch the image. E-paper display technologies include • Electrophoretic displays (see chapters “▶ Electrophoretic Displays” and “▶ InPlane Electrophoretic Displays”), which are truly bi-stable; suppliers include E-Ink Holdings, Philips, SiPix/AUO, and Bridgestone • Electrochromic displays, which are almost bi-stable because they require only a small amount of energy to maintain a static image; suppliers include NTera, Aveso • Bi-stable LCD (see chapter “▶ Bistable Liquid Crystal Displays”), which can be truly bi-stable or nearly bi-stable; suppliers include Kent Displays, Magink, Fujitsu, and ZBD • MEMS (Micro ElectroMechanical System) (see chapters “▶ Mirasol ®: MEMSbased Direct View Reflective Display Technology” and “▶ Time Multiplexed Optical Shutter Displays”), which is nearly bi-stable; suppliers include Qualcomm and UniPixel • Electrowetting (see chapters “▶ Video-Speed Electrowetting Display Technology,” “▶ Droplet-Driven Electrowetting Displays” and “▶ Electrofluidic

6

J. Colegrove

Displays”), which can be truly bi-stable or nearly bi-stable; suppliers include Liquavista and Gamma Dynamic • Other technologies that can be bi-stable Currently, almost all e-books and e-newspaper/e-magazine products in the market are built on glass substrates. The TFT manufacturing process on glass substrates is mature, while TFT manufacture on plastic or metal foil has been demonstrated for several years by Plastic Logic, LG Display, E-Ink Holdings, ITRI, HP, FDC, etc. The first flexible active matrix display will enter the market by end of 2010 or 2011. Most e-paper displays in the market are currently monochrome, with the exception of the Fujitsu e-book and the Magink CLC billboard. Vivid color has been a major challenge for manufacturers of e-paper displays. E Ink Holdings finally announced its commercially available color e-paper display in November 2010. E-book/e-textbook displays currently account for the majority of e-Paper revenues. Nearly all e-book devices currently in the market use E-Ink’s microencapsulation-type electrophoretic display technology (see chapter “▶ Electrophoretic Displays”). SiPix and its biggest shareholder AUO started commercialization of its microcup-type electrophoretic display in 2010. There are also a small number – such as Fujitsu’s FLEPia – using cholesteric LCD technology (see chapter “▶ Cholesteric Reflective Displays”). In DisplaySearch’s e-Paper Displays Report (e-paper display technology and market forecast 2009), the e-paper display market for e-books/e-textbooks (screen sizes 5–10 in.), and e-Newspaper/e-Magazine (screen size >1000 ) was forecast to grow from about one million units in 2008 and 4 million units in 2009 to 14 million units in 2010 and reach 90 million units in 2018 (Fig. 3).

OLED Displays The Organic Light-Emitting Diode (OLED) display (see chapter “▶ Organic Light Emitting Diodes (OLEDS)”) is a relatively new and superior display technology on the market. In particular, Active Matrix OLED (AMOLED) (chapter “▶ Active Matrix for OLED Displays”) mobile phones are a hot topic among consumers, and indeed this is the fastest growing sector in OLED displays. The shipment of AMOLED displays in 2009 reached over 20 million units which tripled that of 2008, and DisplaySearch forecasts these will double in 2010, reaching about 46 million units. Currently, most OLED displays in the market are less than 1500 diagonal size. However, in September 2010, Mitsubishi Electric started to sell >100 in. OLED displays are constructed by tiling many small full-color passive matrix OLED (PMOLED) displays. PMOLED display revenue peaked in 2006 and fell in 2007, 2008, and 2009. Worldwide PMOLED capacity is in oversupply. This tiling method immediately

Opportunities for Alternative Display Technologies: Touchscreens, E-Paper. . .

7

120.0

100.0 e-newspaper/e-Magazine (>10”) e-book/e-textbook (1000 ) market forecast (From e-paper display technology and market forecast 2009)

increased PMOLED display size to >10000 . This might be a new way to boost PMOLED display sales in the future. As mentioned previously, the OLED display market’s growth is highly dependent on the AMOLED’s growth. Most of the AMOLED products in the market are 500 AMOLED is currently very small. Medium- and large-size applications include OLED TV – such as the 1500 AMOLED TV from LG, digital photo frames, and public signage. Figure 4 below is a display characters comparison of AMOLED and AMLCD. The advantages of AMOLED are given in the green-color cells: thinness, light weight, excellent viewing angle, good color gamut, fast response time (no motion blur, good for 3D), high contrast ratio, wide operating temperature (good for cold weather), lower power consumption at 30 % pixel on. However, AMOLED also has the challenge of higher price and is limited to 100% NTSC (top emission), ∼70% NTSC (bottom); high at all gray levels

∼70%, up to ∼100% NTSC (LED backlight and new color filter); falls at low gray levels

Color Reproduction

Better; gamut independent of view angle

Good; gamut changes with viewing angle

Resolution

Lower; 308 dpi (SM), 202 dpi (polymer)

Higher; best is 498 dpi

Response Time

Faster, nanoseconds. No motion blur, good for 3D

Slower, milliseconds

Contrast Ratio

Higher

Lower

Sunlight readability

Better than transmissive LCD, worse than transflective LCD

OK if transflective

Operating Temperature

Range is larger, can operate at low temps like −40 C.

Range is smaller, lowest temp is −10 C.

Power Consumption

Lower at typical video content when ∼30% of pixels are on

Higher at typical video content

°

°

Lifetime

Shorter, 5K to 30K hour, but improving

Much longer, above 50K hour

Manufacturing Investment

Lower, but lack of standards keeps the investment only slightly lower

Higher

Production Cost

Expensive; low yield and complex structure, potential to be low cost

Cheaper than AMOLED

Fig. 4 Comparison of AMOLED and AMLCD (Source: DisplaySearch data 2010)

In mid-2010, Samsung Mobile Display (SMD) announced its investment in Gen 5.5 fab for AMOLEDs. The new Gen 5.5 facility will significantly increase their potential output of super-AMOLED displays, starting production in July 2011. It will produce an impressive 70 K 1,300  1,500 mm sheets per month. Samsung can produce 30 million 300 class displays per month with all lines running. Moving forward, SMD is also working on Gen 8 line for AMOLED TV. LG Display is also taking an aggressive approach to AMOLEDs, having spent about $100 million at the end of 2009 to purchase all the Kodak OLED-related IP. LG Display announced in April 2010 that it would invest 250 billion won ($225.7 million) to triple the capacity of a line producing AMOLED displays. Its CEO also announced mass production of larger-size (>2000 ) OLED TV in 2011/ 2012. Taiwan-based AUO had mass produced AMOLED in 2005/2006 but stopped since 2007. In 2010, AUO restarted the AMOLED mass production. ChiMei Innolux (CMI) has been producing AMOLEDs for several years, through its subsidiary TPO and CMEL. Several Chinese companies have announced they will produce AMOLED displays in the near future. As mentioned previously, Mitsubishi Electric and Pioneer are selling tiled PMOLED displays. For AMOLEDs, in 2010, Toppan Printing and Casio have formed a joint venture, Ortus Technology, to mass produce AMOLEDs and LCDs. Ortus Technology will initially specialize in the small- and medium-sized LCD business it has taken over from Casio Computer. Going forward, Ortus plan to launch an OLED business at the earliest possible date.

Opportunities for Alternative Display Technologies: Touchscreens, E-Paper. . .

9

North America and other regions also have activities on AMOLED. For example, eMagin has been mass producing microdisplay AMOLEDs for many years and recorded high revenue in 2010. The OLED display market was worth about $0.8 billion in 2009, and DisplaySearch forecast it will pass the $1 billion milestone and grow to about $1.2 billion in 2010 and reach $8 billion by 2017. The mobile phone main display market segment grew strongly recently and will continue to lead the revenue, at $4 billion in 2017. TV will become the second largest revenue application, at $3 billion in 2017.

Summary Despite the restricted growth forecast across the FPD global market, some faster growth opportunities may be found among specific emergent display technologies. These include touch screens, e-paper displays, and OLED displays. In this chapter we have discussed the market drivers, suppliers, and market forecasts for these technologies.

Further Reading Q2 2010 world wide FPD report, DisplaySearch Touch Panel Market Analysis, 2010 report, DisplaySearch e-paper display technology and market forecast, 2009 report, DisplaySearch Quarterly OLED shipment and forecast report, 2010 report, DisplaySearch The emitter: emerging display technology monthly report, DisplaySearch

The History of Graphics: Software’s Sway Over Silicon Adam Kerin

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolution of APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Enter Microsoft and DirectX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The OS and the GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OS and Success . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Gears of Gaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphics Processing Units Become General Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Age of Mobile Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 3 4 5 6 7 7 8

Abstract

This chapter explores software as an influence in the brief history of computer graphics. The beginning of the industry with proprietary APIs is described, followed by the entry of DirectX and how the unified format changed how chips are designed and used. Examples of how PC gaming titles have driven the need for better and faster GPUs are listed. Finally, an overview detailing several nontraditional parallel applications have added to the role of the GPU. Technical Terms

API CPU CTM GPU HW T&L

Application programming interface Central processing unit Close-to-metal Graphics processing unit Hardware transform and lighting

A. Kerin (*) Qualcomm Technologies, Inc, San Diego, CA, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_156-2

1

2

A. Kerin

IRIS GL OpenCL OpenGL PPU RAM Shader Core X86

Integrated raster imaging system graphics library Open computing language Open graphics library Physics processing unit Random access memory Fundamental execution unit of a GPU Common PC instruction set architecture

Corporate Entities and Terms

ATI CUDA HP MS-DOS PhysX SGI

Array Technologies Incorporated, acquired by AMD Compute Unified Device Architecture Hewlett Packard Microsoft Disk Operating System Nvidia GPU accelerated game physics Silicon Graphics, Inc.

Introduction The history of computer graphics is more than a summary of release dates, product names, and silicon specifications. Other factors have driven advancement in the graphics industries that have not always been advancements in silicon. Proprietary software interfaces unified into a common model, isolating those companies that did not adopt the standard. PC gaming industry has pushed existing hardware to its limits, thus driving the demand for more performance. Also, the demand in some sectors for high computing power in parallel applications led to the consideration of the GPU as an additional resource. This chapter studies the dynamics between software, strategy, and silicon throughout the recent history of the GPU.

Evolution of APIs Prior to the cross-platform programming environments that developers have today, the graphics industry was fragmented, with proprietary programming interfaces functional only to unique hardware. In contrast with the GPU during the 1980s, the CPU and the pervasive x86 environment enabled programmers’ codes to function on Intel and AMD CPUs alike. Arguably, the added overhead and effort of programming for different interfaces curtailed growth of the 3D industry. At the time of its founding in 1982, SGI was another company with yet another programming interface for its hardware. The proprietary API of SGI, IRIS GL, was required to program any of the company’s graphics products. It was not until a decade later in 1992 that SGI decided to create an API that would be used across any platform. It was dubbed OpenGL and allowed developers to create a single driver or program to function across multiple platforms. A consortium, the OpenGL Architecture Review Board, was established upon the year of OpenGL’s release to

The History of Graphics: Software´s Sway Over Silicon

3

continue to drive the open standard. The latest release of the group was with OpenGL 3.0 in August 2008 (OpenGL Overview).

Enter Microsoft and DirectX With the creation of OpenGL, the graphics industry seemed to have converged on a single language. The unification was short lived as Microsoft began development of the DirectX API within 3 years. There was an attempt between Microsoft, SGI, and eventually HP to attempt to combine the two standards but the “Fahrenheit project” as it was called was canceled and both APIs exist today. The following section focuses on the evolution of the DirectX API.

The OS and the GPU The DirectX versions have evolved to incorporate new instructions, features, and new architectures to meet the advancements in technologies or usages of the industry. With the arrival of DirectX 7 in the year 2000, the API included the feature known as Hardware Transform and Lighting (HW T&L). Previously, the operations that moved objects in a 3D space to a 2D view had been calculated in software or off-loaded to the CPU. DirectX 7 enabled this direct acceleration of these operations on the GPU. DirectX version DirectX 1.0 DirectX 2.0 DirectX 3.0 DirectX 4.0 DirectX 5.0 DirectX 6.0 DirectX 7.0 DirectX 8.0 DirectX 8.1 DirectX 9.0 DirectX 9.0a DirectX 9.0b DirectX 9.0c DirectX 10 DirectX 10.1 DirectX 11

Windows version 95 and NT 4.0 NT 4.0 SP3 n/a 98 98 SE and ME 2000 2000 XP XP XP XP XP Vista Vista SP1 “Windows 7”

Release date September 1995 June 1996 September 1996 Never released July 1997 August 1998 September 1999 November 2000 November 2001 December 2002 March 2003 August 2003 August 2004 November 2006 February 2008 February 2009

DirectX 8 brought forth the creation of the pixel and vertex shaders. These specialized components enabled changes to polygons or 3D objects that were not possible in previous versions. The purpose of the pixel shader was to paint pixels or

4

A. Kerin

change colors of an object, while the vertex shader could change the vertices of an object. DirectX 10 unveiled a new architecture. Rather than the individual vertex and pixel shader units described above, the new model was to use a “Unified Shader.” This unified shader could switch between the roles of either the pixel or vertex shader depending on the program’s specific need. In this model, the dynamic shaders could change to fit the need of the program. Using a 3D game as an example, if the player looked toward the sky, more pixel shaders would be required, while the vertex shaders may be idle. With unified shaders, all available shaders could be devoted act as a pixel shader. Were the player to look down upon the ruins of a destroyed building, this potentially vertex heavy scene would then shift the shaders to this need. As such, the architecture became more “programmable” while moving away from “fixed function” (Torres 2008).

OS and Success The first DirectX came with the arrival of a new operating system, Windows 95. As the predecessor to Windows 3.1, Windows 95 launched the “Start” campaign, owing to the newly featured button on the main desktop named as such. A slogan within the campaign was “Start Playing,” as Microsoft attempted to convince game developers to advance from the MS-DOS operating system. As the ultimate performance of a design is dependent upon software and hardware capabilities, the shift by Microsoft to DirectX was in conflict with Nvidia strategies in 1995. The product code named NV1 unleashed ground-breaking gaming performance for the time period. The issue for Nvidia was that while the architecture of the NV1 was finalized the programming standards were not. The NV1 used quadratic texture maps instead of the polygons used within DirectX. This method allowed the graphics card to make more calculations of the surrounding environment in less time. In addition, the cards could also perform audio card functions, enabling playback of CD quality audio. Jen-Hsun Huang, founder and CEO of Nvidia, made a deal with Sega to use the Nvidia chip within the Sega Saturn game console (Dang 2001). Despite the technical promise of the NV1, it did not succeed in the marketplace. Huang described this product as a massive failure (Reinhardt 2000). The texturing methods used in the first generation graphics cards, although superior to the competition at the time, was not compatible with DirectX. DirectX used polygon rendering, rather than quadratic, and this method quickly became the standard for the industry. This incompatibility nearly killed Nvidia according to Huang. The situation was grim for the company as the engineers confessed there was no way they could produce a polygon capable chip within a reasonable amount of time. The company was on the verge of bankruptcy and was forced to cut its workforce. In addition, Nvidia attributed some of the failure to trying to overdose the chip with impressive features that in the end hurt the overall performance (Reinhardt 2000).

The History of Graphics: Software´s Sway Over Silicon

5

The deal Nvidia made with Sega was responsible for keeping Nvidia afloat for a time. The company built a relationship with Sega and the game programmers with the NV1 product. Microsoft was not a factor in this sector, because many Japanese game developers were ready and willing to use the unconventional technology of quadratic surfaces if it brought additional performance. Sega funded a large portion of the NV2 project, the successor to the NV1 product. This success was short lived as well. Sega was developing its Dreamcast game console and wanted the system to have “a better future in an easier programming environment.” As a result, they chose a competitor’s graphics chipset for the Sega Dreamcast. Again, Huang and Nvidia were brought to the verge of bankruptcy. Huang realized that their GPUs and their techniques, although superior to other products on the market, were nothing without the support of independent developers. As a result, Nvidia switched over to the polygon standard. Switching over to the polygon method and using Direct3D exclusively also helped Nvidia bring the marketing power of Microsoft behind their product.

The Gears of Gaming Video games are the most common applications that demand the most advanced GPU. The video game industry continued to boom and has surpassed movie box office receipts in total sales and was growing faster than the film industry as well in 2008 (US video games sales hit record). While plenty of games had run comfortably on contemporary hardware, there have been a few titles over the years that demanded the hardware industry match its pace. This section focuses on a few of the games that impacted the industry. The release of Quake by id Software in June 1996 marked one of the first releases of a 3D game. The Quake engine prerendered the scene as to reduce the real-time load on the CPU. Later variants of the Quake engine introduced 3D hardware acceleration. VQuake was launched in early 2007 and was named so as it utilized hardware features of the Ve´rite´ chip by Rendition. This code was of limited use as it was designed to function on this specific chip, so a few months later an OpenGL version of Quake was released (GLQuake). These codes resulted in better frame rates in most scenarios, coupled with higher visual quality, specifically, higher resolutions, transparent water, and antialiasing. After the initial release of OpenGL, all subsequent Quake titles were also done on OpenGL. While Quake enabled the first 3D environments utilizing dedicated 3D hardware, other games have pushed the industry, and thus the silicon, in different ways. Upon release in late 2007, Crysis by Electronic Arts became popular in both the gaming and performance circles. PC Gamer awarded Crysis its Game of the Year and Action Game of the Year awards. GameSpot awarded the game with Best PC Game, Best Shooter, and Best Graphics of 2007 and described it as one of the greatest shooters ever (Ocampo 2007). Furthermore, with a built-in GPU and CPU benchmarking test and having some of the most demanding system requirements of the time this game gained traction with hardware review Web sites as well.

6

A. Kerin

Crysis exacted significant strain on the most expensive gaming machines and was the most demanding PC title upon its release. Nothing but the highest performance multi-GPU systems could achieve playable frame rates at the highest quality settings. IGN.com observed that even on a high-end test machine with an Intel Quad-core processor and Nvidia GeForce 8800 GTX the game exhibited slow frame rates on the highest settings (Adams 2007). In short, Crysis demanded more performance than the current generation of hardware could provide.

Graphics Processing Units Become General Purpose The most typical applications of the GPU are 3D or at the very least, visual in nature. Video games, image processing, 3D rendering are the types of applications typically accelerated by graphics hardware. However, the year 2007 brought with it new software and thus a new usage scenario with the GPU. The General Purpose Graphics Processing Unit, or GPGPU, utilizes the hardware once dedicated to graphical operations, to execute “general purpose” code. Nvidia enabled this functionality with CUDA or Compute Unified Device Architecture. CUDA enabled programmers to off-load parallel portions of algorithms to be executed on the GPU for applications such as oil and gas exploration, financial risk management, product design, medical imaging, and scientific research (CUDA Zone). The ATI variant of this technology was dubbed, Close-to-Metal or CTM. Most of the programs or applications do not benefit from the parallel nature of the GPU and those that do tend to be scientific in nature. Stanford University’s “Folding@Home” program is one such example that simulates protein folding to study disease. Applications for more mainstream users surfaced, however. Video transcoding programs surfaced that off-load portions of the task to the GPU, ATI bundled a converter with the catalyst drivers to access the CTM. The program Bada-boom, which uses CUDA, utilizes the CPU as well as GPU acceleration for portions of the video-transcoding. Nvidia also ported physics operations in games, traditionally done on the CPU, to the GPU using a propriety API called PhysX. PhysX is used to perform physics calculations, originally intended to enable Ageia’s PPU (physics processing unit) to perform physics calculations. In this scenario, the GPU is used to both render the games being played as well as use the same resources to accelerate real-time physics calculations in the game. As the physics operations utilize the finite resources of the GPU, the rendering power of the GPU is also reduced, lowering the frames per second, while providing more physics operations. Popular games that have used the PhysX libraries are Unreal Tournament 3 and Ghost Recon Assault Warrior 2. Traditionally, physics operations have been done on the CPU. Havok physics is a prime example and is enabled on over 200 titles between the PC and gaming consoles like Playstation 3, Nintendo Wii, and Microsoft Xbox 360 (Havok Physics). However, Havok is branching to GPUs for physics support. AMD has announced it is working with Havok to optimize physics operations across the

The History of Graphics: Software´s Sway Over Silicon

7

company’s entire product line, starting with CPUs, and eventually moving to the ATI GPU products (ATI CrossFire and Physics; AMD & Havok to optimize physics for gaming). By the end of 2008, OpenCL was released as a new standard for parallel programming. Then standard was created by the Kronos group with industry participation from AMD, Apple, Intel, IBM, Nvidia, and others (OpenCL Overview). Unlike CUDA, OpenCL is an open standard and utilizes “heterogeneous” systems or uses the multiple cores both the GPU and CPU for parallel tasks. OpenCL will be launched within the next Apple operating system, code-named “Snow Leopard” (Mac OS X Snow Leopard).

The Age of Mobile Graphics In 2007, Apple CEO Steve Jobs stood on stage to announce the first iPhone. The first Android phone, the HTC Dream, followed a year later. Since then, the majority of GPU shipments and gaming titles have moved the mobile market. Five years after these devices, smartphone shipments alone surpassed PCs. By 2014, over two million tablets and smartphones shipped compared to 0.3 million desktop and notebooks (Gartner Says Worldwide Traditional). This shift in form factor meant that the dominate compute platform shifted from Windows operating system and x86 processors to Android/iOS and ARM processors (Windows has fallen behind Apple iOS and Google Android). In turn, this made OpenGL ES the dominate graphics API. The new mobile ecosystem of smartphones and tablets proved to be a significant gaming platform. During this time, mobile was the fastest growing segment of the gaming industry (Gartner 2013). This new segment brought with it new performance and power challenges. GPUs could no longer rely on whirring fans to cool themselves. Devices went from resting atop a desk to the palm of hand. Users expected to play games but with all day battery life. Despite these power and thermal restrictions, mobile graphics performance evolved to offer console-quality visuals and features. One such example was the inclusion of hardware tessellation and geometry shaders. These capabilities were initially limited to PCs and consoles. By 2014, these same features would waterfall to Android 5.0 with an extension pack to OpenGL ES 3.1. On the hardware side of the platform, the Qualcomm Adreno 420 GPU within the Snapdragon 805 processor was first to enable these high-end features in mobile devices (Qualcomm). Mobile GPUs also adopted GPGPU features through support for OpenCL and Google’s Renderscript.

Conclusion Changes in the graphics industry have been shaped by the dynamics between the software and its interface with the hardware. Standardization proved itself to be more important than the raw performance of a GPU. Computer games have taken

8

A. Kerin

full advantage of the GPU’s performance and features while driving demand for more. Parallel and scientific applications have begun to use the GPU for added performance. Finally, the growth in mobile platforms has taken the GPU to new form factors while bringing high-end features and performance with it.

Further Reading Adams D (2007) Crysis review. IGN. http://pc.ign.com/articles/834/834614p1.html. Accessed 23 Nov 2008 AMD and Havok to optimize physics for gaming. News Room. 12 June 2008. AMD. http://www. amd.com/us-en/corporate/virtualpressroom/0,,51_104_543126548,00.html. Accessed 15 Dec 2008 ATI CrossFire and Physics. AMD. http://ati.amd.com/technology/crossfire/physics/index.html. Accessed 15 Dec 2008 CUDA Zone. Nvidia. http://www.Nvidia.com/object/cuda_home.html. Accessed 23 Nov 2008 Dang A (2001) History of Nvidia. http://firingsquad.com/features/Nvidiahistory/. Accessed 23 Nov 2008 Gartner (2013) Forecast Video Game Ecosystem Worldwide Gartner Says Worldwide Traditional PC, Tablet, Ultramobile and Mobile Phone Shipments to Grow 4.2 Percent in 2014. http://www.gartner.com/newsroom/id/2791017. Accessed 7 July 2014 GPGPU: General-Purpose Computation on Graphics Hardware. http://www.gpgpu.org Havok Physics. Havok. http://www.havok.com/content/view/17/30/. Accessed 15 Dec 2008 Mac OS X Snow Leopard – Core Innovation. Max OS X Leopard – Snow Leopard. Apple. http:// www.apple.com/macosx/snowleopard/. Accessed 15 Dec 2008 Nguyen H (2008) GPU Gems 3. http://http.developer.nvidia.com/GPUGems3/gpugems3_pref01. html Ocampo J (2007) Crysis review. GameSpot. http://www.gamespot.com/pc/action/crysis/review. html. Accessed 23 Nov 2008 OpenCL Overview. OpenCL. Khronos Group. http://www.khronos.org/opencl/. Accessed 15 Dec 2008 OpenGL Overview. SGI. http://www.sgi.com/products/software/opengl/overview.html. Accessed 23 Nov 2008 Pharr M (2004) GPU Gems. http://http.developer.nvidia.com/GPUGems/gpugems_pref02.html Pharr M (2005) GPU Gems 2. http://http.developer.nvidia.com/GPUGems2/gpugems2_part01. htm Qualcomm moves to 4K with Snapdragon 805 – Adds console features and new LTE companion modem. http://jonpeddie.com/back-pages/comments/qualcomm-moves-to-4k-with-snap dragon-805/. Accessed 21 Nov 2013 Reinhardt A (2000) Nvidia’s invasion. Business week. http://www.businessweek.com/2000/00_ 25/b3686029.htm. Accessed 23 Nov 2008 The Rise of Mobile Gaming on Android (2014) https://developer.qualcomm.com/file/27978/riseof-mobile-gaming.pdf/rise-of-mobile-gaming.pdf Torres G (2008) DirectX versions. Hardware secretes. http://www.hardwaresecrets.com/article/95. Accessed 23 Nov 2008 US video games sales hit record. BBC Business. 18 Jan 2008. BBC News. http://news.bbc.co.uk/2/ hi/business/7195511.stm. Accessed 23 Nov 2008 Windows has fallen behind Apple iOS and Google Android. http://www.zdnet.com/windows-hasfallen-behind-apple-ios-and-google-android-7000008699/. Accessed 12 Dec 2012

Design Tools: Imaging, Vector Graphics, and Design Evolution Kathleen Maher

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raster Versus Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Evolution of Professional Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Influence of the Networking and the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Democratization of Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imaging and Digital Photography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raw Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RAW Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OpenRAW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Family Photography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Free/Online Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mid-Range Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Redefining the Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustration and Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Drawing Tools for Artists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Office Drawing Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Desktop Publishing Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . New Platforms and Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 5 6 7 9 9 11 11 12 12 13 13 15 19 20 20 22 23 24

Abstract

The relationship between artists and their tools has become very dynamic in the age of digital content creation. In the early days of computers, the limitations of the software tools at least partially dictated the style of art produced on the computer. With advances in computer graphics, the tools have become more K. Maher (*) Jon Peddie Research, Tiburon, CA, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_157-2

1

2

K. Maher

flexible, prices have dropped, and some capabilities have been extended to the Web. This chapter provides an historical overview of the evolution of software tools for drawing and imaging and a classification of the tools being used today. DNG DPI GUI PPI SPI

Digital native specification Dots per inch Graphical user interface Pixels per inch Samples per inch

Introduction Digital content creation tools, especially imaging and painting, were some of the first tools to follow the introduction of the graphical user interface (GUI) for personal computers in the 1980s. Those first products were far from easy to use. Early tools and input devices came with limitations that influenced the work. They required artists and designers to adapt their methods considerably, and, it could be argued, the advantages to early adopters were fairly modest. Most designers continued to work out their ideas on paper and then scan elements into the computer to be reassembled in content creation programs like Aldus Freehand, Adobe Illustrator, and Corel Graphics. Tools have gradually improved, but there is still plenty of room for improvement. And as tools have evolved, artists have also evolved. There has been a continuous give and take between artists and their tools that has resulted in distinct styles for the print and the Web (Communication Arts Magazine 2009). For example, the rather arcane craft of type creation became democratized with digital tools that made it easier to fine-tune type designs. In addition, typography underwent a revolution as type was freed from physical boundaries and allowed to flow organically on the page or the web page. Similarly, early digital art featured heavy dark lines, outlines, and bold colors, because that is what the computer was able to do well. This chapter will discuss the evolution of tools and the technological changes that have affected the field of drawing and imaging for artists.

Raster Versus Vector The key differentiator between drawing and illustration tools for all users, professional and consumer, is raster versus vector. Raster-based programs take advantage of the computer’s ability to display graphic information as pixels on the screen. For the sake of convenience, let us say that most computer screens display 72 pixels per

Design Tools: Imaging, Vector Graphics, and Design Evolution

3

Fig. 1 A raster-based image will show jagged edges when zoomed on the computer screen or printed at higher resolutions

inch. This is generally true for consumers; professionals work at higher resolutions and new displays are capable of displaying much higher resolutions. Higher resolutions are needed for many professional applications because printers work within ranges of 600–2,400 pixels per inch. (And again, this may vary with the quality of the printer and its applications.) Resolutions for printers and screens, pixels per inch (PPI), may also be expressed as dots per inch (DPI), or samples per inch (SPI). Raster-based programs produce files that are resolution dependent. An image created in 72 DPI on a computer will look fine on the screen, but if blown up to comparable size on a printer, it will show considerable degradation. In contrast, vector-based drawing programs such as Freehand and Illustrator use mathematical techniques to produce the required image. As a result, vector images can scale to any resolution and can be printed at any size. Fonts are vector based to allow them to be scaled to the required size. CAD programs and desktop publishing are also vector-based programs. In practical use, raster programs are used for editing photographs and to create images for the Web. In addition, raster-based programs may be used for drawing and sketching, and many hobbyist’s tools are raster based (Fig. 1). The majority of vector-based drawing tools have been created for professionals. Both raster and vector tools for professionals have tools that recognize and accommodate the requirements of commercial printing including color space options, registration, bleeds, etc.

The Evolution of Professional Tools In general, computer programs for artists, typographers, and graphics designers were not created by professionals in the graphics arts industries though the developers may well have relied heavily on experts to create their programs. For example, it is widely reported on the Web that Illustrator was originally developed by Adobe as an in-house font development tool. Photoshop was first developed by Thomas Knoll to help him with his Ph.D. work on processing digital images. He wrote subroutines to simulate grayscale levels on a Mac in 1987. It was his brother John who was working at ILM who recognized the usefulness of the program for

4

K. Maher

image processing and was interested in the ability to convert files to different formats (Derrick 2000). On the Windows side, CorelDraw was developed by engineers Michel Bouillon and Pat Beirne to develop a vector-based illustration program to bundle with desktop publishing systems (Laver 1998). Even if the developers were setting out to create tools specifically for graphics professionals, there is the practical problem of expressing smooth curves on a screen – which is essentially a grid of points. Thus, an artist wishing to create a free form curve was confronted with the complication of splines and control points. As a result of the digital origins of digital drawing and imaging tools, early adopting graphics professionals were asked to radically change the way they worked. The situation has considerably improved, but learning how to create art on the computer is still a different process from learning how to create art on paper. In many cases, the development of graphics software was heavily influenced by printing technology and desktop publishing. For instance, the concepts of layers, fills, and gradients are more related to printing than drawing and illustration. It could be argued, however, that in the long run, the cross-pollination of disparate disciplines has been healthy for commercial art and design. Certainly, digital content creation tools have developed with capabilities that enhance the native abilities of artists and even enable those with limited abilities to work in the field. The tools have spawned the field of computer art and have inspired unique styles. On the less positive side, visual content creation tools have forced artists to adapt to them rather than adapting to the requirements of the artists. Many developers are sensitive to this limitation, and as computers become more powerful, they are trying to improve the flexibility of content creation tools to allow artists to work more intuitively. The field of Visual Content Design has been transformed by several technology waves, some internal to the programs, but most external. “Layers” is an example of an internal technology. Adobe introduced the concept of layers to artists with Photoshop 3 in 1994 and Illustrator 5 in 1993. However, computer-aided design (CAD) systems had been using layers since the late 1970s as a means of organizing information in complicated projects. For commercial artists working in Photoshop and Illustrator, the concept of layers has resonance because it is similar to the idea of creating separations for print. The capability also adds incredible power to the creation process and acts as a useful tool for packaging work for delivery. Illustrator makes extensive use of layers reflecting the ability of artists to work with pens, different weights of lines and fills. Likewise, layers in Photoshop lets artists try out different looks and adjustments and gives them a path back out of changes if necessary. Other innovations in digital software that have changed the field of graphics arts include the ability to place text along a path, easily create runarounds (for instance, to set type close to a complex image), and convert raster lines to vector, making scanning and tracing a practical option for input. Text has been freed from the grid. In fact, the impact of digital technology on artistic style has not been adequately explored but it has clearly been profound. And, as always, there are two sides of the

Design Tools: Imaging, Vector Graphics, and Design Evolution

5

story. It has been argued that the dependence on software has degraded drawing skills because students spend less time doodling, sketching, and drawing and go to software sooner. The evidence for this is completely arguable and anecdotal. However, there is one important change that has impacted all work since the arrival of the computer and that is that jobs are compressing. The people who create artwork take it all the way to the prepress stage or submit their work in a format suitable for the Web. There is considerably less handing off of work to different people to be reformatted for different output options. Rather, the artist is called upon to do this work.

The Influence of the Networking and the Web When talking about the evolution of commercial art and digital tools, it is important to remember that the tools were created before the broad deployment of the Web and the arrival of browsers and graphical user interfaces for the Web. The tools were primarily developed for professionals working in print publishing fields. But, even though the users and developers at the time did not know it, the wheels of change were already in motion. The Internet has emerged as a new form of media. It has almost completely displaced the trade press, and it is having a huge impact on the newspaper industry and even consumer magazines are seeing the effects. The biggest factor affecting the print industry is the defection of advertisers for the Web. In late 2008, with a slowdown in the world economy, advertising was declining across all media, but it has picked up speed for the print industry. There are whole sectors of the print industry that have disappeared. What the Web takes away, it gives back. Work for graphics artists has changed dramatically as some make the shift to creating artwork for the Web rather than for print. New jobs have been created as layout and typesetting transitions to web design. In addition, the evolution to the Internet has enabled more efficient workflows for graphics artists. Before the Web arrived for the mainstream, companies were going digital and changing their workflows with the addition of local area networking (LAN). In hindsight, LAN can be thought of as training wheels for the Web. Local area networking, as it was gradually rolled out over the 1980s and 1990s, improved the ability of workers to collaborate. And, as the Web arrived, it extended the ability of people to collaborate over large distances and to access information from a variety of sources. Networking and the Web has improved the ability of people to collaborate; to find inspiration, photos, and clip art; and to buy software and support. Ironically, networking has also increased the need for organization as it has added to the amount and complexity of data maintained by just about any computer user. The Internet, networking, and easy access to tools and training have served to democratize significant areas in the imaging and graphics arts fields.

6

K. Maher

The Democratization of Imaging The evolution of the Internet, starting with the introduction and broad acceptance of a graphical user interface (GUI) in the form of browsers and simple coding via HTML (Tim Berners Lee is credited with inventing the World Wide Web in 1991), has accelerated the process of democratization. As a result, new users and new applications are appearing in this segment that once seemed so stable and dominated by a few market leaders. So what happens when digital photography is widespread among professionals and consumers? The industry is at that point now. In research performed for The Digital Content Creation Software Report, Jon Peddie Research estimated the installed base of digital still cameras to be at 352 million in 2010. Of those, approximately 14 million were digital SLR cameras (Digital Content Creation Software Report). That was a high point for digital still camera sales. As the quality of the cameras integrated with mobile phones has increased, most people have stopped carrying an additional camera. People who buy an additional camera therefore tend to be enthusiasts. As a result, almost everyone has a camera in their pocket and those cameras are becoming more capable every year. And those people who are carrying an additional camera tend to be much more committed to photography, and they are carrying better quality cameras. Since the first days of digital imaging, there has been a class of imaging utilities that professionals tended to use in conjunction with their professional tools such as Photoshop, CorelDraw, Canvas, Illustrator, etc. Among these programs were Jasc’s PaintShop Pro and HiJaak (originally developed by Inset) which were very low cost and enabled screen capture and format conversion. However, the lightweight utilities of the past either disappeared or grew into full-blown applications that were every bit as slow to open and as unwieldy to navigate through as their highend professional counterparts. Most of the low-cost photo products, all with similar names – PhotoStudio, PhotoImpact, and PhotoPlus – have wound up in the hands of larger companies to become part of a suite. As some programs have blown up and become unwieldy, however, small companies have popped-up to create useful small programs and utilities. Some programs like SnagIt, a simple capture tool, perform single tasks, while others such as Irfanview, Picasa, Gimp, and Xara offer basic, stripped-down capabilities that a broad range of creative users – professionals and amateurs – might use. They take up where the useful utilities of the past left off (Table 1). The competition now is coming from free or almost free products that are available on the Web or shipping with products. According to studies from Texas Instruments, Corel, and others, the average consumer only uses editing software for a few tasks. They are likely to use the software that came with their camera. Also, they are likely to use automatic fixes rather than fine-tune their results.

Design Tools: Imaging, Vector Graphics, and Design Evolution

7

Table 1 Something for everyone – this sample gives an idea of the variety of software available for image processing (For more information about free resources for photographers see About. com. Sue Chastain has written an article listing several programs. http://graphicssoft.about.com/ od/pixelbasedwin/tp/freephotoedw.htm. For a list of all kinds of freeware, including photo imaging, see Freeware Home: http://www.freewarehome.com/index.html?http%3A//www.freewarehome. com/Graphics/Graphic_Manipulation/Digital_Photos_t.html. www.infotrends.com) (From Jon Peddie Research) Company Abrosoft Acorn

Product FantaMorph Flying Meet

Fast Stone

FastStone Soft Tank Software Gimp.org open source group Morpheus Irfan Skiljan SnagIt

Gallery Mage Gimp

Morpheus Irfanview Techsmith VCW Maggi Photofiltre PhotoPlus

VicMan’s Beauty Wizard Photofiltre. free.fr Serif

Pixarra

Twisted Brush

Real Illusion

FaceFilter

Function Morphing movie software Mac editing software with support for layers, text, vector shapes, filters Resizing, change color depth, file conversion Nondestructive organizing software Photo editing includes layers, channels, paths, etc.

Price $49.95 Free version and $49.95 version

Morphing software Photo viewer with light editing

$29.95 Free

Easy capture and file conversion, basic editing Photo editing Hairstyle and makeup software

$39.95

Lightweight photo editing supports filters but not layers Photo editing with layer support, GIF animation, brushes, text Digital paint software with natural art tools and photo editing Photo editor

Free

Free Free Free

Free $39.85

$79.95 (earlier versions offered for free as introduction to software) $39.95 $29.95

Imaging and Digital Photography Perhaps no other influence has affected the field of digital imaging as much as the arrival of the digital camera. At first, the use of imaging software was limited by the ability of users to obtain or produce digital images. They could create original works using the computer or scan in images and photographs. The digital camera has made the process much easier, and as a result it has introduced the mainstream

8

K. Maher 180

Worldwide shipments digital cameras 160

140

In millions of units

120

100

80

60

40

20

0 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 (est) (est) (est) Source JPR with data from CIPA and PMA

Fig. 2 Worldwide shipments of cameras are slowing down, but revenues have been bolstered by the popularity of higher-priced DSLRs. Data for 2010 is forecasted data

of users to imaging. Mainstream users have returned the favor and have expanded the concept of photography by using their digital cameras to document events, illustrate their blogs, create posters and other items, and to communicate more effectively on social networks (Fig. 2). The relationship of the customer and digital cameras has evolved quickly in the decade or so since cameras have been attractively priced. Every year new cameras are introduced with larger and larger megapixel (MP) sizes – meaning that larger files are created and larger pictures can be printed. In 2014 the leading phones featured sensors which are comparable to low-cost digital still cameras of just a few years ago. For example, the Apple iPhone 6 has an 8 megapixel camera based on a (reported) 1/3 in. sensor; the Samsung Galaxy 5S has a 16 megapixel camera using a 1/2.6 sensor. As phones like these increase in sales, they are replacing digital still cameras. On the higher end, there is new interest in digital SLR cameras, and analysts tracking digital camera sales point to the digital SLR market as the fastest growing market for cameras even as the sales of mainstream cameras level off in the USA (and even more so in Europe and Japan).

Design Tools: Imaging, Vector Graphics, and Design Evolution

9

Raw Format Increasingly, cameras with ever larger resolutions are offering uncompressed formats for users willing to work with very large files in order to get all the data in an image. Technically, RAW data is the information gathered by the camera sensor before it has been compressed into JPEG or other formats and before adjustments such as white balance, sharpening, and contrast are applied and before image quality and image size are changed, but there is no one RAW format. RAW formats vary according to camera maker, and a certain amount of processing and compression might take place. RAW captures 12 bits of color per pixel or 4,096 shades of color per pixel (68.7 billion colors). As a point of comparison, JPEG saves 8 bits of color per pixel or 256 shades of color per pixel for 16.7 million colors. As a result, using RAW images can give users more flexibility in working with images than they would have after information is lost in JPEG. Adobe was early to offer access to RAW data via a Camera RAW plug-in for Photoshop in 2004. Soon after, the company offered to “standardize” RAW format within its DNG (digital native specification) format. Some camera manufacturers including Leica and Hasselblad have signed on, but others including Canon, Nikon, Sony, and others prefer to maintain control over their native RAW formats. Now, RAW is available through most image editing programs, and again even mainstream photographers are learning how to use this format formerly reserved for professionals (Table 2). There are many more products that can work with these formats, but most cameras that have a RAW format also offer their own software to deal with those formats.

RAW Software Working with RAW camera files has become as easy as working with processed camera files such as JPEG or TIFF, thanks to the introduction of software that handles RAW files. Cameras that support raw files usually ship with software to deal with that camera’s proprietary file format. In addition, there are several software programs that can work with a variety of RAW file formats including Google’s Picasa, Apple’s iPhoto and Aperture, Adobe’s products including Photoshop and Lightroom, and Corel’s AfterShot Pro and PaintShop Pro products. Commercial software products license the technology from the camera manufacturers, or they may reverse engineer, meaning they work backward from the file format, to develop software that can edit and convert RAW format files to standard file formats such as JPEG, TIFF, GIF, etc. Table 3 is a list of available software developed to handle a variety of RAW formats. The emphasis for most of these programs is to convert files and perform basic imaging functions such as adjust white balance and sharpen. They are particularly useful in workflows where several cameras are being used, each with their own proprietary RAW format

10

K. Maher

Table 2 A list of common RAW formats and their associated cameras and software Proprietary camera raw formats .3fr .raf .crw, .cr2 .tif, k25, .kdc, .dcs, .dcr, . drf .mrw

Camera manufacturer Hasselblad Fuji Canon Kodak

.mef .nef, .nrw .orf .ptx, .pef .arw, .srf, .sr2

Mamiya Nikon Olympus Pentax Sony

.x3f .erf .mos .raw, .rw2 .cap, .iiq .bay

Sigma Epson Leaf Panasonic Phase One Casio

Minolta

Associated software FlexColor, Phocus Finepix S2 Pro Digital Photo Professional, ZoomBrowser EasyShare (Sony acquired Minolta’s camera technology) – Capture NX Studio 2 Pentax Photo Sony Image Data Suite (data converter, Lightbox) PhotoPro Photolier Leaf Capture, Leaf Raw Converter – Capture One –

Table 3 Several programs have been developed to handle a variety of RAW files Product Photostudio Darkroom

Publisher Arc

BreezeBrowser Pro

Breeze Systems

Bibble Raw Editing

Bibble Labs

Lightzone

Light Crafts Inc.

AbleRAWer

Graphic Region

Camera Canon, Nikon, Panasonic, Adobe (DNG), Sony, Kodak, Olympus, Sigma, Mamiya, and Epson Canon EOS 50D, Canon EOS 5D Mark II, Canon PowerShot G10, and Panasonic DMC-LX3 raw conversion Supports formats from Nikon, Canon, Olympus, Kodak, Pentax, Minolta, Epson, Fuji, Sony, Panasonic, Leica, Leaf, Mamiya, Samsung RAW files from the Canon 450D, Canon 1000D, Fujifilm S100FS, Nikon D60, Nikon D700, Olympus E-420, Olympus E-520, Olympus SP-550UZ, Panasonic DMC-L10, Pentax K20D, Pentax K200D, Sony DSLR-A200, Sony DSLR-A350 .raw, .crw, .cr2, .nef, .pef, .raf, .x3f, . bay, .orf, .srf, .mrw, .dcr, .dng, .arw

Platform Win/ Mac/ Linux Win

Price $99.99 $89.90

Win/ Mac/ Linux

$159.95

Win/ Mac

$149.95

Win

Free

Design Tools: Imaging, Vector Graphics, and Design Evolution

11

Several companies have made it their business to understand a variety of RAW formats and enable conversion as the included list suggest. However, the trend is to add broad RAW support to products that add conversion, image editing, and perhaps even image organization. There is considerable controversy over the usefulness of RAW, but professional photographers and graphics professionals will prefer to have the option. And such features as RAW support in software – for at least key cameras – will help separate professional products from mainstream products for many users. For RAW editing software, differentiation comes in the breadth of formats supported and also flexibility. If nondestructive editing is a major feature for RAW workflows – the ease with which users can step back from adjustments differentiates professional products from low-cost, mainstream software.

OpenRAW The OpenRAW organization was formed in an attempt to encourage camera manufacturers to open up their RAW formats to make it easier for users to work with RAW files. The group surveyed photographers and not surprisingly found photographers to be frustrated working with RAW formats. The supporters of the OpenRAW group come primarily from user groups and software developers. Manufacturers have been disinterested, and OpenRAW has had limited success, seeing the development of an open RAW format. As software companies have developed their own strategies for dealing with RAW formats and as these programs have become easier and easier to work with, the problem is becoming moot.

Family Photography Although there was never any doubt that digital cameras would become popular with mainstream users very rapidly, it has been surprising to see the development of applications enabled by the widespread use of digital photography. Retail printing is enjoying a bit of a recovery after the precipitous drop-off caused by the acceptance of digital cameras. However, InfoTrends has also reported that users are moving toward electronic viewing online, using digital photo frames and, increasingly, using their new HD televisions to view photos. There is a growing demand for retail printing, and consumers are creating their own books, thanks to the arrival of low-cost digital printing on demand. Several trends that are evolving with increased use of digital photographs include: • Photo sharing online via social networking sites including Yahoo’s Flickr, Facebook, MySpace, and others. • Photo album software – stand-alone programs are showing considerable popularity.

12

K. Maher

• Photo scrapbooking and other project-based software. • Increased printing – more photos are being printed, and more photos are being printed at retail locations. • Photo books. It has been remarked before that an increasing number of digital camera users are women. Unlike many other technical markets, women play a major role in purchasing and using cameras. The Photo Market Association (PMA) believes that women are the primary users of cameras in over 50 % of the households owning a digital camera. In addition, even when women are not taking pictures themselves, they are often the archivers and the people reusing images. For that reason they are helping drive the sales of album software, and photo books, they maintain photo sharing sites, and they buy and share retail prints. Interestingly, printer manufacturers HP and the PMA have both found that users are increasingly keeping photos on their cameras. They do not download them until the memory is full. Tara Bunch, General Manager, HP Imaging and Printing group, observes that people use their cameras as digital photo albums. It helps to keep a sense of perspective when talking about consumer trends. The vast majority of images are likely to remain digital, and the applications for them will be primarily digital, but that opens up a very fluid range of possibilities for the future.

The Cloud In a related trend, we are seeing people become more comfortable with putting their images online. Services like Google’s Picasa and Flickr allow people to store and share images online. Adobe is expanding its use of the cloud as a customer resource. The company has transitioned to a subscription-only model for its software and offers free services and storage for consumers. Customers are gradually showing a willingness to pay for online storage, and vendors including Google, Apple, Adobe, and others are making increasingly good deals in return for access to customers via a subscription relationship.

Free/Online Editing The divide between professional tools and consumer tools has widened considerably with the entrance of free online tools such as Picasa offered by Google and Photoshop Express offered by Adobe. In addition, there are photo organizing and sharing sites including Flickr that offer some editing capabilities. The sites are free for various reasons – Google can sell ads and services and Adobe hopes to sell products and protect its professional products from low-cost interlopers – but whatever the reason, free tools obviously threaten the position of low-cost photo

Design Tools: Imaging, Vector Graphics, and Design Evolution

13

editing tools. Adobe with the industry standard Photoshop has a unique position in the market. Photoshop is a requirement for many photographers and imaging professionals. As a result, the company has been able to move to a subscription model. The company offers its photography tools Photoshop and Lightroom for $9.99 a month and access to all of its creative tools for $49.99 a month. As of 2014 the company has over three million subscribers. Other companies are introducing subscription models, but no other company selling imaging tools has had this success with subscription sales. That may change as people become accustomed to the idea of subscriptions.

Mid-Range Tools No vacuum lasts for long, and the gaping vacuum between the professional tools and consumer tools has been filled by several classes of tools – the leaders are Apple’s Aperture and Adobe’s Lightroom. Both combine organization for professionals and serious hobbyists with the most commonly used tools for editing, as well as presentation tools and printing. Apple introduced Aperture at $500 originally – a price that put it well under that of Photoshop – but still put it in the class of “professional” software. It included tools that represented a subset of Photoshop tools for digital photo editing, and the company described it as the features professional photographers are likely to use most often in their day-to-day work. The price eventually dropped to $150–$199. Adobe countered with Lightroom for about $250, but it has since gone to a subscription model. In 2014 Apple announced plans to abandon Aperture in favor of new photo management and editing tools to come in the future. Corel hopes to fill the vacuum with its cross-platform product, AfterShot Pro, a RAW management and editing tool, which has dropped in price from $79.99 to $39.99.

Redefining the Market Digital photography for consumers has created markets that have not existed before, such as online photo printing and custom t-shirts, coffee cups, and scrapbooks; consumer applications are helping drive the growth of the graphics and photo imaging market. In addition, the consumers’ enthusiasm for digital photography and sharing photographs has inspired new applications that have grown up with the social media. Among these applications are Instagram, Pixlr, and Snapchat. And, obviously, social media applications such as Facebook and Twitter also thrive with photographic content. For consumers, the camera is more indispensible than it has ever been at any point in time. It should also be noted that the digital photography market is a useful model for a variety of digital media markets. Applications that make creative impulses easy to realize are changing the way professionals work as well (Fig. 3).

14

K. Maher Professional graphics users Other 24%

Adobe 46%

Alien Tech Smith Skir 11 % 1%

Xara 1%

ACD Core/Jasc 10% 5%

Imsi 1% Autodesk 1%

Fig. 3 Adobe owns most of the professional graphics market and has expanded it through its acquisition of Macromedia. In cases where people use other products especially plug-ins, they also use Photoshop. Jon Peddie Research estimates that there are approximately 1.5 million professional users

Professional users tend to be conservative in their choices of tools. They do not want to spend time learning new products and they do not want to risk unpredictability. For that reason, they are much less price conscious. In the graphics field, professional users can tend to be more conservative in their upgrades and their hardware purchases as well. As long as everything is working, many professionals will prefer not to rock the boat; they only change at the point of pain, when software starts driving the hardware too much and slows down and when there is not enough memory. New operating systems may inspire the purchase of a new system but only if it is seen as a requirement – when version changes cause incompatibilities, for example, or functionality is increased to a significant degree. Even then, these users are rarely early adopters. The consumer market for digital imaging software is large and varied (Fig. 4). Users have choices between low-cost album software that lets them organize their photos, do light editing and create slide shows, add photos to web pages, create coffee mugs, t-shirts, calendars, and scrapbooks. There is potential for more specialized products and also a broader population of general purpose users as well.

Design Tools: Imaging, Vector Graphics, and Design Evolution

15

Graphics consumers - Marketshare Nova 1% Intervideo Ulead 1% Nik Multimedia 1% Sonic Solutions Roxio 4%

Other 16% Adobe 30%

Cosml 4% ArcSoft 11% Riverdeep/ Broderbund 10%

Microsoft 16%

Corel/Jasc 6%

Fig. 4 The consumer market for graphics software is driven by the digital camera market. JPR estimates that there are approximately 6.6 million consumers who use graphics and imaging software. While this is the highest number of people in the digital content creation market, the numbers are relatively low compared to the number of cameras out there – approximately 350 million – and the installed base of mobile phones with cameras which is approximately three billion (multiple sources, including Jon Peddie Research)

Illustration and Drawing Digital illustration and drawing have never democratized to the same extent that imaging has for a simple and obvious reason – drawing is a skill. Anyone can take a picture, but not everyone can draw. Illustration and drawing are likely to remain the domain of craftspeople with the ability and interest to pursue it. In addition, drawing and illustration is closely associated with the publishing industry, which is in decline. In general, applications designed for drawing are vector based to enable printing (Table 4). In general, however, raster-based drawing tools are included in photo/imaging products and are classified with those products. Drawing products fall into two major classes: drawing tools for artists and office drawing tools. There is an overlap between the tools in many cases, and the distinction is based primarily on the types of templates and pre-drawn elements provided. For instance, office tools like Visio, EazyDraw, and SmartDraw offer pre-drawn components for block diagrams, flow charts, office layout, charts, and so on. They may offer quite a bit of clip art and that too is oriented toward office projects. There is an emphasis on making content creation for people not skilled in drawing. Drawing tools developed for commercial arts also offer template and shortcuts, but the emphasis is on creating original art. The products usually have more features, they may be more complicated, but they are designed to fit into a professional workflow including traditional print output and web development.

16

K. Maher

Table 4 Commonly used vector drawing programs classified according to their primary use: creating office graphics or commercial art (From Jon Peddie Research) Vector drawing programs Illustrator

Company Adobe

Win Y

Mac Y

Linux N

Office/ Artist A

Cost $499

Canvas

ACDSee

Y

Y

N

A

$350

CorelDraw

Corel

Y

Y

N

Both

$399

ConceptDraw

Computer Systems Odessa Corporation

Y

Y

N

O

$499

EazyDraw

EazyDraw

N

Y

N

O

$139

Comments Illustrator is a de facto standard for commercial artists with strong competition A venerable program with longtime users. ACDSee has recently introduced Canvas 11 and has added GIS support. It supports a wide variety of formats A leading contender in the professional drawing market. CorelDraw’s early expertise in raster-tovector conversion has given it an edge in sign-making and patternmaking Office Drawing tool which includes ConceptDraw’s project management software and MindMap. ConceptDraw includes support for Visio files An Apple program for business users. The company reports growth (continued)

Design Tools: Imaging, Vector Graphics, and Design Evolution

17

Table 4 (continued) Vector drawing programs

Company

Win

Mac

Linux

Office/ Artist

Cost

DrawPlus

Serif Software

Y

N

N

O

$100

Freehand

Adobe

Y

Y

N

A

$399

Mayura

Mayura

Y

N

N

O

$39

Microsoft Expression Design

Microsoft

Y

N

N

A

Corel Designer

Corel

Y

N

N

O

$429

Comments year after year and is especially excited about new release. Also supports ClarisWorks, MacDraw, and AppleWorks Supports CMYK, making it suitable for professional printing applications. Serif offers earlier versions of its software as free downloads Freehand, formerly part of Macromedia and included in the MX suite, is now being phased out by Adobe in favor of Illustrator or Fireworks Low-cost program that is easy to use and has good EPS output. Limited for professional use Microsoft is mounting a very serious campaign against Adobe with its Expression Tools Originally developed by Micrografx, a (continued)

18

K. Maher

Table 4 (continued) Vector drawing programs

Company

Win

Mac

Linux

Office/ Artist

Cost

Omnigraffle

Omni Group

N

Y

N

O

$200

RealDraw

MediaChance

Y

N

N

O

$55

SmartDraw

SmartDraw

Y

N

N

O

$197

Visio

Microsoft

Y

N

N

O

$259

Comments Dallas-based software developer. Since acquiring the program, Corel has enhanced its usefulness for technical publishing. It supports a wide range of formats Mac tool for business users. Oriented toward graphs, diagrams, etc., and has support for Visio files Includes 3D capability in a big list of tools. Latest version released in November 2008 Windows tool for business users. The company offers a wealth of pre-drawn elements and templates to create graphs and diagrams Visio has been a leading business graphics tool on the Windows side with good tools for graphs, diagrams, facilities management, etc. Strong support for Office products gives Visio a (continued)

Design Tools: Imaging, Vector Graphics, and Design Evolution

19

Table 4 (continued) Vector drawing programs

Xara Extreme

Company

Xara

Win

Y

Mac

N

Linux

N

Office/ Artist

Both

Cost

$179

Comments lead for many business users. Microsoft has limited access to Visio to the Office suite curtailing its use by some Xara had worked with Corel to create CorelXara. Xara was acquired by German company Magix 2008

Drawing Tools for Artists Artists choose drawing tools for very personal reasons such as ease of use, intuitiveness, etc., and they may be influenced by application. Adobe Illustrator, by virtue of its wide use in commercial art and printing, is often specified as a required tool for artists for companies. Adobe has strengthened its position with the delivery of its products in interoperable suites: the Creative Cloud (CC) line of products. In addition, Adobe formats including Illustrator EPS/AI files and PDF have become de facto standards in the industry. Adobe sells its CC tools as subscriptions. Adobe is closely followed by Corel which was developed originally for the Windows platform and remains strong for Windows users. In addition, CorelDraw has attracted users because of its raster-to-vector conversion abilities including pattern makers and sign makers. It is also strong in office graphics because of its ease of use. Depending on their comfort with computer graphics tools, artists are likely to use a variety of tools to get the effect they want including raster-based tools. Autodesk is promoting its Sketchbook Pro product, a raster-based product developed for tablets as a tool for industrial artists. Likewise, Corel offers its Painter tool; Ambient Design offers a popular low-cost tablet tool called ArtRage. Work created in these products may be imported into vector drawing tools for final output. The prepress industry is coalescing around PDF and EPS/AI as preferred output formats. The tablet has become an important tool for artists who gravitate toward the ability to draw with a finger or pen in much the same way one works with pencil and paper. Apple’s iPad has enjoyed an early-adopter edge, but Windows tablets with built in digital pens are becoming important contenders. Adobe and Microsoft have teamed

20

K. Maher

up to make Adobe’s products work better on touch-enabled screens for Windows devices. In addition Adobe has considerably built out its mobile line of products and is courting third-party developers to work under the Creative Cloud umbrella. Users working with Adobe’s mobile products can feed their work directly to their Creative Cloud accounts for distribution or further editing in the desktop products. There is considerable opportunity in the mobile world for illustration apps. Adobe and Autodesk are strong, but no company can be said to “own” the market. As the included list suggests, there are also a wealth of low-cost drawing products that may have been on the market for some time. Artists may use them for a particular features or because they are easy to use.

Office Drawing Tools Tools designed for office applications are usually developed from the ground up for that purpose. In many cases, these products are a hybrid between CAD and drawing tools because the orientation is toward creating 2D drawings for diagrams, chart resource management, simple design, etc. An example of this type of product is Microsoft’s Visio, which was originally developed as a simple drawing tool for office users who needed to access vector content including CAD drawings. Another tool, Corel Designer, was originally developed as an all-purpose drawing tool, but Corel has refined it for use with CAD data to create technical illustrations. SmartDraw is an example of a product that was developed from the ground up as an office productivity tool with a wealth of templates and pre-drawn components for specific tasks including electrical diagrams, organizational charts, and simple illustrations. It is a Windows tool, but there are similar tools for the Mac such as ConceptDraw. Office drawing tools including Visio, SmartDraw, ConceptDraw, etc., have been relatively stable in terms of price. They do not have large user bases, but they meet the requirements of many office users who do not need full-blown computer-aided design (CAD) tools but need basic diagramming and drawing tools for office work. Office drawing tools usually offer strong support for Microsoft Office products especially Word and PowerPoint. And, for its part, Microsoft has been enhancing its ability to product graphics within all its products including Word, PowerPoint, and Excel.

Desktop Publishing Programs Desktop publishing programs are always vector based in order to accurately reproduce fonts and communicate precise visual information for printing. The entire desktop publishing market has been decimated by consolidation and is probably the most stable of all DCC segments. It is also suffering from the effects of the transition of publishing from print to online publishing. There will always be a need for printed material, but increasingly information that was printed and distributed in the real world is becoming digital and distributed on the Web.

Design Tools: Imaging, Vector Graphics, and Design Evolution

21

The digital publishing software field has been reduced to two market leaders, Adobe InDesign and Quark’s QuarkXpress. Professionals who have committed to one product or another tend to stay with it for a long time. In addition, customers do not upgrade machines or software unless it is strictly necessary because to do so will disrupt their production cycle. Publishers working within longer production cycles such as book publishers and some magazine publishers are better able to change direction, but that newspapers and weeklies have to plan carefully to make changes in their production pipeline. Desktop publishing is one of the oldest subsegments of the computer graphics field. As a result, there are quite a few products that continue to be used in certain industries including Corel Ventura and Adobe’s FrameMaker and PageMaker. However, in most professional publishing houses, companies settle on InDesign or Quark as a standard. At one time, Quark’s output format was the standard for commercial printing houses, giving it an edge in some cases. Adobe changed the equation with its Acrobat and its PDF format which has been accepted as a universal format for publishing, both online and printed documents. For individuals and light applications, users might turn to one of the vector drawing products listed in the previous section, or they might choose from a variety of products designed for light publishing including Microsoft Publisher, Print Shop, or Print Explosion. In general, the desktop publishing market for professionals has consolidated around Adobe InDesign and QuarkXpress (Table 5). There are some enterprise applications that rely on Ventura Publisher or FrameMaker for very long documents or books. These programs are often used for manuals. Users who are not routinely creating content for print publication may use some of the less-expensive programs for occasional print projects. Most professionals use either Adobe InDesign or QuarkXpress; however, there are professional users who hang on to PageMaker, FrameMaker, and Ventura Publisher The desktop publishing software industry is very mature. There is not much growth, nor do professional users switch programs after they have settled on one. As might be expected, the software vendors are seeing a slowdown in sales as their customers are squeezed by declines in advertising and subscriptions. The print industry is somewhat stabilizing and publishers are learning how to adapt to online publishing. Companies are introducing new tools for digital publishing, and a new layer of management is growing up around back-end infrastructure for online sales and distribution and tracking of advertising. In 2009, Adobe acquired Omniture, which offers web tracking tools. As a result, the company is unique in offering end-to-end tools with content creation and back-end management, but there is a great deal of development going on for web-based distribution. Adobe has used the Omniture acquisition to build the Adobe Marketing Cloud. It is a much smaller business than Adobe’s core Creative Cloud, but it is a fast growing source of revenue. This is a new market with new competitors, but Adobe is unique in owning the tools for creating marketing content including Photoshop, Illustrator, and InDesign, as well as tools for tracking the effectiveness of marketing.

22

K. Maher

Table 5 Commonly used desktop publishing programs InDesign

Adobe

$699

PageMaker QuarkXpress

PageMaker Quark

$499 $799

FrameMaker

Adobe

$899

iCalamus

Inverse Software Microsoft

129 Eur

Page Plus

Free

Pages PageStream

Apple Page Stream

$79 $99/$149

Print Explosion Deluxe Print Shop

Nova Development

$49.95

Broderbund

$29.95–$99.99

Ventura Publisher

Corel

$599

Microsoft Publisher Serifs

$169

With QuarkXpress the leading DTP program Adobe’s DTP for light applications One of the leading DTP programs for professional applications Adobe offers FrameMaker for technical publishing. It has been designed for long documents including books Developed from the ground up for Mac OS X by German company Loningen Designed for newsletters and short documents This is part of Serif’s lineup of older products Part of Apple’s iWork suite First sold as Publishing Partner for Atari. PageStream is available for Windows, Mac, Linux, Amiga, and MorphOS Print program for Mac hobbyists. Includes templates for greeting cards, business cards, CD labels, and scrapbooks Primarily a tool for hobbyists, Print Shop comes in three versions and has tools for all types of print projects including newsletters Ventura was originally developed for the PC platform and was first to incorporate tagging, style sheet concepts, and XML. It is used primarily for very long documents

New Platforms and Trends The process of digital content creation is undergoing further changes as computer processors become more powerful and new capabilities such as touch screens are added. The introduction of the iPad by Apple is opening up the ability to draw directly on the screen. The iPad along with electronic books offers a new platform for content distribution and it is seeing phenomenal uptake. New tablets, including Windows-based tablets with more powerful processors than those powering the Apple iPad, are appearing, and competition will drive down the price for these new devices. There is an increasing interoperability between all tablets and desktop products, allowing professional workflows. As a result, content creation is going to become much more mobile. It will not be necessary to rely on powerful computers for all tasks, and the process of creation will expand to include better tools for sketching and doodling. The freedom of traditional tools is returning and artists also have the additional capabilities of digital tools.

Design Tools: Imaging, Vector Graphics, and Design Evolution

23

Most significantly, the democratization of digital tools means that capabilities formerly reserved for those professionals who could afford to buy high-end tools will soon be extended to almost anyone who wants them and they are becoming easy enough to use that learning the software is no longer a major barrier. Talent and training, rather than the ability to use certain tools, will differentiate artists.

Summary Design is organization – it is the organization of ideas into a visual form that communicates. Computer graphics is just one more medium for design, but it touches almost every other medium including filmmaking, magazines, newspapers, television, online video, and web pages. There is a great deal of crossover in design tools. For example, imaging tools are used by filmmakers and video producers, illustration tools are used by desktop publishers, and so on. Increasingly workflows are becoming completely digital. In the case of print, publications are created completely on the computer and sent to the printer in digital format. They may then be repurposed for the Web. Likewise, filmmaking and TV production are also moving to a digital workflow as digital cameras gradually replace film and video tape and digital theaters replace theaters with film projectors. As mentioned throughout this chapter, the transition to digital has had a tremendous effect on content creation. One of the major advantages of a digital workflow is the improved ability to organize work all the way from concept to production. The computer and the reduced prices of storage allow users to save every stage of their work to use again or adapt for different mediums. And such capabilities as layers allow an increased level of organization within the project itself. Finally, the arrival of the Web enables people to share and collaborate no matter how close or how far away they are from each other. It can also be argued that confusion, as digital content proliferates on computers, hard drives, CDs, DVDs, and across the Internet, might also be considered one of the challenges introduced by digital workflows. There are always two sides to change. The digital revolution has had a profound effect on the design disciplines, bestowing a greater level of organization but also exacting a huge toll in facility, flexibility, and spontaneity. And, if not managed properly, digital workflows can even introduce disorganization and confusion. Since the introduction of the personal computer, digital content creation software has steadily improved to the point that there is almost no task that cannot be accomplished on the computer. That does not mean it is easy or pleasant for all content creators. More often than not, the tools still get in the way. In the next phase of digital content creation, the tools should disappear giving artists a more direct relationship with the work they create.

24

K. Maher

Further Reading Communication Arts Magazine (2009) Art and the lack of it. 2009 Mar/Apr 50th Anniversary Issue, p 158 Derrick S (2000) Photoshop: a look back. http://www.storyphoto.com/multimedia/multimedia_ photoshop.html. Accessed 18 Feb 2000 Digital Content Creation Software Report. Jon Peddie Research and available at www.jonpeddie. com Gatter M (2006) Software essentials for graphic designers: photoshop, illustrator, quark, indesign, quarkxpress, dreamweaver, flash, and acrobat. Yale University Press, New Haven, 240 pp. ISBN-10: 0300118007, ISBN-13: 978-0300118001 Johnson S (2006) Stephen Johnson on digital photography, 1st edn. O’Reilly Media, Sebastopol. ISBN-10: 059652370X, ISBN-13: 978-0596523701 Laver R (1998) Random excess: the wild ride of Michael Cowpland and Corel. Viking, New York, p 106. ISBN 067087972X, 9780670879724 Macario J (2009) Graphics design essentials, skills, software and creative solutions, 1st edn. Prentice Hall, Upper Saddle River, 176 pp. ISBN-10: 0136052355, ISBN-13: 978-0136052357 Tim Berners Lee is credited with inventing the World Wide Web in 1991 and also with introducing the first browser. In: Proctor RW, Vu K-PL (eds) (2005) Handbook of human factors in web design. Routledge, London, p 24. ISBN 0805846123, 9780805846126

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_158-2 # Springer-Verlag Berlin Heidelberg 2015

Introduction to High-Resolution Displays Mark Fihn* Veritas et Visus, Temple, TX, USA

Abstract Image quality on a display is determined to a large extent by pixel density. The smaller the pixels, the less likely the viewer can perceive pixelation; whereby images become photorealistic and fonts can be formed more perfectly as pixel size decreases. Viewing distance is also a factor when it comes to image quality – as the viewing distance increase, details as fine as pixels become less and less significant. As a result, vision scientists and display engineers have determined “optimal” pixel densities for particular display-related applications. But these calculations tend to be based on visual acuity measuring tools, such as the Snellen visual acuity test – and fail to recognize the ability of the human visual system to discern much finer details. As such, it is predictable that display resolution will be significantly enhanced in the coming years to better match the capabilities of the human visual system.

Introduction There is a common tendency in the display industry for “vision experts” to suggest that display resolution is necessarily limited by the human visual system. Of course there is some practical point of diminishing returns with regard to display resolution, but we are nowhere near achieving the limits of our visual system on our displays. Apple’s announcement of the iPhone 4 with a 3.5-in. display at 960  640 pixels (326 pixels/in.) has sparked tremendous interest in high-resolution displays – and not just in handheld devices. There has been considerable discussion as to whether Apple’s Retina Display really matches the resolution capabilities of the human visual system. Display engineers regularly explain that 20/20 vision is accurately represented by a display that subtends about 1 arcmin. The typical “scientific” analysis (e.g.,, http://blogs.discovermagazine.com/ badastronomy/2010/06/10/resolving-the-iphone-resolution/) usually concludes for a handheld device like a mobile phone that 326 ppi is representative of what the human visual system can resolve. This analysis is somewhat flawed, as we can actually resolve to points that are much finer than 1 arcmin.

Visual Acuity Consider the Snellen test which is used frequently to determine visual acuity (see chapter “▶ Visual Acuity”). If you can read the properly sized and formed letters above the red line from 20 ft, given certain ambient light conditions, then you are said to have 20/20 vision. But, keep in mind that if you can read those letters, you can also see that there are additional lines below them on the chart. While you might not be able to read them, you can almost certainly recognize that there are differences in the letters (Fig. 1). *Email: [email protected] *Email: markfi[email protected] Page 1 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_158-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 The Snellen test

Among other things, the Snellen exam fails to recognize the importance of light intensity. When we look at the stars in the sky, for example, we are able to more easily see very bright stars as opposed to very large stars. Likewise, if we consider a “bright dot” pixel defect on a display, the size of the pixel defect is actually less important than the brightness of display. Even on the iPhone 4, a bright dot pixel defect will not be obvious if the display brightness is low, but it will become obnoxiously obvious as brightness is increased. Consider these facts about human visual acuity: • “Optical infinity” is the least distance at which there is no significant accommodation by the crystalline lenses of a person’s eyes (to represent parallel lines of light). Traditionally, optical infinity has been accepted to be 20 ft (hence the distance chosen for the Snellen test). However, at 20 ft, there is an accommodative demand on the eye of about 1/6 D (one-sixth of a diopter), which can be significant. Many experts maintain that optical infinity, for the purposes of examining the refractive error of the human eye, should be at least 8 m or 26 ft. In any case, 20/20 vision is primarily valuable when measuring visual acuity at optical infinity – it is a less valuable measure for determining visual acuity of objects that are positioned inside the point of optical infinity. • 20/20 visual acuity resolves 1 arcmin. However, 20/20 is defined for a room maintained at a very dim ambient illumination (e.g., 100 lx). In nature, the range of illumination is many orders of magnitude higher (0.01–100,000 lx) and luminance contrast is often sufficient to resolve objects smaller than 50 arc sec. • Also, 20/20 is defined for black/white luminance contrast and ignores color, 3D, and motion as image resolving features of human vision. • Human visual acuity is much finer than 20/20 implies. Stars subtend far less than 50 arc sec, perhaps as small as 5 arc sec in some cases, yet people see stars. Similarly, glint from a highly reflective surface is readily visible but often subtends 20–25 arc sec or less. Page 2 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_158-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Here are three groupings of the letter “E” from the Snellen eye test. In the upper left is the “E” from the exam; the upper middle shows an outline of the letter; the upper right shows a single white dot in the letter; the lower left is an example of Vernier acuity, with a white line that is broken by a single line width; the lower middle shows a diagonal line within the outline; and the lower right is an example of contrast sensitivity, identifying the importance of contrast with regard to visual acuity

• Humans have a visual capability known as hyperacuity. The most famous example of hyperacuity is Vernier acuity, also known as positional acuity. An attribute of Vernier acuity is the ability to perceive colinearity between two line segments. Humans can resolve two line segments as being distinct if they differ by as little as 1 s of an arc. In other words, when you hear a display expert explain that a “retina display” enables 1 arcmin of visual acuity (20/20 vision), bear in mind that our visual systems actually can resolve details to a point that is 100 times finer than that! Note that even though you are probably looking at the “E” image in Fig. 2 with a display with less than 100 pixels per inch, and even though the image itself has been mangled by both bitmap and Acrobat compression issues, you can probably see things that the “experts” say the human eye cannot see. For example, if you are looking at the smallest cluster of “E”s on a sub-100 ppi monitor, you may not see the outline “E”s at all – the lines are less than a pixel wide and may not be reproduced at all. But on a 133 ppi display, for example, both the outline and the diagonal within the lower outline are clearly visible, even from a distance that is well beyond the “normal” viewing distance.

Resolution In a world where display performance has continuously increased and specsmanship seems to be the rule (to the point of the absurd), it is somewhat surprising that display engineers and marketers have largely ignored display resolution in their technology one-ups-man-ship efforts. In terms of pixel density, there has been little improvement over the past decade (with mobile phones being the one significant exception). In the world of notebook PCs and LCD monitors, the market has remained stubbornly stuck on about 100 ppi for a decade now, with only an occasional experiment at higher pixel densities. In the TV space, although the shift to 1080p has been rapid, despite the fact that most TV broadcasting is not done at 1080p levels, the higher pixel count has been somewhat diminished in terms of display quality by virtue of the simultaneous trend to larger panel sizes; (a 50-in. panel at 1,920  1,080 offers about the same 2,000:1). Subsequent development and commercialization of high-efficiency LEDs (lightemitting diodes) with high luminance, long lifetime, and reliability facilitated their use in LCD backlights. LED backlights have given a significant boost to AMLCDs in reducing the performance gap against emerging OLED (organic light-emitting diode) display technology (see section “AM OLED Page 12 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_168-1 # Springer-Verlag Berlin Heidelberg 2015

Technology”). High-efficiency white LEDs are achieved using a blue LED and a high-efficiency yellow phosphor. For more precise control of the chromaticity to achieve the desired requirements, a mixture of white and individual color (such as red) LEDs may be utilized in the backlight design. Also, as the LED switching time is of the order of microseconds, dynamic backlight mode can be employed to enhance the display contrast. Quantum dot (QD)-enhanced backlight is a recent development for AMLCDs (Steckle et al. 2014). Compared to a conventional LED backlight using a white LED or R, G, B LED elements, a QD-enhanced backlight system uses a blue LED array and a QD encapsulated film (sheet) between the LED array and the LCD cell. The desired amount of blue light from the blue LEDs is upconverted to green and red colors by the appropriately sized and spaced QDs in the QD film. The major advantage of the QD approach is to enhance the color gamut of the display, as the QDs upconvert blue wavelength to red and green emissions with a very narrow band. Using this technology, up to 100 % sRGB color gamut has been achieved, nearing the color gamut achievable by an OLED. The QD backlight design optimization involves tradeoffs between power consumption and color gamut. However, the suitability of the QD backlight technology for flight deck displays remains to be tested and validated. Display Packaging and Integration Various external components are added to the LCD cell to achieve the required environmental and optical performance (as shown in Fig. 4). The backlight system – consisting of the LEDs, backlight cavity, dimming control circuitry, diffuser, NVIS filters, and brightness enhancement and polarization-recycling films – is a crucial part of the display system. Similarly, the heater glass between the backlight and the LC cell and the cover glass in front of the LC cell are also very important. Efficient optical coupling is needed between the various components of the display system and can be obtained by use of index-matched lamination materials and antireflection coatings. The display packaging design must ensure that the system meets the mechanical-stress (shock, vibration, explosive decompression, and altitude), thermalshock, and EMI requirements.

Projection Displays

Projection display technology has been evaluated as an alternative to direct view flat panel displays for avionics applications in the past (Steckle et al. 2014). At that time, the rationale for projection display development for avionics displays was mainly due to the expected future difficulties in developing AMLCDs with a variety of custom sizes and aspect ratios (mostly square), in small quantities with stringent performance requirements at a reasonable cost. Projection display technology offered a potential path to develop flight deck displays with the required size and aspect ratio, at low volumes, using standard (commercial off-the-shelf (COTS)) light valves (such as liquid crystal on silicon (LCOS) or digital micromirror device (DMD)) and other optical components. The challenges associated with this development included development of reliable and efficient projection light sources and the illumination system, projection screens required for achieving the image quality competitive with the direct view AMLCDs, and meeting other environmental requirements, including shock and vibration. Since the beginning of the avionics projection display investigations, continued development and advancements in AMLCD technology with respect to performance improvements and cost reductions, and the lack of adequate progress on resolving projection display development challenges, interest in projection display approach for flight deck applications has decreased dramatically. More recently, there is renewed interest in the rear projection display approach for avionics (Cuypers et al. 2012; Retrieved from https://www.thalesgroup. com/sites/default/files/asset/document/ODICIS%20Datasheet.pdf), using tiled, multiple short-throw, wide angle projectors, under the European Project ODICIS (One DIsplay per Cockpit Interactive

Page 13 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_168-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Schematic structure of an AMOLED display illustrating its simplicity compared to that of an AMLCD

Solution), to develop a single large area display of arbitrary shape and curved surfaces to fit the flight deck, using multiple light valves (e.g., LCOS image sources and a LED back illumination system).

AMOLED Technology

AMOLED has been under active development for more than 15 years, as a next-generation flat panel display technology. Unlike an AMLCD, the AMOLED is an emissive display and does not require a backlight and the associated backlight system components or color filters. Thus, the AMOLED has far fewer components as illustrated in Fig. 5, with a potential for significant cost savings. Also, as the OLED is an emissive device, it can have a better black state with a very high contrast, and its viewing angle performance is intrinsically excellent. Further, because of the faster switching speed (tens of microseconds for an OLED versus several milliseconds for an LCD) OLED has superior video performance compared to an LCD. Also, the OLED has a potential for lower power consumption as the power losses due to color filter absorption can be eliminated and power is dissipated only by the pixels that are switched “on,” unlike in an LCD. In addition, the OLED has a superior potential for enabling flexible displays due to its solid state. In spite of the OLED’s advantages of superior viewing angle and video image quality, and the potential for lower cost and power consumption, it has taken a long time to overcome the technology and manufacturing challenges to be viable and achieve market success. A first AMOLED product (display in a DSC (Digital Still Camera)) was introduced in 2003 by Sanyo-Kodak. Subsequent product introductions included a 3.8” display for a PDA, an 11” display for a TV by Sony, a 15” display for a TV by LG Display, and a variety of specialized professional OLED monitors up to ~25” size, by Sony. All these AMOLED-based products had varying degrees of market success. The main reason for this is that OLEDs have been competing with LCDs for essentially the same applications (i.e., displays for mobile electronic devices, desktop monitors, and TVs). It had been a competition with a moving target of the AMLCD, which made remarkable progress during same time frame with respect to image quality, cost, and developing huge infrastructure for very large mother glass size (e.g., Gen 10 Fabs with a mother glass size of ~3 m  3 m). For example, optimizing the LCD modes and replacing the fluorescent backlight with an LED backlight has allowed remarkable improvements to the viewing angle and image quality of AMLCDs in recent times. While an AMLCD with an LED backlight can realize significant improvements in contrast ratio and power savings by implementing local area dimming, the AMOLED takes this concept to the ultimate with each pixel in the display possessing the capability for dimming.

Page 14 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_168-1 # Springer-Verlag Berlin Heidelberg 2015

During the past couple of years, AMOLED has had a remarkable commercial success in highperformance displays for the smartphone and tablet applications. This commercial progress is due to the dramatic advances in the enabling technologies, including efficiency and lifetimes for the R, G, B OLED emitters, TFT backplane technology, and OLED manufacturing technologies and infrastructure development. OLEDs offer unique advantages as well as challenges in their use as flight deck displays. In addition to the superior wide-viewing-angle image quality and SWaP attributes, the optical performance of OLEDs is temperature independent, unlike an LCD. It does not require a heater to maintain the response time at low temperatures and the contrast does not decrease at elevated temperatures as in an LCD. The sunlight readability challenge for OLEDs can be successfully addressed (Sarma et al. 2012) by reducing their reflectivity by use of antireflective coatings. One remaining challenge for adapting OLEDs broadly as flight deck displays is the requirement for further improvement in the lifetime of OLED devices, particularly for blue OLED emitters, to reduce the susceptibility for retained images (long-term image retention).

Flexible Displays Flexible AMOLED displays have made remarkable progress in recent times and are beginning to be commercialized for some consumer applications such as smartphones, for example, by LG Display (Retrieved from http://www.lg.com/us/mobile-phones/gflex). The flexible AMOLED is fabricated on a very thin (~0.1 mm) flexible plastic film (unbreakable, unlike a glass substrate). In addition to the very thin, lightweight, and unbreakability attributes, flexible displays can be configured to conform to a desired shape such as a flight deck to achieve enhanced display real estate. Continuing developments in flexible displays offer the potential for their use in next-generation flight decks.

Touch Screen User Interface Widely popularized in recent years by smartphones and tablet computers, touch screen controls are becoming highly accepted as a display interaction modality. This has resulted in an expectation by pilots of touch screen controls in the flight deck. Touch screen advantages include the direct interaction with displayed parameters/controls, software flexibility of controls, and more intuitive manipulation of the displayed information. Touch gestures such as tap, swipe, pinch, and scroll are now widely familiar and are candidates for use in the flight deck. However, vibrations and turbulence are common in the flight deck and can compromise the pilot’s ability to accurately use such gestures.

Avionics Display System Design and Integration Considerations Electronic displays in the flight deck are capable of presenting a tremendous amount of information to the pilots that varies from low-importance advisories to flight critical parameters that, if lost or are erroneous, could jeopardize the expensive aircraft and the lives of its occupants. For example, in instrument meteorological conditions (IMC), the pilots rely solely on their instruments for situational awareness. Misinterpretation, incorrect display, or complete loss of parameters such as aircraft pitch and roll can result in loss of controllability of the aircraft and so must be prevented. System design and verification play a crucial role in ensuring that these aspects are appropriately addressed. Human factors methodologies help to ensure that the information presented can be properly absorbed and understood by the pilots. Input sensor comparisons, monitoring of hardware and software Page 15 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_168-1 # Springer-Verlag Berlin Heidelberg 2015

functions, and levels of design assurance appropriate to the safety implications of the functions are important in preventing the display of incorrect or hazardously misleading information to the pilots. Loss of important information is addressed through analysis of failure modes/effects, hardware failure rates, redundancy, and independence of backup systems (including possibly dissimilarity between primary and backup systems to prevent total loss of function due to generic/design faults). In addition to addressing flight safety system design considerations, the flight deck display system must also be designed to support the business objectives of the aircraft operator. For many operators this will focus on maximizing the likelihood that the aircraft is capable of dispatching for a flight/mission when it is needed. Flight deck display systems are often designed with sufficient redundancy such that the aircraft can dispatch even with some of the system elements failed. Electronic displays and associated multifunctional controllers (e.g., cursor control devices and keyboards) are well suited for redistributing functions following the failure of some system elements to remaining operational units.

Human Factors Electronic displays in the flight deck, as with all flight deck systems, must be validated to accomplish their intended functions. Human factors has emerged in recent years as a discipline of increasingly important emphasis in establishing functional and user interaction requirements. With their advantages of increased graphical capability, integration of information/control, flexibility, and upgradability, electronic display systems promise a wide range of functional possibilities. However, the safety considerations of the flight deck mandate a careful, methodical, and scientific approach to assessing and validating the design of the electronic displays systems from the pilot’s perspective. Thoughtful decomposition of flight deck function requirements, and the allocation of those requirements to systems that include the electronic displays and their associated controls, is an important first step in the design process. Throughout development, the needs of the pilots and the ramifications of design decisions on their ability to perform flight deck tasks must be addressed through methods such as analyses and empirical human factors experiments. Of particular importance are assessments of pilot workload (for both normal and abnormal situations), the propensity for errors, and pilot fatigue. Electronic displays have facilitated great strides in the capabilities, efficiencies, and safety of the flight deck through its human occupants, the pilots. As aircraft operational environments (e.g., NextGen and SESAR) become increasingly more complex and demanding in the future, opportunities will continue to arise for leveraging continuing advances in displays and user interface technologies for the nextgeneration flight deck designs. With a focus on human factors and system design, effectively leveraging the next-generation display and user interfaces for the development of next-generation flight decks will improve upon the already stellar safety and efficiency performance of today’s aviation.

Further Reading Cuypers D et al (2012) Projection technology for future airplane cockpits. In: IDW 2012 Endoh S, Ohta M, Konishi N, Kondoh K (1999) Advanced 18.1 inch Super TFT LCD with mega-wide viewing angle and fast response speed of 20 ms. In: IDW ’99, p 187 Federal Aviation Administration (2002) Function and installation (14 CFR 25.1301). Retrieved from http://www.gpo.gov/fdsys/pkg/CFR-2002-title14-vol1/xml/CFR-2002-title14-vol1-sec25-1301.xml Haim E, McCartney R, Penn C, Inada T, Unate U, Sunata T, Taruta K, Ugai Y, Aoki S (1994) Full color grayscale LCD with wide viewing angle for avionics applications. In: SID ’94, application digest, p 23 Page 16 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_168-1 # Springer-Verlag Berlin Heidelberg 2015

Kalmansh M, Tomkins R (2000) Projection display technology for avionics applications. In: Hopper D (ed) Cockpit displays VII, SPIE, April 2000 Kobayashi K, Masutani Y, Nakashima K, Nivano Y, Nishimura M, Tahata S, Mori Y, Lamberth L, Laddu R, Coyle M, Komenda V, Esposito C, Sarma K (1998) IPS-mode TFT-LCDs for aircraft applications. In: SID ’98 digest, pp 70–73 Lee KH, Kim HY, Park KH, Jang SJ, Park IC, Lee JY (2006) A novel outdoor readability of portable TFT-LCD with AFFS technology. In: SID digest ’06, p 1079 McCartney JA (1994) The primary flight instruments for the Boeing 777 airplane. SPIE, cockpit display, vol 2219, p 98 Mori H et al (1997) Conference record. In: International display research conference (SID), p M-88 Ohta M, Oh-e M, Kondo K (1995) Asia display ’95, p 707 Sarma KR, Franklin H, Johnson M, Frost K, Bernot A (1990) Grayscale in AMLCD panels with wide viewing angles. Proc Soc Inform Disp 31(1):7 Sarma KR, McCartney RI, Heinze B, Aoki S, Ukai Y, Sunata T, Inada T (1991) A wide viewing angle 5-in diagonal AM LCD using halftone grayscale. In: SID Digest ’91, p 555 Sarma KR, Laddu R, Harris D, Lamberth L, Li WY, Chien CC, Chu CY, Lee CS, Wei CK, Kuo CL (2003) MVA-AM LCD development for avionic applications. In: IDMC 2003, p 211 Sarma KR et al (2012) Recent advances in AM OLED technologies for application to aerospace and military systems. In: 2012 SPIE DSS conference proceedings, Baltimore, 23–27 April 2012 Steckle JS et al (2014) Quantum dots: the ultimate down-conversion material for LCD displays. In: SID 2014 digest, p 130 Takeda A, Kataoka S, Sasaki T, Chida H, Tsuda H, Ohmuro K, Sasabayashi T, Koike Y, Okamoto K (1998) A super-high image quality multi-domain vertical alignment LCD by new rubbing-less technology. In: SID ’98 digest, p 1077

Page 17 of 17

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

Medical Displays Elizabeth A. Krupinski* Department of Medical Imaging, University of Arizona, Tucson, AZ, USA

Abstract Medical imaging is an integral part of any healthcare practice today and cuts across multiple disciplines and clinical special specialties to provide information to clinicians that will help them render diagnostic decisions and make treatment recommendations. The technologies available have expanded rapidly in recent years, and the number and range of image types clinicians have available to them is vast. Radiologic images are the most commonly acquired and used, but other specialties such as pathology, ophthalmology, and dermatology and the telemedicine implementation of nearly every other clinical specialty are acquiring, archiving, transmitting, and viewing images on a daily basis. Although the acquisition technologies differ, all of these images do have one thing in common – they must be displayed in a fashion that the healthcare professional can view them and render an accurate and efficient diagnostic decision. In order for that to happen, the display itself must be considered as well as the environment in which the display is located. This chapter will review some of the basic factors to consider for the optimal display of medical images. Before discussing the display options, it is important to realize that a key goal when choosing a display device for a given medical imaging application is to match the output of the imaging system to the display and then optimize the display to the observer’s visual system for a given reading environment. Thus there are a few properties of the human eye–brain system that are useful to understand before talking about the displays themselves. The human visual system operates on (and limited by) its anatomic and physiological properties. Photoreception is the mechanism by which light from the environment (in this case the display) produces changes in the photoreceptors or nerve cells in the retina. Rods (about 115 million rods) sense contrast, brightness, and motion and are located mostly in the periphery of the retina; and cones (6.5 million cones) are for fine spatial resolution and color vision, and they are located in the foveal and parafoveal regions. When light hits these receptors, pigments undergo chemical transformations that convert light energy into electrical energy, which then acts on nerve cells that connect the eye to the optic nerve and subsequent visual pathways that extend to the visual cortices in the brain. Two key visual properties that relate to displays are spatial and contrast resolution. Spatial resolution is the ability to see fine details and is highest at the fovea and declines sharply toward the retinal periphery. Contrast resolution is the ability to distinguish differences in intensity in an image. It permits one to distinguish between objects and background in an image. Contrast resolution depends on both the quality of the image and the visual capabilities of the human observer. Therefore, whenever a new type of image, display, or presentation state is developed, it is necessary to characterize its contrast resolution.

Grayscale Displays Radiology, as already noted, is the most common specialty in which images are used for medical diagnoses. Although originally film-based, radiology throughout the world with rare exception has

*Email: [email protected] Page 1 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

gone digital since the 1990s; thus radiology has the most experience with display technology and optimization compared to other clinical specialties. There are technical guidelines for the practice of digital radiography that incorporate standards for image display (Andriole et al. 2012; Kanal et al. 2013; Norweck et al. 2013), and they should be consulted when questions arise about display properties and what the minimum requirements are. The technical guidelines for radiology generally recommend the use of medical-grade monochrome displays for primary interpretation, and there are a number of manufacturers with models available up to at least 5 M pixel resolution (which is required for mammography) and thus satisfy the spatial requirement for nearly every imaging modality. In terms of resolution, it is recommended that the display matches as closely as possible the resolution of the images to be displayed so that all of the information in the image can be visualized at least initially without having to zoom and pan. The sizes of common radiologic images are shown in Table 1. Typical display options are 1.3MP (1,280  1,024, 1900 ) for CT, MRI, and ultrasound (US), 3 MP (2,048  1,536, 2100 ) for non-chest CR/DR, and 5 MP (2,048  2,056, 2100 ) for chest CR/DR and mammography. There are some manufacturers with even higher-resolution displays (e.g., 10 MP) that are useful for displaying multiple images at full resolution at the same time rather than using two displays. Although display resolution/size is typically thought of in terms of the number of pixels, it is also important to look at pixel dimension, and it is recommended that the pixel size is 0.2 mm for standard viewing conditions. What are some of the important display properties that need to be considered? Excellent technical summaries are available in the practice guidelines and in a paper prepared by the American Association of Physicists in Medicine Task Group 18 (Samei et al. 2005), but some key points are provided here. Most displays today are liquid crystal (LCD) with cold-cathode fluorescent (CCFL) backlights, and although there are other technologies available (e.g., LEDs) and new ones on the horizon, the basic elements to consider will remain important. Spatial resolution, discussed above, is one key property. Maximum luminance and minimum luminance are critical properties as well and depend on the type of image to be displayed. CT, MRI, and US should have a minimum luminance (black level) of 1 cd/m2 and a maximum (white level) of at least 250 cd/m2, CR/DR 0.6 and 400 cd/m2, and mammography 0.6 and 500 cd/m2. Most monitors today, even commercial off-the-shelf (COTS) displays, have a maximum of at least 500 cd/m2 so this is generally quite readily achieved. The ratio of maximum to minimum luminance is called the contrast ratio and is obviously a function of the display itself but is also impacted by the ambient lighting conditions. Contrast resolution is also affected by the viewer’s location, although this aspect has improved significantly in the past few years. Viewing the display orthogonally is the ideal since with LCDs, as the viewer moves “off-axis,” the contrast ratio is reduced and low-contrast targets are more difficult to see and thus can be missed. Contrast ratio can be adjusted/optimized by calibrating the display. In radiology the DICOM GSDF (Digital Imaging and Communications in Medicine Grayscale Standard Display Function) is the standard for Table 1 Typical sizes of common images used in radiology Modality Mammography CR/DR chest (computed and digital radiography) CR/DR non-chest Ultrasound CT MRI Nuclear medicine

Image size 4,000  5,000 3,000  2,500 1,760  2,140 1,024  768 512  512 512  512 128  128

Page 2 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

calibrating displays for the display of grayscale images (DICOM 2006). The GSDF matches the contrast resolution of the display to the contrast sensitivity of the human visual system and maps image pixel values to specified displayed luminances to show consistent grayscale presentation across displays. Most medical-grade displays come calibrated to the DICOM GSDF and often have hardware and software that allow for periodic remote monitoring and recalibration. There are also tools available for manual calibration, and they are especially useful when using COTS displays as these typically do not have an option for GSDF calibration. Every display has some amount of noise and there are two types that can impact image quality (especially for low-contrast images): temporal and spatial. Temporal noise refers to Poisson noise that comes from the emission of photons by the backlight and control electronics. Spatial noise refers to stationary pixel-to-pixel variation in output as derived from spatial variation in the panel itself that some manufacturers correct for to improve spatial nonuniformity. Display uniformity is one of the key differences between medical-grade (~15 %) and COTS (20–30 %) displays. A critical part of any department or clinician using electronic displays for medical image display and interpretation is a solid quality assurance/quality control (QA/QC) program. This is important not only when setting up a new display but also over time as monitors do tend to degrade over time, and these changes need to be tracked and if possible corrected. Some QC procedures are addressed by the manufacturer (e.g., luminance uniformity). It is important to determine if any of the tools necessary to carry out QA/QC procedures (e.g., luminance sensors/tools for calibration) are included with the display or if they need to be purchased separately (and from whom if the vendor does not provide them). It is also important to set up a regular schedule for QA/QC and to decide (based as often as possible on published guidelines) when corrections need to be made (e.g., when the luminance falls below a given threshold). COTS monitors typically require more frequent QA/QC and adjustment or recalibration than medicalgrade displays since they are not manufactured at the same level of quality and typically have shorter useful lifetimes. In general it is probably useful to carry out basic QA/QC measures on a monthly basis if feasible and practical. As already noted, many medical-grade displays come with the option (at a cost) for the manufacturer to remotely monitor a given display and maintain its calibration. They generally also provide regular reports that track the QA/QC results and advise users when the useful lifetime of a monitor is approaching its end.

Color Displays Color displays are increasingly being used in radiology but are critical to viewing other types of medical images that are inherently color. Pathology (Fig. 1), dermatology, endoscopy and laparoscopy, dentistry, ophthalmology, and telemedicine are some of the important clinical specialties where imaging (usually visible light) is increasingly being used. There is, however, very little guidance available for calibrating color displays for medical imaging applications, although the FDA and many researchers are actively pursuing the development of color standardization techniques (Badano et al. 2014). There are two aspects of color that need to be considered. The first is color accuracy. This is the ability of a system to produce exact color matches between input and output. This is relatively easy to measure with spectrometers and color reference charts like the Macbeth ColorChecker chart. The second is color consistency which refers to the ability of a system to yield data that is identical or similar to the color perceptual response of the human visual system (like the DICOM GSDF). This is a little more difficult to achieve however since color perception itself is a rather complicated issue. In any case, the goal is the same as with grayscale displays – to match the output to the human visual system as well as create consistent display of images across different displays. Page 3 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Typical pathology images of tissues with different staining techniques, showing the range of colors that need to be accurately displayed when viewed on a monitor

One option is the use of the ICC (International Color Consortium) device profiles that provide a standardized architecture, profile format, and data structure for color management and data interchange between different imaging devices. The profiles incorporate characterization data for color-imaging devices along with data tags and metadata that detail a set of transforms between the native color spaces of a device and a device-independent color space. Computer operating systems can use these color management modules or software applications that utilize the ICC profiles to provide consistent and perceptually meaningful color reproduction for input devices, output devices, and color image files.

User Interface Image display does not only refer to the physical display and its properties but also to the actual presentation of the image, and in medical imaging this is critical as users are likely going to be spending hours using these displays to interpret images. Thus, the workstation user interface is important to consider. User interfaces should be fast, intuitive, user-friendly, able to integrate and expand, and reliable. For example, the way the image options appear to the user (referred to as the hanging protocol in radiology) is important. In radiology as well as many other others, it is typical to present all possible images available from an exam as thumbnails on one side of the display and the first image in a series in the center at full or close to full resolution. In areas such as pathology, this can be difficult because the images are quite large, but they can be presented in adequate detail at low resolution and then magnified using the zoom/pan function for visualizing fine details. Image processing and analysis tools are a standard part of most image viewing software packages. Some offer more options than others, but (a) the user should be able to use the basic navigation tools of the interface without any training and without any prior exposure, and (b) the system should be user-friendly, easy to use, and customizable. This means there should be simple menus and file managers; single mouse click navigation; visually comfortable colors or gray scales and an uncluttered workspace; ergonomically positioned input devices such as mouse, keyboard, and pad; and ergonomically positioned monitors. Perceptually, default image presentation quality is extremely important. It is crucial to provide the best,

Page 4 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

most perceptually useful information in the initial default presentation so the clinician can make correct decisions with as little unnecessary image manipulation as possible so as not to prolong viewing times. Speed is another important issue, in terms of both how long it takes to pull an image from the server and how long it takes to navigate (e.g., zoom/pan, scroll through CT/MRI slices, rotate a volume rendered object) through the image. Speed is very often a determining factor in the acceptance of a workstation. Speed is determined by the programming, the number of processors, the power, and memory of the system. Major operations (e.g., loading image(s) of the same patient for display) should take no more than 2 s per operation, and operations executed during the interpretation of a case should take milliseconds. How the user interacts with the display is a function of the type of input device, and this is usually a matter of personal preference. Keyboards with hot key options, the mouse (with or without a scroll wheel and with various numbers and types of buttons), and trackballs are the most commonly used interface devices for navigating through images, although other options such as foot pedals, joysticks, modified keyboard/mouse combinations, and even voice-controlled technologies are being used.

Human Factors and Ergonomics There are a number of factors related to the environment in which the display is used that are important. As noted earlier, ambient lighting is important and it is recommended that 20–40 lx of ambient light be used. This is sufficient to avoid most reflections and glare on the display while still providing adequate light for the human visual system to adapt to the surrounding environment and the displays. Ambient lights should be indirect and backlight incandescent lights with dimmer switches rather than fluorescent are recommended. Light-colored clothing and lab coats can increase reflections and glare even with today’s LCDs, so they should be avoided if possible. An important issue, especially in radiology, is the number of displays that are used. This can depend on the amount of space available, but in radiology two monitors for image display and a third lower-quality COTS display for displaying the worklist are common. For other clinical specialties, a single display is more common although having one dedicated to viewing images and one for viewing other patient data such as the electronic health record does make it easier to view data all at once rather than toggling between screens on a single display. Viewing distance is a matter of personal preference, but typical viewing distances between a user and display are between 12 and 18 in. for most people. Many medical-grade displays have screen sizes around 25  16 in., making it necessary to have a viewing distance closer to 18 in. to be able to view the entire display. With two displays, especially the larger ones, it is best to have them side-by-side and angled slightly inward so both surfaces can be viewed close to orthogonally. This also reduces strain and stress associated with having to move the head and neck during viewing which could result in repetitive motion injuries. A concern that has not been considered very much to date, but is being investigated, is visual fatigue (i.e., computer vision syndrome) (Reiner and Krupinski 2012). This can result from the long hours that clinicians are now spending viewing images, and common symptoms are visual strain, headaches, blurry vision, dry eyes, and a host of other physical symptoms. There is increasing evidence, at least in radiology, that long work days increases fatigue and does negatively impact diagnostic accuracy (by about 4 %) as well as the time (Fig. 2) it takes to review a case (Krupinski et al. 2010). As clinicians in other specialties increase their volume of images and time spent viewing them, similar concerns about diagnostic accuracy and efficiency are likely to arise.

Page 5 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

Late Moderate Fx Subtle Fx No Fx

Early

0

20 40 Total Viewing Time (sec)

60

Fig. 2 Average time for radiologists to interpret bone cases with fractures (fx) before (early) and after (late) a long day of reading cases

Conclusion There are numerous aspects to consider when judging the type of display to use for interpreting medical images in order to have the optimal quality needed to efficiently render accurate diagnostic decisions. In today’s clinical environment, grayscale displays are still the norm for radiology, but color displays are increasingly being used as manufacturers are developing high-performance medical-grade color displays for use in radiology and other clinical specialties like pathology that use color images. It seems quite likely that in the near future, color monitors will be the norm even in radiology as the technology underlying color displays improves even further. Challenges still exist, such as the development of a standardized method for calibrating displays for accurate and consistent image presentation, but these will likely be solved in just a few short years. In the future displays are very likely to change even more. Tablets, cell phones, and other mobile devices are being used increasingly to capture, transmit, and display medical images although there are clearly challenges associated with these displays including but not limited to presentation size, impact of ambient lighting, and the ability to easily navigate. Some images are more amenable to being viewed with these types of displays (e.g., photos of the skin) than others (e.g., radiographs), but clever viewing software and dedicated QA/QC metrics and criteria could make these options viable. In radiology there are apps that have been approved by the FDA for viewing radiologic images with mobile devices, and there is precise language regarding when and how they can and should be used. For example, they can readily be used for CT, MRI, and ultrasound images since the resolution of these images is not as high as plain film or mammography. The apps have full zoom and pan capability so no details are lost during viewing, and they are Health Insurance Portability and Accountability Act (HIPAA) compliant with encryption tools incorporated into the software to protect patient information. There is built-in QA measure as well. There is a tool that can measure ambient light conditions and insure that the environmental lighting conditions are adequate to accommodate a diagnosis. No matter what the direction of displays is in the future for medical imaging, there will always be the need to carefully match the display capabilities to those of the human visual system and to maintain a solid QA/QC program to monitor key display properties during setup and over the lifetime of the device.

Page 6 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_169-1 # Springer-Verlag Berlin Heidelberg 2014

Further Reading Andriole KP, Ruckdeschel TG, Flynn MJ, Hangiandreou NJ, Jones AK, Krupinski E, Seibert JA, Shepard SJ, Walz-Flannigan A, Mian TA, Pollack MS, Wyatt M (2012) ACR-AAPM-SIIM practice guideline for digital radiography. J Digit Imaging 26:26–37 Badano A, Revie C, Casertano A, Cheng WC, Green P, Kimpe T, Krupinski E, Sisson C, Skrovseth S, Treanor D, Boynton P, Clunie D, Flynn MJ, Heki T, Hewitt S, Homma H, Masia A, Matsui T, Nagy B, Nishibori M, Penczek, Schopf T, Yagi Y, Yokoi H (2014) Consistency and standardization of color in medical imaging: a consensus report. J Digit Imaging. doi:10.1007/s10278-014-9721-0 Beutel J, Kundel HL, Van Metter RL (2000) Handbook of medical imaging, vol 1, Physics and psychophysics. SPIE Press, Bellingham Branstetter BF, Rubin DL, Weiss DL (2009) Practical imaging informatics. Springer, New York Digital Imaging and Communications in Medicine Grayscale Standard Display Function (2006) http:// medical.nema.org/dicom/2006/06_14pu.pdf Kagadis GC, Langer SG (2012) Informatics in medical imaging. CRC Press, New York Kanal KM, Krupinski E, Berns EA, Geiser WR, Karellas A, Mainiero MB, Martin MC, Patel SB, Rubin DL, Shepard JD, Siegel EL, Wolfman JA, Mian TA, Mahoney MC, Wyatt M (2013) ACR-AAPMSIIM practice guideline for determinants of image quality in digital mammography. J Digit Imaging 26:10–25 Kim Y, Horii SC (2000) Handbook of medical imaging, vol 3, Display and PACS. SPIE Press, Bellingham Krupinski EA, Berbaum KS, Caldwell RT, Schartz KM, Kim J (2010) Long radiology workdays reduce detection and accommodation accuracy. J Am Coll Radiol 7:698–704 Norweck JT, Seibert JA, Andriole KP, Clunie DA, Curran BH, Flynn MJ, Krupinski E, Lieto RP, Peck DJ, Mian TA (2013) ACR-AAPM-SIIM technical standard for electronic practice of medical imaging. J Digit Imaging 26:38–52 Pak HS, Edison KE, Whited JD (2008) Teledermatology: a user’s guide. Cambridge University Press, New York Reiner BI, Krupinski E (2012) The insidious problem of fatigue in medical imaging practice. J Digit Imaging 25:3–6 Samei E, Krupinski E (2010) The handbook of medical image perception and techniques. Cambridge University Press, New York Samei E, Badano A, Chakraborty D, Compton K, Cornelius C, Corrigan K, Flynn MJ, Hemminger B, Hangiandreou N, Johnson J, Moxley-Stevens DM, Pavlicek W, Roehrig H, Rutz L, Shapard J, Uzenoff RA, Wang J, Willis CE, AAPM TG-18 (2005) Assessment of display performance for medical imaging systems: executive summary of AAPM TG18 report. Med Phys 32:1205–1225

Page 7 of 7

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Image Aquisition James E. Adams Jr.a* and Bruce H. Pillmanb a Eastman Kodak Company, Rochester, NY, USA b Harris Corporation, Rochester, NY, USA

Abstract A high-level overview of image formation in a digital camera is presented. The discussion includes hardware and software issues, highlighting the impact of hardware characteristics and image processing algorithms on the resulting images.

Introduction Display of digital images is supported by a deep understanding of the creation of those images. This chapter provides a foundation for the understanding of image creation by digital cameras with emphasis on the possible artifacts that can be produced in the process. As such, the material will be an aid in distinguishing digital camera image artifacts from display image artifacts. This chapter deals primarily with digital still cameras, with section “Video Capture” discussing how video sequence capture differs from still capture. Most of the discussion applies equally to still and video capture. In addition, the technology of digital still cameras and video cameras is converging, further reducing differences. This chapter focuses on camera hardware and the subsequent processing algorithms.

Optical Imaging Path Figure 1 is an exploded diagram of a typical digital camera optical imaging path. The taking lens forms an image of the scene on the surface of the substrate in the sensor. Antialiasing (AA) and infrared (IR) cutoff filters prevent unwanted spatial and spectral scene components from being imaged. A cover glass protects the imaging surface of the sensor from dust and other environmental contaminants. The sensor converts the incident radiation into photocharges which are subsequently digitized and stored as raw image data. Each of these components is discussed in detail below.

Taking Lens There are standard considerations when designing or selecting a taking lens that translate directly from film cameras to digital cameras, e.g., lens aberrations and optical material characteristics. Due to constraints introduced by the physical design of the individual pixels that compose the sensor, digital camera taking lenses must address additional concerns in order to achieve acceptable imaging results. It is these digital camera-specific considerations that will be discussed below.

James E. Adams Jr. has retired. *Email: [email protected] Page 1 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Taking Lens

AA/IR Filters

Cover Glass

Sensor

Fig. 1 Optical image path of a digital camera

F

Fig. 2 Image space telecentricity

Image Space Telecentricity Due to the optical characteristics of the sensor, a key requirement of high-quality digital camera taking lenses is telecentricity (Watanabe and Nayar 1995). More explicitly, image space telecentricity is a necessary or, at least, a highly desirable feature. Figure 2 illustrates this condition. An object in the scene to the left of a simple thin lens is imaged on the right. This thin lens is represented by its (coincident) principal planes passing through the middle of the lens. An aperture stop is placed at the front focal plane of the lens. By definition, any ray now passing through the center of the aperture stop (through the front focal point, F) will emerge from the lens parallel to the optical axis. Optically, this is known as having the exit pupil at infinity. (The exit pupil is the image of the aperture stop formed by the optics on the image side of the aperture.) This is the necessary condition of image space telecentricity. Three ray bundles are shown in Fig. 2. One ray bundle originates at the base of the object on the optical axis. The lens produces a corresponding ray bundle as the base of the image on the optical axis with the image ray bundle being, essentially, normal to the image plane. A second ray bundle originates from halfway up the object. This ray bundle is imaged by the lens at the halfway point of the (inverted) image with a ray bundle that is also normal to the image plane. Finally, a third ray bundle originates from the top of the object and is imaged at the top of the (inverted) image with a ray bundle also normal to the image plane. Inspection of the figure also shows that the ray bundles originating from the object are not normal to the object plane. Hence, in this situation telecentricity occurs in image space and not object space. Placing additional optics in front of the aperture as shown in Fig. 3 will change the system magnification and focal length but not the telecentricity condition. The advantage of having an image space telecentric taking lens is that the cone of rays incident on each pixel is the same in size and orientation regardless of location within the image. As a result lens falloff, i.e., the darkening of image corners relative to the center of the image, is avoided. Even classic cos4 falloff is eliminated in the ideal case of perfect image space telecentricity. Page 2 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

F2

Fig. 3 Image space telecentricity, two lens system

With all its evident advantages, perfect image space telecentricity is frequently impractical to achieve. To maintain constant illumination over the entire image plane, the size of the rear element of the telecentric lens may become rather large. The design constraint of having the exit pupil at infinity means there is one fewer degrees of freedom for addressing other lens issues, such as aberrations. To offset this, the number of optical elements in a telecentric lens tends to be greater than that of the equivalent conventional lens, making telecentric lens generally more expensive. Sometimes in the lens design process, the location where the aperture stop needs to be placed is inaccessible (e.g., inside a lens) or impractical given other physical constraints of the system. In the low-cost imaging environment of consumer and mobile phone digital cameras, such cost and spatial footprint issues can become severe. As a consequence, perfect image space telecentricity in digital cameras is usually sacrificed, either partially or completely. The Four Thirds standard incorporates lenses that are “near telecentric” for DSLR cameras (Four thirds standard). Though there are some minor image quality consequences of only being “nearly” telecentric, the engineering compromises are reasonable, as befitting a high-quality imaging system. In the case of low-end digital cameras, especially mobile phone cameras, the telecentric condition may be dismissed completely or addressed marginally. These latter cameras can freely exhibit lens falloff which may be considered acceptable at the price point of the camera. Point Spread Function Considerations The basic imaging element of a digital camera sensor is a pixel, which for the present discussion can be considered to be a light bucket. The size of the photosensitive area of a pixel has a direct impact on the design goals of the taking lens. In Fig. 4 three pixels are shown with three different sizes of point spread functions (PSF). The PSF is the image created by the taking lens of a point of light in the scene. Generally speaking, the higher the quality of the taking lens, the smaller the PSF. Degrees of freedom that affect the size of the PSF at the pixel are lens aberrations, including defocus; the size, shape, and placement of the aperture stop; the focal length of the lens; and the wavelength of light. These will be discussed in greater detail below. What follows is a greatly simplified discussion of incoherent (white) light imaging theory in order to discuss some concepts relevant to display. For a more detailed development of this topic, the reader is directed to Gaskill (1978) and Goodman (1968). In Fig. 4 the pixel is partitioned into a photosensitive region (gray) and nonphotosensitive region (white). The latter refers to the region containing metal wires, light shields, and other nonimaging components. The quantity b in Fig. 4a is related to the fill factor of the pixel. Assuming a square pixel and a square photosensitive region, the fill factor can be defined as b2 with b being relative to the full pixel width (Guidash and Lee 1999). The range of the fill factor can be from zero (no photosensitive area) to Page 3 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

a

c

b

d

b Small PSF

Large PSF

Oversized PSF

Fig. 4 Pixels and point spread functions

unity (the entire pixel is photosensitive). The width of the PSF in Fig. 4a, d is also with respect to the full pixel width. As can be seen in Fig. 4a, the PSF easily fits inside the photosensitive region of the pixel. In Fig. 4b, the PSF is as large as possible while still fitting entirely in the photosensitive region. Recalling that the pixel is being treated as a light bucket, there is no difference in optical efficiency between Fig. 4a, b. Assuming each PSF is of the same input point of light, the same amount of energy is collected in both cases, resulting in the same number of photocharges being generated. In Fig. 4c the PSF is clearly larger than the photosensitive area of the pixel, resulting in a loss of optical efficiency. From the foregoing analysis, the smaller the PSF, the better the quality of the image, both in terms of light sensitivity (signal-to-noise) and spatial fidelity (resolution). However, there are significant costs associated with maintaining a small PSF relative to the photosensitive area of the pixel. Today’s digital cameras generally have rather small pixels with low fill factors resulting in target sizes of the PSF being challengingly small. The professional DSLR camera may have pixel sizes on the order of 5 mm, while consumer mobile phone cameras may have pixel sizes on the order of 1.4 mm. Fill factors are generally around 0.5. While the professional DSLR market will support the cost of taking lenses with sufficient size and image quality to meet the larger pixel PSF requirements, seemingly everything is working against consumer digital cameras. Consumer devices are under constant pressure to be reduced in size and to cost less to manufacture. In the case of taking lenses, this almost invariably leads to a reduction in lens elements which, in turn, reduces the degrees of freedom available to the lens designer to control the size of the PSF. Making one or more of the lens surfaces aspheric restores some degrees of freedom but at the price of more expensive manufacturing costs. As a consequence, it is not unusual for low-end consumer digital cameras to have PSFs that span two or three pixels. From the foregoing analysis this clearly results in significant losses of both signal-to-noise and spatial resolution. Some of these liabilities can be partially offset by some of the other components in the optical chain to be discussed below, but the size of the PSF relative to the pixel sets a significant limit on what the optical system can and cannot achieve.

Antialiasing Filter The digital camera sensor consists of a rectilinear grid of pixels. As such it senses a sampled version of the image formed by the taking lens. A well-known consequence of such signal sampling is aliasing (Gaskill 1978). Though it can take on a number of manifestations, aliasing is usually associated in an image with distortions within a higher-frequency region, as shown in Fig. 5. In Fig. 5, the original image is of a chirped-frequency sinusoidal function of the form A cos (2pfr2) where A is an amplitude scalar, f is a frequency scalar, and r is the radial coordinate. As such, the only location that low-frequency circles should be occurring is near r = 0 at the center of the image. The repeated low-frequency circles throughout the rest of the image are the result of aliasing.

Page 4 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Example of aliasing in an image

There are two fundamental ways to reduce aliasing. The first is to increase the sampling rate by decreasing the pixel pitch. In the present case this would amount to exchanging the sensor for one with smaller pixels. Unfortunately, this sensor substitution comes with a number of problems. First, as discussed in section “Point Spread Function Considerations,” shrinking the size of the pixel generally requires the PSF to also shrink in size to maintain image fidelity. Second, it will require more of the smaller pixels to cover the same physical sensor area than it did the larger pixels. This results in a larger amount of pixel data that will require increased compute resources to process and store. Both of these situations raise the cost of the imaging system, making an increased sampled rate solution to aliasing an expensive one. As a result, a second way to reduce aliasing is usually taken: bandlimiting the captured image. Through the use of an antialiasing filter (Palum 2009), the image is optically low-pass filtered to eliminate the higher frequencies that produce aliasing. One of the most common antialiasing filters is the four-spot birefringent antialiasing filter. There are a number of different ways this filter can be constructed. Figure 6 is an exploded view of one of the more common filter configurations. A pencil of light from the object is first split into two pencils of light by a thin slab of birefringent crystal, typically quartz. This splitting is accomplished by the birefringent crystal having a different index of refraction depending on the polarization of the incoming light and its orientation with respect to the optical axis of the crystal material. In the case of the four-spot filter being described, an unpolarized pencil of light splits into two polarized pencils of light. These polarized pencils then pass through a retarder plate (another crystal slab) that effectively depolarizes the two pencils of light. Finally, a second splitter plate splits each of the incoming two pencils of light into two (polarized) pencils of light. In order to achieve the desired antialiasing effect, the thicknesses of the three crystal plates are adjusted so that the four spots are separated horizontally and vertically by the pixel pitch of the sensor. (This statement will be revisited when color filter arrays are discussed in section “Color Filter Array.”) Alternatively, one could think of the four-spot pattern as being the size of one pixel with a fill factor of unity. Unfortunately, the use of an antialiasing filter comes at the price of a low-pass filtering (softening) of the image projected onto the sensor. This loss of image fidelity is usually acceptable in consumer imaging

Page 5 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015 sensor splitter retarder splitter

object

antialiasing filter

Fig. 6 Four-spot birefringent antialiasing filter

applications, but can be anathema to the professional DSLR photographer. In the latter case the only solution available is to try and compose the scene and the capture conditions to minimize the most grievous aliasing, e.g., changing the camera distance slightly so that the weave in the model’s clothing is not at one of the worst aliasing frequencies. It should be noted that for low-end imaging applications, the cost of including an antialiasing filter in the camera can become objectionable. Since the image quality requirements are more relaxed, it is possible to simply use a taking lens that produces a lower-quality (blurrier) image in the first place. While this may be a cost-effective solution, it imposes a hard upper bound on the possible image quality that can be produced by the imaging system. Still, for the low-cost solution, a lower-quality taking lens will, itself, generally be lower in cost and possibly smaller in size which is usually another plus in this end of the market. The principle of antialiasing remains the same, regardless: Eliminate the higher spatial frequencies that will mix together once the image is sampled by the sensor.

Infrared Cutoff Filter For all but the most specialized applications, digital cameras are used to capture images as they appear to the human visual system (HVS). As such, their wavelength sensitivity needs to be confined to that of the HVS, i.e., roughly from 400 to 700 nm (Hunt 1987). The main photosensitive element of the digital camera imaging system is its silicon sensor which, unfortunately, does not match the photometric sensitivity of the HVS very well, as shown in Fig. 7. The most grievous difference appears in the nearinfrared region of the spectrum. In order to address this, an infrared cutoff filter or “IR cut filter” is incorporated into the digital camera optical path. This filter usually consists of a multilayer thin film coating. Figure 1 shows the IR cut filter coated onto the antialiasing filter, though other optical surfaces in the optical path can also act as the substrate. Figure 8 shows typical IR cut filter and glass substrate spectral responses. It is noted that the glass substrate itself tends to be opaque in the near ultraviolet, so that both stopbands (the wavelength regions where the light is blocked) are addressed by the filter/substrate package. The key characteristics of the IR cut filter’s spectral response are its cutoff wavelength and the sharpness of its cutoff. Foreshadowing the color filter array discussion of section “Color Filter Array,” the color-sensing capability of the camera can be strongly distorted by the IR cut filter. In Fig. 9, the solid lines are the spectral responses of the camera color channels without an IR cut filter (Kodak KAI). The dashed lines include the effects of adding an IR cut filter. The blue and green responses are left largely unchanged in their primary regions of spectral sensitivity. However, the red response from approximately 650 to 700 nm has been significantly suppressed, if not eliminated outright. This leads to two opposing issues: color accuracy and signal-to-noise. In terms of color accuracy, the HVS is highly sensitive to color Page 6 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015 1 0.9 peak normalized response

0.8

HVS silicon

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 300

400

500

600

700

800

900

1000

1100

λ (nm)

Fig. 7 Peak normalized photometric sensitivities of the HVS and silicon

1 0.9

peak normalized response

0.8 IR cut silicon IR cut x silicon

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 300

400

500

600

700 λ (nm)

800

900

1000

1100

Fig. 8 Peak normalized photometric sensitivities of IR cut filter and silicon

variation in the orange-red-magenta region of color space. If the IR cut filter cutoff wavelength is too high and lets too much long-wavelength energy through (especially beyond 700 nm), color names can change very quickly. For example, the flames of a bonfire can turn from orange to magenta. Sunset colors can be equally distorted. If the IR cut filter cutoff wavelength is too low, the system can become starved for red signal and the corresponding signal gain needed to balance the red channel with the more photosensitive green and blue channels can lead to noticeable, if not unacceptable, noise amplification. Ideally, the proper IR cut filter response is the one that produces three color channels that are linear combinations of the HVS’s color-matching functions (Hunt 1987). By sensing color in the same way as the HVS (within a simple linear transform), maximum color accuracy is achieved. Of course, the camera’s color channel spectral sensitivities are more strongly determined by the color filters in the color filter array. But, a poorly

Page 7 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015 1.2 without IR cut with IR cut

relative response

1

0.8

0.6

0.4

0.2

0 400

450

500

550

600

650

700

λ (nm)

Fig. 9 Relative color photometric sensitivities

realized IR cut filter can significantly distort these responses. Since there are significant manufacturing limitations on producing the ideal spectral sensitivities in the color filters, the IR cut filter is usually used as the most “tunable” parameter in the digital camera’s optical chain to compensate for any inaccuracies in the color filter spectral responsivities, at least to the degree possible. In terms of controlling the cutoff response of the digital camera’s IR cut filter, the usual considerations with multilayer thin film coatings apply. The sharper the desired cutoff response, the more layers will generally be required. Each additional thin film layer complicates the manufacturing process and adds to its expense. Choice of coating materials for durability and ease of manufacturing also significantly impact the cost and complexity of the multilayer stack. Recalling the discussion of section “Image Space Telecentricity,” it is important that the angle of incidence be controlled when using a thin film stack. The spectral response (such as the cutoff wavelength) will shift toward the blue end of the spectrum as the angle of the incident light moves away from the normal of the filter stack. A telecentric taking lens will achieve this control, but departures from telecentricity can lead to changes in color spectral sensitivity as a function of distance from the center of the image. There are simpler and less expensive filter technologies that can be used for producing IR cut filters. Perhaps the simplest is a filter made from IR absorbing glass (Pecoraro and Shelestak 1988). These glasses absorb near-infrared radiation and reradiate it as heat. A popular choice with some low-end consumer cameras, these glasses are characterized by more gradual cutoff responses, as shown in Fig. 10. As discussed above, these more gradual cutoffs can produce small amounts of IR leakage which, in turn, may produce color distortion. However, as with other engineering tradeoffs, the use of absorbing glass IR cut filters may be acceptable for certain imaging applications. As with the other engineering considerations previously discussed, the professional DSLR market will tend to support the more expensive multilayer IR cut filter construction with additional thin film layers to achieve a sharper cutoff at more precisely located cutoff wavelengths. Lower-end consumer cameras will retreat from this position and strive only for more basic color control, e.g., preventing bonfires from turning magenta while letting other colors in the scene drift in fidelity. From a display perspective, the accuracy of the color reproduction, and in particular significant color failures, can provide clues into the capturing camera’s IR cut filter pedigree. Such clues are most easily revealed in scenes with illuminants with significant near-infrared energy, e.g., direct sunlight, firelight, and tungsten (incandescent) light. The

Page 8 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015 1 multilayer stack IR absorbing glass

0.9 peak normalized response

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 300

400

500

600 700 λ (nm)

800

900

1000

Fig. 10 Peak normalized photometric sensitivities of multilayer and absorbing IR cut filters

reflectance of the objects in the scene in the near infrared will also strongly influence the detection of undesirable IR cut filter designs, e.g., some flowers are notorious for anomalous reflectance such as blue morning glories, gentians, and ageratums due to their high reflectances in the near infrared (Why a color may not reproduce correctly). Significant departures from visual colors can indicate the nature of an IR cut filter’s design. Color Filter Array Perhaps the most distinctive and iconic element of the digital camera is the color filter array or CFA (Adams et al. 1998). The CFA is a mosaic of pixel-sized color filters that are arranged over the surface of the sensor to enable the detection of full-color information from a single sensor. As indicated in Fig. 1, each pixel has its own color filter. These filters are usually fabricated from colored dyes, pigments, or photoresists that are deposited onto the sensor surface. Sometimes a given color filter will consist of a single layer of colorant, and other times it may be composed of two or more layers of different colorants. The layers of colorants are usually thin enough that their effects on the direction and location of incident light can be safely ignored. The colors chosen for the CFA usually compose one of two sets: red, green, and blue (RGB) or cyan, magenta, and yellow (CMY). Either set can be augmented with the inclusion of a noncolored pixel, usually denoted as panchromatic (P) or white (W). Therefore, one will also find RGBP and CMYW CFAs. It is possible to create hybrid color CFAs from these colors, such as CMYG. The decision to use RGB versus CMY is usually based on signal-to-noise issues. A simplistic theoretical analysis can suggest the overall signal-to-noise differences between RGB and CMY systems. In Fig. 11 the idealized block spectral sensitivities of these CFA colors are shown. As indicated by the plots, it is assumed that the area normalized CMY signals are Y = (R + G)/2, C = (G + B)/2, and M = (R + B)/2. Therefore, the CMY system is twice as light sensitive as the RGB system. However, the final image of the system is almost always required to be an RGB image, so a color rotation of the captured CMY image must occur as shown in Eq. 1:

Page 9 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

R

Y λ

G

λ C

λ B

λ M

λ

λ

Fig. 11 Idealized block spectral sensitivities

0

1 0 1 0 1 1 1 1 C R0 @ G0 A ¼ @ 1 1 1 A @ M A: 1 1 1 Y B0

(1)

If it is assumed that the noise in each of the RGB color channels is independent and its variance s2RGB is the same for  all colors, then the variance of the noise in the CMY color channels is s2CM Y ¼ 2s2RGB =4 ¼ s2RGB =2. While this is a significant improvement in signal-to-noise performance, this improvement is lost by the color rotation operation s2R0 G0 B0 ¼ 3s2CM Y ¼ 3s2RGB =2. Therefore, in this idealized case the CMY system ends up having a lower signal-to-noise than the RGB system. However, with more realistic spectral sensitivities and noise characteristics, it is possible for CMY systems to have signal-to-noise responses comparable to RGB systems. The point is that there is no “free lunch” with respect to the set of the colors chosen for the CFA. Large improvements in signal-to-noise capability over RGB are not automatically achieved through the use of CMY color systems. Once the CFA color set is chosen, the arrangement of the colors across the sensor needs to be determined. The CFA is usually composed of a minimum repeating pattern or MRP that is tessellated over the surface of the sensor. The MRP must tile the sensor surface completely with no gaps or overlaps, given the underlying pixel sampling grid. This underlying pixel sampling grid is almost always rectilinear, though hexagonal sampling grids have been investigated (Ochi 1984). Because of the usual rectilinear pixel sampling grid, the MRP is almost always rectangular, though this is not the only possible shape. One of the simplest possible MRPs is a 2  2 square of four-colored pixels, as shown in Fig. 12. This MRP, known as the Bayer pattern, is the most popular MRP in use in digital cameras today (Bayer 1976). When tessellated over the surface of the sensor, the green pixels are sampled in a checkerboard manner, and the red and blue pixels are sampled over a rectilinear grid with a pixel pitch of two in both horizontal and vertical directions as shown in Fig. 13. A commonly used intuitive interpretation for representing the frequency responses and potential aliasing patterns of an MRP comes from drawing the associated Nyquist diagram. Deriving this diagram from first principles, Fig. 14 shows the locations of the centers of the repeated frequency components for a sensor with no CFA pattern. Components appear at m n  ℤ. The central square represents the boundary halfway between the fundamental at the origin and the nearest frequency sidebands. As such, this square describes the Nyquist frequency of the sensor. If the frequency spectrum of the fundamental is bandlimited (restricted) to be entirely within the square, then aliasing is avoided. Figures 15 and 16 show the

Page 10 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

G

R

B

G

Fig. 12 Bayer CFA minimum repeating pattern

G

R

G

R

B

G

B

G

G

R

G

R

B

G

B

G

Fig. 13 Bayer pattern tessellated

h

x

Fig. 14 Sensor repeated frequency component locations

corresponding frequency component location plots for the green and red/blue channels of the Bayer MRP and their Nyquist frequencies. Amalgamating Figs. 14, 15, and 16, the resulting Nyquist diagram of Fig. 17 compares the frequency responses of the associated sampling patterns. It can be seen that along the horizontal and vertical frequency axes, the green channel has the same Nyquist frequency as the sensor without a CFA pattern. It is for this reason that the four-spot antialiasing filter is usually designed for a pixel pitch of unity in both horizontal and vertical directions, as described in section “Antialiasing Filter.” From the Nyquist diagram it can be seen that this strategy will break down for diagonally oriented edges as the green channel Nyquist frequency pattern takes on a diamond shape that misses the corners of the

Page 11 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

h

x

Fig. 15 Bayer CFA green channel repeated frequency component locations

h

x

Fig. 16 Bayer CFA red and blue channel repeated frequency component locations

sensor’s Nyquist frequency square. The smaller square corresponding to the red and blue channel Nyquist frequency pattern is clearly smaller than the green channel and sensor patterns. These aliasing effects are demonstrated in Fig. 18. The effects of exceeding the Nyquist frequency of the green channel can be seen in the green-magenta patterns in the corners of the figure. The effects of exceeding the Nyquist frequency of the red and blue channels can be seen in the cyan-yellow patterns at the middle of the edges of the figure. With the Bayer CFA MRP, the designer of the antialiasing filter has to make a compromise. If the antialiasing filter is designed for the green channel, it will permit aliasing in the red and blue channels. However, an antialiasing filter designed for the red and blue channels will discard valid green channel high-frequency information. The usual solution is to design for the green channel, i.e., a one-pixel pitch and to address the subsequent red and blue aliasing with sophisticated image processing operations after the capture.

Page 12 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

1 2

h sensor green channel red and blue channels

1 2

1 2

x

1 2

Fig. 17 Bayer CFA Nyquist frequency diagram. Frequency axes are in cycles/sample

Fig. 18 Example of aliasing due to the Bayer CFA pattern

a

b

Fig. 19 Idealized pixel designs: (a) front-illuminated pixel; (b) back-illuminated pixel Page 13 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Image Sensor Operation Artifacts In present digital cameras, all image sensors use integrating sensing technology: Each sensing pixel accumulates light-induced photocharge for a finite exposure time and is then read out. Most sensors convert 20–60 % of the photon incident on the silicon to photocharge. The signal output from the sensor is usually linear with the accumulated charge. Image sensors are commonly grouped, based on their fabrication processes, into charge-coupled devices (CCD) and complementary metal oxide semiconductor (CMOS) sensors. While design variations and terminology continue to develop, the key difference between these groups is that CCD sensors transport charge from each active pixel to an output gate, where charge is converted to a measurable signal. CMOS, or active pixel, sensors convert charge to voltage within each pixel and send that signal to the sensor output. Most sensors are illuminated from the front side, but backside-illuminated sensors are becoming more common. The literature on sensor development and technology is rich; recent books discussing the field include (Holst and Lomheim 2007; Nakamura 2005; Ohta 2007). The presentation here is at a very high level, providing only enough detail to help explain some of the common artifacts caused by sensor limitations. These artifacts are discussed in section “Sensor Artifacts.”

Sensor Artifacts

Having presented a high-level flow of operation for the sensors, characteristic artifacts will be discussed next. Many of these artifacts may be used to help identify specific cameras or camera technologies used to form an image. Digital camera image processing chains, discussed in the following chapter, will normally mitigate these artifacts, but the correction may not be perfect, and concealment artifacts can be introduced in the image. Sensor Defects Defects are those pixels whose response is abnormal enough to be discarded after readout rather than used as image data. There are many causes of defects: high dark current arising from an impurity in the silicon crystal, a flaw in a filter or lenslet, a surface flaw on the sensor (dirt or scratch), an electrical fault, or even a flaw on the sensor cover glass. While manufacturers strive to minimize defects, they do occur and impact the yield of image sensors. In order to improve yield, limited quantities of certain types of defects are generally accepted in sensors. Details about what defects are acceptable are generally part of the specification negotiated between a sensor supplier and a camera manufacturer. Regardless of the details of the specification, defects are routinely concealed when processing images from cameras, though some defects are more difficult to conceal than others. The following discussion describes a number of classes of defects. Isolated single-pixel defects are common; most sensors include dozens or even hundreds of these, and they are easily concealed in the processing path. A flaw in a vertical shift register can introduce a singlecolumn defect, in which most or all of the pixels in a column are defective. This kind of defect is more difficult to conceal than an isolated pixel, but some camera processing paths support concealment of a few isolated column defects in order to improve image sensor yield and lower cost for the image sensor. Because CMOS sensors address pixels with row and column selection, row defects are possible as well as column defects. Defects on the surface of the sensor, such as dirt, often affect several or many pixels in a cluster. Sensors with these cluster defects are usually rejected during manufacturing, though some cameras have implemented defect concealment for small clusters of pixels. In some CMOS designs, multiple pixels (usually two or four) share circuitry for conversion of charge to voltage, so a failure in the shared circuitry will produce a cluster of defective pixels (Guidash 2003).

Page 14 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 20 Example layout for a multiple output interline CCD

While most point defects in a sensor are present at manufacture and thus can be mapped during sensor or camera manufacture, some defects accumulate during the life of the sensor. Bright point defects can be caused by cosmic ray damage to the sensor, particularly when airborne, such as when being shipped or carried overseas (McColgin et al. 2007). A flaw, such as dirt, on the cover glass of the sensor will produce local variation in sensitivity that depends on the f-number of the taking lens, distance from the exit pupil of the lens to the sensor, and the distance from the cover glass to the surface of the sensor. Generally the shadow of the flaw cast onto the sensor subtends the same angle as the cone of light rays coming from the exit pupil of the lens. At lower f-numbers, the flaw will produce modest gain variations over a relatively large area, while at higher f-numbers, the flaw will produce a more compact region of more significant gain variations. Dirt on the cover glass is common in cameras with interchangeable lenses, as dust can easily get on the cover glass of the image sensor. Manufacturers of cameras with interchangeable lenses have introduced a variety of systems for cleaning the sensors and also for mapping and concealing the artifacts from dirt. Dark Current Dark current is signal accumulated within each pixel even in the absence of light. This happens because electrons are generated from thermal effects as well as from photon collection. The current is relatively constant, so the charge collected during integration depends linearly on integration time. Dark current also varies exponentially with temperature (Nakamura 2005). Because dark correction must be precise over a wide range of temperatures and integration times, light-shielded (also known as dark or optical black) pixels are placed around the periphery of the image sensor to provide accurate data on the dark current accumulated during each exposure. Figure 20 shows the layout for a CCD with two outputs. The dark pixels are shown on the left and right sides of the sensor as well as the top and bottom of the sensor. With small sensors, temperature variations across the sensor are usually insignificant, but larger sensors can have measurable variation in dark signal due to temperature nonuniformity.

Page 15 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

In addition to temperature nonuniformities, some readout modes vary integration time slightly from row to row, with the last rows read out of the sensor being integrated longer than the first rows read out. This occurs more often with full-frame CCDs, since the exposure is usually controlled by performing a global reset of the CCD (draining all accumulated charge), opening the optical shutter for a time, then closing the optical shutter and reading out the CCD. The last rows read out have integrated dark current longer than the first rows read out, even though the optical shutter provides a global shutter for light collection. Depending on the dark current and the readout timing, there may be measurable differences in noise from the first rows read out to the last rows read out, because shot noise occurs with dark signal as well as photocharge. Fixed Patterns The dark floor of a captured image has two main contributing factors: dark signal, which is dark current integrated over the exposure time, and offset in the analog signal chain. Fixed pattern noise has two main components: variation in the dark floor and variation in gain or sensitivity. Variation in dark signal is primarily driven by slight changes in dark current from pixel to pixel. With short exposure times and low dark current levels, the variation in dark signal is usually insignificant. Long exposure times can bring the dark signal up high enough to be visible, and summing or averaging multiple exposures can bring down temporal noise, making the fixed pattern noise more visible. Fixed pattern noise is usually more significant and complex in CMOS sensors than CCDs because of the more complex circuitry in the array. In addition to more complex pixels, it is common to have signal variation for each row and for each column. The changes in dark signal fixed pattern with row or column are usually more significant than gain changes. Because these structured patterns are highly objectionable, they are usually corrected to subthreshold levels during processing. Variation in sensitivity is most often caused by subtle nonuniformity in color filter arrays and lenslets, as well as interactions between the sensor and the taking lens. These variations usually have a very low spatial frequency, though large sensors fabricated in several blocks, such as described in Manoury et al. (2008) and Meynants et al. (2004), may have abrupt step features at block boundaries. Routine processing rarely compensates precisely for very gradual fixed offset or gain patterns, though partial correction is becoming more common. Crosstalk Most sensors suffer from crosstalk between pixels, where photons that should be collected in one pixel are collected in a neighboring one. Some crosstalks are optical, where light that passes through one lenslet and CFA filter ends up being collected in an adjacent pixel. As Fig. 19a shows in a simplified way, the region between CFA and photosensitive pixel area is formed from layers of silicon and metal, with many opportunities for reflections. Crosstalk is also caused by charge diffusion, especially for longer wavelength light. Longer wavelength light penetrates farther into silicon before being absorbed and converted to charge, and some photons get absorbed in the substrate rather than in the photosensitive pixel. The charge generated in the substrate can then diffuse to a nearby pixel. With the presence of a CFA, crosstalk presents several artifacts. For example, in the Bayer CFA, the green channel is composed of two classes of pixels. Half of the green pixels have horizontal neighbors that are blue and vertical neighbors that are red. The other half have the converse – horizontal neighbors are red, and vertical neighbors are blue. Because of this layout, crosstalk makes the effective spectral sensitivity of each red (or blue) pixel combine with a portion of the sensitivity of neighboring green pixels. Another artifact is nonuniformity in response of pixels in a color channel, based on their neighbors. Crosstalk in most sensors is asymmetric – either mostly vertical or mostly horizontal, due to asymmetry in pixel layout. In interline CCD sensors, vertical crosstalk is usually greater because of the vertical shift Page 16 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 21 Example Bayer CFA binning pattern (a) Red (b) Green (c) Blue

registers separating the photosensitive pixel areas. Thus, half of the green pixels have spectral sensitivity that has more of a red contribution, while the other half has more of a blue contribution. This change in sensitivity is usually a small percentage of the total sensitivity, but the highly patterned nature of the artifact can cause trouble for demosaicing. Any filter array pattern that places pixels within varying neighborhoods is vulnerable to this kind of pattern problem.

Video Capture For some time in the development of digital cameras, the somewhat different needs of camcorders and digital still cameras kept the technology distinct. Digital video cameras (DVCs) were developed using relatively small sensors with relatively low resolution, optimized for capturing just enough data for video under a wide range of illumination conditions. Digital still cameras (DSCs) were developed to capture the highest-resolution images practical, usually using sensors with a larger area and electronic flash. More recently, technology has been converging: Nearly all digital cameras are expected to acquire both stills and video. The differentiation between DSCs, DVCs, and other devices such as mobile phone cameras lies in the relative priority for cost, compactness, image quality, and so forth. Some current DVCs may use a sensor with roughly twice as many pixels as are provided in an output video stream, such as a fivemegapixel sensor and a two-megapixel (1920  1080) video stream. Most current DSCs use higherresolution sensors, such as a 14-megapixel sensor, while still providing a video stream limited to two megapixels. While the sensor resolution can vary, both DVCs and DSCs require the ability to efficiently read a relatively low-resolution image from the sensor at video frame rates, such as 30 or 60 frames per second. Key in this convergence has been the development of flexible binning readout architectures, allowing good-quality low-resolution images to be read out from a high-resolution sensor. This was alluded to previously when discussing sensor readout schemes but deserves more detail here. The need for goodquality video from still sensors has been driven by two needs: the feature of storing digital video files and the feature of providing electronic live view functionality, either on a display or in an electronic viewfinder with an eyepiece. Both of these features demand a rapidly updated image with relatively low resolution.

Page 17 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Raw CFA Image

Color Noise Reduction

Camera Corrections

Color Correction

Stochastic Noise Reduction

Interpolated RGB Image

Tone Scale and Gamma Correction

Edge Enhancement Exposure and White Balance Finished Image Adjusted CFA Image

Demosaicing

Compression

Storage Formatting

Stored Image

Fig. 22 Nominal flow for a digital camera processing chain

Binning A preferred, and now more common, approach to lower resolution is binning. In binning, charge from multiple pixels is combined before conversion to voltage, allowing a single-pixel readout to represent the sum of multiple pixels on the sensor. This improves the signal-to-noise ratio for the summed pixel and also provides a great reduction in aliasing. Successful binning becomes more complex with a color filter array; the binning architecture and the CFA pattern must complement each other to provide the proper color response for binned pixels. Figure 21 illustrates a simple 2  2 binning pattern, showing each of the three color channels separately to reduce clutter in a single figure. The diagonal lines joining the colored dots indicate which pixels are binned; the intersection of the colored lines conveniently shows the effective center of each binned pixel. The summing operation acts like a convolution with a sparse 2  2 kernel, applying an antialiasing filter as part of the subsampling. The reader may notice that only half of the green pixels are being binned and read out. Binning and reading the other half of the green pixels is one of many additional options possible when designing binning and readout patterns for a sensor with a CFA. Other binning patterns are also possible, including 2  1, 1  2, and 3  3 patterns. Binning in CCDs is controlled by the design of shift registers and timing patterns, so some CCDs offer a number of binning options. Binning in CMOS sensors is controlled by specific interconnection hardware, so it tends to be less flexible.

Exposure Control In addition to the spatial sampling differences between full-resolution readout and low-resolution readout, the temporal sampling of video provides constraints on exposure time. Conventional video readout requires the integration time to be less than the frame time. Conversely, in the presence of motion, it is important for the integration time to be fairly close to the frame time. Excessively short exposures will cause an artifact sometimes known as judder, where individual frames seem to present a discontinuous motion. This is the temporal equivalent of a small pixel fill factor. When integration times approaching the frame time are used, moving objects have a motion blur in each frame that is perceptually (and physically) consistent with the displacement from frame to frame. A common rule of thumb is that integration times Page 18 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

roughly 50 % or more of the frame time limit temporal artifacts to a practical level. Animated videos specifically add motion blur that is consistent with frame displacements in order to present more convincing motion. An example from the field of stop-motion animation is (Brostow and Essa 2001), in which a sequence of frames is analyzed for motion displacements, and each frame is blurred with a locally varying PSF based on the displacements.

Nominal Image Processing Chain To provide an order for the discussion of routine camera image processing, the processing steps are presented in the sequence shown in Fig. 22. The ordering shown in the figure is reasonable, though in practice steps are often moved, combined or split up, and applied in several locations in the chain. In Fig. 22, ellipses represent the image at key points of interest along the chain, while rectangles represent processing blocks. The blocks in this chain will be discussed in the following sections. One reason for split or redundant operations is the tendency for noise to be amplified during the processing chain. White balancing, color correction, tone and gamma correction, and edge enhancement are all operations that tend to increase noise or at least increase visibility of the noise. Because noise is amplified during processing, it is common for several operations to take steps to reduce noise or at least limit its amplification. Also, the chain is often altered to meet different expectations for image quality, performance, and cost. For example, color correction usually works more accurately when performed on linear data, but lower-cost image chains will often apply gamma correction fairly early in the processing chain to reduce bit depth and apply color correction on the gamma-corrected data. In general, image processing chain design is fairly complex, with tradeoffs in use of cache, buffer memory, computing operations, image quality, and flexibility. This chapter will discuss some of the more common processing chain operations, but the reader is advised to consult (Adams and Hamilton 2008) for further discussion of the design of camera processing chains.

Raw CFA Image

The first image is the raw image as read from the sensor through the analog signal-processing chain. It is a single-channel image in which different pixels sense different colors through a color filter array, as discussed in the previous chapter.

Stochastic Noise Reduction In most digital camera image processing chains, noise reduction is a critical operation. The main problem is that compact digital cameras usually operate with limited signal and significant noise. As mentioned at the beginning of section “Nominal Image Processing Chain,” noise reduction is often addressed in several places in the processing chain. All noise reduction operations seek to preserve as much scene information as possible while smoothing noise. To achieve this efficiently, it is important to use relatively simple models to discriminate between scene modulation and noise modulation. Before the demosaicing step in Fig. 22, it is somewhat difficult to exploit interchannel correlations for noise reduction. In the stochastic noise reduction block, grayscale techniques for noise reduction are usually applied to each color channel individually. There are many possible approaches, but two families of filtering, based on different models of the image capture process, are discussed here. The first of these models represents the random noise in the sensor capture and is used for range-based filtering, such as in a sigma filter (Lee 1983) or the range component of a bilateral filter (Tomasi and Manduchi 1998). Early in a digital camera processing chain, a fairly simple model for noise variance is effective. There are two primary sources of random noise in the capture chain. The first is PoissonPage 19 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

distributed noise associated with the random process of photons being absorbed and converted to charge within a pixel. The second is electronic read noise, modeled with a Gaussian distribution. These two processes are independent, so a pixel value Q may be modeled as Q = kQ(q + g), where kQ is the amplifier gain, q is a Poisson random variable with mean mq and variance s2q, and g is a Gaussian random variable with mean mg and variance s2g. Because q is a Poisson variable, s2q ¼ mq , and the total variance for a digitized pixel Q is   (2) s2Q ¼ k 2Q mq þ s2g ; where mq is the mean original signal level (captured photocharge), and s2g is the read noise. This relationship, that signal variance has a simple linear relationship with code value and a positive offset, allows a very compact parameterization of s2Q based on a limited number of tests to characterize capture noise. Noise reduction based on the noise levels in captured images allows smoothing of modulations with a high probability of being noise while not smoothing over (larger) modulations that have a low probability of being noise. Range-based filtering can introduce two main artifacts in the processed image. The most obvious is loss of fine texture. Since the noise reduction is based on smoothing small modulations and retaining large modulation, textures and edges with low contrast tend to get oversmoothed. The second artifact is the tendency to switch from smoothing to preservation when modulation gets larger. This results in a very nonuniform appearance in textured fields or edges, with portions of the texture being smoothed and other portions being much sharper. In some cases, this can lead to contouring artifacts as well. The second simple model is an approximation of the capture PSF. Any point in the scene is spread over a finite area on the sensor in a spatially bandlimited capture. Thus, single-pixel outliers, or impulses, are more likely to be caused by sensor noise than scene content. While range-based filtering handles signaldependent noise fairly well, it is prone to leaving outliers unfiltered, and it tends to increase the kurtosis of the distribution since it smooths small modulations more than larger modulations. The likelihood that impulses are noise leads to use of impulse filtering noise reduction. A standard center-weighted median filter can be effective, especially with lower-cost cameras that have a large enough PSF to guarantee any scene detail that will be spread over several pixels in the capture, thus preventing it from appearing as an impulse. More sophisticated approaches may be used for cameras with smaller or variable point spread functions, such as digital SLR cameras. The characteristic artifact caused by impulse filtering is elimination of small details from the scene, especially specular reflections from eyes and small lights. When applying impulse filters to CFA data, the filtering is particularly vulnerable to creating colored highlights, if an impulse is filtered out of one or two color channel(s) but left in the remaining channel(s).

Exposure and White Balance Correction The human visual system automatically adapts in complex ways when viewing scenes with different illumination. Research in the areas of color appearance models and color constancy continues to focus on developing models for how different scenes appear to human observers under different conditions. Because the human visual system generally has nonlinear responses and operates over a wide range of conditions, this process is extremely complex. A more restricted form of the problem is normally addressed in digital cameras. The goal is to capture neutral scene content with equal responses in all color channels (R = G = B), with the midtones being rendered near the middle of the tone scale, regardless of the illuminant or content of the scene.

Page 20 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 23 Example color image reconstruction from Bayer pattern (a) Bayer CFA image (b) full-resolution CFA interpolated color image

White balance adjustment is accomplished by multiplying pixels in each color channel by a different gain factor that compensates for a non-neutral camera response and illuminant imbalance. In a digital camera, exposure adjustment is usually done primarily by controlling exposure time, analog gain, and f-number, but sometimes a digital exposure adjustment is used as well. This is done by scaling all three gain factors by a common factor. Application of the gain factors to the CFA data before demosaicing may be preferred, since some demosaicing algorithms may presume equal responses for the different color channels. The other part of the exposure and white balance task is estimation or selection of appropriate gain factors to correct for illumination imbalance. Knowledge of the illuminant (and the camera’s response to it) is critical. The camera’s response to typical illuminants, such as daylight, incandescent, and fluorescent, is easily stored in the camera. Illuminant identification can be done manually through a user interface, which then drives selection of the stored gains for the illuminant. This is generally quite simple for the camera but more tedious for the user. Another common approach is to allow the user to capture an image of a neutral (gray) under the scene illuminant and have the camera compute gain factors from that image. The classic and most ill-posed form of the estimation problem is to analyze the image data to estimate the gains automatically, responding to the illuminant, regardless of the scene content. A classic difficult example is a featureless flat image with a reddish cast. Is it a gray card under incandescent illumination, or a reddish sheet of paper under daylight? Fortunately, normal scenes contain more information than a flat field. Current cameras approach this estimation problem with different algorithms having different responses to scene content and illuminants. Camera manufacturers usually have somewhat different preferences, for example, biasing white balance to render images warmer or cooler, as well as different approaches to estimating the scene illuminant. Most automatic white balance and exposure algorithms are based on some extension of the gray world model that images of many different scenes will average out to 18 % gray (a midtone gray). Unfortunately, this says very little about a specific image, and the algorithm must work well for individual images. Most Page 21 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 24 Demosaicing algorithm comparisons (a) Original (ground truth) image (b) Bayer CFA image (c) nearest neighbor interpolation (d) bilinear interpolation (e) DLMMSE (Zhang and Wu 2005)

extensions of the gray world model try to discount large areas of single colors, to avoid having the balance driven one way or another by red buildings, blue skies, or green foliage. There is also a tendency to more heavily weight colors closer to neutral more than colors far from neutral, but this complicates the algorithm, since the notion of neutral is not well defined before performing illuminant estimation. This can be approached by applying a calibrated daylight balance to the image for analysis, then analyzing colors in the resulting image (Gindele et al. 2007). Another extension is to consider several possible illuminant classes and estimate the probability of each illuminant being the actual scene illuminant (Miyano and Shimizu 1997; Finlayson et al. 2001). Sometimes highlight colors are given special consideration, based on the theory that highlights are specular reflections that are the color of the illuminant (Lee 1986; McCann et al. 1976; Miyano 1997). This breaks down for scenes that have no truly specular highlights. Some approaches also consider the color of particular scene content. The most common of these is using face detection and adjusting balance to provide a reasonable color for the face(s). This has other challenges, because faces themselves vary in color. Using exposure control information such as scene brightness can help with the illuminant estimation. For example, a reddish scene at high illumination levels is more likely to be an outdoor scene near sunset, while the same scene with dim illumination is somewhat more likely to be indoor illumination. Another example is flash information. If an image is captured primarily with flash illumination, then the illuminant is largely known.

Page 22 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Demosaicing

As discussed in section “Color Filter Array,” capturing color images using a digital camera requires sensing at least three colors at each pixel location. In order to reduce cost and complexity of full-color image capture, most digital cameras are designed using a single monochrome CCD or CMOS image sensor with a color filter array (CFA) laid on top of the sensor (Li et al. 2008; Gunturk et al. 2005). The CFA is a set of color filters that samples only one color at each pixel location, and the missing colors are estimated using interpolation algorithms. These CFA interpolation algorithms are also widely referred to as CFA demosaicing or demosaicing algorithms (Glotzbach et al. 2001; Kimmel 1999; Trussell and Hartwig 2002). Because the quality of the CFA interpolated image largely depends on the accuracy of the demosaicing algorithm, a great deal of attention has been paid to the demosaicing problem (Fig. 23). Although simple nonadaptive image-interpolation techniques (e.g., nearest neighbor, bilinear interpolation, etc.) can be used to interpolate the CFA image, demosaicing algorithms designed to exploit inter-pixel and interchannel correlations outperform nonadaptive interpolation algorithms as illustrated in Fig. 24. The original (ground truth) and the corresponding Bayer CFA images are shown in Fig. 24a, b, respectively. Three different demosaicing algorithms, namely, (i) nearest neighbor interpolation (Gupta and Chen 2001), (ii) bilinear interpolation (Alleysson et al. 2005), and (iii) directional linear minimum mean squareerror estimation (DLMMSE) (Zhang and Wu 2005) were applied to the CFA image, and the corresponding demosaiced color images are shown in Fig. 24c, e. Both the nearest neighbor and bilinear interpolation algorithms are nonadaptive in nature and, as a result, produced aliasing artifacts in the highfrequency regions. However, DLMMSE, due to its adaptive design, reconstructed the color image almost perfectly. For more details on the design and performance of various adaptive and nonadaptive CFA demosaicing algorithms, see (Lu and Tan 2003; Mukherjee et al. 2001; Hirakawa and Parks 2005; Zhang et al. 2009). Placement of the CFA on top of the image sensor is essentially a downsampling operation. Therefore, the overall quality of color images (e.g., spatial resolution, color fidelity, etc.) produced by demosaicing algorithms not only depends on the accuracy of the demosaicing algorithm but is also influenced significantly by the underlying CFA layout. Careful selection of a CFA pattern and corresponding demosaicing algorithm leads to high-quality color image reconstruction. Although the Bayer pattern is one of the most commonly used CFA patterns, many others such as GGRB, RGBE, CYMM, CYGM, etc. (Yamanaka 1977; Miao et al. 2004; Yamagami et al. 1994; Hamilton et al. 2001) also have been suggested for consumer digital cameras. Primarily influenced by manufacturing constraints and implementation costs, these CFA constructions and the corresponding demosaicing algorithms have been researched extensively. However, a systematic research framework for designing optimal CFA patterns is still a fairly new research direction. Some of the state-of-the-art developments in this area include the Kodak panchromatic CFA (Kumar et al. 2009; Pillman et al. 2010), secondgeneration CFA (Hirakawa and Wolfe 2007), etc. A detailed review of these and other similar CFA patterns is beyond the scope of this chapter, and readers are encouraged to refer to Moghadam et al. (2010), Bawolek et al. (1999), Lukac and Plataniotis (2005), Bean (2003) and Roddy et al. (2006) for more details.

Color Noise Reduction

Section “Stochastic Noise Reduction” discussed noise reduction based on simple single-channel noise models. This section discusses exploitation of interchannel correlation and knowledge of the human visual system for further noise reduction. As mentioned before, these concepts are easier to apply after demosaicing, though application of these concepts before and during demosaicing remains a research opportunity. Page 23 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Once three fully populated color channels are present, it is straightforward to rotate the color image to a luma-chroma color space. Because color correction has not yet been performed, this luma-chroma space will typically not be colorimetrically accurate. Still, a simple uncalibrated rotation, such as in Eq. 3, similar to one proposed by Ohta et al. (1980), suffices to get most of the scene detail into the luma channel and most of the color information into the two chroma channels: 2 3 2 3 2 3 1 2 1 R Y 4 C 1 5 ¼ 1 4 1 2 1 5 4 G 5: (3) 4 C2 2 0 2 B Once separated, each channel is cleaned based on the sensitivity of the human visual system. In particular, the luma channel is cleaned carefully, trying to preserve as much sharpness and detail as possible while providing adequate noise reduction. Digital cameras operate over a wide range of gain values, and the optimum balance of noise reduction and texture preservation typically varies with the input noise level. The chroma channels are cleaned more aggressively for several reasons. One reason is that sharp edges in the chroma channels may be color interpolation or aliasing artifacts left over from demosaicing (Hamilton and Adams 2003). The second is that viewers are less sensitive to chroma edges and especially sensitive to colored noise. The overall quality of the image is improved by emphasizing smoothness of the chroma channels rather than sharpness. After noise reduction, this rotation is inverted as in Eq. 4: 2 3 2 3 2 3 G 1 1 1 Y 4 4G5 ¼ 41 1 5 (4) 0 C 1 5: C2 B 1 1 1 Chroma-based noise reduction can produce two signature artifacts: color blobs and color bleed. Color blobs are caused by chroma noise that is smoothed and pushed to lower spatial frequencies in the process, without being eliminated. Successful cleaning at lower spatial frequencies requires relatively large filter kernels or iterative filtering. Implementations with constraints on available memory or processing power tend to leave low-frequency noise unfiltered. Desktop implementations, with fewer constraints, can use pyramid (Adams et al. 2007) or wavelet decomposition (Chang et al. 2000) to reach the lowest frequencies. Chroma-based noise reduction is also prone to color bleed artifacts, caused by substantial mismatch in edge sharpness between the luma and chroma channels. Adaptive techniques that smooth the chroma channels using edge detection techniques such as (Adams and Hamilton 2003) reduce the color bleeding problem by avoiding smoothing across edges. The visibility of color bleed artifacts depends in part on the luma-chroma color space used for noise reduction. If the color space is less accurate in separating colorimetric luminance from chrominance data, color bleed artifacts will also affect the lightness of the final image, increasing the visibility of the artifacts. Adaptive chroma noise cleaning that can clean to very low frequencies while avoiding significant color bleeding is an open research question. Color moiré patterns are sometimes addressed in a processing chain, often treated as a form of color noise. The usual approach is a variation of chroma noise reduction, including an additional test to check for high-frequency textures (Adams et al. 2004, 2005).

Page 24 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Color Correction After suitable noise reduction, colors are corrected, converting them from (white balanced) camera responses to a set of color primaries appropriate for the finished image. This is usually accomplished through multiplication with a color correction matrix, as in Eq. 5: 2 3 2 3 RC RS 4 GS 5 ¼ C4 GC 5: (5) BS BC In this equation, C is a 3  3 matrix with coefficients determined to convert from the camera’s whitebalanced native sensitivity to a standard set of color primaries, such as sRGB (Anderson et al. 1995), ROMM (Reference output medium metric RGB, or Adobe RGB (1998). One characteristic of color correction is the location of colors in the finished image color space. For example, the color of blue sky, skin, and foliage in the rendered image can vary from manufacturer to manufacturer. Different camera designers usually choose different objectives when doing this primary conversion; this can be thought of as combining color correction and preferred color rendering. Considerations affecting the determination of the color matrix are discussed in more depth by Hunt (1987) and also Giorgianni and Madden (2008). If a standard color space with relatively small gamut, such as sRGB, is chosen, then many saturated colors will be outside the gamut. Depending on the gamut mapping strategy chosen, this can introduce clipping or gamut mapping artifacts. The color correction matrix will usually amplify noise in the image. Sometimes, color correction is deliberately desaturated to reduce the noise in the rendered image. If a more complex color correction transform is chosen, such as one that increases color saturation but preserves flesh tones at a less saturated position, then some nonuniform noise characteristics and even contouring may be observable.

Tone Scale and Gamma Correction After color correction, a tone scale is applied to convert the image, still linear with respect to scene exposure, to a final color space. Sometimes this is referred to as gamma correction, though most processing chains apply additional contrast adjustment beyond simple gamma correction. Along with color correction, the choice of the tone scale for rendering a reproduction of a scene is complex. As with color correction, issues involved in this selection are discussed in more depth by Hunt (1987) and Giorgianni and Madden (2008). The most common processing chains simply apply a tone scale to all pixels in the image as a lookup table operation, regardless of scene content. Consumer preference is usually for a higher-contrast tone scale, as long as no significant scene information is lost in the process. This is somewhat dependent on scene and user expectation; professional portraits are an example where the preference is for a lower contrast look. More recently, processing chains are using adaptive tone scales that are adjusted for each scene. When such adjustments get very aggressive (bringing up shadow detail, compressing highlight range), image texture becomes unnatural. If the contrast in the shadows is stretched, noise is amplified, leading to image quality degradation. If the highlights are compressed, texture in the highlights is flattened, leading to a different quality degradation. The more sophisticated processing chains apply the adaptive tone scale with spatial processing. The approach is usually to use a multilevel decomposition to transform the image into a base image containing low spatial frequencies and a detail or texture image (Goodwin and Gallagher 1998; Gindele and Gallagher 2001). Once the image is decomposed into base and detail images, the tone scale adjustments Page 25 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

a

b

Edge Out

Edge In

Edge Out

Edge In

Fig. 25 Example edge enhancement nonlinearities: (a) soft thresholding (b) edge limiting

are applied to the base image with no changes or with controlled changes in detail. When the decomposition into base and texture images is imperfect, this approach can cause halo effects near high contrast edges where the base image is adjusted extensively. In extreme cases, the image is rendered with an artificial decoupling of base image and texture, looking more like an animation rather than a photographic image. Image decomposition and derivation of spatially adaptive tone scales for optimal rendering of images are open areas of research (Bae et al. 2006; Durand and Dorsey 2002; Farbman et al. 2008).

Edge Enhancement During image capture, edge detail is lost through optical and other effects. Most processing operations, especially noise reduction, further reduce high spatial frequency content. Display devices, printers, and the human visual system also attenuate system response. Edge enhancement, also known as sharpening, is a relatively simple spatial operation to improve the appearance of images, making it appear sharper and partially compensating for these losses. The core of routine edge enhancement is a convolution operation to obtain an edge image which is then scaled and added to the original image, as in Eq. 6: A0 ¼ A þ kE:

(6)

In this equation, A0 is the enhanced image, A is the color-corrected image from the previous processing stage, k is a scalar gain, and E is the edge enhancement image. The key variations in the process lie in the creation of E. This is normally done with a standard spatial convolution, as in Eq. 7: E ¼ Ah:

(7)

In this equation, h is a high-pass convolution kernel. In other implementations, an unsharp mask formulation is chosen, as in Eq. 8: E ¼ A  Ab:

(8)

In this equation, b is a low-pass convolution kernel. If h in Eq. 7 is chosen to be h = I  b, then Eqs. 7 and 8 are identical. In both implementations, the design of the convolution kernel and the choice of k are the main tuning parameters. The design of the kernel controls which spatial frequencies to enhance, while k controls the magnitude of the enhancement. Often, the kernel is designed to produce a band-pass edge image, providing limited gain or even zero gain at the highest spatial frequencies. In most practical implementations, the size of the kernel is relatively small, with 5  5 being a common size. Page 26 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Even with a band-pass kernel, this formulation amplifies noise and can produce significant halo or ringing artifacts at high contrast edges. Both of these problems can be treated by applying a nonlinearity to the edge image before scaling and adding to the original image, as in Eqs. 9 or 10: E ¼ LðAhÞ

(9)

E ¼ LðA  AbÞ:

(10)

Figure 25 shows two example nonlinearities, both based on the magnitude of the edge value. The edge values with the smallest magnitude are most likely to be the result of noise, while larger edge values are likely to come from scene edges. The soft thresholding function shown in Fig. 25a reduces noise amplification by reducing the magnitude of all edge values by a constant and is widely used for noise reduction, such as in Chang et al. (2000). Soft thresholding eliminates edge enhancement for small modulations, while continuing to enhance larger modulations. The edge-limiting nonlinearity shown in Fig. 25b also limits halo artifacts by limiting the edge value at the largest edge values, since high contrast edges are the most likely to exhibit halo artifacts after edge enhancement. Application of edge enhancement in an RGB color space will tend to amplify colored edges caused by any capture or processing artifacts earlier in the capture chain, as well as colored noise. For this reason, edge enhancement is often applied to the luma channel of an image rather than all three color channels, for the same reasons that noise reduction is often applied in a luma-chroma color space. With more aggressive edge enhancement, different artifacts can be caused by the choice of different color spaces. Selection of a carefully chosen luma-chroma space tends to minimize artifacts, though aggressive edge enhancement can still lead to color bleeding problems if luma edges are enhanced too much more than chroma edges. Selection of a color space for edge enhancement can depend on several factors. For chains planning to apply JPEG compression in a YCbCr color space, the Y channel is a natural choice. In other cases, the green channel may be an acceptable luma channel – it is often the channel with the lowest noise and highest captured spatial resolution. This choice can produce artifacts with edges of different colors, however, since some luma edges will not show up the in the green channel.

Compression Once an image is fully processed, it is often compressed to reduce the amount of physical storage space required to represent the image data. Image compression algorithms can be divided into two categories: lossy and lossless. Lossless image compression algorithms are reversible, meaning that the exact original image data can be recovered from the compressed image data. This characteristic limits the amount of compression that is possible. Lossy image compression algorithms allow some of the original image data to be discarded, and only an approximation to the original image is recovered from the compressed image data. Many image compression algorithms have been proposed both academically and commercially, but digital cameras predominantly utilize the JPEG image compression standard and in particular a baseline implementation of the JPEG lossy image compression standard (Pennebaker and Mitchell 1993).

Video Capture Processing for video capture is normally a simplified form of the processing used for still image capture. The same basic processing blocks listed in the nominal processing chain in Fig. 22 are still required, though many of them are simplified. The primary reason for this simplification is the time constraint on the operations. Processing for a still capture can take from a small fraction of a second to many seconds,

Page 27 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

depending on the size of the image and sophistication of the processing. Even when a rapid sequence of still images is captured, the normal practice is to buffer the raw images in memory and process the images after capture. In contrast, video capture is performed at a fixed frame rate, such as 30 frames per second (fps), for an indefinite period of time. This precludes buffering approaches such as used with still capture and requires the processing path to keep up with the capture rate. One advantage for a video processing path is the limited size of the frames, normally two megapixels (1920  1080) or less, compared with five megapixels or much more for still capture. Further, because each image in the video is normally viewed only fleetingly, the image quality of each frame need not be as high. On the other hand, there are several challenges not present with full-resolution processing. One is the problem of aliasing and colored edge artifacts. Use of binning to limit aliasing in low-resolution readout was discussed in the previous chapter. One side effect of binning is the readout of color pixels whose spatial relationship (nonoverlapping, partially overlapping) depends on the binning pattern. As a result, a video processing path must deal with raw data that can have more complex spatial relationships between pixels than is usual with full-resolution readout. Several additional issues not relevant to a single image come to light when processing a sequence of images. One is the use of temporal noise reduction. Approaches such as Jain and Sethuraman (2008) and Kim et al. (2009) can provide significant benefit, though processing budgets usually constrain them to be relatively simple. In addition, temporal compression techniques, such as MPEG or H.264, are normally used with video, though some processing chains simply use lossy JPEG compression on individual frames (ISO/IEC 14496-2 1999; ISO/IEC 14496-10 2003). This greatly increases the bit rate for a given level of quality but saves a great deal of compression processing. Another processing block of potential interest with video display is exposure control. Because exposure control is essentially a feedback control loop, the response of exposure control and automatic white balance to changes in the scene can vary with camera model. The response of the camera to flickering illuminants can also be revealing, especially if a rolling shutter is employed. Bands caused by illuminant flicker interacting with a rolling shutter exposure are a form of gain artifact that is sometimes treated in the processing path. The standard flickering illuminants oscillate at two times the AC power line frequency. Because the AC frequency is one of two values, flicker detection can be approached as a classification problem, analyzing a video or preview stream for evidence of flicker-induced bands at either 120 or 100 Hz (Kaplinsky and Subbotin 2006; Poplin 2006). The most common processing path treatment is really avoidance of flicker bands, by using an exposure time that is an integer multiple of the flicker period. If flicker is determined to be a problem and not avoidable through selection of exposure time, it can be mitigated using gain corrections to correct the banding, such as in Baer (2007). The resulting variations in digital gain can lead to noise amplification in bands of the image. As with other gain corrections, suitable noise reduction algorithms allow mitigation of the noise, though usually with an increase in noise reduction artifacts.

Acknowledgments The authors gratefully acknowledge many helpful discussions with Mrityunjay Kumar and Aaron Deever in the development and review of this chapter.

Page 28 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

References Adams Jr JE, Hamilton Jr JF (2003) Removing chroma noise from digital images by using variable shape pixel neighborhood regions. US Patent 6621937 Adams Jr JE, Hamilton Jr JF, Hamilton JA (2004) Removing color aliasing artifacts from color digital images. US Patent 6804392 Adams Jr JE, Hamilton Jr JF, Smith CM (2005) Reducing color aliasing artifacts from digital color images. US Patent 6927804 Adams Jr JE, Hamilton Jr JF, Williams FC (2007) Noise reduction in color digital images using pyramid decomposition. US Patent 7257271 Adams JE Jr, Hamilton JF Jr (2008) Digital camera image processing chain design. In: Single-sensor imaging: methods and applications for digital cameras, 1st edn. CRC, Boca Raton, pp 67–103 Adams J, Parulski K, Spaulding K (1998) Color processing in digital cameras. IEEE Micro 18:20–30 Adobe RGB (1998) Color image encoding. Technical report. http://www.adobe.com/adobergb. Adobe Systems, Inc Alleysson D, Susstrunk S, Herault J (2005) Linear demosaicing inspired by the human visual system. IEEE Trans Image Process 14(4):439–449 Anderson M, Motta R, Chandrasekar S, Stokes M (1995) Proposal for a standard default color space for the internet: sRGB. In: Fourth IS&T/SID color imaging conference, Scottsdale, Arizona. pp 238–245 Bae S, Paris S, Durand F (2006) Two-scale tone management for photographic look. ACM Trans Graph 25(3):637–645. doi:10.1145/1141911.1141935 Baer R (2007) Method and apparatus for removing flicker from images. US Patent 7298401 Bawolek E, Li Z, Smith R (1999) Magenta-white-yellow (MWY) color system for digital image sensor applications. US Patent 5914749 Bayer B (1976) Color imaging array. US Patent 3,971,065 Bean J (2003) Cyan-magenta-yellow-blue color filter array. US Patent 6628331 Brostow GJ, Essa I (2001) Image-based motion blur for stop motion animation. In: SIGGRAPH ‘01: proceedings of the 28th annual conference on computer graphics and interactive techniques. ACM, New York, pp 561–566. doi:10.1145/383259.383325 Chang S, Yu B, Vetterli M (2000) Adaptive wavelet thresholding for image denoising and compression. IEEE Trans Image Process 9(9):1532–1546 Durand F, Dorsey J (2002) Fast bilateral filtering for the display of high-dynamic-range images. In: SIGGRAPH ‘02: proceedings of the 29th annual conference on computer graphics and interactive techniques. ACM, New York, pp 257–266. doi:10.1145/566570.566574 Farbman Z, Fattal R, Lischinski D, Szeliski R (2008) Edge-preserving decompositions for multi-scale tone and detail manipulation. In: SIGGRAPH ‘08: ACM SIG-GRAPH 2008 papers. ACM, New York, pp 1–10. doi:10.1145/1399504.1360666 Finlayson G, Hordley S, HubeL P (2001) Color by correlation: a simple, unifying framework for color constancy. IEEE Trans Pattern Anal Mach Intell 23(11):1209–1221 Four thirds standard. http://www.four-thirds.org/en/fourthirds/index.html Gaskill J (1978) Linear systems, fourier transforms, and optics. Wiley, New York Gindele EB, Gallagher AC (2001) Method for adjusting the tone scale of a digital image. US Patent 6275605 Gindele E, Adams Jr JE, Hamilton Jr JF, Pillman BH (2007) Method for automatic white balance of digital images. US Patent 7158174 Giorgianni EJ, Madden TE (2008) Digital color management encoding solutions, 2nd edn. Wiley, New York

Page 29 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Glotzbach J, Schafer R, Illgner K (2001) A method of color filter array interpolation with alias cancellation properties. Proc IEEE Int Conf Image Process 1:141–144 Goodman J (1968) Introduction to fourier optics. McGraw-Hill, San Francisco Goodwin RM, Gallagher A (1998) Method and apparatus for areas selective exposure adjustment. US Patent 5818975 Guidash RM (2003) Active pixel sensor with wired floating diffusions and shared amplifier. US Patent 6657665 Guidash RM, Lee PP (1999) Active pixel sensor with punch-through reset and cross-talk suppression. US Patent 5872371 Gunturk B, Glotzbach J, Altunbasak Y, Schafer R, Mersereau R (2005) Demosaicking: color filter array interpolation. IEEE Signal Process Mag 22(1):44–54 Gupta M, Chen T (2001) Vector color filter array demosaicing. Proc SPIE 4306:374–382 Hamilton Jr JF, Adams Jr JE (2003) Correcting for chrominance interpolation artifacts. US Patent 6542187 Hamilton Jr JF, Adams Jr JE, Orlicki D (2001) Particular pattern of pixels for a color filter array which is used to derive luminanance and chrominance values. US Patent 6330029B1 Hirakawa K, Parks T (2005) Adaptive homogeneity-directed demosaicing algorithm. IEEE Trans Image Process 14(3):360–369 Hirakawa K, Wolfe P (2007) Spatio-spectral color filter array design for enhanced image fidelity. Proc ICIP:II-81–II-84. Vol 2 Holst GC, Lomheim TS (2007) CMOS/CCD sensors and camera systems. The International Society for Optical Engineering, Bellingham Hunt R (1987) The reproduction of colour. Fountain Press, England ISO/IEC 14496-10:2003 (2003) Information technology – coding of audio-visual objects – part 10: advanced video coding ISO/IEC 14496-2:1999 (1999) Information technology – coding of audio-visual objects – part 2: visual Jain C, Sethuraman S (2008) A low-complexity, motion-robust, spatio-temporally adaptive video de-noiser with in-loop noise estimation. Proc ICIP:557–560 Kaplinsky M, Subbotin I (2006) Method for mismatch detection between the frequency of illumination source and the duration of optical integration time for image with rolling shutter. US Patent 7142234 Kim YJ, Oh JJ, Choi BT, Choi SJ, Kim ET (2009) Robust noise reduction using a hierarchical motion compensation in noisy image sequences. Digital Image Forensics. 1–2 Kimmel R (1999) Demosaicing: image reconstruction from color CCD samples. IEEE Trans Image Process 8(9):1221–1228 Kodak KAI-16000 image sensor. http://www.kodak.com/global/plugins/acrobat/en/business/ISS/ datasheet/interline/KAI-16000LongSpec.pdf Kumar M, Morales E, Adams Jr JE, Hao W (2009) New digital camera sensor architecture for low light imaging. Proc ICIP:2681–2684 Lee J (1983) Digital image smoothing and the sigma filter. Comput Vis Graph Image Process 24:255–269 Lee H (1986) Method for computing the scene-illuminant chromaticity from specular highlights. J Opt Soc Am A 3:1694–1699 Li X, Gunturk B, Zhang L (2008) Image demosaicing: a systematic survey. SPIE Conf VCIP 6822:68,221J-68, 221J-15 Lu W, Tan YP (2003) Color filter array demosaicking: new method and performance measures. IEEE Trans Image Process 12(10):1194–1210 Lukac R, Plataniotis K (2005) Color filter arrays: design and performance analysis. IEEE Trans Consum Electron 51(4):1260–1267 Page 30 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Manoury EJ, Klaassens W, van Kuijk H, Meessen L, Kleimann A, Bogaart E, Peters I, Stoldt H, Koyuncu M, Bosiers J (2008) A 36  48mm2 48m-pixel CCD imager for professional DSC applications. 1–4 McCann JJ, McKee SP, Taylor TH (1976) Quantitative studies in retinex theory. Vision Res 16:445458 McColgin WC, Tivarus C, Swanson CC, Filo AJ (2007) Bright-pixel defects in irradiated ccd image sensors. Mater Res Soc Symp Proc 994:341 Meynants G, Scheffer D, Dierickx B, Alaerts A (2004) A 14-megapixel 36  24  mm2 image sensor. SPIE 168–174. doi:10.1117/12.525339. Url http://link.aip.org/link/?PSI/5301/168/1 Miao L, Qi H, Snyder W (2004) A generic method for generating multispectral filter arrays. Proc ICIP 5:3343–3346 Miyano T (1997) Auto white adjusting device. US Patent 5659357 Miyano T, Shimizu E (1997) Automatic white balance adjusting device. US Patent 5644358 Moghadam A, Aghagolzadeh M, Radha H, Kumar M (2010) Compressive demosaicing. Proc MMSP 105–110. Mukherjee J, Parthasarathi R, Goyal S (2001) Markov random field processing for color demosaicing. Pattern Recogn Lett 22(3–4):339–351 Nakamura J (ed) (2005) Image sensors and signal processing for digital still cameras (Optical science and engineering). CRC Press, Boca Raton Ochi S (1984) Photosensor pattern of solid-state imaging sensors. US Patent 4,441,123 Ohta J (2007) Smart CMOS image sensors and applications (Optical science and engineering). CRC Press, Boca Raton Ohta YI, Kanade T, Sakai T (1980) Color information for region segmentation. Comput Graph Image Process 13(3):222–241 Palum R (2009) Optical antialiasing filters. In: Lukac R (ed) Single-sensor imaging. CRC Press, Boca Raton Pecoraro G, Shelestak L (1988) Transparent infrared absorbing glass and method of making. US Patent 4,792,536 Pennebaker WB, Mitchell JL (1993) JPEG still image data compression standard. Van Nostrand Reinhold, New York Pillman B, Deever A, Kumar M (2010) Flexible readout image capture with a four-channel CFA. Proc ICIP (To appear) 561–564 Poplin D (2006) An automatic flicker detection method for embedded camera systems. IEEE Trans Consum Electron 52(2):308–311 Reference output medium metric RGB color space (ROMM RGB) white paper (1999) Tech Rep Version 2.2, Accession Number 324122H, Eastman Kodak Company Roddy J, Zolla R, Blish N, Horvath L (2006) Four color image sensing apparatus. US Patent 7057654 Tomasi C, Manduchi R (1998) Bilateral filtering for gray and color images. In: Proceedings of the 1998 I.E. international conference on computer vision. IEEE. Bombay. Trussell H, Hartwig R (2002) Mathematics for demosaicking. IEEE Trans Image Process 11(4):485–492 Watanabe M, Nayar S (1995) Telecentric optics for computational vision. In: Proceedings of European conference on computer vision, Cambridge, UK, pp 439–451 Why a color may not reproduce correctly. http://www.kodak.com/global/en/professional/support/ techPubs/e73/e73.pdf Yamagami T, Sasaki T, Suga A (1994) Image signal processing apparatus having a color filter with offset luminance filter elements. US Patent 5323233 Yamanaka S (1977) Solid state camera. US Patent 4054906

Page 31 of 32

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_170-1 # Springer-Verlag Berlin Heidelberg 2015

Zhang L, Wu X (2005) Color demosaicking via directional linear minimum mean square-error estimation. IEEE Trans Image Process 14(12):2167–2178 Zhang F, Wu X, Yang X, Zhang W, Zhang L (2009) Robust color demosaicking with adaptation to varying spectral correlations. IEEE Trans Image Process 18(12):2706–2717

Page 32 of 32

Computer Image Generation Jon Peddie

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Geometry Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rendering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generating the Image: Hardware Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GPUs and Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Software Stack from Application to API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11 12 12 16 17 19 20 21 22 23

Abstract

The image generation stage in a computer graphics visual display device is the final section where the magic of beautiful pictures is created. It is where the user has the most interaction with the images and their generation. In the case of movies, it’s passive but emotional. In the case of computers, it’s interactive, such as playing a game. There are two types of image generation: the content creation work and the consumption of the content. In both cases, there is an unrelenting demand and need for high quality, fast response, and image generation. Synonyms and Definitions of Key Terms

Ambient occlusion

To create realistic shadowing around objects, developers use an effect called ambient occlusion (AO), sometimes

J. Peddie (*) Jon Peddie Research, Tiburon, CA, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_171-1

1

2

J. Peddie

API

called “poor man’s ray tracing.” AO can account for the occlusion of light, creating nonuniform shadows that add depth to the scene. Most commonly, games use screen space ambient occlusion (SSAO) for the rendering of AO effects. There are many variants, though all are based on early AO tech, and as such AO effects suffer from a lack of shadow definition and quality, resulting in a minimal increase in image quality (IQ) compared to the same scene without AO. Acronym for application program interface. A series of functions (located in a specialized programming library), which allow an application to perform certain specialized tasks. In computer graphics, APIs are used to expose or access graphics hardware functionality in a uniform way (i.e., for a variety of graphics hardware devices) so that applications can be written to take advantage of that functionality without needing to completely understand the underlying graphics hardware while maintaining some level of portability across diverse graphics hardware. Examples of these types of APIs include OpenGL and Microsoft’s Direct3D. An API is a software program that interfaces an application (Word, Excel, a game,

Computer Image Generation

Aspect ratio

CPU

CRT

Device driver

3

etc.) to the GPU as well as the CPU and operating system of the PC. The API informs the application of the resources available to it, which is called exposing the functionality. If a GPU or CPU has certain capabilities and the API doesn’t expose them, then the application will not be able to take advantage of them. The leading graphics APIs are DirectX and OpenGL. The ratio of length to height of computer and TV screens, video, film, or still images. Nearly all TV screens are 4:3 aspect ratio. Digital TVs are moving to widescreen which is 16:9 aspect ratio. Acronym for central processing unit. In PC terms, this refers to the microprocessor that runs the PC, such as an Intel Pentium chip. Cathode ray tube. Technical name for a display, screen, and/or monitor. Most commonly associated with computer displays. A device driver is a low-level (i.e., close to the hardware) piece of software which allows operating systems and/or applications to access hardware functionality without actually having to understand exactly how the hardware operates. Without the appropriate device drivers, one would not be able to install a new graphics board, for example, to use

4

J. Peddie

DisplayPort

DVI (Digital Visual Interface)

Fragment shader

with Windows, because Windows wouldn’t know how to communicate with the graphics board to make it work. DisplayPort is a VESA digital display interface standard for a digital audio/video interconnect, between a computer and its display monitor, or a computer and a home-theater system. DisplayPort is designed to replace digital (DVI) and analog component video (VGA) connectors in the computer monitors and video cards. DVI is a VESA (Video Electronics Standards Association) standard interface for a digital display system. DVI sockets are found on the back panel of AIBs and some PCs and also on flat panel monitors and TVs, DVD players, data projectors, and cable TV set-top boxes. DVI was introduced in and uses TMDS signaling. DVI supports high-bandwidth digital content protection, which enforces digital rights management (see HDCP). Pixel shaders, also known as fragment shaders, compute color and other attributes of each fragment. The simplest kinds of pixel shaders output one screen pixel as a color value; more complex shaders with multiple inputs/outputs are also possible. Pixel shaders range from always outputting the same color, to

Computer Image Generation

Frame buffer

Geometry shaders

5

applying a lighting value, to doing bump mapping, shadows, specular highlights, translucency, and other phenomena. They can alter the depth of the fragment for Z-buffering. The separate and private local memory for a GPU on a graphics AIB. The term frame buffer is a bit out of date since the GPU’s local memory holds much more than just a frame or an image for the display as they did when originally developed. Today the GPU’s local memory holds programs (known as shaders) and various textures, as well as partial results from various calculations, and two to three sets of images for the display as well as depth information known as a Z-buffer. Geometry shaders, introduced in Direct3D 10 and OpenGL 3.2, generate graphics primitives, such as points, lines, and triangles, from primitives sent to the beginning of the graphics pipeline. Executed after vertex shaders geometry shader programs take as input a whole primitive, possibly with adjacency information. For example, when operating on triangles, the three vertices are the geometry shader’s input. The shader can then emit zero or more primitives, which are rasterized and their fragments ultimately passed to a pixel shader.

6

J. Peddie

Global illumination

GPU (Graphics processor unit)

“Global illumination” (GI) is a term for lighting systems that model this effect. Without indirect lighting, scenes can look harsh and artificial. However, while light received directly is fairly simple to compute, indirect lighting computations are highly complex and computationally heavy. The GPU is the chip that drives the display (monitor) and generates the images on the screen (and has also been called a visual processing unit or VPU). The GPU processes the geometry and lighting effects and transforms objects every time a 3D scene is redrawn – these are mathematically intensive tasks, and hence the GPU has upwards to hundreds of floating point processor (also called shaders or stream processors.) Because the GPU has so many powerful 32-bit floating point processors, it has been employed as a special purpose processor for various scientific calculations other than display ad is referred to as a GPGPU in that case. The GPU has to be compatible with several interface standards including software APIs such as OpenGL and Microsoft’s DirectX, physical I/O standards within the PC such as Intel’s Accelerated Graphics Port (AGP) technology and PCI Express, and output standards known as VGA,

Computer Image Generation

HDMI (High-Definition Multimedia Interface)

IGP (Integrated Graphics Processor)

Pixel

7

DVI, HDMI, and Display Port. The GPU has its own private memory on a graphics AIB which is called a frame buffer. When a small (less than five processors) GPU is put inside a northbridge (making it an IGP), the frame buffer is dropped and the GPU uses system memory. HDMI is a digital, point-topoint interface for audio and video signals designed as a single-cable solution for home theater and consumer electronics equipment which is also supported in graphics AIBs and some PC motherboards. Introduced in 2002 by the HDMI consortium, HDMI is electrically identical to video-only DVI. An IGP is a chip that is the result of integrating a graphics processor with the northbridge chip (see northbridge and chipset). An IGP may refer to enhanced video capabilities, such as 3D acceleration, in contrast to an IGC (integrated graphics controller) that is a basic VGA controller. When a small (less than five processors) GPU is put inside a northbridge (making it an IGP), the frame buffer is dropped and the GPU uses system memory; this is also known as a UMA – unified memory architecture. Acronym for pix element (“pix” is a shortened version

8

J. Peddie

Reflective shadow maps

Tessellation shaders

of “picture”). The name given to one sample of picture information. Can refer to an individual sample of RGB luminance or chrominance information. A pixel is the smallest unit of display that a computer can access to display information but may consist of one or more bits (see “BPP”). Reflective shadow maps (RSMs) are an extension to a standard shadow map, where every pixel is considered as an indirect light source. The illumination due to these indirect lights is evaluated on the fly using adaptive sampling in a fragment shader. By using screen-space interpolation of the indirect lighting, it is possible to achieve interactive rates, even for complex scenes. Since visualizations and games mainly work in screen space, the additional effort is largely independent of scene complexity. The resulting indirect light is approximate but leads to plausible results and is suited for dynamic scenes. A tessellation shader adds two new shader stages to the traditional model. Tessellation control shaders (also known as hull shaders) and tessellation evaluation shaders (also known as domain shaders), which together allow simpler meshes to be subdivided into finer meshes at run-time

Computer Image Generation

Texture

9

according to a mathematical function. The function can be related to a variety of variables, most notably the distance from the viewing camera to allow active levelof-detail scaling. This allows objects close to the camera to have fine detail, while further away ones can have more coarse meshes yet seem comparable in quality. It also can drastically reduce mesh bandwidth by allowing meshes to be refined once inside the shader units instead of downsampling very complex ones from memory. Some algorithms can upsample any arbitrary mesh, while others allow for “hinting” in meshes to dictate the most characteristic vertices and edges. Tessellation shaders were introduced in OpenGL 4.0 and Direct3D 11. A texture is a special bitmap image, much like a pattern, but is intended to be applied to a 3D surface in order to quickly and efficiently create a realistic rendering of a 3D image without having to simulate the contents of the image in 3D space. That sounds complicated, but in fact it’s very simple. For example, if you have a sphere (a 3D circle) and want to make it look like the planet Earth, you have two options. The first is that you meticulously plot each nuance in

10

J. Peddie

Texture mapping

the land and sea onto the surface of the sphere. The second option is that you take a picture of the Earth as seen from space, use it as a texture, and apply it to the surface of the sphere. While the first option could take days or months to get right, the second option can be nearly instantaneous. In fact, texture mapping is used broadly in all sorts of real-time 3D programs and their subsequent renderings, because of its speed and efficiency. 3D games are certainly among the biggest beneficiaries of textures, but other 3D applications, such as simulators, virtual reality, and even design tools, take advantage of textures too. The act of applying a texture to a surface during the rendering process. In simple texture mapping, a single texture is used for the entire surface, no matter how visually close or distant the surface is from the viewer. A somewhat more visually appealing form of texture mapping involves using a single texture with bilinear filtering, while an even more advanced form of texture mapping uses multiple textures of the same image but with different levels of detail, also known as mip mapping. See also “Bilinear Filtering,” “Level of Detail,” “Mipmap,” “Mip mapping,” and “Trilinear Filtering.”

Computer Image Generation

11

Introduction Creating the beautiful images, we see in the theater, in video games, and in manipulated photos and videos used for advertising, is done in the image generation section of a computer, any computer whether it’s a handheld smartphone, a desktop PC, or a supercomputer. The basic arrangement of hardware and software in a computer graphics system is illustrated Fig. 1 (Peddie 2013). Image generation, also known as computer graphics (CG) or CGI – computergenerated imagery (and also computer graphics interactive) – is a collection of techniques or tricks designed to make the most realistic image possible within the available computer hardware, as well as the budgets of time, funding, and skillset. In addition, the fidelity of the image generation influences the techniques used. For example, designers creating a photo-realistic image of an automobile want the result to be physically accurate and reflect light from every surface and facets of the automobile model. Animators and game developers however don’t want a perfect reproduction of the world, and neither do many special effects technicians in the movies. To accomplish those broad and seemingly contradictory objectives requires that software tool providers and hardware processor suppliers offer several styles and versions of their products (Fig. 2).

System Memory

CPU

Graphics Controller

The Computer Hardware Graphics Memory (Frame Buffer)

Application TheSoftware The Software

Operating System

Fig. 1 Basic block diagram of a computer graphics system

DIsplay

12

J. Peddie

Fig. 2 Raster or direct rendering uses CG “tricks” to approach realism and is not physically correct. The bunny is such a rendered animated character (Blender), and a perfectly physically correct rendered automobile (Autogaleria) uses similar but different CG processes

Dozens of books have been written on these subjects, and this chapter will offer an overview of the concepts and provide references for those who want to dig deeper into the topics.

The Pipeline The image generation pipeline, or CGI pipeline, is a complex process; for simplification, it can be described in three basic stages: • Geometry generation and model construction – Design, modeling, transformations, and rigging • Shading (the geometric model and world) – Surfaces (texture and color), staging, animation, lighting, and effects • Post – the final stage in image generation – Rendering, clipping, and hidden-surface removal composite, touch-up, and final film/video output Each of these stages has several subcomponents and is a ballet between the hardware and software. It’s difficult to speak about one part without discussing the adjacent components or functions, even though we refer to it as a pipeline (Fig. 3) (Graphics Pipeline nd). The components of the pipeline will be discussed in more details throughout the rest of the chapter.

The Geometry Creation The process of generating a computer graphic (CG) image begins with a designer’s idea, an idea for a building, a character, an airplane, a toaster, a watch, and almost

Computer Image Generation

13

Fig. 3 Graphics rendering pipeline

Fig. 4 Wire-frame models of a cube, icosahedron, and approximate sphere (Wikipedia)

anything imaginable. The designed then translates the idea into a series of lines displayed on the screen of a computer monitor. Those lines represent the outline and several details of the thing imagined, and logically enough it’s called a wire-frame model (Nasifoglu 2013; Principles of Engineering Graphics by Maxwell Macmillan International Editions). To build a wire-frame model, the designer uses triangles and so defines models in terms of how many triangles they have. We also evaluate a processor by how many millions of triangles a second it can process (Fig. 4). The wire frame can be constructed in 3D (x,y,z) space or in the case of cartoons, simple games, or first impressions as in a story board, just 2D (x,y). In the case of architectural and many mechanical drawings, the designer uses four views: a flat 2D front, side, and top view (known as a A-B-C view) and then in the fourth quadrant an orthogonal or perspective view (Fig. 5).

14

J. Peddie

Fig. 5 Three 2D views and a perspective view (Wikipedia)

When the designer needs curves in a drawing, as in an ultramodern building, an automobile, an iron, or a fantastic looking character, one of two techniques is used. Sometimes combinations of them are used. There is the faceted or piece-part curve and the algorithmic curve. As mentioned in the beginning, CG is about tricks in order to balance resources and time. To draw a simple circle on a computer screen can be quite demanding on the computer. That is because the circle has a theoretically infinite number of points to describe it, and computers don’t have an infinite amount of time or resolution; therefore, reasonable compromises have to be made. This is where the designer enters into a negotiation with the viewer. One of the qualities of the human brain is that it can make up for missing or incorrect image features (Harmon 2015). This fact was put to good use in the early days of television when the scan rate was interlaced and TVs had long-persistence CRT phosphor screens. The same compensation for incorrectness in an image is realized when a nonlinear line is approximated. For example, to compute a curve or a circle would require a calculation per pixel. If the user had a high-resolution screen, that could be quite a time-consuming calculation. Straight lines however are easily drawn in a computer, and so a circle can be approximated by a series of straight-line segments. The number of segments is decided based on computational time vs desired realism. These types of approximations are what influence the viewer’s suspension of disbelief (Fig. 6). As you can imagine, it’s even trickier when designing something with a complex nonlinear curve like an automobile fender, a person’s leg, or a graceful long cantilevered street lamp.

Computer Image Generation

15

Fig. 6 An eight-sided faceted circle approximation and a perfect circle

Fig. 7 Compare Lora Croft from 1996 to 2013 – a measure of how CG has improved due to hardware that is more powerful and advanced software (Wikipedia)

Therefore, the developers of software and algorithms used in the CG industry have spent the last several decades, since about 1950 (Harmon 2015, 149), coming up with techniques (known in the industry as tricks) that would give the most believable image possible given the resources available at the time. Thanks to the continuous improvements in scaling capability in the manufacturing of semiconductors referred to as Moore’s law (Moore 1965), the CG industry has had faster, smaller, and less expensive computers every 18 months. Therefore, the developers have made better-looking images every year through the combination of improved algorithmic work and faster, less expensive hardware (Fig. 7). In 1996 the best the industry could do was an image of 100,000 triangles, 30 times a second, on a 1,024  768 screen; by 2013 it was possible to generate over a 100 million triangles (also referred to as polygons) at 60–120 times a second on screens greater than HD (1,920  1,080). By 2015 UHD, or 4 K displays, were

16

J. Peddie

commonplace offering 3,840  2,160 resolution and again challenging the hardware engines.

Shading To get impressive images like the above, the GPU fills in the triangles with color. The basic “filling in” is called flat shading, and it is a very useful technique to quickly get a feeling or impression of how a thing looks. Designers generally work with flat shaded models because they are fast and not distracting. After the basic shading, the designers apply textures to the surfaces to create photo-realistic images (Fig. 8). To realize a realistic image, computer graphics scientists and engineers strive to manage and simulate the way light behaves in a scene. The reflections, shadows, and special effects like glints and color contribution from object to another can place a significant load on the graphics processor unit (GPU) and central or main processing unit (CPU). That itself isn’t the only challenge. Generating a highresolution image quickly, 30–60 times a second, is the goal. Techniques known as bump mapping (Blinn 1978) where textures seem to have depth by manipulating the way shadows are created are used to quickly add realism to an image by changing the lighting calculations based on the highlight in the image (Fig. 9) (Dreijer 2007). Shading and reflection techniques are the heart and soul of computer graphics, and whether the goal is a movie, a flight simulator, or a fast-paced realistic game, the goal is to make it realistic and believable. Shading is a broad subject and many textbooks have been written about it. There are several techniques, from basic flat shading to highly realistic complex shading. Flat shading is a technique to shade (color) each polygon of an object based on the angle between the polygon’s surface (normal) and the light source. It is used for high-speed rendering where more

Fig. 8 Texture mapping is a way to add realism to 3D models

Computer Image Generation

17

Fig. 9 No bump mapping vs bump mapping (Image by Wesley Chartrand). A height map based on brightness is generated to simulate the surface displacement and then calculate the interaction of the new “bumpy” surface with lights in the scene. It is a classic and well-used CG “trick” Fig. 10 Changing the intensity or color of the pixels between the line and the background gives the effect of an even line; the eye (brain) is tricked into seeing a smooth line

advanced shading techniques are too computationally expensive. In contrast to flat shading is smooth shading where the color changes from pixel to pixel.

Rendering The final rendering stage is where the most interesting/important tricks are used. Special coloring, anti-aliasing of edges and lines, it’s where the pixel polishing is done. (“Pixel polishing” is a term used by computer graphics people to refer to the improvement in appearance of the image.) Anti-aliasing is a CG trick to smooth out the chunky pixelated edges, colloquially known as “jaggies” from the edges of images and lines (Fig. 10). Rendering is an all-encompassing term and can refer to the final stage where the actual pixels are produced or to the whole pipeline after the geometry is created. It

18

J. Peddie

is the last major step, giving the final appearance to the models and animation. With the increasing sophistication of computer graphics since the 1970s, it has become a more distinct subject. The technique for producing the image is done in one of two ways and often a combination of both. The two primary methods of rendering are scan conversion or scan-line rendering (also known as rasterization) where all the techniques and tricks discussed so far are finally converted in a time sequence to the scan lines of the display and ray tracing or ray casting. A special case of scan-line rendering is tile-based rendering, which is most used in mobile devices like smartphones and tablets.

Polygon Rendering Polygon rendering is a method for creating two-dimensional images (what we see on the screen) from a group of three-dimensional objects. These three-dimensional objects are made up of flat polygons; any shape can be approximated using polygons. The set of algorithms, which produce the image, collectively known as a rendering pipeline (Funkhouser 2000), process the image one step at a time. Most 3D computer graphics applications use a polygon-rendering pipeline to generate images (Kurt and Jermoluk 1988). This is the step taken before the final rendering (imaging drawing on the screen) takes place. Scan-Line Rendering Scan-line rendering is the technique used for determining which surfaces of the objects in the scene will be visible (Wylie et al. 1967). All the polygons are rendered and sorted by their depth location (z). The image data (pixels) is then fed to the display beginning with the top row or scan line. The main advantage of this method is that sorting vertices along the normal of the scanning plane reduces the number of comparisons between edges (Newell et al. 1972). The technique was derived from the scanning method used in television dating back to the late 1930s. Traditional 3D graphics workstations have concentrated on the hardware that transforms points and lines from object space to screen space. As users’ needs for display of realistic solid objects have increased, the demands on graphics architectures have changed significantly. The challenges include increases in transformation rate, incorporation of real-time illumination calculations, and dramatic increases in pixel fill rates. Ray Tracing The history of ray tracing dates back to Newtown in his treatise Opticks (Newton 1730), and references go back as far back as Euclid and his student Archimedes. Used originally for calculating the effects of lenses, the mathematical techniques were adopted to computer graphics in 1975 (Bui-Tuong 1975). In 1980 an improved and faster technique was introduced (Whitted 1980). In computer graphics, ray tracing is a technique for generating an image by tracing the path of light through pixels in an image plane and simulating the effects of its encounters with virtual objects (Fig. 11).

Computer Image Generation

19

Fig. 11 Ray-traced image of a Mercedes-Benz SS Roadster (Courtesy of Volkan Kac¸ar)

Ray tracing produced physically perfect renditions of a scene. The drawback is it is a computationally heavy operation, especially as resolution and color depth are increased. Tile-Based Rendering While performance is the main goal for computer graphics, there is the requirement for balancing performance against power consumption and memory bandwidth. While Moore’s law produces faster and higher performance processors, making computations relatively inexpensive, the further data has to be moved, the more power it takes (Torborg and Kajiy 1996). To reduce the bandwidth demand, CG systems, especially those in mobile device, use tile-based rendering. As the name implies, the image is broken up into a grid of tiles (Fig. 12) (Molnar 1994). Tile-based rendering allows the smaller pieces of the image (the tiles) to be moved into the graphic processor’s internal memory. Since that memory is in the processor, and close to where the computations occur, far less power is required to access it, and it can be fast.

Generating the Image: Hardware Issues To get a photo-realistic image, a great deal of processing is needed, almost a processor per triangle. In 1999, Nvidia introduced the first programmable parallel processor exclusively designed to process and output graphics and called it the graphics processor unit or GPU (Peddie 1999). Today that term is part of the vocabulary of anyone in the computer industry; it did cause a little confusion because the industry had been using the geometry processor unit (GPU) term since 1980 (Clark 1980).

20

J. Peddie

Fig. 12 TBR graphics pipeline

The first consumer GPUs had four major 32-bit processors, called pipelines at the time, and have since evolved to hundreds and thousands of parallel processors now called “shaders.” Modern GPUs built in 2014 had 5,632 shaders, and Moore’s law continued to raise the number. Today discrete semiconductor (“chips”) GPUs have more transistors and use more power than the most powerful x86 CPU (Fig. 13). At the same time GPUs were increasing the number of shaders, screen resolution on all devices was going up. In 2014, a 5.7-in. smartphone with 2,560  1,440 resolution was introduced. It produces 515 PPI (pixels per inch). This compares with the iPhone’s retina display resolution 326 PPI and the Galaxy S5’s 432 PPI. At the same time PCs were being equipped with 4 k displays – 3,840  2,180 resolution – 8.3 million (mega) pixels. PPI became the new metric for evaluating displays, with bigger being better. The world was becoming hooked on highquality, high-resolution displays. In order to drive these super high-resolution displays, the industry had to switch from the old display interface of DVI and the older analog (1989) VGA to DisplayPort and HDMI (Clark 1980, 361, 362).

GPUs and Memory In a PC there are two configurations of GPUs: discrete or stand-alone and integrated (in with the CPU). In mobile devices, there are only integrated GPUs in a semiconductor (“chip”) called a system on a chip (SoC) or application processor (Fig. 14) (EE Times 2009). Discrete GPUs are very powerful, and part of their performance is derived from having a bank of private, high-speed memory, whereas an integrated GPU has to share the main memory with the CPU. Main memory is also slower than the dedicated graphics memory. Memory is referred to as double data rate random access memory or DDR RAM. There are series of DDR memory types reflecting

Computer Image Generation

21

Fig. 13 A modern discrete multi-thousand shader PUC (Nvidia)

the development over time. As of this writing the fastest system’s memory for PCs and servers was DDR4, and for mobile devices it was DDR2. In the case of mobile devices where power consumption is paramount, there is specialized DDR2 RAM known as LP (low power) DDR. Discrete GPUs use a specialized very high-speed DDR known as GDDR or graphics DDR, and the current version is GDDR5.

Software Stack from Application to API However, the hardware is can’t do anything useful without software. There are four major elements of software in a computer: the operating system (OS), the application, an interface program called an application program interface (API) that

22

J. Peddie

Fig. 14 A modern SoC, with integrated GPU, of the type found in smartphones (Qualcomm)

connects the OS to the app, and the hardware via a fourth very specialized program called a “driver” that resides between the API and actual hardware. The driver acts as a translator between the hardware and the systems’ software. The hardware manufacturers develop drivers that are unique for each GPU. Shown in the following diagram is the basic arrangement of the elements. The APIs contain drawing instructions for lines, circles, and other geometric items which is known as a library (Fig. 15). Because computers and mobile devices can have various CPU processor types and various operating systems, there is the need for various APIs as well. The following table shows the general association of APIs and operating systems (Table 1). The x86 processor is in PCs, servers, and mobile devices, and the ARM processor is in most mobile devices and servers. A third processor, MIPS, is in TVs, game machines, and a few mobile devices. In 2013 AMD introduced a new API called Mantle designed to accelerate games developed for x86-based machines. In 2014, Apple introduced a new API called Metal, and later Google introduced extension to OpenGL ES.

Summary A computer image generator’s job is to paint the image on the screen. The image is generated in a processor in the computer, called the rendering engine. When computer graphics first began, all the computations for the image took place in the CPU. Gradually, beginning with graphics terminals in the early 1970s, some of the work

Computer Image Generation

23

Fig. 15 The software stack in a typical computer

Table 1 APIs and operating systems

OS/API Windows (x86) Windows (ARM) iOS (x86) iOS (ARM) Android Linux

DirectX X

OpenGL X

OpenGL ES X

X X X X

was off-loaded from the CPU to geometry engines. The concept evolved further to managing the vertices of the 3D models, which is known as the vertex processor. In the early 2000s, massive parallel processing engines were integrated into the graphics controller for processing the shader programs. The shader programs are the ones which create the reflections, shadows, and other lighting characteristics – which is called pixel polishing. With the introduction of the shaders, the graphics controller became known as a graphics processor units or GPU. Modern GPUs look more like a computer, and the notion of a pipeline is only in the data flow discussion, with almost every element in GPU being capable of processing every function in the construction of a 3D image.

References Blinn JF (1978) Simulation of wrinkled surfaces. Comput Graph 12(3):286–292 SIGGRAPHACM Bui-Tuong P (1975) Illumination, for computer-generated pictures. Comm ACM 18(6):311–317 Clark J (1980) Special feature a VLSI geometry processor for graphics. IEEE Computer 13:59–68

24

J. Peddie

Dreijer S (2007) Bump mapping using CG, 3rd edn. Retrieved 30 May 2007. http://www. blacksmith-studios.dk/ projects/downloads/bumpmapping_using_cg.php Funkhouser T (2000) 3D polygon rendering pipeline. https://www.cs.princeton.edu/courses/ archive/fall00/cs426/lectures/pipeline/pipeline.pdf Graphics Pipeline (nd) Computer desktop encyclopedia. http://www.answers.com/topic/graphicspipeline. Retrieved 13 Dec 2005 Harmon K (2015) The brain adapts in a blink to compensate for missing information. Sci Am. http://www.scientificamerican.com/article/brain-adapts-in-a-blink/ Kurt A, Jermoluk T (1988) High-performance polygon rendering. Comput Graph 22(4) Molnar S (1994) A sorting classification of parallel rendering. IEEE. Retrieved 24 Aug 2012. http://www.cs.cmu.edu/afs/cs/academic/class/15869-f11/www/readings/molnar94_sorting.pdf Moore GE (1965) Cramming more components onto integrated circuits (PDF). Electron Mag, p 4. Retrieved 11 Nov 2006 Nasifoglu Y (2013) Renaissance wireframe. Architectural Intentions from Vitruvius to the Renaissance Studio Project for ARCH 531. McGill University. Retrieved 11 March 2013 Newell ME, Newell RG, Sancha TL (1972) A new approach to the shaded picture problem. Proceedings of ACM national conferernce Newton I (1730) Opticks or a treatise of the reflections, refraction, inflections & Colors of Light, 4th edn. London. https://archive.org/details/opticksortreatis1730newt Peddie J (1999) Nvidia’s GeForce graphics processor. The Peddie Rep XII(36):1421 Peddie J (2013) The history of visual magic in computers. Springer, London/ Heidelberg/New York/Dordrecht The Great Debate: SOC vs SIP. EE Times. Retrieved 12 Aug 2009 Torborg J, Kajiy J (1996) Talisman: commodity real-time 3D graphics for the PC. In: Proceedings of the ACM SIGGRAPH 1996 Whitted T (1980) An improved illumination model for shaded display. Comm ACM 23 (6):343–349 Wylie C, Romney GW, Evans DC, Erdahl A (1967) Halftone perspective drawings by computer. Proc AFIPS FJCC 31:49

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Consumer Imaging I – Processing Pipeline, Focus and Exposure Peter Corcorana* and Petronel Bigioib a College of Engineering and Informatics, National University of Ireland Galway, Galway, Ireland b FotoNation Ltd, Galway, Ireland

Abstract Digital photography completely supplanted film photography, with a huge, disruptive shift occurring during the first decade of the twenty-first century. The collapse of film photography led to the fall of industry giant like Kodak. Then in 2007, Apple introduced the smartphone and digital photography went mobile. Today smartphones dominate everyday imaging, but the digital camera industry has survived. In this chapter, we will take a look at the state of the art in consumer digital imaging as it stands halfway through this second decade of the twenty-first century. We follow an image from the photons striking the CMOS electronic sensor through to the compressed image or video stored on a memory card. Starting with a look at the basics of digital photography, we move on to explore the complexity of the image processing pipeline (IPP) that is used on today’s cameras. The reader will be introduced to different color spaces and how these are used for different purposes inside the IPP. The mechanisms of autofocus (AF) and exposure using CMOS sensors are explained, and the concepts of the rolling shutter and high-dynamic-range (HDR) imaging are explained. Video preview and image compression are also explored. In a companion article, we will take a look at some developments in “smart imaging” that allow pictures to be enhanced in ways that mimic high-end photography equipment on miniature smartphone cameras.

List of Abbreviations ADC APS-C CDAF CE CFA CMOS DOF DSLR DSP FOV GPU HDR IPP ISP RLE VCM

Analog/digital convertor Advanced Photo System type-C Contrast detection autofocusing Consumer electronics Color filter array Complementary metal-oxide semiconductor Depth of field Digital single-lens reflex Digital signal processing Field of view Graphics processing unit High dynamic range Image processing pipeline Image signal processor Run-length encoding Voice-coil module

*Email: [email protected] Page 1 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

VGA/SVGA WFOV

Video graphics array/super video graphics array Wide field of view

Definition of Key Terms Aperture

Chroma subsampling Dark current Dark-frame subtraction Gamma adjustment Hysteresis ISO

In optics, an aperture is a hole or an opening through which light travels. More specifically, the aperture of an optical system is the opening that determines the cone angle of a bundle of rays that come to a focus in the image plane. Is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system’s lower acuity for color differences than for luminance. The constant response exhibited by a receptor of radiation during periods when it is not actively experiencing radiation. In digital photography, dark-frame subtraction is a way to minimize image noise for pictures taken with long exposure times. Or gamma correction is the name of a nonlinear operation used to code and decode luminance or tristimulus values in video or still image systems The time-based dependence of a system’s output on current and past inputs In the case of digital cameras, ISO sensitivity is a measure of the camera’s ability to capture light.

Background The first digital cameras started to appear in the early 1990s, and throughout the 1990s, efforts to substitute chemical film with electronic sensors continued. But it was not until the twenty-first century that digital cameras started to enter mainstream consumer markets in significant volumes. Once this transition occurred, the market for traditional cameras and film collapsed in less than a decade. Arguably today’s digital cameras still do not capture the same quality of image as traditional film (Dainty 2012), but factors of convenience, improved digital storage, and a growing ability to share images and videos in “real time” have led to an unprecedented shift in this consumer market. And the rapidly evolving imaging capabilities of smartphones suggest that further market shifts are already in progress. In this chapter, we will take a look at the state of the art in consumer digital imaging as it stands halfway through this second decade of the twenty-first century. In particular, we will take a look at recent developments in “smart imaging” that allow pictures to be enhanced in ways that mimic high-end photography equipment or that can even achieve effects that were simply not possible with conventional cameras or older digital models. But first we should take a look at the basics of today’s digital camera.

The Modern Digital Camera The modern digital camera takes many forms ranging from a full-sized professional digital single-lens reflex (DSLR) camera that can cost north of $10,000 down to a low-cost consumer camera costing less than $100. And recently we can find miniature camera modules with midlevel functionality incorporated into smartphones and tablets. For example, the iPhone 5s has an aperture of f/2.2 enabling impressive

Page 2 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

high-quality image capture that is comparable to images taken with many larger consumer cameras. This illustrates that it is now smartphones that are driving the leading edge of digital image acquisition.

Inside the Camera So how does a digital camera work? Well, a lens is still used to capture and focus light into the body of the camera element. Classic cameras always used glass lenses, but as cameras have shrunk in size, so too have their lens assemblies to the point where glass can no longer be used. Thus most modern lenses are manufactured from advanced plastics. As the associated manufacturing and finishing technologies have improved, so too has the optical quality as exemplified by the iPhone 5s. In a modern camera, in the absence of photographic film, light is focused onto silicon-based sensing element. This is typically constructed using a CMOS process to take advantage of the latest massmanufacturing techniques. The most recent sensors also use clever process refinements to increase the surface area per pixel (Waltham 2013). These “back-illuminated” sensors have enabled the size of individual sensor pixels to grow even while the number of pixels continued to increase. Today, however, the sensors in smartphones have reached their performance limits in terms of pixel size due to a phenomenon known as the diffraction limit (Baer 2010). But the images obtained from a modern camera are not like the single exposure images that we used to capture with film cameras. Being “digital” in nature, it is possible to capture “more than one” image of a scene, and by processing the acquired digital data, it is possible to construct a better picture. But that digital data is not easy to produce as it has to be postprocessed in near real time as it is offloaded from the digital sensor. This processing is complicated by many practical considerations: due to the limitations of the underlying electronics, we cannot grab every pixel at the same time and mass manufacturing leads to process defects; then the image acquisition process at the pixel level is statistical in nature, relying on photons of light converting to electrons, and is further subject to electronic noise sources. Finally, the sensor is linearly sensitive to light so that at low lighting levels, the stored photon energy has to be amplified (increasing the noise levels) and filtered to distinguish light-photon energy from dark-current energy originating from natural processes in the silicon lattice. Thus the image sensor requires a complex secondary layer of electronics simply to postprocess and correct the raw sensor data. The situation is further complicated because the imaging sensor functions like the human eye, having each pixel on the image sensor surface sense one of the three fundamental components of visible light – red (R), green (G), or blue (B) wavelengths. But in a display or printed image, we perceive all three components to be combined at the same spatial location. Thus G and B values have to be extrapolated for each red pixel, R and G values for each B pixel, and so on. But additional corrections have to be performed to compensate for global lighting levels in the imaged scene and for color imbalances that may have occurred. In practice, a complex pipeline of localized image adjustments and corrections is followed by global spatial extrapolations, color space transformations, and gray-level/color balancing. This image processing pipeline (IPP) is typically so complex that it requires a dedicated image signal processor (ISP) to process the sensor data into a high-quality image suitable for local display or further processing and ultimately compression into a final end point of JPEG (still) or MPEG (video) data. This overview of a digital camera system is shown in Fig. 1.

Types of Camera As mentioned above, there is a wide range of digital cameras in the market and commercial pressures and market forces have driven many changes in the last few years. Many professional DSLR cameras have moved into consumer markets by cutting prices to below $1000, while at the same time a new generation of technology has enabled DSLR quality optics and large Page 3 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015 FOCUS OPTICAL PATH

IMAGE SENSOR

CORRECTION CIRCUITS

IMAGE SIGNAL PROCESSOR

PICTURE PROCESSING

OUTPUT FILES

DIRECTIVES

Fig. 1 Outline of the main functional blocks of a modern digital camera (From Andorko et al. 2010)

sensors to fit into a smaller footprint of interchangeable lens cameras. These more compact models now command higher prices than low-end DSLR cameras. And for those upgrading from a compact consumer model, there are the bridge cameras which feature a high magnification lens system with a larger sensor than compact models but do not offer the ability to switch lenses. These allow consumers to experience the improved image quality resulting from improved optics and larger image sensors for a small price premium over compact consumer models. Meanwhile, compact consumer models have lost out to the proliferation of smartphones and tablets and the rapidly improving image acquisition capabilities of these new devices. There are still many compact consumer models that can achieve better image quality than smartphones, in particular as they can accommodate a larger lens. However market pricing dictates that the difference in image quality is often not sufficient to overcome the convenience of having an adequate camera built into a device you always carry with you. As has been said on many occasions, “The best camera is the one you always have with you.” In the midst of all this market turmoil, new categories of imaging device are emerging. Notably the action cams are imaging devices that can be mounted on a helmet or a shoulder harness and capture the exploits of individuals involved in extreme sports activities such as windsurfing, skiing, and cross-country biking with a wide field-of-view (WFoV) lens so none of the action is missed. A variant is waterproof cameras originally designed to be splashproof for use on the beach or in a water park, but the latest models can be used to capture underwater images in the swimming pool or during a snorkel run. Another variant is the accident cam that you can place in your car and which continuously loops recording the last 1–2 h until sudden movement causes it to stop recording. The idea is you now have a black box with video evidence of the events leading up to a crash. The ubiquity of digital image capture devices is quite astonishing. And the majority of these devices now support the latest high-quality compression standards for digital video as well as single image acquisition. That means they can generate large volumes of video content, often of a quality that was rare only 3–4 years ago. Next lets take a more detailed look at the black magic you will find inside most of these devices that enables them to do their job so well.

Image Sensors As with digital cameras, there is a wide range of image sensors in use. Most of today’s digital cameras employ CMOS technology so the most important differentiator is the sensor size as shown in Fig. 2. The high end of professional DSLR cameras mostly employ 35 mm or full-frame-sized sensors, providing an equivalent image quality to traditional 35 mm film, whereas the new-generation interchangeable lens cameras use smaller APS-C or four-thirds system sensors. However as manufacturers try to get the cost/performance mix right, we are seeing increasing overlap between low-end DSLR systems and high-end interchangeable lens cameras. At the consumer end, the sensor sizes are significantly smaller, being less than 2/300 across the diagonal, and in smartphones and tablets, the sensor size is typically less than 1/200 . For example, the iPhone 5 uses a 1/3.200 or 4.54  3.42 mm-sized sensor – even smaller than the smallest size shown in Fig. 2. This is largely why the low-light performance of most smartphones is so poor. The undisputed king of smartphone sensors, used by Nokia PureView™ devices, has a 1/1.200 sensor (10.67  8 mm). Page 4 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Different sizes of image sensor

So when it comes to image sensors for consumer imaging, we can definitely say that size matters!

Optical Systems Consumers tend to like their gadgets to be small. In recent years, we have seen the emergence of the handheld camera, followed by the smartphone revolution. But as cameras get smaller, there is increasing pressure on their optical systems to shrink in unison with the electronics. Unfortunately that introduces some physical limits arising from the wavelength of light. You may recall experiments from your high-school physics classes with diffraction grating – where a fine grating produces interference effects. Well a similar effect limits the performance of optical systems and is known as the diffraction effect. This is discussed in detail by the author of Dainty (2012), but in essence, it leads to an overlapping between image pixels and this effect is increased towards the periphery of an image. This overlap of the light focused onto each pixel causes an underlying blurring of the image that has its origins in the optics.

Focus and Exposure Given a lens system that gathers light onto a light-sensitive pixel array, we have the essence of a photographic system. However, for better flexibility, the lens system should be capable of changing focal length in order to capture details on objects or people who can be at different distances from the camera. Early cameras avoided this problem by having a lens with fixed focus. However this requires a wide depth of field. Thus everything appears to be in focus, but this can create unnatural looking pictures, and these cameras do not work well with close subjects. The limitations of fixed focus lenses were acceptable when digital images were of VGA (0.3 megapixels) or SVGA size (0.8 megapixels), but now that images have 10 greater resolution, even the least expensive consumer cameras feature an autofocus lens system. And autofocus is also incorporated in the latest smartphones even though device thickness is now well under 1 cm. Autofocus Algorithms Traditional autofocus algorithms relied mainly on contrast-based focus, but the compromises to fit an autofocus system into the latest smartphones were such that these devices are often quite slow to focus. This has seen the introduction of two approaches to increase focus speeds in these devices – some use face-tracking information to estimate the distance to subject and therefore optimize focus speed and accuracy when there are faces in the scene, and another alternative is phase-sensitive sensors or laser ranging sensors that provide focus based on depth measurements. Page 5 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

The former approach is a good example where smart imaging can leverage information about the scene being imaged to enhance image acquisition without adding the cost of a more expensive sensor technology. Digital Exposure In traditional cameras, a shutter mechanism allows a timed exposure of the photographic film. In high-end professional cameras, the shutter mechanism may be retained, but most consumer cameras and smartphones do not provide a shutter mechanism. Instead the image sensor is continually exposed and constantly gathers light photons. The mechanical shutter mechanism is replaced by an electronic equivalent known as a rolling shutter. The term rolling shutter is derived from the behavior of the electronic shutter which simply off-loads image data pixel by pixel from the sensor, scanning across the image sensor with a zigzag motion similar to a raster scan. As data is off-loaded from each pixel, the pixel is reset to zero and begins to accumulate photons to form the next image frame. The overall effect is that image pixels from a particular image frame are not all acquired at the same instant in time. This can, in fact, lead to some unusual side effects, but there is not space to discuss further here. The maximum exposure time for a consumer camera in video mode using rolling shutter techniques is determined by the video capture rate – typically 30 frames per second (fps) giving an exposure time of 33 ms. Shorter exposure times are achieved by resetting pixels in advance of off-loading data – thus to obtain a 0.01 s exposure time, a wave of pixel resets would propagate 10 ms in advance of the data off-loading for each image frame. To increase equivalent exposure time, it is often possible to increase the electronic gain applied to individual picture sensing elements, or pixels, but this will have the effect of increasing noise levels in the image sensor. Thus many modern imaging devices often combine multiple images in addition to enhancing the sensitivity of a single image capture. High-end devices also have separate video and still image modes; in still image (photography) mode, the frame rate is slowed down, allowing longer exposure times, and this facilitates much higher image quality in low-light scenarios. After all you do not need fast frame rates when you are composing and framing a picture as you’ll be holding the camera still.

The Image Processing Pipeline To fully understand the complexity of what happens in a modern digital camera, we need to expand on the concept of the image processing pipeline (IPP) – the sequence of underlying manipulations of the original image data to get to the image that you see on the main camera screen. The IPP was introduced in outline form in the introduction, but here we will take a look at the finer details. Figure 3 illustrates an example IPP adapted from US patent application 2013/0004071, Image signal processor architecture optimized for low-power, processing flexibility, and user experience. Here we notice many different processing steps, some compensating for the sensor, some for nonlinearities of the lens, and others relating to image exposure and white balance. You will also note several color space transitions. I will explain some of these steps in the following sections, but one point I want to make right now is that the image you get to see on your computer has changed very dramatically from the original image data obtained from the camera sensor.

Page 6 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 A typical state-of-art image processing pipeline. It is normally implemented in a dedicated image signal processor, and this processing occurs before the image data reaches the main CPU or the JPEG compression engine (Adapted from US patent application 2013/0004071)

Bayer and Color Acquisition The Bayer pattern is named after its inventor, Bryce Bayer, who introduced the concept in US patent 3,971,065 in 1976. To be exact, it is a color filter array (CFA) of RGB filters on a grid of photosensitive pixel elements. The underlying idea is that the Bayer pattern mimics the light-sensitive physiology of the human eye. Our eyes obtain luminance or brightness information mainly from green wavelengths, while color information is derived from blue and red wavelengths. Thus Bayer wanted to have twice as many green sensors as red or blue. The resulting pattern led to the well-known RGBG organization which is used almost exclusively on modern image sensors. The operation of a Bayer sensor is shown in Fig. 4a, b, while Fig. 4c is taken from Bayer’s original patent. You can see from Fig. 4a, c that the various pixel elements of a Bayer pattern sensor do not in fact overlap each other spatially. The Bayer pattern image that is obtained directly from the image sensor is often known as a RAW image, and you will see this term used frequently by professional photographers. This RAW data is the original, unprocessed image data, and depending on how this is subsequently processed, a wide variation in final image characteristics can be achieved. They can be regarded as a digital negative. Thus professional photographers often prefer to retain this original data as it allows them much more control over the appearance and presentation of the final image. In turn all professional cameras will allow images to be saved in the native RAW format. In consumer cameras, however, the postprocessing of an image will be automated, although more recently many consumer cameras have started to offer the capability to store the RAW image. However these images are quite large – larger than an uncompressed RGB image – and so most consumer cameras will automate the most important postprocessing steps. This automated postprocessing is effected in the image processing pipeline (IPP). Let us take a closer look at some of the magic that goes on inside your consumer camera or smartphone.

Page 7 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

b Incoming light Filter layer

a

Sensor array

Resulting pattern

c 8

y y

y y

y

2

y

y

y

y y

y

y

y

y

y y

y y

c1 c1 c1

c1

c1

y

c1

y

c1

y

c1

c2

y

c2

y

c2

y

y

c1

y

c1

y

c1

c2

y

c2

y

c2

y

y

c1

y

c1

y

c1

c2

y

c2

y

c2

y

c1 4 c1 c1 c1 c2 c2 c2

c2 c2 c2

c2

6

c2 c2

Fig. 4 (a) Bayer color filter array; note there are twice as many green elements as these wavelengths provide both luminance and color information to the human eye. (From http://en.wikipedia.org/wiki/File:Bayer_pattern_on_sensor.svg.) (b) This illustrates how the different wavelengths of visible light only penetrate the R, G, or B elements of the CFA, as appropriate; the underlying photosensor array is typically fabricated from silicon and is sensitive across all visible wavelengths and into infrared wavelengths as well. (From http://en.wikipedia.org/wiki/File:Bayer_pattern_on_sensor_profile.svg.) (c) Bayer color filter array from US 3,971,065)

Compensating for the Sensor and Lens It was already mentioned that the lens in any camera is imperfect. It should not be a surprise to learn that image sensors also suffer from some defects. Typically these are manufactured in a volume production environment, and there are many small variations in the manufacturing process that inevitably lead to differences between individual sensor chips and indeed between the sensitivities of individual row and even at pixel level on a single chip. The situation is further complicated due to additional electronic circuits that process the analog pixel values, converting these into digital data. These can introduce additional offsets and errors. Most variations tend to cancel out, but occasionally, some of them are cumulative and lead to a chip that is mostly functional but has some pixels or pixel rows that lie outside acceptable tolerances in sensitivity or performance. These must be compensated for when the image is postprocessed, typically in the early stages of the IPP (Fig. 5).

Page 8 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 The initial processing stages for sensor data involve compensation for sensor defects, fixed pattern noise, and lens shading prior to the conversion of raw Bayer pattern data to RGB (Adapted from US patent application 2013/0004071)

Optical Black Processing The first step in normalizing the image that is obtained from the sensor is to eliminate offsets due to the dark current. This is a residual current that is present in the underlying electronic circuitry of the pixel and that leads to a slight offset, even if the pixel is shielded from light. For this purpose, some pixels along the edge of the sensor are permanently shielded from light and provide a reference indication of the variation in magnitude of this parameter across a sensor. Depending on the manufacturing process, it may be known that certain sensor rows have higher levels of dark current, and these data can be programmed into the IPP hardware. In general this processing step is necessary to allow very dark regions of an image to appear properly black. Pixel Defects These are more serious defects in the pixels of the sensor array. They can arise for numerous reasons from small particulates in the processing environment, impurities in the manufacturing process, to optical flaws in the lens array or even scratches during handling and mounting. In addition to flaws in individual pixels (hot and/or dead pixels), there can be clusters or entire rows or columns of pixels that are nonfunctional. Statistically some flaws are expected and the testing carried out postmanufacturing will determine if a sensor is of acceptable quality or not. It is common to expect tens or even hundreds of defects on an individual sensor element, and these can be compensated for in the IPP processing using a defect map recorded during testing. Larger sensors with more pixels typically expect a higher number of defects, and much of the increase in cost is caused by lower batch yields from manufacturing. Camera Noise There are different noise sources that arise in a digital camera. Fixed pattern noise is a noise pattern that occurs on digital imaging sensors and is characterized by the same pattern of “hot” (brighter) and “cold” (darker) pixels occurring with images taken under the same illumination conditions. It is due to variations in the geometries and sizes of individual pixels and can to a large extent be calibrated out of a manufacturing batch by characterizing a selection of sensors. For an individual camera, a more precise compensation can be achieved by taking a number of reference shots with predetermined illumination conditions. Other noise sources originate in the underlying electronics and can become more noticeable in low-lighting conditions when the gain of individual pixels is increased. Salt-and-pepper noise leads to bright pixels in dark areas and dark pixels in bright areas of the image and tends to originate in the analogto-digital conversion on-board the sensor. Shot noise has several sources, including the sensor dark current levels and statistical variations in the number of photons sensed by pixels at a given exposure level. A technique known as dark frame subtraction can help reduce the effects of most of these noise sources and is often implemented in the ISP. Page 9 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Lens Shading, Chromatic Aberrations, and Purple Fringing As previously mentioned, the lens system in a consumer device is far from perfect. When it is coupled with the image sensor, there are several major sources of image distortion that require correction. The effects of imperfect lens can be seen through the degradation of three main lens parameters: geometrical distortions, relative illumination, and modulation transfer function (MTF – a measure of lens sharpness). To start with, the lens only acts as an ideal lens for a small region at the center of the image sensor. Towards the edges of the sensor, the optical path is longer and thus the edges of the sensor are not on the correct imaging plane. This needs to be corrected in the optical design, leading to either geometrical distortion of the image, loss in sharpness towards the edges, or loss in relative illumination. As lens and sensor geometries are known, to a certain level of accuracy, the geometrical distortions can be compensated for. Another known quality issue is the lens/sensor shading, and the problem is made worse when the lens has a short focal length and a larger image sensor is employed – both desirable aspects in a modern smartphones. (A larger sensor can employ larger pixels, compensating for the diffraction limit, while a short focal length is a necessary requirement to fit a thin lens/camera module into the body of a modern smartphone.) Chromatic aberrations arise from the nature of the lens material and the varying effects that it has on different wavelengths of light. This is the same effect that splits light into a rainbow when it passes through a prism – the optical paths of different wavelengths vary slightly. In the final image, a common result is lateral chromatic aberration (when the color channels are displaced on the sensor) or axial chromatic aberration (when the color channels have different sharpness) that of purple fringing where the edges of objects in an image have a thick bluish border due to the blue light being slightly out of registration. Those effects are most prominent in lower-quality plastic lenses that are commonly found in consumer imaging devices. Pixel Gain and Compensation Most consumer imaging devices use active pixel sensors where an integral amplifier enables the sensitivity of the pixels to be increased or decreased depending on ambient light levels. This is important for CMOS sensors, especially in video mode – the camera is constantly acquiring image data, off-loading pixels one row at a time across the sensor. Thus the exposure time for an individual pixel is restricted by the frame rate – typically 60 frames per second (fps). (This applies when the imaging system operates in its normal acquisition mode; in low-light conditions, the standard frame rate can be reduced by 1/2, 1/3, 1/4, 1/6, . . . allowing significantly longer exposure times.) So pixels can only be exposed for a maximum of 1/60 s. To achieve the effect of a longer exposure, the pixel charge has to be amplified when it is off-loaded. But in turn this amplifies any noise that is present (requiring complex denoising circuitry). This capability corresponds to the ISO settings available on many modern digital cameras. Not only is any underlying noise amplified, but it also boosts the effects of dark current and each gain setting requires a different offset compensations. Bayer Scaling The last step in this initial processing stage is to modify the Bayer data by scaling each of the R, G, and B values according to the white balance settings of the camera. Typically the white balance can be predetermined for certain scene modes, or the camera may calculate it automatically based on previously acquired images – usually the camera is continually acquiring new frames at preview frame rates (usually lower than the video recording rates) while you are composing your picture. The image is now ready for conversion from Bayer or RAW format into a format more suitable for additional postprocessing.

Page 10 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 The Bayer to RGB conversion also involves spatial interpolation and the image data is often further converted to YCC (YUV) so that luminance data is available and to prepare the image for JPEG compression (or MPEG if it is a video frame) (Adapted from US patent application 2013/0004071)

Color Processing (Fig. 6) Debayering the Image Most sensors use a conventional Bayer pattern with two green pixels to every red or blue pixel. In part the rationale is that the human eye is more sensitive to green wavelengths. Of the major camera manufacturers, only Fujifilm has brought an alternative color array to market in their X-Trans sensors. Other specialized sensors such as Foveon’s technology that enabled three independent color pixels to coexist at the same physical location have not been successfully commercialized. For a conventional sensor, this means that at each pixel we have only got color information from a single range of color wavelengths – red, green, or blue. Thus we need to interpolate at each pixel in order to find the missing color values. This process is not as straightforward as might at first seem, and because of the importance of this conversion to the final image quality, there has been a significant amount of research into the effects of different debayering algorithms. Sharpness from G channel has to be transferred to R and B channels, and the techniques to do so have been researched and improved year after year. It is probably fair to say that the differences between algorithms for most images are quite small, but occasionally some side effects may be noticeable. For this reason, most professional photographers prefer to store and archive images in RAW format. Nevertheless, recent cameras have provided improved compensation for such conversion artifacts and some state-of-the-art debayering and high-quality JPEG conversion engines are considered good enough for all but the most demanding image scenes. RGB Processing The natural step from Bayer is not directly to JPEG, because there is still a need to balance and adjust the image data. Instead there is typically an intermediate RGB conversion to allow for gamma adjustment. The standard RGB color space formats use a nonlinear encoding (a gamma compression) of the intended intensities of the primary colors of the photographic reproduction. This is further dependent on the luminance and tonal distributions of the current scene – another complex nonlinear relationship. As it is too complex to correct in a single step, typically the image is gamma corrected after the initial RGB conversion with further color and luminance balancing performed after YCC conversion. YCC Conversion YCC, or YCbCr as it is also known, is not an absolute color space; instead, it is a way of encoding RGB information that takes advantage of the human visual perception system. Because our vision systems are less sensitive to the compression of color data than of luminance, moving the image data into a YCC format is a first step towards JPEG compression. In fact there are multiple variants of YCC, but in most consumer imaging pipelines, YUV is used. The main reason to implement YUV is for interfacing with analog or digital television or photographic equipment as many such devices conform to YUV industry standards such as PAL and

Page 11 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

SECAM. Historically these standards were designed to support the evolution of TV signals from black and white (Y component only) to color television which added two additional color signals to the original B&W analog signal. One particular advantage of YUV color representations is that transformations from RGB are possible using integer math fixed point approximations. These are well developed and proven from television technology. In the NTSC and PAL systems, the chrominance signals had significantly narrower bandwidth than that for the luminance. Early versions of NTSC rapidly alternated between particular colors in identical image areas to make them appear adding up to each other to the human eye, while all modern analog and even most digital video standards use chroma subsampling by recording a picture’s color information at reduced resolution.

Optimizing the Image White/Color Balance In photography, this is the adjustment of the primary colors so that “white” actually appears “white.” In a 24-bit RGB space, this would imply that a value of 255, 255, 255 – or all RGB colors being at maximum value – should appear as a brilliant white. But as we have seen, the image has been through multiple transformations and scaling steps, so it is very likely that some global offsets and imbalances have been introduced. Further, the acquired image data is highly dependent on the imaged scene so some global compensation is also required. In practice white balancing should more correctly be called “gray-level” balancing as it seeks to ensure that neutral colors are correctly represented (Fig. 7). As the image is typically in YUV format towards the end of the IPP, it is usually convenient to manage the color or chroma correction independently of the luminance. The original luminance levels can be referenced back to the RGB data as a baseline, and depending on the sophistication of the IPP, a range of local and global refinements of the luminance data can be performed. As examples, functions such as image sharpening, desaturation of highlights, or shadow-region enhancement can effectively be provided at this point in the IPP.

The Image Signal Processor In its original forms, the IPP was typically implemented as a dedicated hardware pipeline with some programmable memory for lookup tables (LUTs) and a fixed set of control I/O. However as the capabilities and configurability of digital imaging systems expanded, it became necessary to add increasing flexibility to the IPP. In many of today’s devices, there is at least one dedicated CPU core

Fig. 7 Color balancing is followed by digital zoom/resize, which may be needed to rescale the image to match the size of the preview display screen on the device, to provide 16:9 movie format, or to facilitate advanced image processing algorithms (Adapted from US patent application 2013/0004071) Page 12 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 8 A typical camera system found in mobile phones featuring both high-resolution rear (primary) camera and lowerresolution front (user-facing) camera

embedded into the IPP, and this combination of IPP + CPU is commonly referred to as an image signal processor (ISP). In a digital camera, the ISP is likely also the main CPU as the primary function of the device is simply to capture images. But a modern smartphone will have a separate and more powerful main CPU for the device – this is normally referred to as the application processor (AP). This arrangement allows the ISP to be highly optimized for low-level image processing, and it will frequently incorporate dedicated hardware modules or IP cores designed to implement more advanced functionality such as face detection and tracking. The use of such modules allows the main ISP core to be low power and can provide significant power efficiency advantages in terms of battery consumption over general-purpose DSP or GPU approaches (Fig. 8).

Image Focus and Exposure The Autofocus (AF) System Most consumer imaging devices employ some form of contrast detection autofocusing (CDAF). Such an autofocus system estimates the degree of focus by numerically calculating the energy of high-frequency component in the input image. There are many different approaches and algorithms; see, for example, Groen et al. (1985), Ligthart and Groen (1982), Shih (2007), and Santos et al. (1997). Three main elements of the CDAF system include (i) selection of the focusing region (or the focusing window), (ii) computation of focus measure which yields a maximum at the focused position, and (iii) search for the peak value of the focus measure. Among these, the focus measure and search for the peak value of the focus measure are the most important factors in determining the speed and accuracy of autofocusing. In the case of CDAF systems, the focus sensor is the same as the imaging sensor (typically it is a rolling shutter CMOS sensor in smartphones and consumer cameras). The image is captured by the sensor and

Page 13 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 9 A typical contrast detection AF system components showing the complexity of hardware/software interactions between ISP, lens driver and main AP

analyzed by the ISP. An AF algorithm layer typically initializes the ISP. This includes setting a focusing region, or window, where the focus measure is computed – the AF window. There are many different methods to compute the sharpness in the target AF window (Groen et al. 1985; Ligthart and Groen 1982; Shih 2007). Usually the computation is done in hardware as it is typically computationally intensive, and it must be computed real time on each image frame. The AF Control System The control loop is closed by the AF algorithm and runs typically as software within the ISP. The sharpness function is analyzed by a specific algorithm that searches iteratively for a peak in the sharpness function by moving the lens and performing a sharpness evaluation on the target AF window. The AF window may be chosen by the AF algorithm itself, or it may be provided by the user (e.g., touch focus) or by other external intelligent algorithms (e.g., face detection and tracking, object detection and tracking, etc.). The AF algorithm layer may also incorporate a scene detection algorithm that will have to retrigger a focus cycle when changes in the scene are detected (e.g., object in the AF window goes out of plane, a new scene has been framed, etc.) (Fig. 9). The ISP determines and outputs contrast (or sharpness) information (AF statistics data) at every frame with interrupt. The AF algorithm evaluates the data and determines the position of the lens for the next image acquisition cycle. AF Search Time Most of the mobile phone cameras use CMOS sensor as an image sensor, with no global shutter. Rolling shutter is used during both preview and full image frame acquisition. In a state-of-the-art imaging

Page 14 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 10 AF timing and lens actuation; contrast-based autofocus for a rolling shutter camera; frame-to-frame and focus cycle timings, including example lens motion and settling times

systems, the preview is driven at full HD resolution, modern ISPs being well capable to process full HD at real time frame rates of 24–60 fps depending on the device hardware. The timing of such a system is depicted in Fig. 10, where a rolling shutter CMOS sensor is used and clocked to achieve 30 fps, with a readout time of 33 ms. The AF window (with blue) is typically about one-third of the height of the field of view (FOV) of the camera and can use the full width. Exposure time is considered to be quite low (below 10 ms), and a voice-coil module (VCM) open-loop actuator having around 40 ms actuation time plus settling time is considered in this figure. The sharpness or focus measure is computed by the ISP in real time, as the image lines are transferred to the ISP. At the end of sharpness computation, an interrupt takes place that will in turn trigger the AF algorithm running on the CPU to evaluate the new position of the lens. The CPU will initiate a lens motion into the new position. Note that many modern smartphone devices use VCM technology which, due to the scaling effects of miniaturization, leads to slow actuation times. Thus these modules require at least two image frames per focus step. In consequence, current smartphones tend to suffer from slow focusing cycles and focus hunting. AF Peak Search Algorithm As an AF algorithm, the hill climbing approach is the most common. It begins by checking image contrast of the AF window from an infinity setting, moving back towards a macroposition (the closest focus distance at which the camera can resolve an image). Once the algorithm detects a contrast decrease, it steps back towards the peak. In the simplest implementation, the first steps through the focus range are relatively large, and after contrast decrease is detected, a smaller step size is used. There are many variations on the hill climbing technique using improved focus measures or using data from a small number of initial AF steps to find the peak value using smart interpolation techniques. For example, to greatly increase the speed of a regular hill climb while trading off some user experience, an AF algorithm could move the lens in increments of two positions instead of one, in effect having a larger step, and when the peak is overshot, the actual peak can be interpolated by a second-order polynomial by using the last three lens positions and their respective AF data. The graph below (Fig. 11) shows the structure of a simple hill climb AF, where the average number of AF cycles (assuming no hysteresis is present) is 6. Sharpness Function Response Fast methods of peak search can be employed if the sharpness function is steep; otherwise approximating the peek can be difficult. The method presented above would work nicely if the sharpness function would

Page 15 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 11 Optimal hill climb focus – Steps 1–5 (From FotoNation datasheet – redrawn)

Fig. 12 Various sharpness functions’ characterization – three sample algorithms (Data courtesy of FotoNation – www. fotonation.com)

indeed fit a second-order polynomial in all lighting conditions. Each imaging device or camera module has its own unique characteristics, and these will vary depending on the scene being imaged. Figure 12 presents three different sharpness functions that were measured for the same scene, in good illumination condition and using the same camera module. The first function is an FPGA implementation of a modified Vollath F4, a basic CDAF algorithm that is widely used for digital cameras. The response is unsatisfactory for devising fast AF algorithms, but an improved technique based on the application of an infinite impulse response (IIR) filter yielded the much-improved characteristic, shown as continuous autofocus (CAF) in Fig. 12. The equivalent sharpness function, implemented using an off-the-shelf industry-standard commercial ISP, is provided for comparison. Note that the IIR-based function is significantly more sensitive than the other approaches, but since the implementation requires dedicated hardware, the AF algorithms would need to be tuned to work in the top of existing implementations in target systems.

Page 16 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Rolling Shutter Issues Digital image sensors in modern consumer devices are typically CMOS based. For reasons of cost, complexity, and size, these do not have mechanical shutter mechanisms, relying entirely on electronic controls to provide an equivalent shutter function. Thus the sensor is continually exposed and pixel elements are continually accumulating charge generated by photons of light as they strike these pixel elements. To provide an electronic shutter mechanism, the sensor must implement means to off-load and store the current charge levels of the sensor pixels. If it were to mimic an analog camera, then every pixel should be off-loaded and stored at the same instant in time – analogous to capturing a full film negative when the shutter is activated. This form of electronic shutter mechanism is known as a global shutter. However it requires (i) an independent external data interface for each pixel element, (ii) a dedicated analog-to-digital convertor (ADC) for each pixel element, and (iii) a fast memory cache sufficiently large to temporarily buffer/store the digital output from every pixel element. The large cache memory requirements, ADC elements, and the number of data lines make such solutions impractical for consumer systems. A more practical approach is to off-load the sensor one pixel row at a time. This serialized transfer greatly reduces the additional electronic circuitry required as ADC and data lines can be shared. Data is constantly being off-loaded from the sensor and loaded into an image frame buffer. In this configuration, the image in the frame buffer is formed from rows of image data that were acquired at incrementally different times, and the frame buffer is constantly being updated until the shutter button is pressed. At that point the ongoing off-loading of data from the sensor is paused, and the current frame buffer is moved to more permanent long-term storage (in some imaging systems, there can be multiple image frame buffers, so that while one image is being stored, the device can continue to acquire a following image frame; this explains how some cameras can acquire multiframe image sequences at the full sensor resolution). This type of electronic shutter is known as a rolling shutter as each row of the image corresponds to a time period that commenced a short time increment after the previous image row. Although it provides a simple and cost-effective solution for consumer devices, rolling-shutter sensors are not without their drawbacks. Because different images lines are effectively exposed over different intervals of time, the motion of either camera or subject will cause geometric distortions, such as skew or wobble. These problems become more significant and noticeable as high-quality imaging systems become the norm in handheld devices such as smartphones. They also create particular difficulties for the implementation of new modes of image acquisition such as sweep panorama imaging where the user is required to sweep the camera across a scene in order to acquire multiple images for postprocessing into a single wide-frame image. The interested reader will find many papers in the research literature discussing additional details of the rolling shutter phenomenon. Some starting points include Liang et al. (2008), Han et al. (2011), Nicklin et al. (2007), Forssén and Ringaby (2010), Baker et al. (2010), Grundmann et al. (2012), and Oth et al. (2013).

Image Exposure As can be seen from Fig. 13, modifying the reset time controls the exposure time for a digital camera that employs current CMOS sensor technology. Because the latest consumer devices are designed to acquire video as well as still images, this leads to some restrictions and most devices will provide a separate stillimage mode where the speed of the rolling shutter is slowed to allow exposures longer than 30 ms. In smartphones and other handheld devices that feature camera modules, it is generally not desirable to allow long exposure times – the form of these devices makes them more difficult to hold still than a dedicated camera device. Thus many smartphone devices do not support extended exposure times, and their use is restricted to environments with reasonable levels of ambient light. Page 17 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 13 Rolling shutter. (a) CMOS image sensor layout; (b) exposure timing for rolling shutter

However there is a great deal of pressure to improve the low-light capabilities of smartphones, and the latest models have incorporated lens assemblies with surprisingly low apertures and dynamic frame rates that can be slowed significantly in low-light conditions. These devices feature an improved ability to acquire still images and low frame-rate video at much lower levels of ambient lighting. High-Dynamic-Range (HDR) Imaging One technique that is widely employed by handheld devices is that of multiple image capture. As has been indicated, the natural mode for a CMOS imaging device is a constant rolling shutter at 30 fps leading to a 33 ms acquisition time window. Extending the acquisition time beyond this window tends to lead to blurred images due to hand movement. However if two consecutive image frames are acquired, they will cover exactly the same image scene, and with suitable digital processing, an image can be generated that provides an improvement in image quality equivalent to a longer exposure. Thus the tendency for these devices is that they do not feature hardware to extend exposure time but rather rely on capturing and merging multiple images of the same scene to provide improved dynamic range and better low-light sensitivity. The subject of HDR imaging is quite an involved one as there are multiple problems that must be resolved to achieve a robust and effective combination of two (or more) images. However it is worth mentioning that this technology is now being adapted and incorporated into the actual image sensors and the individual pixels of these sensors (Bandoh et al. 2010; Fowler 2011).

From Still Images to Video So far our focus has been on still images and their acquisition. However it has been fairly clear from earlier descriptions that modern digital cameras are designed to facilitate the capture of multiple image frames. With the increasing capabilities of modern handheld devices in terms of memory, data storage, and processing power, it is a small additional step for one of today’s digital cameras to be adapted to perform as a fully functional video camera. In this section, we consider what happens to the image frames after they exit the IPP. Most cameras already use real-time image frames to generate a live display for the camera screen. The other end point for image frames is to store them, but typically they are compressed before being stored permanently. We consider both still and video compressions and how there is significant commonality between them. The

Page 18 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

reader should then understand the convergence of modern still photography and video cameras as it makes sense to incorporate both compression modalities in a single hardware engine.

Displaying the Image Prior to 1995, digital consumer cameras did not provide live preview but relied instead on optical viewfinders. The first digital still camera with autoexposure framing to provide live preview on-screen was the Casio QV-10. It was more than ten years later before high-end digital single-lens reflex (DSLR) cameras began to appear with this feature, as it is fundamentally incompatible with the swinging mirrorbased mechanism in such cameras. Today most low-end cameras have completely eliminated optical viewfinders as the live preview functionality is embedded into dedicated ISP chipsets. Live Preview Modes There are two main modes of operation for live preview, with only a handful of manufacturers offering both modes in their consumer cameras. The first mode directs the sensor output directly onto an electronic display to provide a preview of what the camera’s sensor will detect prior to image acquisition and additional postprocessing. This is helpful when lighting conditions are low and too dark for an optical viewfinder. The sensor autogain will effectively boost the ISO to an appropriate level, and the user can see how noise levels and other side effects will affect the image in real time. This mode of live preview is known as autogain/framing mode, also described as framing priority display mode. The second mode of live preview is more sophisticated, displaying the exact final appearance on the display. In other words, the results of IPP processing are shown, and this allows the photographer to alter programmable acquisition settings such as shutter speed, ISO, and aperture to control the image exposure, before the main image is acquired. This second type of live preview is known as exposure priority display. It can also allow dynamic adjustment for video acquisition. Many bridge and compact cameras with movie mode have only an automatic exposure and limited exposure compensation control and live view that is primarily for framing only. However as ISP chipsets become more powerful, it is to be expected that most cameras will feature both framing and exposure priority displays in the next few years. Some exposure priority preview cameras offer a live histogram graph for color balance or image tone; the graph changes dynamically as exposure adjustments are made enabling the photographer to experiment with settings of aperture and exposure timing. Other features that may be available include a live depth of field (DOF) estimate and dynamic marking of overexposed areas of the imaged scene. Almost all modern bridge and compact cameras now have a movie mode with 1080p or “full-HD” video capability now being available on even low-cost consumer models. However the ability to adjust the exposure and other parameters of the image stream outside the autoexposure values is dependent on the power and real-time capabilities of the underlying ISP and/or application processor.

Compression Algorithms in Cameras When considering the IPP, emphasis was placed on the color transformations that are performed and the fact that the typical output from the IPP is in YUV formal. As mentioned, this color space has a long

Page 19 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

tradition in the consumer electronics industry, and its underlying rationale is that it facilitates the further compression of color or chrominance data. In older analog systems, this was achieved by using a narrower bandwidth for the color signal, but in modern digital devices, the next step is to apply a lossy compression to the YUV image. For still images, the best known format is JPEG, a standard that originated in the 1980s but continues to dominate in consumer devices (Wallace 1992). For movies and video, the landscape of compression algorithms is more complicated but appears to have converged in practice with the latest MPEG-4 (H.264) and HEVC (H.265) standards (Marpe et al. 2005, 2006; Pourazad et al. 2012; Sullivan et al. 2012).

Still Image Compression

The term “JPEG” is an acronym for the “Joint Photographic Experts Group,” the organization responsible for the creation of the various JPEG standards. The basic JPEG method provides a lossy form of compression based in part on a spatial reduction of the chrominance data combined with the application of the discrete cosine transform (DCT) to remove redundant high-frequency data and a form of Huffman compression. The compression method is described as lossy, because some of the original image information is lost during the compression process. However the degree of lossiness can be adjusted, and there is even a lossless mode defined in the JPEG standard, but this is not widely supported. As with more variable compression schemes, there is a trade-off between image size and the lossiness of the compression. Nevertheless the main JPEG algorithm has proved very robust both in terms of its wide deployment in the consumer digital imaging industry and also in terms of its resistance to patent claims on the underlying techniques. Further the standard has continued to evolve and improve to meet the needs of the digital photography industry. A more detailed description of an exemplary JPEG compression process is provided in the next section. The JPEG Encoding Process In this section, we provide a brief description of a common method of encoding when applied to a 24-bit input image (8 bits each for R, G, and B values at each pixel). This will help the reader understand how the output YUV image is further processed in the JPEG engine of a typical digital camera or smartphone. Color Space Transformation The starting point for JPEG compression is YCC (or the YUV subset) for a number of reasons. Firstly, as was previously discussed, this color space separates the image nicely into the luminance component, important for human perception, and the color components that are of less importance. A second reason is that the three individual components of the YCC color space are largely decorrelated in contrast with the RGB color space where all three components are highly correlated (Ionita and Corcoran 2011; Ionita et al. 2009). In practical terms, this improves the performance of the compression techniques that are subsequently applied to the image. Within a camera system, no additional color conversion of the RAW images is needed because, as we saw earlier, the output of the IPP is normally in YCC (YUV) format, which is ideally suited as a starting point for the JPEG compression process. Downsampling The next step is a reduction in the spatial resolution of the two color components. The standard ratios of downsampling for JPEG images are 4:4:4, which implies that no downsampling is used, 4:2:2 indicating downsampling by a factor of 2 in the horizontal direction only, or most commonly 4:2:0 which describes downsampling in both the horizontal and vertical directions again by a factor of 2.

Page 20 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Block Splitting Each of the three YUV channels must next be split into 8  8 macroblocks. From the chroma subsampling applied, these are derived from blocks of size 8  8 in original image for the 4:4:4 case (no subsampling), from 16  8 in the 4:2:2 case (horizontal subsampling), or most typically blocks of 16  16 pixels in the 4:2:0 case (both horizontal subsampling and vertical subsampling). Discrete Cosine Transform Next, each 8  8 block of each component is transformed into the frequency-domain representation, through a normalized, 2D discrete cosine transform (DCT). Quantization The human visual system is not good at detecting high-frequency variations in brightness. Thus the amount of high-frequency information in the image can be greatly reduced. A simple approach is to simply divide each frequency domain component by a fixed value, rounding to the nearest integer. The rounding operation and chroma subsampling are the lossy parts of the JPEG process. Typically many high-frequency components of the frequency domain image will be rounded off to zero, and of the remaining components, many survive only as small positive or negative numbers. These can be truncated to provide additional compression gains in the final stages of compression. Entropy Encoding Entropy encoding is a final lossless run-length encoding (RLE) process applied to the output of the previous compression steps. As was just indicated, many DCT components may have been rounded off and RLE is particularly suited to reducing strings of recurring values. It follows the pattern of the DCT coefficients which are arranged in a zigzag order as shown in Fig. 14. The RLE algorithm thus groups similar frequencies together, length coding strings of zero or low integer values, and using Huffman coding on the remaining values of the DCT coefficients. Some variants of the JPEG standard allowed for other compression techniques that were superior to Huffman coding, but patents owned by third parties covered these, and as a consequence, they were not adopted by the industry. In some cases, such improved entropy encoding might have offered a further 5 % increase in compression efficiency, but it would also have been slower to implement that Huffman. The first coefficient in each DCT block is the DC component. This represents the average luminance or chrominance value of that block. The remaining coefficients represent the higher frequencies or AC variations across that image block. One aspect of entropy encoding is that the previously quantized DC coefficient is used as a reference for the DC coefficient of the next macroblock; thus only the difference is recorded. This improves compression across image regions of a similar luminance or chrominance range, but it also means that a JPEG image has to be decoded in a single global operation – it is not possible to jump to a specific region of the image and decode it partially. This predictive differencing is unique to the DC coefficients – the original value of each of the 63 AC coefficients in each macroblock is recorded. However as many of these are typically rounded-off to zero, substantial compression is still achieved on most macroblocks of a typical image (Fig. 15).

Video Compression Raw digital video requires high data rates. Those of us who recall the IEEE 1394 video standard will remember that 400 Mbps was considered a requirement for standard digital TV and 800 Mbps and even 1600 Mbps were the targets for HD TV back in the 1990s. But then video compression using modern coding techniques came of age, and it became possible to stream high-quality video with a few Mbps.

Page 21 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 14 Entropy encoding is the final step in the JPEG compression process; the zigzag ordering of JPEG image components shows the path taken by entropy encoding (From http://en.wikipedia.org/wiki/File:JPEG_ZigZag.svg)

Fig. 15 The process of JPEG compression (top) and decompression (bottom) (From http://en.wikipedia.org/wiki/File:JPEG_ process.svg)

Today, Netflix recommends 4 Mbps for 720p video, a reduction of two orders of magnitude from the IEEE 1394 standard! Today’s highly optimized video compression algorithms combine spatial compression across an image frame (intraframe compression) with temporal motion compensation to enable compression between image frames. Typically image frames are arranged with a key frame surrounded by prior and succeeding image frames into a group of pictures (GOP). In practice, video codecs also implement audio compression techniques in parallel to enable combined data streams in a single package (Noll 1997; Gao et al. 2014; Sullivan and Wiegand 2005). A typical MPEG-4 lossy compression video has a compression factor between 20 and 200 (Vatolin et al. 2007). The majority of video compression schemes operate on rectangular area of the image, often called macroblocks. (The use of rectangular image blocks is helpful when designing electronics hardware to implement the functionality of the algorithms associated with a particular codec.) These image blocks are compared at the pixel level, and changes from one frame to the next are determined and recorded. These

Page 22 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

interframe variations are used to compile the intermediate frames between key frames. In sections of video that exhibit significant frame-to-frame motion, the compression algorithm must encode much more data to record the larger number of pixels that change between frames. In particular action sequences, explosions, flames, flocks of animals, crowds of people, and in certain panning shots, the high-frequency detail leads to a noticeable decrease in quality or may require a significant increase in the variable bitrate. Interframe Compression One powerful technique for compressing video is that of interframe compression. This employs image data from prior or succeeding image frames to compress a current frame. Intraframe compression uses only the current frame, effectively using similar techniques to JPEG compression. (One of the early techniques used in digital cameras was motion JPEG (MJPEG) compression. It used the hardware JPEG engine of a digital camera to compress individual video frames. Unsurprisingly compression rates were poor when compared to today’s more sophisticated algorithms.) Interframe compression works well for programs that will simply be played back by the viewer but can cause problems if the video sequence needs to be edited. One drawback of interframe compression occurs if a key frame becomes corrupted; the prior or succeeding image frames cannot be reconstructed as they rely on data in the missing key frame. The H.264/AVC/MPEG-4 standard contains new features to overcome or minimize such issues. For example, multiple key frames can be used for reconstruction of an image frame and motion estimation and compensation is more flexible than previous standards. As a consequence, the latest compressed video sequences are more robust and can provide excellent video quality, sometimes at significantly lower bitrates. HEVC encoding (Pourazad et al. 2012; Sullivan et al. 2012) improves further by allowing variable macroblock sizes and improving the efficiency of implementation to the point where it can typically achieve similar quality to H.264 at less than 50 % of the bitrate. HEVC was developed specifically to meet the challenges of the latest 4 K and 8 K video formats. In a digital camera, the compression engine may be implemented as a core on the main application processor or within the ISP. However the ISP will typically make uncompressed images available to the application processor and will only compress these directly to JPEG or MPEG video if configured explicitly to do so.

Concluding Remarks In this article, we have tracked the images in a typical consumer-imaging device from the sensor through the image-processing pipeline to their final compression and storage on an SSD card or flash memory. The autofocus process has also been explored, and the digital equivalent of image exposure has been outlined. It is clearly a nontrivial process to convert the raw sensor data into the high-quality compressed data files that have become the new currency of consumer photography and videography. These images have been converted through multiple color formats and numerous compensating adjustments to provide a high-quality full-resolution image that is subsequently compressed carefully and transferred to its permanent storage. The development and evolution of this complex process over the past two decades has ensured that the picture you obtain is much improved over the raw pixel data from the image sensor. But the fact remains that the final image on your display or hard drive has been altered and enhanced in many subtle ways. The complexity and sophistication of this process lead one to further consider that additional processing of the acquired images might open the door to a range of new techniques and services that Page 23 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

could be provided within our digital imaging devices. And indeed this is the case – the techniques presented here are only the tip of the iceberg. In a second article in this series, we will introduce some more in-camera processing techniques that have become practical with the continued evolution of digital imaging devices.

Further Reading Andorko I, Corcoran P, Bigioi P (2010) Hardware implementation of a real-time 3D video acquisition system. In: Optimization of electrical and electronic equipment (OPTIM), 2010 12th international conference on Brasov, Romania, pp 920–925. IEEE.0 Baer R (2010) Resolution limits in digital photography: the looming end of the pixel wars – OSA technical digest (CD). In: Imaging systems, 2010, p ITuB3 Baker S, Bennett E, Kang SB, Szeliski R (2010) Removing rolling shutter wobble. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2010, San Francisco, California, pp 2392–2399 Bandoh Y, Qiu G, Okuda M, Daly S, Aach T, Au OC (2010) Recent advances in high dynamic range imaging technology. In: ICIP 2010 (IEEE international conference on image processing), 2010, Hong Kong, pp 3125–3128 Corcoran P, Stan C, Florea C, Ciuc M, Bigioi P (2014) Digital beauty: the good, the bad, and the (not-so) ugly. Consum Electron Mag IEEE 3(4):55–62 Dainty C (2012) Film photography is dead: long live film: what can digital photography learn from the film era? IEEE Consum Electron Mag 1(1):61–64 Forssén PE, Ringaby E (2010) Rectifying rolling shutter video from hand-held devices. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2010, San Francisco, California, pp 507–514 Fowler B (2011) High dynamic range image sensor architectures. High Dyn Range Imaging Symp Work Stanford Univ Calif 7876:787602–787602-15 Gallagher P (2012) Smart-phones get even smarter cameras [future visions]. Consum Electron Mag IEEE 1(1):25–30 Gao W, Huang T, Reader C, Dou W, Chen X (2014) IEEE standards for advanced audio and video coding in emerging applications. Computer (Long Beach Calif) 47:81–83 Groen FC, Young IT, Ligthart G (1985) A comparison of different focus functions for use in autofocus algorithms. Cytometry 6(2):81–91 Grundmann M, Kwatra V, Castro D, Essa I (2012) Calibration-free rolling shutter removal. In: 2012 IEEE international conference on computational photography, Seattle, Washington, ICCP 2012 Han D, Choi J, Cho J-I, Kwak D (2011) Design and VLSI implementation of high-performance face-detection engine for mobile applications. In: 2011 IEEE international conference on consumer electronics (ICCE), 2011, Las Vegas, NV, pp 705–706 Ionita M, Corcoran P (2011) Advances in extending the AAM techniques from grayscale to color images. US 7965875, 28 Feb 2011 Ionita MC, Corcoran P, Buzuloiu V (2009) On color texture normalization for active appearance models. Image Process IEEE Trans 18(6):1372–8 Liang C-K, Chang L-W, Chen HH (2008) Analysis and compensation of rolling shutter effect. IEEE Trans Image Process 17(8):1323–30 Ligthart G, Groen F (1982) A comparison of different autofocus algorithms. In: Proceedings sixth international conference on pattern recognition, Munich, Germany, pp 597–600 Page 24 of 25

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_172-1 # Springer-Verlag Berlin Heidelberg 2015

Marpe D, Wiegand T, Gordon S (2005) H.264/MPEG4-AVC fidelity range extensions: Tools, profiles, performance, and application areas. In: Proceedings – international conference on image processing, ICIP, 2005, vol 1, Genoa, Italy, pp 593–596 Marpe D, Wiegand T, Sullivan GJ (2006) The H.264/MPEG4 advanced video coding standard and its applications. IEEE Commun Mag 44(8):134–143 Nicklin SP, Fisher RD, Middleton RH (2007) Rolling shutter image compensation. Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 4434. LNAI, pp 402–409 Noll P (1997) MPEG digital audio coding. Signal Process Mag IEEE 14:59–81 Oth L, Furgale P, Kneip L, Siegwart R (2013) Rolling shutter camera calibration. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2013, Columbus, Ohio, pp 1360–1367 Pourazad M, Doutre C, Azimi M, Nasiopoulos P (2012) HEVC: the new gold standard for video compression: how does HEVC compare with H.264/AVC? IEEE Consum Electron Mag 1(3):36–46 Santos A, Ortiz de Solorzano C, Vaquero JJ, Pena JM, Malpica N, Del Poze F (1997) Evaluation of autofocus functions in molecular cytogenetic analysis. J Microsc 188(03):597–600 Shih L (2007) Autofocus survey: a comparison of algorithms. In: Electronic imaging 2007 (proceedings SPIE), pp 65020B, 1–11 Sullivan GJ, Wiegand T (2005) Video compression-from concepts to the H.264/AVC standard. Proc IEEE 93:18–31 Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22:1649–1668 Vatolin DD, Seleznev I, Smirnov DM (2007) Lossless video codecs comparison ‘2007 Wallace GK (1992) The JPEG still picture compression standard. IEEE Trans Consum Electron 38(1):18–34 Waltham N (2013) CCD and CMOS sensors

Page 25 of 25

DisplayPort Rudolf Sosnowsky

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Specification Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other Standards Not to Be Confused with DisplayPort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . eDP: Embedded DisplayPort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iDP: Internal DisplayPort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thunderbolt: DisplayPort Including PCI Express . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Backward Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary and Directions of Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 4 4 5 5 5 5 7

Abstract

With existing display interface standards having reached their limits with respect to bandwidth, the definition of a new standard was inevitable. VESA defined “DisplayPort” (DP) which allows for higher resolution, greater color depth, and increased frame rates. The recent release of the standard is DP 1.2, allowing 8 MP displays with more than 16 million colors to be driven with 60 Hz frame rate. Glossary

DDC DVI EMI Gbps GPU

Display Data Channel Digital Visual Interface Electromagnetic Interference Gigabits Per Second Graphics Processing Unit

R. Sosnowsky (*) HY-LINE Computer Components, Unterhaching, Germany e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_174-1

1

2

R. Sosnowsky

HDMI LVDS MP MST TMDS VESA VGA

High-Definition Multimedia Interface Low-Voltage Differential Signaling Megapixel Multi Stream Transport Transient Minimized Differential Signaling Video Electronics Standards Association Video Graphics Array

Introduction DisplayPort is a family of new standards for the transmission of digital graphics data from a source (computer) to a sink (monitor). It was defined by the VESA committee and is meant to replace existing standards like VGA (analog signals) and DVI/HDMI (digital signals). All family members are defined upon the same electrical standard but are using different protocols for the data transmission. Major chip vendors like Intel are implementing the DP interface into their chips, phasing out LVDS and DVI. Industrial customers prefer DP over HDMI since the connector is mechanically more stable and employs latches which lock the connector firmly into the receptacle. Implementing an industrial device does not require paying for license fees.

Technology DP is different from existing display interfaces. At the physical layer, DP employs differential signaling with controlled edges, thus achieving minimum EMI radiation. The minimum requirement of a DP link is one link (“lane”) for data, an auxiliary channel to support bidirectional communication, one line signaling cable plugged at both ends (“Hot Plug Detect”), and one pin for power. DisplayPort uses a clock signal which is embedded into the data stream. For this reason, 8-bit data are coded into 10 bits to ensure the signal is DC balanced and clock can be recovered from alternating bits. Requirements of the specification are to transmit full bandwidth signals over a 2 m cable and to be able to transmit 1080p60/24bpp on a 15 m cable, using four lanes. At the protocol layer, DP is using a packetized transmission instead of sequentially and continuously transmitted signals other interfaces used before. By a mechanism called “link training,” transmitter and receiver agree upon a nominal data rate. There are three fixed data rates: 1.62, 2.7, and 5.4 Gbps per lane which result in a maximum throughput of 21.6 Gbps or 2160 Mbps for the application. The design targets an uncompressed video stream. DP allows for free choice of pixel depth, resolution, and frame rate, as long as the overall available bandwidth is maintained (Fig. 1).

DisplayPort

3

Fig. 1 DisplayPort signals

Specification Target Since the specification of DisplayPort was made under aspect of the limitations of existing display interfaces, the features are state of the art: High bandwidth, allowing high-resolution displays, high refresh rates, and increased color depth. The contents of an 8 MP display with 4096  2304 pixels and 24-bit colors per pixel can be transmitted at a frame rate of 60 Hz. The former DDC was upgraded to a multipurpose bidirectional AUX channel which enables enhanced communication between image source and sink. It also provides low-skew audio support. The latest revision, DP 1.2, takes advantage of the packetized protocol: The DP link can carry up to 63 individual data streams for multiple displays. These can be either connected to a hub or daisy-chained. The standard also defines power to be provided by both source and sink, allowing, e.g., fiber adapters to be used without external power supply. Figure 2 depicts the dependency between several display parameters and the related data rate. On the vertical axis, several standard display interfaces and their bandwidth are shown. The bandwidth figure is net rate without protocol overhead. On the right-hand axis, common display resolutions are shown. For a given resolution, the bandwidth requires increases with the number of colors to be displayed (bpp = bits per pixel) and the desired frame rate. 24 Hz is common for cinema, 60 Hz is the standard frame rate commonly found in monitors, and 120 Hz is mostly used for 3D and gaming (VESA 2010).

4

R. Sosnowsky

Fig. 2 Data rate versus display parameters (VESA 2010)

Other Standards Not to Be Confused with DisplayPort eDP: Embedded DisplayPort eDP has been defined as an internal display interface for PC products. eDP and DisplayPort share the same video port of the CPU. The protocol, however, is not compatible. Typical applications are portable products like notebook or netbook computers, tablet PC, or AOI (All-In-One PC) and also special applications like IPC (Industrial PC) or panel PC. Based upon the DisplayPort specification, protocol, features, and power consumption are optimized for internal displays. The configuration (i.e., format setup of the GPU) is done via AUX channel and HPD. There is one 30-pin connector only for all signals including power. Due to the fixed environment, initial link training procedure is shortened or skipped. It is meant to replace LVDS.

DisplayPort

5

iDP: Internal DisplayPort iDP is an interface tailored for consumer electronics devices, e.g., TV sets. It features high resolution and frame rates and also supports 3D formats. The number of lanes can range from 1 to 16, depending on the bandwidth required. It is not directly compatible to DisplayPort and the main target is long internal connection with high data rates inside large TV sets (Kobayashi 2010).

Thunderbolt: DisplayPort Including PCI Express Thunderbolt combines two protocols on the same link: DisplayPort and PCIe (PCI express). Since DisplayPort uses data packets, PCIe packets can be transferred interleaved and bidirectionally. Since the same connector is used, the interface is backward compatible to DisplayPort only devices. The available bandwidth is 10 Gbps. Furthermore, Thunderbolt provides power (like USB) for bus-powered devices (Intel 2012).

Backward Compatibility The definition of the DP standard provided a method for backward compatibility by introducing “DisplayPort Dual Mode,” ‡DP. This sign indicates the DisplayPort interface chipset is backward compatible to HDMI, thus enabling the connection of HDMI or DVI sinks to the source output. An adapter signals the DP source to switch to HDMI/DVI mode. Physically, all DP lanes are activated (note: a DP link does not necessarily need all four lanes), three of them outputting graphics data and the fourth one the clock signal. The AUX channel is used for bidirectional DDC transmission. The signal levels are adapted to TMDS inside the adapter. From a protocol point of view, the signals are transmitted sequentially instead of packetized (Video Electronics Standards Association 2010) (Figs. 3, 4, and 5, Table 1). In 2009, VESA defined a more compact connector called Mini DP, or short mDP. It can be used as an alternative to the connector shown above where space is limited. There is no difference in function compared to the DP standard (Fig. 6).

Summary and Directions of Future Research DisplayPort is a graphics interface designed to overcome the limitations of existing interfaces – resolution, frame rate, and color depth. The bandwidth is sufficient for even the largest commercially available displays. Its AUX channel allows for highspeed bidirectional communication between source and sink. Electrical features like link training and Dual Mode allow for adaptation to the graphics environment. The MST feature enables multiple displays to be attached to one source. All

6

R. Sosnowsky

Fig. 3 DP connector, source side Fig. 4 Cable with DP plug

Fig. 5 Slot bracket of a graphics card with 1 DVI and 2 DP receptacle

revisions of the specifications are fully backward compatible. VESA is working on the next revision 1.3 of the specification.

DisplayPort Table 1 Pinout and signals of the DP connector in Fig. 3 (VESA 2010)

7 Pin 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Name Lane 0 (p) Ground Lane 0 (n) Lane 1 (p) Ground Lane 1 (n) Lane 2 (p) Ground Lane 2 (n) Lane 3 (p) Ground Lane 3 (n) Config 1 Config 2 AUX CH (p) Ground AUX CH (n) Hot Plug Detect Return DP PWR

Signal Out GND Out Out GND Out Out GND Out Out GND Out In In I/O GND I/O In Power Power

Comment

Pull to GND Pull to GND

PWR return path 3.3 V/500 mA

Fig. 6 Slot bracket of a graphics card with 2 DVI and 1 mDP receptacle

Further Reading http://www.displayport.org/ http://www.vesa.org/displayport-developer/about-displayport/ Intel (2012) Thunderbolt-technology-brief.pdf, downloaded from intel.com on 8 Sep 2014 Kobayashi A (2010) DisplayPortTM Ver.1.2 overview. DisplayPort-DevCon-presentation-DP-1.2Dec-2010-rev-2b.pdf, downloaded from vesa.org on 30 Aug 2014 Kobayashi A (2010) iDPTM (Internal DisplayPortTM) technology overview. DisplayPort-DevConpresentation-iDP-Dec-2010-rev-2-.pdf, downloaded from vesa.org on 3 Sep 2014 Video Electronics Standards Association (2008) VESA DisplayPort standard, Version 1, Revision 1a. DportV1.1a.pdf, downloaded from vesa.org on 3 Jan, 2014 Video Electronics Standards Association (2010) DisplayPort technical overview, DisplayPort Technical Overview.pdf, downloaded from vesa.org on 3 Sep 2014

8

R. Sosnowsky

Video Electronics Standards Association (2010) VESA dual-mode cable adaptor verification procedure, version 1, downloaded from vesa.org on 16 Dec 2013 Wiley C (2010) eDP™ embedded DisplayPort™. DisplayPort_DevCon_Presentation_eDP_Dec_2010_v3.pdf, downloaded from vesa.org on 30 Aug 2014

Contact Effects in Organic Thin-Film Transistors: Device Physics and Modeling Luigi Mariucci, Matteo Rapisarda, Antonio Valletta, and Guglielmo Fortunato

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parasitic Contact Resistance Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contact Resistance Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Device Structures and Contact Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contact Effects in Staggered Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contact Effects in Coplanar OTFTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 7 9 11 17 20 22

Abstract

The electrical characteristics of organic thin-film transistors (OTFTs) are frequently affected by contact effects, which can seriously reduce the transistor performance. The importance of the contact resistance, Rc, is more relevant in the case of high carrier mobility and/or small channel length devices, where its value may become comparable or even larger than the channel resistance. Rc appears to be strongly affected by the device architecture, and much higher Rc values are typically observed, at low-drain voltages, in coplanar structures (i.e., bottom-gate/bottom-contact (BGBC) devices) than in staggered structures (i.e., top-gate/bottom-contact (TGBC) devices). The presence of Schottky barriers, trap states, field dependence of carrier mobility, and defected regions near the electrodes has been suggested as the origin of Rc.

L. Mariucci (*) • M. Rapisarda • A. Valletta IMM-CNR, Rome, Italy e-mail: [email protected]; [email protected]; [email protected] G. Fortunato IMM-CNR, Rome, Italy e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2016 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_176-1

1

2

L. Mariucci et al.

In this work, the contact effects in devices with different architectures (staggered and coplanar) will be reviewed. Device characteristics are discussed and analyzed by using 2D numerical simulations. The electrical characteristics of both staggered and coplanar structures can be reproduced considering Schottky barriers at source and drain contacts and including an “effective” barrier lowering, which takes into account different field-dependent effects occurring at the contacts (Schottky effect, tunneling, trap-assisted tunneling). In the case of staggered devices, the detailed analysis of the current density shows that the current injection distribution depends on drain voltage. At low Vds and for a given Vgs, the current is mainly injected from an extended source contact region overlapped by the gate electrode, inducing a current spreading along the contact. For higher Vds, the current injected from the edge of the source contact rapidly increases, due to field-enhanced injection mechanisms, while the current injected from the remaining part of the source contact basically saturates. Numerical simulation also allows to clarify the drain current dependence on gate voltage. In the case of coplanar devices, current spreading does not occur, as most of the current is injected by the contact edge, and even at low Vds, the current increase is related to the field-enhanced injection mechanisms that take place at the edge of the source contact. The lack of the extended contact contribution to the injected current explains the higher contact resistance usually observed in coplanar devices at low Vds compared to staggered OTFTs. On the other hand, at high Vds (saturation condition), higher electric fields are present in coplanar structures, near the source, resulting in a more efficient carrier injection. Consequently, in saturation regime nearly ohmic contact condition can be observed in coplanar devices.

Introduction Organic electronics is an emerging technology that promises to cover a wide range of applications. Indeed, the intrinsic characteristics of the organic materials, such as the low-temperature process, flexibility, biocompatibility, and even the biodegradability, make them as the basis for devices for the emerging flexible plastic electronics. Among the most recent applications, organic electronic circuits driving electrophoretic ink display (Sou et al. 2014); electronic skin (Hammock et al. 2013) that use devices on ultrathin substrates (the so-called imperceptible electronics (Kaltenbrunner et al. 2013)) or on stretchable substrates (Sekitani and Someya 2010); and medical applications as implantable devices (Kuribara et al. 2012) that exploit the biocompatibility of organic materials are worth mentioning. Furthermore, sensor applications can also take advantage from the wide range of different organic materials available for electronic devices and also from the different device structures allowed by organic technology (Mabeck and Malliaras 2006). Organic electrochemical transistors (OECTs) have been developed for chemical and biological sensing (White et al. 1984; Khodagholy et al. 2013) and applied, for instance, to control cell activity (Lin et al. 2010). The simple OECT device configuration and manufacturing of OECT allows their use in different applications, from very simple

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

3

wearable sensors (Coppedè et al. 2014) to the new devices for bioelectronics (Tybrandt et al. 2012; Rivnay et al. 2014). Basic devices for organic electronics are thin-film transistors (OTFTs), whose performance is continuously improving in the last years, thanks to the development of new organic materials, specifically designed to obtain high carrier mobility and high stability (Dong et al. 2013). The field effect mobility of p-type OTFTs fabricated by using both evaporated (DNTT) (Kang et al. 2011) and solution-processed (Li et al. 2012) organic semiconductors (OSCs) is now approaching 10 cm2/Vs perylene-based n-type devices (Guo et al. 2014) showing high performance and high stability even when operated in air. These achievements allowed now the development of complex CMOS circuitry (Jacob et al. 2013; Sekitani et al. 2010; Abdinia et al. 2014) with good performance and reliability. The improvement of the performance of the circuits can be achieved also by a reduction of transistor channel length. This requirement, along with the increase of carrier mobility that reduces the transistor channel resistance, makes the parasitic contact resistance effects a major issue in organic transistors (Blanchet et al. 2004; Gundlach et al. 2006; Klauk et al. 2003; Natali et al. 2007; Necliudov et al. 2003; Pesavento et al. 2004). For very short-channel devices (L < 10 μm), parasitic contact resistance can severely limit or even dominate the overall transistor performance (Ante et al. 2012). The analysis of contact effects and their origins in the different device structures can provide guidelines to reduce contact effects, thus allowing further improvement of the device electrical characteristics.

Parasitic Contact Resistance Effects Parasitic contact resistance induces a voltage drop at the contacts (Fig. 1) that reduces the effective drain/source as well as gate/source bias voltages applied to the intrinsic channel of the transistor and, consequently, reduces the device current. The equation of an ideal MOSFET is modified by the presence of parasitic resistance at the drain/source contacts and must be rewritten as follows: For low Vds Id ¼

  W μ Ci V gs  V T  V c1 ðV ds  V c Þ L FE

(1)

where V c ¼ V c1 þ ðV ds  V c2 Þis the sumof the voltage drop at the source and drain. Since at low Vds, V c1  V gs  V T and Vc1 can be neglected compared to (VgsVT), the Eq. 1 can be approximated by   W μ Ci V gs  V T ðV ds  V c Þ L FE   In the saturation condition, for ðV ds  V c Þ > V gs  V T  V c1 Id 

(2)

4

L. Mariucci et al.

gate source

drain parasitic

Vc1

Vc2

Fig. 1 Schematic of a transistor with parasitic contact element located at source and drain electrodes

Id ¼

 2 W μFE Ci V gs  V T  V c1 L

(3)

For resistive contacts the voltage drops are simply related to drain current Vc1 = RsId and Vc2 = RdId, where Rs and Rd are the contact resistances at the source and drain contacts, respectively. However, more in general, Vc1 and Vc2 can depend on Id, Vgs, and Vds, as, for instance, in the case of Schottky barrier, and Vc1 and Vc2 can show complicated dependence on the gate and drain voltages. Figure 2 summarizes the effect of parasitic components on the device output characteristics, depending on their position. At low Vds, contact effects are independent with the position of the parasitic contact resistance, while saturation current depends if the parasitic element is only at the source, at the drain, or at both contacts. It can be noted that, according to Eq. 3, in the presence of contact resistance at the source, saturation current is always lower than the ideal case due to the reduced Vgs applied to the devices, unless the Vc1 vanishes at high Vds or for increasing Vgs (nonohmic element). On the contrary, when the parasitic resistance is only at the drain contact, only the saturation voltage is modified, while saturation current remains unchanged. Figure 3a shows the effect of parasitic contact resistance on the transfer characteristics of an organic transistor, comparing the normalized transfer characteristics of OTFTs with different channel lengths. As can be seen, the drain current is reduced by the contact resistance due to reduction of drain voltage applied to the device channel. Consequently, the device transconductance (Fig. 3b), for given Vgs, is reduced. It can be also noted from Fig. 6a that the transfer characteristic of a real organic device often does not follow the ideal MOSFET equation (Eq. 1) and the drain current increases superlinearly with Vgs, at least for long-channel devices, where the contact parasitic resistance can be neglected comparing to the channel resistance. For shorter channel lengths and/or higher parasitic resistance, drain current can show a linear behavior, apparently in agreement with the ideal case, or an under linear variation with gate voltages, characteristic of a high influence of the contact. One of the origins of contact parasitic resistance in OTFTs is the interface between metal S/D electrodes and semiconductor layer. Indeed, contrary to conventional single-crystal Si MOSFET, where carrier charge injection from the electrode is assisted by heavily doped silicon layer, in organic device transistors, the source and drain electrode are directly in contact with the undoped semiconductor active layer. Consequently, an energy barrier is expected at the interface induced by the difference

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . . Fig. 2 Effect of parasitic components on the device output characteristics, depending on their position

5

5x10–6 4x10–6

R c =0

Rc=Rd =1 MΩ Rc=2Rd=2Rs=1 MΩ

Id (A)

3x10–6 2x10–6

Rc =Rs =1 MΩ

1x10–6

Vgs=–30V L=10 μm

0 0

–5 –10 –15 –20 –25 –30 –35 –40 Vds (V)

L=100 μm L=50 μm L=20 μm 10–8 L=10 μm L=5 μm 10–9

a 5x10–9

10–10 3x10–9 10–11 2x10

–9

1x10–9

10–13 Vds= –0.1 V

0 –30 –25 –20 –15 –10 –5

10–14 0

5

Vgs (V)

b 3x10–10

L=100 μm L=50 μm L=20 μm L=10 μm

2x10–10 gm (S)

L=5 μm

1x10–10

0 –30 –25 –20 –15 –10 Vgs (V)

–5

0

)

10–12

Id L/W (A/

)

4x10–9 Id L/W (A/

Fig. 3 Normalized transfer characteristics (a) and transconductance (b) of organic transistor with different channel lengths

6

L. Mariucci et al.

Fig. 4 Energy band diagram of the metal/OSC contact

Fig. 5 Schematic of organic molecular (pentacene) growth near the edge of a gold electrode

between the metal work function and the ionization energy (IE), or electron affinity (EA) of the semiconductor, corresponding to HOMO or LUMO level, respectively (see diagram in Fig. 4). In the Schottky-Mott limit and assuming vacuum level alignment, the energy barrier height for holes (ΦBh) and for electrons (ΦBe) injection is equal to the difference of metal work function and HOMO and LUMO levels, respectively. Another origin of contact resistance can be ascribed to a defected layer near the electrode (Bock et al. 2006; Risteska et al. 2014). Indeed, depending on device structure, during the organic semiconductor growth, a highly defected transitional region can be created at the edge of the contact with reduced carrier mobility and increased defects (Fig. 5). Due to the different surface energy of the bare metal and the insulating substrate,

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

7

the organic semiconductor molecules can assume different orientations on the two surfaces, resulting in a defected transitional region just at the edge of the metal contact, where the carrier injection occurs (Ruiz et al. 2004).

Contact Resistance Evaluation Methods The most commonly used method to evaluate the contact resistance is the so-called transmission line method (TLM). This method was first developed to model rectangular contacts to planar crystalline silicon devices (Berger 1972) and then applied to estimate the contact resistance value in amorphous silicon TFTs (Luan and Neudeck 1992). According to the TLM, the total resistance, RT, of the transistor in linear regime (low-drain voltage, Vds) is given by RT ¼

V ds ¼ Rch þ Rs þ Rd ¼ r ch  L þ Rc Id

(4)

where rch is the channel resistance per channel length unit, L is the channel length, and Rc = Rs + Rd is the total contact resistance. The method assumes that Rc does not depend on channel length, while Rch linearly depends on L (Fig. 4), if the electrical characteristics of the transistor follow the standard square law, in the linear regime. By measuring the device characteristics in transistor with different L and by plotting RT versus L for different overdrive voltages (VgsVT), it is possible to extract, by linear extrapolation of the RT value for L = 0, the contact resistance value, Rc (Fig. 6):

a

2.5 VG=–10V

b

W=100 μm

VG=–15V

2

40

VG=–25V

Rc (k Ω cm)

RT (MΩ cm)

VG=–20V

1.5

VG=–30V

1

0.5

0

50

30

20

10

0

20

40

60 L (μm)

80

100

120

0 –5

–10

–15

–20

–25

–30

–35

Vgs (V )

Fig. 6 Total resistance as a function of channel length (a) for pentacene OTFTs measured at different Vgs and low Vds = 0.1 V and parasitic contact resistance variation with gate voltage (b) calculated by linear extrapolating the RT values for L = 0

8

L. Mariucci et al.

Fig. 7 Schematic of the four possible OTFT architectures: (a) bottom gate/bottom contacts (BGBC), (b) top gate/bottom contacts (TGBC), (c) top gate/top contacts (TGTC), (d) bottom gate/top contacts (BGTC)

RT ¼

L   þ Rc WμFE Ci V gs  V T

(5)

As can be seen in Fig. 4, OTFTs usually show a dependence of parasitic contact resistance on the gate voltage (Gupta et al. 2009), but TLM cannot be applied to evaluate a possible Rc dependence on drain voltage. Other methods, such as the gated four-point probe (gFPP) (Pesavento et al. 2004) and the scanning Kelvin probe microscopy (SKPM) (B€ urgi et al. 2002; Petrovic et al. 2009), can directly measure the voltage drops at the source and drain contacts, and hence, they can be used to calculate Rc variation with both Vgs and Vds. However, these methods can be applied only considering an ad hoc structure (gFPP) or only to one device structure (bottomgate device, SKPM). Assuming that voltage drop occurs only at one electrode, usually at the source contact, as suggested by SKPM measurements, it is possible to evaluate the IdVcs characteristics (Vcs = Vc1Vs) and the contact resistance just considering a long-channel and a short-channel device (Hamadani, Hong, Valletta). Indeed, the contact potential drop, Vcs, can be calculated by solving the following equation in the case of short-channel devices and for a given set of Id, Vgs, and Vds: ð W ðV gs V cs Þ Id ¼  GðV ÞdV  L V gs V ds

(6)

where G(V ) is the transistor channel conductance, which can be determined from the transfer characteristics, measured at very low Vds (0.1 V ), of a long-channel device (usually L  100 μm), where the contact effects can be neglected (Rch >> Rc).

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

9

Device Structures and Contact Resistance There are four possible structures of OTFTs (Fig. 7) usually called, depending on the position of gate and source/drain (S/D) electrodes, bottom gate/bottom contacts (BGBC), top gate/top contacts (TGTC), top gate/bottom contacts (TGBC), and bottom gate/top contacts (BGTC). The first two OTFT types are coplanar structures, i.e., gate and S/D electrodes are on the same side of the semiconductor, while the last two structures are staggered structures, i.e., gate and S/D electrodes are on the sides of the semiconductor. While TGTC structures are rarely used, BGBC is widely used as it can be simply and quickly manufactured by adopting highly doped silicon wafer as extended gate and silicon dioxide as gate dielectric. BGBC structure is often used as test structure to check new organic semiconductor but also to manufacture high-performance organic devices and circuits (Ante et al. 2012). BGTC devices usually show very low parasitic contact resistance (Gundlach et al. 2006; Xu et al. 2010; Noda et al. 2014), but as drawback, only relatively long-channel lengths (L > 10 μm) can be achieved, due to the limitation of contact mask process. TGBC structure is often used for solution-processed devices, defining the S/D contacts by standard photolithography or printing techniques. It has been shown that the parasitic contact effects influence differently the electrical characteristic variations depending on the OTFT structures (Gupta et al. 2009; Kim et al. 2011; Noda et al. 2014; Kumar et al. 2013; Gundlach et al. 2006). Considering the current flow in the different structures (red arrows in Fig. 7), it is quite evident that different contact effects are expected for different device layouts. In the coplanar structures, the carriers are injected directly in the device channel, while in the staggered structures, injected carrier must flow through a highly resistive bulk region before reaching the accumulated region at the semiconductor/dielectric interface. This gives rise to an access resistance not present in the coplanar structures. On the other hand, while in the staggered devices the injection can take place along the whole contact length (typically several microns), in coplanar structure the injection occurs just at the edge of the contact, of the order of few tens of nanometers. These differences result in a different influence of contacts on the OTFT electrical characteristics with higher parasitic resistances observed in coplanar devices (Kim et al. 2011; Gupta et al. 2009). Figure 8 shows examples of output characteristics of staggered TGBC solutionprocessed high-mobility OTFTs (Valletta et al. 2011) and coplanar BGBC evaporated pentacene TFTs (Rapisarda et al. 2011), showing evident contact effects. Staggered device shows a linear behavior at low Vds, but after a few voltage, a change in slope of IdVds curve can be observed. Coplanar OTFT shows the typical

10

a

L. Mariucci et al.

b

80 L = 10 μm W = 2000 μm

70

Vgs = –50V

60

Id (μA)

Id (μA)

Vgs = –40V

30

Vgs=–30V

0.3 Vgs=–25V 0.2 Vgs=–25V

Vgs = –30V

20

L = 2 μm W = 200 μm

0.4

50 40

0.5

0.1

10 0

0

–10

–20

–30

–40

–50

–60

0

0

–5

Vds (V )

–10

–15

–20

–25

–30

Vds (V)

Fig. 8 Output characteristics of staggered TGBC (a) and coplanar BGBC (b) OTFTs showing evident contact effects

4x10–8 Vgs=–20V Vgs=–30V

3x10–8 Id (A/μm)

Fig. 9 IdVcs curves deduced by using a modified GCA expression that accounts for a voltage drop at source contact in staggered OTFTs (solution-processed highmobility TFT (Valletta et al. 2011))

Vgs=–40V Vgs=–50V

2x10–8

1x10–8

0

L=10 μm W=2000 μm 0

5

10 Vcs (V)

15

S-shape near the origin with a superlinear increase of the drain current. The two devices have different semiconductors and different channel lengths; however the method to extract the contact characteristics, discussed above, allows to remove the influence of the device channel and to calculate the IdVcs characteristics of the semiconductor/metal contact (Figs. 9 and 10). As can be seen, the two devices show completely different contact characteristics. Staggered devices show contact characteristics that closely resemble the characteristics of a reverse-biased leaky diode, which can be approximated by the equation (Valletta et al. 2011):  α      V cs qV cs I d ¼ I 0 exp exp  1 V0 ηkT

(7)

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . . Fig. 10 IdVcs curves deduced by using a modified GCA expression that accounts for a voltage drop at source contact in coplanar OTFTs (evaporated pentacene TFTs (Rapisarda et al. 2011))

11

0.3

L=2 μm W=200 μm

0.25

Vgs=-20 V

Id (μA)

0.2

Vgs=-25V Vgs=-30 V

0.15

0.1

0.05

0 1

0

2

3 4 Vcs (V)

5

6

7

where I0 is the diode reverse current which depends on both Vcs (Schottky effect) and Vgs (gated diode); η is the diode ideality factor; V0 and α are parameters determining the diode current dependency upon Vcs; and q, k, and T have their usual meanings: elementary charge, Boltzmann constant, and absolute temperature. In turn, I0 depends on Vgs:  I 0 ¼ I 00

V gs V 00



(8)

where I00 is the diode reverse current prefactor and V1 = 1 V is a normalization coefficient. Coplanar OTFTs show superlinear IdVcs curves, which can resemble forward biased diode characteristics, with weak dependence on the gate voltage. In order to explain these differences, in the next sections, the current injection from metal into the semiconductor will be analyzed for the two device structures.

Contact Effects in Staggered Devices Staggered devices usually have a large overlap between gate and source/drain electrodes. As current injection occurs along most of the contact area, evaluation of the contact resistance must take into account the charge transport in this region. The charge injection from planar ohmic contacts into thin semiconductor layer has been described by the transmission line model (Berger 1972; Murrmann and Widmann 1969) (TLM). The model, considering ohmic contact resistance, allows to

12

L. Mariucci et al.

Gate Ich (x) Gate dielectric Vc Is

Vch(x) rch

Pb

Jc

Pa

Source

OSC

LT d

Fig. 11 Schematic of the current injection in staggered OTFTs: Pa – parasitic element that accounts for the presence of a Schottky barrier; Pb – parasitic element that accounts for the current flow through the non-accumulated organic semiconductor region

calculate the current distribution along the electrode and the total contact resistance. In the case of staggered OTFTs, however, additional conditions to the classical model must be considered, as illustrated in Fig. 11. At the semiconductor/dielectric interface, a conductive layer is present, induced by the gate electrode overlapping the contact area, which can be considered as an ohmic resistance, rch depending on Vgs. The contact resistance (Pa) can be nonohmic, as well as the resistance Pb related to the non-accumulated organic semiconductor region, between the contact and the semiconductor/dielectric interface. All the three elements, Pa, Pb, and rch, can contribute to the contact resistance, with different dependences on gate and drain bias. In order to evaluate the contact resistance, we can consider that the variation of the current component parallel to the channel, Ich (flowing in the accumulated region at the semiconductor/dielectric interface), is @I ch ðxÞ ¼  W J c ðxÞ @x

(9)

where Jc(x) is the vertical current density injected by the contact at position x. This current is, in general, a function of the electrical potential in the channel at position x, Vch(x): J c ðxÞ ¼ f ðV ch ðxÞÞ

(10)

The form of f (Vch(x)) depends on the type of the parasitic elements Pa and Pb. Considering the channel resistivity, rch, the variation of the potential along x can be expressed as

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

13

@V ch ðxÞ ¼ r ch I ch ðxÞ @x

(11)

Combining Eqs. 9, 10, and 11, we obtain the relation @ 2 V ch ðxÞ ¼ Wr ch f ðV ch ðxÞÞ @x2

(12)

whose solution depends on the form of f (Vch(x)). The most simple case is when the parasitic elements can be approximated as ohmic resistances (Pa = ra and Pb = rb). This approximation has been shown to be valid for low Schottky barrier (Φb < 0.3 eV) and thin organic semiconductor layers (Richards and Sirringhaus 2007; Marinkovic et al. 2012). In this case f ðV ch ðxÞÞ ¼ r ce V ch ðxÞ

(13)

where rce = ra + rb and Eq. 12 can be solved obtaining an analytical expression for the contact resistance (Chang et al. 1998): 

RC V gs



    V ch ðx ¼ 0Þ d   ¼ r ch V gs LT V gs coth ¼ I0 LT V gs

! (14)

where LT is related to the effective contact length (measured from the edge of the contact near the device channel, Fig. 11) where the injection/extraction takes place. For large gate to S/D overlap (d/LT >> 1) LT = Rc/rch, while for d 0.3 eV (Richards and Sirringhaus 2007)), the contact resistance is controlled by the Schottky diode at the metal/OSC interface. In this case the contact can be considered as a “distributed diode” (Fig. 12) and the current injected along the electrode can be written as  qVch J c ðxÞ ¼ J 0 ðxÞ e kT  1 ¼ f ðV ch Þ

(15)

According to Scott and Malliaras (1999), the reverse current, J0, for a metal/organic semiconductor diode is 

Φb J 0 ðxÞ  μ EðxÞexp  kT



 pffiffiffiffiffiffiffiffiffi exp βb ðT Þ EðxÞ

(16)

where (Ex) is the electric field. These expressions can explain the IdVcs characteristics shown in Fig. 9. When the contact is dominated by the “distributed diode” case, Eq. 10 cannot be

14

L. Mariucci et al.

Fig. 12 Schematic of the current injection in staggered OTFTs when the parasitic elements Pb (see Fig. 11) can be neglected and Pa accounts for the presence of a Schottky junction at metal/OSC interface (distributed diode model)

Vdsat1

5x10–5

Vgs= –44V

Vdsat2

4x10–5

Vgs =–38V Id (A)

Fig. 13 Experimental (symbols) and computed (lines) IdVds characteristics of short-channel OTFTs. The characteristics have been computed by using 2D numerical simulations, assuming the presence of a Schottky junction with fieldenhanced mechanisms at metal/OSC interface. The orange circles indicate the Vdsat2 voltage (see text)

3x10–5 Vgs = –34V 2x10–5 Vgs = –28V 1x10–5 0

Vgs =–24V 0

–10

–20

–30

–40

–50

Vds (V )

analytically solved and a numerical approach must be used (Rapisarda et al. 2012; Mariucci et al. 2013). The output characteristics of staggered OTFTs, previously discussed (Fig. 9), can be reproduced by using 2D numerical simulations (Fig. 13) that consider Schottky barrier at source-drain contacts with an effective barrier lowering mechanism, accounting for several field-enhanced mechanisms (field emission tunneling, phonon-assisted tunneling, and trap-assisted tunneling) (Rapisarda et al. 2012). In particular, the numerical simulations reproduce the change of slope at low Vds, characteristic of the staggered devices. As will be discussed below, this feature is strictly related to the presence of the Schottky barrier at the interface, with the related

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

Electrostatic potential (V )

a

0

source

Vds =–2V Vds =–5V

–10

Vds =–10V

–15

Vds =–15V

–20

Vds =–20V

–25

Vds =–25V

–30

Electrostatic potential (V )

–1

Vds =–30V

Vc Vgs = –44V

–40 –2

b

drain

–5

–35

15

0

2

4 6 x (μm)

8

Vds =–40V 12

10

source

Vds 0 –0.5

–1.5

–1 –1.5

–2 –2 Vc

–2.5

–2.5

Vgs = –44V –3 –20

–15

–10 x (μm)

–5

0

–3

Fig. 14 (a) Potential distributions at OSC/gate insulator interface at different drain bias computed by using 2D numerical simulations. (b) Detail of the potential distribution near the source contact at low-drain bias

depletion regions, and cannot be reproduced by the case of distributed contact resistance model. To gain more insight into the device behavior, in particular the dependence upon drain and gate voltages of the electrical characteristics, we can consider the potential distributions along the semiconductor/gate dielectric interface (Fig. 14). The main feature that can be observed is the presence of a large voltage drop at the source edge. A detailed analysis of the Vc potential as well as of the current distribution near the source contact suggests the presence of three regimes occurring at different drain voltage ranges. At very low |Vds| (Vdsat1 = 2 V in Fig. 14), the potential linearly drops mainly along the device channel, even though potential variation is also observed (Fig. 14b)

16

L. Mariucci et al.

Fig. 15 Schematic of the current injection at Vds > Vdsat1 (see text) in staggered OTFTs using the distributed diode model and taking in account the pinch off at source end of the channel

in the region above the source electrode, as predicted by the distributed contact resistance model. In this regime, the channel current is provided by the whole contact area (see schematic of Fig. 12), and the injected current shows a distribution along the contact, related to the variation of the potential induced by current flowing in the accumulated layer near the semiconductor/dielectric interface. At Vds = Vdsat1 depletion layer of the Schottky contact reaches the insulator/ semiconductor interface causing that for Vdsat1 < |Vds| < Vdsat2 (see Fig. 14), most of the potential drop is localized at the source end of the channel. In this Vds range, the potential above the source contact remains almost constant. The Vc potential, indicated in Figs. 14a, b, corresponds to the floating channel potential at the source edge of the transistor (Vc1 in Fig. 1). In this regime the current injected from the edge of the source contact (Iedge in Fig. 15) increases, due to the strong barrier lowering induced by potential drop at the contact edge. On the contrary, the current injected from the remaining part of the source contact (Icontact in Fig. 15) remains basically constant. Figure 15 summarizes the contributions of the different OTFT regions to the IdVcs characteristics in two regimes described. Finally, at high |Vds| (>Vdsat2), the standard pinch off at the drain occurs at the same Vds as in the case of long-channel device (negligible contact resistance), since the condition for saturation (Vdsat2Vc) = (VgsVcVT) is independent from Vc. Numerical simulations also clarify the Vgs dependence of the contact IdVcs characteristics observed in the contact characteristics of staggered devices (Fig. 9). The main contribution arises from the increase of Icontact, in the range Vds < Vdsat1, related to two effects: (1) the increase of Vgs reduces the channel resistance and, consequently, increases the contact injection area (LT), so effectively reducing the contact resistance, and (2) the increase of Vgs also increases the barrier lowering of the Schottky source contact, which is related to the potential drop distribution at source contact.

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . . Fig. 16 Contact (red curve) and edge (blue curve) contributions to the overall IdVcs contact characteristic (black curve) in staggered OTFTs computed by using 2D numerical simulations

17

3x10–8 Id = Iedge + Icontact

Id /W (A/μm)

2x10–8

Vdsat1

Icontact

1x10–8 Iedge Vgs =–44V 0

0

–5

–10 Vcs (V )

–15

–20

For Vds > Vdsat1, Icontact remains basically constant and the Vgs dependence of Iedge arises from the dependence on Vgs of the Schottky barrier lowering at the contact edge.

Contact Effects in Coplanar OTFTs As shown in the previous section, in the case of staggered device with nonohmic contacts, most of the current flowing into the channel device at low Vds is injected along most of the source electrode, at least at low Vds. This contribution (Icontact in Fig. 16) is lacking in coplanar OTFTs. Indeed, in numerical simulations (Rapisarda et al. 2015), the current in coplanar BGBC devices is injected and extracted, at the source and drain contact, respectively, within a few nanometers from the semiconductor/dielectric interface (see Fig. 17), both in the linear and the saturation regime, either for ohmic contact or Schottky contact. This implies that the contact effects and the S-shape of the output characteristics are strictly related to the characteristics of a small region located at the edge of the electrode near the semiconductor/dielectric interface. In coplanar/bottom-contact OTFTs, voltage drop at the source contact induced by contact resistance has been experimentally observed by SKPM measurements (Necliudov et al. 2003; B€ urgi et al. 2003). This voltage drop and the resulting contact resistance have been related to two main causes: (a) the presence of a defected region near and on top of the source and drain electrodes (Gupta et al. 2009; Kim et al. 2013; Jung et al. 2010; Marinkovic et al. 2012; Wang et al. 2013) and (b) Schottky barrier, which limits the current injection at the source

18

L. Mariucci et al.

Fig. 17 Current distribution near the source contact for devices with ohmic contacts (a), Schottky contacts, (b) and very thin metal contacts (c)

(Brondijk et al. 2012). Most likely, the two effects can either concurrently or separately control the electrical characteristics depending on Schottky barrier height and semiconductor quality (B€ urgi et al. 2003; Li et al. 2003). Both hypotheses must explain the characteristic S-shape of the output characteristics of coplanar TFTs. The presence of a defected region near the source and drain can explain the superlinear current dependence of the output characteristics experimentally observed if SCLC transport occurs (Bullejos et al. 2009), resulting in contact characteristics 3 I c / μ V αþ1 cs =d , where α = 1 for Mott-Gurney model, while α > 1 in the case of SCLC transport in the presence of energetically distributed traps (Rose 1955). However, for devices with a low-defect density semiconductor region near the contact and/or high Schottky barrier height at the contact (>0.3 eV), contact resistance appears dominated by the charge carrier injection at the source (B€urgi et al. 2003; Brondijk et al. 2012), and the current can be evaluated by the diffusionlimited thermionic field emission model (Scott and Malliaras 1999). In this case, the S-shape observed in the output characteristics cannot be simply reproduced by a constant Schottky barrier (Gundlach et al. 2006; Scheinert and Paasch 2009). In order to reproduce the experimental data, field-dependent mobility (Scheinert 2009)

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . . Fig. 18 Homo band at the metal/organic semiconductor interface for three different gate voltages and Vds = 0.1 V

19

0.1 Vds=–0.1V 0 Vgs=–22V

–0.2

qΦB0=0.4eV

E (eV)

–0.1

Vgs=–10V

–0.3 Vgs=–3V

–0.4 source –0.5 0.035 0.04

organic semiconductor

0.045

0.05

0.055

0.06

X (μm)

or Schottky barrier lowering (Brondijk et al. 2012), induced by the increase of the electric field at the source for increasing Vds, has been suggested. It has been also shown by numerical simulations that in coplanar OTFTs under accumulation, the electric field at the source and barrier lowering increase for increasing Vgs, explaining the gate voltage dependence of contact resistance (Brondijk et al. 2012). These results suggest that the dependence of the current on Vds and Vgs, in particular the S-shaped output characteristics, can be reproduced by considering high Schottky barrier without other assumptions (Brondijk et al. 2012). Then, as for staggered OTFTs, the barrier lowering appears to dominate the charge injection from source contact. However, the different structure deeply influences the charge injection dependence on Vgs and Vds. In coplanar OTFTs, the source contact is directly in contact with the channel region, and the depletion region induced by the Schottky contact is limited to a small region near the contact edge, due to the charge accumulation induced by the gate bias (Fig. 18). For increasing Vgs the electric field and, consequently, the barrier lowering at the source contact progressively increase (Brondijk et al. 2012), as the voltage drop occurs in a smaller region (Fig. 18), resulting in the Vgs dependence observed for the IdVcs characteristics of coplanar OTFTs (Fig. 10). Despite the beneficial effects of the gate bias on the carrier injection, the effects of the contact resistance are often clearly visible in the transfer characteristics of coplanar OTFTs, at least low Vds, causing the incorrect scaling with L and gm reduction at high Vgs, when the channel region resistance is low (Fig. 3). However, in coplanar OTFTs, the influence of Vds can be much higher than in the case of staggered devices. Indeed, for devices with moderate contact effects (not excessively high barrier height), IdVgs can show the correct scaling with L in saturation

20

L. Mariucci et al.

Fig. 19 Normalized transfer characteristics in saturation regime for coplanar pentacene OTFTs with different channel lengths

L=100 μm 10–5 L=50 μm –6 L=20 μm 10

5x10–6

3x10

–6

2x10

–6

L=10 μm –7 L=5 μm 10 10–8 10–9

1x10–6

10–10 Vds= –30V

0 –30 –25 –20 –15 –10 –5 Vgs (V)

0

5

10

10–11

0.5 Vgs=–30V 0.4 Id L/W (μA/ )

Fig. 20 Normalized output characteristics, measured at different gate voltages, for coplanar pentacene OTFTs with channel length of 100 μm (dashed lines) and 10 μm (solid lines)

Id L/W (A/ )

Id L/W (A/ )

4x10–6

0.3

Vgs=–25V

0.2 Vgs=–20V 0.1

0 –40

–35

–30

–25

–20

–15

–10

–5

0

Vds (V)

condition (Fig. 19). This implies that the electric field increase at the source, induced by the drain voltage, is high enough to make ohmic the contact, sufficiently reducing the barrier height. The corresponding normalized output characteristics (Fig. 20) show a reduced current at low Vds, but in the saturation condition, the device performance appears similar for long- and short-channel OTFTs.

Conclusion Contact resistance is mainly related to the energy barrier formed at the metal/organic semiconductor, which depends on the specific metal work function and HOMO (LUMO) level for p-type (n-type) organic semiconductor and dipole formation. The

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

21

presence of a Schottky barrier at the source severely limits the carrier injection, thus substantially reducing the on-current. Material quality at the metal/OSC interface and use of self-assembled monolayers have been shown to allow the control the contact properties. Device geometry has been also found to influence contact effects, inducing different features and different Vgs and Vds dependence in the electrical characteristics of staggered and coplanar devices. In the staggered structure, the current is injected from an extended source contact region, which can be represented by a distributed network of parasitic elements (see Fig. 11). For low Schottky barrier heights (0.3 eV) Schottky barrier heights, the contact effects are dominated by the Schottky barrier at the metal/OSC interface and can be represented by a distributed diode network (see Fig. 12). In this case, numerical simulations have helped to clarify the distribution of the current spreading at the contacts: it remains basically constant for increasing Vds up to a value Vdsat1, when the depletion layer of the Schottky contact reaches the insulator/semiconductor interface, thus causing the pinch off at the source end of the channel. For Vds > Vdsat1, the current injected from the edge of the source contact rapidly increases, due to barrier lowering induced by field-enhanced injection mechanisms, while the current injected from the remaining part of the source contact saturates. In the staggered configuration, the Vgs dependence of contact characteristics is mainly related to the variation of the channel conductance in the overlap region between gate and S/D contacts. In the case of coplanar devices, current spreading does not occur, as most of the current is injected by the contact edge, and even at low Vds, the current increase is related to the field-enhanced injection mechanisms that take place at the edge of the source contact. The lack of the extended contact contribution to the injected current explains the higher contact resistance, if compared to staggered OTFTs, usually observed in coplanar devices at low Vds. In coplanar structure, both Vgs and Vds directly influence the electric field at the edge of the source contact and at high Vds (saturation condition). Higher electric fields are present in coplanar structures, near the source, resulting in a more efficient carrier injection induced by field-enhanced barrier lowering mechanisms. Consequently, in saturation regime nearly ohmic contact condition can be achieved in coplanar structures in presence of a not excessively high barrier height. Finally, we can note that the different device structures, different carrier injection conditions, and the different influences of the Vgs and Vds result in large differences in the electrical characteristics of the device with parasitic contact resistance. In particular, at low Vds, staggered structure appears less affected by contact resistance due to wide injection area usually available in these devices. However, at high Vds, the normalized saturated current of short channel of staggered devices is always lower than that of long channel, due to the voltage drop at the source contact. On the contrary, in coplanar structure, even in the presence of Schottky barrier that

22

L. Mariucci et al.

substantially reduces the drain current at low Vds, short-channel devices can efficiently operate in saturation condition, thanks to the effective barrier lowering induced by both Vgs and Vds.

Further Reading Abdinia S, Torricelli F, Maiellaro G, Coppard R, Daami A, Jacob S, Mariucci L, Palmisano G, Ragonese E, Tramontana F, van Roermund AHM, Cantatore E (2014) Variation-based design of an AM demodulator in a printed complementary organic technology. Org Electron 15:904–912 Ante F, K€alblein D, Zaki T, Zschieschang U, Takimiya K, Ikeda M, Sekitani T, Someya T, Burghartz JN, Kern K, Klauk H (2012) Contact resistance and megahertz operation of aggressively scaled organic transistors. Small 8:73 Berger HH (1972) Model for contacts to planar devices. Solid State Electron 15:145–158 Blanchet GB, Fincher CR, Lefenfeld M (2004) J. A. Rogers, contact resistance in organic thin film transistors. Appl Phys Lett 84:2964 Bock PDV, Kunze U, K€afer D, Witte G, Wöll C (2006) Improved morphology and charge carrier injection in pentacene field-effect transistors with thiol-treated electrodes. J Appl Phys 100:114517 Bullejos P, Tejada JAJ, Rodríguez-Bolívar S, Deen MJ, Marinov O (2009) Model for the injection of charge through the contacts of organic transistors. J Appl Phys 105:084516. B€urgi L, Sirringhaus H, Friend RH (2002) Noncontact potentiometry of polymer field-effect transistors. Appl Phys Lett 80:2913 Chiang C-S, Martin S, Kanicki J, Ugai Y, Yukawa T, Takeuchi S (1998) Top-gate staggered amorphous silicon thin-film transistors: Series resistance and nitride thickness effects. Jpn J Appl Phys 37:5914–5920 Coppedè N, Tarabella G, Villani M, Calestani D, Iannotta S, Zappettini A (2014) Human stress monitoring through an organic cotton-fiber biosensor. J Mater Chem B 2:5620 Dong H, Fu X, Liu J, Wang Z, Hu W (2013) 25th anniversary article: key points for high-mobility organic field-effect transistors. Adv Mater 25:6158–6183 Guo X, Facchetti A, Marks TJ (2014) Imide- and amide-functionalized polymer semiconductors. Chem Rev 114:8943–9021 Gupta D, Katiyar M, Gupta D (2009) An analysis of the difference in behavior of top and bottom contact organic thin film transistors using device simulation. Org Electron 10:775–784 Hamadani BH, Natelson D (2005) Nonlinear charge injection in organic field-effect transistors. J Appl Phys 97:064508 Hammock ML, Chortos A, Tee BC-K, Tok JB-H, Bao Z (2013) 25th anniversary article: the evolution of electronic skin (E-skin): a brief history, design considerations, and recent progress. Adv Mater 25:5997–6038 Hong J-P, Park A-Y, Lee S, Kang J, Shin N, Yoon DY (2008) Tuning of Ag work functions by selfassembled monolayers of aromatic thiols for an efficient hole injection for solution processed triisopropylsilylethynyl pentacene organic thin film transistors. Appl Phys Lett 92:143311 Jacob S, Abdinia S, Benwadih M, Bablet J, Chartier I, Gwoziecki R, Cantatore E, van Roermund AHM, Maddiona L, Tramontana F, Maiellaro G, Mariucci L, Rapisarda M, Palmisano G, Coppard R (2013) High performance printed n and p-type OTFTs enabling digital and analog complementary circuits on flexible plastic substrate. Solid State Electron 84:167–178 Jung K-D, Kim YC, Shin H, Park B-G, Lee JD, Cho ES, Kwon SJ (2010) A study on the carrier injection mechanism of the bottom-contact pentacene thin film transistor. Appl Phys Lett 96:103305 Kaltenbrunner M, Sekitani T, Reeder J, Yokota T, Kuribara K, Tokuhara T, Drack M, Schwödiauer R, Graz I, Bauer-Gogonea S, Bauer S, Someya T (2013) An ultra-lightweight design for imperceptible plastic electronics. Nature 499(458):2013

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

23

Kang MJ, Doi I, Mori H, Miyazaki E, Takimiya K, Ikeda M, Kuwabara H (2011) Alkylated dinaphtho[2,3-b:20 ,30 -f]thieno[3,2-b]thiophenes (Cn-DNTTs): organic semiconductors for high-performance thin-film transistors. Adv Mater 23:1222–1225 Khodagholy D, Rivnay J, Sessolo M, Gurfinkel M, Leleux P, Jimison LH, Stavrinidou E, Herve T, Sanaur S, Owens RM, Malliaras GG (2013) High transconductance organic electrochemical transistors. Nat Commun 4:2133 Kim CH, Bonnassieux Y, Horowitz G (2013) Charge distribution and contact resistance model for coplanar organic field-effect transistors. IEEE Trans on Electron Device 60:280. Klauk H, Schmid G, Radlik W, Weber W, Zhou L, Sheraw CD, Nichols JA, Jackson TN (2003) Contact resistance in organic thin film transistors. Solid State Electron 47:297–301 Kumar B, Kaushik BK, Negi YS, Saxena S, Varma GD (2013) Analytical modelling and parameter extraction of top and bottom contact structures of organic thin film transistors. Microelectron J 44:736–743 Kuribara K, Wang H, Uchiyama N, Fukuda K, Yokota T, Zschieschang U, Jaye C, Fischer D, Klauk H, Yamamoto T, Takimiya K, Ikeda M, Kuwabara H, Sekitani T, Loo Y-L, Someya T (2012) Organic transistors with high thermal stability for medical applications. Nat Commun 3:723 Li T, Ruden PP, Campbell IH, Smith DL (2003) Investigation of bottom-contact organic field effect transistors by two-dimensional device modelling. J Appl Phys 93:4017–4022 Li J, Zhao Y, Tan HS, Guo Y, Di C-A, Yu G, Liu Y, Lin M, Lim SH, Zhou Y, Su H, Ong BS (2012) A stable solution-processed polymer semiconductor with record high-mobility for printed transistors. Sci Rep 2:754 Lin P, Yan F, Yu JJ, Chan HLW, Yang M (2010) The application of organic electrochemical transistors in cell-based biosensors. Adv Mater 22:3655–3660 Luan SW, Neudeck GW (1992) An experimental study of the source/drain parasitic resistance effects in amorphous silicon thin film transistors. J Appl Phys 72:766 Mabeck JT, Malliaras GG (2006) Chemical and biological sensors based on organic thin-film transistors. Anal Bioanal Chem 384:343–353 Mariucci L, Rapisarda M, Valletta A, Jacob S, Benwadih M, Fortunato G (2013) Current spreading effects in fully printed p-channel organic thin film transistors with Schottky source–drain contacts. Org Electron 14:86–93 Murrmann H, Widmann D (1969) Current crowding on metal contacts to planer devices. Dig Tech Pap. ISSCC: 162 Natali D, Fumagalli L, Sampietro M (2007) Modeling of organic thin film transistors: effect of contact resistances. J Appl Phys 101:014501 Necliudov PV, Shur MS, Gundlach DJ, Jackson TN (2003) Contact resistance extraction in pentacene thin film transistors. Solid State Electron 47:259–262 Noda K, Wada Y, Toyabe T (2014) Intrinsic difference in Schottky barrier effect for device configuration of organic thin-film transistors. Org Electron 15:1571–1578 Pesavento PV, Chesterfield RJ, Newman CR, Frisbie CD (2004) Gated four-probe measurements on pentacene thin-film transistors: contact resistance as a function of gate voltage and temperature. J Appl Phys 96:7312 Petrovic A, Pavlica E, Bratina G, Carpentiero A, Tormen M (2009) Contact resistance in organic thin film transistors. Synth Met 159:1210–1214 Rapisarda M, Simeone D, Fortunato G, Valletta A, Mariucci L (2011) Pentacene thin film transistors with (polytetrafluoroethylene) PTFE-like encapsulation layer. Org Electron 12:119–124 Rapisarda M, Valletta A, Daami A, Jacob S, Benwadih M, Coppard R, Fortunato G, Mariucci L (2012) Analysis of contact effects in fully printed p-channel organic thin film transistors. Org Electron 13:2017–2027 Rapisarda M, Calvi S, Valletta A, Fortunato G, Mariucci L (2015) The role of defective regions near the contacts on the electrical characteristics of bottom-gate bottom-contact organic TFTs. J Disp Technol 12:252–257

24

L. Mariucci et al.

Risteska A, Steudel S, Nakamura M, Knipp D (2014) Structural ordering versus energy band alignment: effects of self-assembled monolayers on the metal/semiconductor interfaces of small molecule organic thin-film transistors. Org Electron 15:3723–3728 Rivnay J, Owens RM, Malliaras GG (2014) The rise of organic bioelectronics. Chem Mater 26:679685 Rose A (1955) Space-charge-limited currents in solids. Phys Rev 97:1538 Ruiz AR, Choudhary D, Nickel B, Toccoli T, Chang K-C, Mayer AC, Clancy P, Blakely JM, Headrick RL, Iannotta S, Malliaras GG (2004) Pentacene thin film growth. Chem Mater 16:4497–4508 Scheinert S, Paasch G (2009) Interdependence of contact properties and field- and densitydependent mobility in organic field-effect transistors. J Appl Phys 105:014509 Scott JC, Malliaras GG (1999) Charge injection and recombination at the metal–organic interface. Chem Phys Lett 299:115–119 Sekitani T, Someya T (2010) Stretchable, large-area organic electronics. Adv Mater 22:2228–2246 Sekitani T, Zschieschang U, Klauk H, Someya T (2010) Flexible organic transistors and circuits with extreme bending stability. Nat Mater 9:1015 Sou A, Jung S, Gili E, Pecunia V, Joimel J, Fichet G, Sirringhaus H (2014) Programmable logic circuits for functional integrated smart plastic systems. Org Electron 15:3111–3119 Tybrandt K, Forchheimer R, Berggren M (2012) Logic gates based on ion transistors. Nat Commun 3:871 Valletta A, Daami A, Benwadih M, Coppard R, Fortunato G, Rapisarda M, Torricelli F, Mariucci L (2011) Contact effects in high performance fully printed p-channel organic thin film transistors. Appl Phys Lett 99:233309 Wang W, Li L, Ji Z, Ye T, Lu N, Li Z, Li D, Liu M (2013) Modified transmission line model for bottom-contact organic transistors. IEEE Electron Device Lett 34:1301 White HS, Kittlesen GP, Wrighton MS (1984) Chemical derivatization of an array of 3 gold microelectrodes with polypyrrole – fabrication of a molecule-based transistor. J Am Chem Soc 106:5375–5377 Xu Y, Minari T, Tsukagoshi K, Chroboczek JA, Ghibaudo G (2010) Direct evaluation of low-field mobility and access resistance in pentacene field-effect Transistors. J Appl Phys 107:114507

The following papers provide more insight into charge injection and contact resistance B€urgi L, Richards TJ, Friend RH, Sirringhaus H (2003) Close look at charge carrier injection in polymer field-effect transistors. J Appl Phys 94:6129–6136 Marinkovic M, Belaineh D, Wagner V, Knipp D (2012) On the origin of contact resistances of organic thin film transistors. Adv Mater 24:4005–4009 Natali D, Caironi M (2012) Charge injection in solution-processed organic field-effect transistors: physics, models and characterization methods. Adv Mater 24:1357–1387

The following paper is a recent review on the innovative developments of contact engineering Liu C, Xu Y, Noh Y-Y (2015) Contact engineering in organic field-effect transistors. Mater Today 18:79–96

Contact Effects in Organic Thin-Film Transistors: Device Physics and. . .

25

An exhaustive review on metal-organic interface can be found in Hwang J, Wan A, Kahn A (2009) Energetics of metal–organic interfaces: new experiments and assessment of the field. Mater Sci Eng R 64:1–31

More detail about the dependence of contact resistance on device structures can be found in: Brondijk JJ, Torricelli F, Smits ECP, Blom PWM, de Leeuw DM (2012) Gate-bias assisted charge injection in organic field-effect transistors. Org Electron 13:1526–1531 Gundlach DJ, Zhou L, Nichols JA, Jackson TN, Necliudov PV, Shur MS (2006) An experimental study of contact effects in organic thin film transistors. J Appl Phys 100:024509 Kim CH, Bonnassieux Y, Horowitz G (2011) Fundamental benefits of the staggered geometry for organic field-effect transistors. IEEE Electron Device Lett 32:1302 Richards TJ, Sirringhaus H (2007) Analysis of the contact resistance in staggered, top-gate organic field-effect transistors. J Appl Phys 102:094510

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Organic Ambipolar Transistors and Circuits Anita Risteska and Dietmar Knipp* Jacobs University Bremen, Bremen, Germany

Abstract Ambipolar charge transport of a variety of small molecules and polymers has initiated research on ambipolar transistors and circuits. Such electronic devices can be fabricated at low temperatures on rigid or flexible large area substrates. The realization of ambipolar logic circuits is simple in comparison to the fabrication of logic circuits in complementary metal-oxide-semiconductor (CMOS) technology. Fewer layers have to be deposited and less patterning steps are required to implement simple digital circuits. In the following chapter, the basic operation principles of ambipolar transistors and circuits will be introduced. Moreover, an overview of the current ambipolar transistor technology based on different materials will be discussed, and their advantages and disadvantages compared to unipolar devices will be described.

Introduction Organic electronics is proclaimed to be a promising technology for large area display applications on flexible and rigid substrates (Klauk 2006; Sekitanu and Someya 2010; Gelinck et al. 2004; Zhou et al. 2006; Benor et al. 2007a; Inoue et al. 2013). Organic thin-film transistors (OTFTs) have been used as pixel switches and pixel circuits in backplanes of active-matrix liquid crystal displays and lightemitting diode displays (Gelinck et al. 2004; Zhou et al. 2006; Benor et al. 2007b). Their compatibility with low fabrication temperatures has enabled the realization of digital circuitry on bendable and flexible substrates (Sekitanu and Someya 2010; Gelinck et al. 2004; Zhou et al. 2006; Benor et al. 2007a). Organic field-effect transistors (FETs) with mobilities >3 cm2/Vs (Li et al. 2012; Hofmockel et al. 2013) and MHz frequency operations (Kitamura and Arakawa 2009; Hoppe et al. 2010; Risteska et al. 2014) have been reported. However, the realization of CMOS inverters based on p- and n-type organic transistors requires deposition and patterning of two different organic semiconductors, which increases the manufacturing complexity and thus its cost. Therefore, devices with ambipolar materials that can provide both p- and n-type operation and enable the realization of CMOS-like circuitry without the need of multiple patterning steps are desirable. Ambipolar charge transport, where accumulation of both electrons and holes is possible, has been shown for various materials ranging from small molecules (Opitz et al. 2012; Singh et al. 2006; Schmechel et al. 2005; Yasuda et al. 2004; Yang et al. 2008) and polymers (Kronemeijer et al. 2012; Lei et al. 2012; Fan et al. 2012; Chen et al. 2012; Yuen et al. 2011; Bijleveld et al. 2009; Baeg et al. 2013) to graphene (Lemme et al. 2007; Liao et al. 2010) and carbon nanotubes (Yu et al. 2009). Although, the realization of such organic ambipolar circuitry is less complex and inexpensive, one of the challenges of fabricating ambipolar transistors is the efficient injection of both charge carrier types. Optimal hole injection happens if the work function of the metal electrode lines up with the highest occupied molecular orbital (HOMO) level of the semiconductor, whereas for efficient electron injection, the work function of the metal should be close to the lowest unoccupied molecular orbital (LUMO) level (see Fig. 1a). So one way of realizing ambipolar devices is using different metals for the hole- and *Email: [email protected] Page 1 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 1 Schematic sketch of an energy band alignment for a typical (a) p-type unipolar and (b) ambipolar transistor. For a unipolar transistor, the metal is chosen so that its work function is either close to the highest occupied molecular orbital (HOMO) for a p-type transistor or close to the lowest unoccupied molecular orbital (LUMO) for an n-type operation. In the case of an ambipolar transistor, a metal whose work function lies in the middle of the semiconductor’s bandgap is preferred

electron-injecting electrodes (Schmechel et al. 2005). However, this increases the complexity of the process and limits the assembly and performance of inverter-like circuitry. Another challenge in both unipolar and ambipolar organic transistors is trapping of charge carriers by impurities at the semiconductor/dielectric interface, limiting the charge transport of one or both charge carrier types (Melzer and von Seggern 2010; Horowitz 2010; Knipp and Northrup 2009). Thus, the ambipolar organic FETs should consist of a high mobility ambipolar organic semiconductor, a rather trap-free dielectric/semiconductor interface, and source/drain electrodes with good electron and hole injection properties (Melzer and von Seggern 2010; Horowitz 2010). Ideally, in ambipolar transistors, balanced electron and hole injection should be achieved. In order to develop transistors with similar contact resistances (Benor and Knipp 2008), the semiconductor and the metal should have comparable work functions – preferably a metal whose work function lies in the middle of the bandgap of the employed organic semiconductor (see Fig. 1b). Nevertheless, for a good injection efficiency, low injection barriers are necessary, which requires a semiconductor system with small energy difference between the electron- and hole-transporting states. The bandgap of typical organic semiconductors is usually in the range of at least 2 eV; therefore, metals with appropriate work function, including its changes due to molecular adsorbates (interface dipole), are needed. Alternatively, a blend of electron- and hole-transporting materials can be used to improve the intrinsic mobility of the electrons and holes in the active material. Such blends have been actively used in the field of organic solar cells (Bijleveld et al. 2009; Brabec et al. 2014; Conibeer and Willoughby 2014; Zhou et al. 2013; Sun et al. 2012) and also in organic field-effect transistors to enhance their ambipolar behavior (Meijer et al. 2003; Unni et al. 2006; Inoue et al. 2005; Hayashi et al. 2005; Rost et al. 2004). Moreover, ambipolar light-emitting transistors (LETs), as an alternative to organic light-emitting diodes (OLEDs), have been studied for an efficient and controlled light emission (Rost et al. 2004). In the following, basic operation principles of ambipolar transistor will be described, and their performance based on single transistor and circuit level will be compared to unipolar devices. Moreover, we will give an overview of the ambipolar transport observed in different organic transistors with focus on single component ambipolar organic semiconductors, as the best choice for a facile circuit fabrication. Finally, a discussion on the two main applications of ambipolar transistors, LETs and ambipolar circuits, will be provided.

Operation Principles An ambipolar field-effect transistor is a transistor that can transport both electrons and holes. Depending on the biasing voltages, different operating regions can be distinguished as depicted in Fig. 2. For small

Page 2 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 2 Different operation modes of an ambipolar TFT depending on the applied bias voltages. The vertical dashed lines represent VG = VTp/n, while the sloping dashed lines indicate VD = VG – VTp/n defining the boundaries between the different operating regimes (Reprinted with permission from Risteska et al. (2012) # 2012 Elsevier B. V.)

positive drain voltages and gate voltages, the transistor exhibits a unipolar behavior with electrons dominating the charge transport and an ambipolar behavior for large positive drain and gate voltages (Benor and Knipp 2008; Brabec et al. 2014; Conibeer and Willoughby 2014; Zhou et al. 2013; Sun et al. 2012; Meijer et al. 2003; Unni et al. 2006; Inoue et al. 2005; Hayashi et al. 2005; Rost et al. 2004; Risteska et al. 2012; Zaumseil and Sirringhaus 2007; Smits et al. 2006; Knipp et al. 2011). Since the unipolar behavior is dominated by electrons, this transistor is referred as an n-type ambipolar field-effect transistor. Similarly, a p-type ambipolar transistor exhibits unipolar behavior with holes as the dominant charge carrier for small negative drain and gate voltages and an ambipolar behavior where both electrons and holes contribute to the current flow, for large negative drain voltage and gate voltage (Risteska et al. 2012; Zaumseil and Sirringhaus 2007; Smits et al. 2006; Knipp et al. 2011). If a simple charge sheet model is used to describe the charge transport of an ambipolar transistor, four operating regions can be distinguished. In an n-type ambipolar transistor, for VD < VG – VTN (with VD, VG, and VTN being the drain, the gate, and the electrons’ threshold voltage, correspondingly), only electrons are being injected in the channel, and the drain current exhibits linear behavior described by (Risteska et al. 2012)   W VD  C G  mn  V D  V G  V T N  (1) ID ¼ L 2 where W and L are the channel width and length, respectively, CG is the gate capacitance, and mn is the electron mobility. When VG – VTN  VD  VG – VTP (where VTP describes the threshold voltage for the holes transport), the transistor still exhibits unipolar behavior and the drain current saturates (Risteska et al. 2012):

Page 3 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

ID ¼

W  C G  mn  ðV G  V T N Þ2 2L

(2)

For VD  VG – VTP and VG – VTP > 0, the transistor shows ambipolar behavior with both electrons and holes contributing to the current flow (Risteska et al. 2012): ID ¼

  W  C G  mn  ðV G  V T N Þ2 þ mp  ðV D  ðV G  V T P ÞÞ2 2L

(3)

where mp is the hole charge carrier mobility. When the transistor is operated in this regime, an electron and a hole accumulation regions are formed next to the respective electrodes that meet somewhere in the channel of the transistor. At this point, the oppositely charged carriers recombine, which in the case of electroluminescent materials will result in light emission from the channel of the transistor. This type of light-emitting ambipolar transistors will be discussed in section “Light-Emitting Transistors,” where more details on the recombination and light emission process will be given. For VD  VG – VTP and VG – VTN < 0, the transistor operates in the saturation regime and the current flow is determined by holes (Risteska et al. 2012): ID ¼

W  C G  mp  ðV D  ðV G  V T P ÞÞ2 2L

(4)

Figure 3 shows a comparison of calculated current/voltage transfer characteristics of a unipolar and an ambipolar transistor. The unipolar field-effect transistor operates in three main regimes: the linear and the saturation region depending on the applied drain voltage for VG > VT, and when VG < VT, the transistor is in the OFF state (see Fig. 3a). This results in high ON/OFF ratios, suitable for obtaining high voltage gains and negligible power losses when the devices are integrated in a digital circuit (McCall et al. 2014; Zhao et al. 2014). On the other hand, as described in Eqs. 1, 2, 3, and 4, an ambipolar transistor can be operated in three main regimes: linear, saturation, and ambipolar. As shown in Fig. 3b, increasing the applied drain voltage increases the ON current of the transistor but also substantially increases its OFF current. As a consequence, the high drain current in the nominal off state limits their application in digital circuitry, as it will be discussed in section “Ambipolar Circuits.” More detailed description of the ambipolar transistors’

Fig. 3 Comparison of the current/voltage transfer characteristics of (a) a unipolar and (b) an ambipolar transistor. The plots show calculated drain current for different applied drain voltages

Page 4 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

operation and device physics can be found in Risteska et al. (2012), Zaumseil and Sirringhaus (2007), and Knipp et al. (2011).

Ambipolar Transistors Small Molecules Organic materials based on small molecules have been extensively used in the organic electronics’ industry (Klauk 2006; Sekitanu and Someya 2010; Gelinck et al. 2004; Zhou et al. 2006). These types of semiconductors are usually suitable for either hole or electron transport and have been widely used in the preparation of organic transistors (Klauk 2006; Opitz et al. 2012). However, many research groups have shown that they exhibit ambipolar characteristics as well (Opitz et al. 2012; Singh et al. 2006; Yang et al. 2008; Tang et al. 2008; Liang et al. 2011; Song et al. 2011; Zeng et al. 2013; Glowacki et al. 2011; Irimia-Vladu et al. 2012). Figure 4 demonstrates an ambipolar behavior of a typically considered p-type semiconductor, 6,13bis(triisopropylsilylethynyl)-pentacene (TIPS-Pen) (Opitz et al. 2012). Opitz et al. have achieved ambipolar operation in the OFETs by a proper molecular passivation layer and selection of electrode materials (Opitz et al. 2012). Electron and hole mobilities of 0.1 and 0.04 cm2/Vs were obtained for top-contact (TC) TIPS-Pen transistors with both aluminum (Al) and silver (Ag) electrodes (Opitz et al. 2012). A list of small-molecule materials that have demonstrated an ambipolar behavior is given in Table 1. One of the most widely used and highest mobility p-type organic semiconductors is pentacene. Opitz and coworkers have demonstrated that pentacene TC transistors with tetratetracontane (TTC)-passivated silicon oxide dielectric and with both Al and Ag electrodes show ambipolar transport (Opitz et al. 2012). The measured electron and hole mobilities were in the range of 0.02 and 0.1 cm2/Vs, respectively (Opitz et al. 2012). Moreover, Yang et al. have realized ambipolar pentacene transistors by incorporating

Fig. 4 (a) Chemical structure of the typical p-type semiconductor 6,13-bis(triisopropylsilylethynyl)-pentacene (TIPS-Pen). (b) Cross-sectional structure of the organic field-effect transistor (OFET) in bottom-gate and top-contact geometry. (c) Transfer characteristics of thin-film transistors in the linear region built with the molecular semiconductor 6,13-bis(triisopropylsilylethynyl)-pentacene (TIPS-Pen, Ag contacts) using metal contacts for bipolar transport. The drain voltage is 2 V. The channel length is always 70 mm (Reprinted with permission from Opitz et al. (2012) # 2012 Elsevier B. V.)

Page 5 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Table 1 List of small-molecule ambipolar semiconductors and their respective charge carrier mobilities, threshold voltages, and current on-off ratios Molecule Pentacene

Electrode material me [cm2 V1 s1] mh [cm2 V1 s1] VTN [V] VTP [V] Al, Ag 0.02 0.1 / /

ION/ IOFF(n,p) /

Al

0.0023

0.026

42

27

/

Au

0.04

0.3

30

13

/

a-Sexithiophene (6 T)

Ag

0.003

0.01

/

/

/

7,8,9,10-Tetrafluoro-5,12bis(TIPSethynyl)tetracenotetraceno[2,3-b]thiophene 6,13Bis(triisopropylsilylethynyl)pentacene (TIPS-PEN) silylethynylated N-heteropentacenes 8,9,10,11-Tetrachloro-6,13bis(triisopropylsilylethynyl)1-azapentacene (4Cl-Azapen)

Au

0.133

0.097

70

10

105 (n) 102 (p)

Au

1.1

0.22

/

28

104 (p)

Au/Ag

0.14/0.10

0.12/0.11

28/45

Au

0.21

0.23

39

22/36 104/103 (n, p) 28 /

Au

0.03

0.22

3–5

Au

0.01

0.01

4.5–7

Tyrian purple (6,60 -dibromoindigo) Indigo

References Opitz et al. (2012) Yang et al. (2008) Singh et al. (2006) Opitz et al. (2012) Tang et al. (2008) Liang et al. (2011)

Song et al. (2011) Zeng et al. (2013) 3 4 1.5–1.75 10 – 10 Glowacki et al. (2011) 1.5–3 / IrimiaVladu et al. (2012)

polymethyl methacrylate (PMMA) as an interfacial-modified layer for surface trap elimination (Yang et al. 2008). Using a suitable work-function electrode material like Al, electron and hole mobilities of 0.0023 cm2/Vs and 0.026 cm2/Vs were achieved, correspondingly (Yang et al. 2008). Other metals like calcium (Ca), silver (Ag), and gold (Au) were also investigated, but the transistors showed lower electron and hole mobilities than the transistors with the Al electrodes (Yang et al. 2008). Ambipolar transport in pentacene TC transistors have been observed also by Singh et al. (2006). With a proper choice of organic dielectric, like polyvinyl alcohol (PVA), pentacene OFETs showed ambipolar operation even with a high work-function metal (Au) used for the source and the drain electrodes (Singh et al. 2006). The charge carrier mobilities of these pentacene devices were in the order of 0.3 cm2/Vs for holes and 0.04 cm2/Vs for electrons (Singh et al. 2006). Another small molecule, for which ambipolar behavior was demonstrated by Opitz et al., is a-sexithiophene (6 T), with very low electron mobility of 0.003 cm2/Vs and hole mobility in the order of 102 cm2/Vs (Opitz et al. 2012). Next, acene-based semiconductors used in ambipolar organic transistors are discussed. Tang et al. observed balanced ambipolar transport in 7,8,9,10-tetrafluoro-5,12-bis(TIPSethynyl)tetracenotetraceno[2,3-b]thiophene transistors with hole mobility of around 0.1 cm2/Vs and electron mobility of 0.133 cm2/Vs in nitrogen (Tang et al. 2008). TIPS-PEN silylethynylated N-heteropentacene organic FETs have shown high-electron mobilities of up to 1.1 cm2/Vs when measured in vacuum and hole mobilities as high as 0.22 cm2/Vs in ambient air conditions (Liang et al. 2011). However, it should be noted that the high electron mobilities were obtained only in vacuum, while when the devices were tested in ambient air, Page 6 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

the mobilities dropped down to 103 cm2/Vs due to electron trapping. Song et al. and Zeng et al. prepared 4Cl-Azapen TC organic ambipolar transistors with balanced electron and hole mobilities in the range of 0.1–0.2 cm2/Vs (Song et al. 2011; Zeng et al. 2013). Although showing high and balanced electron and hole mobilities, all these acene-based semiconductors exhibited ambipolar behavior only in vacuum. Organic single crystals have been used as organic semiconductors in transistors as they show high fieldeffect mobilities, because they are free of grain boundaries and have low trap states concentration. Nakanotani et al. demonstrated single-crystal ambipolar OFETs based on oligo(p-phenylenevinylene) (OPV) with balanced electron and hole mobilities in the range of 0.1 cm2/Vs (Nakanotani et al. 2009). Ambipolar behavior, only in vacuum, has been demonstrated for rubrene single-crystal transistors with PMMA dielectric and Ag source/drain electrodes (Takahashi et al. 2006). The devices showed low electron mobilities of 0.011 cm2/Vs and hole mobilities of 1.8 cm2/Vs, which is an order of magnitude lower than the highest hole mobility reported for rubrene single crystals (Menard et al. 2004).

Polymers Polymer semiconductors are another group of materials that have been extensively used in the preparation of organic ambipolar field-effect transistors. Usually they exhibit unipolar, mostly a p-type and less an n-type behavior (Baeg et al. 2013). However, many researchers have demonstrated that polymeric organic semiconductors can show an ambipolar behavior with very high and balanced charge transport of both electrons and holes (Kronemeijer et al. 2012; Lei et al. 2012; Fan et al. 2012; Chen et al. 2012; Yuen et al. 2011; Bijleveld et al. 2009; Zaumseil et al. 2006a; Lee et al. 2012; Sonar et al. 2010, 2012). Diketopyrrolopyrrole (DPP)-based copolymers are representative materials with very strong ambipolar properties. For this class of polymeric semiconductors, impressive charge carrier mobilities have been shown for both electrons and holes that are close or sometimes higher than those of the state-of-the-art unipolar conjugated polymers. Figure 5 shows the current/voltage transfer characteristics of a bottom-contact/top-gate transistor with a selenophene-based low-bandgap ambipolar polymer, PSeDPPBT (Kronemeijer et al. 2012). Using Au source and drain electrodes and a spin-coated polymer film annealed at 200  C on a 70 nm thick PMMA dielectric, mobilities of 0.84 and 0.46 cm2/Vs were achieved for electrons and holes, respectively (Kronemeijer et al. 2012). The prepared transistors were further used in high-performance ambipolar CMOS-like circuits, whose operation will be discussed in more detail in section “Ambipolar Circuits.” A list of some polymers exhibiting ambipolar behavior is given in Table 2. Lei et al. observed ambipolar transport behavior in isoindigo-based conjugated polymer (Lei et al. 2012). They showed that fluorination on the isoindigo unit considerably increases the electron mobilities from 102 to 0.43 cm2/Vs, while maintaining the high hole mobility of 1.85 cm2/Vs for transistors fabricated and tested in ambient conditions (Lei et al. 2012). Apart from DPP-based polymers, benzobisthiadiazole (BBT)-based polymers have been used in the preparation of ambipolar transistors. Fan et al. demonstrated bottom-gate/Au bottom-contact ambipolar transistors with poly(benzobisthiadiazole bithiophene-thienothiophene) PBBTTT as the active layer on a heavily doped silicon substrates (Fan et al. 2012). This semiconducting BBT-based polymer exhibited nearly balanced electron and hole mobilities of 0.7 and 1.0 cm2/Vs, consequently (Fan et al. 2012). On the other hand, poly(diketopyrrolopyrrole-terthiophene) (PDPP3T) and poly(9,9-di-noctylfluorene-alt-benzothiadiazole) (F8BT) have shown balanced but low electron and hole mobilities (Bijleveld et al. 2009; Zaumseil et al. 2006a). Field-effect transistors with PDPP3T as the semiconductor were fabricated on a thermally grown silicon dioxide dielectric with mobilities in the order of 102 cm2/ Vs for both electron and holes (Bijleveld et al. 2009). Moreover, Zaumseil et al. have prepared F8BT ambipolar devices with mobilities in the range of 104 cm2/Vs for light emission applications (Zaumseil et al. 2006a). Page 7 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 5 (a) Chemical structure of the PSeDPPBT/PDPPBT polymers synthesized via Suzuki polycondensation. (b) Cross section of the bottom-contact/top-gate transistor structure. Transfer characteristics at (c) negative source/drain bias and (d) positive source/drain bias of an ambipolar transistor, L = 20 mm, W = 1,000 mm, based on PSeDPPBT annealed at 200  C (Reprinted with permission from Kronemeijer et al. (2012) # 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)

As discussed above, DPP-based ambipolar polymers have shown so far the best mobilities for both electron and holes. Recently, Lee et al. reported a solution-processable donor-acceptor-type ambipolar copolymer consisting of DPP unit and hybrid siloxane substituents, PTDPPSe-Si (Lee et al. 2012). Electron and hole mobilities as high as 2.20 and 3.97 cm2/Vs, respectively, were achieved (Lee et al. 2012). They used the effectiveness of siloxane-terminated solubilizing groups as side chains for enhancing charge transport, as their PTDPPSe-based transistors showed electron mobility of 0.43 cm2/Vs and hole mobility of 2.53 cm2/Vs, which are lower than the mobilities measured for the PTDPPSE-Si field-effect transistors (Lee et al. 2012). Chen et al. prepared ambipolar DPPT-TT top-gate/bottom-contact transistors with balanced hole and electron mobilities of 1.36 cm2/Vs and 1.56 cm2/Vs, consequently (Chen et al. 2012). Additionally, Sonar et al. reported DPP-based ambipolar transistors with reasonably balanced electron and hole mobilities in the order of 101 cm2/Vs (Sonar et al. 2010, 2012). Moreover, a combination of DPP- and BBT-based polymer has been used by Yuen et al. in the preparation of highperformance ambipolar transistors (Yuen et al. 2011). Their bottom-gate/bottom-contact devices exhibited balanced electron and hole mobilities close to 1 cm2/Vs with good thermal stability (Yuen et al. 2011).

Page 8 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Table 2 List of polymer ambipolar semiconductors and their respective charge carrier mobilities, threshold voltages and current on-off ratios Molecule Fluorinated isoindigo (PFII2T) Poly(benzobisthiadiazole bithiophenethienothiophene) (PBBTTT) Diketopyrrolopyrrolethieno[3,2-b]thiophene (DPPT-TT) Poly(benzobisthiadiazole) diketopyrrolopyrrole (PBBT12DPP) Poly(diketopyrrolopyrroleterthiophene) (PDPP3T) Poly(9,9-di-noctylfluorene-altbenzothiadiazole) (F8BT) Poly-3,6-dithien-2-yl-2,5di(2-octyldodecyl)-pyrrolo [3,4-c]pyrrole-1,4-dione50 ,500 -diyl-alt-3(5-(selenophene-2-yl)) (PTDPPSe) Poly-3,6-dithien-2-yl-2,5bis(6-(1,1,1,3,5,5,5heptamethyltrisiloxan-3-yl) hexyl)-pyrrolo[3,4-c] pyrrole-1,4-dione50 ,500 -diyl-alt-3(5-(selenophene-2-yl)) (PTDPPSe-Si) Poly(3,6-difuran-2-yl-2,5di(2-octyldodecyl)-pyrrolo [3,4-c]pyrrole-1,4-dionealt-phenylene)(PDPP-FBF) Poly(diketopyrrolopyrrolebenzothiadiazolethiophene) (PDPP-TBT)

Electrode material me [cm2 V1 s1] mh [cm2 V1 s1] VTN [V] VTP [V] ION/IOFF(n,p) Au 0.43 1.85 30 38 105–106 (n) 106–107 (p) Au 0.7 1.0 / / /

References Lei et al. (2012) Fan et al. (2012)

Au

1.56

1.36

/

/

105–106 (n,p)

Chen et al. (2012)

Au

0.99

0.89

/

/

/

Yuen et al. (2011)

Au

0.01

0.04

/

/

/

Au

8.5  104

7.5  104

/

/

/

Bijleveld et al. (2009) Zaumseil et al. (2006a)

Au

0.43

2.53

/

/

> 103 (n) > 105 (p)

Lee et al. (2012)

Au

2.20

3.97

/

/

>10 (n) > 104 (p)

Lee et al. (2012)

Au

0.16

0.18

/

/

103

Sonar et al. (2012)

Au

0.40

0.35

/

/

/

Sonar et al. (2010)

Light-Emitting Transistors One important issue of any semiconductor technology is to realize devices that can integrate both electronic and optical functions. A simple way of combining the switching properties of transistors together with the light emission properties of light-emitting diodes (LEDs) has lead to the realization of light-emitting transistors (LETs). Organic semiconductors are good candidates for light emission applications, as a large number of small molecules and polymers have demonstrated relatively high electro- and Page 9 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 6 (a) Schematic sketch of a light emission process in an ambipolar LET. The position of the light emission can be controlled by the applied drain/gate voltage. Potential and charge distribution along the channel for (b) small drain voltages (unipolar behavior) and (c) large drain voltages (ambipolar behavior). The potential profiles were calculated for a gate voltage and threshold voltage of 4 V and 0 V, respectively

photoluminescence efficiencies together with good charge transport properties (Rost et al. 2004; Nakanotani et al. 2014; Kuik et al. 2014; AlSalhi et al. 2011; Gather et al. 2011; Schidleja et al. 2009). Especially, ambipolar organic field-effect transistors are good for light emission, since they provide an effective pn-junction in the channel of the transistor and thus a recombination of electrons and holes. Moreover, only ambipolar transistors provide technologically and scientifically the most interesting properties that make light-emitting transistors desirable. The controlled position of the emission point, the possibility of light emission far away from the metal contacts, and the balanced electron and hole currents make ambipolar organic light-emitting transistors attractive for novel electro-optical applications (Kajii et al. 2003; Bader et al. 2002; Tessler et al. 2000; Baldo et al. 2002). In unipolar light-emitting transistors, the minority charge carriers are only injected at high electric fields from one of the electrodes, which forces the recombination to happen in the vicinity of that electrode (Ahles et al. 2004; Cicoira et al. 2006; Hepp et al. 2003). Therefore, the light emission process takes place only at the edge of the electrode making it inefficient as most of the majority charge carriers can escape in the contact without recombining. Figure 6a shows a schematic sketch of an ambipolar light-emitting transistor demonstrating how the recombination point and the emission zone can be moved throughout the whole channel by changing the applied voltages (Zaumseil et al. 2006b). The position of the recombination zone, x0, is strongly dependent on the drain voltage, the gate voltage, the threshold voltages, as well as the ratio of the charge carrier mobilities of holes and electrons as given by Eq. 5: x0 ¼



L

mp V D  V G  V T P 1þ  mn V G  V TN

2

(5)

Figure 6b and c show calculated potential profiles along the channel of the transistor. For small drain voltages, the charge distribution compromises only one charge carrier type and the transistor behaves as

Page 10 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

unipolar (see Fig. 6b). In Fig. 6c, potential profiles for larger VD are shown and the transistor exhibits ambipolar behavior. The potential profiles nicely visualize the charge distribution in the channel of the transistor, showing a clear separation between the electron and the hole conduction regions, with a demonstration of the shift of the recombination point as a function of the applied drain voltage. One of the first light-emitting ambipolar transistor based on coevaporation of a-quinquethiophene (a-5 T) as hole-transporting material and N,N’-ditridecylperylene-3,4,9,10-tetracarboxylic diimide (P13) as electron-transporting material was reported by Rost et al. (2004). The transistors exhibited ambipolar behavior over a range of biasing voltages followed by light emission (Rost et al. 2004). The light emission was proportional to the drain current, but the position of the recombination point could not be detected (Rost et al. 2004). Later, it was demonstrated that the light emission in this blend depends on the ratio of the coevaporated materials (Loi et al. 2006). Furthermore, Schidleja et al. reported light emission from a pentacene ambipolar field-effect transistor with Au and Ca drain and source electrodes, respectively (Schidleja et al. 2009). Electroluminescence was measured for a narrow wavelength range over the entire transistor channel with a spatially controllable position of the recombination zone by the applied voltages (Schidleja et al. 2009). Zaumseil et al. were among the first to demonstrate an ambipolar light-emitting polymer transistor in a top-gate/bottom-contact configuration (Zaumseil et al. 2006a). An efficient green emitter polymer, F8BT (poly(9,9-di-n-octylfluorene-alt-benzothiadiazole)), was used as an ambipolar semiconductor (Zaumseil et al. 2006a). Maximum external quantum efficiencies of 0.75 % were obtained for these devices, which was higher than the observed efficiency of 0.5 % for a corresponding F8BT LEDs, despite the reflection and absorption losses caused by the top-gate electrode (Zaumseil et al. 2006a). Overall, ambipolar light-emitting transistors with controllable and resolved light emission enable new methods to study emission physics of organic materials and new optoelectronic devices. More details on the recent advances of organic light-emitting transistors can be found in Ref. (Wakayama et al. 2014).

Ambipolar Circuits An inverter is the basic building block of all digital circuits. Complementary metal-oxide-semiconductor (CMOS) inverters have been widely used in the design of logic functions in microprocessors, microcontrollers, and other digital logic circuits as well as in a broad variety of analog circuits, like image sensors and data converters (Kang and Leblebici 2003). In the implementation of a CMOS inverter, complementary unipolar p-type and n-type organic field-effect transistors (OFETs) are being used (Yeap 2011). The advantage of using CMOS inverters is that at each logical state, one of the employed transistors is turned off; thus, there is an insignificant current flow through the inverter. This leads to very low static power consumptions and electrical losses that are restricted to the switching of the inverter. On the other hand, the integration of two different materials (a p-type and an n-type) on a single substrate is usually difficult and requires an increased number of steps in the fabrication process, making it more expensive. With the advent of ambipolar semiconducting materials, many researchers have proposed an ambipolar circuitry as an alternative to CMOS technology (Singh et al. 2006; Yang et al. 2008; Kronemeijer et al. 2012; Fan et al. 2012; Chen et al. 2010, 2012; Bijleveld et al. 2009; Risteska et al. 2012). Using ambipolar OFETs, complementary logic can be realized since both charge polarities can be induced. This makes the fabrication process easier and therefore less expensive, since the logic functions can be implemented by a network of identical transistors. The circuit of such ambipolar inverter is shown in Fig. 7a. The inverter’s circuit consists of one ambipolar transistor intended for p-type operation (larger negative drain and negative gate voltages) and one for n-type operation (larger positive drain and positive gate voltages). Page 11 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 7 (a) Circuit diagram and (b) typical voltage transfer characteristics indicating the different operating regions of an ambipolar inverter

The performance of an inverter is usually evaluated by using its voltage transfer characteristics (VTCs). Depending on the applied input voltage, five different operating regions can be distinguished on the VTC curve as indicated in Fig. 7b (Risteska et al. 2012). When 0 < VIN < VTN (region A), both the n-type and the p-type ambipolar transistors behave as unipolar. The n-type transistor operates in the saturation region with current flow due to holes, while the p-type transistor operates in the linear region and only holes contribute to the current flow. For VTN < VIN < (VDD/2) (region B), where VDD is the supply voltage of the inverter, the n-type transistor operates in the saturation regime and exhibits an ambipolar behavior with both electron and holes contributing to the current flow, while the p-type transistor still operates in the linear regime with a current flow only due to holes. Next, when VIN = (VDD/2) (region C), both the n-type and the p-type transistors operate in the saturation region with current flow due to only electrons and only holes, respectively. In the fourth operating region D, where (VDD/2) < VIN < (VDD + VTP), the n-type transistor behaves as unipolar, operating in the linear regime with current flow due to electrons, while the p-type transistor shows an ambipolar behavior with current flow due to both electrons and holes. In the last region E, where (VDD + VTP) < VIN < VDD, both transistors behave as unipolar with only electrons contributing to the current flow. More details on the different operating regions can be found in Ref. (Risteska et al. 2012). Using the VTCs, information on the static noise margin (SNM), which is the maximum noise signal that can be superimposed on a signal without causing malfunction of the circuit, can be obtained. Moreover, the power consumption and the inverter gain can be determined based on the VTC of the inverter. VTCs of a CMOS and an ambipolar inverter are shown in Fig. 8a–d (Risteska et al. 2012). The figure shows the fundamental differences between CMOS and ambipolar inverters. In a CMOS inverter at each logical state, one of the transistors is turned off and the current flow is insignificant; therefore, the output voltage reaches either 0 Vor VDD and the static power consumption is negligible (see Fig. 8a and b). On the other hand, for an ambipolar inverter, since both the n-type and the p-type ambipolar transistors cannot be switched off at low and high input voltages, respectively, the VTC shows a “Z-shape” (see Fig. 8c and d). This strongly reduces the noise margin and increases the static power consumption when compared to a CMOS inverter (Risteska et al. 2012). Moreover, the plots show the importance of having balanced charge carrier mobilities and larger threshold voltages for the n-type and the p-type transistors, for both complementary technologies. The ratio of the mobilities of the n-type and the p-type transistors has an effect on the

Page 12 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 8 Simulated VTC of (a, b) a CMOS and (c, d) an ambipolar inverter for different (a, c) charge carrier mobility ratios (VDD = 5 V; VTN = VTH = 0 V; mnN/mpP = mnP/mpN = 0.1, 1, 10) and (b, d) threshold voltages (VDD = 5 V; mnN/mpP = mnP/ mpN = 1; VTN = VTH = 0, 1, 2 V). The VTCs were simulated for LN = LP = 2 mm and WN = WP = 1,000 mm (Reprinted with permission from Risteska et al. (2012) # 2012 Elsevier B. V.)

switching voltage (see Fig. 8a and c), while the increase of the threshold voltages affects the shape of the VTC (see Fig. 8b and d), which has an influence on the noise margin of the inverters (Risteska et al. 2012). The noise margin is defined by the maximum signal distortion or noise that can be accepted, while a proper identification of a logic “high” or a logic “low” signal is still ensured. The static noise margin of an inverter should be as large as possible, and the common methods used for determining it are described in Risteska et al. (2012). For a CMOS inverter, the maximum SNM can reach half the supply voltage, whereas for ambipolar inverters, the static noise margin is considerably smaller (Risteska et al. 2012). Using the SNM together with some statistics on the other transistors’ parameters could be used in calculating circuit yield, providing information on the failure rate and circuit reliability. As discussed before, from the perspective of a facile integrated circuit (IC) fabrication using singlelayer ambipolar semiconductors with balanced electron and hole mobilities, relatively high threshold voltages and efficient injection of both charge types from the same electrode material is the best choice. Figure 9 shows experimentally measured VTCs of an ambipolar pentacene inverter (Singh et al. 2006). The constituent ambipolar transistors were prepared on a PVA dielectric with gold electrodes and channel width to length ratio of 56 and holes/electrons mobility ratio of 7.5 (Singh et al. 2006). For a supply voltage of 100 V, a voltage gain of around 10 was achieved (Singh et al. 2006). Similarly, Yang et al. have demonstrated ambipolar inverters based on pentacene transistors with Al electrodes and voltage gains between 8 and 11 for supply voltages of 100 and 100 V (Yang et al. 2008). Page 13 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 9 Transfer characteristics of complementary-like inverter using two identical ambipolar pentacene OFETs. Inverter characteristics with Vin and Vdd being positively biased and the corresponding gain (Reprinted with permission from Singh et al. (2006) # 2006 AIP Publishing LLC)

Figure 10 shows the experimental VTC of the DPP-based selenophene copolymer ambipolar inverter measured by Kronemeijer et al. (2012). They reported a high-performance complementary-like inverters and ring oscillators using PSeDPPBT ambipolar OFETs (Kronemeijer et al. 2012). With a gate dielectric thickness of only 70 nm, the inverters were able to operate at supply voltages as low as 10 V. Due to the almost balanced charge carrier mobilities of electrons and holes with the same order of magnitude, the switching voltage is very close to VDD/2. Moreover, average inverter gains of 30 and 40 were measured for VDD = 10 Vand VDD = 20 V, respectively (Kronemeijer et al. 2012). The inverters exhibit noise margin of 3.25 V for VDD = 10 V and 5.75 V for VDD = 20 V (Kronemeijer et al. 2012). Further, the inverters were used in a three-stage ring oscillators where maximum oscillation frequency of 182 kHz at VDD = 50 V was obtained, corresponding to a stage delay of 0.91 ms (Kronemeijer et al. 2012). Fan et al. demonstrated BBT-based polymer ambipolar inverters operating in the first and the third quadrant with a gain of 35 (Fan et al. 2012). High-performance complementary-like inverters were constructed by combining two identical ambipolar DPPT-TT transistors (Chen et al. 2012). The inverter’s VTC exhibited a good symmetry with switching voltage nearly half the supply voltage and a relatively high gain of more than 20 (Chen et al. 2012). Another, DPP-based ambipolar inverters were prepared by Bijleveld et al. with gain as high as 30 for VDD = +/ 50 V (Bijleveld et al. 2009). The highest voltage

Page 14 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fig. 10 Input-output characteristics, worst-case noise margin extraction, and gain of constituent inverters at positive drive voltages of 10 and 20 V. The constituent PSeDPPBT ambipolar transistors have channel length and width of 5 and 120 mm, respectively, with mh = 0.46 cm2/Vs and me = 0.84 cm2/Vs (Reprinted with permission from Kronemeijer et al. (2012) # 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)

gain of 86 for VDD = 80 V was reported by Chen et al. for polyselenophene polymeric ambipolar transistors (Chen et al. 2010). The top-gate/Au bottom-contact devices with L = 20 mm and W = 1 cm showed balanced electron and hole mobilities in the order of 102 cm2/Vs (Chen et al. 2010). Although the efficient injection of electrons and holes and the charge transport in ambipolar OFETs are very advantageous for their applications, in a complementary-like electronic circuitry, this causes drawbacks, such as increased static power consumption and a reduced static noise margin, as pointed out before. The reduced noise margin deteriorates the robustness to noise in digital logic circuits in comparison to CMOS inverters (Risteska et al. 2012). The power consumption of an inverter consists of two main components: a dynamic power and a static power. The dynamic power consumption is proportional to the switching frequency of the inverter and is similar for both CMOS and ambipolar inverters. On the other hand, the static power consumption is related to the leakage current when the input is not switching. In CMOS inverters, the static power consumption is negligible, while in ambipolar inverters, it is considerably higher due to the fact that the ambipolar transistors cannot be completely switched off. Concepts to turn the ambipolar OFET to a unipolar device by modifying and controlling the charge injection barriers for electrons and holes have been proposed (Baeg et al. 2011). Further techniques based on channel, gate, and contact engineering have been explored to suppress the ambipolar behavior of other carbon-based materials that will be discussed in the next section (Wang et al. 2009a; Lin et al. 2004; Javey et al. 2004). With the ability to control the device polarity in-field, novel and promising design opportunities in both analog and digital domains are possible (O’Connor et al. 2007; Jamaa et al. 2008; Wang et al. 2009b).

Page 15 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

The growing interest in wearable electronics has initiated and stimulated a lot of research and development of display media on flexible substrates. Display backplanes realized by organic transistors and organic circuitry might be a cornerstone in realizing bendable and rollable displays (Sekitanu and Someya 2010; Zhou et al. 2006; Nomoto et al. 2011). So far several commercial display manufacturers have demonstrated displays on flexible substrates (Nomoto et al. 2011). In the future, it can be expected that not only organic CMOS circuitry but ambipolar transistor-based circuits will be used as part of the display backplanes.

Outlook Ambipolar charge transport has been reported in many newly investigated material systems including graphene and carbon nanotubes (Lemme et al. 2007; Novoselov et al. 2004; Heinze et al. 2003). Due to their excellent electrical and physical properties, they have been extensively researched for device applications. Graphene MOSFETs with a cutoff frequency of 100 GHz (Heinze et al. 2003) and charge carrier mobilities exceeding 20,000 cm2/Vs (Liao et al. 2010) have been demonstrated. Moreover, ambipolar carbon nanotube transistors have been used to realize logic circuits (Yu et al. 2009). Device structures integrating a second gate to control the ambipolarity of the transistor have been also investigated (Javey et al. 2004; Chen et al. 2006; Li et al. 2006). One of the gates is used to control the conductivity of the transistor’s channel, while the second gate is intended for control of the charge injection. Depending on the applied voltage on this second so-called polarity gate, the transistor can be operated either as an n- or p-type transistor (Chen et al. 2006; Li et al. 2006). The adaptiveness of the logic gates with the possibility of integrating few logic functions in a single logic circuit enables new design opportunities for logic circuits with high integration density (O’Connor et al. 2007; Jamaa et al. 2008). Although the ambipolarity of these novel device materials has shown a great potential for innovative design opportunities in different analog and digital fields (O’Connor et al. 2007; Jamaa et al. 2008; Wang et al. 2009b), there are still several challenges that need to be addressed and overcome before they are being used for commercial applications. More details on recent advances in the graphene technology, including its advantages and disadvantages, can be found in Schwierz (2010).

Summary Over the past few years, ambipolar TFTs have drawn a lot of attention as potential candidates for low-cost and flexible electronics. Tremendous progress has been made in understanding both electron and hole transport in a variety of small molecules and polymers, as well as newly investigated materials such as graphene and carbon nanotubes. The ability to observe both electron and hole transport in a single semiconductor material has opened up new opportunities for both technological applications and scientific studies of these materials. As discussed, ambipolar FETs enable innovative device applications based on light-emitting transistors and complementary-like circuits. Unlike unipolar transistors, they operate independently of the polarity of the gate voltage. This intrinsic property of the ambipolar transistors has the potential to lead to new paradigms in the design of simple analog and digital circuits. Therefore, an ambipolar circuitry has been proposed as complementation to CMOS technology. Further, the ability to inject both electrons and holes in a semiconductor from a single metal contact material provides less complex and cheap fabrication. However, there are still a number of technological and scientific challenges that need to be addressed, related both to materials and processes to fulfill the stringent

Page 16 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

constraints on yield and uniformity needed in production environment before they are fully incorporated in commercial applications.

Further Reading Ahles M, Hepp A, Schmechel R, von Seggern H (2004) Light emission from a polymer transistor. Appl Phys Lett 84:428–430 AlSalhi MS, Alam J, Dass LA, Raja M (2011) Recent advances in conjugated polymers for light emitting devices. Int J Mol Sci 12:2036–2054 Bader MA, Marowsky G, Bahtiar A, Koynov K, Bubeck C, Tillmann H, Hörhold HH, Pereira S (2002) Poly(p-phenylenevinylene) derivatives: new promising materials for nonlinear all-optical waveguide switching. J Opt Soc Am B 19:2250–2262 Baeg KJ, Kim J, Khim D, Caironi M, Kim DY, You IK, Quinn JR, Facchetti A, Noh YY (2011) Charge injection engineering of ambipolar field-effect transistors for high-performance organic complementary circuits. ACS Appl Mater Inter 3:3205–3214 Baeg KJ, Caironi M, Noh YY (2013) Toward printed integrated circuits based on unipolar or ambipolar polymer semiconductors. Adv Mater 25:4210–4244 Baldo MA, Holmes RJ, Forrest SR (2002) Prospects for electrically pumped organic lasers. Phys Rev B 66:35321 Benor A, Knipp D (2008) Contact effects in organic thin film transistors with printed electrodes. Organic Electron 9:209–219 Benor A, Hoppe A, Wagner V, Knipp D (2007a) Microcontact printing and selective surface dewetting for large area electronic applications. Thin Solid Films 515:7679–7682 Benor A, Hoppe A, Wagner V, Knipp D (2007b) Electrical stability of pentacene thin-film transistors. Organic Electron 8:749–758 Bijleveld JC, Zoombelt AP, Mathijssen SGJ, Wienk MM, Turbiez M, de Leeuw DM, Janssen RAJ (2009) Poly(diketopyrrolopyrroleterthiophene) for ambipolar logic and photovoltaics. J Am Chem Soc 131:16616–16617 Brabec C, Scherf U, Dyakonov V (2014) Organic photovoltaics: materials, device physics, and manufacturing technologies, 2nd edn. Wiley, New York Chen BH, Wei JH, Lo PY, Wang HH, Lai MJ, Mj T, Chao TS, Lin HC, Huang TY (2006) A carbon nanotube field effect transistor with tunable conduction-type by electrostatic effects. Solid-State Electron 50:1241–1348 Chen Z, Lemke H, Albert-Seifried S, Caironi M, Nielsen MM, Heeney M, Zhang W, McCulloch I, Sirringhaus H (2010) High mobility ambipolar charge transport in polyselenophene conjugated polymers. Adv Mater 22:2371–2375 Chen Z, Lee MJ, Ashraf RS, Gu Y, Albert-Seifried S, Nielsen MM, Schroeder B, Anthopoulos TD, Heeney M, McCulloch I, Sirringhaus H (2012) High-performance ambipolar diketopyrrolopyrroleThieno[3,2-b]thiophene copolymer field-effect transistors with balanced hole and electron mobilities. Adv Mater 24:647–652 Cicoira F, Santato C, Melucci M, Facaretto L, Gazzano M, Muccini M, Barbarella G (2006) Organic lightemitting transistors based on solution-cast and vacuum-sublimed films of a rigid core thiophene oligomer. Adv Mater 18:169–174 Conibeer GJ, Willoughby A (2014) Solar cell materials: developing technologies. Wiley, Chichester

Page 17 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Fan J, Yuen JD, Wang M, Seifter J, Seo JH, Mohebbi AR, Zakhidov D, Heeger A, Wudl F (2012) Highperformance ambipolar transistors and inverters from an ultralow bandgap polymer. Adv Mater 24:2186–2190 Gather MC, Könen A, Meerholz K (2011) White organic light-emitting diodes. Adv Mater 23:233–248 Gelinck GH, Huitema HEA, van Veenendaal E, Cantatore E, Schrijnemakers L, van der Putten J, Geuns TCT, Beenhakkers M, Giesbers JB, Huisman BH, Meijer EJ, Benito EM, Touwslager FJ, Marsman AW, van Rens BJE, de Leeuw DM (2004) Flexible active matrix displays and shift registers based on solution-processed organic transistors. Nat Mater 3:106–110 Glowacki ED, Leonat L, Voss G, Bodea MA, Bozkurt Z, Ramil AM, Irimia-Vladu M, Bauer S, Sariciftci NS (2011) Ambipolar organic field effect transistors and inverters with the natural material Tyrian Purple. AIP Adv 1:042132 Hayashi Y, Kanamori H, Yamada I, Takasu A, Takagi S, Kaneko K (2005) Facile fabrication method for p/ n-type and ambipolar transport polyphenylenevinylene-based thin-film field-effect transistors by blending C60 fullerene. Appl Phys Lett 86:052104 Heinze S, Radosavljevic M, Tersoff J, Avouris P (2003) Unexpected scaling of the performance of carbon nanotube Schottky-barrier transistors. Phys Rev B 68:235418 Hepp A, Heil H, Weise W, Ahles M, Schmechel R, von Seggern H (2003) Light-emitting field-effect transistors based on a tetracene thin film. Phys Rev Lett 91:157406 Hofmockel R, Zschieschang U, Kraft U, Rödel R, Hansen NH, Stolte M, W€ urthner F, Takimiya K, Kern K, Pflaum J, Klauk H (2013) High-mobility organic thin-film transistors based on a smallmolecule semiconductor deposited in vacuum and by solution shearing. Organic Electron 14:3213–3221 Hoppe A, Knipp D, Gburek B, Benor A, Marinkovic M, Wagner V (2010) Scaling limits of organic thin film transistors. Organic Electron 11:626–631 Horowitz G (2010) Interfaces in organic field-effect transistors. Adv Polym Sci 223:113–153 Inoue Y, Sakamoto Y, Suzuki T, Kobayashi M, Gao Y, Tokito S (2005) Organic thin-film transistors with high electron mobility based on perfluoropentacene. Jpn J Appl Phys 44:3663–3668 Inoue A, Okamoto T, Sakai M, Kuniyoshi S, Yamauchi H, Nakamura M, Kudo K (2013) Flexible organic field-effect transistors fabricated by thermal process. Phys Status Solidi A 210:1353–1357 Irimia-Vladu M, Glowacki ED, Troshin PA, Schwabegger G, Leonat L, Susarova DK, Krystal O, Ullah M, Kanbur Y, Bodea MA, Razumov VF, Sitter H, Bauer S, Sariciftci NS (2012) Indigo – a natural pigment for high performance ambipolar organic field effect transistors and circuits. Adv Mater 24:375–380 Jamaa MHB, Atienza D, Leblebici Y, De Micheli G (2008) Programmable logic circuits based on ambipolar CNFET. In: Proceedings of the design automation conference Anaheim, CA, pp 339–340 Javey A, Guo J, Farmer DB, Wang Q, Wang D, Gordon RG, Lundstrom M, Dai H (2004) Carbon nanotube field-effect transistors with integrated ohmic contacts and high-k gate dielectrics. Nano Lett 4:447–450 Kajii H, Taneda T, Ohmori Y (2003) Organic light-emitting diode fabricated on a polymer substrate for optical links. Thin Solid Films 438:334–338 Kang SM, Leblebici Y (2003) CMOS Digital integrated circuits: analysis and design. McGraw-Hill, Tata Kitamura M, Arakawa Y (2009) Current-gain cutoff frequencies above 10 MHz for organic thin-film transistors with high mobility and low parasitic capacitance. Appl Phys Lett 95:023503 Klauk H (2006) Organic electronics: materials, manufacturing, and applications. Wiley, Germany Knipp D, Northrup JE (2009) Electric-field induced gap states in pentacene. Adv Mater 21:2511–2515 Knipp D, Chan KY, Gordijn A, Marinkovic M, Stiebig H (2011) Ambipolar charge transport in microcrystalline silicon thin-film transistors. J Appl Phys 109:024504 Page 18 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Kronemeijer AJ, Gili E, Shahid M, Rivnay J, Salleo A, Heeney M, Sirringhaus H (2012) A Selenophenebased low-bandgap donor–acceptor polymer leading to fast ambipolar logic. Adv Mater 24:1558–1565 Kuik M, Wetzelaer GJAH, Nicolai HT, Craciun NI, De Leeuw DM, Blom PWM (2014) 25th Anniversary article: charge transport and recombination in polymer light-emitting diodes. Adv Mater 26:512–531 Lee J, Han AR, Kim J, Kim Y, Oh JH, Yang C (2012) Solution-processable ambipolar diketopyrrolopyrrole-selenophene polymer with unprecedentedly high hole and electron mobilities. J Am Chem Soc 134:20713–20721 Lei T, Dou JH, Ma ZJ, Yao CH, Liu CJ, Wang JY, Pei J (2012) Ambipolar polymer field-effect transistors based on fluorinated isoindigo: high performance and improved ambient stability. J Am Chem Soc 134:20025–20028 Lemme MC, Echtermeyer TJ, Baus M, Kurz H (2007) A Graphene field-effect device. IEEE Electron Device Lett 28:282–284 Li J, Zhang Q, Chan-Park MB (2006) Simulation of carbon nanotube based p-n junction diodes. Carbon 44:3087–3090 Li J, Zhao Y, Tan HS, Guo Y, Di CA, Yu G, Liu Y, Lin M, Lim SH, Zhou Y, Su H, Ong BS (2012) A stable solution-processed polymer semiconductor with record high-mobility for printed transistors. Sci Rep 2:754 Liang Z, Tang Q, Mao R, Liu D, Xu J, Miao Q (2011) The position of nitrogen in N-Heteropentacenes matters. Adv Mater 23:5514–5518 Liao L, Bai J, Qu Y, Lin YC, Li Y, Huang Y, Duan X (2010) High-k oxide nanoribbons as gate dielectrics for high mobility top-gated graphene transistors. Proc Natl Acad Sci U S A 107:6711–6715 Lin YM, Appenzeller J, Avouris P (2004) Ambipolar-to-unipolar conversion of carbon nanotube transistors by gate structure engineering. Nano Lett 4:947–950 Lin YM, Dimitrakopoulos C, Jenkins KA, Farmer DB, Chiu HY, Grill A, Avouris P (2010) 100-GHz transistors from wafer-scale epitaxial graphene. Science 327:662 Loi MA, Rost-Bietsch C, Murgia M, Karg S, Riess W, Muccini M (2006) Tuning optoelectronic properties of ambipolar organic light-emitting transistors using a bulk-heterojunction approach. Adv Funct Mater 16:41–47 McCall KL, Rutter SR, Bone EL, Forrest ND, Bissett JS, Jones JDE, Simms MJ, Page AJ, Fisher R, Brown BA, Ogier SD (2014) High performance organic transistors using small molecule semiconductors and high permittivity semiconducting polymers. Adv Funct Mater 24:3067–3074 Meijer EJ, De Leeuw DM, Setayesh S, van Veenendaal E, Huisman B-H, Blom PWM, Hummelen JC, Scherf U, Klapwijk TM (2003) Solution-processed ambipolar organic field-effect transistors and inverters. Nat Mater 2:678–682 Melzer C, von Seggern H (2010) Organic field-effect transistors for CMOS devices. Adv Polym Sci 223:213–257 Menard E, Podzorov V, Hur SH, Gaur A, Gershenson ME, Rogers JA (2004) High-performance n- and p-type single-crystal organic transistors with free-space gate dielectrics. Adv Mater 16:23–24 Nakanotani H, Saito M, Nakamura H, Adachi C (2009) Tuning of threshold voltage by interfacial carrier doping in organic single crystal ambipolar light-emitting transistors and their bright electroluminescence. Appl Phys Lett 95:103307 Nakanotani H, Higuchi T, Furukawa T, Masui K, Morimoto K, Numata M, Tanaka H, Sagara Y, Yasuda T, Adachi C (2014) High-efficiency organic light-emitting diodes with fluorescent emitters. Nat Commun 5:4016 Nomoto K, Noda M, Kobayashi N, Katsuhara M, Yumoto A, Ushikura S, Yasuda R, Hirai N, Yukawa G, Yagi I (2011) Rollable OLED display driven by organic TFTs. SID Symp Digest 42:488–491

Page 19 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Novoselov KS, Geim AK, Morozov SV, Jiang D, Zhang Y, Dubonos SV, Grigorieva IV, Firsov AA (2004) Electric field effect in atomically thin carbon films. Science 306:666–669 O’Connor I, Liu J, Gaffiot F, Pregaldiny F, Lallement C, Maneux C, Goguet J, Fregonese S, Zimmer T, Anghel L, Dang TT, Leveugle R (2007) CNTFET modeling and reconfigurable logic-circuit design. IEEE Trans Circuits Systems I 54:2365–2379 Opitz A, Horlet M, Kiwull M, Wagner J, Kraus M, Br€ utting W (2012) Bipolar charge transport in organic field-effect transistors: enabling high mobilities and transport of photo-generated charge carriers by a molecular passivation layer. Organic Electron 13:1614–1622 Risteska A, Chan KY, Anthopoulos TD, Gordijn A, Stiebig H, Nakamura M, Knipp D (2012) Designing organic and inorganic ambipolar thin-film transistors and inverters: theory and experiment. Organic Electron 13:2816–2824 Risteska A, Myny K, Steudel S, Nakamura M, Knipp D (2014) Scaling limits of organic digital circuits. Organic Electron 15:461–469 Rost C, Karg S, Riess W, Loi MA, Murgia M, Muccini M (2004) Ambipolar light-emitting organic fieldeffect transistor. Appl Phys Lett 85:1613–1615 Schidleja M, Melzer C, von Seggern H (2009) Electroluminescence from a pentacene based ambipolar organic field-effect transistor. Appl Phys Lett 94:123307 Schmechel R, Ahles M, von Seggern H (2005) A pentacene ambipolar transistor: experiment and theory. J Appl Phys 98:084511 Schwierz F (2010) Graphene transistors. Nat Nanotechnol 5:487–496 Sekitanu T, Someya T (2010) Stretchable, large-area organic electronics. Adv Mater 22:2228–2246 Singh TB, Senkarabacak P, Sariciftci NS, Tanda A, Lackner C, Hagelauer R, Horowitz G (2006) Organic inverter circuits employing ambipolar pentacene field-effect transistors. Appl Phys Lett 89:033512 Smits ECP, Anthopoulos TD, Setayesh S, van Veenendaal E, Coehoorn R, Blom PWM, de Boer B, de Leeuw DM (2006) Ambipolar charge transport in organic field-effect transistors. Phys Rev B 73:205316 Sonar P, Singh SP, Li Y, Soh MS, Dodabalapur A (2010) A low-bandgap diketopyrrolopyrrolebenzothiadiazole-based copolymer for high-mobility ambipolar organic thin-film transistors. Adv Mater 22:5409–5413 Sonar P, Foong TRB, Singh SP, Li Y, Dodabalapur A (2012) A furan- containing conjugated polymer for high mobility ambipolar organic thin film transistors. Chem Commun 48:8383–8385 Song CL, Ma CB, Yang F, Zeng WJ, Zhang HL, Gong X (2011) Synthesis of tetrachloro-azapentacene as ambipolar organic semiconductor with high and balanced carrier mobilities. Org Lett 13:2880–2883 Sun Y, Welch GC, Leong WL, Takacs CJ, Bazan GC, Heeger AJ (2012) Solution-processed smallmolecule solar cells with 6.7% efficiency. Nat Mater 11:44–48 Takahashi T, Takenobu T, Takeya J, Iwasa Y (2006) Ambipolar organic field-effect transistors based on rubrene single crystals. Appl Phys Lett 88:033505 Tang ML, Reichardt AD, Miyaki N, Stoltenberg RM, Bao Z (2008) Ambipolar, high performance, acenebased organic thin film transistors. J Am Chem Soc 130:6064–6065 Tessler N, Pinner DJ, Cleave V, Ho PKH, Friend RH, Yahioglu G, Le Barny P, Gray J, de Souza M, Rumbles G (2000) Properties of light emitting organic materials within the context of future electrically pumped lasers. Synth Met 115:57–62 Unni KNN, Pandey AK, Nunzi JM, Alem S (2006) Ambipolar organic field-effect transistor fabricated by co-evaporation of pentacene and N, N' -ditridecylperylene 3, 4, 9, 10-tetracarboxylic diimide. Chem Phys Lett 421:554–557 Wakayama Y, Hayakawa R, Seo HS (2014) Recent progress in photoactive organic field-effect transistors. Sci Technol Adv Mater 15:024202 Page 20 of 21

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_177-1 # Springer-Verlag Berlin Heidelberg 2014

Wang X, Li X, Zhang L, Yoon Y, Weber PK, Wang H, Guo J, Dai H (2009a) N-doping of graphene through electrothermal reactions with ammonia. Science 324:768–771 Wang H, Nezich D, Kong J, Palacios T (2009b) Graphene frequency multipliers. IEEE Electron Device Lett 30:547–549 Yang CY, Cheng SS, Ou CW, Chuang YC, Wu MC, Dhananjay CCW (2008) Realization of ambipolar pentacene thin film transistors through dual interfacial engineering. J Appl Phys 103:094519 Yasuda T, Goto T, Fujita K, Tsutsui T (2004) Ambipolar pentacene field-effect transistors with calcium source-drain electrodes. Appl Phys Lett 85:2098 Yeap KH (2011) Fundamentals of digital integrated circuit design. AuthorHouse, Central Milton Keynes Yu WJ, Kim UJ, Kang BR, Lee IH, Lee EH, Lee YH (2009) Adaptive logic circuits with doping-free ambipolar carbon nanotube transistors. Nano Lett 9:1401–1405 Yuen JD, Fan J, Seifter J, Lim B, Hufschmid R, Heeger AJ, Wudl F (2011) High performance weak donor–acceptor polymers in thin film transistors: effect of the acceptor on electronic properties, ambipolar conductivity, mobility, and thermal stability. J Am Chem Soc 133:20799–20807 Zaumseil J, Sirringhaus H (2007) Electron and ambipolar transport in organic field-effect transistors. Chem Rev 107:1296–1323 Zaumseil J, Donley CL, Kim JS, Friend RH, Sirringhaus H (2006a) Efficient top-gate, ambipolar, lightemitting field-effect transistors based on a green-light-emitting polyfluorene. Adv Mater 18:2708–2712 Zaumseil J, Friend RH, Sirringhaus H (2006b) Spatial control of the recombination zone in an ambipolar light-emitting organic transistor. Nat Mater 5:69–74 Zeng WJ, Zhou XY, Pan XJ, Song CL, Zhang HL (2013) High performance CMOS-like inverter based on an ambipolar organic semiconductor and low cost metals. AIP Advances 3:012101 Zhao X, Pei T, Cai B, Zhou S, Tang Q, Tong Y, Tian H, Geng Y, Liu Y (2014) High ON/OFF ratio single crystal transistors based on ultrathin thienoacene microplates. J Mater Chem C 2:5382 Zhou LS, Wanga A, Wu SC, Sun J, Park S, Jackson TN (2006) All-organic active matrix flexible display. Appl Phys Lett 88:083502 Zhou J, Zuo Y, Wan X, Long G, Zhang Q, Ni W, Liu Y, Li Z, He G, Li C, Kan B, Li M, Chen Y (2013) Solution-processed and high-performance organic solar cells using small molecules with a benzodithiophene unit. J Am Chem Soc 135:8484–8487

Page 21 of 21

AOS TFTs for AMOLED TV Jin-Seong Park

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Issues of AOS TFT for AMOLED TV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TFT Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AOS Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gate Dielectrics and Passivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gate and Contact Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TFT Reliability and Pixel Compensation Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Current Status and Prospects of AMOLED TVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Display Panel Makers for AMOLED TVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equipment Suppliers for AOS TFTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary and Future Prospects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 5 5 7 8 10 11 13 13 15 16 17

Abstract

Since the introduction of televisions based on active-matrix organic lightemitting diode (AMOLED) technology, the major technical issues that have been addressed by the panel makers involve the fabrication of high-performance thin-film transistors (TFTs) and the selection of OLED materials, along with their color management (individual RGB patterns vs. single white OLED with color filters). In the present review, a brief summary on the history and research motivation of AMOLED flat panel displays is provided first. In the ensuing section, the issues related to amorphous oxide semiconductor (AOS) TFTs are examined. While a small range of high-end products incorporating AOS TFT backplanes is already commercially available, some technical issues still need to be resolved in order to have AOS TFTs implemented in AMOLED panels J.-S. Park (*) Division of Materials Science and Engineering, Hanyang University, Seoul, Republic of Korea e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_178-1

1

2

J.-S. Park

routinely, under well-controlled processes. In this regard, an overview is provided concerning the effect of TFT architecture, AOS materials, gate insulator/ passivation layers, and gate/contact metals on the device performance and stability. The achievement of high spatial uniformity is also important for large-area panels, which has led to the development of pixel compensation circuitry. The third section describes the current status of panel manufacturers and equipment suppliers. At present, only two major companies (LG and Samsung) are launching AMOLED TVs in the market, and only a few equipment suppliers provide the necessary AOS-oriented deposition systems (sputter, plasma-enhanced chemical vapor deposition) and annealing furnaces. Finally, the prospects of AMOLED TVs based on AOS TFTs are discussed. The successful implementation of AOS TFTs is vital regarding the mass production of high-resolution AMOLED TVs at relatively low costs, which is anticipated to boost up the growth of the AMOLED flat panel display industry.

Introduction For the past few years, TFTs based on amorphous hydrogenated silicon (a-Si:H) and low-temperature polysilicon (LTPS) semiconductors have been used as the major switching and driving devices for large-area active-matrix organic lightemitting diode (AMOLED) panels. Despite the relatively high field-effect mobility of LTPS devices that enables sufficient current to be conveyed to the organic emissive layers, the lack of areal uniformity owing to the presence of grain boundaries and the need for stable devices have been the main hurdles toward the achievement of large-area AMOLED panels. Samsung Electronics have successfully implemented small-/medium-size AMOLEDs in mobile products such as Galaxy S and Note, which was possible because reasonable areal uniformity of LTPS TFTs can routinely be obtained in screens that are smaller than 5 ~ 6 in. The next step is to achieve identical levels of areal uniformity in terms of TFT performance and stability in TV-oriented large-area panels (>50 in.), and in this regard many companies and research groups are putting efforts in the improvement of LTPS technology and the development of alternative semiconductors (Fig. 1). Since the introduction of metal oxide semiconductors such as amorphous indium gallium zinc oxide (a-IGZO) in 2004, considerable work has been carried out on the study of amorphous oxide semiconductor devices (Nomura et al. 2004). The possibility to deposit amorphous oxide semiconductors (AOSs) by conventional sputtering methods made them especially attractive for large-area AMOLED backplane applications. Consequently, a-IGZO and related semiconductors are emerging as promising replacements of silicon-based TFT technology in the field of flat panel displays. Figure 1 depicts the recent history of AOS TFT development and the related prototype displays, showing that extremely rapid progress has been made during the past decade or so. Although the research and development of oxide semiconductors has a relatively short history (Fig. 1), the major flat panel display (FPD) manufacturers (e.g.,

AOS TFTs for AMOLED TV

3

Fig. 1 Historical overview of AOS TFT development for AMOLED applications

Samsung, LG, Sharp, and AUO) already announced in 2012 that they would undertake major investments in large AMOLED production based on oxide semiconductor devices. In 2013, 55-in. AMOLED TVs were released in the market by LG Electronics, with the option to choose between flat or curved screens. As anticipated, their competitors Samsung Electronics also launched 55-in. FHD AMOLED TVs of the same grade later that year. Very recently, the two leading companies expressed opposite opinions regarding the production of large AMOLED TVs. While LG Electronics aggressively launched various AMOLED TV models including 55-in. FHD (1920  1080) and 65- and 77-in. UHD (3840  2160) grades, Samsung Display on the other hand pulled back from large AMOLED TV production and decided to concentrate on sustaining their market share in the domain of small-/medium-size AMOLEDs while also maintaining their dominance in the active-matrix liquid crystal display (AMLCD) TV market. Ascending to the CES 2012, LG and Samsung demonstrated identical 55-in. AMOLED TVs with 3D HD capability. The interesting fact is that although the two companies came up with the same AMOLED TV specifications, the adopted technologies were totally different, as shown in Fig. 2. LG demonstrated largearea (>50 in.) AMOLED TVs for the first time based on AOS TFT backplanes. In addition, they selected different OLED materials (white OLED and tandem layers), color patterning, and encapsulation techniques. After this event, other industrials (Sharp, SEL, AUO, and BOE) have competitively demonstrated AMOLED TV

4

J.-S. Park

Fig. 2 55-in. AMOLED prototype TV exhibitions with identical specifications (top left, Samsung; top right, LG). The comparison of AMOLED process techniques of Samsung and LG AMOLED TV (red color: technical drawbacks) (Chaji and Nathan 2014)

panels based on AOS TFTs, using generation 6 ~ 8 glass substrates. Although several AMOLED TV prototypes based on AOS TFTs have been exhibited since then, LG Electronics are the sole AMOLED TV vendors (based on AOS TFTs) in the present market. This implies that the mass production of large AMOLED panels is technically challenging, and specific OLED technology must be employed in order to accompany AOS TFT backplanes. For instance, LG use white OLED as the light source in combination with color filters, which is quite similar to AMLCD technology. Several issues concerning the processing of TFT backplanes and device characteristics still need to be addressed in order to take control of AOS materials. For example, a proper TFT structure needs to be configured for AMOLED TVs because it determines the process cost and device reliability. Another concern is the achievement of high-performance TFTs with high reliability, which is critical for the fabrication of viable products. The technical details will be discussed in the ensuing chapter while summarizing the current industrial trends and prospects regarding the commercialization of AMOLED TVs.

AOS TFTs for AMOLED TV

5

Issues of AOS TFT for AMOLED TV AOS TFTs exhibit relatively high field-effect mobility (μFE > 10 cm2/Vs), high on/off ratio of the drain current, and almost nonexistent leakage current well below femtoampere (fA) levels. The main advantage is the possibility to achieve high field-effect mobility at a relatively low cost compared to LTPS technology that involves the use of expensive excimer laser equipment. The low leakage current of AOS TFTs allows the use of sufficiently large drain voltages to provide current to the emissive layer, whereas relatively small drain voltages must be used in the case of LTPS transistors in order to minimize the off-state leakage current. In this chapter, the issues related to TFT structure and processes, including AOS materials and compensating circuitry, will be discussed.

TFT Architecture The TFT configuration plays a critical role in deciding the pixel aperture ratio, the number of photomasks, and parasitic capacitance. As shown in Fig. 2, many AOS TFTs have been reported with the inverted-staggered back-channel etch (BCE) structure similar to that of conventional a-Si:H TFTs due to the small number of photomasks and cost-effectiveness (using conventional a-Si:H TFT plants). Unfortunately, the semiconducting properties of AOSs are easily compromised by the process conditions and exposure to plasma treatment or humid environments. Also, the source-drain formation process in the BCE structure inevitably exposes the semiconductor surface to the etchant, which is detrimental to the device reliability. Also, reasonable areal uniformity of the active channel layer thickness must be achieved over large glass substrates (> Gen. 8), since the thickness of the semiconducting layer is an important parameter that determines the device’s electrical properties. This is a quite challenging task, since the source-drain etch process may not guarantee uniform thickness of the remaining channel layer over such large areas. An inverted-staggered structure using an etch stopper (ES) layer is thus more appropriate, so as to reduce the channel damage during the source-drain etch process. Although the number of photomasks increases in order to pattern the ES layer, the resulting TFTs exhibit relatively more reliable performance and are less susceptible to the ensuing processes, as shown in Fig. 3. The ES layer generally consists of a silicon oxide (SiOx) layer deposited by plasma-enhanced chemical vapor deposition (PECVD). The relatively low hydrogen content of SiOx compared to SiNx, which used to be the ES material of choice in a-Si:H, makes SiOx more suitable for direct deposition over AOS semiconductors. Hydrogen is well known to increase the number of free carriers in AOSs and deteriorate their semiconducting properties. However, the ES structure has drawbacks such as high parasitic capacitance and a large number of photomasks. The major drawbacks of bottom-gate structures adopting an etch stopper layer include a relatively large parasitic capacitance in the pixel circuit and the difficulty

6

J.-S. Park

Fig. 3 Representative AOS TFT configurations: BCE, etch stopper and coplanar top gate. The process and device-related issues are summarized

of forming small transistors owing to the overlap between the semiconductor and source-drain electrodes above the etch stopper (the channel length cannot be reduced below 8 μm (Yang et al. 2014)). Recently, self-aligned coplanar top-gate structures have been developed in order to decrease the large parasitic capacitance and the number of photomasks, as shown in Fig. 3 (Morosawa et al. 2011). Also, a few panel makers have successfully demonstrated largesize AMOLED panels based on self-aligned top-gate TFTs. To realize such TFT structures, two key issues need to be considered: the formation of a low-resistivity n + layer and the growth process of the gate insulator that would induce minimal damage to the underlying semiconductor. First, the n + layer can be formed by carrying out plasma treatment directly on the exposed AOS surface. Second, the gate insulator may consist of a SiOx/SiNx tandem structure grown by plasma-enhanced chemical vapor deposition (PECVD) or aluminum oxide (Al2O3) films grown by atomic layer deposition (ALD) or reactive sputtering. Nevertheless, the use of SiOx/SiNx stacks and Al2O3 layers still requires considerable process optimization to obtain reasonable device performance and reliability. Also, proper equipment must also be available for large-size production (> Gen. 8). The TFT configuration must therefore be determined based on the above considerations, in order to allow the mass production of large-area AMOLEDs.

AOS TFTs for AMOLED TV

Group

3

4

7

5

6

7

8

9

10

11

12

13

14

Al

Si

period IGZO… X+IZO, X+ZTO, X+ITO… ITZO Looking for high mobility & stability

3 4

Sc

Ti

V

Cr

Mn

Fe

Co

Ni

Cu

Zn

Ga

Ge

5

Y

Zr

Nb

Mo

Tc

Ru

Rh

Pd

Ag

Cd

In

Sn

6

Lu

Hf

Ta

W

Re

Os

Ir

Pt

Au

Hg

Ti

Pb

7

Lr

Rf

Db

Sg

Bh

Hs

Mt

Ds

Rg

Cn

Uut

Uu q

1.E-03 1.E-04 1.E-05 1.E-06 1.E-07 1.E-08 1.E-09 1.E-10 1.E-11 1.E-12 1.E-13 1.E-14 –15 –10 –5

Tr-1 Tr-2 Tr-3 Tr-4 Tr-5 Tr-6 Tr-7 Tr-8 Tr-9

5 0 Vg[V]

10

15 20

1.E-03 1.E-04 1.E-05 1.E-06 1.E-07 1.E-08 1.E-09 1.E-10 1.E-11 1.E-12 1.E-13 1.E-14 –15 –10 –5

(c) a-ITZO

Tr-A Tr-B Tr-C Tr-D

0

5

Vg[V]

10 15 20

1.E-03 1.E-04 1.E-05 1.E-06 1.E-07 1.E-08 1.E-09 1.E-10 1.E-11 1.E-12 1.E-13 1.E-14 –15 –10 –5

(d) c-IGO

Tr-1 Tr-2 Tr-3 Tr-4 Tr-5 Tr-6 Tr-7 Tr-8 Tr-9

5 0 Vg[V]

10 15 20

Id [A]

Id[A]

Transfer curve (Vd = 10 [V])

(b) a-IGZO (another component)

Id[A]

(a) a-IGZO (2:2:1)

Id[A]

Oxide material

1.E-03 1.E-04 1.E-05 1.E-06 1.E-07 1.E-08 1.E-09 1.E-10 1.E-11 1.E-12 1.E-13 1.E-14 –15 –10 –5

0

5

10 15 20

Vg[V]

Mobility 2 [cm /Vs]

11.5

24.2

30.9

23.8

Vth [V]

0.27

0.32

0.97

–0.07

S factor [V/decade]

0.30

0.13

0.21

0.30

Chemical durability of oxide material

Poor

Poor

High to PAN

High to HF and PAN

∗ HF: fluoric acid, PAN: a mixture of phosphoric acid, acetic acid, and nitric acid.

Fig. 4 Combinatorial multi-compound AOS materials on the periodic table of elements (solid circle, main oxide semiconductors; dashed circle, dopants for AOS). Transfer curves of emerging AOS TFTs and those TFT characteristics ((a) a-IGZO TFT, (b) modified a-IGZO TFT, (c) a-ITZO TFT, (d) c-IGO TFT) (Ari and Shiraishi 2012, copyright)

AOS Materials Since the introduction of a-IGZO as a promising semiconductor for display backplanes, many research groups have investigated alternative AOS materials to improve the TFT mobility and reliability (Park et al. 2012, 2014; Fortunato et al. 2012; Kwon et al. 2011). Devices based on a-IGZO generally exhibit a field-effect mobility of ~10 cm2/V-sec. However, the need for higher mobility (~30 cm2/V-sec) AOS TFTs increases for small-sized AMOLED applications because high current driving capability is required in the embedded circuitry, and high-resolution products necessitate the implementation of relatively small TFTs in order to reduce the pixel size. The development of AOSs include X + IZO,

8

J.-S. Park

X + ZTO, and X + ITO semiconductors (X: dopant elements indicated by dashed circles in Fig. 4) in order to achieve high mobility and reliability. For example, the alloying of indium-based semiconductors with transition elements such as Hf, Ta, Zr, or Y results in improved device reliability, because the alloying element strongly binds with oxygen in the AOS matrix. Also, the addition of some III and IV group elements may enhance the mobility by generating excess-free electrons. As shown in Fig. 4, two main approaches are possible in order to achieve highmobility AOS TFTs. One is to modify the cation composition of a-IGZO with a relatively high indium (In) content, because the In-O binding energy is weak and results in a higher concentration of oxygen vacancies, which in turn generates excess-free carriers. A high mobility of 24.2 cm2/V-sec was reported by such composition adjustment (In:Ga:Zn = 2:2:1), compared to that of conventional a-IGZO TFTs (11.5 cm2/V-sec) with a cation composition of 1:1:1 (Fig. 4). The other way is to use a different base material, for instance by including tin (Sn) or reducing the zinc (Zn) content. High mobility values of 30.9 and 23.8 cm2/V-sec could be achieved with a-ITZO (amorphous indium tin zinc oxide) and c-IGO (crystalline indium gallium oxide), respectively (Ari and Shiraishi 2012; Fuh et al. 2014; Song et al. 2014).

Gate Dielectrics and Passivation The mobility and reliability of AOS TFTs are the major parameters that need to be considered for display applications. The gate insulator and passivation materials play a key role regarding the performance and stability of AOS TFTs. First, the major gate insulators involve PECVD SiOx and SiNx layers. The growth of PECVD SiNx is now a routine process that is well developed in the field of a-Si:H. However AOS TFTs adopt PECVD SiOx gate insulators, since the relatively large hydrogen content in SiNx accelerates the degradation of AOS properties upon prolonged operation. Therefore if one needs to place a dielectric material directly in contact with the AOS layer, either below or above it, the material of choice is low hydrogen SiOx. Thus, most manufacturers have settled with a single PECVD SiOx layer or a tandem PECVD SiNx/SiOx stack as the gate insulators in AMOLED panels. As shown in Table 1, different research groups have already evaluated the effect of various insulator materials on the properties and stability of AOS devices (Su et al. 2009). Although various high-k insulators were also studied, the fieldeffect mobility is not greatly improved, although the subthreshold swing (SS) values could be improved in some cases. Indeed, the AOS materials themselves are most influential as far as the device’s field-effect mobility is concerned. In addition to gate dielectrics, the development of appropriate passivation layers for AOS TFTs is required in order to prevent the deterioration of device performance and reliability by the adsorption of moisture and oxygen molecules from the external environment. The protective layer has to contain relatively small amounts

AOS TFTs for AMOLED TV

9

Table 1 Comparison of IGZO TFTs with various gate dielectrics Gate dielectric HfLaO SiO2a SiO2 AlTiO SiO2b Si3N4c Si3N4 SiO2 SiO2 Si3N4/TiO2d Ba0.5Sr0.5TiO3 Y2O3

Operating voltage (V) 2 30 1.5 220 10 20 20 30 30 30 3 6

VT(V) 0.22 1.01 0.11 12 0.5 3.25 5 2 5.9 5 0.5  0.1 1.4

μFE (cm2/Vs) 25 51.7 3.1 11  15 104 5.1 10 24.5 35.8 9.9 ~ 1.8 10  1 12

SS (V/decade) 0.076 0.25 0.063 0.2  0.25 0.25 0.68 0.23 – 0.59 0.22 0.06  0.01 0.2

Ion/Ioff 5  107 1.88  108 >108 >107 >108 3  107 >108 6  107 4.9  106 3  107 8  107 108

μCI (μA/V2) 12.6 1.2 0.16 0.6 10.2 0.12 0.52 0.56 0.82 0.16 1.12 1.24

GIZO = 1:2:2, Pdep = 0.7 Pa, ds = 40 nm, PO2 = 1.5 mbar, anneal at 150  C ITO/IGZO channel c IGZO/Cu-doped IGZO channel d For TiO2 = 8 nm a

b

of hydrogen and act as an effective gas diffusion barrier. Current panel manufacturers are applying single SiOx or tandem SiOx/SiNx stacks as passivation layers. A few alternatives in terms of materials, structures, and processes have also been suggested as shown in Fig. 5; however, conventional SiOx or SiOx/SiNx combinations still remain the materials of choice owing to their process compatibility with a-Si:H technology. Figure 5a exhibits a schematic diagram of AOS TFTs using aluminum oxide (AlOx) grown by atomic layer deposition (ALD) as the protective film, since this material contains relatively low concentrations of hydrogen and the process is free from plasma damage (Yang et al. 2010; Park et al. 2013). Also, AlOx is an excellent gas diffusion barrier that prevents the infiltration of water and oxygen molecules. However, the ALD process has its limits in the fact that only small-/medium-size equipment (< Gen. 6) are available up to date and the relatively low growth rate/throughput. Figure 5b shows an example involving the use of a titanium oxide (TiOx) passivation layer by oxidizing the Ti adhesion layer using oxygen plasma, after etching the upper Mo source-drain electrode (Seo et al. 2009). This is an efficient way to obtain a highly stable and robust passivation that also protects the underlying active layer from any possible damage during the sourcedrain patterning. Reactive sputtering methods using a metal target may also be used to deposit a stable metal oxide passivation layer such as AlOx in Fig. 5c (Morosawa et al. 2011). The use of metal oxide films as protective layers is quite promising for next-generation displays due to the cost-effectiveness of the process, provided that reasonable film properties (small hydrogen content and low gas permeability) can be achieved over large areas (> Gen. 8).

10

J.-S. Park Oxygen Plasma

a

b

Protective Layer Al2O3(9nm)

Passivation Layer Al2O3 / SiNx

IGZO

TiOx

Ti

Mo

Active ITO 150nm ITO 150nm

Al2O3 185nm

Gate

SiNx

Substrate

c

Source Electrode

Gate Gate Insulator Electrode

AI2O3 Protection layer

Oxide Semiconductor

Drain Electrode

Inter layer

Source/Drain Region

Glass Substrate

Fig. 5 Schematic cross-sectional views of AOS TFTs with various passivation concepts ((a) ALD AlOx passivation, (b) metal oxidation passivation (e.g., TiOx), (c) Al reactive sputtering passivation (e.g., AlOx)) (Morosawa et al. 2011; Park et al. 2013; Seo et al. 2009)

Gate and Contact Materials In the case of a-Si:H TFTs, PECVD may be used to deposit all the important layers such as the semiconductor, gate insulator, passivation, and highly conductive (n+) Si. The latter is usually deposited on top of the semiconductor in order to reduce the contact resistance between the active and the source-drain electrodes. On the other hand, the reduction of the contact resistance in AOS materials is done by posttreatments such as plasma radiation (Park et al. 2007; Ahn et al. 2009). Everincreasing display sizes and resolution also require the use of low resistance wiring so as to reduce signal delays over long distances, which necessitates low-resistivity electrodes such as Al and copper (Cu). The power lines should also be designed with low resistivity and be able to withstand relatively large current density to avoid the degradation by Joule heating in the panel. Thus, as the panel size increases, Cu metal layers are appropriate as the gate and S/D electrodes. As summarized in Table 1, the device parameters do not depend on the type of contact material, except for some cases where the mobility may vary a little (Wu and Chien 2013). Although Cu has the lowest electrical resistivity among the metals in practical use, the diffusion of Cu into the adjacent layers is a critical problem, even at low temperature (~200  C), and accelerates the degradation of the device properties. As shown in Fig. 6, the application of proper Cu diffusion barriers is essential, regarding the preservation of AOS TFT’s electrical performance and stability. For instance, AUO used Ti/Al/Ti electrode stacks in 37-in. AMLCD panels (Hung et al. 2010), and Samsung Electronics applied Cu-based bus

AOS TFTs for AMOLED TV

11

Fig. 6 (Top) Schematic diagram of AOS TFTs (inverted-staggered structure) using Cu electrode. (Bottom) Summary of candidate Cu diffusion barrier materials (○, good; , bad)

lines in 15-in. AMLCDs (Lee et al. 2008). LG Display have adopted Cu-based electrodes as the gate and S/D layers in 55-, 65-, and 77-in. AMOLED TVs (Oh et al. 2013).

TFT Reliability and Pixel Compensation Circuit Device reliability is a crucial issue for AOS TFTs, as reasonable lifetime of the AOS backplane must be guaranteed in order to fabricate durable AMOLED TVs. Recent investigations both in academia and industry have intensively focused on understanding the device degradation mechanism and finding ways to improve the stability of AOS TFTs under bias stress. Four practical stress conditions exist by default: negative and positive gate bias, high temperature, illumination, and humid environment (Park et al. 2014; Shin et al. 2009; Jeong et al. 2008). In addition, constant current stress (CCS) has also been a metric for the evaluation of AMOLED-oriented devices. Upon application of the above stresses, the AOS TFT usually exhibits shifts in the threshold voltage (Vth) values without much mobility degradation (Nomura et al. 2010). In fact, a-Si:H TFTs exhibit similar Vth shifts under bias temperature stress (BTS), which is generally understood by the following mechanisms: charge trapping and defect creation (Powell 1983, 1989). In AOS TFTs, several groups have suggested plausible origins of instability such as the presence of oxygen vacancies, hole trapping, and electron injection at the

12

J.-S. Park

semiconductor/gate dielectric interface. However other possible mechanisms are still under investigation. Many researchers have attempted to improve the stability of AOS TFTs using various approaches, including optimized structures (Kwon et al. 2010), the use of different gate insulator materials/processes (Jung et al. 2010), the application of quasi-impermeable passivation layers (Kim et al. 2011), the growth of defect-free semiconductors (Kim et al. 2010), and post-annealing/heat treatments (Ji et al. 2011). For large-size AMOLED TV applications, it is important to examine the stability of the Vth in AOS TFTs under constant current stress and negative bias illumination stress (NBIS), because two types of transistors exist in each pixel: one consists of the switching transistor that provides charge to the storage capacitance, and the other is the driving transistor that conveys current to the emissive layer. As shown in Fig. 7, the effect of the external environment to a-IGZO TFTs was studied under positive bias CCS and NBIS. To simulate the usual AMOLED panel structure, the devices were subjected to electrical stress using unsealed, resin-sealed (WVTR: 10 1-10 2 g/m2-day), and glass-sealed (WVTR: Gen. 8 substrates, which can deposit not only AOS channels but also reactively sputtered AlOx layers. The latter contain relatively low hydrogen concentrations and can be suitable for passivation layers. Applied Materials, Inc. have also developed PECVD

16

J.-S. Park

Fig. 10 PECVD system (in-line type, Gen. 8, Applied Materials, Inc.) and sputtering system (cluster type, Gen. 8, AKT or ULVAC)

Fig. 11 Future prospects of AMOLED TVs: technical issues and potential panel makers

equipment for Gen. 8 to deposit SiOx gate insulators and passivations. The optimization of the deposition conditions is still in progress. The main issues involve the areal uniformity (thickness and composition) of AOS materials, low damage and high areal uniformity of SiOx films, and the development of low hydrogen passivation layers. In other words, to achieve the routine fabrication of large-size AMOLED TVs, the equipment must also be upgraded accordingly, so as to meet the process requirements that would result in the most uniform and stable AOS TFT backplanes in terms of performance and reliability.

Summary and Future Prospects Although a few leading panel makers have already launched AMOLED TVs, AMLCD TVs are still predominant in the TV market. Figure 11 shows why the panel makers have been guarded about the prospects of AMOLED TVs.

AOS TFTs for AMOLED TV

17

At present, intensive research is still ongoing concerning the origin of AOS device instability, the development of alternative high-mobility AOS materials, and innovative pixel-circuit designs for AMOLED applications. Also, considerable efforts are being made concerning the process development for low-resistivity electrodes and effective diffusion barriers, as well as the optimization of gate insulator/passivation properties. Unlike AOS TFTs, LTPS technology is well established in the domain of small/medium AMOLED panels owing to the high mobility and good stability of LTPS devices. However, the need for highresolution, low-cost, and large-size displays increases, and in this regard it is hoped that AOS TFTs will allow the realization of AMOLED TVs satisfying all of the above criteria.

Further Reading Ahn BD, Shin HS, Kim GH et al (2009) A novel amorphous InGaZnO thin film transistor structure without source/drain layer deposition. Jpn J Appl Phys 48:03B019 Ari T, Shiraishi Y (2012) Manufacturing issues for Oxide TFT technologies for large-sized AMOLED displays. SID Symp Dig Tech 42:756–759 Chaji R, Nathan A (2014) LTPS vs Oxide backplanes for AMOLED displays: system design considerations and compensation techniques. SID Symp Dig Tech 45:153–156 Chen C-Y, Lin L-F, Lee J-Y, Wu W-H, Wang S-C et al (2013) A 65-inch amorphous oxide thin film transistors active-matrix organic light-emitting diode television using side by side and fin metal mask technology. SID Symp Dig Tech 44:247–250 Fortunato E, Barquinha P, Martins R (2012) Oxide semiconductor thin-film transistors: a review of recent advances. Adv Mater 24:2945–2986 Fuh CS, Liu PT, Fan YS et al (2014) Performance improvement for high mobility amorphous indium-zinc-tin-oxide thin-film transistors. SID Symp Dig Tech 45:1017–1020 Hung MC et al. (2010) International workshop on TAOS, Tokyo In HJ, Kwon OK (2009) External compensation of nonuniform electrical characteristics of thinfilm transistors and degradation of OLED devices in AMOLED displays. IEEE Electron Dev Lett 30(4):377–379 Jeong JK, Yang HW, Jeong JH et al (2008) Origin of threshold voltage instability in indiumgallium-zinc oxide thin film transistors. Appl Phys Lett 93:123508 Ji KH, Kim JI, Jung HY et al (2011) Effect of high-pressure oxygen annealing on negative bias illumination stress-induced instability of InGaZnO thin film transistor. Appl Phys Lett 98:103509 Jung JS, Son KS, Lee KH et al (2010) The impact of SiNx gate insulators on amorphous indiumgallium-zinc oxide thin film transistors under bias-temperature-illumination stress. Appl Phys Lett 96:193506 Kim HD, Chung HJ, Berkeley BH, Kim SS (2009) Emerging technologies for the commercialization of AMOLED TVs. Info Dis 9:18–22 Kim HS, Park KB, Son KS et al (2010) The influence of sputtering power and O2/Ar flow ratio on the performance and stability of Hf-In-Zn-O thin film transistor under illumination. Appl Phys Lett 97:102103 Kim SI, Kim SW, Kim CJ, Park J-S (2011) The impact of passivation layers on the negative bias temperature illumination instability of Ha-In-Zn-O TFT. J Electrochem Soc 158(2): H115–H118 Kim TG, Kim ST, Yu SH et al (2014) A novel power saving technology for OLED TV with external TFT compensation. SID Symp Dig Tech 45:728–731 Kwon JH (2013) RGB color patterning for AMOLED TVs. Info Dis 2:12–15

18

J.-S. Park

Kwon JY, Son KS, Jung JS et al (2010) The impact of device configuration on the photonenhanced negative bias thermal instability of GaInZnO thin film transistors. Electrochem Solid-State Lett 13(6):H213–H215 Kwon JY, Lee DJ, Kim KB (2011) Review paper: transparent amorphous oxide semiconductor thin film transistor. Electron Mater Lett 7(1):1–11 Lee JH, Kim DH, Yang DJ et al (2008) World’s largest (15-inch) XGA AMLCD panel using IGZO oxide TFT. SID Symp Dig Tech 39:625–628 Lin C-C, Lee K-Y, Chen C-C, Lin H-S, Chang L-H, Lin Y-H (2014) A new pixel circuit to compensate panel non-uniformity and OLED degradation on large IGZO AMOLED panel. SID Symp Dig Tech 45:1013–1016 Mo YG, Kim M, Kang CK et al (2011) Amorphous-oxide TFT backplane for large-sized AMOLED TVs. J Soc Inf Disp 19(1):16–20 Morosawa N, Ohshima Y, Morooka M, Arai T, Sasaoka T (2011) A novel self-aligned top-gate oxide TFT for AM-OLED displays. SID Symp Dig Tech 42:479–482 Nam WJ, Shim JS, Shin H-J, Kim J-M et al (2013) 55-inch OLED TV using InGaZnO TFTs with WRGB pixel design. SID Symp Dig Tech 44:243–246 Nomura K, Ohta H, Takagi A, Kamiya HM, Hosono H (2004) Room-temperature fabrication of transparent flexible thin-film transistors using amorphous oxide semiconductors. Nature 432:488–492 Nomura K, Kamiya T, Kikuchi Y et al (2010) Comprehensive studies on the stabilities of a-In-GaZn-O based thin film transistor by constant current stress. Thin Solid Film 518:3012–3016 Oh CH, Shin HJ, Nam WJ et al (2013) Technological progress and commercialization of OLED TV. SID Symp Dig Tech 44:239–242 Park J-S, Jeong JK, Mo YG, Kim HD (2007) Improvements in the device characteristics of amorphous indium gallium zinc oxide thin-film transistors by Ar plasma treatment. Appl Phys Lett 90:262106 Park J, Maeng WJ, Kim H-S, Park J-S (2012) Review of recent developments in amorphous oxide semiconductor thin-film transistor devices. Thin Solid Films 520:1679–1693 Park S-H, Ryu M-K, Oh H, Hwang C-S, Jeon J-H, Yoon S-M (2013) Double-layered passivation film structure of Al2O3/SiNx for high mobility oxide thin film transistors. J Vac Sci Technol B 31(2):020601 Park J-S, Kim H, Kim I-D (2014) Overview of electroceramic materials for oxide semiconductor thin film transistors. J Electroceram 32:117–140 Powell MJ (1983) Charge trapping instabilities in amorphous silicon-silicon nitride thin-film transistors. Appl Phys Lett 43(6):597–599 Powell MJ (1989) The physics of amorphous-silicon thin-film transistors. IEEE Trans Electron Dev 36(12):2753–2763 Seo H-S, Bae J-U, Kim D-H, Park Y et al (2009) Reliable bottom gate amorphous indium-galliumzinc oxide thin-film transistors with TiOx passivation layer. Electrochem Solid-State Lett 12 (9):H348–H351 Shin JH, Lee JS, Hwang CS et al (2009) Light effects on the bias stability of transparent ZnO thin film transistors. J ETRI 31:62–64 Song JH, Kim KS, Mo YG et al (2014) Achieving high field-effect mobility exceeding 50 cm2/Vs in In-Zn-Sn-O thin-film transistors. IEEE Elect Dev Lett 35(8):853–855 Su NC, Wang SJ, Chin A (2009) High-performance InGaZnO thin-film transistors using HfLaO gate dielectric. IEEE Elect Dev Lett 30(2):1317–1319 Tanabe T, Amano S, Miyake H et al (2012) New threshold voltage compensation pixel circuits in 13.5-inch quad full high definition OLED display of crystalline In-Ga-Zn-Oxide FETs. SID Symp Dig Tech 43:88–91 Wu HC, Chien CH (2013) High performance InGaZnO thin film transistor with InGaZnO source and drain electrodes. Appl Phys Lett 102:062103

AOS TFTs for AMOLED TV

19

Yamada K, Nomura K, Abe K et al (2014) Examination of the ambient effects on the stability of amorphous indium-gallium-zinc oxide thin Film transistor using a laser-glass-sealing technology. Appl Phys Lett 105:133503 Yang S, Cho DH, Ryu MK et al (2010) High-performance Al.Sn.Zn.In.O thin-film transistors: impact of passivation layer on device stability. IEEE Elect Dev Lett 31(2):144–146 Yang JY, Lee S, Cho SJ et al (2014) A new process and structure for oxide semiconductor LCDs. SID Symp Dig Tech 45:469–472

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_179-2 # Springer-Verlag Berlin Heidelberg 2015

Bias and Light-Induced Instabilities in a-IGZO Thin Film Transistors Piero Miglioratoa,b* and Jin Janga a Advanced Display Research Center and Department of Information Display, Kyung Hee University, Seoul, South Korea b Department of Engineering, Electrical Engineering Division, Cambridge University, Cambridge, UK

Abstract Instabilities due to various causes have been a major topic of investigation for all thin-film transistors (TFTs). Permanent or metastable changes in the current voltage characteristics are induced by combinations of gate and drain voltage, aging, and environmental effects, as well as exposure to light. Amorphous indium gallium zinc oxide (a-IGZO) TFTs are no exception and these phenomena have been reported at an early stage of the technology development. These devices are relatively stable under moderate positive (negative) gate bias at room temperature (VGS < 15 V), but show increasing positive (negative) shifts of the transfer characteristics under higher positive (negative) VGS. The effects are referred to as positive bias stress (PBS) and negative bias stress (NBS). The latter causes negligible changes under operating conditions. More importantly, large (DV > 2 V) negative shifts of the transfer characteristics are observed when the devices are exposed to near or above bandgap radiation. The shifts dramatically increase when a negative bias is simultaneously applied. The effect is referred to as negative bias under illumination stress or NBIS and is obviously important since in display applications the devices operate in the presence of light. PBS and NBIS have received great attention in the scientific literature, which has led to an improved, but by no means complete level of knowledge of the mechanisms involved. The main experiments, the characteristics of these effects, and their interpretation through the models that have been developed are reviewed in this chapter. Unanswered questions and possible future directions of research in this area are identified.

Introduction Instabilities due to various causes have been a major topic of investigation for all thin-film transistors (TFTs). Permanent or metastable changes in the current voltage characteristics are induced by combinations of gate and drain voltage, aging, and environmental effects, as well as exposure to light. Amorphous indium gallium zinc oxide (a-IGZO) TFTs are no exception and these phenomena have been reported at an early stage of the technology development (Suresh and Muth 2008; Shin et al. 2009). These devices are relatively stable under moderate positive (negative) gate bias at room temperature (VGS < 15 V, for SiO2 gate thicknesses of  200 nm), but show increasing positive (negative) shifts of the transfer characteristics under higher positive (negative) VGS. The effects are referred to as positive bias stress (PBS) and negative bias stress (NBS). The latter causes negligible changes under operating conditions. More importantly, large (DV > 2 V) negative shifts of the transfer characteristics are observed when the devices are exposed to near or above bandgap radiation. The shifts dramatically increase when a negative bias is simultaneously applied. The effect is referred to as negative bias under illumination stress or NBIS and is obviously important since in display applications the devices operate in the presence of light. The two main instability effects, PBS and NBIS, exemplified in Fig. 1a, b, have received a lot of interest over the years. The work up to 2011 was reviewed by Kamiya et al. (2010) and Brotherton (2013). PBS was *Email: [email protected] Page 1 of 27

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_179-2 # Springer-Verlag Berlin Heidelberg 2015 −3

−3

b

VDS = 0.1 V

−4

Log Drain Current (A)

Log Drain Current (A)

−5

Stress Time (s)

−6

0 500 1000 5000 10000 50000 1500000

−7 −8 −9 −10 −11

−5

0

5

10

15

−5

VDS_stress = 0V

−8

400

500 600 Wavelength (nm)

700

Stress Time (s)

−10

0 500 1000 5000 10000

−11 −12 −13 −20

0.5

0

−9

20

VDS = 0.1 V 1.0

−7

VDS_stress = 0V

−13

V GS_stress = −20V

−6

V GS_stress = +25V,

−12

−4

Intensity

a

−1 0

Gate Voltage (V)

0 10 Gate Voltage (V)

20

Fig. 1 Transfer characteristics for PBS and NBIS for VDS = 0.1 V, W/L = 2,000mm/10 mm; d = 50 nm; tox = 200 nm; a PBS, VGS-stress = 25 V; VDS-stress = 0 V; b negative bias stress under illumination, VGS-stress = 20 V, VDS-stress = 0 V, light intensity: 9,600 nit with spectrum shown in the inset. TFT parameters as in Fig.1a (Courtesy of J Jang, Advanced Display Research Center, Kyung Hee University, Seoul, Korea)

Pixel electrode Passivation Layer PECVD SiO2

Etch Stopper PECVD SiO2

Molybdenum

Molybdenum

Molybdenum

a-IGZO DC Magnetron PECVD SiO2

Substrate: glass or plastic Fig. 2 Structure of a-IGZO TFT

attributed to trapping of accumulated electrons in the gate dielectric (Suresh and Muth 2008; Fung et al. 2009) or near the gate/active layer interface (Nomura et al. 2009). For NBIS in top-gate ZnO, trapping of photogenerated holes in the gate dielectric or at the interface was proposed (Shin et al. 2009). A similar mechanism was described for bottom-gate a-IGZO TFTs (Nomura et al. 2010). On the other hand, creation of ionized oxygen vacancies was proposed on theoretical basis by Ryu et al. (2010) and, from experiments, by Oh et al. (2010) and Chowdhury et al. (2010). The role of adsorbates at the back interface such as oxygen and water was found to be important for both PBS (Jeong et al. 2008) and NBIS (Lee K-H et al. 2009), underscoring the need of back passivation. Considerable improvements have occurred over the years in fabrication technology. These involve the use of cluster tool deposition systems, enabling fabrication of gate oxide/active layer/passivation without breaking the vacuum in order to minimize interlayer contamination; high-quality PECVD SiO2 gate dielectric and back passivation by SiO2/SiNX or Al2O3; DC magnetron sputtered active layer; etch stopper to avoid back surface contamination. A typical device structure is shown in Fig. 2. These improvements however did not eliminate instabilities and work on the subject has been expanding through present days. Probably because extrinsic process effects have now been minimized, more papers have appeared explaining NBIS instabilities with creation of oxygen vacancy-related states (Migliorato et al. 2012b, 2014; Chowdhury et al. 2013a; Flewitt and Powell 2014) or a combination of vacancy creation and trapping at interfaces (Oh et al. 2011; Hung et al. 2014; Ueoka et al. 2014). This

Page 2 of 27

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_179-2 # Springer-Verlag Berlin Heidelberg 2015

situation is not new in the field of TFT instabilities, the notable antecedent being the controversy between various instability models for a-Si:H TFTs, discussed in the relevant 1990s literature. Differentiating between the various effects is not straightforward, since, for instance, in both trapping and state creation, the time dependence of the voltage shift can be fitted to stretched exponentials, while the thinness of the active layers employed (20–50 nm) makes difficult to separate bulk from interface effects. Comparing and contrasting different papers and specific interpretations is made difficult by the large literature, the different conditions employed in experiments and measurements, and the variety of arguments used. This article therefore focuses as much as possible on the quantitative characterization of the physical mechanisms, through combination of experiments and modeling, as to provide what are in our view the most convincing interpretations of the observed phenomena, at the present stage of development. The properties of defects in a-IGZO are undoubtedly important in the context of TFT instabilities. A wealth of theoretical calculations based on density functional theory (DFT) is available for intrinsic defects in single crystal ZnO, yielding formation energies, electronic energy levels, and migration barriers (see, for instance, Janotti and Van de Walle 2009). Relatively few theoretical data are available for defects in a-IGZO. Published work on this subject is summarized in section “Defects in Oxide Semiconductors” and can provide broad guidelines for the study of instabilities in a-IGZO. However, due to the limitations in computing power and the complexity of amorphous devices, the accuracy of these studies falls far short of what is required to match the sensitivity of the device to relatively low defect levels. Characterization tools for a-IGZO TFTs, based on the analysis of their I–Vand C–V characteristics, were recently reviewed (Migliorato et al. 2014). Detailed analyses of PBS and NBIS are presented in sections “Negative Bias Under Illumination Stress” and “Positive Bias Stress.” Section “Control and Suppression of Instabilities” is devoted to methods to control and suppress instability effects. The last section presents the conclusions and outlook. A list of symbols is presented in a separate section.

List of Symbols Cox d EC ED EF EF0 EV IDS L N(E) NC ND ND20 nFB nFB-exp nFB-sim Ngt Ngd QB

Gate dielectric capacitance per unit area Active layer thickness Conduction band minimum energy: EC = 3.1 eV Shallow donor energy Fermi energy Fermi energy for VGS = VFB Valence band maximum energy: EV = 0 Drain current Channel length Density of states (DOS) [cm3 eV1] Effective density of states conduction band: NC = 5  1018 cm3 Shallow donor concentration Double donor concentration Flat band electron concentration Extracted flat band electron concentration Calculated flat band electron concentration Tail states density for E = EC Deep states density for E = EC Bulk charge per unit area Page 3 of 27

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_179-2 # Springer-Verlag Berlin Heidelberg 2015

Qint Tgt Tgd tox VFB VTH VGS VDS Von VO0, VO+, VO2+ W «0 «s «ox m cS

Interface charge per unit area Tail states slope parameter ( K) Deep states slope parameter ( K) Gate dielectric thickness Flat band voltage Threshold voltage Gate to source voltage Drain to source voltage Turn-on voltage (corresponding to minimum measurable current in the IDS–VGS characteristic) Neutral, singly, doubly ionized oxygen vacancy Channel width Permittivity in vacuum:e0 = 8.85  1014farad cm1 a-IGZO dielectric constant: es = 11.9e0 SiO2 dielectric constant: eox = 3.9e0 Electron mobility Surface potential

Defects in Oxide Semiconductors Most of the theoretical work has been done on a close relative of a-IGZO, single crystal ZnO. The oxygen vacancy, VO, has attracted the most investigations. This center can exist in a neutral, VO0; singly ionized, VO+; and doubly ionized, VO2+, state. One would expect that, due to the repulsion between the two electrons in neutral VO0, the transition energy from doubly to singly ionized state, e(2+/+), is lower than that from singly ionized to neutral, e(+/0). Instead, the theory predicts that the above energies are inverted. The effect is due to the strong lattice relaxation present in a polar material (Lany and Zunger 2005; Janotti and Van de Walle 2007a; Clark et al. 2010): the Zn cations surrounding the vacancy move inwardly when the center is neutral (occupied by two electrons), which increases the overlapping between the cations wave functions. Since these are the basis from which the vacancy electron states are constructed, the binding energy increases, so that e(+/0) < e(2+/0) < e(2+/+). Such a defect is said to have a negative correlation energy U. Diagrams illustrating the charge state dependence upon the Fermi level position for a positive and a negative U double donor are shown in Fig. 3. The plots of Fig. 3 are obtained from the equation     (1) DH D, q E F , mX ¼ H D, q  Η0 þ mX þ qEF which applies to a vacancy formation. In this equation, DH is the formation energy of a defect D, with charge number q (=0, 1,2, etc.), HD,q is the total energy of the crystal with the defect, H0 is the total energy of the perfect crystal, mX is the chemical potential of the removed atom, and EF is the Fermi energy, that is the energy of the electron reservoir, taken with respect to the valence band maximum (EV). In the case of positive U (Fig. 3-top), one can identify two transition energies, corresponding to the intersection between the straight lines, with e(2+/+) < e(+/0). For negative U (Fig. 3-bottom) there are three transition energies, in order of increasing energy, e(+/0) < e(2+/0) < e(2+/+): the first and the last involve formation of the singly ionized state, D+, which is unstable, due to the presence of either D0 or D2+ states with a lower formation energy at all EF. D+ can, in principle, be observed under nonequilibrium conditions, e.g., under illumination. In fact, optical electron spin resonance (ESR) measurements on Page 4 of 27

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_179-2 # Springer-Verlag Berlin Heidelberg 2015

Fig. 3 Formation energy vs. EF for a double donor. The lowest energy states are shown dashed. (Top): positive correlation energy U; transition energies: e(+/0), e(2+/+). (Bottom): negative correlation energy U; transition energies: e(+/0), e(2+/0), e(2+/+). For U < 0, D+ is unstable since there are always neutral or doubly ionized states with a lower energy

electron-irradiated single crystal ZnO are consistent with the presence of singly ionized oxygen vacancies, VO+, with a transition energy e(+/0) = EV + 0.9 eV (Laiho et al. 2008), in agreement with theoretical predictions. However, only neutral or doubly ionized oxygen vacancy states ought to be observed in thermal equilibrium. Since, according to the theory, these are deep levels ( 400, fr  1.2 kHz at 20  C) and BPLC-R1 (De  50, fr  15 kHz at 20  C) at the specified frequencies. l = 633 nm (Redrawn from Peng et al. (2014)) Page 6 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_191-1 # Springer-Verlag Berlin Heidelberg (outside the USA) 2015

From Eq. 10, an optimal operation temperature (Top) exists, at which the Kerr constant reaches its maximum value. Generally speaking, Top is governed by several parameters, such as frequency, relaxation frequency, temperature, etc. Figure 5b depicts the temperature-dependent Kerr constant at different frequencies for two PSBP composites. In the high-temperature region, the Kerr constants are insensitive to the frequency, as the curves corresponding to different frequencies well overlap with each other. This is because the frequency term in Eq. 10 can be ignored fr >> f and K is unaffected by frequency. As the temperature decreases, fr decreases exponentially (Eq. 9). The high-frequency curve (whose f is closer to fr) bends down first due to the dramatically reduced De. As a result, its maximum Kerr constant is smaller, which leads to a higher operating voltage. For the lower-frequency curves, their peak Kerr constant and bending-over phenomenon occur at a lower temperature, as Fig. 5b depicts.

Device Configuration From an application viewpoint, PS-BPLC exhibits three attractive features: no need for alignment layer, submillisecond response time, and optically isotropic dark state. The alignment-layer-free feature greatly simplifies device fabrication process. Fast response time not only produces crisp pictures without image blurs but also enables color sequential displays, which require 3X higher refresh rate. By eliminating spatial color filters, both optical efficiency and resolution density are tripled. This feature is particularly important for reducing the power consumption of a high-resolution display. Therefore, PS-BPLC is a strong contender for next-generation displays. However, a relatively high operating voltage, noticeable hysteresis (Chen et al. 2010), and slow charging time (Tu et al. 2013) still hinder the widespread applications of PS-BPLC. Currently, the most fundamental issue for BPLC devices is to reduce the operating voltage to below 10 V. To achieve this goal, extensive efforts have been made on developing large Kerr-constant BPLC materials (Chen et al. 2013; Rao et al. 2011b; Wittek et al. 2012) and new device structures (Rao et al. 2009, 2010; Jiao et al. 2010; Cheng et al. 2011, 2012; Xu et al. 2013).

In-Plane-Switching (IPS) BPLCD

IPS structures, in which the electric fields are primarily in lateral direction, are commonly used in BPLC devices. Figure 6 shows a planar IPS-based blue-phase LCD, in which w stands for the electrode width and l the electrode gap. Before applying a voltage, no light is transmitted and the BPLC cell appears dark under crossed polarizers. Upon applying a voltage, the substantial lateral electric fields would induce birefringence along the electric field direction provided that the LC host has a positive De. As E increases, the induced birefringence increases. As a result, the linearly polarized incident light would experience more phase retardation and gradually transmit through analyzer.

Fig. 6 Operation principle of a planar IPS BPLCD between crossed polarizers: (a) E = 0 and (b) E 6¼ 0 Page 7 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_191-1 # Springer-Verlag Berlin Heidelberg (outside the USA) 2015

The planar IPS structure has intrinsic merits of wide viewing angle and small color shift. However, the operating voltage of planar IPS is still too high to enable a-Si TFT driving since the in-plane field cannot penetrate too deeply into the LC bulk. To overcome this problem, protruded (Rao et al. 2009) and etched (Rao et al. 2010) electrodes are implemented to enhance the penetration depth of electric field. Protruded Electrodes Protruded electrodes are effective in dramatically lowering the operating voltage since it enables the horizontal electric fields to penetrate deeply into the bulk LC layer (Rao et al. 2009). Figure 7a depicts protruded rectangular electrodes with a height (h) from the bottom surface. Shown in Fig. 7b, c are the simulated VT curves of planar and protruded IPS cells with protrusion height h = 1 and 2 mm for IPS-2/4 employing two different BPLC materials: JC-BP01M (Dns = 0.154, Es = 4.05 V/mm) and JC-BP06 (Dns = 0.09, Es = 2.2 V/mm). Compared to planar IPS, the protruded IPS shows about the same transmittance because the protruded electrodes mainly enhance the field penetration depth in the electrode gaps but do not change the field distribution above the electrodes. By comparing these curves, we find that enhancing the protrusion height is an effective way to reduce the operating voltage. However, higher protruded electrodes are more difficult to fabricate. On the other hand, although JC-BP06 has a larger Kerr constant than JC-BP01M (33.8 nm/V2 vs. 17.1 nm/V2 at l = 550 nm), it shows a higher voltage than JC-BP01M in planar IPS-2/4 structure (40 V vs. 34 V). This is because JC-BP06 has a relatively small Dns and IPS-2/4 has a relatively shallow penetration depth; therefore, a higher voltage is required to reach peak transmittance. Hence, large Kerr constant is not the only goal to optimize for BPLC in material development, as a large Dns is also essential. However, when protruded electrodes are employed, JC-BP06 exhibits a lower operating voltage than JC-BP01 because the relatively small Dns of JC-BP06 can be compensated by the increased penetration depth. As Fig. 7c shows, for 2-mm protruded IPS-2/4 employing JC-BP06, its operating voltage is reduced to 10 V while keeping 80 % transmittance. This is an important milestone to enable a-Si TFT driving for IPS-BPLC.

Fig. 7 (a) Cell structure and parameter definitions of protruded IPS electrodes and simulated VT curves of the protruded IPS-BPLC with different electrode dimensions employing (b) JC-BP01M and (c) JC-BP06 (25  C and l = 550 nm) (Redrawn from Xu et al. (2013)) Page 8 of 14

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_191-1 # Springer-Verlag Berlin Heidelberg (outside the USA) 2015

Fig. 8 (a) Cell structure and parameter definitions of etched-IPS and simulated VT curves of etched-IPS cells with different electrode dimensions: (b) l/w = 2 and (c) l/w = 3 and 4 using JC-BP06 (25  C and l = 550 nm) (Redrawn from Xu et al. (2013))

Etched Electrodes Besides protruded electrodes, etch-electrode structure is another effective method to reduce the operating voltage (Rao et al. 2010). Figure 8a shows an IPS cell with an etched depth (h’). Let us take IPS-2/4 as an example, the etching takes place along the 4-mm electrode gaps. As a result, the electric fields penetrate both above and under the 2-mm ITO electrodes. These doubled penetrating fringe fields help to reduce the operating voltage. Similar to a protruded IPS, etched-IPS using JC-BP06 also shows lower operating voltage than that using JC-BP01M since the bottom fringe fields provide an extra phase retardation to compensate for the relatively small Dns of JC-BP06. Therefore, we use JC-BP06 as an example to demonstrate the effectiveness of this etched-electrode approach in our simulations. Figure 8b compares the simulated VT curves of IPS-2/4 structure with h’ increased from 0, 1, 2, to 4 mm. As the etching depth increases, the operating voltage (Von) decreases rapidly ( > > > λ > X   < Rðλ, tÞ  S λ  yðλÞ  Δλ Y ðtÞ ¼ k  > λ X >   > > > Rðλ, tÞ  S λ  zðλÞ  Δλ Z ð t Þ ¼ k  : λ

where

(2)

Optical-Property Measurement of Electronic-Paper Displays

7

100 k¼X   S λ  yðλÞ  Δλ λ

S(λ): the relative spectral power distribution of the illuminant xðλÞ , yðλÞ , zðλÞ : color matching functions of CIE 1931 standard colorimetric observer R(λ, t): spectral reflectance factor of the DUT at time t Δλ: wavelength interval λ: wavelength range from 380 nm to 780 nm

Filter Photometric Reflection Measurement Firstly, set up the measurement geometry. Warm up the light source. Secondly, measure the CIE tristimulus values of the reference white standard, XWS0 , YWS0 , and ZWS0 , with a colorimeter. Thirdly, measure the CIE tristimulus values of the reference black standard, XBS0 , YBS0 , and ZBS0 . Fourthly, refresh the DUT to erase the previous image, by displaying full-screen white and full-screen black alternately for several times. DUT displays the test pattern. Finally, measure the CIE tristimulus values of DUT at time t, X0 (t), Y0 (t) , and Z0 (t). CIE tristimulus values at time t, X(t), Y(t), Z(t), are obtained as   8 XWS  XBS > 0 0 > XðtÞ ¼ ½X ðtÞ  XBS   > 0 0 þ X BS > > >  XWS  XBS  < Y WS  Y BS Y ðtÞ ¼ ½Y 0 ðtÞ  Y BS 0   þ Y BS > Y WS 0  Y BS 0 >  > > ZWS  Z BS > > : ZðtÞ ¼ ½Z0 ðtÞ  Z BS 0   þ Z BS Z WS 0  Z BS 0

(3)

where XWS, YWS, ZWS and XBS, YBS, ZBS are the calibrated CIE tristimulus values of the reference white standard and the reference black standard, respectively.

Ghosting Index Measurement Firstly, follow spectral reflection measurement procedure (Section “Spectral Reflection Measurement”) or filter photometric reflection measurement procedure (Section “Filter Photometric Reflection Measurement”). Secondly, DUT displays the 2  2 checker-board test pattern as shown in Fig. 4 and then displays fullscreen white. Thirdly, measure the spectral reflectance factors or the CIE tristimulus values of the four test points, marked in Fig. 5, on the full-screen white. Reflectance factor is defined as

8

T.-Y. Lin et al.

RðtÞ

Y ðtÞ Yn

(4)

where Yn is the tristimulus value of the light reflected from a perfect reflecting diffuser illuminated by the same light source as the DUT. Then, contrast ratio is obtained as CRðtÞ

Rw ðtÞ RB ðtÞ

(5)

where Rw(t) and RB(t) are the reflectance factors of the DUT’s white state and black state, respectively. D and W indicate the horizontal and vertical length of active area. Numbers 1–4 stand for the test points. Additionally, CIELAB color space coordinates are obtained as Fig. 4 2  2 checker-board test pattern

Horizontal D

Fig. 5 Full-screen white test pattern with four test points marked D/4

Vertical

W/4 W

D/2

D/4

1

2

3

4

W/2 W/4

X Test point X= 1 ~ 4

Active area

Optical-Property Measurement of Electronic-Paper Displays

9

8  L ðtÞ ¼ 116  hRðtÞ1=3  16 > > i < a ðtÞ ¼ 500  ðXðtÞ=Xn Þ1=3  RðtÞ1=3  16 h i > > : b ðtÞ ¼ 200  RðtÞ1=3  ðZðtÞ=Z Þ1=3  16

(6)

n

where Xn, Zn are the tristimulus values of the light reflected from a perfect reflecting diffuser illuminated by the same light source as the DUT. L(t) is the lightness of DUT. Finally, ghosting index is obtained as     GI ¼ Max L1 , L2 , L3 , L4  Min L1 , L2 , L3 , L4

(7)

where L*1, L*2, L*3, and L*4 are the lightness values of the four test points in Fig. 5. Max(L*1, L*2, L*3, L*4) is the maximum lightness value among L*1, L*2, L*3, and L*4. Min(L*1, L*2, L*3, L*4) is the minimum lightness value among L*1, L*2, L*3, and L*4.

Angular Optical Performance Measurement There are two ways to evaluate the angular optical performance for electronic-paper displays. One is to measure lightness range at some critical incidence angles, viewing angles, and azimuth angles. The other is to determine the threshold value of lightness range first and then obtain the threshold-based viewing angle at which the DUT displays lightness range larger than or equal to the threshold value. According to SEMI D68-0512 (Materials International 2012), recommend to use hemispherical or collimated light illumination measurement geometry to evaluate the angular optical performance of the DUT. Illuminate the display at an inclination angle of θS and place the LMD at a viewing angle of θD for azimuth angles ϕ = 0 , 90 , 180 , and 270 . According to spectral or filter photometric measurement procedure (Section “Spectral Reflection Measurement” or “Filter Photometric Reflection Measurement”), derive lightness ranges at designated incidence angles, viewing angles, and azimuth angles, where lightness range is obtained as LRðtÞ ¼ LW ðtÞ  LB ðtÞ

(8)

where L*W(t) and L*B(t) are the lightnesses of the white state and the black state of a display, respectively.

Bistability Measurement According to spectral measurement procedure (Section “Spectral Reflection Measurement”) or filter photometric measurement procedure (Section “Filter Photometric Reflection Measurement”), record two data at different time, t1 and t2. According to SEMI D68-0512 (Materials International 2012), time t1 is less than 10 s, and t2 is more than 5 min. Finally, bistability is obtained as

10

T.-Y. Lin et al.

  jL ðt1Þ  L ðt2Þj Bistability ¼ 1   100 % L ðt1Þ

(9)

where L(t1) and L(t2) are the lightnesses of a display at t1 and t2, respectively.

Summary In this article, measurement geometry, spectral reflection, filter photometric reflection, ghosting index, angular optical performance, and bistability measurement methods have been proposed to evaluate the optical performance of e-Paper displays. The proposed test methods may help a better understanding of e-Paper displays and further lead to the design and manufacture of useful and comfortable e-Paper products.

Further Reading Commission Internationale de l’Eclairage (2004) Colorimetry. CIE 15:2004, 3rd edn International Electrotechnical Commission (2011) IEC 61747-6-2:2011 – measuring methods for liquid crystal display modules – reflective type Semiconductor Equipment and Materials International (2012) SEMI D68-0512: test methods for optical properties of electronic paper displays Video Electronics Standards Association (2001) VESA 2.0 – flat panel display measurements standard Wen BJ, Lai YY, Tsai TC, Ke MT, Chen CY, Liu TS (2011) Visual fatigue difference analysis between reflective and emissive backlight of electronic-book readers. In: Proceedings of China Display/Asia Display 2011, Kunshan

Imaging Light Measurement Devices €rgen Neumeier Michael E. Becker and Ju

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Light Measurement Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spot-LMDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . “How Much Light” Does the Detector See? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measurement of Lateral Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imaging LMDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Realization of Imaging LMDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imaging Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spectral Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Photoinduced Charges and Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dynamic Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LMD Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stray Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vignetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detector Array Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . System Characterization, Correction, and Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imaging Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detector Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calibration of the Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 3 4 5 6 7 8 8 14 14 15 15 17 17 18 18 18 19 19

M.E. Becker (*) Display-Messtechnik & Systeme, Rottenburg am Neckar, Baden-W€ urttemberg, Germany e-mail: [email protected]; [email protected] J. Neumeier Instrument Systems GmbH, Munich, Germany e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_196-1

1

2

M.E. Becker and J. Neumeier

Application-Relevant Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stray Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Polarization Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Control of Exposure Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spatial Sampling Artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Typical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of Features with Complex Outlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluation of Uniformity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measurement of Directional Distribution of Emission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measurement of Directional Distributions of Transmission and Reflection . . . . . . . . . . . . . . . . Outlook on Future Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19 19 20 20 21 21 22 23 24 25 27 28

Abstract

This section introduces state-of-the-art light measurement devices with optics in general (spot meters) and then presents aspects that are specific to the class of imaging light measurement devices. Their structure is described in detail with focus on critical aspects and artifacts that are typical to this class of instruments. Characterization of the performance of the instruments as a basis for correction of nonideal properties is described. Precautions for the user of imaging light measurement devices are provided in the next subsection in order to assure highquality measurements. This section concludes with a listing of measurement tasks where application of imaging light measurement devices has proven to be convenient and new measurements that would – without imaging light measurement devices – require much more instrumental efforts.

Introduction Electronic visual display devices provide information by lateral modulation of luminance and chromaticity across the display area, thus generating contrast with respect to the uniform background of the display screen. Evaluation of the contrast of small features, e.g., alphanumeric characters and symbols, requires scanning of a small measurement field across the display surface area with small steps, conveniently with motorized translation stages. Evaluation of the uniformity of display screens is achieved by motorized scanning of a number of positions evenly distributed over the surface area of the display screen. Such time-sequential acquisition of measurement values results in long measurement time periods and requires the object of measurement to remain stable with respect to its optical properties over the time period to accomplish the measurement. A more convenient way to realize the measurement of lateral variations of luminance and chromaticity as required for characterization of lateral uniformities is provided by imaging light measurement devices, instruments that capture a regular array of luminance and chromaticity values, similar to images recorded by digital still cameras.

Imaging Light Measurement Devices

3

Light Measurement Devices In the most basic type of light measurement device (LMD), light from a well-defined area element on the device under test (DUT) is guided to a detector element – in most of the cases via optical lenses – and the electrical output of the detector is, e.g., proportional to “intensity” (i.e., more specifically the radiant or luminous power or energy) of the incident light. This class of LMD is colloquially called spot-LMD.

Spot-LMDs The light-sensitive area of the detector element (photoelectric transducer) defines the area of measurement when a positive lens is used as sketched in Fig. 1. In this configuration, the lens aperture and the working distance define the aperture angle, α of the receiver (i.e., combination of detector element and optical system) which should not exceed 5 when directional variations are to be evaluated (see, e.g., IEC 61747). The size of the measurement field together with the working distance defines the measurement field angle, β which is often specified in data sheets of LMDs. Both angles should be well distinguished. Since the measurement field angle is usually given by the size of the detector element (i.e., by the size of light-sensitive area) and the focal length of the lens, the dimension of the measurement field can be adjusted by variation of the working distance. In such a kind of LMD, the electrical output of the detector is proportional to the time integrated radiant or luminous power across the measurement field – also called “measurement spot” – thus, this class of LMD is often called “spot-LMD.”

objective lens measurement field

filter detector

aperture angle, α

β

α

signal out

measurement field angle, β working distance, dw object of measurement

light measurement device

Fig. 1 Basic concept of a light measurement device (LMD) with an objective lens for imaging the light from the measurement field to the detector element. The dimensions of the detector element define the measurement field (“spot”). The terminology is according to CIE S023-2013

4

M.E. Becker and J. Neumeier

objective lens

filter detector

L1 Φ1 Ω1

Φ2

signal out

θ1

A1

A2 working distance, dw light measurement device

Fig. 2 Geometrical details for specification of the transfer of radiant (luminous) flux from the area A1 on the object of measurement to the detector element of the LMD

“How Much Light” Does the Detector See? Since this chapter is about light measurement devices, we will explore now the relation between the radiance (luminance) of the object of measurement and the output signal level (voltage) of the detector element(s). The radiant (luminous) flux Φ1 arising from the field of measurement according to Fig. 2 is given by the following relation: ð ð Φ1 ¼

L1 ðθ1 Þ  dA1  cosθ1  dΩ1 Ω1 A 1

L1: radiance (luminance) of the area A1 A2: aperture area of the LMD dw: working distance With the assumption that the luminance remains constant for the angles of inclination that are resulting for the condition dw >>diameter(A1, A2), and thus also cos(θ1) ffi 1, the integral over the solid angle, Ω, can be replaced by a product and the flux is approximated by: ð Φ1 ¼

ð L1  dA1  Ω1 ¼

A1

L1  dA1  A2 =d2w A1

The irradiance (illuminance) of the detector, i.e., the flux per unit active detector area, is then reduced by the transmittance of the optical system and the filters, t, and still proportional to the area integral of radiance (luminance) across the measurement field A1. If the radiance (luminance) is constant over A1, the irradiance (illuminance) of the detector is:

Imaging Light Measurement Devices

5

Φ1 / L 1

A1 A2 t d2w

The electrical output of the detector is proportional to the irradiance (illuminance) and thus to the luminance of the area A1 on the object of measurement. When all measures are taken to assure this linear relationship, the final absolute value of the luminance for each given LMD is calibrated with a reference source of known luminance (calibration light source). Depending on the spectral sensitivity of the receiver (i.e., lens plus filter plus detector element), a spot meter measures luminance when the overall sensitivity is proportional to the color matching function y according to CIE 1931 or chromaticity when, e.g., three detector elements are provided for the three color matching functions according to CIE 1931, x, y, z. Spot-LMDs can be designed to measure the luminance (luminance meter) and to measure the chromaticity (colorimeter) of the light coming from the measurement field on the object to be measured, depending on the spectral sensitivity of the detector(s).

Measurement of Lateral Variations Variations of the target quantities of luminance and chromaticity across the surface area of the device under measurement can be evaluated by performing a set of timesequential measurements at different locations (see, e.g., Fig. 3), provided the

Fig. 3 Locations of the field of measurement on the surface area of a device under measurement for evaluation of lateral variations (uniformity) according to IEC 62341

6

M.E. Becker and J. Neumeier

Fig. 4 3D pseudo-color representation of the luminance variation within a triangular symbol and two numerals (illustration courtesy of Instrument Systems GmbH)

property of the DUT (emission or reflection of light) remains constant over the time required for completion of such a set of sequential measurements. The positioning of the location of the measurement field may conveniently be achieved with motorized (e.g., computer controlled) translation stages. If the plane defined by the translation of the LMD is parallel to the surface area of the object of measurement, the measurement direction (e.g., normal) remains constant for all positions to be measured, e.g., P0 to P8 in Fig. 3. With the above-described approach of mechanical scanning of a surface area, the “lateral resolution” is depending on the number of measurement locations and on the size of the measurement field. When a higher lateral resolution is required, the number of measurements must be increased until the measurement fields touch each other or even overlap. Also the measurement of luminance and chromaticity within areas with irregular contours (i.e., not given by simple geometrical shapes) such as letters and numerals as illustrated in Fig. 4 is a complicated if not impossible task for spot-LMDs with motorized scanning. Both the increase of the time required for “high-resolution” measurements and the limited ability to detect variations over small distances (detection of small features) provide a motivation for an alternative approach for measuring lateral variations of luminance and/or chromaticity.

Imaging LMDs Such an alternative approach can be realized by application of a regular array of detector elements (CCD or CMOS detectors) as they are used in digital still-image cameras (DSCs). With the basic optical setup according to Fig. 5, each element of the detector array provides one output signal which is proportional to the “intensity” of the incident light coming from the object to be measured; more specifically, it is

Imaging Light Measurement Devices

7

objective lens

filter

detector array X Xi-1 Xi Xi+1 X

tlens object of measurement

tx

SSI

signals out

light measurement device

Fig. 5 Basic concept of an imaging LMD where each detector element corresponds to one area element on the object of measurement just like in a digital still-image camera (DSC)

proportional to the irradiance incident on each of the detector elements when the spectral power distribution remains constant. Imaging LMDs are based on an array of detectors providing signals that are proportional to the radiant power or energy of the light coming from the array of elementary measurement fields on the DUT. Depending on the spectral sensitivity of the detector elements, a luminance raster image and/or a stack of images for the chromaticity characteristics (R, G, B or X, Y, Z) of the DUT is obtained.

Digital electronic cameras for photography and video, however, do not provide images with such a linear relationship between the luminance of the object and the resulting image data, because the relation between object luminance and pixel intensity is mapped according to a power function in order to take account of the nonlinear sensitivity of the human visual system. This relation is called optoelectronic conversion function (OECF), and its shape is similar to that of a logarithmic curve (y = log x). Figure 6 illustrates the image of the object of measurement, here a display screen with a regular matrix of red, green, and blue picture elements (pixels), as it is projected onto the detector. In the case of a spot-LMD, the output level of the detector is proportional to the integral of the light coming from the DUT pixel area defined by the light-sensitive area of the detector. In the case of the imaging LMD, an output signal is obtained for each area element as illustrated by the white grid in Fig. 6.

Realization of Imaging LMDs Imaging LMDs are composed of three basic components: • Imaging optics that projects regions of the object of measurement onto the array of detector elements (picture elements, pixels)

8

M.E. Becker and J. Neumeier

Fig. 6 Comparison of spot-LMD and imaging LMD. (a) SPOT LMD: RGB subpixels of the DUT as projected on the photo detector. The grey regions are not sensitive to light. The output of the detector is proportional to the sum of the illuminance over the field of measurement (measurement spot). (b) Imaging LMD: RGB subpixels of the DUT as projected on the array of photo detectors (white grid, 5  6 elements). The detector array provides 5  6 output signals, each of them proportional to the illuminance incident on the corresponding detector pixel

• Two-dimensional array of detector elements (opto-electric transducers) • Means for matching the spectral sensitivity of the detector elements Realization of high-quality imaging light measurement devices requires attention to a range of features that will be introduced and discussed below.

Imaging Optics The imaging optics must assure that each detector element only receives light from exactly the corresponding area element on the object of measurement, and the elements in the detector array must not affect each other (as illustrated in Fig. 5). Lateral separation is an essential feature for imaging LMDs.

Spectral Sensitivity The intrinsic spectral sensitivity of commercially available CCD and CMOS detector arrays is not suited for measurement of luminance and chromaticity since it does not correspond to either of the sensitivities described by the color matching functions, x, y, z, as standardized by the CIE 1931 color space. Among them, the function y is representing the spectral luminous efficiency curve V(λ) for photopic vision, which is used for evaluation of the luminance from spectral power distributions.

Imaging Light Measurement Devices

9

Fig. 7 Color matching functions x, y, z ¼ f ðλÞ according to CIE 1931

In order to match the original spectral sensitivity of the detector elements to the CIE 1931 color matching functions illustrated in Fig. 7, transmissive filters with selective absorption are used in combination with the detector array. The resulting spectral sensitivities of the combination of detector, filter, and the objective lens must be as close as possible to the color matching functions if the output signals are supposed to be proportional to the tristimulus values X, Y, and Z for a wide range of input spectra. Any other combination of spectral sensitivities can be used as well if the output quantities can be transformed by a linear matrix operation into the target values X, Y, and Z as follows: 2

3 2 3 X S1 4 Y 5 ¼ M  4 S2 5 Z S3

(1)

with S1, S2, and S3 being the output levels of a set of detectors with the spectral sensitivities s1, s2, and s3. Such a set of detectors may be realized with red, green, and blue filters as it is the case in digital still-image cameras for photography. The filter characteristics of such devices are rough approximations of the target filter functions and thus cannot be used for measurement purposes. The tristimulus values X, Y, and Z are obtained from the R, G, and B component levels (according to IEC 61966-2-1:1999) as follows:

10

M.E. Becker and J. Neumeier

Fig. 8 Concept of imaging LMD with filters on the detector elements (left) and a set of interchangeable filters for time-sequential measurement of, e.g., tristimulus values, X, Y, and Z. (a) The total spectral sensitivity of the imaging LMD is given by the lens transmission, tlens(λ) and detector spectral sensitivities, si(λ). (b) The total spectral sensitivity of the imaging LMD is given by the lens transmission, tlens(λ), the filter plate spectral transmittance, tx(λ) and detector spectral sensitivity, s(λ).

2 2 3 0:49 X 1 4 0:17697 4Y 5 ¼ 0:17697 0:00 Z

0:31 0:81240 0:01

3 2 3 0:20 R 0:01063 5 4 G 5 0:99 B

(2)

The spectral sensitivity of the detector elements can be shaped, for instance, by covering each element with a filter (see Fig. 8, left) to obtain, e.g., a Bayer mosaic pattern of R, G, and B sensitive elements (pixels). With such a detector array, imaging LMDs can be realized that perform measurements of luminance and chromaticity with one exposure but with reduced lateral resolution. Another approach is based on detector arrays in front of which uniform filter plates are provided that shape the spectral sensitivity of the receiver as a whole as shown in Fig. 8, right. In such a configuration, transmissive glass filters are usually used to reproduce the color matching functions, with x often realized by two filters, one for the shorter (500 nm). These filters are typically mounted in a motorized wheel (see Fig. 9) and placed in front of the detector array one after the other in order to acquire four images from which the luminance and the chromaticity of the object of measurement can be obtained pixel by pixel.

Imaging Light Measurement Devices

11

Fig. 9 Example for the realization of a filter wheel with six color filters and one opaque position (shutter) for measurement of the dark signal, courtesy of Instrument Systems GmbH

The transmittance vs. wavelength curve of high-quality objective lenses is generally flat enough in the wavelength ranges of interest that it can be neglected in a first consideration. The filter plates that are used in order to shape the spectral sensitivity of the imaging LMD are usually made from stained glass instead of polymer foils for reasons of durability. The limited availability of glass types and thus spectral absorbance curves may constitute problems to approximation of the target functions, and this process requires quite some know-how. The quality of approximation of the target filter functions is characterized by spectral mismatch indices as specified by, e.g., CIE S023. One way to compensate for nonideal filter characteristics is to use more than just the minimum of four filters that are required for realization of the color matching functions x, y, z, e.g., six filters (see Figs. 9 and 10). With the six-filter approach, the error of chromaticity evaluation of narrow-band monochromatic LEDs (in comparison to spectroradiometric results) can be reduced down to 0,003 xy-units (CIE 1931). Figure 10 illustrates the improved approximation of the color matching functions by six color filters. With increasing number of filters, there is a gradual transition into what is called multispectral imaging where individual images are taken for small wavelength intervals (e.g., 10 nm). This increases the quality of chromaticity evaluation, but also the time required for completion of the measurement sequence and thus the delay between the single-wavelength components. If the different spectral components have to be measured at the same moment in time, however, wavelength-selective beam-splitter devices can be used to separate the incoming light into the color channels, e.g., red, green, and blue, of which each

12

M.E. Becker and J. Neumeier 1.0 0.9

Normalised response

0.8 0.7

X 6 filter response Y 6 filter response

0.6

Z 6 filter response CIE x CMF

0.5

CIE y CMF CIE z CMF

0.4

X1 channel response

0.3

X2 channel response Y channel response

0.2

Z channel response

0.1 0.0 500

510

520

530

540 550 560 Wavelength / nm

570

580

590

600

Fig. 10 Errors in the approximation of the color matching functions, x, y, z, by four filters (x1, x2, y, and z, dashed curves) and by six filters with matrix correction (dotted curves) (Data courtesy of Instrument Systems GmbH)

one is equipped with a detector array. This option is usually chosen for high-quality movie camera systems. The quality of chromaticity evaluation, especially in the case of bandwidthlimited narrow spectra (e.g., monochromatic LED emission, LCD screens with LED, or quantum dot backlighting), can be significantly improved by combining a spectroradiometer with the imaging colorimeter, both using the same imaging optics as shown in Fig. 11. From the spectroradiometric chromaticity evaluation which is performed for one spot within the much larger measurement field, a correction matrix can be derived which is then applied for the complete measurement field of the imaging LMD (Patent 1995). In order to reduce the variation of the filter characteristics across the area of the detector array as much as possible, the filter plates should be introduced into the optical system of the imaging LMD at a location where the beam divergence is as small as possible. The relative spectral sensitivity of both CCD and CMOS detector arrays depends on details of the technology used in the respective device; a general trend for CCD sensors however is the increase of sensitivity toward the red end of the spectrum. This requires effective suppression of the IR components in the incoming radiation to avoid false readings and thus measurement errors. On the other hand, some LMDs are specially designed for measurements in the near IR range of the spectrum.

Imaging Light Measurement Devices

measurement field of detector array

13

objective lens

beam splitter filter detector array X X X X

measurement field of spectroradiometer

X tx

object of measurement

SSI

signals out

imaging LMD spectroradiometer

Fig. 11 Combination of an imaging LMD with a spectroradiometer for improvement of the precision of colorimetric evaluations

Fig. 12 Typical realization of an imaging LMD for measurement of lateral variations of luminance and chromaticity (Courtesy of Instrument Systems GmbH)

An essential requirement for both types of sensors is the uniformity of the physical properties of the detector elements across the sensor area (absolute and relative spectral sensitivity, dark signal, etc.). Any deviation from uniformity has to be characterized and corrected via software as a basis for reliable imaging photometry and colorimetry (Fig. 12).

14

M.E. Becker and J. Neumeier

Photoinduced Charges and Currents Sensor arrays with CCDs (charge-coupled devices) and photodiodes in CMOS ICs are silicon-based integrated circuits that convert incoming electromagnetic radiation into electrical signals. In the CCD case, electrical charges are generated by absorption of radiation within the semiconductor material with the number of accumulated charges being proportional to the energy of the radiation (inner photoelectric effect). These charges, unamplified as they are generated, are then transferred (shifted) out of the light-sensitive regions of the sensor for measurement of the corresponding voltage and thus to the level of detected radiation. In the case of CMOS sensors, incident radiation generates a current in the elementary photodiodes which is proportional to the momentary irradiance of the diode. A wide range of different pixel architectures and output modes are available for CMOS sensor arrays (linear/logarithmic transfer characteristic, active/passive pixel, etc.); in image sensors the photocurrent is usually transformed into a voltage by a transconductance amplifier integrated in the pixel circuit. In both cases shuttering (i.e., the control of shutter periods and thus the exposure time periods of the detector) is critically affecting the result of the measurements.

Dynamic Range The dynamic range of detector elements is typically defined as the (usable) full-well capacity level in relation to the noise level. The full-well capacity is the largest charge a detector element can hold before saturation or nonlinearity occurs. The dynamic range of an LMD is determined by the noise level and the onset of nonlinearity for high radiance (luminance) values as illustrated in Fig. 13. All contributions to the dark signal of the LMD have to be accurately measured and subtracted from the subsequent measurements. The onset of nonlinearity is usually indicated by the LMD by an overflow/overexposure warning signal. The net dynamic range made available by the detector array and the related electronics can in a next step be digitized and mapped to ranges comprising a minimum of 256 steps (8 bit) and up to 65,526 steps (16 bit). The number of available discrete steps, however, should not be mistaken for the dynamic range of the LMD. A convenient way to expand the dynamic range of a given detector system is based on a series of image acquisitions that are taken with different exposure times and then combined into one final image. Thus, the long exposure periods reproduce dark regions, while short exposure periods correctly capture those regions that would otherwise be overexposed (see, e.g., patent application US5828793, 1998). State-of-the-art imaging LMDs all feature automated multi-exposure high-dynamic range imaging (HDRI) modes. The dynamic range of an LMD can be evaluated as follows: the instrument is tightly closed to protect the detector array from being illuminated, and in each sensitivity range, the luminance reading that corresponds to the sum of all noise

Imaging Light Measurement Devices

15

Output level

saturation

1000 100 linear range 10 1 noise 1

10

100

1000

irradiance

Fig. 13 Schematic opto-electrical transfer characteristic (i.e., output level vs. irradiance) of a detector element used for light measurement. The range that can be used for light measurement in between noise and saturation levels is further limited by the onset of nonlinearity at the upper end of the curve

contributions is recorded (i.e., noise equivalent luminance). Then the upper allowed level – usually indicated by an overflow signal – is determined with the same range setting. The ratio of these two levels specifies the dynamic range of the detector array. The overall dynamic range of the LMD is determined over the complete range of sensitivity settings. The sensitivity can be changed via amplification of the output signals with different gain settings and/or via the exposure time of the sensor. The readings at the transitions from one sensitivity range to the next may be critical and thus should be checked and specified.

LMD Artifacts The nonideal properties of imaging LMDs for measurement of luminance and chromaticity can only be controlled if they are accurately characterized and then corrected for during the measurements.

Stray Light There are two requirements for the optical system that projects an image of the object to be measured on the detector array: its spectral transmittance characteristics should be well behaved (i.e., achromatic) and, much more important, the light received from each single area element of the object surface shall be not affected by the light from the neighborhood of that area element. This may sound like a very

16

M.E. Becker and J. Neumeier

Fig. 14 Sources of stray light in a compound lens system: lens (glass) surfaces, lens mounts, aperture stops, etc. Both regular and diffuse stray light components effect a degradation of the image as sketched

basic requirement that is also the basis for photographic imaging with high contrast, but it is, however, more important for imaging LMDs than in photography or video recording. In the ideal case, every light ray (or ray bundle) should propagate through the optical system undisturbed and unaffected to form a correct image. In reality, however, each encounter of light with matter is the cause of a variety of light rays that are propagating through the optical system in an uncontrolled way and disturbing the image as illustrated in Fig. 14. Such stray light is generated when a light ray hits a glass surface; even when an effective anti-reflection coating is applied, a fraction of the incoming light is reflected at the glass-air interface. In an optical system with, e.g., five lenses, depending on the details of the construction, there is a maximum of ten glass-air interfaces at which a fraction of the light is reflected and then tramping through the optics. Reflections at smooth glass-air interfaces are usually mirrorlike reflections, i.e., the inclination of the reflected beam equals that of the incident beam. As soon as a surface is not smooth, e.g., covered with dust particles, scattering occurs and light rays are formed that propagate in a wide range of directions. A further source for stray light components is every non-transmitting material in the lens system, e.g., lens fixtures and mounts, aperture stops, and even the tube that is keeping the lenses together. Thus, any individual detector element is not only receiving light from the corresponding area element on the DUT, but surrounding areas are also contributing to the light to be measured and analyzed. Depending on the conditions of the measurement, the resulting error can be tremendous as shown in the example of Fig. 13 which may be used as a test case for veiling glare (i.e., light scattered in the optical system forming a veil of light on top of the image). In imaging LMDs that are using absorbing filters for shaping of the spectral sensitivity, stray light may also be generated by the filter (glass-air interfaces and adhesive layers in the multilayer stack) which thus must be manufactured with great care to keep stray light as low as possible.

Imaging Light Measurement Devices

17

The stray light of photographic lenses can be characterized by a method according to, e.g., ISO 9358 (1994, Veiling Glare Index, VGI) with a black field within a white surround, similar to the situation shown in Fig. 14; alternatively, the glare spread function (GSF) can be evaluated with a small light source within the object plane. The latter alternative provides the distribution of stray light across the detector plane which can be used for numerical correction of stray light.

Vignetting Another source of uneven irradiance in the plane of the detector array is the natural falloff with cos4θ (natural vignetting) where θ is the angle between the optical axis and the beam direction under consideration. This deviation from the uniform case is usually rotationally symmetric about the optical axis and can be corrected numerically by calibration with a known uniform light source (sometimes called “flat field correction”).

Detector Array Artifacts When the pixels of a CCD array detector have been accumulating too many charges due to intense irradiation or extended exposure, the charges may spill out of the storage capacitors and contribute to the charge of neighboring cells. This effect is called “blooming,” and several means in the architecture of the sensor array have been provided to prevent such blooming or keep it at a low level. Blooming does not occur in CMOS image sensors. When the charges accumulated in a CCD detector array are being shifted toward the A/D converters without termination of illumination by a shutter, the generation of charges continues, and depending on the duration of exposure period and shift period, unwanted charges are generated that let a bright light source appear as a bright vertical streak in the image. When the pixel charges are first saved in columns with light shielding, a bright light source may still generate parasitic charges when the shielding is not perfect (which never is the case in CCDs). This modification of the charge distribution in CCDs during shifting toward the A/D converters causing bright vertical streaks in the image is called smearing. It is most pronounced when exposure times are short and close to the time required for shifting of the charges to the end of the columns for A/D conversion. Smearing does not occur in CMOS image sensors since no charges are being shifted here for data readout. Another nonideal effect that can be found in image sensors is related to certain specific nonuniform patterns that are characteristic for each individual sensor array. This image distortion which is often called “fixed pattern noise,” has two origins: first, the variation of the dark signal level across the sensor array and, second, the variation of sensitivity of the individual pixels. Those pixels that produce excessive signals in comparison to the other pixels identically irradiated are called “warm” or “hot” pixels. They can be identified by measurements with long exposure times without light, and their output can be disabled or replaced by an average of the

18

M.E. Becker and J. Neumeier

neighboring pixels (“hot pixel correction”). Variation of the dark signal over the sensor array can also be measured and then corrected after a completed exposure pixel by pixel. Noise components in electronic imaging have three main causes: • Shot noise caused by statistic variations of the power of incoming irradiation • Dark signal, i.e., thermally generated charges obvious as output signal without irradiance • Read noise caused by shifting of the charges in a CCD array by amplification and A/D conversion While shot noise is especially apparent in the case of short exposure times, and can be reduced by longer exposure times or by averaging several exposures, the dark signal can be reduced by keeping the detector array at lower temperatures. The readout noise depends on the design of the CCD array, and it can be kept at reasonable levels by careful control of the readout timing and by high-quality electronics on the detector chip and at the periphery.

System Characterization, Correction, and Calibration A basis for high-quality imaging LMDs is a good design of the system, careful selection of the components, well-controlled manufacturing, and advanced quality assurance procedures. But even with all due care taken, the deviations from the ideal case remain to be characterized and compensated by the LMD software.

Imaging Optics Even the best objective lenses with excellent resolution (high and flat MTF) do not provide a uniform illuminance for the detector array among others due to the natural cos4θ falloff. This lateral nonuniformity is usually varying with the aperture setting of the lens (F#) and with the selected working distance.

Filters The filters have to be manufactured to avoid scattering which would degrade the lateral resolution of the imaging LMD. However, the path length for the light transmitted through the filter layer increases with increasing angle of inclination of the light ray as sketched in Fig. 2. A longer path means increased absorption which may result in a variation of the resulting filter characteristics across the area of the detector array.

Imaging Light Measurement Devices

19

Detector Arrays The relative and absolute spectral sensitivities of the detector elements vary across the array area as well as the dark signals. All of these variations across the area of the detector array have to be characterized with special light sources that are highly uniform with respect to luminance and chromaticity over the light-emitting area. Data arrays have to be worked out during the characterization process that make a numerical correction of the nonideal behavior of the real LMD possible during the actual application. Since the nonideal properties are changing with exposure time, working distance, and aperture of the optics, the characterization of imaging LMDs may turn out to be a demanding and time-consuming process, but it is the prerequisite for measurement results with low uncertainties.

Calibration of the Instrument After accurate and detailed characterization of the performance of an imaging LMD, the last step of luminance calibration determines the degree to which systematic errors are reduced and how the instrument is able to reproduce results of other devices. High-quality luminance calibration requires well-designed, transparent, and well-documented calibration procedures with a minimum of calibration transfer steps between national standards (kept by, e.g., PTB, NIST, METAS) and the target instrument. The systematic luminance error of imaging light LMDs, often specified as “accuracy” in the data sheets, is in the range of 3–4 % for illuminant A. That means that for other spectral power distributions (e.g., LEDs), the error may become substantially larger.

Application-Relevant Aspects Since imaging LMDs are nonideal instruments, there are some aspects that should be considered by the measurement engineer in order to produce reproducible and repeatable results with low measurement uncertainties.

Stray Light The manufacturer of the LMD will duly consider all sources of stray light in the instrument and assure the lowest possible stray light levels by care in design and manufacturing. The measurement engineer should avoid stray light by appropriate selection of the test pattern and the measurement setup. The situation shown in Fig. 14 should

20

M.E. Becker and J. Neumeier

not be used for measurement of the luminance of the black area because of the reasons described above. When bright areas in the vicinity of dark regions cannot be avoided, e.g., for determination of display contrast with a checkerboard pattern, an opaque black mask with the dimensions of the black regions under consideration should be used in order to evaluate the luminance of black in the presence of stray light (see (Boynton and Kelley 1998)). The measurement must then be corrected for this black level.

Polarization Effects When imaging LMDs are used for measurement and evaluation of display screens, e.g., for evaluation of mura or uniformity of screen luminance, the fact must be considered that most of the display screens (computer monitors, TV screens, etc.) are based on liquid crystal displays (LCDs), and thus the light emitted or transmitted by such display screens is linearly polarized. As a consequence, LMDs with spectral separation via beam splitters will feature pronounced variations of measured luminance and chromaticity with the angle of rotation of the DUT or the LMD. Such LMDs should not be used with LCDs because of their sensitivity with respect to polarized light.

Control of Exposure Timing The radiant energy incident on the elements of a detector array is proportional to the time period over which the luminous flux is generating charges or photocurrents. As a direct consequence, the time periods of exposure and, in the case of CCDs, also of the charge transfer must be precisely controlled. In the case of CCDs without electronic shuttering, means must be provided to block light from the sensor array with well-characterized transition timing. Most shutters with electromagnetic actuators take about 20 ms for a transition between the open and the closed state, and above that, the latency of such systems is not specified and probably not constant. A more feasible approach is given by a disk with segment-shaped apertures that rotates with a well-defined and controlled constant angular speed. In the case of image sensors with electronic shuttering, the timing circuits have to be realized by the manufacturer of the image sensor and the related peripheral electronics in such a way that precise and repeatable exposure timing is possible. Together with the exposure control of the LMD, temporal modulations of the luminance of the device under test must be taken into account in order to avoid temporal sampling artifacts that may induce large variations of the results. This has recently become an important aspect since the backlights of display screens are increasingly realized with pulse-width-modulation controlled LEDs for adjustment

Imaging Light Measurement Devices

21

of the display luminance. The exposure time of the LMD has to be adjusted to comprise an integer number of frame periods of the DUT luminance modulation period, or the exposure time has to be extended such that variations of the phase difference between DUT modulation and the image acquisition time do not affect the results. In some cases this is achieved by application of neutral density filters to avoid overexposure of the detector array.

Spatial Sampling Artifacts More sampling artifacts can be found in the spatial domain when the size of the detector elements is close to the size of the display pixel image on the sensor array. In that case distinct low-frequency modulations can occur (Moire´ interference) that may severely affect the measurement results. Well-defined defocusing to a specified MTF has been proposed for suppression of such disturbing modulations and is successfully applied in many laboratories. Another way for suppression of Moire´ interferences is given by oversampling of the display pixel matrix at sampling rates above five. When imaging LMDs are used for evaluation of the lateral uniformity of display screen luminance, it is often forgotten that both luminance and chromaticity of all screens with LCDs vary more or less – depending on details of the LCD effect – with viewing direction and thus with direction of measurement. When an imaging LMD is replacing the observer as shown in Fig. 17 and a measurement is carried out to evaluate the uniformity of luminance across the screen area, it cannot be decided if the measured variations (nonuniformities) are due to lateral or directional variations. A separation of these two effects and the identification of the actual reason for the nonuniformities are only possible when telecentric lenses are used for such measurements. The disadvantage of telecentric lenses is the fact that the area of measurement is just as large as the aperture of the lens. Vice versa, if larger areas have to be measured, telecentric lenses become correspondingly large, thus heavy and expensive. A check of the effect of the direction of measurement can be performed by shifting the corner positions into the center of the iLMD measurement field (i.e. on the LMD optical axis) and by comparing the corresponding measurement values.

Typical Applications Electronic visual displays are supposed to provide visual information that precisely corresponds to the electrical input signals on a uniform display background. Deviations from that ideal case comprise nonuniformity of the display background as caused by, e.g., nonuniform backlighting in the case of LCD screens and artifacts by the activation of individual pixels (i.e., cross talk).

22

M.E. Becker and J. Neumeier

All standards for electronic visual displays from IEC TC110 comprise measurement methods for evaluation of display nonuniformities and lateral variations: IEC 61747 IEC 61988 IEC 62341 IEC 62679

Liquid crystal display devices (LCD) Plasma display panels (PDP) Organic light-emitting diode displays (OLED) Electronic paper displays (EPD)

Analysis of Features with Complex Outlines In some parts of the industry, it is important to assure uniform luminance and chromaticity of small features, e.g., of letters and numerals in display systems that provide important visual information to the operator of the machine for both esthetic and safety reasons. Analysis of luminance and chromaticity within the label “130” shown in Fig. 15 would be very time-consuming and error prone with a mechanical scanning approach; it is a simple task for imaging LMDs since the location and the contours of the features are automatically obtained from the image of the display device. The operator just selects the area of analysis with a polygon line (here a rectangle), and for each detector element of the LMD, both luminance and chromaticity can be graphically shown or the measurement values exported to file for further processing or documentation.

Fig. 15 Lateral variation of luminance in one numerical label of an analog indicator in an automotive display (130) shown as a 3D representation with pseudo-colors and the profile of the luminance of the indicator needle of the speedometer. The regions of interest are marked white (Courtesy of Instrument Systems GmbH)

Imaging Light Measurement Devices

23

Fig. 16 Lateral variations of luminance across an LCD screen in the white and black state, pseudo color representation (a) Nonuniformity of an LCD screen in the white state. (b) Nonuniformity of an LCD screen in the black state (Courtesy of Instrument Systems GmbH)

When the measurement setup is calibrated accordingly or when reference graduations are included in a measurement, the lateral dimensions of features can readily be determined.

Evaluation of Uniformity Any display screen should be able to present uniform luminance and chromaticity across its surface, since deviations are perceived at least as disturbing (see Fig. 16). In some diagnostic applications, nonuniformities may lead to wrong results (radiological and other diagnostic imaging examinations) and thus are extremely critical. Evaluation of the lateral uniformity of display screen luminance and chromaticity at first sight seems to be an ideal task for imaging LMDs because all required information can be collected in one image. A second look, however, reveals that such a measurement, in which the LMD takes the position of the eye of the observer shown in Fig. 17; each location on the screen area is seen and measured from a different direction. When the optical properties of the object of measurement are varying with direction of observation, as it is the case with LCD screens, the result of such measurements is based on a mixture of lateral and directional variations which cannot be separated right away. One way of separating lateral and directional effects is to resort to mechanical scanning (see Fig. 3) with the imaging LMD and only measure a small area on the optical axis. Another more elegant way is based on a special kind of optics which is designed such that the principal rays are parallel to the optical axis of the LMD (i.e., telecentric optics). The aperture of such lenses is slightly larger than the area to be measured, and telecentric lenses with aperture diameters above 15 cm are heavy and expensive.

24

M.E. Becker and J. Neumeier

Fig. 17 The observer is looking at – and the LMD measuring – each location on the display from a different direction specified by, e.g., two polar angles, θ and ϕ. The larger the screen and the smaller the distance, the larger is the angle of inclination, θ, at the corner locations

While measurement and evaluation of variations over large distances (which means low spatial frequencies) are critical because of the intermingling of lateral and directional variations in the case of LCD-based screens, imaging LMDs can be used without immediate restrictions for evaluation of variations over small distances (i.e., high spatial frequencies). This group of measurements comprises defect detection and classification, evaluation of image sticking and cross talk in LCDs (see IEC 62747), and evaluation of the modulation transfer function of display screens (Yamazaki et al. 2013).

Measurement of Directional Distribution of Emission The emission characteristics of light sources, provided they do not emit into extended solid angles, can be measured in a setup where the light source illuminates a transparent scattering screen and the imaging LMD measures the lateral luminance distribution of the screen, L(x, y), which then can be transformed into a directional distribution of the luminous intensity, I(θ, ϕ), as illustrated in Fig. 18. Fig. 19 shows a photograph of the realization of the measurement setup sketched in Fig. 18. The scattering of the screen should be as uniform as possible (across its area and with respect to directions) to minimize its effect on the measurement results. The luminance L(x, y) of each area element dA(x, y) on the screen should be proportional to the illuminance E(x, y) at the position x, y and not depending on the angle of inclination of the flux illuminating the area element dA(x, y). In a similar way, the directional emission characteristics can be measured with a reflective scattering screen when both light source and LMD are located on the same side of the screen (Fig. 20).

Imaging Light Measurement Devices

25

Fig. 18 Setup for measurement of the directional emission distribution function, I(θ, ϕ), of a light source with a scattering transparent screen. The lateral luminance (radiance) distribution on the screen is transformed into the directional distribution of emission

Measurement of Directional Distributions of Transmission and Reflection The measurement approach sketched in Fig. 18 can be modified and used for evaluation of the scattering properties of transparent samples (see, e.g., (Becker 2006)) from the lateral luminance distribution when the directional emission characteristic of the light source is sufficiently isotropic in the range of directions of interest. A white LED with “Lambertian” emission can conveniently be used for that purpose. At each area element dA(x, y), the direction of light incidence can be calculated from the geometry and also the direction of the light received from dA(x, y) by the imaging LMD. This approach offers fast evaluation of the scattering characteristics (BSDF) without mechanical scanning and high angular resolution for limited angular ranges. When the surface to be measured is rolled to form a cylinder and illumination of the sample is provided by a linear light source as sketched in Fig. 20, the bidirectional reflectance distribution function (BRDF) can be calculated from the lateral distribution of reflected luminance as shown in Fig. 20 (left). At each position on the cylinder surface, there is a different combination of light incidence angle, θi, and the inclination of the received beam, θr, from which the difference angle θ* = θr  θi is calculated. The result is then the reflectance of the sample as a function of the difference angle, θ*. This approach is being used for characterization of the gloss of, e.g., paper and hair.

26

M.E. Becker and J. Neumeier

Fig. 19 Realization of the measurement setup sketched in Fig. 18 (Courtesy of Instrument Systems GmbH)

Fig. 20 Side view of a measurement setup for measurement of the reflectance distribution of a cylindrical surface with a linear light source (left) and image of the cylinder surface as recorded by the LMD (right)

When the light source is located along the axis of the cylinder, the bidirectional transmittance distribution function (BTDF) of transmissive samples attached to the surface of the transparent cylinder can be measured.

Near-Field Goniometric Measurements with Imaging LMDs The design of optimized illumination devices requires a precise knowledge of the directional distribution of the light emitted at each position of the light source for numerical simulation of the light shaping means (reflectors, lenses, etc.). Such data is usually acquired with imaging LMDs that are mechanically aligned with a goniometer to “look” at the light source to be measured from all directions within a large solid angle, preferably up to 2π (hemisphere). From a series of such measurements, the directional distribution of light emission at each area element of the light source can be calculated.

Imaging Light Measurement Devices

27

Fig. 21 Analysis of the luminance and chromaticity conditions in the surrounding of a computer monitor with an imaging LMD with a calibrated wide-angle optics shown in a spherical coordinate system. The concentric circles indicate angles of inclination of 20 , 40 , 60 , and 80

Analysis and Documentation of Lighting Scenarios (Indoors, Outdoors) When imaging LMDs are combined with wide-angle objective (e.g., fisheye) lenses, they can be used for evaluation of the illuminance conditions at certain locations within buildings from the distribution of luminance and chromaticity as illustrated in Fig. 21. Such applications are becoming more important for improvement of the working conditions according to ergonomic concepts and for improvement of illuminations situation, especially when daylight is being mixed with a range of light sources with different spectrum of emission (incandescent lamps, fluorescent tubes, LED-based illumination).

Outlook on Future Developments Manufacturers of imaging LMDs are working hard to further improve the performance of their instruments especially by reducing the effects of the known weak spots as described above. Additionally, the need for improved evaluation of chromaticity in the case of spectral power distributions with narrow emission bands as provided by, e.g., LEDs may stimulate the increase of the number of filters in imaging LMDs or the addition of a spectroradiometer for reference measurements of chromaticity at one location within the measurement field of the imaging LMD. The combination of an imaging LMD with image transferring spectrographs, realized, e.g., by Fourier transform spectrometers or electrically tunable filters with liquid crystal, is called “hyperspectral imaging”; it finds applications in those fields where the additional

28

M.E. Becker and J. Neumeier

spectral information at each picture element provides clues on the nature of items to be identified, among others in astronomy, agriculture, biomedical diagnostics, mineralogy, physics, and general surveillance. Another aspect that will be improved in the course of time is the sensitivity of the detector arrays to incoming light. More current or charges per unit of incoming radiant flux means shorter exposure periods which again means higher temporal sampling rates. Faster temporal sampling could make analysis of temporal variations of display luminance possible and thus form the basis for evaluation of transition periods in displays between different levels of gray (dynamic response evaluation) and the analysis of moving images.

Further Reading Becker ME (2006) Display reflectance: basics, measurement, and rating. J SID 14(11):1003–1017 Boynton PA, Kelley EF (1998) Small-area black luminance measurements on white screen using replica masks. SID Symp Dig Tech Paper 29(1):941–944 Information Display Measurements Standard 1.03, issued by the International Committee for Display Metrology (ICDM) of the Society for Information Display (SID) Patent US5 432 609–1995 Yamazaki A, et al (2013) Spatial resolution characteristics of organic light-emitting diode displays: a comparative analysis of MTF for handheld and workstation formats. SID 2013 Digest

Optically Clear Adhesives Christopher J. Campbell

Contents Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chemical and Physical Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Key Chemistries and Properties of (L)OCAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Examples of Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Film Sensor Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Bonding to the Cover Glass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Desire for Direct Bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 LOCA Dispensing and Lamination Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Abstract

Optically clear adhesives are used to bond display components to improve contrast and brightness, as well as enhance mechanical and electrical performance of the display. Adhesive manufacturers take into account reliability of the material, as well as fundamental mechanical and optical properties. Display integrators set key specifications and processes to utilize the materials in display bonding. Synonyms and Definitions of Key Terms, Phrases, and Acronyms

AMOLED Cfinger CP Ctouch DITO GFF

Active-matrix organic light-emitting diode Capacitance from a finger Parasitic capacitance Capacitance that a touch sensor measures Double-coated ITO film or glass Coverglass bonded to film touch sensors

C.J. Campbell (*) Advanced Product Development Specialist, 3M Display Materials & Systems Division, Saint Paul, Minnesota, USA e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_197-1

1

2

C.J. Campbell

ITO LCM LOCA Mura OCA OGS PC PCAP PMMA PSA1 PSA2 PSA3 SITO TOL UV

Indium tin oxide Liquid crystal display module Liquid optically clear adhesive Localized/area variation in luminance or color in LCD displays Optically clear adhesive, often referring to a film adhesive One glass or on glass solution Polycarbonate Projected capacitive sensing Poly(methyl methacrylate) Adhesive bonding cover lens to touch sensor Adhesive bonding touch sensor to polarizer surface of display Adhesive bonding touch sensor layers Single-coated ITO film or glass Touch on lens Ultraviolet radiation

Overview Initial touch sensing in displays was achieved through resistive sensing (Askew 2013). An air gap typically separated two transparent conductive traces (one set running vertically; the other, horizontally) or two continuous conductive substrates, and a force needed to be applied (e.g., through a stylus or hard press of the finger) to make the circuit by physically connecting the two traces. As touch sensing migrated to projected capacitive sensing (PCAP) (Walker 2012), it was no longer necessary to physically connect the traces to register the touch. As a result, the air gap became no longer needed, and touch sensor and display integrators could then focus on improving optics, touch sensitivity, and mechanical performance of the touch sensing displays. Optically clear adhesives (OCA) and liquid optically clear adhesives (LOCA) enable filling the air gap. Figure 1 highlights typical constructions that can be utilized today, ranging from discrete sensor layers, such as double-coated or single-coated indium tin oxide (ITO) glass between the coverglass and liquid crystal display module (LCM) or coverglass bonded to film sensors (GFF), to newer technologies involving the touch circuitry patterned on the coverglass, commonly referred to as touch on lens (TOL) or one glass solution (OGS), or the touch circuitry patterned on the LCM, via in-cell or on-cell technology (Hsieh et al. 2011).

Chemical and Physical Principles Optimal performance can be enhanced in a touch-enabled display by taking advantage of Snell’s law through filling the air gap and removing reflective surfaces. This is achieved by choosing adhesives to bond parts of the display stack together that have matched refractive indices. With glass having a refractive index of 1.47–1.5, many adhesives used in display bonding are based on acrylate chemistry that has a

Optically Clear Adhesives

3

Cover Glass PSA1 X-Y Lines

Y Lines

Glass

Glass X

PSA2

Lines PSA2

X-Y Lines

Y Lines

PSA2

PSA1

PSA1

PSA2

Y Lines

Y Lines

X-Y Lines

Lines

Film

Film

Film

Film

PSA3

Lines

X

PSA2

X

Lines

X

PSA2

PSA2

Film PSA2 Polarizer Front Surface of Display GG, GG, OGS, G2 G1F, GFM SITO Glass DITO Glass SITO Glass

GFF

GF2, DITO Film

GF1, In-cell, On-cell, SITO Film TOL

Fig. 1 Typical touch circuit lamination configurations. In-cell and on-cell touch circuits are located behind and in front of the liquid crystal module, respectively

similar refractive index range of 1.46–1.49. A fully bonded display could see an improvement of brightness by the reduction of reflective surfaces by up to 20 %. Figure 2 demonstrates the difference in reflective surfaces between an air gap construction and a fully bonded display. Touch sensitivity is an important factor for touch-enabled displays. The use of (L) OCAs can help improve PCAP touch sensitivity in through careful consideration of the dielectric constant. Figure 3 is an illustration of a PCAP touch sensor in a display. The touch is registered as a capacitance measurement (Ctouch) and is the sum of parasitic capacitance (CP) and finger capacitance (Cfinger). An optimal sensor construction would be sensitive to changes in Cfinger when the capacitance changes as a result of a finger or stylus affecting the electrical fields, while minimizing the parasitic capacitance noise (Hwang et al. 2010). To determine an optimal adhesive for between the cover glass and the touch sensor, the impact on Cfinger is the most important factor. Cfinger is defined by:

4

C.J. Campbell

Fig. 2 Reflective surfaces of each air-substrate interface (left) lead to glare and significant light loss. By using a refractive index matched (L)OCA (right), light loss and glare will only occur on the front surface of the device

Sensing Region Sensor

Fig. 3 Schematic for a PCAP touch sensor. In front of the sensor is the sensing region where the finger or stylus touch is detected. Behind the sensor is the parasitic capacitance region, where noise from the display and other electronics can interfere and create unwanted noise for the sensing region

Cfinger ¼

e0 eR A e0 A ¼ t Tv

Where eR is the dielectric constant (or relative permittivity) between the finger and the electrode, e0 is the vacuum permittivity, A is the touch area, t is the thickness between the electrodes and the finger, and Tv is the equivalent vacuum thickness, defined by the thickness between the electrodes and the finger divided by the dielectric constant between the finger and the electrode. An increase in eR leads to better electric field penetration and greater sensitivity. The easiest way to do this is to pick a (L)OCA with a higher dielectric constant. A reduction in overall thickness between top of the cover glass and the touch sensor is another way of achieving greater sensitivity. To determine an optimal adhesive for between the touch sensor and the display (behind the sensing circuitry), CP should be minimized. The goal is to provide an insulator behind the touch sensor to isolate/minimize noise created by the display and the electronics located behind the display. With a dielectric constant of 1, an air gap is the ideal insulator in this case. However, to improve optical performance and enable thinner devices, (L)OCAs are often used to bond the touch sensor directly to the display. In this case, a (L)OCA should be chosen to maximize the equivalent vacuum thickness, Tv. This can be done through a lower dielectric constant or a thicker adhesive layer.

Optically Clear Adhesives

5

Another crucial factor for touch sensitivity is prevention of corrosion on the touch sensor traces. Many touch sensor manufacturers address this issue by passivation of the touch sensor traces with an organic layer that prevents corrosion. However, environmental exposure to this layer, as well as to the metal flex connections to these traces, can lead to damage to the touch circuitry and a loss in touch sensitivity. To prevent this, a (L)OCA should be chosen that is acid-free and that does not cause oxidation and/or reduction of the materials that it comes in contact with. Many (L) OCA manufacturers have adopted standard test methods for ITO corrosion resistance. As new touch sensor materials become more prevalent (e.g., metal mesh (Aryal et al. 2014), silver nanowire (Mayousse et al. 2013), carbon nanotube (Mikladal et al. 2013), and graphene (Obeng and Srinivasan 2011)), it will be critical to evaluate the sensitivity of (L)OCAs in contact with these materials.

Key Chemistries and Properties of (L)OCAs Optically Clear Adhesives (OCA) An optically clear adhesive typically refers to a product provided in a film form. The material is often a transfer tape between two release liners. Adhesive manufacturers will make this material in a roll form. Most integrators handle this product in a piecepart form to laminate discrete parts of the display, cover glass, and touch circuitry. To make the piece parts, the material is die cut by a converter (the adhesive manufacturer or a third-party converter). Optically clear adhesive films typically exhibit pressure-sensitive adhesive (PSA) characteristics, such as the so-called Dahlquist criterion having a room temperature modulus of less than 3  106 dyn/cm2 (Pocius 2002), being a viscoelastic film and having good adhesion. Good adhesion can be defined as having peel strength to avoid delamination in the flexing of the device or through device warpage in thermal cycles and having suitable shear and tensile adhesion to avoid creep under normal loading. The specific adhesion targets need to be defined by the end-use materials and application. An adhesive that may adhere well to glass may show issues adhering to engineered plastics such as poly(methyl methacrylate) (PMMA) or polycarbonate (PC). Coupon-level tests can be used to assess adhesion performance to relative substrates (Pocius 2002). While many standard transfer tapes may exhibit sufficient adhesion, OCAs require a superior level of optical performance. The OCA should be manufactured in a clean room environment to avoid introduction of foreign particles or contaminates that could be visible in the display area. Subsequent processing of the OCA should also be in a clean room environment. In terms of general optical properties, the OCA should be sufficiently transparent (>99 %), exhibit low color (b* or yellow being less than 5 V) and a steep electro-optical curve. This can be achieved for LCDs by STN technology (see chapter “▶ Twisted Nematic and Supertwisted Nematic LCDs”). Therefore, the resolution of PM LCDs and PM OLEDs (peak luminance) are limited. On the other hand, all commercial PDPs have been passive matrix (but in a special mode called memory mode, see part “▶ Plasma Display Panels”). Active matrix driving (Fig. 7b, chapter “▶ Active Matrix Driving”; examples for LCD see part “▶ LCD Addressing,” AMOLED see chapter “▶ Active Matrix for OLED Displays”) refers to a matrix display which contains a so-called active (electronic) element for each pixel on its back plane (Fig. 3). The vast majority uses thin film transistors (TFTs). These transistors with the electronic functionality of a MOSFET (called address TFT for AM LCDs) act like switches between the display columns and the pixels. The gate of a TFT is connected to its row (scan); if this row is activated by a voltage, the gate “opens” and let the column voltage (gray level data voltage) passes to the pixel. So a pixel is only exposed to its individual voltage and not, as for PM, to the voltage of the other rows during a frame time. So the TFT “isolates” pixels from the rest of the matrix when not selected by its row. A storage capacitor “holds” the gray level voltage (to compensate that small but nonzero conductivity of liquid crystals) until the corresponding row is selected at

What is a Display? An Introduction to Visual Displays and Display Systems

15

the next frame (16.7 ms for 60 Hz). This paves the road toward high resolution and high image quality. The key advantage of active matrix compared to passive matrix drive is that it allows accurate control of the data (usually in the form of a voltage) loaded onto each pixel. TFTs can be produced in different types such as a-Si (amorphous silicon, chapter “▶ Hydrogenated Amorphous Silicon Thin Film Transistors (a Si:H TFTs)”), p-Si (polycrystalline silicon, chapter “▶ Polycrystalline Silicon Thin Film Transistors (Poly-Si TFTs)”) or oxide (chapter “▶ Oxide TFTs”), and organic (chapter “▶ Organic TFTs: Polymers”) material.

What Are A Display Module and Display System? In this section, we consider more of the full system from which a display is composed. A display consists not only of a pure substrate (Fig. 3) or matrix electrodes and TFTs (Fig. 7) but also the drive electronics and embedded software which determine the gray level data for each subpixel and the selection of rows. A device consisting of display glass, interface (IF), electronics, power supply, and a backlight for LCDs is called “display module” (see block diagram Fig. 8). Each of the subassemblies shown in this figure has dedicated tasks to perform: • The module itself provides housing for all subassemblies and the mechanical fixture. Its dimensions are relevant for mechanical integration. • The power supply (part “▶ Power Supply”) delivers all necessary voltages with their maximum current, which are needed within the display module. Some modules have no power supply built-in, so that every voltage has to be provided from outside. This is relevant for the electronic design of the display system. • The interface (part “▶ Panel Interfaces”) transfers data from a data source (e.g., display or graphics controller, see below) to the timing controller (TCON). Power supply is mostly done as well via the interface connector. The selection Fig. 8 Block diagram of a high-resolution matrix display module, backlight only for LCDs

16

K. Blankenbach

Fig. 9 A display module within a complete electronic system with user interaction



• • •

of the electronic interface (see part “▶ Panel Interfaces”) depends mostly on the display resolution – the more pixels the higher the data rate to be transferred. The timing controller (TCON) modifies the input interface data to a format which is needed for the row and column drivers, including gray level transfer function (see chapter “▶ Luminance, Contrast Ratio and Grey Scale”) for the electro-optic display technology used. Row drivers select subsequentially each row so that gray level data can be transferred to the pixel of the selected activated row (line) by the column drivers. Column drivers deliver the gray level voltage waveform for the actual selected row. Backlight for color LCDs (chapter “▶ Dimming of LED LCD Backlights,” part “▶ LCD Backlights and Films”) or front light for reflective e-paper.

So, finally, a display module is a device for converting input data into a viewable form. This means that the input image data must be adapted to the driving principle and the electro-optical characteristics of the display technology used. Another important topic for judging display materials (such as emitters of emissive displays, see section “▶ Emissive Displays”) and subassemblies (such as LCD backlights and films, see part “▶ LCD Backlights and Films”) and systems with a display is efficiency. This is defined as luminance output divided by the total power consumption, multiplied by the active area size (see part “▶ Spatial Effects” section “▶ Efficiency”). It is clear that the efficiency should be maximized, especially for mobile and wearable display systems (see section “▶ Mobile Displays, Microdisplays, Projection and Headworn Displays”). For PC monitors and TV sets, legal requirements, such as EC No 642/2009 in the EU and labels such as Energy Star in the US, exist, forcing display manufacturers to reduce power consumption toward “green displays” (which includes also production, recycling, and disposal). A whole display system as shown in Fig. 9 combines the following elements and subassemblies:

What is a Display? An Introduction to Visual Displays and Display Systems

17

• User input via, e.g., keyboard, touch, and mouse. • Application features like housing and power supply. • Software for Human Machine Interface (HMI) or Graphical User Interface (GUI); this includes an optional operating system (OS). • Data source and storage (hard disk, cloud, etc.). • A microprocessor (includes microcontrollers and processors of PCs) to provide (generate) the content to be shown. The various approaches for microcontroller – display controller – display are presented in part “▶ Embedded Systems.” • The data generated by the microprocessor are transferred to the display controller and are stored in the display data (or video) RAM. Basically the RAM must be only updated when pixel data change. • The display controller (graphics adaptor) streams the display RAM data in real time to the display module’s interface. • The display module (see Fig. 8) processes and visualizes these data. • The user output is mainly provided by the display module but other sources like audio are often used. It is evident from this introductory discussion that a display is a highly complex system with a vast number of technologies, optimization strategies, and applicationspecific considerations. Another very important aspects are the associated markets and economics (see section “▶ Display Markets and Economics”). The variety of data representation methods is illustrated in Fig. 10. The simplest displays are named as “8 Segment” (also named often as “Segment 8,” “Seg 8,” “8

Fig. 10 Overview of character reproduction (for “A” and “a”) of displays and its impact on driving principle, display module, interface, and software

18

K. Blankenbach

Seg,” or “7 Segment”) due to their appearance when all segments (pixels) are activated; another explanation refers to 7 segments plus decimal point. Those displays are mostly used for watches and meters such as temperature devices. Their segment count is in the range of tens, and they are therefore addressed by direct drive or low MUX (multiplex, similar to passive matrix). 8-Segment digits can show only a very limited number of characters like A, B, C, E, F, H, L, O, and U. Starburst displays have 4 (diagonal) segments more than 8-Segment displays, enabling the reproduction of numbers and characters, however mostly limited to Latin ones. Furthermore, icons made by electrode structuring are often implemented to provide some “graphics.” As the total number of segments lies in the range of 100, these displays are typically MUX driven (chapter “▶ Direct Drive, Multiplex and Passive Matrix”). Both low-resolution approaches are typically driven by a microcontroller with built-in display controller (see chapter “▶ Direct Drive, Multiplex and Passive Matrix”). Software development is relatively easy as supported by available C-code subroutines for the microcontroller. Due to low pixel count, the data rate is relatively low. In the future, these segmented displays might gain some volume when integrated in lowest cost systems and smart systems (smart home, Industry 4.0) with rudimentary data visualization and control via wireless interfaces by mobile devices. Matrix displays range from character displays (typically a few 1,000 s of pixels) to graphics displays with millions of pixels. This type is currently the most widespread for consumer and professional applications. As a consequence of the wide range of pixels, display driving is done by either passive or active matrix (see part 1.4). It is obvious that character reproduction by matrix displays is better and more flexible than on segmented displays, and the facility for color provides further features. The biggest benefit is the visualization of graphics and multimedia content. Passive matrix-driven displays have mostly the display controller built in the display module (low data rate), while active matrix displays are typically a part of PC-like systems or ARM-based devices with the highest data rates up to GBit/s. Without the use of an operating system (OS) or graphics library (Lib), the software effort would be enormous.

Fundamentals of Designing Display Systems Toward Applications Since the transducer or emitter materials, display subassemblies, or a display module are only a part of a complete electronic system for a dedicated application, a complete approach requires a full system design. Figure 11 provides an overview of such an approach to designing a full system incorporating a display. This approach is illustrated in the logical order with mutual dependencies indicated; yellow boxes are mainly related to displays; cyan symbolizes user and application characteristics:

What is a Display? An Introduction to Visual Displays and Display Systems

19

Fig. 11 Typical design flow for a system with a display

• The first step a new product design or a redesign of an existing one is the product specification or an “idea.” • The next step is to define the data to be displayed which are necessary to control or operate the system. The way these data are presented on a display is named as the Graphical User Interface (GUI) or Human Machine Interface (HMI). The latter often refers more to the whole system including the input and output devices. • Display resolution is defined by the data to be displayed by summing up all text, icon, and graphics elements in terms of pixels. • The display resolution determines the whole electronic system requirements including the microprocessor and graphics processor (see section “What Are A Display Module and Display System?”). An operating system (OS) and a GUI software are mostly used for high-resolution graphics systems. • The screen size is set by the observer distance (viewing condition) and related to vision (see section “What Is A Pixel”). If a touch input is used, the touch buttons on the display must also have an appropriate size to match input geometry, typically 2 cm  1 cm for an OK button. • The environment of the application including the operating temperature and incident angle of the observer (viewing angle) helps to inform the suitable display technology. • Other application aspects to be considered are power consumption for battery; vehicle or mains-powered devices; the operating lifetime; ambient lighting conditions such as whether it will be used at night, indoor, sheltered outdoor, outdoor and full sunlight exposure, etc.; and other environmental factors such as humidity and vibration. The optical performance of the resultant system is then determined by display metrology (see section “▶ Display Metrology”).

20

K. Blankenbach

Table 2 Consumer versus professional displays Minimum order quantity Targeted display production Optimization ECO (Engineering Change Order) notification LTB (last-/ lifetime buy) notification Requirements Duration per use Lifetime (sum of use time) Operational time (time in use incl. off time) Location of production

Consumer displays High

Professional displays Low

~1 year

>3 years

Typically cost Maybe

Typically performance Yes

Maybe

Yes

Low Minutes (mobile) to hours (PC, TV) 1,000–10,000 h

High Several hours to 24/7 use Up to 100,000+ h

Several years

Decade(s)

Usually complete value chain located in Asia

Display panels are produced in Asia and value is added in country where the display is integrated into a system.

• Finally, the system has to be built, so mechanical construction, procurement, and production issues will need to be considered. In many cases, the display technology or module which would fulfill all requirements is not available (e.g., supply chain for professional products, see below) or the price might be too high. Therefore, redesign and optimization is needed to improve a system. Supply chain and price are often topics which are underestimated, especially when designing professional displays. As flat panel display production lines are expensive, mass production of a vast number of a display types is the only economical way. Therefore, consumer displays dominate the market with a share of over 90 % (see section “▶ Display Markets and Economics”). All other applications are summarized as professional displays. However, the requirements of consumer electronics and professional displays differ significantly, as summarized in Table 2. Consumer displays are typically produced at high volume for a period of about 1 year due to innovations and trends of, e.g., smartphones. As system’s development and lifetime is significantly longer, displays intended for professional use are produced consistently for at least 3 years. The optimization criterion for consumer displays is typically low cost, whereas professional systems must provide a certain performance. As with electronics as a whole, obsolescence is also a critical topic for professional

What is a Display? An Introduction to Visual Displays and Display Systems

21

displays, so communication of producer and customer (ECO and LTB notification) is needed. Professional systems have to fulfill dedicated tasks, so their requirements in terms of system design are significantly higher than, for example, a smartphone where secure operation is not mission critical. Most consumer display systems are operational for periods in the range of hours per day for some years (lifetime), while some professional displays are often continuously operated for years (summing up to 100,000 h of operation) and/or under extreme temperature conditions. Since around the year 2000, nearly all displays are manufactured in Asia; one reason is that mass production of high-volume consumer devices is located there. As professional systems differ at large extent, manufacturers for professional systems often customize their systems (for added value) by modifying, for example, the housing, power supply, and cover lens for a touch module, toward complex systems for individual needs. However, consumer devices set the major trends and professional displays and then follow these trends. An example is the touch screen: Touch control was used in the industry long before the introduction of Apple’s iPhone in 2007. However, soon after that time, touch screens were positively experienced by many people, and so touch screens became a must for many professional display systems as well.

Summary and Directions of Future Research Displays are the most essential device in the information age as they are the link between user and data visualization. To date, there is no single display technology which fulfills all the different needs of an application. Therefore, many different display technologies have been developed since the invention of the CRT. At the time of writing (2015), LCDs, OLEDs, and reflective e-paper are the dominating technologies, with some other display approaches gaining low volumes as with large LED video walls. Most LCDs and OLEDs are full-color displays capable of presenting multimedia information but with some issues under bright light conditions. Reflective e-paper displays are readable in sunlight but there is very limited color reproduction. As displays and their resolution set the requirements of a whole system (such as computing and graphics power), it is essential to have a wide knowledge of all aspects of displays – which is the premise of this handbook (and presented as an overview in this section). Current and future research is focussed on flexible, plastic (unbreakable), and transparent displays to enable new form factors and applications like foldable displays for smartphones as an ideal combination of a small device for making phone calls and browsing the internet with an unfolded screen. The major challenges are bending radius and the number of bendings without noticeable degradation. However, it is clear that “traditional” flat panel displays are here to stay and have a bright future.

22

K. Blankenbach

Acknowledgment This opening chapter of this revised edition of the Handbook of Visual Display Technology is dedicated to Mr Chris Williams, UK. His outstanding networking and enthusiasm for displays brought many of these contributing authors together – this Handbook is the result.

Further Reading Blankenbach K, Gassler G, Koops HWP (2008) Vacuum displays. In: Eichmeier JA, Thumm MK (eds) Vacuum electronics, components and devices. Springer, Heidelberg, pp 85–125 den Boer W. Active matrix LCDs. Newnes/Elsevier, Amsterdam Lee J-H, Liu DN, Wu S-T (2008) Introduction to flat panel displays. Wiley, New York MacDonald LW (2012) Display systems: design and applications. Wiley, Chichester Wu ST, Yang D-K (2014) Fundamentals of liquid crystal devices. Wiley SID, Chichester

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

Displays for the Built Environment Peter Dalsgaard* and Kim Halskov Center for Advanced Visualisation and Interaction (CAVI), School of Communication and Culture, Aarhus University, Aarhus, Denmark

Abstract “Media architecture” is a term for installations in which displays are integrated into architectural structures. In this chapter, we address five unique qualities of media architecture displays: scale, shape, pixel configuration, pixel shape, and light quality. We exemplify this through two media architecture displays, Aarhus by Light and The Danish Pavilion at Expo 2010, followed by a discussion of two key topics for such displays, namely, their integration into the built environment and the potential for interacting with them.

Introduction Within the field of urban computing (Foth 2009), media architecture is a term for installations in which displays are an integral part of a building’s architectural structures (Haeusler 2009). (This chapter is based on the articles: Dalsgaard and Halskov 2010; Dalsgaard et al. 2011; Halskov and Ebsen 2013.) Media architecture is often implemented with LED displays and digital projectors, but the term has also been used to refer to dynamic lighting and mechanical façades, such as Jean Nouvel’s Institut du Monde Arabe, where iris-like shutters automatically open or close to adjust to the lighting conditions. In media architecture, a number of genres may be identified: advertising, brand communication, games, and art. The buildings surrounding Times Square in New York, and Hachiko Square in Tokyo, are some of the archetypical examples of commercial advertising by means of media façades. The Allianz Arena football stadium in Munich, sponsored by Allianz, is a prominent exponent of the use of media architecture for branding purposes. Blinkenlights is a classic example of such an installation, where the windows of a high-rise building were turned into large pixels by placing a lamp behind each. The pixel matrix formed a low-resolution display, which was used for playing pong and displaying low-resolution animations. Artists are the driving force behind the creation of installations in the art genre of media architecture, as in the case of Body Movies, an installation by artist Rafael Lozano-Hemmer (Bullivant 2006). Displays in the context of media architecture differ from conventional displays in several respects, as summarized in Table 1 (Halskov and Ebsen 2013). Whereas a conventional display commonly has a two-dimensional, rectangular shape, a media façade extends into three-dimensional space and may have any shape, including an organic form. The shape of individual pixels in ordinary displays is dot like or square and ideally hardly noticeable, whereas the shape of pixels in media façades is nonstandard, and often part of the visual expression of the façade. Moreover, the pixels in conventional displays are

*Email: [email protected] Page 1 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

Table 1 Comparison of media façades with conventional displays (Halskov and Ebsen 2013) Scale Shape of display Pixel configuration Pixel shape Light quality

Media architecture displays Several meters Two- and three-dimensional Not standardized Not standardized Not standardized

Conventional displays Centimeters, to about 1 m Two-dimensional rectangular Grid or matrix Square Standardized

Table 2 Five specific qualities of media architecture display (Halskov and Ebsen 2013) Scale Scale is important to address in order to understand the size and volume of the building. In contrast to conventional displays, with dimension commonly measured in inches, media façades are huge. Most representations (drawing, models, and simulations) are scaled down to a smaller format, making it difficult to comprehend the actual scale of the media façade Shape Shape refers to both the outer perimeter of the media screen and the shape of the image surface. Whereas traditional displays are flat, rectangular surfaces, media façades may have any shape and even curve along the corners, bends, and curves of a building. This aspect is difficult to visualize on a two-dimensional computer screen Pixel configuration Pixel configuration is the layout or pattern of pixels on the media façade, which on traditional screens is a grid system of equal and perpendicular lines. Media façades, on the other hand, may use any configuration of pixels, creating complex patterns on a building façade. Translating media content from a conventional digital medium often requires techniques like subpixel sampling in order to accommodate the nongrid configuration on some media faades Pixel shape Pixel shape refers to the physical form of the pixels in the facade. Traditionally pixels are square shapes (preferable not identified as individual pixels), but on media façades the pixels can be any shape, determined by the lighting fixture or the architectural element that structures the configurations of pixels. On some media façades the pixel shape makes the visibility highly dependent on the viewing angle Light quality Light quality is crucial to how smooth colors are displayed, and how bright the media façade is. Brightness is mostly an issue of whether the media façade can be viewed in daylight or only in evening light conditions. The type of lighting fixture and the use of diffusers and reflectors may also produce visual qualities that are different form what traditional displays may produce

organized in a grid or matrix structure, whereas in media façades, there exists no standardized way of organizing pixels, see Table 1. Although media façades act as displays, to some extent, their scale, integration into buildings, and context raise many questions that need to be addressed. In this chapter, we focus on two areas of design concern in media architecture: (1) interface and (2) integration into physical structures and surroundings, two of the eight main media architecture design challenges identified by Dalsgaard and Halskov (2010). The others are (3) increased demands for robustness and stability, (4) developing content to suit the medium, (5) aligning stakeholders and balancing interests, (6) diversity of situations, (7) transforming social relations, and (8) emerging and unforeseen uses. In this chapter, we address five specific qualities of media architecture displays: scale, shape, pixel configuration, pixel shape, and light quality (see Table 2). We exemplify this through two media architecture displays, Aarhus by Light and The Danish Pavilion at Expo 2010, followed by a discussion of two key topics for such displays, namely, their integration into the built environment and the potential for interacting with them.

Page 2 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 The Danish Pavilion at Expo 2010

Two Media Architecture Installations The Danish Pavilion at Expo 2010

The Danish Pavilion at Expo 2010 was designed by the Danish architectural firm, BIG. The façade of the helical building was perforated by approximately 3600 holes of various sizes, which created an expressive surface that gave the building a characteristic visual texture. The façade was almost 300 m long, with a unique double-loop shape making the building appear as two stacked bands, one above the other, from some angles (see Fig. 1). When our research laboratory, CAVI (Halskov 2011), became involved with the Danish Pavilion project, the idea was that the holes would be simple openings, but we suggested turning each hole in the façade into an individual pixel. The proposed solution was to furnish each hole with an LED lighting fixture behind a tube of semiopaque material, making each hole appear as an illuminated, tube-shaped pixel. During the design process, several combinations of lighting fixtures and tube materials were tested, in order to investigate light quality, including color and intensity. A particular goal was ensuring that the PVC tubes diffused light uniformly. Owing to the three-dimensional form of the tubes, the shape of individual pixels depended on the viewing angle (see Fig. 2). The diverse size of the holes further added to the uniqueness of the display. Owing to the helical shape and size of the building, the wavy shape and scale of the display was quite extraordinary. If unfolded, the display would have been 300 m long and 12 m high, with pixels organized in 627 columns, with a pixel distribution unique to each column. Moreover, the unique pixel configuration had an aspect ratio of 25:1, which is 13 times wider than what we normally define as “widescreen.” The pavilion’s’scale, combined with the extraordinary shape of the display, the individual pixel shape, and irregular pixel configuration, imposed serious limitations (Biskjaer and Halskov 2014) with respect to the types of content it made sense to show on the low-resolution display. The low resolution excluded the possibility of displaying conventional video, and the irregular pixel configuration, with its lack of

Page 3 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Mock-up for testing the individual pixels during the design process

Fig. 3 Content on the low-resolution display

horizontal lines, made traditional geometric figures very hard to perceive. The final content was limited to white surfaces broken by lines, fades, or silhouettes of people walking or bicycling along the façade. For the dark evenings, the fixtures faded to a warmer color, and a show reel of animations was displayed on the façade, including shimmering, abstract graphics, sweeps, fades, and animations along the entire length of the façade (Halskov and Ebsen 2013), see Fig. 3.

Aarhus by Light Aarhus by Light was an interactive media façade at Musikhuset concert hall in Aarhus, Denmark, which was installed and run 24/7 for 2 months in 2009 (Dalsgaard et al. 2011). The scale of the media façade Page 4 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 Musikhuset concert hall with the media facade installation and the three interaction zones in the park

was, for that time, very large, 180 m2 (1937 sq ft), integrated into the 700-m2 glass façade of the concert hall, which is located in a central public park. The installation was developed through a partnership between our research lab, CAVI (Halskov 2011), and the lighting company, Martin Professional, with additional interactive visuals developed by the Animation Workshop, Denmark. The installation was developed to attract attention to the award-winning concert hall building and to enrich the experience of the park, which at that time was mostly a transit area. At the same time, it served as an experiment in how interactive media architecture could affect and transform social relations in a public space. As a consequence, the installation was designed to run evolving, interactive content, rather than prerendered visuals, and to offer park visitors and passers-by the means with which to interact with the façade in an accessible and easily recognizable manner. The latter aspect was accomplished by developing software to detect and track the silhouettes of people in three designated areas of the park, so-called interaction zones, so that people could interact with the façade solely by moving about and gesturing. The pixel configuration of the display consisted of panels composed of 25  50 pixels (4 cm dot pitch), which were assembled in a display of 1250  150 pixels. The rectangular LED panels matched the existing glass façade modules of the concert hall. The shape of the display was configured as an irregular form measuring approximately 50  6 m across the front of the façade, facing the park, see Fig. 4. Page 5 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 5 Luminous creatures inhabiting the building and the city skyline

A small portion of the display was placed on the side of the building, creating the impression that the display stretched around the entire building. We chose this shape to purposely go against the traditional, rectangular form of most displays, and thereby indicate to observers that this installation was beyond the ordinary. The installation displayed three forms of content. First, and most prominently to passers-by, were 30 so-called luminous creatures, which moved around the structure of the building, seemingly inhabiting it. Each creature was autonomous, guided by a set of algorithms with a degree of randomness, to create emergent and never fully predictable patterns of movement and interaction. Second, and less prominently, a skyline of prominent local buildings and structures was rendered in the background, to establish a relationship to the city. Third, silhouettes of passers-by and visitors to the concert hall were captured via cameras in dedicated areas in the park and displayed on the façade, see Fig. 5. The installation was designed to offer an intriguing experience for people in the park, by drawing them into the installation. When people entered one of three designated interaction zones in the park, their silhouettes appeared on the façade, Figs. 4 and 5. This enabled them to interact with the luminous creatures. The creatures would exhibit different behaviors, such as waving and greeting the silhouettes, being pushed away by them, crawling onto them, jumping off them, kissing them, kicking them, and so on. In so doing, we sought to establish temporary relations among the various luminous creatures and the visitors to the park. Although the behavior of the creatures was to a large extent randomly determined by the underlying algorithms, we found that many visitors created coherent stories about the interaction, to make sense of the installation. Furthermore, many visitors would interact with each other’s silhouettes, for instance, by engaging in shadow boxing or playing “catch.” In addition to the display being physically integrated into the façade, the integration of the display into the building was supported by having the content relate to the building, for instance, by having the luminous creatures crawl up and down the steel framework or entering or exiting doors at the frames of the façade, Fig. 5.

Discussion Integrating Media Architecture Displays into Buildings and Spatial Surroundings When it comes to developing media architecture displays, two principal and interrelated challenges are to determine how to integrate displays into physical structures (typically, buildings) and how to establish a Page 6 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

suitable fit with the surroundings, that is, the landscape, as well as the built environment. The introduction of a new piece of media architecture may constitute a radical intervention into existing architecture and spaces and into the sociocultural situations and routines that unfold in them. This may obviously affect a specific building into which media architecture is integrated, for instance, when a building façade becomes a display. It may also affect the surrounding space less obviously, for example, when huge displays catch the attention of people in the space and alter their behavior. There are various strategies for integrating media architecture displays. They may be designed as an integral part of a façade, as we see with the Danish Pavilion and the BIX façade of Kunsthaus Graz. Such integration usually requires media architecture designers to be part of a project from the very early stages. This may make it easier for designers to establish a strong and meaningful fit between the interactive installation and the intended form and function of the building. On the flip side, deeply integrated media architecture displays may be very costly to upgrade and change. Digital technologies change at a much greater pace than architecture, and therefore, this approach may lead to installations that appear dated, unless their appearance and behavior are thoroughly considered and developed to suit the building. Another strategy for integrating media architecture displays is to integrate them into existing architecture, as seen with Aarhus by Light, or in spaces such as Times Square, where displays are attached to existing buildings. In many respects, the benefits and drawbacks of this approach are opposite to those associated with the integrated display approach, described above. They may be replaced frequently or updated at less cost and without compromising the existing building. However, it may be harder to establish a good fit with the existing building. If we consider the example of Times Square, the displays may give the appearance of being plastered onto existing buildings. However, the Aarhus by Light example illustrates that much may be done to create a better fit with existing architecture. In this case, LED panels were fitted into steel lattices of the building, matching the existing window panels. Also, the LEDs were distributed so that the display appeared semitransparent. This meant that in daylight, and when the display was turned off at night, the display was barely perceptible. Integration into the existing form of the building was further enhanced by programming the luminous creatures to walk along and crawl up and down the lattices of the building.

Interacting with Media Architecture A key feature of media architecture displays is that they enable designers and architects to reconfigure the relations between people and architecture by developing ways for people to interact with the displays. However, a primary challenge when designing such displays is that we can seldom rely on well-known, off-the-shelf technologies to capture user input. Instead, designers often have to modify and/or develop new types of interfaces. Although this may change as media architecture becomes more widespread, currently, designers must consider not only the implementation of the display and development of prerendered content for it but also the development of interfaces and dynamic content suited for interaction. The Aarhus by Light case illustrates these challenges. In terms of designing and implementing the interface, we had to develop an input form that could be operated by anyone in the park, and to design content that would respond to this input in a way that required little or no learning, or preexisting knowledge. For this reason, we developed a camera-tracking system, installed in weather-proof housing and mounted on top of existing light posts, that could capture the three interaction zones. The video feeds were then processed, to isolate the silhouettes, in order to create forms that park visitors could immediately recognize, and whose movements they could quickly link to their own movement. As the field matures, we may expect to see more of abovementioned technologies in an accessible form that will be easier for designers and developers to implement, but at the moment, most interactive media architecture displays rely on a combination of custom-developed software and hardware. Page 7 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_202-1 # Springer-Verlag Berlin Heidelberg 2015

A similar logic applies to the content of media architecture displays: As we have discussed, they differ from most traditional displays in terms of properties such as scale, resolution, brightness, exposure, and shared use. In our experience, this means that often, you have to scrap your preexisting knowledge of which content will work on a display, and how to interact with it. In effect, media architecture displays are a new medium – just as a smartphone is a different medium than a traditional computer – and in most cases, copying existing content and interaction forms does not lead to good results. Instead, we must reconsider which forms of content it makes sense to present on and through architecture, and how people may meaningfully interact with it, given the opportunities and constraints these large displays and their placement encompass.

Further Reading Biskjaer MB, Halskov K (2014) Decisive constraints as a creative resource in interaction design. Digit Creat 25(1):27–61 Bullivant L (2006) Responsive environments: architecture, art and design. V&A, London Dalsgaard P, Halskov K (2010) Designing urban media Façades: cases and challenges. In: Proceedings of the CHI 2010. ACM, New York, pp 2277–2286 Dalsgaard P, Dindler C, Halskov K (2011) Understanding the dynamics of engaging interaction in public spaces. In: INTERACT 2011, vol 6947, Springer lecture notes in computer science. Springer, Berlin, pp 212–229 Foth M (2009) Handbook of research on urban informatics: the practice and promise of the real-time city. IGI Global, Hershey Haeusler MH (2009) Media facades: history, technology, content. Avedition, Ludwigsburg Halskov K (2011) CAVI – an interaction design research lab. Interactions 18(4):92–95 Halskov K, Ebsen T (2013) A framework for designing complex media facades. Des Stud 34(5):663–679

Page 8 of 8

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_203-2 # Springer-Verlag Berlin Heidelberg 2015

Resistive Touch Screen Technology Robert Phares* Display Sourcing & Service LLC, Knoxville, TN, USA

Abstract Resistive is one of the oldest types of touch screens, having been produced in some form since the mid-1970s. These touch screens were once the largest selling products in the touch screen business, but have been surpassed in the last few years by PCAP, mainly for “smartphones” and tablet computers. Four-wire resistive touch screens are the simplest type of analog resistive touch screens, and they are considered first. Many of the concepts applicable to other types of resistive touch screens are introduced in this section.

Synonyms and Definitions of Key Terms, Phrases, and Acronyms 3W 4W 5W 8W Active area GUI

ITO Viewable area VLT

Three-wire, a three-wire touch screen Four-wire, referring to the number of connections between the controller and the sensor of a type of resistive touch screen Five-wire, a five-wire resistive touch screen Eight-wire, an eight-wire touch screen The portion of a touch screen sensor that is responsive to touch by a finger or stylus Graphical user interface, a type of computer user interface that uses mouse-clickable icons displayed on a video screen instead of text commands to select functions and initiate application programs Indium tin oxide, a transparent conductive film common in touch screen and display products The portion of a touch screen sensor that is reasonably transparent and includes the smaller active area Visible light transmission, a measure of the percentage of human perceivable light which passes through a surface

Four-Wire Resistive The simplest analog resistive touch screen is known as the “four-wire” (4 W) resistive. The concept of this touch screen is that of an orthogonal two-axis voltage divider.

*Email: [email protected] *Email: [email protected] Page 1 of 20

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_203-2 # Springer-Verlag Berlin Heidelberg 2015

Principle of Operation In the 4 W product, two conductive planes are developed on transparent conductive substrates, each dedicated to one of the orthogonal coordinate axes. In operation, an electric field is sequentially impressed on the two conductive planes, with the fields being orthogonal to each other. From each plane, a voltage proportional to the position of touch on that plane is passed to the controller for measurement. The details of operation are explained referencing Fig. 1. In Fig. 1, it is assumed that the X or horizontal axis is the lower plane and the Y or vertical axis is the upper plane. While the planes are shown widely separated, the actual construction of a practical touch screen incorporates separations of 0.2 mm or less. In operation as described here, the controller applies a potential, usually no greater than 5VDC, across highly conductive “bias” or “drive” electrodes spanning each end of the X-axis. Each conductive plane is typically a thin transparent plastic or glass substrate with an applied transparent conductive coating. The sheet resistance, or resistivity, of the coating is usually 150–800 Ω per square (Ω⁄□) in commercially available coated glass and film, and the resistivity of the electrodes is typically ny=nz Re>0, Rth >0 Nz=1 nx0, Rth nx>ny Re>0, Rth nz>ny Re>0, Rth –,0,+ Nz= 0 ∼ 1

xz–plane optic axes

nx>ny>nz Re=>0, Rth >0 Nz = 1 ∼ ∞

Hybrid–type,Twist–type

Rth

Nz = 1.5

Nz= 1

Nz= ∞ Nz= 0.5

0

Re

Nz= ー∞ Nz= ー0.5

Nz= 0

Fig. 5 Classification of optical compensation films

the deviation angle between first and second polarizers. These two origins make the polarization difference between the light after cell and absorption point of second polarizer (Fig. 4b), which corresponds to the light leakage. The optical compensation films are designed to cancel out this polarization difference. They are placed at a suitable position between first and second polarizers. The most suitable film depends on the type of LCD modes, as shown in section “Application for LCDs with Compensation Films.”

Classification of Optical Compensation Films To compensate the LC retardation and the deviation angle of two crossed polarizers, various types of optical compensation films have been used. Different LC modes need different types of compensation films to obtain a satisfactory compensation. The films are classified into uniaxial films, biaxial films, and others. Uniaxial and biaxial films are classified by three-dimensional refractive indices (nx, ny, and nz), i.e., index ellipsoid (Fig. 5). Page 4 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

Uniaxial films are anisotropic birefringent films having only one optic axis. They can be further classified into “A film” and “C film.” A film’s optic axis is parallel to the film surface, while a C film’s optic axis is perpendicular to the film surface. Both A films and C films can be further divided into positive (+) or negative ( ) films depending on the relative values of the extraordinary refractive index ne and ordinary refractive index no. A positive uniaxial film is the case when ne > no, while a negative uniaxial film is the case when ne < no. Biaxial films are anisotropic birefringent films having two optic axes. They can be classified into three types of films depending on where its two optic axes exist in xy, yz, or xz planes. Uniaxial and biaxial compensation films’ optical properties are characterized as “Re,” “Rth,” and “Nz” that are calculated by their index ellipsoid and their film thickness. “Re” (in-plane retardation) corresponds to the retardation at normal direction, and “Rth” (out-of-plane retardation) corresponds to the retardation at oblique direction. The relationship between “Re” and “Rth” is represented by “Nz.” The definition of Re, Rth, and Nz are shown in Fig. 5. Uniaxial and biaxial compensation films are generally used for VA-mode and IPS-mode LCDs to compensate the dark state of liquid crystal layers as described later. Other types of compensation films are a hybrid-type or twist-type films which are not uniaxial or biaxial media. These films are suitably used for compensating complicated liquid crystal deformations in the dark state of TN-, STN-, and OCB-mode LCDs (Ito et al. 2013; Uesaka et al. 2005). The hybrid-type films compensate the voltage-on states in TN- or OCB-mode LCDs, while the twist-type films compensate the voltage-off states in STN mode LCDs.

Application for LCDs with Compensation Films Optical compensation films are applied to eliminate the light leakage in the dark states of LCDs. They are placed at proper positions in LCDs, and the type of suitable films used depends on the LCD modes.

VA LCDs The LC director in a VA LCD is aligned perpendicularly to the cell surfaces in the dark state. Therefore, the LC cell is a “homeotropic,” uniaxial medium. It is effectively + C film. To compensate this LC cell, C film is placed between the first polarizer and the second polarizer. The optic axis aligned along the z-axis does not create any retardation at normal direction. For an oblique incidence of light, the positive retardation due to the LCD cell can be compensated by the negative-phase retardation of the –C film. Combined with the –C film, we use another film for the compensation of the deviation angle of the two crossed polarizers. A + A film is usually used (Ohmuro et al. 1997) for this purpose. This compensation principle (A&C type) is explained using the Poincaré sphere in Fig 6a. The polarization state of light from the first polarizer is rotated by the +A film around the axis of the +A film, resulting in suppression of the deviation angle of the two crossed polarizer. Subsequently, the polarization state is rotated by –C film to a point where it is rotated by the LC layer to the second polarizer’s absorption point. The proper retardations of the +A film and –C film are determined so that the polarization state reaches to the second polarizer’s absorption point. Consequently, the light leakage is completely suppressed. The arrangement can be modified by switching the position of +A film and –C film as long as the conditions of compensation scheme are satisfied. Biaxial films can be also used for VA-LCD compensation. Biaxial films simultaneously compensate both the LC retardation and deviation angle of two crossed polarizers. There are two types of compensation scheme for biaxial films: one-film type (Ishiguro et al. 2010) and two-film type (Takeda et al. 2011).

Page 5 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

a

A&C type 2nd Polarizer Absorbance axis = 90°

Light after LC cell

VA Cell

Light after 1st polarizer

Negative C film Positive A film Absorbance axis = 90°

Absorbance point of 2nd polarizer

1st Polarizer Absorbance axis = 0°

b

Biaxial 1film type 2nd Polarizer Absorbance axis = 90°

Light after biaxial film

VA Cell Light after 1st polarizer Biaxial film Absorbance axis = 90° 1st Polarizer Absorbance axis = 0°

c

Absorbance point of 2nd polarizer

Biaxial 2 film type 2nd Polarizer Absorbance axis = 90° Biaxial film Absorbance axis = 0°

VA Cell

Biaxial film Absorbance axis = 90° 1st Polarizer Absorbance axis = 0°

Absorbance point of 2nd polarizer

Light after 1st polarizer Light after LC cell

*Viewing angle: Azimuthalangle=45°,Polar angle =60°

Fig. 6 Panel configuration and compensation principles of VA LCDs: (a) A&C type ( C film’s Rth = 210 nm, +A film’s Re = 140 nm), (b) biaxial 1 sheet type (biaxial film’s Re = 55 nm, Rth = 220 nm), (c) biaxial 2 sheet type (biaxial films Re = 55 nm, Rth = 125 nm). VA cell’s retardation =300 nm

Figure 6b shows the one-film type compensation. The polarization state of light from the first polarizer is rotated by a biaxial film to a point where it is rotated by the LC layer to the second polarizer’s absorption point. The proper retardations of the biaxial film’s Re and Rth are determined. This simple compensation scheme has an advantage of requiring only one compensation film. Figure 6c shows the two-film type compensation. The light from the first polarizer is rotated by the first biaxial film, then rotated by LC layer, and finally rotated by the second biaxial film to the absorption point of the second polarizer. The proper retardations of both the biaxial film’s Re and Rth are determined. Since the compensation scheme by two-film type is geometrically symmetrical on the Poincaré sphere as shown in Fig 6c, the two-film type compensation shows superiorly symmetrical viewing-angle property compared to other types.

Page 6 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

IPS LCDs The compensation scheme in IPS LCD is different from VA LCD in terms of dealing with the LC cell. The LC molecules in an IPS LCD are parallel aligned in the cell at dark state. Therefore, the LC cell is a “homogeneous,” uniaxial medium. It is effectively +A film. The optic axis of the IPS cell is parallel to the absorption axis of the one of the two crossed polarizers. Here, we assume it as the first polarizer. This configuration allows the light to pass through from the first polarizer as the ordinary wave in the LC cell. There is no change of polarization due to the LC cell because only one ordinary wave exists. That means, to eliminate the light leakage, we only need to consider the leakage due to the deviation angle of the crossed polarizers and not the retardation of the LC cells. The important point is that the compensation films need to be placed between the LC cell and the second polarizer whose absorption axis is perpendicular to the LC cell’s slow axis. The typical case for the compensation of IPS cell is to use one biaxial film whose retardation is Re = l/2 (half wave) and Nz = 0.5 (optic axes in xy plane) (Saitoh et al. 1998). As shown in Fig. 7a, the polarization state of light from the first polarizer is not affected by LC cell. Subsequently, the biaxial film directly rotates the polarization to the absorption axis of the second polarizer. Another case is to use two uniaxial films: a +C film and a +A film (Chen et al. 1998). As shown in Fig. 7b, the light from the first polarizer is not affected by LC cell. Subsequently, the +A film rotates the polarization, followed by another rotation by the +C film to the absorption axis of the second polarizer. This compensation scheme can be modified by switching the positions of the +C plate and the +A plate as long as the conditions of compensation scheme are satisfied. Under this circumstance, +C and +A films can be changed to –C and –A films alternatively.

a

Biaxial 1 film type 2nd Polarizer Absorbance axis = 90°

Biaxial film

Light after LC cell

Absorbance axis = 0°

IPS Cell Slow axis = 0° 1st Polarizer Absorbance axis = 0°

b

Light after 1st polarizer Absorbance point of 2nd polarizer

A&C type 2nd Polarizer Absorbance axis = 90° Light after LC cell

Positive c film Positive a film Absorbance axis = 90°

IPS Cell Slow axis = 0°

1st Polarizer Absorbance axis = 0°

Light after 1st polarizer Absorbance point of 2nd polarizer

*Viewing angle: Azimuthal angle=45°, Polar angle = 60°

Fig. 7 Panel configuration and compensation principles of IPS LCDs: (a) biaxial 1 type (biaxial film’s Re = 275 nm, Rth = 0 nm (Nz = 0.5)), (b) A&C type (+A film’s Re = 140 nm, +C film’s Rth = 90 nm). IPS cell’s retardation is 300 nm

Page 7 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

a

b Melting

Solution

Extrusion

Casting

Cooling

Drying Solution casting method

Melt Extrusion method

Fig. 8 Manufacturing methods of base films in optical compensation films

a

Stretching

b

lateral

Coating

longitudinal Anisotropic materials Stretching method

Coating method

Fig. 9 Methods to give optical retardation in optical compensation films

Other compensation types in IPS LCDs have been also proposed, which are the combination of the +A film and –A film (Zhu and Wu 2005) and the combination of two biaxial films by which the compensation in wideband wavelength is achieved (Ishinabe et al. 2002). In addition, low (nearly zero) retardation films are proposed as protect films of polarizers for IPS LCDs, since no retardation between LC cell and first polarizer is preferred to minimize the effect of retardation of LC layer (Nakayama et al. 2005).

Film Manufacturing Methods and Materials Optical Compensation Films

The base polymer films in compensation films are manufactured by mainly two methods. One is a solventcasting method and the other is a melt extrusion method. A solvent-casting method, shown in Fig. 8a, is typically used with triacetyl cellulose (TAC) materials. The resin and the necessary additive are dissolved to form a pourable liquid and pumped through a die onto a moving belt. After the liquid creates the uniform thickness on the belt, an oven heats the liquid and vaporizes the solvent completely, leaving only the resin as a film. The advantage of the solvent-casting method is homogeneous mixing of additives, due to the capability of mixing additives under low temperature and low viscosity. Whereas, a melt extrusion method, shown in Fig. 8b, is used with cyclic olefin polymer (COP) and other materials which have relatively low glass transition temperatures (Tg). A melted resin at high temperature (over Tg) is extruded through a die and forms a continuous sheet. The sheet is expanded and cooled on roller stacks, leading to a film. The advantage of the melt extrusion method is a solvent-free process and miniaturization of the machine. These base polymer films are subsequently given optical anisotropy (retardation), leading to optical compensation films. According to the methods in generating anisotropy, the films are divided into two groups. One is stretching polymer films, and the other is coating films. Figure 9a shows a diagram of stretching films. When the base polymer film is stretched, the polymer chain is oriented along the stretching direction. Since the refractive indices are different between parallel

Page 8 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

and perpendicular direction along the polymer chain, an optical anisotropy is generated. When stretching strength is applied in one direction (i.e., longitudinal direction), then this is referred to as uniaxial stretching, while stretching in two directions (i.e., longitudinal and lateral directions) is called biaxial stretching. The base polymer film materials, which are TAC, COP, cyclic olefin copolymer (COC), polycarbonate (PC), polystyrene (PST), polymethyl metacryl acid (PMMA), or other suitably oriented organic birefringent materials, have their intrinsic birefringence. When the intrinsic birefringence is positive (TAC, COP, COC, PC, etc.), the stretched film exhibits large refractive index along the stretching direction and large refractive index parallel to the stretching direction and are suitable for +A film, C film, and biaxial films whose Nz > 0. On the other hand, when the intrinsic birefringence is negative (PST, PMMA, etc.), the stretched film exhibits large refractive index perpendicular to the stretching direction and are suitable for –A film, +C film, and biaxial films whose Nz < 0. Figure 9b shows a diagram of coated films. Anisotropic materials (usually LC compounds) are coated on the polymer base films (TAC, COP, etc.). Since the anisotropic materials have large birefringence compared to the stretching films’ polymers, it is possible for the coating layer to be thin (i.e., submicro meters). This is the advantage of coated films to be applied with thinner LCDs. +A film is obtained by coating rod-like LC materials with parallel (homogeneous) alignment, and –A film of negative birefringent materials is fabricated by using coplanar aligned films of discotic materials (Ito et al. 2013). +C film is fabricated by the use of vertically (homeotropically) aligned rod-like LC materials. Whereas, C film is fabricated by the use of vertically aligned discotic materials. Moreover, a hybrid-type film and a twist-type film are fabricated by coating the rod-like LC materials or discotic materials with specifically controlled alignment treatment such as rubbing or adding an alignment-controlling agent in LC materials (Ito et al. 2013; Uesaka et al. 2005).

Polarizers

A polarizer is an optical filter that passes light of only a specific polarization. For LCDs, it is placed on both sides of an LCD panel. Polarizers are essential components of LCD panel to obtain optical switching performance (dark and white). The most common polarizer is the “PVA type” polarizers (Land 1951). As shown in Fig. 10, it comprises a polyvinyl alcohol (PVA) film and two protective polymer films (typically TAC films). After the PVA film is dyed with iodine complex in a solution-dying bath, it is uniaxially stretched. The PVA polymer chain is aligned along the stretching direction, and the iodine complexes are aligned along the PVA polymer chain. Iodine complexes are optically uniaxial media and absorb the light parallel to the long axis and transmit the light perpendicular to the long axis. As a result, polarized light is generated. The degree of the polarization is described by the formula shown in Fig. 10. Because of high order of alignment of iodine complex along the PVA polymer chain, the PVA-type polarizer shows high degree

Transmittance Ts

lodine complex

Tp Protect film PVA film Protect film PVA molecular chain

Absorption axis

Degree of Polarization Ts – Tp T s + Tp

Fig. 10 Structure of PVA polarizer

Page 9 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

of polarization (i.e., >0.999). The density of the iodine complex is important for high brightness and small coloring. Aside from PVA-type polarizers, there are coated-type polarizers and wire-grid type polarizers. Coatedtype polarizers (Peeters et al. 2006) are composed of dichroic dyes that are aligned uniformly in one direction. It is used for applications that need thin films. In addition, the wire-grid polarizers (Yu and Kwok 2003) are composed of periodic metal nanostructure. They have high heat resistance because they are made of metal instead of polymer. Currently they are usually used for the small optical devices such as LCD projectors.

Challenges Optical compensation films will continue to improve their compensation performance to reduce the leakage of the light. One of the key optical properties of the optical compensation film is wavelength dispersion. Proper wavelength dispersion can compensate light leakage in a wide range of wavelength and reduce the color shift at oblique angles. Some optical films exhibit “reverse dispersion,” which indicates that the birefringence increases at longer wavelength. The dispersion can be controlled by blending the different chemical units (positive and negative birefringent units) in the polymer (Uchiyama and Yatabe 2003) or by the additives in the polymer (Takeda et al. 2011). Meanwhile, a reduction of the haze in the films is also important to reduce the light leakage. Some technique can be applied such as reducing micro anisotropic nonuniform structures inside the stretched film (Aminaka et al. 2010). These “wavelength dispersion” and “haze-reduction” aspects need to be further improved for realizing the “true black” display performance. Thinness of the optical compensation films will become necessary to satisfy a demand for mobile application such as smart phones, tablets, etc. The film substrate must be thinner. For that purpose, liquid crystal materials are expected to be further used for compensation films because they have a capability of thin thickness due to their high birefringence. Precise control of the alignment and thickness, low light leakage, and easiness for coating process are challenge items for liquid crystal materials for the thin optical compensation films.

Further Reading Aminaka E, Hayashi H, Nakayama H, Ishiguro M, Saitoh Y, Ito Y, Mihayashi K, Kishima M, Tanaka H (2010) Development of low haze VA compensation TAC film and proposal of compensation film arrangement for improving CR in VA panel. SID’10 Dig 41:822 Born M, Wolf E (1999) Principle of optics. Cambridge University Press, Cambridge. ISBN 978-0521642224 Chen J, Kim K-H, Jyu J-J, SoukJH, Kelly JR, Bos PJ (1998) Optimum film compensation modes for TN and VA LCDs. SID’98 Dig 29:315 Ishiguro M, Sekiguchi M, Saitoh Y (2010) New approach to enhance contrast ratio at normal incidence by controlling the retardation of optical compensation film in vertically aligned liquid crystal displays. Jpn J Appl Phys 49:030208 Ishinabe T, Miyashita T, Uchida T (2002) Wide-viewing-angle polarizer with a large wavelength range. Jpn J Appl Phys 41:4553 Ito Y, Watanabe J, Saitoh Y, Takada K, Morishima S, Takahashi Y, Oikawa T, Arai T (2013) Innovation of optical films using polymerized discotic materials: Past, present and future. SID’13 Dig 44:526 Page 10 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_209-1 # Springer-Verlag Berlin Heidelberg 2015

Land EH (1951) Some aspects on the development of sheet polarizers. J Opt Soc Am 41:957 Nakayama H, Fukagawa N, Nishiura Y, Nimura S, Yasuda T, Ito T, Mihayashi K (2005) Development of low-retardation TAC FILM. IDW/AD’05 Dig 36:1317 Ohmuro K, Kataoka S, Sasaki T, Koike Y (1997) Development of super-high-image-quality verticalalignment-mode LCD. SID’97 Dig 28:845 Peeters E, Lub J, Steebakkers JAM, Broer DJ (2006) High-contrast thin-film polarizers by photocrosslinking of smectic guest–host systems. Adv Mater 18:2412 Pochi Y, Claire G (2013) Optics of liquid crystal displays, 2nd edn. Wiley, New York. ISBN 978-0470181768 Saitoh Y, Kimura S, Kusafuka K, Shimizu H (1998) Optimum film compensation of viewing angle of contrast in in-plane-switching-mode liquid crystal display. Jpn J Appl Phys 37:4822 Shin-Tson W, Deng-Ke Y (2014) Fundamentals of liquid crystal devices, 2nd edn. Wiley, New York. ISBN 978-1118752005 Takeda J, Tooyama H, Fujiwara I, Andou T, Ito Y, Mihayashi K (2011) High performance VA-LCD compensation film made with environmentally friendly material for VA-LCD 42:50 Uchiyama A, Yatabe T (2003) Control of wavelength dispersion of birefringence for oriented copolycarbonate films containing positive and negative birefringent units. Jpn J Appl Phys 42:6941 Uesaka T, Nishimura S, Toyooka T, Yoda E (2005) Wide-viewing-angle transflective LCD using hybrid aligned nematic compensators. SID’05 Dig 36:742 Yu XJ, Kwok HS (2003) Optical wire-grid polarizers at oblique angles of incidence. J Appl Phys 93:4407 Zhu X, Wu ST (2005) Super wide view in-plane switching LCD with positive and negative uniaxial A-films compensation. SID’05 Dig 36:1164

Page 11 of 11

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Consumer Imaging II: Faces, Portraits, and Digital Beauty Peter Corcorana* and Petronel Bigioib a National University of Ireland Galway, College of Engineering and Informatics, Galway, Ireland b FotoNation Ltd., Galway, Ireland

Abstract In this second article, we explore some of these most recent technologies to make their way into today’s digital imaging devices. Photography is primarily about people and most of our photographs feature our family and friends. Here we explain how today’s cameras can detect and track faces and even facial features in real time. We look at some of the ways that the growing computational power available in cameras can help analyze, evaluate, and enhance images based on information derived from the faces in a scene. We’ll also take a look at how sophisticated eye tracking and analysis are now feasible and an overview of the classic red-eye defects that occur when flash photography is used and how this became the first computational imaging solution to reach the mass market. Finally, we review the implementation of a range of subtler and more sophisticated enhancements that can be applied to improve our portrait images and enhance our personal appearance in both photographs and video clips.

Definitions of Key Terms AAM CCD ISP IPP RIP ROP TMM WQVGA WFOV

Active appearance model Couple-charged device Image signal processor Image processing pipeline In-plane rotation Out-of-plane rotation Template-matching module Wide quarter video graphics array Wide field of view

Introduction In ▶ Consumer Imaging I – Processing Pipeline, Focus and Exposure of this series, the workings of a modern digital camera were explored in some detail. In particular, the digital processing of the data output from an image sensor was considered, and it was shown that significant post-processing occurs to convert the raw sensor data into a final image that will please the consumer. It does not take a great deal of additional consideration to understand that state-of-the-art image processing technology is powerful enough to do a lot more than conventional improvements to the raw image data. In fact modern digital imaging devices can do a lot more, and in the last decade, there have

*Email: [email protected] Page 1 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

been a range of new technologies that enable quite detailed analysis of the imaged scene coupled with sophisticated modifications to underlying image to improve it in less subtle ways. In this second article, we explore some of these “new” technologies and explain how today’s cameras can detect faces, evaluate and correct images for flash defects, and implement a range of subtler and more sophisticated enhancements to improve our portraits and enhance our personal appearance in both photographs and video clips. It is a fascinating journey, but let’s begin with the center of most of our personal imaging – the human face.

Photographic Composition and Faces It is well known that human faces are the most photographed subject matter for both amateur and professional photographer. There are some statistics showing that almost 80 % of the pictures taken globally have human faces in them. In fact it is the people in our photographs and video clips that personalize them and make them part of our individual life record. Our families and friends are at the heart of our lives and form the basis for most of the material in our personal image and video collections. It is no surprise that the human visual system is particularly sensitive to faces and our attention is drawn to them in photographs and video. In experiments performed by tracking the eye movement of the subjects, with an image that includes a human being, subjects tend to focus firstly and foremost on the face and more specifically on the eyes and only later search the image surrounding the main subject. By default, when a picture includes a human figure and in particular one or more faces, then those faces become the natural key point and focus of the image. That image tells their story and their presence is what gives it meaning for us. Thus, many artists and art teachers emphasize the location of the human figure and the face in particular to be an important part of a pleasing image composition. For example, some teach to position faces around the “golden ratio,” also known as the “divine proportion.”

Enhancing Images Using Knowledge of Faces

Now it is the faces themselves, not just the location of the faces in an image, that have similar “divine proportion” characteristics. The head forms a golden rectangle with the eyes at its midpoint; the mouth and nose are each placed at golden sections of distance between the eyes and the bottom of the chin. And once we can find the faces in a composition, then we can learn a great deal more about that imaged scene, including how to improve and enhance it. Color and Exposure of Faces While the human visual system is tolerant to shifts in color balance, the human skin tone is one area where this tolerance is limited. Variations in skin tone are accepted only around the luminance axis. Indeed skin luminance is the main differentiating factor between skin tones of faces of people of different races or ethnic backgrounds. In terms of chrominance, all tend to lie within a small region with significant overlapping between ethnicities (Zeng and Luo 2013). Knowledge of faces can thus provide an important advantage to determine or automatically correct the overall color balance of an image. Camera Focus and Faces Autofocusing is a popular feature among professional and amateur photographers alike. There are various ways to determine a region of focus. Some cameras use a center-weighted approach, while others allow the user to manually select the region. In most cases, it is the intention of the photographer to focus on the faces photographed, regardless of their location in the image. Other more sophisticated techniques include an attempt to guess the important regions of the image by determining the location where the photographer’s eye is looking. Page 2 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Most modern autofocus techniques do also use facial information (e.g., distance between eyes, in-plane and out-of-plane face orientation) to estimate the distance to subject and thus shorten dramatically the convergence time for autofocus. Digital Fill Flash Another useful feature particularly for the amateur photographer is fill-flash mode. In this mode, objects close to the camera may receive a boost in their exposure using artificial light such as a flash, while far away objects, which are not affected by the flash, are exposed using available light. Another technique provides a digital fill-flash effect to add light to faces in the foreground that are in the shadow or shot with backlight. Orientation The camera can be held horizontally or vertically when the picture is taken, creating what is referred to as a landscape mode or portrait mode, respectively. When viewing images, it is preferable to determine ahead of time the orientation of the camera at acquisition, thus eliminating a step of rotating the image and automatically orienting the image. The system may try to determine if the image was shot horizontally (landscape format) or vertically (portrait mode). Techniques may be used to determine the orientation of an image. Primarily these techniques include either recording the camera orientation at an acquisition time using an in-camera mechanical indicator or attempting to analyze image content post-acquisition. In-camera methods, although providing precision, use additional hardware and, sometimes, movable hardware components which can increase the price of the camera and add a maintenance challenge. Given a knowledge of location, size, and orientation of faces in a photograph, a modern computerized system can offer powerful automatic tools to enhance and correct such images or to provide options for enhancing and correcting images. Color Correction Automatic color correction can involve adding or removing a color cast to or from an image. Such cast can be created for many reasons including the film or CCD being calibrated to one light source, such as daylight, while the lighting condition at the time of image detection may be different, for example, coolwhite fluorescent. In this example, an image can tend to have a greenish cast that it will be desired to remove. In such situations it is helpful to have automatically generated or suggested color correction techniques for use with digital image enhancement processing. Cropping Automatic cropping may be performed on an image to create a more pleasing composition of an image. This is particularly useful if it is desired to have automatic image processing means to resize and crop images for generating or suggesting more balanced image compositions (e.g., using rules such as the rule of third). Cropping of the image allows to remove distracting scene elements at the periphery of an image scene and to improve the overall image geometry to be closer to the divine proportions discussed above. Rendering When an image is being rendered for printing or display, it undergoes operation as color conversion, contrast enhancement, cropping, and/or resizing to accommodate the physical characteristics of the rendering device. Such characteristic may be a limited color gamut, a restricted aspect ratio, a restricted display orientation, fixed contrast ratio, etc. An accurate knowledge of faces in an image can enable the use of sophisticated models to analyze scene lighting and reflectance models to enhance the texture and appearance both of the face and of individual Page 3 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

facial features such as the eyes, nose, and mouth. Advanced face models can separate scene lighting into global and directional components, and this knowledge can be applied to other aspects of the imaged scene to enhance shadows and mitigate highlights and specular reflections from objects and elements of the scene. Perhaps what is most interesting is not that all of these techniques could be implemented on today’s digital cameras, but that many of them are, in fact, already available, disguised in the various smart-scene modes to be found on the latest camera models. These smart-scenes don’t tell you explicitly how they work, but in many cases you’ll be able to figure them out for yourself after reading this article.

Finding Faces in an Image From the above discussion, it is clear that a good starting point is to be able to find faces in imaged scenes with high reliability and accuracy. Face detection first became available in consumer digital cameras around 2006 and it has continued to evolve and develop since then. The original research that enabled detection algorithms that could operate in real time was carried out by Paul Viola and Michael Jones over a decade ago (Viola and Jones) and has been improved on ever since (Viola and Jones 2004; Brubaker et al. 2007; Do et al. 2009; Ren et al. 2008; 2011; Hefenbrock et al. 2010). However, face detection on its own is typically too slow to work at real-time rates in an imaging device, so engineers use a range of “tricks” to follow faces from frame to frame, while executing a background search for new faces. This brings us to the underlying concepts of face tracking (Kublbeck and Ernst 2006; Ianculescu et al. 2008; Corcoran et al. 2008; Lui et al. 2010; Tresadern et al. 2012). Skin color can also be used to improve the accuracy and speed of facial tracking, and in today’s devices these techniques are all likely to be integrated into a hardware framework that can determine the size and shape of facial regions in an image frame almost instantly (Zaharia et al. 2011; Bigioi et al. 2012). Basic face detection is a useful starting point to get an understanding of some of the techniques that become important when algorithms are converted from high-performance desktop workstations with practically limitless resources to work on embedded systems with lower processing capabilities and significantly less energy to operate at high clock rates.1

Face Detection and Tangential Thinking

Faces are approximately round objects, so it might come as a surprise to learn that the best way to find them, at least in a low-resource computing environment, is actually to use rectangular classifiers. This solution was first proposed in 1998 (Papageorgiou et al. 1998) as a general framework for object detection to reduce the computational complexity by using image luminance rather than full color information. This idea was further developed by Viola and Jones (Viola and Jones) in their classic paper that has become the gold standard for face detection. The idea is simple and leverages another image processing tool known as the integral image or, alternatively, the summed area table. The Integral Image As the name suggests, the value at any point (x, y) in the summed area table is just the sum of all the pixels above and to the left of (x, y):

1

Having made this point, it is worth remarking that the latest handheld smartphones and tablets feature multi-core processors and GPU technology that are rapidly catching up on desktop capabilities. We live in interesting times. Page 4 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 1 Integral image (II) and a selected region, ABCD illustrating how the II technique works (Original image from author)

I ðx, yÞ ¼

X x0  x y0  y

iðx0 , y0 Þ

In image processing we normally restrict ourselves to the image intensity or luminance value. If you are used to thinking in RGB space, this is complicated to calculate, but in a camera the YCC color space is what it normally outputs from the imaging pipeline, so you have the Y or luminance component available directly for processing.2 Now you are probably wondering why create this intermediate integral image representation – are we not replacing one complex set of calculations with another one? But the trick here is that it only requires a simple addition of two integer values to get the next value in the integral image. But the real reason for setting up this intermediate representation is that it allows us to compute the summed intensity of combinations of rectangular regions by a very simple set of additions and subtractions. Efficient Luminance Calculations with the Integral Image If you look at Fig. 1, you’ll see a simple rectangle, ABCD marked out. This is the rectangle we want to evaluate. Now the value at A represents the sum of all the pixels to the left and above it, the same for the other four points.3 Now the area above the rectangle that we want to evaluate is simply B-A, or if you prefer sum all the pixels above and left of B and subtract the sum of all pixels left of and above A, where the pixels are taken from the original luminance image. The same applies to the rectangle to the left of ABCD which is simply D-A. Now what we want to calculate is the area inside the ABCD rectangle. That is the value at C less the three rectangles B-A, D-A, and A. Or to put it into an equation, ABCD ¼ C  ðB  AÞ  ðD  AÞ  A ¼ C  B  D þ 2A  A ¼ C þ A  B  D What have we done here? We’ve replaced the summing of all the values within ABCD with a simple fourvalue summing operation. Now arguably we already did the summing when we calculated the integral image representation. However, the classification technique I am about to explain requires that we calculate a very large number of such rectangles. In this way the information preserved in the integral

2 3

You’ll find more details in Part I of this article. In many texts these are written as I(A), I(B), etc., but let’s keep the notation simple here. Page 5 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 2 Multiple integral image evaluations from the original; it simply requires a single value comparison to determine if ABCD rectangular region is brighter (or darker) than the BEFC region (Original image from author)

Fig. 3 The Haar wavelet (taken from http://en.wikipedia.org/wiki/Haar_wavelet)

image is going to be used (and reused) extensively, and the savings of calculating it all the first time will be very much worthwhile. Thus, now we have a fast way to calculate the summed luminance value within any rectangular region of the original image. So let’s assume we wanted to compare two adjacent regions to figure out which was darker or brighter. Well Fig. 2 shows how to do this for a new rectangle, BEFC adjacent to our original ABCD rectangle. Rectangular Haar Classifiers This, in turn, brings us to the concept of Haar classifiers. These derive their name from Haar wavelets – a sequence of rescaled “square-shaped” functions that together form a wavelet basis set. In other words you can reconstruct an original dataset by adding together different weights of each basis function. Haar wavelets are a special case of the better-known Daubechies wavelets. They are also the simplest possible type of wavelet (Fig. 3). Now the classifiers based on these wavelets are ideal both for hardware and software solutions as they can be readily calculated for a particular scanning window from the equivalent integral image. However, we first need to perform some offline “training” to determine the particular set of Haar classifiers that are most useful in identifying facial regions. The details can be found in the literature starting from the work of Viola and Jones (2001, 2004) with many refinements being added by later researchers. Page 6 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 4 (a, b) Two of the larger-sized Haar classifiers that identify faces inside a scanning window (Original image from author generated using OpenCV)

Fig. 5 (a–d) Four examples of smaller Haar classifiers that identify features within a face region; in these examples we see classifiers that identify (a) nose glint, (b) eye/nose/mouth structure, (c) upper lip region, and (d) eyebrow region (Original images from author generated using OpenCV)

Here in Figs. 4 and 5, we provide some examples derived from the OpenCV (http://opencv.org) implementation of the Viola-Jones technique. Figure 4 shows two of the larger-sized classifiers that are used in a typical Haar cascade. Note that the main scanning window is shown in dark red, while recently scanned windows are in a lighter red. Typically, once a scanning window achieves a relative high probability score, the detection algorithm will

Page 7 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 6 The scanning process for a Viola-Jones style face detector (Image to be redone)

step in a more granular fashion around that location and may also step up, or down, the size of the scanning window to maximize the detection score. Figure 5 shows some examples of smaller, local classifiers. These classifiers are determined from training on a very large dataset and not every classifier can be linked explicitly with a face feature as shown here. In a basic detection cascade, we expect to have several dozen up to 100 separate classifiers, but it is possible to have more sophisticated cascades where the detection is split into multiple branches (Brubaker et al. 2007; Verschae et al. 2007; Hiromoto et al. 2009; Wang et al. 2010). In such cases there may be significantly more than 100 classifiers in the cascade structure. Cascading Haar Classifiers Now the last part in the face detection puzzle is how all of the Haar classifiers for a particular detection window are put together. In essence this is illustrated in Fig. 6. The Haar classifiers for the current scanning window are calculated from the corresponding portion of the integral image, and if the luminance values correspond correctly (e.g., left-hand rectangle is darker than middle rectangle and middle rectangle brighter than right-hand rectangle), then that particular classifier test is passed. If any of these Haar tests fails, the current window is rejected as a face candidate and the algorithm moves immediately onto the next scanning window. If all of the predetermined classifiers are successfully applied, then the regions are retained as a potential face candidate. Typically at the end of a scan, there will be clusters of face candidates around each face region and these need to be reduced to a single candidate window. There are various methods to achieve this “pruning” – frequently a commercial detection algorithm will use a secondary detection process that scans around the cluster of candidate regions with a more granular step size and more granular size of scanning windows. Other classifiers than Haar can be used to achieve a probabilistic rather than a binary (pass/fail) result. In the end a single detection window is determined and recorded/displayed with the image frame.

Page 8 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 7 (a–c) Examples of primary Haar classifiers that extend horizontally across the full facial region

Modern face detectors also remember where faces were detected on a previous image frame. This “memory” of the past image frame is known as face tracking and will be described shortly. It means that most modern imaging devices can achieve real-time face detection rates at 30–60 image frames per second – in fact the imaging system has only 16.5–33 ms or 16.5/1000 up to 33/1000 of a second to perform all these calculations! We will return to discuss this point in more detail later. Facial Symmetry and Haar Classifiers It was already commented that Haar classifiers have the benefit that they can be easily computed from the underlying integral image, but there are some other interesting properties that can be used to accelerate or extend their use in hardware subsystems. As an example we consider how facial symmetry can be leveraged to improve both the performance and capabilities of a face detector. Many of the classifiers that emerge from the machine learning process that trains a cascade will extend horizontally across the entire facial region. Several of these are illustrated in Fig. 7. However a second grouping of classifiers, shown in Fig. 8, is asymmetric – typically matching with features on one side of the face.

Page 9 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 8 (a–c) Examples of primary Haar classifiers that match with features on one side of the face (Original image from author; previously published in Corcoran et al. 2013)

Interestingly this second group will have both a left-hand and a right-hand equivalent due to facial symmetry. This is interesting from a hardware perspective as the hardware can be designed to implement both LH and RH classifiers from a single hardware node. This reduced the size of circuitry and lookup tables (LUT) by 50 % for these classifiers. It also facilitates the implementation of a half-face detector (Corcoran et al. 2013). In this arrangement a first cascade scans the detection window for LH and RH faces, and a second detector scans for full faces. If all three cascades provide a match, then a full face is present; if only LH or RH detectors trigger, then a half face is present. This solves the problem of occlusions that can occur in group shots or crowded scenes where one individual’s face is partly hidden by a person standing in front of them.

Detection and Face Rotation In an ideal world, all faces would be aligned with the vertical axis of our photographs. But in practical photography or videography, this rarely happens. Thus, a practical face detector has to compensate for a

Page 10 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 9 In-plane rotation of the face region by 0, 30, and 45 (Original image from author)

range of variations in facial orientation and pose. We’ll take a quick look at some of these problems and briefly sketch some of the approaches that may be used to compensate. In broad terms we can say that there are two main variations that are most significant. Where a face remains directly facing the camera, but is rotated relative to the camera horizontal, we describe this as inplane rotation. This is shown in Fig. 9. Where the face rotates around the horizontal or vertical axis of the head – in other words the facial pose is no longer frontal, facing directly toward the camera – then we have out-of-plane rotations. These are more challenging than the simpler in-plane rotations as the 2D projection of the face onto the image sensor is changed from that of a frontal face image. We discuss each variant briefly in the next sections. In-Plane Rotation There have been a number of approaches to this problem in the literature. One of the earliest papers by Lienhardt and Maydt (2002) took the approach of rotating the conventional Haar classifiers and proposed a new extended set of classifiers that can be used in place of the conventional rectangular classifiers of Viola and Jones. Other approaches have included rotating the integral image itself (Messom and Barczak 2006) or retraining the classifier sets using 12 rotational categories (Bo et al. 2004), or using vector boosting (Chang et al. 2005). A comprehensive approach that can handle both in-plane (RIP) rotation and out-of-plane (ROP) rotation is described by Huang et al. (2007). All of these techniques can be computationally challenging as it becomes necessary to scan through multiple rotation angles for each location where a face may be located. Thus, some additional indicators of the likely degree of rotation of the face can be very helpful. Often knowledge of the faces in one or more previous image frames can provide this information, and this suggests a need to progress from face detection on a single image frame to face tracking across a

Page 11 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 10 Out-of-plane rotation of the face region up to 45 can still be successfully analyzed without significant changes to the underlying face detection architecture (Original image from author)

Fig. 11 Rotation of the face region beyond 45 leads to a semi-profile face view; a different set of detection classifiers is required (Original image from author)

sequence of frames. Another useful tool can be information on local gradients patterns – as we saw in Figs. 4 and 5, the smaller, weak Haar classifiers line up with the horizontal and vertical gradients of the eyes, nose, and mouth. Out-of-Plane Rotation Out-of-plane rotations lead to a distortion of the underlying facial region as the 2D projection is mapped from the rotated 3D face. Fortunately the main transitions can be divided into several main subregions. For rotation around the main vertical axis, a +/ 30  rotation does not lead to very significant distortions and the face may be largely treated as being a frontal one. From 30  to 45  , the distortions are more significant, but if additional knowledge is available, such as eye-tracking and eye-gaze data (Nanu et al. 2011; Corcoran et al. 2012a), then it is possible to modify the face cropping rules and still achieve a useful detection accuracy as shown in Fig. 10. The main compensation can be a simple adjustment of the relative position of the final detection window because the eyes and nose are no longer centered in the 2D face region (Fig. 11).

Page 12 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Table 1 No. of scan windows to cover a full image frame at different resolution settings and for different step sizes Scan step size 4 8 12

Full HD (1080p) 125,875 45,188 13,904

HD (720p) 55,125 13,659 6090

480p 23,920 5928 2622

240p 8525 2079 936

However as the face rotates further, it moves into semi-profile and only a single eye region is visible. This requires a completely revised classifier set similar in some ways to the example of the half-face detector discussed in the previous section dealing with facial symmetry.

From Detection to Tracking There is a very extensive literature on face detection, but much of the work is implemented on conventional desktop computer systems. Historically such systems are more capable computationally than digital cameras or smartphones so that many algorithms that could work effectively in real time on 30 frames per second (fps) video sequences would be impractical in real consumer devices. However the consumer device has some advantages that have become more apparent in recent years. To begin with, any face detection systems operating within the handheld device can obtain data relating to the device orientation and device motion and record how this is changing from frame to frame. This is helpful in predicting the frame-to-frame location of faces once they are initially detected. The detection system also has access to the image stream that is used to display on the device – often referred to as the preview stream that is smaller in size that the full sensor frame used to capture still images. Often there is additional statistical data available from the image processing pipeline that provides information on global luminance, color balance data, and current exposure and focus settings. All of this data can be helpful to the detection algorithms. But the most useful data is knowledge about the face regions that were confirmed in the previous image frame. How Many Scan Windows? To understand why this is, we first need to understand how many scanning windows are used to cover an entire image frame. A typical scan window size is 22  22 pixels, but the step size for each scan window should be less than 22 pixels to ensure that we are close enough to any face regions to ensure an accurate match. In Table 1 a listing is provided of the number of scan windows required to fully cover an image at different frame sizes. Thus, to scan a full HD, image frame with a 4-pixel step size requires 125,000+ scan windows with a full classifier cascade to be applied at each. It is simply not possible to cover so many scan windows within the 33 ms available given a 30 frames per second video rate. However if we use a larger step size of 12 pixels (about 50 % of the 22  22 pixel scan window), then just under 14,000 scan windows are required. If the image frame is down-sampled to 240p, then only 940 scan windows are required. But if you have to apply a 100-classifier cascade to each of these windows, this still requires a lot of computational effort. In a practical detection algorithm, the scan window remains a fixed size and it is the image frame that is down-sampled in order to scan for larger faces. In the classic Viola-Jones embodiment, the size-scaling ratio is 1.2 between scales and thus our transition from 240p to 1080p represents a range of 8 size scales. Now in practice we are not generally interested in finding a 22  22 pixel face in a 1080p image frame – it will be quite a distant face – but the point is that the time required to fully search the scene even if we stop at 480p or 240p equivalent frame size will be more than the 33 ms for an individual frame capture. This is the reality for software algorithms (Fig. 12).

Page 13 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 12 Face search algorithm – adaptive step size as the scanning window approaches a face region (Image to be redone)

Predicted Regions It is at this point that one must take a leap of faith and consider the typical use case for consumer cameras. Cameras are intended, more than 90 % of the time, to be pointed at people. So within the 33 ms that it takes to acquire the next image frame, it is very likely that the camera will continue to be pointed in the same direction and any people in the next image frame will appear in almost the same location in the image scene and with a face that is almost the same size. Thus once a face is initially detected, it becomes quite straightforward to estimate where it will be in the next image frame and what range of face scales matches its likely size. Typically the detection algorithm can limit the search area to a predicted area that is 25–50 % larger than the current scan window region, and scanning uses the same size of detection window with perhaps 1–2 larger or smaller sizes depending on other data that may be available to the detection system (e.g., device motion). Data on the historical movement of the face region coupled with motion and orientation data from the device and optionally focus/exposure data can all help refine processing of the next image frame. In practical terms the detection algorithm is now tracking a person from frame to frame, and by maintaining a history of the detected region, a temporary buffer of useful data can be retained as long as a particular face remains within the field of view of the camera. There will still be a background search for new faces entering the field of the camera, but this becomes less important and can be spread over multiple image frames. Locking on the Face In a practical consumer system, what becomes important is keeping track of faces once they are detected. This is often described as a face-lock and the effectiveness of a face-tracking system is defined in terms of Page 14 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 13 Predicted regions – where we expect to find a face in the next image frame (Original image from author – taken from Corcoran et al. 2008)

Main Memory (SDRAM) 2

3

Read 0 I/F 1 Sensor I/F

Selector

Write 0 I/F

4

5

Read 1 I/F

Write 0 I/F

6

7

Cache Pre Processor

TMM

Local SRAM

AHFD - Advanced Hardware Face Detector

CPU High Quality Filtering

Config I/F

Fig. 14 Hardware engine for scan and detect of faces in an image frame; the face searching is achieved by template-matching modules (TMM) which operate in parallel on an image window (matching locally cached templates against the current window image data) (Original image from author – originally published in Bigioi et al. 2012)

the time to achieve a first lock and the subsequent robustness of the face-lock once it is established. In low-lighting conditions or during sudden camera movements, the lock can be lost and must be reestablished from background search. The tracking algorithm that maintains a face-lock will often employ additional classifiers and use more sophisticated analysis techniques to follow and analyze face regions. For example, color data can be

Page 15 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 15 Three different face models – each is trained with a different number of reference points, and variations in the underlying training dataset will determine the adaptability of a particular model (Original image from Ioana Barcivarov – taken from PhD thesis: Bacivarov 2009)

analyzed and matched using skin color filters – the base detection algorithms only use luminance data derived from the integral image, so color data provides additional confirmation that a region is a valid face and may help retain the face-lock if a person turns sideways or a face becomes blurred due to sudden movement in low light or is partly occluded. Other analyses can include tracking eye state and gaze direction or applying a more sophisticated face model that can determine facial pose and expressions (Fig. 13). Face Detection Hardware Engine As we just saw the basic detection process is a brute force one, particularly when searching for smaller faces. In state-of-the-art imaging devices, it makes sense to offload much of the low-level scan and search to a dedicated hardware engine that can process each image frame, as it is off-loaded from the image sensor. An example of such a hardware engine is given in Fig. 14. Data can be read from main memory (2) as well as being off-loaded from the image sensor (1) so that some recursive processing is possible. The input image data is then passed into a preprocessing block that transparently writes the image frame into main memory (3) but also selects portions of the image frame for direct transfer (4) to the template-matching module (TMM). The TMM contains multiple matching engines, each with its own local cache that allows the cached areas of the image selected by the preprocessor to be scanned independently at multiple scales. The output detection results are written back to main memory (5) as metadata for the main image frame. Typically the hardware engine generates multiple overlapping detection windows that have to be pruned by the main application processor (AP) when the main image frame is processed. Additional processing of detected face regions on the AP can further refine the detection accuracy and reduce the false-positive rate. The processed image frame with its final “face” metadata is then sent for other processing, compression, and/or storage (7). Face Models So far the emphasis has been on simple detection followed by tracking of face regions. In a sense these technologies are already commoditized and can be found in many consumer cameras and smartphones. This has happened either because the onboard CPUs have become powerful enough to run software versions of the underlying algorithms or because the algorithms have been implemented as a hardware core that is often included in image signal processor (ISP) chipsets. However research on facial analysis

Page 16 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 16 The main eye features: eye corners (blue), center of the pupil (green), and eye-socket center (red) (Original image from author – originally published in Corcoran et al. 2012a)

has provided more sophisticated facial models that can closely match the complex 3D morphology of the face and track it in real time. Today, the use of such models becomes more practical as smartphones gain GPU and multi-core capabilities. The concept of a full facial model will be revisited shortly, but first we’ll take a look at the most important parts of the human face – the eyes! (Fig. 15)

From Faces to the Eyes It was William Shakespeare who said “the eyes are the window to your soul.” And whatever your views might be in a literary or philosophical context, there is an eerie truth to this statement when applied to videography and facial analysis. The eyes are indeed the key to understanding the people in a photograph or a video sequence. They create the ambiance in many scenes, they suggest how people are interacting, and they are central to our facial expressions and mirror our body language. For facial analytics the eyes are used to normalize the face (bring the face to the same size, in-plane and out-of-plane orientation) prior to further analysis. They are also a central component of still photographs – we don’t want pictures with half-closed eyes or eyes with an unnatural color due to the effects of flash photography. So after mastering the capability to find faces, the next step is to find those eyes and see if we can make some use of them!

Eyes and Eye State Eyes are important for a range of reasons. Optimizing the eyes in a photograph is important to achieve the best portrait image. And in a group shot, it is even more challenging – we have all had the experience of capturing an almost perfect picture, but with one of the people having closed eyes at the moment of capture. Typically this was a simple eyeblink, but with bad timing, it can be enough to spoil that perfect moment. And in the context of children, not getting them all to look at the camera and smile at the right time is every professional photographer’s nightmare and ruins a large proportion of family photographs. The eyes in a photograph also convey messages, both within the context of the individuals in the picture, and additionally they can direct our attention to key elements/aspects of a scene.

Page 17 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 17 (LHS): A sanity check – the eye-socket boundary (yellow) should lie on the eye corners; (RHS) the pupil can be detected using a simple diagonal scanning search pattern and a modified Haar classifier (Original image from author – originally published in Corcoran et al. 2012a)

Thus, in the context of our eyes, it is not sufficient just to find them and track them, but we wish also to deduce additional information about where people are looking at any instant in time.

Eye Gaze There is nothing new about tracking people’s eyes and attempting to make use of that data. In fact the use of eye gaze as an interface mechanism has been considered from the 1980s. Early research focused mainly on computer interfaces for quadriplegic users (Hutchinson et al. 1989). Jacob (1991) is one of the first researchers to explore the use of eye gaze as a means to provide a real-time human-computer interface (HCI) for practical use by normal users. Interestingly he comments on the “Midas Touch” effect: At first, it is empowering to be able simply to look at what you want and have it happen, rather than having to look at it (as you would anyway) and then point and click it with the mouse or otherwise issue a command. Before long, though, it becomes like the Midas Touch. Everywhere you look, another command is activated; you cannot look anywhere without issuing a command.

More recently Duchowski (2002) has taken a broad look at the use of eye tracking and provides similar observations to those of Jacob’s. He cautions with regard to overloading the perceptual system of the eye with the motor task of a pointing device, but indicates that the use of gaze is more promising as an indicator of intent and to assist with interaction in increasingly complex contextual situations. Eye gaze has also been considered as useful in the context of advanced consumer applications such as game terminals (Nanu et al. 2011; Corcoran et al. 2012a), and recent smartphone devices have also explored the use of eye gaze for device control (Corcoran 2015). However, in the context of a digital imaging device, we are not so concerned with the use of eye gaze for device control, but rather as a means to understand the imaged scene and to make some intelligent decisions about how to modify the acquisition and perhaps even guide and advise the photographer.

4

This assumes a software implementation of the algorithm; if you have a hardware tracker available, it can be implemented on larger screen sizes and scan a wider range of face scales. Page 18 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 18 Example of combined face and eye tracking with eye-gaze algorithm and limited post-processing capabilities; as functionality migrates into hardware, complex real-time post-processing becomes practical (Original image from author – originally published in Corcoran et al. 2012a)

Fig. 19 A real-time face tracker in action; vertical bars at the side of the eyes indicate a blink (or wink) event in progress; eye-socket centers are shown as green crosses; face rectangles indicate head pose transitions: from left, frontal, semi-frontal, and half profile; exact pose angle data is above each face rectangle (Original image from author – originally published in Corcoran et al. 2012a)

External Structure of the Eye It is possible to develop quite sophisticated eye models (Bacivarov et al. 2008a), but the resulting complexity is not necessarily helpful when real-time tracking is essential. A simpler approach is to use a simple but effective model as shown in Figs. 16 and 17 where the structure of the eye can be used to quickly obtain a gaze estimate. The eye corners are straightforward to detect and the pupil region can be detected reliably using a modified Haar classifier as shown in Fig. 17 (RHS). The most challenging region to measure/quantify is the eye-socket region. This is shown in the top half of Fig. 17 (LHS) where it is correctly shown to intersect the two eye corners. The eye socket is the yellow circle with the center marked by the red cross. However, it can be prone to false detections if the algorithm is confused by the subject’s eyebrow, interpreting it as equivalent to the eyelid. This is straightforward to cross-check as the eye-socket boundary (yellow circle) should intersect the two eye corners. A dynamic Page 19 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 20 (a) Example of golden-eye artifact on the left; red-eye on right. (b) Example of hybrid (half red/half yellow) artifacts (Original image from author – originally published in Corcoran et al. 2012b)

adjustment of a threshold value – based on global and local lighting conditions – can help eliminate such false metrics. Holistic Tracking and Post-Processing Algorithms In practical implementations both face tracking and eye tracking followed by gaze estimation are operating continuously on each image frame. The time window of 33 ms is still the key restriction and some compromise is always necessary. A basic example of such a combined algorithm is shown in Fig. 18. Initial face detection is performed on a down-sampled image stream (WQVGA) with a minimum detection size for faces of 14  14 pixels.4 This can take advantage of the preview stream that is available on some devices for direct display of the acquired image stream. Detailed facial analysis operates on a larger-sized image, but the analysis can be restricted to detected face regions within the image frame – thus 90–95 % of the image frame can be immediately discarded and algorithms can be aligned quite accurately to the face. Only larger face regions are considered and for gaze tracking only the largest will be analyzed. The next step is to detect the eye sockets within the face. The eye corners and eye pupils can be quickly and reliably determined and a radial search will normally converge correctly on the eye socket. Note that it is important to apply a blink test at this point as an eye that is in the process of blinking will return unreliable gaze data. The pupil shape is next refined so that the precise center can be located and the pupilsocket offset and direction provide the gaze direction. Now on its own, this does not provide a very accurate angular resolution of the precise gaze direction, but we must remember that our eyes are in constant motion. Thus, a statistical averaging of the eye behavior over 10 or more image frames can be used to achieve a more accurate indication both of eye activity and of the time-averaged gaze direction. Some live captures from a real-time tracker are shown in Fig. 19. 5

Probably more than you would want to know! Page 20 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 21 The LHS image shows two different forms of flash-eye defect; the LH eye exhibits non-red or golden-eye effect, whereas the RH eye is a standard red-eye defect. The RHS image shows how the image can be convincingly restored so both defects are eliminated (Original image from author – originally published in Corcoran et al. 2014b)

Finally, it is worth commenting that for a more complete analysis of the eye gaze, it is also necessary to have a very accurate indication of the face orientation and pose. This requires a more sophisticated modeling of the entire head region, but the topic of detailed facial models would require a complete article of its own.

The Red-Eye Effect Up to now we have considered the analysis of an input image frame and how to detect and track elements of human faces in the imaged scene. But unless this information is converted into an improved image, it has no value for the consumer. Now in the introduction to this article, there is a discussion of how global image properties can be adjusted, but until now it was not considered that we might actually wish to alter physical details of the acquired image. In a sense this is contrary to the philosophy of the photographic image which aims to capture an instant in time for posterity. Altering the underlying image would surely constitute a betrayal of that principle. Well arguably, if the truth of that instant is distorted by the physics of the imaging process, we might justify some corrective actions. Well the red-eye effect is exactly that the flash illumination stimulates blood vessels in the eye and generates an unnatural looking red color in the pupil of the eye. Well detecting and removing that unnatural appearance from images would be just the sort of thing that digital image processing should be good at. Not All Red-Eye Is Red Now on first consideration, the problem of red-eye (RE) detection and correction may appear to be quite a trivial problem, but in practice it is quite a can of worms. Being coinventor on nearly 50 patents relating to techniques for red-eye analysis, your author can reliably attest to the many-layered nature of this particular problem. The interested reader can learn more5 by consulting Corcoran et al. (2012b). A shorter version can be found in Corcoran et al. (2014a). Red-eye and flash-eye defects in still photography continue to cause problems for digital imaging devices. Much of the early literature assumed that red-eye artifacts are substantially red. In fact this is not the case and to a researcher working with flash-eye defects, it very quickly becomes apparent that a significant percentage of flash-eye defects exhibit very little red hue. The situation is further complicated Page 21 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

by racial differences, particularly between Asian and Caucasian subjects. The latter exhibit darker shades of red and are more susceptible to yellowish artifacts with a bright white central region. New variants of flash-eye defects have continued to appear and present new challenges, especially as cameras and camera subsystems continue to reduce in size (Fig. 20). A useful description of a simple yet efficient algorithm is given in Corcoran et al.( 2005) and Steinberg. A key consideration is the speed and simplicity of the initial segmentation. Implementing the segmentation in Lab color space optimizes the segmentation, but no production consumer devices provide Lab color channels. As a consequence it is necessary to use lookup tables to achieve segmentation on YCC or RGB color images. This is best achieved using a hardware preprocessing architecture that can threshold individual pixels (Zaharia et al. 2011). Further details can be found in Corcoran et al. (2012b). Now the detection rates from this algorithm are only effective in 80–85 % of instances of red-eye effect. In a practical system, this is not enough to keep consumers happy, so additional information that is available has to be leveraged to improve the detection rates to higher levels, while minimizing false positive – things that look like a red-eye in fact aren’t one. Today’s state-of-the-art algorithms are good enough to be almost unnoticeable and work away quietly in the background, often without any requirement for user interaction (Fig. 21).

Improving Images with Faces So today, finding and tracking faces is a no brainer. The key question that has been open for some time now is what can you do with these faces that genuinely add value for the consumer? Some of the earliest attempts involved attempting to control image acquisition by waiting for people to smile (Whitehill et al. 2009), but these were only moderately successful in engaging consumers. However, behind the scene engineers were able to use face regions to improve the algorithms that refine white balance and image tones. By adjusting face coloration, the global illumination and color balance of an imaged scene can be determined. In turn this enables your camera to turn a dull overcast day into a brighter one. There are some limits to how much you can tweak the underlying luminance and colors of an image before side effects begin to make an image look unnatural, but faces are a great assistance in stretching these limits. And remember that a digital camera is constantly grabbing new images to update the camera display – so having faces available allows you to adjust the exposure and focus of the next image frame as well as tweak the IPP. So you can improve the acquisition of the next image. And if you can do this dynamically and in real time – which most modern cameras can – that means you can use face data to improve dynamic response when capturing video. The introduction of digital correction of red-eye effect opened the door to more aggressive approaches – if it was possible to change the underlying image to improve its appearance, well why not move beyond image defects and consider “improving” the appearance of subjects within the image? After all professional photographers and PR firms have been “adjusting” and “manipulating” photographic images as part of their professional service offerings. This brings us to one of the most recent technologies to appear on digital imaging devices – digital beauty.

Digital Beauty

Beauty is something that concerns each of us, in particular our own “beauty” and that of our friends and family. We all want to look good and this is especially true when it comes to capturing and saving our precious memories with those friends and family. Each of us will spend many hours of our lives behind a camera shooting pictures and movies and, naturally, we’d like to have everyone look their best. But it can Page 22 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 22 Could makeup be a thing of the past with digital skin smoothing? (Image from author – originally published in Corcoran et al. 2014b)

be challenging to control all the variables in many everyday photographic situations to get people in the right place and setting. And even if you get the photographic composition right, often people may not look their best due to variables beyond your control: the lighting may not suit everyone’s makeup, skin may look oily due to flash reflections, people’s eyes may be half closed, teeth may appear off-white, or you catch someone from the wrong angle, so their face appears plump. And things get even more complex in a group photo where some people always seem to steal the show! Or if you have kids, you’ll know how difficult it is to line them up and get them to stay still for even 10 s so you can properly compose a snapshot. Small wonder then that much of our personal photography ends up in a “virtual bin” on our computers. It is no surprise that imaging engineers have turned the new capabilities of digital imaging devices to help solve some of these everyday problems. And equally the public has become very enthusiastic about these new technologies, particularly in Asian markets where it is a must-have feature on all new smartphones. In fact many Asian ladies have abandoned their hand mirror and use their mobile phone as a substitute! The user-facing camera enables virtual makeup to be superimposed onto their face so that they can try different color combinations before actually applying their makeup. Let’s take a quick tour through some of the basic enhancements that are possible with this emerging technology. Skin Smoothing As has been explained in earlier sections, a typical state-of-the-art digital camera is capable of performing quite detailed and sophisticated analysis of facial regions and uses this data to enhance image acquisition and global setting for the imaged scene. As part of this analysis, the camera knows both the location of each face region and additional data relating to facial features, in particular the eye regions. Often a face model, seeded precisely by eye location, is also employed to obtain additional geometric data about the face, and a multi-threshold skin map is used to fine-tune image exposure and act as a sanity test for facetracking algorithms. It does not require a great deal of additional post-processing to select skin regions and carefully apply some graded filters to smooth out any skin blemishes while preserving the facial boundary features. Somewhat ironically it was the introduction of full-HD video cameras that first introduced a demand for this technology, as the extra resolution of these cameras was sufficient to highlight many minor facial Page 23 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 23 Active appearance model for lips showing both “open” and “closed” states of the mouth. Note how the center feature points of the model overlap when the mouth is closed (Original image from Ioana Barcivarov – taken from PhD thesis: Bacivarov 2009)

Fig. 24 Lipstick can be preselected, or even changed afterward, according to your mood (Original image from author – originally published in Corcoran et al. 2014b)

blemishes. Thus, the manufacturers were quickly forced to provide “soft-focus” modes to ensure market acceptance of their cameras. This digital “soft focus” was the precursor of today’s beautification technologies (Fig. 22). Skin smoothing, as with red-eye, has many subtleties in how it is applied. And if used incorrectly, it can destroy the underlying photograph. For example, there are cameras that do not distinguish the size of the face and will indiscriminately blur and smudge smaller face regions. And in the same way, processing the entire face uniformly will not yield desirable outcomes. It is important not only to distinguish the face region accurately but also to distinguish subregions for the eyes, mouth, nose, jawline, hairline, and intermediate regions (cheeks, forehead). It then becomes straightforward to determine the regions of the

Page 24 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 25 Better than a visit to the dentist and lasts longer – digital teeth whitening! (Original image from author – originally published in Corcoran et al. 2014b)

face where a soft-focus effect is to be applied. Suddenly wrinkles and small facial blemishes are eliminated, restoring youthfulness to the subjects’ face. Enhancing the Mouth Region In the world of digital imaging, we don’t need to resort to cosmetic solutions. No, a simple digital model of the lips’ region is sufficient. This is similar to eyeblink models (Bacivarov et al. 2008a, b) where the object being modeled has a binary state and certain feature points of the model can overlap. The principle shape parameters for this model control the degree of open-/closeness of the mouth. Other parameters determine the left-right orientation, the yaw of the mouth region, and the plumpness of the lips. Although this form of model is more complex than conventional approaches due to its binary nature, it has the advantage that in the digital domain you can easily avoid getting lipstick on your teeth! (Figs. 23 and 24). A second offshoot of having an accurate model of the mouth and lips is that you can detect a person’s teeth and determine if they need a bit of sprucing up. In Fig. 25 we show an example of some of the improvements that can be achieved. And you don’t have to visit a professional photographer to benefit from such digital retouching – just look out for a device that offers “digital beauty” next time you go to upgrade your phone! So how does it work? Well, the AAM face model returns location of the lips; an additional Viola-Jones cascade type (Corcoran et al. 2012b) detects if the teeth are visible or not. Once we have found if the mouth is open or not, the teeth are identified by determining the whitest segments. The whitening process consists of a careful adjustment to the luminance channel in parallel with desaturation of color planes. These are only a small sample of what becomes possible with digital beauty – please refer to Corcoran et al. (2014b) for a more detailed review of what is possible with today’s technology.

Looking to the Future This article has taken a look at what is possible today, with consumer cameras and smartphones. Naturally the pace of technological progress will continue, and I suspect that in only a few more years, we’ll see even smarter and more sophisticated imaging technologies implemented and finding their way into our

Page 25 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

everyday lives. Some of these will be new, surprising, and disruptive, while others will be evolutionary rather than revolutionary. One trend that is clear is that toward more sophisticated hardware processing of images. Once a highlevel algorithm, such as face detection, becomes well established, then the next step is to work out the underlying primitive operations that it implements and to realize these in an optimized hardware configuration. You might expect that the cost and complexity would not make sense, but in fact such highly optimized hardware architectures can greatly reduce the computing resources required. Today we are seeing multi-core processors with multiple GPU – but these were designed to render graphics on a display, not to process and analyze the scene. As a consequence when GPU cores are used to analyze and process video, they tend to be worked at full computational capacity, which also increases energy demands. In brief, these powerful GPU engines become battery hogs and in many of today’s smartphones, they are now the top consumer of system power. Video stabilization is a particularly good example where top-end smartphones struggle to compensate for hand motion on full-HD videos. By contrast a dedicated hardware system can perform the same stabilization function using significantly less energy and fewer hardware gates. And for the future? Well we believe there will be a trend toward more sophisticated hardware architectures that can support advanced facial models. These would enable not only an accurate face pose to be determined but also full frame-to-frame tracking of facial expressions, eye gaze, and real-time beautification/augmentation of faces. The same engines could track our hands and body motions enabling a detailed analysis of human motions within an imaged scene. Other common objects, such as vehicles, bicycles, footballs, beach toys, and dolls, and common scenes such as beach, urban, country, winter (snow), and indoor can be detected, and objects, background transitions, and lighting contexts can be identified and used to enhance the overall scene. Foreground objects can be emphasized or de-emphasized; backgrounds can be replaced, faded, or made more vivid; colors can be remapped and multiple professional effects can be enabled through the enhanced scene analysis and understanding that can now be available from image frame to frame. In brief, the age of smart imaging is upon us – and coming to a smartphone near you in the next year or even sooner! One wonders what will become of the phrase – “The cameras never lies!” It is surely about to become redundant. Some of these advances will bring enhanced photographs and videos, but there are other implications. Today our smartphones are becoming the data hubs of our modern lives. In the very near future, they will also become our financial hubs as enhanced imaging will lead to enhanced authentication through biometric technologies and personal behavior profiling. After all your smartphone is well placed to check your face, analyze your location, and even scan your iris or fingerprints to complete the authentication cycle. All of these functions can be achieved today. And most of them can be performed in the background without interrupting the user of the device. So as consumer imaging evolves as an integral component of our smartphones, it moves beyond a means to capture personal images and video, becoming an enabling technology of increasing power and sophistication. This opens the door to many new applications and uses, but at the same time raises more new questions and issues, particularly with regard to personal privacy and cybersecurity. But here is not the place or time to expand our discussion further – instead let’s conclude with the thought that while the future is never certain, there is one thing we can be sure of with regard to digital imaging: that we are on the cusp of a new technology revolution, and driven by 4K video and increasingly powerful smartphones, this future will be anything but dull!

Page 26 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Further Reading Bacivarov I (2009) Advances in the modelling of facial sub-regions and facial expressions using active appearance techniques. Doctoral dissertation Bacivarov I, Ionita M, Corcoran P (2008a) Statistical models of appearance for eye tracking and eye-blink detection and measurement. IEEE Trans Consum Electron 54(3):1312–1320 Bacivarov I, Corcoran PM, Ionita MC (2008b) A statistical modeling based system for blink detection in digital cameras. In: 2008 digest of technical papers – international conference on consumer electronics, pp 1–2 Bigioi P, Zaharia C, Corcoran P (2012) Advanced hardware real time face detector. IEEE international conference on consumer electronics (ICCE) Brubaker SC, Wu J, Sun J, Mullin MD, Rehg JM (2007) On the design of cascades of boosted ensembles for face detection. Int J Comput Vis 77(1–3):65–86 Chang H, Haizhou A, Yuan L, Shihong L (2005) Vector boosting for rotation invariant multi-view face detection. In: Tenth IEEE international conference on computer vision (ICCV’05), vol 1, pp 446–453 Corcoran P (2015) To gaze with undimmed eyes on all darkness [IP Corner]. IEEE Consum Electron Mag 4(1):99–103 Corcoran P, Bigioi P, Steinberg E, Pososin A (2005) Automated in-camera detection of flash-eye defects. IEEE Trans Consum Electron 51(1):11–17 Corcoran P, Steinberg E, Petrescu S, Drimbarean A, Nanu F, Pososin A, Biglol P (2008) U.S. Patent 7,315,631. U.S. Patent and Trademark Office, Washington DC Corcoran P, Steinberg E, Petrescu S, Drimbarean A, Nanu F, Pososin A, Bigioi P (2008) Real-time face tracking in a digital image acquisition device. 7469055 02 Dec 2008 Corcoran PM, Nanu F, Petrescu S, Bigioi P (2012a) Real-time eye gaze tracking for gaming design and consumer electronics systems. IEEE Trans Consum Electron 58:347–355 Corcoran P, Bigioi P, Nanu F (2012b) Advances in the detection & repair of flash-eye defects in digital images-a review of recent patents. Recent Patents Electr Electron Eng 5(1):30–54 Corcoran PM, Bigioi P, Nanu F (2013) Half-face detector for enhanced performance of flash-eye filter. In: 2013 I.E. international conference on consumer electronics (ICCE), pp 252–253 Corcoran P, Bigioi P, Nanu F (2014a) Detection and repair of flash-eye in handheld devices, Electron (ICCE), 2014 IEEE Corcoran P, Stan C, Florea C, Ciuc M, Bigioi P (2014b) Digital beauty: the good, the bad, and the (not-so) ugly. IEEE Consum Electron Mag 3(4):55–62 Do TT, Doan KN, Le TH, Le BH (2009) Boosted of Haar-like features and local binary pattern based face detection. In: 2009 IEEE-RIVF international conference on computing and communication technologies, pp 1–8 Duchowski AT (2009) A breadth-first survey of eye-tracking applications. Behav Res Methods Instrum Comput 34(4):455–470 Gallagher P (2012) Smart-Phones get even smarter cameras [future visions]. IEEE Consum Electron Mag 1(1):25–30 Hefenbrock D, Oberg J, Thanh NTN, Kastner R, Baden SB (2010) Accelerating Viola-Jones face detection to FPGA-level using GPUs. In: 2010 18th IEEE annual international symposium on fieldprogrammable custom computing machines, pp 11–18 Hiromoto M, Sugano H, Miyamoto R (2009) Partially parallel architecture for AdaBoost-based detection with Haar-like features. IEEE Trans Circuits Syst Video Technol 19(1):41–52 Huang C, Ai H, Li Y, Lao S (2007) High-performance rotation invariant multiview face detection. IEEE Trans Pattern Anal Mach Intell 29(4):671–686 Page 27 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Hutchinson TE, White KP, Martin WN, Reichert KC, Frey LA (1989) Human-computer interaction using eye-gaze input. IEEE Trans Syst Man Cybern 19(6):1527–1534 Ianculescu M, Bigioi P, Gangea M, Petrescu S, Corcoran P, Steinberg (2008) Real-time face tracking in a digital image acquisition device. 7403643 22 Jul 2008 Jacob RJK (1991) The use of eye movements in human-computer interaction techniques: what you look at is what you get. ACM Trans Inf Syst 9(2):152–169 Kublbeck C, Ernst A (2006) Face detection and tracking in video sequences using the modified census transformation. Image Vis Comput 24(6):564–572 Lienhart R, Maydt J (2002) An extended set of Haar-like features for rapid object detection. In: Image processing, 2002 international conference on. proceedings, vol 1, pp 900–903 Lui YM, Beveridge JR, Whitley LD (2010) Adaptive appearance model and condensation algorithm for robust face tracking. IEEE Trans Syst Man Cybern Part A Syst Humans 40(3):437–448 Messom C, Barczak A (2006) Fast and efficient rotated Haar-like features using rotated integral images. In: Australasian conference on robotics and automation, pp 1–6 Nanu F, Petrescu S, Corcoran P, Bigioi P (2011) Face and gaze tracking as input methods for gaming design. In: International games innovation, IEEE conference on, IGIC 2011, pp 115–116 Papageorgiou CP, Oren M, Poggio T (1998) A general framework for object detection. In: Computer vision, 1998. Sixth international conference on, pp 555–562 Ren J, Kehtarnavaz N, Estevez L (2008) Real-time optimization of Viola -Jones face detection for mobile platforms. In: 2008 I.E. Dallas circuits and systems workshop: system-on-chip – design, applications, integration, and software, pp 1–4 Ren H, Che M, Haiwang R, Ming C (2011) A multi-core architecture for face detection based on application specific instruction-set processor. In: Multimedia technology (ICMT), 2011 international conference on, vol 10, pp 3354–3357 Steinberg S. (2005) Method and apparatus for the automatic real-time detection and correction of red-eye defects in batches of digital images or in handheld appliances. US patent 6,873,743 Tresadern PA, Ionita MC, Cootes TF (2012) Real-time facial feature tracking on a mobile device. Int J Comput 96(3):280–289 Verschae R, Ruiz-del-Solar J, Correa M (2007) A unified learning framework for object detection and classification using nested cascades of boosted classifiers. Mach Vis Appl 19(2):85–103 Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. Proceedings 2001 I.E. Computer Society conference on computer vision and pattern recognition (CVPR), vol 1; IEEE Comput Soc 511–518 Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154 Wang P, Shen C, Zheng H, Ren Z (2010) Training a multi-exit cascade with linear asymmetric classification for efficient object detection. In: 2010 I.E. international conference on image processing, pp 61–64 Whitehill J, Littlewort G, Fasel I, Bartlett M, Movellan J (2009) Toward practical smile detection. IEEE Trans Pattern Anal Mach Intell 31:2106–2111 Wu B, Ai H, Huang C, Lao S (2004) Fast rotation invariant multi-view face detection based on real adaboost. In: Sixth IEEE international conference on automatic face and gesture recognition, 2004. Proceedings, pp 79–84 Zaharia C, Bigioi P, Corcoran P (2011) Hybrid video-frame pre-processing architecture for HD-video. In: IEEE international conference on consumer electronics (ICCE), pp 89–90 Zeng H, Luo R (2013) Colour and tolerance of preferred skin colours on digital photographic images. Color Res Appl 38:30–45

Page 28 of 29

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_210-1 # Springer-Verlag Berlin Heidelberg 2015

Useful URL http://en.wikipedia.org/wiki/Digital_camera http://opencv.org/ http://opencv-python-tutroals.readthedocs.org/en/latest/index.html

Books Corcoran P (ed) New approaches to characterization and recognition of faces. InTech. ISBN 978-953307-515-0, 262 p, Chapters published August 01, 2011 under CC BY-NC-SA 3.0 license. doi: 10.5772/ 994 Corcoran PM (ed) Reviews, refinements and new ideas in face recognition. InTech. ISBN 978-953-307368-2, 338 p, Chapters published July 27, 2011 under CC BY-NC-SA 3.0 license. doi:10.5772/743

Page 29 of 29

Color-Property Measurement of Electronic Paper Displays Tzeng-Yow Lin, Bor-Jiunn Wen, Hsiu-Ju Tsai, and Wen-Chun Liu

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Measurement Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Viewing Direction Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Reflection Measurement Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Luminance Measurement Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Colors of the DUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Measurement for the DUT with ILU Off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Measurement of the DUT with ILU On . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Abstract

Reflective display technology is very important for electronic paper (e-paper) displays, as it will create images and need no power to hold images (bistability nature). With the advantage above, more and more applications of color e-paper displays appeared in the market. For the convenience of reading, some of e-paper displays are equipped with a front light. In this chapter, suitable measurement T.-Y. Lin (*) Industrial Technology Research Institute, Hsinchu, Taiwan e-mail: [email protected] B.-J. Wen Department of Mechanical and Mechatronic Engineering, National Taiwan Ocean University, Keelung, Taiwan e-mail: [email protected] H.-J. Tsai • W.-C. Liu Industrial Technology Research Institute, Hsinchu, Taiwan e-mail: [email protected]; [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_211-1

1

2

T.-Y. Lin et al.

methods are proposed to evaluate the color and correlated optical properties of e-paper displays under the condition whether the front light is on or off. List of Abbreviations and Acronyms

DUT EPD ILU LMD

Display under test e-Paper display Integrated lighting unit (e.g., an edge-lit front light plate) Light-measuring device

List of Definitions

Bistability Contrast ratio Gamut area Gamut area ratio Uniformity Viewing direction

Ability to hold the image without applied voltage. Ratio of the DUT brightness at white state to that at black state. Area enclosed by the chromaticity coordinates of the display primary colors. Ratio of the gamut area to the area defined by a particular specification in a color space coordinate. Characterizes the change in brightness or color difference over the display area of the DUT. Direction from which the observer looks at the point of interest on the DUT. Viewing direction dependence characterizes the change in brightness, contrast ratio, and color relative to their values at the normal viewing direction.

Introduction From the viewpoint of consumers, it is required that electronic book reader based on reflective backlight will not cause more significant visual fatigue for long hours of reading than the one based on emissive backlight (Wen et al. 2011). The market has already a variety of color electronic paper (e-paper) displays for different applications. To improve the color properties and to meet the circumstance of reading in the environment with little or no light, the light source is designed to use light-emitting diodes (LEDs) mounted in the sides of the e-paper displays. For characterization, suitable measurement methods are proposed to evaluate the color and correlated optical properties of e-paper displays under the condition whether the front light is on or off.

Measurement Geometry Apparatus Light Source – If the ILU of the EPD is turned off, standard directional ring light, or hemispherical illumination conditions, shall be used. The illumination spectra should approximate CIE illuminant D50 or D65, or A (incandescent lamp,

Color-Property Measurement of Electronic Paper Displays

3

e.g., tungsten halogen lamp). When different light sources are used, they shall be noted in the report. Directional illumination simulates sunlight or small indoor light sources. It is produced by approximately parallel rays incident on the DUT. The maximum deviation of the rays from the optical axis shall not exceed 5 , and the illumination across the cross-section of the beam shall be uniform to within 5 %. For 45 x:0 geometry, DUT is illuminated at an angle of 45 from the normal to DUT surface and viewed normal to DUT surface or within 10 to the normal. Ring light illumination is a special case of directional illumination with rotational symmetry about the display’s surface normal and centered on the measurement spot. For 45 a:0 geometry, DUT is illuminated at 45 circumferential incidence from the normal to DUT surface and viewed normal to DUT surface or within 10 to the normal. Uniform hemispherical diffuse illumination simulates outdoor skylight or indoor diffuse background illumination. It is produced by an integrating sphere with the DUT inside, or a sampling sphere with the DUT at the exit port. Light-Measuring Device (LMD) – Spectroradiometers are used for the measurement of spectral reflectance factor (ILU off) or radiance (ILU on). Its wavelength intervals, Δλ, shall not be greater than 10 nm over the wavelength range from 380 to 780 nm. Colorimeters are used for the measurement of CIE 1931 tristimulus values (X, Y, Z). Luminance meters are used for the measurement of photometric reflectance (ILU off) or luminance (ILU on). Reflection Standards – For the spectral reflection measurement, one reference white standard and one reference black standard of calibrated spectral reflectance factor RWS(λ) and RBS(λ) ,respectively, for the same measurement geometry as in the test configuration shall be used. For the filter photometric reflection measurement, one reference white standard and one reference black standard of calibrated CIE tristimulus values (XWS, YWS, ZWS) and (XBS, YBS, ZBS), respectively, for the same measurement geometry as in the test configuration shall be used. Diffuse reflectance standards are highly Lambertian and are optically flat to +/1 % over the visual region of the spectrum. White standards have a diffuse reflectance of 98 % or more; black standards have a diffuse reflectance of 2 % or less.

Viewing Direction Coordinate System The display is at the center of an x-y-z coordinate system, with x-y the display surface and z the display normal (Fig. 1). The direction of a directed light source and the direction of the LMD (or viewer) are parameterized by azimuth angle φ about the surface normal (rotating anticlockwise, starting from φ = 0 at 3 o’clock, which is also the x-direction) and angle of inclination θ to the normal z.

4

T.-Y. Lin et al.

Fig. 1 Representation of the viewing direction, or direction of measurement, defined by the angle of inclination and the angle of rotation (azimuth angle) in a polar coordinate system

φ=π/2 y

x

φ

φ=0

φ=π

z θ φ=3π/2 Direction

Reflection Measurement Geometry This geometry is used when the ILU is off so that measurement illumination is required. About the geometry of the diffuse illumination and directional illumination, refer to sections 2.2 and 2.3 in chapter “17.7.1.”

Viewing Direction Dependent Reflection Different viewing directions can be achieved by varying the inclination θd of the LMD in a horizontal x-z plane between (θd = π/2, φ = π) and (θd = π/2, φ = 0). The direct source is inclined in the y-z plane by (θs = π/4 = 45 , φ = π/2) so that light source reflections are excluded from the measurement (Fig. 2). The viewing direction rotation θd can be achieved either by rotating the LMD as shown in Fig. 2, or by rotating the DUT about the y-axis, with the directed source fixed to the DUT and rotating with it. The vertical viewing direction (vertical relative to the display) can be measured with the same geometry after rotating the display by 90 about the z-axis (Fig. 3). Different viewing directions can be achieved by varying the detector inclination θd in a y-z plane between (θd = π/2, φ = π/2) and (θd = π/2, φ = 3π/2). The direct source is inclined in the x-z plane by (θs = π/4 = 45 , φ = 0) so that light source reflections are excluded from the measurement (Fig. 3).

Luminance Measurement Geometry This measurement geometry is used when the ILU is on so that no other measurement illumination is required.

Color-Property Measurement of Electronic Paper Displays

5

Directed Source

y θs

φ =0

x

φ=π

z

LMD

θd

Fig. 2 Horizontal viewing direction geometry with directional illumination; θs is source inclination in y-z-plane; θd is viewing direction in x-z-plane

Fig. 3 Vertical viewing direction geometry with directional illumination. Illumination and measurement geometry for vertical viewing direction

x

Directed Source θs

φ =3 π /2 φ = π /2

y

θd

LMD

z

Normal Viewing Direction The LMD should view the DUT along the line in normal direction to the DUT (Fig. 4). The angular uncertainty of the normal direction is less than 0.3 . The measurement field angle is 2 or less. The vertical viewing direction (vertical relative to the display) can be measured with the same geometry after rotating the display by 90 about the z-axis. Viewing Direction Dependence Different viewing directions can be achieved by varying the detector inclination θd in a horizontal x-z plane between (θd = π/2, φ = π) and (θd = π/2, φ = 0). The viewing direction rotation θd can be achieved either by rotating the LMD as shown in Fig. 5, or by rotating the DUT about the y-axis. The vertical viewing direction (vertical relative to the display) can be measured with the same geometry after rotating the display by 90 about the z-axis (Fig. 6).

6

T.-Y. Lin et al.

Fig. 4 Luminance measured by perpendicular viewing

y

x θd = 0 z

1.1.1.1

Fig. 5 Horizontal viewing direction geometry; θd is viewing direction in x-z-plane

LMD

y

φ =0

x φ=π

θd

LMD

z

x

Fig. 6 Vertical viewing direction measurement geometry

φ =3 π /2

φ =π /2 y

θd

LMD

z

Different viewing directions can be achieved by varying the detector inclination θd in a y-z plane between (θd = π/2, φ = π/2) and (θd = π/2, φ = 3π/2).

Color-Property Measurement of Electronic Paper Displays

7

Measurement Colors of the DUT The color performance of the DUT is evaluated by switching it to eight colors: red, green, blue, cyan, yellow, magenta, black, and white. Each color is displayed full screen at its maximum signal level. The signal level for the colors is shown as Table 1.

Measurement for the DUT with ILU Off Spectral Reflection Measurement According to section “Measurement Geometry,” set up the measurement geometry and warm up the light source. Firstly, measure the spectral reflectance factors of the reference white standard, RWS0 (λ) with a spectroradiometer. Secondly, measure the spectral reflectance factors of the reference black standard, RBS0 (λ). Thirdly, refresh the DUT to erase the previous image by displaying full-screen white and full-screen black alternately for several times. Fourthly, DUT displays the test pattern. Finally, measure the spectral reflectance factor of DUT at time t, R0 (λ, t), where time t starts from the time when DUT is triggered to display the test pattern. Color Measurement Firstly, set up the measurement geometry based on section “Measurement Geometry” and warm up the light source. Secondly, measure the CIE tristimulus values of the reference white standard XWS0 , YWS0 , and ZWS0 . Thirdly, measure the CIE tristimulus values of the reference black standard XBS0 , YBS0 , and ZBS0 . Fourthly, refresh the DUT to erase the previous image by displaying full-screen white and full-screen black alternately for several times. Fifthly, DUT displays the test pattern. Finally, measure the CIE tristimulus values of DUT at time t, X0 (t), Y0 (t), and Z0 (t).

Table 1 8-bit signal level for the colors

Color Red Green Blue Yellow Magenta Cyan White Black

8-bit signal level Red = 255, Green = 0, Blue = 0 Red = 0, Green = 255, Blue = 0 Red = 0, Green = 0, Blue = 255 Red = 255, Green = 255, Blue = 0 Red = 255, Green = 0, Blue = 255 Red = 0, Green = 255, Blue = 255 Red = 255, Green = 255, Blue = 255 Red = 0, Green = 0, Blue = 0

8

T.-Y. Lin et al.

70

(a*Y,b*Y)

b*

60 50 40 30

(a*G,b*G)

(a*M,b*M)

20 10

Įĵı

Įijı

0 -10 -20

(a

*

* C,b C)

-30

ı

ijı

ĵı

(a*R,b*R) ķıa

*

(a*B,b*B)

-40

Fig. 7 Six primary colors in CIELAB color space coordinate

Color Gamut Measurement Measure the gamut by the area enclosed by six colors, red, green, blue, cyan, yellow, and magenta, in the CIELAB color space coordinates under the reference illuminant. Gamut area at time t is obtained as:

GAoff

2   a R ð t Þ b R ð t Þ 1 6  ¼  4 a M ðtÞ b M ðtÞ  2  a B ð t Þ b B ð t Þ    a B ðtÞ b B ðtÞ   þ a Y ðtÞ b Y ðtÞ   a C ð t Þ b C ð t Þ

  1   a M ðtÞ   1  þ  a B ðtÞ   1   a Y ð t Þ   1   a Y ðtÞ   1  þ  a C ð t Þ   1   a G ð t Þ

 1   b B ðtÞ 1   b Y ð t Þ 1  3 b Y ðtÞ 1  7 b C ðtÞ 1 5  b G ð t Þ 1  b M ðtÞ

(1)

where (a*R(t), b*R(t)), (a*G(t), b*G(t)), (a*B(t), b*B(t)), (a*Y(t), b*Y(t)), (a*M(t), b*M(t)), and (a*C(t), b*C(t)) are made by primary red, green, blue, yellow, magenta, and cyan shown in the screen in CIELAB color space coordinate (see Fig. 7). Then, gamut area ratio is obtained as: GAoff Ratioð%Þ ¼ Gamut Area=Newsprint Area

(2)

where the Newsprint Area is made by primary red, green, blue, cyan, yellow, and magenta in CIELAB color space coordinates of ISO 12647-3 shown as Table 2. The gamut area ratio should be calculated when the color space coordinate was performed at the 45 :0 or 0 :45 geometry, D50 illuminant, 2 observer.

Color-Property Measurement of Electronic Paper Displays Table 2 CIELAB color space coordinates for the primary colors

Color Red Green Blue Yellow Magenta Cyan

Fig. 8 Full-screen uniformity test pattern with numbered test points

9

a* 41.0 34.0 7.0 3.0 44.0 23.0

b* 25.0 17.0 22.0 58.0 2.0 27.0

D D/6

D/2

D/6

W/6

1

2

3

W W/2

4

5

6

W/6

7

8

9

⊗Test Point X=1∼9

Active Area

Bistability Measurement Measure the spectral reflectance factors or the CIE tristimulus values of a fullscreen test color at different times, t1 and t2. Time t1 is after finishing triggering, and t2 is more than 5 min. Then, bistability is obtained as:  Bistability ¼

 jL ðt1Þ  L ðt2Þj 1  100 % L ðt1Þ

(3)

where L*(t1) and L*(t2) are the CIELAB lightness values of the DUT measured at time t1 and t2, respectively.

Uniformity Measurement Firstly, set the DUT to a full-screen test color and measure the spectral reflectance factors or the CIE tristimulus values for the nine test points marked in Fig. 8. Secondly, calculate the ratio of the minimum to maximum lightness for all nine test points. Thirdly, calculate the CIELAB color differences between the points number 2 and number 8 and between number 4 and number 6. D and W indicate the horizontal and vertical length of active area. Finally, uniformity is obtained as:

10

T.-Y. Lin et al.

 Uniformity ¼

L min L max

  100%

(4)

where L*min and L*max are the minimum and maximum CIELAB lightness values of the DUT measured at nine sample points, respectively.

Viewing Direction Measurement Use collimated light illumination measurement geometry to evaluate the viewing direction dependence of the DUT. Set the DUT to a full-screen test color. Measure the spectral reflectance factors or the CIE tristimulus values at different horizontal or vertical viewing directions. Evaluate viewing direction dependence by calculating the contrast ratio (white to black reflectance factors) as a function of viewing direction θd. Calculate the CIELAB lightness difference and CIELAB color difference between the viewing direction θd and the normal viewing direction θd = 0 . CIELAB color difference is: ΔE ab ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   2  2  2 L i  L j þ a  i  a  j þ b  i  b  j

(5)

where L*i, a*i, b*i and L*j, a*j, b*j are the CIELAB values of the DUT measured at the two different times, positions, or viewing directions i and j.

Measurement of the DUT with ILU On Spectral Radiance Measurement Firstly, turn on the ILU of DUT and warm up the ILU. Secondly, refresh the DUT to erase the previous image, by displaying full-screen white and full-screen black alternately for several times. Thirdly, DUT displays the test pattern. Finally, measure the spectral radiance of DUT at the time t, L(λ, t), where time t starts from the time when DUT is triggered to display the test pattern. Color and Color Gamut Measurement According to spectral radiance measurement in section “Spectral Radiance Measurement,” the CIE tristimulus values of the DUT at time t, X0 (t), Y0 (t), and Z0 (t) are obtained. Then, measure the color gamut by the area enclosed by three colors, red, green, and blue, in the CIE 1976 chromaticity coordinates u0 v0 for the reference illuminant where u0 v0 color space coordinates are obtained as: 8 > > < u0 ¼

4X 4x ¼ X þ 15Y þ 3Z 3 þ 12y  2x 9Y 9y > 0 > ¼ :v ¼ X þ 15Y þ 3Z 3 þ 12y  2x

(6)

Color-Property Measurement of Electronic Paper Displays

11

where 8 > < x¼

X XþYþZ Y > :y ¼  XþYþZ X, Y, and Z are the tristimulus measurement values of the DUT. Finally, calculate Gamut area at time t:

GAon

  x ðtÞ 1  R ¼   xG ðtÞ 2  xB ðtÞ

 yR ðtÞ 1  yG ðtÞ 1  yB ðtÞ 1 

(7)

where (xR(t), yR(t)), (xG(t), yG(t)), (xB(t), yB(t)) are coordinates for the DUT fullscreen primary colors red, green, and blue in CIE 1931 chromaticity coordinates. Then, gamut area ratio is obtained as: GAon Ratio ð%Þ ¼ GAon =NTSC Area

(8)

where the NTSC Area is made by primary red (0.67, 0.33), primary green (0.21, 0.71), and primary blue (0.14, 0.08) of the original 1953 color NTSC specification in CIE 1931 color space.

Bistability Measurement Measure the spectral radiance or the CIE tristimulus values of a full-screen test color at different time, t1 and t2. According to SEMI D68-0512 (Materials International 2012), time t1 is after finishing triggering, and t2 is more than 5 min. Then, bistability is obtained by using Eq. 3. Uniformity Measurement Set the DUT to a full-screen test color. Measure the spectral radiance or the CIE tristimulus values of the nine test points marked in Fig. 8. Calculate ratio of the minimum to maximum luminance for all nine test points. Calculate the u0 v0 color difference between points number 2 and number 8 and between points number 4 and number 6. D and W indicate the horizontal and vertical length of active area. Then, uniformity is obtained by Eq. 4. Viewing Direction Measurement Set the DUT to a full-screen test color. Measure the spectral radiance or the CIE tristimulus values at different horizontal or vertical viewing directions. Evaluate viewing direction dependence by calculating the contrast ratio (white to black luminance) as a function of viewing direction θd. Calculate the luminance difference and u0 , v0 color difference between a viewing direction θd and the normal viewing direction θd = 0 . Finally, u0 v0 color difference is obtained as:

12

T.-Y. Lin et al.

0 0

Δu v ¼

r ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  u0i  u0j

2

þ v0i  v0j

2

(9)

where u0 i, v0 j and u0 i, v0 j are the u0 v0 color space coordinates of the DUT measured at the two different times, positions, or viewing directions i and j.

Summary In this chapter, measurement geometry, spectral reflection, filter photometric reflection, color and color gamut, bistability, uniformity, and viewing direction measurement methods are proposed to evaluate the color performance of e-paper displays under the condition whether the front light is on or off. The proposed measurement methods may help the designer or maker to create useful and comfortable e-paper products.

Further Reading Commission Internationale de l’Eclairage (2004) Colorimetry. CIE 15:2004, 3rd Edition. CIE Central Bureau, Vienna International Committee for Display Metrology (2012) Information display measurements standard (IDMS v1.03). Society for Information Display, Campbell, CA, USA International Electrotechnical Commission (2011) IEC 61747-6-2:2011 – measuring methods for liquid crystal display modules – reflective type. International Electrotechnical Commission (IEC), Switzerland International Organization for Standardization (2005) ISO 12647-3:2005 – graphic technology – process control for the production of half-tone colour separations, proofs and production prints – part 3: coldset offset lithography on newsprint. International Organization for Standardization, Vernier, Geneva, Switzerland Semiconductor Equipment and Materials International (2012) SEMI D68-0512: test methods for optical properties of electronic paper displays. SEMI (Semiconductor Equipment and Materials International), San Jose, CA, USA Video Electronics Standards Association (2001) VESA 2.0 – Flat Panel Display Measurements (FPDM) Standard, Ver. 2. VESA (Video Electronics Standards Association), Milpitas, CA, USA Wen BJ, Lai YY, Tsai TC, Ke MT, Chen CY, Liu TS (2011) Visual fatigue difference analysis between reflective and emissive backlight of electronic-book readers. In: Proceedings of China display/Asia display 2011, Kunshan

AMOLED Manufacture Glory K. J. Chen and Janglin Chen

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . White OLED Versus RGB Side-by-Side AMOLED: Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . Manufacturing of RGB Side-by-Side AMOLED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manufacturing Process Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Production Equipment Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OLED Evaporation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OLED Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . High-Resolution AMOLED Manufacturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RGB SBS FMM Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Main Issues with High-Resolution AMOLED Mass Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . White OLED with Color Filter Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 2 3 3 4 6 9 12 12 14 16 18 18

Abstract

RGB side-by-side and white OLED with color filter are two major technologies to manufacture AMOLED. The skill sets and considerations are much different with AMOLED manufacturing in the factory than with technical development in the laboratory. In this chapter, we discuss the manufacturing process flow, production equipment design, OLED evaporation, and encapsulation technology

G.K.J. Chen (*) Display Technology Center, Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, Republic of China Panel Integration Division I, ITRI, Chutung, Hsinchu, Taiwan e-mail: [email protected] J. Chen Panel Integration Division I, ITRI, Chutung, Hsinchu, Taiwan e-mail: [email protected] # Springer-Verlag Berlin Heidelberg 2015 J. Chen et al. (eds.), Handbook of Visual Display Technology, DOI 10.1007/978-3-642-35947-7_212-1

1

2

G.K.J. Chen and J. Chen

for mass production. The resolution requirement of display is getting higher and higher, and RGB side-by-side of AMOLED, which uses fine metal mask, is facing a far more difficult challenge than the white OLED with color filter approach is.

List of Abbreviations

AGV AMOLED AR BE ELA EML ETL FPD FMM HIL HTL LTPS MGV NTSC RPL SBS TE VAS WOLED

Automated guided vehicle Active matrix organic light-emitting diode Antireflection Bottom emission Excimer laser annealing Emission layer Electron transport layer Flat panel display Fine metal mask Hole injection layer Hole transport layer Low temperature polysilicon Manually guided vehicle National Television System Committee Redundant pixel line Side-by-side Top emission Vacuum assembly system White organic light-emitting diode

Introduction White OLED Versus RGB Side-by-Side AMOLED: Pros and Cons Active matrix organic light-emitting display (AMOLED) has been adopted extensively as the display for many mobile devices in recent years. Samsung Galaxy series products were the best known example of AMOLED application as illustrated in Fig. 1. RGB side-by-side (SBS) by fine metal mask (FMM) technology was used for these commercial AMOLED products. For the mobile device application, RGB SBS AMOLED has the merits of OLED efficiency and low power consumption. The challenge of FMM lies in the yield of high-resolution AMOLED product mass production. White OLED with color filter, on the other hand, was developed to achieve the high-resolution AMOLED product without FMM technology. Moreover, the handling of large size FMM is a serious issue with larger than Gen 6 OLED mass production lines. As a result, the white OLED with color filter

AMOLED Manufacture

3

6inch

Galaxy Round 5.7” Plastic AMOLED Galaxy Note III Galaxy Note 4 5.7” Super AMOLED FHD 5.7” Super AMOLED 2K

Galaxy Note Edge 5.7” Plastic AMOLED 2K

Galaxy Note II 5.5”Super AMOLED HD

5inch

Galaxy Note 5.3”Super AMOLED HD

Galaxy S II 4.3”Super AMOLED-Plus

4inch

Galaxy S III 4.8”Super AMOLED-Plus HD

Galaxy S IV 5”Super AMOLED FHD

Galaxy S5 5.1”Super AMOLED FHD

Galaxy S 4”WVGA Super AMOLED

2010

2011

2012

2013

2014

Fig. 1 Samsung Galaxy serious smartphone (Source: ITRI/DTC Summary, 2014.11)

technology is popular for large screen, such as TV applications (Chang-Wook Han nd, 2002). Figure 2 shows the pros and cons of RGB SBS versus white OLED with color filter.

Manufacturing of RGB Side-by-Side AMOLED Manufacturing Process Flow The microcavity structure of RGB SBS AMOLED is essential in the manufacturing process. The emissive light of OLED will interfere with the semitransparent cathode and the reflective anode. By adjusting the optical path of the cavity, i.e., the total thickness of organic layers of an OLED, the emissive color can be tuned to the desired spectrum. The interaction of OLED-emissive light in the optical cavity is illustrated in Fig. 3. The RGB SBS AMOLED manufacturing process flow is shown in Fig. 4. TFT substrate treatment is especially important prior to OLED evaporation process. A wet cleaning process is designed to remove most of the particles larger than 1 μm in size, while the additional baking process is to eliminate the absorbed moisture. A high vacuum condition is required during OLED deposition. The FMM process is conducted not only on RGB deposition but also on the hole injection layer (HIL) or hole transport layer (HTL) for top emission (TE) microcavity adjustment. A nitrogen environment is required in OLED encapsulation, while moisture and

4

G.K.J. Chen and J. Chen

AMOLED Type

Pros

Structure

RGB Sideby-Side

Glass N2

Cons

Application

• High performance OLED device • Low power consumption

• FMM technology

Mobile Devices

• High resolution by mature color filter technology • High OLED’s mass production capability

• Higher power consumption

Mobile Devices

• High resolution by mature color filter technology • High OLED’s mass production capability

• Higher power consumption

OLED TV

• High OLED’s mass production capability

• Higher power consumption • Resolution limited by TFT aperture ratio

OLED TV

Glass

Glass N2 Glass

Glass

White OLED Dam with color filter

Fill Glass

Glass Fill

Dam Glass

Fig. 2 The pros and of RGB side-by-side and white OLED with color filter (Source: ITRI/DTC Summary, 2014.11)

oxygen concentration must be controlled to less than 1 ppm. Frit material is dispensed or patterned by screen printing followed by laser treatment to cure the frit around the edge of panel. Inspection, cutting, and lit-on check processes are carried out after the encapsulation to complete the whole process.

Production Equipment Design There are, presently, a number of RGB SBS AMOLED lines in production. Half Gen3.5, full Gen3.5, half Gen4.5, full Gen4.5, and quarter Gen5.5 are common for small to medium size panel manufacturing. The limitation in substrate size depends highly on the capability of FMM technology. It is known that Samsung adopts a full Gen5.5 size for LTPS-TFT mass production and a quarter Gen5.5 size OLED equipment for the AMOLED RGB SBS process. One needs to take all critical issues into consideration before finalizing the RGB SBS AMOLED production layout. OLED structure and process flow must be carefully planned before designing the production equipment. For example, if a top emission OLED’s microcavity structure can be adjusted by tuning the thickness

AMOLED Manufacture

5

120% Blue Normallized Intensity

100%

Green Red

80% 60% 40% 20% 0% 0

50

100

150

200

250

HTL Thickness (nm)

Transparent spar cathode ETL

HTL-1

HTL-2

HTL-3

HIL Reflection anode

Fig. 3 Top-emissive microcavity OLED structure and the effect of the thickness of organic layer on the RGB SBS OLEDs (Source: ITRI/DTC Summary, 2014.11)

of HIL for R/G/B sub-pixels, multiple HIL chambers are required to meet the capacity and microcavity design requirement in production. The handling of FMM is also important in the mass production equipment design. Figure 5 shows the production concept of RGB SBS AMOLED. The FMM RGB SBS process chamber has to be designed with high alignment accuracy within 1 μm for high-resolution AMOLED products. Linear source evaporation technology is adopted for HIL, HTL, R, G, B, and ETL process chambers. It is worth mentioning that automation process design in FMM handling is very critical for the yield improvement of high-resolution AMOLED manufacturing. Automated guided vehicle (AGV) transfer is a better way to control the quality of FMM during manufacturing process than the manually guided vehicle (MGV). OLED encapsulation equipment is connected to the evaporation line as shown in Fig. 6. Cover glass is first cleaned by plasma or UV O3 before the glass frit patterning process. Screen printing or dispensing methods may be chosen for glass frit patterning process. Conveyor or batch type furnace is used for glass frit

6

G.K.J. Chen and J. Chen OLED Encapsulation OLED Evaporation

TFT Substrate

Lamination/UV curing O2 plasma Dam

HIL Wet Clean

Baking

Inspection

HTL

Baking

Dispenser (Glue)

Frit patterning

Getter

EML-R

Glass thinning

EML-G Inspection

Face down

Mask clean

ETL EIL Semi-transparent cathode

Inspection Post curing

EML-B

Baking

Laser Curing

FMM

Cathode

Dry Clean Baking

Cutting Light On Check

Wet Clean

Top Emission flow

Un-packing

Common and Bottom Emission flow Depend on maker

Capping layer

Cover Glass

Fig. 4 Manufacturing process flows for RGB SBS AMOLED. TE, top emission; BE, bottom emission (Source: ITRI/DTC Summary, 2014.11)

baking. Pre-cured glass frit on cover glass and finished OLED substrate are transferred to vacuum assembly system (VAS) chamber for assembly. Assembled AMOLED panel is then sealed by laser sintering. The purpose of an outside dam structure is to provide temporary protection for OLED device within N2 environment during the laser sealing process.

OLED Evaporation Point source evaporation has been most popular in Gen 2 substrate size of OLED mass production line. The glass substrate must be rotated during the deposition process. In order to get good uniformity, host and dopant are designed to be offset from the glass center. The distance between point source and glass substrate must be controlled for maximal thickness uniformity and material utilization. Figure 7 shows the point source evaporation concept Shadow effects occur during the deposition process with point source technology. Figure 8 shows the large shadow effect during point source deposition. It is a serious issue for high-resolution OLED by point source evaporation technology.

AMOLED Manufacture

7

CST

FMM AGV

FMM Cleaner/Inspection/Stock FMM AGV CST

ETL (Capping layer)

MS

Cathode

HIL (Common)

HTL MS

N2 Baking

MS

OLED Encapsulation

TFT Backplane

N2/Vacuum

Cathode G (FMM)

R (FMM)

OLED Structure

B (FMM)

O2 plasma HIL-3 (FMM)

HIL-2 (FMM)

Flip

(Capping layer) Cathode ETL(Common) R.G.B(FMM) HTL(Common) HIL-3(FMM) HIL-2(FMM) HIL-1 (Common)

Fig. 5 Mass production equipment concept of RGB SBS OLED evaporation line. AGV, automated guided vehicle; CS, cassette station; MS, mask stock (Source: ITRI/DTC Summary, 2014.11)

Furnace Frit -2

Dam Dispenser

Dry clean

Evaporation TFT/OLED Glass in line

Cover Glass in

VAS

Frit -1

Laser-2 Laser

Panel out

Cover Glass Dam Laser-1

Frit

TFT/OLED Glass

Fig. 6 Mass production equipment concept of encapsulation line. VAS, vacuum assembly system (Source: ITRI/DTC Summary, 2014.11)

8

G.K.J. Chen and J. Chen

Rotation

Process chamber Glass

Host Glass

Dopant

Host deposition area Point Source Dopant deposition area

Fig. 7 The point source evaporation concept (Source: ITRI/DTC Summary, 2014.11)

Shadow Effect Glass FMM Glass θ

OLED FMM 55Ʊ

Off set from glass center

Point Source

Fig. 8 Shadow effect during point source deposition (Source: ITRI/DTC Summary, 2014.11)

As glass substrates are getting larger and larger in the FPD industry, linear source evaporation technology (Steven Van et al. 2012) is developed for highresolution and large size (>Gen2) OLED deposition. Rotation of large size glass substrate in a high vacuum process chamber is not a good idea for manufacturing.

AMOLED Manufacture

9

Glass

D2

Scan

Glass

D1

Host Dopant Host Linear Source

Linear source D1 > Glass D2

Fig. 9 The glass substrate and scanning motion of linear source (Source: ITRI/DTC Summary, 2014.11)

Having the glass substrate fixed and moving the deposition source, clearly, is better for mass production, so as to reduce the required size of vacuum chamber. Two host and one dopant materials are commonly designed for one linear source set. The effective deposition area, typically, is larger than the width of glass substrate. Figure 9 shows the glass substrate and scanning motion of linear source. The shadow effect thus is better controlled by the linear source technology, as shown in Fig. 10. The deposition angle is smaller with the scanning motion of the linear evaporation source. With the linear source, the filling ratio of OLED’s materials on the sub-pixels could be much higher than with the point source deposition. By the way, the design of FMM taper angle must be considered with linear source deposition angle during the OLED manufacturing.

OLED Encapsulation A hermetically sealed glass package and method for manufacturing (Patent No.: US 6 nd) are essential for OLED display application. It is important that Samsung successfully introduced the glass frit edge sealing by laser process for AMOLED encapsulation. A screen printing process is generally used for glass frit process (Yuneng Lai1 et al. 2012). The transient temperature field analysis for the laser bonding process has been discussed for top emission OLED device (Maoyu et al. 2010). Melting the glass frit with high-energy laser beam could cause damage on TFT array circuit by heat distribution. Protection of TFT array circuit under glass frit, therefore, is needed. Inorganic protection layer on TFT metal trace is commonly used for blocking the heat from laser anneal process. An underneath TFT metal trace is also used to deal with the heat issue of glass

10

G.K.J. Chen and J. Chen

Shadow Effect

Shadow Effect

Glass

Glass

θ’

θ’ FMM’s taper angle

θ

FMM’s taper angle θ Deposition angle

Deposition angle

Glass FMM

Moving

Host Dopant Host Fig. 10 Shadow effect control by linear source (Source: ITRI/DTC Summary, 2014.11)

Laser

Glass Glass frit

N2

Glass

FPC

Frit

TFT array circuit Inorganic protection layer TFT metal trace Underneath TFT metal trace

Fig. 11 TFT array circuit design for glass frit encapsulation process (Source: ITRI/DTC Summary, 2014)

frit encapsulation. Figure 11 shows the TFT array circuit protection design for glass frit encapsulation process. In order to meet the tact time requirement of OLED mass production, a multihead laser beam is employed to seal the many panels on the mother glass. Figure 12

AMOLED Manufacture

11

Fig. 12 Multi-heads laser beam for query AMOLED cell sealing (Source: http://www.ltsolution.com)

is an example of multi-head laser beam of AMOLED cell sealing equipment. The mass production multi-head laser process chamber should be kept in nitrogen and low moisture environment that H2O and O2 are less than 10 ppm. A programmed cooling of heated glass frit may be considered after the high-temperature laser heating process. These finished AMOLED panels on the

12

G.K.J. Chen and J. Chen

FMM vs. OLED Process Issues OLED Chamber inside

CCD

CCD

Magnet FMM OLED chamber: Magnet /Substrate /FMM Magnet on: • Phenomena: FMM Hole Open vs. Sub-pixel R/G/B shift

Normal Abnormal: newton ring Bottom to Up view inside chamber

Color mixing

Fig. 13 FMM attachment and color mixing in OLED deposition process (Source: ITRI/DTC Summary, 2014.11)

mother glass are then sent to glass thinning process to reduce the total thickness of panels. Laser cutting is preferred to reduce the stress caused by mechanical cutting.

High-Resolution AMOLED Manufacturing RGB SBS FMM Challenge The most critical issue for the high-resolution display production is with the FMM patterning process in RGB SBS AMOLED. In the color patterning process, the substrate and FMM are joined by magnets in OLED deposition chambers. In general, no Newton ring should appear after aligning the substrate and shadow mask joined by the magnetic force. Otherwise, color mixing phenomena will occur in the adjacent sub-pixels due to misalignment as shown in Fig. 13. In addition, the radiant heat from linear source during deposition could affect all materials, especially, the FMM, the substrate, and the upstage holder; and this

AMOLED Manufacture

13

High Resolution FMM Issues OLED Process Color mixing issue ¸ CCD alignment ¸ Panel sub-pixel design vs. Magnet accuracy CCD CCD

Magnet Substrate Robot Core chamber

FMM ¸ CTE matching ¸ FMM total pitch accuracy ¸ FMM hole open accuracy ¸ FMM shadow effect

¸ Vibration

Fig. 14 Key points of high-resolution OLED FMM process (Source: ITRI/DTC, 2014.11)

therefore changes the alignment precision. Figure 14 shows some important points which have impact on the high-resolution OLED FMM process. Careful control of these issues is necessary for improving the yield of RGB SBS AMOLED production. The key technology and know-how in the FMM equipment and material are still dominated by Japanese and Korean companies. Hitachi Metals is, to our knowledge, the only supplier of the FMM raw material, the Invar film, which is less than 50 μm thick. A list of FMM makers are shown in Fig. 15. Among the FMM suppliers, DNP and TOPPAN are the two major FMM makers. In addition, Sinto, HIMS, and HANSONG are the popular FMM tension equipment suppliers. When AMOLED panel makers choose to adopt the FMM process, they will most likely have to develop their own FMM tension technology (Pub. No.: US 2010). In addition, tension equipment for FMM is also essential in the AMOLED FAB. Figure 16 shows a half Gen4.5 FMM tension equipment by HANSONG.

Main Issues with High-Resolution AMOLED Mass Production In the high-resolution RGB SBS AMOLED mass production process, there are several issues which could impact the yield of manufacturing. The first issue is defects caused in the LTPS-TFT array process. Most defects are caused by particles during the processes. Figure 17 shows several types of defect in an AMOLED panel. Particles will lead to the point defect and line defect. In the TFT array process, particles bigger than 5 μm need to be well controlled. For OLED process, more effort is necessary to minimize the particles between 1 and 5 μm. High density

14

G.K.J. Chen and J. Chen

Fig. 15 Supply chain of OLED FMM industry (Source: ITRI/DTC Summary, 2015.04)

Fig. 16 Half Gen4.5 FMM tension equipment by HANSONG (Source: http://www.hansong.co.kr)

TFT (ex: 5 T-1C or 6 T-2C) sub-pixel arrays, especially, must be protected from the particles during manufacturing process. If a particle locates between the transparent cathode and the reflection anode of OLED, sub-pixel electrical shorting may occur. By using laser repair, this dark sub-pixel can be fixed to function normally with an exception of a small dark spot inside the sub-pixel as shown in Fig. 18. The second issue of mass producing AMOLED is color mixing in the FMM process which, as shown in Fig. 19, easily occurs in high-resolution AMOLED

AMOLED Manufacture

15

Fig. 17 Several types of defect in TFT array and AMOLED panel (Source: ITRI/DTC Summary, 2014.11)

Laser Repair

Transparent Cathode hode ETL EML

Particles

HTL Reflection Anode

Fig. 18 AMOLED before and after a repair by laser (Source: ITRI/DTC Summary, 2014.11)

panel production. In general, this phenomenon is observed on OLED with a resolution higher than 400 ppi. The third issue is the mura of LTPS-TFT manufacturing for AMOLED. Some issues associated with LTPS backplane during manufacturing and solutions attempted to solve them have been discussed (Patent No.: US 8 nd). The ELA mura is caused

16

G.K.J. Chen and J. Chen

Fig. 19 Color mixing picture of a light-on AMOLED (Source: ITRI/DRC, 2014.11)

a

b

Non-uniform region by overlapped scan

Fig. 20 Full-color 5-in AMOLED display, (a) without and (b) with the RPL design (Source: 2012 IEEE)

during the laser annealing process which converts the semiconductor layer from a-Si to poly-Si. Since OLED device is current-driven, the performance of OLED display is sensitive to the nonuniformity of LTPS array. Therefore, a new pixel circuit design, called redundant pixel line (RPL), is proposed to deal with the nonuniformity of image quality (Guanghai et al. 2012). Figure 20a shows the example of nonuniformity of image quality. In addition to the ELA mura, processes of spin cleaning and spin coating of photoresist will also result in the irregular spin mura. In response to this issue, some TFT array processes adopt slit coating method instead of the spin coating process to avoid spin mura on the AMOLED panel. Spot defects, color mixing, and LTPS-TFT process mura after light-on were the top three manufacturing issues for high-resolution AMOLED mass production.

White OLED with Color Filter Approach The major problem of the FMM process which the existing panel makers confront is how to realize the production of high-resolution AMOLED panels. As opposed to

AMOLED Manufacture

17

12 Cavity WOLED + CF + AR Film Side−by−side OLED + Polarizer

Reflectivity (%)

10 8 6 4 2 0 400

450

500

550 600 Wavelength (nm)

650

700

Fig. 21 Reflection of a 4.4 in. white OLED with color filter panel by AUO

the FMM process, white OLED (WOLED) with color filter is an alternative approach to solve the problem in producing high-resolution AMOLED panels. Two thirds of the emissive light is absorbed, thus wasted, due to the use of a color filter. In order to obtain an efficient WOLED, a tandem white OLED structure has been commonly adopted by the industry. The higher efficiency and relatively lower driving voltages are the ultimate goals of current tandem WOLED approach. Although white OLED with color filter is easier to achieve high-resolution OLEDs, it has lower than 100 % NTSC. For RGB SBS, the thickness of OLED HIL or HTL can be adjusted to obtain over 100 % NTSC. Yet, for WOLED with color filter, the RGB microcavity effect cannot be easily adjusted by tuning the thickness of individual HIL or HTL layers. It is thus difficult, with current technology, to effectively increase the NTSC of white OLED with color filter structure by tuning microcavity effect. Samsung disclosed a 4.65 in. AMOLED panel with >100 % NTSC through microcavity design. AUO also presented a 4.4 in. white OLED with color filter prototype with 413 ppi resolution at the 2013 SID (Chung-Chia et al. 2013). The microcavity white OLED with color filter that can attain a low reflectivity by antireflection (AR) film, as RGB SBS OLED does with circular polarizer, is shown in Fig. 21. In addition, RGBW sub-pixel arrangement can reduce the power consumption with an additional white sub-pixel design. The RGBW color filter design and with a high-efficiency tandem white OLED is the most effective approach known to reduce the power consumption. Table 1 shows the performance comparison between FMM RGB SBS and white OLED with color filter. The microcavity effect in WOLED structure is adjusted by using different thickness of ITO and RG EML on individual RGB sub-pixel area (Meng-Ting et al. 2014). For the lower power consumption, FMM RGB SBS is the current mass production technology for small size AMOLED. Yet, as the white OLED material continues to improve, the light efficiency through color filter is constantly getting better. The power consumption of white OLED with color filter AMOLED in

18

G.K.J. Chen and J. Chen

Table 1 Comparison of RGB SBS (Samsung) and white OLED with color filter (AUO) AMOLEDs V

Company

S MSUNG

SAMSUNG DISPLAY

OLED type Size/ resolution Picture/ structure

FMM RGB side-by-side 4.6500 (1113  623 HD) (pentile)

White OLED with color filter (not RGBW) 4.400 (1600  900 HD+) (413 ppi)

Panel brightness NTSC Power consumption Note

240 cd/m2

240 cd/m2

>100 % 434 mW (@160 cd/m2)

>100 % 300 mW (250nits@30 % loading)

Microcavity improved by different thickness of HIL or HTL

Microcavity improved by different thickness of ITO and one FMM of RG EML

(1): Samsung SID2012 demonstration Source: ITRI/DTC Summary, 2014.11

mobile device application nowadays can be acceptable as the AUO prototype demonstrates. It would not be too long before the white OLED with color filter becomes competitive for high-resolution AMOLED manufacturing, besides FMM RGB SBS technology.

Summary RGB SBS OLED by FMM process is the chosen manufacturing technology for AMOLED, especially for high-resolution and low power consumption applications. The OLED structure and process flow dictate the present equipment design for mass production. Linear source deposition and glass frit by laser technologies have been adopted in OLED evaporation and encapsulation manufacturing. Critical considerations in the OLED evaporation section are closely tied to the FMM’s design and handling process in the OLED deposition chamber. Some protective measures for TFT metal trace underneath glass frit by laser need to be considered in OLED encapsulation section. The yield improvement of OLED manufacturing is highly dependent on the ability to control particles in the OLED mass production line. If one does not want to deal with FMM issues in OLED mass production, the white OLED with color filter technology then could be a good alternative to manufacture AMOLED panels.

AMOLED Manufacture

19

Further Reading Chang-Wook Han, Han, Sung-Hoon Pieh, Hee-Suk Pang, Jae-Man Lee, Hong-Seok Choi, Sung-Joon Bae, Hyun-Wook Kim, Woo Suk Ha, Yoon-Heung Tak, Byung-Chul Ahn (nd) 15-in. RGBW panel using two-stacked white OLED and color filter for TV applications. LG Display LG Display Co., Ltd., IDW’10, OLED3 – 1 Invited, 323–326 Chang-Wook Han, Kyung-Man Kim, Sung-Joon Bae, Hong-Seok Choi,Jae-Man Lee, Tae-Shick Kim, Yoon- Heung Tak, Soo-Youle Cha, Byung-Chul Ahn (2002) 55-inch FHD OLED TV employing New Tandem WOLEDs, LG Display Co., Ltd., SID 2002 DIGEST, 279–281 Chung-Chia Chen, Ting-Yi Cho, Jing-Jie Fu, Hsiang-Yun Hsiao, Zai-Xien Weng, Chih-Hung Tsai, Chieh-Hsing Chung, Kuo-Chang Lo, Wen-Hung Lo, Tsu-Chiang Chang, Yu-Sin Lin (2013) High resolution 4.400 AMOLED display with 413 ppi real pixel density. AU Optronics Corporation, SID 2013 DIGEST, 416–418 Guanghai J, Sangmoo C, Moojin K, Sungchul K, Jonghyun S (2012) New pixel circuit design employing an additional pixel line insertion in AMOLED displays composed by excimer lasercrystallized TFTs. J Disp Technol 8(8):479–482 Maoyu Li, Yuanhao Huang, Zikai Hua, Jianhua Zhang (2010) Finite element analysis of laser bonding process on organic light-emitting device. Key Laboratory of Advanced Display and System Applications (Shanghai University), Ministry of Education, School of Mechatronics Engineering and Automation, Shanghai University, 2010 IEEE, 190–194 Meng-Ting Lee, Shih-Ming Shen, Zai-Xien Weng, Jing-Jie Fu, Chih-Lei Chen, Ching-Sang Chuang, Yusin Lin (2014) One FMM solution for achieving active-matrix OLED with 413ppi real pixel density. AU Optronics Corporation, SID 2014 DIGEST, 573–575 Patent No.: US 6,998,776 B2, Glass package that is hermetically sealed with a frit and method of fabrication. Assignee: Corning Incorporated Patent No.: US 8,334,536 B2” Thin film transistor, organic light emitting diode display device having the same, flat panel display device, and semiconductor device, and methods of fabricating the same. Assignee: Samsung Display Co., Ltd. Pub. No.: US 2010/0055810 A1 Mask for thin film deposition and method of manufacturing OLED using the same. Assignee: Samsung Mobile Display Co., Ltd. Steven Van Slyke, Angelo Pignata, Dennis Freeman, Neil Redden, Eastman Kodak Company, Rochester, NY USA; Dave Waters, H. Kikuchi, T. Negishi, Ulvac Japan, Ltd., Chigasaki, Japan; H. Kanno, Y. Nishio, M. Nakai, Sanyo Electric Company, Ltd., Gifu, Japan Linear source deposition of organic layers for full-color OLED SID 2012 DIGEST, 886–889 Yuneng Lai, Zunmiao Chen, Yuanhao Huang, Jianhua Zhang (2012) The process and reliability tests of glass-to-glass laser bonding for top-emission OLED Device, 1 Key Laboratory of Advanced Display and System Applications (Shanghai University), Ministry of Education University, Shanghai, 200072, China 2 The School of Mechanical & Electronic Engineering and Automation, Shanghai, 2012 IEEE, 2036–2041

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015

Flexible Displays: Flexible AMOLED Manufacturing Glory K. J. Chena,b* and Janglin Chena a Display Technology Center, Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, ROC b Panel Integration Division I, ITRI, Chutung, Hsinchu, Taiwan

Abstract Flexible AMOLED manufacturing is fast expanding on existing glass-based AMOLED factories. In this chapter, we discuss the major challenges with the sheet-to-sheet production of flexible AMOLED. Firstly, it is the LTPS-TFT process on flexible substrate for small- to medium-size displays. Secondly, flexible encapsulation of AMOLED for product reliability remains an arduous undertaking for mass production. Finally, separation of flexible AMOLED from the glass carrier dictates the yield of the entire manufacturing process.

List of Abbreviations ALD AMOLED CCD COF CTE ELA EPLaR ESD FlexUP FMM FPC FPD GC-MS GIP LTPS PECVD PI TFE WVTR

Atomic layer deposition Active-matrix organic light-emitting diode Charge-coupled device Chip on film Coefficient of thermal expansion Excimer laser annealing Electronics on plastic by laser release Electrostatic discharge Flexible universal plane Fine metal mask Flexible print circuit Flat-panel displays Gas chromatography-mass spectrometer Gate in panel Low-temperature polysilicon Plasma-enhanced chemical vapor deposition Polyimide Thin film encapsulation Water vapor transmission rate

Considerations for Manufacturing Flexible AMOLED The structure is different between glass AMOLED and flexible AMOLED products. A barrier film is used for the encapsulation for flexible AMOLED in the current products, such as the curved plastic

*Email: [email protected] Page 1 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015 Circular Polarizer Circular Polarizer

Adhesive

Glass

Barrier film

N2

OLED

Thin Film Encapsulation

TFT

OLED’s Adhesive

TFT

Flexible Substrate Adhesive

Glass

Barrier film Glass AMOLED

Flexible AMOLED

Fig. 1 The structure of conventional glass AMOLED and the flexible AMOLED (Source: ITRI)

Glass AMOLED

Glass Frit by Laser Sealing

Glass

TFT Array

Flexible AMOLED

Flexible Substrate on Glass

Glass Thinning

Cutting

OLED Evaporation

Thin Film Encapsulation

De-Bonding from Glass

Flexible Encapsulation

Film Protection Cutting

Fig. 2 Manufacturing process flow of glass AMOLED and flexible AMOLED (Source: ITRI)

AMOLED. Figure 1 shows the structure of conventional glass AMOLED and flexible AMOLED. Compared to the glass AMOLED, the encapsulation structure is completely changed in the flexible AMOLED. Glass AMOLED manufacturing is, relatively, a new yet established mass production technology in FPD industry. Flexible AMOLED, on the other hand, is in its infancy, proposed as a next-generation display technology for many kinds of new application. The major difference in manufacturing between glass AMOLED and flexible AMOLED is shown in Fig. 2. Flexible substrate on glass for TFT array process, thin film encapsulation, flexible encapsulation, de-bonding from glass, film protection, and cutting are among the main considerations in the manufacturing of flexible AMOLED. Flexible material, process, and equipment are completely new items to the existing glass-based AMOLED manufacturing. The choice of flexible material is quite limited due to the high temperature of TFT process. Some critical items with the flexible substrate for LTPS-TFT must be concerned, as shown in Fig. 3. Outgas and ion concentration of flexible substrate are related to the existing TFT process and equipment. The CTE (coefficient of thermal expansion) of flexible substrate can have a strong impact on the alignment accuracy of TFT process. The surface roughness of flexible substrate may also influence the WVTR (water vapor transmission rate) of the substrate. These factors must all be well controlled before entering the TFT process. In the glass OLED evaporation process, CCD catches the image through the glass to TFT and FMM (fine metal mask) marks. For PI (polyimide)-based TFT substrate in OLED evaporation chamber, the TFT mark on PI substrate should be precisely aligned to the FMM mark, as shown in Fig. 4. The alignment image using current glass-based OLED evaporation chamber could be interfered by the presence of PI film.

Page 2 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015 Existing TFT process equipment

Out-gas @ TFT process temperature

Ion concentration

Flexible substrate

Ra, Rms

CTE

TFT alignment accuracy

Barrier property

Fig. 3 Some key considerations of flexible substrate for TFT process (Source: ITRI)

CCD CCD

CCD Glass Magnet

PI TFT

TFT mark FMM

FMM mark (hole)

Fig. 4 Alignment of PI-based TFT array substrate in OLED evaporation chamber (Source: ITRI)

Fig. 5 Pinholes existing in thin film encapsulation layers (Source: Vitex)

The thin film encapsulation, TFE, for OLED is important for reliability of flexible AMOLED encapsulation. For OLED’s TFE, there is a still huge gap between the technology development and mass production. Pinholes in the layers of OLED’s TFE are very difficult to avoid during mass production process. Particles are the typical root cause of pinholes as shown in Fig. 5. Particles need to be tightly controlled in OLED’s TFE manufacturing.

Page 3 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015

Particle issue is concerned not only with OLED’s TFE but also with the flexible encapsulation process. Compared to current glass AMOLED production, barrier film lamination on flexible AMOLED is a completely new process. There can be many sorts of barrier film, adhesive, and lamination process for bendable, foldable, and rollable AMOLED. These barrier films with adhesive will directly be laminated on the surface of OLED’s TFE. Any particles inside or outside the adhesive layer would likely cause damage to OLED’s TFE. Tight control of particles in OLED and barrier film lamination section should especially be considered during manufacturing process. De-bonding from the glass carrier and, afterwards, handling of flexible AMOLED are also completely new manufacturing experiences, compared to glass AMOLED. Laser lift-off and mechanical de-bonding can be employed to separate flexible AMOLED from carrier glass. In addition, handling and cell cutting of flexible AMOLED are also important process flow to be considered in mass production.

Challenges with Flexible AMOLED Manufacturing TFT Processes with Flexible Substrate Flexible substrate for TFT manufacturing is the most basic element to enable commercialization of flexible AMOLED product. LG demonstrated the flexible AMOLED with stainless steel substrate in 2008 and 2009. In 2009, Samsung also showed their prototypes of flexible AMOLED on polyimide substrate. ITRI is among the pioneers for using the polyimide substrate to integrate flexible AMOLED prototype. Figure 6 shows the development of polyimide substrate for flexible AMOLED prototype and production. Both LTPS-TFTs and oxide-TFTs technologies have been proposed for manufacturing of flexible AMOLED. The performance and stability of LTPS-TFTs are excellent than oxide-TFTs. However, oxideTFTs can be manufactured with lost cost, and the device performance is good enough for large-area AMOLED backplane. However, run-to-run reproducibility and bias stability of oxide TFTs are much inferior to those of their LTPS counterparts (Mativenga et al. 2014). LTPS-TFT is the proven technology for glass-based AMOLED mass production already. Samsung is the first company to realize the LTPSTFT manufacturing process on PI substrate (An et al. 2014). Polyamide precursor coating by slot die equipment is commonly used to manufacture PI substrate on glass. There are several considerations 8













Polyimide

Polyimide

Mass production Confirm with PI

Polyimide

Flexible substrate

Polyimide

Polyimide Mass production Confirm with PI

Flexible substrate

Polyimide

Stainless steel

Technology transfer to AUO Flexible substrate

Polyimide

Polyimide

Polyimide

ITRI

Industrial Technology Research Institute

Polyimide

Polyimide

Polyimide

Fig. 6 Development of AMOLED with flexible substrate (Source: ITRI) Page 4 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015

required of PI on glass substrate before the TFT process. Outgas and residual ion concentration in PI are the two major concerns during TFT manufacturing process. There are many solvents used for synthesis of the polyamide precursor. Solution-type polyamide precursor is coated by slot die coater and, after thermal curing in oven, forms the PI film on the glass carrier. There are many types of ovens to be considered in PI-curing process during mass production. The residual outgas in PI film may damage the TFT process equipment, such as PECVD, during the manufacturing. Some inspection methods have been proposed to detect the outgas of PI film before the TFT process. For example, gas chromatography-mass spectrometer (GC-MS) tool can be used for detection of outgas in the PI material. A barrier layer can be inserted between PI substrate and poly-Si layer as shown in Fig. 7. SiNx and SiOx layers are not only used for TFT buffer layer but also to provide barrier function to drive the WVTR to lower than 10 5 g/m2day. Stress damping or control with these LTPS-TFT buffer layers can be factored in the design as well. The challenge with PI-based LTPS-TFT process lies in the high-temperature concern in excimer laserannealing, dehydrogenation, activation process. High-temperature resistance (ex >400  C) in LTPS-TFT process is thus required for PI material. Nevertheless, it has been found that excimer laser annealing (ELA) may not be as detrimental to the plastics, as it allows the melting of the a-Si film, minimizing the heat transfer to the substrate (Jin and Kim 2010). Figure 8 shows simulation result that, during the laser process, when the temperature of the molten Si is raised up to 2,200  C, the plastic surface remains below 70  C without any degradation (Kim et al. 2011). The other difference that needs to be considered between glass and PI-based LTPS-TFT process is the electrostatic discharge, ESD, during LTPS-TFT manufacturing. On PI base, LTPS-TFT circuits such as PDL (Pixel Define Layer)

Poly-Si Buffer layer

SiOx/SiNx

PI

Fig. 7 Buffer layer with barrier function in PI-based LTPS-TFT substrate (Source: ITRI)

Fig. 8 Temperature gradient of the a-Si/multilayer film on plastic layer (Source: SID’11)

Page 5 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015

Fig. 9 GIP circuit on PI damaged by ESD (Source: ITRI)

PDL (Pixel Define Layer) Poly-Si Pin holes

Buffer layer

SiOx/SiNx PI

Adhesive

Barrier film Plastic substrate

Fig. 10 Barrier film laminated outside the PI-based LTPS-TFT backplane for improved reliability (Source: ITRI)

gate in panel, GIP, are subject to damage by ESD during LTPS-TFT process as shown in Fig. 9. This will cause the image of the panel to fail. Some countermeasures like adding ESD circuit design in LTPS-TFT array, installing ESD elimination equipment, and so on are proposed to solve the ESD damage associated with the PI-based LTPS-TFT manufacturing. Although the SiNx and SiOx barrier layers ideally can reach the WVTR 10 6 g/m2day level, pinholes remain a big challenge during manufacturing of the LTPS-TFT buffer layer. Considering flexible AMOLED product reliability, additional barrier film is often required to offset the imperfections in LTPSTFT buffer layer as shown in Fig. 10. This laminated barrier film is not only for barrier function but also for protection of PI-based LTPS-TFT backplane in the handling process. Based on the LTPS-TFT experience of glass AMOLED, handling of PI substrate on glass is not particularly difficult for current panel makers. But barrier quality of PI-based LTPS-TFT backplane can still be an issue if only relying on the buffer layer in the flexible AMOLED panel. There continue to have some technologies proposed to solve the barrier quality of buffer layer, such as atomic layer deposition (ALD), solution-coated gas barrier, and so on.

Flexible Encapsulation on OLED

OLED encapsulation structure is quite different between glass and flexible AMOLED. Face seal encapsulation technology is proposed in flexible AMOLED structure (Hong et al. 2014). Figure 11 shows the

Page 6 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015

Smart Phone

Tablet

Solid state structure

OLED TV Glass Glass ass

ass Glass

ass Glass

N2

N2

Frit

OLED’s Adhesive

ass Glass

ass Glass

Glass ss

Flexible Solid state structure OCA

Barrier film OLED’s Adhesive

Polyimide Barrier film

Fig. 11 Trend of AMOLED encapsulation structure (Source: ITRI)

OLED’s TFE

Through Put Yield

Structure --Single layer --Multi layer

Inorganic material / Process --AlOx / Sputter --SiNx / PECVD --AlOx / Fast ALD

Organic material / Process --Polymer / Spray --Polymer / CVD --Polymer / Screen printing

Pin holes reduction --Particles control --Film quality improvement

2 micron

Fig. 12 OLED’s TFE consideration for mass production (Source: ITRI)

trend of AMOLED encapsulation by different product applications. For small-size glass AMOLED products, frit edge sealing has been the dominant choice in smartphone products. However, for the future flexible AMOLED encapsulation, solid-state encapsulation approach is preferred over hollow structure. Vitex’s multilayer thin film encapsulation (TFE) is well known in OLED encapsulation. However, Vitex’s approach still faces certain challenges in mass production. First of all, the AlOx layer deposited by sputtering is not ideal in mass production due to its pinholes. SiNx by PECVD could be the alternative solution of inorganic barrier layer for manufacturing. In addition, multilayer structure with long tact time is not a productive approach in the OLED’s TFE process. How to simplify and reduce the number of film pairs are two crucial issues for flexible AMOLED encapsulation in manufacturing. All developing approaches must take cost and yield issues into consideration as shown in Fig. 12. Except the Vitex’s technology equipment suppliers, ULVAC and APPLIED MATERIALS are leading equipment suppliers on SiNx barrier layer fabricated by PECVD process for OLED’s TFE. Directly Page 7 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015

PECVD SiNx Shadow mask

Shadow mask

OLED TFT

Substrate

Fig. 13 PECVD process by shadow mask during manufacturing (Source: ITRI)

Shadow mask

Mask open

Panel

Bonding area

Mother glass

Fig. 14 Shadow mask of PECVD versus panels in mother glass (Source: ITRI)

patterning by shadow mask is used in PECVD process on OLED layers during manufacturing, as shown in Fig. 13. SiNx film is coated, conformed on the surface of OLED and TFT circuit. SiNx layer of OLED’s TFE must not deposit on the FPC bonding area of panels on mother glass. Shadow mask of PECVD has blocks to protect the surface of FPC bonding area as shown in Fig. 14. PECVD with shadow mask process is not commonly used in TFT-LCD industry. The particle issue caused from shadow mask has to be first considered during mass production process. It is extremely difficult to fabricate a perfect TFE without any pinhole on OLED panels during mass production on Gen3.5, Gen4.5, or Gen5.5 size substrate. Therefore, a TFE plus a barrier film in combination is a better choice in obtaining a higher reliability in OLED mass production as illustrated in Fig. 15. The yield by laminated barrier film should be carefully considered for the manufacturing cost. Also, the total thickness will increase by using the commercial rolled barrier film. There is a strong relationship between thickness of flexible AMOLED and its flexibility. In the case of wearable device application, curved PMOLED and AMOLED are used for products in the market. Barrier film made by roll to roll process is typically used on both sides of the panel to protect the PMOLED or AMOLED. The total thickness, generally, is larger than 300 mm. With foldable touch AMOLED for smart mobile device application, a bending radius as small as 5 mm or less is required. In that case, the laminated barrier film should be 100 mm or thinner. Thinner barrier and touch films will drive the foldable touch

Page 8 of 15

Handbook of Visual Display Technology DOI 10.1007/978-3-642-35947-7_213-1 # Springer-Verlag Berlin Heidelberg 2015 Circular Polarizer Circular Polarizer

Barrier film OLED’s TFE

OLED’s TFE

PI substrate

PI substrate

Barrier film

Barrier film

Fig. 15 Barrier film encapsulation for OLED’s reliability (Source: ITRI)

ITRI

Conventional

Industrial Technology Research Institiue

Plastic substrate Touch Sensor Adhesive Circular Polarizer Structure

Adhesive Barrier Film

TFT

OLED

OLED’s adhesive PIsubstrate

Thickness

Relatively Thick (300~600µm)

TFT

Functional layer FlexUPTM cover plate Touch Barrier OLED’s adhesive OLED FlexUPTM substrate Super Thin (