Digital Image Processing [1 ed.] 9781842659960, 9781842657317

DIGITAL IMAGE PROCESSING is rapidly gaining applications in various fields of earth resource management, preparation of

328 76 9MB

English Pages 306 Year 2012

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Digital Image Processing [1 ed.]
 9781842659960, 9781842657317

Citation preview

Digital Image Processing

S.K. Ghosh

S. K. Ghosh

α

Alpha Science International Ltd. Oxford, U.K.

Digital Image Processing 314 pgs. | 122 figs. (19 figs. in colour) | 52 tbls.

S. K. Ghosh Department of Civil Engineering Indian Institute of Technology Roorkee Roorkee Copyright © 2013 ALPHA SCIENCE INTERNATIONAL LTD. 7200 The Quorum, Oxford Business Park North Garsington Road, Oxford OX4 2JZ, U.K.

www.alphasci.com All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the publisher. ISBN 978-1-84265-731-7 E-ISBN 978-1-84265-996-0 Printed in India

This book is dedicated to my wife Chitrita and my children Netri and Abhishek

Preface Advent of photographic camera led to the birth of another passion for human beings, to record all important events and beautiful locations and objects on film and chemically process to yield photographs on paper. In the initial stages, to analyze B&W photographs for information required normal human interpretation skills. Categorization of grey level was limited making the interpretation processing simple. With the availability of colour films and digital cameras, the interpretation procedure became more complicated as the number of colours shades or tones increased many fold. Further, change in media type from film based to digital form, brought about a new phase in recording and storing information. These products were known as digital images or simply images. Now it was possible to analyze for information using digital computer and relevant mathematical based algorithms; thus, the introduction of Digital Image Processing. In the present day world, images have started to play an important role. People like to share images across networks, analyze to extract critical informations, store the same as reference for future comparisons. One area where images have started to play a significant role is the use of satellite images for extraction of information related to natural resources. Satellite data is available both as single or multi-spectral data. In view of the same, the need for analysis of digital images has gained prominence. The aim of this book is to provide the fundamental concepts related to formation of digital image and its analysis for extraction of information. This book can be divided into two parts. The first four chapters deals with the fundamentals of image formation, its characteristics, file format and some of the commercially available software. Chapter 1 deals the principles of electromagnetic spectrum and its role in image formation. Chapter 2 deals with the concept of image and its characteristics. In order to store information in digital form, an image has to be designed in such a manner that it can be archived and retrieved. Chapter 3 deals with image file format and provides details of the various file formats that have been used to store images. Further, it is equally important that the analyst develops customized analysis procedures or uses commercially developed software. Chapter 4 provides an overview of some of the commercially available image processing software. The next part of the book deals with the analysis procedures for analyzing satellite data. Chapter 5 outlines the need to examine the image quality by computing both univariate and multivariate statistical parameters. As the electromagnetic radiation has to pass through the earth's atmosphere, it interacts with the various

viii Preface constituents of the earth's atmosphere causing overall degradation in the image quality. Further, the motions of the Earth and the satellite cause a distortion in the shape of the objects, so Chapter 6 deals with the various pre-processing steps applied to satellite data. Chapter 7 is devoted to image enhancement technique in order to improve the quality of the input image caused by background or sensor quality. In general, it is observed that in case of multi-spectral data, it is beset with problems of data redundancy. Use of arithmetical operators and transformations tend to yield a new set of data having better information and the concepts of the same have been discussed in Chapter 8. Chapter 9 is devoted to image analysis or classification and is the heart of the book. It discusses the various statistical based classification techniques and algorithms and in the end provides details for assessing the accuracy of the classified images. Finally, Chapter 10 deals various techniques of identifying the various techniques of spatial filtering. The task of spatial filtering is to emphasize or de-emphasize information on the basis of its tonal variation or frequencies. An attempt has also been made to provide an understanding to each of the processes or techniques by considering a sample satellite data in order to provide a simple interpretation to the results. The same data has been used for all the tasks to provide consistency. It may be noted that these interpretations are as per author's understanding and many vary with person. At the end, it may be noted that it is the passion of the author and many years of teaching and research in this area has finally culminated into this book. The basic idea has been to translate his knowledge and efforts for the benefits of students, researchers and any person interested in fiddling with digital image. S. K. Ghosh

Acknowledgement Writing a book is a journey through a never ending pool of knowledge. It takes lots of time to put your thought in a well structured way so that the reader can understand them. Many times you have a setback when your computer hard disk crashes or the pen drive containing all the files lands inside the washing machine. Right throughout this venture of ups and downs, there have been many good people who have helped me in their own way. A proper acknowledgement to all these people is appropriate. The first person to be acknowledged is my wife Chitrita. During difficult times, her consoling words would motivate me to go back to my laptop and try and accomplish my task of completing this book. My daughter Netri has been one source of inspiration while My son Abhishek has been my best critic and hope that in my future endeavours they play their role to perfection. No amount of words can express my gratitude for my parents and my late in-laws for being my constant guide. I would like to also thank my Ph.D supervisor Professor George Fleming, Department of Civil and Environmental Engineering, University of Strathclyde, Glasgow, UK, who indirectly inducted me in to this area of digital image processing. A teacher’s evaluator is his students. During this period a few hundreds of students have played their role as the target group. I would like to especially acknowledge the help extended by Ms. Vandita Srivastava, Scientist, Indian Institute of Remote Sensing by developing those model makers in ERDAS IMAGINE for different types of Vegetation Index. I would like also acknowledge University of Roorkee and now Indian Institute of Technology, Roorkee for providing the proper environment conducive for good teaching and research. Finally, I would like to thank my office assistant Mr. Amit Kumar who spent hours typing, edit the manuscripts and drawing all the diagrams. S. K. Ghosh

Contents Preface Acknowledgement List of Figures List of Tables

1. Concept of Images 1.1 1.2 1.3 1.4 1.5 1.6

Introduction Electromagnetic Energy Electromagnetic Spectrum and its Characteristics Utility of EM Radiation in Image Acquisition Image Processing Basic Image Processing Technique 1.6.1 Image representation and modeling 1.6.2 Image enhancement 1.6.3 Image restoration 1.6.4 Image analysis 1.6.5 Image reconstruction 1.6.6 Image data compression

2. The Process of Imaging 2.1 2.2

2.3

2.4 2.5

Introduction Passive Sensors 2.2.1 Gamma-ray spectrometer 2.2.2 Aerial camera 2.2.3 Video camera 2.2.4 Multi-spectral scanner 2.2.5 Imaging spectrometer 2.2.6 Thermal scanner 2.2.7 Radiometer Active Sensors 2.3.1 Laser scanner 2.3.2 Radar altimeter 2.3.3 Imaging radar Platforms Characteristics of Image

vii ix xix xxiii

1.1 1.1 1.1 1.2 1.4 1.5 1.5 1.6 1.7 1.7 1.7 1.7 1.7

2.1 2.1 2.1 2.1 2.2 2.2 2.3 2.3 2.3 2.3 2.3 2.4 2.4 2.4 2.4 2.5

xii

Contents

2.6 2.7

2.5.1 Sampling 2.5.2 Spatial resolution 2.5.3 Sampling pattern 2.6.3 Quantization Colour Fundamentals Colour Models 2.7.1 The RGB model 2.7.2 CMY model 2.7.3 HSI model 2.7.4 Conversion of colour from RGB to HSI 2.7.5 Converting colours from HSI to RGB

3. Image File Format 3.1 3.2 3.3 3.4 3.5

3.6 3.7 3.8

3.9

Storage Media File Formats Common Interchangeable Formats Bitmap (BMP) Tagged Image File Format (TIFF) 3.5.1 The TIFF file structure 3.5.2 TIFF data compression 3.5.3 TIFF classes Graphic Interchange Format (GIF) Joint Photographic Expert Graphic (JPEG) Portable Network Graphics (PNG) 3.8.1 File structure of PNG 3.8.1.1 PNG file signature 3.8.1.2 Chunk layout 3.8.1.3 Chunks specifications 3.8.2 Primary Chunks 3.8.2.1 IHDR chunk 3.8.2.2 PLTE chunk 3.8.2.3 IDAT chunk 3.8.2.4 IEND chunk 3.8.3 Ancillary chunks 3.8.3.1 bKGD chunk 3.8.3.2 cHRM chunk 3.8.3.3 gAMA chunk 3.8.3.4 hIST chunk 3.8.3.5 pHYS chunk 3.8.3.6 sBIT chunk 3.8.3.7 tEXT chunk 3.8.3.8 tIME chunk 3.8.3.9 tRNS chunk 3.8.3.10 zTXT chunk 3.8.4 Summary of standard chunks Shape File 3.9.1 The main file header

2.5 2.6 2.7 2.8 2.9 2.11 2.12 2.14 2.14 2.15 2.15

3.1 3.1 3.1 3.2 3.3 3.4 3.4 3.9 3.9 3.11 3.11 3.12 3.15 3.15 3.15 3.16 3.16 3.16 3.17 3.18 3.18 3.18 3.18 3.19 3.19 3.19 3.19 3.20 3.20 3.21 3.21 3.21 3.22 3.22 3.24

Contents 3.9.1.1 Record headers Index file dBASE file Description of main file record contents 3.9.4.1 Null shapes 3.9.4.2 Point 3.9.4.3 Multi point 3.9.4.4 PolyLine 3.9.4.5 Polygon 3.9.4.6 PointM 3.9.4.7 MultiPointM 3.9.4.8 PolyLineM 3.9.4.9 PolygonM 3.9.4.10 PointZ 3.9.4.11 MultiPointZ 3.9.4.12 PolyLineZ 3.9.4.12 PolyLineZ 3.9.4.14 MultiPatch 3.10 Satellite Tape Formats 3.9.2 3.9.3 3.9.4

4. Image Processing Software 4.1 4.2

4.3 4.4

Introduction ERDAS Imagine 4.2.1 Imagine essential 4.2.1.1 Data types and integration 4.2.1.2 Data visualization 4.2.1.3 Geometric correction 4.2.1.4 Simple classification 4.2.1.5 Map composer 4.2.1.6 General tools and utilities 4.2.2 IMAGINE Advantage 4.2.2.1 Ortho correction 4.2.2.2 Metric accuracy assessment (MAA) tools 4.2.2.3 Mosaicking 4.2.2.4 Image processing 4.2.2.5 Modeling language 4.2.2.6 Knowledge classifier 4.2.3 IMAGINE Professional 4.2.3.1 Spectral analysis 4.2.3.2 Expert classifier 4.2.3.3 Multispectral classifier 4.2.3.4 Radar interpreter 4.2.4 System specifications of ERDAS IMAGINE 9.0 ENVI 4.3.1 Generally review of ENVI functionality 4.3.2 Advantages of ENVI IDRISI

xiii 3.25 3.25 3.26 3.26 3.26 3.27 3.27 3.27 3.27 3.28 3.28 3.28 3.28 3.28 3.28 3.29 3.29 3.29 3.30

4.1 4.1 4.2 4.3 4.3 4.3 4.4 4.4 4.4 4.4 4.4 4.5 4.5 4.5 4.5 4.5 4.6 4.6 4.6 4.6 4.7 4.7 4.8 4.9 4.9 4.11 4.12

xiv

Contents 4.4.1 4.4.2

4.5

4.6

IDRISI system overview Image processing menu 4.4.2.1 Restoration submenu 4.4.2.2 Enhancement submenu 4.4.3.2 Transformation submenu 4.4.3.4 Fourier analysis submenu 4.4.3.5 Signature development submenu 4.4.3.6 Hard classifiers submenu 4.4.3.7 Soft classifiers / mixture analysis submenu 4.4.3.8 Hyperspectral image analysis submenu 4.4.3.9 Accuracy assessment submenu ER Mapper 4.5.1 Algorithms 4.5.1.1 View and enhance raster data 4.5.1.2 Filters 4.5.1.3 Contrast stretches (Transforms) 4.5.1.4 Formulae and statistics 4.5.1.5 View and edit vector data 4.5.1.6 Integrate data 4.5.1.7 Raster translators 4.5.1.8 Automatic data fusion and mosaicing 4.5.1.9 Hardcopy including stereo pair generation 4.5.1.10 Map composition 4.5.1.11 Classify raster images 4.5.1.12 Visualize in 3-D 4.5.1.13 Traverse 4.5.1.14 Application based toolbars and batch scripts 4.5.1.15 Raster to vector polygon conversion 4.5.1.16 Geocoding 4.5.1.17 Gridding 4.5.1.18 Fourier transformation 4.5.1.19 Customizable functions 4.5.2 Virtual datasets 4.5.3 Compression wizard Concluding Remarks

5. Initial Statistics 5.1 5.2

Introduction Univariate Statistics 5.2.1 Histogram 5.2.2 Cumulative histogram 5.2.3 Minimum and maximum value 5.2.4 Mean and standard deviation 5.2.5 Median 5.2.6 Mode 5.2.7 Skewness 5.2.8 Kurtosis

4.13 4.14 4.14 4.15 4.15 4.16 4.16 4.17 4.19 4.20 4.21 4.21 4.21 4.22 4.22 4.22 4.22 4.22 4.23 4.23 4.23 4.23 4.23 4.24 4.24 4.24 4.24 4.24 4.24 4.24 4.24 4.25 4.25 4.25 4.25

5.1 5.1 5.1 5.2 5.3 5.4 5.4 5.5 5.5 5.6 5.6

Contents

5.3

5.4

Multivariate Image Statistics 5.3.1 Scatterplot 5.3.2 Covariance matrix 5.3.3 Correlation Illustrative Example 5.4.1 Discussion on univariate statistics 5.4.2 Multi - variate statistics 5.4.3 Concluding remarks

6. Pre-Processing of Data 6.1 6.2 6.3 6.4

6.5

Introduction Radiometric Corrections 6.2.1 Missing scan lines 6.2.2 De-striping methods Atmospheric Correction Methods Geometric Correction and Registration 6.4.1 Orbital geometry model 6.4.1.1 Aspect ratio 6.4.1.2 Skew correction 6.4.1.3 Earth rotation correction 6.4.2 Transformation based on ground control points Resampling

7. Enhancement Techniques 7.1 7.2 7.3

7.4

7.5 7.6

Introduction Contrast Stretch or Enhancement Linear Enhancement 7.3.1 Min-Max stretch 7.3.2 Percentile stretching 7.3.3 Piece wise linear stretch Non Linear Enhancement 7.4.1 Histogram equalization 7.4.2 Gaussian equalization 7.4.3 Logarithmic contrast enhancement 7.4.4 Exponential contrast enhancement Comparison of Enhancement Method Illustrative Example

8. Image Transformations 8.1

8.2 8.3

Basic Arithmetic Operators 8.1.1 Image addition 8.1.2 Image subtraction 8.1.3 Image multiplication 8.1.4 Image division Vegetation Indices Classification of Vegetation Indices

xv 5.7 5.8 5.8 5.9 5.10 5.10 5.11 5.14

6.1 6.1 6.2 6.2 6.3 6.4 6.6 6.8 6.8 6.8 6.9 6.11 6.12

7.1 7.1 7.1 7.2 7.2 7.4 7.4 7.5 7.5 7.6 7.7 7.7 7.7 7.8

8.1 8.1 8.1 8.2 8.3 8.3 8.4 8.5

xvi

Contents

8.4

8.5

8.6

8.7

8.8

The Slope-based Vegetation Index 8.4.1 Ratio vegetation index (RATIO) 8.4.2 The normalized difference vegetation index (NDVI) 8.4.3 The transformed vegetation index (TVI) 8.4.4 The corrected transformed vegetation index (CTVI) 8.4.5 Thiams transformed vegetation index (TTVI) 8.4.6 Ratio vegetation index (RVI) 8.4.7 The normalized ratio vegetation index (NRVI) 8.4.8 Other indices The Distance-based Vegetation Index 8.5.1 The perpendicular vegetation index (PVI) 8.5.2 Difference vegetation index (DVI) 8.5.3 The Ashburn vegetation index (AVI) 8.5.4 The weighted difference vegetation index (WDVI) 8.5.5 The soil-adjusted vegetation index (SAVI) 8.5.6 The modified soil-adjusted vegetation indices (MSAVI1 and MSAVI2) 8.5.7 Atmospherically resistant vegetation index (ARVI) 8.5.8 Soil and atmospherically resistant vegetation index (SARVI) 8.5.9 Enhanced vegetation Index Special Indices 8.6.1 Normalized difference water index 8.6.2 Normalized difference snow index 8.6.3 Normalized burn ratio The Orthogonal Transformations 8.7.1 Principal component analysis 8.7.2 Tasseled-cap components 8.7.5 The concept of n-space indices 8.7.5.1 Calculation of n-space coefficients Illustrative Example 8.8.1 Ratio and NDVI images 8.8.2 Vegetation indices 8.8.2.1 Transformed vegetation index 8.8.2.2 Corrected transformed vegetation index 8.8.2.3 Thiams transformed vegetation index 8.8.2.4 Ratio vegetation index (RVI) 8.8.2.5 Normalized ratio vegetation index 8.8.2.6 Infrared index (II) 8.8.2.7 Moisture stress Index 8.8.2.8 Perpendicular vegetation Index 1 8.8.3 Principal component analysis images 8.8.4 Tassel cap transformation images

9. Image Classification 9.1 9.2

Introduction Supervised Classification

8.6 8.6 8.7 8.7 8.8 8.8 8.8 8.8 8.9 8.9 8.10 8.12 8.12 8.13 8.13 8.14 8.17 8.18 8.19 8.19 8.19 8.24 8.28 8.30 8.30 8.34 8.36 8.37 8.40 8.40 8.41 8.41 8.42 8.42 8.43 8.43 8.44 8.44 8.45 8.45 8.48

9.1 9.1 9.2

Contents 9.2.1 9.2.2

9.3

9.4 9.5

Classification scheme Training site selection and statistics extraction 9.2.2.1 Guidelines for training data 9.2.2.2 Idealized sequence for selecting training data 9.2.3 Training data statistics 9.2.4 Feature selection 9.2.5 Selection of appropriate classification algorithm 9.2.5.1 The parallelopiped classifier 9.2.5.2 The minimum-distance to means classifier 9.2.5.3 The maximum likelihood classifier Unsupervised Classification 9.3.1 Distance based clustering methods 9.3.2 Model-based clustering methods 9.3.3 Density-based clustering method 9.3.4 Condensation-based method for clustering 9.3.5 Subspace clustering methods 9.3.6 Feature selection for clustering 9.3.7 Clustering pattern classification by distance functions 9.3.7.1 Minimum distance pattern classification 9.3.7.2 Maximin distance algorithm 9.3.7.4 K-means algorithm 9.3.7.5 ISODATA algorithm Classification Accuracy Assessment 9.4.1 Error matrix Illustration Example 9.5.1 Selection of training dataset 9.5.2 Feature selection 9.5.3 Image classification 9.5.4 Assessment of accuracy

10. Spatial Filtering 10.1 Introduction 10.2 Process of Filtering 10.3 Noise Removal Filtering 10.3.1 Mean filter 10.3.2 Weighted mean filter 10.3.3 Median filter 10.3.4 Mode filter 10.3.5 Olympic filter 10.3.6 Multi level median (MLM) filter 10.3.7 P-median (PM) filter 10.3.8 Adaptive mean P-median (AMPM) filter 10.4 Edge Detection 10.4.1 Classification of Edge Detection Techniques 10.4.2 Non-directional Filters 10.4.2.1 Laplacian filter 10.4.2.2 High boost filter

xvii 9.3 9.5 9.5 9.7 9.8 9.8 9.12 9.12 9.14 9.14 9.15 9.18 9.18 9.19 9.20 9.21 9.22 9.23 9.23 9.26 9.28 9.29 9.32 9.33 9.38 9.38 9.38 9.43 9.44

10.1 10.1 10.2 10.4 10.6 10.6 10.7 10.7 10.7 10.7 10.7 10.8 10.8 10.10 10.10 10.10 10.12

xviii

Contents 10.4.3 10.4.4

10.4.5

Simple Directional Filtering Gradient filtering 10.4.4.1 Roberts operator 10.4.4.2 Prewitt operator 10.4.4.3 Sobel operator 10.4.4.4 Kirsch operator Zero Crossing Filtering 10.4.5.1 LoG filter 10.4.5.2 DDoG filter

References Index About the Author

10.13 10.14 10.15 10.15 10.16 10.18 10.18 10.19 10.21

R.1 I.1 A.1

List of Figures

Fig. 1.1 Fig. 1.2 Fig. 1.3 Fig. 1.4 Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 2.4 Fig. 2.5

Electromagnetic Spectrum The wave model of electromagnetic energy Radiations from a blackbody Schematic description of image representation and modeling Overview of the sensors Connectivity of different sampling pattern A log-polar array of pixels Chromaticity diagram Typical color gamut of color monitors (triangle) and color printed devices (irregular region) Fig. 2.6 The RGB Colour Model Fig. 2.7 HSI Model Fig. 2.8 Separation of primaries Fig. 3.1 Simple file format model Fig. 3.2 TIFF Data Structures Fig. 3.3 Image File Header Fig. 3.4 Organization of the Main File Fig. 3.5 Organization of the Index File Fig. 3.6 An Example Polygon Instance Fig. 3.7 Multi-Patch Part Examples Fig. 3.8 Layout of sample dataset Fig. 3.9 Band Interleaved by Pixel (BIP) data format Fig. 3.10 Band Interleaved by Line (BIL) data format Fig. 3.11 Band Sequence (BSQ) data format Fig. 4.1 Basic Utilities in a image processing software Fig. 4.2 Application Window of IDRISI Fig. 5.1 Different types of histograms Fig. 5.2 Representation of a histogram Fig. 5.3 Different types of Skewness Fig. 5.4 Different types of Kurtosis Fig. 5.5 Different types of scatter plot Fig. 5.6 TM Band 1 Fig. 5.7 TM Band 2 Fig. 5.8 TM Band 3

1.2 1.2 1.4 1.6 2.2 2.7 2.8 2.11 2.12 2.13 2.14 2.16 3.2 3.5 3.6 3.24 3.26 3.28 3.30 3.31 3.31 3.31 3.32 4.2 4.13 5.2 5.3 5.6 5.7 5.9 5.14 5.14 5.14

xx

List of Figures

Fig. 5.9 Fig. 5.10 Fig. 5.11 Fig. 5.12 Fig. 5.13 Fig. 5.14 Fig. 5.15 Fig. 5.16 Fig. 5.17 Fig. 5.18 Fig. 5.19 Fig. 5.20 Fig. 5.21 Fig. 5.22 Fig. 5.23 Fig. 5.24 Fig. 5.25 Fig. 5.26 Fig. 5.27 Fig. 5.28 Fig. 5.29 Fig. 5.30 Fig. 5.31 Fig. 5.32 Fig. 5.33 Fig. 6.1 Fig. 6.2 Fig. 6.3 Fig. 6.4 Fig. 7.1 Fig. 7.2 Fig. 7.3 Fig. 7.4 Fig. 7.5 Fig. 8.1 Fig. 8.2 Fig. 8.3 Fig. 8.4 Fig. 8.5 Fig. 8.6 Fig. 8.7 Fig. 8.8 Fig. 8.9

TM Band 4 TM Band 5 TM Band 7 FCC of TM Band 4, 3 and 2 Histogram of TM Band 1 Histogram of TM Band 2 Histogram of TM Band 3 Histogram of TM Band 4 Histogram of TM Band 5 Histogram of TM Band 7 Scatter plot – TM 1 & 2 Scatter plot – TM 1 & 3 Scatter plot - TM 1 & 4 Scatter plot – TM 1 & 5 Scatter plot – TM 1 & 7 Scatter plot - TM 2 & 3 Scatter plot – TM 2 & 4 Scatter plot – TM 2 & 5 Scatter plot - TM 2 & 7 Scatter plot – TM 3 & 4 Scatter plot – TM 3 & 5 Scatter plot - TM 3 & 7 Scatter plot – TM 4 & 5 Scatter plot – TM 4 & 7 Scatter plot - TM 5 & 7 Components of the signal received by a satellite-mounted sensor Regression method for computation of atmospheric path radiance Effects of Earth rotation on the geometry of a line-scanned image Earth Rotation Correction Concept of Min-Max stretch Concept of Percentile stretching Piecewise linear contrast stretch Various contrast enhancement techniques applied to TM Band 1 Various Contrast enhancement Technique applied to TM Band 4 Concept of Vegetation Index The Perpendicular Vegetation Index Distance from the Soil Line Light and dark soil influences on SAVI values as a function of the shift origin correction factor Effect on soil moisture on NDVI, SAVI and WDVI Completion of Red-NIR Space with vegetation density Laboratory-measured green and dry vegetation reflectance spectra Liquid water transmittances for different water thicknesses Sensitivity of NDWI to liquid water thickness

5.14 5.15 5.15 5.15 5.15 5.16 5.16 5.17 5.17 5.18 5.18 5.18 5.18 5.18 5.18 5.19 5.19 5.19 5.19 5.19 5.20 5.20 5.20 5.20 5.20 6.4 6.6 6.7 6.10 7.3 7.4 7.5 7.9 7.9 8.6 8.10 8.12 8.14 8.15 8.16 8.21 8.21 8.22

List of Figures Fig. 8.10 Fig. 8.11 Fig. 8.12

Fig. 8.13 Fig. 8.14 Fig. 8.15 Fig. 8.16 Fig. 8.17 Fig. 8.18 Fig. 8.19 Fig. 8.20 Fig. 8.21 Fig. 8.22 Fig. 8.23 Fig. 8.24 Fig. 8.25 Fig. 8.26 Fig. 8.27 Fig. 8.28 Fig. 8.29 Fig. 8.30 Fig. 9.1 Fig. 9.2 Fig. 9.3 Fig. 9.4 Fig. 9.5 Fig. 9.6 Fig. 9.7 Fig. 9.8 Fig. 9.9 Fig. 9.10 Fig. 10.1 Fig. 10.2 Fig. 10.3 Fig. 10.4 Fig. 10.5

Scatter diagram between reflectance of wet and dry soil at 1.2 µm and 0.86 µm Relationship of NDWI and reflectance at 0.86 µm The mean NDWI for mixtures of wet soils with green vegetation and drier soil with green vegetation as a function of vegetation area fraction Effects different crystal radii on snow reflectance TM Tasseled Cap transformation axes system Approximate locations of some classes in TM Tasseled Cap feature space Ratio image NDVI Image Transformed Vegetation Index Corrected Transformed Vegetation Index Thiams Transformed Vegetation Index Ratio Vegetation Index Normalized Ratio Vegetation Index Infrared Index Moisture Stress Index Perpendicular Vegetation Index 1 Principal Component Analysis Images Tassel Cap Transformation Images FCC of the TM band 4, 3 and 2 FCC generated from the first three Principal Component Analysis FCC generated from Brightness, Greenness and Wetness images Schematic representation of overlapping information from training data Clustering procedure A general classification of clustering methods Two “patterns” in a two dimensional measurement space Decision boundaries defined by a single-prototype, minimum distance classifier Partitioning of feature space by maximum algorithm Schematic representation of an error matrix Histogram plot of different features in different Training data in different TM Bands Minimum Distance to Means classifier Maximum Likelihood classifier Low pass filtering and high pass filtering An example of small image (left) and kernel (right) to illustrate convolution 1-D Gaussian distribution with mean value of 0 and standard deviation of 1 Layout of Mean filter 1st and 2nd derivative of an edge illustrated in one dimensional

xxi

8.22 8.23

8.24 8.25 8.35 8.35 8.41 8.41 8.42 8.42 8.43 8.43 8.44 8.44 8.45 8.45 8.47 8.48 8.49 8.50 8.51 9.9 9.15 9.18 9.23 9.25 9.28 9.34 9.39 9.44 9.44 10.2 10.3 10.5 10.6 10.9

xxii

List of Figures

Fig. 10.6 Representation of a Laplacian Filter procedure Fig. 10.7 (a) Simple Laplacian mask Fig. 10.7 (b) Laplacian mask with orientation in variant Fig. 10.8 Simple Directional Filter Fig. 10.9 Roberts operator Fig. 10.10 Prewitt Operator Fig. 10.11 Sobel operator Fig. 10.12 Kirsch Operator (8 masks) Fig. 10.13 Intensity profit of an ideal Step Edge Fig. 10.14 Response of Log Filter to a Step Edge

10.11 10.12 10.12 10.13 10.15 10.16 10.17 10.18 10.20 10.21

List of Tables Table 1.1 Table 1.2 Table 3.1 Table 3.2 Table 3.3 Table 3.4 Table 3.5 Table 3.6 Table 3.7 Table 3.8 Table 3.9 Table 3.10 Table 3.11 Table 3.12 Table 3.13 Table 3.14 Table 3.15 Table 4.1 Table 4.2 Table 5.1 Table 5.2 Table 5.3 Table 6.1 Table 7.1 Table 8.1 Table 8.2 Table 8.3 Table 8.4 Table 8.5 Table 9.1 Table 9.2 Table 9.3 Table 9.4 Table 9.5

Data Volumes of Image Sources (in Millions of Bytes) Storage Capacities (in Millions of Bytes) Summary of Tags in TIFF Tag Value Additional tags required for TIFF conformant images Comparison between JPEG and GIF Details of IHDR Chunk Allowed combination of colour type and allowed bit depth List of predefined keywords Details of tIME Chunk Structure of zTXT chunk Properties and ordering constraints of standards chunk Description of the Main File Header Values for Shape Type Description of Main File Record Headers Description of Index Records Multi Patch Part values System Specifications of ERDAS IMAGINE 9.0 for Windows OS Recommended System specifications ERDAS IMAGINE 9.0 for UNIX OS Initial Statistics of the 6-band TM Dataset Variance Co-variance Matrix Correlation Matrix Comparative statement of Resampling Methods Look up Table Tasseled-cap coefficients for Landsat-1 MSS Formulae of Different Indices Eigen values of the Principal Component Analysis Eigen vectors of Principal Component Analysis Degree of correlation between each band and component U.S. Geological survey land use /land cover classification system The Four levels of Remotely Sensed Data to be used Different feature selection methods Alternate definitions of discriminate function A sample Error Matrix

1.8 1.8 3.8 3.9 3.10 3.13 3.16 3.17 3.21 3.21 3.22 3.22 3.24 3.25 3.25 3.26 3.30 4.8 4.8 5.11 5.12 5.12 6.13 7.2 8.36 8.39 8.48 8.48 8.48 9.3 9.4 9.13 9.25 9.34

xxiv

List of Tables

Table 9.6 Table 9.7 Table 9.8 Table 9.9 Table 9.10 Table 9.11 Table 9.12 Table 9.13 Table 9.14 Table 9.15 Table 9.16 Table 9.17 Table 9.18 Table 9.19 Table 9.20 Table 9.21 Table 9.22 Table 9.23 Table 9.24 Table 9.25 Table 9.26 Table 9.27 Table 9.28 Table 9.29 Table 9.30 Table 9.31 Table 9.32 Table 9.33 Table 9.34 Table 9.35 Table 9.36

Percentage errors of Ommission and Commission Confidence limits of Ommission and Commission Sample computation of κ List of accuracy measures Statistics for Dry Sand Statistics for Agriculture Statistics for High density forest Statistics for Medium density forest Statistics for Light dense forest Statistics for New forest Statistics for Urban Statistics for Water Statistics for Shallow water Statistics for Wet sand Classification results using different classifiers using different band combinations ERROR MATRIX of 4-band combination using Maximum Likelihood Classifier Accuracy of Classification for 4-band combination using Maximum Likelihood Classifier ERROR MATRIX of 4-band combination using Minimum Distance to Mean Classifier Accuracy of Classification for 4-band combination using Minimum Distance to Mean Classifier ERROR MATRIX of 3-band combination using Maximum Likelihood Classifier Accuracy of Classification for 3-band combination using Maximum Likelihood Classifier ERROR MATRIX of 3-band combination using Minimum Distance to Mean Classifier Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier Error Matrix of 3-band combination using MLC Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier Error Matrix of 4-band combination using MLC Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier Error Matrix of 4-band combination using Minimum Distance to Means Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier Error Matrix of 3-band combination using Minimum Distance to Mans Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier

9.35 9.35 9.36 9.37 9.40 9.40 9.40 9.41 9.41 9.41 9.42 9.42 9.42 9.43 9.43 9.46 9.47 9.48 9.49 9.50 9.51 9.52 9.53 9.54 9.55 9.56 9.57 9.58 9.59 9.60 9.61

1 Concept of Images

1.1

INTRODUCTION

Simply, images are pictures where information is recorded by a camera or sensor and presented visually either in hard or soft copy form. Pictures are important as these can be an extraordinarily effective medium for storage and communication of information. Photographs help human beings to create a permanent record of their visual experiences, and also to share these experiences with others. A photograph, in general, provides the necessary information which otherwise may require a lengthy, tedious and ambiguous verbal description of the area photographed. Normally, human beings rely on their eyes to receive information related to a surrounding, and the brain is adept at visual data processing. Thus, it is very aptly said that a photograph or a picture is worth a thousand words. The process of acquiring a photograph is similar to the process of normal human vision. In both human vision and photography, a light source is required to illuminate a scene. The light interacts with the objects in the scene and the reflected light reaches the observer, whereupon it is detected by the eyes in case of human vision or by a camera in case of photography. Information about the objects in the scene is recorded as variations in the intensity and colour of the detected light. It is important to note is that, although a scene is threedimensional in nature, the image of that scene is always a two-dimensional one.

1.2

ELECTROMAGNETIC ENERGY

Apart from light, there are other forms of energy that can be used to create images. Light is a small portion between 0.4µm to 0.7µm portion of the electromagnetic (EM) spectrum known as the visible portion. It is important to note that this is the only portion of the spectrum that can be associated with the

1.2

Digital Image Processing

concept of colour. Blue, green, and red are known as the primary colours or wavelengths of the visible spectrum. However electromagnetic (EM) spectrum ranges from gamma rays to radiowaves (Fig.1.1). 0.7 (µm) Red

UV

0.6 Green

0.5 Blue

0.4

Near-infrared

Visible Wavelength (µm) 10-2

Ultraviolet (UV) Visible

10-1

1

10

102

103

104

105

106

107

108

109

Television and radio

10-3

Microwave

10-4

Near-IR Mid-IR Thermal IR

10-5

X-rays

γ-rays

Cosmic rays

10-6

Fig. 1.1: Electromagnetic spectrum

1.3

ELECTROMAGNETIC SPECTRUM AND ITS CHARACTERISTICS

Electromagnetic energy can be modeled by waves or by energy bearing particles called photons. In the wave model, electromagnetic energy is considered to propagate through space in the form of sinusoidal waves. These waves are characterized by electrical field (E) and magnetic field (M) both of which is perpendicular to each other, and hence for this reason the term electromagnetic energy is used. The vibration of both fields is perpendicular to the direction of travel of the wave (Fig. 1.2). Both fields propagate through space at the speed of light c which is 299,790,000 ms-1, and can be rounded off to 3×108 ms-1. Electric field E

Wavelength

Distance Magnetic field

M

Velocity of light c

Fig. 1.2 The wave model of electromagnetic energy

The wavelength λ of electromagnetic waves, are particularly important for understanding remote sensing, is defined as the distance between successive

Concept of Images

1.3

wave crests. Wavelength is measured in metres (m) or some fraction of metres, such as nanometers (nm, 10-9 metres) or micrometers (µm, 10-6 metres). The frequency ν of the electromagnetic energy is the number of cycles of a wave passing a fixed point over a specific period of time. Frequency is normally measured in hertz (Hz), which is equivalent to number of cycles per second. Since the speed of light is constant, wavelength and frequency are inversely related to each other.

ν=

c

... (1.1)

λ

Most characteristics of EM energy can be described using the wave model. For some purposes, however, EM energy modeled by particle (photons) theory is more convenient to use. This approach is considered when quantifying the amount of energy measured by multi-spectral sensor. The amount of energy held by a photon of a specific wavelength is given by Q = hν or

=h

c

λ

…(1.2)

where Q is the energy of a photon J and h is Planck’s constant (6.6262×10-34 Js). From Eq. (1.1) it follows that the longer the wavelength, the lower is its energy content. Gamma rays (around 10-9 m) are the most energetic, and radio waves (around >1 m) are the least energetic. It may be noted that it is easier to measure shorter wavelengths than the larger wavelengths. All matter with a temperature above Absolute zero (0° K) radiate EM energy due to molecular agitation in which movement of the molecules is taking place. This means that the Sun, and also the Earth, radiates energy in the form of waves. The matter capable of absorbing and re-emitting all EM energy is known as a blackbody. For blackbodies, both the emissivity (ε) and the absorptance (α) are equal to 1. The amount of energy radiated by an object depends on its absolute temperature and emissivity, and it is a function of the wavelength. The radiation emitted by a blackbody at different temperatures is shown in Fig. 1.3. The area below the curve represents the total amount of energy emitted at a specific temperature. It can be concluded that a higher temperature corresponds to a greater contribution of shorter wavelengths. The peak radiation at 400° C is around 4 µm while at 1000° C it is 2.5 µm. The emitting ability of a real material compared to that of the blackbody, is referred to as the emissivity of a material. In reality, blackbodies are hardly found in nature, and most natural objects have emissivity less than one. This means that only a part, usually between 80-98% of the received energy, is re-emitted, and the remaining part of the energy is absorbed. One of the useful properties of EM radiation, for imaging purposes, is its ability to travel in straight lines. Thus geometric characteristics of objects in a

1.4

Digital Image Processing

scene can be preserved in images. Further, EM radiation can interact with matter in different ways, depending on its wavelength. Images acquired at different wavelengths may have very different properties, and we may need to be aware of these differences when seeking appropriate image processing techniques. If T increases, peak moves towards shorter wavelengths and the total area under the curve increases ●

4 T=1273°K=1000°C wλ (watts/cm2/µm)

3 T=1073°K=800°C 2

T=873°K=600°C



T=673°K=400°C 1 ● ●

1.0

2.0

4.0 5.0 3.0 Wavelength, µm

6.0

7.0

8.0

Fig. 1.3 Radiations from a blackbody

1.4

UTILITY OF EM RADIATION IN IMAGE ACQUISITION

The visible portion of the spectrum occurs between wavelengths of approximately 400 and 700 nanometres (nm). Within this region, wavelength is perceived as colour; light at 550 nm appears green, whereas light at 700 nm is seen as red. At shorter wavelengths, EM radiation carries larger energies. In the X-ray region of the spectrum (at a wavelength λ, of around 10-10m), it carries sufficient energy to penetrate a significant volume of material. X-ray images therefore reveal the internal structure of objects that are opaque to light and commonly used to image internal parts of a human body. At wavelengths around 10-12m, EM radiation has radioactive properties and known as gamma rays. Gamma rays are highly penetrating and, have medical applications. The information is represented as a function of absorbivity of radioactive tracer. This tracer is absorbed in varying amounts by different tissues in the body, according to their level of activity. Thereafter, a gamma camera is used to collect gamma ray photons emitted by body tissues to form an image. The diseased areas such as a tumour will appear as a bright region in image. EM radiation having wavelengths longer than light can be used for acquiring information. Warm objects emit large quantities of infrared (IR) radiation and can be used to locate people or moving vehicles even in conditions

Concept of Images

1.5

of total darkness. Microwave energy is able to penetrate through cloud and fog and can be used to study earth surface during cloudy or foggy conditions. Apart from EM energy, images can be generated whenever an information has a spatial nature. Digital Elevation Model (DEM) is a common representation of elevation in a pictorial/ image form. In this case, instead of depicting information as a function of some property of the object such as reflectivity, emissivity or absorbivity, variation of ground elevation is represented.

1.5

IMAGE PROCESSING

Image processing is a general term for the wide range of techniques that are used to manipulate, modify or classify images in various ways. In general, a digital image acquired through satellite or a digital camera is used for analysis through computers, and hence the term digital image processing. In this book, the focus would be to use digital satellite images for highlighting the various procedures of digital image analysis. In general, Digital Image Processing refers to processing of a two dimensional picture in digital form by a digital computer. A digital image can be defined as an array of real or complex numbers represented by a finite number of bits. An image can be acquired through the process of scanning an existing photograph by a scanner, digital camera or by a digital sensor on board a satellite or aircraft. This digital image can then be processed and/or displayed on a highresolution computer monitor. Digital image processing has a broad spectrum of applications, such as remote sensing, image transmission and storage for business applications, medical processing, radar, sonar, and acoustic image processing, robotics, and automated inspection of industrial parts. Images acquired by satellites are useful in identifying and mapping earth resources; prediction of agricultural crops, urban growth, and weather, flood and fire control; and many other environmental applications. Image transmission and storage application are frequently used in broadcast of television programmes, teleconferencing between users, sending of facsimile messages (FAX), internet and intranet. In the field of medical applications, use of X-rays, angiograms, tomography, nuclear magnetic resonance (NMR), and ultrasonic scanning are some of the use of digital image processing. Radar and sonar images can be used for detection and recognition of various types of targets or in guidance and maneuvering of aircraft or missile systems. Robot vision for industrial automation, image synthesis, cartoon making or fashion design are some of the application areas of digital image processing.

1.6

BASIC IMAGE PROCESSING TECHNIQUE

The basic image processing techniques can be enumerated as follows: i. Image representation and modeling ii. Image enhancement iii. Image restoration

1.6

Digital Image Processing

iv. Image analysis v. Image reconstruction vi. Image data compression

1.6.1

Image Representation and Modeling

A digital image can represent a typical informational characteristic such as reflected, emitted or absorbed energy of an object, elevation, rainfall attribution, electrical conductivity etc. of an area. Thus, any information having spatial characteristics can be considered as a image. The spatial dimension of a digital image is spatially represented as a pixel (picture element) and it is defined as spatial resolution of the image. Fig 1.4 shows a schematic description of image representation and modeling. Image representation and modeling

Perception model * Visual Perception of contrast spatial frequencies and colour * Image fidelity models * Temporal perception * Scene perception

Local model * Sampling and reconstruction * Image quantization * Deterministic models * Series expansions/unitary * Statistical models

Global * Scene analysis/artificial intelligence models * Sequential and clustering model * Image understanding models

Fig. 1.4: Schematic description of image representation and modeling

In image representation, fidelity is an important consideration as it defines the quality of the image, and specifies the contrast, spatial resolution, colour etc. It is helpful in designing the sensor since it defines which parameters are to be measured with high accuracy. Further, the sampling rate of image is an importation criteria, as it defines the manner by which useful information in an image is to be preserved, and is dependent upon the band width of an image. In general, a television signal is about 4 MHz and from sampling theorem, it requires a minimum sampling rate of 8MHz. At 30 frames per sec, it implies that each frame should contain 266,000 pixels approximately. So, for a raster image of 512 lines, each image should be of the size of 512 × 512 pixels. Statistical model describes an image, which is often characterized by its mean and covariance functions. This allows for the development of algorithms that can be used for useful identification for an entire class of images. Often the images are assumed to be stationary, so that the mean and covariance functions can easily be estimated. Stationary models are useful in data compression problems. In global modeling, an image is considered as a composition of several objects. Various objects in the scene are detected (for example, by segmentation techniques), and the model gives the rules for defining the relationship among

Concept of Images

1.7

various objects. Such representations fall under the category of image understanding models, which is not a subject of study in this text.

1.6.2

Image Enhancement

The role of image enhancement is to highlight certain image features so that it can be used for subsequent analysis or for image display. Some of the image enhancement techniques are contrast and edge enhancement, pseudo-colouring, noise filtering, sharpening, and magnifying. Image enhancement is useful in feature extraction, image analysis, and visual information display. It may be noted that enhancement process does not increase the inherent information content of data; it simply emphasizes certain specified portion of the image. Enhancement algorithms are generally interactive and application-dependent.

1.6.3

Image Restoration

The purpose of image restoration is to remove or minimize known degradations in an image, such as deblurring of images caused by the limitations of a sensor or its environment, noise filtering, and correction of geometric distortion or non-linearities due to sensors.

1.6.4

Image Analysis

Image analysis is performed to make quantitative measurements from an image and produce a description of it. Image analysis techniques require extraction of certain features that aid in the identification of the object. Quantitative measurements of object features allow classification and description of the image.

1.6.5

Image Reconstruction

In image reconstruction, 2D or 3D object is reconstructed from several onedimensional images. Each image is obtained by passing rays of energy through the object at different angles and viewed simultaneously to obtain a 2D or 3D view. Such techniques are important in medical imaging (CT scanners), astronomy, radar imaging, geological exploration and nondestructive testing of assemblies.

1.6.6

Image Data Compression

To acquire an image of object or scene, voluminous data is generated (Table 1.1) and to store the same would require devices having enormous storage capacity (Table 1.2). Generally, the access speeds of storage devices are, inversely proportional to their capacity. Image data compression techniques are concerned with reduction of the number of bits required to store or transmit images without any appreciable loss of information. Image storage is required commonly for educational and business documents, medical images are used in patient

1.8

Digital Image Processing

monitoring systems. Due to their wide applications, data compression is of great importance in digital image processing Table 1.1 Data Volumes of Image Sources (in Millions of Bytes) National archives 1 h of color television Encyclopedia Britannica Book (200 pages of text characters) One page viewed as an image

12 5 x 109 28 x103 125 x 103 13 13

Table 1.2 Storage Capacities (in Millions of Bytes) Human brain Magnetic cartridge Optical disc memory Magnetic disc 2400-ft magnetic tape Floppy disc Solid-state memory modules

125,000,000 250,000 12,500 760 200 125 025

With the help of this background knowledge regarding images and image processing in brief, the next chapter looks at the process of imaging i.e. the data acquisition.

2 The Process of Imaging

2.1

INTRODUCTION

Imaging is short form for image acquisition. It is the process of sensing any surroundings and its subsequent representation of measurements made, leading to the formation of an image. One of the important aspects in imaging is the source of energy, depending on this; it can be categorized as passive or active imaging. To record information, a device is required and is known as a sensor. A sensor can be grouped into two groups. Passive sensors depend on an external source of energy usually the Sun. A passive sensor can record energy from 10-12 m (gamma rays) to over 1 m (micro and radio waves). Active sensors have their own source of energy. Measurements by active sensors are controlled as they do not depend upon the illumination condition. Fig. 2.1 shows a schematic representation of different type of sensors.

2.2

PASSIVE SENSORS

Passive sensor is that sensor which does not have its independent energy sources and relies on some external source such as Sun, to illuminate the objects. The reflected energy from these objects is recorded by the sensor. Aerial camera is a typical sensor as it can act both as passive or active sensor depending upon the illumination condition.

2.2.1

Gamma-ray Spectrometer

The gamma – ray spectrometer measures the amount of gamma rays emitted by the upper soil or rock layers due to radioactive decay. The energy measured in specific wavelength band provides information on the abundance of radio isotopes that relate to specific minerals. Therefore, the main application of gamma-ray spectrometer is found in mineral exploration. Further, gamma rays

2.2

Digital Image Processing

have very short wavelength (pico-m), thus due to large atmospheric absorption characteristics of these waves, this type of energy can one be measured up to a few hundred meters above the Earth’s surface. Visible domain

Passive sensors

gamma ray spectrometer

Optical domain

- multispectral scanner

Microwave domain

thermal scanner

passive microwave

- aerial camera wavelength

radar altimeter

Active sensors

laser scanner

imaging radar

Fig. 2.1 Overview of the sensors

2.2.2

Aerial Camera

The aerial camera uses a optical lens and film system and is mostly mounted in aircraft for aerial photography. Low orbiting satellites and NASA Space Shuttle missions also used conventional camera. The film types used in the camera enable electromagnetic energy in the range between 400 nm and 900 nm to be recorded. Aerial photographs are used in a wide range of applications. The rigid and regular geometry of aerial photographs allows for the possibility to acquire stereo-photography. This has enabled the development of photogrammetric procedures for obtaining precise 3D coordinates. Although aerial photos are used in many applications, yet principal applications include medium and large scale topographic mapping and cadastral mapping. Presently, analogue photos are often scanned stored and processed in digital systems.

2.2.3

Video Camera

Video camera is sometimes used to record image data. Most video sensors are only sensitive to be visible colours, although a few are able to record the near infrared part of the spectrum. Until recently, only analogue video cameras were available. Today, digital video cameras are increasingly available, some of which are applied in remote sensing. Mostly, video images serve to provide low cost image data for qualitative purpose, for example, to provide additional visual information about an area captured by another sensor.

The Process of Imaging

2.2.4

2.3

Multi-spectral Scanner

The multi-spectral scanner is an instrument that mainly measures the reflected sunlight in the optical domain. A scanner systematically scans the Earth’s surface thereby measuring the energy reflected from the viewed area. This is done simultaneously for several wavelength bands, hence the name multispectral scanner. A wavelength band is an interval of the electromagnetic spectrum for which the average reflected energy is measured. The reason for measuring a number of distinct wavelength bands is that each band is related to specific characteristics of the Earth surface. For example, reflection characteristics of ‘blue’ light give information about the mineral composition, reflection characteristics of ‘infrared light’ tells something about the type and health of vegetation. The definition of the waveband of scanner therefore depends on the applications for which the sensor has been designed.

2.2.5

Imaging Spectrometer

The principle of the imaging spectrometer is similar to that of the multi-spectral scanner, except that a spectrometer can measure only in very narrow (5-10 nm) spectral bands. This results in an almost continuous reflectance curve per pixel rather than the values for relatively broad spectral bands. The spectral curves measures depend on the chemical composition of the material. Imaging spectrometer data, therefore, can be used to determine mineral composition of the surface or the chlorophyll content of the surface water.

2.2.6

Thermal Scanner

Thermal scanners measure the thermal data in the range of 10-14 µm. Wavelengths in this range are directly related to the temperature of an object. Data on cloud, land and sea surface temperature are extremely useful for weather forecasting. For this reason, most remote sensing systems designed for meteorology include a thermal scanner. Thermal scanners can also be used to study the effects of drought (“water stress”) on agricultural crops, or to monitor the temperature of cooling water discharged from the thermal power plants. Another application is in the detection of coal fires, forest fire, geyser etc.

2.2.7

Radiometer

EM energy with very long wavelength (1-100 cm) is emitted from the soil and rocks, on or just below the Earths surface. The depth from which this energy is emitted depends on the properties, such as water content, of the specific material. Radiometers are used to detect this energy. The resulting data can be used in mineral exploration, soil mapping and soil moisture estimation.

2.3

ACTIVE SENSORS

Active sensors are those sensors which have their own energy source to illuminate the objects on ground and then record the reflected energy from these

2.4

Digital Image Processing

objects. Some of the commonly used active sensors are discussed in the following section.

2.3.1

Laser Scanner

Laser scanners are mounted on aircraft and use a laser beam (infrared light) to measure the distance from the aircraft to points located on the ground. This distance measurement is then combined with exact information on the aircraft’s position to calculate the elevation of ground points. Laser scanning is mainly used to produce detailed and high resolution Digital Terrain Models (DTM) for topographic mapping. Laser scanning is increasingly used for other purposes, such as the production of detailed 3D models of city buildings and for measuring tree heights in forestry.

2.3.2

Radar Altimeter

Radar altimeters are used to measure the topographic profile parallel to the satellite orbit. They provide profiles (single lines of measurements) rather than ‘image’ data. Radar altimeters operate in the 1-6 cm domain and are able to determine height with a precision of 2-4 cm. Radar altimeters are useful for measuring relatively smooth surface such as oceans and for small scale mapping of continental terrain models.

2.3.3

Imaging Radar

Radar instruments operate in the 1-100 cm domain. As in multi-spectral scanning, different wavelength bands are related to particular characteristics of the Earth’s surface. The radar backscatter is influenced by the illuminating signal. The radar backscatter is influenced by the illuminating signal (microwave parameters) and the illuminated surface characteristics (orientation, roughness, dielectric constant/moisture content). Since radar is an active sensor system and the applied wavelength are able to penetrate clouds, it has all-weather day-andnight acquisition capability. The combination of two radar images of the same areas can provide information about terrain heights. Combining two radar images acquired at different times can be used to precisely assess changes in height or vertical deformations (SAR Interferometry).

2.4

PLATFORMS

In remote sensing, the sensor is mounted on a platform. In general, remote sensing sensors are attached to moving platform such as aircraft and satellites. Static platforms are occasionally used in an experimental context. For example, by using a multi-spectral sensor mounted to a pole, the changing reflection characteristics of a specific crop during the day or season can be assessed. Airborne observations are carried out using aircraft with specific modifications to carry sensors. An aircraft that carries an aerial camera or a scanner needs a hole in the floor of the aircraft. Sometimes, Ultra Light Vehicles (ULVs),

The Process of Imaging

2.5

balloons, zeppelins or kites are used for airborne remote sensing. Airborne observations are possible from 100 m up to 30-40 km height. Until recently, the navigation of an aircraft was one of the most difficult and crucial parts of airborne remote sensing. In recent years, the availability of satellite navigation technology has significantly improved the quality of flight execution. For space-borne remote sensing satellites are used. Satellites are launched into space with rockets. Satellites for Earth Observation are positioned in orbits between 150-36,000 km altitudes. The specific orbit depends on the objectives of the mission, e.g. continuous observation of large areas or detailed observation of smaller areas. For detailed discussion on various satellites system, the reader is advised to refer to any standard remote sensing book (Jensen, 2000, Chandra & Ghosh, 2005)

2.5

CHARACTERISTICS OF IMAGE

The optics of an imaging system focuses a continuous, two-dimensional pattern of varying light intensity and colour onto a sensor. Pattern is defined in a coordinate system whose origin is conventionally defined by the upper-left corner of the image and a function, f (x, y). For monochrome images, the value of the function at any pair of coordinates, x and y, is the intensity of the light detected at that point. In the case of colour images, f (x, y) is a vector-valued function. The function f (x, y) must be translated into a discrete array of numerical data if it is to undergo computer processing. This digital representation is only an approximation of the original image, as this will allow the analyst to manipulate the image using a computer. Translation of f (x, y) into an appropriate numerical form is accomplished by the processes of sampling and quantization. For standard video signals, both processes are usually carried out by a single piece of hardware, known as an Analogue to Digital Converter (ADC).

2.5.1

Sampling

Sampling is the process of measuring the value of the image function I(x, y) at discrete intervals in space. Each sample corresponds to a small, square area of the image, known as a pixel. A digital image is a two-dimensional array of these pixels. Pixels are indexed by x and y coordinates, with x and y taking integer values. In a CCD sensor, which consists of an array of photo detectors, the pattern of sampling is already defined by the layout of the photo detectors. However, in conventional video cameras, the incoming radiation gets converted into an analogue video signal for compatibility as per the specification of video equipment in use today. A single frame from a standard video signal is already discrete in the y dimension, consisting of either 525 or 625 lines of data. Sampling the signal therefore involves measuring its amplitude at regular time intervals during the segments of the signal that correspond to each line. This makes the image discrete spatially in the x dimension. Video standards enforce a particular

2.6

Digital Image Processing

sampling rate for a video signal. For example, a RS-170 video signal, for instance, has 485 active lines and that each frame must have an aspect ratio of 4:3, so there must be 485 × (4/3) = 646 samples per line. In practice, a few lines and samples are trimmed from the signal to give an array of pixels with dimensions 640 × 480. To produce such an image, a temporal sampling rate of around 12 MHz is required. With a digital still picture camera, things are simple, as there is no need to convert samples from the CCD into an analogue form and then resample. Further, there is no requirement to conform to broadcast video standards. Such cameras can typically produce images with dimensions of 1024 × 768, 1280 × 1024 pixels etc. These dimensions are chosen to suit display standards originating from the computer industry (e.g., SVGA). Much higher resolutions than those of broadcast video are possible, and a 4:3 aspect ratio is not enforced although this is often preferred. Other types of imaging equipment operate under different conditions. In medicine, for example, radio-isotope imaging devices produce images that are, sampled very coarsely. This is because images are formed from gamma ray photons emitted by radioactive material inside the patient. For safety reasons, the quantity of this material is small, hence there are relatively few photons emitted. It is therefore necessary to integrate photon counts over a relatively large area in order to obtain statistically meaningful results. For example, the size of the chest be represented by a 64 × 64 pixel array.

2.5.2

Spatial Resolution

The spatial resolution of an image is the physical size of a pixel in that image; i.e., the area in the scene that is represented by a single pixel in the image. For a given field of view, dense sampling will produce a high resolution image in which there are many pixels, each of which represents the contribution of a very small part of the scene; coarse sampling, on the other hand, will produce a low resolution image in which there are few pixels, each representing the contribution of a relatively large part of the scene to the image. Spatial resolution dictates the amount of useful information that can be extracted from an image. Quality of a digital image also depends upon the spatial frequency of the image. Spatial frequency can be defined as the rate of change with which information f (x, y) changes. A gradual change in f (x, y) characterizes low spatial frequencies and can be represented adequately by coarsely-sampled image while rapid changes are characterized by high spatial frequencies and can be represented accurately by densely-sampled image. However, the appropriate sampling for an image is defined by Nyquist criterion. Essentially, it states that the sampling frequency should be at least twice the highest spatial frequency found in the image. If an image is sampled coarsely, such that the Nyquist criterion is not met, then the image may suffer from the effects of aliasing. In general, the advance knowledge of the highest spatial frequency present in an image is not known. Consequently, the sampling process is normally

The Process of Imaging

2.7

preceded by anti-aliasing. This is a filtering operation designed to remove frequencies that exceed half the sampling rate achieved by the ADC hardware, thereby guaranteeing that the Nyquist criterion is met.

2.5.3

Sampling Pattern

When sampling an image, it is not only important to know the sampling rate, but also the physical arrangement of the samples. A rectangular pattern, in which pixels are aligned horizontally and vertically into rows and columns, is by far the most common. Unfortunately, rectangular sampling pattern leads to ambiguities in pixel connectivity. A second problem with rectangular patterns is an inconsistency in distance measurement. Suppose that each pixel in Fig.2.2 (a) represents a region of the scene that is 1 cm wide and 1 cm high. Then, the distance between pixels C and D is 1 cm; however, the distance between pixels B and C is not 1 cm but √2 cm, by simple trigonometry. Hence, the actual distance traveled is defined by a fixed number of pixels in the image depending upon on the direction moved.

A

A B

E

F

C

(a) Rectangular

B

D

F

E C

D

(b) hexagonal

Fig 2.2 Connectivity of different sampling pattern (Efford, 2000)

These problems may be solved by a hexagonal sampling pattern (Fig.2.2b). Here, diagonal neighbours are properly connected and the distance traveled in an image does not depend on direction. Despite these advantages, a hexagonal pattern is seldom used. It cannot portray accurately the large number of horizontal and vertical features found in many images, and, in any case, sensors and display hardware generally do not support hexagonal sampling. The rectangular and hexagonal patterns described above are uniform, with the result that one part of an image is as important as any other part. This is useful in images intended for eventual human interpretation, for which prediction of where viewers will direct their attention is impossible. In other situations, where attention can be predicted or controlled, a non-uniform sampling scheme may be profitable. In particular, a log-polar sampling pattern has some interesting and useful properties. Fig. 2.3 shows an array of pixels that conforms to this pattern. The pixels of this array are sectors with a fixed angular

2.8

Digital Image Processing

size and a radial size that increases logarithmically with increasing distance from the centre. This gives high resolution near the centre of the array and low resolution in the periphery. Such an arrangement satisfies the conflicting requirements of good resolution and wide field of view. However, a camera using a sensor with this sampling pattern must always point towards the most interesting or important part of the scene, to ensure that it lies in the centre to the array and is therefore imaged at the highest possible resolution. This is known as an attentive vision strategy. The human visual system supports attentive vision by means of eye, head and even body movements, thereby ensuring that the features of interest are always imaged using the fovea.

Fig 2.3 A log-polar array of pixels

2.6.3 Quantization It is usual to digitize the values of the image function, f (x, y), in addition to its spatial coordinates. This process of quantization involves replacing a continuously varying f (x, y) with a discrete set of quantization levels. The accuracy with which variations in f (x, y) are represented is determined by the number of quantization levels that are used. By using more levels, a better approximation of the image information can be achieved. Conventionally, a set of n quantization levels comprises of the integers 0, 1, 2, . … . ,n – 1 with 0 and n – 1 being usually displayed or printed as black and white, respectively, and intermediate levels rendered in various shades of grey. Quantization levels are therefore commonly referred to as grey levels. The collective term for all the grey levels, ranging from black to white, is a grey scale. For convenient and efficient processing by a computer, the number of grey levels, n, is usually an integral power of two and maybe expressed as. n = 2b,

... (2.1)

where b is the number of bits used for quantization. A typical value of b equal to 8, gives images with 256 possible grey levels ranging from 0 (black) to 255 (white). Some ADCs are not capable of quantizing to 8 bits, producing 6-bit or 7

The Process of Imaging

2.9

bit images. However these may subsequently be represented in memory using 8 bits per pixel. The specialized equipment used in medicine and astronomy may produce images quantized using 10 or even 12 bits.

2.6 COLOUR FUNDAMENTALS The process by which the human brain perceives and interprets colour is a primarily a physio-psychological phenomenon that is not yet fully understood. The physical nature of colour can be expressed by a formal basis supported by experimental and theoretical results. Basically, the colours that humans and some other animals perceive in an object are determined by the nature of the light reflected from the object. Visible light is composed of a relatively narrow band of frequencies in the electromagnetic spectrum. A body that reflects light is balanced in all visible wavelengths appears white to the observer, while, a body that reflects in a limited range of the visible spectrum tends to exhibits some shades of colour. An object appears green as it reflect light with in the wavelength range of 500 to 570 nm range while it absorbs most of the energy at other wavelengths. Characterization of light is central to the science of colour. If the light is achromatic (void of colour), its only attribute is its intensity, or amount. Achromatic light is what viewers see on a black and white television set, and grey level which refers to a scalar measure of intensity that ranges from black, to grey, and finally to white. Chromatic light spans the electromagnetic spectrum from approximately 400 to 700 nm. Three basic quantities are used to describe the quality of a chromatic light source: radiance, luminance, and brightness. Radiance is the total amount of energy that flows from the light source, and it is usually measured in watts (W). Luminance, measured in lumens (lm), a measure of the amount of energy that an observer perceives from a light source. For example, light emitted from a source operating in the far infrared region of the spectrum could have significant energy (radiance), but an observer would hardly perceive it as its luminance would be almost zero. Finally, brightness is a subjective descriptor that is practically impossible to measure. It embodies the achromatic notion of intensity and is one of the key factors in describing colour sensation. The characteristics generally used to distinguish one colour from another are brightness, hue, and saturation. Brightness represents the chromatic notion of intensity. Hue is an attribute of colour of the dominant wavelength in a mixture of light waves as perceived by an observer. So, when an object is referred to as red, orange, or yellow, the observer is specifying to its hue. Saturation refers to the relative purity or the amount of white light mixed with a hue. The pure spectrum colours are fully saturated. Colours such as pink (red and white) and lavender (violet and white) are less saturated, with the degree of saturation being inversely proportional to the amount of white light added. Hue and saturation when taken together is called chromaticity, and, therefore, a colour may be characterized by its brightness and chromaticity. The amounts of red, green, and blue needed to form any particular colour are called the tristimulus values and are denoted by, X, Y and Z, respectively. A colour can

2.10

Digital Image Processing

then be specified by its trichromatic coefficients, as defined by X X+Y+Z

… (2.2)

Y y= X+Y+Z

... (2.3)

x=

z=

and

Z X+Y+Z

…(2.4)

It is noted from these equations that x + y + z = 1.

… (2.5) Another approach for specifying colours is to use CIE chromaticity diagram (Fig. 2.4), which shows colour composition as a function of x (red) and y (green). For any value of x and y, the corresponding value of z (blue) is obtained from Eq. (2.5) by noting that z = 1– (x + y). The positions of the various spectrum colours from violet at 380 nm to red at 780 nm are indicated around the boundary of the tongue-shaped chromaticity diagram. These are the pure colours shown in spectrum of Fig 2.4. Any point within the diagram represents some mixture of spectrum colours. The point of equal energy shown in Fig. 2.4 corresponds to equal fractions of the three primary colours; it represents the CIE standard for white light. Any point located on the boundary of the chromaticity chart is fully saturated. As a point moves ways from the boundary and approaches the point of equal energy, more white light is added to the colours and it becomes less saturated, thus, at the point of equal energy, saturation is equal to zero. The chromaticity diagram is useful for colour mixing since a straight line segment joining any two points in the diagram defines all the different colour variations that can be obtained by combining these two colours additively. If a straight line is drawn line drawn from the red to the green points shown in Fig. 2.4 then, if there is more red light than green light, the exact point representing the new colour will be on the line segment, but it well be closer to the red point than to the green point. Similarly, a line drawn from the point of equal energy to any point on the boundary of the chart will define all the shades of that particular spectrum colour. Extension of this procedure to three colours is straight forward. To determine the range of colours that can be obtained from any three give colours in the chromaticity diagram, we simply draw connecting lines to each of the three colour points. The result is a triangle, and any colour inside the triangle can be produce by various combinations of the three initial colours. A triangle with vertices at any three fixed colours cannot enclose the entire colour region in Fig. 2.4 This observation supports graphically the remark made earlier that not all colours can be obtained with three single, fixed primaries.

The Process of Imaging

2.11

Fig. 2.4 Chromaticity diagram. (Courtesy of the General Electric Co. Lamp Business Division)

The triangle in Fig. 2.5 shows a typical range of colours (called the colour gamut) produced by RGB monitors. The irregular region inside the triangle is representative of the colour gamut of today’s high-quality colour printing devices. The boundary of the colour printing gamut is irregular because colour printing is a combination of additive and subtractive colour mixing, a process that is much more difficult to control than that of displaying colours on a monitor, which is based on the addition of three highly controllable light primaries.

2.7 COLOUR MODELS The purpose of a colour model (also called colour space or colour system) is to facilitate the specification of colours in some standard, generally accepted way. In essence, a colour model is a specification of a coordinate system and a subspace within that system where each colour is represented by a single point. Most colour models in use today are oriented either toward hardware (such as for colour monitors and printers) or toward application where colour manipulation is a goal (such as in the creation of colour graphics for animation). In terms of digital image processing, the hardware-oriented models most commonly used in practice are the RGB (red, green, blue) model for colour monitors and a broad class of colour video cameras; the CMY (cyan, magenta, yellow) and CMYK (cyan, magenta, yellow, black,) models for colour printing;

2.12

Digital Image Processing

and the HSI (hue, saturation, intensity) model, which corresponds closely within the way humans describe and interpret colour. The HSI model also has the advantage that it decouples the colour and grey-scale information in an image, making it suitable for many of the grey-scale techniques. There are numerous colour models in use today due to the fact that colour science is a broad field that encompasses many areas of application. It is tempting to dwell on some of these models here simply because they are interesting and informative. However, keeping to the task at hand, the models discussed in this chapter are leading models for image processing. Having mastered the material in this chapter, the reader will have no difficulty in understanding additional colour models in use today. .9 520 530

.8

510

540

G

.7

550 560

.6 570

500

y-axis

.5

580 590

.4

600

.3

R

490

610 620 640 780

.2

480 .1

B

470 460 450

.0 .0

.1

380

.2

.3

.4

.5

.6

.7

.8

x-axis Fig. 2.5 Typical color gamut of color monitors (triangle) and color printed devices (irregular region)

2.7.1 The RGB Model Technology for creating and displaying colour is based on the empirical observation that a wide variety of colours can be obtained by mixing red, green and blue light in different proportions. For this reason, red (R), green (G) and blue (B) are described as the primary colours of the additive colour system. However not all colours can obtained in this way. Thus, colour image can be formed by making three measurements of scene brightness at each pixel using the red,

The Process of Imaging

2.13

green and blue components of the detected light. This can be done either using a colour camera where the sensor is able to measure radiation at red, green and blue wavelengths for all points in the image or by using a monochrome camera in conjunction with three special filters that blocks all but a narrow band of wavelengths centered on red, green and blue, respectively. In a colour image conforming to the RGB model, the value of each f (x, y) is a vector of three components, corresponding to R, G and B. In a normalized model, these components can vary between 0.0 and 1.0. R, G and B can be regarded as orthogonal axes defining a three-dimensional colour space. Every possible value of f (x, y) is a point in this colour cube. The primary colours red, green and blue are at the corners (1, 0, 0), (0, 1, 0) and (0, 0.1); the colours cyan, magenta and yellow are at the opposite corners while black is at the origin and white is at the corner furthest from the origin (Fig. 2.6). Points on a straight line joining the origin to the most distant corner represent various shades of grey. B Blue (0,0,1)

Cyan (0,1,1)

1 Magenta (1,0,1)

White (1,1,1)

0

1 G

Black (0,0,0)

Green (0,1,0)

1 Red (1,0,0)

Yellow (1,1,0)

R Fig. 2.6 The RGB colour model

Since each of the three components red, green and blue is normally quantized using 8 bits, an image made up of these components is commonly described as a 24-bit colour image. As each primary colour is represented to a precision of 1 part in 256, it is possible to specify any arbitrary colour to a precision of about 1 part in 16 million. Hence, in a 24 bit image 16.7 million colours are available. Despite its importance in image acquisition and display, the RGB model is of limited use when processing colour images, because it is not a perceptual model. In perceptual terms, colour and intensity are distinct from one another, but the R, G and B components each contain both colour and intensity information. Models which decouple these two different types of information tend to be more useful for image processing.

2.14

Digital Image Processing

2.7.2 CMY Model The CMY model has as its primaries Cyan (C), Magenta (M) and Yellow (Y). These are the primary colours of the subtractive system that describes how colour is produced from pigments. A CMY colour is derived from an RGB colour as follows:

⎡ C ⎤ ⎡1⎤ ⎡ R ⎤ ⎢ M ⎥ = ⎢1⎥ − ⎢G ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢⎣ Y ⎥⎦ ⎢⎣1⎥⎦ ⎢⎣ B ⎥⎦

… (2.6)

Theoretical almost any colour can be produced on paper by mixing cyan, magenta and yellow pigments. However this approach cannot produce a satisfactory black, so a fourth component labeled as K and representing black pigment is added, resulting in the CMYK model. This is the model that is used when generating hardcopy versions of digital images using colour printers.

2.7.3 HSI Model HSI model is more suitable than the RGB model for many image processing tasks. The three components are hue (H), saturation (S) and intensity (I). H and S specify colour where H specifies the dominant pure colour as perceived by an observer while S measures the degree to which that pure colour has been diluted by white light. Since colour and intensity are independent, it is possible to manipulate one without affecting the other. White

S

Green

H

Yellow

Cyan

Red Blue

Magenta

I

Black

Fig. 2.7 HSI model

The Process of Imaging

2.15

HSI colour space is described by a cylindrical coordinate system and is commonly represented as a 'double cone' (Fig. 2.7). A colour is a single point inside or on the surface of the double cone, while the height of the point corresponds to intensity. If a point lies in a horizontal plane, a vector can defined in this plane from the axis of the cones to that point, the saturation is the length of this vector and hue is its orientation, expressed as an angle in degrees.

2.7.4 Conversion of Colour from RGB to HSI Given an image in RGB colour format, the Hue (H) component of each RGB pixel is obtained by using the equation ⎧θ H=⎨ ⎩360 − θ

if B ≤ G if B > G

… (2.7)

where 1 ⎧ [(R − G ) + (R − B)] ⎫⎪⎪ ⎪ 2 θ = cos ⎨ . 1/ 2 ⎬ ⎪ (R − G )2 + (R − B)(G − B) ⎪ ⎪⎩ ⎪⎭ −1 ⎪

[

]

… (2.8)

The saturation (S) component is given by S = 1−

3

(R + G + B)

[min(R , G, B)] … (2.9)

and, the intensity (I) component is given by I=

1 (R + G + B). 3

… (2.10)

It is assumed that the RGB values have been normalized to the range [0, 1] and that angle θ is measured with respect to the red axis of the HSI space, as indicated in (Fig. 2.7) Hue can be normalized to the range [0,1] by dividing by 3600 all values resulting from Eq 2.7. The other two HSI components already are in this range if the give RGB values are in the interval [0,1].

2.7.5 Converting Colours from HSI to RGB Given values of HSI in the interval [0,1], we want to find the corresponding RGB values in the same range. The transforming equations depend on the values of H. There are three sectors of interest, corresponding to the 120° intervals in the separation of primaries (Fig 2.8). First multiply H by 360°, such that the hue returns to its original range of [0°, 360°]. Depending on the value of H, three cases can be defined.

2.16

Digital Image Processing

G

G

G

G

B

R

B

(a)

R

B

(b)

(c)

Fig. 2.8 Separation of primaries

Case I If Hue (H) has value between 0° to 120° (Fig 2.8(a)) i.e., H is between Red and Green colour, the RGB components can be expressed by the following equations: B = I(1 − S)

… (2.11)

⎡ S cos H ⎤ R = I ⎢1 + ⎥ 0 ⎣⎢ cos 60 − H ⎥⎦

(

and

)

G = 1 − (R + B).

… (2.12)

… (2.13)

Case II If the colour point is located such that Hue lies within the range of (120° < H 11% (Hall et al, 2002). However, if the MODIS band 4 reflectance is < 10%, then the pixel will not be mapped as snow even if the other criteria are met. This prevents pixels containing very dark targets such as black spruce forests, from being mapped as snow. This is required because very low reflectances cause the denominator in the NDSI to be quite small, and only small increases in the visible wavelengths are required to make the NDSI value high enough to classify a pixel, erroneously, as snow. Changes that occur in the spectra of a forest stand as it becomes snowcovered can be exploited to map snow cover in forest. The primary change in reflectance occurs in the visible wavelength as snow has a much higher visible reflectance than soil, leaves or bark. A fundamental change that snow cover causes in the spectral response of a forest, which can be used in a global algorithm, is that the reflectance in the visible will often increase with respect to the near-infrared reflectance. This behavior is captured in the normalized difference vegetation index (NDVI), as snow will tend to lower the NDVI. MODIS bands 1 (0.620-0.70 µm) are used to calculate the NDVI. The NDVI and NDSI are used together to improve snow mapping in dense forests. If the NDVI = –0.1, the pixel may be mapped as snow even if the NDSI is 2770 K, then the pixel will not be mapped as snow. Results in snow-covered areas have not changed much, but results in warm areas have improved dramatically. (Xiao et al, 2000) proposed a slightly modified version of NDSI known as the normalized difference snow/ice index (NDSII). This index has been defined for SPOT VEGETATION sensor as the snow and ice have very high reflectance

8.28

Digital Image Processing

values in visible spectral bands (blue, green and red), but very low reflectance in the middle infrared for VGT data. NDSII is similar to the normalized difference snow index (NDSI) proposed by Hall et al, 1995 in which spectral reflectance values of green and mid infrared band have been used. As VGT provides daily images of the globe at 1 Km, VGT data offers a new opportunity for mapping and monitoring of snow and ice cover across large spatial scales. NDSII = [(Band 2 – MIR)/(Band 2 + MIR)] …(8.56) A threshold approach is used to generate a thematic map of snow/ice cover using both NDSII and NIR reflectance value (band 3 in VEGETATION). The same thresholds values as proposed by Hall et al, (1995, 1998) for NDSI to delineate ice and snow cover, that is, NDSII values greater than 0.40 (Xiao et al,2000; Hall et al. 1995; 1998) have been adopted. Snow/ice cover is assigned to those pixels where the NDSII ≥0.40 and Band 3 (NIR) > 0.11%. Sidjak and Wheate (1999) studied the use of Landsat TM data along with high resolution digital elevation model (DEM) for mapping glacier extent for Illecillewaet Icefield under Canadian Glacier Inventory program. It was found through Principal Component Analysis (PCA), PC2 is primarily loaded by short wave infrared Band 5 and 7 and cleanly isolates glacier from non-glacier surfaces as a result of low reflection of snow and ice at longer wavelengths. It was also observed by using a NDSI between TM 2 and TM 5 was effective in distinguishing snow from similarly bright soil, vegetation and rock, as well as from clouds in TM imagery. This is based on the difference between strong reflection of visible radiation and near total adsorption of middle infrared wavelengths by snow. Further, it was found that by using a combination of images of PC2, ratio of TM4/TM5 and NDSI, virtually all glacier areas could be correctly identified, especially in shadow areas.

8.6.3 Normalized Burn Ratio On an average nearly 2.5 million hectares of forest land are destroyed due to wild fire causing disturbances which present severe challenges to management of diverse and sustainable ecological systems. Similar destruction to forest land in Australia and India are reported every year. Wildfire are a natural part of ecosystems and plays an important role in maintaining and regenerating firedependent ecosystems. Since 1960, annual areas burned by wildfires in United States have seen a steady increase and the annual increase has accelerated in recent years by more than 75% (Zhu and Eidenshink, 2007). A number of factors are thought to have led to the increase: fuel build-up from fire protection programs; aging forest; changes in fire policy related to prescribed burning; expanded public access to and use of the forests; and higher temperatures and lower rainfall associated with climate change. Therefore, not only does the increase trend have profound impact on overall costs of managing fire-dependent ecosystems and communities that live in wildland-urban interfaces, it also represents a scientific challenge for restoring impaired ecosystems and assessing feedback mechanisms in face of the climate change trends.

Image Transformations

8.29

Within the field of the remote sensing of wild fire, the detection of burned areas is relatively well established (Roy, et al, 2005). A further refinement is to measure how a fire’s effects vary within burned areas. Information on withinburn variability is useful to ecologists and resource managers who want to understand fire’s affects on ecosystem processes, for example, vegetation recovery and succession, and to plan post-fire rehabilitation and remediation. Such within-burn information may be estimated by visual examination of burned site conditions or by labor-intensive field measurements Key & Benson, 2005, Ray and Landmann, 2006. One qualitative indicator used to assess fire effects within burned areas is named fire severity. Definitions of fire severity vary but are used to relate how fire changes ecosystems differentially or result in different biological responses (Key and Benson, 2005, Tanaka et al, 1993, Jakubauskas et al, 1990, White et al 1996, Michalek et al, 2000, van Wagtendonk et al, 2004, Epting et al, 2005). Definitions also vary in term of the amount of time elapsed before fire severity is assessed; this time can very from one day to several years postfire (the longer time delays are necessary if the biological effects of fire are of interest). Parameters used to estimate severity in the field include the condition and color of the soil, amount of fuel consumed, respouting from burned plants, blackening or scorching of trees, depth of burn in the soil, and changes in fuel moisture. Although several of these parameters may not be amenable directly to optical wavelength remote sensing, or may not be related in a linear way to reflectance, field-based measures of fire severity have been used to parameterize and assess fire severity maps created using optical wavelength satellite data (Key and Benson, 2005, van Wagtendonk et al, 2004, Epting et al, 2005, Cocke et al, 2005). Fire severity has been estimated remotely using spectral indices; both using single-data and multitemporal index data (Rogan and Yool, 2001). One index presented as a reliable means to map fire severity is called the Normalized Burn Ratio (NBR), computed as the difference between near-infrared (NIR) and middle-infrared (MIR) reflectance divided by their sum (Key and Benson, 2005). The NBR is defined from NIR and MIR reflectance as

NBR =

ρ NIR − ρ MIR , ρ NIR + ρ MIR

-1 ≤ NBR ≤ 1.

…(8.57)

The advantage of this ratio is that many parameters related to forest and forest fire can be extracted using remote sensing multispectral data. Fire cause substantial spectral changes by consuming vegetating, destroying leaf chlorophyll, exposing soil, charring stems, and altering both aboveground and belowground moisture. Reduction of chlorophyll absorption leads to increased reflectance in the visible electromagnetic region, along with tissue damage leading to a decreased reflectance in the near-infrared (NIR) region (Jensen, 2000). In contrast, with a decrease in crown shadow and a decrease in canopy moisture, mid-infrared (MIR) reflectance typically increases following a fire (van Wagtendonk et al, 2004; White et al, 1996), and the change in the NIR and

8.30

Digital Image Processing

MIR region has been effectively exploited for mapping burned regions (Jakubauskas et al, 1990). Indices applied to burn mapping have included single data, post-burn and bitemporal, post-burn approaches. Single date models have several advantages over bi-temporal models. They are less expensive, less time-consuming, and they also reduce inherent error found in bi-temporal models (Koutsias et al, 1999). Between date error can be caused by differences in phenology (van Wagtendonk et al, 2004), misregistration of image pixels (Verbyla & Boles, 2000), and differences in sensor calibration, sun–sensor geometry, and atmospheric effects. Sensor differences and some atmospheric effects may be corrected through radiometric normalization techniques, but differences in vegetation phenology may be unavoidable, especially in cloudy regions, such as interior Alaska, where there may be few cloud free optical images available. Use of a single post-burn image avoids these problems, but lack of a pre-burn reference image lead to difficulties in mapping spectrally similar areas such as water and recent burns, or senescent vegetation and older burns (Pereira 1999; Pereira & Setzer 1993). Key and Benson (2003) had proposed a slightly different approach for determining Normalized Burn Ratio. They suggested that instead of using TM 4 and TM 7 digital value, their transformed reflectance (R) values be used and proposed that NBR be expressed as NBR =

( R 4 − R 7) ( R 4 + R 7)

…(8.58)

They observed that the value of R7 increase with fire, while the value of R4 decreased, and that these trends get accentuated in the NBR.

8.7 THE ORTHOGONAL TRANSFORMATIONS The derivation of vegetation indices has also been approached through orthogonal transformation techniques such as the PCA, Tasseled Cap Transformation. The link between these techniques is that they all express green vegetation through the development of their second component.

8.7.1 Principal Component Analysis In data mining you often encounter situations where there are a large number of variables in the database. In such situations it is very likely that subsets of variables are highly correlated with each other. The accuracy and reliability of a classification or prediction model will suffer if you include highly correlated variables or variables that are unrelated to the outcome of interest. Superfluous variables can increase the data-collection and data-processing costs of deploying a model on a large database. The dimensionality of a model is the number of independent or input variables used by the model. One of the key steps in data mining is finding ways to reduce dimensionality without sacrificing accuracy. Principal component analysis (PCA) is a mathematical procedure that transforms a number of (possibly) correlated variables into a (smaller) number of

Image Transformations

8.31

uncorrelated variables called principal components. The objective of principal component analysis is to reduce the dimensionality (number of variables) of the dataset but retain most of the original variability in the data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. This procedure performs Principal Component Analysis on the selected dataset. A principal component analysis is concerned with explaining the variance covariance structure of a high dimensional random vector through a few linear combinations of the original component variables. Consider a p-dimensional random vector X = ( X1, X2, ..., Xp ). k principal components of X are k (univariate) random variables Y1, Y2, ..., Yk which are defined by the following formulae.

…(8.59) where the coefficient vectors l1,l2 ,..etc are chosen such that they following conditions: i) First Principal Component = Linear combination l1'X that Var(l1'X) and || l1 || =1 ii) Second Principal Component = Linear combination l2'X that Var(l2'X) and || l2 || =1 and Cov(l1'X , l2'X) =0 iii) jth Principal Component = Linear combination lj'X that Var(lj'X) and || lj || =1 and Cov(lk'X , lj'X) =0 for all k < j

satisfy the maximizes maximizes maximizes

This says that the principal components are those linear combinations of the original variables which maximize the variance of the linear combination and which have zero covariance (and hence zero correlation) with the previous principal components. Multispectral image bands are often highly correlated, i.e. they are visually and numerically similar. The correlation between spectral bands arises from a combination of factors: i) Material spectral correlation. This correlated component is caused by, for example, the relatively low reflectance of vegetation across the visible spectrum, yielding a similar signature in all visible bands. The wavelength range of correlation is determined by the material spectral reflectance. ii) Topography. For all practical purposes, topographic shading is the same in all solar reflectance bands and can even be the dominant image component in mountainous areas and at low sun angles. It therefore

8.32

Digital Image Processing

leads to a band-to-band correlation in the solar reflective region, which is independent of surface material type. iii) Sensor band overlap. Ideally, this factor is minimized in the sensor design stage, but can seldom be avoided completely. The amount of overlap is typically small, but is nevertheless important for precise calibrations. Principal component analysis (PCA) is a classical statistical method. This linear transform has been widely used in data analysis and compression. Principal component analysis is based on the statistical representation of a random variable. Suppose we have a random vector population x, where x = (x1, …….., xn)T …(8.60) and the mean of that population is denoted by µx = E{x}

…(8.61)

and that the covariance matrix of the same data set is defined by Cx = E{(x – µx) (x – µx)T}

…(8.62)

The components of Cx, denoted by cij, represent the co-variances between the random variable components xi and xj. The component cij is the variance of the component xi. The variance of a component indicates that the spread of the component values around its mean value. If two components xi and xj of the data are uncorrelated, then their covariance is zero (cij and cji = 0). The covariance matrix is, by definition, always symmetric. From a sample of vectors x1,……, xm, calculate the sample mean and the sample covariance matrix as the estimates of the mean and the covariance matrix. From a symmetric matrix such as the covariance matrix, calculate an orthogonal basis by finding its eigen values and eigen vectors. The eigenvectors ei and the corresponding eigen values λi are the solutions of the equation Cx ei = λi ei, i = 1, ….., n …(8.63) For simplicity we assume that the λi are distinct. These values can be found, for example, by finding the solutions of the characteristic equation | Cx – λI| = 0 …(8.63) where I is the identity matrix having the same order than Cx and the |.| denotes the determinant of the matrix. If the data vector has n components, the characteristic equation becomes of order n. This is easy to solve only if n is small. Solving eigen values and corresponding eigen vectors is a non-trivial task, and many methods exist. One way to solve the eigen value problem is to use a neural solution to the problem. The data is fed as the input, and the network converges to the required solution. By ordering the eigen vectors in the order of descending eigen values (largest first), one can create an ordered orthogonal basis with the first eigenvector having the direction of largest variance of the data. In this way, we

Image Transformations

8.33

can find directions in which the data set has the most significant amounts of energy. Suppose one has a data set of which the sample mean and the covariance matrix have been calculated. Let A be a matrix consisting of eigen vectors of the covariance matrix as the row vectors. By transforming a data vector x, we get y = A(x - µx)

…(8.64)

which is a point in the orthogonal coordinate system defined by the eigenvectors. Components of y can be seen as the coordinates in the orthogonal base. Now reconstruct the original data vector X from y by x = ATy + µx

…(8.65)

Using the property of an orthogonal matrix A–1 = AT. The AT is the transpose of a matrix A. The original vector x is projected on the coordinate axes defined by the orthogonal basis. The original vector may be then reconstructed by a linear combination of the orthogonal basis vectors. Instead of using all the eigenvectors of the covariance matrix, we may represent the data in terms of only a few basis vectors of the orthogonal basis. If we denote the matrix having the K first eigenvectors as rows by AK, a similar transformation can be created as seen above y = AK(x – µx) …(8.66) and

x = (AK)T y + µx

…(8.67)

This means that the original data vector can be projected on the coordinate axes having the dimension K and transforming the vector back by a linear combination on the basis of vectors. This minimizes the mean-square error between the data and this representation with given number of eigen vectors. If the data is concentrated in a linear subspace, this provides a way to compress data without losing much information and simplifying the representation. By picking the eigenvectors having the largest eigen values a little loss of information is possible in the mean-square sense. So it is possible to choose a fixed number of eigenvectors and their respective eigen values and get a consistent representation, or abstraction of the data. This preserves a varying amount of energy of the original data. Alternatively, it is possible to choose approximately the same amount of energy and a varying amount of eigen vectors and their respective eigen values. This would in turn give approximately consistent amount of information in the expense of varying representations with regard to the dimension of the subspace. However, there is a contradiction of goals i.e. on one hand, there is a need to simplify the problem by reducing the dimension of the representation, while on the other hand it is important to preserve as much as possible of the original information content. PCA offers a convenient way to control the trade-off between loosing information and simplifying the problem at hand. This may be

8.34

Digital Image Processing

possible to create piecewise linear models by dividing the input data to smaller regions and fitting linear models locally to the data. The PCA has several interesting properties: i) It is a rigid rotation in K-D of the original coordinate axes to coincide with the major axes of the data. The data in the PC images result from projecting the original data onto the new axes. In this example, the PC2 component will have negative digital value. That is not a problem because the original of the PC space can be arbitrarily shifted to make all PC values positive (routinely done for display), with no other change in the PCA properties (Richards, 1993). ii) Although the PC axes are orthogonal to each other in K-D, they are generally not orthogonal when projected to the original multispectral space. iii) It optimally redistributes the total image variance in the transformed data. The first PC image contains the maximum possible variance for any linear combination of the bands, the second PC image contains the maximum possible variance for any linear combination of the original bands, the second PC image contains the maximum of the original bands, the second PC image contains the maximum possible variance for any axis orthogonal to the first PC, and so forth. The total image variance is preserved by the PCA. If the low variance (low contrast) information in the higher order components can be ignored, significant savings in data storage, transmission and processing time can result. Also, any uncorrelated noise in the original image will usually appear only in the higher order components, and can therefore be removed by setting those PC images to a constant value. Caution is advised, however, because small, but significant band differences may also appear only in the higher-order components.

8.7.2 Tasseled-Cap Components PVI considers spectral variations in two of the four Landsat MSS bands, and uses distance from a soil line in the 2D space as a measure of biomass or green leaf area index. Kauth and Thomas (1976) used similar concept using all the four bands of Landsat MSS. Here all the pixels representing soil, fall along an axis that is oblique with respect to each pair of four MSS axes. A triangular region in the 4D MSS space is occupied by pixels representing vegetation in various stage of growth. The shapes of triangular regions are in the form of tassel cap, and hence this transformation is known as the Tassel Cap Transformation. This transformation is based on Gram–Schmidt Sequential Orthogonalization technique that produces an orthogonal transformation of the original 4 band MSS data space into a new 4D space. The axes of this new coordinate system are termed as brightness, greenness, yellowness and non-such. The brightness axis is associated with variation in the soil background reflectance. The greenness axis is correlated with variations in the vigour of green vegetation, while yellowness axis is related to variations in the yellowness of senescent vegetation. The nonsuch axis is correlated to atmospheric conditions.

Image Transformations

8.35

Plane of Vegetation Greenness

Transition Zone Plane of Soils

Wetnes

Plane of Vegetation

Tassel Cap Transformation has been used primarily to monitor the growth of agricultural crops at different stages by using the brightness and greenness information. The advantage of using this transformation is that the axes provide a consistent, physically-based coordinate system as these are defined aprior, thus any variation in crop cover and stage of growth from image to image will not be affected. Crist and Cicone (1984) explained this concept for 6 bands TM Data (excluding the thermal band). It was observed that TM bands contained significant information related to wetness in the third dimension Fig. 8.14 shows the TM Tasseled Cap functionality between brightness, greenness and wetness in two planes.

Plane of Soils

Fig. 8.14 TM Tasseled Cap transformation axes system (Source: Crist and Cicone, 1986)

The brightness information is weighted sum of the entire 6 TM bands and is a measure of overall reflectance. It is helpful in differentiating light soils from dark soils. Greenness is the contrast between Near Infrared and Visible reflectance, and is a measure of presence and density of green vegetation, while wetness is the contrast between Shortwave Infra Red and Visible/Near Infrared reflectance. It is a measure of soil moisture content, vegetation density and other class characteristics. Fig. 8.15 shows the general locations of some of the important features in TM Tasseled Cap feature space.

Fig. 8.15 Approximate locations of some classes in TM Tasseled Cap feature space (Source: Crist and Cicone, 1986)

8.36

Digital Image Processing

The plane formed between Brightness and Greenness is known as the Plane of Vegetation while the plane formed between Brightness and Wetness is known as the Plane of Soil. By plotting the Brightness, Greenness and Wetness, certain specific and interesting information are revealed. When Greenness and wetness are plotted, it provides informations regarding the strong correlation to the percentage of vegetation cover. The distinction between forest/natural vegetation and cultivated vegetation is enhanced in the Wetness dimension. In the Greenness/Brightness projection it is found that there is a distinct separation between cultivated vegetation and forest. The location of forest is for interest as it forms a badge of Tree in front of the cap [Fig. 8.15 (a)]. Table 8.1 lists the Landsat-5 TM Tasseled Cap coefficient. The next section gives a generalized approach to n-space data set. Table. 8.1 Tasseled-cap coefficients for Landsat-1 MSS (Kauth and Thomas, 1976). Landsat-2 MSS (Thompson and Whemanen, 1980), Landsat-4 TM (Crist and Cicone, 1984) and Landsat -5 TM (Crist et al., 1986). Sensor L-1 MSS

L-2 MSS

L-4 TM

L-5 TM

Axis Name Brightness Greenness Yellow stuff Non-such Brightness Greenness Yellow stuff Non-such Brightness Greenness Wetness Haze TC5 TC6 Brightness Greenness Wetness Haze TC5 TC6

MSS1

MSS2

MSS 3

MSS 4

+0.433 -0.290 -0.829

+0.632 -0.562 +0.522

+0.586 +0.600 -0.039

+0.234 +0.491 +0.194

+0.223 +0.332 +0.283 +0.900

+0.120 +0.603 -0.660 +0.428

-0.543 +0.676 +0.577 +0.0759

+0.810 +0.263 +0.388 -0.041

+0.016 TM 1

+0.428 TM2

-0.452 TM3

+0.882 TM4

TM5

TM7

+0.3037 -0.2848 +0.1509 -0.8242 -0.3280 +0.1084 +0.2909 -0.2728 +0.1446 +0.8461 +0.0549 +0.1186

+0.2793 -0.2435 +0.1973 +0.0849 +0.0549 -0.9022 +0.2493 -0.2174 +0.1761 +0.0731 -0.0232 -0.8069

+0.4743 -0.5436 +0.3279 +0.4392 +0.1075 +0.4120 +0.4806 -0.5508 +0.3322 +0.4640` +0.0339 +0.4094

+0.5585 +0.7243 +0.3406 -0.0580 +0.1855 +0.0573 +0.5568 +0.7221 +0.3396 -0.0032 -0.1937 +0.0571

+0.5082 +0.0840 -0.7112 +0.2012 -0.4357 -0.0251 +0.4438 +0.0733 -0.6210 -0.0492 +0.4162 -0.0228

+0.1863 -0.1800 -0.4572 -0.2768 +0.8058 +0.0238 +0.1706 -0.1648 -0.4186 +0.0119 -0.7823 +0.0220

Additive term 10.3695 -0.7310 -0.3828 0.7879 -2.4750 -0.0336 10.3695 -0.7310 -0.3828 0.7879 -2.4750 -0.0336

8.7.5 The Concept of n-Space Indices The concept of n-space is difficult to visualize if n is greater than three. However a two-dimensional index such as the PVI, it is relatively easy to visualize and can be used to demonstrate the physical basis for n-space indices. Richardson and Wiegand (1977) showed that a plot of MSS7 and MSS5 (i.e a near-IR and a red band of the Landsat multispectral scanner) for soil would fall on a straight line (Fig 8.3). As vegetation grows on the soil, the red radiance decreases and the

Image Transformations

8.37

near-IR radiance increases. A vegetation point would lie away from the soil line with the perpendicular distance from the point to the soil line being a measure of the amount of vegetation present.

8.7.5.1 Calculation of n-Space Coefficients The number of dimensions (n) available in spectral space is the number of wavelength intervals (or bands) for which data are available. The number of spectral indices (m) that may be calculated is also equal to the number of bands (n). However, there is no requirement that n indices be calculated, only that the number m cannot exceed n if all indices are to be orthogonal. Often just the first two indices are of interest. In this development m+1 data points are required to specify m indices however, if m = n, then (m + 1)th point can be arbitrary. The n-space coefficients are unit vectors that give direction, and thus vector notation is appropriate. However, the dot product is the only vector manipulation necessary, and, since forming a dot product result in a scalar, the development can be written largely in algebraic terms. Thus, vectors will be discussed where necessary, but the algebraic forms of equations will be stressed to facilitate both comprehension and computation. The terms brightness, greenness, and yellowness, as used by Kauth and Thomas (1976), will be used here. To obtain the first index, an equation for a line through the soil data points must be derived. A minimum of two soil points are required, with points differing considerably in reflectance preferred (e.g., wet and dry surfaces). The differences between the dry (Xsd) and the wet (Xsw) soil points are bi = ( X sd − X sw ) i

…(8.69)

for each of n bands. The vector (bi, b2,….,bn) is normalized to form a unit vector by dividing each of its components by the normalization factor B, where then, A1 , i = bi / B

…(8.70)

are the coefficients of brightness. Now, Brightness can be expressed as BR = A1, 1 X 1 + A1, 2 X 2 + ... + A1, n X n

…(8.71)

where Xi represents values for a data point in the ith band. Calculation of the second index (greenness) begins by choosing a data point that represents green vegetation and forming the differences between that point and any point on the soil line (Xg-Xs)i, then where

gi = (Xg – Xs) i – D 2,1 A 1,i

…(8.72)

(

…(8.73)

n

)

D2,1 = ∑ X g − X s i A1,i i =1

8.38

Digital Image Processing

This procedure, called the Gram-Schmidt process (Freiberger, 1960), which insures that the vector (gi ,g2 ,….,gn ) is orthogonal to the soil line vector (bi, b2,….,bn) The subscripts of D indicate that it is associated with the second index (greenness) and also the first (brightness). The normalization factor is 1/ 2

⎛n ⎞ G = ⎜ ∑ g i2 ⎟ ⎝ i =1 ⎠

…(8.74)

The coefficients for the second index (greenness) are A2 = g i / G,

…(8.75)

and greenness can be calculated from GN = A 2,1 X 1 + A 2, 2 X 2 + ... + A 2, n X n

…(8.76)

The third index (yellowness) must be orthogonal to both brightness and greenness. Choose an appropriate data point and form the differences (Xy-Xs)i, then y i = (Xy – Xs)i – (D3,1A1,i + D3,2 A2,1) …(8.77) The two D terms must be evaluated before calculating yi. The first is denoted D3, 1 because it refers to the third (yellowness) and the first (brightness) indices, and the second as D3, 2 because it refers to third and the second (greenness). These terms are evaluated from the equations n

(

)

D3,1 = ∑ X y − X s i A1,i

…(8.78)

(

…(8.79)

i =1

and n

)

D3, 2 = ∑ X y − X s i A2,i i =1

The normalization factor is 1/ 2

⎛n ⎞ Y = ⎜ ∑ yi2 ⎟ ⎝ i =1 ⎠

…(8.80)

The coefficients are A3,i = yi / Y

…(8.81)

and the equation for yellowness is YE = A 3,1 X 1 + A 3, 2 X 2 + ... + A 3,n X n

…(8.82)

This procedure can be generalized to calculate m indices using n bands (m ≤ n). The dot products can be written n

Dk , j = ∑ ( X k − X s )i A j ,i i =1

…(8.83)

for k = 1 to m and j = 1 to k – 1. If k = 1 (brightness), j = 0, and Dk,j = 0, thus Eq. (8.73) has only the differences of the soil point on the right-hand side. It follows that the mth index (called t for convenience, with no physical significance attached to the symbol) is

Image Transformations

8.39

Table 8.2 Formulae of Different Indices VI Simple Ratio (SR) Normalized Difference Vegetation Index (NDVI) Ta ssel cap Transformation Brightness Greenness Yellow stuff Non-such Brightness Greenness Wetness

Equation NIR SR = Re d NIR − red NDVI = NIR − red

Landsat Multispectral Scanner (MSS) B=0.332MSS1 + 0.603MSS2 + 0.675MSS3 +0.262MSS4 G=-0.283MSS1-0.66MSS2 + 0.577MSS3 +0.388MSS4 Y=-0.899MSS1+ 0.428MSS2 + 0.076MSS3-0.041MSS4 N=-0.016MSS1 + 0.131MSS2-0.452MSS3+0.882MSS4 Landset Thematic Mapper (TM) B=0.0243TM1 +04158TM2+0.5524TM3+0.5741TM4+0.3124TM5+ 0.2303TM7 G=-0.1603TM10.2819TM2+0.4939TM3+0.794TM4+0.0002TM5+ .1446TM7 W=0.0315TM1+0.2021TM2+0.3102TM3+0.1594TM4+0 .6806TM5+ 0.6109TM7

Leaf Relative Water Content Index (LWCI) MidIR Index Soil Adjusted Vegetation Index (SAVI) Atmospherically Resistant Vegetation Index (ARVI)

Kauth and Thomas, 1976; Kauth et al, 1979

Crist, 1985

NIR TM 5 − MidIR TM 5 NIR TM 4 + MidIR TM 5

Hardisky et.al., 1983

PVI = (0.355MSS4 − 0.149MSS2 )2 + (0.355MSS2 − 0.85

Richardson and Wiegand, 1977

Infrared Index (II) Perpendicular Vegetation Index (PVI) Greenness Above Bare Soil (GRABS) Moisture Stress Index (MSI)

Reference Birth and McVey, 1968 Rouse et al, 1974; Deering et al, 1975

II =

Hay et al, 1986 GRABS = G – 0.09178B + 5.589959

MidIR TM 5 NIR TM 4 − log[ 1 − ( NIR TM 4 − MidIR

Rock et al., 1986

MSI =

LWCI =

− log[ 1 − NIR MidIR =

SAVI =

TM 4

− MidIR

MidIR TM5 NIR TM 4

(1 + L) (NIR - red) NIR + red + L

⎛ p* − p*rb ⎞ ⎟ ARVI= ⎜⎜ *nir * ⎟ ⎝ p nir − p rb ⎠

TM 5 ft TM 5 ft

)]

Hunt et.al.,1987

] Musick and Pelletier 1988 Huete, 1988: Huete and Liu, 1994; Running et. al., 1994; Qi et al, 1995 Kaufman and Tanre, 1992; Huete and Liu, 1994

8.40

Digital Image Processing

Soil and Atmospherically Resistant Vegetation Index (SARVI) Enhanced Vegetation Index (EVI)

Huete and Liu, 1994, Running et al., 1994

⎛ p* − p * ⎞ SARVI= ⎜⎜ * nir * rb ⎟⎟ ⎝ p nir − prb + L ⎠ EVI=

p*nir − p*red p*nir

− C1 × p*red

− C2 × p*blue

+L

(

(1 + L)

ti = ( X m − X s )i − Dm,1 A1,i + Dm, 2 A2,i + ... + Dm, j A j ,i

)

Huete and Justice, 1999

...(8.84)

The normalization is as before, 1/ 2

⎛n ⎞ T = ⎜ ∑ ti2 ⎟ ⎝ i =1 ⎠

The coefficients of the mth index are Am,i = ti / T

…(8.85)

...(8.86)

and the mth n-space index (Im ) is I m = Am,1 X 1 + Am, 2 X 2 + ... + Am,n X n .

...(8.87)

All indices should be orthogonal. This can be checked by calculating dot products of the various coefficients, i.e, n

∑ Ak ,i A j ,i = 0 for k = j

i =1

...(8.88)

= 1 for k = j.

8.8 ILLUSTRATIVE EXAMPLE The data set used in Chapter 5 has been subjected some of the standard image transformation techniques such as i) Ratio ii) NDVI iii) Principal Component Analysis iv) Tassel Cap Transformation

8.8.1 Ratio and NDVI Images Figures 8.16(a) and 8.17(a) shows the Ratio and NDVI images using TM Bands 4 & 3. The difference between the two images is rather distinct especially in the vegetative areas of the image. In the lower right portion of the image, the variation in forest density is more distinct in the NDVI image in comparison to Ratio image. The water channel of River Ganges and other tributaries can be seen clearly in both the Ratio and NDVI images. It may also be noted that since

Image Transformations

8.41

this operators normalize for topographic effects, thus all the shadow in the mountainous areas are completely removed. However, the shape of the histograms of both the operators is completely different (Fig 8.16b and 8.17b). The minimum and maximum values of ratio image are 0.419 and 3.754 with a mean value of 1.558. Similarly, the minimum and maximum values of NDVI image are -0.641 and 0.657 with a mean value of 0.188. It is observed that from the histogram of ratio image the information can be split into two distinct groups at a ratio value of 2.00. However, in the NDVI image, it is found that there are three distinct group at NDVI value of 0.0 and 0.33.

(a) Image

(b) Histogram Fig. 8.16 Ratio image

(a) Image

(b) Histogram Fig. 8.17 NDVI image

8.8.2 Vegetation Indices 8.8.2.1 Transformed Vegetation Index Here it is found that all forest areas appear in a dark tone of gray, whereas all sandy area, barren lands and water bodies appear as black. However, the

8.42

Digital Image Processing

variation of forest density is lost as seen in NDVI image. In fact it is an inverse image of NDVI (Fig. 8.18a). Even though the frequency of the minimum value of 0 dominates the histogram, the shape of the histogram has a conical shape with a long tail towards the higher side (Fig. 8.18b).

(a) Image

(b) Histogram

Fig. 8.18 Transformed Vegetation Index

8.8.2.2 Corrected Transformed Vegetation Index This image is similar NDVI image again that the variation between forest density types is lost. However, the forest areas are seen in much brighter tone in comparison to NDVI image (Fig 8.19a). Here again the minimum value is 0.707 and the maximum value is 15.984.

(a) Image

(b) Histogram

Fig. 8.19 Corrected Transformed Vegetation Index

8.8.2.3 Thiams Transformed Vegetation Index This image is similar in appearance to CTVI and NDVI image (Fig. 8.20a).

Image Transformations

(a) Image

8.43

(b) Histogram

Fig. 8.20 Thiams Transformed Vegetation Index

8.8.2.4 Ratio Vegetation Index (RVI) This index is effectively inverse of Ratio image: All vegetation related information appear in dark tones, while soil and water appear in brighter tones (Fig. 8.21a).

(a) Image

(b) Histogram Fig. 8.21 Ratio Vegetation Index

8.8.2.5 Normalized Ratio Vegetation Index It is inverse of NDVI image. This image is similar to RVI image (Fig. 8.22a).

8.44

Digital Image Processing

(a) Image

(b) Histogram

Fig. 8.22 Normalized Ratio Vegetation Index

8.8.2.6 Infrared Index (II) This index provides information regarding changes in plant biomass and water stress. Thus, all area present under affectations can be stands. Also river channel and the canal are also seen in brighter tones (Fig. 8.23a).

(a) Image

(b) Histogram Fig. 8.23 Infrared Index

8.8.2.7 Moisture Stress Index In this image, all areas which are having no or less moisture appear in brighter tones, so all barren area, sandy areas or afforested area appear in brighter tone, while dense forest, river channel and canal has no or less moisture stress and thus appear in darker tones (Fig. 8.24 a).

Image Transformations

(a) Image

8.45

(b) Histogram Fig. 8.24 Moisture Stress Index

8.8.2.8 Perpendicular vegetation Index 1 This index is a soil-line based Transformation. PVI 1 is between TM 4 and 3. Here, all vegetative areas are shown in varying tones of gray with dense forest areas with dark shades of grey and light dense forest in brighter shades of grey. Water is shown in black tone, while shrubs and bushes within river channels are shown in dark tones (Fig. 8.25a).

(a) Image

(b) Histogram

Fig. 8.25 Perpendicular Vegetation Index 1

8.8.3 Principal Component Analysis Images Now let us examine the results of Principal Component Analysis. Using the Variance – Covariance matrix as derived for this dataset (Table 5.3), the eigen

8.46

Digital Image Processing

values (Table 8.2) and eigen vectors (Table 8.3) have been computed. Table 8.3 shows that nearly 85% of the information is contained in the First Component and that 98.6% of the information are contained in the first three components. Thus, it can be seen that by using Principal Component Analysis, the redundancy is the data has been eliminated and that the information contained in six bands of data has now been accommodated in the first three components, thereby reducing the number of data bands. If this data is now used further in the image analysis, it will certainly reduce the burden of number of bands of data as nearly 99% of the information is now available in the first three components and the rest information may be attributed to errors. Table 8.4 shows the eigen vectors of the Variance – Covariance matrix. These eigen vectors represents the weightage of each band in the newly derived principal components. However, it does not gives the correlation between the original band and the newly formed principal component. Table 8.5 given the correlation between the original bands and the various components. It can be seen that First Principal Component has high correlation with 5 original bands and this fact can be reflected by the high value of variance of 85.33% in the First Principal Component (Table 8.3). This is evident from Fig 8.26 (a) where the forested and other vegetative areas can be seen varying tones of black tone as it is while bare soil areas and dry river and stream beds appear as white. This is due high correlation of the three visual bands of the TM bands where vegetative areas have low reflectance. The Second Principal Component is highly correlated by TM Band 4 and has high reflectance in vegetative areas (compare Figs. 8.26 (a) and (b). All vegetative areas in Fig. 8.26 (b) appear in bright in tone. However, the third component is highly correlated by TM bands 5 & 7, where water has high absorption. Thus, water features appear in dark tones and thus are very distinctive. The other three components hardly give any distinctive information. Table 8.3 Eigen values of the Principal Component Analysis Parameter Eigen value Total Variance Percentage Cumulative

1 1447.831 1696.651 85.33 85.33

PRINCIPAL COMPONENT 2 3 4 5 165.494 64.11 10.761 5.967 9.75 95.08

3.78 98.86

0.63 99.49

0.35 99.84

7 2.482 0.15 99.99

Table 8.4 Eigen vectors of Principal Component Analysis Band TM 1 TM 2 TM 3 TM 4 TM 5 TM 7

1 0.232 0.294 0.458 0.127 0.544 0.581

2 -0.151 -0.116 -0.238 0.890 0.296 -0.164

Component 3 4 -0.414 0.250 -0.449 0.132 -0.405 0.009 -0.340 -0.268 0.491 0.612 0.327 -0.688

5 0.625 0.209 -0.714 -0.005 -0.029 0.235

7 0.546 -0.798 0.245 0.063 -0.024 0.002

Image Transformations

8.47

Table 8.5 Degree of correlation between each band and component Band TM 1 TM 2 TM 3 TM 4 TM 5 TM 7

1 0.882 0.931 0.961 0.374 0.986 0.982

2 -0.194 -0.124 -0.169 0.885 0.181 -0.094

Component 3 4 -0.331 0.082 -0.299 0.036 -0.179 0.002 -0.211 -0.068 0.187 0.096 0.116 -0.100

(a) Component 1

(c) Component 3

(e) Component 5

5 0.153 0.043 -0.096 -0.001 -0.003 0.026

(b) Component 2

(d) Component 4

(f) Component 6

Fig. 8.26 Principal Component Analysis Images

7 0.086 -0.105 0.021 0.008 -0.002 0.000

8.48

Digital Image Processing

8.8.4 Tassel Cap Transformation Images The next transformation is Tassel Cap Transformation performed on the sample data set using the Tassel Cap transformation coefficients as given in Table 8.1. Brightness image clearly emphases all the barren soil and river sand areas in white tone while all vegetative and water areas appear in darker tones (Fig. 8.27 a)). In the Greenness image shows all the vegetative areas in white tones while barren soil and sandy areas and water bodies appear in darker tones (Fig. 8.27 b). Similarly, in the Wetness image, water bodies appear in white tone, while vegetative areas appear in lighter grayish tone as it contains leaf moisture. Thus one can clearly see the utility of Tassel Cap transformation in differentiating different types of land cover.

(a) Brightness

(c) Wetness

(e) Fifth

(b) Greenness

(d) Haze

(f) Sixth

Fig. 8.27 Tassel Cap Transformation Images

Image Transformations

8.49

On comparing the FCC (Figs. 8.28, 8.29 & 8.30) created from original data with those created from the first three principal components and by combining the Brightness, Greenness and Wetness images from Tassel Cap transformation, it is observed that some of the information such as waterlogged areas, villages, forest cut out areas, can be identified more clearly and good differentiation between barren and dry river sand on FCC’s formed by Principal Component Analysis and Tassel Cap Transformation.

Fig. 8.28 FCC of the TM band 4, 3 and 2

8.50

Digital Image Processing

Fig. 8.29 FCC generated from the first three Principal Component Analysis

Image Transformations

Fig. 8.30 FCC generated from Brightness, Greenness and Wetness images

8.51

9 Image Classification

9.1 INTRODUCTION Image classification of remote sensing data refers to the computer assisted interpretation to generate thematic map to show the spatial distribution of identifiable earth features. In order to identify different objects or features, the spectral properties or signatures are used. The success of such a process depends on two aspects: (i) The presence of distinctive signature for the objects and features of interest in the given data set, and (ii) The ability to reliably distinguish these signatures from other spectral response patterns that may be present. A number of factors can cause confusion among spectral signatures, including topography, shadow, atmospheric variability, changes in sensor calibration and mixing of class. Although some of the effects can be accounted for or modeled, the rest may have to be treated simply as a statistical variability. Basically the digital image classification involves three broad steps as discussed below: (i) Feature extraction i.e. transformation of a multispectral image by a spatial or spectral transform to a feature image. This may typically include selection of a subset of bands, or PCT to reduce the data dimensionality or a spatial smoothening filter. (ii) Training i.e. use of specified area in the image of known identity to train the classifier. (iii) Classification i.e. to apply appropriate statistically or discriminate functions to the whole image in order to categorize into different informational classes.

9.2

Digital Image Processing

The process of multispectral classification may be performed using either of two approaches namely, supervised or unsupervised. In supervised classification approach, the analyst classifies the image on the basis of known information referred to as training data. This stage is often called signature analysis and may involve the development of a characterization of information through statistical parameters such mean, standard deviation, covariance and correlation matrices. Once a statistical characterization has been achieved for each informational class, the image is then classified by examining the digital number of each pixel and making a decision about which information class it resembles most. These decisions are generally undertaken by a classifier which may be statistical, probabilistic, and fuzzy or knowledge based. In unsupervised classification, based on some natural spectral properties, the image is converted into clusters, whose exact identity is not known. It is then the responsibility of the analyst to convert these clusters into informational classes. Since the definition of clusters is based on some spectral property of the object, hence it is known as spectral class. When talking about classes, it is important to distinguish between information classes and spectral classes. Information classes are those categories of interest that the analyst is actually trying to identify in the imagery, such as different kinds of crops, different forest types or tree species, different geologic units or rock types, etc. Spectral classes are groups of pixels that are uniform (or near-similar) with respect to their brightness values in the different spectral channels of the data. The objective is to match the spectral classes in the data to the information classes of interest. However, it is rare that there is a simple oneto-one match between these types of classes. Generally, it is found that two to three spectral classes may merge to form one information classes, while some classes may not be of any particular interest. It is the analyst’s job to decide on the utility of the different spectral classes and their correspondence to useful information classes.

9.2 SUPERVISED CLASSIFICATION For a well defined and structured approach to supervised classification, it may be important to consider the following steps as outlined below, in order to generate an useful thematic map: (i) Selection an appropriate classification scheme. (ii) Selection of representative training sites. (iii) Generation of training data statistics from the selected training sites. (iv) Selection of appropriate bands, on the basis on training data statistics, to be used for classification. (v) Selection of the appropriate classification algorithm. (vi) Classify the imagery into desired informational classes. (vii) Evaluation of classification accuracy. The following sections discuss in details the various aspects, so as to emphasis the significance of each.

Image Classification

9.3

9.2.1 Classification Scheme An object or feature can be categorized either based on its physical or spectral properties. The physical properties can be its shape, size, colour, texture and basically are human defined attributes. Thus, any categorization based on these criterias are known as informational classes. Spectral properties are primarily due to sensor characteristics, and account for the chemical and biological characteristics of the object or features, and hence known as the spectral class. However, there is no identify attached to a spectral class, hence based on the analyst experience and knowledge, an informational class label is provided. In order to have a systematic approach to define information class with a well defined hierarchy, a classification scheme is one such approach. Many classification schemes have been developed so that land use and land cover data can obtained by interpreting remotely sensed data. Some of the important ones are the U.S. Geological Survey Land Use/Land Cover Classification System, the Michigan Classification System, and the Cowardin Wetland Classification System. The major points of difference between various classification schemes is the emphasis, and ability to incorporate information obtained using remote sensing data. The U.S Geological Survey Classification scheme is “resource” oriented in contrast with various “people or activity” oriented systems such as the Standard Land Use Coding (SLUC) Manual. The system addresses to nine level I categories (Table 9.1). The system is designed to be driven primarily by the interpretation of remote sensor data obtained at various scales and resolutions (Table 9.2) and not data collected in situ. It was initially developed to include land-use data that was visually interpreted, although it has been widely used for digital multispectral classification as well. Table 9.1 U.S. Geological Survey Land Use/Land Cover classifications System Level I 1

2

3

Urban or builtup land

Agricultural land

Rangeland

Level II 11 12 13

Residential Commercial and services Industrial

14 15 16 17 21 22

Transportation and communications, Industrial and commercial complexes Mixed urban or built-up land Other urban or built-up land Cropland and pasture Orchards, groves, vineyards, nurseries, and ornamental horticultural areas Confined feeding operations Other agricultural land Herbaceous rangeland Shrub and brush rangeland Mixed rangeland

23 24 31 32 33

9.4

Digital Image Processing

4

Forest land

5

Water

6

Wetland

7

Barren land

8

Tundra

9

Perennial snow and ice

41 42 43 51 52 53 54 61 62 71 72 73 74 75 76 77 81 82 83 84 91 92

deciduous forest land Evergreen forest land Mixed forest land Streams and canals Lakes Reservoirs Bays and estuaries Forested wetland No forested wetland Dry salt flats Beaches Sandy areas other than beaches Bare exposed rocks Strip mines, quarries, and gravel pits Transitional areas Mixed barren land Shrub and brush tundra Herbaceous tundra Bare ground Mixed tundra Perennial snowfields Glaciers

On the other hand, The Standard Land Use Coding Manual (SLUC), is landuse “activity” oriented classification scheme and is dependent primarily on in situ observation to remarkably specific land-use information. However, there exists the need to merge the two approaches to produce a hybrid classification system that incorporates both land use as interpreted from remote sensor data and precise but expensive land-use information obtained in situ. Table 9.2 The Four levels of Remotely Sensed Data to be used

Classification level characteristics I II III IV

Typical

data

Landsat (formerly ERTS) type of data High-altitude data acquired at 40,000 ft (12,400 m) or above; results in imagery that is less than 1:80,000 scale Medium-altitude data acquired between 10,000 and 40,000 ft (3100 and 12,400 m) results in imagery that is between 1:20,000 and 1:8000 scale Low-altitude data acquired below 10,000 ft (3100 m); result in imagery that is larger than 1:20,000 scale

source: Anderson et al, (1976) and Jensen et al, (1983).

Michigan classification system, is one which serves as a guide for the preparation of detailed urban/suburban classification schemes similar to USGS

Image Classification .

9.5

system, Michigan system also permits aggregation from the lowest level upward to successively higher levels, and in certain instances, the disaggregation of any higher-level data into lower levels. This reduces cost and permits increased benefits as multidisciplinary user access the same data/ base and avoid duplication of effort. Basically, the nine level I categories of the USGS system were adopted without change. In addition, however, the Michigan system includes detailed categories in levels II, III, and IV consistent with Michigan environmental and cultural conditions. It is land cover oriented up to and including level III, making it amenable to remote sensing data analysis. It is both land cover and activity oriented at level IV so as to meet user needs at the local level.

9.2.2 Training Site Selection and Statistics Extraction Based on the classification scheme adopted, an analyst may select sites within the image that are representative of the land cover classes of interest. These sites should not be atypical but ones that represent the norm for each class. The image coordinates of these sites are then identified and used to extract statistics from the spectral data for these areas. Training data should be of value if the environment from which they were obtained is relatively homogeneous. In defining the training areas for supervised classification, the analyst may adhere to certain guidelines as outlined below. The main objective is to identify a group of pixels in the image which represent the spectral variation of each information class.

9.2.2.1 Guidelines for Training Data The success of supervised classification depends entirely on the proper selection of training data. The training data should adhere to certain guidelines as enumerated below: (i) Number of pixels (ii) Size (iii) Shape (iv) Location (v) Number (vi) Placement (vii) Uniformity

Numbers of pixels The number of pixels selected for each category is an important aspect. As a general guideline, the analyst should assure that several individual training areas for each category provide a total of at least 100 or so pixels for each category.

Size Size of training area is an important consideration. It must be large enough to provide accurate and reliable estimates of the properties of each informational

9.6

Digital Image Processing

class. However, the individual training areas should not be too large or big, as large areas may include undesirable variation, while, small training areas may be difficult to locate accurately on the image. Sample size is not simply a matter of the bigger the better’, for cost is an important consideration. It may be recalled sample size is related to the number of variables (i.e number of spectral bands) whose statistical properties are to be estimated, and the number of statistical properties. In the case of a single variable and the estimation of a single property, such as the mean or the variance, a sample size of 30 is usually found to be sufficient. For the multivariate case, the size should be at least 30p pixels per class where p is the number of features (spectral bands), and preferably more. Dobbertin and Biging (1996) showed that classification accuracy improves as sample size increases. However, neural-based classifiers appear to work better than statistical classifiers for small training sets, though better results were achieved when training set size is proportional to class size and variability (Blamire, 1996; Foody et al, 1995; Hepner et al, 1990). Thus, the analyst must devote more time to the definition and analysis of the training areas. Joyce (1978) suggests 65 ha (160 acres) as the maximum size for training fields.

Shape Shape of training areas is not an important consideration, provided that shape does not prohibit accurate delineating and positioning of correct outlines of regions on digital images. Usually, it is easy to define square or rectangular areas; as such shapes minimize the number of vertices that may have to be specified, which is usually a bothersome task for the analyst.

Location Location is an important, as each informational category should be represented by several well distributed training areas located throughout the image. Proper care be taken that the limits of the training data are not too close to the boundary of the class as this may lead to inclusion of pixels that are not pure but mixed in nature.

Number The optimum number of training areas depends upon the number of categories to be mapped, their diversity, and the resources that can be devoted to delineating training areas. Ideally, each informational class, or spectral subclass, should be represented at least five to ten number of training areas spread over the image area so as to assure that spectral properties of each category are represented. Since informational classes are often spectrally diverse, it may be necessary to use several sets of training data for each informational category, due to the presence of spectral subclasses. Selection of multiple training areas is also desirable because later in the classification process it may be necessary to discard some of training areas if they are found to be unsuitable.

Image Classification

9.7

Placement Placement of training areas may be an important criterion. Training areas should be placed within the image in a manner that permits convenient and accurate location with respect to distinctive features, such as water bodies or boundaries between distinctive features on the image. They should be distributed throughout the image so that it represents the diversity present within the scene. Boundaries of training fields should be placed well away from the edges of contrasting parcels so that they do not encompass edge pixels.

Uniformity Perhaps, one of the most important properties of a good training area is its uniformity or homogeneity. Data within each training area should exhibit unimodal frequency distribution for each spectral band to be used. Prospective training areas that exhibit bimodal histograms should be discarded if their boundaries cannot be adjusted to yield more uniformity. Training data provide values that estimate the means, variance, and covariances of spectral data measured in several spectral channels. For each class to be mapped, these estimates approximate the mean values for each band, variability of each band, and interrelationships between bands.

9.2.2.2 Idealized Sequence for Selecting Training Data It is possible to outline an idealized sequence as a suggestion of the key steps in the selection and evaluation of training data. (i) Assemble information, including maps and aerial photographs of the region to be mapped. (ii) Conduct field trips to selective and representative sites to acquire first hand knowledge of the area. The field trips must coincide with the data and time of data acquisition. If it is not possible, then it should be at the same time of the year. (iii) Conduct a preliminary examination of the digital data, in order to make assessment for the quality of image. (iv) Identify prospective training areas. These locations may be defined with respect to some easily identifiable object in the image. Further, same maybe identified on the map and aerial photographs, if readily available. (v) Extract the training data area from the digital image. (vi) For each informational class, display and inspect frequency histograms of all spectral bands to be used in the classification. Examine the mean, variance, divergence measures, covariance, and other measures to assess the usefulness of the training data. (vii) Modify the boundaries of training field to eliminate bimodal frequency distributions, or if necessary, discard those areas that are not suitable. If necessary, return to Step 1 to define new areas to replace those that have been eliminated.

9.8

Digital Image Processing

(viii)

Incorporate training data information into a form suitable for use in the classification procedure, and proceed with the classification process as described in subsequent sections of this chapter.

9.2.3 Training Data Statistics In supervised classification, training data plays a vital role. It provides a blue print based on which a classifier assign a pixel its informational class. Once the training data has been properly identified as per guidelines given in section 9.2.2, it has to be converted into a plausible form, so that it can used by a classifier. Classification algorithms can be grouped into two types: parametric and nonparametric. Parametric algorithms assume a particular class statistical distribution, commonly the normal distribution and require estimates of the distribution parameters such as the mean vector and covariance matrix. On the other hand, non-parametric algorithms make no assumption about the probability distribution and are considered robust as they may work well for a variety of class distributions. The different types of statistical parameter that may be generated are the class means, standard deviation, covariance and correlation matrices. Apart from this, the minimum and maximum value can be computed for each class. To check for homogeniety of information as defined by a training data for a given class. Histogram plots of each class in each band may be examined. A good training data set should yield uni-modal histogram in each band for each class. Further, scatter plots may be generated from the training data selected for all information classes to check overlap. However this may not be the ideal or convenient way to identify or select the best bands for classification. For this, the analyst may have to carry out feature selection, as discussed in the next section.

9.2.4 Feature Selection Once the training statistics have been systematically collected from each band for each class of interest, an assessment may have to be made to determine the best band combination that are most effective in discriminating each class from all others. This process is commonly known as feature selection. This process minimizes the cost of the digital image classification. Feature selection may involve both statistical and / or graphical analysis to determine the degree of separability between classes. Combinations of bands are normally ranked according to their sensor training data. Combinations of bands are normally ranked according to their potential ability to discriminate each class from all others using n bands at a time. Statistical methods of feature selection are used to quantitatively select the subset of band (or features) that provides the greatest degree of statistical separability between any two classes, a and b. The basic problem of spectral pattern recognition is that given a spectral distribution of data in n bands of data, to find a discrimination technique that will for allow separation of the major land cover categories with a minimum of error and a minimum number of bands. Generally, it is found that the cost of classification increases with the increase in

Image Classification

9.9

the number of bands of data being used. The basis for this lies in the fact that there may be large redundancy or repetition in information content. Consider a simple case where based on the training data, it is found that two informational classes are having an overlap amongst themselves in a given band. Arising out of this situation, the analyst may like to apply some decisional rule so that for the given separation between two classes, on explicit definition of the nature of errors. Normally, the nature of error can be enumerated as: (i) A pixel may be assigned to a class to which it dose not belong i.e. an error of commission. (ii) A pixel is not assigned to its appropriate class i.e. an error of omission. The goal is to select an optimum subset of bands and apply appropriate classification techniques to minimize both types of error in the classification process. If the training data for each class from each band are normally distributed as suggested in Fig. 9.1, it is possible to use some measure of seperability identify the optimum subset of bands to use in the classification procedure.

No. of pixels

One Dimensional Decision boundary

Class 1

Pixels in class 2 erroneously assigned to class 1

Class 2

Pixels in class 1 erroneously assigned to class 2

Fig. 9.1 schematic representation of overlapping information from training data. (Source: Jensen, 1986)

Some of the statistical seperability measures are: (i) City Block Distance (ii) Euclidean Distance (iii) Angular separation (iv) Normalized City Block Distance (v) Mahalanobis Distance (vi) Divergence

9.10

Digital Image Processing

(vii) Transformed Divergence (viii) Bhattacharyya’s Distance (ix) Jeffries – Matusita Distance City Block Distance commonly known as Manhattan Distance, or Boxcar Distance (Kardi, 2006) is basically a seperability measure to represent the distance between two points in a city road grid. It examines the absolute differences between and the coordinates of two object a and b. and hence also known as Absolute Value Distance. Its mathematical formulation is given in Table 9.3 Euclidean distance is a popular measure of finding distance between two points or objects, on the basis of Pythagoras theorem. The Normalized City Block measure is better than City Block distance, in the sense that it is proportional to the separation of the class means and inversely proportional to their standard deviations. If the means are equal, however, it will be zero regardless of the class variances, which does not make sense for a statistical classifier based on probabilities. Angular separation is a similarity measure than a distance. It represents the cosine angle between two objects. Higher values of angular separation indicates close similarity (Kardi, 2006) However, all the these measures do not account for overlap in class distance due to variation and thus not good measures of seperability, in case of remote sensing data. For this reason, probability-based measures have also been defined. Divergence is one of the first measures of statistical separability used in image processing of remote sensor data, and continues to be the widely used as a method of feature selection. On the basis of the training data statistics, it tries to select the best number of band to be used out of the given dataset. If there are n number of bands in a given dataset, and the analyst is interested in finding the best q number of bands, then the number of band combinations C to be examined at a time can be expressed as:

⎛n⎞ n! C ⎜⎜ ⎟⎟ = ⎝ q ⎠ q!(n − q ) !

(9.1)

For the six band Landsat thematic data set, if the analyst is interested in selecting the best three bands, then 20 combinations of 3- band sets may have to be evaluated. Similarly, for a 4- band combination, 15 combinations have to be examined. Divergence is computed using the mean and covariance matrices of the class statistics collected in the training phase of the supervised classification (Table 9.3). However, for more than two classes, this can be extended by computing the average divergence, Davg for all possible pairs of classes, a and b, for the given the subset of bands, The subset of bands having the maximum average divergence may be used in the classification algorithm, and can be expressed as:

Image Classification

9.11

m −1 m

D ave =

∑ ∑ D ab

c =1 d = c +1

… (9.2) C Using this, the band subset, q, with the highest average divergence would be selected as the most appropriate set of bands of classifying the m classes. Kumar and Silva (1977) suggested that it is possible to take the divergence logic one step further and compute transformed divergence, (TD) as given in Table 9.3 This statistic gives an exponentially decreasing weight to increasing distances between the classes. It also scales the divergence values to lie between 0 and 2000. There are other methods of feature selection also based on determining the separability between two classes at as time. For example, the Bhattacharyya (B) distance assumes that the two classes, a and b, are Gaussian in nature and that the means, Ma, Mb and covariance matrices Va and Vb are available and. It is computed as given in table 9.3. To select the best q features (i.e. combination of bands) from the original n bands in an m-class problem, the Bhattacharyya distance is calculated between each of the m(m-1)/2 pairs of classes for each of the possible ways of choosing q features from n dimensions. The best q features are those dimensions whose sum of the Bhattacharyya distance between the m (m-1)/2 classes is highest. Jefferies – Matusita (JM) distance is an extension of Bhattacharyya distance (Table 9.3). Its purpose is similar to Transformed Divergence (TD). However, it scales the values within the range of 0 to 2.

Comparison of Divergence and JM Distance JM distance performs better as a feature selection criterion for multivariate normal classes than divergence for the reasons given above; however it is computationally more complex and thus expensive to use. Suppose a particular problem involves M spectral classes. Consider the cost of computing all pairwise divergences and JM distances. These costs can be assessed largely on the basis of having to compute matrix inverse and determinates, assuming reasonably that they involve similar computational demands using numerical procedures. In the case of divergence, it is necessary to compute only M matrix inverses to allow all the pairwise divergences to be found. However for JM distance it is necessary to compute M C2 + M equivalent matrix inverses since the individual class covariances appear as pairs which have to be added and then inverted. It may be 1 noted that M C + M = M ( M + 1) so that divergence is a factor of 2 2 1 ( M + 1) more economical to use. When it is recalled how many feature subsets 2 may need to be checked in a feature selection exercise this is clearly an important consideration.

9.12

Digital Image Processing

9.2.5 Selection of Appropriate Classification Algorithm Various supervised classification methods may be used to assign an unknown pixel to one of a number of classes. The choice of a particular classifier or decision rule depends on the nature of the input data and the desired output. Parametric classification algorithms assume that the observed measurement vectors Xc obtained for each class in each spectral band during the training phase of the supervised classification are Gaussian in nature (i.e. they are normally distributed). Nonparametric classification algorithms make no such assumption. It is instructive to review the logic of several of the classifiers. Among the most frequently used classification algorithms are the minimum distance, parallelepiped and maximum likelihood classifier.

9.2.5.1 The Parallelopiped Classifier It is based on simple Boolean “and/or” logic. Training data in n spectral bands are used in performing the classification. Brightness values from each pixel of the multispectral imagery are used to produce an n-dimensional mean vector, M c = (µ ck , µ c 2 , µ c3 ,..., µ cn ), with µ ck being the mean value of the training data obtained for class c in band k out of m possible classes as previously defined. σ ck is the standard deviation of the training data class c of band k out of m possible classes. Using a one-standard deviation threshold, a parallelopiped algorithm decides BVijk is in class c if, and only if, µ ck − σ ck ≤ BVijk ≤ µ ck + σ ck

…(9.3)

where c = 1, 2, 3, . . . , m, number of classes K = 1, 2, 3, . . ., n, number of bands Therefore, if the low and high decision boundaries are defined as Low ck = µ ck − σ ck

…(9.4)

and

Highck = µ ck + σ ck the parallelopiped algorithm becomes Lowck ≤ BVijk ≤ Highck

…(9.5) …(9.6)

These decision boundaries form an n-dimensional parallelopiped in feature space. If the pixel value lies above the lower threshold and below the high threshold for all n bands evaluated, it is assigned to that class. When an unknown pixel does not satisfy any of the Boolean logic criteria, it is assigned to an unclassified category. Although it is only possible to analyze visually up to three dimensions, however it is possible to create an n-dimensional parallelepiped for classification purposes within a computer environment.

Image Classification

9.13

Table 9.3 Different feature selection methods (Schowengerdt, 1997) Name City Block (L1)

Formula k

L1 = µa − µ b = ∑ m ak − m bk k =1

Euclidean (L2)

L2 = µ a − µ b = [(µ a − µ b ) T ( µ a − µ b )]1 / 2

⎡k ⎤ = ⎢ ∑ (m ak − m bk )⎥ ⎣ k =1 ⎦ Angular (ANG) Normalized City Block (NL) Mahalanobis (MH)

Divergence (D)

1/ 2

⎛ µT µ ⎞ ANG = a cos⎜ a b ⎟ ⎜ µ µ ⎟ ⎝ a b ⎠ k

NL1 = ∑

k =1(

mak − mbk vak + vbk ) / 2

−1 ⎡ ⎤ T⎛v +v ⎞ MH = ⎢(µ a − µ b ) ⎜ a b ⎟ ( µ a − µ b )⎥ ⎝ 2 ⎠ ⎢⎣ ⎥⎦

D=

[

1/ 2

)]

(

1 1 tr (v a − vb ) vb−1 − va−1 + tr 2 2

[(v

−1 a

)

+ vb−1 (µ a − µ b )(µ a − µ b )

T

]

Where tr is the trace of matrix i.e. sum of all diagonal elements. Transformed Divergence (TD) Bhattacharyya Distance (B)

JeffriesMatusita Distance (JM)

[

TD = 2 1 − e − D / 8

]

1 1 ⎡ ( v a + vb ) / 2 ⎤ B = MH + In ⎢ ⎥ 8 2 ⎢⎣ ( va vb )1/ 2 ⎥⎦

[(

JM = 2 1 − e − B

)]

1/ 2

The parallelopiped algorithm is a computationally efficient method of classifying remote sensor data. Unfortunately, as some parallelopipeds may overlap, it is possible that an unknown pixel might satisfy the criteria of more than one class. In such cases it is usually assigned to the first class for which it meets all criteria. A more elegant solution is to take the pixel that can be assigned to more than one class and use a minimum distance to means decision rule to assign it to just one class.

9.14

Digital Image Processing

9.2.5.2 The Minimum-Distance to Means Classifier This decision rule is computationally simple and commonly used. It can result in classification accuracies comparable to other more computationally intensive algorithms, such as the maximum likelihood algorithm. It requires that the user provide the mean vectors, for each in each class band µ ck , from the training data. To perform a minimum distance classification, the classifier calculates the distance to each mean vector, µ ck , from each unknown pixel (BVijk). It is possible to find the distance using Euclidian distance based on the Pythagorean Theorem distance measures Dist =

where and l.

(BVijk − µ ck )2 + (BVijk − µ cl )2

…(9.7)

µ ck and µ ck represent the mean vectors for class c measured in bands k

9.2.5.3 The Maximum Likelihood Classifier The classification strategies considered so far do not consider the variation that may be present in spectral categories and also do not address the problems arising when spectral classes overlap. Such a situation arises, frequently, as one is interested in classifying those pixels that tend to be spectrally similar, rather than those which are distinct enough to be easily and accurately classified by other classifiers. The essence of the maximum likelihood classifier is to assign a pixel to that class which would maximize the likelihood of a correct classification, based on the information available from the training data. It uses the training data to estimate the mean measurement vector Mc, for each class and the variancecovariance matrix of each class c for band k, Vc. It decides, if x is in class c if, and only if, pc > pi, where i = 1,2,3, ……, m possible classes where pc = [− 0.5 loge {det (Vc)}]− 0.5 [(X− Mc)T (Vc)-1(X − Mc)] and pi = probability of that class existing.

…(9.8)

Theoretically, pi for each class, is given equal weightage, if no knowledge regarding the existence of the features on the ground is available. If the chance of a particular class existing is more than the others, then the user can define a set of priori probabilities for the features and the equation can be slightly modified. Decide x is in class c if, and only if, pc (ac) > pi (ac), where i = 1,2,3, …………..., m possible classes and pc (ac) = loge (ac) − [− 0.5loge {det (Vc)}]− 0.5 [(X − Mc)T (Vc)-1 (X − Mc)] …(9.9)

Image Classification

9.15

The use of priori probability helps in incorporating the effects of relief and other terrain characteristics. The disadvantage of this classifier is that it requires a large computer memory space and computing time, and yet sometimes may not produce the best results (Jensen, 1986).

9.3 UNSUPERVISED CLASSIFICATION Clustering is one of the most important tasks in data mining and knowledge discovery (Fayyad, et al, 1996). Clustering analysis attempts to find subsets within a given data that are similar enough to warrant further analysis. It organizes a set of objects into groups (or clusters) such that objects in the same group are similar to each other and different from those in other groups (Gordon, 1987; Gordon, 1996; Everitt, et al, 2001). These groups or clusters should have meaning in the context of a particular problem (Jain and Dubes, 1988). The definition of proximity (similarity function) between two data objects (or between two groups of data objects) and the overall optimization search strategy, i.e. how to find the best overall grouping according to an optimization criteria, are the most important and fundamental parts within a clustering method. Clustering, known as unsupervised classification, does not need training data and is especially useful when the user has limited knowledge about the data. Clustering algorithms partition data into a certain number of clusters (groups, subsets, or categories). There is no universally agreed upon definition (Everitt et al, 2001,). Most researchers describe a cluster by considering the internal homogeneity and the external separation (Gordon, 1999, Hansen and Jaumard, 1997, Jain and Dubes, 1988), i.e. patterns in the same cluster should be similar to each other, while patterns in different clusters should not. Both the similarity and the dissimilarity should be examinable in a clear and meaningful way.

Feature Selection or Extraction

Clustering Algorithm Design or Selection

Data Samples Clusters Results Interpretation

Clusters Validation

Knowledge

Fig. 9.2 Clustering procedure. (Source: Xu and Wunsch, 2005)

The process of clustering can be explained by four basic operations (Fig. 9.2) (i) Feature selection or extraction (ii) Clustering algorithm design or selection (iii) Cluster validation (iv) Result interpretation

9.16

Digital Image Processing

Feature selection or extraction   Feature selection relates to distinguishing features from a set of candidates, while feature extraction utilizes some transformations to generate useful and novel feature from the original ones. Both are very crucial to the effectiveness of clustering applications. Proper selection of features can greatly decrease the workload and simplify the subsequent design process. Generally, ideal features should be of use in distinguishing pattern belonging to different clusters, which are immune to noise, easy to extract and interpret. Clustering algorithm design or selection  This step is a crucial one in clustering as it is related to the selection of a proximity measure and the construction of a criterion function. Patterns are grouped into clusters according to whether they resemble each other. Obviously, the proximity measure will directly affect the formation of the resulting clusters. Almost all clustering algorithms are explicitly or implicitly connected to some definition of a proximity measure. Once a proximity measure is chosen, the construction of a clustering criterion function makes the partition of clusters an optimization problem. Clustering is ubiquitous, and a wealth of clustering algorithms has been developed to solve different problems in specific fields. However, there is no clustering algorithm that can be universally used to solve all problems. “It has been very difficult to develop a unified framework for reasoning about it (clustering) at a technical level, and profoundly diverse approaches to clustering” (Kleinberg, 2002). Therefore, it is important to carefully investigate the characteristics of the problem at hand, in order to select or design an appropriate clustering strategy. Cluster validation  Given a data set, clustering algorithm can always generate a division, no matter whether a structure exists or not. Moreover, different approaches usually lead to different clusters; and even for the same algorithm, parameter identification or the presentation order of input patterns may affect the final result. Therefore, effective criteria are important to provide the users with a degree of confidence for the clustering results derived from the used algorithm. These assessments should be objective and have no preferences to any algorithm. Also, they should be useful for answering questions like how many clusters are hidden in the data, whether the clusters obtained are meaningful or just an artifact of the algorithms, or selection of an algorithm over the others. Generally, there are three categories of testing criteria: external indices, internal indices, and relative indices. These are defined on three types of clustering structures known as partitional clustering hierarchical clustering and individual clusters (Jain and Dubes, 1988). External indices are based on some pre-specified structure, which is based on the prior information of the data, and used as a standard to validate the clustering solutions. Internal indices are not dependent on external information (prior knowledge). On the contrary, they examine the clustering structure directly from the original data. Relative indices place the emphasis on the comparison of

Image Classification

9.17

different clustering structures, in order to provide a reference, to decide which one reveals the best characteristics of the objects.

Result interpretation   The ultimate goal of clustering is to provide the users with meaningful insights from the original data, so that they can effectively solve the problems encountered. Experts in the relevant fields interpret the data partition. Further analysis, even experiments, may be required to guarantee the reliability of extracted knowledge. It may be observed that the flow chart (Fig 9.2) also includes a feedback pathway. Cluster analysis is not a one-way process. In many circumstances, it needs a series of iterations. Moreover, there are no universal and effective criteria to guide the selection of features and clustering schemes. Validation criteria provide some insights on the quality of clustering solutions. However, to choose the appropriate criterion is till a problem requiring more efforts. A cluster analysis generally consists of seven steps (Milligan 1996; Everitt, et al, 2001): (i) Data to cluster – data acquisition, preparation and cleaning. (ii) Variables to use – selecting relevant variable to be included in the clustering procedure. Irrelevant or masking variable should be excluded as for as possible. (iii) A proximity measure – designing a proper proximity measure. (iv) The clustering procedure. (v) Number of clusters – determining how many clusters there should be and even no cluster is a possible outcome. (vi) Replication, testing and interpretation. Clustering methods can be divided into two groups: partitioning and hierarchical approaches (Fig. 9.3). The partitioning approach aims to divide the data set into several clusters, which may not overlap with each other but together cover the whole data space. A data item is assigned to the closest cluster based on the type of proximity or dissimilarity measure. Hierarchical clustering approaches decompose the data set with a sequence of nested partitions, from fine to coarse resolution. Hierarchical clustering can be presented with dendrograms, which consist of layers of nodes, each representing a cluster (Jain and Dubes, 1988). This type of clustering is best representing by a dendrogram (Duda, et al, 2001), which is a tree that splits the data space recursively into small subsets. The simplest approach for hierarchical clustering is to form each individual data point, and then recursively merge two closest points (which may be based on a type of distance measure) into a subset until the whole data set is in one cluster. A top down approach can also used – recursively by dividing the dataset into smaller subsets. Each data point can belong to several clusters at different levels. Within each group, according to their definition of a cluster, clustering methods may also be classified into three groups: distance based, model based (or distribution based), and density based methods. For detailed reviews of

9.18

Digital Image Processing

clustering research can be found in (Jain, et al, 1999; Duda, et al, 2001; Everitt, et al, 2001; Han and Kamber, 2001).

9.3.1 Distance based Clustering Methods Distance-based clustering methods require a distance or dissimilarly measure, based on which most similar objects are grouped into clusters. Distance-based, partitioning methods include K-means and CLARANS (Ng and Han, 1994). Distance-based, hierarchical clustering methods include single-linkage and graph-based methods (Gordon 1987; Jain and Dubes, 1988; Duda, et al, 2001). CLARANS (Clustering Large Application based on RANdomized Search) (Ng and Han, 1994) is a K-medoids clustering method, which is a type of distancebased, partitioning clustering. It uses a randomized and bounded search strategy to improve the initial partition and is more efficient than traditional k-medoids methods like PAM and CLARA (Kaufman and Rousseeuw, 1990). However, CLARANS still needs to scan through the whole database to compute the overall optimal solution, and thus it is very time-consuming even with the randomized search strategy. Ester (1995) proposed two techniques to integrate CLARANS with spatial databases using a spatial index structure (Ester, et al, 1995) to improve time performance.

Fig. 9.3 A general classification of clustering methods

9.3.2 Model-based Clustering Methods Model-based or distribution-based clustering methods assume that the data of each cluster conform to a specific statistical distribution (e.g Gaussian distribution) and the whole dataset is a mixture of several distribution models. Maximum likelihood Estimation (MLE) and Expectation-Maximization (EM) are two examples of distribution-based partitioning clustering methods (Bradley, et al., 1998; Duda, et al., 2001). Model-based hierarchical clustering has been studied in (Fraley, 1998; Vaidyanathan and Dom, 2000). Traditional model-based statistical clustering algorithms typically use an iterative refinement procedure to compute a local optimal clustering solution that maximizes the fit to data. These algorithms typically require many data scans to

Image Classification

9.19

converge, and within each scan they require the access to every item in the dataset. For large databases, the scans become prohibitively expensive. Bradley and others presented a scalable implementation of the ExpectationMaximization (EM) algorithm (Bradley et al., 1998). The EM method is a traditional distribution-based clustering method. The scalable method is based on a decomposition of the basic statistics the algorithm needs. This new implementation operates within the confines of a limited main memory buffer and requires a single scan of the database. The main problem with distributionbased methods generally is their distribution, i.e. in high dimensional data set; it is seldom that the data points conform to any kind of distribution.

9.3.3 Density-based Clustering Method Density-based approaches regard a cluster as a dense region of data objects (Jain and Dubes 1988; Ester, et al, 1996; Agrawal et al, 1998). Density-based clustering can adopt two different strategies: grid-based or neighborhood-based. A grid-based approach divides the data space into a finite set multidimensional grid cells, where it calculates the density of each grid cell, and then groups those neighboring dense cells into a cluster. Such methods include Grid-Clustering (Schikuta 1996), CLIQUE (Agarwal, et al, 1998), OptiGrid (Hinneburg and Keim, 1999), ENCLUS (Cheng et al, 1999). The key idea of neighborhood-based approaches is that, given a radius ε (as in DBSCAN (Ester, et al, 1996) and OPTICS (Ankerst et al, 1999)) or a side length w (Procopiuc, et al, 2002), the neighborhood (either a hyper-sphere of radius ε or a hyper-cube of side length w) of an object has to contain at least a minimum number of object (Min Pts) to form a cluster around this object. Among those density-based, Grid-Clustering methods, OPTICS can perform hierarchical clustering. DBSCAN (Density Based Spatial Clustering of Application with Noise) (Ester, et al, 1996) relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape in spatial database with noise. The basic idea is that the neighborhood of a given radius (Eps) for each data point has to contain at least a minimum number of points (MinPts) to form a cluster (i.e. the density in the neighborhood has to exceed some threshold value). To find a cluster, DBSCAN starts with an arbitrary point p and retrieves all points densityreachable from p. Thus, it requires two input parameters: Eps and MinPts, which is subjectively decided and thus prone to errors. CLIQUE (Clustering In QUEst) (Agarwal, et al, 1998) is another density based clustering method. It is different from DBSCAN in three aspects. First, it is grid-based i.e it divides the data space into grid cells and identifies dense cells while DBSCAN is circle-based where dense regions are searched with a moving circle. Second, CLIQUE can automatically search subspaces composed of a subset of dimensions of the original dataset) that contain significant clusters, while DBSCAN can only search clusters in a 2D space. Third, CLIQUE generates the description of clusters with a Disjunctive Normal Form (DNF) expression, which is of a rule-based from and easy to understand.

9.20

Digital Image Processing

OptiGrid (Hinneburg and Keim, 1999) is a density-based clustering using a grid-based approach. It is different from the CLIQUE method in three aspects. First, it uses a number of contracting projections to determine the optimal cutting planes and achieve the optimal partitioning of the data. So, grid cells can be of different size. To form projection is similar to create new dimensions in the Principal Component Analysis (PCA) method. Choosing the right projection is not a trivial task and is time-consuming. Second, OptiGrid later integrated visualization techniques to balance the automaticity and interactivity (Hinneburg et al,, 1999). Third, OptiGrid can not perform subspace clustering.

9.3.4 Condensation-based Method for Clustering Strictly speaking, condensation-based method cannot perform cluster analysis automatically. Rather, they are designed to help other clustering methods with large volumes of data. The basic idea of condensation-based methods is to fit the huge data set in main memory by summarizing the data set with sufficient statistical information. In other word, the clustering algorithm takes the summarization (e.g., mean values, histograms, sums, etc.) of the data as input rather than the data itself. Examples include the BIRCH method and STING method (Zhang, et al, 1996; Wang, et al, 1997). These methods can be combined with other methods to work with large dataset efficiently. BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) (Zhang et al, 1996) is the first method performing condensation-based clustering. It uses a highly specialized tree-structure for the purpose of clustering very large sets of points. The advantage of this structure is that its memory requirements can be adjusted to the available main memory. BIRCH incrementally computes compact descriptions of sub-clusters, called Clustering Features (CF) containing the number of points, the linear sum and the square sum of all points in the cluster. The CF values are sufficient for computing information about subclusters like centroid, radius and diameter. The clustering features are organized in a balanced tree with branching factor B and a threshold T. A non-leaf node represents a cluster consisting of all the sub-clusters represented by its entries, and the number of these entries should be no more than B. A leaf node has to contain at most L entries and the diameter of each entry in a leaf node should be no more than T. Thus T determines the size of the tree. The first phase of BIRCH performs a linear scan of all data points and builds a CF-tree. In an optional phase two, the CF-tree can be further reduced until a desired number of leaf nodes are reached. In phase three, an arbitrary clustering algorithm (e.g., CLEARANS) is used to cluster the leaf node of the CF-tree. Similar to BIRCH, as another condensation-based approach, STING (Wang et al, 1997) which divides the data space into rectangular cells and store statistical parameters (such as mean, variance, etc.) of the object in the cells. This information can then be used to efficiently determine the clusters. Nevertheless, condensation-based methods have been designed for low-dimensional data (Hinneburg and Keim, 1999). They generally do not work well with high dimensional data.

Image Classification

9.21

9.3.5 Subspace Clustering Methods Subspace clustering is very important for effective treatment of high dimensionality dataset. It is not meaningful to look for clusters in a high dimensional space because (i) the average density of points anywhere in the data space is likely to be very low, and (ii) many dimensions or combinations of dimensions can have noise or values that are uniformly distributed. Thus, either distance-based or density-based clustering methods that use all the dimensions of the data may be ineffective in identifying clusters. Furthermore, clusters may exist in different subspaces comprised of different subset of attributes (Agarwal et al, 1998). Currently, two approaches are often used to tackle the problem of high dimensionality: (1) requiring the user to specify the subspace (a subset of the dimensions for cluster, or cluster analysis, or (2) applying a dimensionality reduction (or multi-dimensional scaling) method to the dataset, e.g., principal component analysis (PCA) or self-organizing map. They all try to transform the original data space into a lower dimensional space by forming dimensions that are combinations of original attributes. While these techniques may succeed in reducing the dimensionality of the data, they have severe drawbacks: (i) new dimensions can be very difficult to interpret, making result clusters hard to understand, and (ii) these techniques cannot identify clusters that exist in different subspaces of the original data space (Agarwal et al, 1998). CLIQUE is one of the first subspace clustering methods. It adopts a densitybased approach to clustering a cluster is a region that has higher density of points than its surrounding area. The problem is to automatically identify projections of the input data onto a subset of the attributes with the property that these projections include regions of high density. To approximate the density of the data points, they partition the data space into a finite set of cells and then find the number of points that lie inside each cell. CLIQUE partitions each dimension into the same number of equal length intervals. MAFIA (Merging of Adaptive Finite IntervAls) (Goil et al, 1999) uses a grid-and density-based approach for cluster detection in subspaces. MAFIA is a scalable clustering algorithm using adaptive computation of the finite intervals (bins) in each dimension, which are merged to explore clusters in higher dimensions. Bins are determined based on the data distribution in a particular dimension. The size of the bin and hence number of bins in each dimension in turn determine the computation and quality of clustering. Later (Nagesh, et al, 2000), presented the pMAFIA algorithm a parallel extension of MARIA. ORCLUS (arbitrarily ORiented projected CLUSter generation) introduced the problem of generalized projected cluster. A generalized projected cluster is a set E of vectors together with a set D of data points such that the points in D are closely clustered in the subspace defined by the vectors in E. The subspace defined by the vectors in E may have much lower dimensionality than the full dimensional space. ORCLUS assumes that the number of clusters (k) and the dimensionality of each subspace (l) are input parameters. The output of the algorithm has two parts.

9.22

Digital Image Processing

(i) A (k+1)-way partition of the data, such that the points in each partition element form a cluster. The points in the last partition element are outliers. (ii) A possibly different orthogonal Ei set of vectors for each cluster, such that the points in each cluster, lie best in the subspace defined by the vectors whose cardinality is equal to the user defined parameter l. ORCLUS uses the Euclidean distance metric. Let v = {v 1 , v 2 ,..., v d } be a point in a d-dimensional space, and let E = {e1, e2, …,el} be a set of 1 ≤ d orthogonal vectors in this d-dimensional space. These orthogonal vectors define a projected subspace. They define the projected energy of a cluster in a subspace E as the mean square distance of the points to the centroid of the cluster in that subspace. The algorithm is to discover clusters with small projected energy in subspaces of user-specified dimensionality l. The algorithm for ORCLUS first picks ko initial points (called seeds) from the input data set. Then it runs a number of iterations, each of which does a sequence of merging operations in order to reduce the number of current clusters by the factor α T/2, then let xm- > z2, otherwise, if Dmax < T/2. Then terminate the procedure. In other words if the maximum distance is greater than half the distance between the two closest cluster centers, then xm becomes a new cluster center, otherwise terminate the procedure. Step 9 Increment the number of cluster centers by 1: Nc = Nc+1. Step 10 Reset the scaling distance: n n

T = ∑ ∑ zi − z j i =1 j=1

+ 1) . 2

n −1 k ( k



k =1

…(9.14)

Step 11 Return to step 7.

The partitioning of feature space by the maximin algorithm is a significant improvement over the simple clustering discussed previously. The essential idea is to isolate those points that are farthest apart assuming that the greater the distance between points; the more likely they are to belong to different classes (Fig. 9.6a). The algorithm generally works well for circular clusters whose size (radius) is smaller than the separation between clusters. However, the algorithm is very sensitive to details of the data structure. Consider, for instance, the two class problem illustrated in Figure 9.6b. The “true” data clusters are illustrated by the dark and open circles while the partitioning of the space is indicated by the numbered cluster centers and the decision boundaries. Computational complexity is an inherent problem in image processing. Due to large volume of data and the need to repeat complex computations many times, it is crucial to streamline computations as much as possible. Clever,

9.28

Digital Image Processing

efficient programming is essential. The maximum algorithm, for example, is computationally demanding. Each time a new cluster center is selected, the feature space distance must be computed for every point from every cluster center. On the ith pass through an m × n pixel image there will be i + 1 clusters requiring m×n×i distance computations.

9.3.7.4 K-means Algorithm In the previous clustering algorithms, once a point has been selected as a clustering center, it remains a clustering center, even if it is a relatively poor representative of its cluster. The K-means algorithm is an improvement in the sense that the clustering is optimized by shifting the cluster centers in order to optimize a performance index. Many variations of the K-means algorithm have been developed, but the steps of a basic are as given below: 8

8 Z

Z

6

6

4

2

Z

Z

4

Z 2

4

6

X

8

Z

Z

2 2

6

4

8

X

(a) Clear and separable circular

(b) Improper cluster Fig. 9.6 Partitioning of feature space by maximum algorithm

Step 1 Choose initial cluster center centers, z1, z2,z3,….,zk. The only real requirement of the user in the K-means algorithm is the specification of the number of desired clusters. (In practice, the number of desired clusters should be greater than the number of expected classes). The actual positions of the cluster centers is sometimes specified by a user and is sometimes assigned by particular routine, If selected by routine, they be chosen in a variety of ways: a. by random selection from the image data, b. by applying some a priori information (e.g., training data), c. by selecting points based on preliminary calculation (minimum/maximum grey values, variance in each band, localized data density, etc.), or d. by applying a theoretical principal independent of actual data. Step 2 Distribute the sample among the K means Sample should be assigned to the class represented by the nearest cluster center, i.e.:

Image Classification

x ∈ S i (n ) if x − z i (n ) ≤ x − z j (n )

9.29 …(9.15)

for all j = 1,2,3,…,K; where i ≠ j Si (n) is the set of sample whose cluster center is zi (n), where n indicates that this is the nth iteration of this procedure. Step 3 Compute new cluster center for each set Si (n) Find a new value for each zi. The new cluster center, zi (n+1) will be the mean of all the points in Si (n) such that: 1 z i (n + 1) = …(9.16) ∑x N i x∈si ( n ) Step 4 Compare zi (n) and zi (n+1) for all i Compute the distance between each pair of points for the consecutive iterations. If there is no substantial change, terminate the procedure, otherwise return to Step 2, for the next iteration. Some possible criteria for termination are:

if |zi (n + 1) - zi (n)| < T for all i k

if ∑ zi (n + 1) − zi (n) < T for all i

…(9.17)

i =1

if |zi (n + 1) - zi (n)| < T for all i

9.3.7.5 ISODATA Algorithm The full form of ISODATA is Iterative Self-Organizing Data Analysis Technique A. The ISODATA algorithm is essentially a refinement of the KMeans algorithm. The specific refinements are: i) Clusters that have too few members are discarded. ii) Clusters that have too many members are split into two new cluster groups. iii) Clusters that are too large (too disperse) are split into two new cluster groups. iv) If two cluster centers are too close together they are combined. The user must select more or less arbitrary values for the various process parameters. This can be done iteratively by choosing an initial parameter set, then evaluate the clustering procedure and finally adjust the parameters. Thus, the automated “unsupervised” classifier can be used interactively. Step 1 Select an initial (seed) set of clustering centers, z1, z2,…,zNc A set of clustering center (prototypes) must be chosen for most clustering algorithms. Generally, the analyst has to specify the number of initial cluster centers. Some allow the analyst to specify the specific locations of the cluster centers. Clustering algorithms tend to be sensitive to the number and location of the initial cluster centers.

9.30

Digital Image Processing

Selecting a good set of seed values, cannot only speed up the clustering process, but can also lead to better results. Step 2 Specify process parameters The analyst in principle is not required to provide any inputs. However, to control the clustering procedure, the user may provide the following inputs: a) K = number of cluster desired. If specific initial cluster center are not defined, K can also serve as the number of initial cluster centers. The actual number of clusters returned at the end of clustering will be between K/2 and 2K. b) Nmin = the minimum number of sample allowed for a viable cluster. c) σlim = a standard deviation parameter. It is a measure of the maximum allowable size of a single cluster. d) Dlump = a lumping parameter. This is used to group two cluster center are closer than the lumping parameter distance, then the two clusters will be grouped into a single cluster. Only pairs of cluster will be lumped together at a time. e) Nlump = maximum number of pairs of cluster centers which can be lumped together during one iteration. Generally this is set to two. f) Niter = maximum number of iterations allowed. Step 3 Distribute samples among the cluster centers x ∈ S i iff x − z j ≤ x − z i

…(9.18)

where: Sj = the set of samples assigned to cluster zj i,j = 1, 2, 3, …, Nc; 1≠j Nc = number of cluster centers Step 4 Discard sample subsets with fewer than Nmin members If Sj contain Nj members and Nj < Nmin, discard zj and reduce Nc by 1. The members discarded cluster of may be dealt with in a number of ways. They may be: a) incorporated into other clusters b) ignored for this iteration c) assigned to an unclassified group and ignored for the rest of the procedure. Step 5 Compute the new cluster center for each set Sj i zj = ∑ x j = 1, 2, 3, …, Nc N j x∈s j

...(9.19)

Step 6 Compute the cluster size parameter, Dj Dj is the average distance of samples in each cluster Sj, from the corresponding cluster center, zj: 1 …(9.20) Dj = ∑ x − z j , j = 1, 2, 3,…, Nc N j x∈s j

Image Classification

9.31

Step 7 Compute the overall size, Ds Ds is weighted average of distance of sample from their respective cluster enters: 1 Nc …(9.21) Ds = ∑ N j D j N j =1 where N is the total number of pixels in each cluster: Step 8 Branch point a) If this is the last iteration, set D1 = 0 and go to Step 12, b) if Nc ≤ K/2 go to step 9, c) on even iterations or if N ≥ 2K, go to Step 12, d) otherwise go to Step 9. Step 9 Find the standard deviation vector for each class, 1 ⎡ 2⎤ …(9.22) σ ij = ⎢ ∑ x i − z i, j ⎥ N j ⎢⎣ x∈s j ⎥⎦ where: j = 1, 2, 3, …, Nc i = 1, 2, 3, …n n = number of bands σij represents the standard deviation of all the sample in Sj along the principal coordinate axes, xj, in feature space.

(

)

Step 10 Find the maximum component σ i ,max = max σ i, j

[ ]

…(9.23)

This will be the primary measure of the size of the cluster. Step 11 Test for splitting IF

the variance for any cluster, j, exceeds the limiting > σ lim value, i.e if σ j,

AND EITHER

Dj > Ds and nj > 2 (K+1) i.e. if the cluster is more disperse than the average and if splitting will not result in exceeding the upper limit of clusters. Nc < K/2 i.e. if there are too few clusters. Split the cluster, Sj, into two clusters. split the cluster, Sj, into two new clusters. let z N c +1 = z 1 , z 2, ..., z j + kσ j ,max ,.., z n

max

OR

THEN

(

(

z j = z 1 , z 2, ..., z j − kσ j ,max ,.., z n

)

)

Nc = Nc + 1 If splitting occurs then go to Step 2; otherwise continue.

9.32

Digital Image Processing

Step 12 Compute the pairwise distances, D (i,j), between all cluster centers D(i, j) = z i − z j , i = 2,…,Nc …(9.24)

j = 1,2, …, i-1 Step 13 Find and sort the Nlump smallest distances, such that, D (i,j )< Dlump Find the set of Dk (i,j) < Dlump, where: k = 1, 2, 3,…, Nlump and Dk (i,j) < Dk+1 (i,j) Step 14 Lump clusters Dk (i,j) be the distance between cluster center at zi and zj. If neither zi nor zj has been used for lumping during this iteration, then merge the two clusters: 1 zi Ni − z jN j …(9.25) z lump = Ni + N j

[

Note:

]

delete zil and zjl and decrement Nc: Nc = Nc-1 -only pairwise lumping is allowed if any zi appears in more than one pair, only the first pair will be merged.

Step 15 Continues or terminate IF Niter iterations have already been performed. OR IF on the Kth iteration: N c (k ) = N c − 1 THEN ELSE

and z i (k ) − z i (k − 1) < ε TERMINATE CONTINUE

9.4 CLASSIFICATION ACCURACY ASSESSMENT No classification task using remote sensing data is complete till an assessment of accuracy is performed. The analyst and the user of a classified map would like to know as to how accurately the classes on the ground have been identified on the image. The term accuracy correlates to correctness. In digital image processing, accuracy is a measure of agreement between standard information at given location to the information at same location on the classified image. Generally, the accuracy assessment is based on the comparison of two maps; one based on the analysis of remote sensing data and second based on information derived from actual ground also known as the reference map. This reference map is often compiled from detailed information gathered from different sources and is thought to be more accurate than the map to be evaluated. The reference map consists of a network of discrete parcels, each designated by a single label. The simplest method of evaluation is to compare the two given maps with respect to areas assigned to each class or category. This yields a report of the areal extents of classes, which agree to each other. The accuracy assessment is presented as an overall classification of map or as site-specific accuracy. Overall classification accuracy represents the overall accuracy between two maps in terms of total area for each category. It does not take into account the agreement or disagreement between two maps at specific locations. The second

Image Classification

9.33

form of accuracy measure is site-specific accuracy, which is based upon detailed assessment of agreement between the two maps at specific locations. In case of supervised classification, the simplest strategy is to compare the classified data to the training samples used to generate the training statistics. However, this may yield a higher value of accuracy, and strictly speaking the accuracy should be 100%, as one would expect correct classification of all the training data. The best approach would be to select a set of data of known identity, out of which, one portion is used as training data and the rest to assess the accuracy of classification.

9.4.1 Error Matrix The standard form for reporting site-specific accuracy is the error matrix, also known as the confusion matrix or the contingency table. An error matrix not only identifies the overall error for each category, but also the misclassification for each category. An error matrix is essentially consists of an n by n array, where n is the number of class or categories on the map reference. Here the rows of the matrix represent the true classes or information on the reference map, while column of the matrix represent the classes as identified on the classified map. Fig. 9.7 shows a schematic representation of the error matrix, while Table 9.5 shows an actual error matrix. The values in the last column give the total number of true points per class used for assessing the accuracy. Similarly, the total at the bottom of each column gives information regarding the number of points/pixels per class in the classified map. The diagonal elements of the error matrix indicate the number of points/pixels correctly identified both the reference and classified maps. The sum total of all these diagonal elements is entered at the right hand side bottom most elements, i.e., total number of points/ pixels correctly classified both in the reference and classified maps. The offdiagonal elements of the error matrix give information regarding the errors of ommission and commission. In Table 9.5, a total of 4421 pixels have been selected from five informational classes for the assessment of classification accuracy. Of these, 4081 pixels have been identified on correctly on the classified image, hence an overall accuracy (p’) of 92.3% has been achieved in the classification of the image. This statistic is useful however it does not report the confidence of the analyst. For this, a 95% 1-tailed lower confidence limit test for a binomial distribution can be determined as given below:

⎡ ( p' )( q ) 50 ⎤ p = p' – ⎢1.645 + ⎥ n n ⎦⎥ ⎣⎢ where p = Overall accuracy at 95% confidence level p'= the overall accuracy, q = 100- p', and n = the sample size.

…(9.26)

9.34

Digital Image Processing

If this value of p exceeds the defined criterion at the lower limit, then it is possible to accept this classification with 95% confidence limit. Normally, the defined criterion for confidence limit is set at 85%. For the above given example, the accuracy at the lower is 91.6% and hence it is acceptable as the classified map has met or exceeded the defined accuracy standards. Urban

Crop

Classified Image Range Water

Forest

Barren

Total

Urban Crop Range Water Forest Barren Total

Row Marginal

Reference Image

Class

Column Marginal Each row shows errors of omission while each column shows errors of commission Correctly identified pixels Total sum of correctly identified pixels Fig. 9.7 Schematic representation of an error matrix (Source: Campbell,1986) Table 9.5 A sample Error Matrix (Source: Ghosh, 1991) Actual class Heather Water Forest 1 Forest 2 Bare soil Pasture Total

Predicted class Heather

Water

Forest 1

Forest 2

826 0 0 33 61 0 920

0 878 0 0 0 0 878

0 0 720 21 0 0 741

5 0 183 878 0 1 1067

Bare soil 27 0 0 2 560 0 589

Pasture

Total

0 0 7 0 0 219 226

858 878 910 934 621 220 4081

The off-diagonal elements of the error matrix provide information on errors of ommission and commission respectively. Errors of ommission are found in the upper right half of the matrix and for each class it is computed by taking the sum of all the non-diagonal elements along each row, and dividing it by the row total of that class. In the example given above, 32 pixels of heather class have been omitted by the classified map, and out of these 27 pixels have been identified as bare soil and 5 pixels as forest 2 class. This has resulted in a 3.7% error of ommission. Table 9.6 shows the percentage error of ommission for all classes. Similarly, the errors of commission are computed by taking the sum total of all the non-diagonal elements in the error matrix in the column direction of that class. It is found that the classified image has identified 94 extra pixels as heather class, out of which 61 pixels and 33 pixels actually belong to bare soil

Image Classification

9.35

and forest 2 classes, respectively. This has resulted in a 10.2% error of commission. Table 9.6 shows the percentage error of commission for all classes. Table 9.6 Percentage errors of Ommission and Commission (Source: Ghosh, 1991)

Class Heather Water Forest 1 Forest 2 Bare soil Pasture

Ommission No. of pixels Total no. omitted of pixel 32 858 0 878 190 910 56 934 61 621 1 220

% error 3.7 0 20.9 6.0 9.8 0.5

Commission No. of pixels Total no. of commission pixels 9.4 920 0 878 21 741 189 1067 29 589 07 226

% error 10.2 0 2.8 17.7 4.9 3.1

A single accuracy measure is adequate to describe the confidence limit of the entire map, however, it is necessary to determine the accuracy of individual classes. For this, a two-tailed 95% confidence limit test, for each category, can be determined as: ⎡ ( p' )( q ) 50 ⎤ p = p' ± ⎢1.96 …(9.27) + ⎥ n n ⎥⎦ ⎢⎣ Table 9.7 shows the 95% confidence limits for errors of ommission and commission. Table 9.7 Confidence limits of Ommission and Commission (Source: Ghosh, 1991)

Class

Points correct

Heather Water Forest 1 Forest 2 Bare soil Pasture

826 878 720 878 560 219

Ommission % 95% correct confidence limits 858 96.3 95.0 – 97.6 878 100.0 99.4 – 100.0 910 79.1 76.4 – 81.8 934 94.0 92.4 – 95.6 621 90.2 87.8 – 92.6 220 99.5 98.3 – 100.0 n

n 920 878 741 1067 589 226

Commission % 95% correct confidence limit 89.8 87.8 – 91.8 100.0 99.4 – 100.0 97.2 95.9 – 98.5 82.3 80.0 – 84.6 95.1 93.3 – 96.7 96.9 94.4 – 99.1

Using 85% as the criteria, it is seen that Forest 1 has failed the test as both the upper and lower limits of accuracy at 95% confidence limit is less than 85%. Similarly, when errors of commission are evaluated, it is found that Forest 2 fails to meet the criteria. This is also evident from Table 9.5. It can be seen that 183 pixels have not been identified as Forest 1, and has been identified as Forest 2 class. It implies that even though the overall classification accuracy has exceeded the defined criterion of acceptability, the training samples of classes Forest 1 and Forest 2 have not been able to provide the correct information to the

9.36

Digital Image Processing

classification process, and hence the training samples have to be collected with caution. In the above procedure for determining accuracy of classification, it is highly dependent upon the training samples used for classification and assessment of classification accuracy. In order to assess the agreement between two maps, Kappa (κ), which is a measure of the difference between, observed agreement between two maps (as reported by overall accuracy) and the agreement that might be contributed solely by chance matching of two maps. It attempts to provide a measure of agreement that is adjusted for chance and is expressed as follows: κ=

Observed − Expected 1 − Expected

…(9.28)

Observed is the overall accuracy, while expected is an estimate of chance agreement to the observed percentage correct. Expected is computed by first taking the products of row and column totals to estimate the number of pixels assigned to each element of the matrix, given that pixels are assigned by chance to each class. Table 9.8 shows the sample computation of κ for the error matrix as given in Table 9.7. Table 9.8 Sample computation of κ 789360 807760 837200 859280 571320 202400 920

753324 770884 798980 820052 545238 193160 878

635778 650598 674310 692094 460161 163020 741

915486 936826 970970 996578 662607 234740 1007

505362 517142 535990 550126 365769 129580 589

193908 198428 205660 211084 140346 49720 226

858 878 910 934 621 220

Total of diagonal element = 3046621 Total of all elements = 28866023 Sum of diagonal elements Expected Agreement by chance = Total of all elements = 0.126 0.916 − 0.126 Κˆ = 1 − 0.126 = 0.904. ˆ The value of Κ = 0.904 means that the classification has achieved an accuracy that is 90% better than would be expected from random assignment of pixels to classes. Apart of this many other measures have been suggested. Table 9.9 gives the list of other accuracy measures for assessment of classification.

Image Classification

9.37

Table 9.9 List of accuracy measures Measure Overall accuracy

Abbreviation

Explanation

OA

Percent of samples correctly classified

User’s accuracy

UA

Producer’s accuracy

PA

AAu Average accuracy AAp

CAu Combined accuracy CAp

Kappa coefficient of agreement

Weighted Kappa

Conditional Kappa Tau coefficient

K

Kw

K+I

Te

Index of individual class accuracy computed from row total Index of individual class accuracy computed from column total Average of all the individual user’s accuracies. Average of all the individual producer’s accuracies. Average of overall accuracy and average user’s accuracy Average of overall accuracy and average user’s accuracy Proportion of agreement after removing the proportion of agreement by chance Proportion of weighted disagreement corrected for chance Conditional Kappa computed from the ith column in error matrix (Producer’s) Tau for classifications based on equal probabilities of class membership

Base reference (s) Story and Congalton (1986)

Formula q

∑n

1 N

ii

i =1

nii N i

Story and Congalton (1986)

nii M i

Story and Congalton (1986)

1 q 1 q

q

∑N

nii

i =1 q

∑ i =1

i

Fung and LeDrew (1988)

nii Mi

1 [OA + AAp] 2

[

1 OA + AA p 2

]

Po − Pe 1 − Pe

1−

∑v ∑v

Fung and LeDrew (1988)

Congalton et al. (1983)

ij

Poij

ij

Peij

Rosenfield and FitzpatrickLins (1986)

Po ( + i ) − Pe(i +1) 1 − Pe( +i ) 1 q 1 1− q

Po −

Foody (1992), Ma and Raymond (1995)

9.38

Digital Image Processing

Tp Conditional Tau

Ti+ T+I

Tau for classifications based on unequal probabilities of class membership Conditional Tau compute from the ith row (User’s) Conditional Tau computed from the ith column (Producer’s)

Po − Pr 1 − Pr

Po (i + ) − Pi 1 − Pi Po ( + i ) − Pi

Naesset (1996)

1 − Pi

9.5 ILLUSTRATION EXAMPLE In this section, the various processes and steps as discussed and outlined in section 9.2 have been adopted to perform the classification of the selected image. After image classification has been carried out, it is subjected to assessment of accuracy as per discussion given in section 9.4.

9.5.1 Selection of Training Dataset The first step in image classification using supervised classification method is the selection of training areas on the image. Since the area is well known to the author, the training areas have been selected directly from the FCC Image generated using TM band 4, 3 and 2. It is found that about 10 distinct classes can be identified in the image. These classes are, along with the number of sample points selected per class within parenthesis are: i) Dry Sand (530) ii) Agriculture (386) iii) High Density Forest (5629) iv) Medium Density Forest (961) v) Light Density Forest (606) vi) New Forest (1747) vii) Urban (1673) viii) Water (579) ix) Shallow Water (177) x) Wet sand (436) Tables 9.10 to 9.19 shows the statistics of the training data selected for each class. On the basis of this information, histogram of each class in different bands have been plotted (Fig 9.8). In general, it is observed that the histogram plots for each class in different bands are more or less unimodal in shape. Thus, for the present, it can be said that the training data is acceptable.

9.5.2 Feature Selection The next step in the classification procedure is to perform the feature selection based on the training data selected. The two common methods for feature selection are Transformed Divergence which is a scaled version of Divergence

Image Classification

9.39

and Jeffries – Matusita Distance, a scaled version of Bhattacharyya’s Distance. In this example, Transformed Divergence has been used to determine the seperability between different classes, for 3 band and 4 band combinations (Appendices 9-I and 9-II respectively). For a 3-band combination, it is found that TM bands 3, 4 and 5 gives the highest Average Transformed Divergence value of 1988. All the class combinations have Transformed Divergence value more than 1800 except for the class pair of medium density forest and agriculture, which has a value of 1777. This indicates that there is likelihood that there may be confusion in classifying either of these classes. For a 4-band combination, TM band 1, 3, 4 and 5 gives the highest Average Transformed Divergence value of 1991 and that all class pairs have divergence value greater than 1800. This indicates that for this combination of bands, there is no overlap between the classes and that the training data selected is distinct in nature.

Fig. 9.8 Histogram plot of different features in different Training data in different TM Bands.

9.40

Digital Image Processing Table 9.10 Statistics for Dry Sand

Layer Minimum Maximum Mean Std. Dev

1 90.000 122.000 104.985 4.080

Layer 1 2 3 4 5 6

1 16.650 16.706 21.533 9.714 13.946 18.071

Total No. of sample points – 530 2 3 4 81.000 95.000 89.000 121.000 142.000 111.000 101.685 121.108 97.870 4.911 6.759 3.693 Variance – Covariance Matrix 2 3 4 24.118 29.809 14.328 19.867 25.354

45.680 20.485 28.490 34.853

13.641 16.153 17.607

5 111.000 149.000 128.649 5.427

7 96.000 140.000 120.696 6.253

5

6

29.457 30.133

39.097

Table 9.11. Statistics for Agriculture

Layer Minimum Maximum Mean Std. Dev Layer 1 2 3 4 5 6

1 58.000 72.000 65.604 2.504 1 6.271 2.759 4.759 -3.285 0.703 3.276

Total No. of sample points – 386 2 3 4 46.000 34.000 64.000 60.000 60.000 151.000 52.609 43.047 103.728 2.733 4.205 16.048 Variance – Covariance Matrix 2 3 4 7.470 5.842 -12.011 7.962 7.260

17.681 -33.250 13.546 19.552

257.534 -55.420 -63.435

5 43.000 87.000 58.756 7.597

7 17.000 74.000 31.642 7.014

5

7

57.717 41.897

49.191

Table 9.12 Statistics for High density forest

Layer Minimum Maximum Mean Std. Dev

1 50.000 66.000 58.491 2.486

Layer 1 2 3 4 5 6

1 6.178 3.496 3.401 4.274 9.700 5.926

Total No. of sample points – 5629 2 3 4 34.000 23.000 45.000 49.000 42.000 95.000 41.728 33.621 69.498 2.208 2.608 5.156 Variance – Covariance Matrix 2 3 4 4.877 3.482 5.397 10.307 6.144

6.801 4.855 10.751 6.401

26.584 17.305 7.828

5 27.000 65.000 48.948 6.442

7 14.000 42.000 27.080 4.047

5

7

41.503 22.996

16.381

9.41

Image Classification Table 9.13 Statistics for Medium density forest

Layer Minimum Maximum Mean Std. Dev

1 55.000 67.000 60.832 1.949

Layer 1 2 3 4 5 6

1 3.800 1.794 1.651 3.587 2.085 1.062

Total No. of sample points – 961 2 3 4 39.000 28.000 69.000 55.000 46.000 117.000 46.967 37.657 91.788 2.456 2.678 6.823 Variance – Covariance Matrix 2 3 4 6.034 3.752 10.513 6.197 2.659

7.172 6.551 6.117 3.042

46.551 13.935 3.106

5 45.000 75.000 59.518 4.252

7 22.000 42.000 30.436 2.755

5

7

18.079 8.161

7.590

Table 9.14 Statistics for Light dense forest

Layer Minimum Maximum Mean Std. Dev

1 52.000 71.000 61.089 3.092

Layer 1 2 3 4 5 6

1 9.562 9.456 18.951 14.060 28.671 20.120

Total No. of sample points – 600 2 3 4 39.000 30.000 69.000 65.000 79.000 119.000 47.871 45.993 92.139 4.089 8.261 9.478 Variance – Covariance Matrix 2 3 4 16.717 30.597 25.991 48.522 32.456

68.248 48.652 102.414 68.959

89.832 93.082 52.373

5 40.000 122.000 71.150 13.719

7 18.000 74.000 40.061 9.169

5

7

188.204 120.513

84.074

Table 9.15 Statistics for New forest

Layer Minimum Maximum Mean Std. Dev

1 57.000 73.000 65.614 2.212

Layer 1 2 3 4 5 6

1 4.893 2.260 4.585 1.906 6.532 6.452

Total No. of sample points – 1747 2 3 4 45.000 37.000 64.000 58.000 62.000 94.000 51.489 49.915 79.456 2.146 3.663 3.864 Variance – Covariance Matrix 2 3 4 4.605 5.150 3.177 8.241 7.552

13.417 3.648 16.766 15.592

14.933 6.098 3.149

5 44.000 102.000 81.958 6.511

7 24.000 73.000 49.732 5.619

5

7

42.391 32.520

31.573

9.42

Digital Image Processing Table 9.16 Statistics for Urban

Layer Minimum Maximum Mean Std. Dev

1 59.000 104.000 74.555 4.659

Layer 1 2 3 4 5 6

1 21.708 22.974 34.097 15.419 26.486 26.988

Total No. of sample points – 1673 2 3 4 42.000 35.000 31.000 97.000 114.000 88.000 58.988 61.527 53.763 5.535 8.447 6.607 Variance – Covariance Matrix 2 3 4 30.631 43.738 22.918 35.993 34.776

71.346 36.592 58.599 55.956

43.652 45.467 31.849

5 26.000 98.000 62.356 10.062

7 15.000 99.000 53.393 9.481

5

7

101.240 88.640

89.889

Table 9.17 Statistics for Water

Layer Minimum Maximum Mean Std. Dev

1 59.000 76.000 65.485 2.489

Layer 1 2 3 4 5 6

1 6.195 3.740 4.481 0.611 -0.092 0.515

Total No. of sample points – 579 2 3 4 5 44.000 29.000 14.000 1.000 85.000 52.000 56.000 56.000 50.475 37.111 20.789 13.774 2.539 4.652 3.954 3.816 Variance – Covariance Matrix 2 3 4 5 6.447 7.325 1.671 0.815 1.100

21.642 8.679 6.364 4.974

15.630 12.028 7.230

14.563 7.980

7 2.000 30.000 12.048 2.764 7

7.641

Table 9.18 Statistics for Shallow Water

Layer Minimum Maximum Mean Std. Dev Layer 1 2 3 4 5 6

1 69.000 85.000 75.175 3.240 1 10.498 9.236 14.960 20.019 39.402 31.052

Total No. of sample points – 177 2 3 4 5 55.000 50.000 31.000 11.000 73.000 81.000 70.000 83.000 62.271 62.192 49.537 47.525 3.583 5.906 9.003 17.161 Variance – Covariance Matrix 2 3 4 5 12.835 17.635 24.496 45.521 35.497

34.883 42.811 80.177 60.981

81.057 140.069 103.471

294.501 215.248

7 8.000 71.000 35.864 12.906 7

166.572

9.43

Image Classification Table 9.19 Statistics for Wet sand

Layer Minimum Maximum Mean Std. Dev

1 70.000 113.000 87.218 6.754

Layer 1 2 3 4 5 6

1 45.617 48.155 61.878 36.154 66.146 71.831

Total No. of sample points – 436 2 3 4 57.000 54.000 37.000 100.000 117.000 90.000 75.618 83.428 64.957 7.531 9.965 6.890 Variance – Covariance Matrix 2 3 4 56.722 71.393 43.187 79.183 84.641

99.310 60.858 112.479 115.504

47.479 83.559 80.900

5 22.000 123.000 76.518 13.207

6 28.000 113.00 67.950 13.019

5

6

174.436 165.998

169.489

9.5.3 Image Classification Based on this information, the image data can be subjected to classification selecting the best 3-band and 4 band combinations using Minimum Distance to Means and Maximum Likelihood classifiers. Figures 9.9 and 9.10 shows the classified images for 3 band combination for Minimum Distance to Means and Maximum likelihood classified image for 4 band combinations for the 2 classifiers mentioned above. Table 9.20 shows the classification results of the two classifiers for 3 band and 4 band combinations. Table 9.20 Classification results using different classifiers using different band combinations Class

Urban L.D. forest M.D.F M.D.T Dry sand Water Sh. water Wet sand New forest Agriculture

Minimum Distance to Means 3 Band 4 Band 159912 154928 74590 96981 551096 531717 143211 157067 55771 51647 45087 39678 20794 21189 143269 158722 316833 299019 10874 10484 1521432 1521432

MLC 3 Band 341884 165268 370740 67548 26152 9361 30974 62577 142397 304531 1521432

4 Band 279977 149108 382142 68530 23602 9892 72807 69429 151494 314451 1521432

A visual comparison of the results shows that there is a small difference in the number of pixels assigned to a class by a classifier irrespective of the band combination. However, there are large differences in the numbers of pixels classified in each class, by different classifiers. As per the knowledge of the author, this area has two major class types i.e. forest and agriculture. It is

9.44

Digital Image Processing

observed that Minimum Distance to Means has identified less number of points in comparison to Maximum Likelihood classifier (MLC) which gives better result in case of different types of forest i.e. high density, medium density, light density and new forest, the intra – class distribution is not correct in case of Minimum Distance to Means classifier, however Maximum Likelihood gives better result. It is observed that MLC has identified urban class in excess while fewer pixels are identified for wet sand. The reason for this could be attributed to the similarity in signature characteristics of the two above mentioned classes. At present, no concrete decision regarding the acceptability of result can be made, so best is to perform assessment of accuracy.

4 band combination

3 band combination

Fig. 9.9 Minimum Distance to Means classifier

4 band combination

3 band combination

Fig. 9.10 Maximum Likelihood classifier

9.5.4 Assessment of Accuracy For assessment of accuracy the analyst may, first of all, like to perform testing of the training data itself. Table 9.21 shows the error matrix using the training data

Image Classification

9.45

using MLC. It is found that an overall accuracy is 89.23%. The producer and user Accuracy of each class is given in Table 9.22. It can be seen from Table 9.22 that the Producer and User accuracies for medium dense forest, light dense forest, wet sand, shallow water and agriculture is below 85%. From Table 9.21, it can be seen that 110 pixel of medium dense forest are assigned to agriculture. Similarly, 119 pixels belonging to agriculture are classified as 72 pixels of Medium Dense Forest and 47 pixels as light density respectively. Similarly, 88 New forest pixels are identified as light dense forest. Incase of urban 355 pixels are misclassified into 132 pixels of wet sand and 223 pixels as shallow water. The largest confusion exists for light dense Forest. Here 303 pixels have been misclassified by the classifier and has 173 pixels to Medium dense forest and 111 pixels to New forest. On the basis of the above, it can be stated that the training samples of light dense forest, shallow water, Agriculture and urban may be selected with caution and care.

0 0 0 0 18 0 0 11 602

0 0 0 23 0 0 0 1 554

Overall Accuracy = 89.04%

0 4

0 573

530 0

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban Column Total

5664

0 11 26

0

0

43

35

5545

High dense forest

Water

Dry Sand

Reference Data

1118

72 8 0

0

0

173

781

84

0 0

Medium dense forest

443

47 88 0

0

0

273

35

0

0 0

Light dense forest

525

0 0 132

7

386

0

0

0

0 0

333

0 0 223

100

8

0

0

0

0 2

Classified Data Wet Shallow sand Water

371

255 0 0

0

0

6

110

0

0 0

Agriculture

1772

12 1640 9

0

0

111

0

0

0 0

New forest

Table 9.21 ERROR MATRIX of 4-band combination using Maximum Likelihood Classifier

1369

0 0 1271

52

46

0

0

0

0 0

Urban

11354

386 1747 1673

177

463

606

961

5629

530 579

Row Total

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban

Class Name

Classified Total

554 602 5664 1118 443 525 333 371 1772 1369 12751

Reference Total

530 579 5629 961 606 463 177 386 1747 1673 12751

530 573 5545 781 273 386 100 255 1640 1271 11354

No of Pixels correct

Producer’s Accuracy (%) 100 98.96 98.50 81.27 45.05 83.37 56.50 66.06 93.88 75.97

User’s Accuracy (%) 95.67 95.18 97.90 69.86 61.63 73.52 30.03 68.73 92.55 92.84

Table 9.22 Accuracy of Classification for 4-band combination using Maximum Likelihood Classifier

1 0 0 1 17 0 0 13 605

0 0 0 18 0 0 0 1 549

Overall Accuracy = 88.48%

0 573

530 0

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban Column Total

Water

Dry Sand

Reference Data

5646

0 13 17

0

0

38

35

5541

High dense forest 0 2

1140

90 8 0

0

0

188

767

87

Medium dense forest 0 0

425

43 47 0

0

0

289

46

0

Light dense forest 0 0

554

0 0 134

9

393

0

0

0

18 0

416

0 0 317

87

9

0

0

0

0 3

Classified Data Wet Shallow sand Water

364

243 0 0

0

0

8

113

0

0 0

Agriculture

1781

10 1678 1

0

0

83

0

0

0 0

New forest

1289

0 10 1181

64

42

0

0

0

1 1

Urban

Table 9.23 ERROR MATRIX of 4-band combination using Minimum Distance to Mean Classifier

11282

386 1747 1673

177

536

606

961

5629

530 579

Row Total

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban

Class Name

Reference Total 530 579 5629 961 606 463 177 386 1747 1673 12751

549 605 5646 1140 425 536 416 364 1781 1289 12751

Classified Total

No of Pixels correct 530 573 5541 767 289 393 87 243 1678 1181 11282

Producer’s Accuracy (%) 100 98.96 98.42 78.77 47.53 84.88 49.15 60.10 95.99 69.58

User’s Accuracy (%) 95.84 95.02 98.16 66.64 67.61 70.94 20.71 63.32 94.43 91.51

Table 9.24 Accuracy of Classification for 4-band combination using Minimum Distance to Mean Classifier

0 0 0 0 0 0 0 0 568

0 0 0 3 0 0 0 0 533

Overall Accuracy = 90.71 %

0 0

0 568

530 0

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban Column Total

5514

4 0 2

0

0

28

26

5454

High dense forest

Water

Dry Sand

Reference Data

1146

66 1 0

0

0

143

833

103

0 0

Medium dense forest

574

28 47 0

0

0

382

47

70

0 0

Light dense forest

544

0 0 124

5

415

0

0

0

0 0

386

0 0 234

134

15

0

0

0

0 0

Classified Data Wet Shallow sand Water

407

285 29 11

0

0

29

53

0

0 0

Agricul

1699

5 1666 0

0

0

24

2

0

0 0

New forest

1380

2 2 1300

38

30

0

0

2

0 0

Urban

Table 9.25 ERROR MATRIX of 3-band combination using Maximum Likelihood Classifier

11567

386 1747 1673

177

463

606

961

5629

530 579

Row Total

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban

Class Name

Reference Total 530 579 5629 961 606 463 177 386 1747 1673 12751

Classified Total 533 568 5514 1146 574 544 386 407 1699 1380 12751

No of Pixels correct 530 568 5454 833 382 415 134 285 1666 1300 11567

Producer’s Accuracy 100.00 98.10 96.89 86.68 63.04 89.63 75.71 73.83 95.36 77.70

User’s Accuracy 99.44 100.00 98.91 72.69 66.55 76.29 34.90 70.02 98.06 94.20

Table 9.26 Accuracy of Classification for 3-band combination using Maximum Likelihood Classifier

1 0 0 1 17 0 0 13 603

0 0 4 18 0 0 0 1 553

Overall Accuracy = 88.16 %

0 2

0 573

530 0

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban Column Total

5644

0 14 14

0

0

38

36

5540

High dense forest

Water

Dry Sand

Reference Data

1136

98 8 0

0

0

185

757

88

0 0

Medium dense forest

426

48 47 0

0

0

288

43

0

0 0

Light dense forest

536

0 0 152

9

393

0

0

0

0 0

416

0 0 317

87

9

0

0

0

0 3

Classified Data Wet Shallow sand Water

364

232 0 0

0

0

12

123

0

0 0

Agriculture

1781

10 1677 10

0

0

79

2

0

0 0

New forest

1289

0 1 1164

64

42

0

0

0

0 1

Urban

Table 9.27 ERROR MATRIX of 3-band combination using Minimum Distance to Mean Classifier

11241

364 1747 1673

177

463

606

961

5629

530 579

Row Total

Dry Sand Water High dense forest Medium dense forest Light dense forest Wet sand Shallow water Agriculture New forest Urban

Class Name

Reference Total 530 579 5629 961 606 463 177 386 1747 1673 12751

Classified Total 549 605 5646 1140 425 536 416 364 1781 1289 12751

No of Pixels correct 530 573 5540 757 288 393 87 232 1677 1164 11241

Producer’s Accuracy 100.00 98.96 98.42 78.77 47.52 84.88 49.15 60.10 95.99 69.58

User’s Accuracy 96.54 94.55 98.12 66.40 67.76 73.32 20.19 63.74 64.16 90.30

Table 9.28 Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier

Overall Classification Accuracy =

Column Total

Urban Light dense forest High dense forest Med dense forest Dry sand Water Shallow water Wet sand New forest Agriculture

Reference Data

0 0 0 0 0 0 1 0

0 0 0 0 0 16 0 0

72.66%

48

18

17

1

31

Light dense forest 0

Urban

78

0 0 1

0

0 0

2

69

6

High dense forest 0

9

0 0 0

0

0 0

9

0

0

Med dense forest 0

3

0 0 0

0

3 0

0

0

0

0

3

0 0 0

0

0 2

0

0

1

0

Classified Data Dry Water sand

26

20 1 0

4

0 0

0

0

0

1

Shallow water

8

8 0 0

0

0 0

0

0

0

0

Wet sand

17

0 6 0

0

0 0

0

0

11

0

New forest

Table 9.29 Error Matrix of 3-band combination using MLC

46

1 2 38

0

0 0

1

0

4

0

Agriculture

187

45 10 39

4

3 3

12

78

40

32

Row Total

Reference Classified Totals Totals 32 48 Urban 40 18 Light dense forest 69 77 High dense forest 12 9 Med dense forest 3 3 Dry Sand 2 3 Water 4 26 Shallow water 45 8 Wet sand 10 17 New forest 39 46 Agriculture Total 256 256 Overall Kappa Statistics = 0.6766

Class Name

Number Correct 31 17 69 9 3 2 4 8 6 38 187

Producers Accuracy 96.88% 42.50% 100.00% 75.00% 100.00% 100.00% 100.00% 17.78% 60.00% 97.44%

Users Accuracy 64.58% 94.44% 88.31% 100.00% 100.00% 66.67% 15.38% 100.00% 35.29% 82.61%

0.59 0.93 0.84 1.00 1.00 0.66 0.14 1.00 0.33 0.79

Kappa

Table 9.30 Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier

Overall Classification Accuracy =

Column Total

0 0 0 0

1 4 0 0

84.38%

29

1 0

0 0

55

0

2

63

0 0 1

0

0 0

6

6

0 0 0

0

0 0

6

0

0

0

2 53

0

0

3

28

2

6

0 0 0

0

6 0

0

0

0

0

0

2

0 0 0

0

0 2

0

0

0

0

Classified Data Dry Water sand

46

Urban Light dense forest High dense forest Med dense forest Dry sand Water Shallow water Wet sand New forest Agriculture

Med dense forest

High dense forest

Light dense forest

Urban

Reference Data

14

3 0 0

11

0 0

0

0

0

0

Shallow water

8

8 0 0

0

0 0

0

0

0

0

Wet sand

24

1 13 5

0

0 0

0

0

5

0

New forest

Table 9.31 Error Matrix of 4-band combination using MLC

49

0 0 43

0

0 0

0

0

6

0

Agriculture

216

16 13 49

12

6 2

13

55

44

46

Row Total

Reference Classified Totals Totals 46 55 Urban 44 29 Light dense forest 55 63 High dense forest 13 6 Med dense forest 6 6 Dry Sand 2 2 Water 12 14 Shallow water 16 8 Wet sand 13 24 New forest 49 49 Agriculture 256 256 Total Overall Kappa Statistics = 0.8407

Class Name

Number Correct 46 28 53 6 6 2 11 8 13 43 216

Producers Accuracy 100.00% 63.64% 96.36% 46.15% 100.00% 100.00% 91.67% 50.00% 100.00% 87.76%

Users Accuracy 83.64% 96.55% 84.13% 100.00% 100.00% 100.00% 78.57% 100.00% 54.17% 87.76%

0.80 0.96 0.80 1.00 1.00 1.00 0.78 1.00 0.52 0.85

Table 9.32 Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier

Overall Classification Accuracy =

Column Total

0 0 0 0

0 5 1 0

78.52%

10

0 0

0 3

0 0

0 0

33

0

1

0

0

1

99

1 2 0

0

0 0

3

24

0 0 0

0

0 0

24

1

0

0

0

1 92

0

0

1

10

1 0 0

0

9 0

5

0 0 0

0

0

3

0 0 0

2

0

0

1

10

0

0

0

0

0

0

Classified Data Dry Water Shallow sand water

25

Urban Light dense forest High dense forest Med dense forest Dry sand Water Shallow water Wet sand New forest Agriculture

Med dense forest

High dense forest

Light dense forest

Urban

Reference Data

22

49

1

0 0 1

5 13 0 22 0 0

0

0

0

0 0

0

0

0

0

Agriculture

0 0

1

1

29

0

New forest

0 0

0

0

0

0

Wet sand

Table 9.33 Error Matrix of 4-band combination using Minimum Distance to Means

201

34 16 1

2

9 5

30

95

40

26

Row Total

Reference Totals 26 Urban 40 Light dense forest 95 High dense forest 30 Med dense forest 9 Dry Sand 3 Water 2 Shallow water 34 Wet sand 16 New forest 1 Agriculture Total 256 Overall Kappa Statistics = 0.7319

Class Name

Classified Totals 33 10 99 24 10 5 3 22 49 1 256

Number Correct 25 10 92 24 9 3 2 22 13 1 201

Producers Accuracy 96.15% 25.00% 96.84% 80.00% 100.00% 100.00% 100.00% 64.71% 81.25% 100.00%

Users Accuracy 75.76% 100.00% 92.93% 100.00% 90.00% 60.00% 66.67% 100.00% 26.53% 100.00%

0.73 1.00 0.89 1.00 0.90 0.60 0.66 1.00 0.22 1.00

Table 9.34 Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier

0 0 0 0 0 0 0

0 0 1 0 0 8 0 0

Overall Classification Accuracy =

79.30%

10

0

0

22

10

13

Urban Light dense forest High dense forest Med dense forest Dry sand Water Shallow water Wet sand New forest Agriculture

Column Total

Light dense forest 0

Urban

Reference Data

107

0 0 6

0

0 0

7

85

9

High dense forest 0

21

0 0 1

0

0 0

20

0

0

Med dense forest 0

8

0 0 0

0

8 0

0

0

0

0

3

0 0 0

0

0 1

0

2

0

0

Classified Data Dry Water sand

2

0 0 0

2

0 0

0

0

0

0

Shallow water

29

27 0 0

0

0 0

0

0

0

2

Wet sand

54

0 37 4

0

0 0

1

0

12

0

New forest

0

0 0 0

0

0 0

0

0

0

0

Agriculture

Table 9.35 Error Matrix of 3-band combination using Minimum Distance to Mans

203

35 37 11

2

9 1

28

87

31

15

Row Total

Reference Totals 15 Urban 31 Light dense forest 87 High dense forest 28 Med dense forest 9 Dry Sand 1 Water 2 Shallow water 35 Wet sand 37 New forest 11 Agriculture Total 256 Overall Kappa Statistics = 0.7386

Class Name

Classified Totals 22 10 107 21 8 3 2 29 54 0 256

Number Correct 13 10 85 20 8 1 2 27 37 0 203

Producers Accuracy 86.67% 32.26% 97.70% 71.43% 88.89% 100.00% 100.00% 77.14% 100.00% ---

Users Accuracy 59.09% 100.00% 79.44% 95.24% 100.00% 33.33% 100.00% 93.10% 68.52% ---

0.565 1.00 0.689 0.947 1.00 0.331 1.00 0.920 0.632 0

Table 9.36 Accuracy of Classification for 3-band combination using Minimum Distance to Mean Classifier

Average T.D

1889

1914

1972

Bands

1 2 3

1 2 4

1 2 5

1654

60

850

Minimum T.D

2000 2000 2000

2000 2000

2000

2000 2000

1964

1745

2000

2000

2000

2000

2000 1961

2000

1774

1976 1963

6:9 2000

6:8 2000

2000

4:9

4:8

1546

3:6

3:5

2000

2:4

2:3

1982

1:3

1:2

2000

2000

2000

2000

2000

2000

2000

2000

1940

2000

2000

1941

6:10 2000

4:10

3:7

2:5

1:4

2000

1994

2000

2000

2000

2000

2000

2000

2000

1988

2000

1992

2000

2000

2000

2000

2000

1996

2000

2000

1997

1999

2000

1996

7:9 2000

5:7

3:9

2:7

1:6

2000

2000

1848

2000

2000

1980 2000

1926 1037

60

2000 1538

2000

2000

2000

1468 2000

1503 850

1693

1955

8:9 2000

5:9

4:5

2:9

1:8

1666

2000

1306

7:10 2000

5:8

3:10

2:8

1:7

5 Light dense forest 10 Urban

Class Pairs:

7:8 2000

5:6

3:8

2:6

1:5

Transformed Divergence Using bands: 1 2 3 4 5 6 taken 3 at a time 1 Dry Sand 2 Water 3 High Dense forest 4 Medium dense forest 6 Wet sand 7 Shallow water 8 Agriculture 9 New_forest

Signature Separability Listing

Appendix 9 - I

1: 9

2000

2000

2000

2000

2000

2000

2000

2000

1963

1980

2000

1980

8:10 2000

5:10

4:6

2:10

2000

1914

2000

2000

2000

2000

1922

2000

1975

1993

2000

1573

9:10 2000

4:7 6:7

3:4

1:10

1978

1936

1976

1962

1986

1 2 6

1 3 4

1 3 5

1 3 6

1 4 5

1752

1227

1537

384

1659

2000 2000 2000

2000 1999

2000 2000

2000 2000

2000

2000 1664

2000

2000

2000

2000

2000

2000

2000

2000

2000 2000

2000

2000

1537

2000

2000

1995

2000

2000

2000 1758

2000

2000 2000

2000

2000

1999

2000

2000

2000 2000

2000

2000

1752

2000

2000

1998

2000

1700

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1998

2000

2000

1995

2000

1954

2000

2000

1998

2000

1977

2000

2000

2000

2000

2000

2000

2000

1971

2000

1996

2000

2000

1977

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1654

1929

2000

2000

2000

1989

1671

2000

2000

2000

1956

1767

2000

2000

1891

2000

2000

1991

1826

1984

2000

2000

1998

1908

1958

2000

2000

1994 2000

1847 384

1724

2000

2000

1996

1880

1925

2000

2000

1995

1490

2000

2000

2000

1995

1856

2000

2000

2000

1964

2000

2000

2000

2000

2000

2000

2000

2000

1997

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1784

1999

2000

1949

2000

1972

2000 2000

1227

2000

2000

2000 2000

1828

2000

2000

2000 2000

1947

2000

1980

2000 2000

1659

2000

2000

2000

1979

1932

1957

1982

1976

1 4 6

1 5 6

2 3 4

2 3 5

2 3 6

1694

1627

1130

1400

1553

2000 2000 2000

2000 2000

2000 2000

2000 2000

2000

2000 1627

2000

2000

2000

2000

2000

2000

1996

2000

2000 2000

2000

2000

1743

2000

1998

1998

2000

2000

2000 1416

2000

2000 2000

2000

2000

1941

2000

2000

2000 2000

2000

2000

1711

2000

2000

2000

2000

1752

2000

2000

2000

2000

2000

2000

2000

2000

1999

2000

2000

2000

2000

1993

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1998

2000

2000

1998

2000

1997

2000

2000

2000

2000

2000

2000

2000

1983

2000

1830

2000

2000

2000

2000

1999

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1968

1896

2000

2000

1982

2000

2000

1999

1889

1939

2000

2000

1995

1130 2000

1892

1677

2000

2000

1696

2000

2000

2000

1598 1906

1535

1796

2000

2000

1999

1916

1924

2000

2000

2000

1991

1400

2000

2000

2000

1997

1553

2000

2000

2000

1867

1909

2000

2000

2000

1987

1997

2000

2000

2000

2000

2000

2000

2000

2000

1999

2000

2000

2000

2000

2000

2000 2000

2000

2000

2000

2000

2000

1694

2000

2000

2000 2000

1891

2000

2000

2000 2000

1935

2000

2000

2000 2000

1557

2000

2000

2000 2000

1968

2000

2000

2000

1987

1982

1951

1988

1960

2 4 5

2 4 6

2 5 6

3 4 5

3 4 6

638

1777

1473

1626

1808

2000 2000

2000 2000

2000 2000

2000 2000

2000

2000

2000 1777

2000

2000

2000

2000

2000

2000

1996

2000

2000 2000

2000

2000

1635

2000

1966

2000

2000

2000

2000 1762

2000

2000 2000

2000

2000

2000

2000

2000

2000 2000

2000

2000

1808

2000

2000

1999

2000

1713

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1999

2000

2000

1990

2000

1990

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1994

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1869

1937

2000

2000

2000

1626 1694

2000

2000

2000

1998

1626

2000

2000

2000

1900

1942

2000

2000

2000

1995

1810

1984

2000

2000

2000

1984

1962

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1996

1928

1473

2000 2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1999

1997

1724

2000

2000

1999

1936

1917

2000

2000

2000

1992

1851

2000

2000

1993

1818

2000

1977

2000

2000

2000

1948 2000

2000

2000

2000 2000

1778

2000

2000

2000 2000

1969

2000

2000

2000 2000

1921

2000

1950

2000

1920

1949

1987

1988

3 5 6

4 5 6

2 4 5

3 4 5

1777

1808

1475

1168

2000 2000

2000

2000

2000 1777

2000

2000

2000

2000

2000

2000

2000

2000

2000 2000

2000

2000

1808

2000

1999

1999

2000

2000

2000 1524

2000

2000 2000

2000

2000

1983

1998

2000

2000 2000

2000

2000

1398

2000

2000

1988

2000

1759

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1999

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1687

1992

2000

2000

1999

2000

1875

2000

2000

2000

2000

2000

1996

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

2000

1869

1937

2000

2000

2000

1900

1942

2000

2000

2000

1475

1539

2000

2000

2000

1984

1962

2000

2000

2000

1992

1851

2000

2000

1998

1823

1805

2000

2000

1835

1499 2000

1245

1898

2000

2000

1999

1854

1168

2000

2000

2000

1996

638

2000

2000 2000

2000

2000

2000

2000 2000

2000

2000

2000

2000

2000

2000

2000

1999

2000

2000

2000

2000

2000

2000

2000

2000 2000

1948

2000

2000

2000 2000

1921

2000

2000

2000 1983

1919

2000

2000

2000 2000

1512

2000

2000

2000

Average T.D

1957

1982

1975

Bands

1 2 3 4

1 2 3 5

1 2 3 6

1634

1650

1122

Minimum T.D 1:2 2:3 3:5 4:8 6:8 2000 2000 1997 1742 2000 2000 2000 2000 1650 2000 2000 2000 2000 1730 2000

1:3 2:4 3:6 4:9 6:9 2000 2000 2000 1997 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

1:4 2:5 3:7 4:10 6:10 2000 2000 2000 2000 1893 2000 2000 2000 2000 2000 2000 2000 2000 2000 1963

1:5 2:6 3:8 5:6 7:8 2000 2000 2000 2000 2000 2000 2000 1994 2000 1989 2000 2000 1994 2000 1972

Class Pairs: 1:6 2:7 3:9 5:7 7:9 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1:7 2:8 3:10 5:8 7:10 2000 2000 2000 1775 1122 2000 2000 2000 1900 1968 2000 2000 2000 1898 1995

Transformed Divergence Using bands: 1 2 3 4 5 6 taken 4 at a time 1 Dry Sand 2 Water 3 High Dense forest 4 Medium dense forest 6 Wet sand 7 Shallow water 8 Agriculture 9 New_forest

Signature Separability Listing

Appendix 9 - II

1:8 2:9 4:5 5:9 8:9 2000 2000 1679 1927 1995 2000 2000 1938 1932 1999 2000 2000 1983 1790 1993

1: 9 2:10 4:6 5:10 8:10 2000 2000 2000 2000 2000 2000 2000 2000 1993 1968 2000 2000 2000 1988 1997

5 Light dense forest 10 Urban

1:10 3:4 4:7 6:7 9:10 2000 1922 2000 1997 2000 2000 1874 2000 2000 2000 2000 1634 2000 2000 1946

1988

1987

1958

1991

1981

1 2 4 5

1 2 4 6

1 2 5 6

1 3 4 5

1 3 4 6

1462

1850

1631

1765

1850

2000 2000 1998 1854 2000 2000 2000 2000 1819 2000 2000 2000 1946 1741 2000 2000 2000 2000 1850 2000 2000 2000 2000 1837 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1997 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1988 2000 1995 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 1934 1900 2000 2000 2000 1765 1999 2000 2000 2000 1693 1721 2000 2000 2000 1955 1869 2000 2000 2000 1462 1997

2000 2000 1850 1997 2000 2000 2000 1922 1950 1999 2000 2000 1733 1631 1930 2000 2000 1961 1993 2000 2000 2000 1985 1878 1999

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1999 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 1913 2000 2000 2000 2000 1965 2000 2000 2000 2000 1757 2000 2000 2000 2000 1952 2000 2000 2000 2000 1978 2000 2000 2000

1942

1970

1990

1986

1953

1 3 5 6

1 4 5 6

2 3 4 5

2 3 4 6

2 3 5 6

1271

1689

1848

1530

1433

2000 2000 1983 1621 2000 2000 2000 1982 1758 2000 2000 2000 1999 1848 2000 2000 2000 2000 1825 2000 2000 2000 1963 1691 2000

2000 2000 2000 1998 2000 2000 2000 2000 1999 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1990 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 1999 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 1943 2000 1999 2000 2000 1998 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1993 2000 1999

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 1572 1542 2000 2000 2000 1721 1530 2000 2000 2000 1980 1915 2000 2000 2000 1689 1999 2000 2000 2000 1816 1738

2000 2000 1898 1433 1838 2000 2000 1808 1910 1998 2000 2000 1940 1994 2000 2000 2000 1981 1908 1999 2000 2000 1844 1271 1891

2000 2000 2000 2000 1999 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 1997

2000 1556 2000 2000 2000 2000 1930 2000 2000 2000 2000 1894 2000 2000 2000 2000 1970 2000 2000 2000 2000 1682 2000 2000 2000

1974

1967

1991

2 4 5 6

3 4 5 6

1 3 4 5

1850

1507

1720

2000 2000 1969 1810 2000 2000 2000 1997 1791 2000 2000 2000 2000 1850 2000

2000 2000 2000 1999 2000 2000 2000 2000 1999 2000 2000 2000 2000 2000 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 1999 2000 2000 2000 2000 2000

2000 2000 2000 2000 2000 2000 2000 1999 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 2000 2000 1827 1720 2000 2000 2000 1614 1507 2000 2000 2000 1955 1869

2000 2000 1727 1915 1998 2000 2000 1903 1796 1997 2000 2000 1961 1993 2000

2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

2000 1877 2000 2000 2000 2000 1927 2000 2000 2000 2000 1952 2000 2000 2000

10 Spatial Filtering

10.1 INTRODUCTION Space borne imagery has some inherent noise or interference due to sensor error or malfunctioning of the detector, channel transmission and sampling error. The noise caused by noisy sensor or channel transmission error appears as discrete isolated pixels variation that is not spatially correlated (Pratt, 1978). This means that the pixels having error often appears different from their surrounding pixels. At times, the edges or boundaries between the features are not clearly discernible and this creates problem for interpretation of images. Spatial filtering, a digital local enhancement technique, can be very much useful to overcome the problem. Spatial filtering is a pixel by pixel transformation which depends on the digital number of surrounding pixels. So spatial filtering is a context dependent operation that alters the gray level of a pixel at any location according to its relationship with the digital counts of the pixels in its immediate vicinity (Sabins, 1986). Spatial filtering refers to separating out some features. Its task is to either emphasize or de-emphasize image data of various frequencies. Essential spatial frequency enables the tonal variation per unit area of an image (Lilliesand and Kiefer, 1986). So an image area of high frequency is totally rough. This signifiers the digital number (DN) values in these area changes abruptly over a small number of pixels. Conversely, the image area of low frequencies is tonally smooth and it signifies that digital number in the image changes gradually over a large number of pixels. e.g. water bodies. Spatial filtering is categorized as low pass, high pass (Fig. 10.1) and band pass filtering. Low pass filtering is used for smoothening or removing the noise. It is designed to preserve the low frequency details at the cost of high frequency details. High pass filtering, on the other hand, preserves the high frequency details at the cost of low frequency detail.

10.2

Digital Image Processing

The latter one is referred as edge enhancement or edge crispening operation. Band pass filtering is widely used in signal processing. It allows the spatial frequency in a limited region of the spectrum instead of all the frequencies above or below a cut off frequency. This is applicable only for isolating periodic noises from an image. It is of limited use for thematic applications. In view of application, the spatial filtering operation are categorised as (i) noise removal, (ii) edge enhancement or edge crisping, and (iii) edge extractions. −1 −1 −1

1 2 1

1 1 1

− 1 16 − 1 −1 −1 −1

2 4 2 1 2 1

1 2 1 1 1 1

High-Pass

Low-Pass

Fig. 10.1 Low pass filtering and high pass filtering

Each of them is separately dealt at the appropriate place.

10.2 PROCESS OF FILTERING Filtering is performed by using convolution windows. These windows are called mask, templet filter or kernel. The window is moved over the input image from extreme top left hand corner of the scene, from pixel to pixel performing discrete mathematical function transforming the original input image digital number to a new digital value. First, it moves along the first line. As soon as the first line is complete, it starts to move along the next line and continues till it covers the entire image (Sabins, 1976; Lillisand and Kiefer, 1986). To perform filtering, a process known as convolution is carried out. Convolution may be defined as a simple mathematical operation which is fundamental to many common image processing operators. Convolution provides a way of multiplying together two arrays of numbers, generally of different sizes, but of same dimensionality, to produce a third array of number of the same dimensionality. In image processing context, one of the input arrays is normally a gray level image and the second array is usually much smaller and is also two – dimensional and is known as the kernel. Consider an image of size 6×9 pixels and a kernel of size 2×3 (Fig 10.2). The convolution is performed by sliding the kernel over the image, generally starting at the top left corner, so as to move the kernel through all the positions where the kernel fits entirely within the boundaries of the image. Each kernel position corresponds to a single output pixel, the value of which is calculated by multiplying together the kernel value and the underlying image pixel value for each of the cells in the kernel, and then adding all these numbers together. If the image has M rows and N columns, and the kernel has m rows and n columns, then the size of the output image will have M - m + 1 rows, and N - n + 1 columns.

10.3

Spatial Filtering I11 I 21 I 31 I 41 I 51 I 61

I12 I 22 I 32 I 42 I 52 I 62

I13 I 23 I 33 I 43 I 53 I 63

I14 I 24 I 34 I 44 I 54 I 64

I15 I 25 I 35 I 45 I 55 I 65

I16 I 26 I 36 I 46 I 56 I 66

I17 I 27 I 37 I 47 I 57 I 67

I18 I 28 I 38 I 48 I 58 I 68

I19 I 29 I 39 I 49 I 59 I 69

(a) Image

K 11 K 21

K 12 K 22

K 13 K 23

(b) Kernel

Fig. 10.2 An example of small image (left) and kernel (right) to illustrate convolution.

So, the value of the bottom right pixel in the output image will be given by: O 57 = I 57 K 11 + I 58 K 12 + I 59 K 13 + I 67 K 21 + I 68 K 22 + I 69 K 23

…(10.1)

Mathematically, the convolution can be written as: O(i, j) =

m

n

k =1

i =1

∑ ∑ I(i + k − 1, j + l − 1)× K(k, l)

…(10.2)

where i runs from 1 to M - m + 1 and j runs from 1 to N - n + 1. It may be noted that many implementations of convolution produce a larger output image than this because they relax the constraint that the kernel can only be moved to positions where it fits entirely within the image. Instead, these implementations typically slide the kernel to all positions where just the top left corner of the kernel is within the image. Therefore the kernel ‘overlaps’ the image on the bottom and right edges. One advantage of this approach is that the output image is the same size as the input image. Unfortunately, in order to calculate the output pixel values for the bottom and right edges of the image, it is necessary to invent input pixel values for places where the kernel extends off the end of the image. Typically, pixel values of zero are chosen for regions outside the true image, but this can often distort the output image at these places. Therefore, in general, when using a convolution implementation that does this, it is better to clip the image to remove these spurious regions. Removing n - 1 pixels from the right hand side and m - 1 pixels from the bottom will fix the things. A kennel window may be rectangular (1×3, or 1×5 pixels) size or square (3×3, 5×5 or 7×7pixels size). Each pixel of the window is assigned a weight. For a low pass filters, all the weights in the window will have positive values and for a high pass filter all the values may be negative or zero, but the central pixel will be positive with higher weight value. In the case of high pass filter the algebraic sum of all the weights in the window will be zero. Many types of mask window and different sizes can be designed by changing the size and by varying weight within the window. The simplest form of mathematical function performed in filtering operation is neighbourhood averaging. Another commonly used discrete function is to calculate the sum of the products given by the elements of the

10.4

Digital Image Processing

mask and the input image pixels digital number at the same point and uses this sum as the output digital number of the central pixel digital number in the moving window.

10.3 NOISE REMOVAL FILTERING Noise is a term used to designate unwanted signal recorded in the image. A noise in an image may be detected as isolated pixels having no correlation with its neighborhood pixels. It may be present in the image either due to preprocessing of data, sensor characteristics or channel errors. If noise is present in the image then the corresponding histogram will not be Gaussian in nature and image is generally found unsuitable for analysis. Hence, noise should be removed from the images so that subsequent enhancement, registration and numerical analysis can be preformed on images with a high signal to noise ratio (Moik, 1980). Noise can generally be grouped into two classes: (i) Independent noise. (ii) Noise which is dependent on the image data.

Image independent noise can often be described by an additive noise model, where the recorded image f(i,j) is the sum of the true image s(i,j) and the noise n(i,j): f(i,j)=s(i,j)+n(i,j)

…(10.3)

The noise n(i,j) is often zero-mean and described by its variance σ 2n .The impact of the noise on the image is often described by the signal to noise ratio (SNR), which is given by SNR =

σs σ f2 = −1 σn σ n2

…(10.4)

where σ s2 and σ f2 are the variances of the true image and the recorded image, respectively. In many cases, additive noise is evenly distributed over the frequency domain (i.e. white noise), whereas an image contains mostly low frequency information. Hence, the noise is dominant for high frequencies and its effects can be reduced using some kind of low pass filter. This can be done either with a frequency filter or with a spatial filter. Often a spatial filter is preferable, as it is computationally less expensive than a frequency filter. In the second case of data-dependent noise (e.g. arising when monochromatic radiation is scattered from a surface whose roughness is of the order of a wavelength, resulting in image speckle), it is possible to model noise with a multiplicative, or non-linear, model. These models are mathematically more complicated; hence, if possible, the noise is assumed to be data independent.

Spatial Filtering

10.5

Another kind of noise which occurs in all recorded images to a certain extent is detector noise. This kind of noise is due to the discrete nature of radiation, i.e. each imaging system is recording an image by counting photons. Allowing some assumptions (which are valid for many applications) this noise can be modeled with an independent, additive model, where the noise n(i,j) has a zero-mean Gaussian distribution described by its standard deviation (σ ) , or variance Fig 10.3). This means that each pixel in the noisy image is the sum of the true pixel value and a random, Gaussian distributed noise value. 0.

G(X)

0.

0.

0.

0 -4

-2

0 X

2

4

Fig. 10.3 1-D Gaussian distribution with mean value of 0 and standard deviation of 1

Another common form of noise is data drop-out noise (commonly referred to as intensity spikes, speckle or salt and pepper noise). Here, the noise is caused by errors in the data transmission. The corrupted pixels are either set to the maximum value (which looks like snow in the image) or have single bits flipped over. In some cases, single pixels are set alternatively to zero or to the maximum value, giving the image a `salt and pepper' like appearance. Unaffected pixels always remain unchanged. The noise is usually quantified by the percentage of pixels which are corrupted. Smoothing filters are designed to reduce the noise, detail, or “busyness” in an image. If multiple copies of the image are available or can be obtained, they can be averaged pixel by pixel to improve the signal to noise. However, the enhancement techniques developed for this, can be performed on a single image. In this case, typical smoothing filters perform some form of moving window operation that may be convolution or other local computation in the window (Niblack, 1986).

10.6

Digital Image Processing

Typical noise removing filters are: (i) Mean filter (ii) Weighted mean filter (iii) Median filter (iv) Mode filter (v) Olympic filter (vi) Multi level Median (MLM) filter (vii) P-median (PM) filter (viii) Adaptive Mean P-median (AMPM) filter

10.3.1 Mean Filter In this technique, each pixel within a given window (say 3×3) is sequentially examined, and, if the magnitude of the central pixel is significantly different than the average brightness of its surrounding pixels (as given by a predetermined threshold value), then the central pixel is replaced by the average value (Fig 10.4).

a1

A2

a3

a4

X

a5

a6

A7

a8

Fig. 10.4 Layout of Mean filter

The process can be represented mathematically as If

THEN

ELSE where

⎛ 1 ABS ⎜ X − ⎜ 8 ⎝

X=

8

∑a

i

i =1

1 8

⎞ ⎟>ε ⎟ ⎠

(given threshold value)

… (10.5)

8

∑a

i

… (10.6)

i =1

X=X X = the brightness value of the central pixel ai = brightness value of the surrounding ith pixel.

10.3.2 Weighted Mean Filter A weighted mean is often used in which the weight for a pixel is related to its distance from the central pixel of the window. For example, if the processing of boundary area of any region is to be undertaken, then the different weights, inside and outside the boundary are assigned such that the best results for boundary as well as for region can obtained.

Spatial Filtering

10.7

10.3.3 Median Filter Median filter uses median rather than average of the neighborhood pixels of a given window. The median of a set of numbers is that value such that 50% of the numbers are above and 50% are below it. It is considered to be superior to the mean filter primarily due to two reasons. First of all, the median of a set of n numbers (where n is odd) is always equal to one of the value present in the data set. Second, median is less sensitive to errors or to extreme data values (Mather, 1987). Conceptually simple, a median filter is inefficient to implement because sorting of the pixels in ascending or descending order is required. However, it is one of the best noise removing and edge preserving smoothing filter (Niblack, 1986) but thin lines narrow than the dimension of the filter window may be removed. This filter has no user – defined parameters.

10.3.4 Mode Filter In this technique, the central pixel is replaced by its most common neighbour. This is particularly useful in coded images such as classification maps in which the pixel values represent object labels. Averaging labels makes no sense, but mode filters may clean up the isolated points (Niblack, 1986). It produces irregular shifts in edges which make them appear ragged.

10.3.5 Olympic Filter The Olympic filter is a variant of a mean filter. It is named after the system of scoring used in certain Olympic events, where the highest and lowest scores are dropped and the remaining ones averaged. The Olympic filter ranks the values within the filter window (number of values = N), and discards high and low values before calculating the mean of those remaining. The output of the Olympic filter is less influenced by outlier values than the mean filter, but the averaging process blurs edges and removes fine detail from the images.

10.3.6 Multi Level Median (MLM) Filter The MLM filter is designed to reduce image noise (outlier values) while preserving edges, corners, and thin line detail in the image. The filter calculates separate median values for horizontal, vertical, and two diagonal transects through the central cell in the filter window. The minimum and maximum of these four values are then found. The minimum, maximum, and original center raster value are then ranked, and the median of these three values is assigned as the filter output value. The MLM filter has no user-defined parameters. The MLM filter preserves edges better than the simple median filter, but does not smooth small-scale noise as well.

10.3.7 P-Median (PM) Filter The P-Median filter is designed to suppress noise while preserving edge and line detail. The filter calculates median values for two subsets of the values in the

10.8

Digital Image Processing

filter window: 1) combined horizontal and vertical transects through the center cell, and 2) two diagonal transects through the center cell. These two median values are then averaged. The output of the filter is a user-controlled weighted average of the averaged median and the original central cell value. The PM filter preserves edges better than the simple median filter, but provides less noise suppression in uniform regions.

10.3.8 Adaptive Mean P-Median (AMPM) Filter The Adaptive Mean P-Median filter is a variant of the P-Median filter that is designed to provide better smoothing in uniform regions while still preserving edges and line detail. The filter first classifies the center cell in the filter window as belonging to either a uniform region or an edge region. The AMPM filter then applies the P-Median filter in regions containing edges, but uses a simple averaging filter in region with relatively uniform values. The output of the AMPM filter in the edge regions is a user-controlled weighted average of the calculated P-Median filter value and the original center cell value; this weighting does not apply to the areas where simple averaging is performed.

10.4 EDGE DETECTION Edges are places in the image with strong intensity contrast. Since edges often occur at image locations representing object boundaries, edge detection is extensively used in image segmentation when we want to divide the image into areas corresponding to different objects. Representing an image by its edges has the further advantage that the amount of data is reduced significantly while retaining most of the image information. Since edges consist of mainly high frequencies, hence edges can be detected, in theory, by applying a high pass frequency filter in the Fourier domain or by convolving the image with an appropriate kernel in the spatial domain. In practice, edge detection is performed in the spatial domain, as it is computationally less expensive and often yields better results. Since edges correspond to strong illumination gradients, these can be highlighted by calculating the derivatives of the image. This is illustrated for the one-dimensional case as shown in Fig 10.5 It can be seen that the position of the edge can be estimated with the maximum of the 1st derivative or with the zero-crossing of the 2nd derivative. Therefore, the task is to find a technique to calculate the derivative of a twodimensional image. For a discrete one-dimensional function f(i), the first derivative can be approximated by df ( i ) = f (i +1)− f (i ) d( i )

...(10.7)

Calculating this formula is equivalent to convolving the function with a kernal of [-1 1]. Similarly, the 2nd derivative can be estimated by convolving f(i) with a kernal of [1 -2 1].

10.9

Spatial Filtering

Function f (I)

1st derivative

2nd derivative

Fig. 10.5 1st and 2nd derivative of an edge illustrated in one dimensional

In spatial domain, differentiation process sharpens the given image, thus enhances the edges or linear features. It is known that averaging is analogous to integration and it tends to blur details of an image, therefore, it is natural to expect that differentiation will have the opposite effect and sharpen a given image. The most commonly used term in the method of differentiation is ‘Gradient’. Given a function f(x,y), the gradient of f at co-ordinates (x, y) is defined as the vector ⎡∂f / ∂x ⎤ … (10.8) G[ f (x , y )] = ⎢ ⎥ ⎣∂f / ∂y ⎦ The important properties of gradient are: (i) The vector G[f(x,y)] points towards the direction of the maximum rate of increase in the function f(x,y), and (ii) The magnitude of G[f(x,y)] equals to the maximum rate of increase of f(x,y) per unit distance in the direction of G, and can be denoted by G[f(x,y)], where:

[

G [ f ( x , y )] = mag [G ] = [∂ f / ∂ x ]2 + [∂ f / ∂ y ]2

]

1/ 2

… (10.9)

These properties have been used for the design of various filters, nondirectional and directional and have been discussed in the following sections. Also some zero crossing filters are described which use method of zero crossing in place of gradient. The edge detection techniques can be categorized into two classes, frequency domain and spatial domain. In remote sensing, use of spatial domain

10.10

Digital Image Processing

techniques is easier and economical, therefore, only spatial domain techniques have been discussed here. These techniques can further be classified as nondirectional and directional filtering and zero crossing filtering.

10.4.1 Classification of Edge Detection Techniques The techniques for detection of edges or linear features can be classified as: (i) Non-directional filtering: It can be performed by using either (a) Laplacian filter (b) High boost filter (ii) Directional filtering: It can be performed by using (a) Simple directional filter (b) Gradient filters: These consists of following operators: (i) Roberts (ii) Sobel (iii) Prewitt (iv) Kirsch (iii) Zero crossing filtering: It can be performed by using (a) Log filter (b) DDoG filter Non-directional filtering operation consists of implementation of Laplacian filter and high boost filter. These filters are known as non-directional as they do not enhance any specific feature having a particular orientation. Directional filtering is performed by a simple directional filter and gradient filters. These filters work on the basis of a linear gradient. At each point the gradient is obtained to locate the intensity and direction of the edge. This technique, however, is sensitive to noise because the process of differentiation amplifies the noise in an image, resulting in detection of false edges. The process of removal of noise should be applied with the help of a suitable noise removal filter. Directional filtering is used to enhance the linear features having a definite orientation in the image. Zero crossing filters may be a direction or nondirectional using second order derivative of a Gaussian function.

10.4.2. Non-directional Filters 10.4.2.1 Laplacian Filter Laplacian filters are the non-directional filters as they enhance the linear features having any orientation in the image. The exception applies to linear features oriented parallel to the direction of filter movement as these features are not enhanced (Sabins, 1987). Laplacian of any function f is given by ∇ 2 f where ∇2f =

∂ 2f ∂x 2

+

∂ 2f ∂2y

… (10.10)

10.11

Spatial Filtering

Analogously, for a digital picture, where ∂ 2 f / ∂x 2 and ∂ 2 f / ∂y 2 are the second order derivatives of f in x and y direction respectively. This can be written as ∇ 2 f (x, y ) = ∇ 2x f (x , y ) + ∇ 2y f (x , y ) … (10.11)

( )

where

∇ 2x

∇ 2y

and

represent the second partial derivatives of f with respect to x

and y respectively. These second partial derivatives can be written in the form of difference equation.

(∇ x f )(x, y ) = f (x, y ) − f (x − 1, y )

… (10.12)

(∇ x f )(x, y ) = [f (x + 1, y ) − f (x, y )] − [f (x, y ) − f (x − 1, y )]

Similarly,

… (10.13)

(∇ 2x f )( x , y) = [f ( x + 1, y) − 2f ( x , y) + f ( x − 1, y]

… (10.14)

(∇ 2y f )( x, y) = [f ( x , y + 1) − 2f ( x , y) + f ( x, y − 1)]

… (10.15)

On combining the equations 10.14 and 10.15 we get (∇ 2 f )( x , y ) = [f ( x + 1, y ) + f ( x − 1, y ) + f ( x , y + 1) + f ( x , y − 1) − 4f ( x , y )] …(10.16) The above equation can be represented as shown in Fig 10.6 a1 a4 a6

a2 x a7

a3 a5 a8

Input image

0 −1 0 −1 4 −1 0 −1 0 Filter or Mask

a1 a4 a6

a2 Y a7

a3 a5 a8

Output image

Fig. 10.6 representation of a Laplacian Filter procedure

The central pixel (Y) of the output image = 4 x − (a 2 + a 4 + a 5 + a 7 )

… (10.17)

This is conceptually very much similar convolution filtering operation. Only the operating window is different. The sum of all elements within the filtering window is zero. Hence, it is a kind of high pass filtering. Basically the Laplacian filtering operation computes the differences between digital counts of the central pixel and the average of the DN values of four adjacent pixels in the horizontal and vertical location. The above equation can be explained as Y = (X − a 4 ) + (X − a 5 ) + (X − a 2 ) + (X − a 7 )

... (10.18)

10.12

Digital Image Processing

So the output image is nothing but the sum of the partial differences in the horizontal and vertical pixels within the operator (Hall, 1979; Gonzalez and Wintz, 1983). 0

1

0

1

1

1

1

-4

1

1

-8

1

0

1

0

1

1

1

Fig. 10.7(a) Simple Laplacian mask

Fig. 10.7(b) Laplacian mask with orientation in variant

Eq 10.18 can be represented by the 3 x 3 window shown in Fig 10.7(a). This window calculates the derivative in four orientations i.e. along horizontal and vertical lines. In order to make this operator rotation invariant, the window is rotated about 45° in any direction clockwise or anti-clockwise, and combined with the original window, thus giving a window as shown in Fig 10.7 (b). This window takes derivatives in eight orientations i.e. horizontal, vertical and two diagonal directions. This filter, being a second difference operator, has zero response to linear ramps but it responds to the shoulders at the top and bottom of a ramp, where there is a change in the rate of change of grey level. The digital Laplacian does respond to edges, but it responds more strongly to corners, lines, line ends and isolated points. It also responds to noise as effectively as it does to edges (Rosenfeld & Kak, 1982).

10.4.2.2 High Boost Filter A high boost filter may be created by using a weight ‘K’ on the original image in a high pass filter calculation. High boost = (K) original – low pass = (K-1) original + original – low pass = (K-1) original + high pass

… (10.19)

This result in partial restoration of the low frequency components which may have been lost in a high pass orientation (Schowengerdt, 1983). A standard high pass image results, if K equals one. For values of K greater than one, the processed image looks like the original image, with a degree of edge enhancement that is dependent on K. The same algorithm can be described by the following equation: R*=R-f×R’+C where R* is the filtered pixel value at the centre of the window,

…(10.24)

Spatial Filtering

10.13

R is the original value R’ is the average of the elements of the filter window f is a proportion between 0 and 1 and c is a constant whose function is to ensure that R* is always positive. This subtractive box filter has been used by Thomas et. a. (1981) to enhance circular geological feature.

10.4.3. Simple Directional Filtering Directional filters are used to enhance specific linear trends within an image. A typical filter consists of two kernels, each of which is an array of three-by-three pixels. The left kernel is multiplied by the cos(A), where A is the angle relative to North, of the linear feature to be enhanced. The right kernel is multiplied by sin(A). Angles in the north-east quadrant are considered negative; angles in the north-west quadrant are positive (Sabins, 1987) (Fig 10.8).

COS (A)

-1

0

1

-1

0

1

-1

0

1

+ SIN (A)

Left Kernel

-1

0

1

0

0

0

-1

-1

-1

= FILTERED VALUE

Right kernel N

LINE FEATURE

+A

-A

LINE FEATURE

E Fig. 10.8 Simple Directional Filter

In this method, first the left filter kernel is placed over the original image and each pixel is multiplied by the corresponding filter value, and these nine values are summed up. This summation is first combined with the original central pixel value of the image window and this resulting value is multiplied by cos(A), Now this value replaces the original central value of the image. The kernel moves on the entire image and image filtered by left kernel is obtained. Now the process is repeated with the right kernel except that multiplication is done by sin(A) instead of cos(A), and the image filtered by right kernel is

10.14

Digital Image Processing

obtained. Then, two filtered images are summed to get the directionally filtered image.

10.4.4 Gradient Filtering This type of filtering is based on ‘gradient’. These filters yield a high value at places where the grey level values are changing rapidly. Before going into the details of these techniques, it is appropriate to understand the significance of gradient. If ∂f / ∂x and ∂f / ∂y are the rates of change of a function f in any two perpendicular directions, then the rate of change can be defined by tan −1 ((∂f / ∂y ) / (∂f / x )), and its magnitude by

[(∂f / ∂x)

2

+ (∂f / ∂y )2

]

1/ 2

. The

vector having this magnitude and direction is called the gradient of f. If a directional derivative is used as a measure of edge strength, its response would vary with the orientation of the edge. To avoid this, the magnitude of the gradient may simply be used, as this automatically changes the rate of change in the direction of greatest slope. Analogously, for a digital picture, first differences can be used instead of first derivatives, as given below:

(∆ x f )(x, y ) = f (x, y ) − f (x − 1, y )

…(10.25)

(∆ y f )(x, y ) = f (x, y ) − f (x, y − 1)

…(10.26)

where, ∆ x f and ∆ y f and represent the gradient in x and y-directions respectively. These are the digital convolution operators which convolve f with the patterns 1 [−1 1] and ⎡⎢ ⎤⎥ respectively. ⎣− 1⎦ Any partial derivative operator D = ∂ n / ∂x k ∂y n − k is a linear operator; it follows that any linear combination of D’s is also a linear operator. Any arbitrary combination of D is a local operator, since its value for a picture f at a point (x,y) depends only on the values of f in any small neighborhood of (x, y) (Rosenenfeld & Kak, 1982). It is of particular interest to construct these operators that are isotropic i.e. rotation invariant (in the sense that rotating f and then applying the operator gives the same result as applying the operator to f and rotating the output). These operators must be isotropic so that they can sharpen the blurred linear features, such as edges and lines, which may run in any direction. The isotropic operators have the following properties: i) An isotropic linear derivative operator can involve only derivatives of even orders.

Spatial Filtering

10.15

ii) In an arbitrary isotropic derivative operator, derivatives of odd orders can occur only raised to even powers. Following are the most common types of operators used for gradient filtering. i) Roberts operator ii) Prewitt operator iii) Sobel operator iv) Kirsch operator

10.4.4.1 Roberts Operator As given earlier in the first difference is used instead of first derivative for digital picture i.e. ∂ f (x, y ) = ∇ x f (x, y ) = f (x , y ) − f (x − 1, y ) …(10.27) ∂x ∂ f (x , y ) = ∇ y f (x, y ) = f (x, y ) − f (x, y − 1) …(10.28) ∂y ∇ x f and ∇ y f can be combined together by taking the square root of the sum of

the squares. However, it does not seem correct to combine ∇ x f and ∇ y f at the position (x, y), since the difference measured by that operator is not symmetrically located with respect to (x, y); ∇ x uses a pair of pixels centred at 1 1 ( x − , y) while ∇ y uses a pair centred at ( x, y − ). This can be avoided by 2 2 using difference operator other than ∇ x and ∇y. One possibility is to use (∇ + f )( x , y) and (∇ _ f )( x, y) where ∇ + f ( x, y) = f ( x + 1, y + 1) − f ( x , y) ∇ _ f ( x, y) = f ( x , y + 1) − f ( x + 1, y)

…(10.29) …(10.30)

These windows are represented these windows measure the 45° and 135° diagonal changes in f using pairs of points that symmetrically surround 1 1 ( x + , y + ). 2 2 ⎡ 0 1⎤ ⎡1 0 ⎤ ⎢ − 1 0⎥ ⎢0 − 1⎥ ⎦ ⎦ ⎣ ⎣ Fig. 10.9 Roberts operator

10.4.4.2 Prewitt Operator The effect of noise on the response of an operator can be reduced by smoothing the picture before applying the operator. In particular, local average can be taken before differencing or equivalently. An operator can be used computes

10.16

Digital Image Processing

differences of local averages. Suppose the average over 2 x 2 neighbourhood pixels is taken. e.g.1 f 4 ( x , y) = [f ( x , y) + f ( x + 1, y) + f ( x, y + 1) + f ( x + 1, y + 1)] …(10.31) 2 For these type of operations, the average should not be used at adjacent pixels, since the neighborhoods for such pixels, overlap, and the differencing will cancel out the common values and weaken the response (Rosenfeld and Kak, 1982). For example, we have 1 f 4 (x , y ) − f 4 (x − 1, y ) = [f (x + 1, y ) + f (x + 1, y + 1) − f (x − 1, y ) − f (x − 1, y + 1)] 4 …(10.32) which is a weakened difference of two two-pixel averages that comes from adjacent, but non-overlapping neighborhoods. ⎡− 1 0 1⎤ ⎢− 1 0 1⎥ ⎥ ⎢ ⎢⎣− 1 0 1⎥⎦

⎡− 1 − 1 − 1⎤ ⎢0 0 0⎥ ⎥ ⎢ ⎢⎣ 1 1 1 ⎥⎦

Fig. 10.10 Prewitt Operator

This type of operator which is based on difference of averages will respond “blurrily” to an edge several positions. The blurredness of the responses can be reduced to edges, while still retaining some smoothing power, by averaging only in the direction along the edge. It can be represented by the windows. They respond strongly to vertical and horizontal edges separately and weakly to diagonal edges. The image can be convolved by both the windows separately and sum of the absolute values or the root mean square of the two window can be used to determinate the magnitude of edges.

10.4.4.3 Sobel Operator The Sobel operator is used in image processing, is technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function. At each point in the image, the result of the Sobel operator is either the corresponding gradient vector or the norm of this vector. The Sobel operator is based on convolving the image with a small, separable, and integer valued filter in horizontal and vertical direction and is therefore relatively inexpensive in terms of computations. On the other hand, the gradient approximation which it produces is relatively crude, in particular for high frequency variations in the image. In simple terms, the operator calculates the gradient of the image intensity at each point, giving the direction of the largest possible increase from light to dark and the rate of change in that direction. The result therefore shows how abruptly or smoothly the image changes at that point, and therefore how likely it

Spatial Filtering

10.17

is that part of the image represents an edge, as well as how that edge is likely to be oriented. In practice, the magnitude (likelihood of an edge) calculation is more reliable and easier to interpret than the direction calculation. Mathematically, the gradient of a two-variable function (here the image intensity function) is at each image point a 2-D vector with the components given by the derivatives in the horizontal and vertical directions. At each image point, the gradient vector points in the direction of largest possible intensity increase, and the length of the gradient vector corresponds to the rate of change in that direction. This implies that the result of the Sobel operator at an image point which is in a region of constant image intensity is a zero vector and at a point on an edge is a vector which points across the edge, from darker to brighter values. Mathematically, the operator uses two 3×3 kernels which are convolved with the original image to calculate approximations of the derivatives - one for horizontal changes, and one for vertical. If we define A as the source image, and Gx and Gy are two images which at each point contain the horizontal and vertical derivative approximations, the computations are as follow: ⎡ + 1 0 − 1⎤ ⎡+ 1 + 2 + 1⎤ ⎥ ⎢ G x = ⎢+ 2 0 − 2⎥ * A and G y = ⎢⎢ 0 0 0 ⎥⎥ * A ⎢⎣ + 1 0 − 2⎥⎦ ⎢⎣ − 1 − 2 − 1⎥⎦

where * here denotes the 2-dimensional convolution operation. The x-coordinate is here defined as increasing in the right-direction, and the y-coordinate is defined as increasing in the down-direction. At each point in the image, the resulting gradient approximations can be combined to give the gradient magnitude, using: G = G 2x + G 2y

… (10.33)

Using this information, we can also calculate the gradient's direction: ⎛ Gy ⎞ ⎟ … (10.34) Θ = arctan ⎜⎜ ⎟ ⎝ Gx ⎠ where, for example, Θ is 0 for a vertical edge which is darker on the left side. ⎡ − 1 0 1⎤ ⎢ − 2 0 2⎥ ⎢ ⎥ ⎢⎣ − 1 0 1⎥⎦

⎡− 1 − 2 − 1⎤ ⎢0 0 0 ⎥⎥ ⎢ ⎢⎣ 1 2 1 ⎥⎦

Fig. 10.11 Sobel operator

The Sobel operator represents a rather inaccurate approximation of the image gradient, but is still of sufficient quality to be of practical use in many applications. More precisely, it uses intensity values only in a 3×3 region around

10.18

Digital Image Processing

each image point to approximate the corresponding image gradient, and it uses only integer values for the coefficients which weight the image intensities to produce the gradient approximation.

10.4.4.4 Kirsch Operator All the operators, which have been discussed so far, are having one or two windows which are more sensitive to edges in the two principal directions. Kirsch operator consists of eight windows. The operator is equally sensitive to edges in all the four directions i.e. eight orientations. It gives negative weights to each five consecutive neighbours and positive weights to the other three with the weights chosen so that they all sum upto zero. The central pixel can thus be given zero weight. All the windows are applied separately at each point and the root mean square value is adopted which is the measure of an edge. These windows are shown in Fig 10.12. ⎡− 5 3 3⎤ ⎢− 5 0 3⎥ ⎢ ⎥ ⎢⎣− 5 3 3⎥⎦

(i) ⎡3 3 − 5⎤ ⎢3 0 − 5⎥ ⎢ ⎥ ⎢⎣3 3 − 5⎥⎦

(v)

⎡− 5 − 5 3⎤ ⎢− 5 0 3⎥ ⎢ ⎥ ⎢⎣ − 3 3 3⎥⎦

⎡− 5 − 5 − 5⎤ ⎢3 0 3 ⎥⎥ ⎢ ⎢⎣ 3 3 3 ⎥⎦

(ii)

(iii)

(iv)

3⎤ ⎡3 3 ⎢3 0 − 5⎥ ⎢ ⎥ ⎢⎣3 − 5 − 5⎥⎦

3 3⎤ ⎡3 ⎢− 5 0 3⎥ ⎢ ⎥ ⎢⎣− 5 − 5 3⎥⎦

3 3⎤ ⎡3 ⎢3 0 3 ⎥⎥ ⎢ ⎢⎣− 5 − 5 − 5⎥⎦

(vi)

(vii)

⎡3 − 5 − 5⎤ ⎢3 0 − 5⎥ ⎢ ⎥ ⎢⎣3 3 3 ⎥⎦

(viii)

Fig. 10.12 Kirsch Operator (8 masks)

10.4.5 Zero Crossing Filtering Most of the edge detectors use gradient as the means of detecting edges. If we take the second derivative of intensity function, we obtain a negative or positive sloped zero-crossing at the gradient extreme i.e. edges, which can be proved mathematically. These zero crossing in second derivative of an intensity function will determine edge locations oriented in orthogonal directions. The typical zero crossing filters are a Laplacian of Gaussian filter and a Directional Derivative of Gaussian Filter. A major difficulty with natural images in that, changes in pixel values occur over a wide range of scales. No single filter can be optimally suitable for all cases, so a way has to be found to deal separately with change occurring at different levels. This can be achieved by first taking local averages of image at various resolutions and then detecting the changes in intensity that occurs. To realize this idea we need to determine the nature of the optimal smoothing filter and the technique to detect intensity change at a given scale.

Spatial Filtering

10.19

There are two physical considerations that combine to determine the appropriate smoothing filter. The first is that the motivation for filtering the image is to reduce the range of scale over which intensity changes take place. The filter spectrum should therefore be smooth and roughly band limited in frequency domain. This condition may be expressed by requiring that its variance therefore, ∆w , should be small. The second consideration is next expressed as constraint in spatial domain, the constraint is that the contribution to each point in the filtered image should arise from a smooth average of nearby points, rather than any kind of average of widely scattered points. Hence, the filter that we seek should be smooth and localized in spatial domain and in particular its spatial variance, ∆x, should also be small. But these two localization requirement, one in spatial domain and the other in frequency domain are conflicting. They are in fact related by uncertainty principle which states ∆x × ∆w ≥ π / 4. There is only one distribution which optimizes this relation, namely the Gaussian. The Gaussian function, in one dimension, is represented as:

[

]

G ( x ) = 1 / σ(2π)1 / 2 exp(− x 2 / 2σ 2 )

In two dimensions G ( x , y) = (1 / 2πσ 2 ) exp(−( x 2 + y 2 ) / 2σ 2 )

…(10.35)

…(10.36)

So, prior to detecting intensity changes, the image is smoothed by convolving it with the two dimensional Gaussian function i.e. f ( x , y ) = G ( x , y ) * I( x , y )

…(10.37)

Whenever there is a change in intensity, there will be a corresponding peak in the first directional derivative or a zero crossing in the second directional derivative of intensity. So we determine the zero crossings by applying the rotation invariant second derivative Laplacian operator g (x , y ) = ∇ 2 [G (x , y )* I(x , y )] …(10.38) where I(x, y) is the image. Some of the zero crossing filters are LoG filter and DDoG filter.

10.4.5.1. LoG Filter The two dimensional Gaussian function is given by G ( x , y) = K exp(−( x 2 + y 2 ) / 2σ 2 )

The expression for ∇ 2 G ( x , y) =

∂ 2G ∂x 2

From Eq. 10.39 and Eq. 10.40 we get

+

∂ 2G ∂y 2

…(10.39) …(10.40)

10.20

Digital Image Processing ∇ 2 G ( x , y) =

∂2 ∂x 2

(K exp(−( x 2 + y 2 ) / 2σ 2 )) +

∂2 ∂y 2

(K exp(−( x 2 + y 2 ) / 2σ 2

…(10.41) Taking partial first derivative of Gaussian w.r.t. x we get ∂ G ( x, y) = −(Kx / σ 2 ) exp(−( x 2 + y 2 ) / 2σ 2 ) …(10.42) ∂x for second partial derivative, by differentiating with respect to x, we get ∂2 −K x2 2 2 2 G = [exp( − ( x + y ) / 2 σ ) − exp(−( x 2 + y 2 ) / 2σ 2 )] …(10.43) 2 2 2 σ σ ∂x or X2 ∂2 −K G = 2 [1 − 2 ] exp(−( x 2 + y 2 ) / 2σ 2 ) …(10.44) 2 ∂x σ σ Similarly y2 ∂2 −K G = [ 1 − ] exp(−( x 2 + y 2 ) / 2σ 2 ) …(10.45) ∂y 2 σ2 σ2 By adding Eq. 10.44 and 10.45 we get x2 + y2 ∇ 2 G = K (2 − ) exp(−( x 2 y 2 ) / 2σ 2 ) …(10.46) 2 σ where σ is the space constant of the Gaussian of the function and K is the scale factor. Whenever an intensity change occurs there is a peak in the first, directional derivative of the intensity function and zero crossing in the second directional derivative. The problem of detection of intensity change then change then reduces to finding of the zero crossing in second derivative of the intensity function in direction of maximum slope dmax. Fig. 10.13 shows the intensity profile of an ideal step edge, corresponding zero crossing is shown in Fig 10.14

1 (x,y) b+h h = Difference in grey level between adjacent regions of a Step Edge dMAX = Direction of maximum slope

b dmax

Fig. 10.13 Intensity profit of an ideal Step Edge

Spatial Filtering

10.21

∇ 2G ( x , y ) * I ( x , y )

P2

d P2-P1 = Peak to peak difference for a zero crossing

P1 Fig. 10.14 Response of Log Filter to a Step Edge

10.4.5.2 DDoG Filter Second direction derivative operator (∂ 2 / ∂n 2 ) is another rotationally invariant operator used for edge detection. ∂ 2 / ∂n 2 is a nonlinear function. It neither commutes nor associates with convolution i.e. ⎡ ∂2 ⎤ ( g * f ) = ⎢ 2 ⎥ *f ∂n 2 ⎣⎢ ∂n ⎦⎥ ∂2

∂2 ∂n

2

g *f = g*

∂ ∂n 2

f

…(10.47) …(10.48)

For any function f this operator can be defined in cartesian and polar coordinates as follows 2 2 ∂ 2f f x f xx + 2 f x f y + f y f yy …(10.49) = ∂n 2 f x2 + f y2 ∂2 f ∂n 2

⎡ 2 ∂f ∂f 1 =⎢ 2 + 4 p θ ∂ ∂ p ⎢⎣ p

2 2 2 ⎡ ∂f ⎤ a 2 f ⎤ 1 ∂f ⎡ ∂f ⎤ ⎡ ∂f ⎤ ⎥ + − ⎢ ⎥ ⎢ ∂θ ⎥ 2 p 3 ∂p ⎢⎣ ∂θ ⎥⎦ ⎣ ⎦ ⎣ ∂p ⎦ ∂p ⎥⎦ 1 * …(10.50) 2 2 ⎡ 1 ∂f ⎤ ⎡ ∂f ⎤ ⎢ ∂p ⎥ + ⎢ 2 2⎥ ⎣ ⎦ ⎣ p ∂θ ⎦

10.22

Digital Image Processing

These expressions are for the directional derivative in the direction of the gradient. The second directional derivative in the direction orthogonal to the gradient along the edge is given by f y2 f xx − 2f x f y f xy + f x2 f yy ∂ 2f = …(10.51) ∂n 2 ⊥ f x2 + f y2 The representation in polar coordinates show clearly that the operator are rotationally symmetric, since form does not change for a rotation of coordinate system by θ . A sufficient condition for an operator to be rotationally invariant is that θ appears only as derivative in polar representation of the operator. For the directional in polar representation of the operator. For the directional derivative of Gaussian (DDoG) filter, as the name suggests, the function chosen is Gaussian. The point spread function of the filter is given by ∂ 2G ∂n

2

=

⎡ X2 + y2 ⎤ K ⎡ x 2 + y2 ⎤ − 1⎥ * exp ⎢− ⎥ 2 ⎢ 2 σ ⎣⎢ σ 2σ 2 ⎦⎥ ⎥⎦ ⎣⎢

…(10.52)

This equation can be derived by considering a two dimensional Gaussian function is represented by ⎡x2 + Y2 ⎤ …(10.53) G ( x , y) = K exp ⎢ ⎥ 2 ⎣⎢ 2σ ⎦⎥ Taking first and second partial derivatives of Gaussian with respect to x we get ⎡ X2 + Y2 ⎤ ∂G − Kx …(10.54) exp = ⎢− ⎥ ∂X 2σ 2 2σ 2 ⎥⎦ ⎢⎣ ⎡ x2 + y2 ⎤ ∂ 2 G ⎡ Kx 2 K ⎤ exp …(10.55) = − ⎢− ⎥ ⎢ ⎥ ∂x 2 ⎢⎣ 4 σ 2 ⎥⎦ σ 2 ⎥⎦ ⎢⎣ Similarly by taking derivatives with respect to y we get ⎡x2 + Y2 ⎤ ∂G − Kx = 2 * exp ⎢ ⎥ ∂y σ ⎢⎣ 2σ 2 ⎥⎦ ⎡ x2 + Y2 ⎤ K ⎤ ∂ 2 G ⎡ Kx 2 * exp = − ⎥ ⎢− ⎥ ⎢ 2σ 2 ⎦⎥ ∂X 2 ⎣⎢ σ 4 σ 2 ⎦⎥ ⎣⎢ By differentiating Eq. 10.56 with respect to x and y, we get ⎡x2 + Y2 ⎤ ∂ 2 G Kxy = 4 * exp ⎢ ⎥ 2 ∂x∂y σ ⎦⎥ ⎣⎢ 2σ Substituting in Eq. 10.54 from above eqns, we get:

…(10.56) …(10.57)

…(10.58)

Spatial Filtering

10.23

⎡ ⎡ − Kx ⎤ ⎡ Kx 2 K ⎤ ⎡ − Ky ⎤ ⎡ Ky 2 k ⎤ ⎡ 2 Kxy ⎤ ⎡ − Kx ⎤ ⎤ − = − + ⎢ ⎥ ⎢ ⎥+ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ∂n 2 ⎢⎣ ⎣ σ 2 ⎦ ⎣ σ 4 σ 2 ⎦ ⎣ σ 2 ⎦ ⎣ σ 4 σ 2 ⎦ ⎢⎣ σ 2 ⎥⎦ ⎢⎣ σ 2 ⎥⎦ ⎥⎦

∂ 2G

2

2 2 ⎡ − Ky ⎤ ⎡ Kx ⎤ ⎡ Kx ⎤ ⎤ 2 2 + / ⎢ σ 2 ⎥ ⎢ σ 2 ⎥ ⎢ σ 2 ⎥ ⎥ * exp − x + y / 2σ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎦⎥

((

2 ⎡⎡ 2 ⎡ x 2 + 2 ⎤⎤ x + y2 ⎤ ⎥ K 3 ⎢⎢ − ⎥ ⎢ 6 ⎥ ⎢ ⎣⎢ σ 4 ⎦⎥ ⎢ σ ⎦⎥ ⎥ ⎣ ⎣ ⎦ = 2 2 ⎡x + Y ⎤ K2 ⎢ − 1⎥ 4 ⎣⎢ σ ⎦⎥

=

⎡ x2 + Y2 ⎤ K ⎡X2 + y2 ⎤ − 1⎥ * exp ⎢− ⎥ 2 ⎢ 2 σ ⎣⎢ σ 2σ 2 ⎦⎥ ⎥⎦ ⎣⎢

) )

...(10.59)

…(10.60)

…(10.61)

where σ is standard deviation of Gaussian functions. ω is the diameter of central excitatory region of these masks i.e. the distance between central zeros. W1D for the two dimensional case. It can be easily seen that second directional derivative has W12 = W2D. The width W2D can directly be obtained from Eq. 10.54. at y = 0 or x = σ

we get

X2

=1

σ2 or W2D2x = 2 σ

…(10.62) …(10.63)

If we compare these results with that of LoG filter we find that in terms of central excitatory region, the LoG filter in one dimension is equivalent to DDoG in one or two dimensions, i.e. W2dD = W1dD = W11D = 2σ …(10.64) W21D = 2σ 2

…(10.65)

Although, different types of filters are discussed here, they may not be suitable for enhancement and extraction of all types of features. So a trial and error procedure is required for selection of optimum filter for the detection of linear features.

References Agrawal, R., J.Gehrke, D.Gunopulos and P Raghavan (1998). Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD International Conference on Management of Data, Seattle, WA USA. Ankerst, M., M.M.Breunig, H.P.Kriegel and J.Sander (1999). OPTICS: Ordering Points To Identify the Clustering Structure. ACM SIGMOD International Conference on Management of Data, Philadelphia PA. Backer E. and A. Jain (1981), A clustering performance measure based on fuzzy set decomposition, IEEE Trans. Pattern Anal. Mach, Intell., vol. PAMI-3, no. 1, pp. 66-75, Jan. 1981. Baraldi A. and E.Alpaydin, Constructive feedforward ART clustering networks – Part I and II, IEEE Trans. Neural Netw., vol. 13,no. 3, pp. 645-677, May 2002. Bishop, C (1995), Neural Networks for pattern Recognition. New York: Oxford Univ. Press, 1995. Bradley, P., U.Fayyad and C.Reina (1998), Scaling clustering algorithms to large databases. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York City. Bromwich, D.H and R.-Y.Tzeng (1994), Simulation of modern arctic climate by the NCAR CCMI, Journal of Climate, 7: 1050–1069. Bunting, J.T. anf R.P.d’Entremont (1982), Improved cloud detection utilizing defense metereological satellite program near infrared measurements, Air Force Geophysical Laboratory, Hanscom, AFB, MA, AFGL – TR – 82–0027, Environmental Research Papers, 765, 91p. Carroll, T.R., J.V.Baglio Jr., J.P.Verdin and E.W.Holyroyd III (1989), Operational mapping of snow cover in United States and Canada using airborne and satellite data, Proceedings of the 12th Canadian Symposium on Remote Sensing, V3, IGARSS’89, 10-14 July, 1989, Vancouver, Canada. Cheng, C., A.Fu and Y.Zhang (1999), Entropy-based subspace clustering for mining numerical data. ACM SIGDD international Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA. Cherkassky V. and F.Mulier (1998), Learning From Data: Concepts, Theory, and Methods. New York: Wiley, 1998. Choudhury, B.J., and A.T.C. Chang (1979), Two- stream theory of reflectance of snow, IEEE Trans. Geosci. Electron., GE-17: 63–68. Choudhury, B.J., and A.T.C Chang (1981), The albedo of snow for partially cloudy skies, Boundary Layer Meteorol., 20: 371–389.

R.2

References

Cocke, A.E., P.Z. Zule and J.E. Crouse (2005), Comparison of burn severity assessments using normalized burn ratio and ground data, International Journal of Wildland Fire, Vol. 14, no. 2, pp 189–198. Dirnhirm, I. and F.D. Eaton (1975), Some characteristics of the albedo of snow. J. Appl. Meteorol, 14: 375 – 379. Dozier, J. (1984), Snow reflectance from Landsat – 4 thematic mapper, IEEE Transactions on Geosciences and Remote Sensing, 22, 323–328. Dozier, J. (1989), Spectral signature of alpine snow cover from Landsat Thematic Mapper, Remote Sensing of Environment, 28, 9–22. Dozier, J., S.R. Schneider and D.F. McGinnis Jr. (1981), Effect of grain size and snowpack water equivalence on visible and near infrared satellite observations of snow, Water Resources Res. 17: 1213–1221. Duda, R. O., P.E. Hart and D.G. Stork. (2001), Pattern classification. New York, John Wiley & Sons. Epting, J., D. Verbyla and B. Sorbel, Evaluation of remotely sensed indexes for assessing burn severity in Alaska using Landsat TM and ETM+, Remote Sensing of Environment, Vol 97, no.1, pp. 92–115. Ester. M., H.-P. Kriegel, J. Sander and X. Xu (1996), A density-based algorithm for discovering clusters in large spatial database with noise. The 2nd international Conference on Knowledge Discovery and Data Mining, Portland, Oregon, AAAI Press. Ester, M., H. P. Kriegel and X.W. Xu (1995), Knowledge discovery in large spatial databases: Focusing techniques for efficient class identification. Advances in Spatial Databases. Berlin 33, Springer-Verlag Berlin. 951:67–82. Everitt, B.S., S. Landau and M. Leese (2001). Cluster Analysis, Oxford University Press: Fayyad, U., G. Piatetsky-Shapiro and P. Smyth (1996), From data mining to knowledge discovery-An review. Advances in knowledge discovery. R. Uthurusay. Cambridge, MA, AAAI Press/The MIT Press: 1–33. Foster, J.L.and A.T.C. Chang (1993), Snow cover, in Atlas of Satellite Observations related to Global Changes, R.J.Gurney, C.L.Parkinson and J.L.Foster (Eds.), Cambridge University Press, Cambridge, pp361–370. Fraley, C. (1998). Algorithm for Model-based Gaussian Hierarchical Clustering. SIAM Journal on Scientific Computing 20(1): 270-281. Fritzke, B. (1997) Some competitive learning methods. [Online]. Available http://www. neuroinformatik. ruhr-uni-bochum. De/ini/VDM/reearch/gsn//JavaPaper

Goil, S., H Nagesh and A. Choudhary (1999), MAFIA: Efficient and Scalable Subspace Clustering for Very Large Data Sets. Technical Report CPDC-TR9906-010, Center for Parallel and Distributed Computing, Northwestern University, June 1999. Gordon, A. Classification, 2nd ed. London, U.K.: Chapman & Hall, 1999. Gordon, A.D. (1987). A Review of Hierarchical Classification. Journal of the Royal Statistical Society. Series A (General) 150(2): 119–137. Gordon, A.D. (1996). Hierarchical Classification. Clustering and Classification. P. Arabie, L.J. Hubert and G.D. Soete. River Edge, HJ, World Scientific Publ.: 65–122. Hall, D.K., G.A. Riggs, V.V. Salomonson (1995), Development of method for mapping global snow cover using moderate resolution imaging spectroradiometer data, Remote Sensing of Environment, 54 (2), 127–140.

References

R.3

Hall, D.K., G.A. Riggs, V.V. Salomonson, N.E. DeGirolamo, and K.J. Bayr (2002), MODIS Snow- Products, Remote Sensing of Environment, 83, 181–194. Han, J. and M. Kamber (2001), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers. Hinneburg, A. and D.A. Keim (1999), Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering. Proceeding of the 25th VLDB Conference, Edingburgh, Scotland. Hinneburg, A., D.A. Keim and M. Wawryniuk (1999). “HD-Eye: Visual Mining of High-Dimensional Data.” IEEE Computer Graphics & Applications 19(5): 22–31. Jain, A.K. and R.C. Dubes (1988). Algorithm for clustering data. Englewood Cliffs, NJ, Prentice Hall. Jain, A.K., N. Murty and P.J. Flynn (1999). “Data clustering: a review.” ACM Computing Surveys (CSUR) 31(3): 264–323. Jakubauskas, M.E., K.P.Lulla, P.W. Mausel (1990), Assessment of vegetation changes in a fire altered forest landscape, Photogrammetric Engineering and Remote Sensing, Vol, 56, no. 3, pp 371–377. Jensen, J.R., (2000), Remote Sensing of Environment, Upper Saddle, NJ, Prentice Hall. Kaufman, L. and P. J. Rousseeuw (1990), Finding Groups in Data. An Introduction to Cluster Analysis, John Wiley & Sons, Belgium. Key, C.H and N.C. Benson (2005), Landscape assessment: Remote Sensing of severity, the normalized burn ratio; and ground measure of severity, the composite burn index, in FIREMON: Fire effect monitoring and Inventory system, D.C.Lutes, R.E.Keane, J.F.Caratti, C.H.Key, N.C. Benson and L.J.Gangi, Eds. Ogden, UT; USDA Forest Service, Rocky Mountain Research Station. Key, J.R., J.B. Collins, C. Fowler and R.S.Stone (1997), High latitude surface temperature estimates from thermal satellite data, Remote Sensing of Environment, 61, 302–309. Klien, A.G., D.K. Hall and G.A. Riggs (1998), Improved snow – cover mapping in forests through the use of a canopy reflectance model. Hydrological Processes, 12, 1723–1744. Koutsias, N., M.Karteris, A, Fernandez-Palacios, C. Navarro, J. Jurado, R. Navarro (1999), Burnt land mapping at local scales. In E. Chuvicco (editor)Remote sensing of large wildfires in the European Mediterranean basin, (pp 157-187), Berlin: Springer Verlag. Michalek, J.L., J.E. Colwell, N.H.F.French, E.S.Kasischke and R.D.Johnson (2000), Using Landsat data to estimate carbon release from burned biomass in an Alaskan spruce forest complex, International Journal of Remote Sensing, Vol 21, pp 329–343. Milligan, G.W. (1996), Clustering Valiation: Results and Implications for Applied Analysis. Clustering and Classification. P. Arabie, L. J. Hubert and G.D. Soete. River Edge, NJ, World Scientific Publ.: 341–375. Nagesh, H.S., S.Goil and Choudhary (2000), A scalable parallel parallel subspace clustering algorithm for massive data sets. Proceeding for the International Conference on Parallel Processing. Ng, R. and J. Han (1994), Efficient and Efficient and Effective Clustering Methods for Spatial Data Mining. Proc. 20th International Conference on Very Large Databases, Santiago, Chile.

R.4

References

Nolin, A. and S. Liang (2000), Progress in bidirectional reflectance modeling and application for surface particulate media: Snow and soils, Remote Sensing Reviews, 14, 307–342. Pereira, M.C., (1999), A comparative evaluation of NOAA/AVHRR vegetation indices for burned surface detection and mapping, IEEE Transanctionson Geoscience and Remote Sensing, 37, 217–226. Pereira, M.C. and A.W. Setzer, (1993) Spectral characteristics of fire scars in Landsat 5-TM images of Amazonia, International Journal of Remote Sensing, Vol 14, pp 2061–2078. Procopiue, C.M., M. Jones, P.K. Agrawal and T.M. Murali (2002). A Monte Carlo Algorithm for Fast Projective Clustering. ACM SIGMOD International Conference on Management of data, Madison, Wisconsin, USA. Riggs, G.A. , D.K. Hall and V.V. Salomonson (1994), A snow index for the Landsat Thematics Mapper and Moderate Resolution Imaging Spectroradiometer, Proceedings of the International Geoscience and Remote Sensing Symposium, IGARSS’94, 8 – 12 August, 1994, Pasadena, CA, pp 1942 – 1944. Rogan, J. and S.R. Yool (2001), Mapping fire-induced vegetation depletion in the peloncillo mountains, Arizona and New Mexico, International Journal of Remote Sensing, Vol 22, pp 3101–3121. Romanov, P., G. Gutman and I. Csiszar (2000), Automated monitoring of snow cover over North America using multispectral satellite, Journal of Applied Metereology, 39, 115–130. Roy, D.P., Y.Jin, P.E.Lewis and C.O. Justice, (2005), Prototyping a global algorithm for systematic fire-effected area mapping using MODIS time series data, Remote Sensing of Environment, Vol 97, pp. 137–162. Salomonson, V.V. and I. Appel, (2004), Estimaing fractional snow cover from MODIS using the normalized difference snow index. Remote Sensing of Environment, 89, 351–360. Salomonson, V.V., and D.C. Marlatt (1968), Anisotropic solar reflectance over white sand, snow and stratus clouds, J. Appl. Meteorol. 7:475–483. Sidjak, R.W. and R.D Wheate (1999), Glacier mapping of Illecillewaet icefield, British Columbia, Canada, using Landsat TM and digital elevation data, International Journal of Remote Sensing, Vol. 20, no. 2, 273–284. Steffen, K. (1987), Bidirectional reflectance of snow, in Large Scale effects of seasonal snow cover, (B.E. Goodison. R.G.Barry, and J. Dozier, Eds.) Proceedings of IAHS Symposium, 19 – 22 August 1987, Vancouver, Canada, pp 415–425. Tanaka, S., H.Kimura and Y.Suga, (1993), Preparation of 1:25,000 Landsat map for assessment of burnt area on Etijama Island, International Journal of Remote Sensing, Vol 4, pp 17–31. Tucker, C.J. (1979), Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment, 8: 127–150. van Wagtendonk, J.W., R.R.Root, and C.H.Key (2004), Comparison of AVIRIS and Landsat ETM+ detection capabilities for burn severity, Remote Sensing of Environment, Vol 92, no.3, pp. 397–408. Vaithyanathan, S. and B.Dom (2000). Model-Based Hierarchical Clustering. The Sixteenth Conference on Unceratinty in Artificial Intelligence, Stanford, CA.

References

R.5

Verbyla, D.L., and S.H. Boles, (2000), Bias in land cover changes estimates due to mis-registration, International Journal of Remote Sensing, Vol 21, pp 3553– 3560. Wang, W., J.Yang and R Muntz (1997). STING: A Statistical Information Grid Approach to Spatial Data Mining. 23rd Int. Conf on Very Large Data Bases, Athens, Greece, Morgan Kaufmann. Warren, S.G. (1982), Optical properties of snow, Rev. Geophys. Space Phys, 0: 67–89. White, Y., K. Ryan, C. Key and S. Running (1996), Remote Sensing of fire severity and vegetation recovery, International Journal of Wildland Fire, Vol. 6, pp 125–136. Xiao, X., S. Zhenxi and X. Qin (2000), Assessing the potential of VEGETATION sensor data for mapping snow and ice cover: Normalized Difference Snow and Ice Index, International Journal of Remote Sensing, Vol 22, No. 13, 2479–2487. Zhang, T., R. Ramakrishnan and M. Livny (1996). BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD International Conference on Management of Data, Montreal, Canada, ACM Press. Zhu, Z. and J. Eidenshink (2007) Assessment of spatial and temporal trends of fire severity in United States, available http://www.fire.uni-freiburg.de/sevilla2007/contributions/doc/REGIONALES/b_AUSTRALASIA_NORTEAMERIC A/Zhu_Eidenshink_USA.pdf.

Index A Absolute zero 1.3 Absorbed Photosynthetically Active Radiation (APAR) 8.5 Absorbivity 1.4 Accuracy 8.30, 9.6 Achromatic light 2.9 Acoustic image processing 1.4 Adaptive filtering 3.17 - Mean P-Median (AMPM) 10.6, 10.8 Additive 2.10, 2.11 - model 220 Aerial camera 2.1, 2.2 Affine transformation 6.11, 6.12 Aliasing 2.6 Alpha channel 3.13, 3.14 Ancillary chunk 3.14, 3.16 Angular separation 9.9, 9.10 Angular velocity of the satellite 6.9 SNOMAP 8.26 Archival Short-term storage 3.1 Archival storage 3.1 Arithmetic mean 5.4, 5.6 - Operators 8.3 Aspect Ratio 2.6, 6.8 Ashburn Vegetation Index (AVI) 8.12 Atmospheric Correction 6.4, 8.19 - effects 4.14 - scattering 5.10 Atmospherically Resistant Vegetation Index (ARVI) 8.17 Attentive vision 2.8 Average 3.12 - divergence 9.11 Averaging method 6.2

B Band pass filtering 10.2 Band Interleaved by Line (BIL) 3.31 Band Interleaved by Pixel (BIP) 3.31 Band Sequential (BSQ) 3.31 Bhattacharyya (B) distance 9.11 Bi-cubic or cubic convolution 6.12 - spline 4.4 Bimodal frequency distributions 9.7 Binomial distribution 9.33 BIRCH 9.20 BITMAP (BMP) 3.2 BKGD (Background Colour) 3.16 - chunk 3.16 Blackbody 1.3 BMP 3.2 Boolean 9.12 Boxcar Distance 9.10 Brightness 2.9 Byte oriented 3.6 - ordering field 3.4 C CGA 3.3 Change detection 8.2 Chlorophyll 8.4 CHRM 3.15 - chunk 3.15, 3.16 Chromatic light 2.9 Chromaticity 2.9 Chunk Layout 3.15 - Specification 3.16 CIE chromaticity diagram 2.10 City Block Distance 9.10 CLARA 9.18 CLARANS 9.18 Classification 9.32

I.2   Index - Accuracy Assessment 9.32 - Scheme 9.3 - Cowardin Wetland 9.3 - Michigan Classification 9.3, 9.4 - U.S. Geological Survey Land 9.3 Use/Land cover 169.3 Clean polygon 3.27 CLEARANS 9.20 CLIQUE 9.20, 9.21 Clusters 4.18 - algorithm design or selection 9.16 - validation 9.16 Clustering 9.15 - Features (CF) 9.20 - Hierarchical approaches 9.17 - Pattern Classification 9.23 CMY 2.11 - model 2.11 CMYK 2.11, 2.14 Colour 2.9 - gamut 2.11 - model 2.11 - space 2.11 - system 2.11 Condensation-based method 9.20 Confidence limit test 1-tailed lower 9.33 Confusion matrix 9.33 Contingency table 9.33 Contrast 1.6 - enhancement 1.7 - exponential 4.22 - stretch 4.22 Conversion of colour from RGB to HSI 2.15 - HSI to RGB 2.15 Convolution 10.2 - filtering operation 10.2 - windows 10.2 Corrected Transformed Vegetation Index (CTVI) 8.8 Correlation 5.9, 5.10 - matrix 5.12 Covariance 5.8 - Matrix 5.8 Cubic convolution 6.12 Cumulative frequency 7.5 - histogram 7.5, 7.6

D Data drop-out noise 10.5 Data Visualization 4.3 - dependent noise 10.4 dBASE File 3.23 - table 3.23 DBSCAN 9.19 DDoG filter 10.10 Density-based clustering method 9.19 De-striping methods 6.3 Detector noise 10.5 Device specialized 3.2 Difference Vegetation Index (DVI) 8.12 Digital Elevation Model 1.5 Digital image 1.5 - sensor 1.5 Digital Terrain Models 2.4 Directional 10.10 - Derivative of Gaussian Filter 10.18 Directory Entries 3.7 Discrete differentiation operator 10.16 Discriminant function 9.23 Distance weighted averaging procedure 6.12 - based clustering methods 9.18 - based Vegetation Index 8.6, 8.9 Distribution-based clustering methods 9.18 Divergence 9.38 Double cone 2.15 E Earth rotation correction 6.8, 6.9 Edge 10.3 - crispening operation 10.2 - detection 10.8 - enhancement 1.5, 10.12 - extractions 10.2 EGA 3.3 Eigen values 8.32 - vectors 8.32 Electrical field (E) 1.2 Electromagnetic (EM) spectrum 1.1 - energy 1.2 Emissivity 1.3, 1.4

  ENCLUS 9.19 Enhanced vegetation Index 8.19 ENVI 4.9 Equalized histogram 7.5 ER MAPPER 4.1, 4.21 ERDAS IMAGINE 4.2, 5.10 Error Matrix 4.21, 9.33 - of commission 4.7, 9.34 - of omission 4.21, 9.9 ESRI shape file 3.23 Euclidean distance 9.10, 9.22 Expectation-Maximization (EM) 9.18 - hyperbolic 7.2 External indices 9.16 - separation 9.26 F Feature extraction 1.7, 9.1 - selection 9.2 - selection for clustering 9.22 - selection or extraction 9.15 Fidelity 1.6 File formats 3.1 - Header 3.2 - Structure of PNG 3.15 Fire severity has 8.29 First Ring 3.29 Frequency 1.3 - domain 10.4 - histograms 9.7 Full-colour JPEG 3.12 Fuzzy 9.2 G gAMA (Image Gamma) 3.15 - chunk 3.15, 3.19 Gamma – ray spectrometer 2.1 - camera 1.5 - rays 1.2, 1.4, 1.7 Gaussian 5.3 - distribution 5.3, 5.11, 7.6 - Equalization 7.6 - stretching 7.2 Geometric Correction 4.4 Global modeling 1.6 Gradient 10.9, 10.14 - filtering 10.14 Gram Schmidt process 8.38 Graph-based methods 9.18

Index

I.3

Graphic Interchange Format (GIF) 3.11 Gray-scale JPEG file 3.12 - pixel 3.13 Grey levels 2.8 - scale 2.8 Grid-based approach 9.19 - Clustering 9.19 Ground Control Point (GCP) 4.4, 6.6 H Hexagonal sampling pattern 2.7 Hierarchical approaches 9.17 High boost filter 10.10, 10.12 - pass filtering 10.1 hIST 3.16 - chunk 3.19 Histogram 5.2 - equalization 4.5,5.3, 7.5 - matching 4.5 - Minimum Method (HMM) 6.5 - range of 8.5 HSI 2.12 - model 2.12 Hue 2.9, 2.14 I IBM 3.5 IDAT (Image Data) 3.16 - chunk 3.17 IDRISI 4.1, 4.12 - System Overview 4.13 IEND (Image Trailer) 3.16 - Chunk 3.16 IHDR (Image header) 3.16 - chunk 3.16 Image Addition 8.1 - Analysis 1.6, 1.7 - Classification 9.3 - Data Compression 1.6, 1.7 - Difference 8.2 - Division 8.3 - enhancement 4.2, 7.1 - File Directory 3.7 - File 3.7 - Multiplication 8.1 - Reconstruction 1.6, 1.7 - Representaiion and modeling 1.5 - Restoration 1.5, 1.7

I.4   Index - storage 1.7 - subtraction 8.2 IMAGINE Advantage 4.2 - Essential 4.2 - Professional 4.2 Imaging radar 2.4 Imaging spectrometer 2.3 Independent noise 10.4 Index file 3.23 Index file (.shx) 3.25 Indexed-colour pixel 3.14 Informational classes 9.1, 9.2 Infrared 1.4 Infrared Index (II) 8.9 Initial statistics 5.1 Inner Ring 3.29 Intensity 4.22, 6.12 Interactive Data Language (IDL) 4.9 Interchangeable formats 3.2 Internal homogeneity 9.15 Internal indices 9.16 ISODATA Algorithm 9.29 J Jefferies – Matusita (JM) 9.11 Joint Photographic Expert Graphic (JPEG) 3.11 K Kappa (κ) 9.36 Kernel window 10.2 Kirsch 10.10, 10.18 K-means 9.18 - Algorithm 9.18 Knowledge based 9.2 Kurtosis 5.2, 5.6 L Laplacian filter 10.10, 10.11 - of Gaussian filter 10.18 Laser scanner 2.4 Leaf Water Content index (LWCI) 8.9 Leaf-area index (LAI) 8.5 Left-skewed 5.6 Linear 5.9 - Enhancement 5.11 Location of Training area 9.5 Log filter 10.10, 10.19

Logarithmic Contrast Enhancement 7.7 Log-polar sampling pattern 2.7 Look Up Table 7.2 Low pass filtering 10.1 Luminance 2.9 LZW (Lempel-Ziv & Welch) 3.9 M MAFIA (Merging of Adaptive Finite IntervAls) 9.21 Magic number 3.2 Magnetic fields (M) 1.2 Magnitude of the gradient 10.14 Main file 3.23 - header (.shp) 3.24 Manhattan Distance 9.10 Map projection 4.24 MARIA 9.21 Mask 10.2 Maximin distance algorithm 9.26 Maximum 4.18 - distance 9.23 - Likelihood Classifier 9.43 - likelihood Estimation (MLE) 9.18 - values 4.18 Mean 5.4 - filter 10.6 - values 5.11 Median 5.5 - Filter 10.7 Medical processing 1.5 Memory – based storage 3.1 Mesokurtic 5.7 Microwave energy 1.5 MidIR Index 8.9 Minimum 4.17 - distance 9.14 - value 5.4 - distance classifier 9.25 - Distance to Means 9.43 Min-Max stretch 7.2, 7.4 Missing scan lines 6.2 Mode 4.10 - Filter 10.6 Model-based clustering methods 9.18 Modified Soil-Adjusted Vegetation Indices 8.14

  MSAVI1 8.14 MSAVI2 8.14 Moisture Stress Index (MSI) 8.9 Multi - Variate Statistics 5.11 Multi Level Median (MLM) Filter 10.7 - modal 5.3, 5.11 - Patch 3.29 - Patch shapes 3.30 - point 3.27 - PointM 3.28 - PointZ 3.28 Multiplicative 10.4 Multi-spectral classification 4.6 Multi-spectral scanner 2.3 N Nearest neighbour method 6.12 Negative association 5.8 - kurtosis 5.6 - skew 5.6 Neighborhood-based approaches 9.19 Neighbourhood averaging 10.3 Neural-based classifiers 9.6 Noise 9.16 - filtering 1.7 - removal 10.4 Noise Removal Filtering 10.4 Non – systematic errors 6.6 Non Linear Enhancement 7.5 Non-associated 5.8 Non-directional 10.10 Non-directional Filters 10.10 Non-linear 7.2 Non-linear model 10.4 Non-linear techniques 7.2 Non-parametric algorithms 9.8 Non-such 8.34 Non-systematic 6.6 Normalized Burn Ratio (NBR) 8.29 - City Block distance 9.10 - Difference Snow Index (NDSI) 8.28 - Difference Snow/Ice Index (NDSII) 8.27 - Difference Vegetation Index (NDVI) 8.27 - Difference Water Index (NDWI) 8.20

Index

I.5

- Ratio Vegetation Index (NRVI) 8.8 N-Space Indices 8.36 Null Shapes 3.24 Number of Training area 9.6 Numbers of pixels 9.43 Nyquist criterion 2.6 O Olympic Filter 10.7 Online 3.1 Online storage 3.1 OPTICS 9.19 OptiGrid 9.19 Optimization problem 9.16 Orbital geometry model 6.8 Orbital period 6.9 ORCLUS (arbitrarily ORiented projected CLUSter generation) 9.21 Orthogonal transformation 8.30 Overall classification 9.32 - optimization search strategy 9.15 P Paeth 3.17 PAM 9.18 Parallelopiped 9.12 Parallelopiped classifier 1709.12 Parametric algorithms 9.8 Parametric nonparametric 9.8 Partitioning 9.17 Pearson Product Moment correlation coefficient 5.9 Percentile stretch 7.2, 7.4 Perpendicular Vegetation Index (PVI) 8.6 Photos 2.2 pHYS 3.16 - chunk 3.19 Physical models 6.4 Piece Wise Linear Stretch 7.4 Placement of Training area 9.7 Plane of soils 8.35 Platykurtic 5.7 PLTE (palette) 3.16 - Chunk 3.17 P-Median (PM) Filter 10.7 PNG file signature 3.15 Point 3.23

I.6   Index PointM 3.25 PointZ 3.25 Polygon 3.25 PolygonM 3.25 PolygonZ 3.25 PolyLine 3.25 PolyLineM 3.25 PolyLineZ 3.25 Portable Network Graphics (PNG) 3.12 Positive association 5.8 - kurtosis 5.7 - skew 5.6 Pre-processing 6.1 Prewitt 10.15 Primary chunks 3.16 - colours 2.13 Principal Component Analysis (PCA) 8.30 Priori probabilities 9.14 Probabilistic 9.2 Proximity measure 9.16 Pseudo-colouring 1.7 PVI1 8.11 PVI2 8.12 PVI3 8.12 Pythagorean Theorem 9.14 Q Quantization 2.8 Quantization levels 2.8 R Radar 1.5 Radar altimeter 2.4 Radiance 2.9 Radio waves 1.2, 1.3 Radiometer 2.3 Radiometric error 6.1 Random sampling pattern 1.15 Ratio Index (RATIO) 8.6 - Vegetation Index (RVI) 8.6 Record Headers 3.25 Rectangular pattern 2.7 Reference map 9.32 Reflectivity 1.5 Registration 4.12 Regression method 6.6 Relative indices 9.16

Resource oriented 9.3 RGB 2.11 Right-skewed 5.6 Ring 3.27 Roberts Operator 10.15 Robotics 1.5 Rotation 6.7 Run Length Encoding (RLE) 3.4 S Sampling 2.5 Sampling pattern 2.7 Satellite tape format 3.30 Saturation 2.9 - Brightness 2.9 sBIT (significant bits) 3.16 sBlT chunk 3.20 Scalable Vector Graphic (SVG) 3.11 Scanner 1.5 Scatter plot 5.8 Second order polynomial 6.12 Sensor 2.1 - Active 2.1 - Passive 2.1 Shape file 3.22 Shape of Training area 9.6 Sharpening 1.7 Short-term 3.1 Signal to noise ratio (SNR) 10.4 Signature 3.2 Similarity function 9.15 Simple directional filter 10.13 Single-linkage 9.18 Site-specific accuracy 9.33 Size of Training area 9.5 Skew 6.8 - angle 6.9 - Correction 6.8 Skewness 5.6 Slope-based 8.5 Smoothing filters 10.5 Sobel Operator 10.15, 10.16 Software specialized 3.2 Soil adjustment factor L 8.13 Soil and Atmospherically Resistant Vegetation Index (SARVI) 8.18 Soil Adjusted Vegetation Index (SAVI) 8.13 - brightness 8.9

Index

  - line 8.10 Sonar 1.5 Spatial autocorrelation 4.15 - domain 10.19 - filtering 10.15 - Frequency 2.6 - resolution 1.6, 2.6 Special Indices 8.19 Speckle 10.4 Spectral class 9.2, 9.3 - properties 9.2 - ratioing 8.3 Speed of light 1.2 Standard deviation 4.14, 4.18 Statistical models 1.6 Statistics Extraction 9.5 STING 9.20 Sub, Up 3.17 Subspace clustering methods 9.21 Subtractive colour mixing 2.11 Supervised 9.2 Systematic 4.21 T Tagged Image File Format (TIFF) 3.4 Target number 7.5 Target number of pixel (Nt) 7.5 Tasseled Cap Transformation (TCT) 8.30, 8.34 - Components 8.34 templet filter kernel 10.2 tEXT (Textual data) 3.16 - chunk 3.16 TIFF Classes 3.9 - conformant classes 3.10 - Data Compression 3.9 - Geo TIFF 4.3 - file structure 3.4 Transformed Vegetation Index (TVI) 8.7 Thermal scanners 2.3 Thiams Transformed Vegetation Index (TTVI) 8.8 tIME 3.21 - chunk 3.21 Training 9.4, 9.8 - Data Statistics 9.2, 9.8 - Site Selection 9.5 Transformed divergence 9.10

Translation 6.11 Triangle Fan 3.29 -Strip 3.29 Tristimulus values 2.9 tRNS 3.14 - Transparency 3.16 - chunk 3.14 True classes 9.33 - colour pixel 3.14 TSAVI1 8.13 TSAVI2 8.14 U Unimodal 5.3, 5.5 Univariate statistics 5.1 Unsupervised Classification 9.15 V Variance 4.10 - covariance 4.17, 4.22 Vegetation Index (VI) 8.4, 8.5 VGA 3.3 Video camera 2.2 W Wavelength 1.2 WDVI 8.13 Weighted Difference Vegetation Index (WDVI) 8.13 - Mean Filter 10.6 Wetness 8.35 White noise 10.4 Y Yellow stuff 8.39 Z Zero crossing 10.18, 233 - Filtering 10.18 - crossing filters 10.18, 10.19 Zero kurtosis 5.7 - mean 9.28 zTXT (Compressed external data) 3.16 - chunk 3.21  

I.7

About the Author Dr. Sanjay Kumar Ghosh, Professor in the Department of Civil Engineering, IIT Roorkee, Roorkee has been teaching both undergraduate and postgraduate students for the last 30 years. He did his B.E.(Civil) from University of Roorkee (now IIT Roorkee) in 1980 and was awarded The Thomason Engineering Design Gold Medal for the Best B.E. (Civil) Engineering Design Project. He completed his Masters with Honours in Advanced Surveying and Photogrammetry in 1982 from University of Roorkee. He completed his Ph.D in 1991 from University of Strathclyde, Glasgow under Commonwealth Scholarship from U.K. Government. His area of interest are in Remote Sensing, Image processing and GIS based applications. He is a Life Member of 12 National and International professional societies and has been the member for various DST expert panels, National Investigator for DST funded project on PURA, a dream project of our exPresident Dr. A.P.J Kalam. He is also a member of AICTE Evaluation teams. He has handled 9 research projects funded by DST, AICTE, Department of Space and Ministry of Water Resources including 2 International collaborative projects with Government of Vietnam. He was elected as a member to the European Union Committee on Ecosystem and Floods for the year 2000 and 2001 and visited Vietnam and China as an expert. He has guided more than 65 M.Tech and 10 Ph.D theses and presently 6 Ph.D programs are under progress. He has published 43 research papers in various National and International journals. He has organized more than 35 specialist courses for various Government agencies in the area of Land Surveying GPS, Remote Sensing, Image Processing and GIS. At present, he has been entrusted with a prestigious project by Defense Estate, Ministry of Defense for surveying, verification and mapping of Defense lands for many cantonments within the country using GIS, GPS and high resolution satellite data. He has already published one book on Remote Sensing and GIS which has also been translated into Russian language. He has been awarded Vijay Shree Award for Outstanding Services and Achievements and Contributions by India International Friendship Society, Glory of India Gold Medal by International Institute for Success Awareness in the year 2004 and Indira Gandhi Shiromani Award for Excellence in research and teaching by IIFS, New Delhi in 2006.