Trace Residue Analysis. Chemometric Estimations of Sampling, Amount, and Error 9780841209251, 9780841211148, 0-8412-0925-1

Content: Statistics : a child of our time? / Frauke Tschiltschke -- Sampling for chemical analysis of the environment :

481 136 3MB

English Pages 279 Year 1985

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Trace Residue Analysis. Chemometric Estimations of Sampling, Amount, and Error
 9780841209251, 9780841211148, 0-8412-0925-1

Citation preview

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.fw001

Trace Residue Analysis

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.fw001

ACS SYMPOSIUM SERIES

284

Trace Residue Analysis

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.fw001

Chemometric Estimations of Sampling, Amount, and Error David A. Kurtz, EDITOR The Pennsylvania State University

American Chemical Society, Washington, D.C. 1985

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.fw001

Library of Congress Cataloging in Publication Data Trace residue analysis. (ACS symposium series, ISSN 0097-6156; 284) Bibliography: p. Includes indexes. 1. Trace elements—Analysis—Statistical methods. I. Kurtz, David A., 1932. II. American Chemical Society. III. Series. QD139.T7T73 1985 ISBN 0-8412-0925-1

543

85-11226

Copyright © 1985 American Chemical Society All Rights Reserved. The appearance of the code at the bottom of the first page of each chapter in this volume indicates the copyright owner's consent that reprographic copies of the chapter may be made for personal or internal use or for the personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc., 27 Congress Street, Salem, MA 01970, for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to copying or transmission by any means—graphic or electronic—for any other purpose, such as for general distribution, for advertising or promotional purposes, for creating a new collective work, for resale, or for information storage and retrieval systems. The copying fee for each chapter is indicated in the code at the bottom of the first page of the chapter. The citation of trade names and/or names of manufacturers in this publication is not to be construed as an endorsement or as approval by ACS of the commercial products or services referenced herein; nor should the mere reference herein to any drawing, specification, chemical process, or other data be regarded as a license or as a conveyance of any right or permission, to the holder, reader, or any other person or corporation, to manufacture, reproduce, use, or sell any patented invention or copyrighted work that may in any way be related thereto. Registered names, trademarks, etc., used in this publication, even without specific indication thereof, are not to be considered unprotected by law. PRINTED IN THE UNITED STATES OF AMERICA

ACS Symposium Series

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.fw001

M. Joan Comstock, Series Editor Advisory Board Robert Baker U.S. Geological Survey Martin L . Gorbaty Exxon Research and Engineering Co.

Robert Ory USDA, Southern Regional Research Center Geoffrey D. Parfitt Carnegie-Mellon University

Roland F. Hirsch U.S. Department of Energy

James C . Randall Phillips Petroleum Company

Herbert D. Kaesz University of California—Los Angeles

Charles N . Satterfield Massachusetts Institute of Technology

Rudolph J. Marcus Office of Naval Research

W. D. Shults Oak Ridge National Laboratory

Vincent D. McGinniss Battelle Columbus Laboratories

Charles S. Tuesday General Motors Research Laboratory

Donald E . Moreland USDA, Agricultural Research Service

Douglas B. Walters National Institute of Environmental Health

W. H . Norton J. T. Baker Chemical Company

C. Grant Willson IBM Research Department

FOREWORD Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.fw001

The ACS S Y M P O S I U M SERIES was founded in 1974 to provide a

medium for publishing symposia quickly in book form. The format of the Series parallels that of the continuing ADVANCES IN CHEMISTRY SERIES except that, in order to save time, the papers are not typeset but are reproduced as they are submitted by the authors in camera-ready form. Papers are reviewed under the supervision of the Editors with the assistance of the Series Advisory Board and are selected to maintain the integrity of the symposia; however, verbatim reproductions of previously published papers are not accepted. Both reviews and reports of research are acceptable, because symposia may embrace both types of presentation.

PREFACE

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.pr001

THE

SYMPOSIUM UPON WHICH THIS V O L U M E

is based was organized

originally because of the perpetual need to better formalize both understanding and error in the analytical methods used in quantitative analytical work. In this field, problem areas occur in sampling, recovery, and quantitative measurement. These analyses involve the production of numbers or data that describe quantitatively the system under scrutiny. Those who have been a part of this process know the locations of the various errors and have some idea of the size of the error. They may even run appropriate statistical tests to quantitatively determine the amount of error. However, society likes to have decisions made in a black and white manner and to know whether something is there or not. This situation suggests that the analytical error should drop to zero. While this result is the goal of all analytical work, it is simply not realistic. Our basic need, then, is to simplify error determinations and explanations and to educate the public both for the reasons and for the interpretations of error. The goal of this volume is to further the use of mathematical and statistical tools—the field of chemometrics—for chemical and, specifically, trace chemical analyses of pesticides and environmental contaminants. Statistics have been used in chemical analysis in increasing amounts to quantify errors. The focus shifts now to other areas, such as in sampling and in measurement calibrations. Statistical and computer methods can be brought into use to give a quantified amount of error and to clarify complex mixture problems. These areas are a part of chemometrics as we use the term today. Errors in trace analyses are usually hidden to all except those intimately involved in the sample collection and, later, in the bench analysis. In chromatography, especially, it is too easy to hide behind uncertain work because published research does not concern itself with exactly how the chromatographer makes his quantitative decisions. Today, with the advent of the microprocessor and with the use of "black box" instruments, the chromatographer knows even less about his calibration graph or line, or the error associated with it. In these instruments, a single point and the origin may determine the calibration graph. Similar problems exist in other modern instrumental analysis techniques. This volume addresses these problems directly. The use of statistics enables error to be determined in calibration measurements at a particular ix

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.pr001

confidence level. Decisions can be made in sample selection, and the limits of detection can be determined in an orderly manner. The knotty problem of outliers can be approached systematically. The symposium on which this volume is based was formatted, first, to outline appropriate and noncumbersome methods for analytical decision making and, second, to make the methods easily understandable to the ordinary bench chemist so that they will actually be used. I visualize this text, actively being used, next to an analytical instrument. I hope it is clear enough so that when used, the bench chemist will be able to obtain more meaningful results that can be interpreted on short notice. Acknowledgment is made to the donors of the Petroleum Research Fund, administered by the American Chemical Society, for their financial support that enabled foreign speakers to travel to the symposium upon which this volume is based. DAVID A. K U R T Z

The Pennsylvania State University University Park, Pennsylvania March 25, 1985

x

1 Statistics: A Child of Our Time? FRAUKE TSCHILTSCHKE

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch001

Department o f Philosophy, Christian Albrechts University of Kiel, Kiel, West Germany

The use of data to represent s c i e n t i f i c information has been found not only to be of modern use but also to have been a part of our society for centuries. Science and s t a t i s t i c s thus formed have explained our complex world i n more and more d e t a i l but have yet to f u l l y explain truth . . . the goal of the philosophers.

"Every year 10% of the American chemists spend 40 hours i n conference rooms and use 19 pounds of paper." Even i f t h i s statement i s not a t r u t h f u l one, i t expresses one of the w e l l e s t a b l i s h e d forms of s c i e n t i f i c statements, namely a s t a t i s t i c a l one. We are q u i t e used to d e a l i n g w i t h s t a t i s t i c s , the c o l l e c t i o n and a n a l y s i s of data and the drawing of conclusions from t h i s data ( _1_ )• I n a s c i e n t i f i c way, t h i s mode c o n s t i t u t e s no problem. On the other hand compare these two statements: "Get a shot against the f l u because only very few of the i n o c u l a t e d people w i l l get the f l u , " versus "Get a shot against the f l u , because only 3% o f the i n o c u l a t e d people w i l l get the f l u " . The second statement provides more p r e c i s e i n f o r m a t i o n than the f i r s t . S t a t i s t i c s seems to be a "magic" word of our time! S t a t i s t i c a l use has r a p i d l y increased i n our century which has i n d i c a t e d that a strong b e l i e f i s now present. I n e a r l i e r times, t h i s b e l i e f d i d not e x i s t , but are we not sure t h a t the use o f s t a t i s t i c s was not present then? Can i t be traced back to the early c i v i l i z a t i o n s ? This paper w i l l show that s t a t i s t i c s has been w i t h us f o r a long time. The techniques have g r a d u a l l y developed from simple counting or i n f o r m a t i o n gathering to e x p l a i n i n g complex phenomena w i t h only l i m i t e d i n f o r m a t i o n . Not to be f o r g o t t e n w i l l be the u l t i m a t e i n t e r a c t i v e r o l e of philosophy. Current address: Klodtstrasse 12, 2408 Tinmendorfer Strand, West Germany

0097-6156/ 85/0284-0001 $06.00/ 0 © 1985 American Chemical Society

2

T R A C E RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch001

S c i e n t i f i c Thinking In former times philosophers proposed g e n e r a l i t i e s to describe l i f e . They f a i l e d i n some respects because they were out of touch w i t h r e a l i t y and d i d not look at f a c t s . The explanations were only l o g i c a l l y derived and suggested errant d i r e c t i o n s . L a t e r s c i e n t i f i c endeavor gained favor because of i t s s e l f - c o r r e c t i n g nature. Making an o b s e r v a t i o n , c o l l e c t i n g i n f o r m a t i o n , a n a l y z i n g i t , and drawing conclusions which express the reasons f o r the common behavior i n a s c i e n t i f i c law i s e x a c t l y the way s c i e n t i f i c t h i n k i n g works. I f f a c t s found l a t e r were i n c o n s i s t e n t , the g e n e r a l i t i e s were modified to i n c l u d e them. In short the understandable reason f o r an observed event was attempted to be found. Too o f t e n enough i n f o r m a t i o n simply was not a v a i l a b l e f o r adequate s c i e n t i f i c laws to be fashioned. In t h i s area, s t a t i s t i c s was found to have an e v e r - i n c r e a s i n g r o l e . A confidence l e v e l was devised that allowed l i m i t e d f a c t s to express l a r g e r g e n e r a l i t i e s . This narrowed the amount of work s c i e n t i s t s had to do to come to good conclusions (and sometimes increased the work they thought they had to do to reach a d e s i r e d c o n f i d e n c e ! ) . Conclusions or hypotheses are never absolute but are more and more c e r t a i n as the number of f a c t s a v a i l a b l e i n c r e a s e s . Nonetheless, astounding amounts of good c o n c l u s i o n s , again at a given confidence l e v e l , can be drawn from l i m i t e d f a c t s using statistics. S t a t i s t i c s i n Former Times The e a r l y r o l e of s t a t i s t i c s was e s s e n t i a l l y only i n c o l l e c t i n g f a c t s and assembling them i n an o r d e r l y way. I t seems to have been a v a l u a b l e method s i n c e we have seen such an i n c r e a s e i n i t s use. For example, we have found r e p o r t s about weather, s t a r s , sun, moon, and change of day and n i g h t i n a l l of the o l d c u l t u r e s . Even without a complex language s c r i p t l i k e ours i t was p o s s i b l e to cut marks i n stones and s t i c k s , which allowed counting. In 4241 B.C. Egyptians had a f a i r l y p r e c i s e calendar; even the leap year was known ( 2_ ). Other examples of s c i e n t i f i c observations include r e g i s t r a t i o n s of p o p u l a t i o n s , h a r v e s t s , and tenure. Around 3700 B.C. Seneferu, a mighty w a r r i o r , r a i d e d and captured 7,000 men and 200,000 sheep, c a t t l e , and goats ( .3 ). Wherever people l i v e d together and depended on each o t h e r , i t was necessary to make plans f o r the use of t h e i r land and a v a i l a b l e water f o r producing food. For example, i n Egypt the c u l t i v a t i o n depended on the f l o o d area of the N i l e R i v e r . They, t h e r e f o r e , had o b s e r v a t i o n s t a t i o n s along the r i v e r to measure the water l e v e l . From t h i s measuring they made the f o l l o w i n g very p r e c i s e p r e d i c t i o n s : only a 21 foot l e v e l meant famine. A 23 foot l e v e l meant i m p e r f e c t l y watered l a n d . However, at 26.5 f e e t the whole country had p l e n t y of water ( 4^ ).

1.

TSCHILTSCHKE

Statistics: A Child of Our Time?

3

One of the best examples of a p r e c i s e s t a t i s t i c a l e s t i m a t i o n of counting, a n a l y z i n g , and drawing conclusions was the exact f o r e c a s t of the e c l i p s e of the sun i n 585 B.C. made by Thales of M i l l e t ( _5 ).

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch001

New B e l i e f s i n Numbers Over the years people have switched t h e i r b e l i e f s from the explanations of the gods of former times to the explanations of the gods of modern times, the s c i e n t i s t s and s t a t i s t i c i a n s . I t i s easy to see how people have done t h i s . Modern people began t o b e l i e v e i n numbers and data because they represented nature so w e l l . The t h e o r i e s of science and s t a t i s t i c s were e x p l a i n e d , and the ideas of science and s t a t i s t i c s became b e t t e r founded and entrenched i n t h e i r t h i n k i n g . However one mistake was made and that was a b i g one: f a c t s and data were taken as t r u t h and reality. The change i n the b e l i e f s of the people d i d not happen o v e r n i g h t . As f a c t s and data began to s u b s t a n t i a t e the t h e o r i e s and methods of s c i e n c e , the b e l i e f s of the people s l o w l y evolved away from the more general explanations of the p h i l o s o p h e r s . The f e e l i n g that the t h i n k i n g of the philosophers represented t r u t h and r e a l i t y was l o s t . In former times the f r i e n d s of t r u t h which i s the t r a n s l a t i o n of the Greek word f o r philosopher - t r i e d to f i n d b a s i c explanations from which they could e x p l a i n a l l the n a t u r a l phenomena i n the w o r l d . However, the study proved t o be too complex. I n chemistry and p h y s i c s , f o r example, there i s the b e l i e f that the world i s b u i l t of b a s i c elements, but people kept f i n d i n g smaller and smaller elements: atoms, neutrons, and now n e u t r i n o s and quarks. Our t h i n k i n g j u s t became shrouded w i t h f a c t s so that the wholeness of the world became l o s t . Wholeness Thinking

Lost

The t r u t h the philosophers searched f o r was s t r o n g l y i n f l u e n c e d by the idea of wholeness. Wholeness gives a broad d i r e c t i o n i n l i f e . However, our people have o f t e n f e l t that s c i e n t i f i c t h i n k i n g has l o s t i t s connection to the idea of t r u t h and wholeness. The world i s so complex and d e t a i l e d that people have become only s p e c i a l i s t s i n s t e a d of g e n e r a l i s t s . The l a t t e r category i n c l u d e s the p h i l o s o p h e r s . We switched over t o the idea that s p e c i a l i s t s can be the only ones that b r i n g t r u t h . The d i f f e r e n t d i r e c t i o n s , such as math, p h y s i c s , a r t , and philosophy, l o s t t h e i r connections to each other and were removed f a r away from t h e i r o r i g i n a l study areas and the idea of t r u t h and wholeness. Today the d i f f e r e n t departments b u i l d up such l a r g e realms of s p e c i a l knowledge t h a t year-long s t u d i e s are necessary t o f i n d one's way through. For each of these s c i e n t i f i c realms people developed t h e i r own language which was almost l i k e the event of the b u i l d i n g the tower of Babel.

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch001

4

With s t a t i s t i c s as an example, I have t r i e d t o show how e a s i l y a p r o g r e s s i v e thought, although v a l u a b l e i n e x p l a i n i n g n a t u r e , l o s e s i t s context. To be s u r e , s t a t i s t i c s has aided the development of science tremendously e s p e c i a l l y i n recent times. In s p i t e of t h i s , however, explanations of nature that a i d l i f e have f a l l e n short of t h i s mark. I t i s now necessary to f i n d again one common "language" so that we are able to put the r e s u l t s of the d i f f e r e n t f i e l d s together and to b r i n g increased understanding of our w o r l d . Today people wish f o r s e c u r i t y . Their o r i e n t a t i o n i s expressed by the way they b e l i e v e i n s c i e n c e , namely, i n something g o d - l i k e that should be able t o r u l e the w o r l d . Our s o c i e t y puts a l l emphasis on a s c i e n t i f i c education and s c i e n t i f i c r e s e a r c h , and e l i m i n a t e s a t the same time a l l other p o s s i b l e methods which can o f f e r e x p l a n a t i o n s . Therefore, i t might be a good s t a r t t o organize a d i f f e r e n t s o r t of conference, where s c i e n t i s t s from a l l d i s c i p l i n e s s i t together and d i s c u s s ways to cooperate w i t h each o t h e r . Here philosophers should l e a d the d i s c u s s i o n s . L i v e l y d i s c u s s i o n s w i l l ensure new d i r e c t i o n s , i d e a s , and g o a l s , which w i l l again be c l o s e to t h e i r o r i g i n a l thought, the f r i e n d s h i p t o truth! Acknowledgments The author wishes to thank Jean Cummins, a f r i e n d from Kent, WA who spent q u i t e some time to make my E n g l i s h understandable. I a l s o want t o thank Dr. Wolfgang Deppert, Department of Philosophy, C h r i s t i a n A l b r e c h t s U n i v e r s i t y of K i e l , West Germany, who was my philosophy teacher and who was the f i r s t who confronted me w i t h some of these i d e a s . Literature Cited 1.

Snedecor, G. W.; Cochran, W. G. i n " S t a t i s t i c a l Methods"; 7th Ed.; Iowa State University Press: Ames, IA, 1980.

2.

Breasted, J. H. i n "A History of the Ancient Egyptians"; Smith, Elder and C o . : London, 1908; V o l . V, pp. 35-36.

3.

Budge, E . A. W. i n "A Short History of the Egyptian People"; J. M. Dent and Sons, L t d . : London, 1914; p. 38.

4.

Budge, E . A. W. i n "The N i l e " ; Thomas Cook and Son, London, 1902; pp. 76-77.

5.

Rousseau, P. i n "Man's Conquest of the Stars"; Publishers, L t d . : London, 1959; p.43.

RECEIVED May 6, 1985

Ltd.:

Jarrolds

2 Sampling for Chemical Analysis of the Environment: Statistical Considerations B. KRATOCHVIL

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

Department of Chemistry, University of Alberta, Edmonton, Alberta, Canada T6G 2G2

A statistically valid sampling plan requires careful design and execution so that generalizations based on mathematical probability can be drawn from a small number of test portions. Guidelines are given for estimation of the minimum number and size of sample increments needed to achieve a given level of confidence i n chemical analyses. Accurate sampling f o r p e s t i c i d e s and p e s t i c i d e residues i n the environment presents formidable problems. The p o p u l a t i o n o f i n t e r e s t i s l i k e l y t o be c o m p l e x . I t may c o n s i s t o f s u c h d i v e r s e m a t r i c e s as a i r , water, v e g e t a t i o n , s o i l , sediment, f i s h , or w i l d l i f e . Furthermore, concentrations of the soughtf o r s u b s t a n c e may b e l o w a n d u n e v e n l y d i s t r i b u t e d . The n e c e s s i t y f o r a sound s a m p l i n g program i n any study o f p e s t i c i d e d i s t r i b u t i o n i n the environment i s g e n e r a l l y recognized. Y e t programs a r e o f t e n s o d e s i g n e d as t o be s t a t i s t i c a l l y unsound, or a v a l i d , w e l l designed p l a n i s compromised by expediency o r c a r e l e s s n e s s . The e f f o r t expended on e v a l u a t i o n o f s a m p l i n g d e s i g n s f o r p e s t i c i d e m o n i t o r i n g i s u s u a l l y e x c e e d i n g l y s m a l l compared w i t h t h a t expended on the a n a l y t i c a l measurements. I n o n l y a few c a s e s have g e n e r a l c o n s i d e r a t i o n s f o r s t a t i s t i c a l l y sound e n v i r o n m e n t a l sampling p l a n s b e e n d i s c u s s e d (J_#_2) • An e x a m p l e o f a t h o r o u g h s a m p l i n g s t u d y i s t h e i n v e s t i g a t i o n of fungicide persistence i n s o i l by a randomized s a m p l i n g p l a n (3)• O t h e r a u t h o r s h a v e p r e s e n t e d general c r i t e r i a f o r sampling m a t r i c e s such as s o i l s ( 4 ) , p l a n t s and s o i l s {5), a n d a i r ( 6 ) ; a m o r e g e n e r a l r e v i e w o n s a m p l i n g f o r c h e m i c a l a n a l y s i s i s a v a i l a b l e (J7). A u s e f u l d i s c u s s i o n c o n t a i n i n g much p r a c t i c a l i n f o r m a t i o n h a s b e e n p r o v i d e d b y t h e m o n i t o r i n g panel o f the F e d e r a l Working Group on Pest Management i n t h e U . S . A . ( 8 ) . T h i s g r o u p o b s e r v e d t h a t most

0097-6156/85/0284-0005$06.00/0 © 1985 American Chemical Society

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

6

T R A C E RESIDUE ANALYSIS

recorded data on d e l e t e r i o u s substances i n the environment have not come from programs designed according to s t a t i s t i c a l p r i n c i p l e s , and so the r e l i a b i l i t y of e x t r a p o l a t i o n s from the r e s u l t s cannot be assessed. The r e l i a b i l i t y of any environmental a n a l y t i c a l data depends upon the r e l i a b i l i t y of sample q u a l i t y . To g e n e r a l i z e from a n a l y t i c a l r e s u l t s on a small p o r t i o n of m a t e r i a l to a l a r g e r population r e q u i r e s c a r e f u l planning and execution i f b i a s i s to be avoided. This a r t i c l e considers the general problems involved i n sampling heterogeneous bulk populations such as s o i l , a i r , and n a t u r a l waters; s p e c i f i c d e t a i l s f o r p a r t i c u l a r types of m a t e r i a l s are not i n c l u d e d . These problems i n c l u d e the heterogeneity of most environmental m a t e r i a l s ; the costs i n time, manpower, and e f f o r t required f o r c o l l e c t i o n of r e a l samples; and the need to avoid contamination or decomposition of samples a f t e r c o l l e c t i o n . A s e t of d e f i n i t i o n s of terms f r e q u e n t l y used i n sampling i s provided because usage sometimes d i f f e r s among s t a t i s t i c i a n s , chemists, and o t h e r s . The d e f i n i t i o n s have been s e l e c t e d a f t e r c o n s i d e r a t i o n of the recommendations of various standards o r g a n i z a t i o n s • Background Sources of e r r o r i n an a n a l y s i s may be c l a s s i f i e d as random or systematic. Systematic e r r o r s g e n e r a l l y bias a r e s u l t i n one d i r e c t i o n i n a r e l a t i v e l y reproducible way and are not u s u a l l y amenable to s t a t i s t i c a l treatment. Random e r r o r s vary i n a nonreproducible way around the true value and can be t r e a t e d s t a t i s t i c a l l y by the laws of p r o b a b i l i t y . Therefore i n t h i s d i s c u s s i o n we s h a l l deal only with random e r r o r s , keeping i n mind that most e r r o r s are p a r t l y random and p a r t l y systematic and that systematic e r r o r s i n the a n a l y t i c a l operations can be c o n t r o l l e d by proper use of blanks, standards, and reference samples. Because poor samples are not i d e n t i f i a b l e by such checks, sampling u n c e r t a i n t y i s often t r e a t e d s e p a r a t e l y . For random e r r o r s the o v e r a l l variance s i s the sum of the 2

_ .

.

sampling variance

2

"~°

.

.

s_ and the variance of the remaining —s 2 2 2 2 2 a n a l y t i c a l operations : -f-a * s may be obtained by s u b t r a c t i o n of s (known i f a measurement —a o process i s i n s t a t i s t i c a l c o n t r o l ) from (obtained by a n a l y s i s of the samples). A l t e r n a t e l y , a s e r i e s of r e p l i c a t e measurements or samples can be designed to evaluate both standard d e v i a t i o n s . Reduction i n the o v e r a l l u n c e r t a i n t y r e q u i r e s , t h e r e f o r e , a t t e n t i o n to both sampling and a n a l y t i c a l operations. Once the a n a l y t i c a l standard d e v i a t i o n s i s one t h i r d or l e s s of the sampling standard d e v i a t i o n s^, f u r t h e r reduction i n has l i t t l e e f f e c t on (9). An example of the importance of sampling i s i n the determination of a f l a t o x i n s (a c l a s s of h i g h l y t o x i c compounds =

+

2

T t l e

v

a

l

u

e

o

f

s

2.

KRATOCHVIL

Sampling for

Chemical Analysis of the Environment

produced by molds) i n peanuts (10 11) • Because the d i s t r i b u t i o n of contaminated kernels i s t y p i c a l l y patchy and uneven, and because the t o l e r a b l e l e v e l of contamination i s so low (about 25 ppb), sampling i s the major source of a n a l y t i c a l u n c e r t a i n t y even with samples of over 20 kg. The o v e r a l l a n a l y t i c a l process can be d i v i d e d i n t o f i v e s t e p s — c o n s t r u c t a model, design a plan, take samples, perform analyses, and evaluate r e s u l t s (12)• The model defines the population to be studied, the substances to be measured ( i n c l u d i n g s p e c i a t i o n ) , the extent of d i s t r i b u t i o n w i t h i n the population, and the l e v e l of p r e c i s i o n r e q u i r e d . The sampling plan s p e c i f i e s the number, s i z e , and l o c a t i o n of the sample increments, the extent of combining of increments (compositing), and the procedures f o r reduction of the bulk or gross sample to a l a b o r a t o r y sample and to t e s t p o r t i o n s (subsampling). The plan should be w r i t t e n as a d e t a i l e d p r o t o c o l before work begins and r e v i s e d as warranted by new i n f o r m a t i o n . I t should include o n - s i t e c r i t e r i a f o r c o l l e c t i o n of a v a l i d sample, such as whether a substance should be considered f o r e i g n to the population and r e j e c t e d . A discarded piece of metal or p l a s t i c i n a f i e l d , f o r example, might be considered f o r e i g n f o r a s o i l a n a l y s i s and t h e r e f o r e l e g i t i m a t e l y r e j e c t e d . I t should a l s o include information on procedures f o r p r o t e c t i o n of the sample from contamination before and a f t e r c o l l e c t i o n , f o r p r e s e r v a t i o n , and f o r l a b e l i n g and recording of a l l appropriate i n f o r m a t i o n . F i e l d sampling operations are o f t e n c o s t l y i n time and manpower. Those c o l l e c t i n g samples should be aware of the p o s s i b i l i t y of b i a s and contamination.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

r

Random and

Systematic

Sampling

In d e v i s i n g a model f o r an a n a l y t i c a l operation, we i d e n t i f y a t a r g e t population to which we want our conclusions to apply. This w i l l d i f f e r from the parent population from which the samples are a c t u a l l y taken. The d i f f e r e n c e may be reduced by random s e l e c t i o n of i n d i v i d u a l p o r t i o n s (increments) f o r a n a l y s i s so that each part of the population has an equal chance of s e l e c t i o n . Genuinely random sampling i s d i f f i c u l t because b i a s , unconscious or d e l i b e r a t e , i s r e a d i l y i n t r o d u c e d . Untrained i n d i v i d u a l s o f t e n have d i f f i c u l t y i n accepting that an apparently unsystematic sampling pattern must be followed to be v a l i d . For s i m p l i c i t y and convenience, sampling at evenly spaced i n t e r v a l s over a population i s often used i n place of random sampling. For example, a f i e l d may be d i v i d e d i n t o uniform segments, and a sample taken from the center of each segment. This procedure i s g e n e r a l l y subject to more b i a s than random sampling• Should p e r i o d i c i t y i n the population be present or suspected, segments to be sampled should be s e l e c t e d with the

1

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

8

T R A C E RESIDUE ANALYSIS

a i d of a t a b l e of random numbers (13) • The sampling s i t e w i t h i n each segment should then be s e l e c t e d by f u r t h e r d i v i s i o n i n t o imaginary subsegments, each assigned a number, and the one to be sampled s e l e c t e d from a table of random numbers. Sometimes random sampling i s d i f f i c u l t to execute, as when a stream i s being monitored with a t i m e - a c t i v a t e d automatic remote sample c o l l e c t i o n d e v i c e . Under such c o n d i t i o n s a random s t a r t or other superimposed random time element may be s u b s t i t u t e d . The e f f i c i e n c y of systematic sampling improves as the population becomes b e t t e r understood. Both t h e o r e t i c a l and experimental s t u d i e s of t h i s p o i n t have been made (14)• When the component of i n t e r e s t i s d i s t r i b u t e d i n a segregated way, s p e c i a l sampling precautions may be needed. Thus, a p e s t i c i d e may have been d i s t r i b u t e d i n higher c o n c e n t r a t i o n i n one p a r t of the area under study or may have undergone more r a p i d degradation i n a low wet p o r t i o n of a f i e l d . To o b t a i n a v a l i d sample of a s t r a t i f i e d m a t e r i a l , the procedure recommended (15) i s to ( i ) d i v i d e the population i n t o segments ( s t r a t a ) based on the known or suspected p a t t e r n of segregation, ( i i ) f u r t h e r d i v i d e the major s t r a t a i n t o subsections and s e l e c t the r e q u i r e d number of subsections to be sampled by use of a t a b l e of random numbers, and ( i i i ) c o l l e c t samples p r o p o r t i o n a l i n number to the r e l a t i v e s i z e of the major s t r a t a . S t r a t i f i e d random sampling i s p r e f e r a b l e to u n r e s t r i c t e d random sampling, provided the number of major s t r a t a i s kept s u f f i c i e n t l y small that s e v e r a l increments can be taken from each. Composite Samples When only the average p r o p e r t i e s of a population, and not the v a r i a b i l i t y or d i s t r i b u t i o n of the sought-for component, are of i n t e r e s t , a composite sample may be prepared and analyzed. D i s t i n c t i o n should be made between composite and r e p r e s e n t a t i v e samples. A r e p r e s e n t a t i v e sample i s f r e q u e n t l y defined as one that possesses the average p r o p e r t i e s of a population; a composite sample i s u s u a l l y produced by homogenizing i n any of s e v e r a l ways one or more sample increments, and i t c o n s t i t u t e s one approach to producing r e p r e s e n t a t i v e samples. Compositing u s u a l l y means fewer analyses are required, and sample storage, r e c o r d i n g , and handling are s i m p l i f i e d once compositing i s completed. But much u s e f u l information may be l o s t i n preparing a composite sample. A n a l y s i s of i n d i v i d u a l samples c o l l e c t e d by a p r o p e r l y designed and executed sampling plan permits determination of the between-sample and within-sample v a r i a b i l i t y as w e l l as the average composition. This i n f o r m a t i o n helps to e s t a b l i s h the heterogeneity of the p o p u l a t i o n , i d e n t i f y anomalous samples, and evaluate d i f f e r e n c e s w i t h i n and between l a b o r a t o r i e s . Thus composite samples provide l i m i t e d information and should be employed only a f t e r c a r e f u l c o n s i d e r a t i o n of the disadvantages i n v o l v e d .

2.

KRATOCHVIL

Sampling for Chemical Analysis of the Environment

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

Subsampling I f the sample increment i s l a r g e r than the amount ( t e s t portion) needed per measurement, subsampling i s necessary• This operation may be simple, as with many l i q u i d or gaseous m a t e r i a l s , or complex, as with c e r t a i n bulk s o l i d s . The work r e q u i r e d to produce a uniform subsample depends on the heterogeneity of the o r i g i n a l m a t e r i a l . Subsampling of s o l i d s may require s e v e r a l steps of p a r t i c l e s i z e r e d u c t i o n and mixing; much has been w r i t t e n on t h i s t o p i c . P a r t i c l e s i z e r e d u c t i o n i s important when the p a r t i c l e s d i f f e r a p p r e c i a b l y i n composition because sampling e r r o r may occur even i n a w e l l mixed sample i f too few p a r t i c l e s are taken f o r a n a l y s i s . One approach to determining the extent of the r e d u c t i o n needed i s to t r e a t the sample as a two-component mixture, with each component c o n t a i n i n g a d i f f e r e n t amount of the substance of i n t e r e s t (16,17)• This treatment i s based on a binomial d i s t r i b u t i o n of the two kinds of p a r t i c l e s . Because i t has been covered i n d e t a i l elsewhere, i t w i l l not be considered here. D i s t r i b u t i o n s Found i n Nature For the purpose of sampling f o r chemical a n a l y s i s three types of d i s t r i b u t i o n s can be c o n s i d e r e d . These are the Gaussian ( a l s o known as the normal, Laplace, or DeMoivre), the Poisson, and the negative b i n o m i a l . Knowledge of the type of d i s t r i b u t i o n i s u s e f u l i n d e v i s i n g the most e f f i c i e n t sampling d e s i g n . Gaussian and Poisson d i s t r i b u t i o n s are both c l o s e l y r e l a t e d to the binomial d i s t r i b u t i o n , which a p p l i e s to the p r o b a b i l i t y of whether or not an event w i l l be observed i n a s e r i e s of independent o b s e r v a t i o n s . [The binomial d i s t r i b u t i o n i s based on the p r o b a b i l i t y of an event or property being observed _p, or not observed 1-p_, i n a s e r i e s of _n independent o b s e r v a t i o n s . The d i s t r i b u t i o n of the number of times the event i s observed, x, i n j i t r i a l s i s given by

For f u r t h e r i n f o r m a t i o n see Reference 18.] The event might be the presence of any p a r t i c u l a r a t t r i b u t e i n a sample, such as the d e t e c t i o n of a p e s t i c i d e . Only two l e v e l s of the a t t r i b u t e are p o s s i b l e , present or not present. If many a t t r i b u t e s c o n t r i b u t e to the r e s u l t of an observation, the binomial p r o b a b i l i t y d i s t r i b u t i o n approaches a l i m i t i n g curve whose equation i s given by _y = (1/a v2%) exp[-(x-ji) / 2 a ] . As a p p l i e d to an a n a l y t i c a l measurement of a substance, y_ i s the p r o b a b i l i t y of a measurement value being observed, j i i s the 2

9

T R A C E RESIDUE ANALYSIS

10

true value f o r the substance, and o_ i s the standard d e v i a t i o n i n jx. This equation d e s c r i b e s the Gaussian d i s t r i b u t i o n . This d i s t r i b u t i o n i s observed f o r a large f r a c t i o n of the systems encountered i n chemical a n a l y s i s ; a c h a r a c t e r i s t i c i s that j i i s greater than JJ. The Poisson d i s t r i b u t i o n i s c l o s e l y r e l a t e d to the binomial, and i s l i k e w i s e d e r i v e d from c o n s i d e r a t i o n of discrete properties. [The Poisson d i s t r i b u t i o n i s given by pOO = e~ _^ /x! where \ = N£_ when N_ i s large and p_ i s s m a l l . Thus \ i s the expected number of events o c c u r r i n g on any given o b s e r v a t i o n , x = \. The Poisson d i s t r i b u t i o n i s a l i m i t i n g form of the binomial d i s t r i b u t i o n (18).] I t a p p l i e s when the p o s s i b l e number of values N_ i s large but the p r o b a b i l i t y p^ of the a t t r i b u t e of i n t e r e s t being observed i s s m a l l . One example i s the measurement of r a d i o a c t i v e decay, where the p r o b a b i l i t y of any one of a large number of atoms undergoing decay a t a given time may be s m a l l . Another example might be the l o c a t i o n of a weed s e e d l i n g or a l i v e i n s e c t i n a f i e l d a f t e r spraying with a p e s t i c i d e . In the f i e l d there are a large and u n s p e c i f i e d number of p o i n t s where a weed p l a n t or i n s e c t might be found, but the p r o b a b i l i t y of f i n d i n g one a t a given p o i n t w i l l be small i f the a p p l i c a t i o n of p e s t i c i d e has been s u c c e s s f u l . The Poisson d i s t r i b u t i o n i s c h a r a c t e r i z e d by j i , the mean or average, being equal to the variance . Thus the standard d e v i a t i o n s_ f o r a s e t of measurements i n a Poisson d i s t r i b u t i o n i s e a s i l y obtained by taking the square root of the average, s_ = /x. Each observed event must be independent f o r the Poisson d i s t r i b u t i o n to h o l d . A t h i r d type of p r o b a b i l i t y d i s t r i b u t i o n f r e q u e n t l y encountered i n nature i s where the occurence of one event at some l o c a t i o n increases the p r o b a b i l i t y of other events being observed nearby. This leads to clumping or patchiness, c h a r a c t e r i s t i c of many b i o l o g i c a l systems such as weed or i n s e c t i n f e s t a t i o n s , and mold growth i n stored g r a i n s . Although a v a r i e t y of p r o b a b i l i t y d i s t r i b u t i o n s have been considered f o r contagious systems, the most s u c c e s s f u l appears to be the negative b i n o m i a l . Here a d i s t i n g u i s h i n g c h a r a c t e r i s t i c i s that o* i s greater than j i . Major considerations" i n any sampling plan are the s i z e and number as w e l l as the l o c a t i o n of the sampling increments. The f o l l o w i n g s e c t i o n s consider aspects of these p o i n t s . 2

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

X

2

g

E s t i m a t i o n of Minimum Size of Sample

Increments

For the determination of a chemical or a p e s t i c i d e i n a f i e l d the sampling increment may be a bulk q u a n t i t y such as a core of s o i l , a volume of a i r passed through a p a r t i c u l a t e s c o l l e c t o r , or a q u a n t i t y of vegetation gathered from a s i n g l e s i t e . A u s e f u l method f o r r e l a t i n g the amount of sample i n an increment to the sampling u n c e r t a i n t y , developed by Ingamells (19,20)

2.

KRATOCHVIL

Sampling for Chemical Analysis of the Environment

11

f o r mining e x p l o r a t i o n , can be a p p l i e d e f f e c t i v e l y to unsegregated Gaussian d i s t r i b u t i o n s . In t h i s approach a sampling constant _Kg, corresponding to the weight of sample required to l i m i t the sampling u n c e r t a i n t y to 1% r e l a t i v e with 68% confidence, i s defined by 2

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

Kg = WR

(1)

where W_ represents the weight of sample taken and R_ i s the r e l a t i v e standard d e v i a t i o n i n sample composition. For a given population, Kg i s evaluated by performing a s e r i e s of analyses on sets of samples of d i f f e r i n g s i z e e i t h e r by c a l c u l a t i o n or with the a i d of a sampling diagram. An example i s a study of human l i v e r homogenate prepared by cryogenic g r i n d i n g at the N a t i o n a l Bureau of Standards (21)• The e f f e c t i v e n e s s of the homogenization step was assessed by withdrawing a small p o r t i o n of t i s s u e , i r r a d i a t i n g i t , adding i t to the remainder of the sample, performing the homogenization operation, and measuring the sodium-24 a c t i v i t y i n ten samples each of about 0.1, 1, and 5.5 g. For the f i r s t s e t of ten a value of 13.1 was obtained for the percent r e l a t i v e standard d e v i a t i o n R_, f o r the second set a value of 5.5%, and f o r the t h i r d 2.53. From Equation 1 values for Kg are 17, 30, and 35. Thus the value f o r K approaches 35, and t h i s i s the best estimate of the sampling constant. From Equation 1, then, we f i n d that the weight of subsample i n grams r e q u i r e d to hold the sampling u n c e r t a i n t y to 1% r e l a t i v e i s 35 g. Equation 1 can be used to estimate the sampling u n c e r t a i n t y for subsamples of other s i z e s . In the above example, a subsample of 0.5 g would be expected to give a sampling u n c e r t a i n t y of about 8% r e l a t i v e . Note that p r e l i m i n a r y measurements are necessary to e s t a b l i s h the degree of heterogeneity of the i n d i v i d u a l sample increments whenever the p r o p e r t i e s of the p o p u l a t i o n are unknown. Under such c o n d i t i o n s e s t i m a t i o n of K„ should not be —s based on a s i n g l e increment, but on r e s u l t s from s e v e r a l . I t i s always sound p r a c t i c e whenever p o s s i b l e to perform a p r e l i m i n a r y assessment of an unknown population by c o l l e c t i n g a few samples and a n a l y z i n g f o r the component of i n t e r e s t . These samples can be s e l e c t e d on the b a s i s of experience and judgment. Then on the b a s i s of the p r e l i m i n a r y r e s u l t s a r e f i n e d sampling plan can be designed. E s t i m a t i o n of Minimum Number of Sample Increments c

A second f a c t o r to consider i n a v a l i d sampling plan i s the c o l l e c t i o n of enough i n d i v i d u a l sample increments to ensure t h a t heterogeneity on a large scale does not b i a s the r e s u l t s . Estimation of t h i s number can be made s t r a i g h t f o r w a r d l y i f the component of i n t e r e s t i s d i s t r i b u t e d

12

TRACE RESIDUE ANALYSIS throughout the population according to a known s t a t i s t i c a l relation. I f the d i s t r i b u t i o n i s Gaussian or b i n o m i a l , the minimum number of increments can be estimated from 4

n = ^ = f - x () R x_ where t_ i s the Student's _t-table value f o r the l e v e l of confidence d e s i r e d , and are estimated from p r e l i m i n a r y measurements on or previous knowledge of the population, and _R i s the percent r e l a t i v e standard d e v i a t i o n acceptable as sampling u n c e r t a i n t y . I n i t i a l l y , t_ can be set at the value f o r 95% confidence l i m i t s , 1.95, and an i n i t i a l estimate of _n c a l c u l a t e d . The t_ value f o r t h i s n_ can then be s u b s t i t u t e d , and the system i t e r a t e d to constant _n. If the d i s t r i b u t i o n i s Poisson, s = x, and Equation 2 s i m p l i f i e s to

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

1

0

2

2

* 4 n = — x 10 R £

(3)

For a negative binomial d i s t r i b u t i o n an index of clumping k_ must be incorporated, and Equation 2 becomes 4

n - ^ ^ H l O ) (4) R x ii Both k_and x^are estimated from p r e l i m i n a r y measurements. E s t i m a t i o n of Number and Size of Increments f o r a Population

Segregated

When the p o p u l a t i o n i s segregated, a number of samples should be taken from each stratum or segment. A guide to the number of samples to c o l l e c t under these circumstances has been developed by Visman (22,23)• Through an e m p i r i c a l study, subsequently put on a t h e o r e t i c a l f o o t i n g by Duncan (24,25), Visman derived the r e l a t i o n ±Q

2

= A/w

n_ + JB/n_

(5)

2 where i s the variance of the average of n_ samples of i n d i v i d u a l weight w, and _A and _B are constants f o r a given p o p u l a t i o n . The magnitude of A^ depends on the degree of homogeneity a t the l o c a l l e v e l , and may be c a l c u l a t e d from Ingamell's subsampling constant jC and the average c o n c e n t r a t i o n of sought-for component x^ by A = 10~ — Once

4

x —

2

K —s

and x have been estimated by n p r e l i m i n a r y measurements

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

2.

KRATOCHVIL

Sampling for Chemical Analysis of the Environment

on a given m a t e r i a l , B_ can be estimated by c a l c u l a t i n g for the same p r e l i m i n a r y measurements and s u b s t i t u t i o n i n t o Equation 5. The magnitude of _B depends on the extent of segregation or s t r a t i f i c a t i o n i n the m a t e r i a l . Once A_and J5 are known, and an acceptable l e v e l f o r the standard d e v i a t i o n of sampling decided on, various combinations of w_ and n_ can be chosen to hold s„ w i t h i n the s e l e c t e d value. —6 Two other methods of o b t a i n i n g values f o r A_and B_ have been developed. In the f i r s t , two sets of samples, one of r e l a t i v e l y large and the other of r e l a t i v e l y small increments, are c o l l e c t e d ; the constant i s obtained from the measurements on the small samples, and the constant from the l a r g e samples. Small samples make the f i r s t terms on the r i g h t side of Equation 5 l a r g e r than the second by emphasizing the e f f e c t s of l o c a l heterogeneity and by making the value of smaller. Large samples have the reverse e f f e c t , and when w^ i s of such a s i z e that the second term swamps the f i r s t , a value f o r _B can be c a l c u l a t e d . I f the m a t e r i a l being sampled c o n s i s t s of d i s c r e t e p a r t i c l e s such that an average p a r t i c l e mass can be c a l c u l a t e d , then s t i l l another method i s u s e f u l . In t h i s procedure the constants _A and B_ of Equation 5 are obtained from the i n t r a c l a s s c o r r e l a t i o n c o e f f i c i e n t r_ between p a i r s of s m a l l , single-increment samples of equal mass, the increments of each p a i r being c o l l e c t e d near each other and the p a i r s d i s t r i b u t e d over the population under study. The value of r_ can be estimated from the r e l a t i o n 2Z(x-x)(x'-x) Z(x-x) + E(x'-x) where the sums are over a l l p a i r s x_ and x_' and i s the mean of a l l measurements (26). From t h i s p i l o t study of 10 to 20 p a i r s the constants A_ and B_ are obtained by A_ = s^/( rm + 1/w) and B_ = rAm. Here jn equals 1/(average p a r t i c l e mass), the mass of the i n d i v i d u a l sample increments, and js_ the pooled standard d e v i a t i o n f o r the measurements. An a t t r a c t i v e aspect of t h i s approach i s that i t a l s o allows c a l c u l a t i o n of a minimum d e t e c t a b l e bias (MDB) i n the sampling o p e r a t i o n f o r any s p e c i f i e d confidence l e v e l and number of samples from the relation MDB

= ts/2/n

The value f o r t_ i s obtained from a table of student's t_ values (see, f o r example, Table T-5 i n Ref. 13, or Table A-4 i n Ref. 26) f o r the d e s i r e d confidence l e v e l and number n_ of samples taken. The need to estimate the average p a r t i c l e mass l i m i t s t h i s method to granular m a t e r i a l s . An example of a c a l c u l a t i o n of Tj _A, B, and MDB i s given i n the Appendix.

13

14

T R A C E RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

Estimation of Sample Size when Form of Population D i s t r i b u t i o n i s Unknown In the preceding s e c t i o n s the Gaussian, Poisson, and clumped d i s t r i b u t i o n s have been discussed, and methods of c a l c u l a t i n g the number of samples i n each case have been g i v e n . When no information i s a v a i l a b l e about a p o p u l a t i o n , however, the q u e s t i o n a r i s e s as to the best approach to use. If sufficient samples can be c o l l e c t e d and analyzed to e s t a b l i s h the d i s t r i b u t i o n as one of the three, the problem i s s o l v e d . I f the d i s t r i b u t i o n does not f i t one of the above, i t should be checked to see whether i t can be converted to Gaussian by taking the logarithm of the v a l u e s . Transformations using f u n c t i o n s other than l o g a r i t h m i c may be considered, but are not e a s i l y r e l a t e d to most r e a l systems. For unknown d i s t r i b u t i o n forms where only l i m i t e d data i s a v a i l a b l e i t i s p o s s i b l e to draw u s e f u l conclusions without knowledge of the d i s t r i b u t i o n . For example, a confidence i n t e r v a l can be e s t a b l i s h e d f o r a set of a n a l y t i c a l values by p l o t t i n g cumulative percent of the number of analyses on the v e r t i c a l a x i s against the i n d i v i d u a l a n a l y t i c a l values on the h o r i z o n t a l a x i s . Then draw l i n e s p a r a l l e l to t h i s p l o t a t a distance of 100 d ,the values f o r &^_ being read from a t a b l e for various numbers of samples and confidence i n t e r v a l s (see, f o r example, Table A-21 i n Ref. 18). Tables are a l s o a v a i l a b l e to determine the number of samples r e q u i r e d to be able to s t a t e that the population cumulative d i s t r i b u t i o n i s w i t h i n a defined band at a s e l e c t e d confidence l e v e l (Ref. 18, Table A-21b)• The numbers tend to be l a r g e . For example, to be 95% sure of c o n t a i n i n g the d i s t r i b u t i o n w i t h i n an i n t e r v a l of ±10% r e l a t i v e 740 samples would be r e q u i r e d . C l e a r l y the p r i c e r e q u i r e d f o r not knowing the form of the population d i s t r i b u t i o n i s more d a t a . 1 - a

a

Conclusions A general theory f o r sampling a heterogeneous system such as the environment f o r trace l e v e l s of substances such as p e s t i c i d e s i s not l i k e l y to become a v a i l a b l e f o r some time. Although a v a r i e t y of models have been proposed to d e s c r i b e s p e c i f i c d i s t r i b u t i o n s , each r e q u i r e s p r i o r knowledge of the system under study. The best approach appears to be to c a r r y out a set of p r e l i m i n a r y sampling and a n a l y s i s operations based on knowledge of s i m i l a r systems from past experience. The extent of the p r e l i m i n a r y work depends on the time and resources a v a i l a b l e ; the more care and e f f o r t expended, the b e t t e r i s the q u a l i t y of the data u l t i m a t e l y c o l l e c t e d . On the b a s i s of t h i s i n i t i a l information a model and sampling plan can

2.

KRATOCHVIL

Sampling for Chemical Analysis of the Environment

be developed. I t must be borne i n mind that the p l a n may need to be a l t e r e d as a r e s u l t of data being c o l l e c t e d i n the course of the work. Such a l t e r a t i o n i s v a l i d i f s t a t i s t i c a l p r i n c i p l e s are c a r e f u l l y adhered to throughout. Acknowledgments

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002

The a s s i s t a n c e of Ram Thapa with the c a l c u l a t i o n s and of Annabelle Wiseman with p r e p a r a t i o n of the manuscript i s g r a t e f u l l y acknowledged. This work was supported by the N a t u r a l Sciences and Engineering Research C o u n c i l of Canada and the U n i v e r s i t y of A l b e r t a . Appendix Example of A p p l i c a t i o n of Sampling Theory to P e s t i c i d e A n a l y s i s T a y l o r , Freeman, and Edwards (27) performed a study of the pathways and rate of loss of the p e s t i c i d e d i e l d r i n from a grass-meadow s o i l . The large number and v a r i e t y of samples c o l l e c t e d and analyzed allow e v a l u a t i o n of the u n c e r t a i n t y a s s o c i a t e d with the sampling o p e r a t i o n . B r i e f l y , i n one p a r t of t h e i r i n v e s t i g a t i o n a s e t of three s o i l cores of d i f f e r i n g diameters was c o l l e c t e d i n a diagonal p a t t e r n w i t h i n each square meter of a 6 m x 6 m square p o r t i o n of a f i e l d (Figure 1). Core depth was 17.7 cm; core diameters were 21, 24, and 44 mm. The cores were each e x t r a c t e d with 1:1 hexane:2-propanol. The e x t r a c t was washed with water and the r e s i d u a l hexane i n j e c t e d i n t o a gas chromatograph. The p r e c i s i o n of the e x t r a c t i o n and measurement operations can be estimated to be of the order of a few per cent. R e l a t i v e to the v a r i a b i l i t y observed i n the o v e r a l l r e s u l t s , these u n c e r t a i n t i e s can be considered n e g l i g i b l e . The r e s u l t s , c a l c u l a t e d on an area b a s i s to f a c i l i t a t e comparison, are reproduced i n Table I . The data show a wide range; the values i n Columns B and C are r e l a t i v e l y high, while those i n Column E are r e l a t i v e l y low. The authors suggested that these v a r i a t i o n s may r e f l e c t i r r e g u l a r i t i e s i n the spray a p p l i c a t i o n of the p e s t i c i d e . A second, more l o c a l , v a r i a b i l i t y was a t t r i b u t e d to incomplete mixing of the s o i l a f t e r the spray a p p l i c a t i o n . The r e s u l t i s a large value f o r the o v e r a l l standard d e v i a t i o n , 166 mg per square meter. Given these data, what statements can we make about the number and s i z e of samples that would have to be taken to hold the sampling standard d e v i a t i o n to some p r e s c r i b e d l e v e l ? C a l c u l a t i o n of I n g e i ^ H ^ subsampling constant i s not appropriate since segregation i s present. The minimum number of sample increments r e q u i r e d can be c a l c u l a t e d from e i t h e r Equation 2 or 5. From Equation 2, assuming that a 50% l e v e l of confidence i s d e s i r e d and that an acceptable percent r e l a t i v e standard d e v i a t i o n R i s 50, j i

15

TRACE RESIDUE ANALYSIS

• Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch002



s

/



/ / / / /

*

s

7 /

/

/

/ /

m

Contour

Spray track B Figure 1. Sampling g r i d (6> C l a s s i c a l \ samp-ling \ problem ? /

Data \ supposed \ . to be ' V^no/mal ? /

N

>

Q

J I

Use AMT estimator tor location parameter.Use median of deviations from sample median for scale estimator

S \

s

I I

.1.

re

F i g u r e 2. D e c i s i o n flowchart p a r t 1

Use robust regression by Andrews' RHO reference (^f)

MUHLBAUER

Processing Outliers in Statistical Data

' 2

N

V A /

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

\

. Fitting a regression line ?

y

/

A Detect N

/

v

ing inconsistent subsamples only ? >

C

I i

,

s

. / A n y form of*\ the General \ Linear > Model to be ' \ used

1 I

I I

l

I Use the classical I methods of J multiple comparisons of riance. I | e.g.v aanalysis I

I

NO

.



Are the ^ \ classical "\ YES assumptions for > ^ fitting regression/ \lines met 'iS

_t___

1

Prepare P r e p a r e the the problem problem

o that it m a y be s o l v e d j I| sby classic regression . I methods.If necessary | j contact a f e l l o w | themat

T"

±_

Use references {lY or ( fe) or contact a f e l l o w j mathematician.

I I 1

|

j . _ In the context of the I General Linear Model I use the MAXIMUM ABSOLUTE STUDENTIZED RESIDUAL to detect , inconsistency. Keep in I mind that inconsistency | is RELATIVE to the assumed form of the j model. 1

1

Use the MAXIMUM ABSOLUTE STUDENTIZED RESIDUAL to d e t e c t inconsistent v a l u e s . Keep in mind that inconsistency is RELATIVE to the assumed form of the regression l i n e .

Figure 3. D e c i s i o n flowchart part 2

TRACE RESIDUE ANALYSIS

42

V

I I

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

X \

n

°°

t

YES

s — 1

ta

N O y

w

< supposed to be ^exponential ?

,

Is there a transformation / of the data into X normal or exponential form \ which transforms / \ outliers into / \ outliers ? / /



J

I Use Shapiro - Wilks | Exponential W - Test I consecutively to ' clear the data. Keep I rejected data for supplementary I studies. For |jeferenc© see ( ^ )

/

x

I i r — | Use robust , statistical methods.

| •

'

F i g u r e 4.

Decis^

1

1

Test the transformed data.

ilowchart

part 3

| |

|

4.

Processing Outliers in Statistical Data

MIJHLBAUER

43

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

a n a l y s i s . The data i s not part of any s p e c i a l set or s p e c i a l environment. Sometimes the data i s c o l l e c t e d by automated devices and sometimes by independent service organizations. - r e s u l t s of the a n a l y s i s are g i v e n o n l y i n a summarized manner, such as a mean v a l u e , a standard d e v i a t i o n , the slope of a r e g r e s s i o n l i n e , e t c . U n i v a r i a t e Data - M u l t i v a r i a t e Data. I f one d e a l s o n l y w i t h one v a r i a b l e under study, e.g., the c o n c e n t r a t i o n of a p a r t i c u l a r chemical i n the water of a r i v e r , t h i s i s a u n i v a r i a t e problem. I t i s u n i v a r i a t e even when the v a r i a b l e under study depends on s e v e r a l o t h e r v a r i a b l e s s u c h as t e m p e r a t u r e and l o c a t i o n of sampling• On the c o n t r a r y , i f more then one v a r i a b l e i s under study s i m u l t a n e o u s l y , t h i s would be c a l l e d a m u l t i v a r i a t e problem. An example of a m u l t i v a r i a t e problem i s i n determining water q u a l i t y using s e v e r a l analyzed v a r i a b l e s . C l a s s i c a l Sampling Problem. I f one i s o n l y i n t e r e s t e d i n e s t i m a t i n g the l o c a t i o n and s c a t t e r parameters of a p o p u l a t i o n , t h i s i s a c l a s s i c a l sampling problem. C l a s s i c a l Assumptions f o r F i t t i n g Regression L i n e s . v a r i a b l e y might be expressed i n the f o l l o w i n g way:

The

dependent

y = f ( x , . . . , x ; b ,...,b ) + e 1

n

1

n

In t h i s formula, f i s a f u n c t i o n of the independent v a r i a b l e s x^ to x and the unknown parameters b. to b which i s l i n e a r i n the parameters. The f u n c t i o n n

f(x , x ; b , b ) = b x

2

x

2

x

+ b

x

2

2

i s the c l a s s i c a l example, but the f u n c t i o n f ( x

l>

x 2

;

b

i, b ) » b 2

x

sin ( ) + b X]L

2

cos(x ) 2

is also possible. The e r r o r , e, i s supposed t o be n o r m a l l y d i s t r i b u t e d w i t h mean 0 and s t a n d a r d d e v i a t i o n s i g m a . As a consequence t h i s means that the v a r i o u s measurements f o r y are ( s t o c h a s t i c a l l y ) independent and the a s s o c i a t e d e's come from an i d e n t i c a l p o p u l a t i o n (they have h o m o s c e d a s t i c i t y or equal v a r i a n c e over the f u l l range)• Detecting I n c o n s i s t e n t Subsamples Only. A p o s i t i v e response to t h i s choice r e s u l t s from an a n a l y t i c a l problem i n v o l v i n g an i n t e r l a b o r a t o r y comparison. The main i n t e r e s t i s to f i n d those l a b o r a t o r i e s which produce i n c o n s i s t e n t r e s u l t s . The r e s u l t s of each

44

TRACE RESIDUE ANALYSIS

l a b o r a t o r y form a subsample. We can f i n d i n c o n s i s t e n t subsamples and, t h e r e f o r e , i n c o n s i s t e n t l a b o r a t o r i e s •

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

Is There a Transformation of the Data i n t o Normal o r Exponential Form? Many data sets a r e d i s t r i b u t e d according t o p r o b a b i l i t y laws that are not the common normal d i s t r i b u t i o n law. Transformations a r e p o s s i b l e t o convert such data s e t s t o a normal o r a n e a r l y normal d i s t r i b u t i o n . I t i s evident that transforming the data i s o n l y a p p r o p r i a t e when the o r i g i n a l problem, f o r example, d e c i d i n g whether two populations a r e d i f f e r e n t o r not, i s not a f f e c t e d by the t r a n s f o r m a t i o n . Several cases are p o s s i b l e . The following transformation, y = (t+3/8)

0

where t = number of occurrences, w i l l normal. This next f o r m a t i o n ,

#

5

transform Poisson data t o

y = a r c s i n [(t+3/8)/(n+3/4)] where n = number of runs, w i l l normal. F i n a l l y ,

0

#

5

transform binomial data t o n e a r l y

y = arc sinh [(t+3/8)(n-3/4)]

0

#

5

w i l l transform negative b i n o m i a l data t o n e a r l y normal. C a l c u l a t i o n and Processing Procedures;

the P r o c e s s i n g Flow Chart

There a r e v a r i o u s methods t o process the data which are mentioned i n the f l o w c h a r t . A l l of them a r e covered o n l y by c i t a t i o n s . You w i l l f i n d the b a s i c references i n Table I .

Table I .

Mathematical

Methods Included i n the Flow Chart

Method AMT-estimator Shapiro Wilks W-test f o r normal data Shapiro Wilks W-test f o r e x p o n e n t i a l data Maximum s t u d e n t i z e d r e s i d u a l Median of d e v i a t i o n s from sample median Andrew's rho f o r robust r e g r e s s i o n C l a s s i c a l methods of m u l t i p l e comparisons M u l t i v a r i a t e methods

Reference

Number 1 2A 2B 2C 3 4 5 6-9

4.

MUHLBAUER

45

Processing Outliers in Statistical Data

Example. For an example of the use of t h i s d e c i s i o n procedure, I w i l l use DATASET D (see Appendix I ) . The data s e t i s t o be used to prepare a c a l i b r a t i o n graph i n chromatographic a n a l y s i s . I t contains a number of e x c e s s i v e l y h i g h values i n the lower l e v e l s due t o the presence of an o v e r l a p p i n g contaminant. We s t a r t a t the top of the D e c i s i o n Flow Chart, part 1, shown i n Figure 2.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

D e c i s i o n diamond: Automatic processing of standard data? Since the answer i s "NO", the l e f t branch i s f o l l o w e d . Instruct i o n s are met t o have a thorough look a t the d a t a . There are s e v e r a l numbers which seem t o be i n c o n s i s t e n t . However, w i t h no a d d i t i o n a l data a v a i l a b l e t o t h i s author, I w i l l proceed. D e c i s i o n diamond:

U n i v a r i a t e data?; "YES"

D e c i s i o n diamond: C l a s s i c a l sampling problem? i s "NO", I r e s t a r t a t the top of Figure 3. D e c i s i o n diamond:

F i t t i n g a regression line?

As the answer

"YES"

D e c i s i o n diamond: Are the c l a s s i c a l assumptions f o r f i t t i n g r e g r e s s i o n l i n e s met? "NO" C l e a r l y the measurements a t the d i f f e r e n t x - l e v e l s d i f f e r i n t h e i r v a r i a b i l i t y . This can be shown by using the F - t e s t . Another method i s o u t l i n e d i n another chapter of t h i s t e x t ( 1 0 ) . In t h i s case weighted l e a s t squares w i l l r e solve the problem of h e t e r o s c e d a s t i c i t y or unequal v a r i a n c e across the graph. I have chosen weights of 1, 1, 0.1, 0.01 and 0.01 f o r the r e s o l u t i o n of t h i s problem.

Table I I . F i t t i n g DATASET D Data t o the F i r s t Order Regression Model, y = a + bx

Quantity

Equation C o e f f i c i e n t s (1) a = 0.13 b = 26.42 Max ASR (2)

C a l c u l a t e d Values t Max ASR

C r i t i c a l Values t Max ASR

2.10 2.10

0.28 8.66 2.29

2.78

(1) C o r r e l a t i o n c o e f f i c i e n t f o r the r e g r e s s i o n f i t t i n g i s 0.90. (2) Max ASR occurs a t x * 0.5, y =* 44.1.

46

TRACE RESIDUE ANALYSIS

D e c i s i o n command: Use the maximum absolute s t u d e n t i z e d r e s i d u a l method t o detect i n c o n s i s t e n t v a l u e s .

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

When t h i s method i s used, Table I I shows the r e s u l t s when the r e g r e s s i o n model i s the normal f i r s t order l i n e a r model. Since the maximum absolute s t u d e n t i z e d r e s i d u a l (Max ASR) found, 2.29, was l e s s than the c r i t i c a l value r e l a t i v e t o t h i s model, 2.78, the c o n c l u s i o n i s that there are no i n c o n s i s t e n t v a l u e s . I t i s evident that the c a l c u l a t e d t-value f o r the constant v a l u e , a, i s l e s s than the c r i t i c a l t - v a l u e . From the s t a t i s t i c a l viewpoint t h i s v a l u e , then, i s n e g l i g i b l e . The data can then be r e c a l c u l a t e d according t o the f i r s t order model without a constant value. Table I I I shows the r e s u l t of t h i s r e c a l c u l a t i o n . There are no changes r e l a t i n g t o the conclusions made concerning the author d e t e r m i n a t i o n . Three c r i t i c a l p o i n t s can be made i n t h i s a n a l y s i s . The f i r s t one i s l o c a t e d a t the "thorough look" i n s t r u c t i o n . This examination i n r e a l i t y i n v o l v e s a c r i t i c a l a n a l y s i s of the e x p e r i mental p r o t o c o l and the data produced from i t . For example, i t was q u i t e evident i n c o l l e c t i n g the standards data from DATASET D that values were w e l l out of l i n e w i t h previous d e t e r m i n a t i o n s . See other DATASETS, e s p e c i a l l y DATASET E i n the Appendix, f o r c o n f i r m a t i o n of t h i s i d e a . The second c r i t i c a l point i s a t the " P r e p a r a t i o n of the problem" i n s t r u c t i o n . In t h i s case heteros c e d a s t i c i t y must be removed before submitting the data t o r e g r e s s i o n a n a l y s i s . Weighted l e a s t squares of s e v e r a l types (11) and power transformations (10) can be used. The t h i r d c r i t i c a l point

Table I I I . F i t t i n g DATASET D Data t o the F i r s t Order Regression Model, y = bx

Quantity

C a l c u l a t e d Values t Max ASR

C r i t i c a l Values t Max ASR

Equation C o e f f i c i e n t s (1) b = 27.14

17.2

2.09

Max ASR (2)

2.36

2.78

(1) C o r r e l a t i o n c o e f f i c i e n t f o r the r e g r e s s i o n f i t t i n g i s 0.97. (2) Max ASR occurs a t x = 0.5, y = 44.1.

47 4. MUHLBAUER Processing Outliers in Statistical Data

i s a t the same i n s t r u c t i o n and i s the d e c i s i o n of the r e g r e s s i o n model used f o r the c a l i b r a t i o n graph. F i r s t order, higher order, and s p l i n e (12) methods can a l l be used f o r t h i s model. A l l these choices w i l l s i g n i f i c a n t l y i n f l u e n c e the d e c i s i o n concerning the r e a l i t y of i n c o n s i s t a n t v a l u e s .

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch004

Literature Cited

1.

Andrews, D. F.; B i c k e l , P. J.; Hampel, F. R.; Huber, P. J.; Rogers, W. J.; Tukey, J. W. "Robust Estimates of Location: Survey and Advances"; Princeton Univ. Press: Princeton, NJ, 1972; pp. 5, 15, 39-44, others.

2.

Barnett, V . ; Lewis, T. "Outliers i n S t a t i s t i c a l Data"; Wiley:New York, 1978; pp. (A) 89-103, (B) 76-88, (C) 234-265.

3.

Huber, P. J. "Robust S t a t i s t i c s " ; Wiley: New York, 1981; pp. 107-109.

4.

Lawson, J. S. J. Quality Technology 1982, 14, 19-33.

5.

M i l l e r , R. G. "Simultaneous S t a t i s t i c a l Inference"; McGrawHill: New York, 1966; p. 98.

6.

Beckman, R. J.; Cook, R. D.

7.

Campbell, N. A. Applied S t a t i s t i c s 1980, 29, 231-237.

8.

Maronna, R. A. The Annals of S t a t i s t i c s 1976, 4, 51-67.

9.

Schwager, S. J.; Margolin, B. H. The Annals of S t a t i s t i c s 1982, 10, 943-954.

Technometrics 1983, 25, 119-163.

10.

Kurtz, D. A.; Rosenberger, J. R.; Tamayo, G . , Chapter 9 i n this book.

11.

M i t c h e l l , D. G . , Chapter 8 i n this book.

12.

Wegscheider, W., Chapter 10 i n this book.

RECEIVED March 25, 1985

American Chemical Society Library 1155 16th St. N. W. Washington, D. C. 20036

5 The Many Dimensions of Detection in Chemical Analysis

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

with Special Emphasis on the One-Dimensional Calibration Curve LLOYD A. CURRIE Center for Analytical Chemistry, National Bureau of Standards, Washington, DC 20234

Simple detection decisions generally involve the comparison of scalar quantities (gross s i g n a l , blank). Conventional chromatography and spectrometry, on the other hand, involve one-dimensional variables (time, mass, wavelength, energy) where signal and baseline traces may be examined to decide whether a peak i s present at a given location. Linked techniques, such as GC-MS or two-parameter nuclear spectroscopy, raise the question of detection i n two dimensions. F i n a l l y , problems wherein a set of samples i s characterized by many independent chemical and physical observations raise the issue of multidimensional detection. A unified approach for all such problems i s given by the statistical theory of hypothesis testing. Following a brief review of underlying assumptions and techniques for applying the theory to detection decisions and detection l i m i t s , primary attention i s given to a one-dimensional (reduced from two) problem involving the calibration curve and the pesticide, Fenvalerate. Other topics addressed include information-loss through faulty reporting (at trace levels) and its impact on regulatory issues, and chemometric quality assurance through standard interlaboratory test data sets. One of the fundamental performance c h a r a c t e r i s t i c s of any anal y t i c a l procedure i s the L i m i t of D e t e c t i o n . Just as with the imprecision (standard d e v i a t i o n ) , with which i t i s i n t i m a t e l y connected, the Detection L i m i t ( L ) i s undefined unless there D

This chapter not subject to U.S. copyright. Published 1985, American Chemical Society

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

50

TRACE RESIDUE ANALYSIS

e x i s t s a f u l l y - s p e c i f i e d Chemical Measurement Process (CMP) i n a s t a t e of complete c o n t r o l . When these requirements are met, i t i s convenient t o define Lp i n accordance w i t h the s t a t i s t i c a l theory o f Hypothesis T e s t i n g (1,2). Although t h i s theory i s w e l l e s t a b l i s h e d there continues t o be a great d i v e r s i t y of terminology and formulations which generate needless confusion i n our d i s c i p l i n e . S t i l l more s e r i o u s i s i n t e r d i s c i p l i n a r y confusion, when a n a l y s t s are c a l l e d upon t o provide v a l i d methods and c r i t i c a l data f o r r e g u l a t o r y , c l i n i c a l , or environmental decision-making ( 3 ) . The o b j e c t i v e s of t h i s review w i l l be t o summarize the b a s i c concepts of d e t e c t i o n i n A n a l y t i c a l Chemistry, w i t h the development f o l l o w i n g a stepwise increase i n d i m e n s i o n a l i t y . Prime emphasis i s given t o the assumptions which must be met, and t o i l l u s t r a t i o n s having d i f f e r i n g dimensions. I n keeping w i t h the Symposium t i t l e and i n response to the i n v i t a t i o n of the Symposium o r g a n i z e r , a d e t a i l e d e x p o s i t i o n i s presented f o r the t r a c e d e t e c t i o n of a p e s t i c i d e (Fenvalerate) by gas chromatography — an e x e r c i s e which h i g h l i g h t s the r e l a t i o n s h i p of the c a l i b r a t i o n process t o the d e t e c t i o n c h a r a c t e r i s t i c , and which exposed a s u r p r i s i n g (and unnecessary) l i m i t a t i o n t o the d e t e c t i o n c a p a b i l i t y . Treatment of a r e a l , imperfect c a l i b r a t i o n data s e t revealed the f u l l complexity and breadth of the c a l i b r a t i o n curve - d e t e c t i o n l i m i t problem, ranging from v a r y i n g s t a t i s t i c a l weights t o an u n c e r t a i n model and data c o n t a i n i n g p o s s i b l e blunders t o an a r t i f i c i a l l y imposed response t h r e s h o l d . Attempts t o s i m p l i f y an a c t u a l l y complicated s i t u a t i o n were r e j e c t e d i n favor of a f u l l e x p o s i t i o n i n c l u d i n g an Appendix c o n t a i n i n g worked-out numerical examples. SIMPLE HYPOTHESIS TESTING - SCALAR SIGNALS The b a s i c d e t e c t i o n concepts can be presented f o r the "zerodimensional" case where d e t e c t i o n d e c i s i o n s and d e t e c t i o n l i m i t s are e s t a b l i s h e d simply from the c h a r a c t e r i s t i c s of the chemical s i g n a l (instrument response), without g i v i n g d e t a i l e d a t t e n t i o n t o other dimensions such as time, wavelength, analyte concentrat i o n , e t c . A c t u a l l y , higher dimensional s i t u a t i o n s (multiparameter separations or detector responses) reduce t o t h i s case e i t h e r through s e q u e n t i a l c l a s s i f i c a t i o n schemes or v i a a l g o rithms which operate d i r e c t l y on the multidimensional data. Our b a s i c task i s t o d i s t i n g u i s h the blank or background ( H , n u l l h y p o t h e s i s ) , from a s i g n a l a t the d e t e c t i o n l i m i t (H-|, alternative hypothesis). A straightforward p r o b a b i l i s t i c f o r m u l a t i o n can be given provided that the observed s i g n a l s ( a r i s i n g from an u n d e r l y i n g " t r u e " s i g n a l ) are random, independent and s t a t i o n a r y . To completely s p e c i f y the f a l s e p o s i t i v e (a) and f a l s e negative (3) r i s k s , we must know the form of the d i s t r i b u t i o n and i t s parameters. For most a n a l y t i c a l s i t u a t i o n s Q

5.

51

CURRIE

Dimensions of Detection in Chemical Analysis

we assume the d i s t r i b u t i o n t o be normal (Gaussian), and the d i s p e r s i o n parameter i s simply the i m p r e c i s i o n (standard deviation, o). As shown i n Reference ( 2 ) , i t i s s u f f i c i e n t t o have an estimate o f the blank (B) and i t s standard d e v i a t i o n ( 0 5 ) p l u s the v a r i a t i o n of o w i t h the s i g n a l magnitude (y) t o s p e c i f y a d e c i s i o n c r i t e r i o n o r l e v e l (LQ) given a, and a d e t e c t i o n l i m i t ( L D ) given LQ and 3 . (See e s p e c i a l l y F i g u r e 2 i n Reference (2j). I f oy i s independent of s i g n a l magnitude (at and below the d e t e c t i o n l i m i t ) , and i f y i s normally d i s t r i b u t e d , one concludes that y

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

L

z

1a

C - 1-a°o

L

= LQ

D

< ) (1b)

+ Z1-300

where z 0, then the d i s p e r s i o n about the f i t t e d values ( y i ) y i e l d s an estimate f o r o . In every case, independent experiments and r e p l i c a t e s are v i t a l f o r e x t e r n a l v a l i d a t i o n of the presumed OJ'S — e.g., w i t h the a i d of the A n a l y s i s of Variance. f

2

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

D e v i a t i o n s from the I d e a l Model. In a l l r e a l s i t u a t i o n s , the e r r o r terms have a s t r u c t u r e q u a l i t a t i v e l y represented by Equation 4, e « BA i s n e c e s s a r i l y zero; so K = 1. Second, i f « 1 ( p r e c i s e A or A-known), both K and 1 - 1 ; so x = 2 XQ. T h i r d , s i n c e PBA i s negative and | P B A I » OB £ o , the r a t i o K/I may be w r i t t e n as [1-e(z )]/[l-(z )(z )] where 0 £ e < 1. Thus, there can be no a n a l y t e d e t e c t i o n l i m i t ( x ») i f A

< 1

A

N

D

D

0

A

A

A

D

5 1/z = 1/1 .645 =* 0.608 A

A l s o x = 2 x i f e = | P B A I ° B O + H> l i m i t , ->0. I f e < z, XD > 2XQ and the converse. The minimum i n the r a t i o xp/xc occurs when the design {x^} i s such t h a t x - xp (see Table I I and Equation 1 0 ) . More g e n e r a l l y , the c a l i b r a t i o n curve can be represented as a m a t r i x equation /O

D

Z

a

n

d

c

i n

t n e

A

A

y - M8 + e

(8)

whose weighted l e a s t - s q u a r e s s o l u t i o n i s

8 = (MTwM)"" M Wy 1

T

(9a)

60

TRACE RESIDUE ANALYSIS

V

T

1

= (M WM)-

e

T

1

* (M M)" V

(9b)

y

s

The approximation f o r the variance o f 6 i n (9b) h o l d i n g when V. i s approximately constant, A summary of the a p p l i c a t i o n of Equation 9 t o the l i n e a r c a l i b r a t i o n curve d e r i v e d from known a n a l y t e concentrations {x} - (x-j, X 2 ... x ) and corresponding s t a t i s t i c a l weights ( i n v e r s e v a r i a n c e s ) i s given i n Table I I . n

Table I I . D e c i s i o n and D e t e c t i o n (weighted l e a s t

squares)

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

(y = M0 = B + Ax) eT = (B A)

(

Zw

Zwx \

where W£ = 1 / V ( x i ) [k/V , i f k-replicates]

) , Zwx Zwx2y

y

y

Then, 1 T , 1 *^ V (x) + — A L Decision: x = 0

V

x x

v

2

Z

w

(x - x ) 2

1

w

+

— :

Z w ( x

w

a

" w) J D e t e c t i o n : x » x^ x

n

2

Expressions f o r the v a r i a n c e s and covariance o f B and A f o l l o w from the i n v e r s e m a t r i x (MTWM)*' . See the d i s c u s s i o n o f "casef" from Table V i n the Appendix, f o r e x p l i c i t formulas. 1

(Note t h a t x represents the weighted mean o f the {x}.) Given the d e f i n i n g expressions f o r d e c i s i o n and d e t e c t i o n l i m i t s together w i t h the c a l i b r a t i o n design {x}, the equation f o r V i n Table I I immediately y i e l d s the d e s i r e d q u a n t i t i e s f o r the l i n e a r c a l i b r a t i o n curve. For equal weights (Vy - const.) and t a k i n g r o o t s , the e x p r e s s i o n s i m p l i f i e s t o r . -|1/2 w

x

5.

CURRIE

61 Dimensions of Detection in Chemical Analysis

Non-linear curves may be t r e a t e d using Equation 9 d i r e c t l y , using the techniques of n o n - l i n e a r l e a s t squares, when a p p r o p r i a t e . (Note that a n o n - l i n e a r c a l i b r a t i o n curve does not n e c e s s a r i l y imply n o n - l i n e a r l e a s t squares. The l a t t e r i s necessary only i f the problem i s n o n - l i n e a r i n the estimated parameters (16). For example, y = a+bx+cx and y = a+bx a r e both n o n - l i n e a r f u n c t i o n s , but o n l y the l a t t e r i s n o n - l i n e a r i n the parameters.)

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

2

c

Fenvalerate Data. C a l i b r a t i o n data f o r the GC measurement of Fenvalerate were f u r n i s h e d by D. K u r t z (17). Average responses f o r f i v e r e p l i c a t e s a t each o f f i v e standard concentrations a r e given i n Table I I I . I t should be noted t h a t the s t a t e d r e sponses are not raw o b s e r v a t i o n s , but r a t h e r o n - l i n e computer generated peak area estimates (cm ). (Had we s t a r t e d w i t h the raw data [chromatograms], the problem would a c t u a l l y have been two-dimensional, i n c l u d i n g as v a r i a b l e s r e t e n t i o n time and concentration.) The s t a t e d u n c e r t a i n t i e s i n the peak areas are based on a l i n e a r f i t (o * a+bx) o f the r e p l i c a t i o n standard d e v i a t i o n s t o c o n c e n t r a t i o n ; and the " l o c a l s l o p e s " [ f i r s t d i f f e r e n c e s ] i n the l a s t column of Table I I I a r e presented 2

Table I I I . Fenvalerate (GC) Data - Set B

(averages of 5 r e p l i c a t e s ) 2

Response (y, c m )

a

Amount (x, ng)

Ay/Ax

[0.023]

0.05

23.6

7.08 ± 0.06s C0.365]

0.25

29.5

[0.18]

1.00

30.1

1.18 ± 0 . 0 2

29.68 ± 0.23

4

209.0

± l.l-i

[1.87]

5.00

44.8

920.6

± 4.4

[4.32]

20.00

47.4

0

U n c e r t a i n t i e s represent standard e r r o r s , based on the f i t t e d equation o(y) - (0.028 + 0.49 x)//5~. Q u a n t i t i e s i n brackets a r e the observed standard e r r o r s .

TRACE RESIDUE ANALYSIS

62

simply t o i n d i c a t e the extent of n o n - l i n e a r i t y i n the c a l i b r a t i o n curve, ( T h i s i s not so easy t o grasp from a p l o t , because of the very wide dynamic range.) In order t o c a l c u l a t e V , and t h e r e f o r e the d e t e c t i o n l i m i t , i t i s necessary f i r s t t o estimate V as a f u n c t i o n o f c o n c e n t r a t i o n and then t o use t h i s i n f o r m a t i o n t o estimate the parameters of the c a l i b r a t i o n curve u s i n g weighted l e a s t squares (WLS) f i t t i n g . Rigorous a p p l i c a t i o n o f WLS r e q u i r e s knowledge of r e l a t i v e weights, but the technique i s already considered adequate when n £ 5 (18). In Table IV we present the r e s u l t s of f i t t i n g a l t e r n a t i v e models t o the p a t t e r n of weights and the c a l i b r a t i o n curve. Before using the r e s u l t s i n Tables I I I and IV t o c a l c u l a t e detection l i m i t s , x

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

y

Table IV. A l t e r n a t i v e C a l i b r a t i o n Models

Model (1)

y - B + Ax(a)

(2)

y = B + Ax

(3)

y = B + Aq(b)

t

^ 0.042

j^A/dff

22.8

-1.04 ± 0.02 0.042 ± 0.024

38.81 ± 0.12 32.46 ± 0.10

32.7 9.64

a

( ^ T h i s model i s taken t o be exact — i t uses B from model-3 together w i t h the i n i t i a l p o i n t , (x,y) = (0.05, 1.18), t o d e r i v e A. take q t o be e x a c t l y x - t o account f o r the nonl i n e a r i t y i n the curve; the two parameters (B, A) are then estimated by l i n e a r l e a s t squares, u s i n g weights as i n d i c a t e d i n Table I I I . 1

1 2

a number of observations should be made: (a) The observed SE's (Table I I I ) a r e g e n e r a l l y monotonic ( c e r t a i n l y not constant) w i t h i n c r e a s i n g c o n c e n t r a t i o n and c o n s i s t e n t w i t h the l i n e a r model, w i t h the exception o f the value a t x = 0.25 ng. (b) The i n i t i a l o b s e r v a t i o n (at x = 0.05) has a response already > f o r t y times the z e r o - p o i n t standard d e v i a t i o n

5.

CURRIE

63

Dimensions of Detection in Chemical Analysis

(1,18/0.028); thus, i t i s c l e a r l y way i n excess of the detect i o n l i m i t . A very l a r g e e x t r a p o l a t i o n i s t h e r e f o r e necessary to estimate both the background (B) and standard d e v i a t i o n ( o ) i n the r e g i o n of the d e t e c t i o n l i m i t . T h i s i s the b a s i s of i n t r o d u c i n g model-1, f o r i l l u s t r a t i v e purposes (Table I V ) . (c) Model-2 (Table IV) o b v i o u s l y i s inadequate; the s i g n i f i c a n t l y negative i n t e r c e p t and poor f i t r u l e i t out (over the e n t i r e data range). Not shown are simple polynomial f i t s , which are a l s o inadequate. (d) Model-3 i s b e t t e r . The i n t e r c e p t i s c o n s i s t e n t w i t h zero (to be expected from the technique of c a l c u l a t i n g net GC peak a r e a ) . The f i t , however, i m p l i e s an a d d i t i o n a l (nonr e p l i c a t i o n ) e r r o r source. Again, f o r i l l u s t r a t i v e purposes, the f u n c t i o n q(x) has been taken exact i n order t o avoid the d i s t r i b u t i o n a l p e r t u r b a t i o n s of n o n - l i n e a r l e a s t squares (not j u s t i f i e d i n view of the foregoing l i m i t a t i o n s of the d a t a ) . Before t u r n i n g t o the question of d e t e c t i o n , i t i s i l l u m i n a t i n g t o examine a p l o t of the data, and the r e s i d u a l s from the f i t of model-3. These are shown i n Figure 1. The p r i n c i p a l observations which d e r i v e from the r e s i d u a l p l o t are that the assumed shape of the curve and v a r i a t i o n of s t a t i s t i c a l weight w i t h c o n c e n t r a t i o n are g e n e r a l l y a c c e p t a b l e . The magnitude of the r e s i d u a l s and d i s p e r s i o n f o r c e r t a i n r e p l i c a t e s and concent r a t i o n s are not. That i s , there i s a d d i t i o n a l s c a t t e r about the f i t t e d curve, unaccounted f o r by the r e p l i c a t i o n e r r o r ; and c e r t a i n r e p l i c a t e s , e s p e c i a l l y (• a n d • ) i n the 0.25 ng and 5 ng samples are more widely separated than the o t h e r s . Queries which f o l l o w e d these observations l e d to suggestions that some untoward d i l u t i o n e r r o r s may have been i n v o l v e d i n preparing two of the standards, and random e r r o r s i n "x (concentrations of standards) may not be n e g l i g i b l e . Thus, a d e t a i l e d e v a l u a t i o n of the c a l i b r a t i o n process would r e q u i r e s c r u t i n y (or restanda r d i z a t i o n ) of standard s o l u t i o n s f o r p o s s i b l e blunders (outl i e r s ) , and the d i f f i c u l t task of f i t t i n g the c a l i b r a t i o n data t a k i n g i n t o account e r r o r s i n both v a r i a b l e s (19).

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

y

ft

Fenvalerate D e t e c t i o n L i m i t s . To the extent that d e t e c t i o n l i m i t s r e q u i r e knowledge of the c a l i b r a t i o n curve and random e r r o r ( f o r x) as a f u n c t i o n of c o n c e n t r a t i o n , a l l of the f o r e g o i n g d i s c u s s i o n i s r e l e v a n t — both f o r d e t e c t i o n and e s t i m a t i o n . However, curve shape and e r r o r s where x >> x , are r e l a t i v e l y unimportant at the d e t e c t i o n l i m i t , i n c o n t r a s t to d i r e c t observations of the i n i t i a l slope and the blank and i t s variability. ( I t w i l l be seen t h a t the i n i t i a l o b s e r v a t i o n i n the current data set exceeded the u l t i m a t e d e t e c t i o n l i m i t by more than an order of magnitude!) To give some p e r s p e c t i v e t o the above remarks a set of a l t e r n a t i v e d e c i s i o n and d e t e c t i o n l i m i t s are given i n Table V, d e r i v e d from a p p r o p r i a t e i n f o r m a t i o n i n the preceding three t a b l e s . F i r s t , we observe t h a t there are two broad c l a s s e s of D

64

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

1000.

0.03

0.1

1.

10. 30.

Amount (ng) Fenvalerate—Set B

CO D "35

o> cc "O (B/S ) + (B/S ) A BA A B D B D y

d

(11)

f

where the s are r e l a t i v e standard d e v i a t i o n s , and Sj) i s the net s i g n a l ( y - B ) a t the d e t e c t i o n l i m i t . (See the Appendix, Case-f, f o r the a p p l i c a t i o n o f Equation t1.) D

Within each o f the two c l a s s e s i n Table V, the f i r s t two s e t s o f l i m i t s ( ( a ) , ( b ) , ( d ) , (e)) use the constant and v a r i a b l e weights, r e s p e c t i v e l y , and assume B and A are e x a c t l y known (model-1 i n Table I V ) . The remaining l i m i t s i n v o l v e estimated parameters, based on the design {x} and the equations of Table I I I . Method (c) u t i l i z e s the parameters o f Model-1 and constant weight; method ( f ) uses Model-3 and v a r i a b l e y-err ors (we i g h t ) . P r i n c i p a l conclusions t o be drawn from t h i s e x e r c i s e , d i s p l a y e d g r a p h i c a l l y i n Figure 2, are t h a t : © The "black-box" t h r e s h o l d imposes a l a r g e and unnecessary increase i n d e t e c t i o n l i m i t . © I n the r e g i o n o f the d e t e c t i o n l i m i t , f o r t h i s data s e t , the a l t e r n a t i v e weighting scheme or model s e l e c t e d has l i t t l e effect. © The a d d i t i o n a l , n o n - r e p l i c a t i o n , s c a t t e r about the f i t t e d c a l i b r a t i o n curve — perhaps due t o random e r r o r i n the x - v a r i a b l e — does show a s u b s t a n t i a l e f f e c t . (See l a s t paragraph, Append!x.) © Optimal assessment of the minimum d e t e c t i o n l i m i t would r e q u i r e a design {x} w e l l below the c u r r e n t standard conc e n t r a t i o n s and i n c l u d i n g the blank. The scope of t h i s a r t i c l e does not permit the c o n s i d e r a t i o n of p h y s i c a l v s . e m p i r i c a l models f o r the c a l i b r a t i o n curve, nor the e f f e c t of new designs on the d e t e c t i o n l i m i t , but these a r e extremely important issues i n c a l i b r a t i o n . For example, i t can be shown t h a t with an inadequate design the d e t e c t i o n l i m i t ( f o r a » 3 = 0.05) may not even e x i s t ! (XD «>.)

5.

CURRIE

67 Dimensions of Detection in Chemical Analysis

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

[y(cm2)l

Figure 2. C a l i b r a t i o n curve f o r Fenvalerate i n the Regions of the D e t e c t i o n L i m i t s . Numerical values of d e c i s i o n (C) and d e t e c t i o n l i m i t s (D) are shown f o r the "No Threshold", case (b) and "Threshold", case (e) from Table V. For the former, a- and 3-errors are i n d i c a t e d q u a l i t a t i v e l y . The f i r s t (lowest c o n c e n t r a t i o n ) data p o i n t i s shown at (x,y) = (50, 1.18). (Though t o p o l o g i c a l l y c o r r e c t , the s c a l e s have been d e l i b e r a t e l y d i s t o r t e d t o encompass both cases, and e s p e c i a l l y near the o r i g i n t o dramatize the e f f e c t of some designs on the uncert a i n t y of the i n t e r c e p t and the r a t i o X\)/XQ.)

TRACE RESIDUE ANALYSIS

68

HIGHER DIMENSIONS:

EXPLORATION AND VALIDATION

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

Space remains f o r o n l y a b r i e f glance a t d e t e c t i o n i n higher dimensions. The b a s i c concept of hypothesis t e s t i n g and the c e n t r a l s i g n i f i c a n c e of measurement e r r o r s and c e r t a i n model a s sumptions, however, can be c a r r i e d over d i r e c t l y from the lower dimensional d i s c u s s i o n s . I n the f o l l o w i n g t e x t we f i r s t examine the nature o f d i m e n s i o n a l i t y (and i t s r e d u c t i o n t o a s c a l a r f o r d e t e c t i o n d e c i s i o n s ) , and then address the c r i t i c a l i s s u e of d e t e c t i o n l i m i t v a l i d a t i o n i n complex measurement s i t u a t i o n s . Physicochemical A n a l y s i s v s . Chemometric R e s o l u t i o n . Once we pass beyond the one-dimensional c a l i b r a t i o n of a pure substance, we enter the realm of mixtures, and the a n a l y z i n g dimensions of chromatography, spectrometry, r e l a x a t i o n times, morphology, chemical " f i n g e r p r i n t i n g " , e t c . When one has s u f f i c i e n t r e s o l v i n g power, whether by means of a simple dimension of extreme r e s o l u t i o n or a l i n k e d ("hyphenated") s e r i e s of independent dimensions y i e l d i n g the product of t h e i r i n d i v i d u a l r e s o l v i n g powers, then the problem reduces t o the zerodimensional case. That i s , one simply measures the s i g n a l i n the a p p r o p r i a t e hypercube i n multidimensional space which marks the l o c a t i o n of the species of i n t e r e s t . An outstanding example of such m u l t i s p e c t r a l s o r t i n g i s the new technique o f A c c e l e r a t o r Mass Spectrometry (20), which has l e d t o a r e v o l u t i o n i n measurements f o r radiocarbon d a t i n g , isotope geophysics, nuclear geology, e t c . Here, f o r example, 1**C atoms and c l u s t e r s a r e i n i t i a l l y mass analyzed as high energy (-2 MeV) negative i o n s , a f t e r which a l l molecular fragments are destroyed and most e l e c t r o n s removed; then a d d i t i o n a l a c c e l e r a t i o n and mass a n a l y s i s occurs w i t h i o n s , and f i n a l d i s c r i m i n a t i o n takes place w i t h 8 MeV i o n s on the b a s i s of i o n i z a t i o n d e n s i t y (dE/dx) and energy (E) or range. The r e s o l v i n g power i s so enormous that one can i s o l a t e the s i g n a l of one 1^0 atom from the a s s o c i a t e d 1 0 -10 * C atoms. A s u b t l e dimension i n t h i s spectroscopy i s time, i n t h a t the overwhelming background o f % i s e l i m i n a t e d by the decay o f N~ during the i n i t i a l a c c e l e r a t i o n phase. F i n a l q u a n t i t a t i v e e s t i m a t i o n comes from i n t e g r a t i n g counts i n the a p p r o p r i a t e r e g i o n o f the dE/dx, E - plane. More commonly, we are faced w i t h the need f o r mathematical r e s o l u t i o n o f components, u s i n g t h e i r d i f f e r e n t patterns (or s p e c t r a ) i n the v a r i o u s dimensions. That i s , l i t e r a l l y , mathematical a n a l y s i s must supplement the chemical or p h y s i c a l a n a l y s i s . I n t h i s case, we very o f t e n i n i t i a l l y l a c k s u f f i c i e n t model i n f o r m a t i o n f o r a r i g o r o u s a n a l y s i s , and a number of methods have evolved t o "explore the data", such as p r i n c i p a l components and " s e l f - m o d e l i n g " a n a l y s i s (21), c r o s s c o r r e l a t i o n (22). F o u r i e r and d i s c r e t e (Hadamard, . . .) transforms (23), d i g i t a l f i l t e r i n g (24), rank a n n i h i l a t i o n (25), f a c t o r a n a l y s i s (26), and data matrix r a t i o i n g (27). 1

1

1 2

11

1 2

1

5.

69

Dimensions of Detection in Chemical Analysis

CURRIE

Under the best of circumstances we can express the m u l t i dimensional s i g n a l (y) as a l i n e a r f u n c t i o n of the unknown concentrations ( x ) , such as decaying nuclear or o p t i c a l spectra: k

v

ij

=

1

v

A

ijk - U Ui)

ijk

=

1

A

i j k *k + eij

(12)

where e-tj/^k

k

(13)

(The f i r s t f a c t o r U ( A i ) i s the spectrum of species-k vs wavelength (A^); the second i s the decay curve v£ time~Ttj) w i t h mean l i f e x . ) I f the e^j are normally d i s t r i b u t e d w i t h known ( r e l a t i v e ) v a r i a n c e s , and we know the s p e c t r a and l i f e t i m e s f o r a l l components, then weighted, l i n e a r l e a s t squares w i l l provide estimates f o r x and °x (28). Since each x i s a l i n e a r sum of the normally d i s t r i b u t e d o b s e r v a t i o n s , i t too i s normal, and i t i s (almost) s t r a i g h t f o r w a r d t o compute the d e c i s i o n l e v e l (XQ) and d e t e c t i o n l i m i t ( x ) f o r s p e c i e s - k . I n p r i n c i p l e , the q u a n t i t i e s would r e q u i r e the e v a l u a t i o n o f o as x i n c r e a s e s from zero t o i t s d e t e c t i o n l i m i t . I f the s i g n a l i s r e l a t i v e l y weak (x

2

(3b) A

In the present case of = 0.00724, so the i n c l u s i o n of J would i n c r e a s e o and t h e r e f o r e xc by l e s s than 1 p a r t i n 10 *. The e x i s t e n c e of such a f a c t o r i s important i n p r i n c i p l e , however, f o r i t s i g n i f i e s the c o n t r i b u t i o n of e to the variance of the estimated net a n a l y t e c o n c e n t r a t i o n (or amount) even when that c o n c e n t r a t i o n i s zero. A

1

0

A

[ i i ] S t r i c t l y speaking, XQ and xn as c a l c u l a t e d above must be viewed as approximations (though extremely good ones) s i n c e A i s

76

TRACE RESIDUE ANALYSIS

not e x a c t l y known (denominator of Eq. 1 0 ) . A u s e f u l viewpoint i s to consider x as e x a c t l y Z-J-QJ o /A, where A i s a s i n g l e , s e l e c t e d outcome 1ST! 1

c

z

0

1-a'/A -

z = + 2pAA*B + i|; q A B

(11')

B

where

*B = < J > B i ——V

B|

yvD-B J

—V

n d

ys J

*

X D

=

*q

/

K

1

2

D

D

Using the previous r e s u l t s , we f i n d PBA = -0.292 SD - YD-B = B

- yc-B+zi-3°VD * 1-.042+1.645(.028+0.49[0.0464])

(jTJ

=

°

•A = °A/A

B / S d

024

* °- 3/1 .041 = 0.0233

* 0.103/32.46 = 0.00317

Thus, q . 0.0226; so c|> =* 0.0202 The ("1o") confidence i n t e r v a l f o r the d e t e c t i o n l i m i t i s thus, D

XD

x : 46.4 ± 0.9l| pg D

A symmetric and normal confidence i n t e r v a l i s a good approximation, s i n c e the u n c e r t a i n t y i s dominated by 03 (numerator of Eq 3b). F i n a l l y , _ i f the poor f i t simply r e f l e c t e d p r o p o r t i o n a t e l y e x t r a y - v a r i a n c e , then: *i

2

+ W i / ( 9 . 6 4 ) and Oy + o

y

• (9.64)//5

80

TRACE RESIDUE ANALYSIS

and , would be increased by a f a c t o r of A

B

9.64

The r e s u l t i n g estimate f o r xp would be x : D

59.1 ± 1 1 . 5 pg

T h i s r e s u l t , however, should not be taken too s e r i o u s l y , because the poor f i t may not be simply r e l a t e d to e x t r a y - v a r i a n c e . Literature Cited

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

Kaiser, H. Two papers on the Limit of Detection of a Complete Analytical Procedure. London:Hilger, 1968. Currie, L. A. Anal. Chem. 1968, 40, 586. Rogers, L. B. Subcommittee dealing with the s c i e n t i f i c aspects of Regulatory Measurements, American Chemical Society, 1982. Ingle, J . D . , J r . J . Chem. Educ. 1974, 51, 100-5. Currie, L. A. Pure & Appl. Chem. 1982, 54, 715-754. Currie, L. A. Nucl. Instr. Meth. 1972, 100, 387. Currie, L. A. i n "Modern Trends i n Activation Analysis"; DeVoe, J. R.; LaFleur, P. D . , Eds.; Nat. Bur. Stand (U.S.) Spec. Publ. 312, 1968; p. 1215. Currie, L. A. i n "Treatise on Analytical Chemistry"; Elving, P.; Kolthoff, I . M., Eds.; J. Wiley & Son:New York, 1978; Vol 1, Chap. 4. Ingle, J . D., Jr.; Wilson, R. L. Anal. Chem. 1976, 48, 1641. Patterson, C. C . ; S e t t l e , D. M. 7th Materials Res. Symposium, Nat. Bur. Stand (U.S.) Spec. Publ. 422, 1976; p. 321. Scales, B. Anal. Biochem. 1963, 5, 489. Horwitz, W. (FDA); Meinke, W. W. (NRC), personal communica­ tions, 1982. See also Reference (3). Horwitz, W.; Kamps, L. R.; Boyer, K. W. J. Assoc. Off. Anal. Chem. 1980, 63, 1344. Hubaux, A . ; Vos, G. Anal. Chem. 1970, 42, 849. Ku, H. H. J. Res. N a t l . Bur. Stand. 1966, 70c, 263. Brownlee, K. A. i n " S t a t i s t i c a l Theory and Methodology i n Science and Engineering"; J . Wiley & Son:New York, 1960. Kurtz, D. A . , personal communication, 1982, 1983. Jacquez, J . A . ; Mather, F. J . and Crawford, C. R. i n "Linear Regression with Non-Constant, Unknown Error Variances"; Biometrics 1968, 24, 607. Golub, G. H . ; van Loan, C. F. J . Numer. Anal. 1980, 17, 883. Purser, K.; Russo, C . ; Gove, H . ; Elmore, R.; Ferraro, R.; Beukens, K.; Chang, L.; K i l i u s , L.; Lee, H . ; Litherland, A. Chapt. 3 i n Symposium on Nuclear and Chemical Dating Techniques, Currie, L. A., Ed.; American Chemical Society: Symposium Series No. 176, Washington, D.C., 1982.

5. CURRIE Dimensions of Detection in Chemical Analysis

21. 22. 23. 24. 25. 26. 27. 28. Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch005

29. 30. 31.

81

Lawton, W. H.; Sylvestre, E. A.; Maggio, M. S. Technometrics 1972, 14, 3. Horlick, G. Anal. Chem. 1973, 45, 319. Marshall, A. G . , Ed.; "Fourier, Hadamard, and Hilbert Transforms i n Chemistry", Plenum Press:New York, 1982. Savitzky, A . ; Golay, M.J.E. Anal. Chem. 1964, 36, 1627. Ho, C . - N . ; Christian, G. D . ; Davidson, E. R. Anal. Chem. 1981, 53, 92. Malinowski, E. R.; Howery, D. G. i n "Factor Analysis i n Chemistry"; J . Wiley & Son:New York, 1980. Fogarty, M. P . ; Warner, I . M. Anal. Chem. 1981, 53, 259. Nicholson, W. L.; Schlosser, J . E . ; Brauer, F. P. Nucl. Instr. Meth. 1963, 25, 45. Parr, R. M . ; Houtermans, H . ; Schaerf, K. Computers i n A c t i ­ vation Analysis and Gamma-ray Spectroscopy Ed., Conf. -780421 1979, p. 544. Currie, L. A . ; Gerlach, R. W.; Lewis, C. W. Atm. Environ­ ment 1984, 18, 1517. Liggett, W. ASTM Conf. on Quality Assurance for Environ­ mental Measurements, STP (in press) 1984, Boulder, CO.

RECEIVED March 25, 1985

6 Introduction to the Theory of Correlation Chromatography RAYMOND ANNINO

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

The Foxboro Company, Foxboro, MA 02035 The technique of correlation chromatography i s des­ cribed through text and figures i n a step-by-step manner. The d e s c r i p t i o n e x p l a i n s how a Pseudo Random Binary Sequence (PRBS) can control multiple and overlapped input injections into a chromatograph and again be used to sort out the detector data to give a correlogram. Correlograms mimic chromatograms but represent chromatographic data at a much higher s e n s i t i v i t y . The method i s used to immensely increase the signal-to-noise r a t i o of a chromato­ graph. Problems in sampling and non-linearity are also discussed. One of the d i f f i c u l t i e s i n w r i t i n g about C o r r e l a t i o n Chromatography (CC) i s i n p r o v i d i n g the reader w i t h enough conceptual understanding o f t h i s elegant procedure t o appreciate both i t s p o t e n t i a l and the aggravating problems which s t i l l hamper i t s use. Many of us r e q u i r e a p h y s i c a l model to a i d us i n understanding the elements of a problem. A mathematical formula i s o f l i t t l e help unless we can a s s o c i a t e i t w i t h some p h y s i c a l p i c t u r e . For t h i s reason, we w i l l attempt i n t h i s paper a l a r g e l y p i c t o r i a l but s t i l l r i g o r o u s p r e s e n t a t i o n of C o r r e l a t i o n Chromatography (CC). Background Before proceeding t o the main subject of CC, i t i s necessary f o r the reader t o g a i n some understanding f o r the b a s i c process o f "correlation." C o r r e l a t i o n i s a mathematical procedure f o r meas u r i n g the s i m i l a r i t y o f two d i f f e r e n t s i g n a l s o r the s p e c t r a l c h a r a c t e r i s t i c s of one s i g n a l . Consider the two time-varying s i g n a l s shown i n Figure 1. I s there any s i m i l a r i t y between the two s i g n a l s ? To answer t h i s question we might cut out the s i g n a l l b so that a f t e r we place i t underneath the other we can s h i f t i t a l i t t l e along the x a x i s (which i s a time a x i s ) , to v i s u a l l y compare i t w i t h the other. I f 0097-6156/ 85/ 0284-O083S06.00/ 0 © 1985 American Chemical Society

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

84

Figure 1. Signals to be compared. Signal a. and b. are the same. S i g n a l b. has been delayed S u n i t s i n time.

6.

Introduction to the Theory of Correlation Chromatography

ANNINO

85

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

we were to do t h i s , we would f i n d that when we had s h i f t e d i t S u n i t s , the two s i g n a l s would appear to be i d e n t i c a l . Our c o n c l u s i o n i s that the o n l y d i f f e r e n c e i n the two s i g n a l s i s that one of them i s s h i f t e d i n time w i t h respect to the other one. How can we document t h i s comparison procedure mathematically? Suppose we a g a i n examine F i g u r e 1. L e t us d i g i t i z e the two s i g n a l s i n t o the u n i t s we have shown along the x - a x i s . Now at each of these u n i t s m u l t i p y the i n t e n s i t y value of one s i g n a l w i t h the corresponding i n t e n s i t y value of the o t h e r . Do t h i s f o r each d i g i t i z e d u n i t , sum a l l of these i n d i v i d u a l products and d i v i d e by the number of u n i t s , T, i n the sum to o b t a i n an average value f o r a l l these products. To express t h i s o p e r a t i o n mathematically we write: t=T R

xy

(0)

=

1/T

x(t)y(t) t=0

The number o b t a i n e d i n t h i s way i s c a l l e d a c o r r e l a t i o n c o e f f i c i e n t , R , and i t s magnitude f u r n i s h e s us w i t h a measure of the c o r r e l a t i o n ^ between the two s i g n a l s . Small values of the c o r r e l a t i o n c o e f f i c i e n t i n d i c a t e l i t t l e or no c o r r e l a t i o n w h i l e l a r g e values are obtained when the two s i g n a l s match. What we would l i k e to do now i s s h i f t the s i g n a l i n the same manner t h a t we d i d when we cut i t o u t , and repeat the above m u l t i p l i c a t i o n , summing, and averaging procedure to c a l c u l a t e another c o r r e l a t i o n c o e f f i c i e n t . In other words, each i n t e n s i t y value of s i g n a l l a at a c e r t a i n time v a l u e , t , w i l l be m u l t i p l i e d w i t h the i n t e n s i t y value i n l b found at t -1. We w i l l proceed w i t h our m u l t i p l i c a t i o n , summing, and averaging i n t h i s manner f o r a l l values of t w i t h i n the i n t e r v a l we have s e l e c t e d f o r examination, and thus produce a new value f o r the c o r r e l a t i o n c o e f f i c i e n t , R , at t h i s time s h i f t of one u n i t . Thus we have: ^ X

t=T R xy (1)

(t-l)y(t)

1/T t=0

We continue t h i s s h i f t i n g and c a l c u l a t i o n process u n t i l we have s h i f t e d the d e s i r e d i n t e r v a l . Our process i s one of moving past events i n s i g n a l l b i n t o the present f o r comparison w i t h l a . I f we c a l l the time s h i f t t a u , we can w r i t e a g e n e r a l e x p r e s s i o n f o r the c a l c u l a t i o n of the c o r r e l a t i o n c o e f f i c i e n t at any value of t a u .

TRACE RESIDUE ANALYSIS

86

t=T R

xy

(r)

=

x(t-r)y(t)

1/T

t=0

The l i m i t of t h i s expression g i v e s us the f a m i l i a r d e f i n i t i o n f o r the c r o s s - c o r r e l a t i o n f u n c t i o n w i t h the l i m i t s of i n t e g r a t i o n r e d e f i n e d f o r a d i s t r i b u t i o n about zero. T/2

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

R X y

oo

x(t-r)y(t)dt

T

-T/2 F o l l o w i n g the c a l c u l a t i o n of the c o r r e l a t i o n c o e f f i c i e n t a t a number of values of t a u , we can p l o t the c o r r e l a t i o n c o e f f i c i e n t s against t a u . Such a p l o t might look l i k e the one shown i n Figure 2. Notice t h a t the value of the c o r r e l a t i o n c o e f f i c i e n t i s s m a l l except a t the time s h i f t where e v e r y t h i n g i n both s i g n a l s l i n e s up. Suppose the s i g n a l s i n Figure 1 were random i n nature. We know f r o m o u r e l e m e n t a r y s t a t i s t i c s t h a t i f we had a s e t o f t o t a l l y random numbers centered around some v a l u e , say z e r o , that che average value of the sum of these numbers would be zero. Thus, f o r a random s i g n a l centered about zero our above o p e r a t i o n f o r c a l c u l a t i n g the c o r r e l a t i o n c o e f f i c i e n t produces a c o r r e l a t i o n c o e f f i c i e n t c l o s e t o zero (the value about which the s i g n a l i s centered) except when the s i g n a l s are l i n e d up. Since these a r e random s i g n a l s , t h i s w i l l occur a t o n l y one value of t a u , the time delay between the two s i g n a l s . Our c o r r e l a t i o n p l o t would then look l i k e the one i n Figure 3. Notice that t h i s c o r r e l a t i o n p l o t i s n o i s e - f r e e as compared t o the one shown i n Figure 2 where some c o r r e l a t i o n e x i s t s a t a number of time s h i f t s . This n o i s e - f r e e correlogram i s a c h a r a c t e r i s t i c of the c r o s s - o r a u t o c o r r e l a t i o n of a random s i g n a l . C o r r e l a t i o n Chromatography S i n g l e Peak CC. Let us now t u r n our a t t e n t i o n t o mating t h i s procedure w i t h chromatography i n an e f f o r t t o i n c r e a s e the d e t e c t o r sensitivity. Suppose we were t o s e t up a chromatograph i n such a way that e i t h e r sample or c a r r i e r gas w i l l be f l o w i n g through the column. This s e t up i s shown s c h e m a t i c a l l y i n Figure 4. Which gas i s b e i n g i n j e c t e d w i l l be d e t e r m i n e d by t h e p o s i t i o n o f t h e two p o s i t i o n v a l v e , V. A r b i t r a r i l y , we w i l l l a b e l one p o s i t i o n of

Introduction to the Theory of Correlation Chromatography

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

ANNINO

JlUU-

s Figure 2.

P l o t of c o r r e l a t i o n c o e f f i c i e n t s v s .

tau.

4> (T)

Figure 3. P l o t of c o r r e l a t i o n c o e f f i c i e n t s v s . tau random s i g n a l s S u n i t s out of phase.

for

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

88

TRACE RESIDUE ANALYSIS

Figure

4.

Schematic of a c o r r e l a t i o n

chromatograph.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

6.

ANNINO

Introduction to the Theory of Correlation Chromatography

t h i s v a l v e as 1, (sample gas b e i n g i n j e c t e d ) and the o t h e r p o s i t i o n as - 1 , (pure c a r r i e r gas being i n j e c t e d ) . Assume the s i n g l e component i n the sample r e s e r v o i r e l u t e s w i t h a time of 3 u n i t s and a s i g n a l height of 3D u n i t s . A l s o , i n t h i s example, i t i s assumed that the column w i l l not d i s t o r t the s i g n a l — only delay i t i n time. Thus, i f we were to s w i t c h the v a l v e according to the PRBS code shown i n Figure 5a, we would expect, some time l a t e r , to see a d e t e c t o r s i g n a l such as that a l s o shown i n Figure 5a. I f we compare t h i s output s i g n a l to the i n p u t code, we see that there i s not a one-to-one correspondence at a g i v e n time. Obviously the d e t e c t o r s i g n a l i s not going to be e x a c t l y synchronized w i t h the v a l v e p o s i t i o n code because of the column d e l a y . Let us now c r o s s - c o r r e l a t e the d e t e c t o r s i g n a l w i t h the v a l v e p o s i t i o n code i n the manner j u s t discussed f o r the s i g n a l s of Figure 1. We have i l l u s t r a t e d t h i s procedure i n Figure 5b. An a r r a y of d e t e c t o r s i g n a l s has been memorized and the a r r a y of v a l v e p o s i t i o n s a s s o c i a t e d i n time w i t h d e t e c t o r s i g n a l s i s shown above i t . I f you m u l t i p l y each d e t e c t o r s i g n a l w i t h the v a l v e p o s i t i o n shown above i t and sum each of these products, you w i l l o b t a i n the c o r r e l a t i o n c o e f f i c i e n t shown at the r i g h t . The v a l v e p o s i t i o n code i s then s h i f t e d one code u n i t ( g i v e n by the value of tau) and the process repeated. The p l o t of these c o r r e l a t i o n c o e f f i c i e n t s v s . tau g i v e s us the correlogram shown i n Figure 5c. We have j u s t performed c o r r e l a t i o n chromatography. Notice that t h i s procedure converts the time v a r y i n g d e t e c t o r s i g n a l to a s i n g l e pulse even though we are feeding sample to the column more or l e s s c o n t i n u o u s l y . T h i s p u l s e i s , i n t h e o r y , i d e n t i c a l to a conventional s i n g l e pulse chromatogram. A c t u a l l y , the pulse i s modified i n t o a G a u s i a n - l i k e s i g n a l by the column j u s t as the column m o d i f i e s a s i n g l e pulse chromatogram. However, there i s an important d i f f e r e n c e . Because i t i s constructed from a number of s i n g l e pulses and represents the average of these s i n g l e p u l s e chromatograms, i t , i n t h e o r y , has a much l a r g e r s i g n a l - t o - n o i s e r a t i o than i t s s i n g l e pulse analog. Again using our elementary s t a t i s t i c s we p r e d i c t that the s i g n a l s t r e n g t h has been i n c r e a s e d by the number of sample i n j e c t i o n s made d u r i n g the p e r i o d of the code, w h i l e the noise has o n l y i n c r e a s e d by the square root of the number of i n j e c t i o n s . This then i s the b a s i s f o r using t h i s procedure f o r t r a c e a n a l y s i s where one i s u s u a l l y working at the l i m i t of the s i g n a l - t o - n o i s e r a t i o of the system. One could consider the correlogram to be a snapshot of past events. We must memorize the d e t e c t o r s i g n a l s and v a l v e p o s i t i o n s over a p e r i o d of time and then c a l c u l a t e a correlogram w i t h t h i s d a t a . To o b t a i n a more up-to-date snapshot we must r e t u r n to the a r r a y of d a t a and c a l c u l a t e a n o t h e r c o r r e l o g r a m based on the newest i n f o r m a t i o n which has been placed i n the a r r a y . As mentioned p r e v i o u s l y , the c o r r e l a t i o n of random s i g n a l s y i e l d s c l e a n b a s e l i n e s . The v a l v e p o s i t i o n code shown i n Figure 5a has been chosen w i t h t h i s property i n mind. I t i s from a set

89

TRACE RESIDUE ANALYSIS

90

a. CHROMATOGRAPHIC DETECTOR SIGNAL 3 2 I

D

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

TIME.

—•I . '

I

LTLJ PRBS

* - LENGTH OF COOE UNIT ' , _ ,

,

R

LTL_J

,

T

.

I

LTL_J

SAMPLE INJECT

1.

CARRIER INJECT

COMMANO SIGNAL TO VALVE CALCULATION OF CORRELATION COEFFICIENTS

b. VALVE POSITION DETECTOR SIGNAL

1 3

3 3

T«l + ~ 4 T»2 T«3

T«4 for)

T«0 - 4 - - 4 4 4

T«5

T •«

0

3

0 0 0

4 4 3 3 3 0 3 00 4 -1- - 4- - - 4 3 3 3 0 3 0 0 4 4 +--4 3 3 3 0 3 0 0 - 4 4 4 - 4 3 3 3 0 3 0 0 -- 4 + 3 3 3 0 3 0 0 4 1-4 4 3 3 3 0 3 0 0

0 0 12

0 0 O

"5—r Figure 5. C o n s t r u c t i o n of correlograms. a. Detector s i g n a l generated by the sample i n j e c t e d over a period of time according t o the PRBS shown below, b. C a l c u l a t i o n of c o r r e l a t i o n c o e f f i c i e n t s , c. Correlogram.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

6.

ANNINO

Introduction to the Theory of Correlation Chromatography

91

of codes which are c a l l e d pseudo random b i n a r y sequences ( P R B S ) , and can be generated on demand from published algorithms 0»2»3)« These codes have the i n t e r e s t i n g property that w i t h i n the i n t e r v a l d e f i n e d by the sequence they behave l i k e a t o t a l l y random sequence of b i n a r y events. This i s a convenient code to use. It depicts b i n a r y events which i s e x a c t l y our s i t u a t i o n w i t h a sample v a l v e ; i t i s e i t h e r i n j e c t i n g sample or i t i s not. The s h o r t e s t time i n t e r v a l f o r i n j e c t i o n w i l l e s t a b l i s h the primary u n i t of the code and the s e l e c t e d code w i l l determine the sequence of these u n i t s . A number of codes can be g e n e r a t e d w h i c h w i l l a l l o w us the f l e x i b i l i t y to s e l e c t the d e s i r e d r e s o l u t i o n w i t h i n the sequence. C o r r e l a t i o n of the code w i t h i t s e l f ( a u t o c o r r e l a t i o n ) y i e l d s o n l y one c o r r e l a t i o n p o i n t i n the t i m e domain d e f i n e d by the sequence and the u n i t code i n t e r v a l (see Figure 5c) and an otherwise c l e a n b a s e l i n e . Since the d e t e c t o r i n our chromatogram j u s t f o l l o w s what the sample v a l v e i s doing, i t a l s o should be a pseudo random sequence and the c r o s s - c o r r e l a t i o n of i n p u t and output i s r e a l l y an a u t o c o r r e l a t i o n and thus y i e l d s the s i n g l e pulse correlogram w i t h an otherwise c l e a n b a s e l i n e . Suppose the d e t e c t o r s i g n a l contains random noise which has been c o n t r i b u t e d from other p a r t s of the system. This random noise has a d i f f e r e n t source from our a r t i f i c i a l l y c r e a t e d , randomly v a r y i n g d e t e c t o r s i g n a l ; t h e r e f o r e i t i s not c o r r e l a t a b l e w i t h ours. I t i s not i n phase w i t h our s i g n a l . Our procedure f o r c a l c u l a t i n g the correlogram averages t h i s noise to zero. This w i l l o n l y be t r u e f o r t r u l y random noise and provided t h a t you have s e l e c t e d a l a r g e enough p e r i o d f o r the c a l c u l a t i o n , i . e . a s t a t i s t i c a l l y v a l i d sample. In p r a c t i c e , t h i s means that you must s e l e c t a PRBS which i s long enough so that the data a r r a y contains enough samples to average out the random n o i s e . M u l t i p l e Peaks. F i n a l l y , we must answer the question of how t h i s method y i e l d s chromatograms c o n t a i n i n g many peaks and whether the correlogram i n t e n s i t y i s r e l a t e d to component concentration? Notice i n our previous examples our d e t e c t o r s i g n a l has been g i v e n the a r b i t r a r y u n i t s of 3D. The type ( v o l t a g e , c u r r e n t ) and i n t e n s i t y of t h i s output w i l l depend on the p a r t i c u l a r d e t e c t o r (TC, FID, EC, e t c . ) and the c o n c e n t r a t i o n of the species i n the sample. The input code i s j u s t t h a t . I t i s not a s i g n a l . The +1 and -1 v a l u e s a r e used t o d e p i c t a p o s i t i o n of the i n j e c t i o n valve. I f the sample contains more than one species w i t h d i f f e r e n t r e t e n t i o n times, the e f f e c t of the column w i l l be to transmit the same v e r s i o n of the s w i t c h i n g code but at d i f f e r e n t delay times, according to the r e t e n t i o n times of each of the components. This can be v i s u a l i z e d then as a number of delayed v e r s i o n s of the sampling p a t t e r n t r a n s m i t t e d through the column simultaneously — a l l o r i g i n a t i n g simultaneously from the same i n j e c t i o n code. We w i l l s e l e c t a PRBS of 15 f o r t h i s n e x t example f o r increased r e s o l u t i o n of the two s o l u t e s whose s e p a r a t i o n we are

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

92

TRACE RESIDUE ANALYSIS

going t o demonstrate. Consider, f o r example, a sample composed of two components. Component A has a r e t e n t i o n time of 3 and, i f i n j e c t e d over one of the time u n i t s of our code ( a s i n g l e p u l s e ) , would y i e l d a d e t e c t o r s i g n a l of 2. S i m i l a r l y , component B has a r e t e n t i o n time of 7 and a s i n g l e pulse d e t e c t o r i n t e n s i t y of 7. I f we were t o use two d e t e c t o r s , one s p e c i f i c f o r only A and one s p e c i f i c f o r o n l y B, the s i g n a l s would appear, s h i f t e d i n time, as shown i n Figure 6a. The p o s i t i o n of the i n j e c t i o n valve at the time these d e t e c t o r values were recorded i s a l s o shown. S i m i l a r l y , i f we were t o use on n o n - s p e c i f i c d e t e c t o r , the sum of the s i g n a l s due t o A and B would y i e l d the d e t e c t o r s i g n a l l a b e l e d i n Figure 6a as Combined S i g n a l . Now l e t us t a k e a time p e r i o d o f d e t e c t o r s i g n a l s l a r g e enough t o encompass t h e l e n g t h o f t h e pseudo random b i n a r y sequence i n j e c t i o n code which produced i t , and c r o s s - c o r r e l a t e i t w i t h t h i s i n j e c t i o n code of -1 and 1. We have shown i n Figure 6a an a r r a y of the combined s i g n a l values f o r one sequence of our code and underneath, i n Figure 6b, the corresponding p o s i t i o n s of the sample v a l v e a t the moment each d e t e c t o r value was recorded. A l s o , we have c a l c u l a t e d a value f o r the c o r r e l a t i o n c o e f f i c i e n t a t each value of tau as we s h i f t e d the i n j e c t i o n code from the past i n t o the present. F i n a l l y , we have p l o t t e d the value of the c o r r e l a t i o n c o e f f i c i e n t s vs. t a u i n Figure 6c. Notice the simple c r o s s - c o r r e l a t i o n o p e r a t i o n has deconvol v e d the two chromatograms from the r a t h e r nasty d e t e c t o r s i g n a l of overlapped chromatographic peaks. A l s o , i f you examine the PRBS i n j e c t i o n code of 15 which we used, you w i l l see that there are a t o t a l of e i g h t +1 p o s i t i o n s and seven -1 p o s i t i o n s . Thus we have i n j e c t e d sample e i g h t times before repeating our code. The summation procedure a t time s h i f t s of 3 and 7 should r e f l e c t t h i s , and they do. A value of 16 i s obtained a t tau equals 3. This i s e i g h t t i m e s t h e s i n g l e p u l s e r e s p o n s e of 2 o b t a i n e d f o r t h i s compound a t t h i s c o n c e n t r a t i o n . S i m i l a r l y , a t t a u equal t o 7, compound B has a c o r r e l a t i o n c o e f f i c i e n t o f 56 ( e i g h t t i m e s seven)• N o t i c e t h a t we have s e t t h i s example up so t h a t t h e average value of the d e t e c t o r s i g n a l b a s e l i n e i s zero and a simple summation accomplishes the averaging. N o i s e A d d i t i o n . To i l l u s t r a t e t h e r e d u c t i o n o f o u t - o f - p h a s e random noise which i s p o s s i b l e w i t h t h i s procedure, l e t us put some random noise on the s i n g l e pulse chromatogram of the sample used f o r the above example. We have p l o t t e d the s i n g l e pulse chromatogram w i t h a noise band (minimum t o maximum noise s i g n a l ) of 2 i n Figure 7b. A s i m i l a r noise band added t o the s i g n a l from our c o r r e l a t i o n d e t e c t o r shown i n Figure 6a produces the s i g n a l shown i n Figure 7a. The noise s i g n a l i s a l s o reproduced here, and you can see i t i s r e a l l y a s h i f t e d PRBS code of 15. This was done only t o keep the c a l c u l a t i o n s w i t h i n the paper and p e n c i l domain as we have throughout t h i s paper. The r e s u l t s w i l l be the same for t r u l y random n o i s e .

6.

ANNINO

Introduction to the Theory of Correlation Chromatography

93

, PRBS Valve -« Command

4

2

Detector A

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

Detector B

ji

n

ji

n 0 27

Lr LT

Lr

9

2

Combined S i g n a l

9

2

b. T

POSITION OF SHIFTED PRBS

0 f — 0 . + _ H f 1 - i 0 + 4-4--4-4--I-42 4-M-+ — + -4 4- 4 - 0 3 4---4-4-4-4-4-44- l« 4 4 4- 4 4 - + - 4 4 4 0 5 4- + f- - 4H—+ 4 0 « 4- 4 4 4-- + 4-H 4 0 7 -t +4-44 4-4-f- - ss - 44444--44-4- 0 9 + -4444 4- 4 4 - 0 +

+

+



Figure 6. sample.

C o n s t r u c t i o n of a correlogram f o r a two component

TRACE RESIDUE ANALYSIS

ji—ru-u

n_r-Ln_r Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

n

n

Lr Lr

n_ruu

Jl

1 Jl 0 2 7 t

Code

Combined S i g n a l

2 • 2

LTIT i_Ji_mj i_r:

Noise

Signal with Added Noise

SINSLE PULSE CHROMATOMAM WITH NOISE AOOEO

Jl

\PRBS :\

0J2 2

lr u

II 2 4 4 1 2

• 4

Time Figure 7. A d d i t i o n of noise t o the d e t e c t o r s i g n a l of a s i n g l e pulse chromatographic run and a run using a PRBS i n j e c t i o n sequence.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

6.

ANNINO

Introduction to the Theory of Correlation Chromatography

95

We have kept the phases the same so you can c r o s s - c o r r e l a t e the s h i f t t a b l e shown i n Figure 6b w i t h the n o i s y combined detect o r s i g n a l shown i n Figure 7a to o b t a i n the corresponding c o r r e l a tion coefficients. E x a c t l y the same c o r r e l a t i o n c o e f f i c i e n t s are obtained f o r the s i g n a l i n Figure 7a as f o r the n o i s e f r e e d e t e c t o r s i g n a l shown i n Figure 6a. The correlograms are the same. The noise has vanished! C l e a r l y , the correlogram of Figure 6c represents a much b e t t e r data set from which to c a l c u l a t e the c o n c e n t r a t i o n of the two components than the chromatogram shown i n Figure 7b. You can understand the excitement of e a r l y researchers i n the f i e l d when they r e a l i z e d the p o t e n t i a l of t h i s technique. Large i n c r e a s e s i n the s i g n a l - t o - n o i s e r a t i o were p o s s i b l e simply by encoding the sample i n j e c t i o n s p r o p e r l y . A l s o , i t might be p o s s i b l e to t u r n the batch-operated chromatograph i n t o one g i v i n g more or l e s s continuous answers. The c a l c u l a t i o n s r e q u i r e d to decode the r e s u l t s were t r i v i a l and thus would not r e q u i r e a l a r g e computing power. Problems Now that we have discussed the t h e o r e t i c a l b a s i s f o r c o r r e l a t i o n chromatography, l e t us examine some areas which may cause problems or at l e a s t , i n p r a c t i c e , l i m i t i t s a p p l i c a t i o n . Sampling. Notice that the c o r r e l a t i o n procedure produces a t r i angle whose base width i s two b a s i c time u n i t s wide. In order not to degrade the peak shape produced by a s i n g l e pulse w i d t h , our t r i a n g l e should be of the same width. Thus, i n t r a c e a n a l y s i s we are l i m i t e d to a u n i t code width which i s one h a l f the sample volume allowed f o r a s i n g l e pulse chromatogram of the same sample. This may have i m p l i c a t i o n s on the sampling s w i t c h design i f very f a s t s w i t c h i n g i s necessary to maintain r e s o l u t i o n of c l o s e l y spaced peaks. In t h i s regard, we concluded i n e a r l i e r work (4) that ensemble averaging a number of s i n g l e pulse chromatograms might be a more e f f e c t i v e way of i n c r e a s i n g s i g n a l - t o - n o i s e r a t i o s i n f a s t chromatography of c l o s e l y spaced peaks than by using CC. I n s e l e c t i n g parameters f o r c o r r e l a t i o n work, two f a c t o r s need to be considered. F i r s t , the l e n g t h of the pseudo random sequence i s determined; i t must be longer than the longest e l u t i n g peak. Second, the maximum u n i t pulse width i s c a l c u l a t e d to maint a i n a given chromatographic i n t e g r i t y . Both of these f a c t o r s are then used as c r i t e r i a to s e l e c t the pseudo random code that one must use i n a p a r t i c u l a r a n a l y s i s . The sample v a l v e must be made so that i t can be a c t i v a t e d a u t o m a t i c a l l y at the command of the s w i t c h i n g code. Also, i n terms of hardware d e s i g n , CC demands a r e l i a b l e sample s w i t c h which must s w i t c h many more times over a given a n a l y s i s c y c l e than i t would have to i f run i n the s i n g l e pulse chromatography mode. Since the sampling valve i s probably the most u n r e l i a b l e component

96

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

of an automated chromatograph, you can understand our i n t e r e s t i n no moving p a r t f l u i d i c v a l v e s ( 4 , 5 ) . Other r e s e a r c h e r s have recognized the sample v a l v e problem and o f f e r e d t h e i r own c o n t r i butions (£, 7). Non-linearity. One of the most aggravating problems i n CC which was r e c o g n i z e d e a r l y i n t h e r e s e a r c h i s a s s o c i a t e d w i t h t h e s o - c a l l e d n o n - s t a t i o n a r y nature of the system. I n a l l of the examples shown above we have assumed l i n e a r i t y of system response. This means that peak shapes do not change w i t h c o n c e n t r a t i o n ( a l l the moments of the peak remain the same); only the magnitude of the response changes and t h i s does so l i n e a r l y w i t h the concentration. Only i f t h i s property i s maintained w i l l our m u l t i p l i c a t i o n , summing and averaging work out t o cancel everything except at the c o r r e c t time s h i f t . Otherwise, the o p e r a t i o n produces s i g n i f i c a n t values f o r the c o r r e l a t i o n c o e f f i c i e n t s at other p o i n t s , and t h e r e f o r e , a n o i s y b a s e l i n e . We have termed t h i s noise " c o r r e l a t i o n noise** t o i d e n t i f y i t s source. I t i s a noise component which w i l l not be reduced by cross c o r r e l a t i o n . It i s always present t o some extent i n CC and can many times be confused w i t h r e a l peaks. A l s o , the c o n c e n t r a t i o n of the analyzed species must remain the same d u r i n g t h e t i m e o f t h e a n a l y s i s o r , a g a i n , we w i l l produce c o r r e l a t i o n n o i s e . Further D i s c u s s i o n . For a d e t a i l e d e x p l a n a t i o n of the problems of CC, backed by both experimental data and the r e s u l t s of computer modeling, you are r e f e r r e d t o our f i r s t papers on the subject (8,9). On the b a s i s of the r e s u l t s , we proposed a t that time g u i d e l i n e s f o r the e f f e c t i v e use of t h i s technique which I b e l i e v e are s t i l l v a l i d . We have r e p e a t e d some o f t h e s e i d e a s i n a l a r g e l y t u t o r i a l paper that we published some time l a t e r (3) and i n a review of s i g n a l enhancement techniques (10)• A l l o f t h e above problems have been d e m o n s t r a t e d i n t h e literature. You can v e r i f y them y o u r s e l f by t a k i n g the examples we have worked, moving the p o s i t i o n of some of the peaks i n the standing waves of each of the components and then adding them together t o produce the f i n a l d e t e c t o r s i g n a l . C r o s s - c o r r e l a t i o n of t h i s d e t e c t o r s i g n a l w i t h the sampling code w i l l not produce a n i c e c l e a n correlogram. S i m i l a r l y , you can demonstrate that the same t h i n g happens i f the i n t e n s i t y of the standing waves changes during the run, as would be the case i f the c o n c e n t r a t i o n of the sample species were changing. Burke and h i s students (11) have published a proposal f o r s o l v i n g t h e n o n - l i n e a r i t y p r o b l e m a s s o c i a t e d w i t h CC and t h e consequent c o r r e l a t i o n n o i s e . They used a constant frequency m u l t i p l e i n j e c t i o n s i g n a l ; w h i l e t h i s occurred, t h i s frequency was modulated. Before each i n j e c t i o n , a random number was generated to determine the magnitude and s i g n of the d e v i a t i o n from the c a r r i e r frequency f o r the next i n j e c t i o n time. Thus, the next

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

6.

ANNINO

Introduction to the Theory of Correlation Chromatography

i n j e c t i o n would come a l i t t l e e a r l i e r or a l i t t l e l a t e r than the time d i c t a t e d by the base frequency governing the m u l t i p l e i n j e c tion. By t h i s means, the sample c o n c e n t r a t i o n i n the column was kept at an almost constant value and, thus, w i t h i n the l i n e a r range of the isotherm. This was an extremely c l e v e r p r o p o s a l , and i t i s not c l e a r why i t d i d not work b e t t e r than i t d i d , s i n c e c o r r e l a t i o n noise was s t i l l present. I t may be that t h i s was due to the hardware implementation of the technique r a t h e r than a conceptual problem. R e c o g n i z i n g the problems a s s o c i a t e d w i t h CC, allows one to s e l e c t the a p p l i c a t i o n f o r w h i c h i t i s b e s t s u i t e d . Trace a n a l y s i s using CC i s a n a t u r a l s i n c e i t i s under these very d i l u t e s o l u t i o n c o n d i t i o n s that chromatography i s apt to be a l i n e a r phenonema. However, as we have pointed out a number of times, i t only makes sense i f the t r a c e m a t e r i a l i s d i s s o l v e d i n the s o l v e n t that we are using as c a r r i e r f l u i d . In gas chromatography, t h i s means ambient a i r a n a l y s i s w i t h n i t r o g e n as a c a r r i e r gas (12) and i n l i q u i d chromatography i t probably means water p o l l u t i o n a p p l i c a t i o n s w i t h water as a c a r r i e r f l u i d ( 1 3 ) , although there may be some other a p p l i c a t i o n s i n organic s o l v e n t s which f o r t u i t o u s l y can be used as a c a r r i e r f l u i d or which w i l l not change t h e i r chromatographic s o l v e n t p r o p e r t i e s when d i l u t e d by one-half (remember, i n essence, we i n j e c t our sample f o r 50% of the time and pure c a r r i e r f l u i d f o r the r e s t of the t i m e ) • The model we have presented f o r CC appears to be a s i m p l i f i e d r e p r e s e n t a t i o n , but i t i s q u i t e r i g o r o u s i n terms of the c o r r e l a t i o n process. I t can be f u r t h e r r e f i n e d t o y i e l d the common chromatographic peak shapes by c o n v o l u t i n g each u n i t pulse w i t h a G a u s i a n - l i k e f u n c t i o n before summing them to g i v e the d e t e c t o r signal. The output of t h i s simple model then looks l i k e that shown i n Figure 8. However, the above o p e r a t i o n merely makes the appearance of the correlogram more chromatographic-like; i t does not make i t a model which w i l l mimic the r e s u l t s obtained i n the laboratory. The computer model we have used to demonstrate the l a b o r a t o r y experimental r e s u l t s i n CC was one we had p r e v i o u s l y developed to e x p l a i n the anomolies i n f i n i t e d i f f e r e n c e chromatography (14) which we modified to take the encoded sample input of a c o r r e l a t i o n chromatography For the reader who i s i n t e r e s t e d i n pursuing the mathematical d e t a i l s of CC, I recommend the papers of Smit and h i s co-workers (15) • These researchers published e a r l y i n the area of CC (16) and continue to c o n t r i b u t e r e g u l a r l y to the f i e l d ( 1 7 ) . Phillips has a l s o been a c t i v e i n the f i e l d of CC, which he considers to be a subset of what he c a l l s , m u l t i p l e x chromatography ( 1 8 ) . Some workers have used o n - l i n e c o r r e l a t i o n chromatography to study the thermal decomposition of polymers and compared the r e s u l t s against those using conventional i n j e c t i o n procedures ( 1 9 ) , while others have a p p l i e d i t to the study of g a s - s o l i d a d s o r p t i o n (20)• In a d d i t i o n , f o r those of you who may wish to do some f u r t h e r r e a d i n g on the g e n e r a l s u b j e c t of c o r r e l a t i o n i n a n a l y t i c a l

97

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

98

TRACE RESIDUE ANALYSIS

OUTPUT Figure 8. P l o t s of computer generated r e p r e s e n t a t i o n of the input code, x ( t ) , d e t e c t o r output, y ( t ) , and f i n a l c o r r e l o gram output, (t), f o r a sample c o n t a i n i n g 18 and 82% conc e n t r a t i o n of two components. Reproduced w i t h permission from Ref. 9, copyright 1973, "American Chemical S o c i e t y . "

6. ANNINO Introduction to the Theory of Correlation Chromatography 99

chemistry, the chapter by H o r l i c k and H i e f t j e (21) i s an e x c e l l e n t overview. Literature Cited

1. 2. 3. 4. 5.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch006

6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.

Davies, W. D. T. Control 1966, 10, 302, 364, 431. Scholefield, P. H. R. Electronic Tech. 1960, 389. Annino, R.; Grushka, E. J . Chromatog. Sci. 1976, 14, 265. Annino, R.; Leone, J. J. Chromatog. Sci. 1982, 20, 19. Annino, R.; Gonnord, M.-F.; and Guiochon, G. Anal. Chem. 1979, 51, 379. Laeven, J. M.; Smit, H. C.; Kraak, J . C. Anal. Chim. Acta 1983, 150, 253. Valentin, J. R.; Carle, G. C.; Phillips, J . B. J . High Resolut. Chromatog. Chromatog. Commun. 1982, 5, 269. Annino, R.; Bullock, L. E. in "Gas Chromatography 1972"; Perry, S. G.; Adlard, E. R., Eds., Applied Sci. Publisher: London, 1973; pp. 171-186. Annino, R.; Bullock, L. E. Anal. Chem. 1973, 45, 1221. Annino, R. in "Advances in Chromatography"; Giddings, J. C.; Grushka, E.; Cazes, J.; Brown, P. R., Eds.; Marcel Dekker Inc.: New York, 1977; Vol. 15, pp. 33-67. Villalanti, D. C.; Burke, M. F.; Phillips, J. B. Anal. Chem. 1979, 51, 2222. Moss, G. C.; Kipping, P. J.; Godfrey, K. R. in "Gas Chroma­ tography 1972"; Perry, S. G.; Adlard, E. R., Eds.; Applied Sci. Publisher: London, 1973; pp. 187-197. Lub, T. T.; Smit, H. C.; Poppe, H. J. Chromatog. 1978, 149, 721. Annino, R.; Franko, J.; Keller, H. Anal. Chem. 1971, 43, 107. Lub, T. T.; Smit, H. C. Anal. Chim. Acta. 1979, 112, 341. Smit, H. C. Chromatographia 1970, 3, 515. Smit, H. C.; Duursma, R. P. J.; Steigstra, H. Anal. Chim. Acta. 1981, 133, 283. Phillips, J. B. Anal. Chem. 1980, 52, 468A-478A. Kal'yurand, M. R.; Kullik, E. J . Chromatog. 1979, 186, 145. Phillips, J. B.; Burke, M. F. J. Chromatog. Sci. 1976, 14, 495. Horlick, G.; Hieftje, G. M. in "Contemporary Topics in Analytical and Clinical Chemistry"; Plenum Press: New York, 1978; Vol. 3, p. 153.

RECEIVED March 25, 1985

7 Developments in Correlation Chromatography Application in Trace Analysis H. C. SMIT

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

Laboratory for Analytical Chemistry, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV Amsterdam, The Netherlands

The use of the chemometric technique of correlation chromatography has been demonstrated i n l i q u i d chromatography. The s e n s i t i v i t y compared to normal liquid chromatography has been extended up to 100 f o l d . An analysis time period of 2 hours was required. Estimated time for conventional enhancement to a similar signal-to-noise ratio i s 50 days. Correlograms very similar i n shape to chromatograms of 50 fold higher i n component concentration are shown. The advantages of correlation chromatography in ultra trace analysis are unmistakable despite the r e l a t i v e l y large amount of sample needed for high enhancement of signal. The i n s t r u m e n t a l a n a l y t i c a l t e c h n i q u e s , d e v e l o p e d i n t h e l a s t three o r f o u r d e c a d e s , a r e a l m o s t a l l based on the l i m i t e d s i g n a l and data p r o c e s s i n g c a p a b i l i t i e s o f r e l a t i v e l y s i m p l e a n a l o g instruments, and u t i l i z e a l i m i t e d or simple t h e o r e t i c a l b a s i s f o r calculations. A p a r t from the r a t h e r advanced a p p l i c a t i o n o f s t a t i s t i c s , o n l y a modest use o f mathematical techniques i n a n a l y t i c a l c h e m i s t r y has been used i n these t r a d i t i o n a l a n a l y s e s . The c o m p u t e r , w i t h i t s enormous power i n d a t a p r o c e s s i n g a n d i t s p o s s i b i l i t i e s i n a u t o m a t i o n a n d c o n t r o l , h a s added a new d i m e n s i o n b o t h to the i n s t r u m e n t a l a n a l y t i c a l method and the a p p l i c a t i o n o f mathematics and s t a t i s t i c s i n a n a l y t i c a l c h e m i s t r y . The i n t r o d u c t i o n o f t h e c o m p u t e r was one o f t h e m a i n f a c t o r s i n i t i a t i n g a new a n a l y t i c a l s u b d i s c i p l i n e , c h e m o m e t r i e s , w h i c h has a strong mathematical character. I n most a p p l i c a t i o n s chemometric methods a r e a p p l i e d to a n a l y t i c a l d a t a i n a n o f f - l i n e mode; t h a t i s , d a t a has a l r e a d y been o b t a i n e d by c o n v e n t i o n a l techniques and i s then a p p l i e d to a p a r t i c u l a r chemometric method. Examples o f t h i s use a r e i n c l u s t e r a n a l y s i s and i n p a t t e r n r e c o g n i t i o n . They a r e a p p l i e d to s p e c t r o s c o p i c , chromatographic, and other a n a l y t i c a l d a t a .

0097-6156/85/0284-0101S06.00/0 © 1985 American Chemical Society

102

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

A more a c t i v e kind of chemometries aims a t the i n t e g r a t i o n of s t a t i s t i c a l and mathematical techniques with the a n a l y t i c a l procedure. The conventional a n a l y t i c a l process i s modified or a completely new process i s developed i n studying r e a c t i o n s , transport processes, a d s o r p t i o n , absorption, e t c . The ultimate aim i s to obtain more and b e t t e r information i n an optimum way. C o r r e l a t i o n Chromatography (CC) can be considered a typical example of an a c t i v e or o n - l i n e chemometric technique. Impossible without computers, i t shows promising r e s u l t s i n ( u l t r a ) trace a n a l y s i s . This paper w i l l describe two d i r e c t i o n s that u t i l i z e c o r r e l a t i o n techniques: a semi-continuous kind of chromatography ( _1 ) and an extension of the l i m i t of d e t e c t i o n i n trace a n a l y s i s ( 1_ ). C o r r e l a t i o n Chromatography w i l l be shown to be a powerful method f o r a p p l i c a t i o n i n ( u l t r a ) trace a n a l y s i s . P r i n c i p a l s of C o r r e l a t i o n Chromatography C l a s s i c a l l y , the chromatogram i s the response of a chromatographic system that u t i l i z e s an impulse or s i n g l e i n j e c t i o n of a sample. C o r r e l a t i o n chromatography, on the other hand, u t i l i z e s semi-continuous i n j e c t i o n of sample over a period of time. An example a t t h i s p o i n t i s needed to describe t h i s comparison more clearly. In normal chromatography an i n j e c t i o n i s made a t a given i n s t a n t . The i n t r o d u c t i o n of the sample i s a s i n g l e , d i s c r e e t a c t i o n . The output of such an i n j e c t i o n that has been c a r r i e d through the separation column i s a chromatographic recorder trace which c o n s i s t s of a s e r i e s of ( h o p e f u l l y ) well defined "peaks", where each peak represents ( h o p e f u l l y ) a s i n g l e compound. Hence, a s i n g l e force or "impulse" of i n j e c t i o n produces a response of a s i n g l e peak ( f o r each compound) which could be c a l l e d an "impulse response". I t i s e s s e n t i a l l y s i n g l e impulse chromatography. In c o n t r a s t , c o r r e l a t i o n chromatography i s m u l t i p l e impulse chromatography. I t u t i l i z e s sample that i s d i s c r e e t l y added many times i n a random way . Since any number and length of i n j e c t i o n s can be made i n l i q u i d chromatography before the compound of i n t e r e s t may e l u t e from the column, the response (the t o t a l s i g n a l ) o f such chromatography i s a massive group of fused peaks that looks l i k e a l o t of noise often with a g r e a t l y r a i s e d baseline. See Figure 1. To the naked eye i t i s impossible to v i s u a l i z e separated peaks. However, to the computer which knows the i n j e c t i o n f u n c t i o n , the output response has a l o t of sense. I t can resolve a peak from noise and produce a "correlogram" which i s very s i m i l a r to a normal chromatogram. The longer the system i s run, the l a r g e r the sought peaks w i l l be. In trace a n a l y s i s the r e s u l t i s the d e t e c t i o n of trace compounds otherwise not a t t a i n a b l e by impulse techniques. The c o s t of such work i s the l a r g e r amount of sample needed ( i n mL instead of m i c r o l i t e r q u a n t i t i e s ) and a longer a n a l y s i s time than from s i n g l e impulse chroma tography.

7. SMIT

Developments in Correlation Chromatography

103

Of course, the impulse response of a system can be determined by measuring the response on an impulse-shaped i n p u t s i g n a l , but an a l t e r n a t i v e way i s to determine the c r o s s - c o r r e l a t i o n f u n c t i o n of a s u i t a b l e s t o c h a s t i c (random) i n p u t s i g n a l and the r e s u l t i n g output. Omitting the mathematical proof, which i s given i n ( 2_ ), we w i l l d e s c r i b e t h i s process by f o l l o w i n g the reasoning given i n ( 3 ).

Some b a s i c d e f i n i t i o n s are necessary. The d e f i n i t i o n of a c r o s s - c o r r e l a t i o n f u n c t i o n (CCF) of two non-zero average power s i g n a l s , x ( t ) and y ( t ) , i s : T/2

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

R

(r) N

xy —'

x(t- r)y(t)dt

l i m 1/T T->oo -T/2

(1)

The c r o s s - c o r r e l a t i o n f u n c t i o n R ( T ) f o r two s i g n a l s , x ( t ) and y ( t ) , d e s c r i b e s the genera? dependence ( c o r r e l a t i o n ) of the amplitude of one s i g n a l to the other as a f u n c t i o n of the time displacement T^. For example, x( t) can be the i n p u t s i g n a l and y ( t ) the r e s u l t i n g output s i g n a l of a system. The c o r r e l a t i o n between the i n p u t and the output s i g n a l i s determined by the p r o p e r t i e s of the system. I f i n t h i s system there e x i s t s o n l y a pure d e l a y , _r^, without a f f e c t i n g the s i g n a l (mathematically y ( t ) = x ( t - j r ^ ) , then of course the maximum c o r r e l a t i o n i s found a t _r = T. . The d e f i n i t i o n of R ( T ) i s given f o r non-zero average power s i g n a l s , i . e . , s i g n a l s t h e o r e t i c a l l y not l i m i t e d i n time, l i k e n o i s e and p e r i o d i c s i g n a l s . Peaks and other time l i m i t e d s i g n a l s are zero average power s i g n a l s . R ( r ) i s the average product of s i g n a l y ( t ) and a time r d e l a y e S ^ v e r s i o n of s i g n a l x(t). The a u t o c o r r e l a t i o n f u n c t i o n (ACF) of a non-zero average power s i g n a l x ( t ) i s defined by

R

(r)

xx —

l i m 1/T T- • oo -T/2

/

T/2 x(t-r)

x(t)dt

(2)

R ( T ) i s the average product of x( t) and a time _r delayed versTon of x ( t ) . The a u t o c o r r e l a t i o n f u n c t i o n i s a b a s i c f u n c t i o n i n the c h a r a c t e r i z a t i o n of a s t o c h a s t i c s i g n a l . Considering the ACF as a f u n c t i o n of T of a r e l a t i v e l y f a s t f l u c t u a t i n g ( s t a t i o n a r y ) s t o c h a s t i c s i g n a l w i t h an average value equal to zero, one can observe a much f a s t e r decrease of the ACF compared w i t h an i d e n t i c a l but s l o w l y f l u c t u a t i n g s t o c h a s t i c s i g n a l .

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

104

TRACE RESIDUE ANALYSIS

I t i s impossible to p r e d i c t the amplitude of a s t o c h a s t i c s i g n a l a t a c e r t a i n time i n the f u t u r e i n c o n t r a s t to a d e t e r m i n i s t i c s i g n a l l i k e a sine wave. Only a s t a t i s t i c a l d e s c r i p t i o n , f o r instance by d i s t r i b u t i o n f u n c t i o n s and a u t o c o r r e l a t i o n f u n c t i o n s , can be given. Most kinds of noise have a stochastic character. An ACF i s always an even f u n c t i o n , symmetrical with r e s p e c t to rj=o. The f a s t decreasing ACF of a very f a s t f l u c t u a t i n g s t o c h a s t i c s i g n a l can be considered as an impulse. In c o r r e l a t i o n chromatography a s p e c i a l kind of a s t o c h a s t i c input s i g n a l i s used, a b i n a r y n o i s e . In b i n a r y noise only two amplitude l e v e l s can occur, high or low (see Figure 1). Neverthel e s s , i t i s a s t o c h a s t i c s i g n a l because i t i s unpredictable which of the two l e v e l s w i l l be present a t a c e r t a i n time i n the f u t u r e . On the average each of the two l e v e l s has a p r o b a b i l i t y of 0.5. G e n e r a l l y , the b i n a r y noise i s generated a r t i f i c i a l l y by a generator c o n t r o l l e d by an i n t e r n a l c l o c k . The c l o c k period A^t determines the minimum time that one of the two s t a t e s w i l l e x i s t . During the determination of the ACF R ( T ) o f a binary noise with amplitude l e v e l s of +1 and -1, i f the time s h i f t jr i s greater than the c l o c k period A t , then the average product of x ( t ) and x ( t - T) , being R (_r), w i l l be zero; the p r o b a b i l i t y of each of the products and -1 i s 0.5. However, i f T

1i

Figure 1. R(r).

Example of a PRBS,

Determination of one p o i n t of

SAMPLE

DETECTOR

Eluate

Figure 2.

COMPUTER

PSEUDO RANDOM BINARY SEQUENCE (PRBS)

Basic diagram of a c o r r e l a t i o n chromatograph.

106

TRACE RESIDUE ANALYSIS

delay for every input pattern. A conventionally pulse-shaped i n j e c t i o n (one component) c a u s e s a d e l a y e d p u l s e a t the detector; the d e l a y time i s t . O f c o u r s e the r e s u l t o f a PRBS i n p u t ( o n e c o m p o n e n t ) i s a PRBS o u t p u t , t seconds d e l a y e d . In that c a s e the c r o s s - c o r r e l a t i o n f u n c t i o n ( C C F ) o f i n p u t and o u t p u t is i d e n t i c a l to the ACF o f the P R B S , h o w e v e r , seconds shifted in time. The ACF i n t h i s c a s e i s a g a i n a t r i a n g l e , h o w e v e r n o t a t jr=0 b u t a t r = t . I f the c l o c k p e r i o d i s s m a l l compared w i t h the d e l a y t i m e , t h i s t r i a n g l e c a n be c o n s i d e r e d a s a n i m p u l s e . H e n c e , the c o r r e l o g r a m i s i d e n t i c a l w i t h the c h r o m a t o g r a m . In b o t h c a s e s the a m p l i t u d e i s p r o p o r t i o n a l to the input c o n c e n t r a t i o n o f the component. R

R

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

R

A sample w i t h n components, each w i t h a c e r t a i n concentration a n d i t s own r e t e n t i o n t i m e , r e s u l t s i n 11 summed P R B S f u n c t i o n s at the d e t e c t o r o u t p u t , e a c h w i t h an a m p l i t u d e d e p e n d e n t on the c o n ­ centration. Cross-correlation gives again a correlogram similar to the c h r o m a t o g r a m . C o n s i d e r i n g a n o r m a l c h r o m a t o g r a m t o b e made u p f r o m la p o i n t s , e a c h r e p r e s e n t i n g a " c o m p o n e n t " w i t h a c e r t a i n a m p l i t u d e a n d r e t e n t i o n t i m e , l e a d s b y t h e same r e a s o n i n g t o t h e same c o n c l u s i o n , a c o r r e l o g r a m i s i d e n t i c a l t o t h e c h r o m a t o g r a m . The PRBS i s t o be p r e f e r r e d to o t h e r r a n d o m i n p u t s w i t h approximately impulse-shaped autocorrelation functions for the following reasons: 1) I t i s a b i n a r y n o i s e w i t h o n l y two l e v e l s (+1 a n d - 1 o r 1 a n d 0, r e s p e c t i v e l y ) . The l e v e l s c a n be u s e d to c o n t r o l s i m p l e o n / o f f valves. 2) The f u n c t i o n c a n be e a s i l y g e n e r a t e d a n d r e p r o d u c e d . 3) I t s s p e c i a l p r o p e r t i e s o f f e r the p o s s i b i l i t y o f r e d u c i n g the s o - c a l l e d c o r r e l a t i o n n o i s e , caused by a l i m i t e d c o r r e l a t i o n time. CC i s e s s e n t i a l l y s t a t i s t i c a l b y n a t u r e . The s y s t e m n o i s e ( d e t e c t o r n o i s e ) i s n o t c o r r e l a t e d w i t h the i n p u t PRBS; the n o i s e i n the c o r r e l o g r a m , r e s u l t i n g from the d e t e c t o r n o i s e , is c o n v e r g i n g to z e r o w i t h i n c r e a s i n g c o r r e l a t i o n t i m e . Set

Up o f a

Correlation

Chromatograph

A c o r r e l a t i o n chromatograph requires only a m o d i f i c a t i o n of the i n j e c t i o n system of a c o n v e n t i o n a l chromatograph. F i g u r e 3 shows a design for a c o r r e l a t i o n high pressure l i q u i d chromatography system. I t i s s u i t a b l e f o r h i g h p r e s s u r e up t o 500 b a r a n d f o r use w i t h c o r r o s i v e s a m p l e s . An e x t e n s i v e d e s c r i p t i o n i s g i v e n i n ( 4^ ) . T h i s c h r o m a t o g r a p h i s i n t e n d e d f o r use i n u l t r a trace a n a l y s i s and r e s e a r c h i n C C . A p a r t f r o m the i n j e c t i o n s y s t e m , e s s e n t i a l e x t e n s i o n s o f a c h r o m a t o g r a p h i c s e t - u p t o a l l o w CC t h a t are required are: a) The p a t t e r n g e n e r a t o r w h i c h i s n e c e s s a r y f o r g e n e r a t i n g a s i n g l e p u l s e ( n o r m a l c h r o m a t o g r a p h y ) o r a PRBS a d a p t e d to sample ( C C ) . I t i s used f o r s t i m u l a t i o n o f the c o l u m n v i a s w i t c h i n g o f v a l v e s and f o r c a l c u l a t i o n o f the CCF from the d e t e c t o r o u t p u t and the pattern.

either the

7.

SMIT

Developments in Correlation Chromatography

107

b) The d a t a s a m p l e r , w h i c h i s u s e d to s a m p l e the f i l t e r e d e l e c t r i c d e t e c t o r s i g n a l and to c o n v e r t i t to a d i g i t a l v a l u e . c) An a r i t h m e t i c u n i t to c a l c u l a t e the C C F . d) A d i s p l a y f o r the results. Some e x t e n s i o n s a r e n o t e s s e n t i a l f o r C C , b u t g r e a t l y i m p r o v e its capabilities. Interfaces to a d a t a s t o r a g e d e v i c e and to a hard copy u n i t are v a l u a b l e . Some f a c i l i t y f o r d a t a processing afterwards ( b a s e l i n e c o r r e c t i o n and peak area d e t e r m i n a t i o n ) is desirable •

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

A microprocessor i s i d e a l l y suited for CC. In our laboratory a m i c r o p r o c e s s o r based i n s t r u m e n t , a c o r r e l a t o r , has been d e v e l o p e d w h i c h meets a l l the m e n t i o n e d r e q u i r e m e n t s f o r C C . D e t a i l s c a n b e f o u n d i n ( J> ) . Results

i n Trace

Analysis

The f i r s t e x p e r i m e n t s d i r e c t e d to t r a c e a n a l y s i s w e r e c a r r i e d o u t i n c o r r e l a t i o n gas chromatography ( 2 ) . H o w e v e r , i n the recent y e a r s m u c h a t t e n t i o n was p a i d to c o r r e l a t i o n H P L C , b e c a u s e the d e t e c t i o n i s g e n e r a l l y m o r e a p r o b l e m t h a n i n GC a n d b e c a u s e injection is inherently easier. Results with a f i r s t experimental s e t - u p and a n o f f - l i n e c o m p u t e r c a l c u l a t i o n o f the CCF were v e r y promising.

Table

I.

Peak No.

1 2 3 4 5 6 7 8 9 10 11 12 13

Listing

of

Solutes

Present

Solute

i m p u r i t i e s , THF 2,3-dichlorophenol 2,6-DCP 3,4-DCP 2,5-DCP 2,3,4-trichloropheno1 2,3,6-TCP 3,5-DCP 3,4,5-TCP 2,4,6-TCP 2,3,4,5-tetrachlorophenol 2,3,5,6-TCP pentachlorophenol

in

the

Chromatogram

Capacity Ratio

5.52 6.34 7.07 7.78 8.53 9.70 10.52 11.94 12.76 15.52 17.21 24.96

(Fig.

4).

Concentration (ppm)

10.5 12.6 10.9 10.0 10.1 11.8 9.9 10.4 10.0 10.1 10.3 10.4

108

TRACE RESIDUE ANALYSIS A

lowering of

achieved

in

efficiency methods

the of

was

time,

of

1200 h o u r s

enhancement number only

The shows

HPLC of

a

plot

of

the

each

Figure is

of

a

of

component

Figure with

the

achieve the

square

twelve

correlator the

and

is

a

only

response

HPLC an

ppm,

leading

a Figure

Table

I).

analysis

system.

the

concentration

The

obtained

concentration

of

50.

In

correlogram of

Figure

5

shown. CC i s

caused

essentially

by components

components

in

differences separation

the

in

a

in

differential the

sample

resolution,

conditions

at

138

seconds

large

peak

previous

at

35

the

correlogram overlap

in

exceeds

the

or

duration the

to

7 shows

the

an

after

the

peaks

is

the

c a l i b r a t i o n graph

See

Figure

of

the

The i f

8.

chosen

length

was

the

of

isomer. peak

to

the

the

peak

The from

a

incorrect PRBS

caused

an

chromatogram

of

a

complex

correlator.

solely

correlogram,

The

area

demonstrated

using

with

reversed

Injection

five

of

of (10

3 the

-

0.1

/xg/L)

sequences

were

one

of

at

conventional

and

by c o r r e l a t i o n

indicated

the

c a l i b r a t i o n graph

units),

where

a

were HPLC

is

the

-

(

4^ ) .

The

100 / A g / L )

The

time,

two by

the

standard

with

HPLC total 0.01

were

lower correlation

respectively. out

both

sequence).

represent

separation

concentration:

carried (1

extending by CC.

correlation

determined

correlation

1 /xg/L l e v e l

Measurements

by

found

phase

device

decades

concentrations

data

for

by c o n v e n t i o n a l methods.

(0.01

16 a n d

CC i s phenol

used.

was

two h i g h e r

concentrations

of

a newly developed

measured

determined

(arbitrary

due

appear

small

PRBS. part

measured

HPLC e q u i p m e n t

The

T

an

in

peaks

The

late-eluting

folded

p e r i o d i c i t y of

enlarged

of

detection

phenol

on

two e x t r a

by

minor

slight modification

presumably

was

peaks caused

A p a r t from

chromatogram.

is

the

peaks

shown.

100 j u g / L .

HPLC w i t h

by a

that

c o r r e l a t i o n by

accomplished with

range

present.

probably a

a n a l y t i c a l performance

fluorometric was

was peak

time.

displayed The

a

correlogram

length

Figure

with

negative

positive

temperature),

correlogram

seconds

injection

be

caused

c o r r e l o g r a m compared

method;

and

both

(column

the

in

eluent

can

in

_a

The

this

the

in

( _7 ) .

representing

enhancement

to

noise

summations. used

10 m g / L ( s e e

0.2

the

proportionally

of

phenols.

modified

about

In

the

been

chromatogram

correlogram of

trace

required

complex mixture

10 ppm o r

a

lowering of

result.

increases

has

The

total

100

have

number

was

6^ ) •

enhancement

However,

correlator

a more

of

same

the

chlorinated

about

5 represents

component 6

of

of

100 (

enhancement

Signal

would

the

signal

root

of

The

factor

two h o u r s .

conventional

is

a

chromatograms.

"separation"

separation

each

to

technique

summed

by

experiments.

chromatograms)

(!)

factor

dime t h y l p h e n o l

known s i g n a l

achieve

about

by a

and

well

these to

microprocessor-based

correlation 4

of

50 d a y s of

in

was

a number or

to

increases

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

limit,

limit

phenol

with

required

signal the

of

CC c o m p a r e d

detection

(summing

detection

demonstrated

correlation the

the

analysis

by

The

peak

bars

area

deviation

+3

-

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

SMIT

Developments in Correlation Chromatography

SAMPLE LOOP

Figure 3. Set-up of a c o r r e l a t i o n HPLC system. The constant water flow i s c o n t r o l l e d by a PRBS pattern which d i r e c t s the flow to e i t h e r the sample or the e l u e n t r e s e r v o i r d r i v i n g the appropriate plunger forward. A 6-way r o t a r y valve i s placed a t the o u t l e t of the e l u e n t r e s e r v o i r to allow s i n g l e i n j e c t i o n experiments.

Chromatogram

25.00

50-00

75.00

100.00

125.00

150.00

Time

175.00

( x 10

200.00

225.00

250.00

1 0 PPM

275.00

300.00

1)

Figure 4. Separation of twelve d i f f e r e n t c h l o r i n a t e d phenols by conventional HPLC. The s o l u t e s are l i s t e d i n Table I.

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

Correlogram: Cl-Phenols A l l Concentr. 200 PPB

30.00

60-00

90.00

120.00

150.00

180.00

210.00

240.00

270.00

300.00

330.00

360.00

Time ( x 10 1) Figure 5. Correlogram corresponding to Figure 4 with s l i g h t l y d i f f e r e n t separation c o n d i t i o n s . The concentration of each component i s 0.2 ppm.

Developments in Correlation Chromatography

Area =

11.44141 mV.s

Sigma-I -

1.46756 mV.s

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch007

SMIT

Figure 8. detection.

C a l i b r a t i o n graph of phenol with f l u o r o m e t r i c

1

TRACE RESIDUE ANALYSIS

112

of the integrated s i g n a l plus noise ( J3 ). The inner bars a t the 1 g/L l e v e l represent the c o r r e l a t i o n r e s u l t s and the outer bars the s i n g l e i n j e c t i o n r e s u l t s . The d e t e c t i o n l i m i t s of the s i n g l e i n j e c t i o n experiments and of the c o r r e l a t i o n procedure with 10 ng/L c o n c e n t r a t i o n , both defined as 3 X

?

X+e e~N(0,aVa))

(Zw.ZwXY - Za)X.Zo)Y)/[Za).Za)X

(ZwY.ZwX -

2

+ b

2

- ZX.ZXY)/[n.ZX

x

2

2

- (ZX)
Y - b^wXY, ZY - b ZY - b ZXY. Band around p r e d i c t e d c o n c e n t r a t i o n s s u b s t i t u t e Y values (mean s i g n a l ± AY) i n Y = b + b 8 + band around r e g r e s s i o n . Solve f o r X . a = 0.05; = 1.96. ° ° (Reprinted w i t h permission from Ref* 4.)

Regression band

S i g n a l band, AY

Slope, b

Intercept, b

Zu)X/Zu)

Mean, X

0

Y = b

V a r i a b l e - v a r i a n c e data

Model

Parameter

Table I . A l g e b r a i c Equations f o r F i r s t - O r d e r Regression C a l c u l a t i o n s

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

122

the narrowest band around the p r e d i c t e d c o n c e n t r a t i o n . F i g u r e 5 shows t y p i c a l curves s e l e c t e d by t h i s procedure. At low concent r a t i o n s a f i r s t - o r d e r equation based on, say, the three lowest standards i s chosen. At high concentrations a second-order equation y i e l d s the narrowest band. Note that t h i s procedure o f t e n does not use a l l a v a i l a b l e data, an omission which seeirs i n t u i t i v e l y i n c o r r e c t . The procedure w i l l improve p r e c i s i o n when the b e n e f i t s from b e t t e r mathematical modeling exceed the l o s s e s from not using some data. In general, the m u l t i p l e - c u r v e procedure produces maximum b e n e f i t s at the low-concentration end of a long, n o n l i n e a r curve. F o r example, i n t y p i c a l data f o r the determination of f e n v a l e r a t e by gas chromatography (Table I I ) , use of the m u l t i p l e - c u r v e procedure improved the p r e c i s i o n of the a n a l y s i s by a f a c t o r of two at the 1 meg l e v e l , and a f a c t o r of three a t the 5 meg l e v e l . C o r r e c t i o n f o r nonconstant v a r i a n c e . To c o r r e c t f o r nonconstant v a r i a n c e , i t i s necessary t o weight standard measurements according t o t h e i r l o c a l v a r i a n c e , S . For each standard conc e n t r a t i o n the variance i s determined by r e p e t i t i v e a n a l y s i s at that l e v e l , and a weighting f a c t o r , w = 1/s , i s c a l c u l a t e d . These f a c t o r s are used i n the equations given i n Table I . The computation r e q u i r e s o n l y that the variance r a t i o s be a c c u r a t e l y known. The absolute p r e c i s i o n of the method may change from day t o day without a f f e c t i n g the v a l i d i t y of e i t h e r the l e a s t - s q u a r e s curve-of-best f i t procedure or the confidence band c a l c u l a t i o n s . ( I t i s not p r a c t i c a l t o r e g u l a r l y monitor l o c a l v a r i a n c e s , and e r r o r s may develop i n v a r i a n c e r a t i o s . Eowever, the e r r o r due t o i n c o r r e c t r a t i o s w i l l almost always be much l e s s than the e r r o r due t o assuming constant v a r i a n c e . Even guessed values of, say, S a c o n c e n t r a t i o n are l i k e l y t o y i e l d more p r e c i s e data.) An unweighted l e a s t squares procedure i s o f t e n adversely a f f e c t e d by high c o n c e n t r a t i o n standards, w i t h h i g h (absolute) v a r i a n c e s . These may cause l a r g e e r r o r s i n the slope of f i r s t order equations. The l i n e i s ' r o t a t e d ' , causing large r e l a t i v e e r r o r s at low c o n c e n t r a t i o n s . The weighting proceprocedure deemphasiz.es these p o i n t s , thus reducing t h i s e f f e c t . Figure 6 shows data f o r the determination of lead i n blood by Delves cup AAS. The f i r s t - o r d e r curve i s known t o pass through zero. The weighted least-squares l i n e i s c l o s e , w i t h an 4 i n t e r c e p t of 1.5, but the unweighted l i n e has been ' ' r o t a t e d ' ' by a s i n g l e low value (not an o u t l i e r ) a t 65 ug/dL, g i v i n g an i n c o r r e c t i n t e r c e p t of 3.3. A sample y i e l d i n g a s i g n a l of 5.3 has a c a l c u l a t e d lead c o n c e n t r a t i o n of 10 meg Pb/dl using the weighted l e a s t squares l i n e and 6 mcg/dl using the unweighted l i n e - a 40% e r r o r . S i m i l a r l y , f i g u r e 7 shows part of the weighted and unweighted l e a s t - s q u a r e s curves f o r symposium Dataset B. Standards over the range 0.05 to 20 r i g f e n v a l e r a t e were analyzed, and the f i g u r e shows a range of only 0-1. The v a r i a n c e at each amount l e v e l was known, so both weighted and 2

2

2

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

8.

MITCHELL

Calibration-Curve-Based Analysis

123

Concentration F i g . 5. Use of m u l t i p l e - c u r v e procedure. Subsets of c a l i b r a t i o n data, each comprising s e v e r a l standards b r a c k e t i n g the samples, are used to c a l c u l a t e p r e d i c t e d c o n c e n t r a t i o n s f o r high-and low-concentration samples.

Concentration (pig PB/dl)

F i g . 6. Determination of lead i n blood by Delves-cup AAS. and unweighted curves of best f i t are shown.

Both weighted

124

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

Table

II.

Use o f m u l t i p l e c u r v e p r o c e d u r e t o i m p r o v e p r e c i s i o n o f f e n v a l e r a t e a n a l y s i s by g a s c h r o m a t o g r a p h y

F e n v a l e r a t e Amount (meg)

Single-Curve C a l i b r a t i o n RCB

Multiple-Wave C a l i b r a t i o n C a l i b r a t i o n Range (%) Low High RCB%

0.05

1.00

24

33

0.05

1.00

20

1.00

33

0.05

5.00

17

5.00

26

1.00

20.00

7

20.00

20

1.00

20.00

7

0.05 0.25

Note:

40

a = 0.05; Z, = l-a/2 / 0

1.96

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

8.

MITCHELL

Calibration-Curve-Based Analysis

125

F i g . 7. Determination of f e n v a l e r a t e by gas chromatography w i t h DATASET B showing weighted and unweighted second-order curves.

TRACE RESIDUE ANALYSIS

126

unweighted curves could be c a l c u l a t e d . These data are very p r e c i s e , w i t h f i v e r e p l i c a t e measurements at 20 ng f e n v a l e r a t e having a range of + 2.6%. Use of the unweighted procedure caused s i g n i f i c a n t e r r o r s only at amount l e v e l s below 1 ng fenvalerate.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

Improved Measurement of P r e c i s i o n C a l i b r a t i o n curve q u a l i t y . C a l i b r a t i o n curve q u a l i t y i s u s u a l l y evaluated by s t a t i s t i c a l parameters, such as the c o r r e l a t i o n c o e f f i c i e n t and standard e r r o r of estimate, and by e m p i r i c a l indexes, such as the length of the l i n e a r range. Using c o n f i dence band s t a t i s t i c s , curve q u a l i t y can be b e t t e r described i n terms of confidence band widths at s e v e r a l key c o n c e n t r a t i o n s . Other s e m i - q u a n t i t a t i v e indexes become redundant. A l t e r n a t i v e l y , the e f f e c t s of curve q u a l i t y can be incorporated i n t o statements of sample a n a l y s i s data q u a l i t y . Sample a n a l y s i s data q u a l i t y . P r e c i s i o n of sample a n a l y s i s i s almost always measured by determining the RSD at two o r more concentrations without using a c a l i b r a t i o n curve. Such data do not include the e f f e c t s of the c a l i b r a t i o n process on p r e c i s i o n , flluch b e t t e r i n f o r m a t i o n i s given by the r e l a t i v e confidence bandwidth (RCB) defined as: RCB(%) =

band - lower band) x 100 2 x P r e d i c t e d Concentration

^PPer

For example, Figure 8 shows both RSD and RCB data f o r determinat i o n of c h l o r i d e and lead i n water. I n Figure 8a, the l e a s t squares curve of best f i t c l o s e l y f i t s the lead standard data, and the c a l i b r a t i o n process has l i t t l e adverse e f f e c t on p r e c i s i o n . RSD's and RCB's are almost equal. On the other hand, c h l o r i d e standard data i n F i g u r e 8b does not c l o s e l y f i t the mathematical model, and the RSD data o v e r s t a t e s the p r e c i s i o n of the a n a l y s i s by a f a c t o r of about two. Minimum r e p o r t a b l e c o n c e n t r a t i o n . The lower c o n c e n t r a t i o n l i m i t f o r a method i s u s u a l l y measured by determining the d e t e c t i o n l i m i t . This i s b a s i c a l l y an instrument s i g n a l t o noise r a t i o , and i t does not include c a l i b r a t i o n e f f e c t s . At low concentrations the c a l i b r a t i o n process o f t e n has a major adverse e f f e c t on p r e c i s i o n . D e t e c t i o n l i m i t s are u s e f u l f o r comparing the inherent s e n s i t i v i t y of methods, but they are not r e a l i s t i c indexes of measurable concentrations i n r o u t i n e a n a l y s i s . We suggest using a new parameter, the minimum r e p o r t a b l e c o n c e n t r a t i o n , defined as the c o n c e n t r a t i o n whose confidence band j u s t i n c l u d e s zero (5.). This parameter i s obtained by reducing the value of s i g n a l Yo, f i g u r e 4, u n t i l the band around p r e d i c t e d c o n c e n t r a t i o n , Xo, j u s t touches zero. For example, f o r the determination of i r o n i n water by AAS, (data given i n Table I I I ) the d e t e c t i o n l i m i t , d e f i n e d as the c o n c e n t r a t i o n a t which the

127

Calibration-Curve-Based Analysis

8. MITCHELL

C XI

•u

W

T3 C •H O £ -H

C

OJ

CO «H PQ >

1.0

on on

2.5

5

Concentration

(yg/ml)

30 c

(0

X u T)

•H

c

o

•1-1 4J

c CO CO •H PQ >

(D Q O C X» (D U T3 CO •H T3 M-l C C CO O 4J O CO

20

10

1

> >

•H •H 4J

CO

4->

CO

_

\

— t0)1— rPdQ)1 0

20 Concentration

40 (pg/ml)

Fig. 8. Comparison of RCB (-) and RSD ( ) for determination of (a) chloride and (b) lead in water. (Reprinted with permission from D. G. Mitchell and J . S. Garden, Talanta 1982, 29, 921-929, copyright 1982, Pergamon Press Ltd.)

TRACE RESIDUE ANALYSIS

128

Table

I I I . Determination by AAS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

Standard (vig/ml)

0.05 0.10 0.25 0.50 1.00 1.5 2.5 5.0 10.0 15.0 18.0 20.0 22.0 25.0 40.0 50.0 75.0 100.0

Absorbance a t nm

0. 004 0. 008 0. 022 0. 045 0. 093 0. 142 0. 222 0. 430 0. 750 0. 961 1. 054 1. 086 1. 145 1. 191 1. 268 1. 300 1. 360 1. 405

o f Maximum C o n c e n t r a t i o n o f I r o n i n Water

Single-Curve Calibration: RCB ( % ) -

40 26 13 9 8 7 7 5 6 11 60 60 60 60 60 60 60 60

Multiple-Curve Calibration Calibration r a n g e (pg/ml) Low High RCB (%)

0 .05 0 .05 0 .05 0 .1 0 .1 0 .1 0 .1 0 .1 0 .1 0 .1 15 .0 15 .0 15 .0 15 .0 25 .0 0 .05 40 .0 40 .0

1.5 1.5 1.5 1.5 1.5 20.0 20.0 20.0 20.0 20.0 40.0 40.0 40.0 40.0 40.0 100.0 100.0 100.0

N o t e : a = 0.05; Z, = 1.96 l-a/2 —A w e i g h t e d l e a s t - s q u a r e s t e c h n i q u e was u s e d : c a l i b r a t i o n 0.05-18 ug/ml.

33 19 6 5 4 6 6 4 5 8 11 12 16 25 38 60 60 60

/ 0

Source:

Reproduced w i t h

permission

from

R e f . 5.

range

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

8.

MITCHELL

Calibration-Curve-Based Analysis

129

RSD i s 50%, i s 0.015 ug/ml. The minimum r e p o r t a b l e c o n c e n t r a t i o n i s a f a c t o r of 2 higher when the method i s c a l i b r a t e d over a narrow, low c o n c e n t r a t i o n range (0.05 to 0.1 ug/ml). I t i s a f a c t o r of 20 higher when the method i s ( i n a p p r o p r i a t e l y ) c a l i b r a t e d over a dynamic range of 100 (0.05 to 5 ug/ml). Maximum r e p o r t a b l e c o n c e n t r a t i o n . The upper l i m i t of measurement f o r a method i s u s u a l l y defined as the c o n c e n t r a t i o n at which the curve shows a c e r t a i n d e v i a t i o n from l i n e a r i t y . This i s a v a l i d e m p i r i c a l c r i t e r i o n , since s e n s i t i v i t y and hence p r e c i s i o n decreases as the curve f l a t t e n s . However, l i n e a r i t y does not d i r e c t l y measure the performance parameter of i n t e r e s t : p r e c i s i o n . In p r a c t i c e an a n a l y s t would accept curves at h i g h c o n c e n t r a t i o n s , p r o v i d i n g the p r e c i s i o n i s s t i l l adequate and p r o v i d i n g the method does not have accuracy problems at h i g h concentrations e.g., because of l i g h t s c a t t e r i n g i n a b s o r p t i o n methods. Confidence bands are d i r e c t p r e c i s i o n data, and the maximum r e p o r t a b l e c o n c e n t r a t i o n can be defined as the maximum c o n c e n t r a t i o n at which the method y i e l d s adequate p r e c i s i o n (5.) (excluding measurements near the minimum r e p o r t a b l e c o n c e n t r a t i o n , where poor p r e c i s i o n i s unavoidable). Table I I I shows RCB f o r the determination of i r o n i n water by AAS. The analyst may consider a RCB of say, 15% to be adequate. The maximum r e p o r t a b l e c o n c e n t r a t i o n would be 15 ug/ml from a s i n g l e , weighted least-squares curve, and 20 ug/ml by the m u l t i p l e - c u r v e method. Samples c o n t a i n i n g > 20 ug/ml should be d i l u t e d to 1-10 ug/ml and analyzed using standards c o n t a i n i n g 0.05 - 15 pg/roL. (Mote that i t i s always b e t t e r to include a standard above the maximum d e s i r e d c o n c e n t r a t i o n . The p r e c i s i o n of t h i s standard measurement w i l l be poor, but poor data at t h i s l e v e l are b e t t e r than none.) I m p l i c a t i o n s For Method Development. The e f f e c t s of the c a l i b r a t i o n process on p r e c i s i o n suggest the need f o r an a d d i t i o n a l step i n the development of an a n a l y t i c a l method. A suggested flow chart i s shown i n F i g u r e 9. The a n a l y s t should f i r s t develop a method of adequate accuracy and p r e c i s i o n without using c a l i b r a t i o n curves. The c a l i b r a t i o n step i s then added, and the p r e c i s i o n i s rechecked. I f p r e c i s i o n has been e x c e s s i v e l y degraded, the a n a l y s t can choose among a l t e r n a t i v e c a l i b r a t i o n s t r a t e g i e s , such as use of more standard measurements and use of the m u l t i p l e - c u r v e procedure. Conclusion I have described a reasonably complete set of mathematical techniques f o r improving the p r e c i s i o n of c a l i b r a t i o n - c u r v e - b a s e d analyses and measuring t h e i r p r e c i s i o n . Each technique may not be the optimum s o l u t i o n to each problem, but the o v e r a l l philosophy should be c o r r e c t . We should develop s t a t i s t i c a l techniques to measure p r e c i s i o n which are s e l f - c o n s i s t e n t and

TRACE RESIDUE ANALYSIS

130

SELECT AND

SAMPLE

PREPARATION

MEASUREMENT

DEVELOP

TECHNIQUES

FOR ROUTINE

ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

YES

YES

YES

METHOD

SATISFACTORY

FOR ROUTINE

ANALYSIS

Method development procedure for calibration-curve-based analysis. (Reproduced with permission from D. G. Mitchell S. Garden, Talanta, 1982, 29, 921-929, copyright 1982, Pergamon Press Ltd.)

8.

MITCHELL

Calibration-Curve-Based Analysis

131

which account f o r a l l the f a c t o r s a f f e c t i n g p r e c i s i o n . We should use them t o choose optimum c a l i b r a t i o n s t r a t e g i e s and t o measure the p r e c i s i o n of the r e s u l t i n g data. The computer program REGRES was w r i t t e n by John S. Gorden, New York State Department of Health. I t can be obtained by sending a check f o r f 3 0 . 0 0 , made out t o Health Research I n c . , and a 9-track magnetic tape t o John S. Gorden, New York State Department of H e a l t h , CSMDP, Concourse L e v e l , Empire State P l a z a , Albany, New York 12237.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch008

Literature Cited

1. Natrella, M.G., "Experimental Statistics", National Bureau of Standards Handbook 91 (1963) 2. Miller, R.G., "Simultaneous Statistical Inference", McGraw-Hill, New York (1966) 3. Mitchell, D. G., Mills, W., N. Garden, J.G., and Zdeb, M., Anal. Chem., 1977, 49, 1655-1660. 4. Garden, J . S., Mitchell, D. G., and Mills, W. M., Anal. Chem., 1980, 52, 2310-2315. 5. Mitchell, D. G., and Garden, J. S., Talanta, 1982, 29, 921-929. 6. Draper, N.R., and Smith, H., "Applied Regression Analysis", J . Wiley and Sons, New York, (1966) RECEIVED March 25, 1985

9 The Linear Calibration Graph and Its Confidence Bands from Regression on Transformed Data 1

2

2

DAVID A. KURTZ , JAMES L. ROSENBERGER , and GWEN J. TAMAYO 1

Pesticide Research Laboratory, Department of Entomology, The Pennsylvania State University, University Park, PA 16802 Department of Statistics, The Pennsylvania State University, University Park, PA 16802

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

2

Linear calibration graphs were constructed from chromatographic response values by means of least squares statistical regression techniques to calculate amount estimates. The amount interval estimates r e f l e c t both the uncertainties of measuring the response values and the uncertainty of the c a l i b r a t i o n graph. The following steps were followed: transformation of response variables to constant variance across the graph using a family of power transformations approach, transformation of the amount variable with similar transformations towards l i n e a r i t y , calculation of the regression coefficients by sums of squares, and solving the regression equation for unknowns. The total range of the amount interval estimation was found by construction of the response confidence interval and the confidence band around the c a l i b r a t i o n graph. Estimated amounts and amount intervals were calculated from chromatographic analysis of pesticide standards data at 95% probabilities with an overall α=0.05. Data were presented that show large errors at the l i m i t of detection using non-transformed or improperly transformed data.

The c a l i b r a t i o n p r o b l e m i n c h r o m a t o g r a p h y a n d s p e c t r o s c o p y h a s been r e s o l v e d over the years w i t h v a r y i n g success by a wide v a r i e t y o f methods. C a l i b r a t i o n graphs have been drawn b y hand, by i n s t r u m e n t s , a n d b y commonly used s t a t i s t i c a l methods. Each m e t h o d c a n b e q u i t e a c c u r a t e when p r o p e r l y u s e d . However, o n l y a few p a p e r s , f o r e x a m p l e ( 1 , 2 , 1 5 , 1 6 , 2 6 ) , show t h e s o p h i s t i c a t e d use o f a chemometric method t h a t c o n t a i n s h i g h p r e c i s i o n : regression with t o t a l assessment o f e r r o r . the

D i f f i c u l t i e s i n both the chemical and s t a t i s t i c a l aspects o f problem have been found to be enormous i n u t i l i z i n g s u c h a 0097-6156/85/0284-0133S09.50/0 © 1985 American Chemical Society

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

134

TRACE RESIDUE ANALYSIS

calibration method. Chromatographic detection is accomplished by flame ionization as well as by many species specific detectors, such as the electron capture and the flame photometric detectors. The flame detector is non-specific to the chemical species found, is mass sensitive since the total sample is burned, and has a linear range of some 7 powers of ten ( 3 ). The electron capture detector is concentration dependent. It has been linearized through geometric design and electronic configuration to a range approaching 3 powers of ten ( 3 ). The flame photometric detector, similar in action with the flame ionization detector, has been found to be linear to 4 powers of ten in the phosphorus mode ( J3 ). Other known effects on the calibration have included contamination of the detector and day-to-day v a r i a b i l i t y . F u l f i l l i n g the s t a t i s t i c a l protocol also requires careful study. Aspects in this area include model f i t t i n g , preparing constant variance data across the graph, diagnostic tests of closeness of f i t and constant variance achievement, and the construction of confidence l i m i t s . There are a number of ways to model calibration data by regression. Most researchers have attempted to describe data with a linear function. Others ( 4,5 ) have chosen a higher order or a polynomial method. One report ( 6 ) compared the error in the interpolation using linear segments over a curved region verses using a curvilinear regression. S t i l l others ( 7,8 ) chose empirical or spline functions. Mixed model descriptions have also been used ( 4,7 ). Ordinary least squares regression requires constant variance across the range of data. This has typically not been satisfied with chromatographic data ( 4,9,10 ). Some have adjusted data to constant variance by a weighted least squares method ( 4 ). The other general adjustment method has been by transformation of data. The log-log transformation is commonly used ( 9,10 ). One author compares the robustness of nonweighted, weighted linear, and maximum likelihood estimation methods ( 11 ). Another has constructed calibration graphs and confidence limits under the condition of nonuniform variance ( 12 ). On the other hand a completely different approach for the processing of chromatographic data has been suggested ( 13 ) which involves the use of a mean slope method. The need for reporting accuracy and error in the form of confidence limits when reporting analytical results has already been well outlined ( _14 ). The confidence interval requires information about the number and distribution of calibration measurements, the location of the sample and the number of sample replicates. Agterdenbos has simplified the calculation by assuming homogeneity of variances and assuming that the variance of the sample and the calibration standards are the same ( ). Schwartz has calculated the approximate confidence limits of linear graphs without elaborate d i g i t a l computation ( JJ3 ).

9.

KURTZ ET AL.

Linear Calibration Graph and Its Confidence Bands

135

We w i l l describe an accurate s t a t i s t i c a l method that includes a f u l l assessment of error in the overall calibration process, that i s , (1) the confidence interval around the graph, (2) an error band around unknown responses, and f i n a l l y (3) the estimated amount intervals. To properly use the method, data w i l l be adjusted by using general data transformations to achieve constant variance and l i n e a r i t y . It u t i l i z e s a six-step process to calculate amounts or concentration values of unknown samples and their estimated intervals from chromatographic response values using calibration graphs that are constructed by regression.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

Laboratory Methods and Equipment Preparation of Standard Solutions. The standards used in the preparation of the solutions for this work were obtained from the Health Effects Laboratory, U. S. Environmental Protection Agency, Research Triangle Park, NC. Dilutions were obtained from concentrated solutions using wiretrol measuring capillaries (Drummmond Scientific Co., Bromall, PA). The fenvalerate and chlorothalonil data sets were prepared with a s t a t i s t i c a l l y equivalent format: Each of the standards at each concentration level had the same number of dilution steps and should therefore contain the same variance of d i l u t i o n . In this case the i n i t i a l dilution was used to prepare two working standards, 1 and 2. Each was diluted once to 1.1 and 2.1. Each of these was used to prepare three standards for chromatographic injection, 1.11, 1.12, 1.13 and 2.11, 2.12, and 2.13. The pesticides included in this study were fenvalerate, chlordecone (kepone), chlorothaloni1, and chlorpyrifos. Fenvalerate is a synthetic pyrethroid insecticide used, for example, for mites on chickens. Its chemical name is cyano(3-phenoxyphenyl)methyl 4-chloro-alpha-(l-methylethyl)benzeneacetate. Chlordecone is an insecticide, no longer used, and has a chemical name decachloro-octahydro-l,3,4-metheno-2H-cyclobuta(cd)=pentalen-2-one. Chlorothalonil is fungicide used on tomatoes whose chemical name is 2,4,5,6-tetrachloroisophthalonitrile. Chlorpyrifos is an insecticide with a chemical name 0,0-diethyl 0-(3,5,6-trichloro2-pyridyl)phosphorothioate. Chlorpyrifos is the U. S. Food and Drug Administration chromatographic reference standard since numerous specific detectors (electron capture, flame photometric in both sulfur and phosphorus modes, a l k a l i flame, nitrogen phosphorus, and Hall detectors) are sensitive to i t . Each of the Datasets A-F were also of fenvalerate and were obtained from an extensive study of fenvalerate residues in chickens and eggs. They show how much v a r i a b i l i t y in data quality can be obtained in practice. Table VII describes the number of calibration levels, replicates at each level, and ranges in ng of amounts injected into the gas chromatograph. Dataset A is an "ideal" set, a set that looked ideal at the time i t was recorded. Dataset B is a set of data taken over two days under constant

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

136

TRACE RESIDUE ANALYSIS

detector sensitivity. Dataset C is a set of data taken over two days under changing detector sensitivity. Dataset D has values where an a r t i f a c t compound was present in the same peak as fenvalerate which altered the areas of smaller peaks. Dataset E has the points containing the a r t i f a c t removed from the set. Dataset F has a limited range and was s t i l l found to be non-linear when log-log transformation was performed. Gas chromatographic data was obtained on a Tracor Model 220 gas chromatograph equipped with a Varian Model 8000 autosampler. The analysis column was a 1.7 m "U" column, 4 mm i d , f i l l e d with 3% SP-2250 packing (Supelco, Inc., Beliefonte, PA) held at 200° C. The injection temperature was 250° and the nitrogen carrier gas flow rate was 60 mL/min. The detector temperatures were 350° for electron capture and 190° for flame photometric. Detector signals were processed by a Varian Vista 401 which gave retention times and peak areas. The symposium Appendix contains a l l the raw data sets analyzed in this paper. General Analytical Plan. A six step process is described to calculate the amount or concentration values of unknown samples using chromatographic response values and calibration graphs that were constructed by regression. The steps are: 1. Instrumental response values of the standards were transformed according to Tukey^s simple family of power transformations ( 17_ ) and described later by Box-Cox ( 18 ) to a point where a s t a t i s t i c a l test of constant variance was accepted. In this work the state of constant variance was tested by the Hartley test ( 19 ). Response variances were calculated at each amount level. The H s t a t i s t i c is then found by dividing the maximum variance by the minimum variance, each taken from any level. 2. The amount data corresponding to the response values in 1 above were transformed by the same general family of power transformations until linearity was obtained. The F-test s t a t i s t i c that relates lack of f i t and pure error was used as the criterion for l i n e a r i t y . 3. The transformed response values were regressed on the transformed amount values using the simple linear regression model and ordinary least squares estimation. The standard deviation of the response values (about the regression line) was calculated, and plots were formed of the transformed response values and of the residuals versus transformed amounts. 4. The Working-Hotelling confidence band ( 20 ) was then constructed around the estimated regression l i n e . 5. Unknown amounts and error limits of those amounts were predicted by the Lieberman, Miller and Hamilton method ( 21^ ). 6. The predicted transformed amount and amount values and their interval limits were transformed back to their original units. Though the six-step procedure is complicated, i t is easily implemented on a computer. Figure 1 illustrates this process.

9. KURTZ ET AL.

Linear Calibration Graph and Its Confidence Bands

CD

o c o s« "O CD CD

l

i

I

i I

c ! .2 o S

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

low

h'Qh Amount

Level

C

o a CD

cc T3 CD

E

Transformed

Transformed

Amount

Amount

Figure 1. P l o t s showing the C a l i b r a t i o n Process. A. Response transformation to constant v a r i a n c e : Examples showing a. too l i t t l e , b. a p p r o p r i a t e , and c. too much transformation power. Amount Transformation i n conforming to a ( l i n e a r ) model. C. Construction of p. confidence bands about the regressed l i n e , response e r r o r bounds and i n t e r s e c t i o n of these to determine i the estimated amount i n t e r v a l .

TRACE RESIDUE ANALYSIS

138 Statistical

Calibration

In the c a l i b r a t i o n problem two r e l a t e d q u a n t i t i e s , X and Y, are i n v e s t i g a t e d where Y, the response v a r i a b l e , i s r e l a t i v e l y easy to measure while X, the amount or c o n c e n t r a t i o n v a r i a b l e , i s r e l a t i v e l y d i f f i c u l t to measure i n terms of c o s t or e f f o r t . Furthermore, the measurement e r r o r f o r X i s small compared with that of Y. The experimenter observes a c a l i b r a t i o n set of N p a i r s of values ( x ^ y ) , i * l , . . . , N , of the q u a n t i t i e s X and Y, being the known standard amount or c o n c e n t r a t i o n values and the chromatographic response from the known standard. The c a l i b r a t i o n graph i s determined from t h i s s e t of c a l i b r a t i o n samples using r e g r e s s i o n techniques. A d d i t i o n a l values of the dependent v a r i a b l e Y, say y.*, j * l , . . . , M , where M i s a r b i t r a r y , are a l s o observed whose corresponding X v a l u e s , x * are the unknown q u a n t i t i e s of i n t e r e s t . The s t a t i s t i c a l l i t e r a t u r e on the c a l i b r a t i o n problem considers the e s t i m a t i o n of these unknown v a l u e s , x *, from the observed y.*, and the e q u a l l y important aspect of c a l c u l a t i n g the upper and lower bounds f o r x *. The technique f o r o b t a i n i n g i n t e r v a l estimates f o r X*, discussed i n t h i s s e c t i o n , i s presented i n the paper by Lieberman, M i l l e r , and Hamilton ( 21^ ) and based on the Bonferroni i n e q u a l i t y ( 22 ) described below. Other methods are found i n the r e f e r e n c e s

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

i

(

23,24

).

The simple l i n e a r r e g r e s s i o n model w i l l be assumed t h i s s e c t i o n . That i s , *i

-

+

x

0O /*, i

+

e

i

throughout

X

where i=*l,...,N, e^ are independent e r r o r s with constant v a r i a n c e , and (x ,y^) are observations from the standard c a l i b r a t i o n sample. Since data from chromatography standards t y p i c a l l y do not s a t i s f y the assumptions of constant variance nor l i n e a r i t y , a procedure described above f o r f i t t i n g a f a m i l y of transformations on the y and w i l l be used. We assume f o r the r e s t of t h i s s e c t i o n that the above model i s s a t i s f i e d f o r the transformed data • Bonferroni I n t e r v a l Estimates. I n t e r v a l estimates f o r the unknown X*, r e f e r r e d to as u n l i m i t e d simultaneous d i s c r i m i n a t i o n i n t e r v a l s ( 2 1 ), are based on the estimated r e g r e s s i o n l i n e of y. on x., and the confidence i n t e r v a l (on the Y-axis) about trie response y * f o r an unknown. The r e s u l t i n g i n t e r v a l estimates have the property that f o r a t l e a s t 1 0 0 ( 1 - a )% of the d i f f e r e n t c a l i b r a t i o n s e t s , a t l e a s t 100P% of the amount i n t e r v a l s estimated from that c a l i b r a t i o n w i l l c o n t a i n true unknown amounts x*. The ( 1 - a ) confidence r e f e r s to the u n c e r t a i n t y inherent

9.

Linear Calibration Graph and Its Confidence Bands

KURTZ ET AL.

139

in the estimation of the calibration l i n e , A y

A

i

A

=

+

@0

X

A? | i

2 and the response variance, (J , whereas the probability P refers to the sampling distribution of the unknown samples V J

The Bonferroni interval estimate of X* given Y*, is found in three moves. F i r s t , the Working-Hotelling confidence band for the regression line f

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch009

E(Y) -

p

+£, X

0

is obtained ( j25 ). Next, the confidence interval (on the Y-axis) for the true value of Y*, say U , is determined. Lastly, the Bonferroni inequality is invoked to combine the two proceeding confidence statements, each made with the confidence ( l - a / 2 ) , to yield an interval estimate for X* with confidence at least ( l - « ) . The confidence band on the regression line and the confidence interval on U are intersected and the Bonferroni interval estimate of X* is found by projecting the intersection onto the x-axis. Figure lc illustrates the procedure. If \J is in the interval on the Y-axis and i f the hyperbolic confidence band contains the line

then the shaded region must contain the point (X*, p + /?, X*) and hence, X* must l i e in the interval on the X-axis, the interval estimate for X*. With confidence ( l - a / 2 ) the Working-Ho telling confidence band contains the true line 0

E(Y) =

0

O

+0,X

and with (1- a / 2 ) confidence the true mean, U of Y*, is contained in the interval about Y*. The B o n f e r r o n i inequality guarantees a confidence coefficient of at least ( 1 - a ) that both statements are true. The steps in the mathematical construction of these confidence intervals are given below and in Lieberman, Miller and Hamilton ( 11 ) and Hunter ( 26 ). #

Move 1. F i r s t , the Working-Hotelling confidence band for the true regression line E(Y)

=

j8

0

is obtained with confidence coefficient ( l - a / 2 ) . bounds are

For any X, the

140

TRACE RESIDUE ANALYSIS

Y - W s(Y) where

2 W

-tk3^ ^ statistical a n a l y s i s of t h i s data set were used as p l o t t i n g coordinates f o r each sample ( T a b l e I V ) . The d a t a were d i s p l a y e d and r o t a t e d a b o u t t h e a x e s i n 3-D u s i n g a D a z z l e r TV g r a p h i c s b o a r d (Cromemco, Inc., Mountain View, CA). F o l l o w i n g t h i s d i s p l a y of the data, the coordinates f o r each sample were used t o generate a 3-D p l o t w i t h a Texas I n s t r u m e n t s P l o t t e r d r i v e n by a MUMPS program i n the l a b o r a t o r y data base. Two g r a p h i c s v i e w p o i n t s were s e l e c t e d t h a t a l l o w e d us t o d i s c e r n t h e t h r e e c l u s t e r i n g o f A r o c l o r s as l i n e s ( F i g u r e 11). The s i n g l e A r o c l o r 1254 s a m p l e ( p o i n t "3") t h a t was composed o f d a t a f r o m an a l t e r n a t e s e t o f d a t a i s r e a d i l y o b s e r v e d i n t h e s e 3-D p l o t s as not being s i m i l a r to any of the other sample types. a

n

d

r o m

t h e

C l a s s i f i c a t i o n To i l l u s t r a t e the use of SIMCA i n c l a s s i f i c a t i o n p r o b l e m s , we a p p l i e d t h e method t o t h e d a t a f o r 23 s a m p l e s o f A r o c l o r s and t h e i r mixtures (samples 1-23 i n Appendix I ) . I n t h i s example, t h e A r o c l o r c o n t e n t o f t h e t h r e e samples o f transformer o i l was unknown. Samples 1-4, 5-8, 9-12 and 13-16, were A r o c l o r s 1242, 1248, 1254, and 1260, r e s p e c t i v e l y . Samples 17-20 were 1:1:1:1 m i x t u r e s o f t h e A r o c l o r s . A p p l i c a t i o n o f SIMCA t o these data generated a p r i n c i p a l components score p l o t ( F i g u r e 12) t h a t shows t h e t r a n s f o r m e r o i l i s s i m i l a r , b u t n o t

Isomer Specific Analysis of PCBs

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

STALLING ET AL.

Theta-1 (FIRST

69

OF

105

PEAKS)

F i g u r e 10. P r i n c i p a l C o m p o n e n t s P l o t ( T h e t a T h e t a 2) f r o m A r o c l o r C l a s s e s ( T a b l e I X ) .

1 vs

218

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

T3

#

12

Tl 3

* $ # 2 3

A r o c l o r 1248 S e r i e s 1:1:1:1 A r o c l o r Mix A r o c l o r 1254 S e r i e s (69/105) 1248 Test 1254 (69) Test T3

T2 3

F i g u r e 1 1 . 3-D V i e w s o f T h e t a V a l u e s D e r i v e d f r o m PC A n a l y s i s o f A r o c l o r s . K e y : * = A r o c l o r 1 2 4 8 ; 6 = 1:1:1:1 M i x t u r e ; # = A r o c l o r 1 2 5 4 ( 6 9 / 1 0 5 p k s ) ; 2 = A r o c l o r 1248 ( t e s t ) ; and 3 = A r o c l o r 1254.

STALLING ET AL.

219

Isomer Specific Analysis of PCBs

9.9 1254

8.3

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

6.6 5.0 3.3 1248

1:1:1:1

USED TRANSFORMER OIL 1242 1260 -10.27

-7.84

-5.41

-2.98

-0.55

1.88

4.31

I 6.74

I 9.17

11.60

THETA 1

F i g u r e 12. P r i n c i p a l C o m p o n e n t s P l o t f r o m Five A r o c l o r s C l a s s e s and a U s e d T r a n s f o r m e r F l u i d ( m o s t s i m i l a r t o A r o c l o r 1260).

220

TRACE RESIDUE ANALYSIS

i d e n t i c a l , t o A r o c l o r 1260. A more d e t a i l e d d i s c u s s i o n o f c l a s s i f i c a t i o n using these data i s presented by Dunn et a l . (39).

T a b l e IV. T h e t a V a l u e s from a Three Component P r i n c i p a l C o m p o n e n t s C l a s s M o d e l (A=3) and T o t a l M e a s u r e d T o t a l Concentration of PCBs.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

A r o c l o r and SAMPLE # 1248 1 1254 2 1248 3 4 5 6 7 8 10 11 9 12 1:1:1:1 mixture 19 14 13 18 17 15 20 1 6

i 1254 24 26 25 23 27

THETA 1

THETA 2 THETA 3

-0.813

-0.003

-0.180

0.11

0.227 -1.39

CONC.

6 6

0.524 0.351 0.352 -0.0173 -0.0641 -0.645 -1.32 -1.29 -1.52 -2.07

0.621 0.541 0.541 0.369 0.352 0.077 -0.246 -0.236 -0.334 -0.592

0.21 0.216 0.211 0.211 0.225 0.22 0.231 0.231 0.245 0.240

0.76 1.58 1.61 3.46 5.63 6.56 9.85 9.74 10.9 13.7

0.486 0.442 0.255 0.264 0.167 0.0237 -0.587 -0.603

0.570 0.546 0.418 0.430 0.370 0.273 -0.099 -0.122

0.048 0.038 -0.075 -0.089 -0.14 -0.253 -0.629 -0.67

1.48 1.76 3.22 3.24 3.89 5.11 9.7 9.99

0.721 0.733 0.839 0.866 1.08

0.513 0.41 -0.139 -0.353 -1.64

0.173 0.159 0.126 0.112 0.000

0.80 1.38 3.82 4.76 10.7

1

Data Set Obtained from f i r s t 69 of 105 Isomers Quantitated P r e d i c t i o n of Composition

of Unknown Samples PLS.

Because many samples a r e a n a l z y e d i n w h i c h the a n a l y s t i s i n t e r e s t e d i n determining which A r o c l o r mixtures are present, we a p p l i e d the PLS method to the data obtained from the a n a l y s i s of

12.

STALLING ET AL.

221

Isomer Specific Analysis of PCBs

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

the used t r a n s f o r m e r f l u i d p r e v i o u s l y d i s c u s s e d . I n o r d e r t o e s t i m a t e t h e A r o c l o r c o n t e n t of t h e p r e v i o u s l y c l a s s i f i e d t h e used t r a n s f o r m e r f l u i d p r e v i o u s l y d i s c u s s e d . In order to e s t i m a t e t h e A r o c l o r c o n t e n t of t h e p r e v i o u s l y c l a s s i f i e d t h e used transformer f l u i d previously discussed. In order to e s t i m a t e t h e A r o c l o r c o n t e n t of t h e p r e v i o u s l y c l a s s i f i e d transformer o i l samples, we obtained a d d i t i o n a l data from the a n a l y s i s of A r o c l o r s of v a r y i n g p r o p o r t i o n s . I n A p p e n d i x I , t h e d a t a a r e o r d e r e d i n an a r r a y and t h e f i r s t f o u r v a r i a b l e s d e s i g n a t e t h e f r a c t i o n a l p a r t o f each A r o c l o r composing t h e sample i n the order 1242, 1248, 1254, and 1260. This composition d a t a r e p r e s e n t s t h e Y - b l o c k and t h e 69 peaks r e p r e s e n t t h e Xblock of data analyzed w i t h the PLS-2 program.

Table V. S t a t i s t i c a l Summary f o r A=3 P r i n c i p a l A n a l y s i s of A r o c l o r Samples.

StepPar. 0 1 2 3 4

Alpha BetaBe t a Beta-

A 0 0 1 2 3

NDF 1862 1794 1700 1608 1518

SS 4.36E+03 2.25E+03 5.83E+02 1.36E+02 5.24E+01

SD

1 -SD/SDY

1.53E+00 1.12E+00 5.86E-01 2.91E-01 1.86E-01

0.000 0.268 0.617 0.810 0.879

Components SIMCA

SS(TETA) 0. 0. 1.7E+03 4.5E+02 8.4E+01

ITET 0 14 9 17

The samples of unknown composition—21-23 and samples 1-20, 24-34 ( A p p e n d i x I ) w e r e t h o s e o f A r o c l o r s o f v a r i a b l e composition. V a r i a b l e s 5-73 are isomer concentrations ( V a r i a b l e 74, t h e t o t a l PCB c o n c e n t r a t i o n i n ppm was n o t i n c l u d e d i n t h e analysis). V a r i a b l e s 5-73 represent the f r a c t i o n a l composition or i s o m e r p r o p o r t i o n a l c o n c e n t r a t i o n v a l u e s . Representative c o n c e n t r a t i o n histograms of the data s e t are presented i n F i g u r e 13. Four PLS components were e x t r a c t e d and then used to e s t i m a t e the A r o c l o r content of the unknowns and of a standard sample (No. 24). The A r o c l o r standard i s a mixture of three A r o c l o r s i n the r a t i o of 033:0.33:0:0.33. Chromatograms of the samples f o r which the PLS e s t i m a t e s were made ( T a b l e V I ) were s i m i l a r when compared to a chromatogram of a s i m i l a r m i x t u r e of standards. The p a r t i a l l e a s t squares (PLS) method has been a p p l i e d t o s t r u c t u r e a c t i v i t y p r o b l e m s by Wold ej: a l . ( 3 8 ) . Recently, Lindberg e t a l . (40) employed t h i s approach t o r e s o l v e mixtures of humic a c i d and l i g n i n s u l f o n a t e on t h e b a s i c o f f l u o r e s c e n c e spectra. This example demonstrates that the PLS method gives a s t a b l e e s t i m a t e of t h e Y - b l o c k , even though t h e r e a r e many more Xv a r i a b l e s than samples, a c o n d i t i o n that removes the p o s s i b i l i t y of a p p l y i n g m u l t i p l e r e g r e s s i o n . Another advantage of the method

222

TRACE RESIDUE ANALYSIS

AROCLOR

.125

1254 C*9> 2 . 9 0 3 ng

.1 .875 .86 .825 _

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

a 18

15

28

25

L-il P«ok

AROCLOR

.125 _ .1

48 *

1254 +

45

58

55

1260 I:J 3.044 n 0

-

.875 CO O 0.

a o o

18

16

L, 28

J l , , , 1J i l l i l j l l i l l , 25 38 36 48 45 P.ak • AROCLOR

175 _ .16 125

. , 11 88 66

78

1260 C#13) 3 . 1 8 5 ng

-

876 .86

-

8 28

,1 ll .1,

26

ll.,..lllmi.

3 8 3 6 4 8 4 6 5 8 6 6 P«ok •

TRANSFORMER O I L C * 2 I ) 1862 ng

.16 .126 _ .1 .875 .85 _ .826 _

a 6

18

ll.lM 16 28

26

Jlii

38 36 48 P«ok *

F i g u r e 13. F r a c t i o n a l C o m p o s i t i o n T r a n s f o r m e r F l u i d and A r o c l o r s .

66

68

66

78

H i s t o g r a m s o f Used

12.

STALLING ET AL.

223

Isomer Specific Analysis of PCBs

t h a t makes i t a t t r a c t i v e f o r use i n a n a l y t i c a l p r o b l e m s i s i t s computational e f f i c i e n c y and s i m p l i c i t y , which makes i t p o s s i b l e t o use m i c r o c o m p u t e r s and m i n i c o m p u t e r s t o c a r r y out such calculations•

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

The SIMCA software i s a v a i l a b l e i n two forms, both developed by Wold (25., 31): 1) an i n t e r a c t i v e , F o r t r a n v e r s i o n w h i c h r u n s on C o n t r o l Data Corporation (CDC) machines, and 2) an i n t e r a c t i v e v e r s i o n , SIMCA-3B. A d d i t i o n a l i n f o r m a t i o n on t h e s e programs i s contained i n Appendix I. Only the SIMCA-3B v e r s i o n contains the CPLS-2 program used f o r PLS a n a l y s i s .

Table VI. F r a c t i o n a l Composition of A r o c l o r s i n Transformer O i l s Estimated by p a r t i a l l e a s t squares.

Aroclor

i

Sample number

1242

1248

1254

1260

21

.03

.03

.08

.84

22

.03

.03

.08

.84

23

.03

.03

.08

.84

25

.37 (.33)

.36 (.33)

.05 (.00)

.24 (.33)

1

A c t u a l composition

Environmental A p p l i c a t i o n s To i l l u s t r a t e the environmental a p p l i c a t i o n of the SIMCA method we examined a s e t of i s o m e r s p e c i f i c a n a l y s e s of s e d i m e n t samples. The d a t a examined were d e r i v e d f r o m more t h a n 200 sediment samples taken from a study s i t e on the Upper M i s s i s s i p p i R i v e r (41). These a n a l y t i c a l data were t r a n s f e r r e d v i a magnetic tape f r o m the l a b o r a t o r y d a t a base t o the Cyber 175 computer where p r i n c i p a l component a n a l y s i s were conducted on the isomer c o n c e n t r a t i o n data (ug/g each isomer). The f i r s t p r i n c i p a l component v a l u e s ( T h e t a 1) f o r each sample were determined and these values were c o r r e l a t e d w i t h the t o t a l PCB c o n c e n t r a t i o n ( F i g u r e 14) recorded f o r each sample i n a separate computer data base t h a t c o n t a i n e d o t h e r e n v i r o n m e n t a l d a t a such as h y d r o l o g y and s e d i m e n t t e x t u r e . The r e s u l t s i n d i c a t e d that c e r t a i n samples deviated by f a c t o r s of about two. Upon examining the sample records, the recorded d i l u t i o n v a l u e s

224

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

0.024

«

%

1

3

.001 ).000 \ >.O01

I 6 9 U 6 t I RI966 G ! 1U 7986 COO 9 C56S7 « 1 6 IH 9137R >P68S7«8U 769 7 18 9 86L 1U 1 1 61 R 6 3

1.004 1.005 .00

0.C3

C.06

>.09

0.12

0.15

0.18

0.21

0.24

0.27

0.30

0.33

T o t a l PCB C o n c e n t r a t i o n , ppm F i g u r e 1 5 . P l o t o f T h e t a 2 v s . T o t a l PCB C o n c e n t r a t i o n m e a s u r e d i n 201 L a k e O n a l a s k a S e d i m e n t S a m p l e s .

226

TRACE RESIDUE ANALYSIS

Appendix I .

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

This appendix contains some of the data generated f o r the SIMCA and PLS analyses. The complete data s e t i s a v a i l a b l e from the a u t h o r s . Upon r e q u e s t t h e d a t a w i l l be p r o v i d e d on 8" s i n g l e d e n s i t y s i n g l e s i d e d f l o p p y d i s k s i n IBM 3740 f o r m a t f o r CP/M based s y s t e m s o r on 5 1/4" d o u b l e s i d e d d o u b l e d e n s i t y f l o p p y d i s k f o r t h e IBM/PC o r o t h e r MS/DOS based c o m p u t e r s . The requestor, however, must supply a properly formated floppy d i s k . S o f t w a r e A v a i l a b i l i t y . The SIMCA s o f t w a r e i s a v a i l a b l e i n two f o r m s , b o t h d e v e l o p e d by Wold ( 2 5 ) : 1) an i n t e r a c t i v e , F o r t r a n v e r s i o n w h i c h r u n s on C o n t r o l Data C o r p o r a t i o n (CDC) m a c h i n e s . The second s e t of programs a r e an i n t e r a c t i v e m i c r o c o m p u t e r v e r s i o n , SIMCA-3B, a r e a v a i l a b l e from P r i n c i p a l Data Components, 2505 Shepard B l v d . , C o l u m b i a , MO 65201. The SIMCA-3B p a t t e r n r e c o g n i t i o n programs i n c l u d e s t h e CPLS-2 program used f o r PLS a n a l y s i s and a r e a v a i l a b l e f o r CP/M ( D i g i t a l R e s e a r c h , P a c i f i c Grove, CA) and MS-DOS ( M i c r o s o f t Corporation, B e l l u e v e , WA) f o r 8088 o r 8086 based microcomputers. The F o r t r a n v e r s i o n used i n t h i s study was l o c a t e d a t t h e Computer Center a t t h e U n i v e r s i t y of I l l i n o i s a t Champaign/Urbana. The F o r t r a n v e r s i o n i s u s e f u l f o r a n a l y s i s of v e r y l a r g e d a t a s e t s , i . e . 400 x 70 m a t r i c e s . The SIMCA-3B v e r s i o n f o r m i c r o c o m p u t e r s y s t e m s i s i n t e r a c t i v e , menu d r i v e n , and i s a p p l i c a b l e t o i n t e r m e d i a t e s i z e d data sets and runs under CPM or MS-DOS. I n t h i s study, the SIMCA-3B program—CPLS-2, was used t o o b t a i n the r e s u l t s i n the PLS examples discussed. An e a r l i e r F o r t r a n v e r s i o n of SIMCA i s a v a i l a b l e f o r use i n the ARTHUR package a v a i l a b l e from Chemical I n f o r m a t i o n Systems, Box 2227, F a l l s Church ,VA. R e c e n t l y , t h e o p e r a t i n g s y s t e m was changed on t h e CDC Cyber computer s y s t e m a t t h e U n i v e r s i t y o f I l l i n o i s . The new o p e r a t i n g s y s t e m does n o t a l l o w t h e e a r l i e r SIMCA-2T v e r s i o n used t o perform the environmental analyses t o o p e r a t e c o r r e c t l y . The a u t h o r s e x p e c t t h a t a new v e r s i o n of SIMCA w i l l be i n s t a l l e d t h a t w i l l f u n c t i o n w i t h t h e c u r r e n t operating system i n use on the CDC Cyber computer.

12.

STALLING ET AL.

Isomer Specific Analysis of PCBs

227

P a r t i a l Summary of Data from the Gas Chromatographic A n a l y s i s ot A r o c l o r , A r o c l o r M i x t u r e s , and Transformer O i l analyses.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

Table 1. I d e n t i t y of Samples A n a l y z e d — A r o c l o r 1242, 1248, 1254, 1260, Their M i x t u r e s and a Transformer O i l . Data are i n c l u d e d i n t h i s appendix f o r sample numbers designated w i t h an a s t e r i k .

SIMCA ID 1* 2 3 4

Description A r o c l o r 1242 R e p l i c a t e A r o c l o r 1242 R e p l i c a t e A r o c l o r 1242 R e p l i c a t e A r o c l o r 1242 R e p l i c a t e

5* 6 7 8

Aroclor Aroclor Aroclor Aroclor

1248 1248 1248 1248

Replicate Replicate Replicate Replicate

9* 10 11 12

Aroclor Aroclor Aroclor Aroclor

1254 1254 1254 1254

Replicate Replicate Replicate Replicate

13* 14 15 16

Aroclor Aroclor Aroclor Aroclor

1260 1260 1260 1260

Replicate Replicate Replicate Replicate

17* 18* 19* 20*

Aroclor42:48:54:60 A r o c l o r 4 2 : 4 8 :54:60 Aroclor42:48:54:60 Aroclor42:48:54:60

21* 22* 23*

Used Transformer O i l R e p l i c a t e Used Transformer O i l R e p l i c a t e Used Transformer O i l R e p l i c a t e

24 25 26 27 28 29 30 31 32 33

Aroclor Aroclor Aroclor Aroclor Aroclor Aroclor Aroclor Aroclor Aroclor Aroclor

42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60 42:48:54:60

1:1:1:1 1:1:1:1 1:1:1:1 1:1:1:1

1:1:0:1 1:0:1:1 0:1:1:1 1:1:0:0 1:0:1:0 1:0:0:1 0:1:1:0 0:0:1:1 0:1:0:1 1:1:1:1

Z Z 6

TRACE RESIDUE ANALYSIS

Table I I . Data M a t r i x O r g a n i z a t i o n f o r A r o c l o r s and Samples.

Data M a t r i x - V a r i a b l e # Sample #

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

K

1

1242) 6( peak 2 ) 11 16 21 26 31 35 41 46 51 56 61 66 71

2

TD code

2(1248) 7(peak 3)

72

3(1254) 8(peak 4)

73(peak 69)

4(1260) 9

1 10

3

7 4 ( T o t a l cone.)

Weight f r a c t i o n each A r o c l o r i n sample v a r i a b l e 1-4 ^Variables 5-73 are f r a c t i o n a l c o n c e n t r a t i o n of each PCB isomer 'Variable 74 designates t o t a l PCB c o n c e n t r a t i o n i n sample Table I I I . R e p r e s e n t a t i v e Analyses of A r o c l o r s , T h e i r Mixtures and a T r a n s f o r m e r O i l Sample. R e f e r t o T a b l e 2 f o r key t o d a t a organization.

100 0 .8347 .07662 .2134 .08603 .04002 .1051 .01593 0 0 0 0 0 0

Sample # ID code 1 1 42 0 .03722 .3023 .07048 .3437 .02791 .0355 .09709 .02423 0 0 0 0 0 0

0 .1324 .3691 .02987 .3375 .07272 .1661 0 .02633 0 0 0 0 0 0

0 .1501 .1762 .5528 .3169 .1301 .08603 .03421 0 .01582 0 0 0 0 7.511

.3318 .6295 0 1.14 .09121 .1613 .2138 .02422 0 .01318 0 0 0 0

12.

STALLING ET AL.

IsomerSpecific Analysis of PCBs

229

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

Table I I I . Continued

0 0 .3688 .1108 .09049 .1164 .1572 .1657 .06115 0 0 0 .0009057 0 0 0 0 0 0 0 .001386 .2877 .003889 .07539 .1595 .1752 .005103 .02406 .001136 0 0 0 0 0 0 0 .1134 0 0 .3606 .5124 .2944 .01198 .01951 0

5 5 48 100 0 .09205 .08407 .6112 .04218 .1294 .1605 .09849 0 0 0 0 0 6.426E-05 9 9 54 0 0 0 0 .2006 0 .1281 .03496 .1657 .07247 .02599 .002069 0 0 0 13 13 60 0 0 0 0 0 0 0 .0004755 .006675 .07934 .01567 .08182 .02128 .02727 .04784

0 0 .1105 .03458 .4933 .1092 .3208 0 .1533 0 0 0 0 1.463E-05 0

0 .02243 .07285 .3342 .5239 .2177 .1477 .1479 0 .09543 0 0 0 0 6.257

.02185 .04143 0 .4488 .1222 .05379 .3446 .09148 0 .05943 0 0 .0003501 0

100 0 0 0 .03684 0 .06263 .03043 .3222 0 .1777 .001326 .007326 0 0

0 0 0 0 .07303 .01534 .013 97 .3352 .04839 .1939 .01992 0 .0102 0 2.903

0 0 0 0 0 0 .01731 .09876 .007758 .06609 0 .002222 0 0

0 0 0 0 0 0 0 .06472 .02628 .07441 .2802 .04352 .3146 .03142 .009139

100 0 0 0 0 0 0 .1251 .1837 .0009664 .01874 0 .1039 .02323 3.185

0 0 0 0 0 .02807 0 0 .02169 0 .1711 .006068 .06354 .002931

Table I I I . Continued on next page

230

TRACE RESIDUE ANALYSIS

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

Table I I I . Continued

25 0 .2019 .03203 .04948 .03584 .1765 .04801 .04109 .1798 .2308 .1034 .01464 .007217 0 25 0 .1923 .02852 .04677 .03311 .1754 .04581 .03923 .1755 .2197 .09693 .01202 .006168 0 25 0 .1979 .02768 .04811 .03671 .1798 .04658 .04075 .1816 .2255 .09807 .01253 .006722 0

17 17 M4 25 .007108 .06352 .02606 .2394 .01305 .07449 .05987 .08646 .05681 .01501 .03089 .007318 .009673 .01765 18 18 M4 25 .009318 .06195 .0246 .2254 .02247 .06777 .0564 .08372 .05492 .01502 .02829 .006532 .008577 .01566 19 19 M4 25 .00758 .06332 .02566 .2343 .02365 .07212 .05797 .08714 .05516 .01544 .02874 .00762 .009386 .0158

25 .02049 .0797 .01205 .156 .02929 .09968 .03558 .1563 .02459 .1661 .01593 .103 .01155 .002694

25 .01411 .04155 .1385 .1621 .06368 .04474 .1951 .08263 .09232 .01465 0 .04175 .007575 4.385

.05351 .1062 0 .2308 .03699 .03058 .09625 .05449 .01062 .03886 .06061 .002988 .02351 .0004781

25 .02033 .07673 .01163 .1456 .03049 .0975 .03511 .1494 .02173 .1514 .01499 .09292 .01005 .002565

25 .01366 .037 57 .1362 .1597 .06241 .04221 .1943 .07 889 .0869 .01265 0 .03756 .007134 4.186

.0582 .1012 0 .2158 .03718 .02041 .09346 .05288 .01055 .03402 .05624 .002718 .02181 0

25 .01591 .07692 .01181 .1521 .03228 .09911 .03613 .1496 .02203 .1564 .01439 .09556 .01079 .001706

25 0 .03998 .139 .1627 .06273 .04306 .198 .08243 .09154 .0134 0 .03932 .006948 4.22

0 .1035 0 .2205 .0366 .01992 .0963 .05478 .01029 .03546 .05881 .002419 .02241 0

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

12.

STALLING ET AL.

Isomer Specific Analysis of PCBs

Table I I I . Continued 20 20 M4 25 25 0 0 .1888 .05618 .029 .02383 .04563 .2313 .03259 .0219 .1742 .07348 .04368 .05558 .03871 .08279 .1776 .05508 .2167 .0147 .09493 .0272 .01184 .007262 .006812 .008346 0 .01538 21 21 TO 0 0 0 0 21.21 9.022 0 0 9.456 29.72 5.23 0 67.07 11.2 16.32 19.48 7.393 2.339 169.9 41.68 259.6 10.67 141.5 39.53 9.323 8.306 10.12 18.16 0 27.11 022 22 TO 0 0 0. 0 18.84 6.861 0 0 9.354 28.37 4.054 0 68.15 11.62 15.17 17.98 6.871 2.331 167.6 42.73 258.6 10.65 140.7 39.08 8.846 8.124 10.08 17.99 0 26.88

231

25 .01953 .07373 .01114 .1486 .03021 .09438 .03355 .1486 .02429 .1513 .01298 .09327 .01053 .001 846

25 .02258 .03659 .1375 .159 .06214 .04086 .1881 .07911 .08819 .01244 0 .03842 .006725 4.101

0 .1012 0 .2149 .03436 .0217 .09305 .05164 .008697 .03461 .05699 .002228 .02261 0

0 0 7.668 0 18.87 3.79 13.59 25.79 36.12 17.97 150.3 20.73 156.3 20.98 6.732

0 0 4.709 22.01 16.88 6.906 0 90.13 83.84 19.92 11.58 9.535 62.34 10.67 1962

0 9.761 0 37.65 0 0 13.89 10.21 9.548 7.416 82.64 0 39.39 0

0 0 6.402 0 16.44 0 13.79 24.69 36.60 18.34 150.1 20.85 156.7 20.90 6.672

0 0 4.596 20.97 16.38 7.090 0 89.59 82.98 18.26 11.48 9.035 61.77 10.63 1929.

0 7.651 0 36.33 0 0 13.65 8.900 9.401 69.10 81.87 0 39.50 0

Table I I I . Continued on next page

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

232 Table I I I . Continued 023 23 TO 0 0 0 0 8.285 20.07 0 0 27.98 8.918 0 0 10.25 63.37 19.00 15.78 2.185 0 39.50 161.7 9.602 244.7 36.90 131.9 7.868 8.368 16.73 9.241 24.92 0

TRACE RESIDUE ANALYSIS

0 0 5.973 0 15.53 0 13.64 24.82 37.18 17.65 141.9 19.59 146.8 19.50 6.579

0 0 4.554 21.05 15.96 6.221 0 86.27 79.38 20.34 10.87 8.435 56.95 9.806 1835

0 9.198 0 33.38 4.408 0 12.98 8.967 9.282 6.783 77.59 0 36.46 0

Literature Cited 1. Ballschmiter, K.; Zell, M. Freshenius Z. Anal Chem. 1980, 302, 20. 2. Albro, P. W.; Corbett, J. T.; Schroeder, J. L. J . Chromatogr. 1981, 205, 103. 3. Bush, B.; Connor S.; Snow, J . J. Assoc. Off. Anal. Chem. 1982, 65, 555. 4. Hutzinger, O.; Safe, S.; Zitko, V. in "The Chemistry of PCB's," CRC Press, Cleveland, OH, 1974. 5. Brinkman, U. A. Th. de Kok, A. in "Topics in Environmental Health, Halogenated Biphenyls, Terphenyls, Naphthalenes, Dibenzodioxins and Related Products, Kimbrough, R. D., Ed. ; Elsevier/North Holland Biomedical Press: New York , 1980; 2-4. 6. Jensen, S. New Sci. 1966, 32, 612. 7. Zell, M.; Ballschmiter, K. Fresenius Z. Anal. Chem. 1980, 304., 337 8. Koeman, J . H.; Debrouw, M.C.; De Vos, R.H. Nature (Lond.) 1969, 221, 1126. 9. Biros, F.J.; Walker, A.C.; Medbery, A.; Bull. Environ. Contam. Toxicol. 1970, 5, 317. 10. Fishbein, L. J. Chromatogr. 1972, 68, p 345. 11. Hammond, P. B.; Nisbet, I.C.T.; Sarofim, A.F.; Drurry, W.H.; Nelson, W.; Rall, D. P. Environmental Impact. Environ. Res. 1972, 5, 249. 12. Bush, B.; Tumasonis, C.F.; Baker, F.D. Arch. Environ. Contam. Toxicol. 1982, 28, 97. 13. Mes, J.; Davies, D.J.; Turton, D. Bull. Environ. Contam. Toxicol. 1982, 28, 97. 14. Bandera, S.; Sawyer, T.; Campbell, M.A.; Robertson, L.W.; Safe, S. Life Sciences 1982, 31, 517.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

12.

STALLING

ET

AL.

Isomer

Specific

Analysis

of

PCBs

233

15. Safe, S.; Robertson, L.W.; Safe, L . ; Parkinson, A.; Bndera, S.; Sawyer, T.; Campbell, M. A. Can. J. Pharmacol., 1982, 60, 1057. 16. Rappe, C.; Buser, H.-R. in "Topics in Environmental Health, Halogenated Biphenyls, Terphenyls, Naphthalenes, Dibenzo dioxins and Related Products, Kimbrough, R. D., Ed.; Elsevier/North Holland Biomedical Press: New York, 1980; Chap. 2. 17. Poland, A.; Glover, E. Mol. Pharmacol. 1977, 13, 924. 18. Trotter, ; Young, W.J.; Casterline, S.J.V., Jr.; Bradlaw, J.L.; Kamps, L.R. J. Assoc. Off. Anal. Chem. 1982, 65, 838. 19. Webb, R.G.; McCall, A.C. J. Chromatogr. Sci. 1973, 11, 366. 20. Environmental Protection Agency, Washington, D. C., Method 625, Fed. Reg. 1979, 44, 69540. 21. Duinker, J.C.; Hillebrand, J.I.J.; Palmark, K.H.; Wilhemsen, S. Bull. Environ. Contam. Toxicol. 1980, 25, 956. 22. Zell, M.: Ballschmiter, K. Fresenius Z. Anal. Chem. 1980, 304, 337. 23. Bopp, R.F.; Simpson, J.; Olsen, C.R.; Kostyk, N. Environ. Sci. and Technol. 15, 1981, 210. 24. Kowalski, B.R. Anal. Chem. 1980, 52, 112R. 25. Wold, S. Kemia-Kemi 1982, 9, 401. 26. Schwartz, T.R.; Stalling, D.L.; Petty, J. D.; Hogan, J.W.; Marlow, B. K.; Campbell, R.D.; Little, R.L. 184th National Meeting of the American Chemical Society, Environmental 1982, Papers 20, 21, Kansas City, MO. 27. Schwartz, T.R. M.S. Thesis, University of Missouri-Columbia 1982. 28. Albro, P. W.; Fishbein, L. J. Chromatogr. 1972, 69, 273. 29. Ugawa, M.; Nakamura, A.; Kashimota, C. in "New Methods in Enviromental Chemistry and Toxicology," Proceedings of the International Symposium, Susonso, Japan, International Academic Printing Co., Tokyo, Japan 1973. 30. Wold, S. Pattern Recognition 1976, 8, 127. 31. Wold, S.; Albano, C.; Dunn, W. D.,III; Esbensen, E.; Helberg, S.; Johansson, E.; Sjostrom, M.; "Pattern Recognition: Finding and Using Regularities in Multivariate Data in Food Research and Data Analysis," Eds. Martens, H. and H. Russwurm, Jr.; Applied Science Publishers, New York, 1983, pp. 147-188. 32. Wold, S.; Sjostrom, M. in ACS Symposium Series No. 52, American Chemical Society: Washington, D.C., 1977, pp. 243. 33. Joreskog, K.G.; Klovan, J.E.; Reyment, R.A. "Geological Factor Analysis," Elsevier, Amsterdam, 1976. 34. Wold, S. Technometrics 1978, 20, 397. 35. Kowalski, B. J. Amer. Chem. Soc. 1973, 95, 686. 36. Massart, D.L.; Dijkstra, A.; Kaufman, L . ; in "Evaluation and Optimization of Laboratory Methods and Analytical Procedures"; Elsevier, Amsterdam 1978.

234

37.

38. 39.

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch012

40. 41.

TRACE RESIDUE ANALYSIS

Wold, H. Soft Modeling by Latent Variables: the Nonlinear Iterative P a r t i a l Least Squares Approach," Ed. J. Gani, i n Perspective i n Probability and S t a t i s t i c s - Papers i n Honor of M. S. B a r t l e t t , Academic Press, London 1975, pp. 117-142 Wold, S.; Dunn, W . J . , I I I ; J. Chem. Inf. Comput. S c i . 1983, 23, 6. Dunn, W.J.,III.; S t a l l i n g , D . L . ; Schwartz, T.R.; Hogan, J.W.; Petty, J . D . ; Johansson, E . ; Wold, S.; "Pattern Recognition for C l a s s i f i c a t i o n and Determination of Polychlorinated Biphenyls i n Environmental Samples," Anal. Chem. 1984, i n press. Lindberg, W.; Persson, J . A . ; Wold, S. Anal. Chem. 1983, 55, 643. Dexter, R. N.; Pavlou, S. P . ; Hines, W. G . ; Anderson, C.; 1978, "Dynamics of Polychlorinated Biphenyls in the Upper Mississippi River: Final Report. Phase I , Task 1: Compilation of Information. U. S. Fish and Wildlife Service, Columbia, MO Contract No. 14-16-009 78-026.

RECEIVED March 25, 1985

13 From Data to Information to Knowledge The Problems of Metamorphosis

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch013

C. ZERVOS Pharmaceutical Research and Testing, National Center for Drugs and Biologics, U.S. Food and Drug Administration, Washington, DC 20204

Rational control of health and environmental risks from technical development requires scientific knowledge which must be acquired through the orderly process of the scientific method of inquiry. Contrary to widely held opinions the latter is no less subjective than other rational human endeavors which require decisions under uncertainty. Indeed, to be applied, the method requires a value system which in ordinary research is supplied by the various scientific disciplines. Because of differences among the disciplinary value systems problems often arise in the interdisciplinary settings of efforts to control risks from technical development. Metrics, the concepts, theory, and practice of measurement is suggested here as a way to deal with such problems. The terms d a t a , i n f o r m a t i o n , and knowledge a r e o f t e n used i n t e r c h a n g e a b l y f o r d i v e r s e purposes by r e s e a r c h e r s i n a l l s c i e n t i f i c d i s c i p l i n e s . In t h e s c i e n t i f i c e n t e r p r i s e , however, they a r e n o t i n t e r c h a n g e a b l e , d e s p i t e arguments t o t h e c o n t r a r y . As a matter o f f a c t , Chemometrlcs, t h e s u b j e c t o f t h i s Symposium, may s p r i n g a t l e a s t i n p a r t from t h e r e a l This chapter not subject to U.S. copyright. Published 1985, American Chemical Society

236

TRACE RESIDUE ANALYSIS

d i f f e r e n c e s among t h e s e terms. The I n t e r n a t i o n a l Chemometrlcs S o c i e t y , f o r i n s t a n c e , d e c l a r e s t h a t :

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch013

"Chemometrlcs i s t h e c h e m i c a l d i s c i p l i n e t h a t uses mathematical and s t a t i s t i c a l methods (a) t o d e s i g n and s e l e c t o p t i m a l measurement p r o c e d u r e s and e x p e r i m e n t s , and (b) t o p r o v i d e maximum c h e m i c a l i n f o r m a t i o n by a n a l y z i n g c h e m i c a l data."U) The d i f f e r e n c e s among t h e t h r e e terms a r e not j u s t o f t h e o r e t i c a l i n t e r e s t . They touch n e a r l y e v e r y a s p e c t o f our d a i l y l i v e s because they a r e c e n t r a l t o the f u n c t i o n s of the agencies t h a t p r o t e c t the p u b l i c h e a l t h and t h e environment. S p e c i f i c a l l y , t o be c r e d i b l e , t h e s e a g e n c i e s must base t h e i r a c t i o n s on " a c c e p t e d " s c i e n t i f i c knowledge. Consequently, they c o l l e c t enormous amounts o f e x p e r i m e n t a l d a t a . These d a t a , however, a r e o f l i t t l e use u n t i l they a r e f i r s t c o n v e r t e d t o s c i e n t i f i c i n f o r m a t i o n and then p l a c e d i n the context of other r e l e v a n t s c i e n t i f i c i n f o r m a t i o n and t h e r e b y become knowledge. The s t a n d a r d s f o r c o n v e r t i n g d a t a t o i n f o r m a t i o n a r e s h o r t - c u t o r "economy" s o l u t i o n s t o t h e u n i v e r s a l problem o f h a v i n g t o d e c i d e under u n c e r t a i n t y ; t h e s e s t a n d a r d s a r e based on c o n v e n t i o n , not on s c i e n c e . They v a r y from d i s c i p l i n e t o d i s c i p l i n e and from time to t i m e . As might be e x p e c t e d , a l t h o u g h a p p r o p r i a t e i n t h e c o n t e x t o f t h e i r development, such s t a n d a r d s a r e o f t e n l i k e l y t o be i n c o m p l e t e o r o t h e r w i s e i n a p p r o p r i a t e f o r u n i v e r s a l a p p l i c a t i o n because they a r e v a l u e - l a d e n r u l e s f o r making c h o i c e s ( v i d e i n f r a ) . Through use, however, they become v a l u a b l e t o t h o s e who use them. O f t e n t h e s e s t a n d a r d s a l s o become a cause o f c o n t e n t i o n when, i n i n t e r d i s c i p l i n a r y s e t t i n g s , p r a c t i c a l knowledge must be e x t r a c t e d from e x p e r i m e n t a l d a t a . Thus, c o n t r o v e r s i e s o f t e n a r i s e when s c i e n t i s t s t r a i n e d i n d i f f e r e n t d i s c i p l i n e s I n f l u e n c e p u b l i c p o l i c i e s o r make d e c i s i o n s based on t h e c o n v e r s i o n s t a n d a r d s i n which they were t r a i n e d . Here I w i l l examine a t some l e n g t h the problems w i t h one such s t a n d a r d , namely, t h e odds f o r d e c i d i n g " g a t i n g " hypotheses ( v i d e i n f r a ) i n t h e l i f e s c i e n c e r e l a t e d d i s c i p l i n e s . This standard plays a p i v o t a l r o l e i n t h e assessment o r management o f t e c h n o l o g i c a l r i s k s and t h u s i s a t t h e r o o t o f many c o n t r o v e r s i e s o f t h e genre. I w i l l a l s o s u g g e s t t h a t t h i s and s i m i l a r problems can be overcome by renewed emphasis on t h e p r o p e r use o f t h e s c i e n t i f i c method o f i n q u i r y and by f o c u s i n g a t t e n t i o n on M e t r i c s , i . e . , t h e c o n c e p t s , t h e

13.

ZERVOS

From Data to Information to Knowledge

237

t h e o r y and t h e p r a c t i c e o f measurement. The f o l l o w i n g e x a m i n a t i o n o f t h e s e problems w i l l i n c l u d e a b r i e f d e s c r i p t i o n o f t h e s c i e n t i f i c method o f i n q u i r y ; an a n a l y s i s o f i t s value foundations; a d e s c r i p t i o n of r e p r e s e n t a t i v e examples o f s u b j e c t i v e c h o i c e s i n s c i e n c e ; an a n a l y s i s o f t h e c l a s h o f v a l u e s d u r i n g Interdisciplinary investigations of societally important t o p i c s ; and a recommendation t o d e v e l o p and expand t h e uses o f M e t r i c s t o overcome t h e d i f f i c u l t i e s o f making d e c i s i o n s under u n c e r t a i n t y .

Publication Date: July 15, 1985 | doi: 10.1021/bk-1985-0284.ch013

The

S c i e n t i f i c Method o f I n q u i r y ; An Overview.

D i f f e r e n t a u t h o r s d e s c r i b e t h e s c i e n t i f i c method o f i n q u i r y d i f f e r e n t l y depending on what they wish t o emphasize