Computers in the Laboratory. Current Practice and Future Trends 9780841208674, 9780841210936, 0-8412-0867-0

Content: Planning an approach to laboratory automation / Joseph G. Liscouski -- Robots and robotics in the laboratory :

498 73 2MB

English Pages 127 Year 1984

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Computers in the Laboratory. Current Practice and Future Trends
 9780841208674, 9780841210936, 0-8412-0867-0

Citation preview

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.fw001

Computers in the Laboratory Current Practice and Future Trends

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.fw001

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

ACS SYMPOSIUM SERIES

265

Computers in the Laboratory Current Practice and Future Trends Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.fw001

Joseph G. Liscouski, EDITOR Digital Equipment Corporation

Based on a symposium sponsored by the Division of Computers in Chemistry at the 186th Meeting of the American Chemical Society, Washington, D.C., August 28-September 2, 1983

American Chemical Society, Washington, D.C. 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.fw001

Library of Congress Cataloging in Publication Data Computers in the laboratory. (ACS symposium series, ISSN 0097-6156; 265) "Based on a symposium sponsored by the Division of Computers in Chemistry at the 186th Meeting of the American Chemical Society, Washington, D.C., August 28-September 2, 1983." Bibliography: p. Includes index. 1. Chemical laboratories—Data processing— Congresses. 2. Chemical laboratories—Automation— Congresses. 3. Chemistry, Analytic—Data processing— Congresses. I. Liscouski, Joseph G., 1945. II. American Chemical Society. Division of Computers in Chemistry. III. Series. QD51.C65 1984 ISBN 0-8412-0867-0

542'.028'54

84-18518

Copyright © 1984 American Chemical Society All Rights Reserved. The appearance of the code at the bottom of the first page of each chapter in this volume indicates the copyright owner's consent that reprographic copies of the chapter may be made for personal or internal use or for the personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc., 21 Congress Street, Salem, M A 01970, for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to copying or transmission by any means—graphic or electronic—for any other purpose, such as for general distribution, for advertising or promotional purposes, for creating a new collective work, for resale, or for information storage and retrieval systems. The copying fee for each chapter is indicated in the code at the bottom of the first page of the chapter. The citation of trade names and/or names of manufacturers in this publication is not to be construed as an endorsement or as approval by ACS of the commercial products or services referenced herein; nor should the mere reference herein to any drawing, specification, chemical process, or other data be regarded as a license or as a conveyance of any right or permission, to the holder, reader, or any other person or corporation, to manufacture, reproduce, use, or sell any patented invention or copyrighted work that may in any way be related thereto. Registered names, trademarks, etc., used in this publication, even without specific indication thereof, are not to be considered unprotected by law. P R I N T E D I N THE U N I T E D STATES O F

AMERICA

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

ACS Symposium Series M . Joan Comstock, Series Editor

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.fw001

Advisory Board Robert Baker U.S. Geological Survey

Geoffrey D . Parfitt Carnegie-Mellon University

M a r t i n L. Gorbaty Exxon Research and Engineering Co.

Theodore Provder Glidden Coatings and Resins

Herbert D . Kaesz University of California—Los Angeles

James C . R a n d a l l Phillips Petroleum Company

Rudolph J. Marcus Office of Naval Research

Charles N. Satterfield Massachusetts Institute of Technology

M a r v i n Margoshes Technicon Instruments Corporation

Dennis Schuetzle Ford Motor Company Research Laboratory

Donald E. Moreland USDA, Agricultural Research Service W. H. N o r t o n J. T. Baker Chemical Company Robert O r y USDA, Southern Regional Research Center

Davis L. Temple, Jr. Mead Johnson Charles S. Tuesday General Motors Research Laboratory C . Grant Willson IBM Research Department

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.fw001

FOREWORD The ACS SYMPOSIUM SERIES was founded in 1974 to provide

a medium for publishing symposia quickly in book form. The format of the Series parallels that of the continuing ADVANCES IN CHEMISTRY SERIES except that in order to save time the papers are not typeset but are reproduced as they are submitted by the authors in camera-ready form. Papers are reviewed under the supervision of the Editors with the assistance of the Series Advisory Board and are selected to maintain the integrity of the symposia; however, verbatim reproductions of previously published papers are not accepted. Both reviews and reports of research are acceptable since symposia may embrace both types of presentation.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.pr001

PREFACE

A

L A B O R A T O R Y T H A T T O O K A D V A N T A G E of all of the forms of automation available today would be quite a place. Many routine sample preparation tasks would be done by robots that would introduce the samples to instruments for analysis. The data generated would be taken by a computer system, analyzed, reported, and then stored for later retrieval and more detailed analysis. Each chemist, technician, secretary, and manager might have his own work station at his desk with communication between fellow workers and with larger machines for data analysis and data management. Working with data would be considerably easier because of graphics displays that would make the information easier to extract and understand. Sound far-fetched? In this volume, much of what I just described is covered. The intent of this collection is to give an idea of the breadth of computer usage in chemistry and the resultant gains that can be achieved. I thank all who have contributed to this volume, and in particular Gerst Gibbon (Pittsburgh Energy Technology Center) for his assistance in reviewing the papers.

JOSEPH G. LISCOUSKI

Digital Equipment Corporation Marlboro, MA May 1984

ix In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

1 Planning an Approach to Laboratory Automation JOSEPH G. LISCOUSKI

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

Digital Equipment Corporation, 1 Iron Way, P.O. B o x 1002, M a i l Stop: MRO 2 - 3 / M 9 1 , Marlboro, MA 01752

Laboratory automation is, in i t s e l f not a goal, but, rather a means of achieving an objective and a process for solving some laboratory problems. That process involves a substantial planning effort. Without adequate planning, a laboratory automation project may generate more problems than i t solves. Successful projects are planned to take into account both the current needs of a laboratory and some projections as to where the lab w i l l be two or three years out. That time period roughly matches the technology change in computing equipment and microprocessor driven instrumentation. There are two important elements that need to be included in any planning for the future: flexibility and compatibility. Will my approach provide room for growth and changing requirements (more sophisticated analysis routines for example)? Can different workstations or microprocessor systems transfer data and programs to each other? The process of laboratory automation begins when you have clearly identified the things that you want to achieve, and why you want to achieve them. Those " t h i n g s " s h o u l d not be couched in phases like "I want to automate the ..." , b u t r a t h e r " I need faster sample t u r n a r o u n d " , or "...more sophisticated analysis r o u t i n e s will...". As far as the " w h y ? " , a t some p o i n t you a r e g o i n g to have t o justify the project, and its cost in terms o f t i m e , money, and p e o p l e .

1.0

data

PROBLEMS FOR LABORATORY AUTOMATION

L a b o r a t o r y a u t o m a t i o n can be d i r e c t e d a t two t y p e s o f problems: instrumer o r experiment a u t o m a t i o n and l a b o r a t o r y management s y s t e m s . In the f i r s t case, the computer system (microprocessor-based or larger) may be r e s i d e n t in the 0097-6156/84/0265-0001$06.00/0 © 1984 American Chemical Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

2

C O M P U T E R S IN T H E

LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

i n s t r u m e n t or e x t e r n a l t o i t . These systems can p r o v i d e us w i t h c o n t r o l o f the i n s t r u m e n t , d a t a a c q u i s i t i o n , d a t a a n a l y s i s , and l o c a l s t o r a g e . They s h o u l d p r o v i d e some means o f communication to another system, and the i n f o r m a t i o n t r a n s m i t t e d s h o u l d a l l o w you to work w i t h t h e d a t a . T h i s i m p l i e s the a b i l i t y t o o b t a i n the "raw" d a t a - d i g i t i z e d s p e c t r a , chromatogram, e t c . - as w e l l as reduced d a t a . The t e c h n o l o g y f o r communications i s changing rapidly. The p r a c t i c a l c h o i c e s today range from s e r i a l ASCII (RS232) through the IEEE-483 bus. E v e n t u a l l y we can expect to see i n s t r u m e n t s and computer system s u p p o r t i n g the E t h e r n e t approach. An i n s t r u m e n t automation system t h a t i s expected t o be around for s e v e r a l y e a r s , needs t o be a b l e to t a k e advantage o f improving communications hardware and s o f t w a r e t e c h n o l o g y . The second class of problem for automation or " c o m p u t e r i z a t i o n " , which may be more a c c u r a t e , i s t h a t o f l a b o r a t o r y management and l a b o r a t o r y d a t a management. These g e n e r a l l y come under the heading o f LIMS ( L a b o r a t o r y I n f o r m a t i o n Management System) systems. The c o n c e r n here i s u s u a l l y i n the a r e a o f sample t r a c k i n g , managing an a r c h i v e o f i n s t r u m e n t d a t a and conformance t o government r e g u l a t i o n s (Good Laboratory P r a c t i c e s , Good M a n u f a c t u r i n g P r a c t i c e s , E n v i r o n m e n t a l P r o t e c t i o n Agency, and o t h e r s ) . Word p r o c e s s i n g , and a d m i n i s t r a t i v e work ( p e r s o n n e l , s c h e d u l e s , e t c . ) may r e p r e s e n t added needs. In a s e n s e , t h i s would be the hub o f a f u l l y automated, i n t e g r a t e d , l a b o r a t o r y system. I t s h o u l d be a b l e t o c o m m u n i c a t e w i t h the i n s t r u m e n t automation systems, h a n d l i n g the v a r i e t y o f d a t a t y p e s noted above. A f u l l y - a u t o m a t e d l a b may need t o c o n t a i n b o t h t y p e s o f systems. For i n s t r u m e n t automation systems i t i s i m p o r t a n t t o note t h a t not a l l i n s t r u m e n t s (and experiments) can or s h o u l d be interfaced to a computer. There a r e some whose a c c u r a c y or u t i l i t y can be i m p a i r e d by adding an i n t e r f a c e . With some i n s t r u m e n t s t h e r e i s a l s o the problem o f h a v i n g t o go i n s i d e t h e d e v i c e t o g a i n a c c e s s to an analog s i g n a l , t h a t c o u l d v o i d any equipment w a r r a n t y . One o f the c h o i c e s you may have t o f a c e i s the e a r l y o b s o l e s c e n c e o f equipment due to the need f o r e a s i e r , and s u p p o r t e d , c o m p u t e r - t o - i n s t r u m e n t interfacing. L a b o r a t o r y automation doesn't b e g i n when the first computer i s planned or d e l i v e r e d . L i m i t s on your f l e x i b i l i t y i n l a b automation began to appear the day you o r d e r e d your first p i e c e o f l a b equipment. T h i n k i n g about l a b automation s h o u l d occur when you purchase i n s t r u m e n t s . How can t h e y be i n t e r f a c e d ? Are t h e r e d a t a systems f o r them? Are those systems c o m p a t i b l e w i t h a range o f computer systems or have you ( k n o w i n g l y or not) l o c k e d y o u r s e l f i n t o a p a r t i c u l a r approach? I f i t i s n ' t p o s s i b l e t o t a c k l e the e n t i r e j o b a t o n c e , p r i o r i t i e s can be e s t a b l i s h e d as to whether i n s t r u m e n t or management problems are implemented first. Any system implemented over time must have c o m p a t i b i l i t y and communications as prime f a c t o r s i n the p l a n n i n g p r o c e s s .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

1.

LISCOUSKI

2.0

Approach

to Laboratory

3

Automation

GOALS FOR INSTRUMENT AUTOMATION

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

What k i n d s o f g o a l s might we have f o r i n s t r u m e n t automation? One m i g h t be t o improve sample t h r o u g h p u t . The l e v e l o f a u t o m a t i o n i n v o l v e d may take t h e form o f an autosampier to work o f f - h o u r s , a r o b o t system t o t a k e c a r e o f r o u t i n e t a s k s , or an a u t o m a t i c d a t a r e d u c t i o n system t o c a p t u r e the d a t a , reduce it, and p r o v i d e a completed r e p o r t . A l l o f t h e s e c a n be used t o off-load l a b p e r s o n n e l and f r e e them f o r more productive assignments. That l a t t e r p o i n t may speak t o a g o a l o f h a v i n g t o reduce the r a t e o f growth o f a l a b o r a t o r i e s p e r s o n n e l , w h i l e s u p p o r t i n g an i n c r e a s i n g work l o a d . Instrument a u t o m a t i o n may be r e q u i r e d t o p r o v i d e us w i t h more p o w e r f u l t e c h n i q u e s o f d a t a a n a l y s i s and d a t a h a n d l i n g ; u s i n g s t a t i s t i c a l t e c h n i q u e s t h a t would be o t h e r w i s e t o o t i m e consuming t o be p r a c t i c a l ; o r computer g r a p h i c s t o g a i n g r e a t e r flexibility i n data a n a l y s i s . S m a l l d a t a base systems o f s p e c t r a l l i b r a r i e s c a n h e l p a d d r e s s a problem o f f a s t e r component identification. These a r e j u s t a few examples where a u t o m a t i o n b e i n g used t o a c h i e v e a g o a l .

3.0

is a

tool

SOME POTENTIAL GOALS FOR LIMS SYSTEMS

Here we a r e more concerned w i t h d a t a management than w i t h data a c q u i s i t i o n . The g o a l s m i g h t stem from a need t o comply w i t h government r e g u l a t i o n s and g a i n faster access to information. T h i s i s a c l a s s i c a l s i t u a t i o n f o r l a r g e r (than i n s t r u m e n t ) computer s y s t e m s , t h o s e c a p a b l e o f h a n d l i n g a l a r g e d a t a base w i t h enough f l e x i b i l i t y t o s u p p o r t r o u t i n e and ad hoc q u e r i e s , a s w e l l a s exchange i n f o r m a t i o n w i t h o t h e r systems. While t h i s was noted a s the hub o f the l a b o r a t o r i e s a u t o m a t i o n system, i t may be on a lower t i e r of a larger structure o f , perhaps, diverse machines with differing communications r e q u i r e m e n t s . For example, a w e l l - d e s i g n e d and w e l l - i n t e g r a t e d system can h e l p address a g o a l o f improving i n f o r m a t i o n management not o n l y i n t h e l a b , b u t i n a p l a n t - w i d e scheme f o r process c o n t r o l . Communications t h r o u g h t h e "hub" system (serving as a d a t a r o u t e r o r s w i t c h ) c a n p e r m i t i n t e g r a t i o n o f d a t a from d i f f e r e n t t e s t s t a t i o n s and p e r f o r m more thorough and more s o p h i s t i c a t e d a n a l y s i s o f the l a b s d a t a .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

4

C O M P U T E R S IN T H E L A B O R A T O R Y

4.0

TURNING GOALS INTO A LAB AUTOMATION PROJECT

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

Once a s e t o f g o a l s have been i d e n t i f i e d , how are they turned i n t o a l a b automation p r o j e c t ? You can b e g i n by s e t t i n g up a measurement c r i t e r i a f o r each g o a l , a g a i n s t which success can be t e s t e d . T h i s p o i n t can h e l p you i n s e v e r a l ways. Once q u a n t i f i e d , the f e a s i b i l i t y o f your program can be determined. It w i l l g i v e you a means o f e v a l u a t i n g d i f f e r e n t approaches t o the problem, and p r o v i d e s something t o p o i n t t o t o show t h a t you have met your o b j e c t i v e s - t h a t s h e l p f u l the n e x t time a p p r o v a l i s needed f o r a p r o j e c t . Review a l l p o t e n t i a l remedies. Not a l l situations r e q u i r e a c c e s s to a computer to improve them. In an e a r l i e r example - sample throughput - some approaches, such as an a u t o s a m p l e r , may s o l v e the problem. Computer s o l u t i o n s , w h i l e sometimes f l a s h y and g i v i n g the appearance o f a good solution, a r e not always the b e s t or most e f f e c t i v e . They s h o u l d be used o n l y when i t i s c l e a r t h a t t h e r e i s no o t h e r a l t e r n a t i v e . Why? For many p e o p l e , computers and m i c r o p r o c e s s o r s w h i l e f r e q u e n t l y e n c o u n t e r e d , may not be w e l l u n d e r s t o o d . T h e i r c a p a b i l i t i e s a r e sometimes o v e r s t a t e d , and the work and knowledge r e q u i r e d t o do t he j o b u s u a l l y u n d e r e s t i m a t e d even by the b e s t o f people. Exhaust the s i m p l e r approaches f i r s t r a t h e r than jump i n t o a more a m b i t i o u s program. Determine i f the problem you a r e f a c i n g i s a s h o r t term (a s p i k e i n the r e q u e s t r a t e f o r a p a r t i c u l a r t e s t i n g procedure) or a l o n g term c o n c e r n . I t may be e a s i e r t o l i v e w i t h the s h o r t term problem than r e l y on a p r o j e c t t h a t may not move to c o m p l e t i o n f a s t enough. Once g o a l s have been s t a t e d and j u s t i f i e d , determine a realistic time t a b l e f o r i m p l e m e n t a t i o n . An urgent need may r e q u i r e the purchase o f a " t u r n - k e y " , ready t o run system, r a t h e r than one i n which c u s t o m i z i n g i s n e c e s s a r y , or b u i l d i n g one i n - h o u s e . B e f o r e p u r c h a s i n g t h a t t u r n - k e y system, e v a l u a t e i t s growth p o t e n t i a l , and the a b i l i t y to add t o i t w i t h o u t h a v i n g t o r e l y s o l e l y on the o r i g i n a l vendor. Will i t g i v e you the e x p a n s i o n you need f o r the n e x t few y e a r s o r a r e you l o c k e d i n t o what i s now a v a i l a b l e ? I f the s o l u t i o n l o o k s t o be a l o n g - t e r m e f f o r t , i t may be w o r t h w h i l e t o segment i t i n t o s m a l l e r s t e p s so t h a t you can g a i n some e a r l y b e n e f i t . A p r o j e c t t h a t r e q u i r e s a complete, i n s t r u m e n t and management, l a b automation system, might be divided into successive stages: choose some instrument automation first w i t h l a b management a t a second s t a g e , and complete the i n s t r u m e n t work l a t e r . T h i s segmentation r e q u i r e s e n t i r e problem and the i n d i v i d u a l following considerations:

an e v a l u a t i o n o f stages i n l i g h t of

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

the the

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

1.

LISCOUSKI

Approach

to Laboratory

Automation

ο

c o m p a t i b i l i t y - w i l l what you do i n one s t a g e be compatible with t h e next? Will i t permit easy integration, or w i l l a second p r o j e c t be needed t o h a n d l e i t ? Segmentation s h o u l d make l i f e e a s i e r , n o t c r e a t e more work.

ο

communications between the i n s t r u m e n t a u t o m a t i o n system and management system. Does the c a p a b i l i t y e x i s t ? Does the s o f t w a r e e x i s t ? Communications means more t h a t j u s t two machines h a v i n g RS232 c a p a b i l i t y a v a i l a b l e . That i s a w i r i n g and v o l t a g e s t a n d a r d . Communication involves the useful transmission o f information. In order for t h a t t o happen, the two systems have t o agree on message format and p r o t o c o l s , e r r o r d e t e c t i o n and c o r r e c t i o n , the a b i l i t y t o t r a n s m i t ASCII and b i n a r y f i l e s , and a range o f o t h e r f a c t o r s . Who i s r e s p o n s i b l e f o r making i t happen? An answer o f "someone" i s a problem waiting to happen.

ο

What c a n I e x p e c t from the vendors two o r t h r e e y e a r s down t h e road? W i l l the equipment purchased s t i l l be supported then? W i l l t h e y s t i l l be t h e r e ?

ο

A r e t h e r e any changes i n the v e n d o r s p l a n s t h a t may have an e f f e c t on my d i r e c t i o n ? I f y o u a r e s t a n d a r d i z i n g on one v e n d o r , p e r i o d i c a l l y r e v i e w your p l a n w i t h them i n l i g h t o f t h e i r development plans. Many, a f t e r t h e n e c e s s a r y l e g a l paperwork has been t a k e n c a r e o f , w i l l d i s c u s s t h e g e n e r a l d i r e c t i o n s , or a t l e a s t comment on c o m p a t i b i l i t y o f y o u r s and t h e i r d i r e c t i o n s . These r e v i e w s may l e a d t o a d j u s t m e n t s i n d i r e c t i o n a s a r e s u l t of newer, and h o p e f u l l y , c o m p a t i b l e t e c h n o l o g i e s .

I t u s u a l l y h e l p s t o have your p l a n s reviewed b y an o u t s i d e p a r t y , e i t h e r a c o n s u l t a n t o r t h e v e n d o r s you see a s b e i n g k e y t o your p r o j e c t . They may see t h i n g s from a d i f f e r e n t p e r s p e c t i v e or have encountered s i m i l a r s i t u a t i o n s and h e l p a v o i d b l i n d a l l e y s and p i t f a l l s . I t may h e l p t o have them recommend a s o l u t i o n w i t h o u t s e e i n g y o u r s . Someone f r e s h t o a problem, and n o t b i a s e d b y suggested s o l u t i o n s , may come up w i t h interesting insights.

5.0

MAKE OR BUY?

E v e n t u a l l y , i n any l a b a u t o m a t i o n p r o j e c t , you come t o the same d e c i s i o n p o i n t : do I purchase a s y s t e m , o r have one b u i l t t o my [ p a r t i c u l a r , u n i q u e , s p e c i a l , one o f a k i n d ] fill in t h e b l a n k - needs? F i r s t , a r e your needs r e a l l y t h a t [ p a r t i c u l a r , u n i q u e , s p e c i a l , one o f a k i n d l ? A c o n s u l t a n t c a n h e l p you f i n d o u t . I f t h e y a r e , you c h o i c e i s c l e a r . I f they a r e n ' t you have some t h i n k i n g t o d o .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

5

C O M P U T E R S IN T H E L A B O R A T O R Y

6

6.0

CONVENTIONAL WISDOM FAVORS THE BUY DECISION

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

There was an i n t e r e s t i n g s t u d y p u b l i s h e d i n 1975 titled "Achieving the Optimal I n f o r m a t i o n System f o r t h e L a b o r a t o r y " ( p u b l i s h e d by J . L l o y d Johnson A s s o c i a t e s , Northbrook 111.). T h e i r s t u d y i s p r i m a r i l y concerned w i t h h o s p i t a l s y s t e m s , b u t as f a r as we a r e g o i n g , t h e r e i s no d i f f e r e n c e i n the a p p l i c a b i l i t y of the r e s u l t s t o g e n e r a l l a b o r a t o r y a u t o m a t i o n . In pre-1975 d o l l a r s , the c o s t o f a m i n i m a l laboratory system ran as h i g h as $900,000- for a in-house developed package. That was the extreme c i t e d , b u t not f a r behind were numbers l i k e $700,000, $650,000, $600,000, and r a n g i n g down t o $100,000. What was a m i n i m a l system? A m i n i m a l system was one "having some o f the automated i n s t r u m e n t s i n c h e m i s t r y and hematology o n - l i n e , a l o n g w i t h any t h r e e o f the f o l l o w i n g f u n c t i o n s " :

ο

T e s t r e q u e s t e n t e r e d through e f f i c i e n t manner

a

CRT

terminal

ο

Collection l i s t with labels printed

ο

W o r k l i s t generated

ο

T e s t r e s u l t s e n t e r e d w i t h o u t manual r e - e n t r y o f or specimen number

ο

T e s t r e s u l t i n q u i r y v i a CRT

ο

Ward r e p o r t p r i n t e d

ο

C u m u l a t i v e summaries p r i n t e d

or

more

patient

terminal

The s t u d y found t h a t one i n four w i l l a c h i e v e a m i n i m a l s a t i s f a c t o r y system a t an average d i r e c t c o s t o f $300,000. Another one i n four w i l l a c h i e v e a m i n i m a l l y s a t i s f a c t o r y system at costs well i n e x c e s s o f $300,000. The development t i m e averaged 3 y e a r s . Would a m i n i m a l system meet your g o a l ? In a d d i t i o n , some a c t u a r i a l r e s u l t s ( a g a i n from the same s t u d y ) :

ο

" H a l f the time investment"

the

hospital

were

will

also

lose

reported

the

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

total

1.

LISCOUSKI

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

7.0

Approach

to Laboratory

7

Automation

ο

"a q u a r t e r o f the t i m e the h o s p i t a l w i l l i n v e s t more t h a n $300,000, f o r a system w i t h t w i c e the o p e r a t i n g c o s t o f a l e a s e d t u r n k e y system"

ο

"There i s l e s s than one chance i n twenty t h a t an in-house system w i l l b e , f o r a b r i e f t i m e , m a r g i n a l l y b e t t e r than any a v a i l a b l e t u r n k e y system"

ο

"Turnkey system s u p p l i e r s a r e c o n s t a n t l y improving t h e i r capabilities"

WHAT HAS CHANGED SINCE 1975?

The b i g g e s t change i s i n p r i c e . Hardware h a s dropped i n p r i c e d r a m a t i c a l l y . Manpower c o s t s have r i s e n . The power o f t h e s y s t e u s a v a i l a b l e - a t a p a r t i c u l a r p r i c e - has improved, and t h e software i s f a r superior t o t h a t o f e i g h t y e a r s ago. (On t h e e s t a b l i s h e d 16- and 3 2 - b i t systems, m i c r o p r o c e s s o r systems, w h i l e moving r a p i d l y , a r e s t i l l far behind-the c a p a b i l i t y o f l a r g e r vendors p r o d u c t s . ) Hardware c o s t s a r e among t h e more d e c e p t i v e p o i n t s i n p r i c i n g a package, o r b u d g e t i n g f o r a p r o j e c t . Computers c a n be purchased a t any p r i c e r a n g e , from $50 t o $500,000 and up; w i t h the range e x t e n d i n g from m i c r o t o m i n i and m a x i . The c o n c e r n i s not the c o s t o f the hardware b u t what c a n you do w i t h i t ? That i s governed by s o f t w a r e . Good s o f t w a r e can make up f o r t h e s i n s o f poor hardware - t o a p o i n t . Poor s o f t w a r e c a n make a good hardware package u n u s a b l e . Don't purchase a p i e c e o f hardware and then t r y t o f i n d , or not find 109 j develop, a p p l i c a t i o n s software. The b e t t e r approach i s t o f i n d the s o f t w a r e t h a t w i l l do the j o b , and then buy the hardware b e s t s u i t e d f o r i t . S i n c e the s t u d y was done, t h e range o f s o f t w a r e packages, t h e i r c a p a b i l i t y , and q u a l i t y has improved. Standard l i b r a r i e s for d a t a a c q u i s i t i o n , a n a l y s i s , g r a p h i c s and d a t a base management now exist. Those i n t h e s t u d y had t o w r i t e t h e i r own; f r e q u e n t l y i n assembly language r a t h e r then a h i g h level language. Computer programming languages have e v o l v e d r a p i d l y With a number o f them p r o v i d i n g i n one s t a t e m e n t , f a c i l i t i e s t h a t took pages o f code b e f o r e . y

With a l l t h i s improvement, i s "buy" s t i l l the best answer? Y e s , l a r g e l y because o f manpower c o s t s and the t r u l y [ p a r t i c u l a r , u n i q u e , s p e c i a l , one o f a k i n d l t h i n g s t h a t need t o

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

8

C O M P U T E R S IN T H E LABORATORY

be addressed i n any o r g a n i z a t i o n . D u p l i c a t i n g an e x i s t i n g package means t h a t you w i l l have to take on t h e s u p p o r t and maintenance e f f o r t f o r y o u r s e l f , and t h a t s n o t l o w budget s t u f f ! The q u e s t i o n s noted above s t i l l have t o be asked o f any purchased system, questions regarding growth, expansion, support, r e l i a b i l i t y o f t h e v e n d o r , and so o n .

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

9.0

OTHER POINTS TO PONDER IN PLANNING

'Where a r e you g o i n g t o put i t ? Laboratory environments are n o t noted as b e i n g k i n d t o s e n s i t i v e e l e c t r i c a l equipment. W h i l e t h e government, t h r o u g h the Federal Communications Commission, i s working t h e problems o f computers g e n e r a t i n g e l e c t r o m a g n e t i c and r a d i o f r e q u e n c y e m i s s i o n s t h a t may a f f e c t o t h e r d e v i c e s , t h e s e same d e v i c e s , through d i r t y e l e c t r i c a l l i n e s , or t h e i r own e m i s s i o n s a f f e c t computers. C o r r o s i v e gases and o t h e r agents c a n render a machine i n t o an e x p e n s i v e , though u n u s a b l e , c o l l e c t i o n o f m e t a l , epoxy, and plastic. Poor e l e c t r i c a l g r o u n d i n g has p r e m a t u r e l y aged a number o f people i n general l a b o r a t o r y automation. Many o f t h e s e problems c a n be c i r c u m v e n t e d by c a r e f u l l y p i c k i n g t h e machines l o c a t i o n . The systems vendor s h o u l d be a b l e to g i v e you t h e n e c e s s a r y g u i d e l i n e s , and i f needed, make a v i s i t to your s i t e to l o o k f o r p o t e n t i a l problems.

Who i s g o i n g t o run t h e system? L i k e any o t h e r p i e c e o f equipment, computers r e q u i r e maintenance, updates, repairs, m a t e r i a l s ordered,and t h e l i k e . T h i s s h o u l d not be l e f t to a committee, b u t r a t h e r p i c k someone t o handle t h e r e s p o n s i b i l i t y and see t h a t t h e y g e t adequate t r a i n i n g t o handle t h e j o b .

ground.

In t h e c o u r s e o f t h i s a r t i c l e , we have c o v e r e d a The main p o i n t s can be summarized e a s i l y :

ο

planning i s e s s e n t i a l ,

ο

c l e a r g o a l s a r e needed w i t h an i m p l e m e n t a t i o n responsibilities outlined,

l o t of

plan

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

and

1.

LISCOUSKI

Approach

to Laboratory

Automation

9

ο

measurement c r i t e r i a f o r s u c c e s s o f the p r o j e c t need be e s t a b l i s h e d , a n d ,

ο

p r o v i s i o n needs t o be made f o r t h e systems for communications.

growth

to

and

P r o p e r l y planned, a l a b o r a t o r y automation p r o j e c t can improve a l a b s a b i l i t y t o c o l l e c t , a n a l y z e , and manage d a t a . That p l a n n i n g needs t o b e g i n e a r l y i n a l a b s l i f e t i m e , and s h o u l d i n c l u d e a c o n s i d e r a t i o n o f l o n g term and s h o r t term g o a l s . With t h a t work i n p l a c e , you have g r e a t l y improved the l i k e l i h o o d o f a s u c c e s s f u l automation p r o j e c t .

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch001

R E C E I V E D June 2 0 , 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

2 Robots and Robotics in the Laboratory: What Does It Mean? CHARLES H.

LOCHMÜLLER

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch002

Paul M. Gross Chemical Laboratory, Duke University, Durham, NC 27706

This paper addresses the limited current state- of-art in laboratory robotics and compares i t to current manufacturing practice. Important questions are: "When is automation robotics?", "What is a robot anyway?" and "Where does a robot fit in a laboratory environment.?" Examples of current applications are reviewed and suggestions for future directions are presented. The i d e a o f a robot i n the l a b o r a t o r y i s a t once a f a m i l i a r and a very strange concept. P a r t o f the p r o b l e m i s the a s s o c i a t i o n by many o f the word ROBOT w i t h a v a r i e t y o f ambulatory m e c h a n i c a l automatons of different degrees o f s o p h i s t i c a t i o n . Currently a v a i l a b l e r o b o t s are a d i s a p p o i n t m e n t t o many as they a r e neither as c l e v e r as R2D2 n o r as human as C3P0 o f STARWARS fame. In fact, the v a s t m a j o r i t y o f c u r r e n t r o b o t s are r e a l l y a r m - l i k e machines w i t h v a r y i n g s t r e n g t h and d e x t e r i t y ; some a r e c a p a b l e o f moving hundreds o f k i l o s and o f p l a c i n g such o b j e c t s w i t h i n f r a c t i o n s o f a centimeter w h i l e o t h e r s m a n i p u l a t e gram masses t o s u b - m i l l i m e t e r precision. They resemble p a r t s o f the common concept o f a robot more than a w h o l e . Nevertheless, the p r e v i o u s p a r a g r a p h p r o v i d e s the k e r n e l o f a d e f i n i t i o n for a robot. " A m e c h a n i c a l d e v i c e w h i c h performs comp l e x t a s k s w i t h h u m a n - l i k e s k i l l " may be a l i t t l e too g e n e r a l but is a good w o r k i n g d e f i n i t i o n . The word r o b o t d e r i v e s from the R u s s i a n f o r " w o r k e r " o r " t o work" and human work o f t e n requires s i g n i f i c a n t mechanical s k i l l . C o n s i d e r then t h a t c u r r e n t l a b o r a t o r y robots are, i n essence, "blind, one-armed men" and you immediately a r r i v e at the crude n a t u r e they p o s s e s s . Current robots do n o t have t r u e human s k i l l but many common t a s k s are accomplished s a t i s f a c t o r i l y given t h e i r inherent handicaps.

A u t o m a t i o n vs R o b o t i c s How

is

a

r o b o t ( w h i c h i s used t o

automate

a

laboratory

0097-6156/84/0265-0011$06.00/0 © 1984 A m e r i c a n C h e m i c a l Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

task)

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch002

12

C O M P U T E R S IN T H E LABORATORY

d i f f e r e n t from an automated i n s t r u m e n t (which c o u l d be d e s i g n e d t o p e r f o r m t h e same t a s k ) ? That i s n o t an easy q u e s t i o n t o answer b u t a r e f e r e n c e t o m a n u f a c t u r i n g r o b o t s may p r o v i d e some c l u e . There i s a g r e a t d e a l o f d i f f e r e n c e between a " r o b o t o c i z e d " p r o d u c t i o n l i n e and an automated one. Automated p r o d u c t i o n works w e l l i n s i t u a t i o n s i n which t h e p r o d u c t i s c o m p l e t e l y s t a n d a r d i z e d and a l l s p a t i a l c h a r a c t e r i s t i c s a r e f i x e d - e.g.- b o t t l i n g soda. The advantage o f r o b o t i c s i s i n t h e a b i l i t y t o adapt t o new p r o d u c t c h a r a c t e r i s t i c s - e.g.- a complete body change on t h e "'84 model" i n a welding o r spraying operation. An " i n d u s t r i a l " r o b o t i s a "reprogrammable. m e c h a n i c a l d e v i c e which performs complex t a s k s w i t h human-like s k i l l " . I t i s t h i s reprogrammable o r r e t r a i n a b l e a s p e c t t h a t makes t h e r o b o t a t t r a c t i v e from an e n g i n e e r i n g view­ point. Of c o u r s e , even a r o b o t assembly l i n e i s n o t c o m p l e t e l y r e t r a i n a b l e - i . e . - auto assembly p l a n t s cannot become t e x t i l e m i l l s by s i m p i e s o f t w a r e f i x e s . Not u n r e a s o n a b l y , t h e same i s t r u e o f current laboratory robots but, e s p e c i a l l y i n a r o u t i n e determina­ t i o n f u n c t i o n - e.g.- q u a l i t y c o n t r o l - where t h e c h e m i c a l "unit o p e r a t i o n s " a r e very s i m i l a r i n procedures involving radically different analytes, the r e t r a i n i n g f e a t u r e i s an extremely d e s i r a b l e advantage. T r a i n i n g a. Robot Robots which a r e r e q u i r e d t o mimic complex human motion - i . e . - t h e s p r a y p a i n t i n g o f a u t o m o b i l e s by a 20-year v e t e r a n p a i n t e r - w i l l r e q u i r e very s o p h i s t i c a t e d t r a i n i n g u t i l i t i e s i n the c o n t r o l l e r / o p e r a t i n g system. I n f a c t , such r o b o t s " l e a r n " by b e i n g l e d through a t a s k "hand-in-hand" w i t h a s k i l l e d human o p e r a t o r . Such a continuous t r a n s d u c t i o n o f p o s i t i o n speed and d i r e c t i o n i n t o a c o n t r o l program i s v e r y e x p e n s i v e . No l a b o r a t o r y r o b o t available today u t i l i z e s such a t r a i n i n g scheme. I n f a c t , c u r r e n t r o b o t s a r e l e d t h r o u g h a sequence o f s t e p s which a r e i n d i v i d u a l l y "programmed" by an o p e r a t o r t o r e p r e s e n t a u n i t o p e r a t i o n - e.g.- " p o u r " , " t a r e " , "weigh", " d i l u t e " , " d i s p e n s e " , " t a k e a l i q u o t " - which a r e l i n k e d t o become a program t o make t h e r o b o t c a r r y o u t a p a r t i c u l a r t a s k - e g - "Do 100 immunoassays - Type 1". Again the d i f f e r e n c e from a u t o m a t i o n i s t h a t t h e same r o b o t c a n , a f t e r f i n i s h i n g t h e immunoassays, b e g i n a new t a s k - " P r e p a r e 20 v i t a m i n assay samples - Type 3". A r e a l requirement f o r c u r r e n t r o b o t s i s a t o t a l l y f i x e d c o o r d i n a t e system. C u r r e n t r o b o t s cannot f i n d a tube r a c k on a t a b l e t o p , t h e y s i m p l y go t o where a tube r a c k " i s supposed t o be".

Robots; Types and C o o r d i n a t e Systems C u r r e n t l a b o r a t o r y r o b o t o p e r a t i o n s use many o f t h e i n s t r u m e n t modules f a m i l i a r i n c o n v e n t i o n a l a u t o m a t i o n : syringe drives, relay d r i v e r s , c u r r e n t and/or v o l t a g e s e n s o r s ( i n c l u d i n g Α/Ό c o n v e r s i o n ) etc. The u n i q u e l y r o b o t i c component i s a " p i c k and p l a c e " arm which s e r v e s as a "mass mover" o f sample, s o l u t i o n e t c . from one u n i t operation to the next. The r o b o t c o n t r o l l e r f u n c t i o n s t o c o n t r o l b o t h t h e p i c k - a n d - p l a c e component and t h e s e p a r a t e unit operations. A c t u a l l y i t i s poor p r a c t i c e t o s e p a r a t e any o f t h e

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch002

2.

LOCHMULLER

Robots and Robotics

in the Lab

13

f u n c t i o n s o f a r o b o t i c system and d e c i d e t h a t i t i s t h e r o b o t i c element. I t i s the system t h a t i s reprogrammable o r r e t r a i n a b l e and s h o u l d be thought o f a s an e n t i t y composed o f numerous functional a b i l i t i e s . L e t us compare the two r o b o t t y p e s c u r r e n t l y a v a i l a b l e comm e r c i a l l y f o r use ( o r adapted f o r use) i n l a b o r a t o r y environments: Zymate ( d e v e l o p e d by Zymark C o r p . , H o p k i n t o n , Mass.) and M i c r o b o t A l p h a (manufactured by M i c r o b o t , I n c . , Mountain View, CA b u t adapted by G. Owens and co-workers o f the P r o c t e r and Gamble Advanced I n s t r u m e n t a t i o n Group, C i n c i n n a t i , OH). These two r o b o t s d i f f e r i n major ways each w i t h i t s unique p e r s o n a l i t y and c a p a b i l i ties. The d e t a i l o f i m p l e m e n t a t i o n has been d e a l t w i t h elsewhere (1) and need n o t be dwelt on h e r e . The Zymate i s a r o b o t specifically-built f o r l a b o r a t o r y o p e r a t i o n s and e s p e c i a l l y f o r sample p r e p a r a t i o n . The M i c r o b o t A l p h a i s an assembly r o b o t t y p i c a l o f e l e c t r o n i c s manfacture but w i t h s l i g h t l y p o o r e r posit i o n i n g t o l e r a n c e s than the v e r y b e s t a v a i l a b l e f o r t h a t purpose. Both a r e s t a t i o n a r y r o b o t s ( a l t h o u g h the a d a p t a t i o n o f Owens e t a l . t r a n s l a t e s i n one dimension i n some c o n f i g u r a t i o n s ) r e q u i r i n g p r e c i s e p o s i t i o n i n g o f work p i e c e s i n a c i r c l e around the workplace. N e i t h e r p o s s e s s t a c t i l e o r v i s u a l "sense" i n s t a n d a r d c o n f i g u r a tion. T a c t i l e sense can be a c h i e v e d by m o n i t o r i n g c u r r e n t i n t h e h a n d / f i n g e r s e r v o systems. The Zymate [ F i g u r e 1] moves i n a c y l i n d r i c a l c o o r d i n a t e system ( r o t a t e 370°, r e a c h 60cm, l i f t 56cm) under c o n t r o l o f a m i c r o p r o c e s s o r computer u s i n g DC servomotor and c a b l e d r i v e w i t h potentiometric sensing o f p o s i t i o n . I t p o s s e s s e s a "broken w r i s t " c a p a b l e o f r o t a t i o n (360°) b u t not b e n d i n g . A unique f e a t u r e l i e s i n t h e i n t e r c h a n g a b i l i t y o f the "hands". G r i p p e r hands p e r m i t movement o f tubes and o t h e r v e s s e l s w h i l e s y r i n g e hands can d e l i v e r s m a l l volumes, t a k e a l i q u o t s and, w i t h a d a p t e r s , f i l t e r l i q u i d samples. I n some a p p l i c a t i o n s s p e c i a l hands c o n t r o l instument on/off functions. The M i c r o b o t A l p h a [ F i g u r e 2] i s an a r t i c u l a t e d arm w i t h a 46 i n . h e m i s p h e r i c a l envelope. The arm has a p o s i t i o n i n g a c c u r a c y o f 0.5 mm w i t h i n the e n v e l o p e . I t i s a stepping-motor and c a b l e d r i v e n r o b o t c o n t r o l l e d by a 6502 p r o c e s s o r t h a t communicates v i a an RS-232 i n t e r f a c e t o the " o u t s i d e w o r l d " . L i k e the Zymate, i t can be t r a i n e d u s i n g a hand-held pendant keyboard o r can be e x t e r n a l l y d r i v e n by a l a b o r a t o r y microcomputer. The c o o r d i n a t e system o f a f u l l y a r t i c u l a t e d arm i s more c o m p l i c a t e d than a s i m p l e cyli n d r i c a l system b u t t h i s i s overcome by s o f t w a r e c o n t r o l . The advantage i s t h a t the A l p h a can bend i t s " w r i s t " t o r e a c h into t i g h t , a n g l e d q u a r t e r s such as when tubes must be removed from a s l a n t - t u b e c e n t r i f u g e head. Current A p p l i c a t i o n s Robots a r e b e s t s u i t e d ( i n t h e i r p r e s e n t form) f o r t e d i o u s , r e p e t i t i v e and humanly-hazardous j o b s . T a b l e t a n a l y s i s , immunoassay d e t e r m i n a t i o n s , polymer s o l u b i l i t y , e t c a r e i d e a l applications. Less r o u t i n e p e r h a p s , b u t j u s t a s t e d i o u s , a r e s t u d i e s o f enzyme a c t i o n and a c t i v i t y which r e q u i r e v a r i a t i o n i n r e a g e n t s and perhaps incubation timing, the " o p t i m i s a t i o n o f chemical reactions or 1

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E L A B O R A T O R Y

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch002

14

F i g u r e 1. The Zymate (Zymark Corp., H o p k i n t o n , MA) showing t h e main r o b o t module ( c e n t e r ) w i t h u n i v e r s a l w r i s t and " g r i p p e r " hand a t t a c h ed. I n t h e upper r i g h t i s t h e c o n t r o l l e r w i t h programming keyboard and s o f t keys ( r i g h t hand s i d e o f d i s p l a y s c r e e n ) . The s o f t keys can be d u p l i c a t e d i n a " t e a c h / l e a r n " pendant (not shown).

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

F i g u r e 2. The A l p h a ( M i c r o b o t I n c . , M o u n t a i n v i e w , CA) showing: t e a c h i n g pendant ( l o w e r l e f t ) , r o b o t w i t h g r i p p e r a t t a c h e d ( a l t e r n a t e g r i p p e r s i n f r o n t c e n t e r ) , system c o n t r o l l e r and o p e r a t o r c o n t r o l module (lower r i g h t ) . Note how w r i s t can b o t h r o t a t e and bend.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch002

16

COMPUTERS IN T H E LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch002

routine preparation o f d e r i v a t i v e s i n organic s y s t h e s i s . To d a t e , r o b o t s do n o t d i s p l a c e people by " t a k i n g over t h e i r j o b " because they a r e used i n t a s k s i n which t h e r e i s a h i g h t u r n - o v e r o f p e r s o n n e l due t o boredom. The p e r s o n n e l f r e e d by t h e i n t r o d u c t i o n o f r o b o t s c a n be i n v o l v e d i n more p e r s o n a l l y s a t i s f y i n g t a s k s . In a d d i t i o n , t h e r e i s mounting evidence t h a t t h e use o f r o b o t s g r e a t l y improves b o t h long and s h o r t - t i m e v a r i a n c e i n t h e p r e c i s i o n o f quality control applications. Immediate l a b o r a t o r y r o b o t a p p l i c a t i o n i s p o s s i b l e i n almost any l a b o r a t o r y sample p r e p a r a t i o n program. Robots c u r r e n t l y a v a i l ­ a b l e can and do f u n c t i o n t o p e r f o r m almost a l l o f t h e u n i t o p e r a t i o n s a s s o c i a t e d w i t h sample p r e p a r a t i o n : w e i g h i n g , dissolu­ tion, c e n t r i f u g a t i o n , reagent dispensing, mixing, incubation, f i l t e r i n g , l i q u i d - l i q u i d e x t r a c t i o n and f i l l i n g sample t r a y s . A l l o f t h i s c a n be performed w i t h complete l o g g i n g o f sample h i s t o r y .

Conelusion There i s l i t t l e t o c o n c l u d e a t p r e s e n t . R o b o t i c s i s an i n f a n t e n g i n e e r i n g d i s c i p l i n e and y e t t h e m a n u f a c t u r i n g a s p e c t o f r o b o t i c i m p l e m e n t a t i o n i s f a r ahead o f any l a b o r a t o r y a p p l i c a t i o n . Today's r o b o t s a r e reprogrammable a u t o m a t i o n , they a r e f a r from b e i n g cybernauts and a r e h a r d l y " c l e v e r " b u t t h e i r p o t e n t i a l as an "arm" f o r a r t i f i c i a l i n t e l l i g e n c e experiments cannot be o v e r l o o k e d .

Literature Cited 1. Analytical Chemistry A/C Interface, Vol 55, 1100A-1114A, 1232Α1242A (1983). RECEIVED

June 5, 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

3 General Laboratory Data Management and Specific Laboratory Needs W. KIPINIAK and W. FINNERTY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch003

Computer Inquiry System Inc., 160 Hopper Avenue, Waldwick,ΝJ07463 This paper describes the design and implementation of a versatile computerized laboratory automation and information management system. Discussion highlights i t s adaptability to a variety of laboratory environments. Data can be entered manually or acquired automatically from spectrum generating and spot reading instruments such as chromatοgraphs, spectrophotometers, balances, pH meters and a wide variety of intelligent instrumentation. In accordance with good laboratory practice, information can be databased on-line or archived off-line in such a manner that allows quick and easy retrieval. Standard reports in fixed formats and ad-hoc reports in virtually any format can be easily generated either on demand or on a scheduled basis. Because every laboratory i s unique in i t s function and organization, this package is designed to be easily adapted to any combination of environments, products, instruments and staffing. This adaptability makes it ideally suited to any laboratory application. A comprehensive c o m p u t e r i z e d l a b o r a t o r y d a t a management system has been d e s i g n e d and implemented s u c c e s s f u l l y i n a number o f l a b o r a t o r i e s t o p r o v i d e t e s t d a t a c o l l e c t i o n , r e s u l t r e p o r t i n g and t o t a l i n f o r m a t i o n c o n t r o l . The broad range o f t a r g e t l a b o r a t o r i e s focused the Lab Manager system d e s i g n e f f o r t s on the f l e x i b i l i t y n e c e s s a r y t o accomodate custom c o n f i g u r a t i o n t o t h e u n i q u e and e v o l v i n g needs o f d i v e r s e l a b o r a t o r i e s . I t has been i n t e g r a t e d i n t o the Computer Automated L a b o r a t o r y System (CALS) f o r complete l a b o r a t o r y management. O p e r a t i n g i n an o n - l i n e , u s e r - f r i e n d l y c o n v e r s a t i o n a l mode, t h e Lab Manager system adapts t o e s t a b l i s h e d l a b o r a t o r y p r o c e d u r e s and methods r e s u l t i n g i n m i n i m a l o p e r a t i o n a l r e t r a i n i n g and l o s s o f 0097-6156/ 84/0265-0017S06.00/0 © 1984 American Chemical Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch003

18

C O M P U T E R S IN T H E

LABORATORY

" l e a r n i n g curve" experience. A d d i t i o n a l l y , design c r i t e r i a i n c l u d e d the a b i l i t y t o e a s i l y accommodate the s t r i c t e v o l v i n g "Good L a b o r a t o r y P r a c t i c e " p r o c e d u r e s d e r i v e d from r e g u l a t o r y agency o r i n - h o u s e s t a n d a r d s . The system's f l e x i b i l i t y i n c o n f i g u r a t i o n i s p r o v i d e d by the use of f i l e s , c a l l e d d i c t i o n a r i e s , w h i c h c o n t a i n a l l the t a b l e s n e c e s s a r y t o a l l o w c o n v e r s a t i o n a l o n - l i n e maintenance of u s e r ID and s e c u r i t y d e f i n i t i o n s , t e s t d e s c r i p t i o n s , product s p e c i f i c a t i o n s , c a l c u l a t i o n and r e p o r t g e n e r a t i o n p r o c e d u r e s . For i n s t a n c e , the i d e n t i f i c a t i o n d i c t i o n a r y d e f i n e s l o g o n i n f o r m a t i o n about each u s e r , i n c l u d i n g name, password and s e c u r i t y c l a s s i f i c a t i o n . The system manager g e n e r a l l y m o d i f i e s t h i s d i c t i o n a r y o n - l i n e as employees and s e c u r i t y c o n s i d e r a t i o n s change. The system a u t o m a t i c a l l y r e f e r e n c e s t h i s d i c t i o n a r y t o ensure t h a t each o p e r a t o r has the r e q u i r e d a u t h o r i t y b o t h t o l o g o n and e x e c u t e any r e q u e s t e d command. System s e c u r i t y i s managed by a scheme p r o v i d i n g f o r the assignment of each command t o any one o r more of s i x t e e n s e c u r i t y k e y s . Every u s e r i s a s s i g n e d one o r more of the d e f i n e d keys and a l l o w e d t o e x e c u t e o n l y t h o s e commands w h i c h h i s keys e n a b l e . I n i t i a l l y d e f i n e d d u r i n g system i n s t a l l a t i o n , the s e c u r i t y t a b l e s a r e m a i n t a i n e d by a s s i g n i n g u s e r i d e n t i f i c a t i o n , password and s e c u r i t y key. A c c e s s t o the s e c u r i t y t a b l e s i s c o n t r o l l e d by the scheme i t s e l f - the t a b l e s b e i n g e a s i l y m o d i f i e d o n - l i n e , but o n l y by one w i t h the r e q u i r e d a u t h o r i z a t i o n .

Dictionaries The Lab Manager system d i c t i o n a r i e s c o n t a i n a l l i n f o r m a t i o n r e l e v a n t t o the unique o p e r a t i o n o f the system i n each l a b o r a t o r y . The f i e l d s w i t h i n each d i c t i o n a r y r e c o r d can be d e f i n e d and r e d e f i n e d f o r the s p e c i f i c t y p e o f d a t a and s p e c i f i c a t i o n s encountered w i t h i n any p a r t i c u l a r l a b o r a t o r y , making i t p o s s i b l e f o r prompting messages and headings d i s p l a y e d on t e r m i n a l s t o v a r y from one l a b o r a t o r y environment t o a n o t h e r , each r e q u e s t i n g o r p r e s e n t i n g d a t a i n appropriate terminology. A l l d i c t i o n a r i e s , r e q u i r e d e i t h e r by the system o r t h o s e d i c t a t e d by s i t e s p e c i f i c r e q u i r e m e n t s , can be used i n r e t r i e v a l s and r e p o r t i n g as needed. The system i s c o n f i g u r e d t o a u t o m a t i c a l l y r e c o r d , f o r each e n t r y i n e v e r y d i c t i o n a r y , the d a t e , t i m e and i d e n t i f i c a t i o n of the o p e r a t o r making the l a s t change t o the e n t r y . I n a d d i t i o n t o the i d e n t i f i c a t i o n d i c t i o n a r y a l r e a d y d i s c u s s e d , the t e s t d i c t i o n a r y i s a r e p o s i t o r y f o r a l l the t e s t s t h a t might be performed i n the l a b o r a t o r y and i s c o n f i g u r e d t o c o n t a i n u s e r prompts, t e s t p r o t o c o l s , q u a n t i t y r e q u i r e d f o r t e s t i n g , a s s i g n e d t e s t i n g l o c a t i o n and a n a l y s t , s t a n d a r d t e s t i n g t i m e , the name of any s p e c i a l c a l c u l a t i o n program t o be used and c o s t per t e s t . A d d i t i o n a l f i e l d s a r e added as needed t o meet l o c a l r e q u i r e m e n t s . The c a l c u l a t i o n d i c t i o n a r y d e f i n e s the p r o c e d u r e s r e q u i r e d t o p e r f o r m c a l c u l a t i o n s on t e s t d a t a . User f r i e n d l y f e a t u r e s i n c l u d e v a r i a b l e d e c l a r a t i o n as e i t h e r l o c a l t o a t e s t o r f e t c h e d from a n o t h e r t e s t on the same o r d i f f e r e n t sample, c o n v e r s a t i o n a l i n p u t s t a t e m e n t s and a l g e b r a i c c a l c u l a t i o n s . Each c a l c u l a t i o n p r o c e d u r e may be r e f e r e n c e d by any one o r a l l e n t r i e s i n the t e s t d i c t i o n a r y .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch003

3.

KIPINIAK A N D FINNERTY

Laboratory

Data

Management

19

Each r e c o r d i n the p r o d u c t s p e c i f i c a t i o n d i c t i o n a r y c o n t a i n s f i e l d s f o r the d e f i n i t i o n o f t e s t s t o be conducted, t e s t l i m i t s , p r o d u c t s p e c i f i c i n f o r m a t i o n and any o t h e r a u x i l i a r y i n f o r m a t i o n p e r t a i n i n g t o each d i f f e r e n t t y p e o f p r o d u c t . Predefined t e s t s a r e a u t o m a t i c a l l y s c h e d u l e d d u r i n g the sample l o g i n p r o c e s s . A l t e r n a t e l y , assignment o f t e s t s and t h e i r a p p l i c a b l e l i m i t s may be s p e c i f i e d d u r i n g o r a f t e r the sample l o g i n . The number o f system prompting messages i s s u e d d u r i n g l o g i n o f any sample may be reduced t o a minimum thus p r o v i d i n g minimum o p e r a t o r i n t e r a c t i o n , fewer d a t a e n t r y e r r o r s and g r e a t e r p r o d u c t i v i t y . Complete d a t a r e t r i e v a l and r e p o r t g e n e r a t i o n p r o c e d u r e s a r e e n t e r e d and s t o r e d a s r e c o r d s w i t h i n the database p r o c e d u r e s d i c t i o n a r y . Any d a t a r e t r i e v e d from the d a t a b a s e , o r any o f t h e d i c t i o n a r i e s , may be f o r m a t t e d and p r i n t e d a s a r e p o r t v i a t h e system r e p o r t g e n e r a t o r . T h i s f e a t u r e o f t h e system c a n be used by almost anyone w i t h o u t any knowledge o f programming because i n s t r u c t i o n s a r e e n t e r e d i n E n g l i s h - l i k e commands and checked by t h e system b e f o r e b e i n g e x e c u t e d . M a r g i n s , s p a c i n g , h e a d i n g s and f o o t i n g s c a n be d e f i n e d e a s i l y . R e p o r t s may be d e s i g n e d , p r i n t e d , l a b e l l e d , s o r t e d , t o t a l l e d and averaged i n numerous ways. Data c a n be e a s i l y p l o t t e d , l a b e l l e d and a u t o m a t i c a l l y s c a l e d on a v a r i e t y o f p l o t t e r s , w i t h m u l t i - c o l o r p l o t t i n g i n c l u d e d . The p r o c e s s o f d a t a r e t r i e v a l , r e p o r t g e n e r a t i o n and p l o t t i n g i s t y p i c a l l y i n i t i a t e d by a s i n g l e command w i t h no f u r t h e r o p e r a t o r i n t e r a c t i o n r e q u i r e d . A l l r e t r i e v a l and r e p o r t p r o c e d u r e s a r e permanently s t o r e d i n t h i s d i c t i o n a r y t o e l i m i n a t e the need f o r r e t y p i n g each t i m e they a r e used.

Sample T r a c k i n g and System

Operation

The Lab Manager system p r o v i d e s e x t e n s i v e f a c i l i t i e s t o t r a c k and c o n t r o l samples throughout the l a b o r a t o r y ; i n d e e d , management o f samples and r e s u l t s i s i t s p r i m a r y f u n c t i o n . The s t a t u s o f any sample and i t s a s s o c i a t e d t e s t r e s u l t s can be r e v i e w e d a t any time w i t h o n l y a moments n o t i c e . P r i o r i t i z e d w o r k l i s t s , sample s t a t u s and b a c k l o g g e d sample r e p o r t s a r e generated on r e q u e s t by any o p e r a t o r w i t h the p r o p e r s e c u r i t y d e f i n i t i o n . Sample l o g i n i n v o l v e s r e g i s t e r i n g a sample w i t h the system by a s s i g n i n g , e i t h e r m a n u a l l y o r a u t o m a t i c a l l y , a unique i d e n t i f i c a t i o n c a l l e d the sample ID. During t h i s process, a "snapshot" of t h e p r o d u c t s p e c i f i c a t i o n , t e s t and c a l c u l a t i o n d i c t i o n a r i e s i s t a k e n by moving a l l r e q u i r e d i n f o r m a t i o n i n t o the d a t a b a s e , a s d e f i n e d i n the c o n f i g u r a t i o n t a b l e s . T h i s l o g i n p r o c e s s c a n be performed by a remote computer as w e l l . A l l samples l o g g e d i n t o t h e system may r e q u i r e a " s a m p l i n g " s t e p b e f o r e any t e s t i n g c a n be conducted. The s a m p l i n g s t e p i s o p t i o n a l , as d e f i n e d i n the c o n f i g u r a t i o n t a b l e s , and a l l o w s any p r e - t e s t p r o c e s s i n g , such as l a b e l p r i n t i n g , t h a t may be r e q u i r e d . R e c o r d i n g and v a l i d a t i n g t e s t d a t a r e p r e s e n t s t h e s i n g l e most t e d i o u s a s p e c t o f any l a b o r a t o r y o p e r a t i o n . The Lab Manager system i s d e s i g n e d t o a c c e p t t e s t r e s u l t s e i t h e r d i r e c t l y from l a b o r a t o r y i n s t r u m e n t s o r a s m a n u a l l y e n t e r e d by l a b o r a t o r y p e r s o n n e l .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

20

C O M P U T E R S IN T H E

LABORATORY

Database u p d a t i n g and a r c h i v i n g f u n c t i o n s a r e s t a n d a r d f e a t u r e s of t h e Lab Manager system. U p d a t i n g i s performed " o n - t h e - f l y " , p r o v i d i n g r e t r i e v a b l e d a t a as soon as i t i s e n t e r e d . A r c h i v i n g i s performed p e r i o d i c a l l y t o e s t a b l i s h l o n g term o f f - l i n e s t o r a g e of d a t a from completed samples. A r c h i v e d d a t a may be r e c a l l e d a t any t i m e f o r a d d i t i o n a l o n - l i n e a n a l y s i s o r r e p o r t i n g .

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch003

On-Line Data A c q u i s i t i o n and Manual R e s u l t s E n t r y R e s u l t s of o n - l i n e , r e a l - t i m e i n s t r u m e n t a l analyses are posted d i r e c t l y t o t h e database, w i t h o r w i t h o u t p r o c e s s i n g , as soon as the instrument presents the a n a l y t i c a l data. A l l r e s u l t s are a v a i l a b l e f o r r e v i e w and v a l i d a t i o n i m m e d i a t e l y a f t e r b e i n g p o s t e d a f e a t u r e c r i t i c a l t o e f f e c t i v e l a b o r a t o r y management. Data i s a c q u i r e d from l a b o r a t o r y i n s t r u m e n t a t i o n o f v i r t u a l l y any m a n u f a c t u r e r o r f u n c t i o n . I n s t r u m e n t s a r e i n t e r f a c e d t o t h e computer v i a a n a l o g t o d i g i t a l c o n v e r s i o n , RS-232C, c u r r e n t l o o p , IEEE-4888, b i n a r y coded d e c i m a l (BCD) o r b i t p a r a l l e l t e c h n i q u e s . A n a l o g t o d i g i t a l c o n v e r s i o n i s performed by an i n t e r f a c e o p e r a t i n g a t up t o s i x t y r e a d i n g s per second i n t h e +/-10 v o l t i n p u t range and i s c a p a b l e o f r e s o l v i n g 0.3 m i c r o v o l t s . Manual e n t r y o f d a t a encompasses t e s t s r a n g i n g from s i m p l e p a s s / f a i l t e s t s o r sample d e s c r i p t i o n s t h r o u g h r e c o r d i n g q u a n t i t a t i v e r e s u l t s from complex a s s a y s depending upon t h e c o n f i g u r a b l e d e f i n i t i o n o f the t e s t . A l l prompts i s s u e d by t h e system can be q u i c k l y and e a s i l y changed o n - l i n e t o s u i t t h e r e q u i r e m e n t s o f individual laboratories. A d d i t i o n a l l y , the Lab Manager system w i l l c a l c u l a t e and r e c o r d secondary r e s u l t s from raw d a t a e n t e r e d . F o r example, a t i t r a t i o n t e s t would prompt t h e a n a l y s t f o r the volume and n o r m a l i t y o f t i t r a n t used and the sample w e i g h t , c a l c u l a t e t h e r e s u l t , p o s t t h e raw d a t a and r e s u l t t o t h e database and compare t h e d a t a t o t h e s p e c i f i e d l i m i t s t o determine whether t h e t e s t passes o r f a i l s . A l l r e s u l t s a r e a v a i l a b l e f o r r e v i e w and v a l i d a t i o n , i f r e q u i r e d , i m m e d i a t e l y a f t e r p o s t i n g t o the database. Calculations required a r e s t r u c t u r e d as s i m p l e a l g e b r a i c e x p r e s s i o n s and a r e e a s i l y s p e c i f i e d by l a b o r a t o r y p e r s o n n e l w i t h o u t programming knowledge. C a l c u l a t i o n s can i n c l u d e a d d i t i o n , s u b t r a c t i o n , m u l t i p l i c a t i o n , d i v i s i o n , e x p o n e n t i a t i o n and p o w e r f u l i n t r i n s i c r o u t i n e s such as c a l c u l a t i n g a r i t h m e t i c means, d e v i a t i o n s and t r i g o n o m e t r i c f u n c t i o n s . C o m p l i c a t e d i t e r a t i v e c a l c u l a t i o n s can be programmed and added t o t h e s e f e a t u r e s as r e q u i r e d .

T e s t i n g , V a l i d a t i o n and Sample A p p r o v a l The Lab Manager system a l l o w s s e t t i n g up a sample and s e q u e n t i a l l y performing a l l t e s t s associated w i t h i t . Often, l a b o r a t o r y p r o c e d u r e s make i t e a s i e r t o s e t up and p e r f o r m one t e s t f o r a s e r i e s o f samples b e f o r e p r o c e e d i n g t o t h e next t e s t . The system addresses t h i s f u n c t i o n a l i t y through i t s "runsheet" p r o c e s s i n g f e a t u r e w h i c h groups t o g e t h e r a l l samples s c h e d u l e d f o r t h e same t e s t and d i s p l a y s them on the t e r m i n a l a l l o w i n g t h e o p e r a t o r t o s e l e c t samples s e q u e n t i a l l y o r randomly f o r t e s t i n g .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch003

3.

KIPINIAK A N D FINNERTY

Laboratory

Data

Management

21

The v a l i d a t i o n o f t e s t r e s u l t s r e p r e s e n t s the p r a c t i c e o f a second a n a l y s t r e v i e w i n g t h e work o f a n o t h e r as r e q u i r e d by v a r i o u s r e g u l a t o r y a g e n c i e s and in-house p o l i c i e s . The system can p r e v e n t v a l i d a t i o n o f a t e s t r e s u l t by the p e r s o n who performed the t e s t because the i d e n t i f i c a t i o n o f the a n a l y s t a s w e l l as t h e date and time the t e s t was conducted i s r e c o r d e d f o r each t e s t . V a l i d a t i o n i s performed on a t e s t by t e s t b a s i s and i n c l u d e s r e v i e w i n g t e s t r e s u l t s b e f o r e v a l i d a t i n g o r i n v a l i d a t i n g them. I n e i t h e r c a s e , a r e t e s t may be scheduled. The Lab Manager system's c o n f i g u r a b i l i t y a l l o w s t h i s s t e p t o be bypassed i f i t i s not a p p r o p r i a t e i n a s p e c i f i c l a b o r a t o r y environment. A p p r o v i n g the sample i n v o l v e s r e v i e w i n g a l l t e s t r e s u l t s a s s o c i a t e d w i t h the sample and, o p t i o n a l l y , o t h e r samples i n t h e database. The a p p r o v a l p r o c e s s i s t y p i c a l l y l i m i t e d , by t h e o n - l i n e c o n f i g u r a b l e s e c u r i t y scheme, t o those l a b o r a t o r y p e r s o n n e l t h a t a r e r e s p o n s i b l e f o r r e l e a s i n g samples from the l a b o r a t o r y . T h i s s t e p may be bypassed i f i t i s not a p p r o p r i a t e i n a s p e c i f i c l a b o r a t o r y environment.

R e t r i e v a l and R e p o r t i n g R e p o r t i n g o f l a b o r a t o r y d a t a i s performed by the system r e p o r t generator. Standard and ad-hoc r e p o r t s a r e p r o v i d e d . The s t a n d a r d r e p o r t s , d e s i g n e d t o meet r e g u l a t o r y r e q u i r e m e n t s f o r documentation, v e r i f i c a t i o n and c o n t r o l , a r e d i f f i c u l t t o change whereas ad-hoc r e p o r t s c a n be changed e a s i l y o n - l i n e . The most i m p o r t a n t o f t h e s t a n d a r d r e p o r t s i s the C e r t i f i c a t e o f A n a l y s i s , A l t h o u g h the r e p o r t format i s r e l a t i v e l y f i x e d , i t s unique f e a t u r e i s t h a t a l l c o p i e s g e n e r a t e d b e f o r e o r a f t e r t h e o f f i c i a l copy a r e l a b e l l e d e i t h e r p r e l i m i n a r y o r d u p l i c a t e a s appropriate. T h i s f e a t u r e i s e s s e n t i a l f o r any l a b o r a t o r y t h a t must conform t o s p e c i f i c s t a n d a r d s , c e r t i f y i t s p r o c e s s e s and c o n t r o l r e p o r t documents. Many o t h e r r e p o r t s c a n be produced by the system on demand o r a r o u t i n e schedule. The c o n t e n t and format o f t h e s e r e p o r t s a r e e a s i l y e s t a b l i s h e d and g e n e r a l l y d e f i n e d , as r e q u i r e d , by l a b o r a t o r y p e r s o n n e l w i t h no computer programming knowledge. R e p o r t s c a n c o n t a i n any d a t a s t o r e d i n the database. S o p h i s t i c a t e d f e a t u r e s a r e a v a i l a b l e such as f o o t i n g s , h e a d i n g s , s o r t i n g and c o n d i t i o n a l r e p o r t i n g . These r e p o r t f o r m a t s a r e s t o r e d e a s i l y and permanently i n the database p r o c e d u r e s d i c t i o n a r y .

Data Networking L a b o r a t o r y i n s t r u m e n t s a r e i n t e r f a c e d t o the computer by communication l o o p s each up t o t w e l v e thousand f e e t l o n g and supporting f i f t e e n instruments. A l l i n s t r u m e n t s can a c q u i r e d a t a s i m u l t a n e o u s l y w i t h a l l l a b o r a t o r y management f u n c t i o n s w i t h o u t s a c r i f i c i n g t e r m i n a l response t i m e s . The system c a n be c o n f i g u r e d t o r u n i n a d u a l computer environment when system up t i m e cannot be s a c r i f i c e d even f o r p r e v e n t i v e maintenance s e r v i c i n g . T h i s c o n f i g u r a t i o n a l l o w s two

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch003

22

C O M P U T E R S IN T H E

LABORATORY

computers t o a c c e s s a s i n g l e database a t t h e same t i m e p r o v i d i n g t o t a l system r e l i a b i l i t y . The shut down o f one CPU, f o r any r e a s o n , r e s u l t s i n t h e second CPU assuming a l l e s s e n t i a l o p e r a t i o n s o f t h e f i r s t , i n c l u d i n g i n s t r u m e n t d a t a a c q u i s i t i o n , t e s t d a t a i n p u t and r e p o r t g e n e r a t i o n . The Lab Manager system can be i n i t i a l l y i n s t a l l e d as a d u a l c o n f i g u r a t i o n o r may be upgraded t o t h i s c a p a b i l i t y a t some t i m e i n t h e f u t u r e . Upgrading t o a d u a l computer c o n f i g u r a t i o n i s a s i m p l e and e f f e c t i v e means t o i n c r e a s e responsiveness while v i r t u a l l y e l i m i n a t i n g d i s r u p t i o n to laboratory o p e r a t i o n s by computer down t i m e . A l l d a t a p r o c e s s e d by t h e system i s e a s i l y communicated t o remote computers e i t h e r by the i n d u s t r y s t a n d a r d 2780 Remote Job E n t r y p r o t o c o l o r t h e v i r t u a l t e r m i n a l f a c i l i t y a v a i l a b l e . The RJE f a c i l i t y i s d e s i g n e d to communicate w i t h remote computers a t speeds up t o 9600 baud and i s recommended f o r l a r g e d a t a t r a n s f e r l o a d s . The v i r t u a l t e r m i n a l f e a t u r e i s d e s i g n e d f o r asynchronous i n t e r computer communication and p r o v i d e s t h e c a p a b i l i t y t o a c c e s s remote d a t a b a s e s such as CAS On-Line, T o x l i n e and M e d l i n e . Data s t o r e d on the l o c a l system can be s e n t t o remote computers by e i t h e r f a c i l i t y .

Summary The Lab Manager system i s an e f f i c i e n t and comprehensive l a b o r a t o r y d a t a management computer system d e s i g n e d and implemented t o s p e c i f i c a l l y accommodate t h e u n i q u e o p e r a t i o n s o f d i v e r s e l a b o r a t o r i e s . The c o n f i g u r a b i l i t y and a d a p t a b i l i t y o f t h e Lab Manager system promotes p r o d u c t i v i t y , a l l o w s c r e a t i o n o f systems m e e t i n g i n d i v i d u a l l a b o r a t o r y needs r a p i d l y a t low c o s t and p e r m i t s the system t o i n e x p e n s i v e l y and e x p e d i t i o u s l y adapt t o e v o l v i n g laboratory operations. The c o s t o f t o t a l system maintenance and enhancement i s c u r r e n t l y shared by o v e r one hundred l a b o r a t o r i e s , r e d u c i n g t h e c o s t and r a i s i n g t o o v e r a l l system q u a l i t y as compared t o i n - h o u s e e f f o r t s . Proven time and t i m e a g a i n by s o f t w a r e l i f e c y c l e , t h i s package i s h i g h l y s u p e r i o r t o any custom w r i t t e n system f o r l o n g term s u p p o r t and l o w e r a c q u i s i t i o n c o s t . R E C E I V E D May

21,

1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

4 Applying Database Management in the Analytical Chemistry Laboratory FRED BAUMANN, KENNETH A. LEWIS, and ARTHUR C. BROWN III

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

Varian Instrument Group, Walnut Creek, CA 94598

A general purpose, CODASYL compliant database management system is used to implement the Varian/Digital VAX Laboratory Information Management System (LIMS). The VAX LIMS runs under the VMS operating system and is compatible with the VAX family of 32-bit superminicomputers. Database utilities provided by the VAX Database Management System were extensively applied to implement many laboratory-imposed requirements. Records and set relationships were developed to meet the specific needs of the analytical environment. Ordinary programming languages are used along with the database utilities to retrieve, analyze and report data. Datatrieve, a high level database query and reporting language, is optionally available. A number of data integrity and security features are built into the system. Modification and extension of the database is possible at several levels depending on the complexity of the change and a b i l i t y of the user. Databases a r e used w i d e l y i n commercial a p p l i c a t i o n s and have become the f o u n d a t i o n o f modern d a t a p r o c e s s i n g . V a r i o u s b i b l i o g r a p h i c , f i n a n c i a l and c h e m i c a l r e f e r e n c e databases a r e perhaps the most f a m i l i a r t o s c i e n t i s t s a t t h i s t i m e . However, the p r o l i f e r a t i o n o f L a b o r a t o r y I n f o r m a t i o n Management Systems (LIMS) makes a n a l y t i c a l l a b o r a t o r y databases a c c e s s i b l e t o most l a b o r a t o r y p e r s o n n e l . Such databases s t o r e a n a l y t i c a l d a t a and s c i e n t i f i c i n f o r m a t i o n from w h i c h a v a r i e t y o f documents and r e p o r t s a r e g e n e r a t e d . A n a l y t i c a l database d e s i g n and i m p l e m e n t a t i o n a r e i m p o r t a n t t o the a n a l y t i c a l chemist f o r s e v e r a l r e a s o n s : 1. 2. 3. 4. 5.

The e x p l o s i v e growth i n the amount of l a b o r a t o r y d a t a ; The need to enhance l a b o r a t o r y c o n s i s t e n c y and p r o d u c t i v i t y ; The need t o s h a r e d a t a among l a b o r a t o r y w o r k e r s ; The i n c r e a s i n g importance o f d a t a s e c u r i t y and i n t e g r i t y ; The w i d e n i n g scope o f l a b o r a t o r y a u t o m a t i o n from i n s t r u m e n t s t o d a t a management o f f e r s b o t h o p p o r t u n i t y and c h a l l e n g e t o the way data i s handled i n a l a b o r a t o r y . 0097-6156/84/0265-0023$06.00/0 © 1984 American Chemical Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

24

C O M P U T E R S IN T H E L A B O R A T O R Y

The i n t e r e s t i n LIMS i s d i r e c t l y due t o t h e need t o manage t h e i n c r e a s i n g amounts o f d a t a generated by t h e modern a n a l y t i c a l l a b o r a t o r y . LIMS systems a r e used i n q u a l i t y c o n t r o l and a n a l y t i c a l s e r v i c e s l a b o r a t o r i e s w i t h i n the petroleum, petrochemical, chemical, p h a r m a c e u t i c a l i n d u s t r i e s and o t h e r s , where i n t e l l i g e n t , a u t o m a t i c i n s t r u m e n t s generate l a r g e amounts o f d a t a . The l a b o r a t o r y must p r o c e s s , c o r r e l a t e , r e p o r t and s t o r e t h e s e d a t a s e c u r e l y f o r l o n g p e r i o d s of t i m e . The o p e r a t i n g environment o f an a n a l y t i c a l l a b o r a t o r y i n v o l v e s a n a l y t i c a l c h e m i s t s and t e c h n i c i a n s g e n e r a t i n g d a t a b o t h a u t o m a t i c a l l y u s i n g i n s t r u m e n t s , as w e l l as by manual t e c h n i q u e s . The LIMS a c q u i r e s d a t a i n s e v e r a l forms b e f o r e t r a n s f o r m i n g i t f i n a l l y i n t o d e s i r e d i n f o r m a t i o n . The LIMS may a l s o manage d a t a a s s o c i a t e d w i t h p r o d u c t s , p r o c e s s e s , p i l o t p l a n t s , a n i m a l s t u d i e s , t o x i c o l o g i c a l s t u d i e s and e n v i r o n m e n t a l m o n i t o r i n g . The l a b o r a t o r y manager needs r e c o r d s on p r o d u c t i v i t y , performance, customers, a c c o u n t i n g , p e r s o n n e l and i n v e n t o r y . T h i s complex l a b o r a t o r y environment must be r e f l e c t e d i n t h e database s t r u c t u r e and c o n s e q u e n t l y i n t h e LIMS d e s i g n . The r e s e a r c h chemist a l s o has need f o r a LIMS system t o s t o r e t h e v a s t amounts o f a n a l y t i c a l and o t h e r d a t a generated i n r e s e a r c h p r o j e c t s . A s y s t e m a t i c way o f h a n d l i n g such d a t a makes i t e a s i e r t o r e t r i e v e , t r a n s f o r m and r e p o r t t h e a c q u i r e d d a t a . I n a d d i t i o n t o h a n d l i n g l a r g e amounts o f d a t a generated a u t o m a t i c a l l y , t h e LIMS database must h a n d l e d a t a from a number o f d a t a s o u r c e s : I n s t r u m e n t s , t e r m i n a l s , p e r s o n a l work s t a t i o n s , and o t h e r computers. Not o n l y does d a t a e x i s t i n s e v e r a l forms b u t t e x t u a l i n f o r m a t i o n such as header r e c o r d s , comments, r e p o r t s and o t h e r documents must be accommodated. There e x i s t w e l l - d e f i n e d r e l a t i o n s h i p s among t h e v a r i o u s d a t a t y p e s i n t h e l a b o r a t o r y . The d a t a s e t r e l a t i o n s h i p s must be c a r e f u l l y c o n s i d e r e d i n d e s i g n i n g t h e database. A l l d a t a i n t h e LIMS must be a c c e s s i b l e by key f i e l d s such as sample number, method, i n s t r u m e n t I.D. o r l a b o r a t o r y . I t i s a l s o n e c e s s a r y t o s u p p o r t a c c e s s o f t h e s t o r e d d a t a by ad hoc q u e r i e s t o e x t r a c t i n f o r m a t i o n f o r c o r r e l a t i o n s , summaries, r e t r o s p e c t i v e s t u d i e s and special reports. A d d i t i o n a l LIMS f u n c t i o n s must i n c l u d e a r c h i v i n g o f d a t a , t e s t procedures and o t h e r i n f o r m a t i o n n e c e s s a r y t o meet Good M a n u f a c t u r i n g Practices (GMP) and Good L a b o r a t o r y P r a c t i c e s (GLP) g u i d e l i n e s o f government a g e n c i e s such as FDA and EPA. S e c u r i t y p r o t e c t i o n must be p r o v i d e d f o r t h e s e reasons and a l s o t o l i m i t a c c e s s t o s e n s i t i v e i n f o r m a t i o n . These r e q u i r e m e n t s a r e s t r i n g e n t b u t not beyond t h e c a p a b i l i t i e s o f modern database management systems. A database c a n be d e s c r i b e d as a c o l l e c t i o n o f i n t e r - r e l a t e d d a t a o r g a n i z e d i n t o r e c o r d s and connected by known ( s e t ) r e l a t i o n s h i p s . T y p i c a l l y , a database i s o r g a n i z e d around a f u n c t i o n such as p e r s o n n e l , m a n u f a c t u r i n g , e t c . A LIMS database i s o r g a n i z e d around t h e a n a l y t i c a l and r e s e a r c h l a b o r a t o r y . Good database d e s i g n i n v o l v e s s e v e r a l w e l l e s t a b l i s h e d p r i n c i p l e s :(1) 1. D a t a o r g a n i z a t i o n and s t o r a g e i s independent o f a p p l i c a t i o n p r o grams. By i n s u l a t i n g t h e programs from t h e o r g a n i z a t i o n and s t o r a g e o f d a t a , t h e u s e r s c a n c o n c e n t r a t e on t h e "meaning" o f t h e d a t a i n s t e a d o f t h e p h y s i c a l c h a r a c t e r i s t i c s and l o c a t i o n o f t h e d a t a . S e v e r a l v i e w s (subschemas) o f a database a r e p r e s e n t e d t o t h e o u t s i d e w o r l d depending on involvement w i t h t h e database.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

4.

B A U M A N N ET A L .

Database

Management

in the

Lab

25

The non-expert u s e r needs t o o n l y v i e w a s u b s e t o f t h e r e c o r d s , f i e l d s and s e t s i n a f u l l LIMS d a t a b a s e . T h i s i s the v i e w p r o v i d e d t o the s c i e n t i s t i n a t u r n - k e y LIMS system. Programmers see the database through subschemas s p e c i f i c t o t h e a p p l i c a t i o n . At a d i f f e r e n t l e v e l , the database a d m i n i s t r a t o r v i e w s t h e comp l e t e database through the schema. F i n a l l y , the p h y s i c a l l a y o u t of the r e c o r d s i s viewed by the systems programmer and the d a t a base a d m i n i s t r a t o r a s t h e s t o r a g e schema. These views a r e f u n c t i o n a l and a r e dependent upon the s p e c i f i c l e v e l o f involvement w i t h the database. 2. D a t a redundancy i s m i n i m i z e d . D a t a redundancy i s k e p t t o a minimum by n o r m a l i z i n g d a t a i n t o s i m p l e d a t a s e t s which c a n t h e n p o i n t t o r e l a t e d d a t a s e t s . T h i s saves d i s k s t o r a g e space and speeds up s t o r a g e and m o d i f i c a t i o n o p e r a t i o n s . 3. Database schémas a r e c e n t r a l l y s t o r e d and c o n t r o l l e d . Data d e f i n i t i o n s (schema) a r e s t o r e d i n t h e c e n t r a l i z e d d a t a d i c t i o n a r y . The u s e r ' s v i e w ( s ) o f the database i s d e f i n e d and s t o r e d i n the same d a t a d i c t i o n a r y . Programs a r e g i v e n a c c e s s t o i n d i v i d u a l d a t a f i e l d s , r e c o r d s , s e t s and a r e a s o f t h e database on a need-to-know b a s i s . The database a d m i n i s t r a t o r c r e a t e s and m a i n t a i n s i n t e g r i t y o f the database schémas. The b e n e f i t s o f t h i s approach a r e : A. Adjustment ( t u n i n g ) o f t h e database may be performed o u t s i d e of t h e a p p l i c a t i o n programs. B. Programs d e a l w i t h d a t a l o g i c a l l y r a t h e r t h a n p h y s i c a l l y , s i m p l i f y i n g t h e programming t a s k . C. The database may be m o d i f i e d w i t h o u t a f f e c t i n g t h e a p p l i c a t i o n programs. Only those programs a f f e c t e d by the schema changes need t o be r e c o m p i l e d . D. Database i n t e g r i t y i s m a i n t a i n e d i n a m u l t i - u s e r environment through the c e n t r a l i z e d d a t a d i c t i o n a r y . 4. S e c u r i t y p r o t e c t i o n i s provided t o assure data i n t e g r i t y . Database a c c e s s i s c o n t r o l l e d t o p r e v e n t u n a u t h o r i z e d u s e r a c c e s s ( f o r example, t o s e n s i t i v e a r e a s ) and t o p r e v e n t u n a u t h o r i z e d o p e r a t i o n s ( f o r example, d e l e t e a r e c o r d ) . The remainder o f t h i s paper w i l l d i s c u s s the V a r i a n / D i g i t a l VAX LIMS and the way l a b o r a t o r y r e q u i r e m e n t s a r e f u l f i l l e d u s i n g a d a t a base management system. VAX

LIMS

VAX LIMS F u n c t i o n s . A f u n c t i o n a l diagram o f the VAX LIMS i s shown i n F i g u r e 1. The LIMS database i s o r g a n i z e d i n t o two p o r t i o n s a c c o r d i n g t o f u n c t i o n . The Data Management p o r t i o n (DMDB) s t o r e s d a t a , methods and o t h e r r e c o r d s r e l a t e d t o the a n a l y t i c a l l a b o r a t o r y . The Sample Management p o r t i o n (SMDB) s t o r e s r e c o r d s p e r t a i n i n g t o sample t r a c k i n g and f i n a l r e s u l t s . T h i s r e p o r t d e a l s s p e c i f i c a l l y w i t h t h e DMDB a l t h o u g h the b a s i c p r i n c i p l e s a p p l y t o b o t h s i n c e t h e y use t h e same VAX I n f o r m a t i o n A r c h i t e c t u r e . I n s t r u m e n t s and o t h e r d e v i c e s a r e i n t e r f a c e d t o the VAX and t h e DMDB through t h e Data Management system. After analysis, f i n a l r e s u l t s a r e t r a n s f e r r e d t o t h e SMDB f o r t r a c k i n g , r e p o r t i n g and a r c h i v i n g . F i n a l r e s u l t s a l s o may be i n p u t manually from a t e r m i n a l . Sample Management c o n t a i n s s o f t w a r e f o r t r a c k i n g samples and d a t a

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

F i g u r e 1.

VAX

LIMS F u n c t i o n a l Diagram

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

4.

B A U M A N N ET A L .

Database

Management

in the

27

Lab

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

through t h e p r o c e s s e s of sample c o l l e c t i o n , l o g i n , s c h e d u l i n g , t e s t i n g , v e r i f y i n g and r e p o r t i n g . Data A n a l y s i s modules a r e programs used t o t r a n s f o r m and r e p o r t the d a t a . L a b o r a t o r y Management c o n s i s t s of a c o l l e c t i o n of s o f t w a r e modules and r e p o r t s r e l a t i n g t o the a d m i n i s t r a t i o n of t h e l a b o r a t o r y such as performance m o n i t o r i n g , q u a l i t y c o n t r o l , a c c o u n t i n g , i n v e n t o r y and s c h e d u l i n g . VAX LIMS A r c h i t e c t u r e . The above modules a r e a p p l i c a t i o n programs l a y e r e d upon t h e VAX I n f o r m a t i o n A r c h i t e c t u r e shown i n F i g u r e 2. At t h e l o w e s t l e v e l i s t h e VAX/VMS O p e r a t i n g System. (2) I t s u p p o r t s a l l VAX computers i n b o t h r e a l time m u l t i t a s k i n g and m u l t i u s e r t i m e s h a r i n g environments. The VAX Database Management System (DBMS) i s the h e a r t o f t h e LIMS p r o v i d i n g t h e fundamental d a t a s t o r a g e and r e t r i e v a l c a p a b i l i t i e s used throughout the system.(3) The VAX Common Data D i c t i o n a r y (CDD) c o n t a i n s r e c o r d , f i e l d and s e t d e f i n i t i o n s i n the schema, subschema and s t o r a g e schema. VAX D a t a t r i e v e i s a nonp r o c e d u r a l query and r e p o r t w r i t i n g language f o r d a t a s t o r e d i n the LIMS o r o t h e r database. VAX Forms Management System (FMS) i s an i n t e r a c t i v e t o o l t o develop forms f o r b o t h t h e e n t r y and r e p o r t i n g o f d a t a , and s e r v e s b o t h a p p l i c a t i o n s languages and VAX D a t a t r i e v e . L a y e r e d upon t h i s VAX I n f o r m a t i o n A r c h i t e c t u r e a r e t h e LIMS modules: Sample Management (LIMS/SM) Data Management (LIMS/DM) D a t a A n a l y s i s L i b r a r y (LIMS/DA) Lab Management (LIMS/LM) VAX DBMS. VAX DBMS i s a CODASYL (Conference on Data Systems Languages) c o m p l i a n t , g e n e r a l purpose database management system based on the March, 1981 Working Document of t h e ANSI Data D e f i n i t i o n Language Committee. I t s u p p l i e s u t i l i t i e s t o c r e a t e , m a i n t a i n and use databases w i t h complex network s e t r e l a t i o n s h i p s . VAX database u t i l i t i e s a r e summarized i n Table I . Table I .

Summary o f VAX Database U t i l i t i e s UTILITY

DESCRIPTION

Data D e f i n i t i o n Language (DDL)

Used t o d e f i n e t h e schema, s e c u r i t y schema, subschema and s t o r a g e schema D i c t i o n a r y Management U t i l i t y (DMU) C r e a t e s , m o d i f i e s , d e l e t e s o r r e p o r t s e n t i t i e s i n the CDD DBMS O p e r a t o r U t i l i t y

Database Query

(DBO)

(DBQ)

Data M a n i p u l a t i o n Language(DML)

Used t o c r e a t e , m o d i f y , d e l e t e , m o n i t o r , s t a r t and s t o p , j o u r n a l , backup, r e s t o r e , r e c o v e r o r v e r i f y a database I n t e r a c t i v e language used t o r e t r i e v e , update and r e p o r t d a t a e i t h e r d i r e c t l y from a t e r m i n a l o r c a l l e d f r o m BASIC, PASCAL, e t c . Data m a n i p u l a t i o n statements a b l e by FORTRAN o r COBOL

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

call-

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

28

C O M P U T E R S IN T H E L A B O R A T O R Y

VAX DBMS components and r e l a t i o n s h i p s a r e shown i n F i g u r e 3. The database i s composed o f : F i e l d s - i n d i v i d u a l data items Records - c o l l e c t i o n of d a t a i t e m s S e t s - r e l a t i o n s h i p between r e c o r d s Areas - p h y s i c a l s u b d i v i s i o n s of t h e database A schema Data D e f i n i t i o n Language (DDL) i s p r o v i d e d t o d e f i n e the r e c o r d s , s e t s and a r e a s i n t h e d a t a b a s e . S t o r a g e Schema DDL produces the p h y s i c a l d e s c r i p t i o n o f t h e database r e c o r d s , s e t s and a r e a s . A subschema DDL produces a l o g i c a l s u b s e t o f t h e database t o p r o v i d e a l t e r n a t i v e v i e w s o f t h e database f o r d i f f e r e n t a p p l i c a t i o n s programs. A DDL u t i l i t y i s p r o v i d e d t o c o m p i l e schémas and subschemas. The CDD s t o r e s t h e schema, subschema, s t o r a g e and s e c u r i t y schémas. S e c u r i t y schémas d e f i n e t h e a c t i o n s w h i c h u s e r s a r e a l l o w e d t o p e r f o r m on t h e d a t a b a s e . A l s o s t o r e d i n the CDD a r e t h e D a t a t r i e v e p r o c e d u r e s . The Database O p e r a t o r u t i l i t y (DBO) a l l o w s databases t o be c r e a t e d , modif i e d and d e l e t e d . The CDD has a d i c t i o n a r y management u t i l i t y (DMU) f o r examining and m a i n t a i n i n g t h e CDD c o n t e n t s . DBMS a c c e s s i s p r o v i d e d t o a l l VAX languages by means of Data M a n i p u l a t i o n Language (DML) f o r FORTRAN and COBOL, and Database Query Language (DBQ) s t a t e m e n t s inbedded i n t h e program f o r BASIC, PASCAL and o t h e r VAX l a n g u a g e s . The DML o r DBQ s t a t e m e n t s a r e c o m p i l e d a l o n g w i t h t h e a p p l i c a t i o n language s o u r c e code. A p p l i c a t i o n l a n guages do n o t a c c e s s t h e CDD f o l l o w i n g c o m p i l a t i o n . When t h e comp i l e d program i s s u b s e q u e n t l y e x e c u t e d , DBQ o r DML s t a t e m e n t s r e q u e s t r e c o r d s from o r w r i t e r e c o r d s t o t h e DBMS. A User Work A r e a (UWA) i s the b u f f e r t h r o u g h w h i c h r e c o r d s a r e t r a n s f e r r e d t o and from t h e a p p l i c a t i o n programs by t h e Database C o n t r o l System (DBCS). VAX D a t a t r i e v e r e f e r s t o t h e d a t a d e s c r i p t i o n s and u s e r p r o c e d u r e s i n t h e CDD a t r u n t i m e . VAX D a t a t r i e v e i s a l s o c a l l a b l e from a p p l i c a t i o n languages. VAX LIMS/DM System. The LIMS/DM system i n t e r f a c e s i n s t r u m e n t s , d a t a systems and o t h e r d e v i c e s t o t h e VAX LIMS DMDB v i a t h e Instrument Network A r c h i t e c t u r e ( I N A ) . I n s t r u m e n t s a r e i n t e r f a c e d by s t o r i n g t h e i r communications p r o t o c o l s and d a t a c h a r a c t e r i s t i c s i n r e c o r d s w i t h i n the LIMS d a t a b a s e . The I n t e r n a t i o n a l Standards O r g a n i z a t i o n ' s seven l a y e r open network a r c h i t e c t u r e i s used t o s e p a r a t e i n s t r u m e n t i n t e r f a c e problems i n t o l a y e r s . F l e x i b i l i t y and s i m p l i c i t y a r e i n t r o duced s i n c e each l a y e r d e a l s w i t h a s i m p l e f u n c t i o n . The upper l a y e r s d e a l w i t h t h e u s e r a p p l i c a t i o n program. The m i d d l e l a y e r s a r e c o n c e r n e d w i t h r o u t i n g messages between u s e r a p p l i c a t i o n s and t h e i n s t r u r ment on t h e system. The l o w e r l a y e r s d e a l w i t h t h e p h y s i c a l r o u t i n g of messages between d e v i c e s i n t h e system. I n t h e LIMS/DM, t h e s e f u n c t i o n s a r e performed by I/O s e r v e r s and I/O d e v i c e d r i v e r s . In d i s t r i b u t e d e n v i r o n m e n t s , DECnet can be used f o r t r a n s p a r e n t communic a t i o n s between a p p l i c a t i o n s r u n n i n g on m u l t i p l e VAX's o r PDP-11's, and can be used w i t h i n INA f o r i n s t r u m e n t i n t e r f a c i n g . VAX LIMS DMDB. The key t o good database d e s i g n i s t h e d e f i n i t i o n o f r e c o r d s and t h e s e t r e l a t i o n s h i p s between them. The VAX DMDB schema (Bachman diagram) i s shown i n F i g u r e 4. The diagram shows the major r e c o r d s (boxes) i n t h e database and t h e r e l a t i o n s h i p (arrows) between the r e c o r d s ( s e t s ) . The r e c o r d s and t h e i r f i e l d s a r e determined by the n a t u r e o f t h e d a t a e n c o u n t e r e d i n an a n a l y t i c a l l a b o r a t o r y

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Database

BAUMANN ETAL.

Management

in the

GC LC UV MS NMR AA GC/MS ICP STATISTICS GRAPHICS

STATUS PRODUCTIVITY QUALITY CONTROL SCHEDULING FINANCIAL

DATA ANALYSIS LIBRARY

LAB MANAGEMENT

DATA MANAGEMENT

SAMPLE MANAGEMENT

Languages V \ V A X F M S \ A BASIC FORTRAN FORMS PASCAL

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

29

Lab

V A X Datatrieve

\

Query & Reporting

Graphics

H i g h Level Data A c c e s s

Distributed Access

VAX CDD

DATA

DICTIONARY VAX

Ν

DBMS

CODASYL DATABASE VAX/VMS

OPERATING

Figure

2.

SYSTEM

VAX LIMS A r c h i t e c t u r e

CDD Schema Subschema Storage schema Datatrieve procedures Security schema

Figure

3.

VAX DBMS Components and R e l a t i o n s h i p s

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

COMPUTERS IN T H E LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

30

SYSTEM

SYSTEM

SAMPLE RECORD

METHODI RECORD I

Î

SYSTEM

I/O D E V I C E RECORD SYSTEM

SYSTEM

INSTRUMENT! RECORD I

LLC

WORKLIST RECORD SYSTEM

ANALYSIS RECORD SYSTEM RUN RECORD

RESULT RECORD

1

RUN h PARAMETER I ^RECORT^I

F i g u r e 4.

VAX LIMS DMDB Schema Diagram

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

4. BAUMANN ET AL.

Database

Management

in the Lab

31

environment. The s e t r e l a t i o n s h i p s a r e d e t e r m i n e d b y how a c c e s s e s to t h e database w i l l be h a n d l e d . A d i r e c t s e t r e l a t i o n s h i p between two r e c o r d s i s e s t a b l i s h e d when a l o g i c a l c o n n e c t i o n e x i s t s between them. The l i n k a g e f a c i l i t a t e s i n t e r - r e c o r d t y p e s o f q u e r i e s . F o r example, i t i s easy t o r e t r i e v e a l l a n a l y s e s f o r a sample because t h e r e i s a d i r e c t s e t r e l a t i o n s h i p between t h e Sample Record and t h e A n a l y s i s Record. From t h e A n a l y s i s Record a l l r u n i n f o r m a t i o n f o r the sample c a n be d i r e c t l y r e t r i e v e d and from t h e Run Record a l l r e s u l t s f o r t h e sample can be f o u n d . U s i n g t h e database t o s t o r e r e s u l t s f o r l a t e r r e t r i e v a l by sample number i s one o f t h e fundament a l uses o f a database i n t h e a n a l y t i c a l c h e m i s t r y l a b o r a t o r y . Another i m p o r t a n t s e t r e l a t i o n s h i p i s t h e Instrument t o A n a l y s i s Record which a l l o w s a l l a n a l y s e s f o r an i n s t r u m e n t t o be e a s i l y r e t r i e v e d . S i n c e t h e Sample Record and Instrument Record a r e owners of t h e A n a l y s i s R e c o r d , r e t r i e v a l o f a l l a n a l y s i s i n f o r m a t i o n f o r d e s i g n a t e d samples and i n s t r u m e n t s c a n be r e a d i l y a c c o m p l i s h e d . Those r e c o r d s w h i c h have SYSTEM as an owner can be a c c e s s e d d i r e c t l y w i t h o u t p r i o r knowledge o f i t s r e l a t i o n t o o t h e r r e c o r d s . F o r example, g i v e n a w o r k l i s t name, t h e w o r k l i s t r e c o r d c a n be a c c e s s e d d i r e c t l y w i t h o u t knowing w h i c h i n s t r u m e n t o r t e s t method i t i s related to. I t i s e x t r e m e l y i m p o r t a n t t o d e s i g n t h e database w i t h t h e dynamics o f t h e l a b o r a t o r y environment i n mind. The u s e r must be i n v o l v e d t o ensure t h e s e t r e l a t i o n s h i p s w i l l p e r m i t t h e n e c e s s a r y q u e s t i o n s t o be asked. P o o r l y d e f i n e d r e c o r d s and r e l a t i o n s h i p s w i l l r e s u l t i n awkward programming, poor performance and, i n some c a s e s , a n o n f u n c t i o n a l system. A few o f t h e r e c o r d s a r e e x p l a i n e d below. The Method Record c o n t a i n s i n f o r m a t i o n about t h e a n a l y t i c a l p r o c e d u r e s used w i t h i n s t r u m e n t s i n t e r f a c e d t o LIMS. F i e l d s include : Method I.D. Method v e r s i o n C o l l e c t i o n procedure Sample s t o r a g e p r o c e d u r e Sample p r e p a r a t i o n p r o c e d u r e A n a l y s i s procedure C a l c u l a t i o n procedure Report p r o c e d u r e Sample d i s p o s i t i o n p r o c e d u r e T e s t components High, low l i m i t s f o r expected t e s t r e s u l t s A one-to-many r e l a t i o n s h i p e x i s t s from t h e Method Record t o t h e A n a l y s i s Record s i n c e one Method g e n e r a l l y i s used f o r t h e a n a l y s i s of many samples. The I/O D e v i c e Record c o n t a i n s i n f o r m a t i o n about s p e c i f i c c h a r a c t e r i s t i c s o f equipment i n t e r f a c e d t o LIMS. F i e l d s i n c l u d e : I/O d e v i c e number I/O p o r t I.D. Baud r a t e Number o f d a t a b i t s Number o f s t a r t and s t o p b i t s Time o u t p e r i o d

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

32

C O M P U T E R S IN T H E LABORATORY

Parity Error d e t e c t i o n technique These f i e l d s a r e used t o s e t up t h e I/O d r i v e r s and I/O s e r v e r s i n t h e LIMS/DM module. The A n a l y s i s Record c o n t a i n s d e s c r i p t i v e i n f o r m a t i o n f o r an a n a l y s i s t o be r u n on an i n s t r u m e n t . F i e l d s i n c l u d e : Sample I.D. A l i q u o t I.D. Parent a l i q u o t In d a t e Approval date Source type W o r k l i s t assignment Analysis p r i o r i t y A n a l y s t name A one-to-many s e t r e l a t i o n s h i p e x i s t s t o t h e Run Record s i n c e a sample may be a n a l y z e d s e v e r a l t i m e s . The Run Record d e s c r i b e s c o n d i t i o n s which o c c u r r e d d u r i n g t h e r u n and comments added by t h e o p e r a t o r . F i e l d s f o r t h e Run Record include: Run d a t e Run number Instrument f i l e name Instrument f i l e t y p e Instrument o p e r a t o r I.D. Run Parameter Record i n c l u d e s d e s c r i p t i v e i n f o r m a t i o n about t h e r u n . F i e l d s f o r a chromatographic r u n i n c l u d e : Title Total area Remote program name D r i f t , noise, offset Autosampler r a c k and v i a l numbers I n j e c t i o n number E r r o r messages Instrument c o n d i t i o n Notes Area o r height f l a g Calculation Number o f peaks Number o f u n i d e n t i f i e d peaks Weight o f sample Weight o f i n t e r n a l s t a n d a r d The R e s u l t Record c o n t a i n s f i e l d s f o r b l o c k s o f d a t a f o r a g i v e n sample and r u n . F o r a chromatographic r u n , r e s u l t d a t a c o n s i s t o f a s e r i e s o f d a t a r e c o r d s f o r each peak: Peak name Peak r e s u l t R e t e n t i o n time Peak o f f s e t Peak h e i g h t o r a r e a R e l a t i v e r e t e n t i o n time S e p a r a t i o n code Peak w i d t h Raw o r i n t e r m e d i a t e d a t a such as d i g i t i z e d s i g n a l s o r a r e a s l i c e s from chromatographs c o n s i s t o f one o r two d i m e n s i o n a l a r r a y s o f

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

4.

B A U M A N N ET A L .

Database

Management

in the

Lab

33

f l o a t i n g p o i n t numbers. The R e s u l t Record used t o s t o r e d a t a from chromatographs and s p e c t r o p h o t o m e t e r s can be extended t o o t h e r i n s t r u m e n t s which produce η-dimensional d a t a by s t o r i n g t h e p o i n t s by columns. S e p a r a t i n g d e s c r i p t i v e i n f o r m a t i o n about the r u n i n t o the Run and Run Parameter Records a l l o w s d a t a t o be s t o r e d from a v a r i e t y of i n s t r u m e n t t y p e s .

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

U s i n g The Database I n The LIMS Environment I n f o r m a t i o n R e t r i e v a l and R e p o r t i n g . Ad hoc r e t r i e v a l and r e p o r t i n g of d a t a u s i n g D a t a t r i e v e and o t h e r VAX languages i s an i m p o r t a n t f e a t u r e of LIMS as i t i s i m p o s s i b l e t o f o r e s e e a l l t h e f u t u r e r e ­ quirements f o r r e p o r t s . The Database Query Language u t i l i t y (DBQ) i s used t o r e t r i e v e , update and r e p o r t d a t a from c o m p i l e d BASIC, PASCAL o r o t h e r VAX languages. Data M a n i p u l a t i o n Language (DML) i s used by FORTRAN t o a c c e s s the d a t a . VAX D a t a t r i e v e i s a h i g h l e v e l database query and r e p o r t i n g language w i t h d a t a m a n i p u l a t i o n and g r a p h i c s c a p a b i l i t y . I t i s a n o n - p r o c e d u r a l language i n t e n d e d f o r b o t h the non-programmer and programmer. A s i m p l e s e t of commands a r e used i n t e r a c t i v e l y and a l s o a r e c a l l a b l e from o t h e r languages. Guide mode can be used by b e g i n n e r s t o l e a r n how t o n a v i g a t e the database. Remote databases on o t h e r VAX's a l s o can be a c c e s s e d through DECnet. VAX DBMS i s d i c t i o n a r y - o r i e n t e d and a l l d a t a d e s c r i p t i o n s and D a t a t r i e v e p r o ­ cedures a r e s t o r e d i n t h e VAX CDD. VAX D a t a t r i e v e i s i d e a l f o r ad hoc q u e r i e s and low volume d a t a m a n i p u l a t i o n s . W h i l e e x e c u t i o n time i s l o n g e r than f o r c o m p i l e d a p p l i c a t i o n languages, a t r a d e - o f f needs t o be made between t h e ex­ e c u t i o n t i m e , t h e c o s t of w r i t i n g the program i n a t r a d i t i o n a l , c o m p i l e d language and the f r e q u e n c y of r u n n i n g the program. A D a t a t r i e v e r e p o r t f o r peak d a t a would be o b t a i n e d as f o l l o w s : FOR ANALYSIS where sample ID EQ "123" FOR RUN WITHIN ANALY SI S_RUN FOR RESULTJDATA WITHIN RUN_RESULT PRINT RUN_RESULT The r e s u l t i n g r e p o r t i s shown below: PEAK NAME Peak 1

PEAK RESULT 123

RETENTION TIME 1.0

Peak η 456 2.0 VAX FMS p r o v i d e s forms management c a p a b i l i t y f o r a p p l i c a t i o n languages and VAX D a t a t r i e v e . Forms a r e d e f i n e d i n t e r a c t i v e l y a t a t e r m i n a l and s t o r e d i n t h e FMS forms l i b r a r y independent o f d a t a and programs. VAX D a t a t r i e v e and FMS, used w i t h VAX DBMS, p r o v i d e t h e c a p a b i l i t y t o i n p u t , r e t r i e v e , modify and r e p o r t d a t a e a s i l y and quickly.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

34

C O M P U T E R S IN T H E LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

Database I n t e g r i t y . I n t e g r i t y o f t h e database must be a s s u r e d p a r t i c u l a r l y i f t h e d a t a a r e t o be used t o meet government r e g u l a t i o n s o r used as l e g a l e v i d e n c e . S e v e r a l t h i n g s c a n be done t o s e c u r e t h e d a t a a g a i n s t u s e r e r r o r s and hardware o r s o f t w a r e f a i l u r e s . J o u r n a l i n g i s t h e w r i t i n g o f a l l b e f o r e and a f t e r images o f m o d i f i c a t i o n s o f t h e database t o a j o u r n a l f i l e as w e l l as t o t h e database file. The j o u r n a l d e v i c e s h o u l d be a d e v i c e o t h e r than t h a t used t o s t o r e t h e database i n case o f f a i l u r e . Database O p e r a t o r u t i l i t i e s (DBO) a r e p r o v i d e d t o s p e c i f y t h e a f t e r image j o u r n a l d e v i c e (DBO/ AFTER-JOURNAL), make backup c o p i e s o f t h e database (DBO/BACKUP), r e s t o r e t h e c o r r u p t e d database w i t h t h e backup (DBO/RESTORE), and r e a p p l y a l l changes s i n c e t h e l a s t backup from t h e a f t e r - i m a g e j o u r n a l t o t h e backup database (DBO/RECOVER). A r c h i v e and R e t r i e v e Records. The VAX LIMS/DM p r o v i d e s u t i l i t i e s f o r a r c h i v i n g and r e t r i e v i n g o l d d a t a . To a r c h i v e , t h e u s e r s e l e c t s the sample I.D.'s t o be a r c h i v e d . The LIMS/DM ARCHIVE e x t r a c t s t h e s e l e c t e d A n a l y s i s , Run and R e s u l t Records and s t o r e s them on tape o r d i s k , o p t i o n a l l y d e l e t i n g them from t h e database. The tape o r d i s k s can t h e n be s t o r e d o f f - s i t e o r i n a v a u l t . To r e t r i e v e d a t a from the a r c h i v e , t h e u s e r i n v o k e s t h e LIMS/DM RETRIEVE u t i l i t y which r e l o a d s t h e d a t a i n t h e d a t a b a s e . The u s e r s e l e c t s t h e sample I.D. s to be r e t r i e v e d and w r i t e s t h e s e t o t h e LIMS/DMDB. The u s e r c a n now a c c e s s and u s e t h e s e r e c o r d s i n t h e normal manner. 1

Database S e c u r i t y . Database s e c u r i t y i s m a i n t a i n e d by l i m i t i n g a c c e s s t o t h e database t o a u t h o r i z e d u s e r s . S e v e r a l methods a r e p r o v i d e d by t h e VAX DBMS: (1) Segmenting t h e database i n t o a r e a s and r e s t r i c t i n g a c c e s s by the a p p r o p r i a t e l e v e l o f VAX/VMS f i l e s e c u r i t y ; (2) subschemas t o l i m i t u s e r s t o those s e t s , r e c o r d s and f i e l d s which they need; (3) a s e c u r i t y schema which l i m i t s u s e r ' s a c c e s s t o t h e d a t a b a s e , and a l s o d e f i n e s t h e t r a n s a c t i o n s w h i c h t h e y can p e r f o r m ; (4) t h e VAX CDD r e s t r i c t s a c c e s s t o d a t a d e s c r i p t i o n s s t o r e d i n t h e d i c t i o n a r y . Each u s e r i s g r a n t e d a c c e s s p r i v i l e g e s a c c o r d i n g t o t h e i r needs. Some need o n l y READ a c c e s s t o t h e d a t a f o r w r i t i n g r e p o r t s , o t h e r s r e q u i r e READ and WRITE p r i v i l e g e s . T e r m i n a l s a l s o have r e s t r i c t e d p r i v i l e g e s . A t e r m i n a l l o c a t e d i n a p u b l i c a r e a s may be g r a n t e d READ o n l y a c c e s s f o r example. O f f - s i t e d i a l u p t e r m i n a l s may be r e s t r i c t e d t o use d u r i n g c e r t a i n h o u r s . Unattended t e r m i n a l s may be a u t o m a t i c a l l y logged o u t a f t e r a time out p e r i o d has e l a p s e d . A u d i t T r a i l s . A u d i t t r a i l s a r e i n t r i n s i c i n t h e d e s i g n o f t h e VAX LIMS/DMDB. Records have c r e a t i o n d a t e s , name o f c r e a t o r , and comments on why t h e change was made. No d a t a i s o v e r - w r i t t e n , changed o r d e l e t e d i n p l a c e ; r a t h e r , i f a change i s t o be made t o t h e d a t a , t h e o l d r e c o r d i s marked as h a v i n g been superseded ( n o t d e l e t e d o r m o d i f i e d ) . The new r e c o r d c o n t a i n s a l l t h e d a t a from the o l d r e c o r d a l o n g w i t h any m o d i f i c a t i o n s , an i n d i c a t i o n o f why t h e changes were made and who made t h e changes. T h i s p r o c e s s a l l o w s an a u d i t t r a i l t o be produced, s o r t e d by sample I.D., a l i q u o t and t e s t method, l i k e any o t h e r r e p o r t w i t h i n t h e normal c o n t e x t o f t h e LIMS system. The advantage o f t h i s p r o c e s s i s t h a t t h e a u d i t t r a i l , a l o n g w i t h a l l t h e o t h e r d a t a w i t h i n t h e LIMS/DMDB, i s m a i n t a i n e d

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

4.

B A U M A N N ET A L .

Database

Management

in the

Lab

35

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

and s e c u r e d by the a b o v e - d e s c r i b e d a c c e s s c o n t r o l mechanism of t h e VAX DBMS. Thus, o n l y u s e r s w i t h p r o p e r a c c e s s p r i v i l e g e s can change the d a t a and t h e n o n l y by c o p y i n g and m o d i f y i n g d a t a w i t h o u t changing o l d d a t a (extend a c c e s s ) . The j o u r n a l i n g f a c i l i t y m a i n t a i n s t h e i n t e g r i t y of t h e a u d i t t r a i l as w e l l . The a u d i t o r can r o l l back t h e database t o t h e l a s t backup and see t h e t r a n s a c t i o n s r e a p p l i e d up t o any p o i n t i n t i m e . M o d i f i c a t i o n and E x t e n s i o n s . The a n a l y t i c a l c h e m i s t r y l a b o r a t o r y i s a dynamic environment. New p r o c e s s e s , new t e s t s , i n c r e a s e d sample volume, government r e g u l a t i o n s , e t c . , a l l c o n t r i b u t e t o t h e c o n t i n u a l change t a k i n g p l a c e . Improved computer systems, p e r i p h e r a l s and s o f t w a r e a r e c o n t i n u a l l y a p p e a r i n g and must be accommodated. The whole concept of LIMS and l a b o r a t o r y a u t o m a t i o n i s new and r a p i d l y e v o l v i n g . Without t h e c a p a b i l i t y t o extend and m o d i f y a LIMS, a once s t a t e - o f - t h e - a r t system w i l l r a p i d l y become o b s o l e t e . The VAX LIMS i s c o n s i d e r e d t o be a b a s i c system w h i c h can be m o d i f i e d and extended t o meet s p e c i f i c r e q u i r e m e n t s . The changes can be made a t v a r i o u s u s e r l e v e l s c o r r e s p o n d i n g t o t h e v i e w o f t h e database. 1. Cosmetic changes t o t h e i n p u t s c r e e n s u s i n g t h e VAX FMS e d i t o r do not change the database but o n l y t h e d i s p l a y o n l y terms a p p e a r i n g on the s c r e e n . F i e l d s may be b r o k e n up i n t o s u b f i e l d s u s i n g commas, dashes, s l a s h e s , e t c . , t o make them e a s i e r t o r e a d . F i e l d s can a l s o be surrounded by boxes, s e t i n r e v e r s e v i d e o ( b l a c k on w h i l e ) u n d e r l i n e d , c o l o r e d , made t o b l i n k , made double h e i g h t , o r d i s p l a y e d i n b o l d . Such changes make t h e forms e a s i e r t o use and f r i e n d lier. 2. Comment f i e l d s , g e n e r i c t e s t r e s u l t r e c o r d s and parameter r e c o r d s a r e i n c l u d e d i n t h e d a t a b a s e . These f i e l d s and r e c o r d s can be used w i t h o u t r e c o m p i l i n g t h e database t o s t o r e d a t a , i n s t r u m e n t parameters and comments w h i c h were n o t e x p l i c i t l y d e f i n e d i n t h e o r i g i n a l d a t a b a s e . The a c t u a l format and use o f t h e d a t a and parameters i s d e t e r m i n e d by the a p p l i c a t i o n programs w h i c h use them. 3. The database subschema e n t i t i e s such as s e t names, r e c o r d names, o r f i e l d names can be renamed w i t h o u t changing t h e d a t a o r t h e s e t r e l a t i o n s h i p s by means o f t h e ALIAS f e a t u r e i n VAX DBMS. T h i s i s a u s e f u l f e a t u r e f o r renaming f i e l d s , r e c o r d s and s e t s t o s u i t a p a r t i c u l a r l a b o r a t o r y environment. For example, t h e term 'sample' o r 'specimen' may be p r e f e r r e d , depending upon whether t h e l a b o r a t o r y i s i n an i n d u s t r i a l o r h o s p i t a l environment. ALIAS i s a l s o used t o c r e a t e LIMS subschemas i n f o r e i g n l a n g u a g e s . T h i s i s done by making a copy o f t h e subschema u s i n g t h e DBO/EXTRACT u t i l i t y , a d d i n g the ALIAS e n t r y , and c o m p i l i n g t h e new subschema u s i n g t h e DDL/COMPILE u t i l i t y and DBO/MODIFY u t i l i t y . Only t h o s e p r o grams u s i n g t h e ALIAS need t o be m o d i f i e d and r e c o m p i l e d . The database does n o t need t o be r e b u i l t . 4. A l t h o u g h more c o m p l i c a t e d , t h e schema may be e d i t e d t o add new r e c o r d s , o r f i e l d s , o r c r e a t e a new s e t r e l a t i o n s h i p s w i t h i n t h e LIMS d a t a b a s e . T h i s f u n c t i o n i s u s u a l l y done by the database a d m i n i s t r a t o r , a systems programmer o r system manager i n t h e r o l e o f database a d m i n i s t r a t o r . VAX DBMS

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch004

36

C O M P U T E R S IN T H E LABORATORY

p r o v i d e s a Database A d m i n i s t r a t i o n Manual f o r t h i s p u r pose. (4) The VAX d a t a b a s e u t i l i t i e s summarized i n T a b l e I p r o v i d e f a c i l i t i e s t o d e v e l o p schémas, c r e a t e CDD d i r e c t o r i e s , and c r e a t e , m o d i f y and u s e t h e d a t a b a s e . A f t e r t h e schema and s t o r a g e schema have been m o d i f i e d u s i n g t h e t e x t e d i t o r t o add t h e new f i e l d s , r e c o r d s , o r s e t s , t h e y a r e c o m p i l e d u s i n g t h e DDL/COMPILE u t i l i t y . N e x t , t h e database i s m o d i f i e d u s i n g the DBO/MODIFY u t i l i t y . O l d subschemas a r e s t i l l o p e r a t i v e and o l d a p p l i c a t i o n programs w h i c h do n o t u s e t h e new f i e l d s , r e c o r d s o r s e t s may s t i l l be used. New subschemas a r e c r e a t e d t o use t h e new e n t i t i e s and t h e s e a r e used by new a p p l i c a t i o n programs. One cannot d e l e t e o r modify a n y t h i n g from t h e o l d schémas o r subschemas o r add new a r e a s . T h i s a l l o w s o l d programs u s i n g o l d subschemas t o c o n t i n u e to r u n w i t h o u t r e c o m p i l i n g and r e b u i l d i n g t h e d a t a b a s e . The LIMS database i s amenable t o changes i n a p p l i c a t i o n modules such as i n s t r u m e n t i n t e r f a c i n g , d a t a a n a l y s i s and l a b o r a t o r y management. The independence o f t h e d a t a from t h e programs i s a major a d vantage o f database systems. A c a r e f u l l y d e s i g n e d system w i t h b u i l t i n t o o l s and u t i l i t i e s t o p r o v i d e easy m o d i f i c a t i o n and e x t e n s i o n i s the b e s t s o l u t i o n f o r a system c o n f i g u r a b l e t o t h e u s e r ' s environment and c a p a b l e o f f u l f i l l i n g f u t u r e needs.

Literature Cited 1. Martin, James In "Computer Database Organization"; Prentice-Hall, Inc.: Englewood Cliffs, New Jersey, 1975. 2. "VAX Technical Summary", Digital Equipment Corporation, 1982. 3. "VAX-11 DBMS", Digital Equipment Corporation, August, 1982; Vol. 1-3. 4. Ibid., "Database Administration Manual", Vol. 1 ADJ966A-TI. The following are trademarks of Digital Equipment Corporation: Datatrieve, DECnet, FMS, VAX, VMS. INA is a trademark of Varian Associates, Inc. RECEIVED June 5,

1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

5 Network and Communications D O U G L A S ST.

CLAIR

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

Digital Equipment Corporation, Stow, MA 01775

Frequently the data communications s o l u t i o n i s addressed independently of other o r g a n i z a t i o n a l communications needs. The knowledgeable user considers Data P r o c e s s i n g Networks a PART of the communications problem and rightly so. The entire communications for the o r g a n i z a t i o n c o n s i s t s of v o i c e , FAX, written, e t c . A l a r g e number of new s e r v i c e s are becoming a v a i l a b l e on the "computer network" that augment or replace e x i s t i n g systems. I n t e r o f f i c e mail for example has an analogue in the E l e c t r o n i c Mail systems a v a i l a b l e on computers. T h i s paper will a d d r e s s v a r i o u s a s p e c t s o f computer networks independently of the total communications s o l u t i o n for an o r g a n i z a t i o n . But, thought should be g i v e n to integration o f all c o m m u n i c a t i o n s i n your o r g a n i z a t i o n based on a v a i l a b l e resources and your unique needs. There are two p r i m a r y definitions f o r the term Network. One definition for a network d e s c r i b e s a few computers with l o t s of t e r m i n a l s . Such installations frequently have grown from a batch environment with a l a r g e c e n t r a l mainframe computer to which t e r m i n a l s have been added. The t e r m d a t a communications i s used to describe terminal (or terminal like) communications b e t w e e n t h e t e r m i n a l and h o s t . T h i s i s p e r h a p s t h e most mature communications a r e a . I t was d e v e l o p e d initially to support Teletype Equipment over telephone lines. Once computers began to support terminals this t e c h n o l o g y was a n a t u r a l adaptation. The s e c o n d more modern d e f i n i t i o n d e s c r i b e s a N e t w o r k c o m p r i s e d o f more c o m p u t e r s t i e d t o g e t h e r with r e l a t i v e l y f e w t e r m i n a l s on e a c h c o m p u t e r . T h i s second f o r m o f n e t w o r k became a r e a l i t y w i t h t h e m i n i computer r e v o l u t i o n was s p u r r e d by a c c e p t a n c e D i g i t a l s P D P - 8 and PDP-11 M i n i c o m p u t e r s i n the 1960s and d e v e l o p m e n t of Digitals Network A r c h i t e c t u r e (DNA) i n t h e 1970s. In order to c l a r i f y the distinction lets call the first f o r m o f " n e t w o r k " D a t a C o m m u n i c a t i o n s and t h e s e c o n d type a D i g i t a l Computer Network. 0097-6156/84/0265-0037$06.00/0 © 1984 American Chemical Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

38

C O M P U T E R S IN T H E

LABORATORY

A primary consideration for networking i s resource sharing. The n e w t w o r k i s c l e a r l y b e c o m i n g m o r e a n d m o r e like what was formerly the system. The various functions once l o c a t e d with i n a s i n g l e system (mass storage, computation, p r i n t i n g , etc) are now being s h a r e d and c a l l e d f i l e s e r v e r s , compute s e r v e r s , p r i n t servers, etc. If resources are cheap t h e n t h e y can be replicated. I f they are expensive resources must be shared. This law of economics dictates putting c o m p u t i n g power at t h e d e s k and s h a r i n g p r i n t i n g , and storage. The need for storage i s not related to computing power. Laboratory problems with t r i v i a l c o m p u t a t i o n a l need can acquire l a r g e amounts of data. S t o r a g e c o s t s a r e not as a m e n a b l e t o p r i c e r e d u c t i o n s as CPU power. T h e r e f o r e the network should offer the economy of large disks as a shared resource. An e f f i c i e n t n e t w o r k i n g scheme w o u l d a l s o a l l o w t h e u s e r t o e i t h e r move t h e e n t i r e f i l e o v e r t h e n e t w o r k o r o n l y the r e c o r d s on i n t e r e s t f r o m t h e f i l e . The s e c o n d a p p r o a c h , moving records, i s apt t o be preferred because i t minimizes t o t a l storage r e q u i r e m e n t s ( o n l y one copy of t h e f i l e i n t h e n e t ) and s i m p l i f i e s p r o c e d u r e s s i n c e the one c o p y i s u p d a t e d and there i s no q u e s t i o n about g e t t i n g and " o l d " c o p y . Level

1

Central. This highest level provides resources that are v i a b l e i n t e r m s o f t h e i r c o s t when t h e b e n e f i t s are realistic when s p r e a d a c r o s s the entire organization. I t i s a l s o the l e v e l at which i n f o r m a t i o n which i s of greatest s e n s i t i v i t y should reside. Level

2

D i v i s i o n . T h i s l e v e l merges the r e s o u r c e s necessary to s u p p o r t t w o o r m o r e d e p a r t m e n t s . The e m p h a s i s i s m o v i n g t o w a r d more m a n a g e r i a l f u n c t i o n s . H o w e v e r , i t i s a l s o the l o g i c a l l e v e l f o r resources whose b e n e f i t s are s u c h t h a t t h e c o s t i s o n l y r e a l i s t i c when s p r e a d a c r o s s a l l lower l e v e l s . Level

3

Laboratory. The l e v e l p r o v i d e s s u p p o r t t o m o r e t h a n one group. I t i s a s s u m e d t h a t d a t a c o l l e c t e d by several g r o u p s w i t h i n t h e d e p a r t m e n t i s n e c e s s a r i l y r e l a t e d and therefore resources at t h i s l e v e l support c o l l a t i n g and a n a l y z i n g data from s e v e r a l groups. In a d d i t i o n the i s the lowest l e v e l at which a d m i n i s t r a t i v e , f i n a n c i a l , and management f u n c t i o n s r e s i d e .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

5.

ST. C L A I R

Level

Network

and

Communications

4

Department. A department i s l a r g e enough t o c o n t a i n more t h a n one g r o u p . S o m e t h i n g on t h e o r d e r o f s e v e n members a s a p r a c t i c a l minimum. The d e p a r t m e n t i s t h e l e v e l a t w h i c h a d m i n i s t r a t i o n and management become required.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

Level

5

G r o u p . The g r o u p i s t h e s m a l l e s t u n i t t h a t c a n f u n c t i o n with essentially no a d m i n i s t r a t i v e o v e r h e a d . Groups therefore consist of approximately 3 members. P e o p l e a t t h i s l e v e l a r e i m p l e m e n t o r s . They p e r f o r m t h e t a s k s a s s o c i a t e d w i t h c o l l e c t i n g raw d a t a and l a b o r a t o r y a n a l y s i s . The o n l y f u n c t i o n t o p r o b a b l y bridge between g r o u p a n d d e p a r t m e n t w o u l d be c l e r i c a l . P r o j e c t or Task. This i s the smallest f u n c t i o n a l unit in the organization. One p e r s o n c o u l d b e r e s p o n s i b l e for several p r o j e c t s . There a r e three primary measures o f performance f o r a communications channel. They a r e d i s t a n c e , s p e e d , and cost. Speed and d i s t a n c e a r e c o n v e r s e l y r e l a t e d t o one another. That i s you can have speed o r d i s t a n c e b u t n o t both Cost i s d i r e c t l y r e l a t e d t o both. That i s t o say i f you want e i t h e r speed o r d i s t a n c e you a r e g o i n g t o pay f o r i t . unfortunately t h e r e l a t i o n s h i p i s n o t l i n e a r and i n c r e a s e s i n speed r e s u l t i n dramatic increases i n cost. The cost distance relationship i s not quite so dramatically non-linear. Laboratory

Front

Ends

The l a b o r a t o r y f r o n t e n d d e v i c e l o o k s c o n s p i c u o u s l y like a personal computer t o a network. In fact there are other interesting parallels between t h e growth of terminals to personal c o m p u t e r s and t h e e v o l u t i o n o f laboratory f r o n t ends. A t t h i s t i m e t h e m o s t common interfaces a r e RS-232, RS-422, IEEE-488, and p a r a l l e l interfaces. Many o f t h e s e a r e c l e a r l y a d o p t i o n s of t e r m i n a l i n t e r f a c e s made t o a d a p t t h e s e d e v i c e s t o "automatic data entry". However, t h e s e i n t e r f a c e s were never designed t o allow f o r Networking. They were i n t e n d e d t o make t h e d e v i c e s l o o k l i k e dumb t e r m i n a l s t o the host. But t h e advent o f micro processors and t h e i r implementation i n l a b o r a t o r y equipment c l e a r l y i n d i c a t e s their d i r e c t i o n continues t o p a r a l l e l the path followed by t h e t e r m i n a l t o w a r d a p e r s o n a l computer. They w i l l very quickly require t h e same s e r v i c e s t h a t personal computers r e q u i r e . The n e x t g e n e r a t i o n o f L a b d e v i c e s w i l l h a v e i n c r e a s e d mass s t o r a g e to fuller u t i l i z e the intelligence buried inside.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

39

C O M P U T E R S IN T H E L A B O R A T O R Y

40 Personal

Computers

( D i s k Based

Systems)

A s i g n i f i c a n t d i f f e r e n c e i n communications requirements i s e x p e c t e d when s t o r a g e m e d i a i s d i s t r i b u t e d a l o n g w i t h personal computer systems. These configurations i n t r o d u c e a r e q u i r e m e n t f o r n e t w o r k i n g ( h o s t t o h o s t ) as opposed to data communications (terminal to host).

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

File

T r a n s f e r And

Task

To

Task

Communications

P e r s o n a l c o m p u t e r s w i l l r e q u i r e f i l e t r a n s f e r and t a s k to t a s k communications traffic from the network. The c h a r a c t e r i s t i c s o f n e t w o r k i n g are t h e r e f o r e marked by c o n s i d e r a b l e i n c r e a s e s i n b o t h t h e s p e e d and q u a n t i t y o f data to t r a n s f e r . In a d d i t i o n i t i s not the speed and power o f t h e c o m p u t e r s but t h e s i z e o f t h e s t o r a g e a t the ends o f the network link that dictate networking requirements. Mass S t o r a g e

Devices

In L a b o r a t o r y

Systems

The i n c l u s i o n o f mass S t o r a g e D e v i c e s i n L a b o r a t o r y instruments. The l a b o r a t o r y f r o n t e n d m a n u f a c t u r e s are c o n s i d e r i n g i n t r o d u c t i o n o f mass s t o r a g e i n t o their devices. The backup of these d e v i c e s i s g e n e r a l l y assumed t o t a k e p l a c e from W i n c h e s t e r D i s k s t o floppy diskettes ο streaming tapes. A cop out s o l u t i o n i s t o add a s e c o n d W i n c h e s t e r and p r o d u c e a shadow v o l u m e and h o p e b o t h d i s k s d o n ' t go b e l l y u p a t t h e s a m e t i m e . However, a communications link to a larger host i s c l e a r l y a v a s t l y s u p e r i o r s o l u t i o n t o any o f t h e local mass storage approaches (floppies, streaming tapes, shadow v o l u m e s ) . Problems

A s s o c i a t e d With

Local

Storage

In t h e c a s e o f f l o p p y d i s k e t t e s t h e backup p r o c e d u r e i s c l e a r l y cumbersome. A r c h i v a l s t o r a g e on s t r e a m i n g t a p e i s an u n k n o w n q u a n t i t y . I t i s necessary to handle 1/2 i n c h t a p e e v e r y 18 t o 24 m o n t h s t o a v o i d p r i n t through and m e c h a n i c a l p r o b l e m s . I t i s not c l e a r t h a t streaming t a p e s a r e n o t immune t o t h e same p r o b l e m s and will probably require the same t y p e o f h a n d l i n g . This handling is at this time not supported either by equipment, knowledge, or procedures f o r the streaming tape media. L o c a l b a c k u p o f W i n c h e s t e r d i s k s t o mass media a l s o o f f e r s the p r o s p e c t of i n c r e d i b l e l a b o r c o s t s with the proposed p r o l i f e r a t i o n of these d e v i c e s . A Networked

Solution

To

Distributed

Storage

If the a p p r o p r i a t e communications link existed there i s tremendous p o t e n t i a l f o r i t s a p p l i c a t i o n l a r g e number o f t h e s e W i n c h e s t e r b a s e d systems.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

then in a

5.

ST. C L A I R

Networking

Network

and

Effects

41

Communications

on Data

Manipulation

A D i g i t a l Computer N e t w o r k a l l o w s t h e u s e r s t o p l a c e an i m p r e s s i v e amount o f c o m p u t e r power a n y w h e r e . In the research environment this allows the researcher to control t h e experiment, verify the data, e x t r a c t and record at t h e experiment site. The d r a m a t i c reduction i n t h e c o s t o f d a t a m a n i p u l a t i o n a l l o w s o n e t o p u t many computers where ever they a r e needed.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

Networking

Effects

On D a t a

Access

The researcher can store the data at the point of c a p t u r e o r send t h e i n f o r m a t i o n over a Network t o a larger department l e v e l machine. Since the cost of storage i s considerable relative to the cost of computation every attempt should be made t o m i n i m i z e this cost. The c o s t of networking tends t o take a d v a n t a g e o f t h e same t e c h n o l o g y as data manipulation t h e r e f o r e t h e t r a n s p o r t o f i n f o r m a t i o n over t h e D i g i t a l C o m p u t e r N e t w o r k t o l a r g e r more e c o n o m i c a l s t o r a g e on d e p a r t m e n t l e v e l m a c h i n e s makes s e n s e . In addition the use o f a s m a l l m a c h i n e on s i t e a l l o w t h e amount o f d a t a t o be r e d u c e d by s e l e c t i n g t h e d a t a b e f o r e t r a n s m i s s i o n . A l s o use o f t h e departments machine t o c e n t r a l i z e t h e c o s t s o f backup and r e s t o r a t i o n o f data p r e v e n t s e r r o r s c a u s e d by a s c i e n t i s t t r y i n g t o do a c o m p u t e r o p e r a t o r s job. Wide Than

Area Networks 3000 M e t e r s )

Characteristics

(Distances

Greater

Wide area n e t w o r k s a r e d e f i n e d as t h o s e u t i l i z i n g t h e s e r v i c e s p r o v i d e d when t h e u s e r does n o t c o n t r o l t h e channel. F o r example few u s e r s c a n p u r c h a s e t h e r i g h t s to i n s t a l l a wire from a f a c i l i t y i n Boston, MA t o c o n n e c t a f a c i l i t y i n S t L o u i s , MO. T h e same t e c h n o l o g y i s g e n e r a l l y used i f you need t o communicate a c r o s s town or a c r o s s t h e s t r e e t you buy i t as a s e r v i c e . The d i v e s t a t u r e o f ATT may p r o d u c e a s t e p f u n c t i o n i n l o n g l i n e s c o s t s i n t h e near f u t u r e . The c o s t s o f wide a r e a n e t w o r k s e r v i c e s f r o m t h e t e l e p h o n e company have c a u s e d a number o f l a r g e u s e r s t o s t o p u s i n g t h e s e s e r v i c e s a n d buy s a t e l l i t e communications t o r e p l a c e l o n g l i n e s and broadband networks t o replace t h e s e r v i c e with i n a campus like facility. Cost o f these services i s regulated and s e t by t a r i f f . Recently the operating company i n Oklahoma requested a rate increase f o r residential telephone service attaching personal computers t o p u b l i c data base s e r v i c e s . The c o s t o f connection would i n c r e a s e t h e b a s i c r a t e approximately 5 t i m e s t h e v o i c e o n l y r a t e t h e r e i s no a p p a r e n t c h a n g e i n the a c t u a l s e r v i c e being p r o v i d e d . A similar rate f i l i n g i s b e i n g made i n T e x a s .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

42 Local

C O M P U T E R S IN T H E

Area

LABORATORY

Networks

T h e s e n e t w o r k s a r e l i m i t e d t o d i s t a n c e s g r e a t e r t h a n 45 m e t e r s and l e s s t h a n 3000 m e t e r s . Examples of the t e c h n i q u e s e m p l o y e d i n c l u d e , PABX, E t h e r n e t , Broadband, F i b e r o p t i c s , and M i c r o w a v e . Clusters

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

T h i s d i s t a n c e , l e s s t h a n 45 m e t e r s , i s r o u g h l y 3 times the distance minicomputer i n t e r n a l busses ran in the 1960s. A b u s i s t h e name f o r t h e c o m m u n i c a t i o n s c h a n n e l i n s i d e the machine. Data r a t e f o r the Ethernet i s 10 m i l l i o n b i t s per second.

Broadband Connecting terminals t o h o s t s a p p e a r s t o be the most w i d e l y used area f o r broadband. A l l o w s t h e same c h a n n e l t o c a r r y v o i c e and v i d e o s i g n a l s . The s i g n a l r a t e s f o r broadband are not i m p r e s s i v e f o r data communications at t h i s time under 1 million b i t s per second f o r most applications. Fiber

Optics

This technology is attractive from a number of standpoints. E v e r y o n e w o u l d l i k e t o see c o o r d i n a t i o n of the material and c o n n e c t o r s so that migration in the field could take place as follows. For example a customer obtains a f i b e r o p t i c terminal to host link. Then r e p l a c e s t h e T e r m i n a l w i t h a Professional/Personal Computer. The s i n g l e P r o f e s s i o n a l / P e r s o n a l Computer i s replace by a c l u s t e r of several Professional/Personal Computers. The c l u s t e r expands to c o n t a i n a machine of the VAX class which is connected to the Professional/Personal C o m p u t e r s by E t h e r n e t and into a high performance cluster (a very high performance C l u s t e r o f l a r g e m a c h i n e s ) v i a t h e same F i b e r Optic Link. At each stage the terminal through High performance c l u s t e r the same f i b e r optic cable could handle the traffic. Planning ahead the r i g h t fiber optics material will allow migration s u c h as this to take place. Historical

Development

The development follows.

of

Of

Laboratory

Laboratory

Communications

Devices

has

proceeded

as

Phase 1 Manual Data E n t r y . Data i s keypunched manually and entered v i a c a r d s or paper tape to the computer. L a t e r Data i s entered v i a a t e r m i n a l to the computer.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

5.

ST. C L A I R

Network

and

Communications

43

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

P h a s e 2 A n a l o g To D i g i t a l I n T h e H o s t . T h e l a b o r a t o r y manufacturers follow by connecting their devices directly to t h e data processing system. These Laboratory devices provide and analog signal. The Laboratory Device relies entirely on t h e h o s t f o r intelligence. The h o s t provides both control f o r an a n a l o g t o d i g i t a l (A/D) C o n v e r t e r and p r o c e s s i n g o f t h e data. A moderate improvement i n e f f i c i e n c y i s achieved by m u l t i p l e x i n g s e v e r a l A n a l o g s i g n a l s t o a s i n g l e A/D in t h e computer. P h a s e 3 D i g i t a l To H o s t . A/D c o n v e r s i o n m o v e s i n t o t h e l a b o r a t o r y d e v i c e a n d BCD o r A S C I I d i g i t s a r e s e n t i n s e r i a l or p a r a l l e l to the host. The l a b o r a t o r y d e v i c e now l o o k s l i k e a t e r m i n a l . The c o n t r o l l e r o p e r a t e s i n character i n t e r r u p t mode. Logical extensions of this technology have followed t h e manner i n which we interface terminals. The f i r s t interface i s a single c a r d i n t e r f a c e s and then m u l t i p l e i n t e r f a c e s p e r c a r d . However, both a r e i n t e r r u p t driven and c o n s i d e r a b l y l o a d t h e C P U . T h e n e x t l o g i c a l s t e p i s a DMA o r s i l o c o n t r o l l e r . T h e r e a r e " s t a n d a r d s " t h e R S - 2 3 2 - C ( i n many v a r i a t i o n s ) and t h e IEEE-488. Phase 4 I n t e l l i g e n t Lab Devices. Laboratory device manufactures offer intelligent devices with dedicated CPUs. To d o t h i s t h e y h a v e d e v e l o p e d s o f t w a r e e x p e r t i s e T h e s e l a b o r a t o r y d e v i c e s r e l a y on CPU manufacturers hardware and software. The laboratory device manufactures include a micro processor c h i p i nt h e device with s u f f i c i e n t power t o do t h e j o b t h a t o n c e took a PDP-8. I n a d d i t i o n t h e machine t r o u b l e - s h o o t s the p r o c e s s and o f f e r s r e m e d i a l a c t i o n t o t h e o p e r a t o r . This s u b s t a n t i a l l y changes t h e communications i n t e r f a c e . The data moved to the host i s substantially pre-processed. T h e a n a l y z e r i s now c a p a b l e o f s e n d i n g "packets" o f pre-processed data t o the host f o r more general analysis. Increased storage i s also made available. The a n a l y t i c a l d e v i c e now b e g i n s t o look more l i k e a d i s k t h a n a t e r m i n a l i n t e r m s o f p o t e n t i a l data t r a n s f e r rates. P h a s e 5 M u l t i p l e CPUs And I n t e l l i g e n t L a b o r a t o r y Devices On T h e N e t w o r k . Computer m a n u f a c t u r e r s have developed systems with multiple CPUs and m u l t i p l e operating systems ( i . e . A Network). The c o n s t r u c t i o n o f n e t w o r k s comprised completely of intelligent laboratory devices is a real possibility with the a v a i l a b i l i t y ofthe Ethernet Specification. Small l o c a l area networks o f l a b o r a t o r y d e i c e s w i l l d e v e l o p u s i n g E t h e r n e t and o t h e r technologies. Later these local networks will be interconnected probably v i a r o u t e r s and gateways t o broadband, S a t e l l i t e , and t e l e p h o n e facilities.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E LABORATORY

44 Conclusions

The p a t h t h e d a t a p r o c e s s i n g m a n u f a c t u r e r s have taken i s b e i n g f o l l o w e d by t h e L a b o r a t o r y d e v i c e manufacturers. The laboratory devices are becoming increasingly intelligent. This trend implies that the personal c o m p u t e r may h a v e come t o o l a t e f o r many instruments. They w i l l u s e t h e c h i p s i n s t e a d . Manufacturers a r e now i n t e g r a t i n g c h i p s i n t o t h e i r d e v i c e s and w i l l soon n o t be satisfied with less than true host to host communications c a p a b i l i t y .

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch005

R E C E I V E D July 13, 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6 Introduction to Graphics

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

JOSEPH G. LISCOUSKI Digital Equipment Corporation, 1 Iron Way, P.O. Box 1002, Mail Stop: MRO 2-3/M91, Marlboro, MA 01752

Graphics -- (noun) the science or art of drawing, particularly of mechanical drawing, or of drawing to mathematical rules. (Britannica World Language Dictionary) Computer -- A device capable of accepting information, applying prescribed processes to the information, and supplying the results of these processes. It usually consists of input and output devices, storage, arithmetic, and logical units, and a control unit. (Computer Dictionary, Sippl and Sippl)

T a k e n t o g e t h e r , t h o s e two w o r d s d e s c r i b e b o t h a body of knowledge and a t o o l for a p p l y i n g the " r u l e s " to i n f o r m a t i o n and m a n i p u l a t i n g and it. In a broader c o n t e x t , computer g r a p h i c s i s a endeavor t h a t not o n l y d e a l s w i t h r u l e s and d a t a , b u t also encompasses the means of d i s p l a y i n g t h a t i n f o r m a t i o n and interacting with it.

1.0

GRAPHICS

APPLICATIONS

The applications grouped into four

that the field categories:

embraces

can

0097-6156/84/0265-0045$10.50/0 © 1984 American Chemical Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

be

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

C O M P U T E R S IN T H E

LABORATORY

o

Data R e p r e s e n t a t i o n - data p l o t t i n g packages that permit us to v i s u a l i z e the r e l a t i o n s h i p s between variables,

ο

M o d e l i n g and L i n e D r a w i n g d i s p l a y of r e a l or imagined

ο

I m a g e P r o c e s s i n g - The display and analysis of r e a l or imagined o b j e c t s . This would a l s o i n c l u d e the enhancement of information about those obj e c t s .

ο

Document P r e p a r a t i o n and graphics.

the r e p r e s e n t a t i o n objects,

- combining

text

and

processing

O v e r t h e n e x t f e w p a g e s , we w i l l p r e s e n t a d e s c r i p t i o n of these c l a s s i f i c a t i o n s and t h e p a r t i c u l a r h a r d w a r e and s o f t w a r e f o r t h e i r use.

1.1

1.1.1

DATA

REPRESENTATION

G r a p h s And

Charts

-

D a t a p l o t t i n g i s o n e o f t h e m o s t common l a b o r a t o r y and commercial applications of graphics. Most of us became acquainted with the topic in high school algebra and s c i e n c e c l a s s e s . The b a s i c p r o b l e m i s t o pictorially present (as a species we can grasp pictures much more e a s i l y t h a n l i s t s o f n u m b e r s ) t h e r e l a t i o n s h i p o f two o r m o r e v a r i a b l e s . This usually involved deciding the best way to view the data ( b a r c h a r t s , p i e c h a r t s , or l i n e graphs, for example) scaling the f i g u r e , a d d i n g t h e d a t a , l a b e l i n g , an so on. This i s a straightforward problem, for us, with a p e n c i l and p a p e r , b u t no s o t r i v a l f o r a machine whose strength lies in manipulating discrete integers. L e t s t a k e a s t e p b a c k and l o o k a t the process. Say t h a t we a r e p l o t t i n g a s i m p l e s c a t t e r p l o t o f some X,Y pairs. The f i r s t s t e p i s t o pick a suitable graph paper, there are a l o t o f them, d i f f e r e n t t y p e s and s c a l e s - someone e l s e has gone t h r o u g h t h e t r o u b l e of working out t h e r u l i n g s , l i n e t h i c k n e s s and s u c h and

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

Introduction

to

47

Graphics

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

p r i n t e d t h e p a t t e r n on a w r i t a b l e (and fortunately erasable) surface. T h e y l e f t some w h i t e s p a c e o n t h e edges f o r l a b e l s and n o t e s . Thats n o t t h e case on a CRT (or pen p l o t t e r ) . T h e b e s t we c a n u s u a l l y h o p e for i s that the display surface will have a rectangular c o o r d i n a t e system. The lower l e f t c o r n e r w i l l have a p a r t i c u l a r s e t o f c o o r d i n a t e s a n d s o w i l l the upper r i g h t c o r n e r . Given this situation, the plotting program must allocate the actual plotting region - allowing space f o r l a b e l s - a n d t h e n s e t up t h e g r i d a c c o r d i n g t o t h e users needs. T h a t p r o c e s s i s n o t d i f f i c u l t i f we a r e d e a l i n g w i t h l i n e a r a x i s (drawing a box, f i g u r i n g o u t where t h e t i c k marks s h o u l d g o , what l i n e p a t t e r n s a r e needed f o r m a j o r a n d m i n o r t i c k s ) b u t c a n become very c o m p l e x when n o n - l i n e a r a x i s a r e i n v o l v e d . I t now h a s to t a k e i n t o a c c o u n t s c a l i n g f u n c t i o n s ( t o a c c o u n t f o r the n o n - l i n e a r i t y ) a s w e l l a s s c a l i n g f a c t o r ( t o make s u r e t h a t t h e w i n d s up i n t h e g r a p h p a p e r r a t h e r than in the margins. N e x t , we d e a l w i t h t h e s c a l i n g o f t h e p a p e r . Again for u s , i t i s a minor matter o f working out a convenient scale. We h a v e s o many l a b e l i n g p o s i t i o n s , t h e d a t a c o v e r s a k n o w n r a n g e a n d we c a n p i c k some s e t of l a b e l s t h a t a r e easy t o work w i t h - i n c r e m e n t s o f 5 or 10 m i g h t work out well for a p a r t i c u l a r case. Computers don't understand t h e word "convenient". Given a s e t o f numbers a n d t o l d t o s c a l e them y o u m i g h t w i n d u p w i t h v a l u e s ( a t t h e m i n i m u m a n d maximum) of . 2 3 7 a n d 9 . 3 1 4 w i t h i n c r e m e n t s o f . 9 0 8 i f we h a v e 10 m a j o r t i c k m a r k s . Not very "convenient" i f you a r e t r y i n g t o read i n f o r m a t i o n from t h e p l o t . The program needs t o have s u f f i c i e n t intelligence t o choose an easy t o work w i t h s e t o f l a b e l s - a complex problem s i n c e we f i r s t n e e d t o a g r e e on t h e d e f i n i t i o n o f convenient and then add the c o n s t r a i n t that i t not waste a l o t o f t h e viewing surface by p i c k i n g t o o broad a range. Adding t h e data to the scaled plot i s the least troublesome aspect o f the task a n d c a n be h a n d l e d e a s i l y by most p l o t t i n g programs a s w e l l a s p e o p l e . T h e p r o b l e m b e c o m e s m o r e c o m p l e x a s we b e g i n with b a r - g r a p h s , p i e c h a r t s a n d more i n v o l v e d applications. The p o i n t o f simple graph

working plotting

this excerise i s to illustrate that plotting problems t h a t would n o t t a x a

American Chemical Society Library 1155 16th St. N. W. Washington, D. C. Liscouski, 20038 J.; In Computers in the Laboratory; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

48

C O M P U T E R S IN T H E

LABORATORY

high school student, are far from simple for a p r o g r a m m e r t o a n t i c i p a t e , w h i l e he d e s i g n i n g a p a c k a g e f o r g e n e r a l p u r p o s e u s e , when you c o n s i d e r the steps that are involved - and t a k e n f o r g r a n t e d . This i s the b a s i c reason why computer graphics systems particularly software - a r e e x p e n s i v e and n o t e a s i l y produced.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

The u s e o f d a t a p l o t t i n g can vary widely. From a r e s e a r c h e r t r y i n g t o d i s p l a y an a c q u i r e d a n a l o g signal to response surface plot that might be used a publication or p r e s e n t a t i o n . The f i g u r e s t h a t f o l l o w ( f i g u r e s 1, 2, a n d 3) are some simple applications p r o d u c e d w i t h one s c i e n t i f i c g r a p h i c s p a c k a g e ( D i g i t a l E q u i p m e n t C o r p o r a t i o n s VAX-11 RGL).

F i g u r e 1. P o l a r c o o r d i n a t e g r i d . Papers o f t h i s t y p e can be used t o show t h e r a d i a t i o n p a t t e r n o f an antenna, o r the EMI/RFI e m i s s i o n s f o r a VAX computer o r t e r m i n a l under t e s t f o r FCC compliance.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

LISCOUSKI

Introduction

α

Ο φ Ο

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

X

tf S•

7200. 6720. 6240. 5760. 5280· 4800. 4320. 3840. 3360. 2880. 2400.

to

Graphics

Year Growth o-P MEGABUCKG,

0

frass

INC.

Profit

0 /tatf/ a/fcr Taxes

il il s

0 1440. Ο 960. 480. Q-i

Q-2

0-3

Q-4

Q-2 z m Q-3

m

M Q-4

be

F i g u r e 2. D i f f e r e n t s e t s o f f i n a n c i a l d a t a can be more e a s i l y u n d e r s t o o d and compared g r a p h i c a l l y .

F i g u r e 3. D i f f e r e n t v i e w p o i n t s can be used t o show d a t a o r descriptive text.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

50

C O M P U T E R S IN T H E L A B O R A T O R Y

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

1.1.2

Hardware Requirements

-

The d i s p l a y r e q u i r e m e n t s d e p e n d s o n t h e u s e r s n e e d a n d thus c a n v a r y o v e r t h e s p e c t r u m o f r e s o l u t i o n , number of c o l o r s , and d i s p l a y t y p e . I n t e r a c t i v e users will need a CR^ t h e o n l y form o f an e r a s a b l e graphics medium. The r e s o l u t i o n of the device and i t s sophistication - s i m p l e p l o t s ( o n e o r two p e r s c r e e n ) c a n be d o n e o n a l o w t o m e d i u m r e s o l u t i o n d e v i c e (on the order of a - 768 χ 240 a d d r e s s a b l e p o i n t s ) . As t h e g r a p h s become more c o m p l e x o r more n u m e r o u s on a single s c r e e n , t h e need f o r h i g h e r r e s o l u t i o n (1024 χ 1024) increases. Not o n l y because o f t h e need t o p u t up a l o t o f d i s t i n g u i s h a b l e p o i n t s , b u t f o r t e x t used to l a b e l and a n n o t a t e t h e g r a p h s . Many t e r m i n a l s use an 8 χ 10 character cell f o r i t s normal size characters. The h a l f s i z e characters ( a v a i l a b l e on some graphics t e r m i n a l s ) , o n t h o s e same d e v i c e s a r e barely readable. I f a l o t o f t e x t and g r a p h i c s i s t o a p p e a r on t h e s c r e e n h i g h r e s o l u t i o n i s need s i m p l y t o h a v e t h e n e c e s s a r y number o f d o t s a v a i l a b l e to write s m a l l c h a r a c t e r s t h a t c a n be u n d e r s t o o d . If the user wants to walk away with h i s graph, he needs some f o r m o f h a r d c o p y u n i t . The n a t u r e ( r a s t e r h a r d c o p y , r a s t e r p r i n t e r , o r pen p l o t t e r ) depends on his use f o r the copy. A s c r e e n c o p i e r m i g h t be u s e d to produce suitable overheads for an informal presentation; and a pen p l o t t e r s ' o u t p u t m i g h t be b e t t e r f o r a speech given to a p r o f e s s i o n a l s o c i e t y or funding agency ( t h e y may be j u d g e d b y t h e q u a l i t y o f h i s p r e s e n t a t i o n as w e l l as i t s c o n t e n t ) .

1.1.3

Other

F o r m s Of D a t a

Representation

-

W h i l e g r a p h i c s and c h a r t s a r e t h e more common forms, t h e r e a r e o t h e r a p p r o a c h e s t o i l l u s t r a t i n g d a t a - many of them are only practical when generated by a computer. Lets take the pie-chart as a starting po i n t . The s e g m e n t e d c i r c l e m i g h t b e an appropriate format for d i s p l a y i n g the proportion of our business that i s g e n e r a t e d by v a r i o u s m a r k e t s e g m e n t s , i t s u t i l i t y is limited to only a few v a r i a b l e s . Use t o o many d i v i s i o n s and l a b e l i n g becomes a p r o b l e m . You might wind up o b s c u r i n g more i n f o r m a t i o n r a t h e r t h a n m a k i n g i t c l e a r e r . The o i l c o m p a n i e s solved that problem n e a t l y when t h e y w a n t e d t o show t h e p r o p o r t i o n s o f o u r total o i l imports from various countries i n one

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

6.

LISCOUSKI

Introduction

to

Graphics

51

display. The number of data points i s l a r g e , so r a t h e r t h a n u s e a p i e - c h a r t , t h e y t o o k a map of the world and then, dynamically, distorted the size of e a c h c o u n t r y s o t h a t i t was p r o p o r t i o n a l t o t h e amount of o i l we i m p o r t e d . An example o f s u c h a d i s p l a y on a n o t h e r t o p i c c a n be found on t h e f o l l o w i n g page (figure 4 , taken from t h e Boston G l o b e f o r March 7 t h , 1982) . The c o n c e p t o f t h e b a r - c h a r t has also be extended through t h e use o f computer g r a p h i c s , e x t e n s i o n s t h a t are particularly effective when the item being measured i s a f u n c t i o n o f p o l i t i c a l geography - s t a t e s and t o w n s . The i l l u s t r a t i o n on t h e f o l l o w i n g page (figure 5) comes from Harvard University. The s e q u e n c e o f n i n e f r a m e s shows t h e p o p u l a t i o n of the United S t a t e s as a f u n c t i o n o f time (the h e i g h t o ft h e contour a t any p o i n t i s p r o p o r t i o n a l t o t h e p o p u l a t i o n in that area). The Laboratory f o r Computer Graphics and Spatial Analysis (Harvard Graduate School o f Design) has a s y s t e m c a l l e d ODYSSEY. A c o m b i n a t i o n o f data base and graphics display, t h e ODYSSEY system can represent social and economic statistics as a function of political subdivisions. F o r example t h e amount o f a g r i c u l t u r a l land i n Massachusetts ( b y town) c o u l d be shown a s a map of the state, with the outline of a town p r o j e c t e d above the background by an amount proportional t o i s a g r i c u l t u r a l l a n d mass. Color can a l s o be u s e d t o perform t h e same classification, although t h e s e g m e n t a t i o n w o u l d n o t be a f i n e d u e t o l i m i t a t i o n on t h e number o f c o l o r s available i n the printing process. This same s e tof techniques has b e e n u s e d t o show v o t i n g p a t t e r n s . An " A n i m a t i o n I n f o r m a t i o n R e t r i e v a l " package by t h e same group c a n be u s e d t o shown t h e r e l a t i o n s h i p between m u l t i p l e v a r i a b l e s . One e x a m p l e t h e y c i t e i s the pattern of airline traffic, arrivals and d e p a r t u r e s f r o m U.S. a i r p o r t s , b o t h a s a f u n c t i o n o f time o f day and geography. The a n i m a t i o n g i v e s t h e viewer an easy grasp of the data where more traditional forms of display (including lists of numbers) might l e a v e y o u a b i t b e w i l d e r e d . The a p p r e c i a t i o n o f m u l t i - v a r i a t e relationships need not be a s i n v o l v e d a s i t i s i n t h e H a r v a r d p a c k a g e . There a r e some relatively simple techniques for illustrating data. Take a f a m i l i a r f i g u r e , a f a c e o r truck f o r example. L e t a copy o f a figure represent an individual product l i n e i n D i g i t a l . The s i z e s o f the noses might be p r o p o r t i o n a l t o t h e a m o u n t o f revenue o b t a i n e d from government s o u r c e s , t h e e a r s t h e u n i v e r s i t y segment and t h e s i z e o f t h e eyes t h e i n d u s t r i a l income.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

C O M P U T E R S IN T H E L A B O R A T O R Y

•S

0)

i ο

X

bo

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

F i g u r e 5.

P o p u l a t i o n growth i n t h e U n i t e d S t a t e s from 1790 - 1970.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

I

Ο

3

Ο'



Ο

r η ο G

COMPUTERS IN T H E LABORATORY

54 1.1.4

Hardware Requirements

-

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

The d i s p l a y s u s e d i n t h e l a s t section are usually raster devices due t h e t h e need f o r a variety of colors or shading options. The resolution of equipment i s high so t h a t t h e f i g u r e s and shapes on the screen c a n be accurately represented without distortion. Some f o r m o f h a r d c o p y i s a l m o s t a g i v e n . The type will depend on the nature of the presentation. 35 mm s l i d e s m i g h t be a p p r o p r i a t e (a M a t r i x system [camera a t t a c h m e n t f o r s c r e e n copying] for t h i s purpose c o u l d r u n about $10,000) o r computer o u t p u t on f i l m could be used f o r animation work. Plotters are useful where t h e emphasis i s on l i n e drawing, with l i t t l e filling of areas due t o t h e amount o f t i m e needed t o c o m p l e t e t h e o p e r a t i o n .

1.2

M o d e l i n g And L i n e

Drawing

M o d e l i n g and L i n e Drawing applications pretty much cover t h e map o f e n d u s e r s . One common r e f e r e n c e i s CAD - C o m p u t e r A i d e d D e s i g n . The uses h e r e range f r o m circuit board layout to automotive and aircraft design. Stone and Webster had a package that was running on t h e PDP-15s f o r a r c h i t e c t u r a l work. That p r o g r a m a l l o w e d y o u t o d r a w a b u i l d i n g o r room - in three dimensions and walk through i t . One d e m o n s t r a t i o n showed an a u d i t o r i u m . W i t h t h e package you could " s t a n d " i n t h e back and view t h e s t a g e , o r " s t a n d " o n t h e s t a g e a n d s e e how t h e a u d i e n c e w o u l d b e arranged. T h e v i s u a l i z a t i o n o f m o l e c u l a r s t r u c t u r e s f i n d s a home in this realm. Using packages such as TRIBBLE ( D u P o n t ) o r t h e Ρ Η Ο Ρ Η Ε Τ s y s t e m ( B o l t B a r a n e k & Newman, Cambridge, Mass.) one c a n d e s c r i b e a molecular structure and have the system display the three-dimensional structure as i t w o u l d a p p e a r when viewed from v a r i o u s r e f e r e n c e p o i n t s . Artificial Intelligence, urban planning and the automotive industry use t h i s form of graphics to r e p r e s e n t and v i e w o b j e c t s . The d e s i g n o f a building complex might be r e p r e s e n t e d i n t h e m a c h i n e a n s h o w n on a c o l o r m o n i t o r s o t h a t t h e d e s i g n e r could "walk" around the structure and v i e w them f r o m d i f f e r e n t perspectives.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

Introduction

to

55

Graphics

Advertising i s a common use. The CBS and ABC television logos a r e t h e r e s u l t o f computer generated graphics. The coloring, shading and dynamics a r e machine generated though not i n real-time. The d i s p l a y media i s f i l m and image i s drawn with light, one frame at a time. T h e same a p p r o a c h was u s e d i n the past year to advertise an FM radio station. During the time slot, the viewer was t a k e n on a simulation of night-time car ride. And speaking of simulation, flight simulation i s an i m p o r t a n t one the l a n d i n g o f a j e t on an aircraft carrier for example.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

Regardless of the a p p l i c a t i o n , the elements of such s y s t e m c a n be b r o k e n down a s f o l l o w s :

a

>

Data entry devices this i s t h e means of describing the object o f i n t e r e s t . I t c a n be a mathematical function, or a digitizer that provides a set of coordinates (two o r three d i m e n s i o n s d e p e n d i n g on t h e problem) t h a t describe the object.

>

A d a t a base used description.

>

A s o f t w a r e p a c k a g e t h a t c a n be used to generate the display. D e p e n d i n g on t h e needs o f t h e u s e r , the package might have hidden line removal [the ability to not display lines or surfaces that w o u l d n o t be s e e n i f t h e m o d e l w e r e a s o l i d body] p l u s t h e a b i l i t y t o r o t a t e t h e o b j e c t a b o u t two o r three a x i s .

>

A display device hardcopy u n i t .

>

Some m e a n s o f i n t e r a c t i n g w i t h the display a keyboard t o e n t e r commands, a l i g h t pen o r t a b l e t t o p i c k f r o m a menu o r p o i n t t o a n object, or a joystick.

to store

-

and r e t r i e v e t h e o b j e c t s

either

a

CRT

(usually)

or

The r e a l f u n b e g i n s a s t h e d a t a i s b e i n g e n t e r e d . The way data i s stored i s a major c o n s i d e r a t i o n . Given a s e t o f c o o r d i n a t e s , you must s t o r e them as w e l l as t h e i r r e l a t i o n s h i p t o o t h e r l o c a t i o n s on t h e o b j e c t .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E L A B O R A T O R Y

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

The p r o g r a m m i n g f o r t h e s e p a c k a g e s c a n b e c o m p l e x and expensive p a r t i c u l a r l y i f the s c a l i n g (varying the s i z e of the o b j e c t ) , t r a n s l a t i o n (moving the object from one s p o t t o a n o t h e r ) , and r o t a t i o n ( t u r n i n g t h e o b j e c t ) a r e done i n s o f t w a r e - t h e a l t e r n a t i v e i s t o do i t i n h a r d w a r e , t h e s o f t w a r e c o s t g o e s down, b u t t h e n you pay f o r i t i n i r o n (and g l a s s ) . The s i m p l e s t case i s t h e r e a l i z a t i o n o f an o b j e c t - w i t h o u t h i d d e n l i n e removal - i n three dimensions (two dimensional representations a r e n o t o f much i n t e r e s t a s a g e n e r a l case). H e r e we n e e d t o be a b l e t o d e s c r i b e the object, in t h r e e d i m e n s i o n a l s p a c e , p i c k a p o i n t o f v i e w and t h e n determine what that object would look like when projected onto a v i e w i n g s u r f a c e imposed between t h e o b s e r v e r and t h e o b j e c t - t r y s k e t c h i n g t h e c h a i r you are sitting on when viewed from any angle. Now c o n s i d e r t h e added c o m p l i c a t i o n o f h a v i n g t h e o b s e r v e r - y o u - be a b l e t o t a k e a n y r e f e r e n c e p o i n t , i n c l u d i n g a spot i n s i d e part of the c h a i r . Don't forget that your viewing screen i s n ' t i n f i n i t e i n e x t e n t , i t has p h y s i c a l l i m i t a t i o n s and as a r e s u l t , t h e p a r t o f t h e i m a g e t h a t e x t e n d s b e y o n d t h e s c r e e n m u s t be c l i p p e d . To l e s s e n t h e c o n f u s i o n o f h a v i n g u n n e c e s s a r y l i n e s parts o f t h e image t h a t m i g h t be b l o c k e d f r o m v i e w b y a solid part of the chair, introduce hidden line elimination. The d e v e l o p m e n t o f s o f t w a r e and hardware for these applications i s expensive even f o r a l i m i t e d library. A complete a p p l i c a t i o n s system would e a s i l y run i n the $100,000 end user p u r c h a s e p r i c e .

1.2.1

Hardware Requirements

-

The CRT d i s p l a y s a r e u s u a l l y v e c t o r devices, usually of high resolution (1024 χ 1024) t o p r o v i d e c l e a n lines. R a s t e r equipment w i t h i t s lower r e s o l u t i o n and "jaggies" d o e s n o t p r o v i d e a n y a d v a n t a g e u n t i l we a d d the complexity of the s o l i d f i l l f o r s u r f a c e s , or a range of c o l o r s . Hardcopy - f r e q u e n t l y pen p l o t t e r s i s a normal requirement.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

1.3

Introduction

to

57

Graphics

IMAGE P R O C E S S I N G

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

One k e y d i f f e r e n c e b e t w e e n t h i s a p p l i c a t i o n a r e a and the previous two i s t h a t t h e image i susually a representation of a real object. I t might be a LANDSAT p h o t o g r a p h o r a d e n t a l x - r a y t h a t n e e d s t o be enhanced. The p r o b l e m t h a t i s d e a l t w i t h h e r e i s n o t necessarily the display o f t h e image - t h o u g h t h a t f a c t o r i s here - but rather t h e use o f graphics to e x t r a c t more i n f o r m a t i o n f r o m t h e d a t a . Consider a satellite photograph taken with a m u l t i - s p e c t r a l camera. Rather than viewing a piece of g e o g r a p h y a s we m i g h t i n a m o r e conventional camera, the m u l t i - s p e c t r a l u n i t uses f i l t e r s t o record d e t a i l a t four (as an example) w a v e l e n g t h s . A green filter might be u s e d to detect v e g e t a t i o n , an orange f o r s o i l , another f o r w a t e r , e t c . Each o f these c a n be digitized so t h a t t h e computer has four c o p i e s o ft h e same i m a g e - f o u r t w o - d i m e n s i o n a l a r r a y s , e a c h e l e m e n t of which i s a measure o f t h e i n t e n s i t y o f l i g h t f o r the a p p r o p r i a t e wavelength. By ratioing t h e images from t h e orange a n d g r e e n f i l t e r s , we c a n e m p h a s i z e the v e g e t a t i o n o r barren soil. Dental x-ray images contain more information than might be a p p a r e n t t o t h e human e y e . The l o s s o f d e t a i l i s due t o t h e l o w c o n t r a s t level. After digitizing t h e image with a sensitive densitometer, t h e i n t e n s i t i e s c a n be r e s c a l e d and d i s p l a y e d with g r e a t e r d e t a i l t h a n we w o u l d h a v e s e e n i n t h e o r i g i n a l image. Image P r o c e s s i n g i s a m a j o r g r o w t h a r e a f o r graphics and the i n t e r p r e t a t i o n o f images. The pattern recognition aspects a r e o f much current interest. Lockheed h a s r e c e n t l y been r u n n i n g a d v e r t i s e m e n t s i n magazines i l l u s t r a t i n g t h e problem of automatically detecting tanks i n battle situations (adding the d e s i r a b l e f e a t u r e o f d i s t i n g u i s h i n g them f r o m us was also noted). Closer t o home, we h a v e t h e a r e a o f robotics in automated manufacturing and parts inspection. W e s t e r n E l e c t r i c some y e a r s a g o r e p o r t e d on their efforts to find faulty drill holes in fabricated parts automatically. During the recent (March, 1982) C o r p o r a t e Research open house, they showed work on t h e p r o b l e m o f p a r t s i n s p e c t i o n . The list goes on i n c l u d i n g the photographs and t h e i r enhancement.

"deblurring" of A recent article

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

58

C O M P U T E R S IN T H E LABORATORY

in Scientific American ("Image Computer", Cannon and Hunt, O c t o b e r good overview.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

1.3.1

Hardware Requirements

Processing by 1981) p r o v i d e s a

-

The d i s p l a y h a r d w a r e r e q u i r e s a h i g h resolution unit (512 χ 5 1 2 c a n be a c c e p t a b l e f o r some u s e , o t h e r s may r e q u i r e 1024 χ 1024) c a p a b l e of at least 16 gray shades or c o l o r s ( t h e human e y e c a n d i s t i n g u i s h 64 shades of g r a y ) . Some f o r m of hardcopy i s needed. The most appropriate i s a p h o t o g r a p h i c d e v i c e (Dunn Instruments or Matrix about $10,000). A "pick" d e v i c e - a l i g h t pen, c u r s o r , or t a b l e t - i s sometimes desirable to indicate special regions of interest. B e y o n d t h e g r a p h i c s e q u i p m e n t we a l s o n e e d t o l o o k at the computer and storage sub-systems. Digitized i m a g e s t a k e up a l o t o f d i s k s p a c e , a single 1024 χ 1024 χ 8 frame i s 1024K b y t e s . We a l s o n e e d t o b e concerned a b o u t t h e movement o f t h a t d a t a f r o m s t o r a g e to the display. High r e s o l u t i o n r e a l - t i m e animation can put a substantial load on a CPU, so a high bandwidth i s important (we m u s t be careful to d i s t i n g u i s h between real-time animation and simply animation, i n the f i r s t case high throughput i s n e e d e d , i n t h e s e c o n d , d a t a c a n be r e c o r d e d o n f i l m a t a s l o w speed and p l a y e d back a t normal s p e e d s ) .

1.4

DOCUMENT P R E P A R A T I O N

Throughout t h i s document (prepared using RUNOFF, a t e x t p r o c e s s i n g package) you w i l l see examples o f t e x t formatting; automatic generation of table-of-contents, indented and b u l l e t e d l i s t s , b o l d type and o t h e r features. You will also see some rather simple minded examples of illustrations g r a p h i c s - m i x e d o n t h e same p a g e a s t e x t , a s w e l l as more s o p h i s t i c a t e d g r a p h i c s on s e p a r a t e p a g e s . While

a combination

of

the

VAX

EDT

editor

or

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

the

6.

LISCOUSKI

Introduction

to

59

Graphics

PDP-11 version - KED - a n d RUNOFF make a r e a s o n a b l e t e x t e d i t i n g and p r o d u c t i o n facility (once you g e t used to the editor a n d t h e command structure of RUNOFF), i t does not allow the integration of formatted t e x t and p i c t u r e s beyond what y o u s e e h e r e . The g o a l o f s u c h a p r o d u c t w o u l d be t h e p r e p a r a t i o n o f a document wivh t h e end r e s u l t l o o k i n g l i k e a book. There graphics a p p r o p r i a t e t o t h e s u b j e c t matter are found t o g e t h e r o n t h e same p a g e r a t h e r t h a n a p a g e o r two away.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

1.4.1

Hardware Requirements

-

The k e y component o f the type of system i s the printer. I t m u s t be l e t t e r q u a l i t y a n d s t i l l be a b l e to f u n c t i o n as a g r a p h i c s d e v i c e . I d e a l l y i t w o u l d be able to function l i k e l e t t e r q u a l i t y p r i n t e r on t h e Word P r o c e s s i n g s y s t e m s , a b l e t o t a k e either tractor feed paper o r s i n g l e sheets ( t h e r o l l form paper i s awkward t o s e p a r a t e i n t o s h e e t s and t h e r e i s a high likelihood of mechanical damage t o t h e p a p e r d u r i n g separation). The p r o d u c t i o n of graphics could be either through a p r i n t e r t h a t was " s m a r t " e n o u g h t o i n t e r p r e t t h e g r a p h i c s commands o r h a v e i t f u n c t i o n a s a screen copier. A f u l l p a g e CRT d i s p l a y w o u l d b e t h e i d e a l c h o i c e a s a preview and p r o d u c t i o n d e v i c e . S i n c e t h e e n t i r e page w o u l d be a v a i l a b l e a s a d r a w i n g medium - giving a one-to-one image on t h e p r i n t e r - t h e page l a y o u t c o u l d be c o m p o s e d o n t h e s c r e e n , treating the text material as g r a p h i c s and then doing a t r a n s f e r o ft h e b i t plane to the p r i n t e r .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E LABORATORY

60

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

COMPUTER G R A P H I C S HARDWARE AND

2.0

HARDWARE

2.1

Graphics

SOFTWARE

Displays

Rather than t o give a d e t a i l e d tutorial on graphics hardware, t h e i n t e n t o f t h i s s e c t i o n i s t o g i v e you a overview of t h e type of d i s p l a y hardware a v a i l a b l e . CRT d i s p l a y s c a n be b r o k e n down i n t o two categories: vector and r a s t e r technologies. A vector i s a line d r a w n f r o m some c u r r e n t p o s i t i o n t o a new one. This t y p e o f d i s p l a y i s a l s o r e f e r r e d t o a s a "random s c a n " device since the pattern of painting the screen depends on t h e f i g u r e drawn. With raster systems, the v e c t o r i s i n t e r p r e t e d as t h e dots (their position) needed t o draw i t . I n a d d i t i o n , r a s t e r d e v i c e s p a i n t the s c r e e n i n a o r d e r e d fashion, regardless of the r e s u l t i n g image.

2.1.1

Vector

Displays

-

The b a s i c c o n c e p t o f a vector display i s something most o f u s h a v e b e u s e d t o s i n c e we w e r e c h i l d r e n basically a connect-the-dots approach to drawing pictures. The typical resulting f i g u r e s from t h i s type of d i s p l a y a r e l i n e drawings rather than filled in areas. The l i n e s a r e s h a r p , o w i n g t h e t h e method of d r a w i n g . C o n s i d e r a p i e c e o f g r a p h p a p e r ( t h e common q u a d r a n g l e will do nicely) or the figure below. The intersections of the l i n e s are the addressable points on the display surface. The number o f l i n e s - t h e r e s o l u t i o n - ( 1 0 2 4 χ 1 0 2 4 i s common) i s governed by the design of the hardware, s p e c i f i c a l l y the d i g i t a l - t o - a n a l o g c o n v e r t e r (D/A) u s e d (10 b i t s f o r 1024 points). T h e D/As (2 a r e u s e d , o n e e a c h f o r t h e horizontal and v e r t i c a l p o s i t i o n i n g ) a r e used to

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

6.

LISCOUSKI

Introduction

to

61

Graphics

generate a v o l t a g e t h a t i si nt u r n a p p l i e d t op a i r s o f d e f l e c t i o n p l a t e s (one s e t f o r t h e h o r i z o n t a l a x i s and one f o rthe vertical axis). That v o l t a g e causes an e l e c t r o n beam t o b e d e f l e c t e d f r o m i t s n o r m a l s t r a i g h t l i n e path t o another p o i n t on t h e screen. Just as the graph paper i s a continuous w r i t i n g s u r f a c e , so i s t h e d i s p l a y screen ( i ti s evenly coated w i t h phosphor),s o a s t h e beam moves from oneaddressable point t o another, i t leaves a straight line track on t h e screen. I t i s t h e same a s choosing twopoints o f intersection on t h e paper a n d j o i n i n g them w i t h a line; t h el i n e i s continuous between the points. Repositioning t h e beam w i t h o u t d r a w i n g c a n b e d o n e b y not i n t e n s i f y i n g i t d u r i n g i t movement. These two modes o f operation g i v e r i s e t o t h e two fundamental graphics commands - move (repositioning without intensification) a n d draw (repositioning with intensification).

+—+—+—+—+—+—+—+—+—+—+ I I I I I + — + — + — + — 4I I I I I

I + I

I I I +— +— + I I I

I I + + I I

+—+—+—+—+—+—+—+—+—+—+

I I I I I I I I I I I +—+—*—+—+—+—+—+—+—+—+ I I l \I I I I I I I I +—+—+-\+—+—+—+—+—+—+—+ I I I \ I I I I I I I +—+—+—+\-+—+—+—+—+—+—+

I

I

I

I M

I

I

I

I

I I

I

I

I

I

l\ I

I

I

I

I I

I

I

I

I

I \

I

I

I

I I

The * shows t h e beginning and endo f the l i n e segment w i t h t h e \ j o i n i n g them.

+—+—+—+—\—+—+—+—+—+—+

+—+—+—+—+-\+—+—+—+—+—+ +—+—+—+—+—+\-+—+—+—+—+ I

I

I

I

I

IM

I

I

I

I

I

I

I

I

I I

I

I

I I

+—+—+—+—+—+—*—+—+—+—+ I

+—+—+—+—+—+—+—+—+—+—+ I I I I I I I I I I I + — +—+ —+—+ — + — +—4-—+—+—+

There a r e two b a s i c versions o f this technology: storage tubes a n dv e c t o r r e f r e s h d i s p l a y s . Storage tube technology (developed a n d owned by Tektronix) allows t h e image t o be " s t o r e d " on t h e s c r e e n . This

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

62

COMPUTERS IN THE LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

method p e r m i t s a l o t of lines to be drawn on the screen, but does not a l l o w the user to change p a r t of t h e image w i t h o u t r e d r a w i n g t h e e n t i r e display which can take up t o a m i n u t e t o do ( i n o r d e r t o c h a n g e i t t h e e n t i r e s c r e e n m u s t be f l o o d e d w i t h electrons and then the image d e c a y s - the cause o f t h e g r e e n flash on T e k t r o n i x t u b e s ) . The c o n t r a s t is low and only m o n o c h r o m e i m a g e s c a n be produced. Vector refresh displays on the other hand can be updated without having to reproduce the entire picture. The figure on the screen is completely redrawn - r e f r e s h e d - every 60th of a second ( t h i s i s an a u t o m a t i c p r o c e s s ) and as a r e s u l t , and c h a n g e s a r e quickly s e e n ( t h e r e i s no f l a s h t o e r a s e t h e s c r e e n ) . The p r o b l e m t h a t t h i s introduces is this: i f the i m a g e c a n n o t be r e d r a w n ( d u e t o h a v i n g t o o many l i n e s ) i n t h a t t i m e p e r i o d , an o b j e c t i o n a b l e f l i c k e r i n g (see b e l o w ) o f t h e image w i l l result. The p r i m a r y a d v a n t a g e s o f v e c t o r CRTs a r e t h e ability to produce sharp c l e a n images. The d r a w b a c k s a r e t h e limited colors available (monochrome usually, expensive beam p e n e t r a t i o n u n i t s c a n p r o d u c e s e v e r a l ) and t h e f a c t t h a t t h e u s e r has to choose between a static d i s p l a y f o r a l a r g e number o f l i n e s o r c o n t e n d w i t h f l i c k e r o n a d i s p l a y w h i c h c a n be updated.

2.1.2 Raster Displays Raster technology provides a graphics. Raster devices with vector refresh displays times a second. But r a t h e r

d i f f e r e n t a p p r o a c h t o CRT have o n e t h i n g i n common - t h e y m u s t be u p d a t e d 60 t h a n m o v i n g a beam b e t w e e n

F l i c k e r - when an e l e c t r o n beam s t r i k e s the s u r f a c e of a CRT a chemical compound [or m i x t u r e o f c h e m i c a l ] c o a t i n g t h a t s u r f a c e - r e f e r r e d t o as a p h o s p h o r is excited and emits l i g h t f o r a s h o r t p e r i o d of t i m e . The d e c a y of t h a t l i g h t i s not i n s t a n t a n e o u s but t a k e s some time, referred to as the p e r s i s t e n c e of the phosphor; t h e r e a r e a number o f different types of phosphors with different persistences and colors. When an image i s drawn on the s u r f a c e i t i s done with a moving e l e c t r o n beam w h i c h can o n l y be i n one place at a time. I f the beam gets back to a spot that should be excited before the phosphor dims s u b s t a n t i a l l y , t h e r e i s no p r o b l e m , i f it takes too l o n g , the l i g h t w i l l be e m i t t e d i n p u l s e s - v i s i b l e to us - and the image w i l l be s e e n to f l i c k e r .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

Introduction

to

Graphics

63

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

random p o i n t s , r a s t e r u n i t s s c a n the s c r e e n i n a f i x e d p a t t e r n , b e g i n n i n g a t t h e u p p e r l e f t c o r n e r and moving along a horizontal l i n e . When t h a t l i n e i s finished, it moves to t h e l e f t o f t h e n e x t l i n e and refreshes it, and continues this process until the entire display has been c o m p l e t e d . This approach introduces a s i g n i f i c a n t s t e p between our " n a t u r a l " approach to d r a w i n g - v e c t o r s - and t h e r e s u l t i n g i m a g e . Any line we w a n t t o d r a w h a s t o be t r a n s f o r m e d f r o m a v e c t o r t o s o m e t h i n g t h a t the r a s t e r p r o c e s s can work w i t h - t h a t transformation i s c a l l e d a scan conversion. Instead of storing the location of successive coordinates as m i g h t be d o n e i n v e c t o r u n i t s , r a s t e r displays store the scan converted image in local (within the terminal) memory. I t i s the amount of memory i n t h e t e r m i n a l t h a t l i m i t s t h e a c c u r a c y o f the resulting image. In the s i m p l e s t c a s e - monochrome b l a c k - a n d - w h i t e - t h e s c r e e n i s b r o k e n down i n t o a two dimensional array of pixels (picture elements). In v e r y h i g h r e s o l u t i o n d i s p l a y s e a c h d o t on the screen (the t u b e t e c h n o l o g y i s t h e same a s t e l e v i s i o n ) could be a p i x e l , b u t m o s t r a s t e r d i s p l a y s are of low to medium resolution (190 χ 240 for low, 240 χ 768 [VT125) f o r medium) and h e r e a group of dots would define a pixel. With v e r y low r e s o l u t i o n d e v i c e s , s u c h a s t h e A t a r i v i d e o game u n i t , t h e p i x e l s can be s e e n as f i l l e d b o x e s . E a c h p i x e l w o u l d be r e p r e s e n t e d as a s i n g l e b i t o f memory. I f we w a n t to move into color, we need additional bits o f memory f o r e a c h p i x e l so t h a t t h e c o l o r c a n be d e s c r i b e d . The VT125 for example, support four c o l o r s p e r p i x e l , so two b i t s per p i x e l are r e q u i r e d . The amount of memory that the terminal contains puts a limit on the s h a r p n e s s of lines. One characteristic of raster displays is something called the "jaggies" which r e s u l t when a l i n e i s d r a w n . I t s n o t d i f f i c u l t t o s e e why the jaggies occur. To b e g i n w i t h , l e t s go b a c k t o t h a t p i e c e o f g r a p h p a p e r . T h i s t i m e we w i l l p a y a t t e n t i o n t o t h e boxes on the paper r a t h e r than the l i n e s . E a c h box - o r p i x e l - i s the a d d r e s s a b l e element i n the d i s p l a y . I f we w a n t t o draw a l i n e on t h e p a p e r i n t h e m a n n e r o f t h e raster d i s p l a y , p i c k boxes for the endpoints (a diagonal shows the problem best) and draw a straight line c o n n e c t i n g the c e n t e r s of t h o s e boxes. Now shade in

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

64

C O M P U T E R S IN T H E

LABORATORY

any box that t h el i n e passes through - the s t a i r - s t e p p a t t e r n shows t h e j a g g i e s . The h i g h e r the resolution (the s m a l l e r the boxes) t h el e s s a p p a r e n t the problem, but i t s s t i l l there.

+—+—+—+—+—+—+—+—+—+ I I I I I I I I I I +— +—+ — + — — + — + — + — +—4. I l#*l I I I I I I I 4- — + \ + + + + + + + I I l\#l I I I I I I +— + — + - \ + — + — +—+ — +—4- — + I I I * \##l I I I I I 4-—4-—+—+\-+—+—+—+—+—+ The * a r e t h e I I I l#\l I I I I I beginning and end 4.—4-—4.—+—\—4.—4-—4-—+—4. o fthe l i n e segment, I I I I l \ # lI I I I the \ i s the d e s i r e d +—+—+—+—+-\+—+—+—+—+ l i n e and the # a r e I I I I l# I I I u s e d t o show t h e +—+—+—+—+—+\-+—+—+—+ f i l l e d boxes. Each I I I I I l#*l I I I boxrepresents a +—+—+—+—+—+—+—+—+—+ single pixel. I I I I I I I I I I

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

+

+—+—+—+—+—+—+—+—+—+ I I I I I I I I I I 4- — 4-—+ — 4- — 4. — 4. — 4. — +—4. — 4. I I I I I I I I I I +—+—4-—+—+—+—+ — + — +—+

There are some problems associated with going t o higher resolution. Two i n p a r t i c u l a r , t h e c o s t o f a d d i t i o n a l memory - we h a v e t o store t h e images locally and each added p i x e l w i l l c o s t o n e o r more b i t s d e p e n d i n g o n t h e number o f b i t p l a n e s , and the phosphor c o a t i n g on the d i s p l a y i t s e l f . The phosphor i n t h e VT100 f o r example i s n o t a l o n g persistence phosphor ( i f i t were t h e t e x t would appear t o smear i n t h e s m o o t h s c r o l l mode - t h i s c a n b e s e e n by using smooth s c r o l l f o r f u l l l i n e s o ft e x t i n a dark room). In order to paint t h e screen with double t h e resolution o f t h e VT125, we w o u l d h a v e t o g o t o a n i n t e r l a c e mode i n w h i c h t h e o d d n u m b e r e d lines are refreshed on one pass and the even l i n e s on the next. With a short p e r s i s t e n c e phosphor, that would lead t o a flickering display. The o n l y way t o overcome i t i s t o c h a n g e t h e b o t t l e - t h e CRT t u b e - a n d t h a t would eliminate field upgrades [ R e t r o g r a p h i c s - a company

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

Introduction

to

65

Graphics

t h a t r e b u i l d s VT100 for higher resolution graphics (480 l i n e s v e r t i c a l ) d o e s c h a n g e t h e b o t t l e and u s e s a longer persistence green phosphor].

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

Raster vector

u n i t s do h a v e devices:

some

distinct

advantages

over

ο

Low c o s t - a t y p i c a l v e c t o r d e v i c e m i g h t b e g i n at $7-8,000 (the VT-11 began a $ 1 1 , 0 0 0 some y e a r ago) ,

ο

C o l o r - d e p e n d i n g on w h a t you a r e w i l l i n g t o pay, you can p u r c h a s e 4 or 8 c o l o r u n i t s f o r l e s s t h a n $4000 and g e t terminals with 512 χ 512 pixel r e s o l u t i o n a n d a c h o i c e o f 256 o u t o f a p a l e t t e o f 16 m i l l i o n (AED 512) f o r $ 1 5 - 1 8 , 0 0 0 . Some l i m i t e d color is a v a i l a b l e on vector u n i t ( u s i n g beam penetration techniques for example, [Evans & S u t h e r l a n d ] ) , but at h i g h c o s t .

2.1,3

Raster

Color

-

Color i s another f a c t o r in raster displays. Using vector systems we were limited to a single color against a dark background, here we have more flexibility. By adding more memory i n t h e f o r m o f a d d i t i o n a l p l a n e s , we c a n s t o r e m o r e i n f o r m a t i o n a b o u t each pixel. W i t h o n l y o n e p l a n e , we c a n t e l l i f t h a t pixel should be i l l u m i n a t e d or not, giving a monochrome b l a c k - a n d - w h i t e display. The s e c o n d plane

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

66

COMPUTERS IN T H E LABORATORY

+ +

+

+

#

#

|

two 1 2 χ 12 b i t planes (# i s o n e p i x e l )

l############l##lI I» » > A » I

I //////////////////////////1 I / / g r a p h i c s l i b r a r y f o r B//+—+ I ///////////////////////////// I

I» » » » I + — + DEV I I » B »I

At t h e next l e v e l , t h e subroutine library that the user works w i t h remains f u n c t i o n a l l y c o n s t a n t , b u t i t s internal structure will directly take into account differences i n hardware. This allows some portability, b u t puts t h e burden o f support f o r d i f f e r e n t d e v i c e s on t h e s y s t e m / l i b r a r y manager.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E LABORATORY

+ I /////////////////////////////1 I / / g r a p h i c s l i b r a r y f o r A//+--+ I / / o r Β////////////////////I I//////////////////////////I I //////////////////////////+—+ I/////////////////////////////I I//////////////////////////+--+ I //////////////////////////1 I //////////////////////////+—+ I ///////////////////////////// I + +

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

+

+

+ I» » > ι +--+ DEV I |»»>A»| + +

+ + I » » » » I + ~ + DEV I I » B » I + +

At a higher l e v e l the graphics library c a n remain constant, a n d h a r d w a r e d i f f e r e n c e s c a n be t a k e n into account a t the device handler - t h e lowest level before talking directly t o t h e h a r d w a r e , a s shown below. The l i b r a r y would d i s p a t c h standard messages to the handler and i t i s t h a t items r e s p o n s i b i l i t y t o implement them. This allows a high degree of p o r t a b i l i t y and the e a s i e s t i n t e g r a t i o n o fmixtures o f devices and i t i s a t t h i s s t r u c t u r e that t h e proposed standards a r e aimed. T h e p r o p o s e d CORE s t a n d a r d i n p a r t i c u l a r d e a l s w i t h t h i s l e v e l andt h e segmentation of software. A s n o t e d much e a r l i e r , a s o f t w a r e s y s t e m can e x i s t a t a p r i m i t i v e level and be b u i l d upon, adding l a y e r s o f software as the a p p l i c a t i o n s d i c t a t e . This layering i s directly addressed b y t h e CORE, defining what capabilities w i l l e x i s t a t each l e v e l (both graphics input and output) the functions of various s u b r o u t i n e andt h e i r r e l a t i o n t o each o t h e r it i s a functional specification f o ra layered software product.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

+

Introduction

to

79

Graphics

+ +

+

+

Π­

Ι///common///////1 lhandlerl Ι>>>>>1 [graphics l i b r a r y \ \ A + + + — + DEV I I////////////////A | » » > A » | + + +-+ + +

H e r e the differences in hardware are taken care of by the handler. I ///common///////1 lhandler I I »»>>>> I Igraphics l i b r a r y \ \ Β + — + + — + DEV I I ////////////////A \ I I» B »I

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

+

+ +

+

+

+

The p r o b l e m s w i t h t h e s e s t a n d a r d s i s that they are still y e a r s away. Thus packages t h a t s t a t e t h a t they a r e CORE " c o m p a t i b l e " r a t h e r t h e m f u l l y CORE c o m p l i a n t or compliant to a p a r t i c u l a r l e v e l - e x i s t because t h e CORE i s a moving target, that movement being f o r c e d by a need f o r c o m p l e t e n e s s and r a p i d l y changing t e c h n o l o g y ( t h e 1979 s p e c i f i c a t i o n h a s r a s t e r g r a p h i c s a s a n a p p e n d i x - i t i s now a n m a j o r c o n s i d e r a t i o n ) . I f a l l t h i s l o o k s t o you l i k e a v e r y messy situation, then a l l this v e r b i a g e has had i t s p o i n t . But take h e a r t , t h e i r a p p e a r s t o b e some l i g h t ahead and i t s not an on-coming train. The bottom l i n e f o r us i s t h a t we m u s t make s u r e t h a t o u r s o f t w a r e i s c o n s i s t e n t across operating systems and hardware. On m o r e t h a n o n e i n s t a n c e we h a v e b e e n t o l d b y c u s t o m e r s that "we don't care what standard y o u s e t , s e t one and use a c r o s s o p e r a t i n g systems and hardware". How d o e s t h e s o f t w a r e r e l a t e t o t h e a p p l i c a t i o n s n o t e d above? Lets consider t h e d i a g r a m b e l o w ( f i g u r e s 7a and 7b) t o i l l u s t r a t e t h e l a y e r i n g of software (the diagram was originally presented b y Tom M c l n t y r e , C e n t r a l E n g i n e e r i n g , D i g i t a l Equipment C o r p o r a t i o n ) . A t t h e l o w e r l e f t c o r n e r we h a v e the terminals with some on board intelligence. The s o f t w a r e d o e s n o t e x i s t and t h e user must not only worry about h i s a p p l i c a t i o n , b u t how t o g e t t h e h a r d w a r e t o d o w h a t h e wants done. The s o f t w a r e i s d e v i c e s p e c i f i c . As we move across t h e bottom, t h e hardware gets "smarter" and t a k e s on more o f t h e b u r d e n . A s we move up, t h e software g e t more p o w e r f u l , and f u r t h e r removes t h e a p p l i c a t i o n programmer from t h e hardware, he worries

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

80

COMPUTERS IN T H E LABORATORY

APPLICATIONS COMMERCIAL

ENGINEERING

SCIENTIFIC

general purpose modeling system

S MODELING 0 F Τ VIEWING W A R EXECUTION Ε LEVEL Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

EDUCATIONAL

I applications specific I models (applications specific I viewing

general purpose viewing system

(applications spec, I device drivers

virtual device interface

special purpose hardware

HARDWARE LEVEL TEK 4010

VT125

VS11 COST/PERFORMANCE

APPLICATIONS COMMERCIAL S MODELING Ο F Τ VIEWING W A R EXECUTION Ε LEVEL

EDUCATIONAL

SCIENTIFIC

ENGINEERING

Tektronix emulators and (other) device ( handlers ( special purpose hardware

HARDWARE LEVEL VT125

TEK 4010

VS-11

COST and PERFORMANCE F i g u r e s 7a and 7b. G r a p h i c s P r o d u c t

Environment.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

6.

LISCOUSKI

Introduction

to

81

Graphics

a b o u t h i s a p p l i c a t i o n r e q u i r e m e n t s and t h i n k s i n t e r m s o f t h e g r a p h i c s f i g u r e s he i s d r a w i n g ( I want a box here", rather than "how do I draw a box on t h i s terminal"). The characteristics of a particular terminal (or i t s l i m i t a t i o n s ) are taken into account by the s o f t w a r e . A t t h e t o p , we concern our selves with the needs of a p a r t i c u l a r problem, u s i n g a data p l o t t i n g package, r a t h e r then t r y i n g to f i g u r e out how to label the axis. Lets take a l o o k a t t h e same d i a g r a m w i t h some o f the points on software noted e a r l i e r added.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

H

The I n t e r n a t i o n a l Standards O r g a n i z a t i o n , of which ANSI is a member, has approved the G r a p h i c s K e r n e l P a c k a g e as i t s g r a p h i c s " s t a n d a r d " of c h o i c e . This in effect, makes the GKS approach t h e ANSI s t a n d a r d . W h i l e t h i s w o u l d a p p e a r t o s e t t l e t h e i s s u e o f CORE v s GKS, a l l i t r e a l l y does i s p r o v i d e a m o d i f i c a t i o n of direction. There are a l o t of graphics software systems that are built o n t h e CORE a p p r o a c h , which represent significant financial and manpower investments, and they are not l i k e l y t o w i l t away. The c l o s i n g s a l v o o f t h e s t a n d a r d b a t t l e is still a l o n g way o f f .

3.0

SUMMARY

Through the course of t h i s a r t i c l e , I have tried to present an overview o f t h e s c o p e o f g r a p h i c s and an i d e a o f w h a t c a n be d o n e , a n d w h a t i t t a k e s t o do i t . In a f i e l d as f a s t moving as t h i s , e v e r y t h i n g s a i d was o u t o f d a t e a s s o o n a s I f i n i s h e d t y p i n g . To g i v e y o u some i d e a , f o r a few h u n d r e d d o l l a r s ($100 - $ 2 0 0 ) y o u can purchase a d i g i t i z i n g tablet and some software that a l l o w s a p o p u l a r m i c r o p r o c e s s o r s y s t e m t o become a free-form drawing package. Because of the amount of information that can be presented and the increased clarity gained from graphics, i t i s going to find an increasingly i m p o r t a n t r o l e i n the l a b o r a t o r y . Compared t o d e a l i n g w i t h r e a m s o f n u m b e r s , i t w i l l be a welcome relief. The s o f t w a r e and h a r d w a r e i s b e c o m i n g l e s s c o s t l y and e a s i e r to use. I t w i l l not o n l y change the way the data in the laboratory is used, b u t make i t m o r e interesting and allow us to extract more useful information, faster.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E L A B O R A T O R Y

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch006

Trademark

and P r o d u c t

Acknowledgements

ο

4010 and 4014 are terminal products of Tektronix Inc.

ο

Plot-10

ο

D E C g r a p h , Gamma-11, G I G I , P D P - 1 1 , ReGIS, VSV-11, VT105, VT100, VT125, VT-11, V S - 6 0 , a n d VAX a r e products of Digital Equipment Corp., Maynard, Mass.

ο

PDP, VAX, and VT are Equipment Corp., Maynard,

ο

TRS-80 i s a t r a d e m a r k

ο

DISSPLA and TEL-A-GRAPH a r e Graphics Software

ο

AED

ο

Missile-Command i s a trademark

ο

PERQ i s a t r a d e m a r k

i s a trademark

i s a trademark

R E C E I V E D J u l y 31,

model

numbers

for

of Tektronix Inc.

trademarks Mass.

o f Tandy

o f AED,

of

Digital

Corporation

trademarks

of

ISSCO

Inc. of A t a r i ,

o f PERQ S y s t e m s

Inc.

Corporation.

1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

7 Chemists and Computers in the Corps of Engineers

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch007

RICHARD E. ENRIONE U.S. Army Corps of Engineers, Ohio River Division, P.O. Box 1159, Cincinnati, OH 54201 The chemists in the U.S. Army Corps of Engineers, Ohio River Division perform a variety of computer assisted tasks in assessing water quality and providing input to water management decisions. These include developing and running sophisticated water quality models; evaluating water quality data; and operating an automated water quality laboratory. The U . S . Army Corps o f E n g i n e e r s has a r e l a t i v e l y s m a l l number o f c h e m i s t s . They a r e l o c a t e d e i t h e r i n r e s e a r c h l a b o r a t o r i e s o r i n the d i s t r i c t and d i v i s i o n o f f i c e s . T h i s paper w i l l be l i m i t e d t o the Ohio R i v e r D i v i s i o n h e a d q u a r t e r e d i n C i n c i n n a t i and the f o u r d i s t r i c t o f f i c e s i n H u n t i n g t o n , L o u i s v i l l e , N a s h v i l l e , and Pittsburgh. There a r e a p p r o x i m a t e l y 75 m u l t i p u r p o s e s t o r a g e r e s e r v o i r s and a n o t h e r 75 r i v e r l o c k and dam s t r u c t u r e s which a r e o p e r a t e d by the division. In t h e i r o p e r a t i o n , the h i g h e s t p r i o r i t y i s g i v e n t o s a f e t y o f the s t r u c t u r e s f o l l o w e d c l o s e l y by e i t h e r f l o o d c o n t r o l o r r i v e r n a v i g a t i o n . Depending on the p a r t i c u l a r p r o j e c t , s e v e r a l competing purposes govern most o f the d a y - t o - d a y o p e r a t i o n s . These i n c l u d e : hydropower, r e c r e a t i o n , w a t e r s u p p l y , water q u a l i t y , and minimum f l o w r e l e a s e s . The i m p o r t a n t p o i n t i s t h a t w a t e r c o n t r o l d e c i s i o n s have an impact on w a t e r q u a l i t y . I t i s t h i s aspect which r e q u i r e s chemical e x p e r t i s e . There a r e t h r e e g e n e r a l ways i n w h i c h water q u a l i t y c o n s i d e r a t i o n s impact on water management d e c i s i o n s . These a r e the l o n g - t e r m development o f , o r m o d i f i c a t i o n o f o p e r a t i n g g u i d e l i n e s ; an i n t e r m e d i a t e term t r a c k i n g o f the e f f e c t i v e n e s s o f the g u i d e l i n e s ; and the r e a l time o r q u a s i r e a l time m o n i t o r i n g o f s i t u a t i o n s which have the p o t e n t i a l f o r r a p i d change. The g u i d e l i n e m o d i f i c a t i o n i s e x e m p l i f i e d by B l u e s t o n e Reservoir. I n t h i s c a s e , p r o p o s a l s were made t o modify the p r o j e c t o p e r a t i o n i n two d i f f e r e n t ways. F i r s t , change the r e l e a s e s c h e d u l e t o accommodate downstream Whitewater r a f t i n g ; s e c o n d , i n c r e a s e the r e s e r v o i r depth and add hydropower. To e v a l u a t e t h e s e , a m a t h e m a t i c a l model was used w h i c h i n c o r p o r a t e d T h i s chapter not subject to U . S . copyright. P u b l i s h e d 1984, A m e r i c a n C h e m i c a l Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch007

84

C O M P U T E R S IN T H E L A B O R A T O R Y

t h e two d i m e n s i o n a l hydrodynamic, t h e r m a l , c h e m i c a l , and b i o l o g i c a l c h a r a c t e r i s t i c s o f the r e s e r v o i r . The r e s u l t s o f t h i s m o d e l i n g i n d i c a t e d t h a t t h e s c h e d u l i n g o f r a f t i n g r e l e a s e s would not a g g r a v a t e t h e i n t e r m i t t e n t a l g a e problem c u r r e n t l y found i n t h e l a k e and might h e l p i t s l i g h t l y . On t h e o t h e r hand, the a d d i t i o n o f hydropower would g r e a t l y i n c r e a s e t h e a l g a e and a l s o degrade the w a t e r q u a l i t y o f the r e s e r v o i r r e l e a s e s t o t h e d e t r i m e n t o f the f i s h e r y below t h e dam. The m o d e l i n g , p a r t i c u l a r l y t h e computer g e n e r a t e d g r a p h i c s t o i l l u s t r a t e f o r n o n - s c i e n t i s t s t h e p o t e n t i a l changes i n a l g a l growth, had two r e s u l t s . A m o d i f i e d r e l e a s e s c h e d u l e f o r r a f t i n g w i l l be implemented; and t h e hydropower a d d i t i o n i s b e i n g d e l a y e d pending t h e r e s u l t s o f s t u d i e s aimed a t r e d u c i n g the n u t r i e n t l o a d and a r i s k a n a l y s i s o f t h e impact o f poor q u a l i t y water on t h e downstream f i s h e r y . The t r a c k i n g o f g u i d e l i n e s i n v o l v e s t h e r o u t i n e s a m p l i n g and a n a l y s i s f o r a v a r i e t y o f c h e m i c a l and b i o l o g i c a l c o n s t i t u e n t s . A t J . P e r c y P r i e s t R e s e r v o i r , f o r example, t h e r e i s a l a r g e h i s t o r i c a l d a t a base f o r i r o n , manganese, ammonia, d i s s o l v e d oxygen, e t c . The m o n i t o r i n g i n t h i s case i s t o d e t e r m i n e i f t h e l a k e i s c h a n g i n g i n response t o r a p i d l y c h a n g i n g l a n d use p a t t e r n s i n t h e w a t e r s h e d , and, i f s o , s h o u l d t h e w a t e r management scheme be r e e v a l u a t e d . T h i s e f f o r t r e s u l t s i n a l a r g e number o f samples f o r c h e m i c a l a n a l y s i s . There a r e f i v e l a b o r a t o r i e s w h i c h p e r f o r m most o f t h e s e a n a l y s e s . The f o u r d i s t r i c t l a b o r a t o r i e s p e r f o r m t h e f i e l d s a m p l i n g , b i o l o g i c a l and c h o r o p h y l a n a l y s i s , and i n some cases a n a l y s i s f o r TOC, d i s s o l v e d c a r b o n , s o l i d s , a l k a l i n i t y and a c i d i t y . The d i v i s i o n l a b o r a t o r y performs t y p i c a l w a t e r q u a l i t y c h e m i c a l a n a l y s i s u s i n g , almost e x c l u s i v e l y , mechanized/computerized equipment. The r e a l time d a t a f i e l d d a t a i s c o l l e c t e d h o u r l y v i a a t e l e p h o n e network i n t h e case o f the Ohio R i v e r and t h r o u g h a GOES s a t e l l i t e and a d o w n l i n k i n C i n c i n n a t i f o r o t h e r l o c a t i o n s . The s a t e l l i t e system, w h i c h c o n s i s t s o f over 900 s t a t i o n s i n the b a s i n , was s e t up f o r f l o w f o r e c a s t i n g . A p r o v i s i o n was made t o add w a t e r q u a l i t y i n f o r m a t i o n and, as t h e need a r i s e s , a p p r o p r i a t e m o n i t o r s a r e i n s t a l l e d a t the r e q u i r e d l o c a t i o n s . Two examples o f t h e uses o f t h i s i n f o r m a t i o n a r e : t o change g a t e openings on the Ohio R i v e r Locks t o maximize r e a e r a t i o n when the d i s s o l v e d oxygen l e v e l g e t too low; and t o m o n i t o r h o u r l y f l u c t u a t i o n s from p e t r o l e u m b r i n e d i s c h a r g e s i n t h e B l a i n e Creek w a t e r s h e d . These e f f o r t s a r e c a r r i e d out u s i n g a v a r i e t y o f computers w i t h d i f f e r e n t p r i m a r y p u r p o s e s . Each d i s t r i c t and t h e d i v i s i o n have a w a t e r c o n t r o l minicomputer ( t y p i c a l l y a H a r r i s 100) devoted m a i n l y t o h y d r o l o g i e m o d e l i n g on a r e a l time b a s i s , and t h e maintenance o f a p p r o p r i a t e o n - l i n e h y d r o l o g i e d a t a b a s e s . A l s o , t h e r e a l time w a t e r q u a l i t y i n f o r m a t i o n i s p r o c e s s e d and a n a l y z e d on t h e s e machines. The d i v i s i o n c o u n t e r p a r t , i n a d d i t i o n t o these t a s k s , i s used f o r the development o f h y d r o l o g i e models; the development and use o f water q u a l i t y models; and i s t h e c o n t r o l p o i n t f o r t h e g a t h e r i n g , s o r t i n g and d i s s e m i n a t i n g o f r e a l time data. Each o f f i c e a l s o has a g e n e r a l purpose, e n g i n e e r i n g use m i n i ( t y p i c a l l y a H a r r i s 5 0 0 ) , which i s used f o r the maintenance o f t h e

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch007

EN R ION Ε

Computers

in the Corps of

Engineers

d i s t r i c t water q u a l i t y d a t a base and t h e r u n n i n g o f v a r i o u s d a t a a n a l y s i s and d e p i c t i o n programs. The d i v i s i o n has a g e n e r a l purpose Honeywell which i s used p r i n c i p a l l y f o r f i n a n c i a l purposes but augments t h e l a b o r a t o r y computer by g e n e r a t i n g most o f t h e management r e p o r t s , p e r f o r m i n g some o f t h e c a l c u l a t i o n s and q u a l i t y c o n t r o l checks and a c t s as t h e c o n t r o l p o i n t f o r disseminating o f the data t o the d i s t r i c t s . The d i s t r i c t l a b computers a r e a p p l i e d d i f f e r e n t l y i n each o f the f o u r d i s t r i c t s . Among t h e uses a r e t h e s t o r a g e r e t r i e v a l and a n a l y s i s of b i o l o g i c a l data; the point of entry f o r f i e l d data; d a t a r e d u c t i o n o f l a b o r a t o r y a n a l y t i c a l r e s u l t s ; and d i r e c t d a t a g a t h e r i n g from equipment such as s p e c t r o p h o t o m e t e r s . The d i v i s i o n l a b o r a t o r y has two Wang VP2200 computers which a r e i n t e r f a c e d t o a v a r i e t y o f i n s t r u m e n t s . The n a t u r e o f t h e i n t e r f a c e and t h e a s s o c i a t e d programming depend on t h e i n s t r u m e n t s i n v o l v e d . I n terms o f computer usage, t h e i n s t r u m e n t s e i t h e r accumulate d a t a which i s then b a t c h p r o c e s s e d , o r they r e q u i r e c o n t i n u o u s o n - l i n e support by t h e computer. F o r t h e ICAP, TOC, and GC completed r e p o r t s a r e sent t o t h e Wang which r e q u i r e l i t t l e more than r e f o r m a t t i n g . F o r t h e Atomic A b s o r p t i o n , a d a t a l o g g e r sends i n a sequence o f numbers which r e q u i r e s some a d d i t i o n a l p r o c e s s i n g . The e l e c t r o n i c b a l a n c e i s o p e r a t e d i n c o n j u n c t i o n w i t h an i n t e r a c t i v e program f o r t h e a n a l y s i s o f s o l i d s . S i x c h a n n e l s o f T e c h n i c o n A u t o - a n a l i z e r a r e i n t e r f a c e d through a f l u i d y n e s c a n n i n g A/D c o n v e r t e r . Both o f t h e l a s t two i n s t r u m e n t s p l a c e a c o n s i d e r a b l e burden on t h e computer r e s o u r c e s . I n t h e o p e r a t i o n o f any o f t h s e i n s t r u m e n t s , t h e f i r s t s t e p f o r t h e c h e m i s t i s t o o b t a i n a sample l i s t from t h e computer; t h e l a s t s t e p i s t o r e v i e w t h e q u a l i t y c o n t r o l d a t a on t h e computer and a c c e p t o r r e j e c t a l l o r p a r t o f t h e r u n . A l l computer o p e r a t i o n s a r e menu d r i v e n question/answer sequences w i t h t h e most common responses a v a i l a b l e by d e f a u l t . A l l of the e f f o r t s r e q u i r e a rapid interchange of information and a c c e s s t o a wide v a r i e t y o f d a t a b a s e s , b o t h e x t e r n a l , such as USGS s WATSTOR and E P A s STORET, and i n t e r n a l , c o n t a i n i n g h i s t o r i c a l a n a l y t i c a l r e s u l t s and r e s e r v o i r h y d r a u l i c i n f o r m a t i o n . T h i s i s accomplished by a network o f computers t i e d t o g e t h e r by telephone l i n e s and a u t o d i a l i n g equipment, and i s shown i n F i g u r e 1. F o r example, on an h o u r l y b a s i s , a computer c a l l s the downlink t o r e t r i e v e the l a t e s t s e t of s a t e l l i t e d a t a . T h i s machine on a d a i l y b a s i s ( o r on demand) i s a u t o m a t i c a l l y c a l l e d by t h e d i s t r i c t water c o n t r o l computer t o o b t a i n t h e r e l e v a n t d a t a . As p a r t o f t h e a u t o d i a l i n g p r o t o c a l s , a u s e r w o r k i n g on a modeling problem i n C i n c i n n a t i c a n a c c e s s , and have i n h i s own f i l s i n a few m i n u t e s , such i n f o r m a t i o n as c u r r e n t water q u a l i t y d a t a from a d i s t r i c t g e n e r a l purpose computer, f l o w d a t a from d i s t r i c t water c o n t r o l machines, o r h i s t o r i c a l c h e m i c a l d a t a from STORET. An i n t e g r a l p a r t o f t h e system a r e t h e backup p r o c e d u r e s . To d e a l w i t h t h e problems o f computer f a i l u r e , l o s s o f d a t a base i n t e g r i t y o r communication f a i l u r e s , a c o m b i n a t i o n o f approaches a r e used. The b a s i c assumptions a r e : One days worth o f l a b o r a t o r y r e s u l t s on t h e l a b computer a r e expendable i n t h e sense t h a t a s i g n i f i c a n t f r a c t i o n i s s t i l l i n t h e memory o f t h e f

1

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

86

C O M P U T E R S IN T H E LABORATORY

ft

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch007

GOES S a t e l l i t e

Division G e n e r a l Purpose H a r r i s 700

\

ORSANCO*

V I I

Instruments ι ι

A

i

D i v i s i o n Lab Wang VP2200

Division Water C o n t r o l H a r r i s 100

Weather S e r v i c e AFOS*

Division Honeywell

USEPA STORET*

r

D i s t r i c t Lab Wang

Figure

District Water C o n t r o l H a r r i s 100 and/or G e n e r a l Purpose H a r r i s 500

I I

I

USGS WATSTORE*

1 — S c h e m a t i c o f network showing p r i m a r y d a t a p a t h s . •

Water C o n t r o l

—»—· — - • W a t e r

Quality

*Equipment n o t o p e r a t e d by t h e Corps o f E n g i n e e r s . The p o r t i o n i n t h e box i s r e p l i c a t e d i n each o f f o u r d i s t r i c t s .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch007

7. E N R I O N Ε

Computers

in the Corps of

Engineers

i n s t r u m e n t s and the r e s t can be r e p e a t e d . R e a l time h o u r l y d a t a f o r p e r i o d s o f 24-48 hours a r e expendable except d u r i n g f l o o d emergencies. A l l f i l e s on a l l computers except the l a b o r a t o r y Wangs a r e backed up o n a d a i l y b a s i s w i t h tape c o p i e s . The Lab computers send a l l r e s u l t s t o a second computer every day. A l l a p p r o p r i a t e r e s u l t s are sent t o STORET q u a r t e r l y . W i t h i n 24 h o u r s , a l l c u r r e n t d a t a i s s t o r e d on computers a t two o r more s i t e s ( t h i s i s a d i r e c t r e s u l t o f the f a c t t h a t t h e d a t a u s e r s work on d i f f e r e n t machines than the d a t a g e n e r a t o r s / d a t a c o l l e c t o r s ) . On a monthly b a s i s , a l l programs and d a t a f i l e s on a l l computers a r e c o p i e d t o tape o r d i s c and s t o r e d a t a d i f f e r e n t site. The computers and communication l i n k s which a r e c r i t i c a l a t f l o o d times have s p e c i f i c a l l y d e s i g n a t e d backups where a l l t h e n e c e s s a r y d a t a and programs a r e k e p t on a standby b a s i s . F o r example, the C i n c i n n a t i water c o n t r o l H a r r i s which i s the key t o d i s s e m i n a t i n g r e a l time d a t a can have i t s f u n c t i o n taken over by machine i n L o u i s v i l l e ; the l o c a l s a t e l l i t e d o w n l i n k can be r e p l a c e d by one i n M i s s i s s i p p i . In the l a b , f u t u r e e x p a n s i o n p l a n s i n c l u d e the use o f o p t i c a l scanners f o r r e a d i n g sample l a b e l s , o p e r a t i o n o f r o b o t s t o r e l i e v e some o f the manual o p e r a t i o n s and an a r t i f i c i a l i n t e l l i g e n c e system t o t r a c k q u a l i t y c o n t r o l . I n o t h e r a r e a s , t h e r e w i l l be a n i n c r e a s e i n the number o f r e a l time m o n i t o r s , not n e c e s s a r i l y because r e a l time d a t a i s needed, but t h e c o s t can be s m a l l compared t o s e n d i n g out a f i e l d team. There w i l l be some a p p l i c a t i o n s o f d i r e c t m o n i t o r i n g by s a t e l l i t e s such as LANDSAT D. Both o f these w i l l be i n c o r p o r a t e d i n t o water q u a l i t y models which w i l l a l l o w more i n t e l l i g e n t c h o i c e s o f where t o send a f i e l d team t o c o l l e c t samples f o r d e t a i l e d a n a l y s i s . R E C E I V E D M a y 30, 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

8 Computer Generation of Structure-Effect Relationships from Text Databases 1

RUDOLPH J. MARCUS

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

Office of Naval Research, 1030 East Green Street, Pasadena, CA 91106 The use of text as data for clustering combines computer text processing, on the one hand, with a heuristic research methodology, on the other hand. In addition, the clustering of text data involves non-parametric hyperspaces, where definitions of the closeness of clusters are more difficult than they are in parametric hyperspaces used in the clustering of numerical data. Three different approaches to clustering in text material are described. The f i r s t one makes use of the format of two properties, such as structure and activity, on the same computer line or at least in the same file. This method assumes more importance with the addition of text searching to CAS ONLINE announced for this year. A second approach involves assigning vectors, such as +1, - 1 , or 0, to a property, an opposite property, or absence of a property, and then searching the data base for items with desired vectors in appropriate columns. A third approach is more quantitative and uses various projections of the hyperspace of properties onto one or more recognizable Cartesian axes. Examples are taken from a data base of chemical compounds and their medical uses, extracted from the Merck Index. The work which my a s s o c i a t e s and I have done i n the use of t e x t as data by c h e m i s t s and f o r c h e m i s t s draws on two s o u r c e s . The f i r s t of these i s computer t e x t p r o c e s s i n g . I t i s even more c l e a r today than i t was 13 y e a r s ago when t h i s work was s t a r t e d t h a t computers may not o n l y be used f o r number c r u n c h i n g but f o r a number of o t h e r jobs i n which l e t t e r s and words as w e l l as numbers can be u s e d . Tasks such as e x t r a c t i n g , s o r t i n g , r e f o r m a t t i n g , a s s o c i a t i n g , and c o u n t i n g can be done not o n l y w i t h numeric i n f o r m a t i o n , but a l s o w i t h alphanumeric i n f o r m a t i o n . Whereas such o p e r a t i o n s may have been u s e f u l but e s o t e r i c 13 y e a r s a g o , the tremendous use of word 'Current address: 605 Cavedale Road, Sonoma, C A 95476 T h i s chapter not subject to U . S . copyright. P u b l i s h e d 1984, A m e r i c a n C h e m i c a l Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

90

C O M P U T E R S IN T H E L A B O R A T O R Y

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

p r o c e s s o r s and p e r s o n a l computers has made t h i s an everyday event today. I t i s my c o n t e n t i o n t h a t c h e m i s t s are slow t o use some o l these alphanumeric, non-number-crunching t e c h n i q u e s and i t is i n t e r e s t i n g to s p e c u l a t e why c h e m i s t s seem to be s l o w e r than those ot many o t h e r d i s c i p l i n e s , p a r t i c u l a r l y p s y c h o l o g y and s o c i o l o g y , t o use these t e c h n i q u e s . In the f i r s t p l a c e , c h e m i s t s are used to d e a l i n g w i t h formulae and w i t h numbers more than w i t h words. Chemical elements and compounds are more e a s i l y e x p r e s s e d as formulae than they are as words and c e r t a i n l y a l l ot our f i g u r i n g i s done w i t h numbers r a t h e r than w i t h words, i . e., d i g i t a l l y r a t h e r than a n a l o g . However, as my coworkers and I have shown i n a number of p a p e r s , formulae and the Geneva System names which these formulae express are alphanumeric i n t o r m a t i o n ot the type t h a t can be handled e a s i l y by the computer and which i s t h e r e f o r e a f i t t i n g s u b j e c t f o r the o p e r a t i o n s I was t a l k i n g about e a r l i e r : e x t r a c t i n g , s o r t i n g , r e f o r m a t t i n g , a s s o c i a t i n g , and c o u n t i n g . Not onLy has the microcomputer r e v o l u t i o n changed the field enormously s i n c e t h i s work was s t a r t e d , but a l s o the r e s o u r c e s which are a v a i l a b l e to c h e m i s t s have i n c r e a s e d tremendously. Chemical A b s t r a c t s and a l s o the Merck Index, which we w i l l be d e a l i n g w i t h i n t h i s paper, are now a v a i l a b l e on t a p e . There has been the advent d u r i n g the l a s t few y e a r s of CAS ONLINE which i s ot tremendous h e l p not o n l y i n s e a r c h i n g b i b l i o g r a p h i c i n t o r m a t i o n b u t , as I i n t e n d to show i n t h i s paper, for actually associating i n f o r m a t i o n and t h e r e f o r e a p p l y i n g the i n d u c t i v e p r o c e s s to the data thus e x t r a c t e d from a d a t a base. Methods ot s e a r c h i n g t h i s l a r g e amount ot machine-readable i n f o r m a t i o n have a l s o been augmented d u r i n g the l a s t 13 y e a r s . Systems such as those o p e r a t e d by SDC and Lockheed Data as w e l l as o t h e r s have become v e r y e f f i c i e n t i n e x t r a c t i n g i n f o r m a t i o n . I t i s a c u r i o s i t y t h a t the use of these systems i s v e r y much more developed i n i n d u s t r y than i t i s i n u n i v e r s i t i e s . A p p a r e n t l y i n u n i v e r s i t i e s i t i s s t i l l cheaper to send a graduate student t o the l i b r a r y i n s t e a d ot a l l o w i n g i n t e l l i g e n t use ot these systems. I have d w e l l e d so t a r on the e x t r a c t i n g ot i n f o r m a t i o n and I w i l l d e a l i n the r e s t ot t h i s t a l k w i t h the o t h e r t o u r p r o c e s s e s 1 was mentioning e a r l i e r : s o r t i n g , r e f o r m a t t i n g , a s s o c i a t i n g , and counting information. Dr. Dessy, i n h i s i n t r o d u c t o r y t a l k at t h i s symposium ( 1J, argued v e r y c o n v i n c i n g l y f o r g r e a t e r use ot these systems by c h e m i s t s on the b a s i s t h a t a n y t h i n g l e s s w i l l l e a v e the u s e r overwhelmed by a v a i l a b l e d a t a . Another argument may be addressed f o r the use ot such systems, and a g a i n s t s e n d i n g the graduate s t u d e n t to the l i b r a r y f o r machine-readable i t e m s . That argument i s t h a t , when a s s o c i a t i n g d a t a , o n l y about 30 or 40 items can be kept i n mind at one t i m e . I t i s q u i t e t r u e t h a t one can s t a r t r e a d i n g the Merck Index to s o r t out which c h e m i c a l s t r u c t u r e s a r e a s s o c i a t e d w i t h s p e c i f i e d m e d i c a l u s e s . One may do i t t o r the f i r s t t h r e e pages, o n l y t o get swamped by the f o u r t h page ot an 1100-page volume. The o t h e r source on which our work draws i s the h e u r i s t i c r e s e a r c h methodology. H e u r i s t i c programming i n v o l v e s t r i a l and e r r o r p r o c e dures r a t h e r than a l g o r i t h m s , and has become more p r a c t i c a l w i t h the advent ot r e a l - t i m e i n t e r a c t i o n . The o n l y c o n d i t i o n f o r the p r a c t i c e

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

8.

MARCUS

Generation

of Structure-Effect

Relationships

91

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

ot h e u r i s t i c programming i s t h a t the p r a c t i t i o n e r h i m s e l f s i t s down at the c o n s o l e and i n t e r a c t s with the computer. Only t h e p r a c t i t i o n e r c a n q u i c k l y r e c o g n i z e e r r o r s i n h i s own s p e c i a l t y . In the more u s u a l d e d u c t i v e method o t w o r k i n g , we f o r m u l a t e a s p e c i f i c h y p o t h e s i s and then g a t h e r data which w i l l v a l i d a t e o r i n v a l i d a t e the h y p o t h e s i s . By c o n t r a s t , i n the h e u r i s t i c approach we begin w i t h a comprehensive group ot data such as a data base and we examine t h i s data base r e p e a t e d l y and i n t e r a c t i v e l y w i t h new hypotheses. T h i s i s e s s e n t i a l l y an i n d u c t i v e t e c h n i q u e . We have f o r g o t t e n today, when the d e d u c t i v e t e c h n i q u e i s almost second n a t u r e among s c i e n t i s t s , t h a t some ot the g r e a t e s t advances even i n the hard s c i e n c e s were made by i n d u c t i v e t e c h n i q u e s . I r e f e r here t o such s e m i n a l e v e n t s as the f o r m u l a t i o n o t the p e r i o d i c t a b l e by Mendeleev and, g o i n g back even f u r t h e r , the d e r i v a t i o n o t a h i e r a r c h i c a l o r g a n i z a t i o n o t a l l a n i m a l s and p l a n t s by Linnaeus i n Sweden. A r e c e n t paper ( 2 ) has l i s t e d o t h e r examples o t important r u l e s a r r i v e d a t by the i n d u c t i v e p r o c e s s : K e p l e r ' s t h i r d law, Ohm's law, P r o u t ' s h y p o t h e s i s , Balmer's f o r m u l a , and o t h e r s . 1 might add t h a t i n d u c t i o n does not p r e s c r i b e a p r i o r i t h e form i n which data s h o u l d be o r g a n i z e d . Dr. Perone has been t a l k i n g about t h i s a t t h i s symposium ( 3 ) and he makes a s t r o n g p o i n t f o r not p r e j u d g i n g the form which data s h o u l d f i t . Herman C h e r n o t f , t h e s t a t i s t i c i a n , observes t h a t c h e m i s t s may w e l l be t h r o w i n g away as much as 9 5 % ot the i n f o r m a t i o n c o n t a i n e d i n t h e i r d a t a s i m p l y because they p r e s c r i b e t h e form i n which the data s h o u l d be c a l c u l a t e d (fO · F o r example, we e x p r e s s much ot o u r s p e c t r o s c o p i c data i n the form ot s p e c t r a and we expect t o see c e r t a i n peaks and v a l l e y s even i f we p l o t them i n a d e r i v a t i v e manner. F o r some k i n d ot data a s p e c t r a l form may not be the form which g i v e s the g r e a t e s t amount ot i n t o r m a t i o n i n h e r e n t i n those data and t h a t i s what we have t o be prepared f o r i n t h i n k i n g about t h i n g s i n t h i s way. Perhaps i t i s the freedom o t form i n the h e u r i s t i c , i n d u c t i v e approach which i s one r e a s o n why these methodologies may be l e s s familiar to physical scientists than they a r e t o b e h a v i o r a l scientists. T h i s h e u r i s t i c approach has spawned a l a r g e number ot c l u s t e r i n g t e c h n i q u e s i n which Dr. Perone and some o t h i s more chemometric c o l l e a g u e s have been a c t i v e as f a r as c h e m i s t r y i s concerned. A g a i n , t h e r e a r e s t a t i s t i c i a n s such as Solomon (_5) and o t h e r s who c o n s t r u c t h i e r a r c h i c a l taxonomies by these p r o c e s s e s . The key t o u s i n g these t e c h n i q u e s i s the computer and the t e c h n i q u e s up t o now have been l a r g e l y l i m i t e d t o the use of numbers. What we have done i s t o t r y to extend t h i s i n d u c t i v e approach t o machine-readable alphanumeric chemical data. Clustering I now w i s h t o c o n s i d e r t h e p r o c e s s o t c l u s t e r i n g i n a l i t t l e b i t more d e t a i l . The d a t a , whether they a r e l i t e r a t u r e data o r e x p e r i m e n t a l d a t a , a r e c o n t a i n e d i n a hyperspace and, w h i l e t h e r e are p r o b a b l y a l o t ot b e t t e r d e f i n i t i o n s o t c l u s t e r i n g , t h e way I v i s u a l i z e c l u s t e r i n g when 1 w i s h t o e x p l a i n i t t o c h e m i s t s i s t h a t

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

92

C O M P U T E R S IN T H E L A B O R A T O R Y

the data b e i n g c o n s i d e r e d form a hyperspace and one drops v a r i o u s t w o - d i m e n s i o n a l p l a n e s , one at a t i m e , w i t h r e c o g n i z a b l e axes through the data and sees whether the data group on one or the o t h e r ot these p l a n e s . F u r t h e r m o r e , the game i s to see whether the t w o - d i m e n s i o n a l p l a n e s indeed have r e c o g n i z a b l e a x e s , t h a t i s , whether t h a t p a r t i c u l a r p l a n e on which data group c o r r e s p o n d s to an o r t h o g o n a l r e l a t i o n s h i p of v a r i a b l e s which i s r e c o g n i z a b l e to us. Solomon, whom I quoted b e f o r e , has t o l d the s t o r y t h a t he was asked how many times does he r u n a c l u s t e r i n g program f o r a c l i e n t i n h i s c o n s u l t i n g p r a c t i c e . How does he know when he has run the c l u s t e r i n g programs s u f f i c i e n t l y o f t e n ? H i s answer i s an "ah-so p o i n t " at which the c l i e n t f i n a l l y r e c o g n i z e s two o r t h o g o n a l v a r i a b l e s which were common but not n e c e s s a r i l y r e l a t e d i n h i s e x p e r i e n c e and which the computer-aided c l u s t e r i n g had r e v e a l e d t o be r e l a t e d . We g e n e r a l l y r e f e r to hyperspaces w i t h numeric d a t a p o i n t s as being p a r a m e t r i c h y p e r s p a c e s . In those p a r a m e t r i c hyperspaces a d i s t a n c e measure i s r e l a t i v e l y easy to c o n s t r u c t and t h i s d i s t a n c e measure or m e t r i c then p e r m i t s a measurement ot c l o s e n e s s to be a s s i g n e d t o any two elements i n a c l u s t e r . That way we can d e f i n e a c l u s t e r ot elements i n hyperspace w i t h g r e a t a c c u r a c y because we can t e l l how c l o s e the p o i n t s a r e . A J phanumer i c s , on the o t h e r hand, form n o n - p a r a m e t r i c h y p e r s p a c e s . Here no n u m e r i c a l parameters are a s s o c i a t e d w i t h any element and here the p r e s c r i p t i o n ot d i s t a n c e o r c l o s e n e s s i s v e r y much more d i f f i c u l t . As a matter ot f a c t , i n a l p h a n u m e r i c d a t a the d i m e n s i o n a l i t y of the hyperspace may not even be c o m p l e t e l y d e f i n e d . In such open-ended spaces c l u s t e r i n g becomes a r a t h e r i l l - d e f i n e d o p e r a t i o n . I t i s the purpose ot t h i s paper to i n d i c a t e some h e u r i s t i c approaches to c l u s t e r i n g i n such open-ended spaces and to show the f o r m a t t i n g f o r t h e i r use. The g e n e r a l i z a t i o n ot these t e c h n i q u e s to o t h e r t y p e s ot data bases may be f r u i t f u l and i s something t h a t 1 am v e r y much l o o k i n g f o r w a r d t o , p a r t i c u l a r l y now w i t h the a v a i l a b i l i t y not o n l y ot word p r o c e s s i n g s o f t w a r e but a l s o ot the g r e a t new e x t r a c t i n g and a s s o c i a t i n g power which i s a v a i l a b l e to us from CAS ONLINE. The Data Base 1 have d e s c r i b e d the data base w h i c h we used i n the development ot these techniques previously (6-8) and will not repeat the d e s c r i p t i o n o r d e r i v a t i o n ot the data base. S u f f i c e i t to say t h a t i t was d e r i v e d from the E i g h t h E d i t i o n ot the Merck Index, and t h a t i t l i s t s a l l ot the synonyms and a l l ot the m e d i c a l uses f o r a l l compounds i n the E i g h t h E d i t i o n ot the Merck Index which show a m e d i c a l use, some 3400 compounds i n a l l . We used c o n v e n t i o n a l s o r t i n g programs and t e x t p r o c e s s i n g programs i n i n t e r r o g a t i n g t h i s data base. One ot the t h i n g s which we tound out e a r l y on was t h a t i t was not n e c e s s a r y t o code the s t r u c t u r e ot c h e m i c a l compounds i n t o machine-readable form. R a t h e r , the use ot the Geneva System name ot each c h e m i c a l compound t o r a l p h a n u m e r i c d a t a p r o c e s s i n g was the substance ot some ot our e a r l y p a p e r s . I w i l l now d e s c r i b e t h r e e d i f f e r e n t a l p h a n u m e r i c c l u s t e r i n g methods which we d e r i v e d i n the c o u r s e ot our work.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

8.

MARCUS

Generation

of Structure-Effect

Relationships

93

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

C l u s t e r i n g Method I The f i r s t method took advantage o t the f a c t t h a t from the v e r y b e g i n n i n g we p l a c e d a t l e a s t two p r o p e r t i e s on the same computer l i n e , f o r example c h e m i c a l s t r u c t u r e , i n the form o f the Geneva System name, and m e d i c a l use- We d i d n o t format t h e data base very r i g o r o u s l y a t the b e g i n n i n g because we were not a t a l l sure ot the i n t o r m a t i o n t h a t l u r k e d i n the data base and t h e r e f o r e d i d not want to r e s t r i c t o u r s e l v e s by a d o p t i n g f o r m a t t i n g t h a t would produce o n l y the i n t o r m a t i o n t h a t we c o u l d p r e d i c t o r guess was a v a i l a b l e from the data base. I s h o u l d add t h a t t h i s k i n d o t h u m i l i t y i n the face ot data i s one ot the c h a r a c t e r i s t i c s ot the h e u r i s t i c , i n d u c t i v e approach. When I say t h a t we p l a c e d two p r o p e r t i e s on the same computer l i n e and mention c h e m i c a l s t r u c t u r e as one ot these p r o p e r t i e s , I mean t h a t we used c h e m i c a l s t r u c t u r e as one ot t h e hyperspace c o o r d i n a t e s r a t h e r than as a word t o be used m a i n l y t o r b i b l i o g r a p h i c r e t r i e v a l . S i n c e we had two p r o p e r t i e s on the computer l i n e we were a b l e t o r u n t h e two p r o p e r t i e s a g a i n s t each o t h e r (Table I ) . Table I .

LINE 1 LINE 2

F o r m a t t i n g t o r Alphanumeric C l u s t e r i n g Method I . FIELD 1 Item 1 Item 2

FIELD 2 Property A Property A

FIELD 3 Property Β Property Β

In o t h e r words, we c o u l d p u l l o u t a i l ot the computer l i n e s w i t h p r o p e r t y "A", c o u l d see what k i n d s o t p r o p e r t i e s "B" were a s s o c i a t e d w i t h p r o p e r t y "A", and then r u n the r e v e r s e s e a r c h f o r p r o p e r t y "B" t o see whether we had missed any p r e v i o u s "A's" o r n o t . In t h i s way we o b t a i n e d t h e i n t e r s e c t i o n ot two hyperspace axes and t h e r e f o r e a p r i m i t i v e form o t c l u s t e r i n g i n an alphanumeric system. What d i d we do w i t h t h i s p r i m i t i v e form o t c l u s t e r i n g ? W e l l , we s t u d i e d compounds a c t i v e i n the autonomic nervous system. We d i d t h a t because my c o l l a b o r a t o r was a p s y c h o l o g i s t . We were p a r t i c u ­ l a r l y i n t e r e s t e d i n the q u e s t i o n ot p h y s i o l o g i c a l v e r s u s b e h a v i o r a l e f f e c t s o t the same k i n d s ot compounds. I n o t h e r words, i t some r e l a t e d compounds had a p h y s i o l o g i c a l e f f e c t , c o u l d one read between the l i n e s of the i n f o r m a t i o n and see t h a t they a l s o had a b e h a v i o r a l e f f e c t ? On the o t h e r hand, when a number ot r e l a t e d compounds had a b e h a v i o r a l e f f e c t , c o u l d one r e a d between the l i n e s and a l s o see whether they had a p h y s i o l o g i c a l e f f e c t o r not? I t t h a t had been the o n l y purpose ot o u r work we would have q u i t i t a f t e r s i x months because we v e r y q u i c k l y found o u t t h e answer t o the q u e s t i o n i s o b v i o u s l y yes (.b). Let me i l l u s t r a t e w i t h some ot the i n d o l e - n u c l e u s compounds t h a t we examined e a r l y on i n our work ( T a b l e I I ) .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

94

C O M P U T E R S IN T H E LABORATORY

Table I I . A d r e n e r g i c E f f e c t s of Some I n d o l e - N u c l e u s C l a s s i f i e d by Alphanumeric C l u s t e r i n g Method I .

Compounds,

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

P h y s i o l o g i c a l l y recognizable hemostatic antihistaminic serotonin antagonist Behaviorally recognizable hallucinogenic d i u r e t i c , antihypertensive analgesic, antipyretic tranqui i i z e r

Some ot these have p h y s i o l o g i c a l l y r e c o g n i z a b l e adrenergic e f f e c t s . Other i n d o l e compounds have b e h a v i o r a l i y r e c o g n i z a b l e adrenergic ettects. In t h i s list ot b e h a v i o r a i l y r e c o g n i z a b l e a d r e n e r g i c e f f e c t s one a l r e a d y sees the m i x t u r e w i t h p h y s i o l o g i c a l l y r e c o g n i z a b l e e f f e c t s . F o r example, t h e same compound t h a t i s an a n t i h y p e r t e n s i v e i s a l s o a d i u r e t i c , t h e same compound t h a t i s an a n a l g e s i c i s a l s o an a n t i p y r e t i c , e t c . We a l s o looked i n t o the method of a c t i o n ot m e s c a l i n e (2) · We found when we searched f o r p h e n e t h y l a m i n e s t h a t m e s c a l i n e i s always a s s o c i a t e d w i t h sympathomimetics and o t h e r substances which s t i m u ­ l a t e the s y m p a t h e t i c nervous system such as a n o r e x i g e n i c s . C l u s t e r i n g Method 2 A second, more s o p h i s t i c a t e d c l u s t e r i n g t e c h n i q u e f o r a l p h a n u m e r i c i n f o r m a t i o n proceeds as f o l l o w s : The data base can be imagined as a c o l l e c t i o n ot items w h i c h a r e d e s c r i b e d by a s e t ot p r o p e r t i e s . A g a i n , o u r p r o p e r t i e s i n the Merck Index data base a r e chemical s t r u c t u r e s as names, and medical u s e . Each item i n the data base i s a s s i g n e d a v e c t o r whose column elements i n d i c a t e whether the item has a g i v e n s e t ot p r o p e r t i e s o r whether i t does n o t . F o r example, a 1 i n d i c a t e s the presence ot the p r o p e r t y and a z e r o i n d i c a t e s the absence ot the p r o p e r t y ( T a b l e I I I ) . Table

I I I . Formatting

Item 1 Item 2 Item 3



f o r Alphanumeric C l u s t e r i n g Method I I .

Property A 1 0 1

Property Β 0 1 1

Property C 1 1 0





:

... ... ... ...

:: :

We can now s e l e c t a subset of the p r o p e r t i e s , i . e., some p a r t i c u l a r m e d i c a l uses o r some p a r t i c u l a r c h e m i c a l s t r u c t u r e s and s e a r c h the data base t o r those items w h i c h have a i l those p r o p e r t i e s

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

8.

MARCUS

Generation

of Structure-Effect

95

Relationships

by the way, i s the same way i n which CAS ONLINE works. The computer t r i e s t o t i n d a l l t h e items w i t h a 1 i n the a p p r o p r i a t e columns. The c l u s t e r i n g a s p e c t or. t h i s procedure i s t h a t the more p r o p e r t i e s any two items have i n common, the c l o s e r they s h o u l d l i e i n h y p e r s p a c e . Besides s i m i l a r i t y , i t i s also p o s s i b l e to consider d i s s i m i l a r ­ i t y as a c l u s t e r i n g t e c h n i q u e . There may be p a i r s ot p r o p e r t i e s which a r e e x c l u s i v e o r almost e x c l u s i v e , t h a t i s , items w i t h one p r o p e r t y almost never have the o t h e r p r o p e r t y . Such a f i n d i n g would suggest t h a t t h i s e x c l u s i v e n e s s e x p r e s s e s a r e l a t i o n between t h e p r o p e r t i e s . A r e l a t i o n ot e x c l u s i o n might thus i n d i c a t e that p r o p e r t i e s are opposed i n some way. Here we can use v a l u e s ot +1, -1 and 0 t o show presence ot the p r o p e r t y , t h e o p p o s i t e p r o p e r t y o r neither. As an example I c i t e o u r e a r l y s e a r c h e s on the c h e m i c a l s t r u c t u r e p r o p e r t i e s i n d o l e and e t h y i a m i n e and t h e m e d i c a l use p r o p e r t i e s sympathomimetic and parasympathomimetic (2> 8 ) . These searches r e v e a l e d two s e t s o t i n t e r s e c t i n g hyperspace axes. One ot these i n t e r s e c t i o n s c o n t a i n s compounds w i t h e t h y i a m i n e s t r u c t u r e s ( P r o p e r t y A i n T a b l e IV) which have the m e d i c a l use p r o p e r t y sympathomimetic; t h e o t h e r c l u s t e r i n d i c a t e s o r g a n i c ammonium i o n s Table IV. C l u s t e r i n g ot E t h y i a m i n e s and O r g a n i c Ammonium Ions by Alphanumeric C l u s t e r i n g Method I I .

Property A Property b

1 Simjίlarity i)l items 21 items

-1 Dissimilarity

y items

which are parasympathomimetics. The 51 sympathomimetics i n c l u d e d the c a t e c h o l e - t y p e nerve impulse t r a n s m i t t e r s as w e l l as numerous compounds which mimic t h e i r a c t i o n i n the s y m p a t h e t i c nervous system. The 27 parasympathomimetics tound i n c l u d e the nerve impulse t r a n s m i t t e r a c e t y l c h o l i n e as w e l l as compounds which mimic t h e i r a c t i o n i n the p a r a s y m p a t h e t i c nervous system. Searches on the c h e m i c a l p r o p e r t y o r g a n i c ammonium i o n t u r n e d up compounds which were not parasympathomimetics ( D i s s i m i l a r i t y i n T a b l e I V ) . These compounds c o n t a i n bulky s i d e groups and f u n c t i o n as s k e l e t a l muscle r e l a x a n t s by b l o c k i n g a c e t y l c h o l i n e . Nine compounds ot t h i s type known by the m e d i c a l use p r o p e r t y c u r a r i m i m e t i c were found i n the s e a r c h . The m e d i c a l use term o r p r o p e r t y p a r a s y m p a t h o l y t i c , d e n o t i n g i n h i b i t i o n o f nerve impulse t r a n s m i s s i o n i n t h e p a r a s y m p a t h e t i c nervous system i s not used i n the E i g h t h E d i t i o n of the Merck Index. Dissimilarity ( o p p o s i t e e f f e c t ) due t o b u l k y s i d e groups h i n d e r i n g the a c t i o n ot compounds c o n t a i n i n g t h e e t h y i a m i n e moiety were a l s o found. A g a i n the data base l a c k e d a d e s c r i p t o r i n d i c a t i n g the m e d i c a l use p r o p e r t y s y m p a t h o l y t i c . Examples ot such compounds would be t r a n q u i l i z e r s such as the r e s e r p i n e a l k a l o i d s . An e f f e c t i v e s e a r c h s t r a t e g y f o r compounds o t t h i s type has not been d e v e l o p e d and c o n s e q u e n t l y t h e space f o r them i n T a b l e IV cannot y e t be f i l l e d in.

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

96

C O M P U T E R S IN T H E LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

C l u s t e r i n g Method 3 A t h i r d and more q u a n t i t a t i v e method ot c l u s t e r i n g a l l o w e d us to make t o u r d i t f e r e n t c u t s through the hyperspace ot m e d i c a l uses and t h e i r frequency i n the Merck data base. By c u t I mean a p r o j e c t i o n ot the hyperspace onto one o r more r e c o g n i z a b l e o r t h o g o n a l axes. T h i s i s e q u i v a l e n t t o the p r o c e s s I was t a l k i n g about e a r l i e r i n which we pass a plane w i t h h o p e f u l l y r e c o g n i z a b l e o r t h o g o n a l axes through the hyperspace. Here we are t a l k i n g about two d i f f e r e n t s e t s of p r o p e r t i e s then we d i d i n the p r e v i o u s methods, i n which we t a l k e d about items h a v i n g the p r o p e r t i e s c h e m i c a l s t r u c t u r e and m e d i c a l u s e . Here we t a l k about items h a v i n g the p r o p e r t i e s m e d i c a l use and the frequency ot t h a t m e d i c a l use i n the data base. Two parameters were used i n making these c u t s . F i r s t ot a l l we counted the number ot m e d i c a l use p e r i t e m ( i . e., per compound t h a t i s the c o n n e c t i o n t o c h e m i c a l s t r u c t u r e which may be e x p l o i t e d l a t e r ) . The way i n which uses/compound were t a b u l a t e d i s shown i n Table V: a more complete t a b u l a t i o n can be found i n our e a r l i e r papers (9)· T a b l e V. F o r m a t t i n g t o r Alphanumeric D e r i v a t i o n ot D i s t r i b u t i o n Curve.

1 201

Uses/Compound 2 3 4 24 1

C l u s t e r i n g Method 111.

Freq.

Use Name

5 226

antimicrobial

I f we sum the uses/compound, we get a f r e q u e n c y f o r each m e d i c a l u s e . I f we p l o t the number ot d i f f e r e n t uses a g a i n s t t h e f r e q u e n c y ot t h a t use on p r o b a b i l i t y paper o r l o g - l o g p l o t , we g e t a d i s t r i b u t i o n c u r v e . The d i s t r i b u t i o n c u r v e s f o r f o u r d i f f e r e n t s u b s e t s ot the data i n appear i n ( 9 ) . Each shows t h a t t h e data form a f a i r l y s t r a i g h t l i n e over two o r d e r s ot magnitude. The c u r v e s r e p r e s e n t an attempt to t i t the p o i n t s w i t h a normal, r a t h e r than a l o g - n o r m a l , d i s t r i b u t i o n . I t i s o b v i o u s t h a t the c u r v e s do not f i t the p o i n t s and t h a t t h e r e f o r e the m e d i c a l use d i s t r i b u t i o n f u n c t i o n i s log-normal r a t h e r than n o r m a l . My u n d e r s t a n d i n g i s that t h i s log-normal distribution i s t y p i c a l ot n a t u r a l t e x t data bases d e s p i t e the h i g h l y s p e c i a l i z e d c h a r a c t e r ot the m e d i c a l use data base. When we take away the f i r s t column of Table V, we c o n s t r u c t a t a b l e which shows v a r i o u s p a i r s ot uses as a f u n c t i o n ot how o f t e n these use p a i r s o c c u r r e d on a uses p e r compound a x i s ( T a b l e V I ) .

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

MARCUS

8.

Generation

of Structure-Effect

97

Relationships

T a b l e V I . F o r m a t t i n g f o r Alphanumeric C l u s t e r i n g Method I I I . D e r i v a t i o n of Use P a i r Taxonomy.

2 29 27

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008



Uses/Compound 3 4 12 1 5 1



5 14 0



pair 1 pair 2



For example, t h e second use p a i r o c c u r s 27 times as a p a i r but a l s o i n m u l t i p l e c o m b i n a t i o n s , f i v e times w i t h a t h i r d use and once w i t h two o t h e r u s e s . A more complete v e r s i o n can be found i n o u r e a r l i e r papers ( 9 ; . The use ot p a i r s i n t h a t t a b l e can be grouped by c l o s e n e s s ot r e l a t i o n . F o r example, t h e t i r s t n i n e u s e - p a i r s a r e a l l e i t h e r a n a l g e s i c s o r s e d a t i v e s , the next f o u r a r e a l l e i t h e r a n t i s e p t i c s o r a s t r i n g e n t s . I n a taxonomic sense these grouped use pairs represent a c l u s t e r ; the seven we found a r e l i s t e d i n Table V I I . T a b l e V I I . Use P a i r C l u s t e r s i n Merck Index Date Base, D e r i v e d by Alphanumeric C l u s t e r i n g Method I I I . Analgesic-Sedative Antiseptic-Astringent Cardiotonic Diuretic-Antihypertensive Adrenocortical steroid Parasympathomimetic Sympathomimet i c

T h i s k i n d o t c l u s t e r i n g , a second c u t t h r o u g h the h y p e r s p a c e , c a n lead to e x t r a p o l a t i o n , dividing additional uses f o r e x i s t i n g compounds, and t o s t r u c t u r e - a c t i v i t y r e l a t i o n s h i p s . The mean uses p e r compound i n T a b l e V i s 1.7. The mean uses p e r compound can be counted t o r each o t the u s e s , i t was shown ( 9 ) t h a t mean uses per compound i s d i s t r i b u t e d s t a t i s t i c a l l y and the o u t l i e r s on each s i d e c o u l d be i d e n t i f i e d . The ones w i t h low mean uses p e r compound were h i g h l y s p e c i t i c uses such as a n t i m i c r o b i a l , a n t i n e o ­ p l a s t i c s , a n t i h i s t a m i n i c s , e s t r o g e n i c s , e t c . The ones w i t h h i g h mean uses were the ones which were p r e s e n t i n dominant use p a i r s o r c l u s t e r s such as d i u r e t i c s , v a s o d i l a t o r , CNS d e p r e s s a n t , e t c . , o r which had o l d i m p r e c i s e use d e s c r i p t o r s such as s u d o r i f i c , d i a p h o r e t i c , dermatoses, e t c . C e r t a i n m e d i c a l use q u a l i f i e r s were coded i n t o the data base. These m e d i c a l use q u a l i f i e r s form a f o u r t h c u t through the h y p e r s p a c e , which a g a i n o t t e r s a s t a t i s t i c a l l y s i g n i f i c a n t way o t d i s t i n g u i s h i n g between k i n d s ot m e d i c a l u s e s . As was seen trom the distribution f u n c t i o n t o r uses/compound, the X q u a l i f i e r (null c h a r a c t e r - n o a d d i t i o n a l i n t o r m a t i o n ) showed a mean uses/compound which i s not s i g n i f i c a n t l y d i t t e r e n t trom the e n t i r e data base. T h i s was a l s o t r u e ot the Ζ q u a n t i e r ( a d d i t i o n a l i n f o r m a t i o n ; . However,

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

98

C O M P U T E R S IN T H E LABORATORY

the H (has been used) and F ( f o r m e r l y used) q u a l i f i e r s show significantly more mean uses per compound, and the s e t of experimental qualifiers show s i g n i f i c a n t l y l e s s mean uses p e r compound.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch008

Conclusion I have r e v i e w e d t h r e e methods of c l u s t e r i n g t e x t d a t a . O b v i o u s l y the f i e l d i s i n i t s i n f a n c y , and there i s much work t h a t can be done. However, the power of t h i s approach i s demonstrated by i t s s i m p l i c i t y and u s e f u l n e s s . 1 b e l i e v e that the w o r l d of p h y s i c a l data has e x p l o d e d so much i n the l a s t 20 y e a r s t h a t the i n d u c t i v e approach can be h e l p f u l i n o r g a n i z i n g i t for extrapolation, i n t e r p o l a t i o n , and p l a n n i n g , as w e l l as t o r r e c o g n i z i n g inter­ a c t i o n s . In t h a t p r o c e s s the h e u r i s t i c use of computers can be of tremendous help i f we only p e r m i t i t t o h e l p .

Literature Cited 1. Dessy, R. "The Rational Electronic Laboratory," presented at the 186th ACS National Meeting, Washington, D. C., September 1984. 2. Bradshaw, G. F.; Langley, P. W.; Simon, H. A. Science 1983, 222, 971-5. 3. Perone, S. Chapter 9 in this book. 4. Chernoff, H., personal communication. 5. Solomon, H. "Numerical Taxonomy"; Technical Report No. 167, Stanford University Department of Statistics, Stanford, CA, Dec. 1970. 6. Gloye, Ε. Ε.; Marcus, R. J. Science 1970, 169, 88-91. 7. Marcus, R. J.; Gloye, Ε. E. J. Chem. Documentation 11, 163 (1971). 8. Marcus, R. J.; Gloye, Ε. E.; Florance, Ε. T. Computers & Chemistry 1977, 1, 235-241. 9. Marcus, R. J.; Florance, Ε. T.; Gloye, Ε. Ε. in "Retrieval of Medicinal Chemical Information"; Howe, W. J.; Milne,M.M.; Pennell, A. F., Eds.; ACS SYMPOSIUM SERIES No. 84, American Chemical Society: Washington, D. C., 1978; pp. 39-57. RECEIVED June 1, 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

9 Coping with the Information Explosion Provided by Modern Chemical Instrumentation SAM P. P E R O N E

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch009

Chemistry and Materials Science Department, Lawrence Livermore National Laboratory, Livermore, CA 94550

Modern chemical instrumentation is capable of generating enormous amounts of data in very short periods of time. It is clear that a major task of scientists for the near future is to develop techniques to u t i l i z e more effectively this capability, in order to avoid the typical dilemma of being buried in data with little or no perspective of the information content. Thus, there are three key developments that must be pursued: definition of "information content"; identification of methods to correlate instrumental parameters with information content; and development of tools for the instrumental enhancement of information content and the efficient extraction of information from data. These developments should allow the evolution of "smart instruments", perhaps guided by artificial intelligence principles. This paper w i l l describe some of the principles and tools that have already been developed, and w i l l identify the areas where work needs to be done. Modern i n s t r u m e n t a t i o n f o r c h e m i c a l a n a l y s i s , because o f the i n c o r p o r a t i o n o f d i g i t a l computer systems, a l l o w s the g e n e r a t i o n and c o l l e c t i o n o f immense amounts o f d a t a . T h i s i s f a c i l i t a t e d by computer c o n t r o l o f e x p e r i m e n t a l v a r i a b l e s and h i g h - s p e e d c o l l e c t i o n of m u l t i p l e channels o f d a t a . This i n turn allows complex measurement p r i n c i p l e s t o be implemented, w i t h correspondingly complicated m u l t i v a r i a t e a n a l y s i s . U n f o r t u n a t e l y , the d a t a e x p l o s i o n t h a t has accompanied t h e e v o l u t i o n o f modern c h e m i c a l i n s t r u m e n t a t i o n has not p r o v i d e d a c o r r e s p o n d i n g i n f o r m a t i o n e x p l o s i o n . T h i s i s because r e l a t i v e l y l i t t l e a t t e n t i o n has been p a i d t o the development o f t e c h n i q u e s f o r o p t i m i z a t i o n o f i n f o r m a t i o n c o n t e n t , o r f o r enhancement and e x t r a c t i o n o f i n f o r m a t i o n . I t i s not uncommon t o observe a s c i e n t i s t b u r i e d i n a d a t a p r i n t o u t from an e x p e r i m e n t , m a n u a l l y s c a n n i n g columns o f d a t a , c a l c u l a t o r i n hand, a t t e m p t i n g to e x t r a c t useful information. 0097-6156/84/0265-0099$06.00/0 © 1984 A m e r i c a n C h e m i c a l Society

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

100

COMPUTERS IN T H E LABORATORY

I t i s time t o t u r n o u r a t t e n t i o n t o d e v e l o p i n g more e f f e c t i v e methods f o r o b t a i n i n g i n f o r m a t i o n from complex e x p e r i m e n t a l systems. The f i r s t s t e p i n v o l v e s the d e f i n i t i o n o f g e n e r i c c o n c e p t s o f i n f o r m a t i o n c o n t e n t which a r e independent o f t h e s p e c i f i c i n s t r u m e n t a l system. T h i s i s a t a s k which has been s u r p r i s i n g l y n e g l e c t e d i n t h e p a s t . The v e r y s i m p l e s t c o n c e p t s which must be d e f i n e d i n c l u d e : ο informational goals ο information content ο i n f o r m a t i o n enhancement The next s t e p i s t o a p p l y t h e b a s i c p r i n c i p l e s o f i n f o r m a t i o n theory, s i g n a l processing theory, m u l t i v a r i a t e data i n t e r p r e t a t i o n , and a d a p t i v e i n s t r u m e n t a l c o n t r o l i n o r d e r t o enhance and e f f e c t i v e l y extract information.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch009

Information Goals The p r i m a r y requirement i n t h e process o f i n f o r m a t i o n enhancement i s t o d e f i n e the i n f o r m a t i o n a l g o a l ( s ) a s s o c i a t e d w i t h a s e t o f e x p e r i m e n t a l measurements. E q u a l l y i m p o r t a n t i s t h e d e f i n i t i o n o f an a p p r o p r i a t e measure o f t h e degree t o which t h e i n f o r m a t i o n a l goal i s achieved. Some g e n e r i c q u a l i t a t i v e i n f o r m a t i o n a l g o a l s and t h e i r r e s p e c t i v e f i g u r e s o f m e r i t might be: GOAL FIGURES OF MERIT concentration accuracy/precision resolution peak s e p a r a t i o n / p e a k w i d t h sensitivity detection limit/response slope matrix effects linearity/interference effects In a d d i t i o n , i t i s possible to define q u a l i t a t i v e informational g o a l s . These might i n c l u d e : ο i d e n t i f i c a t i o n o f c h e m i c a l components ο c l a s s i f i c a t i o n of materials/properties ο e s t a b l i s h m e n t o f c h e m i c a l mechanism. Corresponding f i g u r e s o f merit f o r the q u a l i t a t i v e i n f o r m a t i o n a l g o a l s can be d e f i n e d i n terms o f s t a t i s t i c a l a c c u r a c y by e v a l u a t i o n w i t h systems o f known p r o p e r t i e s . I n f o r m a t i o n Content T h i s concept i s one o f t h e most d i f f i c u l t t o q u a n t i t a t e . There a r e some r e l a t i v e l y e x p l i c i t d e f i n i t i o n s o f i n f o r m a t i o n c o n t e n t f o r e l e c t r o n i c communications. ( F o r example, t h e N y q u i s t theorem d e f i n e s t h e minimum s a m p l i n g r a t e r e q u i r e d i n o r d e r t o p r e s e r v e t h e maximum frequency i n f o r m a t i o n i n a p e r i o d i c s i g n a l . And, t h e r e l a t i o n s h i p s between d i g i t a l e n c o d i n g formats and i n f o r m a t i o n c o n t e n t o f a d a t a base c a n be q u a n t i t a t e d . ) However, f o r t h e g e n e r a l problem o f e v a l u a t i n g t h e r e s u l t s o f i n s t r u m e n t a l measurements o f c h e m i c a l systems, t h e d e f i n i t i o n s f o r i n f o r m a t i o n content o f data are very c l e a r . One g o a l o f o u r r e s e a r c h program i s t o d e v e l o p e x p l i c i t and q u a n t i t a t i v e d e f i n i t i o n s o f i n f o r m a t i o n c o n t e n t which may be u s e f u l f o r c h e m i c a l i n s t r u m e n t a t i o n systems. These w i l l be based on t h e p r i n c i p l e s o f i n f o r m a t i o n t h e o r y , sampling t h e o r y , and s i g n a l p r o c e s s i n g t h e o r y . A t t h i s t i m e , however, we can d e s c r i b e an

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

9.

PERONE

Coping with the Information

Explosion

101

e m p i r i c a l approach t o e v a l u a t i o n o f i n f o r m a t i o n content which we have found v e r y u s e f u l . T h i s approach i n v o l v e s t h e f o l l o w i n g s t e p s : ο define the " d e s i r e d i n f o r m a t i o n " ( i n f o r m a t i o n a l g o a l ( s ) ) ο d e f i n e a f i g u r e o f m e r i t f o r g o a l achievement ( e . g . , accuracy, p r e c i s i o n , r e l i a b i l i t y , etc.) ο e m p i r i c a l l y determine " i n f o r m a t i o n c o n t e n t " from t h e relationship: [INFO. GOAL]

= j [INFO. CONTENT]

(1)

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch009

From t h e above statement t h e i n f o r m a t i o n content o f a c h e m i c a l measurement system can be e v a l u a t e d by s t u d y i n g t h e e f f e c t s o f e x p e r i m e n t a l f a c t o r s on t h e degree o f achievement o f t h e i n f o r m a t i o n a l g o a l ( s ) . T h i s i s e l a b o r a t e d below. Information

Enhancement

An e m p i r i c a l procedure c a n be d e f i n e d f o r t h e enhancement o f information content. F i r s t , i t must be r e c o g n i z e d t h a t t h e achievement o f d e s i r e d i n f o r m a t i o n a l g o a l ( s ) depends n o t o n l y on the i n h e r e n t i n f o r m a t i o n c o n t e n t o f d a t a , b u t a l s o on t h e d a t a management and a n a l y s i s p r o c e d u r e s . T h i s i s e x p r e s s e d i n Equation (2): [INFO. GOAL]

= J[CONTENT, MGMT, ANALYSIS]

(2)

Thus, t o examine t h e r e l a t i o n s h i p between i n f o r m a t i o n c o n t e n t and e x p e r i m e n t a l f a c t o r s , i t i s n e c e s s a r y t o m a i n t a i n c o n s i s t e n t d a t a management and a n a l y s i s p r o c e d u r e s . Then, one can assume a d i r e c t r e l a t i o n s h i p between t h e achievement o f i n f o r m a t i o n a l g o a l s and i n f o r m a t i o n content as i m p l i e d i n E q u a t i o n ( 1 ) . A study designed t o determine t h e e f f e c t s o f e x p e r i m e n t a l f a c t o r s on i n f o r m a t i o n c o n t e n t might be based on t h e r e l a t i o n s h i p d e f i n e d by E q u a t i o n ( 3 ) : [INFO. CONTENT]

= f[MEASUREMENT PRINCIPLES, EXPTL DESIGN, EXPTL PARAMETERS] J

(3)

P r o c e d u r a l l y , one c o u l d v a r y any o f t h e e x p e r i m e n t a l f a c t o r s i n E q u a t i o n ( 3 ) and e v a l u a t e t h e e f f e c t s on i n f o r m a t i o n content under c o n d i t i o n s where E q u a t i o n ( 1 ) a p p l i e s . In o r d e r t o c l a r i f y t h e g e n e r a l concepts d e f i n e d i n t h e above s e c t i o n s , t h e f o l l o w i n g s e c t i o n s w i l l d e s c r i b e an e x p e r i m e n t a l study which f o l l o w e d those p r i n c i p l e s i n o r d e r t o a c h i e v e s p e c i f i e d informational goals. E l e c t r o c h e m i c a l S t r u c t u r a l and A c t i v i t y

Classifications

The c l a s s i f i c a t i o n o f c h e m i c a l s t r u c t u r e u s i n g e l e c t r o c h e m i c a l t e c h n i q u e s , i s a c h a l l e n g i n g problem. V o l t a m m e t r i c responses l a c k f i n e s t r u c t u r e and p r o b a b l y w i l l never compete w i t h s p e c t r o s c o p i c methods i n q u a l i t a t i v e a n a l y s i s . The complex dependence o f a n e l e c t r o c h e m i c a l response on many v a r i a b l e s , and t h e o r e t i c a l

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch009

102

C O M P U T E R S IN T H E

LABORATORY

problems i n r e l a t i n g s t r u c t u r e t o e l e c t r o c h e m i c a l a c t i v i t y , make q u a l i t a t i v e v o l t a m m e t r i c a n a l y s i s even more f o r m i d a b l e . Even though the d i f f i c u l t i e s i n q u a l i t a t i v e e l e c t r o a n a l y s i s a r e g r e a t , the rewards o f d e v e l o p i n g a r e l i a b l e means o f s t r u c t u r a l i d e n t i f i c a t i o n through e l e c t r o a n a l y s i s would a l s o be g r e a t . Due t o r e c e n t l y developed m i n i a t u r i z a t i o n t e c h n i q u e s , e l e c t r o d e s are the most p r o m i s i n g probes o f i n v i v o c h e m i c a l s p e c i e s . Carbon f i b e r e l e c t r o d e s may be implanted w i t h i n a s i n g l e c e l l o r neuron (JL). E l e c t r o c h e m i c a l d e t e c t o r s i n l i q u i d chromatography a r e becoming v e r y important because o f t h e i r h i g h s e n s i t i v i t y and s e l e c t i v i t y . Q u a n t i t i e s o f e l e c t r o a c t i v e m a t e r i a l i n the picogram range have been a n a l y z e d . Osteryoung, e t a l . (2) have demonstrated the f e a s i b i l i t y o f s c a n n i n g the p o t e n t i a l o f a l i q u i d c h r o m a t o g r a p h i c e l e c t r o c h e m i c a l d e t e c t o r , so the development o f q u a l i t a t i v e v o l t a m m e t r i c methods would open up the p o s s i b i l i t y o f the c h a r a c t e r i z a t i o n o f e l u a n t s t h a t are 1000 times l e s s c o n c e n t r a t e d than those which can be a n a l y z e d by s p e c t r o s c o p i c t e c h n i q u e s . L i n e a r - f r e e - e n e r g y r e l a t i o n s h i p s have g e n e r a l l y been the most u s e f u l expressions f o r r e l a t i n g s t r u c t u r e to e l e c t r o c h e m i c a l a c t i v i t y i n the p a s t . A s u b s t i t u e n t group w i l l have a c h a r a c t e r i s t i c e f f e c t on the f r e e energy o f an e l e c t r o c h e m i c a l r e a c t i o n o c c u r r i n g i n i t s v i c i n i t y . T h i s e f f e c t may o c c u r through e l e c t r o n w i t h d r a w a l , e l e c t r o n d o n a t i o n , o r i t may be s t e r i c i n nature. In any c a s e , the e f f e c t may be q u a n t i f i e d through the use o f Hammett s u b s t i t u e n t c o n s t a n t s . For a g i v e n c l a s s o f e l e c t r o c h e m i c a l r e a c t i o n s , t h e r e w i l l be a l i n e a r r e l a t i o n s h i p between E ^ ^ * the s u b s t i t u e n t c o n s t a n t s σ ( 3 ) . There are two main problems i n the use o f l i n e a r - f r e e - e n e r g y r e l a t i o n s h i p s . The f i r s t and l a r g e s t problem i s the d e t e r m i n a t i o n o f the r e a c t i o n s e r i e s t o which an unknown b e l o n g s . Such a d e d u c t i o n from e l e c t r o c h e m i c a l b e h a v i o r i s not s t r a i g h t f o r w a r d . F u r t h e r m o r e , t h e r e may be s e v e r a l r e a c t i o n s e r i e s which may be c o n s t r u c t e d f o r a c l a s s o f compounds depending on s o l u t i o n c o n d i t i o n s . The s l o p e o f the %i/2 p l o t would be d i f f e r e n t a t h i g h pH's due t o a change i n the mechanism o f r e d u c t i o n . The second main problem i s t h a t t h e r e i s o f t e n not enough ^1/2 s e p a r a t i o n f o r d i f f e r e n t s u b s t i t u e n t s o r s u b s t i t u e n t combinations to a l l o w f o r confidence i n i d e n t i f i c a t i o n , e s p e c i a l l y when e x p e r i m e n t a l r e p r o d u c i b i l i t y i s low due to u n c o n t r o l l e d m a t r i x e f f e c t s . The c o n s i d e r a t i o n o f more i n f o r m a t i o n than Έ.\/2 would c l e a r l y be h e l p f u l . Because p a t t e r n r e c o g n i t i o n i s w e l l s u i t e d t o the c o n s i d e r a t i o n o f l a r g e amounts o f i n f o r m a t i o n and t o making use o f obscure r e l a t i o n s , we have a p p l i e d i t t o c h e m i c a l s t r u c t u r e i d e n t i f i c a t i o n from e l e c t r o c h e m i c a l d a t a . The main q u e s t i o n s have been what d a t a s h o u l d be c o l l e c t e d and how much? Burgard and Perone ( 4 ) , used s t a i r c a s e voltammetry t o a n a l y z e 29 compounds b e l o n g i n g t o f o u r d i f f e r e n t e l e c t r o a c t i v e g r o u p / s k e l e t o n c o m b i n a t i o n s . The c l a s s e s examined were a r o m a t i c - n i t r o , a l i p h a t i c - n i t r o , a r o m a t i c - a l d e h y d e and a r o m a t i c - a l i p h a t i c - k e t o n e . F o r t u i t o u s l y these c l a s s e s were a l m o s t c o m p l e t e l y separated on the b a s i s o f peak p o t e n t i a l ; but t h i s f e a t u r e a l o n e cannot be c o n s i d e r e d s u f f i c i e n t f o r many i d e n t i f i c a t i o n problems. Thus, the voltammograms were examined f o r any shape i n f o r m a t i o n which might c h a r a c t e r i z e a p a r t i c u l a r a n
>—

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

s: w ω

υ rH •H .O ca W -H •H S-, rH Cd rH > rd PQ Ο

Η- > ο Φ

CM

>

CO •-J I •H -Ρ rH D Σ C

·Η -Ρ ·Η C ·Η

Cm

Q) Ό

CO

ο CO Ω

Cm

C Ο •H -Ρ SL,

CO





Φ

4- ro 3 z

C

in

ο e S-, c < Φ

CO υ ·Η t. Χ) • (Ο rH EL, Φ

Σ

Q_ 0— —(£) φΰ-ΰ-ΰ-ΰ-ΰ-Ο-ΰ-ΰ. ûuûw û-Q_k-^lllll|ll l | ι

ι

°

° C V J C D O CM CD C\J Ο

H

il

II

II

II

II

II

II

II

II

II

CVJ Ο II

η

4- M

m

Φ

bO >

II

•H

CL, X

I M CVJ

Ο #•

I I > ι ι I ι ι ι ι I I » ι , I > ι ι ι I ι ι ι • ι > CsJ f

I

τ­ ι

ι

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

φ ^

10.

STURDIVANA N D SEIDERS

Stochastic

and

Nonlinear

Universe

113

f i t t i n g dichotomous data to the L o g i s t i c function, a mathe­ matically t r a c t i b l e d i s t r i b u t i o n , the method of Walker and Duncan (3) i s convenient. Figure 2 shows the t y p i c a l S-shaped p r o b a b i l i t y d i s t r i b u t i o n r e s u l t i n g from f i t t i n g the L o g i s t i c function: Ρ =

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

l+e-(a+4> In x) to the dichotomous data on penetration. The straight l i n e of slope 1 i n figure 1 i s a c t u a l l y the 50% p r o b a b i l i t y contour from the equation f i t t e d to raw data. I t i s not a least squares f i t to the means plotted on the f i g u r e . The second example i s from a mixed b i o l o g i c a l / p h y s i c a l problem. I t deals with the p r o b a b i l i t y that blunt trauma to the chest or abdomen would be l e t h a l to man. I t has been used to assess the hazard of large b a l l i s t i c p r o j e c t i l e s moving at moderate v e l o c i t y , the hazard behind body armor which has stopped a handgun b u l l e t , etc. The s c a l i n g model, which again i s too lengthy to derive, i s (4) χ

= h

MV2

W^td where

M - mass of the p r o j e c t i l e V =

v e l o c i t y of the p r o j e c t i l e

W =

mass of the i n d i v i d u a l

t =

thickness of the body w a l l over the vulnerable organ

d =

/A/4' = the e f f e c t i v e diameter of the p r o j e c t i l e

A =

mean presented area

Notice that i f the constants ρ= Τ =

mean density of the i n d i v i d u a l t e n s i l e strength of the tissue

were included, the product would be a dimensionless r a t i o comparable to that of the previous example; i . e . , χ =

h MV

(?>

1 / 3

2

« χ

As i n the previous model, the factors assumed to remain constant, ρ and T, are assumed to be absorbed i n the curve f i t t i n g constants when f i t t e d to the p r o b a b i l i t y function. Figure 3 shows how w e l l the model f i t s the mean data. A plot of the p r o b a b i l i t y curve would be exactly l i k e Figure 2 with a change i n scale. Given these introductory examples of applied stochastic models,

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

114

C O M P U T E R S IN T H E L A B O R A T O R Y

F i g u r e 2.

The P r o b a b i l i t y o f P e n e t r a t i n g F a b r i c Armor a s a F u n c t i o n

o f t h e Model V a r i a b l e ,

F i g u r e 3.

x.

V u l n e r a b i l i t y o f t h e Thorax t o B l u n t Trauma (see t e x t f o r

a definition of variables).

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

10.

STURDIVAN A N D SEIDERS

Stochastic

and

Nonlinear

Universe

115

we can discuss i n more d e t a i l some of the techniques and p r i n c i p l e s which are p a r t i c u l a r l y useful i n deriving and f i t t i n g t h i s type model. One of the most u s e f u l modeling andscaling techniques to be found i s Dimensional A n a l y s i s , embodying the p r i n c i p l e of s i m i l i ­ tude (5,6) so-named by G a l i l e o i n the 17th century and given a formal framework i n 1822 by Fourier. The rules for manipulating the fundamental units of measure which Fourier proposed has evolved into the modern technique of dimensional analysis. The major addi­ t i o n i n modern times i s the Buckingham P i Theorem by means of which dimensionless r a t i o s of the type used above may be derived. I t should be noted that i n each of the examples the model shown was not the f i r s t t r i e d . For dimensional analysis to produce useful r e s u l t s the whole set of relevant variable must be included, the proper dimensionless r a t i o s must be found, and, f i n a l l y , the best method of employing those dimensionless r a t i o s i n a model must be determined. Dimensional analysis i s just one method of normalizing the data; i . e . , making i t independent of the units of measure. I t i s , however, the best. Another method which i s widely employed i s to subtract a known or inferred population mean from the i n d i v i d u a l datum and to divide by the population standard deviation. Once a s c a l i n g model has been found the scaled data should be examined c a r e f u l l y to ascertain that the variance i s equal over the domain of the data. I f not then a suitable transform must be found to equalize the variance. Otherwise, no single stochastic model w i l l accurately r e f l e c t the p r o b a b i l i t y of an occurrence of the "event" i n question over the data domain, much less for an extra­ polated p r e d i c t i o n . For example, i f the standard deviation i s proportional to the mean, a very common s i t u a t i o n i n nature, the variance i s equalized by taking the log of the model v a r i a b l e . This i s the case f o r both of the above examples, where the p r o b a b i l i t y model was f i t t i n g to In χ rather than χ i t s e l f . Suitable trans­ formations f o r other common s i t u a t i o n s , as well as a general method for finding transforms i s given by Johnson & Leone (7). When a suitable s c a l i n g model has been found and equal variance confirmed or obtained, a p r o b a b i l i t y function i s f i t t e d to the data. For dichotomous data, the Gaussian (probit) or L o g i s t i c ( l o g i t ) functions are the most common mathematical func­ tions used. The Central Limit Theorem has been used to j u s t i f y assuming normality (Gaussian) i n an over-wide number of cases. For a reasonable sample size from a d i s t r i b u t i o n quite d i f f e r e n t from the Gaussian, t h i s i s a bad assumption. I f one knows, or has reason to believe, that a c e r t a i n p r o b a b i l i t y function p r e v a i l s , then that i s the function to use. An argument can be made for not assuming any "standard" d i s t r i b u t i o n , but using a non-parametric d i s t r i b u t i o n based on the data i t s e l f . This i s fine for large amounts of data and f o r p r e d i c t i o n within the central portion (say .2 to .8) of the d i s t r i b u t i o n . However, such d i s t r i b u t i o n s are not usually well defined i n the t a i l s , e s p e c i a l l y with small sample s i z e , so some assumption must be made concerning a d i s ­ t r i b u t i o n function appropriate for these areas. The L o g i s t i c function i s often used because of i t s mathematical t r a c t i b i l i t y . For dichotomous (0-1 or p a s s / f a i l ) data the method of Walker and Duncan (3) i s convenient. Notice, however, that they disregard

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E LABORATORY

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

116

physical units i n t h e i r example of i t s application. Ignoring the p r i n c i p l e of dimensional homogeneity i s a dangerous oversight i n any model used f o r extrapolation. I f the data to be f i t are continuous there are general non­ l i n e a r methods which can be used to f i t almost any p r o b a b i l i t y func­ t i o n (8), including a v a r i e t y of so-called probit analyses f o r (assumed) Gaussian data (9). For many of these methods, conver­ gence i s slow or nonexistent i f the values i n i t i a l l y selected f o r the f i t t e d parameters are not s u f f i c i e n t l y close to the f i n a l values· I f the function may be made l i n e a r with respect to i t s unknown parameters by a suitable transformation, then i t may be f i t t e d by the Linearized Least Squares method (10) so as to minimize the root mean square error i n the o r i g i n a l (untransformed) space. The essence of t h i s technique i s to use weighted ( l i n e a r ) least squares to e f f e c t a non-linear least squares f i t . Assume that the equation has been transformed into an equal variance space and l e t y = the r e s u l t i n g dependent

variable

χ = the vector of independent variables b = the vector of parameters to be f i t t e d a = the vector of known constants then

y = f (x, b, a)

(1)

The function (1) may be l i n e a r i z e d i f , through any set of mathe­ matical operations, equation 1 may be transformed into My)

= I b (a, x) (2) i The usual procedure i s to employ least squares d i r e c t l y on equation 2. However, t h i s r e s u l t s i n minimizing the squared error i n h, not y. That i s , the procedure finds the set b± such that the quantity ±

g i

y

y

y

j ( Δ hj)2 = j [hj(y) - I b

t

g

i

(a, x j ) ]

2

(3)

y 2

i s minimized. What i s desired i s the minimum of j ( A y j ) . This may be achieved by i t e r a t i v e l y conducting a least squares procedure on equation 2 with weights: 2

w,

=

( A l l )

2

Ah/

(4)

where the A's are from the previous i t e r a t i o n . Starting weights are obtained from the d i f f e r e n t i a l approximation to the r a t i o of differences of equation 4 ; i . e . , „ 2 =|dhj -2

where the derivative i s evaluated at the j th data point to provide

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

10.

STURDIVAN A N D SEIDERS

Stochastic

and

Nonlinear

Universe

117

the weight appropriate at that point. Unlike most nonlinear methods, therefore, Linearized Least Squares does not require i n i t i a l guesses, but derives good s t a r t i n g values from the data and the d e r i v a t i v e . A simple example i s found i n the L o g i s t i c function discussed above: Ρ

=

1+e - (b +b! In x) 0

In the o r i g i n a l space the dependent v a r i a b l e i s the p r o b a b i l i t y , P. The equation may be l i n e a r i z e d as: ^ 1 h(P) = In (

)

Ρ

= b κ

1-P Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

then:

g

ι

0

J. bκ. In ι„ χ ~ == i ± + x

0

b

±

g l

(X)

(x) = 1

Q

gj (χ) = In χ where χ i s the only independent v a r i a b l e . For the f i r s t

Ρ

dP w 2

=

(1-P)

2

y £ w.

Ρ 2

Α

P

- J

) 2

2

we minimize t* w^ squares, min

=

ISl'îj

y

iteration,

A h^ 2

which r e s u l t s i n the usual weighted least

Ah. J

J J

2

= min

y

£ w. J J

2

2

(h. - b - b. In x) · J ο 1

In t h e i r dichotomous f i t , Walker and Duncan transform the L o g i s t i c function to an equal variance space by d i v i d i n g each data point by i t s variance. The variance of a p r o b a b i l i t y value, P, i s Ρ (1-P). For the f i r s t i t e r a t i o n , Ρ (1-P) i s equal to w. This suggests minimizing the function V ΛΡ 9 Y

In l i n e a r least squares (unweighted) where * Ay i

2

i s minimized, i t

I

can be shown that sum

* Δ y \ = 0.

l

ι

j

Ayj = j

W J

Ahj

With weighted least squares, the 0. (6)

2

However, i f equation 5 i s used (weights Wj rather than Wj ), then equation 6 does equal zero. When a zero sum of deviations i s desirable, function 5 may be minimized, often without increasing the root-mean-square-error by an undue amount. In conclusion, the following p r i n c i p l e s may be of some help i n modeling i n a nonlinear, stochastic universe: Model f i r s t . Propose as many reasonable models as you

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

C O M P U T E R S IN T H E LABORATORY

118

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ch010

can - then design experiment(s) to discriminate among them. For maximum a p p l i c a b i l i t y (extrapolation) be consistent with physical lavs - including the p r i n c i p l e of s i m i l i t u d e . Whenever none of the proposed models i s acceptable, amend the model to f i t the data. Specific to P r o b a b i l i t y Models: . Model on means - then f i t on a l l data. . Normalization i s strongly advisable, preferably by dimensional analaysis. . Transform, i f necessary, to equalize variance over domain of d e f i n i t i o n . Stochastic models often require larger data bases than deterministic models. Be prepared to seek a nonlinear, stochastic model u n t i l i t i s demonstrated that a l i n e a r or deterministic approximation i s acceptable.

Literature Cited 1. "Albert Einstein - Hedwig und Max Born: Briefwechsel 1916-1955", Nymphenburger, Munich, 1969. 2. Rosen,R.;AmJ.Physiol, 1983, 244, R591-R599, "Role of Similarity Principles in Data Extrapolation". 3. Walker, S. and Duncan, D.; Biometrika 54, 1 and 2, 1967, 167-179, "Estimation of the Probability of an Event as a Function of Several Independent Variables". 4. Sturdivan,L.M.;"Modelingin Blunt Trauma Research", Second Annual Soft Body Armor Symposium, Miami Beach,FL,Sept 1976. 5. Bridgman, P.; "Dimensional Analysis", Yale university Press, New Haven,CT,1922. 6. Langhaar,H.;"DimensionalAnalysis and Theory of Models", Wiley,NY,1951. 7. Johnson, N. and Leone, F.; "Statistics and Experimental Design in Engineering and the Physical Sciences", Wiley,NY,Vol II, 1964, 54-56. 8. Marquardt, D.; J. Soc. Ind. App. Math II, 1963, 431-441, "An Algorithm for Least Squares Estimation of Nonlinear Parameters". 9. Finney, D.; "Probit Analysis", Cambridge University Press, NY, 1952. 10. Sturdivan,L.M.and Jameson, J.; "Linearized Least Squares", Proceedings of the 1976 Army Numerical Analysis and Computer Conference.AROReport 76-3,USArmy Research Office, 1976. RECEIVED August 6, 1984

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ix001

Author Index Baumann, F r e d , 23 B r o w n , A r t h u r C. I l l , 23 E n r i o n e , R i c h a r d E., 83 F i n n e r t y , W., 17 K i p i n i a k , W., 17 L e w i s , K e n n e t h Α., 23 L i s c o u s k i , J o s e p h G., 1,45 L o c h m u l l e r , C h a r l e s Η., 11 M a r c u s , R u d o l p h J . , 89 P e r o n e , Sam P., 99 S e i d e r s , B a r b a r a A. B., 109 S t . C l a i r , D o u g l a s , 37 S t u r d i v a n , L a r r y , Μ., 109

Subject Index A Animation, r e a l - t i m e , 58 Applications graphics, 46 robots, 13 Audit t r a i l s , 34 Automation goals, 3 planning, 4,6 problems, 1 Automation vs. r o b o t i c s , 11

Β Bachman diagram, 30f Backup procedures f o r computer manage­ ment of water q u a l i t y , 85 Bar chart, computer graphics, 52 BASIC, 29f Broadband technology, 42 Buckingham P i Theorem, 115

CODASYL—See Conference on Data Systems Languages Communications and network, 37-44 Communications s a t e l l i t e , 41 Computer d e f i n i t i o n , 45 d u a l , 21 updating, 22 Computer automated laboratory system, 17 Computer generation of s t r u c t u r e e f f e c t r e l a t i o n s h i p s , 89-98 Computer graphics, 45-82 to i l l u s t r a t e a l g a l growth, 84 Conference on Data Systems Languages, 27,29f Connections to the host, 66 Conversion techniques, analog to d i g i t a l , 20 Converter, d i g i t a l to analog (D/A), 60 CRT, 47,55

D

C CALS—See Computer automated laboratory system CAS On-Line, 22,95 Central l i m i t theorem, 115 C e r t i f i c a t e of a n a l y s i s , 21 C l u s t e r i n g methods, 91,93-97 Clusters technology, 42

Data manual entry, 20 recording and v a l i d a t i n g , 19 Data networking, 21 Data p l o t t i n g , 46 Data r e p o r t i n g , 21 Data r e t r i e v a l , 19,21

121 In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ix002

Author Index Baumann, F r e d , 23 B r o w n , A r t h u r C. I l l , 23 E n r i o n e , R i c h a r d E., 83 F i n n e r t y , W., 17 K i p i n i a k , W., 17 L e w i s , K e n n e t h Α., 23 L i s c o u s k i , J o s e p h G., 1,45 L o c h m u l l e r , C h a r l e s Η., 11 M a r c u s , R u d o l p h J . , 89 P e r o n e , Sam P., 99 S e i d e r s , B a r b a r a A. B., 109 S t . C l a i r , D o u g l a s , 37 S t u r d i v a n , L a r r y , Μ., 109

Subject Index A Animation, r e a l - t i m e , 58 Applications graphics, 46 robots, 13 Audit t r a i l s , 34 Automation goals, 3 planning, 4,6 problems, 1 Automation vs. r o b o t i c s , 11

Β Bachman diagram, 30f Backup procedures f o r computer manage­ ment of water q u a l i t y , 85 Bar chart, computer graphics, 52 BASIC, 29f Broadband technology, 42 Buckingham P i Theorem, 115

CODASYL—See Conference on Data Systems Languages Communications and network, 37-44 Communications s a t e l l i t e , 41 Computer d e f i n i t i o n , 45 d u a l , 21 updating, 22 Computer automated laboratory system, 17 Computer generation of s t r u c t u r e e f f e c t r e l a t i o n s h i p s , 89-98 Computer graphics, 45-82 to i l l u s t r a t e a l g a l growth, 84 Conference on Data Systems Languages, 27,29f Connections to the host, 66 Conversion techniques, analog to d i g i t a l , 20 Converter, d i g i t a l to analog (D/A), 60 CRT, 47,55

D

C CALS—See Computer automated laboratory system CAS On-Line, 22,95 Central l i m i t theorem, 115 C e r t i f i c a t e of a n a l y s i s , 21 C l u s t e r i n g methods, 91,93-97 Clusters technology, 42

Data manual entry, 20 recording and v a l i d a t i n g , 19 Data networking, 21 Data p l o t t i n g , 46 Data r e p o r t i n g , 21 Data r e t r i e v a l , 19,21

121 In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ix002

122

COMPUTERS IN THE LABORATORY

Database adjustment, 25 a r c h i v i n g , 20 design, 23,24 implementation, 23 i n t e g r i t y , 25,34 schema, 25 s e c u r i t y , 25,34 subschema, 24 updating, 20 used i n development of computer techniques, 92 Database management, applied i n a n a l y t i c a l chemistry laboratory, 23-36 Datasets, 25 Datatrieve, 29f,33 Declaration v a r i a b l e , 18 Deductive method vs. h e u r i s t i c approach, 91 D i c t i o n a r i e s , l a b manager systems, 18-19 Dimensional a n a l y s i s , 115 Displays graphics, 60-69 r a s t e r , 62-65 vector, 60-62 D i s t r i b u t e d storage, 40 Document preparation, 46,58 Ε Electrochemical s t r u c t u r a l - a c t i v i t y c l a s s i f i c a t i o n s , 101 Ethernet technology, 2,42

Graphics p r o t o c o l s , 67 Graphics software, 73 Graphs and c h a r t s , computer representation, 46 H Hardware requirements f o r computer graphics, 50,54,56,58-60 H e u r i s t i c programming, 90 H i s t o r i c a l development of laboratory communications, 42 Hub system, 3 Hypers pace, 91,95 I Image i n t e r p r e t a t i o n , 57 Image processing, 46,57 INA—See Instrument network a r c h i t e c ­ ture Inductive techniques, 91 I n d u s t r i a l robot, d e f i n i t i o n , 12 Information content, 100 enhancement, 101 goals, 100 r e t r i e v a l and r e p o r t i n g , 33 Information explosion, e f f e c t of instrumentation, 99-107 Ink j e t p r i n t e r s , 68 Instrument network a r c h i t e c t u r e (INA), 28 Instrumentation, e f f e c t on information, 99-107 I n t e r f a c i n g , computer to instrument, 2

F F a c t o r i a l design to study s t r u c t u r a l e f f e c t s on v o l t a m e t r i c data, 104t Fiber o p t i c s technology, 42 F i l e t r a n s f e r , 40 F l i c k e r , 62 FORTRAN, 29f

G Gaussian f u n c t i o n , 115 Graphics a p p l i c a t i o n s , 46 d e f i n i t i o n , 45 hard copy, 68 Graphics devices c l o s e l y coupled, 66 l o o s e l y coupled, 67 Graphics d i s p l a y s , 60 d i s c u s s i o n , 60-69

J J o y s t i c k s , computer graphics, 70 L Lab manager system, 17 Laboratory automation, 1-9 Laboratory data management, 17-24 Laboratory f r o n t end device, 39 Laboratory Information Management Systems, 2 extensions and m o d i f i c a t i o n s , 35 goals, 3 Light pens, 69 LIMS—See Laboratory Information Management Systems Line drawing, 46 a p p l i c a t i o n s , 54

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

123

INDEX

L i n e a r i t y , d e f i n i t i o n , 110 Linearized l e a s t squares method, 116 Local storage, problems, 40 L o g i s t i c f u n c t i o n , 115

M

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ix002

Mass storage, devices, 40 Medline, 22 Modeling a p p l i c a t i o n s , 54 computer graphics, 46 Models s t o c h a s t i c , 111-115 theory behind, 110

Ν Network computers f o r water management, 85 Data Communications, 37 d e f i n i t i o n , 37 D i g i t a l Computer, 37 l o c a l area, 42 wide area, 41 Network and communications, 37-44 Network showing primary data paths, 86 Networking cost, 41 e f f e c t s on data access and manipulation, 41

R Raster, d e f i n i t i o n , 60 Raster c o l o r , 65 Raster d i s p l a y s , 62 d i s c u s s i o n , 62-65 Raster p r i n t e r s , 68 Record a n a l y s i s , 30f, 31, 32 a r c h i v i n g , 34 I/O device, 30f, 31 instrument, 30f, 31 method, 30f, 31 r e s u l t , 30f, 32, 33 r e t r i e v i n g , 34 run, 30f, 32, 33 run parameter, 30f, 32, 33 sample, 30f, 31 w o r k l i s t , 30f, 31 Remote computer, 19 Remote job entry (RJE), 22 Report generation, 19 Reporting, information r e t r i e v a l , 33 Resolution, i n r a s t e r d i s p l a y s , 64 R e t r i e v a l , information and r e p o r t i n g , 33 RJE—See Remote job entry Robot, 3 d e f i n i t i o n , 11 t r a i n i n g , 12 types and a p p l i c a t i o n s , 13 Robotics i n the l a b o r a t o r y , 11-16 Robotics vs. automation, 11 Robots, pick-and-place components, 12 Rotation, computer graphics, 56 Runsheet processing, 20

0 ODYSSEY, computer graphics, 52 On-line data a c q u i s i t i o n , review, and v a l i d a t i o n , 20 Ρ PASCAL, 29f Pen p l o t t e r s , 69 Personal computers, 40 Photographic systems, 68 Pick-and-place components of robots, 12 Pie chart computer graphics, 50 P i x e l s , i n r a s t e r d i s p l a y s , 63 P l o t t i n g , data management, 19 P l o t t i n g program, computer graphics, 46 P r o b a b i l i t y f u n c t i o n , 111 Purchasing, 5-6

S Sample approval, 20 Sample i d e n t i f i c a t i o n , 19 Sample management, 19 Sample t r a c k i n g , LIMS, 2 Scaling computer graphics, 56 functions and f a c t o r s , 46 laws, 111 model, 111-115 Scan conversion i n r a s t e r d i s p l a y s , 63 Schema, VAX LIMS a r c h i t e c t u r e , 29f Screen c o p i e r s , computer graphics, 68 Screens, touch s e n s i t i v e , 71 Secondary r e s u l t s , manual data entry, 20 S e c u r i t y , 29f,34 data management, 18 Storage tube technology, 61

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.

124

C O M P U T E R S IN T H E

Structural-activity classifications, electrochemical, 101 S t r u c t u r e - e f f e c t r e l a t i o n s h i p s , com­ puter generation, 89-98 Subschema, VAX LIMS a r c h i t e c t u r e , 29f

Publication Date: October 5, 1984 | doi: 10.1021/bk-1984-0265.ix002

Τ Tablet, computer graphics, 71 Task-to-task communications, networking, 40 Test, d i c t i o n a r i e s , data management, 19 Text as data, use, 89 Toxline, 22 Track b a l l , computer graphics, 70 T r a i n i n g robot, 11 T r a n s l a t i o n , computer graphics, 56 U U.S. Army Corps of Engineers, 83-87 Use of t e x t as data, 89

LABORATORY

V VAX database u t i l i t i e s , 27t VAX LIMS, 26f a r c h i t e c t u r e , 27,29f f u n c t i o n s , 25 Vector d e f i n i t i o n , 60 d i s p l a y s , 60 Vector d i s p l a y s , d i s c u s s i o n , 60-62 Voltametric e l e c t r o a n a l y t i c a l data, methods of o b t a i n i n g , 103

W Water management, 83-87 network of computers, 85 Water q u a l i t y chemical a n a l y s i s using mechanized/computerized equipment, 84 l a b computers, 85 management by computers, backup procedures, 85

In Computers in the Laboratory; Liscouski, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1984.