Artificial Intelligence And International Politics 0813309379, 9780813309378

937 80 32MB

English Pages 0 [431] Year 1991

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Artificial Intelligence And International Politics
 0813309379, 9780813309378

Citation preview

Artificial Intelligence and International Politics

Artificial Intelligence and International Politics edited by Valerie M. Hudson

First published 1991 by Westview Press, Inc. Published 2018 by Routledge 52 Vanderbilt Avenue, New York, NY 10017 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business Copyright © 1991 Taylor & Francis All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Artificial intelligence and international politics I edited by Valerie M. Hudson. p. em. Includes bibliographical references and index.

ISBN 0-8133-0937-9 1. International relations-Research. 2. International relationsData processing. 3. Artificial intelligence. I. Hudson, Valerie

M.

JX1291.A73 1991 327'.072-dc20

ISBN 13: 978-0-367-00371-5 (hbk)

90-43085 CIP

CONTENTS

vii

About the Contributors

Introduction, Valerie M. Hudson 1

1

Artificial Intelligence and International Relations: An Overview, Philip A. Schrodt

9

PART ONE CONCEPTUAL ISSUES AND PRACTICAL CONCERNS 2

Artificial Intelligence and Intuitive Foreign Policy Decision-Makers Viewed as Limited Information Processors: Some Conceptual Issues and Practical Concerns for the Future, He/en E. Purkitt

35

3

Steps Toward Artificial Intelligence: Rule-Based, Case-Based, and Explanation-Based Models of Politics, Dwain Mefford

56

4

Text Modeling for International Politics: A Tourist's Guide to RELATOS, Hayward R. Alker, Jr., Gavan Duffy, Roger Hurwitz, and John C. Mallery

97

5

Reasoning and Intelligibility, James P. Bennett and Stuart J. Thorson

127

6

The Computational Modeling of Strategic Time, Howard Tamashiro

149

PART TWO AI/IR RESEARCH: INTERNATIONAL EVENTS AND FOREIGN POLICY DECISION MAKING 7

Pattern Recognition of International Event Sequences: A Machine Learning Approach, Philip A. Schrodt V

169

Contents

vi

8

Scripting International Power Dramas: A Model of Situational Predisposition, Valerie M. Hudson

9

UNCLESAM: The Application of a Rule-Based Model of U.S. Foreign Policy Making, Brian L. Job and

10

11

194

Douglas Johnson

221

Modeling Foreign Policy Decision Making as Knowledge-Based Reasoning, Donald A. Sylvan, Ashok Goe/, and B. Chandrasekaran

245

Decision Making and Development: A "Glass Box" Approach to Representation, Margee M. Ensign and Warren R. PhjJJips

274

PART THREE AI/IR RESEARCH: THE DISCOURSE OF FOREIGN POLICY 12

The Expertise of the Senate Foreign Relations Committee, G. R. Boynton

291

13

Reproduction of Perception and Decision in the Early Cold War, Sanjoy Banerjee

310

14

Theoretical Categories and Data Construction in Computational Models of Foreign Policy, David J. Sylvan, Stephen J. Majeski, and Jennifer L. Mj/Jiken

327

15

Semantic Content Analysis: A New Methodology for the RELATUS Natural Language Environment, John C. Mallery

347

16

Time Space: Representing Historical Time for Efficient Event Retrieval, Gavan Duffy

386

About the Book and Editor Index

407 409

CONTRIBUTORS

Hayward R. Alker, Jr., is a Professor of Political Science at MIT. Sanjoy Banerjee is an Assistant Professor in the International Relations Program at San Francisco State University. James P. Bennett is an Associate Professor of Political Science at Syracuse University. G. R. Boynton is a Professor of Political Science at the University of Iowa. B. Chandrasekaran is a Professor of Computer and Information Sciences at the Ohio State University. Gavan Duffy is Assistant Professor of Political Science at Syracuse University. Margee M. Ensign is an Assistant Professor of Political Science at Columbia University and the coordinator of the International Political Economy Program in the School of International and Public Affairs. Ashok Goel is an Assistant Professor of Computer and Information Sciences at Georgia Technical Institute. Valerie M. Hudson is an Assistant Professor of Political Science and the Director of Graduate Studies at the David M. Kennedy Center for International and Area Studies at Brigham Young University. Roger Hurwitz is a doctoral candidate in Political Science at MIT. Brian L. Job is a Professor of Political Science at the University of British Columbia. Douglas Johnson is a free-lance computer programmer based in Minnesota. Stephen J. Majeski is an Assistant Professor of Political Science at the University of Washington. John C. Mallery is a doctoral candidate in Political Science and Electrical Engineering and Computer Science at MIT. Dwain Mefford is an Assistant Professor of Political Science at the Ohio State University. vii

viii

Contributors

Jennifer L. Milliken is a doctoral candidate in Political S.cience at the University of Minnesota. Warren R. Phillips is a Professor of International Relations in the Department of Government and Politics at the University of Maryland. Helen E. Purkitt is an Associate Professor of Political Science at the U.S. Naval Academy. Philip A. Schrodt is an Associate Professor of Political Science at the University of Kansas. David J. Sylvan is an Associate Professor of Political Science at the University of Minnesota. Donald A. Sylvan is an Associate Professor of Political Science at the Ohio State University. Howard Tamashiro is an Assistant Professor of Political Science at Allegheny College. Stuart J. Thorson is a Professor in, and Chairman of, the Department of Political Science at Syracuse University.

Introduction Valerie M. Hudson

The purpose of this volume is to provide a general yet relatively comprehensive overview of the study of international relations using a computational modeling approach (herein abbreviated Al/IR). 1 Such an approach incorporates sophisticated, formal models of human reasoning into explanations of human behavior. In the international relations context, this means to offer explanations of international behavior in terms of the reasoning, inference, and language use of relevant humans-whether these be national leaders, foreign policy decision-making groups, or even scholars of international relations themselves. At this stage in its development, separating the broader theoretical objectives of AI/IR from the more personal motivations of those doing the research is difficult. This is in part because only a relatively small number of scholars are self-consciously working in the AI/IR subfield, and their research aims have thus far served to define AI/IR for others. At some point in the growth of this subfield a separation will be easier to make. In the meantime, it is instructive to trace the general route that led most AI/IR researchers to the point this edited volume represents. By and large, most of the contributors to the volume began their careers developing and applying mathematical and/or statistical models to "hard" (read "quantifiable") data in JR. To use the jargon, they would be seen as QJPers-QIP standing for "quantitative international politics."2 Hayward Alker at one point in his career searched for statistical patterns in Onited Nations voting behavior. Philip Schrodt elaborated mathematical models of arms races. Donald Sylvan used autoregressive integrated moving averages to explain the dynamics of arms transfers between nations. Stuart Thorson once wrote of visualizing the process of foreign policy adaptation as a chreod. And so on, with few exceptions. Dissatisfaction crept in, however. The central theme of articles expressing dissatisfaction (see, for example, Alker and Chrlstensen, 1972; Schrodt, 1984) was the inadequate expressibility inherent in the way they had been 1

2

Valerie M. Hudson

Figure 1: Weinberg 's Classification of Subjects (From Weinberg, 1975, p. 18)

Ill. Orgaruud Complexity (systems)

COMPLEXITY

analytical treatment statistical treatment

F''>'>>' ',',. .,. .,. .,. .,. .,l

studying international relations. This problem had numerous facets: inadequate expressibility of human reasoning and intentionality, of the richness of qualitative IR data, of the nonrandom but untidy patterns found in human behavior. Above all, what was most troublesome about the methods they employed was that they seemed incapable of offering an explanation of human behavior in IR that was fundamentally satisfying. The remark, attributed to Rudolph Rummel, to the effect of " if your r 2 is 1.0, then you have explained the phenomenon" did not sit well with some scholars who were, in fact, busy calculating r 2 s themselves. This problem is not unique to IR: lt is endemic to the social sciences. Philosophers of social science have amplified this dissatisfaction for decades now (see Taylor, 1985; Ham§ and Secord, 1973; Harre, 1984). When you bring the human being or collectivities of humans back in as the focus of research,3 as these and other philosophers would have us do, the richness of human experience and human reasoning overpowers our methods. This is ironic because for quite some time now quantitative methods have been viewed as the most powerful methods available to the social scientist. Gerald Weinberg's conceptualization of the problem remains a very useful one (see Figure 1). He breaks down realms of scientific inquiry into three regions according to the amount of randomness and complexity found therein (see Weinberg, 1975).

Introduction

3

In Region I, which Weinberg calls the realm of "organized simplicity," or of "mechanisms," there is enough structure and a small enough population to allow analytic or mathematical methods to accurately and completely explain and predict phenomena. In Region 11, the region of "unorganized complexity," there is such a lack of structure (or, alternatively, presence of randomness), and such large N sizes that statistical methods are the appropriate tool of inquiry. However, in Region Ill, that of "organized complexity," lie all the realms too complex for analytical treatment and too structured for statistical treatment. The realm of human experience, decision making, and behavior is most profitably seen as such a "medium-number system." lt is quite possible to apply the methodology of Regions I and 11 to understand humans and their behavior, but Weinberg likens this to using a band saw on one's fingernails: [A] band saw is [not] responsible for the consequences of its being used to trim fingernails. If fingernails need cutting, and the band saw is the only available cutting tool, then the results are more or less predictable. A band saw is a most useful tool, but not for certain jobs (Weinberg, 1975, 20).

The search for more refined tools of greater expressive power led some to paths quite foreign: computer science, information processing, social psychology, linguistic analysis, hermeneutics, and even a return to the more traditional case study approach. Such journeys can be traced in the contributions to this volume. Two observations merit making at this point. The first is that, with some exceptions, 4 there is still commitment to what we might call certain "scientific desiderata" among those who do AI/IR. One reason that the transition from QJP to AI/IR was not painful was that the techniques of the latter enabled researchers to keep the rigor, explicitness, and overt evaluation criteria of the former. The attractiveness to some of an approach capable of expressing the intricacies of human reasoning and experience while at the same time affording some vision of scientific progression should not be understated. The second observation is that the odyssey made by many who have contributed to this volume is, in a fundamental sense, a very familiar one. lt is common to all forms of human inquiry to seek means of expressing puzzles or problems that could not previously be expressed. At some point, scientists become inventors: They invent the world they personally wish to explore. I see the pieces in this volume as the work of inventors, adapting existing tools and constructing new ones in order to see what is to them more interesting than what they could see with the tools at hand. Robert Root-Bernstein, a biochemist and historian of science, could easily have written of AI/IR when he stated,

4

Valerie M. Hudson

It is not science's ability to reach solutions that needs to be accounted for, but its ability to define solvable problems. . . . [l)nduction, deduction, and abduction will not suffice to solve the range of problems scientists address. Scientific "tools of thought" are more diverse than this and have been developed not only to reason and to test, but to invent. These tools of invention include (and are probably not limited to): abstracting, modeling, analogizing, pattern forming and pattern recognition, aesthetics, visual thinking, playacting, and manipulative skill. I suggest that scientists might better educate their successors if they included these tools of thought in teaching their science (Root-Bernstein, 1989a, 486).

As with any odyssey, there are pitfalls, dead ends, wrong turns. AI/IR represents no magical shortcut to the explanation of international relations. Indeed, Charles Taylor's taunt of several years ago should still bother those who choose this journey: Theories [such as those found in artificial intelligence] lead to very bad science: either they end up in wordy elaborations of the obvious, or they fail altogether to address the interesting questions, or their practitioners end up squandering their talents and ingenuity in the attempt to show that they can after all recapture the insights of ordinary life in their manifestly reductive explanatory languages (Taylor, 1985, 1).

What is so very ironic is that AI/IR does wish to recapture the insights of ordinary life, to wit, how humans reason about international relations. Furthermore, use of a computer does consign one at this point to a "manifestly reductive language." What remains to be seen, pace Taylor, is the value of reintroducing to IR the study of how policy-makers and scholars reason about IR and explain IR to themselves. I suspect that value to be quite high. As Mefford argues in this volume, to provide satisfying explanation in international relations, we must strive for "strong" theory. Such theory, in a domain devoted to the study of intentional behavior produced by human reasoning as is international relations, would produce models that capture this process as realistically as possible. AI/IR models go a long way toward realizing that objective and, by so doing, progress what we are capable of offering as explanation in international relations. In my opinion, AI/IR bridges the gap between QJP and traditional IR by merging the strengths of the former with the strengths of the latter. (See Hudson, 1987.) Now one can speak of modeling, in a rigorous and explicit manner, the richness of IR as a human activity, with all that that adjective implies. To begin to accomplish that, even at first by means of a reductionist technology, could be revolutionary in the study of JR. lt may well be the catalyst for a whole new class of theory-building efforts in international

Introduction

5

relations. At the very least, the efforts of those who head in this direction deserve serious scrutiny and reflection-hence this volume. Philip Schrodt opens the volume by tracing the evolution of AI/IR by reference to the evolution of AI in general. The remainder of the volume is divided into two major sections: Conceptual Issues and Practical Concerns, and AI/IR Research. The first section is designed to acquaint the reader with general theoretical and applied issues that are of interest and relevance to the subfield. Helen Purkitt's chapter reinforces the shift to a computational modeling approach in IR by surveying recent empirical findings in social psychology (and related fields), cataloguing what is known about the idiosyncracies of human reasoning and perception. Dwain Mefford's piece, which follows, takes the reader on an informative reconnaissance of AI approaches: rule-based systems, casebased systems, and explanation-based learning systems. Mefford compares their strengths and weaknesses by constructing a model of political reasoning about coups d'etat using each of the three approaches. Hayward Alker and his coauthors then introduce the study of "natural language processing" (NLP), which can be seen as distinct from much of the work in the volume in terms of its objectives. NLP work endeavors to construct computer-aided tools for the systematic processing of what some consider to be the heart of politics-textual accounts of political phenomena. James Bennett and Stuart Thorson also explore how natural language accounts of political concepts-specifically, deterrence-can be formalized using computer languages. This formalization allows for a type of analysis impossible using other methods. Rounding out this first section is Howard Tamashiro's chapter, which illustrates how extremely important strategic concepts (here, time) are often neglected in the absence of techniques capable of capturing their nuances, which techniques can be found in computational modeling. The second section of the volume showcases actual research (completed or ongoing) produced by those working in the AI/IR subfield. This section is subdivided into two parts to distinguish different broad categories of AI/ IR research. It should be noted that these subdivisions are not cut-anddried, and that the reader will discover some overlap. The first subdivision of research I have labeled "International Events and Foreign Policy Decision Making." The research I have called "International Events" attempts to uncover patterns in the behavior of nations by utilizing AI techniques developed to facilitate such a process through use of a computer. These models are not designed to simulate how national authorities came to produce such behavior; in that respect, they constitute a "minority line" of research within the subfield. However, they can be viewed as efforts to make the vast capabilities of the computer perform tasks similar to those which international relations scholars perform when they endeavor to make sense of sequences of international events. Schrodt's chapter uses pattern

6

Valerie M. Hudson

recognition techniques to tease out similarities and dissimilarities in sequences of events composing international crises. Chapter 8 posits a performanceoriented rule-based production system enabling one to postdict to an events data set. The "Foreign Policy Decision Making" research consists of work aimed at modeling the information-processing and policy decision-making activities of political actors. Brian Job and Doug Johnson explicate their UNCLESAM program, a rule-based simulation, which captures the basic structures and processes of U.S. decision making toward Latin America. Donald Sylvan and his colleagues detail JESSE-a model of Japanese energy policy making that incorporates the theory of generic tasks in its architecture. JESSE's design permits it to function as a compiled reasoning system. Margee Ensign and Warren Phillips discuss the application of AI/IR methods to understanding the development policy choices of Third World leaders. In addition to assisting them in this primary task, the authors note that the use of AI/IR methods permits them to envision their model as a classroom teaching tool as well. The second research subsection, "The Discourse of Foreign Policy," involves the search for techniques to interpret, process, and critically analyze textual material of relevance to the study of international relations. G.R. Boynton uses the notion of "interpretive triples" to discover how the Senate Foreign Relations Committee attempted to make sense of the Reagan administration's foreign policy during the summer of 1987. Sanjoy Banerjee posits a "computational hermeneutic" model of Cold War superpower rhetoric and shows how each side helped to "reproduce" the Cold War script of its adversary. David Sylvan and his coauthors discuss how the notion of "grounded theory" helps them to abduce categories in textual data consistent with the self-understanding of those who generated the text. Finally, the chapters by John Mallery and Gavan Duffy detail the components of the RELATUS system introduced in the contribution by Hayward Alker et al. in the conceptual section of the volume. Grateful acknowledgments are due those who helped make this volume a reality. Without the support of the David M. Kennedy Center for International Studies at Brigham Young University and of Hayward R. Alker, Jr., this volume probably would not exist. Without the encouragement and advice of Donald Sylvan, I would never have attempted this project. He and Philip Schrodt were crucial in supporting me through the grimmer moments of editorship, and their advice proved invariably sound. Brian Job was also very helpful. I would like to thank John Mallery and Don Sorenson for helping to fill the gaps in my knowledge on a variety of matters. Without the hard work of Louis Floyd, the manuscript would never have assumed its final form. Several tough figures were rendered camera-ready by Beccy Martin's selfless assistance. I am also appreciative

7

Introduction

of the financial support provided by the Department of Political Science at Brigham Young University. Finally, I would like to dedicate this volume to my grandmother, Roberta Edstrom, whose memories made me whole and whose example lights my way.

Notes 1. Computational modeling is the term to be preferred over AI modeling in this regard, for the models put forth in this volume do not aspire to the ultimate goal of AI, which is to produce a machine that can in some significant sense be said to possess human intelligence. However, since most of the methodology used is derived from AI, as long as the differentiation in objective is understood, I see no reason not to make use of the more widely recognized acronym of "Al." 2. The term originates from the two edited volumes, both entitled Quantitative International Politics, that came out during the heyday of this type of analysis in international relations. 3. Whether the human in question be the researcher (see Schrodt, Mefford, Hudson pieces this volume) or the research subject. 4. David Sylvan makes a strong case for a "possibilist" research agenda, and this finds echoes in the works of others, as well, such as Hayward Alker, Stuart Thorson, James Bennett, and Gavan Duffy.

Bibliography Alker, Hayward, and C. Christensen, 1972. "From Causal Modelling to Artificial Intelligence: The Evolution of a United Nations Peace-keeping Simulation," in J.A. LaPonce and Paul Smoker (eels.), Experimentation and Simulation in Political Science, Toronto: University of Toronto Press. Harre, Rom, 1984. Personal Being, Cambridge, Massachusetts: Harvard University Press. Harre, Rom, and P.F. Secord, 1973. The Explanation of Social Behavior, Totowa, New Jersey: Littlefield, Adams, & Co. Hudson, Valerie M., 1987. "Using a Rule-Based Production System to Estimate Foreign Policy Behavior: Conceptual Issues and Practical Concerns," in Stephen Cimbala (ed.), Artificial Intelligence and National Security, Lexington, Massachusetts: Lexington Books, pp. 109-132. Root-Bernstein, Robert, 1989a. "How Scientists Really Think," Perspectives in Biology and Medicine, 32, 4, Summer, pp. 472-488. ___ , 1989b. Discovering, Cambridge, Massachusetts: Harvard University Press. Schrodt, Philip, 1984. "Artificial Intelligence and the State of Mathematical Modeling in International Relations." Paper presented at the U.S. and Swiss National Science

8

Valerie M. Hudson

Foundation Conference on Dynamic Models of International Conflict, Boulder, Colorado, October 31-November 3. Taylor, Charles, 1985. Philosophy and the Human Sciences, New York: Cambridge University Press. Weinberg, Gerald, 1975. An Introduction to General Systems Thinking, New York: John Wiley and Sons.

1 Artificial Intelligence and International Relations: An Overview Philip A. Schrodt

Artificial intelligence techniques have been used to model international behavior for close to twenty years. Alker and his students at MIT were generating papers throughout the 1970s (Aiker and Christensen 1972; Alker and Greenberg 1976; Alker, Bennett, and Mefford 1980), and by the the early 1980s work at Ohio State was responsible for the first article to apply AI in a mainstream IR journal (Thorson and Sylvan 1982) and the first edited book containing a number of AI articles (Sylvan and Chan 1984). This volume provides the first collection of essays focusing on AI as a technique for modeling international behavior, 1 an approach commonly, if controversially, labeled "AI/JR." The essays come from about twenty-five contributors at fifteen institutions across the United States and Canada; the substantial foci range from Japanese energy security policy to Vietnam policy in the Eisenhower administration. The purpose of this overview is twofold. First, it will provide a brief background of relevant developments in AI in order to provide some perspective on the concepts used in AI/IR. Second, it will identify some common themes in the AI/IR literature that use artificial languages for modeling. I will not deal with any of the issues in depth, nor provide extensive bibliographical guidance, as these are ably presented in the chapters themselves (e.g., those by Mefford and Purkitt in this volume; Mallery [1988) also provides an excellent introduction). I will not discuss the natural language processing (NLP) literature-which is covered in the chapters by Mallery, Duffy, and Alker et al.-though I will discuss projects that use text as a source of information for development of models rendered in artificial language This research was supported in part by National Science Foundation Grant SES-8910738 and by the University of Kansas General Research Allocation 3884-XQ-0038. My thanks to John fllallery for helpful comments on an earlier draft.

9

Philip A. Schrodt

10

(e.g., Bennett and Thorson; Boynton; Sylvan, Milliken, and Majeski). This chapter does not purport to provide a definitive description of the AI/IR field; it is simply one person's view of the organization of the field at the moment. In contrast to many other modeling approaches, the AI/IR community is characterized by a healthy level of internal debate. This chapter is the overture, not the symphony. I intend only to draw your attention to themes; the details, in both melody and counterpoint, are found in the chapters that follow. Artificial Intelligence

The label "artificial intelligence" is, ironically, rejected by a majority of the authors in this volume as a description of their shared endeavor. The preferred label is "computational modeling," which acknowledges the field's intellectual roots in the formal modeling and computer simulation literature within political science, rather than in the AI literature of computer science. As will be noted below, the AI/IR efforts utilize only a tiny subset of AI methods, and in many respects Al/IR overlaps at least as much with cognitive psychology as with computer science. The AI label poses two additional problems. The most severe is guilt by association with "the AI hype": the inflated claims made for AI by the popular media, science fiction, and consulting firms. The AI hype has been followed by the backlash of the "AI winter," and so AI/IR risks being caught in a counterrevolution just as it is beginning to produce results. The second problem is the controversial word "intelligence." In the AI hype, "intelligence" has usually been associated with superior intelligence such as that exhibited by Star Wars robots (either the George Lucas or Ronald Reagan variety). The most common retort I encounter when presenting AI/IR overviews to unsympathetic audiences is: "You can't model politics using artificial intelligence; you'd have to use artificial stupidity. " 2 As the chapters that follow indicate, that is precisely our shared agenda! "Artificial stupidity" involves limited information processing, heuristics, bounded rationality, group decision processes, the naive use of precedent and memory over logical reasoning, and so forth. These features of human reasoning, amply documented in the historical and psychological literature, are key to AI/IR but largely absent from optimizing models of the dominant formal paradigm in political science, rational choice (RC). Ironically, the true "artificial" intelligence is utility maximization, not the processes invoked in computational models. All this being said, one must confront two social facts. First, the term "computational modeling" has not caught on because it is not reinforced by the popular media. Second, Al/IR has borrowed considerably from that part of computer science and the cognitive sciences which calls itself

AI and International Relations: An Overview

11

"artificial intelligence," including the widespread use of LISP and Prolog as formal languages, the formalization of rules, cases and learning, and a great deal of vocabulary. In the spirit of mathematician David Hilbert's definition of geometry as "that which is done by geometers," the AI label will probably stick. AI in the Early 1980s

The term "artificial intelligence" refers to a very large set of problems and techniques ranging from formal linguistic analysis to robots. Researchers in "AI" may be mathematicians or mechanics, linguists or librarians, psychologists or programmers. Schank (1987:60) notes: Most practitioners would agree on two main goals in Al. The primary goal is to build an intelligent machine. The second goal is to find out about the nature of intelligence. . . . [However,] when it comes down to it, there is very little agreement about what exactly constitutes intelligence. lt follows that little agreement exists in the AI community about exactly what AI is and what it should be.

Research in AI has always proceeded in parallel, rather than serially, with dozens of different approaches being tried on any given problem. As such, AI tends to progress through the incremental accumulation of partial solutions to existing problems, rather than through dramatic breakthroughs. Nonetheless, from the standpoint of Al/IR, there were two important changes in AI research in the late 1970s. First, rule-based "expert systems" were shown to be able to solve messy and nontrivial real-world problems such as medical diagnosis, credit approval, and mechanical repair at the same level of competence as human experts (see, for example, Klahr and Waterman 1986). Expert systems research broke away from the classical emphasis in AI on generic problem solving (e.g., as embodied in chess-playing and theorem-solving programs) toward an emphasis on knowledge representation. Expert systems use simple logical inference on complex sets of knowledge, rather than complex inference on simple sets of knowledge. The commercial success of expert systems led to an increase in new research in AI generally-the influx of funding helpedand spun off a series of additional developments such as memory-based reasoning, scripts, schemas, and other complex knowledge representation structures. Second, the personal computer, and exponential increase in the capabilities of computers more generally, brought the capabilities of a 1960s mainframe onto the researcher's desk. The small computers also freed AI researchers from dependence on the slow and idiosyncratic software development designed for centralized mainframes. The "AI style" of programming led to a generation

12

Philip A. Schrodt

of programmers and programming environments able to construct complicated programs that would have been virtually impossible using older languages and techniques. All of this activity lead to a substantial increase in the number of people doi~g AI. The American Association for Artificial Intelligence (AAAI) was founded in 1979, had 9,935 members by 1985 and 14,269 by 1986-a growth of 43 percent in a single year. In short, AI in the 1980s was accompanied by a great deal of concrete research activity in contrast to faddish techniques such as catastrophe theory. The AI Hype

Perhaps predictably, the increase in AI research was accompanied (and partially fueled) by a great deal of hype in the popular and semiprofessional media. An assortment of popular books on AI were produced by researchers such as Feigenbaum (Feigenbaum and McCorduck, 1983; Feigenbaum and McCorduck and Nii, 1988), Minsky (1986), and Schank (Schank and Riesback, 1981); at times these reached sufficient popularity to be featured by paperback book clubs. Journalists such as McCorduck (1979), Sanger (1985), and Leithauser (1987) provided glowing appraisals of AI; these are only three of the hundreds of books and popular articles appearing in the early to mid-1980s. Concern over the Japanese "Fifth Generation Project" (Feigenbaum and McCorduck, 1983) provided impetus for the wildly unrealistic 3 "Strategic Computing Initiative" of the Defense Advanced Research Projects Agency (DARPA, 1983). These popular works provided a useful corrective to the outdated and largely philosophical criticisms of Dreyfus (1979) and Weizenbaum (1976) about the supposed limits of Al. By the early 1980s researchers had made substantial progress on problems that by any reasonable definition required "intelligence" and were exhibiting performance comparable to or exceeding that of humans. However, the popularizations were understandably long on concepts and short on code, and their explicit or implicit promises for continued exponential expansion of the capabilities of various systems did not take into account the tendency of technological innovation to follow a logistic curve. 4 Because the promises made in these popularizations were based largely on laboratory results that had not been scaled up nor widely applied in real-world settings, such promises set up AI for a fall. The AI Winter

The hype of the mid-1980s leveled off by the latter part of that decade and some segments of the AI community-particularly companies producing specialized hardware-experienced the "AI Winter." However, the decline of AI was more apparent than real and reflected the short attention span of

AI and International Relations: An Overview

13

the popular press as attention turned away from AI to global warming, superconducting supercolliders, parallel processing, and cold fusion. Experimental developments, most notably neural networks, continued to attract periodic media attention, but the mainstream of AI assumed a level of glamor somewhere between that of biotechnology and X-ray lasers: yesterday's news, and somewhat suspect at that. Yet ironically, the well-publicized bankruptcies of "AI firms" (see, for example, Pollack 1988) were due to the success rather than the failure of Al. As AI techniques moved out of the laboratories and into offices and factories, commercial demand shifted from specialized "AI workstations" and languages such as LISP to systems implemented on powerful generalpurpose microcomputers using standard procedural programming languages such as C or off-the-shelf expert systems shells. AI research moved inhouse and was diffused into thousands of small applications rather than a few large ones. Overall, the AI field remained very healthy. Although membership in the AAAI declined in 1988 and 1989, dropping to 12,500 members, the 1989 International Joint Conference on Artificial Intelligence, the AI equivalent of the International Political Science Association, was large enough to require the Detroit Convention Center, fill every convention hotel in downtown Detroit and nearby Windsor, Ontario, and all this despite a $200 conference registration fee. Beyond the issues of popular perception, it is important to note that the future of Al/IR is largely independent of the successes or failures of AI generally. Whether a chess-playing program will be able to defeat the reigning human grand master or whether simultaneous translation of spoken language is possible will have no effect on most AI/IR research. Even if mainstream AI has some implications for the development of computational models of international behavior, the Al/IR literature is primarily shaped by literatures in psychology, political science, and history rather than computer science. The techniques borrowed from computer science are only tools for implementing those theories.

AI/IR Research: A Framework This section will attempt to structure the various sets of problems studied in AI/IR. For example, there has been frequent confusion outside the field as to why discourse analysis (represented in this volume by Boynton; Thorson and Bennett; and Sylvan, Milliken, and Majeski) should have anything to do with rule-based models (e.g., Job and Johnson) because the techniques are entirely different. The simple answer is that the AI/IR literature is primarily linked by underlying theories and questions rather than by methodology. Although this is consistent with classical Kuhnian notions of science,

Philip A. Schrodt

14

it is decidedly uncharacteristic of formal approaches to the study of international behavior such as correlational analysis, arms-races models, and game theoretic models of war initiation, which are largely linked by technique. As noted earlier, any effort to find common themes in a field as conceptually rich and disputatious as AljiR is fraught with the risk of oversimplification: This chapter is simply a survey of the high points. AI/IR developed in an evolutionary fashion; the organization I have presented below is a typology imposed, ex post facto, on an existing literature, rather than an attempt to present a consensus view of where the field is going. The typology consists of three parts. The first category is the research on patterns of political reasoning, which provides the empirical grounding for models of organizational decision making. The second category involves the development of static models of organizational decision making, which aim to duplicate the behavior of an organization or system at a specific point in time. This is the largest part of the literature in terms of models that have actually been implemented, and it relies on the expert systems literature in the AI mainstream. The final category contains dynamic models that incorporate precedent, learning, and adaptation, which can show how an organization acquired its behavior as well as what that behavior is. These models are necessarily more elaborate and experimental, though some largescale implementations exist, notably the JESSE model of Sylvan, Goel, and Chandrasekaran. Patterns of Political Reasoning The Psychological Basis. Virtually all work in AI/IR acknowledges an

extensive debt to experimental work in cognitive psychology. Although a wide variety of approaches are cited, two literatures stand out. The first influence is the work of Alien Newell and Herbert Simon on human problem solving (Newell and Simon 1972; Simon 1979, 1982). Simon's early work pointing to the preeminence of satisficing over maximizing behavior is almost universally accepted in Al/IR, as is the Neweii-Simon observation that human cognition involves an effectively unlimited (albeit highly fallible) memory but a fairly limited capacity for logical reasoning. These assumptions about human problem solving are exactly opposite those of the rational choice approach, where cognition involves very little memory but optimization is possible. In addition to these general principles, other work by Newell and Simon on specific characteristics of human problem solving is frequently invoked, for example the re-use of partial solutions and the distinction between expert and novice decision making. The second very large experimental literature is the work of Daniel Kahneman, Paul Slovic, Amos Tversky (KST), and their associates in exploring the numerous departures of actual human decision making from the char-

AI and International Relations: An Overview

15

acteristics predicted by utility maximization and statistical decision theories (Kahneman, Slovic, and Tversky 1982). This work has emphasized, for example, the importance of problem framing, the use of heuristics, the effects of familiarity and representativeness, and so forth. Even though the experimental results, general principles, and concepts of these two literatures are used extensively in Al/IR, their theoretical frameworks are not. Newell and his students have developed a general computational paradigm for AI, SOAR (Laird, Rosenbloom, and Newell 1986; also see Waldrop 1988), but to my knowledge it has not been applied in the IR context. "Prospect theory," the term usually applied to the KST work, is also rarely used. These research results are used instead to explicate some of the characteristics of IR decision making, which, because it is organizational and frequently involves unusual circumstances such as the decision to engage in lethal violence, is far removed from the individualistic studies of much of the psychological literature. Some work on group decisionmaking dynamics has also been used-for example, Pennington and Hastie on decisions by juries (cited in Boynton)-but in general this literature is smaller and less well known in cognitive psychology than the theories and experiments on individuals. Knowledge Representation. Consistent with the Neweii-Simon approach, AI/IR models are heavily information-intensive. However, the theory of data employed in AI/IR is generally closer to that of history or traditional political science than it is to statistical political science. Most of the chapters in this volume use archival text as a point of departure; the remainder use secondary data such as events that were originally derived from text using procedures similar to content analysis. The ordinal and interval-level measures common to correlational studies, numerical simulations, and expected utility models are almost entirely absent. This, in turn, leads to the issue of knowledge representation, which is a theme permeating almost all of the papers and which accounts for much of their arcane vocabulary and seeming inconsistency. Whereas behavioral political analysis essentially has only three forms of knowledge representation-nominal, ordinal, and interval variables-At presents a huge variety, ranging from simple if . . . then statements and decision trees to scripts and frames to self-modifying programs and neural networks. This surfeit of data structures is both a blessing and a curse. It provides a much broader range of alternatives than are available in classical statistical or mathematical modeling, and certainly provides a number of formal structures for representing the large amounts of information involved in political decision making; this comes at the expense of a lack of closure and an unfamiliar vocabulary. Part of this problem stems from the fact that knowledge representation concepts have yet to totally jell within their parent discipline of computer science. In this regard AI/IR is quite different than the behaviorialist adoption

Philip A. Schrodt

16

of statistical techniques and the RC adoption of economic techniques: In both cases stable concepts and vocabulary were borrowed from more mature fields. Structure of Discourse and Argument. For the outsider, perhaps the most confusing aspect of the AI/IR literature is the emphasis on the analysis of political argument and discourse. This type of analysis is found in the articles by Boynton; Sylvan, Milliken, and Majeski; and Bennett and Thorson; more sophisticated tools for dealing with discourse are found in the natural language processing (NLP) articles. At first glance, rummaging through the Congressional Record or using Freedom of Information Act requests to uncover Vietnam-era documents is the antithesis of the formal modeling approach: Archival sources are the stuff of history, not models. In fact, these analyses are at the core of modeling organizational cognition. As such the archival work is simply specialized research along the lines of the general psychological studies. The political activities modeled in AI/IR are, without exception, the output of organizations. Because the AI/IR approach assumes that organizations are intentional and knowledge-seeking, it is important to know how they reason. Conveniently, organizations leave a very extensive paper trail of their deliberations. Although archival sources do not contain all of the relevant information required to reconstruct an organization's behavior-organizations engage in deliberations that are not recorded and occasionally purposely conceal or distort the records of their deliberations-it is certainly worthy of serious consideration. 5 Static Modeling: Rule-Based Systems

Rule-based systems (RBS) are currently the most common form of AI/ IR model, and even systems that go well beyond rules, such as the JESSE simulation, contain substantial amounts of information in the form of rules. Contemporary RBS are largely based on an expert systems framework, but the "production systems" that dominated much of AI modeling from the late 1950s to the early 1970s are also largely based on rules; early production system models of political behavior include Carbonell (1978) in computer science and Sylvan and Thorson (1982) in JR. In addition to chapters by the authors in this volume, other models of international behavior using the rule-based approach have included Soviet crisis response (Kaw 1989), Chinese foreign policy (Tanaka 1986), the political worldview of Jimmy Carter (Lane 1986) and Chinese policy toward Hong Kong (Katzenstein 1989). In its simplest form an RBS is just a large set of if . . . then statements. For example, a typical rule from Job and Johnson's UNCLESAM programa simulation of U.S. policy toward the Dominican Republic-has the form IF U.S. Posture to the Dominican Republic government and

> 4

AI and International Relations: An Overview

17

Stability Level > = 5 and Stability Level Change > 0 THEN Increment U.S. Use of Force Level by 1 An RBS may have hundreds or thousands of such rules; they may exist independently, as in Job and Johnson or production system models, but more typically are organized into hierarchical trees (for example, Hudson in this volume or Kaw 1989). Typical commercial expert systems used for diagnosis or repair have about 5,000 rules; most AI/IR systems are far simpler. Mefford's chapter in this volume describes in considerable detail RBS developments beyond basic if . . . then formulations; one should also note that the boundaries between the more complicated RBS and other types of models (for example, case-based reasoning and machine learning systems) are extremely fuzzy. Nonetheless, virtually all AI models encode some of their knowledge in the form of rules. 6 Despite the near ubiquity of rules in computational models, this approach stands in clear contrast to all existing formal modeling traditions in political science, which, without exception, use algebraic formulations to capture information. These methods encode knowledge by setting up a mathematical statement of a problem and then doing some operations on it (in RC models, optimization; in statistics, estimation; in dynamic models, algebraic solution or numerical simulation). The cascading branching of multiple rules found in RBS is seldom if ever invoked; when branches are present they usually only deal with boundary conditions or bifurcations7 and are simple in structure. Although much of the impetus for the development of RBS in political science came from their success in the expert systems literature, rules are unusually well suited to the study of politics, as much of political behavior is explicitly rule-based through legal and bureaucratic constraints. Laws and regulations are nothing more than rules: These may be vague, and they certainly do not entirely determine behavior, but they constrain behavior considerably. Analyses of the Cuban Missile Crisis, for example, repeatedly observe that the military options were constrained by the standard operating procedures of the forces involved. Informal rules-"regimes" or "operational codes" in theIR literature (e.g., Krasner 1983; George 1969)-impose additional constraints. For example, in the Cuban Missile Crisis, John F. Kennedy did not consider kidnapping the family of the Soviet ambassador and holding them hostage until the missiles were removed, though in some earlier periods of international history (e.g., relations between the Roman and Persian empires, circa 200 c.E.) this would have been considered acceptable behavior.

18

Philip A. Schrodt

In short, rule-based political behavior is not an "as if" proposition: It can be empirically confirmed. However, because bureaucracies do not solely follow their rules-in fact most bureaucracies would be paralyzed if they attempted to do so-the extent to which rules can capture actual behavior and the complexity required to do so is an open question. As the chapters in this volume and other RBS research indicate, it is clearly possible to simulate political behavior using rules. The complexity of these systems, though substantially greater than the complexity of most existing formal models (other than simulations) is also well within the limits of existing RBS developed in other fields. Despite the widespread use of rules in AI/IR models, many of the chapters in this volume argue against rule-based formulations, or at least indicate problems with the use of rules. This should not be interpreted as a rejection of any use of rules but only as a rejection of depending solely on rules. This, in turn, is a reaction to theoretical issues in AI rather than political science: A variety of approaches advocate rules in some form as a universal standard of knowledge representation and argue that all work in AI should use a single, unified concept of knowledge representation and manipulation. This comprehensive approach is rejected by almost all of AI/IR work as premature at best, given the evidence from the cognitive and organizational decision-making literature. Dynamic Modeling: Adaptation and Learning

Learning is one of the most basic characteristics of human cognitive behavior but has been largely absent from existing formal models in political science.8 One of the most distinctive-and potentially revolutionary-characteristics of the AI/IR models is the use of learning as a dynamic element. By attempting to model not only what organizations do but why they do it-in the sense of providing an explanation based on the prior experience of the organization-these models can potentially provide greater detail and process validity than those currently available. An assortment of learning schemes are currently under development, but most involve at least two elements. First, the basic rule of learning is "Bureaucracies do not make the same big mistake twice." Reactions to a situation are based in part on precedents, and the success or failure of a particular plan will affect whether it is used in the future. Knowledge is modified, not merely acquired. The second general characteristic is the reapplication of solutions that worked in the past to problems encountered in the present: In the JESSE simulation this is called "compiled reasoning." As Mefford notes, this same concept is central to Polya's scheme of human problem solving; more recently it has been the central focus of the Laird, Rosenbloom, and Newell

AI and International Relations: An Overview

19

SOAR project. The content of an organization's compiled reasoning in turn depends on the problems it has previously encountered, and so the history of the organization is important in determining what it will do. This emphasis on learning and adaptation provides the "path dependence" mentioned in a number of the chapters. International politics is not like a game of chess where future plays can be evaluated solely on the basis of the current board position. Instead, how a situation came into being may be as important as its static characteristics. Deductive reasoning from first principles alone is not sufficient. Learning also partially resolves the problem of underdetermination in satisficing-it indicates not just the suboptimal character of decisions but also substantially narrows which suboptimal decision will be made. Learning is such a basic human activity that it is taken for granted in the informal bureaucratic politics literature, though in recent years a number of studies reemphasizing the importance and problems of learning as a separate activity have emerged (see, for example, Neustadt and May 1986 on precedent and analogy; Etheredge 1985 on learning in foreign policy; Margolis 1987 on patterns). The difference is that the AI/IR models have succeeded in formally modeling this learning process in a complex environment, which no other models have. 9 To the extent that learning and adaptation are key human behaviors, this is a major step forward. Perhaps because learning is so inherently human, the extent to which it can be simulated is frequently underestimated, if not rejected outright. One of the most popular-and most inaccurate-characterizations of computer models is "A computer can't do anything it wasn't programmed to do." Strictly speaking, this may be true: A computer can't do in five minutes anything a human with a pencil and paper couldn't do in eleven centuries of sustained effort, 10 but for practical purposes there is a qualitative difference. As a complex system, a computer can very easily be programmed to perform in ways unanticipated by its programmers. By that same token one can reject another hoary characterization, "Computers can't be creative." In fact, there are a variety of rather straightforward techniques by which programs can "create" solutions to problems that were unanticipated by their programmers, and the creation of such programs has been a longstanding research tradition in mainstream Al. Modeling organizational learning involves modeling at least two different types of learning. The first, and simpler, learning is knowledge acquisition: the process by which precedents, analogies, cases, plans, or whatever are acquired and retained. The second and more difficult phase is modeling the adaptation of the means by which those are invoked. In other words, the process of "reasoning by analogy" involves both the availability of analogous situations and also the means by which the analogy is made: Neither issue is self-evident.

20

Philip A. Schrodt

Consider, for example, the situation of a failure of reasoning by analogy in the Bay of Pigs invasion. This exercise was modeled on the earlier successful CIA overthrows of Mossadegh in Iran and Arbenz in Guatemala. Had the Marines been involved in the planning, analogies might have been made to the problems of invading islands encountered in World War 11, leading, presumably, to greater skepticism or at least better preparation. Hence one could say that the Bay of Pigs failed because only "overthrow" precedents were used to the exclusion of "invasion" precedents. Alternatively, one could argue that the situation itself had been incorrectly understood in the sense that important variables were not considered: The theory that identified the precedents was in error. If the prevailing theory indicates that two things should match and they did not, the organization will recognize that something is wrong with the theory and change it. Arguably, this occurred historically: The failure at the Bay of Pigs was attributed to the lack of sufficient force and a clear U.S. commitment, and therefore the next two U.S. interventions in the Third World-in the Dominican Republic and Vietnam-involved overt use of large numbers of troops. These two types of learning have very different implications for the structure of the model, and the question as to which dominates is an empirical one. Modeling the use of precedent is relatively straightforward, but organizational modifications to the interpretation of precedent may be more important, particularly when dealing with unexpected behaviors. Precedent. The concept of precedent is probably second only to that of "rules" in the discussions in this volume, and precedent is closely linked to the issue of memory. In the simplest form, actors will simply do what they've done before, and the best predictor for behavior in a complex system is the patterns of history. Precedent, analogy, and case studies have strong antecedents in the traditional political literature (e.g., Neustadt and May 1986), as well as being formally invoked in Anglo-American legal reasoning. As with the use of rules, precedent does not require "as if" apologies: lt is used openly and explicitly. The use of precedent is, however, somewhat ambiguous. In foreign policy discourse, precedent is most likely encountered as a justification-in other words, it is invoked as an empirical regularity. In legal reasoning and in much of the case-based reasoning literature, in contrast, it is a plan-a series of actions which should be taken. In the latter sense a precedent is merely a complex antecedent clause of an if ... then clause with a complex consequent. These two uses of precedent are not mutually incompatiblein a sufficiently regular and well-defined system, the use of precedents as plans would cause them to become empirical regularities-but they are different. As noted in several of the chapters, empirical studies of actual policy deliberations find little evidence of a dominant role for precedent. Several

AI and International Relations: An Overview

21

factors may account for this. First, in contrast to the legal arena, the international system is neither well defined nor particularly regular, and consequently the precedents are not necessarily clear. Furthermore, in IR discourse, a precedent such as "Munich," "Pearl Harbor," or "Vietnam" is most likely to be invoked as something to be avoided, not to be implemented. 11 Precedents used repeatedly are incorporated into the standard operating procedures of the organization-they become rules. Consequently, precedent may be a powerful tool for decision making even if it isn't actively invoked in debate. The second possible problem is that precedents are probably generalized into "ideal cases." If a decision-maker refers to the danger of a coup in El Salvador, what is usually invoked is not a specific coup 12 but rather coups in general. Compiled Reasoning and Structure. On the surface, compiled reasoning is only a modest enhancement of existing psychological models: The notion that individuals and organizations reuse prior successful behavior is not particularly controversial and is certainly strongly supported by experimental evidence with individuals going back several decades. However, the critical contribution of such models may be in the specification of "learning-driven models" (LDMs)-models whose key dynamics are determined by learning. In particular, this might begin to concretize the heretofore exceedingly mushy concept of "structure" in political behavior. For example, suppose that at any given time, the actors in a system can be viewed (or modeled) as simple stimulus-response actors-ceteris paribus, given an input, one can predict the resultant behavior. This, in turn, provides a set of mutual constraints on those activities, which we refer to as "structure." For example in the 1970s, any Eastern European state, on the basis of Soviet activities in Berlin, Hungary, and Czechoslovakia during the 1950s and 1960s could reasonably assume that excessive economic and political liberalization would lead to Soviet military intervention. From the standpoint of the actor, this is simply a rule of the system. However, in an LDM, every event has the potential of changing those reaction functions-in other words, the system is self-modifying (either through accumulation of cases or modification of rules). Although most events do not change the functions, when learning occurs, the reaction functions of the system may change dramatically. For example, by 1989, Eastern European states reacted as if economic and political liberalization would not cause Soviet intervention after some initial experimentation by Poland along these lines was reinforced. Because some-if not most-of the knowledge of structure is embodied in complex qualitative structures, the change is not necessarily incremental, and it may be mutually reinforcing (as happened, for example, in Hungary, Poland, and the GDR in the autumn of 1989, followed later by Czechoslovakia, Bulgaria, and Romania) so as to cause a series of fairly dramatic changes.

22

Philip A. Schrodt

This does not, however, mean that the situation is chaotic, a key advantage of LDMs over the other formulations. Most of the knowledge base and the learning mechanism has not changed;13 only the output has changed. The environment, the actors, and the parameters of their decision making are almost unmodified; the change is embodied in relatively simple and predictable modifications of knowledge and rules for interpreting that knowledge, often empirically available in the archival record. Realistically modeling this type of behavior is a difficult task, and the AlfiR system that comes closest to embodying it, unsurprisingly, is the complex JESSE simulations. But this objective underlies many of the chapters. The Upshot: The Ideal Model

To summarize these points, the following is a list of the characteristics I believe most researchers would include in an ideal AlfiR model. • The model would be consistent with the psychological literature on individual and group decision making, in particular it would involve suboptimal reasoning, heuristics, anchoring, and other aspects of "artificial stupidity." • Both the actions predicted by the model and the reasoning reflected in the determination of those actions would not be inconsistent with that found in actual political debate. • The model would store knowledge in complex, qualitative data structures such as rules, scripts, frames, schemas, and sequences. Among that knowledge, though not necessarily dominant, would be precedents and cases. • The model would learn from both successes and failures: Successful solutions would be likely to be reused; unsuccessful solutions or observations about changes in the environment would cause changes in the rules that are used to solve problems. This learning would be similar to that seen in actual political behavior. • The overall model would provide a general engine for the study of international politics-in other words, it would work on a variety of problems. The specific knowledge required to implement the model, of course, would differ significantly with the problem. None of the existing models incorporates all of these factors, and none is considered even close to a final solution to them. However, almost all of the chapters in this volume contribute to this agenda, and quite a number of the components have already been demonstrated.

AI and International Relations: An Overview

23

AI and Contemporary Political Science One of the most intriguing aspects of the Al/IR research is the breadth of the substantive foci of these studies. The topics dealt with in this volume include the Johnson administration's Vietnam policy, Dwight Eisenhower's Vietnam policy, Japanese energy security policy, development, U.S. policy in the Caribbean, international relations during the early Cold War period, the international behavior recorded in the BCOW and CREON data sets, the Senate Foreign Relations Committee on the Persian Gulf in the 1980s, the Soviet intervention in Hungary, and Luttwak's theory of coups d'etat. This list reads like the topics of any randomly chosen list of articles on international relations, in distinct contrast to the arms race and world modeling literatures, which have focused on fairly narrow types of behavior. By that same token, there is a strong empirical focus in virtually all of the studies. The research is grounded in the detailed study of actual political behavior, not in techniques borrowed from economics, computer science, or mathematics. AI and Rational Choice

The most important alternative model to that proposed in the Al/IR approach is the rational choice approach, which presumes that political behavior, like economic behavior, can best be modeled by assuming individuals optimize their choices within a system of preferences and constraints without any higher cognitive processes. RC uses, for the most part, expected utility decision making and game theory, and is important both because of its dominant role in explaining domestic behavior and increasing application as a theory of international behavior. 14 In addition to its role as "straw man of choice" in contemporary discourse in political science, RC provides an alternative to Al/IR in positing an explicit cognitive mechanism, in contrast to statistical studies and most dynamic models, which merely posit regularities. This debate between RC and cognitive psychology is not confined to IR but is found generally in the social sciences: Hogarth and Reder (1987) provide an excellent survey of the arguments; much of the KST agenda is directed toward falsifying RC assumptions; Simon (1985) deals explicitly with these issues in terms of political science. Although in general AI/IR is viewed as competition to RC-and this is the position taken by most authors who address themselves to the issuethe possibility remains that the two approaches are complementary: Problems appropriate for an RC framework might be inappropriate for AI and vice versa. Most of the discourse analysis and NLP articles would probably concur with this characterization, and NLP tends not to address the RC issue at all. Using AI in a situation that can be accurately described by optimization

24

Philip A. Schrodt

using preferences and constraints would be using a sledgehammer to kill a fly: The problem of nuclear deterrence comes to mind. In relatively static situations of reasonably complete information, limited bureaucratic infighting, quantitative or strictly ordered payoff's, and repeated plays, the expected value framework may be completely appropriate. This assumption of complementarity is the position of many economists who are now using computational modeling: They maintain the basic RC concepts and vocabulary but use computational models to deal with phenomena such as memory and learning that confound existing mathematical techniques. If one can take the structuring of a situation as known and static-for example, the voting decision in a stable liberal democracy doesn't include the option of changing the rules of voting-then RC will predict behavior quite efficiently. However, when self-modification is an option, simple RC models aren't very useful. Alternatively, the two approaches may be simply competitive; this view is implicit in most of the chapters in this volume that discuss RC. Fundamental to virtually all critiques of RC is criticism on empirical grounds: The "as if" approach, though possibly acceptable when formulated by Milton Friedman in 1953, is unacceptable when viewed in the light of thirty years of experimentation that has failed to validate those assumptions except under highly artificial settings. The AljiR literature, with its base in experimental psychology and its heavy use of primary data, provides an alternative formulation of political rationality with a superior empirical grounding. In evaluating this debate, note that the difference between AI/IR and RC does not lie in the assumption of preferences and goal-seeking behavior. The crucial difference is in the issue of optimization: RC assumes utility maximizers; AI/IR assumes individuals and organization have inadequate information-processing abilities to optimize. In addition, AI/IR generally assumes a much more complicated and dynamic world than RC theories, which it can do because the models are computational rather than algebraic. For example, RC models usually assume that the consequences of actions are known (with a known degree of uncertainty), whereas AI/IR approaches may try to model the acquisition of that knowledge. AI/IR, following KST, postulates that the framing of a problem is a very important phase of finding the answer and, in fact, may well determine the answer. Following the Simon-Newell tradition, memory is not considered a constraint either in the cognitive theory or the computational model, so huge amounts of information in complex structures can be used. In this sense, AI/IR and RC are mirror images: AI assumes sophisticated memory but limited processing; RC assumes sophisticated processing (optimization) but limited memory. On the down side, AI has far less mathematical closure than RC. RC's intellectual coherence stems in no small part from a series of mathematical

AI and International Relations: An Overview

25

bottlenecks imposed by the available techniques. Common mathematical tools such as fixed-point theorems and results from game theory severely restrict the assumptions that can be made if a model is to be mathematically tractable. On the one hand, this gives RC an "assume the can opener" air; 15 on the other hand, it provides a large body of shared assumptions so that many new results are widely applicable. RC has a cumulativeness that is less evident in AI/IR. Because AI is algorithmic rather than algebraic, there are virtually no constraints on the complexity of the underlying argument; the techniques, in fact, tend to be data-limited rather than technique-limited, a problem shared with numerical simulations, as Mefford notes. AI/IR and Data

The AI/IR approach is strongly empirical, another aspect that differentiates it from the RC tendency to refine theoretical concepts with little concern for empirical testing. From a sociology of science standpoint, the empirical focus of AI/IR is unsurprising given its dual roots in experimental psychology and the fact that most of the early researchers, from Alker onward, initially studied politics using behavioralist statistical methodologies. The AI/IR approach is developing, from the beginning, testable theories with the corrective feedback empirical studies provide. The data used in AljiR resembles that of historical/archival research more than that of the quantitative statistical traditions of the 1960s and 1970s. The majority of the chapters in this volume use primary text or interviews as their data source, and they also emphasize issues such as context, structure, and discourse. Even those studies not using text (for example, Hudson or Schrodt) utilize very detailed descriptive sequences with thousands of points, a distinct contrast to the annualized forty-year time series typical of contemporary statistical studies. Almost all existing political research has concentrated on simple data structures, usually the rectangular case/variable data array. Even when more complicated structures were built-as, for example, in factor analysis or Guttman scaling-these were done within the constraints of linear models. However, it seems obvious that political behavior involves complex data structures such as rules and hierarchies, scripts and sequences, plans and agendas. The rational choice tradition has begun to work with some complex structures-for example, with agenda setting-but has a very limited empirical tradition. In contrast, most of the methods being developed in AI/IR are general: For example, the methods of analyzing informal rules in the context of bureaucracies or international regimes should be of substantial use to those studying formal institutions such as Congress or the evolution of international regimes. The question of what will be tested remains open. In general there are two possible criteria: outcome validity and process validity. A model with

26

Philip A. Schrodt

outcome validity would reproduce or predict behavior but make no presumptions that the mechanisms through which that behavior was generated corresponded to those in the international system. The chapters using standard IR data sets (e.g., Hudson and Schrodt) are most clearly in this camp. A stricter, and more useful, criterion is process validity: The mechanisms by which an outcome is reached should also correspond to those observed in the actual system. The models strongly based in archival and primary source material (e.g., JESSE; Sylvan, Milliken, and Majeski) are coming closer to achieving this. A problem in testing any model of process validity, however, is the arbitrariness and incompleteness of the empirical record: A model may produce behavior and rationales that are entirely plausible (as evaluated by experts) but that are not found in the actual historical record. Conclusion Despite the tendency at paradigm proliferation in the IR literature-two articles (at most) seem to establish a new IR paradigm-the AI/IR literature has most of the characteristics of a classical Kuhnian paradigm, including a new set of questions, theories of data, and techniques. As a paradigm, AI/IR allows one to look at old data in new ways; one could argue that it also arose out of failures of the behaviorialist modeling techniques to deal with the contextual complexity found in primary source material. The projects reported in this volume are in various stages of completion. Although the "trust me" assurances in research are quite rightly viewed with skepticism in a new and intensely hyped technique such as AI, most of this research has a large inductive component based in extensive primary source material rather than data sets from the ICPSR and justifiably proceeds slowly. The completed research, frequently of considerable complexity (e.g., Hudson, the JESSE simulation, Job and Johnson), is, I hope, a harbinger for the results of the projects still in progress (e.g., Ensign and Phillips). Its vocabulary is evolving and its critical concepts emergent, yet the quantity and diversity of the AI/IR research projects have already transcended those of most faddish modeling techniques. The research reported in this volume opens a variety of doors to further research, and the list of potentially interesting projects is far from exhausted. Notes 1. Cimbala (1987) and Andriole and Hopple (1988) provide introductions oriented toward U.S. defense concerns, but these are only a narrow part of the field of IR generally. 2. This seemingly original joke seems to occur to almost everyone. . . .

AI and International Relations: An Overview

27

3. For example, DARPA's 1983 timetable calls for the following developments by 1990: vision subsystems with "1 trillion Von-Neumann equivalent instructions per second"; speech subsystems operating at a speed of 500 MIPS that "can carry on conversation and actively help user form a plan [sic)," and "1,000 word continuous speech recognition." Each of these projected capabilities is 10 to 100 times greater than the actual capabilities available in 1990, despite massive investments by DARPA. For additional discussion, see Waldrop (1984); Bonasso (1988) provides a more sympathetic assessment. 4. A 10,000-rule expert system is unlikely to achieve ten times the performance of a 1,000-rule system. To the contrary the 1,000-rule system will probably have 80-90 percent of the functionality of the large system; the additional 9,000 rules are devoted almost exclusively to the residual 10-20 percent of the cases. 5. Conceptually, this effort is similar to the "cognitive mapping" methodology pursued in Axelrod (1976); the "operational code" studies (e.g., George 1969; George and McKeown 1985) are other antecedents. 6. Neural networks are the primary exception to this characteristic and are a current research focus precisely because they offer an alternative to rule-based formulations. 7. For example an action A would be taken in the expected utility formation E(A) = p(100)

+

(1-p)(-50)

if and only if p > 1/3; the dynamic model

is stable if and only if la I< 1 and so forth. B. Formal models of simple learning are common in the psychological literature, but these are largely inappropriate to the organizational learning of complex, illdefined tasks common to political settings. 9. Learning is another aspect acquired from the AI tradition. One of the key elements of LISP, the dominant AI programming language, is the absence of a distinction between program and data. As a consequence programs could be "selfmodifying," changing to adapt to a problem at hand. This is the essence of learning, and so models drawing on the AI tradition had a rich set of learning concepts at its core, whereas learning was difficult to add to RC or dynamic models. 10. Figuring eight-hour days and a sustained rate of 100 calculations per hour for the human; a very modest 1 MIPS for the computer. 11. This can occur for normative as well as pragmatic reasons: For example, Allison (1971:197) emphasizes the strong negative impact the Pearl Harbor analogy had on Robert Kennedy's assessment of the option of bombing Cuba during the Cuban Missile Crisis. "I now know how Tojo felt when he was planning Pearl Harbor," Kennedy wrote during an ExCom meeting, and he would later write "America's traditions and history would not permit . . . advocating a surprise attack by a very large nation against a very small one." 12. Unless there is a clear and obvious precedent: The future of Ferdinand Marcos in the Philippines was discussed in terms of Marcos as "another Somoza," referring

28

Philip A. Schrodt

to the Nicaraguan dictator whose fall led to the establishment of the Sandinista regime. Usually, however, the search for precedent does not go very deep. 13. In contrast, consider the treatment of international crises found in Snyder and Diesing (1977), which uses an RC approach. For a given configuration of payoffs at any stage in a crisis, Snyder and Diesing can analyze, using game theoretic concepts, the likely behavior of the actors in the crisis (or, more likely, induce the payoffs from the behavior). But this approach does not provide a means of moving from one game matrix to the next: It predicts only a single decision, not a succession of decisions. An LDM approach, in contrast, would seek to model the changes in the payoffs as well as the decisions themselves. In a much simpler framework, the "sequential gaming" models of the RC tradition are also trying to attack this problem using algebraic methods. 14. Supporters of the rational choice approach frequently consider it to be as pervasive in IR as in domestic politics. Although it has dominated one problemthe counterfactual analysis of nuclear war and nuclear deterrence (see Brams 1985)it is a relatively recent newcomer in the remainder of IR, dating mostly from the work of Bueno de Mesquita and his students, and more recently "crossovers" such as Ordeshook and Niou (see Ordeshook 1989). Even this work deals primarily with a single issue, war initiation. One need only compare the scope of the arms race bibliography of Anderton (1985), with 200+ entries, or the dynamic simulation literature (e.g., Guetzkow and Valdez, 1981) to see how relatively small the RC literature is in JR. Outside of IR, of course, RC is clearly the dominant formal paradigm in political science. 15. For the benefit of those few who do not understand this allusion, the underlying joke goes as follows: A physicist, a chemist, and an economist were stranded on a desert island. Exploring the island, they found a case of canned food but had nothing to open it with. They decided to share their expertise and, being academics, each gave a short lecture on how their discipline would approach the problem. The physicist began, "We should concentrate sufficient mechanical force to separate the lid from the can. . . . " The chemist beg~>n, "We should create a corrosive agent to dissolve the lid from the can. . . . " The economist began, "First, assume we possess a can opener. . . . "

Bibliography Abelson, Robert P. 1973. "The Structure of Belief Systems," pp. 287-339. In R.C. Schank and K.M. Colby (eds.), Computer Models of Thought and Language. San Francisco: Freeman. Alker, Hayward J., James Bennett, and Dwain Mefford. 1980. "Generalized Precedent Logics for Resolving Security Dilemmas." International Interactions 7: 165-200. Alker, Hayward J., and C. Christensen. 1972. "From Causal Modeling to Artificial Intelligence: The Evolving of a UN Peace-Making Simulation," pp. 177-224. In J.A. LaPonce and P. Smoker (eds.), Experimentation and Simulation in Political Science. Toronto: University of Toronto Press. Alker, Hayward J., and W. Greenberg. 1976. "On Simulating Collective Security Regime Alternatives," pp. 263-306. In M. Bonham and M. Shapiro (eds.), Thought and Action in Foreign Policy. Basel: Birkhauser Verlag.

AI and International Relations: An Overview

29

Alker, Hayward J., and F. Sherman. 1982. "Collective Security-Seeking Practice Since 1945," pp. 113-45. In D. Frei (ed.), Managing International Crises. Beverly Hills: Sage. Allison, Graham T. 1971. The Essence of Decision. Boston: Little, Brown. Anderton, Charles H. 1985. "Arms Race Modelling: Categorization and Systematic Analysis." International Studies Association, Washington. Andriole, Stephen J., and Gerald W. Hopple. 1988. Defense Applications of Artificial Intelligence. Lexington, MA: Lexington Books. Axelrod, Robert, ed. 1976. Structure of Decision. Princeton: Princeton University Press. Bonasso, R. Peter. 1988. "What AI Can Do for Battle Management: A Report of the First AAAI Workshop on AI Applications to Battle Management." AI Magazine 9,3: 77-83. Bonham, G. Matthew, and Michael J. Shapiro. 1976. "Explanation of the Unexpected: The Syrian Intervention in Jordan in 1970," pp. 113-141. In Robert Axelrod (ed.), The Structure of Decision. Princeton: Princeton University Press. Brams, Steven J. 1985. Superpower Games. New Haven: Yale University Press. Carbonell, Jaime G. 1978. "POLITICS: Automated Ideological Reasoning." Cognitive Science 2: 27-51. Cimbala, Stephen. 1987. Artificial Intelligence and National Security. Lexington, MA: Lexington Books. Defense Advanced Research Projects Agency. 1983. "Strategic Computing: New Generation Computing Technology: A Strategic Plan for its Development and Application to Critical Problems in Defense." Memo, 28 October 1983. Doyle, Jon. 1988. "Big Problems for Artificial Intelligence." A/ Magazine 9,2: 1922. Dreyfus, Herbert L. 1979. What Computers Can't Do. New York: Harper and Row. Etheredge, Lloyd S. 1985. Can Governments Learn? New York: Pergamon. Feigenbaum, Edward A., and Pamela McCorduck. 1983. The Fifth Generation. Reading, Mass: Addison-Wesley. Feigenbaum, Edward A., Pamela McCorduck, and H. Penny Nii. 1988. The Rise of the Expert Company. New York: Times Books. George, Alexander L. 1969. "The 'Operational Code': A Neglected Approach to the Study of Political Leaders and Decision-making." International Studies Quarterly 13: 190-222. George, Alexander L., and Timothy J. McKeown. 1985. "Case Studies and Theories of Organizational Decision Making." Advances in Informational Processing in Organizations 2: 21-58. Guetzkow, Harold, and Joseph J. Valdez, eds. 1981. Simulated International Processes: Theories and Research in Global Modeling. Beverly Hills: Sage. Hogarth, Robin M., and Melvin W. Reder, eds. 1987. Rational Choice: The Contrast Between Economics and Psychology. Chicago: University of Chicago Press. Kahneman, Daniel, Paul Slovic, and Amos Tversky, eds. 1982. Judgement Under Uncertainty: Heuristics and Biases. Cambridge: Cambridge University Press. Katzenstein, Lawrence C. 1989. "PRC Relations with Hong Kong: An AI Model of Decision Making in Turbo Prolog." Northeastern Political Science Association, Philadelphia.

30

Philip A. Schrodt

Kaw, Marita. 1989. "Predicting Soviet Military Intervention." Journal of Conflict Resolution 33,3: 402-429. Klahr, Philip, and Donald A. Waterman. 1986. Expert Systems: Techniques, Tools and Applications. Reading, MA: Addison-Wesley. Krasner, Stephen D., ed. 1983. International Regimes. lthaca: Cornell University Press. Laird, John E., Paul S. Rosenbloom, and Alien Newell. 1986. llniversal Subgoaling and Chunking: The Automatic Generation and Learning of Goal Hierarchies. Norwell, MA: Kluwer. Lane, Ruth. 1986. "Artificial Intelligence and the Political Construction of Reality: The Case of James E. Carter." American Political Science Association, Washington. Leithauser, Brad. 1987. "The Space of One Breath." The New Yorker, 9 March 1987, pp. 41-73. Mallery, John C. 1988. "Thinking About Foreign Policy: Finding an Appropriate Role for Artificially Intelligent Computers." International Studies Association, St. Louis. Margolis, Howard. 1987. Patterns, Thinking and Cognition: A Theory of Judgement. Chicago: University of Chicago Press. McCorduck, Pamela. 1979. Machines Who Think. San Francisco: Freeman. Minsky, Marvin. 1986. The Society of Mind. New York: Simon and Schuster. Neustadt, Richard E., and Ernest R. May. 1986. Thinking in Time: The llses of History for Decision Makers. New York: Free Press. Newel!, Alien, and Herbert Simon. 1972. Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. Ordeshook, Peter C., ed. 1989. Models of Strategic Choice. Ann Arbor: University of Michigan Press. Pollack, Andrew. 1988. "Setbacks for Artificial Intelligence." New York Times, 4 March 1988. Sanger, David E. 1985. "Smarter Machines Get Smarter." New York Times, 15 December 1985. Schank, Roger C. 1987. "What is AI, Anyway?" AI Magazine 8: 5965. Schank, Roger C., and Christopher K. Riesback. 1981. Inside Computer llnderstanding. Hillsdale, NJ: Erlbaum Associates. Simon, Herbert A. 1979. Models of Thought. New Haven: Yale University Press. _ _ . 1982. The Sciences of the Artificial. 2nd ed. Cambridge: MIT Press. ___ . 1985. "Human Nature in Politics: The Dialogue of Psychology with Political Science." American Political Science Review 79,2: 293-304. Snyder, Glenn H., and Paul Diesing. 1977. Conflict Among Nations. Princeton: Princeton University Press. Sylvan, Donald A., and Steve Chan. 1984. Foreign Policy Decision Making: Perception, Cognition and Artificial Intelligence. New York: Praeger. Tanaka, Akihiko. 1984. "China, China Watching and CHINA-WATCHER," pp. 310344. In Donald A. Sylvan and Steve Chan, Foreign Policy Decision Making: Perception, Cognition, and Artificial Intelligence. New York: Praeger. Thorson, Stuart, and Donald A. Sylvan. 1982. "Counterfactuals and the Cuban Missile Crisis." International Studies Quarterly 26: 537-71.

AI and International Relations: An Overview

31

Waldrop, M. Mitchell. 1984. "Artificial Intelligence (1): Into the World." Science 233: 802-805 (24 February 1984). _ _ . 1988. "Toward a Unified Theory of Cognition." Science 241: 27-29 (1 July 1988). Weizenbaum, Joseph. 1976. Computer Power and Human Reason. San Francisco: Freeman.

PART ONE

Conceptual Issues and Practical Concerns

2 Artificial Intelligence and Intuitive Foreign Policy Decision-Makers Viewed as Limited Information Processors: Some Conceptual Issues and Practical Concerns for the Future He/en E. Purkitt

The application of artificial intelligence techniques and computational models to study intentional activity in international relations reflects a diversity of concepts, research foci, and methods. Underlying this diversity, however, are several important commonalities. One of these commonalities involves a shared recognition of the need to build models that are both computationally tractable and cognitively plausible. Thus, most computational modeling efforts seek to build upon what is already known about the role of cognitive processes of political choosers operating as participants within some larger context (i.e., small group, organizational, social, or cultural environment). Although many of these symbolic models do not seek to replicate the actual foreign policy decision-making process, most AI modelers claim that their underlying model is consistent with what is known about the cognitive processes of foreign policy decision-makers and of intuitive problem-solvers in genera1. 1 Despite recent interest among computational modelers in developing techniques capable of representing the complex and dynamic social, cultural, and linguistic bases of political cognition of multiple actors at various levels of analysis (see, for example, Alker, 1984; Bennett, 1989; Mallery, Hurwitz, and Duffy, 1987; and Alker et al., Banerjee, Bennett and Thorson, Boynton, and Sylvan, Majeski, and Milliken, all this volume), most AI modeling efforts This research was supported by a grant from the U.S. Naval Academy Research Council. I would also like to thank James W. Dyson, my collaborator on a longer-term research project. Many of the ideas developed in this chapter reflect his prior contributions. However, any errors in this manuscript are mine.

35

36

He/en E. Purkitt

continue to emphasize cognitive research and concepts related to how the content and structure of prior knowledge influences problem understanding and choice. This emphasis is understandable, given the intellectual roots of computational models currently in use in international relations, computer science, and cognitive psychology. In these fields the major effort centers on understanding and developing computational techniques that model how individuals store, represent, and access knowledge to understand current situations and problems. Unfortunately, these three research traditions are difficult ones to reconcile. This difficulty will present problems for AI researchers in the future. One problem, for example, is at the conceptual level. Conceptual difficulties persist in part because much of the past descriptive research on political decision making has not focused on the cognitive basis of choice except to suggest that people do not follow the dictates of rational choice models of decision making (Kegley, 1987; Powell, Purkitt, and Dyson, 1987; Schrodt, 1985). Thus, the common recognition of the shortcomings of the rational choice model evident among AI researchers complicates rather than simplifies AI modeling. (Once choice is described as rooted in more than one motive, modeling the cognitive process becomes considerably more difficult). Although binary or multimotive choice models may be more flexible, and perhaps more accurate as well, how people process information to satisfy two or more values is difficult to model fully. In sum, computational modelers using AI techniques have an interest, which they share with descriptive researchers focusing on how people make decisions under conditions of uncertainty, in constructing a new processbased paradigm of foreign policy and international relations. 2 However, dissatisfaction with past research that has either taken a black-box approach to the foreign policy process or used "as if" reasoning to establish the plausibility of a rational choice paradigm, does not facilitate model development. How, for example, Simon's (1958, 1959, 1976) satisficer thinks through a problem is highly varied and uncertain-one simply decides when one subjectively believes enough is known to make a choice. Thus, subjective decisions may vary across situations and choosers. Although it is not possible to offer concrete theoretical solutions to many of the problems AI models in international relations must address, three major insights that have emerged over the past several decades on how people actually process information to solve problems intuitively seem to offer some useful heuristic guidelines for future AI modeling efforts in international relations. These information-processing insights are important as they underscore the fact that there are patterned regularities in the way people structure and process incoming information to solve problems intuitively. Thus, cognitive variety in political problem solving may not be as open-ended as the satisficing perspective implies.

AI and Intuitive Foreign Policy Decision-Makers

37

There are several reasons for relying more on information-processing insights in future AI modeling efforts. One of the most important reasons is that past research on how people process information intuitively under conditions of uncertainty underscores the critical role of incoming information and other external stimulus cues as determinants of choice in a given situation. These results suggest there may be problems with models that rely primarily on a priori assumptions about "the primary mechanisms" used by intuitive processors. Given the primitive nature of our understanding of the determinants of human cognition, it may be more productive to use inductive-based approaches in constructing models to "fit" what is already known about actual decisional processes. There may be a danger at this early stage of becoming "locked in" on a particular type of model (i.e., knowledge systems based on predicate logic and justified in terms of the pervasive use of analogical reasoning by real-world foreign policy decisionmakers), a model that does not conform with what is already known about how people actually think about and make foreign policy decisions. 3 Greater attention to the insights about how people actually structure and process information may also provide several heuristic functions for computational modelers. A fundamental issue to be addressed in constructing a computational model pertains to the appropriate level of analysis (i.e., individual, group, or aggregate). Even if this question depends on the purpose of the analysts, information-processing insights suggest the existence of important cross-level generalizations that can and should be incorporated into computational models designed to simulate some aspect of human cognition. At a minimum, greater familiarity with these information-processing insights should help sensitize modelers to some of the more difficult conceptual issues and practical concerns related to the role of external stimulus cues and cognition that need to be addressed in future efforts to simulate aspects of intentional behavior in international relations. Basic Information-Processing lnsights Research on how people process information has tended to highlight the fact that political decision-makers act more like limited information processors than rational maximizers or incremental satisficers. This information-processing image of the "typical" political decision-makers is based on evidence that has been accumulating over decades, on intuitive problem solving and decision making. Research on how people solve complex problems has tended to converge around a few basic generalizations about human cognition. The most important of these generalizations revolve around three related ideas: (1) People can only process a limited amount of information without experiencing information overload. The main effect of increasing information tends to be an increase in subjective confidence in the adequacy of one's

38

He/en E. Purkitt

chosen solution. Because decision-makers under conditions of uncertainty routinely seek increased amounts of information, a pervasive characteristic of intuitive decision making is an unwarranted faith in the adequacy of intuitively based problem solutions. This faith has been termed "cognitive conceit. "4 (2) People use heuristics or simplifying mental rules of thumbs as cognitive aids during all stages of intuitive decision making. (3) There are recurring patterns in the way people structure and process information (i.e., both their logic and the type of information they stress is patterned). When considering these three generalizations, we need to consider that the specific components of the information-processing routines used at any particular point in the process will be determined by both internal cues (related to the organization and structure of prior experiences and knowledge in long-term memory) and external cues (e.g., the amount and type of information and the key aspects of incoming information perceived to be relevant). Intuitive Problem Solving

On the basis of information-processing research we may describe intuitive problem solving as highly "adaptable." Evidence from experimental research indicates that individuals adapt to the perceived demands of the immediate task at hand. Thus, in a laboratory setting it is rather easy to induce dramatic preference reversals and to demonstrate that people do not follow the dictates of normative models of choice. Seemingly minor variations in the amount and type of information made available to experimental subjects or minor changes in the way an experimental task is presented can profoundly affect all subsequent stages of intuitive problem solving (see, for example, Carrell, 1980; Carrell and Payne, 1976; Kahneman, Slovic, and Tversky, 1982; Payne, 1980; Slovic, Fischhoff, and Lichtenstein, 1977, 1984; Slovic, Lichtenstein, and Fischhoff, 1988). Thus, the highly adaptable nature of intuitive decision making and the use of intuitive heuristics to understand and to solve complex problems often lead to fundamental errors or biases in the processing of information. The evidence to date suggests that the overutilization of simple intuitive heuristics and the underutilization of formal logical or statistical rules causes errors and biases at each of five decisionmaking stages (Nisbett and Wilson, 1977; Nisbett and Ross, 1980, Pitz, 1977, 1980; Pitz and Sachs, 1984): 1. describing and coding information during the initial stage of problem identification, 2. explaining important causal dimensions and making intuitive predictions during the initial stage of problem identification,

AI and Intuitive Foreign Policy Decision-Makers

39

3. explaining important causal dimensions and making intuitive predictions during the problem-evaluation stage, 4. choosing among alternatives during the decision stage, and 5. evaluating, learning, integrating, and adapting to feedback information on the efficacy, adequacy, and accuracy of prior behaviors and actions. The essence of this line of research has been aptly stated by Nisbett and Ross (1980: 12): In ordinary social experience people often look for the wrong data, often see the wrong data, often retain the wrong data, often weight the data improperly, often fail to ask the correct questions of the data and often make the wrong inferences on the basis of their understanding of the data. With so many errors on the cognitive side, it is often redundant and unparsimonious to look for motivational errors. We argue that many phenomena generally regarded as motivation (for example, self-serving perceptions and attributions, ethnocentric beliefs, and many types of human conflicts) can be understood better as products of relatively pervasive information processing errors than of deepseated motivational forces.

Recognition that intuitive problem solving is highly adaptable to perceptions of the nature of the immediate task and that human inferences are often subject to systematic errors and biases when compared to normative models of choice helps us to appreciate why it is difficult to model how people will decide across a variety of political situations. To overcome this difficulty we need to find the recurring patterns in the way people structure information and, next, to use these patterned regularities to gain a reasonably accurate description and perhaps also formal representations of the problem-solving logic used by intuitive decision-makers. Some of the requisite descriptive research has already been completed. This research has indicated that when confronted with a new but complex problem, intuitive problem-solvers seem to act like naive scientists; that both novice and experienced political problem-solvers first try to gain a crude sense of the problem and then develop a response; and that how the problem is initially understood (framed) conditions all subsequent stages of problem solving. 5 Initial framing seems to be accomplished by evaluating and applying available information judged to be highly plausible, salient, and valid by intuitive decision-makers (Nisbett, Borgida, Crandall, and Reed, 1976; Nisbett and Ross, 1980; Pitz, 1980; Ross, 1977; Kahneman, Slovic, and Tversky, 1982), by seeking additional information (Einhorn and Hogarth, 1978, 1980, 1981, 1985; Slovic, Fischhoff, and Lichtenstein, 1977; Slovic and MacPhillamy, 1974), and by searching memory for relevant precedents, analogies, or clues

40

He/en E. Purkitt

to aid in the categorization of the current problem and to structure a response (Mefford, this volume). Cognitive Limitations

Generally speaking, the power and complexity of human cognition is tied to the almost unlimited capacity of humans to store and combine vast amounts of information in long-term or associative memory. Presently, researchers do not fully grasp the structure of human knowledge and the processes used to access and to modify prior knowledge. Conceptual issues related to the representation of human memory and the mechanisms used to access and to modify human knowledge are fundamental, unresolved puzzles in cognitive psychology, AI, and computational modeling in international relations. An important aspect of human thinking is that humans operate under severe cognitive constraints because of the limited capacity of working or short-term memory. Miller's (1956) suggestion that people can only retain seven (plus or minus two) pieces of information in working memory has been confirmed by memory-tracing research (Hayes, 1981; Payne, 1980; Newell and Simon, 1972). Research has also demonstrated that the active processing of information is a serial process within the limited capacity of working memory (Payne, 1980: 95). In moving from the level of pieces of information to the level of factors or indicators, it is now clear that individuals can only systematically process information on a few (probably two or three) factors without explicit cognitive aids (e.g., algorithms). Past information-processing research suggests that the number of dimensions a person can systematically process at any one time is extremely limited and may be no more than one and certainly not more than five (Dyson, Godwin, and Hazlewood, 1974; Dyson and Purkitt, 1986a; Heuer, 1978; Shepard, 1978). The exact number of dimensions employed appears to vary and depend more on such factors as the nature of the immediate task at hand and the amount and type of information presented rather than on variations in the prior knowledge (expertise) or experiences of individual choosers. The experimental literature has repeatedly documented this pervasive tendency of people to use a limited (one to five) number of dimensions in processing information, and the critical role of information and task characteristics across a variety of problems and task settings. We can perhaps illustrate this tendency and the influence of task characteristics and information with the findings from one line of experimental research on how people's perceptions of risk vary with different types of information and task settings. Slovic and Lichtenstein (1968) and Payne (1975) found that in a gambling experiment that presented risk in terms

AI and Intuitive Foreign Policy Decision-Makers

41

of four dimensions, subjects tended to perceive the probability of losing as the most important dimension. Slovic, Fischhoff, and Lichtenstein (1983) and Johnson and Tversky (1983) in a series of related risk experiments found that two or three factors accounted for over 80 percent of the variance in the dimensions of risks used by subjects. Collectively, these and other related experiments indicate that the structure of decision making across specific problem domains evidence a remarkable degree of stability in terms of the number of analytical dimensions used. People generally do not seem to be able to analyze a large amount of information on the basis of complex multidimensional intuitive analysis. Instead, we find that as information increases, people appear to shift dimensions, discarding earlier information, even initially valued analytical dimensions, without being conscious that they are in fact employing a switching strategy during analysis. 6 A number of theoretical frameworks have been offered as fundamental heuristics for studying foreign policy decision making. Two of the more prevalent ones are Holsti's (1962) belief system approach and George's (1969; 1980) operational code framework. Researchers using these perspectives often assumed a highly complex underlying cognitive structure. Although the results from empirical research using these a priori models of cognition confirm that individuals do use prior beliefs, operational codes, and schema, most studies suggest that the structure of these cognitions is rather simple. Holsti's (1962) study of John Foster Dulles concluded that a few belief dimensions were adequate for describing Dulles's "image" of the Soviets over time. Empirical research using George's operational code framework rarely focuses on more than four and often as few as one element of the code to understand changes in U.S. foreign policy (Lampton, 1973; Rosati, 1984; Walker, 1977). Thus, the fundamental insight that people only use a few dimensions of a problem in forging solutions has filtered into the literature on foreign policy decision making. The limitation on the numbers of dimensions used in decision making may operate at higher levels of organized cognitive activity. For example, Sylvan, Majeski, and Milliken (this volume) note in reviewing the record concerning various phases of the Vietnam ground war during the 19641965 period that as a general rule, at any given time, the president and his top advisers simply did not consider more than three principal policy lines. 7 Although they speculate that this "rule of three" may be tied to either cognitive limitations or structural and functional interorganizational distinctions, the experimental evidence to date suggests that cognitive limitations are an important aspect of the explanation. In a recent reexamination of U.S. decision making during the Cuban Missile Crisis, Sylvan and Thorson (1989) found similar evidence that political decision-makers focus on a limited number of dimensions of a problem in

42

He/en E. Purkitt

forging political solutions intuitively. In the original JFK/Cuba production system, Thorson and Sylvan (1982) posited that a model representing knowledge as semantic kernels would be adequate. Their model was based on archival materials and memoir literature available in the early 1980s that indicated that U.S. decision-makers considered six separate options (i.e., blockade, air strike, etc.) during the crisis. This interpretation of the decisionmaking process assumed that the blockade option, for example, emerged in a sequential fashion as a separate option. However, the more detailed transcripts of actual ExCom proceedings now available indicates that U.S. decision-makers did not focus on a number of options in a sequential fashion but instead quickly limited their focus to a few alternative option packages. Thus, the blockade option surfaced as an early follow-up action within a large set of possible military actions. Consequently, in re-examining the content validity of their original assumptions in light of recently released transcripts and other heretofore classified documentation of ExCom deliberations, Sylvan and Thorson (1989) concluded that their original model did not adequately fit the actual process JFK and his advisers followed in processing information and structuring a few "packages of alternatives" involving multiple, simultaneous actions. 8 Basically, the historical record in this particular case supports a large amount of experimental evidence indicating that intuitive problem-solvers usually focus on a few aspects of a problem in their quest to find a problem solution and quickly narrow their consideration of alternative solution paths to a limited number of "option" packages. In fact, a careful reading of the experimental evidence on how people actually process information suggests that intuitive political problem-solvers will usually structure their discussion of problem solutions into a series of binary choices (Dyson and Purkitt, 1986a). From this information-processing insight practical implications can be derived for future efforts to construct computational models. One implication is to underscore the difficulties involved in attempting to construct computational models of decision processes from retrospective accounts (i.e., memoirs and secondhand accounts) of actual decisional processes. There is usually a large gap between the accounts of the deliberation and choice processes of small-group decision making or policy proceedings that are based on secondary sources and retrospective accounts when compared to the actual historical record (i.e., video tapes in laboratory settings or adequate written records for significant small-group deliberations and choices within an organizational setting). Apparently, individuals tend to embellish what took place in their meetings. In view of the limited amount of past research on the actual cognitive processes and information-processing routines used by foreign policy decision-makers, this gap raises some serious questions about the ability of production systems to adequately represent knowledge

AI and Intuitive Foreign Policy Decision-Makers

43

in "cognitively plausible" ways in the absence of a rich record of actual group proceedings. This observation in turn seems to highlight the need for a greater emphasis on developing highly flexible, conceptually neutral, inductive-based AI methodologies (see Alker et al., Boynton, and Sylvan, Majeski, and Milliken, this volume). The Use of Heuristics and Resulting Mental Biases

To cope with limited cognitive capabilities, individuals selectively process information and use a limited number of heuristics or mental rules of thumb as cognitive aids in their effort to manage information. This apparently universal reliance on intuitive heuristics to solve all types of problems seems to be due to the need to compensate for the limitations of short-term memory and information-processing capabilities. By using intuitive mental heuristics, people can develop a problem attack plan that permits them to develop a subjectively acceptable problem solution. Simon's description of political decision making as bounded rationality involving a satisficing process (1958, 1959), Wildavsky's (1964) description of the incremental chooser, and Lindblom's model of muddling (1959) all represent efforts to describe the intuitive metaheuristics frequently relied on by political decision-makers. An intuitively grounded heuristic used across a wide variety of questions relies on past problem-solving experiences; a vague sense of the knowns and unknowns of a problem (i.e., topic/issue/ question); what sort of data, conditions, and situations are relevant to the problem; and a subjective estimate of the consequences that follow from alternative courses of action. As a result of their use of this problem-solving approach, political decision-makers are hindered by an unwieldy problem attack plan that highlights a feedback loop between proposed options and inferred consequences. Such a plan lacks an objective or systematic means of evaluation because of its essential subjectivity. Thus, lessons of history or learning from past experiences generally are improperly drawn (Anderson, 1981; George, 1980; May, 1973). Instead, adaptation appears to be largely a function of trial-and-error learning done on a case-by-case basis (Mefford, 1987 and this volume). Four findings from social psychological research on the tendency of humans to use simplifying heuristics appear to be very relevant to At-based approaches in international relations. First, the results of experimental research on intuitive decision making within the highly decomposed environment of a laboratory, which focuses on how people strive to solve highly constrained problems within fixed time periods, suggest that the specific heuristic used during problem understanding and subsequent stages of decision making are heavily influenced by the amount and type of information made available to subjects and perceptions of the nature of the task at hand (see,

44

He/en E. Purkitt

for example, Carron and Payne, 1976; Dawes, 1988; Einhorn and Hogarth, 1978, 1985; Kogan and Wallach, 1967; Slovic, Lichtenstein, and Fischhoff, 1988; Payne, 1980; Wallsten, 1980). These findings suggest that AI modelers will need extremely rich data sources, including information about key interaction processes and informational inputs, in order to build "cognitively plausible" models to approximate actual foreign policy decisional processes (see, for example, Mefford and Sylvan, Majeski, and Milliken, this volume). Second, framing research has documented quite conclusively the subtle power of initial information to manipulate the way people think about a problem. How a question is framed may affect all stages of intuitive decision making for both laypersons and experts (Slovic, Fischhoff, and Lichtenstein, 1984; Kahneman, Slovic, and Tversky, 1982). For example, framing research indicates that people are more sensitive to negative consequences than positive ones and are more worried about losing money than they are eager to gain the same amounts of money. Experiments in this area (Slovic, Lichtenstein, and Fischhoff, 1988; Tversky and Kahneman, 1983; Quattrone and Tversky, 1983) have also found that people are more willing to accept negative outcomes if they are considered as "costs" rather than "losses" and tend to overvalue the complete elimination of a lesser hazard while undervaluing the reduction of a greater hazard. These findings suggest the importance of combining AI computational techniques with experimentation in order to learn more about the determinants of framing, particularly in the context of small-group decision making (see, for example, Hurwitz, 1988). A third relevant result for At-based research concerns the experimental evidence showing that people do not use heuristics in a consistent or integrated fashion. Rather than integrate information in a coherent way, people tend to shift dimensions as more information becomes available without tying the old information to the new (Estes, 1978; Einhorn and Hogarth, 1978, 1981; Pitz, 1977). Thus, social psychological experimental evidence on the use of heuristics underscores the crucial role played by information cues and situational factors in addition to the structure of prior experiences, beliefs, and knowledge bases for shaping the initial problem understanding and thereby subsequent decision making. This line of research suggests the importance of developing computational methods capable of modeling inductive inference processes, trial-and-error learning both within the context of small groups and across actors at different levels of abstraction (see Mallery, 1987, 1988; Schrodt, this volume). 9 Fourth, computational modelers might also consider that a variety of perspectives have been proposed to explain the pervasive tendency of people to use simple and often widely shared perspectives to explain behavior both in experiments and in the world generally: Kahneman, Slovic, and Tversky (1982) posit that people use a few metaheuristics such as representativeness,

AI and Intuitive Foreign Policy Decision-Makers

45

availability, and anchoring in all intuitive analysis tasks; attribution theorists propose that people operate as naive scientists (Kelley, 1971); Chapman and Chapman (1969) and others use cognitive schema theories (Abelson, 1981) or emphasize the importance of sterotypic or culturally shared rules and knowledge (Hamilton, 1979; Hamilton and Gifford, 1976; Taylor, Fiske, Etcoff, and Ruderman, 1978). Although this proliferation of theories might suggest diversity in the sources of people's intuitive attributions, they actually share a common focus on the idea that what people are actually doing in introspective analyses to make causal connections is to rely upon and make relatively simple judgments about the extent to which some input is representative, plausible, or consistent with some output.

Modeling External Cues The specific stimulus cues used for making attributional linkages seem to depend upon both the number and types of external stimulus cues available and the results of individuals searches through easily recalled stored connotative relations in memory (Nisbett, Borgida, Crandall, and Reed, 1976; Pitz, 1977, 1980). A major task still facing AI researchers is to determine how to model the importance of external stimulus cues. In considering this problem computational modelers might find it useful to review the evidence suggesting that perceptions of the level of uncertainty associated with the immediate task at hand play a critical role in structuring the problem in working memory. For example, Mitchell, Russo, and Pennington (1989) discuss evidence that suggests that the amount of uncertainty associated with outcomes rather than temporal sequence has the greatest impact on the nature of explanations used for events. These ideas may relate to studies reported by Tversky and associates. Tversky (1977), Tversky and Heath (1989), and Tversky and Kahneman (1983) present evidence indicating that people found prediction easier to make than explanation. What this pattern may indicate is that predictive tasks (which seem like games of chance) are less cognitively demanding than explanatory tasks (which seem like games of skill). Thus, the perceived nature of the task may be a confounding factor in the effort to model human thought. In contrast to the focus on the role of external cues and perceptions of the immediate task in research on intuitive judgment and decision making under uncertainty, most AI research has been concerned primarily with ways to model internal stimulus and prior knowledge. Mefford captured the current emphasis on the central posited role that precedent-based analogical reasoning and the use of relevant political analogies for structuring the political thought and subsequent actions of a cognitive agent has when he explained that

46

He/en E. Purkitt

we focus on how key actors manage to "construct realities," that is, to select out what is critical in a situation (including evidence of threat or opportunity) and to formulate appropriate courses of action. We argue that much of the cognitive work involved in interpreting situations essentially entails posing and reworkin~ historical analogies. In short, real or hypothetical situations-in our case a crisis-is understood against the backdrop of selected past incidents (1987: 222).

However, there is a problem with the approach Mefford outlines. Analogical reasoning is clearly a primary cognitive mechanism, but AI researchers encounter difficulties capturing the role of analogies within simple a priori coding categories. As Sylvan, Majeski, and Milliken note, analogies "do not come prepackaged with a whole series of features that can be plucked like arrows from a quiver. . . . Instead, we find that analogical features are adduced in the process of argumentation" (this volume). Their finding that high-level bureaucrats involved in U.S. decision making about Vietnam rarely produced more than one or two analogies seems to "fit" with a general perspective of political decision-makers as limited information processors. The problems to date associated with the use of a priori coding schemes seem to be moving increased numbers of computational modelers toward more inductive-based approaches that build upon what is already known about human information processing. In his AI research Schrodt uses insights about limitations of working memory and the prevalence of pattern recognition and the use of precedent-based logic in accessing long-term memory to illustrate an important isomorphic similarity between human and machine information processing. For Schrodt, this provides cognitive reasons why AI methods based on pattern recognition and a precedent logic approach will be more intuitively pleasing to potential users than axiomatic approaches (1985, 1987, this volume; see also Mefford, this volume). Boynton's work (this volume, 1988, 1989; Boynton and Kim, 1987) is relevant to our thesis that by blending AI and information-processing research more progress may be made in model development. Boynton starts with the assumption (supported by extensive past experimental research) that there are consistent patterns in the way senators serving on the Senate Agricultural Committee, the Senate Armed Services Committee, the Senate Environmental Protection Subcommittee, and the Senate Foreign Relations Committee frame the issues facing them, the procedures they use for assessing recommendations made to them and in the legislative provisions used and reused as new circumstances arise (1988, 1989, this volume). By carefully reconstructing the details of the Senate Foreign Relations Committee members' deliberations in the form of a "reconstructed narrative," Boynton is collecting the type of detailed data we feel is necessary to adequately model political decision-making processes in cognitively plausible ways.

AI and Intuitive Foreign Policy Decision-Makers

47

Instead of immediately building an AI model based on off-the-shelf methods and a large number of a priori assumptions, Boynton is proceeding inductively to develop a text-base file that includes information on the events in the historical record that are highly salient to this particular actor (the Senate Foreign Relations Committee), while also retaining critical contextual and situational information. In addition, Boynton's approach relies on the type of highly detailed interaction data that may be necessary to build a "cognitively plausible" model of information processing and intentional inference in a political context. Finally, Boynton's descriptive analysis seems to provide empirical support for the information-processing proposition that political decision-makers only use a limited number of dimensions (i.e., beliefs, problem factors, policy options, analogies, causal mechanisms, etc.) in developing a representation of a problem and a proposed solution intuitively. His finding (this volume) that there were only three times during the questionand-answer period when congressional committee members or witnesses strung together a longer set of "interpretive triples" (plausible pairwise connections) to establish causal explanations for unexplained events in constructing a historical narrative and as a basis for counterfactual arguments and predictions during committee deliberations, is both consistent with past experimental research and an information-processing perspective of intuitive decision making and may provide some useful practical guidance for constructing future AI models if it is replicated in other politically relevant conversational systems. Conclusion

By combining descriptive research on how intuitive decision-makers actually decide with AI techniques on how prior knowledge and incoming information interact, progress in more accurately modeling political decision processes may follow. But an essential aspect of intuitive decision processes will confound these modeling efforts, to wit: Important differences in task environments seem to influence political decision making. As Boynton demonstrated, the Senate Agriculture Committee had a well-developed perspective of agricultural policy, and they used this shared perspective in developing policy changes, whereas the Senate Armed Services Committee lacked such a perspective and encountered greater difficulty in formulating problem responses (Boynton, 1988, this volume). Recognition of the pervasive use of mental heuristics is central to Atbased research and forms an important rationale for the use of precedentbased models in international relations; it is also an important dimension in information-processing research. These two approaches to the study of choosing share a number of common interests. However, if individuals are highly task-specific in the particular heuristics they employ, as information-

48

He/en E. Purkitt

processing research suggests, then computational modelers will face increased difficulties. Although there are recurring patterns in the structure of how people process information (in terms of the amount, the types of information they stress, and the logic they employ), the specific information-processing routines applied will vary by situation and by decision-maker. Past research on actual intuitive inference processes suggests that a number of external cues also influence the choice process. We have tried to show in a number of ways how this confounds_ AI modeling efforts without dooming them as political decision-makers only process a limited number of dimensions (i.e., beliefs, problem factors, policy options, analogies, or solutions) of a problem within the confines of working memory. In fact, more attention to the patterned regularities in the ways people process information may help in developing computational techniques capable of modeling inductive inference processes. Notes 1. The question whether success in developing machine intelligence is fully dependent on understanding the substance and principles of how humans acquire and use knowledge is a controversial subject in the general artificial intelligence literature yet there is a remarkable consensus among artificial intelligence modelers in international relations on the need to develop "cognitively plausible" models. This position means that AI modelers in international relations must consider a host of issues related to the process validity of their models. See Mallery (1988) and Thorson and Andersen (1987) for a discussion of validity issues that must be addressed. 2. Relevant past descriptive research on decision making under conditions of uncertainty includes early research on human problem solving (i.e., Newell and Simon, 1972; Simon and Hayes, 1976), experimental studies in social psychology (see for example, Carroll and Payne, 1976; Dawes, 1988; Einhorn and Hogarth, 1978; Estes, 1978; Janis and Mann, 1977; Kahneman, Slovic, and Tversky, 1982; Pitz, 1980; Pitz and Sachs, 1984; Rohrmann, Beach, Vlek, and Watson, 1989; Slovic, Lichtenstein, and Fischhoff, 1988), and experimental studies in political science. The relevant experimental research in politics is reviewed in Dyson and Purkitt, 1986a, 1986b; Kirpatrick et al., 1976; see also Bennett, 1981; Davis, 1978; Dyson, Godwin, and Hazlewood, 1974; Purkitt and Dyson, 1988; Tetlock and McGuire, 1986. Thorson and Andersen (1987) provide a useful introduction to the relevant past descriptive decision-making research in social psychology for artificial intelligence applications. 3. Precedential-based logic underlies most recent AI models developed in international relations (see Mallery, 1988; Mefford, this volume). Efforts to develop computational techniques capable of representing inductive inferences and trial-anderror learning are only now being developed. See, for example, Boynton, and Schrodt (this volume). See also recent applications using the RELATUS system (Mallery, 1987, 1988; Mallery, Hurwitz, and Duffy, 1987). 4. For evidence of the pervasiveness of cognitive conceit and the general tendency for increasing amounts of information to lead to increases in subjective confidence

AI and Intuitive Foreign Policy Decision-Makers

49

without a concomitant increase in the quality of intuitive decision making, see Einhorn and Hogarth, 1985; Kogan and Wallach, 1967; Heuer, 1978; and Shanteau, 1989.

5. Although there are important differences in the way prior knowledge is structured and stored in the associative memory of novices and experts, the experimental evidence suggests that both novices and experts are highly ineffective at intuitive judgmental tasks under conditions of uncertainty. See Shanteau (1987, 1989) and Purkitt and Dyson (1987) for a review of past research on how experts and novices process information intuitively. See Anderson (1988) and Boynton (1989) for discussions of some possible ways that findings about expert-novice differences in experimental research may be related to our understanding of the policy process. 6. For evidence that people use a multistage process to interpret information and employ a number of different decision rules in their efforts to solve complex problems, see Tversky and Kahneman (1979, 1986), Montgomery and Svenson (1976), Park (1978), Wallsten (1980), and Dahlstand and Montgomery (1984). See Gallhofer, Saris, and Schellekens (1988) for a discussion of nonexperimental evidence of the same sort of patterning in the deliberations of "real-world" foreign policy decisionmakers. 7. For evidence from another computer-based study of political decision making during the Vietnam War that was based on textual materials and found relatively simple structuring in terms of decisional activity, see Andrus and Powell (1975). 8. See Purkitt and Dyson (1989) and Purkitt (1990, forthcoming) for a detailed analysis of the ExCom deliberations during the Cuban Missile Crisis using an information-processing perspective. Several other analyses have also concluded that U.S. decision making during this crisis departed markedly from the descriptions offered by participants in their memoirs. See, for example, Anderson (1983). 9. Recent computational models developed by Alker and his associates illustrate one potentially promising approach because the RELATUS system is a highly flexible system that can be used to code verbal data and to model organizations as conversational processing systems. See Alker (this volume); Bennett (1989); Mallery (1988); Mallery, Hurwitz, and Duffy (1987); and Mefford (this volume) for some artificial intelligence applications of modeling organizations as conversational processing systems.

Bibliography Abelson, Robert P. 1981. "The psychological status of the script concept," American Psychologist, vol. 36, pp. 715-29. Alker, Hayward R., Jr. 1984. "Historical argumentation and statistical inference: towards t:nore appropriate logics for historical research," Historical Methods, vol. 17(3), pp. 164-173. Anderson, Paul A. 1981. "Justification and precedents as constraints in foreign policy decisionmaking." American Journal of Political Science, vol. 25, pp. 738-61. - - - . 1983. "Decision making by objection and the Cuban missile crisis," Administrative Science Quarterly, vol. 28, pp. 201-22.

50

He/en E. Purkitt

- - - . 1988. "Novices, experts and advisers," paper presented for the annual meeting of the International Studies Association, March 29-April 1, St. Louis, Missouri. Andrus, David, and Powell, Charles A. 1975. "Thematic content analysis of decisionmaking processes: the Pentagon Papers," paper presented at the International Studies Association West Annual Conference, San Francisco. Bennett, James P. 1989. "The ABM and INF Treaties as computational rules of superpower competition, or an essay on inhuman understanding," paper presented at the annual meeting of the International Studies Association, 28-31 April, London. Bennett, W.L. 1981. "Perception and cognition: an information processing framework for politics," in S.L. Long (ed.), Handbook of Political Behavior, vol. 1. New York: Plenum, pp. 69-193. Boynton, G.R. 1989. "Language and understanding in conversations and politics," paper presented for the annual meeting of the Midwest Political Science Association. ___ . 1988. "Micro managing in foreign policy and agricultural policy: communication and cognition in policy making," paper presented at the 1988 meeting of the International Studies Association. Boynton, G.R., and Kim, C.L. 1987. "Political representation as information processing and problem solving," paper presented at the 1987 meeting of the Southern Political Science Association. Carron, J.S. 1980. "Analyzing decision behavior; the magicians's audience," in T.S. Wallsten (ed.), Cognitive Processes in Choice and Decision Behavior. Hillsdale, NJ: Lawrence Erlbaum. Carron, J.S., and Payne, J.W. (eds.) 1976. Cognition and Social Behavior. Hillsdale, NJ: Lawrence Erlbaum. Chapman, L.J., and Chapman, J.P. 1969. "Illusory correlation as an obstacle to the use of valid psychodiagnostic signs," Journal of Abnormal Psychology, vol. 74, pp. 271-280. Dahlstand, U., and Montgomery, H. 1984. "Information search and evaluation processes in decision making: a computer based process tracing study," Acta Psychologica, vol. 56, pp. 113-123. Davis, D.F. 1978. "Search behavior of small decision making groups: an information processing perspective," in R.T. Golembiewski (ed.), The Small Group in Political Science. Athens, Ga: University of Georgia Press. Dawes, Robyn. 1988. Rational Choice in an Uncertain World. San Diego, Ca.: Harcourt Brace Jovanovich. Dyson, James W., Godwin, H.B., and Hazlewood, L. 1974. "Group composition, leadership orientation and decisional outcomes," Small Group Behavior, vol. 1, pp. 114-28. Dyson, James W., and Purkitt, Helen E. 1986a. "An experimental study of cognitive processes and information in political problem solving," final report to the National Science Foundation, Aorida State University and U.S. Naval Academy. ___ . 1986b. "Review of experimental small group research," in S. Long (ed.), Political Behavior Annual, vol. 1. Boulder, Colo.: Westview, pp. 71-101. Einhorn, H.J., and Hogarth, R.M. 1978. "Confidence in judgement: persistence of the illusion of validity," Psychological Review, vol. 85, pp. 395-416.

AI and Intuitive Foreign Policy Decision-Makers

51

___ . 1980. "Learning from experience and suboptimal rules in decision making," in T.S. Wallsten (ed.), Cognitive Processes in Choice and Decision Behavior. Hillsdale, NJ: Lawrence Erlbaum. ___ . 1981. "Behavioral decision theory: processes of judgement and choice," Annual Review of Psychology, vol. 32, pp. 53-88. ___ . 1985. "Ambiguity and uncertainty in probabilistic inference," Psychological Review, vol. 93, pp. 433-461. Estes, W.K. (ed.) (1978). Handbook of Learning and Cognitive Processes, vol. 5. Hillsdale, NJ: Lawrence Erlbaum. Gallhofer, lrmtraud N., Saris, Willem E., and Schellekens, Maarten. 1988. "People's recognition of political decision arguments," Acta Psychologica, vol. 68, pp. 313327. George, Alexander L. 1969. "The operational code: a neglected approach to the study of political leaders and decisionmaking," International Studies Quarterly, vol. 13, pp. 190-222. ___ . 1980. Presidential Decision Making in Foreign Policy: The Effective Use of Information and Advice. Boulder, Colo: Westview Press. Hamilton, D.L. 1979. "A cognitive attributional analysis of stereotyping," in L. Berkowitz (ed.), Advances in Experimental Social Psychology, vol. 12. New York: Academic Press. Hamilton, D.L., and Gifford, R.K. 1976. "Illusory correlation in interpersonal perception: a cognitive basis of sterotypic judgments," Journal of Experimental Social Psychology, vol. 12, pp. 392-407. Hayes, J.R. 1981. The Complete Problem Solver. Philadelphia, Pa.: Franklin Institute Press. Heuer, R.J. 1978. "Do you think you need more information?" Mimeograph. October. Holsti, Ole R. 1962. "The belief system and national images: a case study," Journal of Conflict Resolution, vol. 4, pp. 244-252. Hurwitz, Roger. 1988. "Reading prisoner's dilemma interactions as drama," paper presented at the annual meeting of the International Studies Association, March 30-April 1, St. Louis. Janis, I. L., and Mann, L. 1977. Decision Making: A Psychological Analysis of Conflict, Choice and Commitment. New York: Free Press. Johnson, E.J. and Tversky, Amos. 1983. "Representations of perceptions of risks." Technical Report NR 197-058. Office of Naval Research, June. Kahneman, Daniel, Slovic, Paul, and Tversky, Amos (eds.) 1982. Judgement Under Uncertainty: Heuristics and Biases. Cambridge, England: Cambridge University Press. Kegley, Charles W., Jr. 1987. "Decision regimes and the comparative study of foreign policy," in C.F. Hermann, C.W. Kegley, and J.N. Rosenau (eds.), New Directions in the Study of Foreign Policy. Boston: Alien and CJnwin, pp. 247-268. Kelley, H.H. 1971. Attribution in Social Interaction. Morristown, NJ: General Learning Press. Kirkpatrick, S.A., Davis, D.F., and Robertson, R.O. 1976. "The process of political decisionmaking in groups: search behavior and choice shifts," American Behavioral Scientist, vol. 20, pp. 33-64.

52

He/en E. Purkitt

Kogan, N., and Wallach, M.A. 1967. "Risk taking as a function of the situation, the person, and the group," in G. Mandler, P. Mussen, N. Kogan, and M. Wallach (eds.), New Directions in Psychology, vol. 3. New York: Holt, Rinehart, and Winston. Lampton, D.M. 1973. "The US image of Peking in three international crises," Western Political Quarterly, vol. 26, pp. 28-50. Lindblom, Charles E. 1959. "The science of muddling through," Public Administration Review, vol. 19, pp. 79-88. Mallery, John C. 1987. "Computing strategic language: natural language models of belief and intention," paper presented at the annual meeting of the International Studies Association, April. - - . 1988. "Thinking about foreign policy: finding an appropriate role for artificially intelligent computers," paper presented at the annual meeting of the International Studies Association, March 29-April 1, St. Louis. Mallery, John C., Hurwitz, Roger, and Duffy, G. 1987. "Hermeneutics," Encyclopedia of Artificial Intelligence. New York: John Wiley. May, E. 1973. Lessons of the Past: The Use and Misuse of Power in American Foreign Policy. New York: Oxford University Press. Mefford, Dwain. 1987. "Analogical reasoning and the definition of the situation: back to Snyder for concepts and forward to artificial intelligence for method," in C.F. Hermann, C.W. Kegley, and J.N. Rosenau (eds.), New Directions in the Study of Foreign Policy. Boston: Alien and Unwin, pp. 221-224. Miller, George A. 1956. "The magical number seven plus or minus two: some limits on our capacity for processing information," Psychological Review, vol. 63, pp. 81-97. Mitchell, D.J., Russo, J.E., and Pennington, N. 1989. "Back to the future: temporal perspective in the explanation of events," Journal of Behavioral Decision Making, vol. 2 (1, January-March), pp. 25-38. Montgomery, H., and Svenson, 0. 1976. "On decision rules and information processing strategies for choices among multiattribute alternatives," Scandinavian Journal of Psychology, vol. 17, pp. 273-291. Newell, Alien, and Simon, Herbert. 1972. Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall. Nisbett, R.E., Borgida, E., Crandall, R., and Reed, H. 1976. "Popular induction: information is not necessarily informative," in J.S. Carroll and J.W. Paynes, (eds.), Cognition and Social Behavior. Hillsdale, NJ: Lawrence Erlbaum. Nisbett, R.E., and Ross, L. 1980. Human Inference: Strategies and Shortcomings in Social Judgement. New York: John Wiley. Nisbett, R.E., and Wilson, P.P. 1977. "Telling more than we can know: verbal reports on mental processes," Psychological Review, vol. 84, pp. 231-59. Park, C.W. 1978. "A seven point scale and a decision maker's simplifying choice strategy: an operationalized satisficing-plus model," Organizational Behavior and Human Performance, vol. 21, pp. 252-272. Payne, J.W. 1975. "Relation of perceived risk to preferences among gambles," Journal of Experimental Psychology: Human Perception and Performance, vol. 21, pp. 286-94.

AI and Intuitive Foreign Policy Decision-Makers

53

___ . 1980. "Information processing theory: some concepts and methods applied to decision research," in T.S. Walls ten (ed.), Cognitive Processes in Choice and Decision Behavior. Hillsdale, NJ: Lawrence Erlbaum, pp. 95-115. Pitz, G.F. 1977. "Decision making and cognition," in H. Jungerman and G. DeZeeuw (eds.), Decision Making and Change in Human Affairs. Dordrecht, the Netherlands: Reidel. ___ . 1980. "The very guides of life: the use of probabilistic information for making decisions," in T.S. Wallsten (ed.), Cognitive Processes in Choice and Decision Behavior. Hillsdale, NJ: Lawrence Erlbaum, pp. 77-94. Pitz, G.F., and Sachs, N.J. 1984. "Judgment and decision making: theory and applications," Annual Review of Psychology. vol. 35. Powell, Charles A., Purkitt, Helen E., and Dyson, James W. 1987. "Opening the black box: cognitive processing and optimal choice in foreign policy decision making," in C.F. Hermann, C.W. Kegley, and J.N. Rosenau (eds.), New Directions in the Study of Foreign Policy. Boston: Alien and Unwin, pp. 203-220. Purkitt, Helen E. (1990, forthcoming) "Political decision making in the context of small groups: the Cuban missile crisis revisited-one more time," in E. Singer and M. Hudson (eds.), Political Psychology and Foreign Policy. Boulder, Colo.: Westview. Purkitt, Helen E., and Dyson, James W. 1987. "Does experience and more information lead to high quality political decisions? an information processing perspective." Mimeograph. Annapoli~i, Md.: U.S. Naval Academy. - - - . 1988. "An experimental study of cognitive processes and information in political problem solving," Acta Psychologica, vol. 68, pp. 329-342. ; ___ . 1989. "Foreign policy decision making under varying situational constraints: an information processing perspective," paper presented at the Twelfth Research Conference on Subjective Probability, Utility and Decision Making, August 2125, Moscow. Quattrone, G.A., and Tversky, A. 1983. "Causal versus diagnostic contingencies: on selfdeception and on the voter's illusion." Technical Report NR 197-058. Office of Naval Research, June. Rohrmann, Bernd, Beach, Lee R., Vlek, Charles, and Watson, Stephen R. (eds.) 1989. Advances in Decision Research. Amsterdam: North-Holland. Rosati, J.A. 1984. "The impact of beliefs on behavior: the foreign policy of the Carter administration," in D.A. Sylvan and S. Chan (eds.), Foreign Policy Decision Making. New York: Praeger, pp. 158-91. Schrodt, Philip A. 1985. "Adaptive precedent based logic and rational choice: a comparison of two approaches to the modeling of international behavior," in U. Luterbacher and M.D. Ward (eds.), Dynamic Models of International Conflict. Boulder, CO: Lynne Rienner, pp. 373-400. - - - . 1987. "Pattern matching, set prediction and foreign policy analysis," in S.J. Cimbala (ed.), Artificial Intelligence and National Security. Lexington, Ma.: Lexington Books, pp. 89-107. Shanteau, James. 1987. "Psychological characteristics of expert decision makers," in J. Mumpower, 0. Renn, L.D. Phillips, and V.R.R. Uppuluri (eds.), Expert Judgment and Expert Systems. Berlin: Springer Verlag.

54

He/en E. Purkitt

- - - . 1989. "Psychological characteristics and strategies of expert decision makers," in Bernd Rohrmann, Lee R. Beach, Charles Vlek, and Stephen Watson (eds.), Advances in Decision Research. Amsterdam: North-Holland, pp. 203-215. Shepard, R.N. 1978. "On subjectively optimum selections among multi-attribute alternatives," in W.K. Estes (ed.), Handbook of Learning and Cognitive Processes, vol. 5. Hillsdale, NJ: Lawrence Erlbaum. Simon, H.A. 1945, rev. 3d ed. 1976. Administrative Behavior. New York: Free Press. _ _ . 1958. Models of Man. New York: John Wiley. - - - . 1959. "Theories of decision-making in economic and behavioral science," American Economic Review, vol. 49, pp. 253-283. Simon, H.A., and Hayes, John R. 1976. "Understanding complex task instructions," in David Klard (ed.), Cognition and Instruction. Hillsdale, NJ: Lawrence Erlbaum, pp. 269-286. Slovic, P., Fischhoff, B., and Lichtenstein, S. 1977. "Behavioral decision theories," Annual Review of Psychology, vol. 28, pp. 1-39. ___ . 1983. "Characterizing perceived risk," in R.W. Kates and C. Hohenemser (eds.), Technological Hazard Management. Cambridge, MA: Gunn and Hain. ___ . 1984. "Behavioral decision theory perspectives in risk and safety," Acta Psychologica, vol. 56, pp. 183-203. Slovic, P., and Lichtenstein, S. 1968. "The relative importance of probabilities and payoffs in risk taking," Journal of Experimental Psychology, monograph supplement, vol. 78(3), part 2. Slovic, P., Lichtenstein, S., and Fischhoff, B. 1988. "Decision making," in R.C. Atkinson, R.J. Herrnstein, G. Lindzey, and R.D. Luce (eds.), Steven's Handbook of Experimental Psychology, 2d ed. New York: John Wiley. Slovic P., and MacPhillamy, D.J. 1974. "Dimensional commensurability and cue utilization in comparative judgment," Organizational Behavior and Human Performance, vol. 11, pp. 172-194. Sylvan, Donald A., and Thorson, Stuart J. 1989. "Looking back at JFK/CBA: considering new information and alternative technology," paper presented at the 13th annual meeting of the International Studies Association, March 27-April 2, London. Taylor, S.E., Fiske, S.T., Etcoff, N.L., and Ruderman, A.J. 1978. "Categorical and contextual bases of person memory and stereotyping," Journal of Personality and Social Psychology, vol. 36, pp. 778-793. Tetlock, Philip E., and C. McGurie, Jr. 1986. "Cognitive perspectives on foreign policy," inS. Long (ed.), Political Behavior Annual, vol. 1. Boulder, Colo.: Westview, pp. 147-179. Thorson, Stuart, and Andersen, Kristi. 1987. "Computational models, expert systems, and foreign policy," in S.J. Cimbala (ed.), Artificial Intelligence and National Security. Lexington, Ma.: Lexington Books, pp. 147-158. Thorson, Stuart, and Sylvan, Donald. 1982. "Counterfactuals and the Cuban missile crisis," International Studies Quarterly, vol. 26, pp. 537-71. Tversky, Amos. 1977. "Features of similarity," Psychological Review, vol. 84, pp. 327-352.

AI and Intuitive Foreign Policy Decision-Makers

55

Tversky, Amos, and Heath, C. 1989. "Ambiguity and confidence in choice under uncertainty," paper presented at the Twelfth Research Conference on Subjective Probability, Utility and Decision Making, August 21-25, Moscow. Tversky, Amos, and Kahneman, D. 1983. "Extensional vs. intuitive reasoning, the conjunction fallacy in probability judgment," Psychological Review, vol. 90, pp. 293-315. ___ . 1986. "Rational choice and the framing of decisions," in R.M. Hogarth and M.W. Reder (eds.), Rational Choice: The Contrast Between Economics and Psychology. Chicago, 11.: University of Chicago Press, pp. 67-94. Walker, Stephen G. 1977. "The interface between beliefs and behavior: Henry Kissinger's operational code and the Vietnam war," Journal of Conflict Resolution, vol. 21, pp. 129-168. Wallsten, T.S. (ed.) 1980. Cognitive Processes in Choice and Decision Behavior. Hillsdale, NJ: Lawrence Erlbaum. Wildavsky, A. 1964. The Politics of the Budgetary Process. Boston: Little, Brown.

3 Steps Toward Artificial Intelligence: Rule-Based, Case-Based, and Explanation-Based Models of Politics Dwain Me/ford

New subfields within artificial intelligence emerge on a regular basis about every five to ten years. The most recent include Case-Based Reasoning (Kolodner 1988; Hammond 1989a, 1989b) and Explanation Based Learning (DeJong 1988; DeJong and Mooney 1986; Mitchell, Keller, and Kedar-Cabelli 1986). These approaches cast reasoning in terms of the acquisition and use of large qualitative structures in memory and show signs of merging into a general approach to problem solving and planning. 1 In this chapter we argue, as we have argued elsewhere (Mefford, 1988h, 1989), that these approaches speak to some of the most fundamental questions in the study of politics. The concepts and mechanisms of these approaches can be readily applied to address paradigmatic issues in the areas of decision making, collective action, and the theory of regimes and institutions. The development of case-based reasoning (CBR) and explanation-based learning (EBL) within artificial intelligence in the 1980s was motivated by several sets of issues, including that of the organization of dynamic memory (Schank 1982; Kolodner 1980, 1983, 1984) and the desire to overcome the limitations of rule-based systems-limitations that have become all too apparent after twenty years of experience. Political scientists who have built rule-based systems have themselves become increasingly aware of these limitations. My purpose is not to argue the virtue of one class of systems over another, but to trace out the substantive and theoretical issues that have motivated the development of alternative approaches. There is a conceptual path that leads from rule-based systems, with their uniform and atomistic data structures, to systems that use larger and more complex constructs to reproduce inferential and learning processes. The assumption of a fixed and static knowledge base is increasingly giving way to a fundamental concern with how knowledge is acquired and organized dynamically. 56

Rule-Based, Case-Based, and Explanation-Based Models

57

This progression, if extended, can be steered in the direction of certain core questions in empirical political theory. These issues include, for example, the conceptual challenge of systematically investigating what March and Olsen call "path dependent processes" (1984). A path-dependent process is one in which events impact on the structure of a system, changing its character and thereby opening and closing possibilities for subsequent change or development (David 1985). March and Olsen express this notion cryptically in the context of organization theory with the idea that institutions encode their histories in their structure (March and Olsen 1984: 743). Path-dependent processes are ubiquitous: The sexual abuse of a child may be expressed through a lifetime of false intimacy and failed relationships; it matters for the character of today's regimes in Eastern Europe that the German advance was stopped at Stalingrad fifty years ago. Within the study of foreign policy and international relations, the role of history in shaping strategic relations and institutions is a path-dependent process, as is the impact of contingent experience on the political beliefs of individuals. These issues, and a host of others that are closely linked to them, can only be addressed awkwardly, if at all, by behavioral methods (Aiker 1975, 1984; March and Olsen 1984: 749). The concepts and mechanisms that are emerging in the most recent work in artificial intelligence provide a basis for the systematic formal and empirical investigation of processes of this order. To describe what is essential and characteristic of each of several approaches, and to do so such that they can be systematically compared, we employ a single working example that will be cast and recast in several forms. The example is based on a classic in the literature on the violent transfer of political power, Luttwak's study of coups d'etat (Luttwak 1979). The patterns Luttwak finds in the planning and execution of coups will be presented, first, as a rule-based expert system, then as a planning system that packages and reapplies pieces of plans, then as a case-based reasoning system that reasons on the basis of historical examples, and finally as a system that constructs and updates its theory of coups d'etat as a function not only of the cases that history presents but as a function of the order in which these cases are encountered. This last class of systems paves the way for the systematic study of the experiential learning and other pathdependent processes that account for institutional change (Haas 1982). The Rational Science of Artiftciallntelligence It will become apparent as we step from system to system that the object of the model shifts progressively from behavior, viewed in a sense from the "outside," to the reasoning and adaptation that underlie behavior. This effort to systematically investigate behavior from the "inside" flies in the face of a deep prejudice of behavioral science, which, as Simon observes,

58

Dwain Me/ford

is "always wary of intervening variables that are not directly observable" (Simon 1976: 261). But this is precisely the purpose of artificial intelligence as it is of cognitive science in general: The object is to systematically reconstruct the complex process that stands between the stimulus and the response in the S-R model. In short, the theoretical action is in the mechanism that relates the S to the R; the theoretical action is in the hyphen. Artificial intelligence, and cognitive science more broadly (Newel! 1983a; Simon 1980), departs from the norms of behavioral science not only in its method and its mathematics but more fundamentally in how it defines its scientific problem. Because conventional methodology restricts the complexity of the problems we can study, Simon observes that if we are to study process, we are going to have to employ research methods and research designs that are appropriate to that kind of investigation, and these are going to be different from the methods and designs we have used to study simple stimulus-response connections. The shift from S-R formulations to theory of information-processing formulations is a fundamental shift in paradigm, and it is bringing with it fundamental shifts in method also (Simon 1976: 261).

In terms of our working example, rather than attempting to find a relationship between, say, types of regimes and the incidence of coups d'etat that can be expressed in some linear form, (e.g., Thompson 1974, 1975), the object is to reconstruct, abstractly, the planning process that might be typical of a class of powerful but disaffected political figures. Intentional behavior is explained in terms of the capacity of intelligent agents to reason, communicate, and act on the basis of and within the constraints imposed by their knowledge and goals. In the search for mechanism, artificial intelligence and cognitive science pick up where descriptive behavioral science too often leaves off. Explanation within this conception of science corresponds to the commonsense notion: The object is to answer the question of how and why something happened, or could happen. There is a stark difference between this meaning of explanation and "explanation" as used in the notion of "explaining the variance" between a pair of variables. 2 To use the example that we will return to throughout this chapter, to know that military rank and age "explain," "predict," or are "significantly" related to the likelihood that an individual will participate in a coup is not to know why this might be the case. As a method of inquiry, artificial intelligence is sometimes presented as a radical departure from the types of formal and empirical investigation that are familiar in political science and economics (Schrodt, 1985). There are differences in the substantive issues raised and in the function that

Rule-Based, Case-Based, and Explanation-Based Models

59

computational methods play in the formulation of theory, but it can be argued that these differences remain largely on the surface. The core scientific enterprise across the full spectrum of the AI community, from "left-wing scruffies" to "right-wing neats" (Abelson 1981), conforms to a basic image of science that is broadly shared. The object is to extract general principles or mechanisms that underlie intelligent behavior. To date, the political scientists who have applied concepts and techniques from AI to politics have concentrated their attention on two subfields within artificial intelligence: natural language processing, and what we will loosely call problem solving and planning. Duffy and Mallery's project, which is reported elsewhere in this volume, is far and away the preeminent example of the work by political scientists in the area of natural language understanding (Duffy and Mallery 1986; Duffy 1988; Mallery 1987, 1988a, 1988b). This chapter concentrates exclusively on the second of the two agendas, that of problem solving and planning, which, when viewed as a repeated process generalizes to the question of learning. We will first chart the evolution of AI work in planning, problem solving as it has evolved in the 1970s and 1980s. Viewed from the altitude that we adopt, the overlapping developments in this body of work reveal a basic progression from rule-based, performanceoriented systems to more cognitively oriented programs that utilize larger, qualitative structures and that, increasingly, emphasize the acquisition and modification of knowledge. The actual evolution is not, of course, as neat and linear as our examples will suggest. (For a more complete survey that charts the parallel lines of development and backtracking that has occurred as these ideas have unfolded see Mefford forthcoming, chapters 3 and 4). 3 Three Generations of Rule-Based Systems

In 1967 Edward Luttwak published Coup d'Etat: A Practical Handbook, which is remarkable as much for its format as for its findings. The book distills the concept of coup d'etat from a nearly exhaustive data base from the period 1945 through the early 1960s, later extended to 1978. Luttwak explicates the concept of coup d'etat by adopting the perspective of a hypothetical group bent on overthrowing a seated government. The general scheme is the precipitate of 282 cases (Table 11, pp. 195-207). From these cases, particularly from cases that strike Luttwak as paradigmatic, such as Nasser's rise to power in 1952, he assembles a dynamic concept of coup in terms of the opportunities and challenges that confront any would-be organizer of an armed revolt. The book is written in the genre of the "advice books" to princes, of which Machiavelli's is the most noted example (Gilbert 1965). It is a how-to book that prescribes the steps in the conduct of a coup from the initial recruitment of coconspirators, through the planning and execution, to the consolidation of power and the return to "normalcy"

60

Dwain Mefford

under the rule of a new leadership. In the preface to the second edition (1979), Luttwak notes, with more satisfaction than chagrin, that the book was apparently used as a guide on at least one occasion: A heavily annotated copy of the French edition was found among the personal effects of an ambitious colonel in an unnamed country. Our first attempt to model the political reasoning of Luttwak's coupmakers takes the form of rule-based system. The object is to reconstruct the plans and calculations of a hypothetical individual or group as it organizes and executes a coup d'etat. After observing what can and cannot be represented in this class of systems, we proceed to other types of programs based on principles that promise to better represent the cognitive mechanisms at work. In addition to reconstructing the strategic reasoning of the actors, the attempt will be to capture more fundamental processes, such as how individuals model themselves and their political programs after historical examples. The research question shifts, in effect, from reconstructing patterns in data to modeling the political reasoning and political education of intelligent, purposive agents. When we shift the object of the model in this way, Luttwak himself, as a stand-in for an intelligent and historically informed political actor, becomes the subject. The point is reproduce how Luttwak reasons, how he recognizes patterns in the historical examples, and how he formulates those patterns into principles for a particular kind of political action, namely, armed rebellion. A theory of how regimes are overthrown requires an explicit treatment of how individuals come to see this course of action not only as an option but as a political goal. Production Systems: The Cognitive and Computer Architecture of First-Generation Rule-Based Systems

Rule-based systems and production systems, if they are not the same concept, are so closely related that they can be safely treated as synonymous. 4 They define an architecture or principle of design for a class of computer programs. Production systems consist essentially of a collection of rules or productions plus a pattern matcher, or deductive procedure, or other device for applying the rules to the problem or task. The rules or productions are condition-action pairs, or if . . . then clauses. The condition is matched against data or statements in the workspace. If the match succeeds, i.e., if a rule's condition is satisfied, then the associated action is executed, which generally results in a change in the contents of the workspace, e.g., hypotheses are added or removed, or likelihoods are increased or decremented. This conception of the function and design of computer systems is credited to Alien Newell, building on the work of Post, Floyd, Simon, and Feigenbaum (Post 1943; Minsky 1967; Floyd 1961; Feigenbaum 1963; Simon and Feigenbaum 1964). Production systems have proven to be extremely

Rule-Based, Case-Based, and Explanation-Based Models

61

powerful and plastic. Newell, Anderson, and others have argued, and to some extent have demonstrated, that the components of a production system are sufficient to reproduce a wide range of intelligent behavior. The conception is, therefore, a candidate theory of what is necessary and sufficient for intelligence in human beings and machines (Anderson 1976, 1983; Newell 1980, 1989).5 Edward Feigenbaum is responsible for first applying the concept of production system outside the domain of the theoretical and empirical study of human memory (Buchanan and Shortliffe 1984: 7-8). He convinced the designers of DENDRAL, a program designed to identify organic compounds from mass spectographs, to use productions in place of the conventional programming language and architecture. The insight was that the principles in this branch of chemistry are better conceived as a collection of rules than as some large algorithm. The success of that system (Feigenbaum, Buchanan, and Lederberg 1971; Lindsay, Buchanan, Feigenbaum, and Lederberg 1981) inspired efforts not only to employ production systems in other domains, most notably medical diagnosis, but also to develop shells or languages to facilitate the construction of such systems across a broad range of applications (Feigenbaum 1977). The potential that DENDRAL, MYCIN, and related systems demonstrated in the 1970s,6 plus the advent of special-purpose languages and development tools, including EMYCIN (Van Melle 1980) and OPS5 (Brownston, Farrell, Kant, and Martin 1985), and the founding of companies dedicated to capitalizing on expert system design (Harmon 1989) is credited with "bringing AI out of the laboratory" and into the world of business and industry (Winston and Prendergast 1984). For this reason artificial intelligence is often identified exclusively with rule-based systems. 7 Sophisticated rule-based systems may consist of a number of modules, some of which elicit data from users, monitor the state of the rule base, or offer explanations by backtracking through the chain of rules used to reach a conclusion (Davis, Buchanan, and Shortliffe 1977). But there are only three essential components of a rule-based system: a set of rules, e.g., relationships between symptoms and diseases; a data set that, in a medical application, might include the patient's medical history and the results of laboratory tests; and a device for applying these rules in the process of solving a problem or executing a task (Davis and King 1984: 21). The classical rule-based systems were primarily designed for the task of classification or diagnosis, though a diagnostic system like MYCIN is also intended to prescribe therapies, which is a form of planning (Chandrasekaran 1986: 27-28). Classification goes hand in hand with planning in these systems, and the same is the case for our rule-based version of Luttwak's theory of coups d'etat.

62

Dwain Me/ford

Model 1: Luttwak Rendered as a Rule-Based System

Luttwak's account of the steps and substeps in organizing and executing a coup clearly has the character of a planning system. The particular plan that it, in effect, constructs, is dependent on contingent factors, such as the ethnic divisions within the leadership circle (Luttwak 1979: 55). The reasoning that relates these factors to what is involved in carrying out a coup d'etat is readily represented in a rule-based system. For example, the essential goal for the perpetrator of a coup is to take over and control the power of the state rather than to destroy it. To do this requires neutralizing the instruments of the state: the armed forces, the security agencies, the bureaucracy, etc. This in turn requires infiltrating and eo-opting crucial personnel and/or rendering their ability to oppose the coup ineffective, which can be achieved by interfering with the channels of communication, e.g., seizing radio stations and silencing government officials by arresting or kidnapping them. In a rule-based system, these actions, and a thousand others, can each be represented as a condition-action pair of the form: IF CONDITION, e.g., an army garrison is stationed in the capital, THEN ACTION, e.g., reduce the effectiveness of that garrison by disrupting its communication system. Luttwak does not in fact formulate discrete rules, but the rulelike character of much of what Luttwak extracts from examples of coups d'etat can be reexpressed in such a form. For instance, as part of his analysis of the potential for mutiny in Portugal in the 1970s, Luttwak formulates a rulelike principle: "The forces relevant to a coup are those whose locations and/ or equipment enables them to intervene in its locale (usually the capital city) within the 12-24-hour time-span which precedes the establishment of its control over the machinery of government" (1979: 70). This principle is fleshed out via an analysis of the tactics that are likely to be effective in blocking military units of various descriptions from coming to the defense of the government. In a table Luttwak matches functional descriptions of army units with the tactic that is likely to be effective in "turning" or neutralizing that unit. Table 3.1 is effectively an array of simple rules, which can be read off by matching descriptions of the military unit to the corresponding tactic. Equipped with rules of this sort, encoded in some convenient form, the program chains backwards from the goal, i.e., to overthrow the regime, through the actions and conditions that would realize that goal. A sample of the rules arranged into a tree structure is pictured in Figure 3.1, in a later section on planning systems. It is important to observe that firstgeneration rule-based systems are blind to the order or dependency among

63

Table 3.1: Infiltration Strategies (Simplified from Luttwak, 1979, table 3, p. 74)

Unit:

Battalion No. 1

Battalion No.2

Battalion No.3

Technical Structure:

Very simple Relies on ordinary communication and transportation

Very complex Requires airlift and sophisticated communications

Medium Relies on land transport and radio links

40 technicians

5 technicians

Key men Leaders who must be eo-opted:

Figure 3.1: AND/OR Graph of Tasks Implicit in the Concept of Coup d'Etat

to execute a coup d'etat

I

AND

recruit

neutralize

~rat~rr_s t_h_~~-:r1 :_a_~_; _____

identify isaffected leaders

________Tip_o_w_e_r-----,l

neutralize military command OR

I by exploiting

neutralize police and security forces

I

geographical dispersement

consolidate

I

neutralize neutralize non-state bureaucracy centers of power

by compromising communications

political parties

AND

labor

etc. etc.

64

Dwain Me/ford

the rules. That structure is implicit in the rule base, but is itself not directly accessible to the system. As a consequence it cannot be explicitly used in the process of monitoring a situation or solving a problem, and surfaces only as the system attempts to solve its problem. The "flat" and "linear" character (Sacerdoti 1977) of early rule-based systems, i.e., the principle of design whereby all knowledge in the system is represented in the same form at the same level, is both a strength and weakness. Such systems are easy to modify by adding or deleting rules because there is no need to update complex data structures. But, by the same token, because the system cannot explicitly access this structural knowledge, it cannot utilize it. These issues, among others, motivated the development of the next generation of rule-based systems. The Step to Second-Generation Rule-Based Systems: Functional and Contextual Knowledge Plus the Problem of Explanation

What Davis has called "first-generation" rule-based or expert systems, most notably DENDRAL (Buchanan, Sutherland, and Feigenbaum 1969; Lindsay, Buchanan, Feigenbaum, and Lederberg 1981) and MACSYMA (Martin and Fateman 1971; MACSYMA group 1974), were concerned exclusively with performance. No claims were advanced as to any correspondence between how these systems solved problems in their domains and how human beings accomplish similar tasks (Davis 1984: 20). What mattered was the accuracy of the result and the efficiency with which it is obtained. In the next generation of systems, most notably the medical diagnostic program MYCIN (Shortliffe 1976; Buchanan and Shortliffe 1984), the interaction between the program and human experts became an explicit concern. For the system to be effective in a hospital setting it was necessary to equip it with procedures for entering data and for explaining or justifying its reasoning in a manner consistent with conventions that people use and expect. Expert systems like MYCIN fell well short of producing convincing explanations, but the fact that this complex issue was given prominence on the research agenda marks an advance in the conception and design of rule-based systems. For a system to interact with human beings in even the restricted conversational mode required for data entry, it must be equipped with knowledge of a type and form that cannot be readily represented in discrete, rulelike pieces. In addition to being incapable of explicitly representing connections among the rules in the rule base, rule-based systems cannot relate individual rules to the source or evidence from which the rules were extracted. Clancy's work shows that "individual rules play different roles, have different kinds of justifications, and are constructed using different rationales for the ordering

Rule-Based, Case-Based, and Explanation-Based Models

65

and choice of premise clauses" (Ciancy 1983: 215-216). For a rule-based system to be intelligible to a human being, what Clancy calls the "structural," "strategic," and "support" knowledge that underlies the rules must be made explicit, otherwise the user cannot evaluate the system's conclusions, nor is it possible for the user to intervene and alter the rule base appropriately in the event that the program needs correction. Though several correctives were proposed (Davis 1980; Clancy 1985), what is important in the present context is that the limitations that Clancy describes in MYCIN-Iike systems also plague any rule-based system of the same type, 8 like the one we sketched with the Luttwak example. However, equally applicable to any attempt by political scientists to construct similar rule-based systems is Clancy's evaluation of the MYCIN project: Despite the now apparent shortcoming of MYCIN's rule formalism, we must remember that the program was influential because it worked well. . . . [W]e can treat the knowledge base as a reservoir of expertise, something never before captured in quite this way, and use it to suggest better representations (Ciancy 1983: 249).

As Davis concludes, rule-based systems at present can be characterized with the phrases "narrow domain of expertise. Fragile behavior at the boundaries. Limited knowledge representation language. Limited 1/0. Limited explanation. One expert as knowledge base 'czar' "(1982: 9). What is needed, minimally, is a system capable of a "range of behaviors associated with expertise: solve the problem, explain the result, learn, restructure knowledge, break rules, determine relevance, degrade gracefully" (Davis 1982: 3). What cannot easily be captured in a collection of atomized rules, even if thousands are encoded, is the strategic knowledge of how to use them, when to allow variations and exceptions, and how to adapt and revise them as needed. This demands a richer representation than that which rules provide. These limitations, coupled to the fact that rules are hard to extract because people are often unable to articulate their knowledge in rule form, motivates the search for alternative approaches. Third-Generation Rule-Based Systems: Structural Knowledge and Task Decomposition

The next generation of expert systems, e.g., Davis's work on a computer fault-diagnostic system (Davis 1984b), requires the ability to reason in terms of the structure and function of objects in a domain as well as the capacity to draw inferences on the basis of empirical associations in the manner of earlier rule-based systems. Adding structural, functional, and behavioral descriptions entails employing multiple representations; rules no longer enjoy a monopoly. What we are calling "third-generation" systems introduce a

66

Dwain Me/ford

new level of abstraction that focuses on the inferential tasks rather than on the implementation of these tasks in any particular formalisms, such as rules or frames. The JESSE system developed by Chandrasekaran, Gael, and Sylvan (Gael and Chandrasekaran 1987; Gael, Chandrasekaran, and Sylvan 1987; Sylvan, Gael, and Chandrasekaran 1988 and this volume) applies this theoretical concern with information-processing tasks to policy formation in the area of Japanese energy planning. This program marks a major departure from other rule-based applications to decision making principally because of the theoretical questions that the system poses in its architecture. Programs like Job and Johnson's reconstruction of the Johnson administration's decision in the Dominican Republic crisis (Job and Johnson 1986; this volume) or Thorson and Sylvan's earlier effort to reconstruct Kennedy's options in the Cuban Missile Crisis (Thorson and Sylvan 1982) or my own work in modeling Soviet interventions in Eastern Europe (Mefford 1984, 1987a) essentially serve to demonstrate that it is possible to reproduce specific complex behavior characteristic of decision making in realistic contexts. But these models remain case-specific, as Mallery points out (Mallery 1988c). The principal merit of these experiments is to demonstrate the utility of a formalism or technology to capture a class of behavior in a reproducible form. This is an important achievement; it prepares the way for, but does not deliver, a theoretical contribution to the study of foreign policy. What distinguishes these efforts from case studies (George and Smoke 1989; Lebow and Stein 1989) is that, in principle, the decision mechanisms explicated in the design of the program can serve as building blocks for a theory of decision making by U.S. administrations, or by governments in general, across a range of contexts. But, even though this is the end objective of the exercise, in existing systems the logical steps from the specific case to the general theory are incompletely worked out at best. In contrast, by virtue of the theoretical concerns that characterize thirdgeneration rule-based systems, Gael, Chandrasekaran, and Sylvan pose an explicit theoretical question through the theory of cognitive tasks embodied in the architecture chosen for their system. They ask in effect whether it is possible to reproduce patterns in Japanese energy policy formation using only two basic inferential processes, that of diagnosis/classification and that of planning. Viewed from the perspective of the theory of design of expert systems, the performance of the system is a measure of the analysis of the set of essential tasks or problems embodied in the Japanese case, along with the adequacy of the high-level language in which these tasks are implemented (Bylander, Mittal, and Chandrasekaran 1983; Bylander and Mittal 1986). Viewed from the perspective of the theory of foreign policy, the adequacy of the system is validation of the hypothesis that policy-making apparatus realized in the Japanese bureaucracy is an instance of a class

Rule-Based, Case-Based, and Explanation-Based Models

67

of information-processing systems of which the computer program is an example. The verisimilitude of the program's performance reflects back on the sufficiency of the underlying theory of tasks and task composition that is operationalized in a high-level computer language that has been developed by Chandrasekaran and his colleagues. 9 from Rules to Cases

Systems That Plan: A Bridge Between Rule-Based and Case-Based Systems

For the purpose of making rule-based systems more transparent and less brittle, system designers like Davis and Clancy explored such issues as that of providing the system with "structural knowledge," or with the capacity to trace back and question the evidence that justifies a rule. For the theoretically oriented social scientist, the character of these knowledge structures is of intrinsic interest. Programs serve as a means for posing hypotheses about their properties. One of these structures, or a family of structures, that has been an essential concern for artificial intelligence from the beginning is that of plans and planning. The activity of planning, by individuals or organizations, and the construction of strategic plans predicated on the plans of other agents, is part of the core agenda of foreign policy analysis. The applications of artificial intelligence to foreign policy are, by and large, planning systems, for the most part cast in a rule-based form. To make the argument that recent developments in AI, such as case-based reasoning, can be used to pose important theoretical questions that are difficult to pursue in rule-based systems, it is useful to first establish the characteristics of the planning activity that the computer model should capture. In what has been called the "classical planning framework" in robotics (Pednault 1987; Georgeff 1987), constructing a plan consists of ordering a sequence of actions that, when executed, progressively change an initial state of the world into a state that satisfies a goal. A recipe, a blueprint, or SlOP, viewed functionally, are each examples of this conception. The notion of a plan as a set of instructions or as a program of actions predates the major work in ropot planning. In Plans and the Structure of Behavior, Miller, Galanter, and Pribram redefine the familiar notion of a "plan" such that it becomes functionally equivalent to a computer program (1960). Indeed, they capitalize the word to emphasize this special meaning. A "Plan" is "any hierarchical process in the organism that can control the order in which a sequence of operations is to be performed" (Miller, Galanter, and Pribram 1960: 16). Miller et al., go further to speculate that the instructions that make up a plan are themselves conditioned actions-that is, actions that can be applied only given that specified conditions are

68

Dwain Mefford

satisfied, or actions that may be repeated until some condition is achieved (Miller, Galanter, and Pribram 1960: 38). Construed in this way, plans are quite literally programmable instructions with an explicit control structure, i.e., it is clear under which conditions which actions will be executed. The notion of "public plans" connects planning and problem solving at the level of the individual with the analogous activity at the level of organizations and institutions as conceived by Simon (March and Simon 1958). Plans, or what March and Simon in their book call "programs" (1958: eh. 6), can be viewed as the product, qua policy, or purposive activity of complex organizations. Unfortunately for the social sciences, subsequent work in the field of cognitive science and artificial intelligence has, until comparatively recently, neglected the collective dimension. Within political science these ideas have been pursued by Crecine (1969) and by Alker and his students (Aiker 1981; Alker, Bennett, and Mefford 1980; Alker and Greenberg 1977; Bennett and Alker 1977; Tanaka 1984). 10 A decade after Miller, Galanter, and Pribram, the work in planning and problem solving had progressed to the point that working systems, i.e., robots, or programs that simulated robots, could be built for certain restricted domains. The most notable of these is the STRIPS program developed at Stanford Research International (Fikes and Nilsson, 1971; Fikes, Hart, and Nilsson 1972; Nilsson 1980: eh. 7) and Winograd's "Blocks worlds" program, which was his thesis at MIT (Winograd 1972). Systems like STRIPS and SHRDLU describe states of the world as a list of statements or well-formed formulas in first order logic. The planning problem becomes one of selecting a list of operators from those available to the system, e.g., to "move" or "clear" blocks, or to "open" and "pass through" doors, etc., which, if executed sequentially, transform the world from an initial state to one corresponding to a goal. Since Newel! and Simon's work on the logic theorist, which proved theorems in the prepositional calculus (Newell and Simon 1963, 1972), the task of planning has been construed as a task of constructing a proof. Literally, a plan is a proof by construction in which the theorem to be proved corresponds to a set of statements describing a goal, and the assumptions and axioms consist of a set of initial statements plus a set of operations, interpreted as actions. These operations transform or rewrite statements into statements. Under this definition of the task, a planning system can harness the power of algorithms developed for automated theorem proving, principally Robinson's Resolution Principle, which revolutionized that field in the mid1960s (Robinson 1965, 1979; Nilsson 1980: eh. 5). The engine that powers planning systems like STRIPS is a theorem prover modified to work efficiently with a particular data structure, e.g., lists of formulae representing the robot's world. The system is equipped with operators, such as "pass through door," that, when applied, have the effect of altering the state of the world

Rule-Based, Case-Based, and Explanation-Based Models

69

(or at least altering the program's image of the world). Changes are effected by adding or deleting statements from the list of statements that define the current state of the world. In later versions of the program, and in its successors, additional procedures are incorporated for generalizing and repositing plans, and for constructing plans in a multipass fashion that adds efficiencies and avoids dilemmas that plagued the original system (Sacerdoti 1974, 1977; Stefik 1981). These extensions are of great interest for what they suggest for the study of the policy process, which is manifestly a multilevel process realized in large organizations or governments (Durfee 1986; Durfee and Lesser 1987; Durfee, Lesser, and Corkill 1987). Model 2: Coups d'Etat Represented as Operative Plans

A system that explicitly uses the structure of plans in its reasoning is an intuitively appealing representation of political calculations. Using our Luttwak example, the perpetrator of a coup can be modeled as a system that constructs and modifies plans in the face of a changeable environment. This can be implemented using a rule-based technology, provided that the structure of the plan itself is accessible to the system and not left simply implicit in the rule base. What is of interest is how the system constructs and adapts its operative plan in the face of actions and events. The conditional structure of the plan for a particular coup might take the general form of an andjor graph, as would the more general conception of what is involved in seizing power. A part of that structure might take the form of the diagram in Figure 3.1. To each of the branches in the tree corresponds one or more rules or productions, the choice of which will likely depend on concrete constraints and opportunities in the particular case. One of Luttwak's examples is the failed coup attempt by the French generals in Algeria in 1961. To overthrow a charismatic leader like de Gaulle requires not only the active support or tacit acceptance by the military but also the power to deny that leader the opportunity to appeal to the rank and file in the military (Luttwak 1979: 105). These factors, and others like them, would figure among the contingencies in a fully developed plan. In effect, the rule base and the control structure that is implicit in the scheduling of the rules embodies a theory of coup d'etat. A "strong" theory of this type would assert that the perpetrators of coups, and, for that matter, all other actors involved, reason, to some approximation, in ways that correspond to the steps that such a rule-based system takes as it solves the problem of executing a coup in a given context. A "weak" theory makes no claim as to the correspondence between the functioning of the program and the reasoning and actions of people. The assertion would simply be that the program generates behavior that, at some level, corresponds to or predicts the behavior of some class of political systems. lt is our view that

70

Dwain Mefford

political science should strive for explanative theory in the strong sense rather than for mere prediction. 11 The premise that the scientific purpose should be to capture processes that operate in the world is a prime tenet of modern scientific realism (Boyd 1984; Harre 1986; Wendt 1987). In the present context this means seeking out those developments within artificial intelligence that promise to provide conceptual leverage for reconstructing the cognitive mechanisms at work in political settings. One of the most promising developments differs from the rule-based approach both in its data structure and in the reasoning process it attempts to reproduce. Case-Based Reasoning as a Refinement of the Information-Processing Model

Case-based reasoning addresses several of the limitations inherent in rulebased systems. It expressly raises the issue whether human beings reason on the basis of rules using rulelike deduction. In the place of large numbers of discrete rules, people seem to rely on much more complex structures like those of metaphor and analogy (Vosniadou and Ortony 1989). Because rule-based systems are poor models of cognition, the builders of such systems face not only the problem that Davis and Clancy explore-that of making a program transparent to the user-but the related and perhaps more serious problem of converting knowledge from the forms in which people use it to the form that a rule-based system can apply. This extraction and conversion process, which Feigenbaum calls "knowledge engineering" (1977), is not only a bottleneck in the application of this technology but may well bound the universe of problems to which these systems can be applied. More sophisticated interfaces may not be enough. lt may be necessary to radically rethink the fundamental design of these systems, i.e., to explore data structures that are not rulelike and to attempt to mimic informal types of reasoning. Case-based reasoning can be interpreted as one such effort. If case-based reasoning is a truer model of human reasoning, then it should be easier to transfer knowledge from human experts to systems designed on these principles. In addition, unlike rule-based systems whose performance degrades precipitously when confronted with tasks that go beyond what has been explicitly encoded in the rule base, the analogical inference process used in case-based systems generally degrades more gracefully, in much the same fashion that human reasoning begins to fail but does not collapse in the face of novelty (Waltz 1988). After spelling out the motivations and characteristics of case-based reasoning, we will illustrate the notion by applying it not to the task of reconstructing the patterns in Luttwak's data, as in the previous section, but to the task of reconstructing Luttwak's analysis and interpretation of the data. Luttwak himself serves as an example of political intelligence at

Rule-Based, Case-Based, and Explanation-Based Models

71

work as he constructs the concept of coup d'etat from historical cases. Instead of reproducing patterns in a body of data, we reproduce the process by which political agents identify patterns or paradigms and reapply them in new contexts. Ironically, the fundamental insight in CBR systems conforms to the essentials of the information-processing model that Newell and Simon introduced in the first decade of artificial intelligence (Newell and Simon 1972). What CBR offers in addition is a more explicit account of the objects or structures in memory, their organization, their use in problem solving and, as we will investigate in a later section, their acquisition. In short, case-based reasoning can be approached as a refinement of the informationprocessing model, 12 which holds that human beings solve problems by assembling information in short-term memory that has been retrieved from the virtually infinite store of long-term memory. Solving, for example, a chess problem involves recognizing which of these structures in memory are relevant to the current state of the game. Once recognized and retrieved, the principles of analysis including problem decomposition and heuristic search can be applied. The components of the information-processing model are pictured in Figure 3.2. The overlap with the basic conception of case-based reasoning is apparent when this figure is juxtaposed to the flow of control typical of a CBR system, as diagrammed in Figure 3.3, taken from Kolodner and Riesbeck (1989). The two models differ in the perspective adopted: Newell and Simon focus on the architecture of human problem solving, depicting the workspace of short-time memory and its relation via a channel to a memory store. Kolodner and Riesbeck, in their figure, focus on the functions of retrieving, adapting, and evaluating cases in memory. What CBR adds is a more focused concern with certain theoretical and computational issues such as that of how structures in memory are indexed and how partial matches are handled (Hammond 1989b). Case-Based Reasoning: How It Works, What Motivates Its Questions

"The case-based approach to planning is to treat planning tasks as memory problems. Instead of building up new plans from scratch, a casebased planner recalls and modifies past plans" (Hammond 1986: 287). The case-based approach assumes that a basic form of learning takes place from problem to problem or episode to episode. Rather than drawing inferences by chaining together rules in deductive fashion, case-based systems operate by matching the current problem against large structures in memory, which comprise the system's store of previously solved problems or previously encountered situations. What these structures are and how the memory is organized to facilitate retrieval depends essentially on the characteristics of

72

Dwain Mefford

Figure 3.2: The Neweii-Simon Information Processing Model (Newel! and Simon, 1972; Simon and Newel!, 1971) (Adapted from Davis and Olson, 1985, p. 242)

Internal Long-term memory

I

Processor Short-term memory Inpu t

>

Elementary processor

>

ou tput

Interpreter

t External memory

the domain (Hammond 1986, 1987; Ashley and Rissland 1987, 1988; Mefford 1988d, 1989; Rissland and Ashley 1986). What makes up the basic structures of memory? 13 The variation in ideas on this subject is illustrated in the shifts in Roger Schank's thinking on the question, from the master concept of "script" (Abelson 1973, 1978; Schank and Abelson 1977; Schank 1982) to the idea of congeries of more elemental scenelike units (Schank 1982: eh. 2). The question of what counts as a "case" is equally problematic for Schank's students, and the students of Schank's students, who are heavily represented among the proponents of case-based reasoning (Carbonell 1981, 1986; Farrell 1987, 1988a, 1988b; Hammond 1986, 1987; Kolodner 1983, 1984; Kolodner and Simpson 1984; Kolodner, Simpson, and Sycara 1985; Sycara 1985a, 1985b, 1988a, 1988b). It should come as no surprise that designers of systems intended to construct legal briefs would seize upon precedent-based reasoning (Carter 1987; Woodard 1987). Because the notion of "case" is so plastic, it is evident that what distinguishes case-based reasoning from other forms of inference is not its

73

Rule-Based, Case-Based, and Explanation-Based Models

Figure 3.3: Case-Based Reasoning. Basic Flow of Control (From Kolodner and Riesbeck

1989, p. 2)

Problem Statement

I Store

:

>~

Retrieve :

I

Best Match

Cases

~~ case and Indices

I New I case

Case Memory

Prior Outcomes Evaluate

>-

>I Adapt I

Prior Adaptations

I
!;HI;V-1

I---- --

( JRHOS-KRDRR-1 )

('-8;:E;:-:::::;8:::;5~j ,... LERDER-3 ~ (BE -86)

(GER0-1)

(BE-88)

(nRTYRS-RRKOSI-1)

(IftRE-HRCV-1)

(BE -89)

( BE-87)

(BE-8-1)

(BE-./68)-(InRE-HRCV-1)

(BE-./53 )-(CER0-1)

(BE-./ 33 )-( STRLIHIST )

( PRRTICULRR-925 ) - ( PREftiER-3 )

)---- Uhat directive• ere there?

DEI'IAND-1 DEI'IRI'ID-4 DEI'IAI'ID-3 II'ISTRUCT-1

MORRISON Q Since WITHDRAW-2 Is a bracketed belief, I cannot be sure. SENTENCE-278 WITHDRAW-11 Question to ask ARPA: (GilD to end)

MORRISON Q Ves, because DEI'IAI'ID-4 Is TRUE. SENTEI'ICE-269 DEMRI'ID-111 Question to ••k ARPA: (411D to end) > Did the USSR withdraw its troops froft Hunesry?

I'IORRISON Q For the question, EKECUTE-5, I find: II'IRE-I'IAGV-1 ASSOCIATE-3 SENTENCE-268 EKECUTE-5 Question to ask ARPA: (411) to end) > Did the UI'I-General-Asseftbly deftand that the USSR withdraw Its troops froft Hungary?

>

SENTENCE-267 FALL-7 Question to ask ARPA:

FALL-~'•

I'IORRISOI'I Q

Question to ask ARPA: (411D to end) > Why did 1ft~e-nagy fell f~o" political

MORRISOI'I Q

Figure 15.4: Answering questions about the 1956 Soviet intervention in Hungary. Questions answering is a way to inspect the text model and to see how lexical classification has extended the represented knowledge.

~

1..1

363

Semantic Content Analysis

commands related to the lexical classifier. Since its implementation, the belief-system examiner has proved an invaluable tool for inspecting and verifying knowledge structures. Discussion

After a text has been prepared for parsing, the "debugged" text parses and references at a rate of 0.24 seconds for the average sentence, which multiplies out to about 450 to 600 pages per hour. 24 A number of support tools available in the RELATOS environment, such as the editor mode, simplify and speed the text preparation process. Interestingly, the discipline of converting text to the immediate reference model forces users to think closely about what they themselves must do to understand the sentences. This reflective process is a continuing source of insights into how language works, suggesting how computers might model it. A competent RELATOS user can process about ten pages of raw text from a new domain in a working day. 25 As domain-relevant vocabulary and background knowledge are developed, the daily amount of new text processed should increase and converge to the time required to make the text meet the processing model. Lexical Classification Systems of Lexical Recognizers Recognizer Organization. Lexical recognizers generally have associated

words and selection constraints designed to find the various lexical realizations that instantiate the category. If a user defines a number of recognizers, it is convenient to organize them into independent collections, or lexical classification systems, and to organize the collections hierarchically (or heterarchically). When lexical recognizers are organized hierarchically, two types can be distinguished: Base categories generally form its leaves and recognize instances, whereas the abstract categories, above them in the hierarchy, have no tokens or selection constraints to recognize instances. Instead, they have specialization, generalization, and equality relationships to situate them taxonomically with respect to base categories or other abstract categories. 26 Instance Classification. Although some commands in the RELATOS text editor mode and the belief-system examiner simply find instances of categories and view their source sentences, as seen in Figure 15.5, it is generally more useful to label the category instances. For this reason, every lexical recognizer should have an associated concept reference specification. This makes possible commands in the belief-system examiner and editor to lexically classify instances. Lexical classification makes available the categorizations

INSTRUCT-I: Inre-Nagy' s governMent requested that the UN-Secur i ty-Counci 1 instruct the USSR to negotiate its diffel"'ences uith Inre-Nagy's goverm'lent.

ACCEPT -1: The r~esses be 1 i eved that tl"le USSR had 1 egi t i noted net ton a 1 COP'IMunisl"'' because the USSR accepted Go,..ulka in Poland.

Gero uas ab 1e to

REQUEST-2: J,..re-Nagy' s gover-nf'\ent' s request uas noved to the UN-General-Assenb 1 y because the USSR deadlocked the UN-Secur i ty-Counc i 1 .

PROIIISE-1:

f'IU

1t 1-party

INfORn-2: Janos-l