Computational Logic [1st Edition] 9780080930671, 9780444516244

Handbook of the History of Logic brings to the development of logic the best in modern techniques of historical and inte

667 42 7MB

English Pages 736 [718] Year 2014

Report DMCA / Copyright


Polecaj historie

Computational Logic [1st Edition]
 9780080930671, 9780444516244

Table of contents :
CopyrightPage iv
Editorial NotePage viiJörg Siekmann, Dov Gabbay
ContributorsPages viii-xii
Computational LogicPages 15-30Jörg Siekmann
Logic and the development of the computerPages 31-38Martin Davis
What is a logical system? An evolutionary view: 1964–2014Pages 41-132Dov M. Gabbay
History of Interactive Theorem ProvingPages 135-214John Harrison, Josef Urban, Freek Wiedijk
Automation of Higher-Order LogicPages 215-254Christoph Benzmüller, Dale Miller
Equational Logic and RewritingPages 255-282Claude Kirchner, Hélène Kirchner
Possibilistic Logic — An OverviewPages 283-342Didier Dubois, Henri Prade
Computerising Mathematical TextPages 343-396Fairouz Kamareddine, Joe Wells, Christoph Zengler, Henk Barendregt
Concurrency Theory: A Historical Perspective on Coinduction and Process CalculiPages 399-442Jos C.M. Baeten, Davide Sangiorgi
Degrees of UnsolvabilityPages 443-494Klaus Ambos-Spies, Peter A. Fejer
Computational ComplexityPages 495-521Lance Fortnow, Steven Homer
Logic ProgrammingPages 523-569Robert Kowalski
Logic and Databases: A History of Deductive DatabasesPages 571-627Jack Minker, Dietmar Seipel, Carlo Zaniolo
Logics for Intelligent Agents and Multi-Agent SystemsPages 629-658John-Jules Ch. Meyer
Description LogicsPages 659-678Matthias Knorr, Pascal Hitzler
Logics for the Semantic WebPages 679-710Pascal Hitzler, Jens Lehmann, Axel Polleres
IndexPages 711-734

Citation preview

North Holland is an imprint of Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands The Boulevard, Langford lane, Kidlington, Oxford OX5 1GB, UK 225 Wyman Street, Waltham, MA 02451, USA First edition 2014 Copyright © 2014 Elsevier B.V. All rights reserved No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. ISBN: 978-0-444-51624-4 ISSN: 1874-5857 For information on all North Holland publications visit our web site at

EDITORIAL NOTE J¨org Siekmann and Dov Gabbay Because of space and time considerations, not all topics could be covered by chapters in this Handbook. They will appear in a separate publication soon.


AUTHORS Klaus Ambos-Spies Heidelberg University, Germany. [email protected] Jos Baeten CWI Amsterdam, The Netherlands. [email protected] Henk Barendregt Radboud University Nijmegen, The Netherlands. [email protected] Christoph Benzm¨ uller Freie Universit¨ at Berlin, Gemany. [email protected] Martin Davis New York University, USA. [email protected] Didier Dubois IRIT, France. [email protected] Peter A. Fejer University of Massachusetts Boston, USA. [email protected] Lance Fortnow Georgia Institute of Technology, USA. [email protected] Dov Gabbay Bar Ilan University, Israel; King’s College London, UK; University of Luxembourg, Luxembourg; University of Manchester, UK. [email protected] John R. Harrison Intel Corporation, USA. [email protected]


Pascal Hitzler Wright State University, USA. [email protected] Steven Homer Boston University, USA. [email protected] Fairouz Kamareddine Herriot Watt University, UK. [email protected] Matthias Knorr Universidade Nova de Lisboa, Portugal. [email protected] Robert Kowalski Imperial College London, UK [email protected] Claude Kirchner Inria, France. [email protected] H´ el` ene Kirchner Inria, France. [email protected] Jens Lehmann University of Leipzig, Germany [email protected] John-Jules Meyer Universiteit Utrecht, The Netherlands. [email protected] Dale Miller Inria, France. [email protected] Jack Minker University of Maryland, USA. [email protected] Henri Prade IRIT, France. [email protected] Davide Sangiorgi University of Bologna, Italy. [email protected]




Axel Polleres Vienna University of Economics and Business, Austria. [email protected] Dietmar Seipel University of W¨ urzburg, Germany. [email protected] J¨ org Siekmann Saarland University, DFKI, Germany. [email protected] Josef Urban Radboud University Nijmegen, The Netherlands. [email protected] Joe Wells Herriot Watt University, UK. [email protected] Freek Wiedijk Radboud University Nijmegen, The Netherlands. [email protected] Carlo Zaniolo University of California, Los Angeles, USA. [email protected] Christoph Zengler University of Tuebingen, Germany. [email protected]


READERS Luca Aceto Reykjavik University, Iceland. [email protected] Peter B. Andrews Carnegie Mellon University, USA. [email protected] Serge Autexier DFKI, Germany. [email protected] Franz Baader Dresden University, Germany. [email protected] Johan van Benthem Universiteit Amsterdam, The Netherlands [email protected] Jasmin Blanchette Technical Univesity of Munich, Germany. [email protected] Maarten H. van Emden University of Victoria, Canada. [email protected] William Farmer McMaster University, Canada. [email protected] Peter Fejer University of Massachusetts Boston, USA. [email protected] Herman Geuvers Radboud University Nijmegen, The Netherlands. [email protected] Robert van Glabbeek Nicta, Australia. [email protected] Lluis Godo Lacasa Universitat Autonoma de Varcelona, Spain. [email protected]




Georg Gottlob Univesity of Oxford, UK. [email protected] Patrick Hayes Florida Institute for Human and Machine Cognition, USA. [email protected] Ian Horrocks Oxford University, UK. [email protected] Deepak Kapur University of New Mexico, USA. [email protected] Kurt Mehlhorn Max-Planck-Institut f¨ ur Informatik, Germany. [email protected] Lawrence C. Paulson Cambridge University, UK. [email protected] Lu´ıs Moniz Pereira University of Lisbon, Portugal. [email protected] Richard Shore Cornell University, USA. [email protected] J¨ org Siekmann DFKI, Germany. [email protected] Bruno Woltzenlogel Paleo Vienna University of Technology, Austria. [email protected] Michael Wooldridge Liverpool University, UK. [email protected]

COMPUTATIONAL LOGIC J¨org Siekmann Computational logic was born in the twentieth century and evolved in close symbiosis with the first electronic computers and the growing importance of computer science, informatics and artificial intelligence (AI). The field has now outgrown its humble beginnings and early expectations by far: with more than ten thousand people working in research and development of logic and logic-related methods, with several dozen international conferences and several times as many workshops addressing the growing richness and diversity of the field, and with the foundational role and importance these methods now assume in mathematics, computer science, artificial intelligence, cognitive science, linguistics, law, mechatronics and many other engineering fields where logic-related techniques are used inter alia to state and settle correctness issues, the field has diversified in ways that the pure logicians working in the early decades of the twenties century could have hardly anticipated - let alone those researchers of the previous centuries presented in this eleven volume account of the history of logic. Dating back to its roots in Greek, Indian, Chinese and Arabic philosophy the field has grown in richness and diversity over the centuries to finally reach the modern methodological approach first expressed in the work of Gottlob Frege.1 Logical calculi, which not only capture formal reasoning, but also an important aspect of human thought, are now amenable to investigation with mathematical rigour and computational support and fertilized the early Leibniz’ dream of mechanized reasoning: “Calculemus”. The beginning of the last century saw the influence of these developments in the foundations of mathematics in the works of David Hilbert and Paul Bernays, Bertrand Russell and Alfred North Whitehead2 and others, in the foundations of syntax and semantics of language, and in analytic philosophy most vividly expressed in the previous century by the logicians and philosophers in the Vienna Circle. The Dartmouth Conference in 1956 generally considered the birthplace of artificial intelligence — raised explicitly the hopes for the new possibilities that the advent of electronic computing machinery offered: logical statements could now be executed on a machine with all the far-reaching consequences that ultimately led to logic programming,3 question answering systems, deduction systems for math1 See volume 3, “The Rise of modern Logic : from Leibnitz to Frege” in this eleven volume handbook on the history of logic. 2 See volume 4 “British Logic in the 19th Century” and volume 5 “Logic from Russell to Church” of this handbook on the history of logic. 3 See the chapter by Robert Kowalski.

Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


J¨ org Siekmann

ematics and engineering,4 logical design and verification of computer software and hardware, deductive databases5 and software synthesis as well as logical techniques for the analysis and verification in the fields of mechanical engineering. In this way the growing richness of foundational and purely logical investigations that had led to such developments as: • first order calculi • type theory and higher order logic • nonclassical logics • semantics • constructivism and others, was extended by new questions and problems, in particular from computer science and artificial intelligence, leading to: • denotational semantics for programming languages • first- and higher-order logical calculi for automated reasoning • non-monotonic reasoning • logical foundations for computing machinery such as CSP, π-Calculus and others for program verification • knowledge representation formalisms such as description logics • logics for the semantic web • logical foundations for cognitive robotics • syntax and semantics for natural language processing • logical foundations of databases • linear logics, probabilistic reasoning and uncertainty management • logical foundations and its relationship to the philosophy of mind, and many others. 4 See the chapter by John Harrison, Freek Wiedijk and Josef Urban; the chapter by Christoph Benzm¨ uller and Dale Miller; the chapter by Gilles Dowek and Herman Geuvers; the chapter by Claude and H´ el` en Kirchner and the chapter by Fairouz Kamareddine, J.B. Wells, C. Zengler and Henk Barendregt. 5 See the chapter by Jack Minker, Dietmar Seipel and Carlo Zaniolo.

Computational Logic


In many respects, logic provides computer science with both a unifying foundational framework and a tool for modeling.6 In fact, logic has been called the calculus of computer science, most prominently represented at the LICS conference (Logic in Computer Science), playing a crucial role in diverse areas such as computational complexity,7 unsolvability,8 distributed computing, concurrency,9 multi agent systems,10 database systems, hardware design, programming languages, knowledge representation,11 the semantic web,12 and software engineering. As John McCarthy succinctly coined it: “Newtonian physics is to mechanical engineering as logic is to computer science”. But the demands from artificial intelligence, computational linguistics and philosophy have spawned even more new ideas and developments as we shall argue below.13 This growing diversity is reflected in the numerous conferences and workshops that address particular aspects of the fields mentioned above. For example, only forty years ago, there was just one international conference on Automated Deduction (later to be called CADE). Today there is not only the annual CADE but also the biannual IJCAR, the International Joint Conference on Automated Deduction, which unites every two years CADE, FroCos (the International Symposium on Frontiers of Combining Systems), FTP ( International Worshop on First-order Theorem Proving), the TABLEAUX conference (Automated Reasoning with Analytic Tableaux and Related Methods) and sometimes other smaller conferences as well. There is also the RTA conference (Rewriting Techniques and Applications), the LPAR (Logic Programming and Automated Reasoning), TPHOL (Theorem Proving in Higher Order Logic), and UNIF (the Unification Workshop). Several conferences on mathematical theorem proving have recently united into CICM (Conference on Intelligent Computer Mathematics), among others CALCULEMUS, MKM (Mathematical Knowledge Management), DML (Digital Mathematical Libraries) and the OpenMath workshops. CICM has usually half a dozen workshops colocated such as MathUI (Mathematical User Interfaces) and ThEdu (Theorem Prover Components for Educational Software) and others. Each of these conferences is held regularly on its own or back to back with a related conference, but with its own set of proceedings and supported by a mature scientific community. Frequently, these conferences spawn dozens of national and international workshops, that drive the development and represent the creative innovation, but may not be ready yet for archival presentation. Some of these that united the young rebels of the field are CALCULEMUS, MKM (Mathematical Knowledge Management), CIAO the Workshop that started with induction but soon became an event of its own, the Proof Presentation workshop, UITP 6 See 7 See 8 See 9 See 10 See 11 See 12 See 13 See

the the the the the the the the

chapter chapter chapter chapter chapter chapter chapter chapter

by by by by by by by by

Martin Davis. Lance Fortnow and Steven Homer. Klaus Ambos-Spies and Peter Fejer. Jos Baeten and Davide Sangiorgi. John-Jules Meyer Matthias Knorr and Pascal Hitzler. Pascal Hitzler, Jens Lehmann andAxel Polleres Dov Gabbay and the chapter by Didier Dubois and Henri Prade.


J¨ org Siekmann

(User Interfaces for ATP) and JELIA (European Conference on Logics in Artificial Intelligence). Furthermore there are all the conferences focussing on logic for verification and specification of hardware and software like CAV (International Conference on Computer Aided Verification), FM (Formal Methods), WING (the International Workshop on Invariant Generation), RR (the Web Reasoning and Rules Systems Conference series) and dozens more. Finally there are the related conferences and workshops in AI, like the Nonmonotonic Reasoning Conference and workshops, the Knowledge Representation Conferences, the Frame Problem meetings, the workshops on nonclassical reasoning and many more including all the numerous national workshops in Europe, the US and nowadays increasingly in China and the Pacific Rim as well. Furthermore there are dozens of highly specialized workshops at the large AI conferences such as IJCAI, ECAI, PRICAI, and the AAAI conference. In the summer of 2014, Vienna hosted VSL, the Vienna Summer of Logic, with FLoC (the Federated Logic Conference) plus some more related conferences, which up to then was the largest event so far in the history of logic. It consisted of twelve large federated conferences and over seventy workshops attracting more than 2000 researchers from all over the world. It wass organized into three main thematic sections: logic in computer science, mathematical logic and logic in artificial intelligence. While not every conference on mathematical and computational logic was present at this event (in fact there may be twice as many), it gives, nevertheless, a fair account of the current breadth and scope of our subject in 2014. As a historical snapshot its worth listing it in detail. The first section “Logic in Computer Science” united the following independent conferences: the 26th International Conference on Computer Aided Verification (CAV), the 27th IEEE Computer Security Foundations Symposium (CSF), the 30th International Conference on Logic Programming (ICLP), the 7th International Joint Conference on Automated Reasoning (IJCAR) - which in itselftself conglomerates the Conference on Automated Deduction (CADE), Theorem Proving in Higher-Order Logics (TPHOLs), and (TABLEAUX) - , furthermore the 5th Conference on Interactive Theorem Proving (ITP), the Annual Conference on Computer Science Logic (CSL) and the 29th ACM/IEEE Symposium on Logic in Computer Science (LICS), the 25th International Conference on Rewriting Techniques and Applications (RTA) joint with the 12th International Conference on Typed Lambda Calculi and Applications (TLCA), the 17th International Conference on Theory and Applications of Satisfiability Testing (SAT) and various system competitions. The second section, Mathematical Logic comprised the Logic Colloquium (LC), the Logic, Algebra and Truth Degrees conference (LATD), the Compositional Meaning in Logic (GeTFun 2.0), The Infinity Workshop (INFINITY), the Workshop on Logic and Games (LG) and the Kurt G¨odel Fellowship Competition. The third section, Logic in Artificial Intelligence consisted of the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR), the 27th International Workshop on Description Logics (DL), the15th In-

Computational Logic


ternational Workshop on Non-Monotonic Reasoning (NMR) and the International Workshop on Knowledge Representation for Health Care 2014 (KR4HC). A similar growth of logic related meetings has been seen in other academic fields as well, for example in logics for computational linguistics, in logic and law, in sociology as well as in subareas of artificial intelligence like multi-agent systems, deductive data bases, logic and the semantic web, logics for knowledge representation, argumentation, artificial intelligence in education (AIEd) and many more, a subset of these is presented in this volume. Other interesting areas for the application of computational logic are bioinformatics and biochemistry, but expansion and diversity can also be found in philosophical logic and in the philosophy of science with its world conference Congress of Logic, Methodology and the Philosophy of Science accompanied by national and international conferences. There are also many new conferences, workshops and other events that reflect the growing industrial importance of these techniques. The richness and diversity of this knowledge, with well over several million publications in these subareas of computational logic and related fields,14 which we have learned to access via specialized repositories such as DBLP (computer science bibliography) and with smart parameter settings of search engines such as Google Scholar and other more specialized computer support systems, is generally not well structured and at first overwhelming. For example if we ask Google Scholar for Computational Logic, we get 1.630.000 hits in 0.13 sec — so we better narrow the query down to Overview of Computational Logic which gives us 699 000 entries in 0.13 sec. May be a better wording may help, so we try Survey on Computational Logic and obtain 524.000 entries in 0.21 sec. Since we still do not like to scan half a million entries searching for gold in garbage, we may decide to better search the subareas of computational logic: for example Survey on Logic Programming, which gives us 524 000 hits in 0.18 sec. Another area? Survey on Automated Reasoning yields 142 000 hits in 0.08 sec — and hence any student (and scholar) will probably give up at this point and and look for other entry points, either by a more informed search, i.e. better key words and smarter search parameters, or for example by looking for handbooks covering any important related subfield. However here again we are confronted with a plenitude that is as difficult to comprehend as it is to assess: there has always been a long scholarly tradition for handbooks of logic. Even the seminal Handbook of Logic by J. D. Morell from 1857 is now available again in digitized form [Morell, 1857, digitized 2006]. Since there is unfortunately no standard website or repository to collect and uniformly represent all these handbooks and survey articles, it may be worthwhile 14 In

the small subarea of unification theory within automated reasoning, there is a problem called string unification or word equations. Google scholar finds 1.330.000 entries in 0.11 sec for word equations this year (not all of which is relevant for this topic of course) and needs to be narrowed down by quoting “word equations” as well as searching for “word equations in maths”, “word equations in Comp. science” etc. still shows a few thousand hits. In the year 2008 at the unification workshop, where we published a preliminary result, we asked Dr. Google and the system found 70.300 entries for “word equations” in 0.13 sec - so what are we to make of these facts?


J¨ org Siekmann

here to name at least the better known standard references: There is the standard five volume Handbook of Logic in Artificial Intelligence and Logic Programming by D. Gabbay, C. J. Hogger and J. A. Robinson [Gabbay et al., 1993-1998] and there is an excellent reference in the open access Stanford Encyclopedia of Philosophy under the key Logic and Artificial Intelligence by Richmond Thomason [Thomason, 2013] (revised 2013), which gives in 12 chapters a comprehensive overview of logic and AI. This is probably the best entry point today into the literature for the interested scholar. Jack Minker’s Logic-Based Artificial Intelligence [Minker, 2000] and the Handbook of the Logic of Argument and Inference: The turn towards the practical by Dov Gabbay [Gabbay, 2002] as well as the “red” books in the series Practical Logic and Reasoning by Dov Gabbay, J¨org Siekmann, Johan van Benthem and John Woods are good entry points for AI and reasoning. A standard reference for description logics and logics for the semantic web, as presented in chapter 19 and 20, is Franz Baader’s Description Logic Handbook: Theory, Implementation and Applications [Baader, 2003]. The Handbook of Defeasible Reasoning and Uncertainty Management Systems [Gabbay and Smets, 1998] by Dov Gabbay and P. Smets and the Handbook of Paraconsistency [Beziau et al., 2007] by J. Y. Beziau, W. A. Carnielli and D. Gabbay are also standard references for AI-inspired logics. The importance of stable model semantics for logic based AI is covered inter alia in the first chapter of [Minker, 2000]. Knowledge representation in AI is another source for the development of specialized logics, a standard reference is Knowledge Representation and Reasoning [Brachman and Levesque, 2004] by R. J. Brachman and H. Levesque and the Handbook of Knowledge Representation [van Harmelen et al., 2008] by Frank van Harmelen, Vladimir Lifschitz and B. Porter. The subject of automated reasoning, one of the oldest subareas in AI which is covered in part one of this this volume, has attracted many handbooks and surveys, among them is the two volume Handbook of Practical Logic and Automated Reasoning [Harrison, 2009] by John Harrison, the Handbook of Automated Reasoning by Alan Robinson and Andreij Voronkov [Robinson and Voronkov, 2001] and the Handbook of Tableau Methods [D’Agostino and Gabbay, 1999] by M. D’Agostino and Dov Gabbay. Temporal reasoning is another active area within AI and logic, see Temporal logic: mathematical foundations and computational aspects [Gabbay et al., 1994] by Dov Gabbay, I. Hodkinson, M. Reynolds and M. Finger. Logic, language and AI is another prolific field, see the Handbook of Logic and Language [van Benthem and Meulen, 1996] by Johan van Benthem and A. Ter Meulen. A standard reference for logic in computer science is the four volumes Handbook of Logic in Computer Science by S. Abramsky, D. Gabbay and T. Maibaum [Abramsky et al., 1992-1995] (with Henk Barendregt for the fourth volume) and the Handbook of mathematical logic by John Barwise [Barwise, 1982]. Other interesting sources to find our way into the field are the collections of the most influential papers which shaped a subfield like the wellknown “readings” of Morgan Kaufman. For example Ronald Brachman and Hector Levesque‘s Read-

Computational Logic


ings in Knowledge Representation [Brachman and Levesque., 1985] or Goldman’s Readings in Philosophy and Cognitive Science [Goldman, 1993]. Other readers are M. Ginsberg Readings in nonmontonic Reasoning [Ginsberg, 1980], Logic and Philosophy for Linguists: A Book of Readings by and J. M. E. Moravcsik and Austin Tate, A. Hendler and James Allen: Readings in planning [Tate et al., 1994] A collection of the early papers in automated theorem proving is: Automation of Reasoning: Classical papers on computational logic 1957-1966 by J¨org Siekmann and Graham Wrightson [Siekmann and Wrightson, 1983]. Of course logic in general and computational logic in particular are not the only research areas that witness such an unprecedented growth — many subareas of mathematics or physics would show similar quantative growth. But why do we witness such an exponential growth of our written corpus of knowledge in this field that is still viewed even by many academics as rather moot and part of philosophy? There is of course the general growth and economic importance of science and technology in our modern society and the transformation from agriculture based subsistence to a knowledge based economy, which leads to the paradox that there are currently more scientists and technicians living on this planet then the sum of all its predecessors and the corpus of written knowledge grows exponentially, in fact it almost doubles every 3 years15 . But apart from this general transition, we could still ask: why is it that a subject like logic, widely considered part of philosophy in the previous centuries, is now also part of this transition? It appears that there are two main reasons: (i) an inherent academic and scientific reason that drives the development (ii) but also the fact that the number of industrial applications and their commercial use increases exponentially as well. So let us look at the first issue, the academic diversity and quote from our roadmap for computational logic from the International Federation for Computational Logic (IFCoLog), where we addressed this issue. To understand the nature and the connections between the different areas of computational logic we need to adopt a focus and a point of view. This logical point of view has been dominated during the previous two and a half centuries by the role logic plays in the foundation of mathematics in particular and logicism in general. In contrast, today’s logic is also trying to understand and model human beings and their computational avatars in their daily activities and hence there is in addition strong pressure and urgency from artificial intelligence, robotics, natural language analysis, psychology, philosophy, law, social decision theory, computer science, formal methods and engineering to understand and model human and machine behaviour in these areas with a view to developing and marketing devices which help or replace humans in their activities in these areas. This is a difficult but lucrative and rewarding task and these needs have accelerated the evolution of logic in the last century far more than the traditional roles it used to play. Let us adopt this human activity point of view and try to see how the various areas of computational logic relate to one another, when viewed from this angle. 15 see


J¨ org Siekmann

Imagine early man living in prehistoric times, trying to survive and provide for his family in a hostile and dangerous world. Our human has scarce resources and very little time to make decisions crucial to his survival. He comes up naturally with two immediate logical principles:

Categorical classification He classifies his environment into categories: animate, edible, good, bad, dangerous and others. In logical terms this means that he naturally defines predicates A(x), B(x), . . . and has class relationships between them. So A1 (x) ∧ A2 (x) → B(x) means that by definition, every A1 ∩A2 is a B. This kind of deductive logic was formalised by Aristotle, later refined into first- and higher-order logical formalisms.

Hasty generalisation He also has to learn very quickly about the dangers of his environment. For example if something from the edible category, say, greenish flat leaf, turns out to be several times the cause of stomach upset then our prehistoric man has to generalise and decide that such leaves are bad for his health. So he introduces the approximate quick rule by induction: Green(x)∧ Flat(x)∧ Leaf(x) ⇒ Bad(x).

This rule is not absolute like the classification rules as it is defeasible. He may later find out that when cooked, the leaves are OK. Thus another rule is introduced and learned, namely: Green(x)∧ Flat(x)∧ Leaf(x)∧ Cooked(x) ⇒ ¬ Bad(x).

Here we use ⇒ to indicate that the rules are defeasible. This simple minded resource bounded approach manifests itself in modern diagnostic terms into what is known as Defeasible Logics and resource bounded argumentation. These logics now have absolute rules of the form A1 (x)∧A2 (x) → B(x) and defeasible rules, which are not absolute, of the form A1 (x)∧A2 (x) ⇒ C(x) If we have a query whether B(a) or ¬B(a) holds for some element a, we may have more than one rule which may give us an answer. For example, we may have A1 (x)∧A2 (x) ⇒ B(x) and E(x)∧A1(x)∧A2(x) ⇒ ¬B(x). Now the second rule has priority, because it is more specific, relying on more experience and more information. The above example points to a connection between three different areas of computational logic:

Computational Logic


1. General logical theory, required to provide discipline and methodology for proper reasoning and how to handle such logics as they arise in applications: this was the subject of logic in the previous millennia. 2. Artificial intelligence methodology investigates such observations as hasty generalizations and a variety of diagnostic principles and other human common sense reasoning rules leading to persistence, abduction, negation as failure, non monotonic reasoning and other logical formalisms. 3. Automated deduction and constraint bound programming is designed to give quick answers from the database to our human agent in real time. In practice the automated deduction area and the constraints programming areas developed into large central subareas of computational logic with many diverse applications.

Specification and Verification If we move from prehistoric time forward to the 21st century, we find that the situation confronting man has not changed in its basic nature - it has only become more complex. Instead of trying to survive in the forest, we are now wandering in our modern technological and highly bureaucratic society trying to survive the intricacies of our system. Our resources are still limited and our time is still scarce. To model these environments we need more complex formal languages, more sophisticated logics and faster real time automated reasoning systems. What had been rather easy for the simple prehistoric model has now evolved into a new discipline for modern times. As an example take the area of formal methods and the specific case of controlling and running the entire train system of the European Community. This is a complex structure, with many aspects and many constraints and demands so that we can no longer just write rules of the form A1 (x) → B1 (x) and A2 (x)∧B2 (x) ⇒ C(x). One reason is that such a language is not expressive enough for these needs. A second reason is that we cannot allow for errors in this application domain. So the very defeasibility of ‘⇒’-rules can no longer be tolerated and instead we need to develop a new systematic area of formal methods for: 4. Specification; i.e. writing exactly what we want. 5. Synthesis. i.e. developing executable code from specifications. 6. Verification: methods for proving that programs do what they are supposed to do without error and finally 7. Logic based Safety and Security engineering technologies.


J¨ org Siekmann

The large number of companies interested in these areas have turned it into a big and central area. Computational logic is at the core of it. There is another corrollary from the above points 4, 5, 6 and 7 as well: learning and teaching logic. A good case in point is the car manufacturing industry: more than one third of the future overall value of a motor car will be in its electronic equipment. A car is — like an aeroplane, a military tank or a ship — only at first sight a moving physical object: its essence is the dozens of processors and their complex communication devices that share information among themselves as well as with the outside world - which may be a human or another vehicle or just a street sign. From the internal control of the combustion engine, the brake or the steering wheel (drive-by-wire) to the more elaborate speech controlled devices that activate your iPhone, the park-in-algorithm or your satellite bound navigation system up to the driverless car of the future: computer systems and their reliability are central and paramount. For many years I worked as a consultant and head of joint projects with Daimler: they knew what was coming well before the turn of the last century. But how do you reorganise and re-educate a large company with its justified pride in its traditional engineering skills, which still builds the best cars in the world? How do you convert your selfconfident employees into the more humble role of learning new skills — including formal methods?16 A task that even now is far from being accomplished. Not only Daimler had serious problems with the reliability of their car electronics and costly recalls. Jointly with other car manufacturers they developed complex testing and verification processes. So a very important task today is to teach formal methods and a basic understanding of logical formalism. Hence there is another point: 8. Teaching logic, i.e. to pass the knowledge on from one generation to the next with pedagogical insight and appropriate computer support such as intelligent tutor systems (ITS) and other (world wide) computer supported media for logic.

Master and Ph.D. in Computational Logic The International Center for Computational Logic (ICCL) at the University of Dresden (Germany) is an interdisciplinary center of competence in research and teaching in the field of Computational Logic, with special emphasis on Algebra, Logic, and Formal Methods in Computer Science. It offers a distributed twoyears European Master’s Programme in Computational Logic (EMCL)17 by four European universities. They have also joined other European universities for a distributed European PhD Programme in Computational Logic (EPCL)18 in cooperation with the Free University of Bozen-Bolzano (Italy), the Technische Uni16 My favourite quote from the heart of a frustrated industrial advisor is from Upton Sinclair: “It is difficult to get a man to understand something, when his salary depends on his not understanding it” (Wikiquote). 17 see 18 see

Computational Logic


versitt Wien (Austria) and the Universidade Nova de Lisboa (Portugal). These training and education opportunities, which include financial support and student scholarships, are supported by the IPID program of the DAAD (Deutscher Akademischer Austauschdienst), the German Federal Ministry of Education and Research (BMBF) and the DFG (the German National Science Foundation).

Time and Interaction But coming back to our ancient man: he is not alone and his survival and his capabilities derive from the fact that he lives in small communities with a well defined role for each of its member. The overall behaviour of the tribe is an emergent functionalty of the individual behaviours and the “egoistic gene” must learn altruistic values as well. The analysis and modelling of multi-agent systems in artificial intelligence (and some artificial life systems) address these interaction problems and there are specially tailored logics to capture these phenomena. There is also another aspect in modern times that requires modelling, which is the supremacy of time, action and change. As societies gained in complexity, in resources and especially in long term planning, the time and change aspect has become more and more dominant. Whereas prehistoric man did not think beyond his daily problems, modern man developed plans, actions, laws and commitments. For example the huge task of planning and moving the complete equipment for one hundred thousand soldiers and their support troops - from the tent pole to the maintenance tools for a tank - into a foreign land by air is carried out these days by computer planning and other techniques from artificial intelligence that have to be reliable (first time for example in the Iraq war). So a large area of logic is now devoted to temporal processes and interactions. This manifests itself on two fronts: 9. Models of interactive agents in artificial intelligence, 10. Analysis of interactive, parallel and evolving processes in software engineering and theoretical computer science. These two subjects have evolved independently of each other, studied by separate communities, in spite of the fact that they are highly related. Also there is the “logic of action” research community, see [Kracht et al., 2009] for a survey. Another aspect related to time is revision, uncertainty, change and the resolution of conflicts: with worldwide access to knowledge and people, we receive conflicting information from different sources and with different degrees of reliability so we have to reconcile and deal with this kind of information. Special logics have been constructed and technical machinery has been developed for this purpose.

Language and Law Finally let us look at two additional major areas where computational logic is needed and frequently used: logic and language and logic and law. Human activ-


J¨ org Siekmann

ity and reasoning is reflected in its language and in its laws. The language is the medium and instrument of interaction of the active human agents and the laws govern that interaction, largely in a compatible way to common sense. It is no surprise therefore that computational logic is a core discipline for analysing and modelling the structure of language and in modelling reasoning and fundamental concepts in law. The language community, in particular in computational linguistics, is well aware of the central role which logic plays in language both as a tool for analysis and computation as well as a resident component in the structure of language, shaping language use and substructure. The subcommunity of linguists interested in semantics and using logic is large and well organised and connected. This is unfortunately only to a lesser degree the case in law: There is a community of philosophical logicians and artificial intelligence researchers interested in law and well structured argumentation, but the law community in general does not always realise how closely their core concepts are related to logic and its computational support systems. In fact, the theoretical logicians as a community do not realise in turn how much they can learn by observing the discussions and legal manoeuvring of the law community. Many new logical principles can be extracted from legal behaviour much in the same way that new mathematics was extracted from attempts to model physical behaviour in the material universe. Logic and law is one of the most promising and at the same time unexplored area for logical analysis. The same holds for theological reasoning and the respective logical analysis within the main religious writings: Christian, Buddhist as well as Islamic scholarship and the reasoning and argumentation of the Torah.

Industrial Applications When I ask my colleague from the maths department, whose research is in abstract algebra, about applications of his work, he would answer enthusiastically: “oh there are so many, you can hardly count them: for example in vector analysis, in Galois theory and in Boolean Algebra, in ....” In Amir Pnueli’s hallel (laudatio) for Moshe Vardi’s doctorship honoraris causa here at Saarbr¨ ucken in the year 2002 he justifiedly praised Moshe as one of the most oustanding logicians in our community particulariliy focussing on Moshes many contributions to the applications of logic, like the logical theory of data bases, the computational logic of knowledge, the logic and automata theoretic approach to verification as well as his finite model theory and applications in model checking for hard and software verification. The subsequent academic presentations empasized these application aspects even more and we all departed with a deep impression of the sizeble market opportunities. In my time as a director of the German Institute for Artificial Intelligence (DFKI) where the logic related projects were mainly in my research lab, I worked as an advisor for many European government institutions as well as a consultant to German and other Multi National Companies, so I also showed and discussed the presented material to one of our representatives on the industrial advisory board

Computational Logic


of the DFKI (whose boss and predecessor in the nullary years might have earned about a million Euros a year, whereas his successor today may claim at least as much per month)19 , a mild not necessarily unfriendly expression entered his face, as he might choose to display to his aspiring son, who as everbody expects would one day grow up to enter the footsteps of his father, and he started a fascinating but slightly patronising conversation about multi nationals and international market sizes, time-to-market and market indicators, venture capital, multinationals, investment funds and more. When I ask “Dr. Google Scholar” for industrial applications of fuzzy logic, for example, the system comes up with 350 000 hits in 0.37 sec including excellent scholarly survey articles. The same query just sent to Google delivers 741 000 hits in 0,22 sec showing some of the scholarly articles as well, but also a myriad of concrete devices, companies and market shares. The amount even of this single application area is overwhelming! The original plan for the establishment of the DFKI was based on about 200 Million Euros of seed money over a time span of ten years from the German Federal Ministry BMBF. It included a milestone after five years, whereupon the institute was to be evaluated and in case of a positive outcome it was to be turned into an industrial application centre, i.e. a link between industry and academia. So we, the academic directors and founders, were blessed with a new managing director from one of the largest German companies and we were all sent to management courses and training: a great journey into market shares, market entry barrieres, time-to-market indices as well as the order of magnitude of a market, i.e. the fact that there is more to the difference between a market of a million or a billion dollars than just ten to the power of three. We, the academics, tend to ignore the enormeous complexity of this economic world and vice versa, we are more often then not seen as the small constant gardeners cultivating our “akademisches Vorg¨artchen” - to quote from the opening speech of our SIEMENS representative. Looking for an application of our verification tool VSE at the DFKI a long time ago, we managed in our lab to verify “in-principle” a cardiac pacemaker and presented our results proudly to one of our share holders and manufacturers of these devices. After a warm wellcome and a well received presentation we were asked for a confidential meeting with the top management board, where we learned that the company had moved the official residence of its daughter to Sweden with its well known less restrictive liability laws, and within that legal framework it turned out that the costs of full scale verification by far exceeds the potential liability sum multiplied by the current risk factor of their device — and so we were politely sent back to our institute. Hardly the kind of reasoning a young postdoc trying to establish his own application driven research and development group is accustomed to. 19 Justified or not, there is currently an interesting debate about Thomas Piketty’s book “Capital in the Twenty-First Century” [Piketty, 2014], for example: Paul Krugman “Why we are in a new guilded age” in the New York Review of books, May 2014, vol LXI, No 8.


J¨ org Siekmann

In precis: a single person will need more than one lifetime to fully appreciate both worlds, academic as well as market economy, and a competent account of industrial applications of logic would require several volumes in a set of handbooks of its own. Logical techniques in stand-alone systems such as model checking, theorem proving by induction or fuzzy controllers have found their welldefined markets that can be quantified and accounted for. However logic as an enabling or at least supporting technology in otherwise independent markets such as say manufacturing, car, train or aeroplane industries, mechatronics or chemistry — to just name some major application areas — are no longer within the order of magnitude of millions but billions of Euros.

The International Federation for Computational Logic: IFCoLog The enormeous diversity outlined above is not necessarily disadvantageous, as every of these evolved communities addresses its own important set of problems and issues, and it is clear that one group cannot address them all. However, fragmentation can carry a heavy price intellectually as well as politically in the wider arena of scientific activity where, unfortunately, logical investigations are often still perceived as limited in scope and value. So how can these hundreds of societies, sociologically evolved communities of workshop and conference affiliates be reunited without losing their historical identities? Our solution is inspired by the manner in which the European AI societies are organised: there is one registered society namely ECCAI (European Coordinating Committee for Artificial Intelligence), whose members are the European national AI societies. With the growing unification of Europe there are currently more than two dozen members, who represent all European AI researchers and whose representatives meet every two years at the time of ECAI, the European Conference on Artificial Intelligence. So the idea for IFCoLog is the same as in AI and most other scientific fields these days: An International Federation for Computational Logic (IFCoLog)20 has been created more than twenty years ago with the help of Dana Scott and legally registered in the Netherlands as well as a charity in London with the financial aid of the European Network of Excellence for Computational Logic (CologNet and COMPULOG), whose members are the current (and future) communities related to computational logic. Some of these are actually organised into legal societies, others are simply associated with a conference, but nevertheless form a scientific community of considerable size and importance. In as much as the Federation aims to counterbalance the growing division in the field and to represent it once again in its entirety, it is working on the four major goals: Information, Representation, Promotion and Cooperation. More specifically, its activities are: to influence funding policy and to increase our international visibility, to set up concrete educational curricula and to encourage high quality teaching materials, to maintain an active information policy by creating an 20 see

Computational Logic


infrastructure for web sites and links and to found and maintain formal scientific journals in the name of IFCoLog such as the Oxford IGPL journal, the Journal of Applied Logic (JAL), the journal Computational Logic and the recently founded IFCoLog Journal for Applied Logic- Also there is an informal journal PhiNews. Finally it supports and identifies with FLOC, the major federated conference which is held every three years. We want to establish a permanent office for the federation that coordinates and maintains all of these activities similar to, say, the Royal Society for Science in Great Britain or the scientific academies in various countries. At the time of writing we have applied with positive response for membership in the International Council for Science (ICSU) in order to realize our final goal, namely to establish computational logic as an academic field of its own.

Special Interest Group of the Association for Computing Machinery: SIGLOG At the time of writing the American Association for Computing Machinery (ACM) approved a new special interest group for computational logic, called SIGLOG21 , after several years of negotiation with the ACM authorities by Moshe Vardi, Dana Scott and Prakash Panangaden, who is now its chairman. Representing almost every major area of computing, ACM’s Special Interest Groups offer the infrastructure to announce conferences, publications and scientific activities and to provide opportunities for sharing technical expertise and first-hand knowledge - so this is another important step towards the recognition of Computational Logic as an academic field. ACKNOWLEDGEMENTS I would like to thank the authors of this volume for their critical reading of this chapter, particular thanks to Alan Bundy, Moshe Vardi, Jack Minker, Pascal Hitzler and John-Jules Meyer for their corrections and helpful information. BIBLIOGRAPHY [Abramsky et al., 1992-1995] S. Abramsky, D. Gabbay, and T. Maibaum, editors. Handbook of logic in computer science, volume four volumes. Oxford University Press, 1992-1995. [Baader, 2003] F. Baader, editor. The description logic handbook: theory, implementation, and applications. Cambridge University Press, 2003. [Barwise, 1982] J. Barwise, editor. Handbook of mathematical logic. North Holland, Elsevier, 1982. [Beziau et al., 2007] J. Y. Beziau, W. A. Carnielli, and D. Gabbay, editors. Handbook of paraconsistency. Kings College Publications, 2007. [Brachman and Levesque., 1985] Ronald J. Brachman and Hector J. Levesque., editors. Readings in knowledge representation. Morgan Kaufmann Publishers Inc, 1985. [Brachman and Levesque, 2004] R. J. Brachman and H. Levesque, editors. Knowledge Representation and Reasoning,. Amsterdam, Elsevier, 2004. 21 see, and SIGLOG.inf


J¨ org Siekmann

[D’Agostino and Gabbay, 1999] M. D’Agostino and D. Gabbay, editors. Handbook of tableau methods. Springer, 1999. [Gabbay and Smets, 1998] D. Gabbay and P. Smets, editors. Handbook of Defeasible Reasoning and Uncertainty Management Systems, Volume 1: Quantified Representation of Uncertainty and Imprecision, volume volume 1. Springer, 1998. [Gabbay et al., 1993-1998] D. Gabbay, C. J. Hogger, and J. A. Robinson, editors. Handbook of Logic in Artificial Intelligence and Logic Programming, volume five volumes. Oxford University Press, 1993-1998. [Gabbay et al., 1994] D. Gabbay, I. Hodkinson, M. Reynolds, and M. Finger, editors. Temporal logic: mathematical foundations and computational aspects. Vol. 1. Oxford: Clarendon Press, 1994. [Gabbay, 2002] D. Gabbay, editor. Handbook of the Logic of Argument and Inference: The turn towards the practical. North Holland, Elsevier, 2002. [Ginsberg, 1980] Matthew Ginsberg, editor. Readings in nonmonotonic reasoning. Morgan Kaufmann Publishers Inc, 1980. [Goldman, 1993] Alvin I. Goldman, editor. Readings in philosophy and cognitive science. MIT Press, 1993. [Harrison, 2009] J. Harrison, editor. Handbook of practical logic and automated reasoning. Cambridge University Press, 2009. [Kracht et al., 2009] M. Kracht, J.-J. Ch. Meyer and K. Segerberg. The Logic of Action. In The Stanford Encyclopedia of Philosophy, Edward N. Zalta, ed., 2009 Edition. [Minker, 2000] Jack Minker, editor. Logic-based Artificial Intelligence, Springer, 2000. [Morell, 1857, digitized 2006] John Daniell Morell, editor. Handbook of logic. Oxford University, 1857, digitized 2006. [Piketty, 2014] Thomas Piketty. Capital in the Twenty-First Century. Belknab Press Harvard University (translated from French), 2014. [Robinson and Voronkov, 2001] Alan Robinson and Andreij Voronkov, editors. Handbook of automated reasoning. Elsevier, 2001. [Siekmann and Wrightson, 1983] J. Siekmann and G. Wrightson, editors. Automation of Reasoning: Classical papers on computational logic. Vol 1 and 2. Springer Publishing Company, 1983. [Tate et al., 1994] A. Tate, J. Hendler, and J. Allen, editors. Readings in planning. Morgan Kaufmann Publishers Inc., 1994. [Thomason, 2013] R. Thomason. Logic and artificial intelligence., 2013. [van Benthem and Meulen, 1996] J. F. van Benthem and A. Ter Meulen, editors. Handbook of logic and language. Elsevier, 1996. [van Harmelen et al., 2008] F. van Harmelen, V. Lifschitz, and B. Porter, editors. Handbook of Knowledge Representation. Elsevier, 2008.

LOGIC AND THE DEVELOPMENT OF THE COMPUTER Martin Davis Reader: J¨ org Siekmann In the fall of 1945, as the ENIAC, a gigantic calculating engine containing thousands of vacuum tubes, neared completion at the Moore School of Electrical Engineering in Philadelphia, a committee of experts were meeting regularly to discuss the design of its successor, the proposed EDVAC. As the weeks progressed, their meetings became acrimonious; the experts found themselves dividing into two groups: the “engineers” and the “logicians”. John Eckart was the leader of the engineers. He was justly proud of his accomplishment with the ENIAC. It had been thought impossible that 15,000 hot vacuum tubes could all be made to function without failing long enough for anything useful to be accomplished, and Eckart, by using careful conservative design principles had succeeded brilliantly in doing exactly this. The leading “logician” was the eminent mathematician John von Neumann. Eckart was particularly furious over von Neumann’s circulating his draft EDVAC report under his own name. This report paid little attention to engineering details, but set forth the fundamental logical computer design known to this day as the von Neumann architecture [Von Neumann, 1945]. Although the ENIAC was an engineering tour de force, it was a logical mess. It was von Neumann’s expertise as a logician that enabled him to understand the fundamental fact that a computing machine is a logic machine. In its circuits it embodies the distilled insights of a remarkable collection of logicians, developed over centuries. Nowadays when computer technology is advancing with such breathtaking rapidity, as we admire the truly remarkable accomplishments of the engineers, it is all too easy to overlook the logicians whose ideas made it all possible.1 We begin with the amazing G.W. Leibniz. No project was too ambitious for him. He was ready to try to talk Louis XIV into invading Egypt and building a canal through the Isthmus of Suez, to work on convincing the leaders of Christianity that the doctrinal differences between the various sects could be overcome, and to offer his boss, the Duke of Hanover, to the English to be their king. Incidentally, he developed a philosophical system that purported to show how God’s perfection could be reconciled with the apparent imperfections of the world, and (with Newton) was 1 This and the preceding paragraph are copied verbatim from my book [Davis, 2012] in which this history is discussed in more detail.

Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


Martin Davis

one of the inventors of the differential and integral calculus. But what was perhaps his most audacious conception of all was of a universal language of human thought. In this language the symbols would directly represent ideas and concepts thereby making their relationships transparent. Just as algebraic notation facilitates calculations revealing numerical relationships, so Leibniz’s language would enable logical relationships to be discovered by straightforward calculations. In Leibniz’s vision, his language would be used by a group of scholars sitting around a table and addressing some profound question in philosophy or human affairs. “Let us calculate,” they would say, out would come their pencils, and the required answer would soon be forthcoming. Leibniz was well aware of the essentially mechanical nature of computation, and despite the rudimentary technology of his time, Leibniz looked forward to the mechanization of computation and grasped the potential of mechanical calculation for saving humanity from mindless drudgery.2 Engineers designing the circuits used in computers, make use of a special kind of mathematics called Boolean algebra. This is an algebra of logic, fragments of which Leibniz had already developed. But it was the Englishman George Boole, quite unaware of what Leibniz had begun, who completed the endeavor a century and a half later. Upright to a fault, Boole was “the sort of man to trust your daughter with”. Denied a university education by the poverty of his family, his outstanding contributions were eventually rewarded by a professorship in Ireland. Boole was only 49 when he died after walking three miles in a cold rainstorm to lecture in his wet clothes. The pneumonia he developed was certainly not helped by his wife, who placed him between cold soaking bed sheets.3 “There is nothing worse that can happen to a scientist than to have the foundation collapse just as the work is finished. I have been placed in this position by a letter from Mr. Bertrand Russell . . . ” These despondent words were written by the German mathematician Gottlob Frege in 1902 in an appendix to a treatise that was to have crowned his life’s work. Frege never recovered from the blow administered by the young Englishman. He died in obscurity a little over two decades later leaving behind a diary filled with extreme right-wing ideas close to those that were soon to prove so devastating to Germany and the rest of Europe. Actually, counter to Frege’s perception, his work had by no means been destroyed by Bertrand Russell’s letter. Frege would have been astonished to learn that his seminal ideas are viewed as fundamental by computer scientists engaged in attempting to program computers to perform logical reasoning. Researchers in this field marked the 100th anniversary in 1979 of Frege’s publication of a pamphlet with the almost untranslatable title Begriffsschrift, with a special historical lecture at their annual conference. (Begriff means “concept”, and schrift means “script” or “mode of writing”.) Frege was thoroughly familiar with Leibniz’s thought, and indeed, he believed that in his pamphlet, he had brought into being the universal language of reasoning that Leibniz had sought. While the logical reasoning used by 2 For Leibniz’s life see [Aiton, 1985], for his mathematical work see [Edwards, 1979; Hofmann, 1974], for his philosophy see [Mates, 1986], and for his work on logic [Couturat, 1961]. 3 For Boole’s life see [MacHale, 1985], for his work in logic [Boole, 1847; Boole, 1958].

Logic and the Development of the Computer


mathematicians in developing proofs goes far beyond what Boole’s algebra allows, Frege’s rules of inference proved to be completely adequate.4 Although Frege saw himself as carrying out part of Leibniz’s program, he showed no interest in following up on Leibniz’s “Let us calculate”. He was content to show how his rules could be applied to the most fundamental parts of mathematics, never trying to develop a computational procedure (algorithm, we would say) for determining whether some proposed inference is or is not correct. More than half a century was to elapse before the English mathematician Alan Turing and, independently, the American logician Alonzo Church, proved that no such computational procedure could exist. While Turing’s paper showed that one of Leibniz’s goals was unattainable, it also provided a revolutionary insight into the very nature of computation. In order to prove that there is no possible algorithm for accomplishing some task, it was necessary for Turing to furnish a clear and precise analysis of the very concept of computation. In doing so he developed a mathematical theory of computation in terms of which a machine for carrying out a particular computation and a list of instructions for carrying out the same computation were seen to be two sides of the same coin: either could be readily transformed into the other. Turing saw that all computational tasks that could in principle be carried out by a machine could also be accomplished by “software” running on one single all-purpose machine. This insight ultimately changed the way builders understood what it means to construct a calculating machine. It made it possible to think of a single machine that, suitably “programmed”, is capable of accomplishing the myriad tasks for which computers are used today. Farsighted people began to envision digital computing machines that would be capable of things that Leibniz could hardly have imagined.5 The path from Frege to Turing was no straight line. Logicians during the first third of the twentieth century found themselves faced with what was thought to be a “crisis” in the very foundations of mathematics itself. The “crisis” originated in the efforts of Georg Cantor to expand the grasp of mathematics into the realm of the “transfinite” where he found not merely one infinity, but cascades of larger and larger infinities. Cantor himself was not disturbed by the seeming contradiction between the theorem that there is always another infinity larger than any of his transfinites and the evident fact that one could not exceed the “absolute infinity” consisting of all of his transfinites. For the deeply religious Cantor, this absolute had a clear theological interpretation, but others were not so sanguine. Matters were brought to a head when Bertrand Russell showed that the reasoning embodied in this paradox could be distilled into what appeared to be an outright contradiction in our simplest logical intuitions. This was communicated in Russell’s letter that so upset Frege, and Frege saw that it rendered his own system self-contradictory.6 4 [Kreiser, 2001] is an excellent biography but is only available in the original German; for an English translation of Frege’s Begriffsschrift see [van Heijenoort, 1967]. 5 For Turing’s revolutionary paper see [Turing, 1936]. [Petzold, 2008] provides a carefully guided tour of the article. 6 For Cantor’s life see [Grattan-Guinness, 1971; Meschkowski, 1983; Purkert-Ilgauds, 1987],for


Martin Davis

The perceived crisis arose over the issue of incorporating Cantor’s transfinites into the body of respectable mathematics while avoiding the contradictions to which they seemed inevitably to lead. The technical developments on which Alan Turing based his crucial insights emerged from efforts to deal with the “crisis”. Some leading mathematicians proposed a radical revisionism that would have banished not only Cantor’s transfinites, but a number of other modes of reasoning now felt to be suspect. The great German mathematician David Hilbert whose mathematical trademark was the use of general abstract ideas instead of brute calculation, would have none of this. He contemptuously rejected this attempt “to dismember our science”. Cantor’s theory of transfinites was “the finest product of mathematical genius, and one of the supreme achievements of purely intellectual human activity”. Hilbert extracted from Frege’s language a set of rules of inference for dealing with individual elements of a perfectly arbitrary nature. Any particular part of mathematics, including Cantor’s transfinites, could be expressed as a system consisting of these rules with an appropriate collection of axioms adjoined to serve as premises. Transcending Frege and Russell’s conception, Hilbert saw that such a system had an inside and an outside. Viewed from the inside, it was simply a piece of mathematics turned into an artificial language in which all of the reasoning had been reduced to symbol manipulation. But from the outside, it could be viewed as a purely formal game with rules for the manipulation of symbols. Hilbert’s program for dealing with the “crisis” was what he called metamathematics: viewed from the outside the supposed fact that no contradictions were derivable using the rules of the game could be regarded as a mathematical theorem to be proved. To silence the opposition, Hilbert insisted that the proof of such a consistency theorem be carried out using “finitary” methods that everyone would admit were impeccable.7 In Vienna during the 1920s, a group of scholars with radical ideas about the content and nature of philosophy met regularly, calling themselves the “Vienna Circle”. They embraced Hilbert’s restriction to finitary methods in logical investigations, and indeed tended to go much further. Where Hilbert had conceived of his “games” as a device to insure the validity of abstract and infinitary methods in mathematics, in the Vienna Circle the tendency was to reject any notions of meaning and truth in mathematics except as symbol manipulation permitted by these games. Hilbert had added an “outside” to the conceptual basis of mathematics for the purpose of justifying the “inside”. For the participants in the Vienna Circle, there was only the “inside”. Among those attending the meetings of the Vienna Circle was the young mathematician Kurt G¨ odel who was quite skeptical about what he was hearing. He saw that Hilbert’s insistence on finitary methods was in some cases an obstacle to obtaining very important results. Moreover, embracing Hilbert’s invitation to study the interplay between the “inside” and “outside” of his systems, G¨ odel showed that for such a system, certain simple true statements about the whole numbers 0, 1, 2, . . . could not be proved using the rules of that his transfinite numbers [Cantor, 1941]. 7 For Hilbert’s life see [Reid, 1970], for his polemics [Mancosu, 1998].

Logic and the Development of the Computer


system. One could not, as the members of the Vienna Circle had wished, ignore the “outside”, the meaningful content, of Hilbert’s formal systems! Hilbert was almost 70 in 1930 when G¨ odel announced this result. Soon afterwards, G¨odel realized that a corollary of his work showed that with the restriction to Hilbert’s finitary methods, it would be impossible to prove that his systems were free from contradictions.8 G¨odel’s work killed Hilbert’s program, but it also opened new vistas in a number of directions. In line with Leibniz’s ideas, he made it clear that the study of the structure of artificial languages is worthwhile. G¨odel’s technique showed how to embed the “outside” of one of Hilbert’s systems in its “inside”. To do so, the strings of symbols by which a system presents its “outside” were represented by numbers, which are available “inside”. This anticipated the technique of representing a computer program, which the programmer sees as consisting of strings of symbols, by strings of zeros and ones. In fact, in order to show that various metamathematical relations could be expressed “inside”, G¨odel developed what amounted to a programming language. In his writings on logic, Hilbert had emphasized the crucial importance of what he called the Entscheidungsproblem. This was nothing but the problem of providing the calculational methods for Frege’s rules of inference that would have been needed to carry out Leibniz’s dream of mechanizing reason. Alan Turing’s decisive proof that no such methods could exist came a few years after G¨odel’s work and was heavily influenced by it. The all-purpose digital computers in Turing’s paper were mathematical abstractions, but he was intrigued by the possibility of building a real one. Such thoughts had to be put on hold because of the Second World War during which Turing played a key role in the successful deciphering of the German secret military codes. In his spare time, he thought about how computers could be made to play a good chess game, and indeed to behave “intelligently”. When the war was over he wrote a report showing how to build a computer using the available technology. Turing took a job at the University of Manchester working with the computer that was being built there. After the police in Manchester learned of Turing’s sexual involvement with a young man, he was arrested, convicted of “gross indecency” and compelled to undergo “treatments” with female sex hormones. He died of cyanide poisoning, apparently a suicide, two years later.9 As a young mathematician, John von Neumann had been attracted by Hilbert’s program and had written a number of articles on consistency proofs and on systems of axioms for Cantor’s set theory. He was in the audience at the conference at K¨onigsberg in 1930 where G¨ odel first announced his fateful results, and immediately grasped their significance. Von Neumann was so impressed by what G¨odel had accomplished that a number of his lectures during that period were devoted to G¨odel’s results. He knew Turing personally, and was certainly familiar with 8 For

G¨ odel’s life see [Dawson, 1997; Kreisel, 1981]. little textbook in which Hilbert emphasized the Entscheidungsproblem was [Hilb-Acker, 1928]. The definitive biography of Alan Turing is [Hodges, 1983]. The briefer [Leavitt, 2006] is also excellent. 9 The


Martin Davis

his work. Von Neumann’s draft EDVAC report, that so influenced subsequent computer development and inflamed John Eckart, shows unmistakable evidence of Turing’s influence. With the construction of general purpose digital computers, the role of logicians in the developing new technology became a day-to-day matter. From the beginning, in the 1950s, logic has played a key role in the design of programming languages. The PROLOG language (PROgramming in LOGic) made the connection quite explicit, and can be thought of as a partial realization of Leibniz’s dream. Going beyond what Leibniz could have dreamt of, is the effort to demonstrate that the full power of human thought can be emulated by computers. This very possibility has been denied by various thinkers, some by appealing to G¨odel’s work, others by more general philosophical analysis. Meanwhile, “artificial intelligence”, has become a significant branch of computer science, and while it has hardly accomplished what its advocates had expected, it has certainly achieved some impressive results, including the defeat of the world’s leading chess champion by a computer. From a dispassionate point of view all that can really be said is that neither side of the ongoing debate over whether computers can exhibit truly intelligent behavior has really settled the matter. What is clear is that the revolution wrought by computers is just beginning, and that their ability to perform any symbolic task whose logical structure can be explicitly defined will lead to developments far beyond what we can imagine today. BIBLIOGRAPHY [Aiton, 1985] Aiton, E.J., Leibniz: a Biography, Adam Hilger Ltd. Bristol and Boston 1985. [Boole, 1847] Boole, George, The Mathematical Analysis of Logic, Being an Essay towards a Calculus of Deductive Reasoning, Macmillan, Barclay and Macmillan, Cambridge, 1847. [Boole, 1958] Boole, George, An Investigation of the Laws of Thought on which Are Founded the Mathematical Theories of Logic and Probabilities, Walton and Maberly, London 1854; reprinted Dover, New York, 1958. [Cantor, 1941] Cantor, Georg, Contributions to the Founding of the Theory of Transfinite Numbers, translated from the German with an introduction and notes by Philip E.B. Jourdain, Open Court, La Salle, Illinois, 1941. [Carp-Doran, 1977] Carpenter, B.E. and R.W. Doran, 1977, “The Other Turing Machine,” Computer Journal, vol. 20(1977), pp. 269-279. [Ceruzzi, 1983] Ceruzzi, Paul E., Reckoners, the Prehistory of the Digital Computer, from Relays to the Stored Program Concept, 1933-1945, Greenwood Press, Westport, Connecticut 1983. [Copeland, 2004] Copeland, B. Jack, editor The Essential Turing, Oxford 2004. [Couturat, 1961] Couturat, Louis, La Logique de Leibniz d’Apr` es des Documents In´ edits, Paris, F. Alcan, 1901. Reprinted Georg Olms, Hildesheim 1961. [Dauben 1, 1979] Dauben, Joseph Warren, Georg Cantor: His Mathematics and Philosophy of the Infinite, Princeton University Press, 1979. [Davis, 1988] Davis, Martin “Mathematical Logic and the Origin of Modern Computers,” Studies in the History of Mathematics, pp. 137-165. Mathematical Association of America, 1987. Reprinted in The Universal Turing Machine - A Half-Century Survey, Rolf Herken, ed., pp. 149-174. Verlag Kemmerer & Unverzagt, Hamburg, Berlin 1988; Oxford University Press, 1988. [Davis, 2004] Davis, Martin, ed. The Undecidable, Raven Press 1965. Reprinted: Dover 2004.

Logic and the Development of the Computer


[Davis, 2012] Davis, Martin The Universal Computer: The Road from Leibniz to Turing, W.W. Norton, 2000. Paperback edition titled Engines of Logic: Mathematicians and the Origin of the Computer W.W. Norton, 2001. Turing Centenary Edition, CRC Press, Taylor & Francis 2012. [Dawson, 1997] Dawson, John W.,Jr. Logical Dilemmas: The Life and Work of Kurt G¨ odel, A K Peters, Wellesley, Massachusetts, 1997. [Edwards, 1979] Edwards Jr., Charles Henry, The Historical Development of the Calculus, Springer-Verlag, New York, 1979. [Goldstine, 1972] Goldstine, Herman H., The Computer from Pascal to von Neumann, Princeton University Press 1972. [Grattan-Guinness, 1971] Grattan-Guinness, I., “Towards a Biography of Georg Cantor,” Annals of Science, vol. 27(1971), pp. 345-391. [Hilb-Acker, 1928] Hilbert, D., and W. Ackermann, Grundz¨ uge der Theoretischen Logik, Julius Springer 1928. [Hodges, 1983] Hodges, Andrew, Alan Turing: The Enigma, Simon and Schuster, New York 1983. [Hofmann, 1974] Hofmann, J.E., Leibniz in Paris 1672-1676, Cambridge University Press, London 1974. [Kreisel, 1981] Kreisel, Georg, “Kurt G¨ odel: 1906-1978,” Biographical Memoirs of Fellows of the Royal Society, vol.26 (1980), pp. 149-224; corrigenda, vol.27(1981), p.697. [Kreiser, 2001] Kreiser, Lothar, Gottlob Frege: Leben – Werk – Zeit, Felix Meiner Verlag, Hamburg 2001. [Leavitt, 2006] Leavitt, David, The Man Who Knew too Much: Alan Turing and the Invention of the Computer, Norton, New York 2006. [MacHale, 1985] MacHale, Desmond, George Boole: his Life and Work, Boole Press, Dublin 1985. [Mancosu, 1998] Mancosu, Paolo, From Brouwer to Hilbert, Oxford 1998. [Mates, 1986] Mates, Benson, The Philosophy of Leibniz: Metaphysics & Language, Oxford University Press 1986. [McCull-Pitts, 1943] McCulloch, W.S. and W. Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biophysics, 5(1943), 115-133. Reprinted in McCulloch, W.S., Embodiments of Mind, M.I.T. Press 1965, 19-39. [Meschkowski, 1983] Meschkowski, Herbert, Georg Cantor: Leben, Werk und Wirkung, Bibliographisches Institut, Mannheim, Vienna, Z¨ urich 1983. [Petzold, 2008] Petzold,Charles, The Annotated Turing: A Guided Tour through Alan Turing’s Historic Paper on Computability and the Turing Machine, Wiley, Indianapolis 2008. [Purkert-Ilgauds, 1987] Purkert, Walter, and Hans Joachim Ilgauds, Georg Cantor: 1845-1918, Vita mathematica, v. 1. Birkhauser, Stuttgart 1987. [Randell, 1982] Randell, Brian, ed. 1982, The Origins of Digital Computers, Selected Papers (third edition), Springer-Verlag 1982. [Reid, 1970] Reid, Constance, Hilbert–Courant, Springer-Verlag, New York 1986. [originally published by Springer-Verlag as two separate works: “Hilbert” 1970 and “Courant in G¨ ottingen and New York: The Story of an Improbable Mathematician” 1976] [Stern, 1981] Stern, Nancy, From Eniac to Univac: An Appraisal of the Eckert-Mauchly Machines, Digital Press 1981. [Turing, 1936] Turing, Alan, “On Computable Numbers with an Application to the Entscheidungsproblem,” Proceedings of the London Mathematical Society, ser. 2, 42(1936), pp. 230267. Correction: ibid, 43(1937), pp. 544-546. Reprinted in [Davis, 2004] pp. 116-154. Reprinted in [Turing, 2001] pp. 18-56. Reprinted in [Copeland, 2004] pp. 58-90;94-96. Reprinted in [Petzold, 2008] (the original text interspersed with commentary). [Turing, 1950] Turing, Alan, “Computing Machinery and Intelligence,” Mind, vol. LIX(1950), pp. 433-460. Reprinted in [Turing, 1992] pp. 133-160. Reprinted in [Copeland, 2004] pp. 433464. [Turing, 1992] Turing, Alan Collected Works: Mechanical Intelligence, D.C. Ince, editor. NorthHolland, Amsterdam 1992. [Turing, 2001] Turing, Alan Collected Works: Mathematical Logic, R.O Gandy & C.E.M. Yates, editors. North-Holland, Amsterdam 2001. [van Heijenoort, 1967] van Heijenoort, Jean, From Frege to G¨ odel, Harvard 1967.


Martin Davis

[Von Neumann, 1945] von Neumann, John, First Draft of a Report on the EDVAC, Moore School of Electrical Engineering, University of Pennsylvania, 1945. First printed in [Stern, 1981], pp. 177-246. [Von Neumann, 1963] von Neumann, John, Collected Works, vol. 5, A.H. Taub (editor). Pergamon Press 1963. [Welchman, 1982] Welchman, Gordon, The Hut Six Story, McGraw-Hill 1982. [Weyl, 1944] Weyl, Hermann, “David Hilbert and His Mathematical Work,” Bulletin of the American Mathematical Society, vol. 50(1944). pp. 612-654. [White-Russ, 1925] Whitehead, Alfred North and Bertrand Russell, Principia Mathematica, vol. I, second edition, Cambridge 1925.

WHAT IS A LOGICAL SYSTEM? AN EVOLUTIONARY VIEW: 1964–2014 Dov M. Gabbay Reader: Johan van Benthem 1


In the past half century, there has been an increasing demand from many disciplines such as law, artificial intelligence, logic programming, argumentation, agent theory, planning, game theory, social decision theory, mathematics, automated deduction, economics, psychology, theoretical computer science, linguistics and philosophy for a variety of logical systems. This was prompted by the extensive applications of logic in these areas and especially in agent theory, linguistics, theoretical computer science, artificial intelligence and logic programming. In these fields there is a growing need for a diversity of semantically meaningful and algorithmically presented logical systems which can serve various applications. Therefore renewed research activity is being devoted to analysing and tinkering with old and new logics. This activity has produced a shift in the notion of a logical system. Traditionally a logic was perceived as a ‘consequence relation’ or a proof system between sets of formulas. Problems arising in application areas have emphasised the need for consequence relations between structures of formulas (such as multi sets, sequences or even richer structures). The general notion of a structured consequence relation was put forward in [Gabbay, 1993a]. This finer-tuned approach to the notion of a logical system introduces new problems which called for an improved general framework in which many of the new logics arising from computer science applications can be presented and investigated. This chapter is a systematic study of the notion of what is a logical system.1 It will incrementally motivate a notion of logical system through the needs of various 1 It

is a description of a personal evolution of the author’s view of what is a logical system, resulting from a systematic effort to record and develop such systems.(The author was heavily involved with the development of the new logics; in the past 50 years the author edited over 60 handbook volumes of applied logic, authored or coauthored over 35 research monographs, published over 500 papers and founded and was editor in chief of 5 top journals in the field). I state my own views and those of the people in a similar school of thought. I make no attempt to arrive at a sort of community opinion drawing on other strands in the literature (say, the work of Barwise and Seligman and Johan van Benthem on what logic and logical systems are, or the vast tradition of abstract model theory or category-theoretic approaches, etc. See http:

Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


Dov M. Gabbay

applications and applied logical activity. The chapter proposes an increasingly more detailed and evolving image of a logical system. The initial position is that of a logical system as a consequence relation on sets of formulas. Thus any set theoretical binary relation of the form ∆ |∼ Γ satisfying certain conditions (reflexivity, monotonicity and cut) is a logical system. Such a relation has to be mathematically presented. This can be done either semantically, or set theoretically, or it can be algorithmically generated. There are several options for the latter. Generate first the {A|∅ |∼ A} as a Hilbert system and then generate {(∆, Γ)|∆ |∼ Γ} or generate the pairs (∆, Γ) directly (via Gentzen rules) or use any other means (semantical interpretations, dynamics, dialogues, games or other proof theories). The concepts of a logical system, semantics and proof theory are not sharp enough even in the traditional literature. There are no clear definitions of what is a proof theoretic formulation of a logic (as opposed to, e.g. a decision procedure algorithm) and what is e.g. a Gentzen formulation. Let us try here to propose a working definition, only for the purpose of making the reader a bit more comfortable and not necessarily for the purpose of giving a definitive formulation of these concepts. • We start with the notion of a well formed formula of the language L of the logic. • A consequence relation is a binary relation on finite sets of formulas, ∆, Γ, written as ∆ |∼ Γ, satisfying certain conditions, namely reflexivity, monotonicity and cut. • Such a relation can be defined in many ways. For example, one can list all pairs (∆, Γ such that ∆ |∼ Γ should hold. Another way is to give ∆, Γ to some computer program and wait for an answer (which should always come). • A semantics is an interpretation of the language L into some family of set theoretical structures, together with a definition of the consequence relation |∼ in terms of the interpretation. What I have just said is not clear in itself because I have not explained what ‘structures’ are and what an interpretation is. Indeed, there is no clear definition of what a semantics is. In my book [Gabbay, 1976], following Scott, I defined a model as a function s giving each wff of the language a value in {0, 1}. A semantics S is a set of models, and ∆ |∼S Γ is defined as the condition: (∀s ∈ S)[∀X ∈ ∆(s(X) = 1) → ∃Y ∈ Γ(s(Y ) = 1)] // and see references under Barwise and under van Benthem. I am, however, optimistic, and I don’t think that reconciliation is impossible. I emphasize the practice in AI/CS, and I draw general conclusions about what a logical system is. Not all logicians will agree that one’s concept of logic has to be dominated by what happens in the practice of CS/AI. See reference [Gabbay, 1994a]. Also note that the emphasis on deduction as crucial in CS/AI might be contrasted with the Halpern Vardi manifesto which claims that most significant uses of logic in computer science revolve around model checking.

What is a Logical System?


• There can be algorithmic systems for generating |∼. Such systems are not to be considered ‘proof theoretical systems’ for |∼. They could be decision procedures or just optimal theorem proving machines. • The notion of a proof system is not well defined in the literature. There are some recognised methodologies such as ‘Gentzen formulations’, ‘tableauxs’, ‘Hilbert style axiomatic systems’, but these are not sharply defined. For our purpose, let us agree that a proof system is any algorithmic system for generating |∼ using rules of the form: ∆1 |∼ Γ1 , . . . , ∆n |∼ Γn ∆ |∼ Γ and ‘axioms’ of the form:

∅ ∆ |∼ Γ

The axioms are the initial list of (∆, Γ) ∈|∼ and the other rules generate more. So a proof system is a particular way of generating |∼. Note that there need not be structural requirement on the rule (that each involves a main connective and some sub formulas, etc.). A Hilbert formulation is a proof system where all the ∆s involved are ∅. A Gentzen formulation would be a proof system where the rules are very nicely structured (try to define something reasonable yourself; again, there is no clear definition !). A Gentzen system can be viewed as a higher level Hilbert system of the ‘connective’ ‘|∼’. A tableaux formulation is a syntactical counter model construction relative to some semantics. We have ∆ |∼ Γ if the counter model construction is ‘closed’, i.e. must always fail. It is also possible to present tableaux formulations for logics which have no semantics if the consequence |∼ and the connectives satisfy some conditions. The central role which proof theoretical methodologies play in generating logics compels us to put forward the view that a logical system is a pair (|∼, S|∼ ), where S|∼ is a proof theory for |∼. In other words, we are saying that it is not enough to know |∼ to ‘understand’ the logic, but we must also know how it is presented (i.e. S|∼ ). The next shift in our concept of a logic is when we observed from application areas whose knowledge representation involves data and assumptions the need to add structure to the assumptions and the fact that the reasoning involved relies on and uses the structure. This view also includes non-monotonic systems. This led us to develop the notion of Labelled Deductive Systems and adopt the view that this is the framework for presenting logics. Whether we accept these new systems as logics or not, classical logic must be able to represent them.


Dov M. Gabbay

The real departure from traditional logics (as opposed to just giving them more structure) comes with the notion of aggregating arguments. Real human reasoning does aggregate argument (circumstantial evidence in favour of A as opposed to evidence for ¬A) and what is known as quantitative (fuzzy =) reasoning system make heavy use of that. Fortunately LDS can handle that easily. The section concludes with the view that a proper practical reasoning system has ‘mechanisms’ for updates, inputs, abduction, actions, etc., as well as databases (theories, assumptions) and that a proper logic is an integrated LDS system together with a specific choice of such mechanisms.2



Traditionally, to present a logic L, we need to preset first the set of well formed formulas of that logic. This is the language of the logic. We define the sets of atomic formulas, connectives, quantifiers and the set of arbitrary formulas. Secondly we define mathematically the notion of consequence namely, for a given set of formulas ∆ and a given formula Q, we define the consequence relation ∆ |∼L Q, reading ‘Q follows from ∆ in the logic L’. The consequence relation is required to satisfy the following intuitive properties: (∆, ∆′ abbreviates ∆ ∪ ∆′ ). Reflexivity ∆ |∼ Q if Q ∈ ∆ Monotonicity ∆ |∼ Q implies ∆, ∆′ |∼ Q 2 My personal view in 1994 was that this is a logic, i.e. Logic = LDS system + several mechanisms. In AI circles this might be called an agent. Unfortunately, the traditional logic community were (in 1994) still very conservative in the sense that they have not even accepted non-monotonic reasoning systems as logics yet. They believe that all this excitement is transient, temporarily generated by computer science and that it will fizzle out sooner or later. They believe that we will soon be back to the old research problems, such as how many non-isomorphic models does a theory have in some inaccessible cardinal or what is the ordinal of yet another subsystem of analysis. I think this is fine for mathematical logic but not for the logic of human reasoning. There is no conflict here between the new and the old, just further evolution of the subject. We shall see later that a more refined view is called for in 2014.

What is a Logical System?


Transitivity (cut)3 : ∆ |∼ A; ∆, A |∼ Q imply ∆ |∼ Q The consequence relation may be defined in various ways. Either through an algorithmic system S|∼ , or implicitly by postulates on the properties of |∼. Thus a logic is obtained by specifying L and |∼. Two algorithmic systems S1 and S2 which give rise to the same |∼ are considered the same logic. If you think of ∆ as a database and Q as a query, then reflexivity means that the answer is yes to any Q which is officially listed in the database. Monotonicity reflects the accumulation of data, and transitivity is nothing but lemma generation, namely if ∆ |∼ A, then A can be used as a lemma to obtain B from ∆. The above properties seemed minimal and most natural for a logical system to have, given that the main applications of logic were in mathematics and philosophy. The above notion was essentially put forward by [Tarski, 1936] and is referenced to as Tarski consequence. [Scott, 1974], following ideas from [Gabbay, 1969], generalised the notion to allow Q to be a set of formulas Γ. The basic relation is then of the form ∆ |∼ Γ, satisfying: Reflexivity ∆ |∼ Γ if ∆ ∩ Γ 6= ∅ Monotonicity ∆ |∼ Γ ∆, ∆′ |∼ Γ Transitivity (cut)

∆, A |∼ Γ; ∆′ |∼ A, Γ′ ∆′ , ∆ |∼ Γ, Γ′

Scott has shown that for any Tarski consequence relation there exist two Scott consequence relations (a maximal one and a minimal one) that agree with it (see my book [Gabbay, 1986]). 3 There are several versions of the cut rule in the literature, they are all equivalent for the cases of classical and intuitionistic logic but are not equivalent in the context of this section. The version in the main text we call transitivity (lemma generation). Another version is

Γ |∼ A; ∆, A |∼ B ∆, Γ |∼ B This version implies monotonicity, when added to reflexivity. Another version we call internal cut: ∆, A |∼ Γ; ∆ |∼ A, Γ ∆ |∼ Γ A more restricted version of cut is unitary cut: ∆ |∼ A; A |∼ Q ∆ |∼ Q


Dov M. Gabbay

The above notions are monotonic. However, the increasing use of logic in artificial intelligence has given rise to logical systems which are not monotonic. The axiom of monotonicity is not satisfied in these systems. There are many such systems, satisfying a variety of conditions, presented in a variety of ways. Furthermore, some are proof theoretical and some are model theoretical. All these different presentations give rise to some notion of consequence ∆ |∼ Q, but they only seem to all agree on some form of restricted reflexivity (A |∼ A). The essential difference between these logics (commonly called non-monotonic logics) and the more traditional logics (now referred to as monotonic logics) is the fact that ∆ |∼ A holds in the monotonic case because of some ∆A ⊆ ∆, while in the nonmonotonic case the entire set ∆ is used to derive A. Thus if ∆ is increased to ∆′ , there is no change in the monotonic case ,while there may be a change in the non-monotonic case.4 The above describes the situation current in the early 1980s. We have had a multitude of systems generally accepted as ‘logics’ without a unifying underlying theory and many had semantics without proof theory. Many had proof theory without semantics, though almost all of them were based on some sound intuitions of one form or another. Clearly there was the need for a general unifying framework. An early attempt at classifying non-monotonic systems was [Gabbay, 1985]. It was put forward that basic axioms for a consequence relation should be reflexivity, transitivity (cut) and restricted monotonicity, namely: Restricted monotonicity ∆ |∼ A; ∆ |∼ B ∆, A |∼ B A variety of systems seem to satisfy this axiom. Further results were obtained [Kraus et al., 1990; Lehman and Magidor, 1992; Makinson, 1989; W´ojcicki, 1988; W´ojcicki, 1988] and the area was called ‘axiomatic theory of the consequence relate’ by W´ ojcicki.5 Although some classification was obtained and semantical results were proved, the approach does not seem to be strong enough. Many systems do not satisfy restricted monotonicity. Other systems such as relevance logic, do not satisfy even reflexivity. Others have richness of their own which is lost in a simple presentation as an axiomatic consequence relation. Obviously a different approach is needed, one which would be more sensitive to the variety of features of the systems in the field. Fortunately, developments in a neighbouring area, that of automated deduction, seem to give us a clue. This is the 1994 view. This view was modified after 2009, see [D’Agostino and Floridi, 2009; D’Agostino et al., 2013; D’Agostino and Gabbay, 2014]. 4 These systems arise due to various practical mechanisms compensating for lack of reasoning resources. See for example [Gabbay and Woods, 2008]. 5 In general, the exact formulations of transitivity and reflexivity can force some form of monotonicty.

What is a Logical System?




The relative importance of automated deduction is on the increase, in view of its wide applicability. New automated deduction methods have been developed for non-classical logics, and resolution has been generalised and modified to be applicable to these logics. In general, because of the value of those logics in theoretical computer science and artificial intelligence, a greater awareness of the computational aspects of logical systems is developing and more attention being devoted to proof theoretical presentations. It became apparent to us that a key feature in the proof theoretic study of these logics is that a slight natural variation in an automated or proof theoretic system of one logic (say L1 ) , can yield another logic (say L2 ). Although L1 and L2 may be conceptually far apart (in their philosophical motivation, and mathematical definitions) when we given them a proof theoretical presentation, they turn out to be brother and sister. This kind of relationship is not isolated and seems to be widespread. Furthermore, non-monotonic systems seem to be obtainable from monotonic ones through variations on some of their monotonic proof theoretical formulation. This seems to give us some handle on classifying non-monotonic systems. This phenomenon has prompted us to put forward the view that a logical system L is not just the traditional consequence relation |∼ (monotonic or non-monotonic) but a pair (|∼, S|∼ ), where |∼ is a mathematically defined consequence relation (i.e. the set of pairs (∆, Γ) such that (∆ |∼ Γ) satisfying whatever minimal conditions on a consequence one happens to agree to, and S|∼ is an algorithmic system for generating all those pairs [Gabbay, 1992]. Thus according to this definition classical propositional logic |∼ perceived as a set of tautologies together with a Gentzen system S|∼ is not the same as classical logic together with the two valued truth table decision procedure T|∼ for it. In our conceptual framework, (|∼, S|∼ ) is not the same logic as (|∼, T|∼ ). To illustrate and motivate our way of thinking, observe that it is very easy to move from T|∼ for classical logic to a truth table system Tn|∼ for Lukasiewicz n-valued logic. It is not so easy to move an algorithmic system for intuitionistic logic. In comparison, for a Gentzen system presentation, exactly the opposite is true. Intuitionistic and classical logics are neighbours, while Lukasiewicz logics seem completely different. In fact for a Hilbert style or Gentzen style formulations, one can show proof theoretic similarities between Lukaseiewicz’s infinite valued logic and Girard’s linear logic, which in turn is proof theoretically similar to intuitionistic logic. This issue has a bearing on the notion of ‘what is a classical logic’. Given an algorithmic proof system S|∼c for classical logic |∼c , then (|∼c , S|∼c ) is certainly classical logic. Now suppose we change S|∼c a bit by adding heuristics to obtain S′ . The heuristics and modifications are needed to support an application area. Can we still say that we are essentially in ‘classical logic’ ? I suppose we can because S′ is just a slight modification of S|∼c . However, slight modifications of an algorithmic


Dov M. Gabbay

system may yield another well known logic. So is linear logic essentially classical logic, slightly modified, or vice versa? We give an example from goal directed implicational logic. Consider a language with implication only. It is easy to see that all wffs A have the form A1 → (A2 → . . . → (An → q) . . .), q atomic, where Ai has the same form as A. We now describe a computation with database a multi set ∆ of wffs of the above form and goal a wff of the above form. We use the metapredicate ∆ ⊢ A to mean the computation succeeds, i.e. A follows from ∆. Here are the rules: 1. ∆, q ⊢ q, q atomic and ∆ empty. (Note that we are not writing A ⊢ A for arbitrary A. We are not writing a Gentzen system.) 2. ∆ ⊢ A1 → (A1 → . . . → (An → q) . . .) if ∆ ∪ {A1 , . . . , An } ⊢ q. Remember we are dealing with multisets. 3. ∆′ = ∆ ∪ {A1 → (A2 → . . . (An → q) . . .)} ⊢ q if ∆ = ∆1 ∪ ∆2 ∪ . . . ∪ ∆n , ∆i , i = 1, . . . , n are pairwise disjoint and ∆i ⊢ Ai . The above computation characterises linear implication. If we relinquish the side condition in (3) and let ∆i = ∆′ and the side condition (1) that ∆ is empty, we get intuitionistic implication. The difference in logics is serious. In terms of proof methodologies, the difference is minor. More examples in [Gabbay, 1992]. Given a consequence |∼ we can ask for a theory ∆ and a wff A, how do we check when ∆ |∼ A? We need an algorithmic proof system S|∼ . Here are some examples of major proof methodologies: • Gentzen • Tableaux • Semantics (effective truth tables) • Goal directed methodology • Resolution • Dialogue systems • Game theoretic interpretation • Labelled Deductive systems, etc. To summarise, we have argued that a logical system is a pair (|∼, S|∼ ), where |∼ is a consequence relation and S|∼ is an algorithmic metapredicate S|∼ (∆, A) which when succeeding means that ∆ |∼ A. The reasons for this claim are:

What is a Logical System?

Lukasiewicz infinite valued logic


Classical logic

Intuitionistic logic


Truth tables



Figure 1. Logics landscape 1. We have an intuitive recognition of the different proof methodologies and show individual preferences to some of them depending on the taste and the need for applications. 2. Slight variations in the parameters of the proof systems can change the logics significantly. (For a systematic example of this see [Gabbay and Olivetti, 2000].) Figure 1 is an example of such a relationship. In the truth table methodology classical logic and Lukasiewicz logic are a slight variation of each other. They are not so close to intuitionistic logic. In the Gentzen approach, classical and intuitionistic logics are very close while Lukasiewicz logic is a problem to characterise. This evidence suggests strongly that the landscape of logics is better viewed as a two-dimensional grid.6 4


Further observation of field examples shows that in many cases the database is not just a set of formulas but a structured set of formulas. The most common is a list or multiset.7 Such structures appear already in linear and concatenation logics 6 See

however, our book [Metcalfe et al., 2008]. logic cannot make these distinctions using conjunction only. It needs further annotation or use of predicates. 7 Classical


Dov M. Gabbay

and in many non-monotonic systems such as priority and inheritance system. In many algorithmically presented systems much use is made (either explicitly or implicitly) of this additional structure. A very common example is a Horn clause program. The list of clauses (a1 ) q (a1 ) q → q does not behave in the same way as the list (b1 ) (b2 )

q→q q

The query ?q succeeds from one and loops from the other. It is necessary to formulate axioms and notions of consequence relations for structures. This is studied in detail in [Gabbay, 1993a]. Here are the main features: • Databases (assumptions) are structured. They are not just sets of formulas but have a more general structure such as multisets, lists, partially ordered sets, etc. To present a database formally, we need to describe the structures. Let M be a class of structures (e.g. all finite trees). Then a database ∆ has the form ∆ = (M, f ), where M ∈ M and f : M 7→ wff’s, such that for each t ∈ M, f (t) is a formula. We assume the one point structure {t} is always in M. We also assume that we know how to take any single point t ∈ M out of M and obtain (M ′ , f ′ ), f ′ = f ↾ M . This we need for some versions of the cut rule and the deduction theorem. • A structured-consequence relation |∼ is a relation ∆ |∼ A between structured databases ∆ and formulas A. (We will not deal with structured consequence relations between two structured databases ∆ |∼ Γ here. See [Gabbay, 1993a].) • |∼ must satisfy the minimal conditions, namely Identity {A} |∼ A Surgical cut ∆ |∼ A; Γ[A] |∼ B Γ[∆] |∼ B where Γ[A] means that A resides somewhere in the structure Γ and Γ[∆] means that ∆ replaces A in the structure. These concepts have to be defined precisely. If ∆ = (M, f1 ) and ∆ = (M2 , f2 ) then Γ[A] displays the fact that for some t ∈ M2 , f2 (t) = A. We also allow for the case that M2 = f2 = ∅ (i.e. taking A out). We need a notion of substitution, which is a three place function Sub(Γ, ∆, t), meaning that for t ∈ M2 we substitute M1 in place

What is a Logical System?


of t. This gives us a structure (M3 , f3 ) according to the definition of Sub. (M3 , f3 ) is displayed as Γ[∆], and Γ[∅] displays the case of taking A out. Many non-monotonic systems satisfy a more restricted version of surgical cut: Γ[∅/A] |∼ A; Γ[A] |∼ B Γ[Γ[∅/A]] |∼ B

Another variant would be Deletional cut

Γ[∅/A] |∼ A; Γ[A] |∼ B Γ[∅/A] |∼ B

• A logical system is a pair (|∼, S|∼ ), where |∼ is a structured-consequence and S|∼ is an algorithmic system for it. 5


Logical systems are idealizations and, as such, not intended to faithfully describe the actual deductive behaviour of rational agents. As Gabbay and Woods put it: A logic is an idealization of certain sorts of real-life phenomena. By their very nature, idealizations misdescribe the behaviour of actual agents. This is to be tolerated when two conditions are met. One is that the actual behaviour of actual agents can defensibly be made out to approximate to the behaviour of the ideal agents of the logician’s idealization. The other is the idealization’s facilitation of the logician’s discovery and demonstration of deep laws. [Gabbay and Woods, 2001, p. 158]

This should not necessarily be intended as a plea for a more descriptive approach to the actual inferential behaviour of agents that takes into account their “cognitive biases”. Even from a prescriptive viewpoint, the requirements that Logic imposes on agents are too strong, since it is known that most interesting logics are either undecidable or (likely to be) computationally intractable. Therefore we cannot assume any realist agent to be always able to recognize the logical consequences of her assumptions or to realize that such assumptions are logically inconsistent. This raises what can be called the approximation problem that, in the context of logical systems, can be concisely stated as follows: PROBLEM 1 (Approximation Problem). Can we define, in a natural way, a hierarchy of logical systems that indefinitely approximate a given idealized Logic in such a way that, in all practical contexts, suitable approximations can be taken as prescriptive models of the inferential power of realistic, resource-bounded agents? Stable solutions to this problem are likely to have a significant practical impact in all research areas — from knowledge engineering to economics — where there is an urgent need for more realistic models of deduction. From this point of view, we now claim that a logical system is not only given by a logic L and an algorithmic presentation A of it, but must include also a definition of how the ideal logic L can be approximated in practice by realistic agents (whether human or artificial).


Dov M. Gabbay

This idea has occasionally received some attention in Computer Science and Artificial Intelligence [Cadoli and Schaerf, 1992; Dalal, 1996; Dalal, 1998; Crawford and Etherington, 1998; Sheeran and Stalmarck, 2000; Finger, 2004; Finger, 2004b; Finger and Wasserman, 2004; Finger and Wassermann, 2006; Finger and Gabbay, 2006] but comparatively little attention has been devoted to embedding such efforts in a systematic proof-theoretical and semantic framework. In [D’Agostino et al., 2013], starting from ideas put forward in [D’Agostino and Floridi, 2009], the authors aimed to fill this gap and propose a unifying approach. They also argue that traditional Gentzen-style presentations of classical logic are not apt to address the approximation problem and that its solution must therefore involve an imaginative re-examination of the proof-theory and semantics of a logical system as they are usually presented in the literature. 6


Of course we continue to maintain our view that different algorithmic systems for the same structured consequence relation define different logics. Still although we now have a fairly general concept of a logic, we do not have a general framework. Monotonic and non-monotonic systems still seem conceptually different. There are many diverse examples among temporal logics, modal logics, defeasible logics and more. Obviously, there is a need for a more unifying framework. The question is, can we adopt a concept of a logic where the passage from one logic to another is natural, and along predefined acceptable modes of variation? Can we put forward a framework where the computational aspects of a logic also play a role? Is it possible to find a common home from a variety of seemingly different techniques introduced for different purposes in seemingly different intellectual logical traditions? To find an answer, let us ask ourselves what makes one logic different from another? How is a new logic presented and described and compared to another? The answer is obvious. These considerations are performed in the metalevel. Most logics are based on modus ponens anyway. The quantifier rules are formally the same anyway and the differences between them are metalevel considerations on the proof theory or semantics. If we can find a mode of presentation of logical systems where metalevel features and semantic parts can reside side by side with object level features then we can hope for a general framework. We must be careful here. In the logical community the notions of object level vs. metalevel are not so clear. Most people think of naming and proof predicates in this connection. This is not what we mean by metalevel here. We need a more refined understanding of the concept. There is a similar need in computer science. We found that the best framework to put forward is that of a Labelled Deductive System, LDS, see [Gabbay, 1996]. Our notion of what is a logic is that of a pair (|∼, S|∼ ), where |∼ is a structured (possibly non-monotonic) consequence relation on a language L and S|∼ is an LDS, and where |∼ is essentially required to satisfy no more than Identity (i.e. {A} |∼ A) and a version of Cut. This is a refinement of our concept of a logical system presented in [Gabbay, 1992]. We now not only

What is a Logical System?


say that a logical system is a pair (|∼, S|∼ ), but we are adding that S|∼ itself has a special presentation, that of an LDS. As a first approximation, we can say that an LDS system is a triple (L, A, M), where L is a logical language (connectives and wffs) and A is an algebra (with some operations) of labels and M is a discipline of labelling formulas of the logic (from the algebra of labels A), together with deduction rules and with agreed ways of propagating the labels via the application of the deduction rules. The way the rules are used is more or less uniform to all systems. To present an LDS system we first need to define its set of formulas and its set of labels. For example, we can take the language of classical logic as the formula (with variables, constants and quantifiers) and take some set of function symbols on the same variables and constants as generating the labels. More precisely, we allow ordinary formulas of predicate logic with quantifiers to be our LDS formulas. Thus ∃xA(x, y) is a formula with free variable y and bound variable x. To generate the labels, we start with a new set of function symbols t1 (y), t2 (x, y), . . . of various arities which can be applied to the same variables which appear in the formulas. Thus the labels and formulas can share variables, or even some constants and function symbols. In other words, in some applications, it might be useful to allow some labels to appear inside formulas A. We can form declarative units of the form t1 (y) : ∃xA(x, y). When y is assigned a value y = a, so is the label and we get t1 (a) : ∃A(x, a). The labels should be viewed as more information about the formulas, which is not coded inside the formula (hence dependence of the labels on variables x makes sense as the extra information may be different for different x.) A formal definition of an algebraic LDS system will be given later, meanwhile, let us give an informal definition of an LDS system and some examples which help us understand what and why we would want labels.8 8 The idea of annotating formulas for various purposes is not new. A. R. Anderson and N. Belnap in their book on Entailment, label formulas and propagate labels during proofs to keep track of relevance of assumptions. Term annotations (Curry-Howard formula as type approach) are also known where the propagation rules are functional application. The Lambek Calculus and the categorial approach is also related to labelling. the extra arguments sometimes present in the Demo predicate of metalogic programming are also a form of labelling. What is new is that we are proposing that we use an arbitrary algebra for the labels and consider the labelling as part of the logic. We are creating a discipline of LDS and claiming that we have a unifying framework for logics and that almost any logic can be given an LDS formulation. We can give |∼ an LDS formulation provided |∼ is reflexive and transitive and each connective is either |∼ monotonic or anti monotonic in each of its arguments. See [Gabbay, 1993]. We are claiming that the notion of a logic is an LDS. This is not the same as the occasional use of labelling with some specific purpose in mind. We are translating and investigating systematically all the traditional logical concepts into the context of LDS an generalising them. I am reminded of the story of the Yuppy who hired an interior decorator to redesign his sitting room. After much study, the decorator recommended that the Yuppy needed a feeling of space and so the best thing to do is to arrange the furniture against the wall, so that there will be a lot of space in the middle. The cleaning person, when she/he first saw the new design was very pleased. She/he thought it was an excellent idea. ‘Yes; said the Yuppy, ‘and I paid ‘£1000 for it’. ‘That was stupid’, exclaimed the cleaning person, ‘I could have told you for free! I arrange the furniture this way every time I clean the floor!’.


Dov M. Gabbay

DEFINITION 2 (Prototype LDS system). Let A be a first-order language of the form A = (A, R1 , . . . , Rk , f1 , . . . , fm ) where A is the set of terms of the algebra (individual variables and constants) and Ri are predict symbols (on A, possibly binary but not necessarily so) and f1 , . . . , fm are function symbols (on A) of various arities. We think of the elements of A as atomic labels and of the functions as generating more labels and of the predicates as giving additional strutter to the labels. A typical example would be (A, R, f1 , f2 ) where R is binary and f1 , f2 are unary. A diagram of labels is a set M containing elements generated from A by the function symbols together with formulas of the form ±R(t1 , . . . , tk ), where ti ∈ M and R is a predicate symbol of the algebra. Let L be a predicate language with connectives ♯1 , . . . , ♯n , of various arities, with quantifiers and with the same set of atomic terms A as the algebra. We define the notions of a declarative unit, a database and a label as follows: 1. An atomic label is any t ∈ A. A label is any term generated from the atomic labels by the symbols f1 , . . . , fm 2. A formula is any formula of L. 3. A declarative unit is a pair t : A, where t is a label and A is a formula. 4. A database is either a declarative unit, or has the form (a, M, f ), where M is a finite diagram of labels, a ∈ M is the distinguished label, and f is a function associating with each label t in M either a database or a finite set of formulas. (Note that this is a recursive clause. We get simple databases if we allow f to associate with each label t only single or finite sets of formulas. Simple databases are adequate for a large number of applications.) Definition 2 is simplified. To understand it intuitively, think of the atomic labels as atomic places and times (top of the hill, 1 January, 1992, etc.) and the function symbols as generating more labels, namely more times and more places (behind(x), day after(t), etc.). We form declarative units by taking labels and attaching formulas to them. Complex structures (a, M, f ) of these units are databases. This definition can be made more complex. Here the labels are terms generated by function symbols form atomic labels. We can complicate matters by using databases themselves as labels. This will give us recursively more complex, richer labels. We will not go into that now. The first simplification is therefore that we are not using databases as labels. The second simplification is that we assume constant domains. All times and places have the same elements (population) on them. If this were not the case we would need a function Ut giving the elements residing in t, and a database would have the form (A, M, f , Ut ). EXAMPLE 3. Consider a language with the predicate VS900(x, t). This is a twosorted predicate, denoting Virgin airline flight London–Tokyo, where t is the flight Of course she/he is right, but she/he used the idea of the new arrangement purely as a side effect!

What is a Logical System?


date and x is the name of an individual. For example VS900 (Dov, 15.11.91) may be put in the database denoting that Dov is booked on this flight scheduled to embark on 15.11.91. If the airline practices overbooking and cancellation procedures (whatever that means), it might wish to annotate the entries by further useful information such as • Time of booking; • Individual/group travel booking; • Type of ticket; • ± VIP. This information may be of a different nature to that coded in the main predicate and it is therefore more convenient to keep it as an annotation, or label. It may also be the case that the manipulation of the extra information is of a different nature to that of the predicate. In general, there may be many uses for the label t in the declarative unit t : A. Here is a partial list: • Fuzzy reliability value: (a number x, 0 ≤ x ≤ 1.) Used mainly in expert systems. • Origin of A: t indicates where the input A came from. Very useful in complex databases. • Priority of A: t can be a date of entry of updates and a later date (label) means a higher priority. • Time when A holds: (temporal logic) • Possible world where A holds: (modal logic) • t indicates the proof of A: (which assumptions were used in deriving A and the history of the proof). This is a useful labelling for Truth Maintenance Systems. • t can be the situation and A the infon (of situation semantics) EXAMPLE 4. Let us look at one particular example, connected with modal logic. Assume the algebra A has the form (A, 1 For some rule as in (4), we have that there exists (y1 , . . . , yn′ ) yn′ = y, which is a proof of level ≤ m of y from T ∪ {x′j } ∪ {x1 , . . . , xn−1 }. We also have z ′ = xn . REMARK 69. Note that in logic based argumentation networks (see [Caminada and Amgoud, 2007] or [Hunter, 2010]) only level 0 proofs are used. The rules have the form A1 ∧ . . . ∧ An ⇒c B and only ⇒c eliminations are used.

DEFINITION 70 (Soundness of rules). Let (S, R, ht ) be a logic in the sense of Definition 64. Let r1 , . . . , rk be rules in the sense of Definition 66. We say the rules are sound iff whenever b is proved from a in (S, R) as in Definition 68 for a, b ∈ S then a ⊢ b holds as defined in Definition 64. We say the rules are complete iff we have • a ⊢ b iff b is provable from a using the rules. If a subset of S is marked inconsistent then the constraint arising from that set cannot be solved. EXAMPLE 71 (Defeasible rules). Ordinary implication (strict implication) we can write as A1 ∧ . . . ∧ An → B or as A1 → (A2 → . . . → (An → B) . . .).


Dov M. Gabbay

output node y

input node x


z ′

z = x → y, base node x is an auxiliary node so that we can tell the difference between input and output. When identifying this pattern in an argumentation network it is required that nodes z and x′ bear exactly the attacks shown in the pattern. Figure 17.

input node x

output node y




z = x ⇒ y, base node x′ is an auxiliary node. When identifying this pattern in an argumentation network it is required that nodes z, e and x′ bear exactly the attacks shown in the pattern. Figure 18.

What is a Logical System?






a′1 a′2 a′3

.. .













.. .

.. . a′n−1 a′n

Figure 19. Let us do geometrically A → B and A ⇒ B. All we need are some markers in the figures representing these two implications, to distinguish one from another. See Figures 17, 18 Consider now (S, R) of Figure 19 and consider the rule r of Figure 18. Take the theory T = {a1 , . . . , an } ∪ {bn }. What can it prove using r? The answer is that it can prove b0 and all bn−1 , . . . , b1 along the way. The deduction is essentially aj and bj = [aj ⇒ (aj−1 ⇒ . . . (a1 ⇒ b0 ) . . .)] yields for bj−1 = [aj−1 ⇒ . . . ⇒ (a1 ⇒ b0 )]. We can turn this ordering into a logic if we give the functions ht , for any t of Figure 19. Try hei = 12 . hai = arbitrary ha′i = hai . hb0 = arbitrary. hbj = min(1, 1 − a′j + bj−1 ) for j ≥ 1. We need to show the rule is sound in this semantics, but in this case it is clear because the rules are versions of modus ponens and the function hx are from Lukasiewicz many valued logic. 20


We cannot address the problem of what is a logical system without saying something about our view of semantics. The traditional view, for classical, intuitionistic, or modal logic is to have some notion of a class of models and of an evaluation procedure of a formula in a model. Thus we may have a set K of models and


Dov M. Gabbay

a notion of validity in m ∈ K of a formula A of the logic. We use the notation m  A. Given no details on the internal structure of m and on how m  A is evaluated, all we can say about the model is that m is a {0, 1} function on wffs. Completeness of K for |∼ means that the following holds: A |∼ B iff for all m ∈ K (if m  A then m  B). We would like to present a different view of semantics. We would like to remain totally within the world of logical systems (in our sense, i.e. LDS with mechanisms) and to the extent that semantics is needed, we bring it into the syntax. This can obviously and transparently be done in modal logic where the labels denote possible worlds and the proof rules closely reflect semantical evaluation rules. This in fact can also be done in general. So what then is the basic notion involved in a purely syntactical set up? What replaces the notions of a ‘model’, ‘evaluation’, and completeness? We give the following definition. DEFINITION 72 (Syntactical semantics). Let |∼ be a consequence relation and let K be a class of consequence relations, not necessarily of the same language. For each |∼∗ ∈ K, let k|∼∗ be an interpretation of |∼ into |∼∗ . This involves mapping of the language of |∼ into the language of |∼∗ and the following homomorphic commitment: A |∼ B implies A∗ |∼∗ B ∗ (where A∗ is K|∼∗ (A) and resp. B ∗ ). We say |∼ is complete for (K, k) iff we have A |∼ B iff for all |∼∗ ∈ K, A∗ |∼∗ B ∗ . EXAMPLE 73. The following can be considered as semantical interpretations in our sense: 1. The Solovay–Boolos interpretation of modal logic G (with L¨ob’s axiom) in Arithmetic, with  mean sing ‘provable’. 2. The interpretation of intuitionistic propositional logic into various sequences of intermediate logic whose intersection is intuitionistic logic (e.g. the Ja´skowski sequence). 3. The interpretation of modal logic into classical logic. REMARK 74. We gave a definition of interpretation for consequence relations |∼. Of course, there are always trivial interpretations which ‘technically’ qualify as semantics. This is not intended. Further note that in the general case we have a general LDS proof system with algorithmic proof systems S|∼ and various mechanisms. These should also be interpreted. Each algorithmic move in S|∼ should be interpreted as a move package in S|∼∗ , and similarly for mechanisms.

What is a Logical System?


It is possible to justify and motivate our syntactical notion of semantics from the more traditional one. Let us take as our starting point the notion of Scottsemantics described in [Gabbay, 1976]. DEFINITION 75. Let L be a propositional language, for example the modal language with  or intuitionistic language with →. 1. A model for the language is a function s assigning a value in {0, 1} to each wff of the language. 2. A semantics S is a class of models. 3. Let ∆ be a set of wffs and A a wff. We say ∆ S A iff for all s ∈ S if s(B) = 1 for all B ∈ ∆ then s(A) = 1. The above definition relies on the intuition that no matter what our basic concepts of a ‘model’ or interpretation is, sooner or later we have to say whether a formula A ‘holds’ in it or does not ‘hold’ in it. Thus the technical ‘essence’ of a model is a {0, 1} function s (we ignore the possibility of no value). It can be shown that this notion of semantics can characterise any monotonic (syntactical) consequence relation i.e. any relation |∼ between sets ∆ (including ∆ = ∅) of wffs and wffs A satisfying reflexivity, monotonicity and cut. Thus for any |∼ there exists an S such that |∼ equals S . The semantics S can be given further structure, depending on the connectives of L. The simplest is through the binary relation ≤, defined as follows: • t ≤ s iff (definition) for all wffs A, t(A) ≤ s(A). Other relations can be defined on S. For example, if the original language is modal logic we can define: • tRs iff for all A of L if t(A) = 1 then s(A) = 1. One can then postulate connections between values such as: • t(A) = 1 iff ∀s[tRs ⇒ s(A) = 1] or for a language with →: • t(A → B) = 1 iff ∀s(t ≤ s and s(A) = 1 imply s(B) = 1). In some logics and their semantics the above may hold. For example, the respective conditions above hold of rthe modal logic K and for intuitionistic logic. For other logics, further refinements are needed. The nature of what is happening here can best be explained through a translate into classical logic. The language L can be considered as a Herbrand universe of terms (i.e. the free algebra based on the atomic propositions and the connectives acting as function symbols), and the models considered as another sort of terms, (i.e. the names of the models can be terms). The ‘predicate’ t(A) = 1 can be considered as a two sorted predicate Hold(t, A). Thus the reductions above become ¯


Dov M. Gabbay

• Hold(t, A) iff ∀s(tRs ⇒ Hold(s, A)), where tRs is ∀B(Hold(t, B) ⇒ Hold(sB)). This condition reduces to • ∀s[∀X(Hold(t, X) ⇒ Hold(s, X)) ⇒ Hold(s, A)] ⇒ Hold(t, A) This is an internal reduction on Hold. In general we want to define Hold(t, ♯(A1 , . . . , An )) in terms of some relations Ri (x1 , . . . , xni ) on sort t (first coordinate of Hold), and the predicates Hold(x, Aj ) for subformulas of ♯(A1 , . . . , An ). Ri (t1 , . . . , tni ) in turn, are expected to be defined using Hold(ti , Xj ) for some formulas Xj . Thus in predicate logic we have formulas ϕi and Ψ♯ such that: • Hold(t, ♯(Ai , . . . , An ) iff Ψ♯ (t, Ri , Hold(xi , Aj )) • Ri (t1 , . . . , tni ) iff (definition ϕi (t1 , . . . , tni Hold(tj , Xk )). Together they imply a possible closure condition on the semantics. • Hold(t, ♯(A1 , . . . , Sn ) iff Ψ♯ (t, ϕi (. . . , Hold(tj , Xk ), Hold(xi , Ak )) which may or may not hold. REMARK 76 (Representation of algebras). The above considerations can be viewed as a special case of a general set-represenation problem for algebras. Let A be an algebra with some function symbols fi satisfying some aims. Take for example the language of lattices A = (A, ⊓, ⊔). We ask the following question: can A be represented as an algebra of sets? In other words, is there a set S and a mapping h(a) ⊆ S, for a ∈ A and a monadic first-order language L1 on S involving possibly some relation symbols R1 , . . . ,k on S such that for all s ∈ S and function symbol f of the algebra we have the following inductive reduction, for all x1 , . . . , xn ∈ A h

s( f (x1 , . . . , xn ) iff  Ψf (s, h(x1 ), . . . , h(xn )) where Ψf is a non-monadic wff of L1 involving 1 , . . . , rk and the subsets h(xj ). If the relations R(t1 , . . . , tm ) on S can be defined using h by some formula ϕR of the algebra (involving the classical connectives and equality and the monadic predicates on the algebra Ti (x) meaning ti ∈ h(x) then  r(t1 , . . . , tm ) iff A  ϕR (T1 , . . . , Tm , R). REMARK 77 (Dependent semantics). The above considerations are not the most general and so not reflect all that might happen. The considerations explain nicely semantics like that of modal K but we need refinements. Consider the logic K1 obtained by collecting all theorems of modal logic K together with the schema A → A and the rule of modus ponens. Necessitation is dropped, so although K1 ⊢ A → A, we can still have K1 6⊢ (A → A). This logic is complete for the class for all Kripke structures of the form m = (S m , Rm , am , hm ), where am Ram holds. Completeness means

What is a Logical System?


1. K1 ⊢ A iff for every m as above am  A Let am be the function satisfying 2. am (A) = 1 iff am  A and let 3. S0 = {am |m as above}. Then we have a semantics S0 ⊆ S (of the language L of modal logic) where Am (A) cannot be reduced to values of s(A) for s ∈ S0 , but can be reduced to values s(A), for s ∈ S, This is so because when we evaluate am  A, we evaluate at points b ∈ S m such that am Rm b and the Kripke structure (S m , Rm b, hm ) is a K structure, but not necessarily a K1 structure, as bRb need not hold. Let bm be the function defined by 4. bm (A) = 1 iff b  A in m. We get 5. am (A) = 1 iff for all s ∈ {bm | am Rbm }, we have s(A) = 1. Let ϕ(a, b) mean as follows:

6. ϕ(a, b) iff (definition) for some m, a = am and b = bm and am Rbm . Then we have that K1 is characterised by a designated subset S0 of S and the truth definition: 7. s(A = 1) iff for all s′ ϕ(′ s, s′ ) and sRs′ imply s′ (A) = 1. 8. A  B iff or all s ∈ S0 , s(A) = 1 implies S(B) = 1. We are now ready to say what it means to give technical semantics to a consequence relation |∼.

DEFINITION 78 (What is semantics for |∼). Let |∼ be a consequence relation (reflexive and transitive) in a language with connectives. Then a semantics for |∼ is any set theoretic representation (in the sense of Remark 77) of the free term algebra based on |∼.

The previous definition does not take account of Remark 77. If we want a better concept of what is semantics, we need to talk about fibred semantics and label dependent connectives. These topics are addressed in [Gabbay, 1996]. 21


We have, incrementally, gone through several notions of ‘what is a logical system’ and ended up with a concept of a logic that is very far from the traditional concept. In artificial intelligence circles, what we call a ‘logic’ is perceived as an ‘agent’ or ‘intelligence agent’. This is no accident. Whereas traditional logical systems


Dov M. Gabbay

(classical logic, intuitionistic logic, linear logic) model mathematical reasoning and mathematical proof, our new concept of logic attempts to model, and stay tuned to, human practical reasoning. What we tried to do is to observe what features and mechanisms are at play in human practical reasoning, and proceed to formalise them. The systems emerging from this formalisation we accept as the new ‘logics’. It is therefore no surprise that in AI circles such systems are perceived as intelligent agents. However, compared with AI, our motives are different. We are looking for general logical principles of human reasoning and not necessarily seeking to build practical applied systems. There is one more point to make before we can close this chapter. The above ‘logics’ manipulate formulas, algebraic terms and in general syntactical symbols. We have maintained already in 1988 [Gabbay and Reyle, 1994] that deduction is a form of stylised movement, which can be carried out erectly on natural objects from an application area. Thus ‘logic’ can be done not only on syntactical formulas, but on any set of structured objects, naturally residing in some application area. To reason about gardening, for example, we can either represent the area in some language and manipulate the syntax in some logic, or we can directly nameplate and move the plants themselves and ‘show’ the conclusion. The style of movement is the ‘logic’. This concept of logic as movement is clearly apparent in automated reasoning. Different kinds of ‘shuffling’ licensed by a theorem prover can lead to different ‘logics’, because then different sets of theorems become provable. Our insight twas that similar movements can be applied directly on the objects of the application areas, and therefore reasoning can be achieved directly in the application area without formalisation. This philosophy has been carried out on Discourse Representation Structures in [Gabbay and Reyle, 1994]. Our approach is compatible with the more mixed approach in the contribution by Barwise and Hammer in [Gabbay, 1994a]. We have summarised, with many examples, the various means and mechanisms, syntactical semantical and algorithmic for defining a logic. In any application case study in need for logical formalisation, we can construct a required suitable logic by drawing on and interleaving these available means and mechanisms and define a successful suitable logic for the application. The following further points (I am indebted to Johan van Benthem for his valuable comments) need to be further addressed: • Since our notion of a logical system has changed and evolved, the traditional notions associated with logical systems need also to change. Notions like Interpolation, the notion of Cut, the notions of Expansion, Revision etc etc, all have to be redefined and studied for the new logical systems. This has been studied in [Gabbay, 1996; Gabbay, 1999]. In particular we need to study the question of when are two logical systems equivalent, what are appropriate notions of translation, etc? Indeed, one could wonder generally what sort of Abstract Model Theory (or Abstract Proof Theory) might make sense here. For instance, could there be generalized Lindstr¨om theorems capturing our richer notions of first-order logic as a system?

What is a Logical System?


• Architecture: How do different logical systems work together, combine? We have studied this notion of fibring in [Gabbay, 1999] but we feel further systematic study is needed. • Representation: Can one logical system have different natural grain levels? Can/should there be an account of ‘zooming in’ and ‘zooming out’ between such levels? This also seems connected to the following: we emphasized the role of ‘presentation’ in a richer view of a logical system. Would not just the proof engine but also the choice of the system language itself (finer or coarser) be an essential parameter of this? Our presentation in this paper is implicitly saying yes. • Our notion of a logical system seems largely (not exclusively) inferencedriven. This might be seen as a bias by semantics-oriented people who want a semantics to provide a deeper explanation of why particular packages of proof rules (among the multitude of combinatorial possibilities) make particular sense. It is true that using labels,(in Labelled Deductive Systems), we can bring semantics into the syntax as we have seen, [Gabbay, 1996]. But should we blur the distinctions or maybe continue to maintain it? • As we enrich our notion of a system (e.g., with ‘approximations’ for bounded agents) we hit an issue, (also confronted by Johan van Benthem), which is whether we are really describing “agents using systems” instead of the systems themselves? I said in print on several past occasions that the ultimate logical system is what is in the head of a living human agent. So is then, according to my view, a logical system the same as what other people call (a formalisation of) an agent? I now think that it is not, in which case what is an agent? I am addressing this problem now together with Michael Luck. It will take some time to figure out. Note the daring thought that according to the equational approach to logic, logics can be characterised by equations and so since an agent can be characterised by logic, then an agent can be characterised by a system or equations, just like a particle in mechanics. We are going to check this thought. By the way, Johan van Benthem has also come up against this question, of the connection between logic and agents, in the context of complexity of logical systems. He is exploring the idea that we should not be thinking so much of finding new decidable fragments of logical systems, but rather of simple agents using only tractable parts of larger systems. In other words: would not it make sense to separate system and agent? This is connected with our Section 5 and see reference [D’Agostino and Gabbay, 2014]. See also [Gabbay and Woods, 2008], where we show how an agent with very limited resources would naturally come up with Donald Nute’s system of defeasible reasoning. • Computational issues. Our motivation for enriching the notion of a logical system comes from computation, but we say little about the computational


Dov M. Gabbay

aspects of richer logical systems in our sense, such as the computational complexity of (old, and new) key tasks associated with them. This needs to be studied. There is, however, an important point about our approach which the reader must keep in mind. We develop our notion bottom up. For any application area , the logic consumer/ practitioner can add incrementally to his logic the additional features he needs. So complexity issues become more understandable and manageable. • One main piece of evidence for our view is the current variety of logical systems in practice. There are two reasons for this. The first is that different systems arise from different applications and the fact that practitioners /consumers of logic just build what they need locally. The second reason comes from philosophy, where different philosophical motivations and views give rise to different systems. My view is that we should not seek the one true all encompassing logical system but that part of the notion of logic is how to combine different systems and have them work together. So any new logical system should have as part of its nature procedures for working with other systems. I note that Johan van Benthem for example, while in great sympathy with my view, does raise the issue of whether some more fundamental level of motivation exists as well, perhaps for mathematical system reasons? • Our emphasis is ‘what is a logical system’. Actually, one might just as well say that our topic in this paper is really a more ambitious one: “what is logic”? I just do not want to go this way at this stage , before we figure out “what is an agent”, and get a better understanding of the philosophical and psychological literature on what is logic. ACKNOWLEDGEMENTS I am grateful to Arnon Avron, Johan van Benthem, Marcello D’Agostino, J¨org Siekmann and Leon van der Torre for valuable comments on this paper. BIBLIOGRAPHY [Abraham et al., 2011] M. Abraham, I. Belfer, D. Gabbay, and U. Schild. Delegation, count as and security in Talmudic logic, a preliminary study. In Logic without Frontiers: Festschrift for Walter Alexandre Carnielli on the occasion of his 60th Birthday. Jean-Yves B´ eziau and Marcelo Esteban Coniglio, eds., pp. 73–96. Volume 17 of Tribute Series, College Publications. London, 2011. [Alchourr´ on et al., 1985] C. E. Alchourr´ on, P. G¨ ardenfors, and D. Makinson. On the logic of theory change: partial meet contraction and revision functions. Journal of Symbolic Logic, 50, 510– 530, 1985. [Allwein and Barwise, 1996] . Allwein and J. Barwise. Logical Reasoning with Diagrams, Studies in Logic and Computation, 1996. [Arieli et al., 2011] O. Arieli, A. Avron and A. Zamansky. Ideal paraconsistent logics. Studia Logca, 99(1–3), 31–60, 2011.

What is a Logical System?


[Avron, 2014] A. Avron. Paraconsistemcy, paracompleteness, Gentzen systems, and trivalent semantics. Jouranl of Applied Non-classical Logic, DOI: 10.1080/11663081.2014.911515. [Avron and Lev, 2005] A. Avon and I. Lev. Non-deterministic multi-valued structures. Journal of Logic and Computation, 15:241–261, 2005. [Avron and Zamansky, 2011] A. Avron and A. Zamansky. Non deterministic semantics for logical systems. A survey. In Handbook of Philosophical Logic, D. Gabbay and F. Guenthner, eds. Vol 16, pp. 227–304, Kluwer, 2011. [Barringer et al., 2009] H. Barringer, D. M. Gabbay, and D. Rydeheard. Reactive grammars. In N. Dershowitz and E. Nissan, editors, Language, Culture, Computation, LNCS. Springer, 2009. In honour of Yakov Choueka, to appear. [Barringer et al., 2010] H. Barringer, K. Havelund and D. Rydeheard. Rule systems for run-time monitoring: from Eagle to RuleR (extended version). Journal of Logic and Computation, Oxford University Press, 20(3): 675 – 706, 2010. [Barringer et al., 2012] H. Barringer, D. Gabbay and J. Woods. Temporal, Numerical and Metalevel Dynamics in Argumentation Networks. Argumentation and Computation, 3(2-3), 143– 202, 2012. [Barwise and Seligman, 2008] J. Barwise and J. Seligman. Information Flow: The Logic of Distributed Systems. Cambridge Tracts in Theoretical Computer Science, 2008. [Barwise and Perry, 1983] J. Barwise and J. Perry. Situations and Attitudes. A Bradford Book/MIT Press. 1983. [Batens et al., 1999] D. Batens, K. De Clercq and N. Kurtonina. Embedding and interpolation for some paralogics. The propositional cse. reports on Mathematical Logic, 33:29–44, 1999. [van Benthem, 2005] J. van Benthem. An essay on sabotage and obstruction. In D. Hutter and W. Stephan, editors, Mechanizing Mathematical Reasoning, volume 2605 of Lecture Notes in Computer Science, pages 268–276. Springer, 2005. [van Benthem et al., 2009] J. van Benthem, G. Heinzmann, M. Rebuschi, eds. The Age of Alternative Logics: Assessing Philosophy of Logic and Mathematics Today (Logic, Epistemology, and the Unity of Science), Springer, 2009. [van Benthem, 2014] J. van Benthem. Logic in Games, MIT Press, 2014. [van Benthem, et al., 2011] J. van Benthem, A. Gupta and R. Parikh, eds. Proof computation and Agency: Logic at the Crossroads, SPringer, 2011. [van Benthem, 2011a] J. van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, 2011. [Beziau, 1999] J. Y. Beziau. Classical negation can be expressed by one of its halves. Logic Journal of the IGPL, 7:145–151, 1999. [Boole, 1847] G. Boole. The Mathematical Analysis of Logic, Cambridge and London, 1847. [Brown, 2012] F. Brown. Boolean Reasoning: The Logic of Boolean Equations. Dover, 2012. [Cadoli and Schaerf, 1992] M. Cadoli and M. Schaerf. Approximate reasoning and non-omniscient agents. In TARK ’92: Proceedings of the 4th conference on Theoretical aspects of reasoning about knowledge, pages 169–183, San Francisco, CA, USA, 1992. Morgan Kaufmann Publishers Inc. [Caminada and Amgoud, 2007] Martin Caminada and Leila Amgoud. On the evaluation of argumentation formalisms. Artificial Intelligence, 171(5-6):286–310, 2007. [Console et al., 1991] L. Console, D. T. Dupre, and P. Torasso. On the relationship between deduction and abduction. Journal of Logic and Computation, 1, 661–690, 1991. [Couturat, 1914] L. Couturat. The Algebra of Logic. Open Court, 1914. [Crawford and Etherington, 1998] J.M. Crawford and D.W. Etherington. A non-deterministic semantics fortractable inference. In AAAI/IAAI, pages 286–291, 1998. [Crochemore and Gabbay, 2011] M. Crochemore and D. M. Gabbay. Reactive Automata. Information and Computation, 209(4), 692–704. Published online: DOI: 10.1016/j.ic.2011.01.002 [Cross, 1999] Sir Rupert Cross. On Evidence. Butterworth, 1999. [D’Agostino and Floridi, 2009] M. D’Agostino and L. Floridi. The enduring scandal of deduction: Is propositional logic really uninformative? Synthese 167:271-315 (2009). [D’Agostino et al., 2013] M. D’Agostino, M. Finger and D. M. Gabbay. Semantics and prooftheory of depth-bounded Boolean logics. Theoretical Computer Science 480:43-68 (2013). [D’Agostino and Gabbay, 2014] M. D’Agostino and D. M. Gabbay. Feasible Deduction for Realistic Agents, College Publications, 2014.


Dov M. Gabbay

[Dalal, 1996] M. Dalal. Anytime families of tractable propositional reasoners. In Proceedings of the Fourth International Symposium on AI and Mathematics(AI/MATH-96), pages 42–45, 1996. [Dalal, 1998] M. Dalal. Anytime families of tractable propositional reasoners. Annals of Mathematics and Artificial Intelligence, 22: 297–318, 1998. [Dennis, 1992] H. Dennis. Law of Evidence, Sweet and Maxwell, 1992. [Finger, 2004] M. Finger. Polynomial approximations of full propositional logic via limitedbivalence. In 9th European Conference on Logics in Articial Intelligence(JELIA 2004), volume 3229 of Lecture Notes in Articial Intelligence, pages 526–538. Springer, 2004. [Finger, 2004b] M. Finger. Towards polynomial approximations of full propositional logic.In A.L.C. Bazzan and S. Labidi, editors, XVII Brazilian Symposium on Ar-ticial Intel ligence (SBIA 2004), volume 3171 of Lecture Notes in Artificial Intellingence, pages 11–20. Springer, 2004. [Finger and Gabbay, 2006] M. Finger and D.M. Gabbay. Cut and pay. Journal of Logic, Language and Information, 15(3):195–218, 2006. [Finger and Wasserman, 2004] M. Finger and R. Wassermann. Approximate and limited reasoning: Semantics, proof theory, expressivity and control. Journal of Logic and Computation, 14(2):179–204, 2004. [Finger and Wassermann, 2006] M. Finger and R. Wassermann. The universe of propositional approximations. Theoretical Computer Science, 355(2):153–66, 2006. [Gabbay, 1969] D. Gabbay. Semantic proof of the Craig interpolation theorem for intuitionistic logic. In Logic Colloquium ’69, pp . 391–410, North-Holland, 1969. [Gabbay, 1976] D. Gabbay. Semantical Investigations in Modal and Temporal Logics. D. Reidel, 1976. [Gabbay, 1985] D. Gabbay. Theoretical foundations for non monotonic reasoning in expert system. In K. Apt, editor, Logics and Models of Concurrent Systems, pp. 439–459, Springer-Verlag, Berlin, 1985. [Gabbay, 1986] D. Gabbay. Investigations in Heyting Intuitionistic Logic. D. Reidel, 1986. [Gabbay, 1992] D. Gabbay. Theory of algorithmic proof. In Handbook of Logic in Theoretical Computer Science, Volume 1, S. Abramsky, D. Gabbay and T. Maibaum, eds., pp 307–408 Oxford University Press, 1992. [Gabbay, 1993] D. Gabbay. Classical vs. nonclassical logic. In D. M. Gabbay, C. J. Hogger and J. A. Robinson, eds., Handbook of Logic in AI and Logic Programming, Volume 1, pp. 349–489. Oxford University Press, 1993. [Gabbay, 1993a] D. Gabbay. General theory of structured consequence relations. In Substructural Logics, K. Doˇsen and P. Schr¨ eder-Heister, eds., pp. 109–151. Studies in Logic and Computation, Oxford University Press, 1993. [Gabbay, 1994a] D. M. Gabbay. What is a Logical System?, Oxford University Press, 1994. [Gabbay, 1996] D. M. Gabbay. Labelled Deductive Systems, Vol. 1. Oxford University Press, 1996. [Gabbay, 1998] D. M. Gabbay. Elementary Logic. Prentice Hall, 1998. [Gabbay, 1999] D. M. Gabbay. Fibring Logics, OUP, 1999. [Gabbay, 2009] D. Gabbay. Semantics for higher level attacks in extended argumentation frames Part 1 : Overview. Studia Logica, 93: 355–379, 2009. [Gabbay, 2012] D. Gabbay. An Equational Approach to Argumentation Networks, Argument and Computation, 2012, vol 3 issues (2-3), pp 87-142 [Gabbay, 2012b] D. Gabbay. Reactive Beth tableaux for modal logic. AMAI vol 66, issue 1–4, 55–79, 2012. [Gabbay, 2012a] D. M. Gabbay. Equational approach to default logic. In preparation, 90pp, 2012. [Gabbay, 2012b] D. Gabbay. Bipolar argumentation frames and contrary to duty obligations, a position paper. In Proceedings of CLIMA 2012, M. Fisher et al., eds. pp. 1–24. LNAI 7486, Springer, 2012. [Gabbay, 2012c] D. Gabbay. Temporal deontic logic for the generalised Chisholm set of contrary to duty obligations. In T. Agotnes, J. Broersen, and D. Elgesem, eds., DEON 2012, LNAI 7393, pp. 91–107. Springer, Heidelberg, 2012. [Gabbay, 2013] D. Gabbay. Reactive Kripke Semantics, Theory and Applications. Springer 2013, 450pp. [Gabbay, 2013a] D. Gabbay. Meta-Logical Investigations in Argumentation Networks. Research Monograph College publications 2013, 770 pp

What is a Logical System?


[Gabbay et al., 1993] D. Gabbay, A. Martelli, L. Giordano and N. Olivetti. Conditional logic programming. technical report, University of Turin, 1993. Proceedings of ICLP ’94, MIT Press. [Gabbay and Hunter, 1991] D. M. Gabbay and A. Hunter. Making Inconsistency Respectable, part I. In Proceeding of Fundamental of Artificial Intelligence Resarch (Fair ’91), Ph Jorrand and J. Kelement, eds, pp. 19–32. Vol 535 of LNAI, Springer Verlag, 1991. [Gabbay and Hunter, 1993] D. M. Gabbay and A. Hunter. Making Inconsistency Respectable, part II. In Proceeding of Euro Conference on Symbolic and Quantitive Approaches to Reasoning and Uncertainty, M. Clarke, R. Kruse andS. Moral eds., pp. 129–136. Vol 747 of LNC, Springer Verlag, 1993. [Gabbay and Marcelino, 2009] D. Gabbay and S. Marcelino. Modal logics of reactive frames. Studia Logica, 93, 403–444, 2009. [Gabbay and Olivetti, 2000] D. Gabbay and N. Olivetti. Goal Directed Algorithmic Proof Theory, Kluwer Academic Publishers, 2000. 266pp. [Gabbay and Reyle, 1994] D. Gabbay and U. Reyle. Direct deductive computation on discourse representation structures. Technical report, University of Stuttgart, 1988. Linguisitics and Philosophy, 17, 345–390, 1994. [Gabbay and Rodrigues, 2012] D. Gabbay, O. Rodrigues. Voting and Fuzzy Argumentation Networks, submitted to Journal of Approximate Reasoning. Short version to appear in Proceedings of CLIMA 2012, Springer. New revised version now entitled: Equilibrium States on Numerical Argumentation Networks. [Gabbay and Woods, 2001] D.M. Gabbay and J. Woods. The new logic. Logic Journal of the IGPL,9(2):141–174, 2001. [Gabbay and Woods, 2003] D. M. Gabbay and J. Woods. The law of evidence and labelled deductive systems. Phi-News, 4, 5–46, October 2003. http// Also Chapter 15 in D. M. Gabbay et al., eds Approaches to legal Rationality, Logic, Episemology and the Unity of Science 20, pp. 295–331, Springer, 2010. [Gabbay and Woods, 2008] D. Gabbay and J. Woods. Resource Origins of Non-Monotonicity, in Studia Logica, Vol 88, 2008, pp 85-112. [Hunter, 2010] A. Hunter. Base Logics in Argumentation. In Proceedings of COMMA 2010, pp 275–286. [Kowalski, 1979] R. A. Kowalski. Logic for Problem Solving. North-Holland, 1979. [Kraus et al., 1990] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44, 167–207, 1990. [Lehman and Magidor, 1992] D. Lehman and M. Magidor. What does a conditional knowledge base entail? Artificial Intelligence, 55, 1–60, 1992. [Makinson, 1989] D. Makinson. General theory of cumulative inference. In M. Reinfrank et al., eds., Non-monotonic Reasoning. Volume 346 of Lecture Notes in Artificial Intelligence, pp. 1–18, Springer-Verlag, Berlin, 1989. [Makinson and van der Torre, 2000] D. Makinson and L. van der Torre. Input/output logics. Journal of Philosophial Logic, 29(4): 383–408, 2000. [Makinson and van der Torre, 2001] D. Makinson and L. van der Torre. Constraints for input/output logics. Jouranl of Philosophical Logic, 3-(2): 155–185, 2001. [Makinson and van der Torre, 2001b] D. Makinson and L. van der Torre. What is input/output logic? ESSLLI, 2001. [Metcalfe et al., 2008] G. Metcalfe, N. Olivetti and D. Gabbay. Proof theory for Fuzzy Logics. (Monograph)Springer 2008. [Nute and Cross, 2002] D. Nute and C. Cross. Conditional Logic. In Handbook of Philosophical Logic, D. Gabbay and F. Guenthner, eds., pp. 1–98, Volume 4, 2nd edition. Kluwer, 2002. [Parent et al., 2014] X. Parent, D. Gabbay and L. van der Torre. Intuitionistic basis for input/output logic. In David Makinson on Classical Methods to Non-Classical Problems, S. O. Hansson, ed., pp. 263–286. Springer 2014. [Rohde, 2004] P. Rohde. Moving in a crumbling network: the balanced case. In J. Marcinkowski and A. tarlecki, eds. CSL 2004, LNCS, pp. 1–25, Springer. [Schr¨ oder, 1890–1904] E. Schr¨ oder. Vorlesungen u ¨ber die Algebra die Logik, 3 vols. B. G. Tuebner, Leipzig, 1890–1914. Reprints, Chelsea, 1996, Thoemmes Press, 2000. [Scott, 1974] D. Scott. Completeness and axiomatizability in many-valued logics. In Proceedings of the Tarski Symposium, pp. 411–436, American Mathematical Society, Providence, Rhode Island, 1974. [Shapiro, 2014] S. Shapiro. Varieties of Logic, OUP, 2014.


Dov M. Gabbay

[Sheeran and Stalmarck, 2000] M. Sheeran and G. Stalmarck. A tutorial on Stalmarck’s proof procedure for propositional logic. Formal Methods in System Design, 16:23–58, 2000. [Tarski, 1936] A. Tarski. On the concept of logical consequence. In Logic, Semantics, Metamathematics. Oxford University Press, 1936. [Varzi, 1999] A. C. Varzi, ed. The Nature of Logic. European Review of Philosophy, Volume 4, CSLI Publications, 1999. [Wikipedia, ] [W´ ojcicki, 1988] R. W´ ojcicki. Theory of Logical Calculi. Reidel, Doredrecht, 1988. [W´ ojcicki, 1988] R. W´ ojcicki. An axiomatic treatment of non-monotonic arguments. Bulletin of the Section of Logic, 17(2): 56–61, 1988.

HISTORY OF INTERACTIVE THEOREM PROVING John Harrison, Josef Urban and Freek Wiedijk Reader: Lawrence C. Paulson 1


By interactive theorem proving, we mean some arrangement where the machine and a human user work together interactively to produce a formal proof. There is a wide spectrum of possibilities. At one extreme, the computer may act merely as a checker on a detailed formal proof produced by a human; at the other the prover may be highly automated and powerful, while nevertheless being subject to some degree of human guidance. In view of the practical limitations of pure automation, it seems today that, whether one likes it or not, interactive proof is likely to be the only way to formalize most non-trivial theorems in mathematics or computer system correctness. Almost all the earliest work on computer-assisted proof in the 1950s [Davis, 1957; Gilmore, 1960; Davis and Putnam, 1960; Wang, 1960; Prawitz et al., 1960] and 1960s [Robinson, 1965; Maslov, 1964; Loveland, 1968] was devoted to truly automated theorem proving, in the sense that the machine was supposed to prove assertions fully automatically. It is true that there was still a considerable diversity of methods, with some researchers pursuing AI-style approaches [Newell and Simon, 1956; Gelerntner, 1959; Bledsoe, 1984] rather than the dominant theme of automated proof search, and that the proof search programs were often highly tunable by setting a complicated array of parameters. As described by Dick [2011], the designers of automated systems would often study the details of runs and tune the systems accordingly, leading to a continuous process of improvement and understanding that could in a very general sense be considered interactive. Nevertheless, this is not quite what we understand by interactive theorem proving today. Serious interest in a more interactive arrangement where the human actively guides the proof started somewhat later. On the face of it, this is surprising, as full automation seems a much more difficult problem than supporting human-guided proof. But in an age when excitement about the potential of artificial intelligence was widespread, mere proof-checking might have seemed dull. In any case it’s not so clear that it is really so much easier as a research agenda, especially in the Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


John Harrison, Josef Urban and Freek Wiedijk

context of the technology of the time. In order to guide a machine proof, there needs to be a language for the user to communicate that proof to the machine, and designing an effective and convenient language is non-trivial, still a topic of active research to this day. Moreover, early computers were typically batch-oriented, often with very limited facilities for interaction. In the worst case one might submit a job to be executed overnight on a mainframe, only to find the next day that it failed because of a trivial syntactic error. The increasing availability of interactive time-sharing computer operating systems in the 1960s, and later the rise of minicomputers and personal workstations was surely a valuable enabler for the development of interactive theorem proving. However, we use the phrase interactive theorem proving to distinguish it from purely automated theorem proving, without supposing any particular style of human-computer interaction. Indeed the influential proof-checking system Mizar, described later, maintains to this day a batch-oriented style where proof scripts are checked in their entirety per run. In any case, perhaps the most powerful driver of interactive theorem proving was not so much technology, but simply the recognition that after a flurry of activity in automated proving, with waves of new ideas like unification that greatly increased their power, the capabilities of purely automated systems were beginning to plateau. Indeed, at least one pioneer clearly had automated proving in mind only as a way of filling in the details of a human-provided proof outline, not as a way of proving substantial theorems unaided [Wang, 1960]: The original aim of the writer was to take mathematical textbooks such as Landau on the number system, Hardy-Wright on number theory, Hardy on the calculus, Veblen-Young on projective geometry, the volumes by Bourbaki, as outlines and make the machine formalize all the proofs (fill in the gaps). and the idea of proof checking was also emphasized by McCarthy [1961]: Checking mathematical proofs is potentially one of the most interesting and useful applications of automatic computers. Computers can check not only the proofs of new mathematical theorems but also proofs that complex engineering systems and computer programs meet their specifications. Proofs to be checked by computer may be briefer and easier to write than the informal proofs acceptable to mathematicians. This is because the computer can be asked to do much more work to check each step than a human is willing to do, and this permits longer and fewer steps. [. . . ] The combination of proof-checking techniques with proof-finding heuristics will permit mathematicians to try out ideas for proofs that are still quite vague and may speed up mathematical research. McCarthy’s emphasis on the potential importance of applications to program verification may well have helped to shift the emphasis away from purely auto-

History of Interactive Theorem Proving


Figure 1: Proof-checking project for Morse’s ‘Set Theory’

matic theorem proving programs to interactive arrangements that could be of more immediate help in such work. A pioneering implementation of an interactive theorem prover in the modern sense was the Proofchecker program developed by Paul Abrahams [1963]. While Abrahams hardly succeeded in the ambitious goal of ‘verification of textbook proofs, i.e. proofs resembling those that normally appear in mathematical textbooks and journals’, he was able to prove a number of theorems from Principia Mathematica [Whitehead and Russell, 1910]. He also introduced in embryonic form many ideas that became significant later: a kind of macro facility for derived inference rules, and the integration of calculational derivations as well as natural deduction rules. Another interesting early proof checking effort [Bledsoe and Gilbert, 1967] was inspired by Bledsoe’s interest in formalizing the already unusually formal proofs in his PhD adviser A.P. Morse’s ‘Set Theory’ [Morse, 1965]; a flyer for a conference devoted to this research agenda is shown in Figure 1. We shall have more to say about Bledsoe’s influence on our field later. Perhaps the earliest sustained research program in interactive theorem proving was the development of the SAM (Semi-Automated Mathematics) family of


John Harrison, Josef Urban and Freek Wiedijk

provers. This evolved over several years starting with SAM I, a relatively simple prover for natural deduction proofs in propositional logic. Subsequent members of the family supported more general logical formulas, had increasingly powerful reasoning systems and made the input-output process ever more convenient and accessible, with SAM V first making use of the then-modern CRT (cathode ray tube) displays. The provers were applied in a number of fields, and SAM V was used in 1966 to construct a proof of a hitherto unproven conjecture in lattice theory [Bumcrot, 1965], now called ‘SAM’s Lemma’. The description of SAM explicitly describes interactive theorem proving in the modern sense [Guard et al., 1969]: Semi-automated mathematics is an approach to theorem-proving which seeks to combine automatic logic routines with ordinary proof procedures in such a manner that the resulting procedure is both efficient and subject to human intervention in the form of control and guidance. Because it makes the mathematician an essential factor in the quest to establish theorems, this approach is a departure from the usual theorem-proving attempts in which the computer unaided seeks to establish proofs. Since the pioneering SAM work, there has been an explosion of activity in the area of interactive theorem proving, with the development of innumerable different systems; a few of the more significant contemporary ones are surveyed by Wiedijk [2006]. Despite this, it is difficult to find a general overview of the field, and one of the goals of this chapter is to present clearly some of the most influential threads of work that have led to the systems of today. It should be said at the outset that we focus on the systems we consider to have been seminal in the introduction or first systematic exploitation of certain key ideas, regardless of those systems’ present-day status. The relative space allocated to particular provers should not be taken as indicative of any opinions about their present value as systems. After our survey of these different provers, we then present a more thematic discussion of some of the key ideas that were developed, and the topics that animate research in the field today. Needless to say, the development of automated theorem provers has continued apace in parallel. The traditional ideas of first-order proof search and equational reasoning [Knuth and Bendix, 1970] have been developed and refined into powerful tools that have achieved notable successes in some areas [McCune, 1997; McCune and Padmanabhan, 1996]. The formerly neglected area of propositional tautology and satisfiability checking (SAT) underwent a dramatic revival, with systems in the established Davis-Putnam tradition making great strides in efficiency [Moskewicz et al., 2001; Goldberg and Novikov, 2002; E´en and S¨orensson, 2003], other algorithms being developed [Bryant, 1986; St˚ almarck and S¨aflund, 1990], and applications to new and sometimes surprising areas appearing. For verification applications in particular, a quantifier-free combination of first-order theories [Nelson and Oppen, 1979; Shostak, 1984] has proven to be especially valuable and has led to the current SMT (satisfiability modulo theories) solvers. Some

History of Interactive Theorem Proving


more domain-specific automated algorithms have proven to be highly effective in areas like geometry and ideal theory [Wu, 1978; Chou, 1988; Buchberger, 1965], hypergeometric summation [Petkovˇsek et al., 1996] and the analysis of finite-state systems [Clarke and Emerson, 1981; Queille and Sifakis, 1982; Burch et al., 1992; Seger and Bryant, 1995], the last-mentioned (model checking) being of great value in many system verification applications. Indeed, some researchers reacted to the limitations of automation not by redirecting their energy away from the area, but by attempting to combine different techniques into more powerful AI-inspired frameworks like MKRP [Eisinger and Ohlbach, 1986] and Ωmega [Huang et al., 1994]. Opinions on the relative values of automation and interaction differ greatly. To those familiar with highly efficient automated approaches, the painstaking use of interactive provers can seem lamentably clumsy and impractical by comparison. On the other hand, attacking problems that are barely within reach of automated methods (typically for reasons of time and space complexity) often requires prodigious runtime and/or heroic efforts of tuning and optimization, time and effort that might more productively be spent by simple problem reduction using an interactive prover. Despite important exceptions, the clear intellectual center of gravity of automated theorem proving has been the USA while for interactive theorem proving it has been Europe. It is therefore tempting to fit such preferences into stereotypical national characteristics, in particular the relative importance attached to efficiently automatable industrial processes versus the painstaking labor of the artisan. Such speculations aside, in recent years, we have seen something of a rapprochement: automated tools have been equipped with more sophisticated control languages [de Moura and Passmore, 2013], while interactive provers are incorporating many of the ideas behind automated systems or even using the tools themselves as components — we will later describe some of the methodological issues that arise from such combinations. Even today, we are still striving towards the optimal combination of human and machine that the pioneers anticipated 50 years ago. 2


Automath might be the earliest interactive theorem prover that started a tradition of systems which continues until today. It was the first program that used the Curry-Howard isomorphism for the encoding of proofs. There are actually two variants of the Curry-Howard approach [Geuvers and Barendsen, 1999], one in which a formula is represented by a type, and one in which the formulas are not types, but where with each formula a type of proof objects of that formula is associated. (The popular slogan ‘formulas as types’ only applies to the first variant, while the better slogan ‘proofs as objects’ applies to both.) The first approach is used by modern systems like Coq, Agda and NuPRL. The second approach is used in the LF framework [Harper et al., 1987], and was also the one used in the Automath systems. The idea of the Curry-Howard isomorphism in


John Harrison, Josef Urban and Freek Wiedijk

either style is that the type of ‘proof objects’ associated with a formula is non empty exactly in the case that that formula is true. As an example, here is an Automath text that introduces implication and the two natural deduction rules for this connective (this text appears almost verbatim on pp. 23–24 of [de Bruijn, 1968b]). The other connectives of first order logic are handled analogously. * * b * b * c * c * asp1 * asp2 * c * asp4 *

bool b TRUE c impl asp1 asp2 modpon asp4 axiom

:= := := := := := := := := :=

PN --PN --PN ----PN --PN

: : : : : : : : : :

TYPE bool TYPE bool bool TRUE(b) TRUE(impl) TRUE(c) [x,TRUE(b)]TRUE(c) TRUE(impl)

This code first introduces (axiomatically: PN abbreviates ‘Primitive Notion’) a type for the formulas of the logic called bool1 , and for every such formula b a type of the ‘proof objects’ of that formula TRUE(b). The --- notation extends a context with a variable, where contexts are named by the last variable, and are indicated before the * in each line. Next, it introduces a function impl(b,c) that represents the implication b ⇒ c. Furthermore, it encodes the Modus Ponens rule b

b⇒c c

using the function modpon. If asp1 is a ‘proof object’ of type TRUE(b) and asp2 is a ‘proof object’ of type TRUE(impl(b,c)), then the ‘proof term’ modpon(b,c,asp1, asp2) denotes a ‘proof object’ of type TRUE(c). This term represents the syntax of the proof in first order logic using higher order abstract syntax. Finally, the rule b .. . c b⇒c is encoded by the function axiom. If asp4 is a ‘function’ that maps ‘proof objects’ of type TRUE(b) to those of type TRUE(c), then axiom(b,c,asp4) is a ‘proof object’ of type TRUE(impl(b,c)). 1 For a modern type theorist bool will be a strange choice for this name. However, in HOL the same name is used for the type of formulas (which shows that HOL is a classical system).

History of Interactive Theorem Proving


This Automath code corresponds directly to the modern typing judgments: bool : ∗ TRUE : bool → ∗

impl : bool → bool → bool modpon : Πb : bool. Πc : bool. TRUE b → TRUE (impl b c) → TRUE c axiom : Πb : bool. Πc : bool. (TRUE b → TRUE c) → TRUE (impl b c) The way one codes logic in LF style today is still exactly the same as it was in the sixties when Automath was first designed. Note that in this example, the proof of p ⇒ p is encoded by the term axiom p p (λH : TRUE(p). H) which has type TRUE (impl p p). In the ‘direct’ Curry-Howard style of Coq, Agda and NuPRL, p is itself a type, and the term encoding the proof of p ⇒ p becomes simply λH : p. p which has type p → p. Another difference between Automath and the modern type theoretical systems is that in Automath the logic and basic axioms have to be introduced axiomatically (as PN lines), while in Coq, Agda and NuPRL these are given by an ‘inductive types’ definitional package, and as such are defined using the type theory of the system. The earliest publication about Automath is technical report number 68-WSK05 from the Technical University in Eindhoven, dated November 1968 [de Bruijn, 1968a]. At that time de Bruijn already was a very successful mathematician, had been full professor of mathematics for sixteen years (first in Amsterdam and then in Eindhoven), and was fifty years old. The report states that Automath had been developed in the years 1967–1968. Two other people already were involved at that time: Jutting as a first ‘user’, and both Jutting and van Bree as programmers that helped with the first implementations of the language. These implementations were written in a variant of the Algol programming language (probably Algol 60, although Algol W was used at some point for Automath implementations too, and already existed by that time). Automath was presented in December 1968 at the Symposium on Automatic Demonstration, held at INRIA Rocquencourt in Paris. The paper presented there was later published in 1970 in the proceedings of that conference [de Bruijn, 1968c]. The Automath system that is described in those two publications is very similar to the Twelf system that implements the LF logical framework. A formalization in this system essentially consists of a long list of definitions, in which a sequence of constants are defined as abbreviations of typed lambda terms. Through the Curry-Howard isomorphism this allows one to encode arbitrary logical reasoning. It appears that de Bruijn was not aware of the work by Curry and Howard at the time. Both publications mentioned contain no references to the literature,


John Harrison, Josef Urban and Freek Wiedijk

and the notations used are very unlike what one would expect from someone who knew about lambda calculus. For example, function application is written with the argument in front of the function that is applied to it (which is indeed a more natural order), i.e., instead of MN one writes hN iM , and lambda abstraction is not written as λx:A.M but as [x:A]M (this notation was later inherited by the Coq system, although in modern Coq it has been changed.) Also, the type theory of Automath is quite different from the current type theories. In modern type theory, if we have the typings M : B and B : s, then we have (λx:A.M ) : (Πx:A.B) and (Πx:A.B) : s. However, in Automath one would have ([x:A]M ) : ([x:A]B) and ([x:A]B) : ([x:A]s). In other words, in Automath there was no difference between λ and Π, and while in modern type theory binders ‘evaporate’ after two steps when calculating types of types, in Automath they never will. This means that the typed terms in Automath do not have a natural set theoretic interpretation (probably the reason that this variant of type theory has been largely forgotten). However, this does not mean that this is not perfectly usable as a logical framework. Apparently de Bruijn rediscovered the Curry-Howard isomorphism mostly independently (although he had heard from Heyting about the intuitionistic interpretation of the logical connectives). One of the inspirations for the Automath language was a manual check by de Bruijn of a very involved proof, where he wrote all the reasoning steps on a large sheet of paper [de Bruijn, 1990]. The scoping of the variables and assumptions were indicated by drawing lines in the proof with the variables and assumptions. written in a ‘flag’ at the top of this line. This is very similar to Ja´skowski-Fitch style natural deduction, but in Automath (just like in LF) this is not tied to a specific logic. There essentially have been four groups of Automath languages, of which only the first two have ever been properly implemented: AUT-68 This was the first variant of Automath, a simple and clean system, which was explained in the early papers through various weaker and less practical systems, with names like PAL, LONGPAL, LONGAL, SEMIPAL and SEMIPAL 2, where ‘PAL’ abbreviates ‘Primitive Automath Language’ [de Bruijn, 1969; de Bruijn, 1970]. Recently there has been a revival of interest in these systems from people investigating weak logical frameworks [Luo, 2003]. AUT-QE This was the second version of the Automath language. ‘QE’ stands for ‘Quasi-Expressions’. With this Automath evolved towards the current type theories (although it still was quite different), one now both had ([x:A]B) : ([x:A]s) as well as ([x:A]B) : s. This was called type inclusion. AUT-QE is the dialect of Automath in which the biggest formalization, Jutting’s translation of Landau’s Grundlagen, was written [van Benthem Jutting, 1979]. It has much later been re-implemented in C by one of the authors of this chapter [Wiedijk, 2002]. Later AUT languages These are later variants of Automath, like AUT-QE-NTI (a subset of AUT-QE in which subtyping was removed, the ‘NTI’ standing

History of Interactive Theorem Proving


for ‘no type inclusion’), and the AUT-Π and AUT-SYNTH extensions of AUT-QE. These languages were modifications of the AUT-QE framework, but although implementations were worked on, it seems none of them was really finished. AUT-SL This was a very elegant version of Automath developed by de Bruijn (with variants of the same idea also developed by others). In this language the distinction between definitions and redexes is removed, and the formalization, including the definitions, becomes a single very large lambda term. The ‘SL’ stands for ‘single line’ (Section B.2 of [Nederpelt et al., 1994]). The system also was called ∆Λ, and sometimes Λ∆ (Section B.7 of [Nederpelt et al., 1994]). A more recent variant of this system was De Groote’s λλ type theory [de Groote, 1993]. The system AUT-QE-NTI can be seen as a step towards the AUT-SL language. There were later languages, by de Bruijn and by others, that were more loosely related to the Automath languages. One of these was WOT, the abbreviation of ‘wiskundige omgangstaal’, Dutch for mathematical vernacular [de Bruijn, 1979]. Unlike Trybulec with Mizar, de Bruijn only felt the need to have this ‘vernacular’ be structurally similar to actual mathematical prose, and never tried to make it natural language-like. In 1975, the Dutch science foundation ZWO (nowadays called NWO) gave a large five year grant for the project Wiskundige Taal AUTOMATH to further develop the Automath ideas. From this grant five researchers, two programmers and a secretary were financed [de Bruijn, 1978]. During the duration of this project many students of the Technical University Eindhoven did formalization projects. Examples of the subjects that were formalized were: • two treatments of set theory • a basic development of group theory • an axiomatic development of linear algebra • the K¨ onig-Hall theorem in graph theory • automatic generation of Automath texts that prove arithmetic identities • the sine and cosine functions • irrationality of π • real numbers as infinite sequences of the symbols 0 and 1 We do not know how complete these formalizations were, nor whether they were written using an Automath dialect that actually was implemented. In 1984 De Bruijn retired (although he stayed scientifically active), and the Automath project was effectively discontinued. In 1994 a volume containing the


John Harrison, Josef Urban and Freek Wiedijk

most important Automath papers was published [Nederpelt et al., 1994], and in 2003 most other relevant documents were scanned, resulting in the Automath Archive, which is freely available on the web [Scheffer, 2003]. Automath has been one of the precursors of a development of type systems called type theory: • On the predicative side of things there were the type theories by MartinL¨ of, developed from 1971 on (after discovery of the Girard paradox, in 1972 replaced by an apparently consistent system), which among other things introduced the notion of inductive types [Nordstr¨om et al., 1990]. • On the impredicative side there were the polymorphic lambda calculi by Girard (1972) and Reynolds (1974). This was combined with the dependent types from Martin-L¨ of’s type theory in Coquand’s CC (the Calculus of Constructions), described in his PhD thesis [Coquand and Huet, 1988]. CC was structured by Barendregt into the eight systems of the lambda cube [Barendregt, 1992], which was then generalised into the framework of pure type systems (PTSs) by Berardi (1988) and Terlouw (1989). Both the Martin-L¨ of theories and the Calculus of Constructions were further developed and merged in various systems, like in ECC (Extended Calculus of Constructions) [Luo, 1989] and UTT (Universal Type Theory) [Luo, 1992], and by Paulin in CIC (the Calculus of Inductive Constructions) [Coquand and Paulin, 1990], later further developed into pCIC (the predicative version of CIC, which also was extended with coinductive types), the current type theory behind the Coq system [Coq development team, 2012]. All these type theories are similar and different (and have grown away from the Automath type theory). Two of the axes on which one might compare them are predicativity (‘objects are not allowed to be defined using quantification over the domain to which the object belongs’) and intensionality (‘equality between functions is not just determined by whether the values of the functions coincide’). For example, the type theory of Coq is not yet predicative but it is intensional, the type theory of NuPRL is predicative but it is not intensional, and the type theory of Agda is both predicative and intensional. Recently, there has been another development in type theory with the introduction of univalent type theory or homotopy type theory (HoTT) by Voevodsky in 2009 [Univalent Foundations Program, 2013]. Here type theory is extended with an interpretation of equality as homotopy, which gives rise to the axiom of univalence. This means that representation independence now is hardwired into the type theory. For this reason, some people consider HoTT to be a new foundation for mathematics. Also, in this system the inductive types machinery is extended to higher inductive types (inductive types where equalities can be added inductively as well). Together, this compensates for several problems when using Coq’s type theory for mathematics: one gets functional extensionality and can have proper definitions of subtypes and quotient types. This field is still young and in very active development.

History of Interactive Theorem Proving


We now list some important current systems that are based on type theory and the Curry-Howard isomorphism, and as such can be considered successors to Automath. We leave out the history of important historical systems like LEGO [Pollack, 1994], and focus on systems that still have an active user community. For each we give a brief overview of their development.

NuPRL In 1979 Martin-L¨ of introduced an extensional version of his type theory. In the same year Bates and Constable, after earlier work on the PL/CV verification framework [Constable and O’Donnell, 1978] had founded the PRL research group at Cornell University to develop a program development system where programs are created in a mathematical fashion by interactive refinement (PRL at that time stood for Program Refinement Logic). In this group various systems were developed: the AVID system (Aid Verification through the techniques of Interactive program Development) [Krafft, 1981], the Micro-PRL system, also in 1981, and the λPRL system [Constable and Bates, 1983]. In 1984 the PRL group implemented a variant of Martin-L¨of’s extensional type theory in a system called NuPRL (also written as νPRL, to be read as ‘new PRL’; PRL now was taken to stand for Proof Refinement Logic). This system [Constable, 1986] has had five versions, where NuPRL 5, the latest version, is also called NuPRL LPE (Logical Programming Environment). In 2003, a new architecture for NuPRL was implemented called MetaPRL (after first having been called NuPRL-Light) [Hickey et al., 2003]. The NuPRL type theory always had a very large number of typing rules, and in the MetaPRL system this is handled through a logical framework. In that sense this system resembles Isabelle. Part of the NuPRL/MetaPRL project is the development of a library of formal results called the FDL (Formal Digital Library).

Coq In 1984 Huet and Coquand at INRIA started implementing the Calculus of Constructions in the CAML dialect of ML. Originally this was just a type checker for this type theory, but with version 4.10 in 1989 the system was extended in the style of the LCF system, with tactics that operate on goals. Subsequently many people have worked on the system. Many parts have been redesigned and re-implemented several times, including the syntax of the proof language and the kernel of the system. The Coq manual [Coq development team, 2012] gives an extensive history of the development of the system, which involved dozens of researchers. Of these a majority made contributions to the system that have all turned out to be essential for its efficient use. Examples of features that were added are: coercions, canonical structures, type classes, coinductive types, universe polymorphism, various decision procedures (e.g., for equalities in rings and fields, and for linear arithmetic), various tactics (e.g., for induction and inversion, and for rewriting with a congru-


John Harrison, Josef Urban and Freek Wiedijk

ence, in type theory called ‘setoid equality’), mechanisms for customizing term syntax, the coqdoc documentation system, the Ltac scripting language and the Program command (which defines yet another functional programming language within the Coq system). The system nowadays is a very feature rich environment, which makes it the currently most popular interactive theorem prover based on type theory. The latest released version of Coq is 8.4. This system can be seen as a theorem prover, but also as an implementation of a functional programming language with an execution speed comparable to functional languages like OCaml. A byte code machine similar to the one of OCaml was implemented by Gr´egoire and Leroy, but there now also is native code compilation, implemented by Denes and Gr´egoire. Also, computations on small integers can be done on the machine level, due to Spiwack. Another feature of Coq is that Coq programs can be exported in the syntax of other functional programming languages, like OCaml and Haskell. Coq has more than one user interface, of which Proof General [Aspinall, 2000] and CoqIDE [Bertot and Th´ery, 1998] are currently the most popular. There are two important extensions of Coq. First there is the SSReflect proof language and associated mathematical components library by Gonthier and others [Gonthier et al., 2008; Gonthier and Mahboubi, 2010], which was developed for the formalization of the proofs of the 4-color theorem (2005) and the Feit-Thompson theorem (2012). This is a compact and powerful proof language, which has not been merged in the mainstream version of Coq. Second there are implementations of Coq adapted for homotopy type theory. Finally there is the Matita system from Bologna by Asperti and others [Asperti et al., 2006]. This started out in 2004 as an independent implementation of a type checker of the Coq type theory. It was developed in the HELM project (Hypertextual Electronic Library of Mathematics), which was about presenting Coq libraries on the web [Asperti et al., 2003], and therefore at the time was quite similar to Coq, but has in the meantime diverged significantly, with many improvements of its own.

Twelf In 1987, Harper, Honsell and Plotkin introduced the Edinburgh Logical Framework, generally abbreviated as Edinburgh LF, or just LF [Harper et al., 1987]. This is a quite simple predicative type theory, inspired by and similar to Automath, in which one can define logical systems in order to reason about them. An important property of an encoded logic, which has to be proved on the meta level, is adequacy, the property that the beta-eta long normal forms of the terms that encode proofs in the system are in one-to-one correspondence with the proofs of the logic themselves. A first implementation of LF was EFS (Environment for Formal Systems) [Griffin, 1988]. Soon after, in 1989, Pfenning implemented the Elf system, which added

History of Interactive Theorem Proving


a meta-programming layer [Pfenning, 1994]. In 1999 a new version of this system was implemented by Pfenning and Sch¨ urmann, called Twelf [Pfenning and ] Sch¨ urmann, 1999 . In the meta layer of Twelf, one can execute Twelf specifications as logic programs, and it also contains a theorem prover that can establish properties of the Twelf encoding automatically, given the right definitions and annotations.

Agda In 1990 a first implementation of a type checker for Martin-L¨of’s type theory was created by Coquand and Nordstr¨ om. In 1992 this turned into the ALF (Another Logical Framework) system, implemented by Magnusson in SML [Magnusson and Nordstr¨om, 1993]. Subsequently a Haskell implementation of the same system was worked on by Coquand and Synek, called Half (Haskell Alf). A C version of this system called CHalf, also by Synek, was used for a significant formalization by Cederquist of the Hahn-Banach theorem in 1997 [Cederquist et al., 1998]. Synek developed for this system an innovative Emacs interface that allows one to work in a procedural style on a proof that essentially is declarative [Coquand et al., 2005]. A new version of this system called Agda was written by Catarina Coquand in 1996, for which a graphical user interface was developed around 2000 by Hallgren. Finally Norell implemented a new version of this system in 2007 under the name of Agda2 [Bove et al., 2009]. For a while there were two different versions of this system, a stable version and a more experimental version, but by now there is just one version left. 3


The LCF approach to interactive theorem proving has its origins in the work of Robin Milner, who from early in his career in David Cooper’s research group in Swansea was interested specifically in interactive proof: I wrote an automatic theorem prover in Swansea for myself and became shattered with the difficulty of doing anything interesting in that direction and I still am. I greatly admired Robinson’s resolution principle, a wonderful breakthrough; but in fact the amount of stuff you can prove with fully automatic theorem proving is still very small. So I was always more interested in amplifying human intelligence than I am in artificial intelligence.2 Milner subsequently moved to Stanford where he worked in 1971–2 in John McCarthy’s AI lab. There he, together with Whitfield Diffie, Richard Weyhrauch and Malcolm Newey, designed an interactive proof assistant for what Milner called the Logic of Computable Functions (LCF). This formalism, devised by Dana Scott in 2


John Harrison, Josef Urban and Freek Wiedijk

1969, though only published much later [Scott, 1993], was intended for reasoning about recursively defined functions on complete partial orders (CPOs), such as typically occur in the Scott-Strachey approach to denotational semantics. The proof assistant, known as Stanford LCF [Milner, 1972], was intended more for applications in computer science rather than mainstream pure mathematics. Although it was a proof checker rather than an automated theorem prover, it did provide a powerful automatic simplification mechanism and convenient support for backward, goal-directed proof. There were at least two major problems with Stanford LCF. First, the storage of proofs tended to fill up memory very quickly. Second, the repertoire of proof commands was fixed and could not be customized. When he moved to Edinburgh, Milner set about fixing these defects. With the aid of his research assistants, Lockwood Morris, Malcolm Newey, Chris Wadsworth and Mike Gordon, he designed a new system called Edinburgh LCF [Gordon et al., 1979]. To allow full customizability, Edinburgh LCF was embedded in a general programming language, ML.3 ML was a higher-order functional programming language, featuring a novel polymorphic type system [Milner, 1978] and a simple but useful exception mechanism as well as imperative features. Although the ML language was invented as part of the LCF project specifically for the purpose of writing proof procedures, it has in itself been seminal: many contemporary functional languages such as CAML Light and OCaml [Cousineau and Mauny, 1998; Weis and Leroy, 1993] are directly descended from it or at least heavily influenced by it, and their applications go far beyond just theorem proving. In LCF, recursive (tree-structured) types are defined in the ML metalanguage to represent the (object) logical entities such as types, terms, formulas and theorems. For illustration, we will use thm for the ML type of theorems, though the exact name is not important. Logical inference rules are then realized concretely as functions that return an object of type thm. For example, a classic logical inference rule is Modus Ponens or ⇒-elimination, which might conventionally be represented in a logic textbook or paper as a rule asserting that if p ⇒ q and p are both provable (from respective assumptions Γ and ∆) then q is also provable (from the combined assumptions):4 Γ⊢p⇒q ∆⊢p Γ∪∆⊢q

The LCF approach puts a concrete and computational twist on this by turning each such rule into a function in the metalanguage. In this case the function, say MP, takes two theorems as input, Γ ⊢ p ⇒ q and ∆ ⊢ p, and returns the theorem Γ ∪ ∆ ⊢ q; it therefore has a function type thm->thm->thm in ML (assuming

3 ML for metalanguage; following Tarski [1936] and Carnap [1937], it has become usual in logic and linguistics to to distinguish carefully the object logic and the metalogic (which is used to reason about the object logic). 4 We show it in a sequent context where we also take the union of assumption lists. In a Hilbert-style proof system these assumptions would be absent; in other presentations we might assume that the set of hypotheses are the same in both cases and have a separate weakening rule. Such fine details of the logical system are orthogonal to the ideas we are explaining here.

History of Interactive Theorem Proving


curried functions). When logical systems are presented, it’s common to make some terminological distinctions among the components of the foundation, and all these get reflected in the types in the metalanguage when implemented in LCF style: • An axiom is simply an element of type thm • An axiom schema (for example a first-order induction principle with an instance for each formula) becomes a function that takes some argument(s) like a term indicating which instance is required, and returns something of type thm. • A true inference rule becomes an ML object like the MP example above that takes objects, at least one of which is of type thm, as arguments and returns something of type thm. The traditional idea of logical systems is to use them as a foundation, by choosing once and for all some relatively small and simple set of rules, axioms and axiom schemas, which we will call the primitive inference rules, and thereafter perform all proof using just those primitives. In an LCF prover one can, if one wishes, create arbitrary proofs using these logical inference rules, simply by composing the ML functions appropriately. Although a proof is always performed, the proof itself exists only ephemerally as part of ML’s (strict) evaluation process, and therefore no longer fills up memory. Gordon [2000] makes a nice analogy with writing out a proof on a blackboard, and rubbing out early steps to make room for more. In order to retain a guarantee that objects of type thm really were created by application of the primitive rules, Milner had the ingenious idea of making thm an abstract type, with the primitive rules as its only constructors. After this, one simply needs to have confidence in the fairly small amount of code underlying the primitive inference rules to be quite sure that all theorems must have been properly deduced simply because of their type. But even for the somewhat general meta-arguments in logic textbooks, and certainly for concretely performing proofs by computer, the idea of proving something non-trivial by decomposing it to primitive inference rules is usually daunting in the extreme. In practice one needs some other derived rules embodying convenient inference patterns that are not part of the axiomatic basis but can be derived from them. A derived inference rule too has a concrete realization in LCF systems as a function whose definition composes other inference rules, and using parametrization by the function arguments can work in a general and schematic way just like the metatheorems of logic textbooks. For example, if we also have a primitive axiom schema called ASSUME returning a theorem of the form p ⊢ p: {p} ⊢ p


John Harrison, Josef Urban and Freek Wiedijk

then we can implement the following derived inference rule, which we will call UNDISCH: Γ⊢p⇒q Γ ∪ {p} ⊢ q

as a simple function in the metalanguage. For example, the code might look something like the following. It starts by breaking apart the implication of the input theorem to determine the appropriate p and q. (Although objects of type thm can only be constructed by the primitive rules, they can be examined and deconstructed freely.) Based on this, the appropriate instance of the ASSUME schema is used and the two inference rules plugged together. let UNDISCH th = let Imp(p,q) = concl th in MP th (ASSUME p);; This is just a very simple example, but because a full programming language is available, one can implement much more complex derived rules that perform sophisticated reasoning and automated proof search but still ultimately reduce to the primitive rules. Indeed, although LCF and most of its successors use a traditional forward presentation of logic, it is easy to use a layer of programming to support goal-directed proof in the style of Stanford LCF, via so-called tactics. This flexibility gives LCF an appealing combination of reliability and extensibility. In most theorem proving systems, in order to install new facilities it is necessary to modify the basic code of the prover. But in LCF an ordinary user can write an arbitrary ML program to automate a useful inference pattern, while all the time being assured that even if the program has bugs, no false theorems will arise (though the program may fail in this case, or produce a valid theorem other than the one that was hoped for). As Slind [1991] puts it ‘the user controls the means of (theorem) production’. LCF was employed in several applications at Edinburgh, and this motivated certain developments in the system. By now, the system had attracted attention elsewhere. Edinburgh LCF was ported to LeLisp and MacLisp by G´erard Huet, and this formed the basis for a rationalization and redevelopment of the system by Paulson [1987] at Cambridge, resulting in Cambridge LCF. First, Huet and Paulson modified the ML system to generate Lisp code that could be compiled rather than interpreted, which greatly improved performance. Among many other improvements Paulson [1983] replaced Edinburgh LCF’s complicated and monolithic simplifier with an elegant scheme based on on conversions. A conversion is a particular kind of derived rule (of ML type :term->thm) that given a term t returns a theorem of the form Γ ⊢ t = t′ for some other term t′ . (For example, a conversion for addition of numbers might map the term 2 + 3 to the theorem ⊢ 2 + 3 = 5.) This gives a uniform framework for converting ad hoc simplification routines into those that are justified by inference: instead of simply taking t and asserting its equality to t′ , we actually carry theorems asserting such equivalences through the procedure. Via convenient higher-order

History of Interactive Theorem Proving


functions, conversions can be combined in various ways, applied recursively in a depth-first fashion etc., with all the appropriate inference to plug the steps together (transitivity and congruences and so on) happening automatically.



As emphasized by Gordon [1982], despite the name ‘LCF’, nothing in the Edinburgh LCF methodology is tied to the Logic of Computable Functions. In the early 1980s Gordon, now in Cambridge, as well as supervising Paulson’s development of LCF, was interested in the formal verification of hardware. For this purpose, classical higher order logic seemed a natural vehicle, since it allows a rather direct rendering of notions like signals as functions from time to values. The case was first put by Hanna and Daeche [1986] and, after a brief experiment with an ad hoc formalism ‘LSM’ based on Milner’s Calculus of Communicating Systems, Gordon [1985] also became a strong advocate. Gordon modified Cambridge LCF to support classical higher order logic, and so HOL (for Higher Order Logic) was born. Following Church [1940], the system is based on simply typed λ-calculus, so all terms are either variables, constants, applications or abstractions; there is no distinguished class of formulas, merely terms of boolean type. The main difference from Church’s system is that polymorphism is an object-level, rather than a meta-level, notion; essentially the same HindleyMilner automated typechecking algorithm used in ML [Milner, 1978] is used in the interface so that most general types for terms can be deduced automatically. Using defined constants and a layer of parser and pretty-printer support, many standard syntactic conventions are broken down to λ-calculus. For example, the universal quantifier, following Church, is simply a higher order function, but the conventional notation ∀x.P [x] is supported, mapping down to ∀(λx.P [x]). Similarly there is a constant LET, which is semantically the identity and is used only as a tag for the pretty-printer, and following Landin [1966], the construct ‘let x = t in s’ is broken down to ‘LET (λx.s) t’.5 The advantage of keeping the internal representation simple is that the underlying proof procedures, e.g. those that do term traversal during simplification, need only consider a few cases. The exact axiomatization of the logic was partly influenced by Church, partly by the way things were done in LCF, and partly through consultation with the logicians Mike Fourman, Martin Hyland and Andy Pitts in Cambridge. HOL originally included a simple constant definitional mechanism, allowing new equational axioms of the form ⊢ c = t to be added, where t is a closed term and c a new constant symbol. A mechanism for defining new types, due to Mike Fourman, was also included. Roughly speaking one may introduce a new type in bijection with any nonempty subset of an existing type (identified by its characteristic predicate). An important feature of these definitional mechanisms bears emphasizing: they are not metalogical translations, but means of extending the signature of the object 5 Landin, by the way, is credited with inventing the term ‘syntactic sugar’, as well as this notable example of it.


John Harrison, Josef Urban and Freek Wiedijk

logic, while guaranteeing that such extension preserves consistency. In fact, the definitional principles are conservative, meaning roughly that no new statements not involving the defined concept become provable as a result of the extension. HOL emphasized the systematic development of theories by these principles of conservative extension to the point where it became the norm, purely axiomatic extensions becoming frowned on. Such an approach is obviously a very natural fit with the LCF philosophy, since it entails pushing back the burden of consistency proofs or whatever to the beginning, once and for all, such that all extensions, whether of the theory hierarchy or proof mechanisms, are correct by construction. (Or at least consistent by construction. Of course, it is perfectly possible to introduce definitions that do not correspond to the intuitive notion being formalized, but no computable process can resolve such difficulties.) This contrasts with LCF, where there was no distinction between definitions and axioms, and new types were often simply characterized by axioms without any formal consistency proof. Though there was usually a feeling that such a proof would be routine, it is easy to make mistakes in such a situation. It can be much harder to produce useful structures by definitional extension than simply to assert suitable axioms — the advantages were likened by Russell [1919] to those of theft over honest toil. For example, Melham’s derived definitional principle [Melham, 1989] for recursive data types was perhaps at the time the most sophisticated LCF-style derived rule ever written, and introduced important techniques for maintaining efficiency in complex rules that are still used today — we discuss the issues around efficient implementations of decision procedures later. This was the first of a wave of derived definitional principles in LCF-like systems for defining inductive or coinductive sets or relations [Andersen and Petersen, 1991; Camilleri and Melham, 1992; Roxas, 1993; Paulson, 1994a], general recursive functions [Ploegaerts et al., 1991; van der Voort, 1992; Slind, 1996; Krauss, 2010], quotient types with automated lifting of definitions and theorems [Harrison, 1998; Homeier, 2005], more general forms of recursive datatypes with infinite branching, nested and mutual recursion or dual codatatypes [Gunter, 1993; Harrison, 1995a; Berghofer and Wenzel, 1999; Blanchette et al., 2014] as well as special nominal datatypes to formalize variable binding in a natural way [Urban, 2008]. Supporting such complex definitions as a primitive aspect of the logic, done to some extent in systems as different as ACL2 and Coq, is a complex, intricate and error-prone activity, and there is a lot said for how the derived approach maintains foundational simplicity and security. In fact general wellfounded recursive functions in Coq are also supported using derived definitional principles [Balaa and Bertot, 2000], while quotients in the current foundations of Coq are problematic for deeper foundational reasons too. The HOL system was consolidated and rationalized in a major release in late 1988, which was called, accordingly, HOL88. It became fairly popular, acquired good documentation, and attracted many users around the world. Nevertheless, despite its growing polish and popularity, HOL88 was open to criticism. In particular, though the higher-level parts were coded directly in ML, most of the term operations below were ancient and obscure Lisp code (much of it probably written

History of Interactive Theorem Proving


by Milner in the 1970s). Moreover, ML had since been standardized, and the new Standard ML seemed a more promising vehicle for the future than Lisp, especially with several new compilers appearing at the time. These considerations motivated two new versions of HOL in Standard ML. One was developed by Roger Jones, Rob Arthan and others at ICL Secure Systems and called ProofPower. This was intended as a commercial product and has been mainly used for high-assurance applications, though the current version is freely available and has also been used in other areas like the formalization of mathematics.6 The other, called hol90, was written by Konrad Slind [1991], under the supervision of Graham Birtwistle at the University of Calgary. The entire system was coded in Standard ML, which made all the pre-logic operations such as substitution accessible. Subsequently several other versions of HOL were written, including HOL Light, a version with a simplified axiomatization and rationalized structure written in CAML Light by one of the present authors and subsequently ported to OCaml [Harrison, 2006a], and Isabelle/HOL, described in more detail in the next section. The ‘family DAG’ in Figure 2 gives an approximate idea of some of the influences. While HOL88, hol90 and hol98 are little used today (though HOL88 is available as a Debian package), all the other provers in this picture are under active development and/or have significant user communities.



We will discuss one more LCF-style system in a little more depth because it has some distinguishing features compared to others in the family, and is also perhaps currently the most widely used member of the LCF family. This is the Isabelle system, originally developed by Paulson [1990]. The initial vision of Isabelle was as an LCF-style logical framework in which to embed other logics. Indeed, the subtitle ‘The Next 700 Theorem Provers’ in Paulson’s paper (with its nod to Landin’s ‘The Next 700 Programming Languages’) calls attention to the proliferation of different theorem proving systems already existing at the time. Many researchers, especially in computer science, were (and still are) interested in proof support for particular logics (classical, constructive, many-valued, temporal etc.) While these all have their distinctive features, they also have many common characteristics, making the appeal of a re-usable generic framework obvious. Isabelle effectively introduces yet another layer into the meta-object distinction, with the logic directly implemented in the LCF style itself being considered a meta-logic for the embedding of other logics. The Isabelle metalogic is a simple form of higher-order logic. It is intentionally weak (for example, having no induction principles) so that it does not in itself introduce foundational assumptions that some might find questionable or cause incompatibilities with the way object logics are embedded. But it serves its purpose well as a framework for embedding object logics and providing a common infrastructure across them. The inference rules in the object logic are then given in a declarative fashion as meta-implications 6 See


John Harrison, Josef Urban and Freek Wiedijk

✠ hol90 ❅

HOL88 ❍ ❅❍❍ ❅ ❍❍ ❅ ❍❍ ❍❍ ❅ ❍ ❘ ❅ ❥ ❍ ProofPower Isabelle/HOL ❅

❄ ✠ hol98

❅ ❘ ❄ ❅ HOL Light ❅ ❅ ❅

❘ ❄ ❅ HOL Zero

❄ HOL4 Figure 2: The HOL family tree

History of Interactive Theorem Proving


(implications in the meta-logic). For example, our earlier example of Modus Ponens can be represented as the following (meta) theorem. The variables starting with ‘?’ are metavariables, i.e. variables in the metalogic; → denotes object-level implication while ⇒ denotes meta-level implication. [?P →?Q; ?P ] ⇒?Q By representing object-level inference rules in this fashion, the actual implementations often just need to perform forward or backward chaining with matching and/or unification. Isabelle supports the effective automation of this process with a powerful higher-order unification algorithm [Huet, 1975] giving a kind of higherorder resolution principle. Many design decisions in Isabelle were based on Paulson’s experience with Cambridge LCF and introduced a number of improvements. In particular, backward proof (‘tactics’) in LCF actually worked by iteratively creating function closures to eventually reverse the refinement process into a sequence of the primitive forward rules. This non-declarative formulation meant, for example, that it was a non-trivial change to add support in LCF for logic variables allowing the instantiation of existential goals to be deferred [Sokolowski, 1983]. Isabelle simply represented goals as theorems, with tactics effectively working forwards on their assumptions, making the whole framework much cleaner and giving metavariables with no extra effort. This variant of tactics was also adopted in the ProofPower and LAMBDA systems (see next section). Isabelle’s tactic mechanism also allowed backtracking search over lazy lists of possible outcomes in its tactic mechanism. Together with unification, this framework could be used to give very simple direct implementations of some classic first-order proof search algorithms like tableaux ` a la leanTAP [Beckert and Posegga, 1995] (fast_tac in Isabelle) and the Loveland-Stickel presentation [Loveland, 1978; Stickel, 1988] of model elimination (meson_tac). While nowadays largely superseded by much more powerful automation of the kind that we consider later, these simple tactics were at the time very convenient in making typical proofs less lowlevel. It’s customary to use appellations like ‘Isabelle/X’ to describe the particular instantiation of Isabelle with object logic X. Among the many object logics embedded in Isabelle are constructive type theory, classical higher order logic (Isabelle/HOL) and first-order Zermelo-Fraenkel set theory (Isabelle/ZF) [Paulson, 1994b]. Despite this diversity, only a few have been extensively used. Some axiom of choice equivalences have been formalized in Isabelle/ZF [Paulson and Gr¸abczewski, 1996], as has G¨odel’s hierarchy L of constructible sets leading to a proof of the relative consistency of the Axiom of Choice [Paulson, 2003]. But by far the largest user community has developed around the Isabelle/HOL instantiation [Nipkow et al., 2002]. This was originally developed by Tobias Nipkow as an instantiation of Isabelle with something very close to the logic of the various HOL systems described above, but with the addition (at the level of the metalogic) of a system of axiomatic type classes similar to those of Haskell. In this instantiation, the ties between the


John Harrison, Josef Urban and Freek Wiedijk

Isabelle object and metalogic are particularly intimate. Since its inception, Isabelle/HOL has become another full-fledged HOL implementation. In fact, from the point of view of the typical user the existence of a separate metalogic can largely be ignored, so the effective common ground between Isabelle/HOL and other HOL implementations is closer than might be expected. However, one more recent departure (not limited to the HOL instantiation of Isabelle) takes it further from its LCF roots and other HOL implementations. This is the adoption of a structured proof language called Isar, inspired by Mizar [Wenzel, 1999]. For most users, this is the primary interaction language, so they no longer use the ML read-eval-print loop as the interface. The underlying LCF mechanisms still exist and can be accessed, but many facilities are mediated by Isar and use a sophisticated system of contexts. We describe some of the design decisions in proof languages later in this chapter.


Other LCF systems

There have been quite a few other theorem provers either directly implemented in the LCF style or at least heavily influenced by it. Some of them, such as NuPRL and Coq, we have discussed above because of their links to constructive type theory, and so we will not discuss them again here, but their LCF implementation pedigree is also worth noting. For example, Bates and Constable [1985] describe the LCF approach in detail and discuss how NuPRL developed from an earlier system PL/CV. Another notable example is the LAMBDA (Logic And Mathematics Behind Design Automation) system, which was developed in a team led by Mike Fourman for use in hardware verification. Among other distinctive features, it uses a logic of partial functions, as did the original LCF system and as does the non-LCF system IMPS [Farmer et al., 1990]. 4


The history of Mizar [Matuszewski and Rudnicki, 2005] is a history of a team of Polish mathematicians, linguists and computer scientists analyzing mathematical texts and looking for a satisfactory human-oriented formal counterpart. One of the mottos of this effort has been Kreisel’s ‘ENOD: Experience, Not Only Doctrine’ [Rudnicki and Trybulec, 1999], which in the Mizar context was loosely understood as today’s ideas on rapid/agile software development. There were Mizar prototypes with semantics that was only partially clear, and with only partial verification procedures. A lot of focus was for a long time on designing a suitable language and on testing it by translating real mathematical papers into the language. The emphasis has not been just on capturing common patterns of reasoning and theory development, but also on capturing common syntactic patterns of the language of mathematics. A Mizar text is not only supposed to be written by humans and then read by machines, but it is also supposed to be directly easily readable by humans, avoid too many parentheses, quotation marks, etc.

History of Interactive Theorem Proving


The idea of such a language and system was proposed in 1973 by Andrzej Trybulec, who was at that time finishing his PhD thesis in topology under Karol Borsuk, and also teaching at the Plock Branch of the Warsaw University of Technology. Trybulec had then already many interests: since 1967 he had been publishing on topics in topology and linguistics, and in Plock he was also running the Informatics Club. The name Mizar (after a star in Big Dipper)7 was proposed by Trybulec’s wife Zinaida, originally for a different project. The writing of his PhD thesis and incorporating of Borsuk’s feedback prompted Trybulec to think about languages and computer systems that would help mathematicians with such tasks. He presented these ideas for the first time at a seminar at the Warsaw University on November 14, 1973, and was soon joined by a growing team of collaborators that were attracted by his vision8 and personality, in many cases for their whole lives: Piotr Rudnicki, Czeslaw Byli´ nski, Grzegorz Bancerek, Roman Matuszewski, Artur Kornilowicz, Adam Grabowski and Adam Naumowicz, to name just a few. The total count of Mizar authors in May 2014 grew to 246. In his first note (June 1975 [Trybulec, 1977]) about Mizar, Trybulec called such languages LogicInformation Languages (LIL) and defined them as facto-graphic languages that enable recording of both facts from a given domain as well as logical relationships among them. He proposed several applications for such languages, such as: • Input to information retrieval systems which use logical dependencies. • Intermediate languages for machine translation (especially of science). • Automation of the editorial process of scientific papers, where the input language is based on LIL and the output language is natural (translated into many languages). • Developing verification procedures for such languages, not only in mathematics, but also in law and medicine, where such procedures would typically interact with a database of relevant facts depending on the domain. • Education, especially remote learning. • Artificial intelligence research. For the concrete instantiation to mathematics, the 1975 note already specified the main features of what is today’s Mizar, noting that although such a vision borders on science fiction, it is a proper research direction: 7 Some

backronyms related to Mathematics and Informatics have been proposed later. was easy to get excited, for several reasons. Andrzej Trybulec and most of the Mizar team have been a showcase of the popularity of science fiction in Poland and its academia. A great selection of world sci-fi books has been shared by the Mizar team, by no means limited to famous Polish sci-fi authors such as Stanislaw Lem. Another surprising and inspiring analogy appeared after the project moved in 1976 to Bialystok: the city where Ludwik Zamenhof grew up and started to create Esperanto in 1873 [Zalewska, 2010] – 100 years before the development of Mizar started. 8 It


John Harrison, Josef Urban and Freek Wiedijk

• The paper should be directly written in a LIL. • The paper is checked automatically for syntactic and some semantic errors. • There are procedures for checking the correctness of the reasoning, giving reports about the reasoning steps that could not be automatically justified. • A large database of mathematics is built on top of the system and used to check if the established results are new, providing references, etc. • When the paper is verified, its results are included in such database. • The paper is automatically translated into natural languages, given to other information retrieval systems such as citation indexes, and printed. The proposal further suggested the use of classical first-order logic with a rich set of reasoning rules, and to include support for arithmetics and set theory. The language should be rich and closely resemble natural languages, but on the other hand it should not be too complicated to learn, and at the lexical and syntactic level it should resemble programming languages. In particular, the Algol 60 and later the Pascal language and compiler, which appeared in 1970, became sources of inspiration for Mizar and its implementation languages. Various experiments with Pascal program verification were done later with Mizar [Rudnicki and Drabent, 1985]. It is likely that in the beginning Trybulec did not know about Automath, LCF, SAM, and other Western efforts, and despite his good contacts with Russian mathematicians and linguists, probably neither about the work of Glushkov and his team in Kiev [Letichevsky et al., 2013] on the Evidence Algorithm and the SAD system. However, the team learned about these projects quite quickly, and apart from citing them, the 1978 paper on Mizar-QC/6000 [Trybulec, 1978] also makes interesting references to Kaluzhnin’s 1962 paper on ‘information language for mathematics’ [Kaluzhnin, 1962], and even earlier (1959, 1961) related papers by Paducheva and others on such languages for geometry. As in some other scientific fields, the wealth of early research done by these Soviet groups has been largely unknown in the western world, see [Lyaletski and Verchinine, 2010; Verchinine et al., 2008] for more details about them.


Development of Mizar

The construction of the Mizar language was started by looking at the paper by H. Patkowska9 A homotopy extension theorem for fundamental sequences [Patkowska, 1969], and trying to express it in the designed language. During the course of the following forty years, a number of versions of Mizar have been developed (see the timeline in Figure 3), starting with bare propositional calculus and rule-based proof checking in Mizar-PC, and ending with today’s version of Mizar (Mizar 8 as 9 Another

PhD student of Karol Borsuk.

History of Interactive Theorem Proving


Mizar 2 Mizar-PC 1975

Mizar-MS 1977



Mizar-MSE 1981


Mizar 4 1985


MML started 1989

Mizar 3


Mizar HPF


Figure 3: The Mizar timeline of 2014) in which a library of 1200 interconnected formal articles is written and verified. Mizar-PC 1975-1976 While a sufficiently expressive mathematical language was a clear target of the Mizar language from the very beginning, the first implementation [Trybulec, 1977] of Mizar (written in Algol 1204) was limited to propositional calculus (Mizar-PC). A number of features particular to Mizar were however introduced already in this first version, especially the Mizar vocabulary and grammar motivated by Algol 60, and the Mizar suppositional proof style which was later found10 to correspond to Ja´skowski-Fitch natural deduction [Ja´skowski, 1934]. An example proof taken from the June 1975 description of Mizar-PC is as follows: begin ((p ⊇ q) ∧ (q ⊇ r)) ⊇ (p ⊇ r) proof let A: (p ⊇ q) ∧ (q ⊇ r) ; then B: p ⊇ q ; C: q ⊇ r by A ; p ; let then q by B ; r by C hence end end

The thesis (contents, meaning) of the proof in Mizar-PC is constructed from assumptions introduced by the keyword let (later changed to assume) by placing 10 Indeed, the Mizar team found only later that they re-invented Ja´ skowski-Fitch proof style. Andrzej Trybulec was never completely sure if he had not heard about it earlier, for example at Roman Suszko’s seminars.


John Harrison, Josef Urban and Freek Wiedijk

an implication after them, and from conclusions introduced by keywords thus and hence by placing a conjunction after them (with the exception of the last one). The by keyword denotes inference steps where the formula on the left is justified by the conjunction of formulas whose labels are on the right. The immediately preceding formula can be used for justification without referring to its label by using the linkage mechanism invoked by keywords then for normal inference steps and hence for conclusions. The proof checker verifies that the formula to be proved is the same as the thesis constructed from the proof, and that all inference steps are instances of about five hundred inference rules available in the database of implemented schematic proof-checking rules. This rule-based approach was changed in later Mizar versions to an approach based on model checking (in a general sense, not in connection with temporal logic model checking). Mizar-PC already allowed the construction of a so-called compound statement (later renamed to diffuse statement), i.e., a statement that is constructed implicitly from its suppositional proof given inside the begin ... end brackets (later changed to now ... end) and can be given a label and referred to in the same way as normal statements. An actual use of Mizar-PC was for teaching propositional logic in Plock and Warsaw. Mizar-QC 1977-1978 Mizar-QC added quantifiers to the language and proof rules for them. An example proof (taken from [Matuszewski and Rudnicki, 2005]) is: BEGIN ((EX X ST (FOR Y HOLDS P2[X,Y])) > (FOR X HOLDS (EX Y ST P2[Y,X]))) PROOF ASSUME THAT A: (EX X ST (FOR Y HOLDS P2[X,Y])); LET X BE ANYTHING; CONSIDER Z SUCH THAT C: (FOR Y HOLDS P2[Z,Y]) BY A; SET Y = Z; THUS D: P2[Y,X] BY C; END END The FOR and EX keywords started to be used as quantifiers in formulas, and the LET statement started to be used for introducing a local constant in the proof corresponding to the universal quantifier of the thesis. The keyword ASSUME replaced the use of LET for introducing assumptions. The CONSIDER statement also introduces a local constant with a proposition justified by an existential statement. The SET statement (replaced by TAKE later) chooses an object that will correspond to an existential quantifier in the current thesis. While the LET X BE ANYTHING statement suggests that a sort/type system was already in place, in the QC version the only allowed sort was just ANYTHING. The BY justification proceeds by transforming the set of formulas into a standard form that uses only conjunction, negation and universal quantification and then

History of Interactive Theorem Proving


applying a set of rewrite rules restricted by a bound on a sum of complexity coefficients assigned to the rules. The verifier was implemented in Pascal/6000 for the CDC CYBER-70, and run in a similar way to the Pascal compiler itself, i.e., producing a list of error messages for the lines of the Mizar text. The error messages are inspected by the author, who modifies the text and runs the verifier again. This batch/compilation style of processing of the whole text is also similar to TEX, which was born at the same time. It has become one of the distinctive features of Mizar when compared to interpreter-like ITPs implemented in Lisp and ML. Mizar Development in 1978-1988 The development of Mizar-QC was followed by a decade of additions leading to the first version of PC-Mizar11 in 1989. In 1989 the building of the Mizar Mathematical Library (MML) started, using PC-Mizar. Mizar MS (Multi Sorted) (1978) added predicate definitions and syntax for schemes (such as the Induction and Replacement schemes), i.e., patterns of theorems parameterized by arbitrary predicates and functors. Type declarations were added, the logic became many-sorted, and equality was built in. Mizar FC (1978-1979) added function (usually called functor in Mizar) definitions. The syntax allowed both equational definitions and definitions by conditions that guarantee existence and uniqueness of the function. The BY justification procedure based on rewrite rules was replaced by a procedure based on ‘model checking’: the formula to be proved is negated, conjoined with all its premises, and the procedure tries to refute every possible model of this conjunction. This procedure [Wiedijk, 2000] has been subject to a number of additions and experiments throughout the history of Mizar development, focusing on the balance between speed, strength, and obviousness to the human reader [Davis, 1981; Rudnicki, 1987a]. They were mainly concerned with various restricted versions of matching and unification, algorithms such as congruence closure for handling equality efficiently, handling of the type system and various built-in constructs [Naumowicz and Bylinski, 2002; Naumowicz and Bylinski, 2004]. Mizar-2 (1981) introduced the environment part of an article, at that time containing statements that are checked only syntactically, i.e., without requiring their justification. Later this part evolved into its current form used for importing theorems, definitions and other items from other articles. Type definitions were introduced: types were no longer disjoint sorts, but non empty sets or classes defined by a predicate. This marked the beginning of another large Mizar research topic: its soft type system added on top of the underlying first-order logic. Types in Mizar are neither foundational nor disjoint as in the HOL and Automath families [Wiedijk, 2007]. The best way in which to think of the Mizar types is as a hierarchy of predicates (not just monadic: n-ary predicates result in dependent types), where traversing of this hierarchy – the subtyping, intersection, disjointness relations, etc. 11 PC

stands here for Personal Computer.


John Harrison, Josef Urban and Freek Wiedijk

– is to a large extent automated and user-programmable, allowing the automation of large parts of the mundane mathematical reasoning.12 Where such automation fails, the RECONSIDER statement can be used, allowing one to change a type of an object explicitly after a justification. Mizar-3 and Mizar-4 (1982-1988) divided the processing into multiple cheaper passes with file-based communication, such as scanning, parsing, type and natural-deduction analysis, and justification checking. The use of special vocabulary files for symbols together with allowing infix, prefix, postfix notation and their combinations resulted in greater closeness to mathematical texts. Reservations were introduced, for predeclaring variables and their default types. Other changes and extensions included unified syntax for definitions of functors, predicates, attributes and types, keywords for various correctness conditions related to definitions such as uniqueness and existence, etc. In 1986, Mizar-4 was ported to the IBM PC platform running MS-DOS and renamed to PC-Mizar in 1988. PC-Mizar and the Mizar Mathematical Library (1988-) In 1987-1991 a relatively large quantity of national funding was given to the Mizar team to develop the system and to use it for substantial formalization. The main modification to Mizar to allow that were mechanisms for importing parts of other articles. The official start of building of the Mizar Mathematical Library (MML) dates to January 1, 1989, when three basic articles defining the language and axioms of set theory and arithmetic were submitted. Since then the development of Mizar has been largely driven by the growth of the MML. Apart from the further development of the language and proof-checking mechanisms, a number of tools for proof and library refactoring have been developed. The Library Committee has been established, and gradually more and more work of the core Mizar team shifted to refactoring the library so that duplication is avoided and theories are developed in general and useful form [Rudnicki and Trybulec, 2003]. Perhaps the largest project done in Mizar so far has been the formalization of about 60% of the Compendium of Continuous Lattices [Bancerek and Rudnicki, 2002], which followed the last QED workshop organized by the Mizar team in Warsaw.13 This effort resulted in about 60 MML articles. One of the main lessons learned (see also the Kreisel’s motto above) by the Mizar team from such large projects has been expressed in [Rudnicki and Trybulec, 1999] as follows: The MIZAR experience indicates that computerized support for mathematics aiming at the QED goals cannot be designed once and then simply implemented. A system of mechanized support for mathematics 12 This soft typing system bears some similarities to the sort system implemented in ACL2, and also to the type system used by the early Ontic system. Also the more recent soft (nonfoundational) typing mechanisms such as type classes in Isabelle and Coq, and canonical structures in Coq, can be to a large extent seen as driven by the same purpose as types have in Mizar: non-foundational mechanisms for automating the work with hierarchies of defined concepts that can overlap in various ways. 13

History of Interactive Theorem Proving


is likely to succeed if it has an evolutionary nature. The main components of such a system – the authoring language, the checking software, and the organization of the data base - must evolve as more experience is collected. At this moment it seems difficult to extrapolate the experience with MIZAR to the fully fledged goals of the QED Project. However, the people involved in MIZAR are optimistic. 5


The LCF approach and the systems based on type theory all tend to emphasize a highly foundational approach to proof, with a (relatively) small proof kernel and a simple axiomatic basis for the mathematics used. While Mizar’s software architecture doesn’t ostensibly have the same foundational style, in practice its automation is rather simple, arguably an important characteristic since it also enables batch proof script checking to be efficient. Thus, all these systems emphasize simple and secure foundations and try to build up from there. Nowadays LCF-style systems in particular offer quite powerful automated support, but this represents the culmination of decades of sometimes arduous research and development work in foundational implementations of automation. In the first decade of their life, many systems like Coq and Isabelle/HOL that nowadays seem quite powerful only offered rather simple and limited automation, making some of the applications of the time seem even more impressive. A contrasting approach is to begin with state-of-the-art automation, regardless of its foundational characteristics. Systems with this philosophy were usually intended to be applied immediately to interesting examples, particularly in software verification, and in many cases were intimately connected with custom program verification frameworks. (For example, the GYPSY verification framework [Good et al., 1979] tried to achieve just the kind of effective blend of interaction and automation we are considering, and had a significant impact on the development of proof assistants.) Indeed, in many cases these proof assistants became vehicles for exploring approaches to automation, and thus pioneered many techniques that were later re-implemented in a foundational context by the other systems. Although there are numerous systems worthy of mention — we note in passing EVES/Never [Craigen et al., 1991], KIV [Reif, 1995] and SDVS [Marcus et al., 1985] — we will focus on two major lines of research that we consider to have been the most influential. Interestingly, their overall architectures have relatively little common ground — one emphasizes automation of inductive proofs with iterated waves of simplification by conditional rewriting, the other integration of quantifier-free decision procedures via congruence closure. In their different ways, both have profoundly influenced the field. One might of course characterize them as automated provers rather than interactive ones, and some of this work has certainly been influential in the field of pure automation. Nevertheless, we consider that they belong primarily to the interactive world, because they are systems that are normally used to attack challenging problems via a human process of inter-


John Harrison, Josef Urban and Freek Wiedijk

action and lemma generation, even though the automation in the background is unusually powerful. For example, the authors of NQTHM say the following [Boyer and Moore, 1988]: In a shallow sense, the theorem prover is fully automatic: the system requires no advice or directives from the user once a proof attempt has started. The only way the user can alter the behavior of the system during a proof attempt is to abort the proof attempt. However, in a deeper sense, the theorem prover is interactive: the data base – and hence the user’s past actions – influences the behavior of the theorem prover.



The story of NQTHM and ACL2 really starts with the fertile collaboration between Robert Boyer and J Strother Moore, both Texans who nevertheless began their work together in 1971 when they were both at the University of Edinburgh. However, we can detect the germ of some of the ideas further back in the work of Boyer’s PhD advisor, Woody Bledsoe. Bledsoe was at the time interested in more humanoriented approaches to proof, swimming against the tide of the then-dominant interest in resolution-like proof search. For example, Bledsoe and Bruell [1974] implemented a theorem prover that was used to explore semi-automated proof, particularly in general topology. In a sense this belongs in our list of pioneering interactive systems because it did provide a rudimentary interactive language for the human to guide the proof, e.g. PUT to explicitly instantiate a quantified variable. The program placed particular emphasis on the systematic use of rewriting, using equations to simplify other formulas. Although this also appeared in other contexts under the name of demodulation [Wos et al., 1967] or as a special case of superposition in completion [Knuth and Bendix, 1970], and has subsequently developed into a major research area in itself [Baader and Nipkow, 1998], Bledsoe’s emphasis was instrumental in establishing rewriting and simplification as a key component of many interactive systems. Although Boyer and Moore briefly worked together on the popular theme of resolution proving, they soon established their own research agenda: formalizing proofs by induction. In a fairly short time they developed their ‘Pure LISP theorem prover’, which as the name suggests was designed to reason about recursive functions in a subset of pure (functional) Lisp.14 The prover used some relatively simple but remarkably effective techniques. Most of the interesting functions were defined by primitive recursion of one sort or another, for example over N by defining f (n + 1) in terms of f (n) or over lists by defining f (CONS h t), where CONS is the Lisp list constructor, h the head element and t the tail of remaining elements, in terms of f (t). (In fact, only the list form was primitive in the prover, with 14 Note that the prover was not then implemented in Lisp, but rather in POP-2, a language developed at Edinburgh by Robin Popplestone and Rod Burstall.

History of Interactive Theorem Proving


Simplification Destructor elimination






Elimination of irrelevance

Figure 4: The Boyer-Moore ‘Waterfall’ model

natural numbers being represented via lists in zero-successor form.) The pattern of recursive definitions was used to guide the application of induction principles and so produce explicit induction hypotheses. Moreover, the prover was also able to generalize the statement to be proved in order better to apply induction — it is a well-known phenomenon that this can make inductive proofs easier because one strengthens the inductive hypothesis that is available. These two distinctive features were at the heart of the prover, but it also benefited from a number of additional techniques like the systematic use of rewriting. Indeed, it was emphasized that proofs should first be attempted using more simple and controllable techniques like rewriting, with induction and generalization only applied if that was not sufficient. The overall organization of the prover was a prototypical form of what has become known as the ‘Boyer-Moore waterfall model’. One imagines conjectures as analogous to water flowing down a waterfall down to a ‘pool’ below. On their path to the pool below conjectures may be modified (for example by rewriting), they may be proven (in which case they evaporate), they may be refuted (in which case the overall process fails) or they may get split into others. When all the ‘easy’ methods have been applied, generalization and induction take place, and the new induction hypotheses generated give rise to another waterfall. This process is graphically shown in the traditional picture in Figure 4, although not all the initial steps were present from the beginning. The next stage in development was a theorem prover concisely known as THM, which then evolved via QTHM (‘quantified THM’) into NQTHM (‘new quantified THM’).


John Harrison, Josef Urban and Freek Wiedijk

This system was presented in book form [Boyer and Moore, 1979] and brought Boyer and Moore’s ideas to a much wider audience as well as encouraging actual use of the system. Note that Boyer and Moore did not at that time use the term NQTHM in their own publications, and although it was widely known simply as ‘the Boyer-Moore theorem prover’, they were too modest to use that term themselves. NQTHM had a number of developments over its predecessors. As the name implies, it supported formulas with (bounded) quantification. It made more extensive and systematic use of simplification, using previously proved lemmas as conditional, contextual rewrite rules. A so-called shell principle allowed users to define new data types instead of reducing everything explicitly to lists. The system was able to handle not only primitive recursive definitions and structural inductions [Burstall, 1969] over these types, but also definitions by wellfounded recursion and proofs by wellfounded induction, using an explicit representation of countable ordinals internally. A decision procedure was also added for rational linear arithmetic. All these enhancements made the system much more practical and it was subsequently used for many non-trivial applications, including Hunt’s pioneering verification of a microprocessor [Hunt, 1985], Shankar’s checking of G¨odel’s First Incompleteness Theorem [Shankar, 1994], as well as others we will discuss briefly later on. In a significant departure from the entirely automatic Pure Lisp Theorem Prover, NQTHM also supported the provision of hints for guidance and a proof checker allowing each step of the proof to be specified interactively. The next step in the evolution of this family, mainly the work of Moore and Matt Kaufmann, was a new system called ACL2, ‘A Computational Logic for Applicative Common Lisp’ [Kaufmann et al., 2000b]. (Boyer himself helped to establish the project and continued as an important inspiration and source of ideas, but at some point stopped being involved with the actual coding.) Although many aspects of NQTHM including its general style were retained, this system all but eliminated the distinction between its logic and its implementation language — both are a specific pure subset of Common Lisp. One advantage of such an identification is efficiency. In many of the industrial-scale applications of NQTHM mentioned above, a key requirement is efficient execution of functions inside the logic. Instead of the custom symbolic execution framework in NQTHM, ACL2 simply uses direct Lisp execution, generally much faster. This identification of the implementation language and logic also makes possible a more principled way of extending the system with new verified decision procedures, a topic we discuss later. ACL2 has further accelerated and consolidated the use of the family of provers in many applications of practical interest [Kaufmann et al., 2000a]. Besides such concrete applications of their tools, Boyer and Moore’s ideas on induction in particular have spawned a large amount of research in automated theorem proving. A more detailed overview of the development we have described from the perspective of the automation of inductive proof is given by Moore and Wirth [2013]. Among the research topics directly inspired by Boyer and Moore’s work on induction are Bundy’s development of proof planning [Bundy et al., 1991] and the associated techniques like rippling. Since this is somewhat outside our

History of Interactive Theorem Proving


purview we will not say more about this topic.



There was intense interest in the 1970s and 1980s in the development of frameworks that could perform computer system verification. This was most pronounced, accompanied by substantial funding in the US, for verification of security properties such as isolation in time-sharing operating systems (these were then quite new and this property was a source of some concern), which was quite a stimulus to the development of formal verification and theorem proving in general [MacKenzie, 2001]. Among the other systems developed were AFFIRM, GYPSY [Good et al., 1979], Ina Jo and the Stanford Pascal Verifier. Closely associated with this was the development of combined decision procedures by Nelson and Oppen [1980] and by Shostak [1984]. One other influential framework was HDM, the ‘hierarchical development methodology’. The ‘hierarchical’ aspect meant that it could be used to describe systems at different levels of abstraction where a ‘black box’ at one level could be broken down into other components at a lower level. HDM top-level specifications were written in SPECIAL, a ‘Specification and Assertion Language’. A security flow analyzer generated verification conditions that were primarily handled using the Boyer-Moore prover discussed previously. Unfortunately, the SPECIAL language and the Boyer-Moore prover were not designed together, and turned out not to be very smoothly compatible. This meant that a layer of translation needed to be applied, which often rendered the back-end formulas difficult to understand in terms of the original specification. Together with the limited interaction model of the prover, this effectively made it clumsy for users to provide any useful interactive guidance. Based on the experiences with HDM, a new version EHDM (‘Enhanced HDM’) was developed starting in 1983, with most of the system designed by Michael Melliar-Smith, John Rushby and Richard Schwarz, while Shostak’s decision procedure suite STP was further developed and used as a key component [Melliar-Smith and Rushby, 1985]. Technically this was somewhat successful, introducing many influential ideas such as a system of modules giving parametrization at the theory level (though not fine-grained polymorphism in the HOL sense). It was also used in a number of interesting case studies such as the formalization [Rushby and von Henke, 1991] of an article by Lamport and Melliar-Smith [1985] containing a proof of correctness for a fault-tolerant clock synchronization algorithm, which identified several issues with the informal proof. Working with Sam Owre and Natarajan Shankar, John Rushby led the project to develop PVS (originally at least standing for ‘Prototype Verification System’) [Owre et al., 1992] as a new prover for EHDM. Over time it took on a life of its own while EHDM for a variety of technical, pragmatic and social reasons fell into disuse. Among other things, Shostak and Schwarz left to start the database company Paradox, and US Government restrictions made it inordinately difficult for many prospective users to get access to EHDM. Indeed, it was common to


John Harrison, Josef Urban and Freek Wiedijk

hear PVS expanded as ‘People’s Verification System’ to emphasize the more liberal terms on which it could be used. The goal of PVS was to retain the advantages of EHDM, such as the richly typed logic and the parametrized theories, while addressing some of its weaknesses, making automated proof more powerful (combining Shostak-style decision procedures and effective use of rewriting) and supporting top-down interactive proof via a programmable proof language. At the time there was a widespread belief that one had to make an exclusive choice between a rich logic with weak automation (Automath) or a weak logic with strong automation (NQTHM). One of the notable successes of PVS was in demonstrating convincingly that it was quite feasible to have both. The PVS logic (or ‘specification language’) is a strongly typed higher-order logic. It does not have the sophisticated dependent type constructors found in some constructive type theories, but unlike HOL it allows some limited use of dependent types, where types are parametrized by terms. In particular, given any type α and a subset of (or predicate over) the type α, there is always a type corresponding to that subset. In other words, PVS supports predicate subtypes. In HOL, the simple type system has the appealing property that one can infer the most general types of terms fully automatically. The price paid for the predicate subtypes in PVS is that in general typechecking (that is, deciding whether a term has a specific type) may involve arbitrarily difficult theorem proving, and the processes of type checking and theorem proving are therefore intimately intertwined. On the other hand, because of the powerful automation, many of the type correctness conditions (TCCs) can still be decided without user interaction. The PVS proof checker presents the user with a goal-directed view of the proving process, representing goals using multiple-conclusion sequents. Many basic commands for decomposing and simplifying goals are as in many other interactive systems like Coq or HOL. But PVS also features powerful and tightly integrated decision procedures that are able to handle many routine goals automatically in response to a simple invocation of the simplify command. Although PVS does not make the full implementation language available for programming proof procedures, there is a special Lisp-like language that can be used to link proof commands together into custom strategies.



Having seen some of the main systems and the ideas they introduced in foundations, software architecture, proof language etc., let us step back and reflect on some of the interesting sources of diversity and examine some of the research topics that naturally preoccupy researchers in the field.

History of Interactive Theorem Proving




For those with only a vague interest in foundations who somehow had the idea that ZF set theory was the standard foundation for mathematics, the diversity, not to say Balkanization, of theorem provers according to foundational system may come as a surprise. We have seen at least the following as foundations even in the relatively few systems we’ve surveyed here: • Quantifier-free logic with induction (NQTHM, ACL2) • Classical higher-order logic (HOLs, PVS) • Constructive type theory (Coq, NuPRL) • First-order set theory (Mizar, EVES, Isabelle/ZF) • Logics of partial terms (LCF, IMPS, Isabelle/HOLCF) Some of this diversity arises because of specific philosophical positions among the systems’ developers regarding the foundations of mathematics. For example, modern mathematicians (for the most part) use nonconstructive existence proofs without a second thought, and this style fits very naturally into the framework of classical first-order set theory. Yet ever since Brouwer’s passionate advocacy [van Dalen, 1981] there has been a distinct school of intuitionistic or constructive mathematics [Beeson, 1984; Bishop and Bridges, 1985]. While Brouwer had an almost visceral distaste for formal logic, Heyting introduced an intuitionistic version of logic, and although there are workable intuitionistic versions of formal set theory, the type-theoretic frameworks exploiting the Curry-Howard correspondence between propositions and types, such as Martin-L¨of’s type theory [Martin-L¨of, 1984], are arguably the most elegant intuitionistic formal systems, and it is these that have inspired Coq, NuPRL and many other provers. Other motivations for particular foundational schemes are pragmatic. For example, HOL’s simple type theory pushes a lot of basic domain reasoning into automatic typechecking, simplifying the task of producing a reasonable level of mechanical support, while the very close similarity with the type system of the ML programming language makes it feel natural to a lot of computer scientists. The quantifier-free logic of NQTHM may seem impoverished, but the very restrictiveness makes it easier to provide powerful automation, especially of inductive proof, and forces definitions to be suitable for actual execution. Indeed, a little reflection shows that the distinction between philosophical and pragmatic motivations is not clear-cut. While one will not find any philosophical claims about constructivism associated with NQTHM and ACL2, it is a fact that the logic is even more clearly constructive than intuitionistic type theories.15 Despite the Lisp-like syntax, it is conceptually close to primitive recursive arithmetic 15 At least in its typical use — we neglect here the built-in interpreter axiomatized in NQTHM, which could be used to prove nonconstructive results [Kunen, 1998].


John Harrison, Josef Urban and Freek Wiedijk

(PRA) [Goodstein, 1957]. And many people find intuitionistic logic appealing not so much because of philosophical positions on the foundations of mathematics but because at least in principle, the Curry-Howard correspondence has a more pragmatic side: one can consider a proof in a constructive system actually to be a program [Bates and Constable, 1985]. The language we use can often significantly influence our thoughts, whether it be natural language, mathematical notation or a programming language [Iverson, 1980]. Similarly, the chosen foundations can influence mathematical formalization either for good or ill, unifying and simplifying it or twisting it out of shape. Indeed, it can even influence the kinds of proofs we may even try to formalize. For example, ACL2’s lack of traditional quantifiers makes it unappealing to formalize traditional epsilon-delta proofs in real analysis, yet it seems ideally suited to the reasoning in nonstandard analysis, an idea that has been extensively developed by Gamboa [1999]; for another development of this topic in Isabelle/HOL see [Fleuriot, 2001]. In particular, the value of types is somewhat controversial. Both types [Whitehead and Russell, 1910; Ramsey, 1926; Church, 1940] and the axiomatic approach to set theory culminating in modern systems like ZF, NBG etc., originated in attempts to resolve the paradoxes of naive set theory, and may be seen as two competing approaches. Set theory has long been regarded as the standard foundation, but it seems that at least when working in concrete domains, most mathematicians do respect natural type distinctions (points versus lines, real numbers versus sets of real numbers). Even simpler type systems like that of HOL make a lot of formalizations very convenient, keeping track of domain conditions and compatibility automatically and catching blunders at an early stage. However, for some formalizations the type system ceases to help and becomes an obstacle. This seems to occur particularly in traditional abstract algebra where constructions are sometimes presented in a very type-free way. For example, a typical “construction” of the algebraic closure of a field proceeds by showing that one can extend a given field F with a root a of a polynomial p ∈ F [x], and then roughly speaking, iterating that construction transfinitely (this is more typically done via Zorn’s Lemma or some such maximal principle, but one can consider it as a transfinite recursion). Yet the usual way of adding a single root takes one from the field F to a equivalence class of polynomials over F (its quotient by the ideal generated by p). When implemented straightforwardly this might lie two levels above F itself: if we think of elements of F as belonging to a type α then polynomials over F might be functions N → F (albeit with finite support) and then equivalence classes represented as Boolean functions over that type, so we have moved to (N → F ) → 2. And that whole process needs to be iterated transfinitely. Of course one can use cardinality arguments to choose some sufficiently large type once and for all and map everything back into that type at each stage. One may even argue that this gives a more refined theorem with information about the cardinality of the algebraic closure, but the value of being forced to do so by the foundation is at best questionable. Another limitation of the simple HOL type system is that there is no explicit quantifier over polymorphic

History of Interactive Theorem Proving


type variables, which can make many standard results like completeness theorems and universal properties awkward to express, though there are extensions with varying degrees of generality that fix this issue [Melham, 1992; Voelker, 2007; Homeier, 2009]. Inflexibilities of these kinds certainly arise in simple type theories, and it is not even clear that more flexible dependent type theories (where types can be parametrized by terms) are immune. For example, in one of the most impressive formalization efforts to date [Gonthier et al., 2013] the entire group theory framework is developed in terms of subsets of a single universe group, apparently to avoid the complications from groups with general and possibly heterogeneous types. Even if one considers types a profoundly useful concept, it does not follow of course that they need to be hardwired into the logic. Starting from a type-free foundation, it is perfectly possible to build soft types as a derived concept on top, and this is effectively what Mizar does, arguably giving a good combination of flexibility, convenience and simplicity [Wiedijk, 2007]. In this sense, types can be considered just as sets or something very similar (in general they can be proper classes in Mizar). On the other hand, some recent developments in foundations known as homotopy type theory or univalent foundations give a distinctive role to types, treating equality on types according to a homotopic interpretation that may help to formalize some everyday intuitions about identifying isomorphic objects. Another interesting difference between the various systems (or at least the way mathematics is usually formalized in them) is the treatment of undefined terms like 0−1 that arise from the application of functions outside their domain. In informal mathematics we often filter out such questions subconsciously, but the exact interpretation of such undefinedness can be critical to the assertion being made. We can identify three main approaches taken in interactive provers: • Totalization (usual in HOL) — functions are treated as total, either giving them an arbitrary value outside their domain or choosing one that is particularly convenient for making handy theorems work in the degenerate cases too. For example, setting 0−1 = 0 [Harrison, 1998] looks bizarre at first sight, but it lets us employ natural rewrite principles like (x−1 )−1 = x, −x−1 = (−x)−1 , (xy)−1 = x−1 y −1 and x−1 ≥ 0 ⇔ x ≥ 0 without any special treatment of the zero case. (There is actually an algebraic theory of meadows, effectively fields with this totalization [Bergstra et al., 2007].) While simple, it has the disadvantage that equations like f (x) = y do not carry with them the information that f is actually defined at point x, arguably a contrast with informal usage, so one must add additional conditions or use relational reformulations. • Type restrictions (usual in PVS) — the domain restrictions in the partial functions are implemented via the type system, for example giving the inverse operation a type : R′ → R where R′ corresponds to R−{0}. This seems quite natural in some ways, but it can mean that types become very intricate for complicated theorems. It can also mean that the precise meaning of formulas


John Harrison, Josef Urban and Freek Wiedijk

like ∀x ∈ R. tan(x) = 0 ⇒ ∃n ∈ Z.x = nπ, or even whether such a formula is acceptable or meaningful, can depend on quite small details of how the typechecking and basic logic interact. • Logics of partial terms (as supported by IMPS [Farmer et al., 1990]) — here there is a first-class notion of ‘defined’ and ‘undefined’ in the foundational system itself. Note that while it is possible to make the logic itself 3-valued so there is also an ‘undefined’ proposition [Barringer et al., 1984], this is not necessary and many systems allow partial terms while maintaining bivalence. One can have different variants of the equality relation such as ‘either both sides are undefined or both are defined and equal’. While apparently complicated and apt to throw up additional proof obligations, this sort of logical system and interpretation of the equality relation arguably gives the most faithful analysis of informal mathematics.


Proof language

As we noted at the beginning, one significant design decision in interactive theorem proving is choosing a language in which a human can communicate a proof outline to the machine. From the point of view of the user, the most natural desideratum might be that the machine should understand a proof written in much the same way as a traditional one from a paper or textbook. Even accepting that this is indeed desirable, there are two problems in realizing it: getting the computer to understand the linguistic structure of the text, and having the computer fill in the gaps that human mathematicians consider as obvious. Recently there has been some progress in elucidating the structure of traditional mathematical texts such that a computer could unravel much of it algorithmically [Ganesalingam, 2013], but we are still some way from having computers routinely understand arbitrary mathematical documents. And even quite intelligent human readers sometimes have difficulty in filling in the gaps in mathematical proofs. Subjectively, one can sometimes argue that such gaps amount to errors of omission where the author did not properly appreciate some of the difficulties, even if the final conclusion is indeed accurate. All in all, we are some way from the ideal of accepting existing documents, if ideal it is. The more hawkish might argue that formalization presents an excellent opportunity to present proofs in a more precise, unambiguous and systematic — one might almost say machine-like — way [Dijkstra and Scholten, 1990]. In current practice, the proof languages supported by different theorem proving systems differ in a variety of ways. One interesting dichotomy is between procedural and declarative proof styles [Harrison, 1996c]. This terminology, close to its established meaning in the world of programming languages, was suggested by Mike Gordon. Roughly, a declarative proof outlines what is to be proved, for example a series of intermediate assertions that act as waystations between the assumptions and conclusions. By contrast, a procedural proof explicitly states how to perform the proofs (‘rewrite the second term with lemma 7 . . . ’), and some

History of Interactive Theorem Proving


procedural theorem provers such as those in the LCF tradition use a full programming language to choreograph the proof process. To exemplify procedural proof, √ here is a HOL Light proof of the core lemma in the theorem that 2 is irrational, as given in [Wiedijk, 2006]. It contains a sequence of procedural steps and even for the author, it is not easy to understand what they all do without stepping through them in the system. let NSQRT_2 = prove (‘!p q. p * p = 2 * q * q ==> q = 0‘, MATCH_MP_TAC num_WF THEN REWRITE_TAC[RIGHT_IMP_FORALL_THM] THEN REPEAT STRIP_TAC THEN FIRST_ASSUM(MP_TAC o AP_TERM ‘EVEN‘) THEN REWRITE_TAC[EVEN_MULT; ARITH] THEN REWRITE_TAC[EVEN_EXISTS] THEN DISCH_THEN(X_CHOOSE_THEN ‘m:num‘ SUBST_ALL_TAC) THEN FIRST_X_ASSUM(MP_TAC o SPECL [‘q:num‘; ‘m:num‘]) THEN POP_ASSUM MP_TAC THEN CONV_TAC SOS_RULE);; By contrast, consider the following declarative proof using the Mizar mode for HOL Light [Harrison, 1996b], which is a substantial fragment of the proof of the Knaster-Tarski fixed point theorem [Knaster, 1927; Tarski, 1955].16 There is not a single procedural step, merely structuring commands like variable introduction (‘let a’) together with a sequence of intermediate assertions and the premises from which they are supposed (somehow) to follow: consider a such that lub: (!x. x IN Y ==> a a’ a’ f a . . . λn > λn+1 = 0}. The possibility scale can be the unit interval as suggested by Zadeh, or generally any finite chain, or even the set of non-negative integers1 . For a detailed discussion of different types of scales in a possibility theory perspective, the reader is referred to [Benferhat et al., 2010]. The function π represents the state of knowledge of an agent (about the actual state of affairs), also called an epistemic state distinguishing what is plausible from what is less plausible, what is the normal course of things from what is not, what is surprising from what is expected. It represents a flexible restriction on what is the actual state of facts with the following conventions (similar to probability, but opposite to Shackle’s potential surprise scale which refers to impossibility): 1 If

S = N, the conventions are opposite: 0 means possible and ∞ means impossible.



• π(u) = 0 means that state u is rejected as impossible; • π(u) = 1 means that state u is totally possible (= plausible). The larger π(u), the more possible, i.e., plausible the state u is. Formally, the mapping π is the membership function of a fuzzy set [Zadeh, 1978], where membership grades are interpreted in terms of plausibility. If the universe U is exhaustive, at least one of the elements in S should be the actual world, so that ∃u, π(u) = 1 (normalization). This condition expresses the consistency of the epistemic state described by π. Distinct values may simultaneously have a degree of possibility equal to 1. Moreover, as Shackle wrote, as early as 1949: “An outcome that we looked on as perfectly possible before is not rendered less possible by the fact that we have extended the list of perfectly possible outcomes” (see p. 114 in [Shackle, 1949]). In the {0, 1}-valued case, π is just the characteristic function of a subset E ⊆ U of mutually exclusive states, ruling out all those states outside E considered as impossible. Possibility theory is thus a (fuzzy) set-based representation of incomplete information. Specificity A possibility distribution π is said to be at least as specific as another π ′ if and only if for each state of affairs u: π(u) ≤ π ′ (u) [Yager, 1983]. Then, π is at least as restrictive and informative as π ′ , since it rules out at least as many states with at least as much strength. In the possibilistic framework, extreme forms of partial knowledge can be captured, namely: • Complete knowledge: for some u0 , π(u0 ) = 1 and π(u) = 0, ∀u = 6 u0 (only u0 is possible); • Complete ignorance: π(u) = 1, ∀u ∈ U (all states are possible). Possibility theory is driven by the principle of minimal specificity. It states that any hypothesis not known to be impossible cannot be ruled out. It is a minimal commitment, cautious, information principle. Basically, we must always try to maximize possibility degrees, taking constraints into account. Given a piece of information in the form x is F where F is a fuzzy set restricting the values of the ill-known quantity x, it leads to represent the knowledge by the inequality π ≤ µF , the membership function of F . The minimal specificity principle enforces possibility distribution π = µF , if no other piece of knowledge is available. Generally there may be impossible values of x due to other piece(s) of information. Thus given several pieces



of knowledge of the form x is Fi , for i = 1, . . . , n, each of them translates into the constraint π ≤ µFi ; hence, several constraints lead to the inequality π ≤ minni=1 µFi and on behalf of the minimal specificity principle, to the possibility distribution n

π = min πi i=1

where πi is induced by the information item x is Fi (πi = µFi ). It justifies the use of the minimum operation for combining information items. It is noticeable that this way of combining pieces of information fully agrees with classical logic, since a classical logic knowledge base (i.e. a set of formulas) is equivalent to the logical conjunction of the logical formulas that belong to the base, and its models is obtained by intersecting the sets of models of its formulas. Indeed, in propositional logic, asserting a logical proposition a amounts to declaring that any interpretation (state) that makes a false is impossible, as being incompatible with the state of knowledge. Possibility and necessity functions Given a simple query of the form “does event A occur?” (is the corresponding proposition a true?), where A is a subset of states, the set of models of a, the response to the query can be obtained by computing the degrees of possibility of A [Zadeh, 1978] and of its complement Ac : Π(A) = sup π(u) ; u∈A

Π(Ac ) = sup π(u). s∈A /

Π(A) evaluates to what extent A is consistent with π, while Π(Ac ) can be easily related to the idea of certainty of A. Indeed, the less Π(Ac ), the more Ac is impossible and the more certain is A. If the possibility scale S is equipped with an order-reversing map denoted by λ ∈ S 7→ ν(λ), it enables a degree of necessity (certainty) of A to be defined in the form N (A) = ν(Π(Ac )), which expresses the well-known duality between possibility and necessity. N (A) evaluates to what extent A is certainly implied by π. If S is the unit interval then, it is usual to choose ν(λ) = 1 − λ, so that N (A) = 1 − Π(Ac ) [Dubois and Prade, 1980]. Generally, Π(U ) = N (U ) = 1 and Π(∅) = N (∅) = 0 (since π is normalized to 1). In the {0, 1}-valued case, the possibility distribution comes down to the disjunctive (epistemic) set E ⊆ U , and possibility and necessity are then as follows: • Π(A) = 1 if A ∩ E = 6 ∅, and 0 otherwise: function Π checks whether A is logically consistent with the available information or not. • N (A) = 1 if E ⊆ A, and 0 otherwise: function N checks whether A is logically entailed by the available information or not.



Possibility measures satisfy the characteristic “maxitivity” property Π(A ∪ B) = max(Π(A), Π(B)). Necessity measures satisfy an axiom dual to that of possibility measures: N (A ∩ B) = min(N (A), N (B)). On infinite spaces, these axioms must hold for infinite families of sets. As a consequence, of the normalization of π, min(N (A), N (Ac )) = 0 and max(Π(A), Π(Ac )) = 1, where Ac is the complement of A, or equivalently Π(A) = 1 whenever N (A) > 0, which totally fits the intuition behind this formalism, namely that something somewhat certain should be first fully possible, i.e. consistent with the available information. Moreover, one cannot be somewhat certain of both A and Ac , without being inconsistent. Note also that we only have N (A ∪ B) ≥ max(N (A), N (B)). This goes well with the idea that one may be certain about the event A ∪ B, without being really certain about more specific events such as A and B. Certainty qualification Human knowledge is often expressed in a declarative way using statements to which belief degrees are attached. Certaintyqualified pieces of information of the form “A is certain to degree α” can be modeled by the constraint N (A) ≥ α. It represents a family of possible epistemic states π that obey this constraint. The least specific possibility distribution among them exists and is given by [Dubois and Prade, 1988]:  1 if u ∈ A π(A,α) (u) = 1 − α otherwise. If α = 1 we get the characteristic function of A. If α = 0, we get total ignorance. This possibility distribution is a key building-block to construct possibility distributions from several pieces of uncertain knowledge. It is instrumental in possibilistic logic semantics. Indeed, e.g. in the finite case, any possibility distribution can be viewed as a collection of nested certaintyqualified statements. Let Ei = {u|π(u) ≥ λi ∈ L} be the λi -cut of π. Then it can be check that π(u) = mini:s6∈Ei 1−N (Ei ) (with convention min∅ = 1). We can also consider possibility-qualified statements of the form Π(A) ≥ β; however, the least specific epistemic state compatible with this constraint is trivial and expresses total ignorance. Two other measures A measure of guaranteed possibility or strong possibility can be defined, that differs from the functions Π (weak possibility)



and N (strong necessity) [Dubois and Prade, 1992; Dubois et al., 2000]: ∆(A) = inf π(u). u∈A

It estimates to what extent all states in A are actually possible according to evidence. ∆(A) can be used as a degree of evidential support for A. Of course, this function possesses a dual conjugate ∇ such that ∇(A) = 1 − ∆(Ac ) = supu6∈A 1 − π(u). Function ∇(A) evaluates the degree of potential or weak necessity of A, as it is 1 only if some state s out of A is impossible. It follows that the functions ∆ and ∇ are decreasing with respect to set inclusion, and that they satisfy the characteristic properties ∆(A ∪ B) = min(∆(A), ∆(B)) and ∇(A ∩ B) = max(∇(A), ∇(B)) respectively. Uncertain statements of the form “A is possible to degree β” often mean that any realization of A are possible to degree β (e.g. “it is possible that the museum is open this afternoon”). They can then be modeled by a constraint of the form ∆(A) ≥ β. It corresponds to the idea of observed evidence. This type of information is better exploited by assuming an informational principle opposite to the one of minimal specificity, namely, any situation not yet observed is tentatively considered as potentially impossible. This is similar to the closed-world assumption. The most specific distribution δ(A,β) in agreement with ∆(A) ≥ β is:  β if u ∈ A π[A,β] (u) = 0 otherwise. Note that while possibility distributions induced from certainty-qualified pieces of knowledge combine conjunctively, by discarding possible states, evidential support distributions induced by possibility-qualified pieces of evidence combine disjunctively, by accumulating possible states. Given several pieces of knowledge of the form x is Fi is possible (in the sense of guaranteed or strong possibility), for i = 1, . . . , n, each of them translates into the constraint π ≥ µFi ; hence, several constraints lead to the inequality π ≥ maxni=1 µFi and on behalf of a closed-world assumption-like principle based on maximal specificity, expressed by the possibility distribution n

π = max πi i=1

where πi represents the information item x is Fi is possible. This principle justifies the use of the maximum for combining evidential support functions. Acquiring pieces of possibility-qualified evidence leads to updating π[A,β] into some wider distribution π > π[A,β] . Any possibility distribution can be



represented as a collection of nested possibility-qualified statements of the form (Ei , ∆(Ei )), with Ei = {u|π(u) ≥ λi }, since π(u) = maxi:u∈Ei ∆(Ei ), dually to the case of certainty-qualified statements. The possibilistic cube of opposition Interestingly enough, it has been shown [Dubois and Prade, 2012] that the four set function evaluations of an event A and the four evaluations of its opposite Ac can be organized in a cube of opposition (see Figure 1), whose front and back facets are graded extension of the traditional square of opposition [Parsons, 1997]. Counterparts of the characteristic properties of the square of opposition do hold. First, the diagonals (in dotted lines) of these facets link dual measures through the involutive order-reversing function 1 − (·). The vertical edges of the cube, as well as the diagonals of the side facets, which are bottom-oriented arrows, correspond to entailments here expressed by inequalities. Indeed, provided that π and 1 − π are both normalized, we have for all A, max(N (A), ∆(A)) ≤ min(Π(A), ∇(A)). The thick black lines of the top facets express mutual exclusiveness under the form min(N (A), N (Ac )) = min(∆(A), ∆(Ac )) = min(N (A), ∆(Ac )) = min(∆(A), N (Ac )) = 0. Dually, the double lines of the bottom facet correspond to max(Π(A), Π(Ac )) = max(∇(A), ∇(Ac )) = max(Π(A), ∇(Ac )) = max(∇(A), Π(Ac )) = 1. Thus, the cube in Figure 1 summarizes the interplay between the different measures in possibility theory. a: ∆(A)

A: N (A)

i: ∇(A)

I: Π(A)

e: ∆(Ac )

E: N (Ac )

o: ∇(Ac )

O: Π(Ac )

Figure 1. The cube of opposition of possibility theory





Possibilistic logic has been developed for about thirty years; see [Dubois and Prade, 2004] for historical details. Basic possibilistic logic (also called standard possibilistic logic) has been first introduced in artificial intelligence as a tool for handling uncertainty in a qualitative way in a logical setting. Later on, it has appeared that basic possibilistic logic can also be used for representing preferences [Lang, 1991]. Then, each logic formula represents a goal to be reached with its priority level (rather than a statement that is believed to be true with some certainty level). Possibilistic logic heavily relies on the notion of necessity measure, but may be also related [Dubois et al., 1991a] to Zadeh’s theory of approximate reasoning [Zadeh, 1979b]. A basic possibilistic logic formula is a pair (a, α) made of a classical logic formula a associated with a certainty level α ∈ (0, 1], viewed as a lower bound of a necessity measure, i.e., (a, α) is understood as N (a) ≥ α. Formulas of the form (a, 0), which do not contain any information (N (a) ≥ 0 always holds), are not part of the possibilistic language. As already said, the interval [0,1] can be replaced by any linearly ordered scale. Since necessity measures N are monotonic functions w.r.t. entailment, i.e. if a |= b then N (a) ≤ N (b). The min decomposability property of necessity measures for conjunction, i.e., N (a ∧ b) = min(N (a), N (b)), expresses that to be certain about a∧b, one has to be certain about a and to be certain about b. Thanks to this decomposability property, a possibilistic logic base can be always put in a clausal equivalent form. An interesting feature of possibilistic logic is its ability to deal with inconsistency. Indeed a possibilistic logic base Γ, i.e. a set of possibilistic logic formulas, viewed as a conjunction thereof, is associated with an inconsistency level inc-l(Γ), which is such that the formulas associated with a level strictly greater than inc-l(Γ) form a consistent subset of formulas. A possibilistic logic base is semantically equivalent to a possibility distribution that restricts the set of interpretations (w. r. t. the considered language) that are more or less compatible with the base. Instead of an ordinary subset of models as in classical logic, we have a fuzzy set of models, since the violation by an interpretation of a formula that is not fully certain (or imperative) does not completely rule out the interpretation. The certainty-qualified statements of possibilistic logic have a clear modal flavor. Possibilistic logic can also be viewed as a special case of a labelled deductive system [Gabbay, 1996]. Inference in possibilistic logic propagates certainty in a qualitative manner, using the law of the weakest link,



and is inconsistency-tolerant, as it enables non-trivial reasoning to be performed from the largest consistent subset of most certain formulas. A characteristic feature of this uncertainty theory is that a set of propositions {a ∈ L : N (a) ≥ α}, in a propositional language L, that are believed at least to a certain extent is deductively closed (thanks to the min-decomposability of necessity measures with respect to conjunction). As a consequence, possibilistic logic remains very close to classical logic. From now on, we shall use letters such as a, b, c for denoting propositional formulas, and letters such as p, q, r will denote atomic formulas. This section is organized in four main subparts. First, the syntactic and semantic aspects of basic possibilistic logic are presented. Then we briefly survey extended inference machineries that take into account formulas that are “drowned” under the inconsistency level, as well as an extension of possibilistic logic that handles certainty levels in a symbolic manner, which allows for a partially known ordering of these levels. Lastly, we review another noticeable, Bayesian-like, representation framework, namely possibilistic networks. They are associated to a possibility distribution decomposed thanks to conditional independence information, and provide a graphical counterpart to possibilistic logic bases to which they are semantically equivalent.


Basic possibilistic logic

Basic possibilistic logic [Dubois et al., 1994c] has been mainly developed as a formalism for handling qualitative uncertainty (or preferences) with an inference mechanism that is a simple extension of classical logic. A possibilistic logic formula is a pair made of i) any well-formed classical logic formula, propositional or first-ordered, and ii) a weight expressing its certainty or priority. Its classical logic component can be only true or false: fuzzy statements with intermediary degrees of truth are not allowed in standard possibilistic logic (although extensions exist for handling fuzzy predicates [Dubois et al., 1998b; Alsinet and Godo., 2000; Alsinet et al., 2002]). Syntactic aspects In the following, we only consider the case of (basic) possibilistic propositional logic, ΠL for short, i.e., possibilistic logic formulas (a, α) are such that a is a formula in a propositional language; for (basic) possibilistic first order logic, the reader is referred to [Dubois et al., 1994c]. Axioms and inference rules. The axioms of ΠL are those of propositional logic, PL for short, where each axiom schema is now supposed to hold with the maximal certainty, i.e. is associated with level 1 [Dubois et al., 1994c]. It has two inference rules:



• if β ≤ α then (a, α) ⊢ (a, β) (certainty weakening) • (¬a ∨ b, α), (a, α) ⊢ (b, α), ∀α ∈ (0, 1] (modus ponens) We may equivalently use the certainty weakening rule with the ΠL counterpart of the resolution rule: (¬a ∨ b, α), (a ∨ c, α) ⊢ (b ∨ c, α), ∀α ∈ (0, 1] (resolution) Using certainty weakening, it is then easy to see that the following inference rule is valid (¬a ∨ b, α), (a ∨ c, β) ⊢ (b ∨ c, min(α, β)) (weakest link resolution) The idea that in a reasoning chain, the certainty level of the conclusion is the smallest of the certainty levels of the formulas involved in the premises is at the basis of the syntactic approach proposed by [Rescher, 1976] for plausible reasoning, and would date back to Theophrastus, an Aristotle’s follower. The following inference rule we call formula weakening holds also as a consequence of α-β-resolution. if a ⊢ b then (a, α) ⊢ (b, α), ∀α ∈ (0, 1] (formula weakening) Indeed a ⊢ b expresses that ¬a ∨ b is valid in PL and thus (¬a ∨ b, 1) holds, which by applying the α-β-resolution rule with (a, α) yields the result. It turns out that any valid deduction in propositional logic is valid in possibilistic logic as well where the corresponding propositions are associated with any level α ∈ (0, 1]. Thus since a, b ⊢ a ∧ b, we have (a, α), (b, α) ⊢ (a ∧ b, α). Note that we also have (a ∧ b, α) ⊢ (a, α) and (a ∧ b, α) ⊢ (b, α) by the formula weakening rule. Thus, stating (a ∧ b, α) is equivalent to stating (a, α) and (b, α). Thanks to this property, it is always possible to rewrite a ΠL base under the form of a collection of weighted clauses. Note also that if we assume that for any propositional tautology t, i.e., such that t ≡ ⊤, (t, α) holds with any certainty level, which amounts to saying that each PL axiom holds with any certainty level, then the α-βresolution rule entails the level weakening rule, since (¬a ∨ a, β) together with (a ∨ c, α) entails (a ∨ c, β) with β ≤ α. Inference and consistency. Let Γ = {(ai , αi ), i = 1, ..., m} be a set of possibilistic formulas. Inference in ΠL from a base Γ is quite similar to the one in PL. We may either use the ΠL axioms, certainty weakening and modus ponens rules, or equivalently proceed by refutation (proving



Γ ⊢ (a, α) amounts to proving Γ, (¬a, 1) ⊢ (⊥, α) by repeated application of the weakest link-resolution rule, where Γ stands for a collection of ΠL formulas (a1 , α1 ), ..., (am , αm ). Moreover, note that Γ ⊢ (a, α) if and only if Γα ⊢ (a, α) if and only if (Γα )∗ ⊢ a where Γα = {(ai , αi ) ∈ Γ, αi ≥ α} and Γ∗ = {ai | (ai , αi ) ∈ Γ}. The certainty levels stratify the knowledge base Γ into nested level cuts Γα , i.e. Γα ⊆ Γβ if β ≤ α. A consequence (a, α) from Γ can only be obtained from formulas having a certainty level at least equal to α, so from formulas in Γα ; then a is a classical consequence from the PL knowledge base (Γα )∗ , and α = max{β|(Γβ )∗ ⊢ a}. The inconsistency level of Γ is defined by inc-l(Γ) = max{α | Γ ⊢ (⊥, α)}. The possibilistic formulas in Γ whose level is strictly above inc-l(Γ) are safe from inconsistency, namely inc-l({(ai , αi ) | (ai , αi ) ∈ Γ and αi > inc-l(Γ)}) = 0. Indeed, if α > inc-l(Γ), (Γα )∗ is consistent. In particular, we have the remarkable property that the classical consistency of Γ∗ is equivalent to saying that Γ has a level of inconsistency equal to 0. Namely, inc-l(Γ) = 0 if and only if Γ∗ is consistent. Semantic aspects The semantics of ΠL [Dubois et al., 1994c] is expressed in terms of possibility distributions, (weak) possibility measures and (strong) necessity measures. Let us first consider a ΠL formula (a, α) that encodes the statement N (a) ≥ α. Its semantics is given by the following possibility distribution π(a,α) defined, in agreement with the formula of the certainty qualification in Section 2.2, by: π(a,α) (ω) = 1 if ω  a and π(a,α) (ω) = 1 − α if ω  ¬a Intuitively, the underlying idea is that any model of a should be fully possible, and that any interpretation that is a counter-model of a, is all the less possible as a is more certain, i.e. as α is higher. When α = 0, the (trivial) information N (a) ≥ 0 is represented by π(a,0) = 1, and the formula (a, 0) can be ignored. It can be easily checked that the associated necessity measure is such that N(a,α) (a) = α, and π(a,α) is the least informative possibility distribution (i.e. maximizing possibility degrees) such that this constraint holds. In fact, any possibility distribution π such that ∀ω, π(ω) ≤ π(a,α) (ω) is such that its associated necessity measure N satisfies N (a) ≥ N(a,α) (a) = α (hence is more committed).



V Due to the min-decomposability of necessity measures, N ( i ai ) ≥ α ⇔ ∀i, N (ai ) ≥ α, and then any possibilistic propositional formula can be put in clausal form. Let us now consider a ΠL knowledge base Γ = {(ai , αi ), i = 1, ..., m}, thus corresponding to the conjunction of ΠL formulas (ai , αi ), each representing a constraint N (ai ) ≥ αi . The base Γ is semantically associated with the possibility distribution: πΓ (ω) =

min π(ai ,αi ) (ω) =


min max([ai ](ω), 1 − αi )


where [ai ] is the characteristic function of the models of ai , namely [ai ](ω) = 1 if ω  ai and [ai ](ω) = 0 otherwise. Thus, the least informative induced possibility distribution πΓ is obtained as the min-based conjunction of the fuzzy sets of interpretations (with membership functions π(ai ,αi ) ), representing each formula. It can be checked that NΓ (ai ) ≥ αi for i=1,. . . ,m, where NΓ is the necessity measure defined from πΓ . Note that we may only have an inequality here since Γ may, for instance, include two formulas associated to equivalent propositions, but with distinct certainty levels. Remark 1. Let us mention that a similar construction can be made in an additive setting where each formula is associated with a cost (in N∪{+∞}), the weight (cost) attached to an interpretation being the sum of the costs of the formulas in the base violated by the interpretation, as in penalty logic [Dupin de Saint Cyr et al., 1994; Pinkas, 1991]. The so-called “cost of consistency” of a formula is then defined as the minimum of the weights of its models (which is nothing but a ranking function in the sense of Spohn [1988], or the counterpart of a possibility measure defined on N ∪ {+∞} where now 0 expresses full possibility, and +∞ complete impossibility since it is a cost that cannot be paid). So a ΠL knowledge base is understood as a set of constraints N (ai ) ≥ αi for i = 1, . . . , m, and the set of possibility distributions π associated with N that are compatible with this set of constraints has a largest element which is nothing but πΓ , i.e. we have ∀ω, π(ω) ≤ mini=1,...,m π(ai ,αi ) = πΓ (ω). Thus, the possibility distribution πΓ representing semantically a ΠL base Γ, is the one assigning the largest possibility degree to each interpretation, in agreement with the semantic constraints N (ai ) ≥ αi for i = 1, . . . , m that are associated with the formulas (ai , αi ) in Γ. Thus, any possibility distribution π ≤ πΓ semantically agrees with Γ, which can be written π  Γ. The semantic entailment is defined by Γ  (a, α) if and only if ∀ω, πΓ (ω) ≤ π{(a,α)} (ω).



We also have Γ  (a, α) if and only if Nπ (a) ≥ α, ∀π ≤ πΓ , where Nπ is the necessity measure associated with π. It can be shown [Dubois et al., 1994c] that possibilistic logic is sound and complete w.r.t. this semantics, namely Γ ⊢ (a, α) if and only if Γ  (a, α).

Moreover, we have inc-l(Γ) = 1 − maxω∈Ω πΓ (ω). This acknowledges the fact that the normalization of πΓ is equivalent to the classical consistency of Γ∗ . Thus, an important feature of possibilistic logic is its ability to deal with inconsistency. The consistency of Γ is estimated by the extent to which there is at least one completely possible interpretation for Γ, i.e. by the quantity cons-l(Γ) = 1 − inc-l(Γ) = maxω∈Ω πΓ (ω) = maxπ|=Γ maxω∈Ω π(ω) (where π |= Γ iff π ≤ πΓ ). EXAMPLE 1. Let us illustrate the previously introduced notions on the following ΠL base Γ, which is in clausal form (p, q, r are atoms): {(¬p ∨ q, 0.8), (¬p ∨ r, 0.9), (¬p ∨ ¬r, 0.1), (p, 0.3), (q, 0.7), (¬q, 0.2), (r, 0.8)}. First, it can be checked that inc-l(Γ) = 0.2. Thus, the sub-base Γ0.3 = {(¬p∨q, 0.8),(¬p∨r, 0.9),(p, 0.3),(q, 0.7),(r, 0.8)} is safe from inconsistency, and its deductive closure is consistent, i.e. ∄ a, ∄ α > 0, ∄ β > 0 such that Γ0.3 ⊢ (a, α) and Γ0.3 ⊢ (¬a, β). By contrast, Γ0.1 ⊢ (¬r, 0.1) and Γ0.1 ⊢ (r, 0.8). Note also that, while (¬p∨r, 0.9), (p, 0.3) ⊢ (r, 0.3), we clearly have Γ ⊢ (r, 0.8) also. This illustrates the fact that in possibilistic logic, we are interested in practice in the proofs having the strongest weakest link, and thus leading to the highest certainty levels. Besides, in case Γ contains (r, 0.2) rather than (r, 0.8), then (r, 0.2) is of no use, since subsumed by (r, 0.3). Indeed, it can be checked that Γ \ {(r, 0.8)} and Γ \ {(r, 0.8)} ∪ {(r, 0.2)} are associated to the same possibility distribution. The possibility distribution associated with Γ, whose computation is detailed in Table 1, is given by πΓ (pqr) = 0.8; πΓ (¬pqr) = 0.7; πΓ (¬p¬qr) = 0.3; πΓ (p¬qr) = πΓ (¬pq¬r) = πΓ (¬p¬q¬r) = 0.2; πΓ (pq¬r) = πΓ (p¬q¬r) = 0.1. ω pqr pq¬r p¬qr p¬q¬r ¬pqr ¬pq¬r ¬p¬qr ¬p¬q¬r

π(¬p∨q,.8) π(¬p∨r,.9) π(¬p∨¬r,.1) π(p,.3) π(q,.7) π(¬q,.2) π(r,.8) πΓ 1 1 0.9 1 1 0.8 1 0.8 1 0.1 1 1 1 0.8 0.2 0.1 0.2 1 0.9 1 0.3 1 1 0.2 0.2 0.1 1 1 0.3 1 0.2 0.1 1 1 1 0.7 1 0.8 1 0.7 1 1 1 0.7 1 0.8 0.2 0.2 1 1 1 0.7 0.3 1 1 0.3 1 1 1 0.7 0.3 1 0.2 0.2

Table 1. Detailed computation of the possibility distribution in the example



Thus cons-l(Γ) = maxω∈Ω πΓ (ω) = 0.8 and inc-l(Γ) = 1 − 0.8 = 0.2. Moreover inc-l(Γ\{(¬q, 0.2)}) = 0.1, and inc-l(Γ\{(¬q, 0.2), (¬p∨¬r, 0.1)}) = 0.  Remark 2. Using the weakest link-resolution rule repeatedly, leads to a refutation-based proof procedure that is sound and complete w. r. t. the semantics exists for propositional possibilistic logic [Dubois et al., 1994c]. It exploits an adaptation of an A∗ search algorithm in to order to reach the empty clause with the greatest possible certainty level [Dubois et al., 1987]. Algorithms and complexity evaluation (similar to the one of classical logic) can be found in [Lang, 2001]. Remark 3. It is also worth pointing out that a similar approach with lower bounds on probabilities would not ensure completeness [Dubois et al., 1994a]. Indeed the repeated use of the probabilistic counterpart of the above resolution rule, namely (¬a ∨ b, α), (a ∨ c, β) |= (b ∨ c, max(0, α + β − 1)) (where (d, α) here means Probability(d) ≥ α), is not always enough for computing the best probability lower bounds on a formula, given a set of probabilistic constraints of the above form. This is due to the fact that a set of formulas all having a probability at least equal to α is not deductively closed in general (except if α = 1). Remark 4. Moreover, a formula such as (¬a ∨ b, α) can be rewritten under the semantically equivalent form (b, min(t (a), α)), where t (a) = 1 if a is true and t (a) = 0 if p is false. This latter formula now reads “b is α-certain, provided that a is true” and can be used in hypothetical reasoning in case (a, γ) is not deducible from the available information (for some γ > 0) [Benferhat et al., 1994a; Dubois and Prade, 1996]. Formulas associated with lower bounds of possibility A piece of information of the form (a, α) (meaning N (a) ≥ α) is also semantically equivalent to Π(¬a) ≤ 1 − α, i.e. in basic possibilistic logic we are dealing with upper bounds of possibility measures. Formulas associated with lower bounds of possibility measures (rather than necessity measures) have been also introduced [Dubois and Prade, 1990b; Lang et al., 1991; Dubois et al., 1994c]. A possibility-necessity resolution rule then governs the inference: N (a ∨ b) ≥ α, Π(¬a ∨ c) ≥ β  Π(b ∨ c) ≥ α ∗ β

with α ∗ β = α if β > 1 − α and α ∗ β = 0 if 1 − α ≥ β.2

2 Noticing that Π(¬a∧¬b) ≤ 1−α, and observing that ¬a∨c ≡ (¬a∧¬b)∨[(¬a∧b)∨c], we have β ≤ Π(¬a ∨ c) ≤ max(1 − α, Π(b ∨ c)). Besides, β ≤ max(1 − α, x) ⇔ x ≥ α ∗ β, where ∗ is a (non-commutative) conjunction operator associated by residuation to the mul tiple-valued implication α → x = max(1 − α,x) = inf{β ∈ [0, 1]|α ∗ β ≤ x}. Hence the result.



This rule and the weakest link-resolution rule of basic possibilistic logic are graded counterparts of two inference rules well-known in modal logic [Fari˜ nas del Cerro, 1985]. The possibility-necessity resolution rule can be used for handling partial ignorance, where fully ignoring a amounts to writing that Π(a) = 1 = Π(¬a). This expresses “alleged ignorance” and corresponds more generally to the situation where Π(a) ≥ α > 0 and Π(¬a) ≥ β > 0. This states that both a and ¬a are somewhat possible, and contrasts with the type of uncertainty encoded by (a, α), which expresses that ¬a is rather impossible. Alleged ignorance can be transmitted through equivalences. Namely from Π(a) ≥ α > 0 and Π(¬a) ≥ β > 0, one can deduce Π(b) ≥ α > 0 and Π(¬b) ≥ β > 0 provided that we have (¬a ∨ b, 1) and (a ∨ ¬b, 1) [Dubois and Prade, 1990b; Prade, 2006].


Reasoning under inconsistency

As already emphasized, an important feature of possibilistic logic is its ability to deal with inconsistency [Dubois and Prade, 2011b]. Indeed all formulas whose level is strictly greater than inc-l(Γ) are safe from inconsistency in a possibilistic logic base Γ. But, any formula in Γ whose level is less or equal to inc-l(Γ) is ignored in the standard possibilistic inference process; these formulas are said to be “drowned”. However, other inferences that salvage formulas that are below the level of inconsistency, but are not involved in some inconsistent subsets of formulas, have been defined and studied; see [Benferhat et al., 1999a] for a survey. One may indeed take advantage of the weights for handling inconsistency in inferences, while avoiding the drowning effect (at least partially). The main approaches are now reviewed. Degree of paraconsistency and safely supported-consequences An extension of the possibilistic inference has been proposed for handling paraconsistent information [Dubois et al., 1994b; Benferhat et al., 1999a]. It is defined as follows. First, for each formula a such that (a, α) is in Γ, we extend the language and compute triples (a, β, γ) where β (resp. γ) is the highest degree with which a (resp. ¬a) is supported in Γ. More precisely, a is said to be supported in Γ at least at degree β if there is a consistent subbase of (Γβ )∗ that entails a, where (Γβ )∗ = {ai |(ai , αi ) ∈ Γ and αi ≥ β}. Let Γo denote the set of bi-weighted formulas which is thus obtained. EXAMPLE 2. Take Γ = {(p, 0.8), (¬p ∨ q, 0.6), (¬p, 0.5), (¬r, 0.3), (r, 0.2), (¬r ∨ q, 0.1)}.



Then, Γo = {(a,0.8,0.5), (¬p,0.5,0.8), (¬r,0.3,0.2), (r,0.2,0.3), (¬p∨q,0.6,0), (¬r ∨ q,0.6,0)}. Indeed consider, e.g., (¬r ∨ q, 0.6, 0). Then we have that (p, 0.8) and (¬p ∨ q, 0.6) entail (q, 0.6) (modus ponens), which implies (¬r ∨ q, 0.6, 0) (logical weakening); it only uses formulas above the level of inconsistency 0.5; but there is no way to derive ¬q from any consistent subset of Γ∗ ; so γ = 0 for ¬r ∨ q.  A formula (a, β, γ) is said to have a paraconsistency degree equal to min(β, γ). Then the following generalized resolution rule (¬a ∨ b, β, γ), (a ∨ c, β ′ , γ ′ ) ⊢ (b ∨ c, min(β, β ′ ), max(γ, γ ′ )) is valid [Dubois et al., 1994b]. In particular, the formulas of interest are such that β ≥ γ, i.e. the formula is at least as certain as it is paraconsistent. Clearly the set of formulas of the form (a, β, 0) in Γo has an inconsistency level equal to 0, and thus leads to safe conclusions. However, one may obtain a larger set of consistent conclusions from Γo as explained now. Defining an inference relation from Γo requires two evaluations: - the undefeasibility degree of a consistent set A of formulas: U D(A) = min{β | (a, β, γ) ∈ Γo and a ∈ A} - the unsafeness degree of a consistent set A of formulas: U S(A) = max{γ | (a, β, γ) ∈ Γo and a ∈ A}. We say that A is a reason (or an argument) for b if A is a minimal (for set inclusion) consistent subset of Γ that implies b, i.e., • A⊆Γ • A∗ 6⊢PL ⊥ • A∗ ⊢PL b • ∀B ⊂ A, B ∗ 6⊢PL b Let label(b) = {(A, U D(A), U S(A)) | A is a reason for b}, and label(b)∗ = {A | (A, U D(A), U S(A)) ∈ label(b)}. Then (b, U D(A′ ), U S(A′ )) is said to be a DS-consequence of Γo (or Γ), denoted by Γo ⊢DS (b, U D(A′ ), U S(A′ )), if and only if U D(A′ ) > U S(A′ ), where A′ is maximizing U D(A) in label(b)∗ and in case of several such A′ , the one which minimizes U S(A′ ). It can be shown that ⊢DS extends the entailment in possibilistic logic [Benferhat et al., 1999a].



EXAMPLE 3. (Example 2 continued) In the above example, label(b) = {(A, 0.6, 0.5), (B, 0.2, 0.3)} with A = {(p, 0.8, 0.5), (¬p ∨ q, 0.6, 0)} and B = {(r, 0.2, 0.3), (¬r ∨ q, 0.6, 0)}. Then, Γo ⊢DS (q, 0.6, 0.5). 

But, if we rather first minimize U S(A′ ) and then maximize U D(A′ ), the entailment would not extend the possibilistic entailment. Indeed in the above example, we would select (B, 0.2, 0.3) but 0.2 > 0.3 does not hold, while Γ ⊢ (q, 0.6) since 0.6 > inc-l(Γ) = 0.5. Note also that ⊢DS is more productive than the possibilistic entailment, as seen on the example, e.g., Γo ⊢DS (¬r, 0.3, 0.2), while Γ ⊢ (¬r, 0.3) does not hold since 0.3 < inc-l(Γ) = 0.5. Another entailment denoted by ⊢SS , named safely supported consequence relation, less demanding than ⊢DS , is defined by Γo ⊢SS b if and only ∃A ∈ label(b) such that U D(A) > U S(A). It can be shown that the set {b | Γo ⊢SS b} is classically consistent [Benferhat et al., 1999a]. Another proposal that sounds natural, investigated in [Benferhat et al., 1993a] is the idea of argued consequence ⊢A , where Γ ⊢A b if there exists a reason A for b stronger than any reason B for ¬b in the sense that the possibilistic inference from A yields b with a level strictly greater than the level obtained for ¬b from any reason B. ⊢A is more productive than ⊢SS . Unfortunately, ⊢A may lead to classically inconsistent of conclusions. There are several other inference relations that have been defined, in particular using a selection of maximal consistent sub-bases based on the certainty or priority levels. However these approaches are more adventurous than ⊢SS and may lead to debatable conclusions. See for details [Benferhat et al., 1999a]. From quasi-classical logic to quasi-possibilistic logic Besnard and Hunter [1995] [Hunter, 2000] have defined a new paraconsistent logic, called quasi-classical logic. This logic has several nice features, in particular the connectives behave classically, and when the knowledge base is classically consistent, then quasi-classical logic gives almost the same conclusions as classical logic (with the exception of tautologies or formulas containing tautologies). Moreover, the inference in quasi-classical logic has a low computational complexity. The basic ideas behind this logic is to use all rules of classical logic proof theory, but to forbid the use of resolution after the introduction of a disjunction (it allows to get rid of the ex falso quodlibet sequitur ). So the rules of quasi-classical logic are split into two classes: composition and decomposition rules, and the proofs cannot use decomposition rules once a composition rule has been used. Intuitively speaking,



this means that we may have resolution-based proofs both for a and ¬a. We also derive the disjunctions built from such previous consequences (e.g. ¬a ∨ b) as additional valid consequences. But it is forbidden to reuse such additional consequences for building further proofs [Hunter, 2000]. Although possibilistic logic takes advantage of its levels for handling inconsistency, there are situations where it offers no useful answers, while quasi-classical logic does. This is when formulas involved in inconsistency have the same level, especially the highest one, 1. Thus, consider the example Γ = {(p, 1), (¬p ∨ q, 1), (¬p, 1)}, where quasi-classical logic infers a, ¬p, q from Γ∗ , while everything is drowned in possibilistic logic, and nothing can be derived by the safely supported consequence relation. This has led to a preliminary proposal of a quasi-possibilistic logic [Dubois et al., 2003]. It would also deserve a careful comparison with the use of the generalized resolution rule, mentioned above, applied to the set Γo of bi-weighted formulas. Note that in the above example, this rule yields (p, 1, 1), (¬p, 1, 1) and (q, 1, 1), as expected. Such concerns should also be related to Belnap’s four-valued logic which leaves open the possibility that a formula and its negation be supported by distinct sources [Belnap, 1977; Dubois, 2012].


Possibilistic logic with symbolic weights

In basic possibilistic logic, the certainty levels associated to formulas are assumed to belong to a totally ordered scale. In some cases, their value and their relative ordering may be unknown, and it may be then of interest to process them in a purely symbolic manner, i.e. computing the level from a derived formula as a symbolic expression. For instance, we have {(p, α), (¬p ∨ q, β), (q, γ)} ⊢ (q, max(min(α, β), γ)). This induces a partial order between formulas based on the partial order between symbolic weights (e.g., max(min(α, β), α, γ) ≥ min(α, δ) for any values of α, β, γ, δ). Possibilistic logic formulas with symbolic weights have been used in preference modeling [Dubois et al., 2006; Dubois et al., 2013b; Dubois et al., 2013c]. Then, interpretations (corresponding to the different alternatives) are compared in terms of vectors acknowledging the satisfaction or the violation of the formulas associated with the different (conditional) preferences, using suitable order relations. Thus, partial orderings of interpretations can be obtained, and may be refined in case some additional information on the relative priority of the preferences is given. Possibilistic formulas with symbolic weights can be reinterpreted as twosorted classical logic formulas. Thus, the formula (a, α) can be re-encoded by the formula a∨A. Such a formula can be intuitively thought as expressing



that a should be true if the situation is not abnormal (a ∨ A ≡ ¬A → a). Then it can be seen that {a ∨ A, ¬a ∨ b ∨ B} ⊢ b ∨ A ∨ B is the counterpart of {(a, α), (¬a ∨ b, β)} ⊢ (b, min(α, β)) in possibilistic logic, as {a ∨ A, a ∨ A′ } ⊢ a ∨ (A ∧ A′ ) is the counterpart of {(a, α), (a, α′ )} ⊢ (a, max(α, α′ )). Partial information about the ordering between levels associated to possibilistic formulas can be also represented by classical logic formulas pertaining to symbolic levels. Thus, the constraint α ≥ β translates into formula ¬A ∨ B 3 . This agrees with the ideas that “the more abnormal ‘a false’ is, the more certain a”, and that “if it is very abnormal, it is abnormal”. It can be checked that {a ∨ A, ¬a ∨ b ∨ B, ¬A ∨ B} ⊢ a ∨ B and b ∨ B, i.e., we do obtain the counterpart of (b, β), while β = min(α, β). The possibilistic logic inference machinery can be recast in this symbolic setting, and efficient computation procedures can be developed taking advantage of the compilation of the base in a dNNF format [Benferhat and Prade, 2005], including the special case where the levels are totally ordered [Benferhat and Prade, 2006]. In this latter case, it provides a way for compiling a possibilistic logic base and then process inference from it in polynomial time. One motivation for dealing with a partial order on formulas relies on the fact that possibilistic logic formulas coming from different sources, may not always be stratified according to a complete preorder. Apart from the above one, several other extensions of possibilistic logic have been proposed when the total order on formulas is replaced by a partial preorder [Benferhat et al., 2004b; Cayrol et al., 2014]. The primary focus is usually on semantic aspects, namely the construction of a partial order on interpretations from a partial order on formulas and conversely. The difficulty lies in the fact that equivalent definitions in the totally ordered case are no longer equivalent in the partially ordered one, and that a partial ordering on subsets of a set cannot be expressed by means of a single partial order on the sets of elements in general.


Possibilistic networks

(Basic) possibilistic logic bases provide a compact representation of possibility distributions involving a finite number of possibility levels. Another compact representation of such qualitative possibility distributions is in terms of possibilistic directed graphs, which use the same conventions as Bayesian nets, but rely on conditional possibility [Benferhat et al., 2002a]. An interesting feature of possibilistic logic is then that a possibilistic logic 3 Note that one cannot express strict inequalities (α > β) in this way (except on a finite scale).



base has thus graphical representation counterparts to which the base is semantically equivalent. We start with a brief reminder on the notions of conditioning and independence in possibility theory. Conditioning Notions of conditioning exist in possibility theory. Conditional possibility can be defined similarly to probability theory using a Bayesian-like equation of the form [Dubois and Prade, 1990a] Π(B ∩ A) = Π(B | A) ⋆ Π(A) where Π(A) > 0 and ⋆ may be the minimum or the product; moreover N (B | A) = 1 − Π(B c | A). The above equation makes little sense for necessity measures, as it becomes trivial when N (A) = 0, that is under lack of certainty, while in the above definition, the equation becomes problematic only if Π(A) = 0, which is natural as then A is considered impossible (see [Coletti and Vantaggi, 2009] for the handling of this situation). If operation ⋆ is the minimum, the equation Π(B ∩ A) = min(Π(B | A), Π(A)) fails to characterize Π(B | A), and we must resort to the minimal specificity principle to come up with a qualitative conditioning of possibility [Dubois and Prade, 1988]: ( 1 if Π(B ∩ A) = Π(A) > 0, Π(B | A) = Π(B ∩ A) otherwise. It is clear that N (B | A) > 0 if and only if Π(B ∩ A) > Π(B c ∩ A). Note also that N (B | A) = N (Ac ∪ B) if N (B | A) > 0. Moreover, if Π(B | A) > Π(B) then Π(B | A) = 1, which points out the limited expressiveness of this qualitative notion (no gradual positive reinforcement of possibility). However, it is possible to have that N (B) > 0, N (B c | A1 ) > 0, N (B | A1 ∩ A2 ) > 0 (i.e., oscillating beliefs). In the numerical setting, we must choose ⋆ = product that preserves continuity, so that Π(B | A) = Π(B∩A) Π(A) which makes possibilistic and probabilistic conditionings very similar [De Baets et al., 1999] (now, gradual positive reinforcement of possibility is allowed). Independence There are also several variants of possibilistic independence between events. Let us mention here the two basic approaches: • Unrelatedness: Π(A ∩ B) = min(Π(A), Π(B)). When it does not hold, it indicates an epistemic form of mutual exclusion between A and B. It is symmetric but sensitive to negation. When it holds for all pairs



made of A, B and their complements, it is an epistemic version of logical independence. • Causal independence: Π(B | A) = Π(B). This notion is different from the former one and stronger. It is a form of directed epistemic independence whereby learning A does not affect the plausibility of B. It is neither symmetric not insensitive to negation: in particular, it is not equivalent to N (B | A) = N (B). Generally, independence in possibility theory is neither symmetric, nor insensitive to negation. For Boolean variables, independence between events is not equivalent to independence between variables. But since the possibility scale can be qualitative or quantitative, and there are several forms of conditioning, there are also various possible forms of independence. For studies of these different notions and their properties see [De Cooman, 1997; De Campos and Huete, 1999; Dubois et al., 1997; Dubois et al., 1999a; Ben Amor et al., 2002]. Graphical structures Like joint probability distributions, joint possibility distributions can be decomposed into a conjunction of conditional possibility distributions (using ⋆ = minimum, or product), once an ordering of the variables is chosen, in a way similar to Bayes nets [Benferhat et al., 2002a]. A joint possibility distribution associated with ordered variables X1 , . . . , Xn , can be decomposed by the chain rule π(X1 , . . . , Xn ) = π(Xn | X1 , . . . , Xn−1 ) ⋆ · · · ⋆ π(X2 | X1 ) ⋆ π(X1 ). Such a decomposition can be simplified by assuming conditional independence relations between variables, as reflected by the structure of the graph. The form of independence between variables at work here is conditional noninteractivity: Two variables X and Y are independent in the context Z, if for each instance (x, y, z) of (X, Y, Z) we have: π(x, y | z) = π(x | z) ⋆ π(y | z). Possibilistic networks are thus defined as counterparts of Bayesian networks [Pearl, 1988] in the context of possibility theory. They share the same basic components, namely: (i) a graphical component which is a DAG (Directed Acyclic Graph) G= (V, E) where V is a set of nodes representing variables and E a set of edges encoding conditional (in)dependencies between them. (ii) a valued component associating a local normalized conditional possibility distribution to each variable Vi ∈ V in the context of its parents. The two definitions of possibilistic conditioning lead to two variants of possibilistic



networks: in the numerical context, we get product-based networks, while in the ordinal context, we get min-based networks (also known as qualitative possibilistic networks). Given a possibilistic network, we can compute its joint possibility distribution using the above chain rule. Counterparts of product-based numerical possibilistic nets using ranking functions exist as well [Spohn, 2012]. Ben Amor and Benferhat [2005] have investigated the properties of qualitative independence that enable local inferences to be performed in possibilistic nets. Uncertainty propagation algorithms suitable for possibilistic graphical structures have been studied [Ben Amor et al., 2003; Benferhat et al., 2005], taking advantage of the idempotency of min operator in the qualitative case [Ben Amor et al., 2003]. Such graphical structures may be also of particular interest for representing preferences [Ben Amor et al., 2014]. Possibilistic nets and possibilistic logic Since possibilistic nets and possibilistic logic bases are compact representations of possibility distributions, it should not come as a surprise that possibilistic nets can be directly translated into possibilistic logic bases and vice-versa, both when conditioning is based on minimum or on product [Benferhat et al., 2002a; Benferhat et al., 2001b]. Hybrid representations formats have been introduced where local possibilistic logic bases are associated to the nodes of a graphical structure rather than conditional possibility tables [Benferhat and Smaoui, 2007a]. An important feature of the possibilistic logic setting is the existence of such equivalent representation formats: set of prioritized logical formulas, preorders on interpretations (possibility distributions) at the semantical level, possibilistic nets, but also set of conditionals of the form Π(p ∧ q) > Π(p ∧ ¬q), and there are algorithms for translating one format in another [Benferhat et al., 2001a].



Basic possibilistic logic has found many applications in different reasoning tasks, beyond the simple deductive propagation of certainty levels or priority levels. In the following, we survey its use in default reasoning (and its utilization in causality ascription), in belief revision and information fusion, in decision under uncertainty, and in the handling of uncertainty in information systems. In a computational perspective, possibilistic logic has also impacted logic



programming [Dubois et al., 1991c; Benferhat et al., 1993b; Alsinet and Godo., 2000; Alsinet et al., 2002; Nicolas et al., 2006; Nieves et al., 2007; Confalonieri et al., 2012; Bauters et al., 2010; Bauters et al., 2011; Bauters et al., 2012]. It has also somewhat influenced the handling of soft constraints in constraint satisfaction problems [Schiex, 1992; Schiex et al., 1995]. Let us also mention applications to diagnosis and recognition problems [Dubois et al., 1990; Benferhat et al., 1997a; Dubois and Prade, 2000; Grabisch and Prade, 2001; Grabisch, 2003], and to the encoding of control access policies [Benferhat et al., 2003].


Default reasoning and causality

Possibilistic logic can be used for describing the normal course of things and a possibilistic logic base reflects how entrenched are the beliefs of an agent. This is why possibilistic logic is of interest in default reasoning, but also in causality ascription, as surveyed in this subsection. Default reasoning Nonmonotonic reasoning has been extensively studied in AI in relation with the problem of reasoning under incomplete information with rules having potential exceptions [L´ea Somb´e Group, 1990], or for dealing with the frame problem in dynamic worlds [Brewka et al., 2011]. In the following, we recall the possibilistic approach [Benferhat et al., 1998b; Dubois and Prade, 2011c], which turned out [Benferhat et al., 1997b] to provide a faithful representation of the postulate-based approach proposed by Kraus, Lehmann and Magidor [1990], and completed in [Lehmann and Magidor, 1992]. A default rule “if a then b, generally”, denoted a b, is then understood formally as the constraint Π(a ∧ b) > Π(a ∧ ¬b) on a possibility measure Π describing the semantics of the available knowledge. It expresses that in the context where a is true, there exists situations where having b true is strictly more satisfactory than any situations where b is false in the same context. As already said, this constraint is equivalent to N (b | a) = 1 − Π(¬b | a) > 0, when Π(b | a) is defined as the greatest solution of the min-based equation Π(a ∧ b) = min(Π(b | a), Π(a))4 . The above constraint can be equivalently expressed in terms of a comparative possibility relation, namely a ∧ b >Π a ∧ ¬b. Any finite consistent set 4 It

would be true as well with the product-based conditioning.



of constraints of the form ai ∧ bi ≥Π ai ∧ ¬bi , representing a set of defaults ∆ = {ai bi , i = 1, · · · , n}, is compatible with a non-empty family of relations >Π , and induces a partially defined ranking >π on Ω, that can be completed according to the principle of minimal specificity, e.g. [Benferhat et al., 1999b]. This principle assigns to each world ω the highest possibility level (in forming a well-ordered partition of Ω) without violating the constraints. This defines a unique complete preorder. Let E1 , . . . , Em be the obtained partition. Then ω >π ω ′ if ω ∈ Ei and ω ′ ∈ Ej with i < j, while ω ∼π ω ′ if ω ∈ Ei and ω ′ ∈ Ei (where ∼π means ≥π and ≤π ). A numerical counterpart to >π can be defined by π(ω) = m+1−i if ω ∈ m Ei , f ori = 1, . . . , m. Note that this is purely a matter of convenience to use a numerical scale, and any other numerical counterpart such that π(ω) > π(ω ′ ) iff ω >π ω ′ will work as well. Namely the range of π is used as an ordinal scale. EXAMPLE 4. Let us consider the following classical example with default rules d1: “birds fly”, d2: “penguins do not fly”, d3: “penguins are birds”, symbolically written d1 : b

f ; d2 : p

¬f ; d3 : p


The set of three defaults is thus represented by the following set C of constraints: b ∧ f ≥Π b ∧ ¬f ; p ∧ ¬f ≥Π p ∧ f ; p ∧ b ≥Π p ∧ ¬b. Let Ω be the finite set of interpretations of the considered propositional language, generated by b, f, p in the example, that is Ω = {ω0 : ¬b∧¬f ∧¬p, ω1 : ¬b∧¬f ∧p, ω2 : ¬b∧f ∧¬p, ω3 : ¬b∧f ∧p, ω4 : b∧¬f ∧¬p, ω5 : b∧¬f ∧p, ω6 : b ∧ f ∧ ¬p, ω7 : b ∧ f ∧ p}. Any interpretation ω thus corresponds to a particular proposition. One can then compute the possibility of any proposition. For instance, Π(b ∧ f ) = max(π(ω6 ), π(ω7 )). Then the set of constraints C on interpretations can be written as: C1 : max(π(ω6 ), π(ω7 )) > max(π(ω4 ), π(ω5 )), C2 : max(π(ω5 ), π(ω1 )) > max(π(ω3 ), π(ω7 )), C3 : max(π(ω5 ), π(ω7 )) > max(π(ω1 ), π(ω3 )). The well ordered partition of Ω which is obtained in this example is {ω0 , ω2 , ω6 } >π {ω4 , ω5 } >π {ω1 , ω3 , ω7 }. In the example, we have m = 3 and π(ω0 ) = π(ω2 ) = π(ω6 ) = 1; π(ω4 ) = π(ω5 ) = 2/3; π(ω1 ) = π(ω3 ) = π(ω7 ) = 1/3.  From the possibility distribution π associated with the well ordered partition, we can compute the necessity level N (a) of any proposition a. The



method then consists in turning each default pi qi into a possibilistic clause (¬pi ∨ qi , N (¬pi ∨ qi )), where N is computed from the greatest possibility distribution π induced by the set of constraints corresponding to the default knowledge base, as already explained. We thus obtain a possibilistic logic base K. This encodes the generic knowledge embedded in the default rules. Then we apply the possibilistic inference for reasoning with the formulas in K encoding the defaults together with the available factual knowledge encoded as fully certain possibilistic formulas in a base F . However, the conclusions that can be obtained from K∪F with a certainty level strictly greater than the level of inconsistency of this base are safe. Roughly speaking, it turns out that in this approach, the most specific rules w.r.t. a given context remain above the level of inconsistency. EXAMPLE 5. Example 4 continued. Using the possibility distribution obtained at the previous step, we compute: N (¬p ∨ ¬f )

= =

min{1 − π(ω)|ω |= p ∧ f }

min(1 − π(ω3 ), 1 − π(ω7 )) = 2/3,

N (¬b ∨ f ) = min{1 − π(ω) | ω |= b ∧ ¬f } = min(1 − π(ω4 ), 1 − π(ω5 )) = 1/3, and N (¬p ∨ b) = min(1 − π(ω1 ), 1 − π(ω3 )) = 2/3. Thus, we have the possibilistic logic base K = {(¬p ∨ ¬f, 2/3), (¬p ∨ b, 2/3), (¬b ∨ f, 1/3)}. Suppose that all we know about the factual situation under consideration is that “Tweety” is a bird, which is encoded by F = {(b, 1)}. Then we apply the weakest link resolution rule, and we can check that K ∪ {(b, 1)} ⊢ (f, 1/3), i.e., we conclude that if all we know about “Tweety” is that it is a bird, then it flies. If we are told that “Tweety” is in fact a penguin, i.e., F = {(b, 1), (p, 1)}, then K ∪ F ⊢ (⊥, 1/3), which means that K ∪ {(b, 1)} augmented with the new piece of factual information {(p, 1)} is now inconsistent (at level 1/3). But the following inference K ∪ F ⊢ (¬f, 2/3) is valid (since 2/3 > inc-l(K ∪ F ) = 1/3). Thus, knowing that “Tweety” is a penguin, we now conclude that it does not fly.  This encoding takes advantage of the fact that when a new piece of information is received, the level of inconsistency of the base cannot decrease, and if it strictly increases, some inferences that were safe before are now drowned in the new inconsistency level of the base and are thus no longer allowed, hence a nonmonotonic consequence mechanism takes place. Such an approach has been proved to be in full agreement with the KrausLehmann-Magidor postulates-based approach to nonmonotonic reasoning



[Kraus et al., 1990]. More precisely, two nonmonotonic entailments can be defined in the possibilistic setting, the one presented above, based on the less specific possibility distribution compatible with the constraints encoding the set of defaults, and another one more cautious, where one considers that b can be deduced in the situation where all we know is F = {a} if and only if the inequality Π(a ∧ b) > Π(a ∧ ¬b) holds true for all the Π compatible with the constraints encoding the set of defaults. The first entailment coincides with the rational closure inference [Lehmann and Magidor, 1992], while the later corresponds to the (cautious) preferential entailment [Kraus et al., 1990]; see [Dubois and Prade, 1995a; Benferhat et al., 1997b]. Besides, the ranking of the defaults obtained above from the well-ordered partition [Benferhat et al., 1992] is the same as the Z-ranking introduced by Pearl [1990]. While the consequences obtained with the preferential entailment are hardly debatable, the ones derived with the rational closure are more adventurous. However these latter consequences can be always modified if necessary by the addition of further defaults. These added defaults may express independence information [Dubois et al., 1999a] of the type “in context c, the truth or the falsity of a has no influence on the truth of b” [Benferhat et al., 1994b; Benferhat et al., 1998b]. Lastly, a default rule may be itself associated with a certainty level; in such a case each formula will be associated with two levels, namely a priority level reflecting its relative specificity in the base, and its certainty level respectively [Dupin de Saint Cyr and Prade, 2008]. Let us also mention an application to possibilistic inductive logic programming. Indeed learning a stratified set of first-order logic rules as an hypothesis in inductive logic programming has been shown of interest for learning both rules covering normal cases and more specific rules that handle more exceptional cases [Serrurier and Prade, 2007]. Causality Our understanding of sequences of reported facts depends on our own beliefs on the normal course of things. We have seen that N (b|a) > 0 can be used for expressing that in context a, b is normally true. Then qualitative necessity measures may be used for describing how (potential) causality is perceived in relation with the advent of an abnormal event that precedes a change. Namely, if on the one hand an agent holds the two following beliefs represented by N (b|a) > 0 and N (¬b|a ∧ c) > 0 about the normal course of things, and if on the other hand it has been reported that we are in context a, and that b, which was true, has become false after c takes place, then the



agent will be led to think that “a caused ¬b”. See [Bonnefon et al., 2008] for a detailed presentation and discussion, and [Bonnefon et al., 2012] for a study of the very restricted conditions under which causality is transitive in this approach. The theoretical consequences of this model have been validated from a cognitive psychology point of view. Perceived causality may be badly affected by spurious correlations. For a proper assessing of causality relations, Pearl [2000] has introduced the notion of intervention in Bayesian networks, which comes down to enforcing the values of some variables so as to lay bare their influence on other ones. Following the same line, possibilistic networks have been studied from the standpoint of causal reasoning, using the concept of intervention; see [Benferhat and Smaoui, 2007b; Benferhat, 2010; Benferhat and Smaoui, 2011], where tools for handling interventions in the possibilistic setting have been developed. Finally, a counterpart of the idea of intervention has been investigated in possibilistic logic knowledge bases, which are non-directed structures (thus contrasting with Bayesian and possibilistic networks) [Benferhat et al., 2009].


Belief revision and information fusion

In belief revision, the new input information that fires the revision process has priority on the information contained in the current belief set. This contrasts with information fusion where sources play symmetric roles (even if they have different reliability levels). We briefly survey the contributions of possibilistic logic to these two problems. Belief revision and updating Keeping in mind that nonmonotonic reasoning and belief revision can be closely related [G¨ardenfors, 1990], it should not be a surprize that possibilistic logic finds application also in belief revision. In fact, comparative necessity relations (which can be encoded by necessity measures) [Dubois, 1986] are nothing but the epistemic entrenchment relations [Dubois and Prade, 1991] that underly well-behaved belief revision processes [G¨ ardenfors, 1988]. This enables the possibilistic logic setting to provide syntactic revision operators that apply to possibilistic knowledge bases, including the case of uncertain inputs [Dubois and Prade, 1997a; Benferhat et al., 2002c; Benferhat et al., 2010; Qi, 2008; Qi and Wang, 2012]. Note that in possibilistic logic, the epistemic entrenchment of the formulas is made explicit through the certainty levels. Formulas (a, α) are viewed as pieces of belief that are more or less certain. Moreover, in a revision process it is expected that all formulas independent of the validity



of the input information should be retained in the revised state of belief; this intuitive idea may receive a precise meaning using a suitable definition of possibilistic independence between events [Dubois et al., 1999a]. See [Dubois et al., 1998a] for a comparative overview of belief change operations in the different representation settings (including possibilistic logic). Updating in a dynamic environment obeys other principles than the revision of a belief state by an input information in a static world, see, e.g., [L´ea Somb´e Group (ed.), 1994]. It can be related to the idea of Lewis’imaging [1976], whose possibilistic counterpart has been proposed in [Dubois and Prade, 1993]. A possibilistic logic transposition of Kalman filtering that combines the ideas of updating and revision can be found in [Benferhat et al., 2000b]. Information fusion Information fusion can take place in the different representation formats of the possibilistic setting. In particular, the combination of possibility distributions can be equivalently performed in terms of possibilistic logic bases. Namely, the syntactic counterpart of the pointwise combination of two possibility distributions π1 and π2 into a distribution π1 ⊕ π2 by any monotonic combination operator ⊕5 such that 1 ⊕ 1 = 1, can be computed, following an idea first proposed in [Boldrin, 1995; Boldrin and Sossai, 1997]. Namely, if the possibilistic logic base Γ1 is associated with π1 and the base Γ2 with π2 , a possibilistic base that is semantically equivalent to π1 ⊕ π2 can be obtained in the following way [Benferhat et al., 1998a]: Γ1⊕2 = {(ai , 1 − (1 − αi ) ⊕ 1) s.t. (ai , αi ) ∈ Γ1 } ∪ {(bj , 1 − 1 ⊕ (1 − βj )) s.t. (bj , βj ) ∈ Γ2 } ∪ {(ai ∨ bj , 1 − (1 − αi ) ⊕ (1 − βj )) s.t. (ai , αi ) ∈ Γ1 , (bj , βj ) ∈ Γ2 }. For ⊕ = min, we get

Γ1⊕2 = Γ1 ∪ Γ2

with πΓ1 ∪Γ2 = min(π1 , π2 )

as expected (conjunctive combination). For ⊕ = max (disjunctive combination), we get Γ1⊕2 = {(ai ∨ bj , min(αi , βj )) s.t. (ai , αi ) ∈ Γ1 , and (bj , βj ) ∈ Γ2 }. 5⊕

is supposed to be monotonic in the wide sense for each of its arguments: α ⊕ β ≥ γ ⊕ δ as soon as α ≥ γ and β ≥ δ. Examples of such combination operators ⊕ are triangular norms (a non-decreasing semi-group of the unit interval having identity 1 and absorbing element 0) and the dual triangular co-norms that respectively extend conjunction and disjunction to multiple-valued settings [Klement et al., 2000].



With non idempotent ⊕ operators, some reinforcement effects may be obtained. Moreover, fusion can be applied directly to qualitative or quantitative possibilistic networks [Benferhat and Titouna, 2005; Benferhat and Titouna, 2009]. See [Benferhat et al., 1999c; Benferhat et al., 2001c; Kaci et al., 2000; Qi et al., 2010b; Qi et al., 2010a] for further studies on possibilistic logic merging operators. Besides, this approach has been also applied to the syntactic encoding of the merging of classical logic bases based on Hamming distance (where distances are computed between each interpretation and the different classical logic bases, thus giving birth to counterparts of possibility distributions) [Benferhat et al., 2002b].


Qualitative handling of uncertainty in decision and information systems

Uncertainty often pervades the available information. Possibility theory offers an appropriate setting for the representation of incomplete and uncertain epistemic information in a qualitative manner. In this subsection, we provide a brief presentation of the possibilistic logic approach to decision under uncertainty and to the management of uncertain databases. Qualitative decision under uncertainty Possibility theory provides a valuable setting for qualitative decision under uncertainty where a pessimistic and an optimistic decision criteria have been axiomatized [Dubois and Prade, 1995b; Dubois et al., 2001a; Benferhat et al., 2000a]. The exact counterpart of these pessimistic and optimistic criteria, when the knowledge and the preferences are respectively expressed under the form of two distinct possibilistic logic bases, have been shown in [Dubois et al., 1999b] to correspond to the following definitions: • The pessimistic utility u∗ (d) of a decision d is the maximal value of α ∈ S such that Kα ∧ d ⊢P L Pν(α) • The u∗ (d) of a decision d is the maximal value of n(α) ∈ S such that Kα ∧ d ∧ Pα 6⊢ ⊥ where S denotes a finite bounded totally ordered scale, ν is the ordered reversing map of this scale, Kα is a set of classical logic formulas gathering the pieces of knowledge that are certain at a level at least equal to α, and



where Pβ is a set of classical logic formulas made of a set of goals (modeling preferences) whose priority level is strictly greater than β. As can be seen, an optimal pessimistic decision leads for sure to the satisfaction of all the goals in Pν(α) whose priority is greater than a level as low as possible, according to a part Kα of our knowledge which is as certain as possible. An optimal optimistic decision maximizes only the consistency of all the more or less important goals with all the more or less certain pieces of knowledge. Optimal pessimistic or optimistic decisions can then be computed in an answer set programming setting [Confalonieri and Prade, 2014]. Besides, this possibilistic treatment of qualitative decision can be also related to an argumentative view of decision [Amgoud and Prade, 2009]. See also [Liau, 1999; Liau and Liu, 2001] for other possibility theory-based logical approaches to decision. Handling uncertainty in information systems In the possibilistic approach to the handling of uncertainty in databases, the available information on the value of an attribute A for an item x is usually represented by a possibility distribution defined on the domain of attribute A. Then, considering a classical query, we can compute two sets of answers, namely the set of items that more or less certainly satisfy the query (this corresponds to the above pessimistic viewpoint), and the larger set of items that more or less possibly satisfy the query (this corresponds to the above optimistic viewpoint) [Dubois and Prade, 1988]. Computation may become tricky for some basic relational operations such as the join of two relations, for which it becomes necessary to keep track that some uncertain values should remain equal in any extension. As in the probabilistic case, methods based on lineage have been proposed to handle such problems [Bosc and Pivert, 2005]. Their computational cost remain heavy in practice. However, uncertain data can be processed at a much more affordable cost provided that we restrict ourselves to pieces of information of the form (a(x), α) expressing that it is certain at level α that a(x) is the value of attribute A for the item x. More generally, a(x) can be replaced by a disjunction of values. Then, a possibilistic logic-like treatment of uncertainty in databases can take place in a relational database framework. It can be shown that such an uncertainty modeling is a representation system for the whole relational algebra. An important result is that the data complexity associated with the extended operators in this context is the same as in the classical database case [Bosc et al., 2009; Pivert and Prade, 2014]. An additional benefit of the possibilistic setting is



R 1 2 3

N ame John Mary Peter

M arried City (yes, α) (Toulouse, µ) (yes, 1) (Albi, ρ) (no, β) (Toulouse, φ)

S City 1 Albi 2 Toulouse

F lea M arket (yes, γ) (yes, δ)

Table 2. A database with possibilistic uncertainty an easier elicitation of the certainty levels. We illustrate the idea with the following simple example. EXAMPLE 6. Let us consider a database example with two relations R and S containing uncertain pieces of data. See Table 2. If we look here for the persons who are married and leave in a city with a flea market, we shall retrieve John with certainty min(α, µ, δ) and M ary with certainty min(ρ, γ). It is also possible to accommodate disjunctive information in this setting. Assume for instance that the third tuple of relation R is now (Peter, (no, β), (Albi ∨ Toulouse, φ)). Then, if we look for persons who are not married and leave in a city with a flea market, one retrieve P eter with certainty min(β, φ, γ, δ). Indeed we have in possibilistic logic that (¬Married, β) and (Albi ∨ Toulouse, φ), (¬Albi ∨ Flea Market, γ), (¬Toulouse ∨ Flea Market, δ) entail (¬Married, β) and (Flea Market, min(φ, γ, δ)).  This suggests the potentials of a necessity measure-based approach to the handling of uncertain pieces of information. Clearly, the limited setting of certainty-qualified information is less expressive than the use of general possibility distributions (we cannot here retrieve items that are just somewhat possible without being somewhat certain), but this framework seems to be expressive enough for being useful in practice. Let us also mention a possibilistic modeling of the validity and of the completeness of the information (pertaining to a given topic) in a database [Dubois and Prade, 1997b]. Besides, the possibilistic handling of uncertainty in description logic [Qi et al., 2011; Zhu et al., 2013] has also computational advantages, in particular in the case of the possibilistic DL-Lite family [Benferhat and Bouraoui, 2013; Benferhat et al., 2013]. Lastly, possibilistic logic has been recently shown to be of interest in database design [Koehler et al., 2014a; Koehler et al., 2014b].





Possibilistic logic has been extended in different manners. In this section, we consider three main types of extension: i) replacing the totally ordered scale of the certainty levels by a partially ordered structure; ii) dealing with logical formulas that are weighted in terms of lower bounds of a strong (guaranteed) possibility measure ∆ (see subsection 2.2); iii) allowing for negation of basic possibilistic logic formulas, or for their disjunction (and no longer only for their conjunction), which leads to generalized possibilistic logic.


Lattice-based possibilistic logics

Basically, a possibilistic formula is a pair made of a classical logic formula and a label that qualifies in what conditions or in what manner the classical logic formula is regarded as true. One may think of associating “labels” other than certainty levels. It may be lower bounds of other measures in possibility theory, such as in particular strong possibility measures, as reviewed in the next subsection. It may be also labels taking values in partially ordered structures, such as lattices. This can be motivated by different needs, as briefly reviewed now. Different intended purposes Timed possibilistic logic [Dubois et al., 1991b] has been the first proposed extension of this kind. In timed possibilistic logic, logical formulas are associated with sets of time instants where the formula is known as being certainly true. More generally certainty may be graded as in basic possibilistic logic, and then formulas are associated with fuzzy sets of time instants where the grade attached to a time instant is the certainty level with which the formula is true at that time. At the semantic level, it leads to an extension of necessity (and possibility) measures now valued in a distributive lattice structure. Taking inspiration of possibilistic logic, Lafage, Lang and Sabbadin [1999] have proposed a logic of supporters, where each formula a is associated with a set of logical arguments in favor of a. More recently, an interval-based possibilistic logic has been presented [Benferhat et al., 2011] where classical logic formulas are associated with intervals, thought as imprecise certainty levels. Another early proposed idea, in an information fusion perspective, is to associate each formula with a set of distinct explicit sources that support its truth [Dubois et al., 1992]. Again, a certainty level may be attached to



each source, and then formulas are associated with fuzzy sets of sources. This has led to the proposal of a “multiple agent” logic where formulas are of the form (a, A), where A denotes a subset of agents that are known to believe that a is true. In contrast with timed possibilistic logic where it is important to make sure that the knowledge base remains consistent over time, what matters in multiple agent logic is the collective consistency of subsets of agents (while the collection of the beliefs held by the whole set of agents may be inconsistent. We now indicate the main features of this latter logic. Multiple agent logic Multiple agent possibilistic logic was outlined in [Dubois and Prade, 2007], but its underlying semantics has been laid bare more recently [Belhadi et al., 2013]. A multiple agent propositional formula is a pair (a, A), where a is a classical propositional formula of L and A is a non-empty subset of All, i.e., A ⊆ All (All denote the finite set of all considered agents). The intuitive meaning of formula (a, A) is that at least all the agents in A believe that a is true. In spite of the obvious parallel with possibilistic logic (where propositions are associated with levels expressing the strength with which the propositions are believed to be true), (a, A) should not be just used as another way of expressing the strength of the support in favor of a (the larger A, the stronger the support), but rather as a piece of information linking a proposition with a group of agents. Multiple agent logic has two inference rules: • if B ⊆ A then (a, A) ⊢ (a, B) (subset weakening) • (¬a ∨ b, A), (a, A) ⊢ (b, A), ∀A ∈ 2ALL \ ∅ (modus ponens) As a consequence, we also have the resolution rule if A ∩ B = 6 ∅, then (¬a ∨ b, A), (a ∨ c, B) ⊢ (b ∨ c, A ∩ B). If A ∩ B = ∅, the information resulting from applying the rule does not belong to the language, and would make little sense: it is of no use to put formulas of the form (a, ∅) in a base as it corresponds to information possessed by no agent. Since 2ALL is not totally ordered as in the case of certainty levels, we cannot “slice” a multiple agent knowledge base Γ = {(ai , Ai ), i = 1, . . . , m} into layers as in basic possibilistic logic. Still, one can define the restriction of Γ to a subset A ⊆ All as ΓA = {(ai , Ai ∩ A) | Ai ∩ A = 6 ∅ and (ai , Ai ) ∈ Γ}.



Moreover, an inconsistency subset of agents can be defined for Γ as [ inc-s(Γ) = {A ⊆ All | Γ ⊢ (⊥, A)} and inc-s(Γ) = ∅ if ∄A s.t. Γ ⊢ (⊥, A).

Note that in this definition A = ∅ is not forbidden. For instance, let Γ = {(p, A), (q, B), (¬p ∨ q, C), (¬q, D)}, then inc-s(Γ) = (A ∩ C ∩ D) ∪ (B ∩ D), and obviously inc-s(ΓA∩B∩C∩D ) = ∅. Clearly, it is not the case that the consistency of Γ (inc-s(Γ) = ∅) implies that Γ◦ is consistent. This feature contrasts with possibilistic logic. Just consider the example Γ = {(a, A), (¬a, A)}, then inc-s(Γ) = A∩A = ∅ while Γ◦ = {ai | (ai , Ai ) ∈ Γ, i = 1, . . . , m} is inconsistent. This is compatible with situations where agents contradict each other. Yet, the consistency of Γ◦ does entail inc-s(Γ) = ∅. The semantics of ma-L is expressed in terms of set-valued possibility distributions, set-valued possibility measures and set-valued necessity measures. Namely, the semantics of formula (a, A) is given by set-valued distribution π {(a,A)} :  All if ω |= a ∀ω ∈ Ω, π {(a,A)} (ω) = Ac if ω |= ¬a

where Ac = All \ A, and the formula (a, A) is understood as expressing the constraints N(a) ⊇ A where N is a set-valued necessity measure. Soundness and completeness results can be established with respect to this semantics [Belhadi et al., 2013]. Basic possibilistic logic and multiple agent logic may then be combined in a possibilistic multiple agent logic. Formulas are pairs (a, F ) where F is now a fuzzy subset of All. One may in particular consider the fuzzy sets F = (α/A) such that (α/A)(k) = α if k ∈ A, and (α/A)(k) = 0 if k ∈ A, i.e., we restrict ourselves to formulas of the form (a, α/A) that encode the piece of information “at least all agents in A believe a at least at level α”. Then the resolution rule becomes (¬p ∨ q, α/A); (p ∨ r, β/B) ⊢ (q ∨ r, min(α, β)/(A ∩ B)).


Uses of the strong possibility set function in possibilistic logic

As recalled in subsection 2.2, a possibility distribution can be associated not only with the increasing set functions Π and N , but also with the decreasing set functions ∆ and ∇. As we are going to see, this enables a double reading, from above and from below, of a possibility distribution. This



double reading may be of interest in preference representation by allowing the use of different but equivalent representation formats. Moreover a ∆based possibilistic logic, handling formulas associated with lower bounds of a ∆-set function, can be developed. This is reviewed first. The last part of this subsection is devoted to a different use of ∆-set functions. Namely, the modeling of bipolar information. Then one distinguishes between positive information (expressed by means of ∆-based possibilistic logic formulas) and negative information (expressed by means of N -based possibilistic logic formulas), these two types of information being associated with two distinct possibility distributions. Double reading of a possibility distribution In basic possibilistic logic, a base ΓN = {(ai , αi ), i = 1, ..., m} is semantically associated with the possibility distribution πΓN (ω) = mini=1,...,m max([ai ](ω), 1 − αi ), where [ai ] is the characteristic function of the models of ai . As being the result of a min-combination, this corresponds to a reading “from above” of the possibility distribution. Let us consider another type of logical formula (now denoted between brackets rather than parentheses) as a pair [b, β], expressing the constraint ∆(b) ≥ β, where ∆ is a guaranteed or strong possibility measure. Then, a ∆-base Γ∆ = {[bj , βj ] | j = 1, . . . , n} is associated to the distribution  βj if ω ∈ [bj ] πΓ∆ (ω) = max π[bj ,βi ] (ω) with π[bj ,βi ] (ω) = j=1,...,n 0 otherwise. As being the result of a max-combination, this corresponds to a reading “from below” of the possibility distribution. It can be proved [Dubois et al., 2014b] that the N -base ΓN = {(ai , αi ) | i = 1, . . . , m} is semantically equivalent to the ∆-base Γ∆ = {[∧i∈J ai , min(1 − αk )] : J ⊆ {1, . . . , m}}. k6∈J

Although it looks like the translated knowledge base is exponentially larger than the original one, it can be simplified. Indeed, suppose, without loss of generality that 1 = α1 > α2 > · · · > αm and αm+1 = 0 by convention (we combine conjunctively all formulas with the same level). Then it is easy to check that, max


min(min(1−αk ), [∧j∈J aj ](ω)) = k6∈J



min(1−αk , [∧k−1 j=1 aj ](ω))

which corresponds to the ∆-base Γ∆ = {[∧k−1 j=1 aj , 1 − αk ] : k = 1, . . . , m + 1},



with ∧0j=1 aj = ⊤ (tautology). Of course, likewise the ∆-base Γ∆ = {[bj , βj ] | j = 1, . . . , n} is semantically equivalent to the N -base ΓN = {(∨j∈J bj , max(1 − βk )) : J ⊆ {1, . . . , n}}, k6∈J

which can be simplified as ΓN = {(∨k−1 j=1 bj , 1 − βk ) : k = 1, . . . , n + 1}, with β1 > β2 > · · · > βn > βn+1 = 0, ∨0j=1 ψj = ⊥ (contradiction), by convention. Thus, a possibilistic logic base Γ∆ expressed in terms of a strong possibility measure can always be rewritten equivalently in terms of a standard possibilistic logic base ΓN using necessity measures and conversely, enforcing the equality πΓN = πΓ∆ . The transformation from πΓN to πΓ∆ corresponds to writing the min-max expression of πΓN as a max-min expression (applying the distributivity of min over max) and conversely. This is now illustrated on a preference example. Preference representation As already emphasized, possibilistic logic applies to the representation of both knowledge and preferences . In case of preferences, the level α associated a formula a in (a, α) is understood as a priority. EXAMPLE 7. Thus, a piece of preference such as “I prefer p to q and q to r” (where p, q, r may not be mutually exclusive) can be represented by the possibilistic base ΓN = {(p ∨ q ∨ r, 1), (p ∨ q, 1 − γ), (p, 1 − β)} with γ < β < 1, by translating the preference into a set of more or less imperative goals. Namely, Γ states that p is somewhat imperative, that p ∨ q is more imperative, and that p ∨ q ∨ r is compulsory. Note that the preferences are here expressed negatively: “nothing is possible outside p, q, or r”, “nothing is really possible outside p, or q”, and “nothing is strongly possible outside p”. The possibilistic base ΓN is associated with the possibility distribution πΓN which rank-orders the alternatives: πΓN (pqr) = 1, πΓN (p¬qr) = 1, πΓN (pq¬r) = 1, πΓN (p¬q¬r) = 1, πΓN (¬pqr) = β, πΓN (¬pq¬r) = β, πB (¬p¬qr) = γ, πΓN (¬p¬p¬r) = 0. From this possibility distribution, one can compute the associated measure of strong possibility for some events of interest: ∆(p) = min(πΓN (pqr), πΓN (p¬qr), πΓN (pq¬r), πΓN (p¬b¬r)) = 1 ∆(q) = min(πΓN (pqr), πΓN (¬pqr), πΓN (pq¬r), πΓN (¬pq¬r)) = β



∆(r) = min(πΓN (pqr), πΓN (¬pqr), πΓN (p¬qr), πΓN (¬p¬qr)) = γ. It gives birth to the positive base Γ∆ = {[p, 1], [q, β], [r, γ]}, itself associated with a possibility distribution πΓ∆ (pqr) = 1, πΓ∆ (p¬qr) = 1, πΓ∆ (pq¬r) = 1, πΓ∆ (p¬q¬r) = 1, ∆ πΓ (¬pqr) = β, πΓ∆ (¬pq¬r) = β, πΓ∆ (¬p¬qr) = γ, πΓ∆ (¬p¬q¬r) = 0. It can be checked that πΓN = πΓ∆ . Thus, the preferences are here equivently expressed in a positive manner as a “weighted” disjunction of the three choices p, q and r, stating that p is fully satisfactory, q is less satisfactory, and that r is still less satisfactory.  This shows that the preferences here can be equivalently encoded under the form of the positive base Γ∆ , or of the negative base ΓN [Benferhat et al., 2001d]. Let us mention the representational equivalence [Benferhat et al., 2004a] between qualitative choice logic [Brewka et al., 2004; Benferhat and Sedki, 2008] and ∆-based possibilistic logic, which can be viewed itself as a kind of DNF-like counterpart of standard (CNF-like) possibilistic logic at the representation level. The above ideas have been applied to preference queries to databases [Bosc et al., 2010; Dubois and Prade, 2013] for modeling the connectives “and if possible” and “or at least” in queries. Besides, it has been shown that the behavior of Sugeno integrals, a well-known family of qualitative multiple criteria aggregation operators, can be described under the form of possibilistic logic bases (of the N -type, or of the ∆-type) [Dubois et al., 2014b]. It is also possible to represent preferences with an additive structure in the possibilistic setting thanks to appropriate fusion operators as noticed in [Prade, 2009]. Inference in ∆-based possibilistic logic While in basic possibilistic logic formulas, the certainty level assesses the certainty that the interpretations violating the formulas are excluded as possible worlds, ∆-based formulas rather express to what extent the models of the formulas are actually possible in the real world. This is a consequence of the decreasingness of set functions ∆, which leads to a non standard behavior with respect to inference. Indeed, the following cut rule can be established [Dubois et al., 2000; Dubois and Prade, 2004] (using the notation of ∆-based formulas): [a ∧ b, α], [¬a ∧ c, β] ⊢ [b ∧ c, min(α, β)]



This is due to the fact that in terms of models, we have [b∧c] ⊆ [a∧b]∪[¬a∧ c]. Thus, if both any model of [a ∧ b] any model of [¬a ∧ c] are satisfactory, it should be also the case of any model of [b ∧ c]. Moreover, there is also an inference rule mixing strong possibility and weak necessity, established in [Dubois et al., 2013a]: ∆([a ∧ b]) ≥ α and ∇([¬a ∧ c]) ≥ β entails ∇([b ∧ c]) ≥ α ∗ β where α ∗ β = α if α > 1 − β and α ∗ β = 0 if 1 − β ≥ α. Besides, it has been advocated in [Casali et al., 2011; Dubois et al., 2013a] that desires obey the characteristic postulate of set functions ∆, namely ∆(a ∨ b) = min(∆(a), ∆(b)). Indeed, all the models of a ∨ b are satisfactory (or desirable), if both all the models of a and all the models of b are actually satisfactory. Then desiring a amounts to find satisfactory any situation where a is true. However, this may be a bit too strong since there may exist some exceptional situations that are not satisfactory although a is true. This calls for a nonmonotonic treatment of desires in terms of ∆ function. This is outlined in [Dubois et al., 2014a]. Bipolar representation The representation capabilities of possibilistic logic are suitable for expressing bipolar information [Dubois and Prade, 2006; Benferhat et al., 2008]. Indeed this setting allows the representation of both negative information and positive information. The bipolar setting is of interest for representing observations and knowledge, or for representing positive and negative preferences. Negative information reflects what is not (fully) impossible and thus remains potentially possible (non impossible). It induces constraints restricting where the real world is (when expressing knowledge), or delimiting the potentially satisfactory choices (when dealing with preferences). Negative information can be encoded by basic (i.e., necessity-based) possibilistic logic formulas. Indeed, (a, α) encodes N (a) ≥ α, which is equivalent to Π(¬a) ≤ 1 − α, and thus reflects the impossibility of ¬a, which is all the stronger as α is high. Positive information expressing what is actually possible, or what is really desirable, is encoded by ∆-based formulas [b, β], which expresses the constraint ∆(b) ≥ β. Positive information and negative information are not necessarily provided by the same sources: in other words, they may rely on two different possibility distributions. The modeling of beliefs and desires provides another example where two possibility distributions are needed, one for restricting the more or less plausible states of the world according to the available knowledge, another for



describing the more or less satisfactory states according to the expressed desires [Dubois et al., 2013a] Fusion operations can be defined at the semantic and at the syntactic level in the bipolar setting [Benferhat and Kaci, 2003; Benferhat et al., 2006]. The fusion of the negative part of the information is performed by using the formulas of subsection 4.2 for basic possibilistic logic. Their counterpart for positive information is

Γ∆ 1⊕2

{[ai , αi ⊕ 0] = ∪ {[bj , 0 ⊕ βj ] ∪ {[a ∧ b , α ⊕ β ] i j i j

s.t. [ai , αi ] ∈ Γ∆ 1 } s.t. [bj , βj ] ∈ Γ∆ 2 } ∆ s.t. [ai , αi ] ∈ Γ1 , [bj , βj ] ∈ Γ∆ 2 },

while πΓ∆ = πΓ∆ ⊕ π∆ ∆ . 1 2 1⊕2 This may be used for aggregating positive (together with negative) preferences given by different agents who state what would be really satisfactory for them (and what they reject more or less strongly). This may also be used for combining positive (together with negative) knowledge. Then positive knowledge is usually made of reported cases that testify what is actually possible, while negative knowledge excludes what is (more or less certainly) impossible. A consistency condition is natural between positive and negative information, namely what is actually possible (positive information) should be included in what is not impossible (complement of the negative information). Since positive information is combined disjunctively (the more positive information we have, the more the interpretations that are actually possible), and negative information conjunctively in a fusion process (the more negative information we have, the less the worlds that are non impossible), this consistency condition should be enforced in the result. This can be done by a revision step that gives priority either to the negative side (in general when handling preferences, where rejections are more important), or to the positive side (it may apply for knowledge when reliable observations are conflicting with general beliefs) [Dubois et al., 2001b].


Generalized possibilistic logic

In basic possibilistic logic, only conjunctions of possibilistic logic formulas are allowed (since a conjunction is equivalent to the conjunction of its conjuncts, due to the min-decomposability of necessity measures). However, the negation and the disjunction of possibilistic logic formulas make sense as well. Indeed, the pair (a, α) is both a possibilistic logic formula at the



object level, and a classical formula at the meta level. Since (a, α) is semantically interpreted as N (a) ≥ α, a possibilistic formula can be manipulated as a formula that is true (if N (a) ≥ α) or false (if N (a) < α). Then possibilistic formulas can be combined with all propositional connectives. We are then in the realm of generalized possibilistic logic (GPL) [Dubois and Prade, 2011a], first suggested in [Dubois and Prade, 2007]. Note that for disjunction, the set of possibility distributions representing the disjunctive constraint ‘N (a) ≥ α or N (b) ≥ β’ has no longer a unique extremal element in general, as it is the case for conjunction. Thus the semantics of GPL is in terms of set of possibility distributions rather than given by a unique possibility distribution as in basic possibilistic logic. More precisely GPL is a two-tier propositional logic, in which propositional formulas are encapsulated by modal operators that are interpreted in terms of uncertainty measures from possibility theory. Let Sk = {0, k1 , k2 ,...,1} with k ∈ N \ {0} be the finite set of certainty degrees under consideration, and let Sk+ = Sk \ {0}. Let L be the language of all propositional formulas. The language of GPL LkN with k + 1 certainty levels is as follows: • If a ∈ L and α ∈ Sk+ , then Nα (a) ∈ LkN . • If ϕ ∈ LkN and ψ ∈ LkN , then ¬ϕ and ϕ ∧ ψ are also in LkN . Here we use the notation Nα (a), instead of (a, α), emphasizing the closeness with modal logic calculus and allowing the introduction of other associated modalities. So, an agent asserting Nα (a) has an epistemic state π such that N (a) ≥ α > 0. Hence ¬Nα (a) stands for N (a) < α, which, given the finiteness of the set of considered certainty degrees, means N (a) ≤ α− k1 and thus Π(¬a) ≥ 1 − α + k1 . Let ν(α) = 1 − α + k1 . Then, ν(α) ∈ Sk+ iff α ∈ Sk+ , and ν(ν(α)) = α, ∀α ∈ Sk+ . Thus, we can write Πα (a) ≡ ¬Nν(α) (¬a). Thus, in GPL, one can distinguish between the absence of certainty that a is true (¬Nα (a)) and the (stronger) certainty statement that a is false (Nα (¬a)). The semantics of GPL is defined in terms of normalized possibility distributions over propositional interpretations, where possibility degrees are limited to Sk . A model of a GPL formula is any Sk -valued possibility distribution which satisfies: • π is a model of Nα (a) iff N (a) ≥ α; • π is a model of ϕ1 ∧ ϕ2 iff π is a model of ϕ1 and a model of ϕ2 ; • π is a model of ¬ϕ iff π is not a model of ϕ;



where N is the necessity measure induced by π. As usual, π is called a model of a set of GPL formulas K, written π |= K, if π is a model of each formula in K. We write K |= Φ, for K a set of GPL formulas and Φ a GPL formula, iff every model of K is also a model of Φ. The soundness and completeness of the following axiomatization of GPL has been established with respect to the above semantics [Dubois et al., 2012; Dubois et al., 2014c]: (PL) The Hilbert axioms of classical logic (K) Nα (a → b) → (Nα (a) → Nα (b)) (N) N1 (⊤) (D) Nα (a) → Π1 (a) (W) Nα1 (a) → Nα2 (a), if α1 ≥ α2 with modus ponens as the only inference rule. Note in particular that when α is fixed we get a fragment of the modal logic KD. See [Herzig and Fari˜ nas del Cerro, 1991; Dubois et al., 1988; Liau and Lin, 1992; Dubois et al., 2000] for previous studies of the links between modal logics and possibility theory. The case where k = 1 coincides with the Meta-Epistemic Logic (MEL) that was introduced by Banerjee and Dubois [2009; 2014]. This simpler logic, a fragment of KD with no nested modalities nor objective formulas, can express full certainty and full ignorance only and its semantics is in terms of non-empty subsets of interpretations. Moreover, an extension of MEL [Banerjee et al., 2014] to a language containing modal formulas of depth 0 or 1 only has been shown to be in some sense equivalent to S5 with a restricted language, but with the same expressive power, the semantics being based on pairs made of an interpretation (representing the real world) and a non-empty set of possible interpretations (representing an epistemic state). Note that in MEL, we have Π1 (a) ≡ ¬N1 (¬a) whereas in general we only have Π1 (a) ≡ ¬N k1 (¬a). GPL is suitable for reasoning about the revealed beliefs of another agent. It captures the idea that while the consistent epistemic state of an agent about the world is represented by a normalized possibility distribution over possible worlds, the meta-epistemic state of another agent about the former’s epistemic state is a family of possibility distributions. Modalities associated with set functions ∆ and ∇ can also be introduced in the GPL language [Dubois et al., 2014c]. For a propositional interpretation ω let us write conjω for the conjunction of all literals made true by ω, V V i.e. conjω = ω|=a a ∧ ω|=¬a ¬a. Since ∆(a) = minω∈[a] Π({ω}), we define:


∆α (a) =




Πα (conjω ) ; ∇α (a) = ¬∆ν(α) (¬α)

Using the modality ∆, for any possibility distribution π over the set of interpretations Ω, we can easily define a GPL theory which has π as its only model [Dubois et al., 2014c]. In particular, let a1 , ..., ak be propositional formulas such that [ai ] = {ω | π(ω) ≥ ki }. Then we define the theory Φπ as: Φπ =

k ^


Nν( ki ) (ai ) ∧ ∆ ki (ai ).

In this equation, the degree of possibility of each ω ∈ [ai ] is defined by inequalities from above and from below. Indeed, ∆ ki (ai ) means that π(ω) ≥ i i−1 i / [ai ]. It k for all ω ∈ [ai ], whereas, Nν( k ) (ai ) means π(ω) ≤ k for all ω ∈ i follows that π(ω) = 0 if ω ∈ / [a1 ], π(ω) = k if ω ∈ [ai ] \ [ai+1 ] (for i < k) and π(ω) = 1 if ω ∈ [ak ]. In other words, π is indeed the only model of Φπ . If we view the epistemic state of an agent as a possibility distribution, this means that every epistemic state can be modeled using a GPL theory. Conceptually, the construction of Φπ relates to the notion of “only knowing” from Levesque [1990]. See [Dubois et al., 2014c] for a detailed study. Another remarkable application of generalized possibilistic logic is its capability to encode any Answer Sets Programs , choosing S2+ = {1/2, 1}. In this case, we can discriminate between propositions in which we are fully certain and propositions which we consider to be more plausible than not. This is sufficient to enable us to capture the semantics of rules (with negation as failure) within GPL. See [Dubois et al., 2011] for the introduction of basic ideas in a possibility theory and approximate reasoning perspective, and [Dubois et al., 2012] for theoretical results (including the encoding of equilibrium logic [Pearce, 2006]). In GPL modalities cannot be nested. Still, it seems possible to give a meaning in the possibility theory setting to a formula of the form ((a, α), β). Its semantics, viewing (a, α) as a true or false statement, is given by a possibility distribution over the possibility distributions π such that π ≤ π(a,α) (that makes N (a) ≥ α true) and all the other possibility distributions, with respective weights 1 and 1 − β. This may reduce to one possibility distribution corresponding to the semantics of (a, min(α, β)), via the disjunctive weighted aggregation max(min(π(a,α) , 1), min(1, 1 − β), which expresses that either it is the case that N (a) ≥ α with a possibility level equal to 1, or one knows nothing with possibility 1 − β. Nested modalities are in particular of interest for expressing mutual beliefs of multiple agents.



This suggests to hybridize GPL with possibilistic multiple agent logic, and to study if the Booleanization of possibilistic formulas may give us the capability of expressing mutual beliefs between agents in a proper way, as well as validating inferences with nested modalities such that (¬(a, 1), α), ((a, 1) ∨ b, β) ⊢ (b, min(α, β)), following ideas suggested in [Dubois and Prade, 2007; Dubois and Prade, 2011a]. Other possibility theory-based logical formalisms have been developed, which are at least as expressible as GPL, but based on fuzzy logic [H´ajek et al., 1995; H´ ajek, 1998], instead of keeping a Boolean view of possibilistic formulas as in GPL. Moreover, these formalisms have been extended to cope with fuzzy propositions as well [Flaminio et al., 2011; Flaminio et al., 2011b; Flaminio et al., 2012], and may allow for nested modalities [H´ajek et al., 1994; Bou et al., 2014], or propose a fuzzy modal logic of possibilistic conditionals [Marchioni, 2006]. See also [Liau, 1998; Liau and Lin, 1996] for other possibilistic logic formalisms (over classical logic propositions). A careful comparison of GPL with these different formalisms is still to be done. However, the distinctive feature of basic possibilistic logic, as well as of GPL, is to remain as close as possible to classical logic, which makes possibilistic logic simple to handle, and should also have computational advantages.



Possibilistic logic is thirty years old. Although related to the idea of fuzzy sets through possibility measures, possibilistic logic departs from other fuzzy logics [Dubois et al., 2007], since it primarily focuses on classical logic formulas pervaded with qualitative uncertainty. Indeed basic possibilistic logic, as well as generalized possibilistic logic remain close to classical logic, but still allow for a sophisticated and powerful treatment of modalities. The chapter is an attempt at offering a broad overview of the basic ideas underlying the possibilistic logic setting, through the richness of its representation formats, and its various applications to many AI problems, in relation with the representation of epistemic states and their handling when reasoning from and about them. In that respect possibilistic logic can be compared to other approaches including nonmonotonic logics, modal logics, or Bayesian nets. Directions for further research in possibilistic logic includes theoretical issues and application concerns. On the theoretical side, extensions to nonclassical logics [Besnard and Lang, 1994], to the handling of fuzzy predicates [Dellunde et al., 2011; El-Zekey and Godo, 2012], to partially or-



dered sets of logical formulas [Cayrol et al., 2014] are worth continuing, relations with conditional logics [Halpern, 2005; Lewis, 1973; H´ajek, 1998] worth investigating. On the applied side, it seems that the development of efficient implementations, of applications to information systems, and of extensions of possibilistic logic to multiple agent settings and to argumentation [Ches˜ nevar et al., 2005; Alsinet et al., 2008; Nieves and Cort´es, 2006; Godo et al., 2012; Amgoud and Prade, 2012] would be of particular interest.

ACKNOWLEGMENTS Some people have been instrumental in the development of possibilistic logic over three decades. In that respect, particular thanks are especially due to Salem Benferhat, Souhi Kaci, J´erˆ ome Lang, and Steven Schockaert. The authors wish also to thank Leila Amgoud, Mohua Banerjee, Philippe Besnard, Claudette Cayrol, Florence Dupin de Saint-Cyr, Luis Fari˜ nas del Cerro, Henri Farreny, H´el`ene Fargier, Dov Gabbay, Lluis Godo, Andy Herzig, Tony Hunter, Sebastian Link, Weiru Liu, Olivier Pivert, Agn`es Rico, R´egis Sabbadin, Mary-Anne Williams, and Lotfi Zadeh for discussion, encouragement and support over the years.

BIBLIOGRAPHY [Alsinet and Godo., 2000] T. Alsinet and L. Godo. A complete calculus for possibilistic logic programming with fuzzy propositional variables. In Proc. 16th Conf. on Uncertainty in Artificial Intelligence (UAI’00), Stanford, Ca., pages 1–10, San Francisco, 2000. Morgan Kaufmann. [Alsinet et al., 2002] T. Alsinet, L. Godo, and S. Sandri. Two formalisms of extended possibilistic logic programming with context-dependent fuzzy unification: a comparative description. Elec. Notes in Theor. Computer Sci., 66 (5), 2002. [Alsinet et al., 2008] T. Alsinet, C. I. Ches nevar, and L. Godo. A level-based approach to computing warranted arguments in possibilistic defeasible logic programming. In Ph. Besnard, S. Doutre, and A. Hunter, editors, Proc. 2nd. Inter. Conf. on Computational Models of Argument (COMMA’08), Toulouse, May 28-30, pages 1–12. IOS Press, 2008. [Amgoud and Prade, 2009] L. Amgoud and H. Prade. Using arguments for making and explaining decisions. Artificial Intelligence, 173:413–436, 2009. [Amgoud and Prade, 2012] L. Amgoud and H. Prade. Towards a logic of argumentation. In E. H¨ ullermeier, S. Link, Th. Fober, and B. Seeger, editors, Proc. 6th Int. Conf. on Scalable Uncertainty Management (SUM’12), Marburg, Sept. 17-19, volume 7520 of LNCS, pages 558–565. Springer, 2012. [Banerjee and Dubois, 2009] M. Banerjee and D. Dubois. A simple modal logic for reasoning about revealed beliefs. In C. Sossai and G. Chemello, editors, Proc. 10th Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU), Verona, July 1-3, number 5590 in LNCS, pages 805–816. Springer, 2009.



[Banerjee and Dubois, 2014] M. Banerjee and D. Dubois. A simple logic for reasoning about incomplete knowledge. Int. J. of Approximate Reasoning, 55:639–653, 2014. [Banerjee et al., 2014] M. Banerjee, D. Dubois, and L. Godo. Possibilistic vs. relational semantics for logics of incomplete information. In A. Laurent, O. Strauss, B. BouchonMeunier, and R. R. Yager, editors, Proc. 15th Int. Conf. on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU’14), Part I, Montpellier, July 15-19, volume 442 of Comm. in Comp. and Inf. Sci., pages 335– 344. Springer, 2014. [Bauters et al., 2010] K. Bauters, S. Schockaert, M. De Cock, and D. Vermeir. Possibilistic answer set programming revisited. In P. Gr¨ unwald and P. Spirtes, editors, Proc. 26th Conf. on Uncertainty in Artificial Intelligence (UAI’10), Catalina Island, July 8-11, pages 48–55. AUAI Press, 2010. [Bauters et al., 2011] K. Bauters, S. Schockaert, M. De Cock, and D. Vermeir. Weak and strong disjunction in possibilistic ASP. In S. Benferhat and J. Grant, editors, Proc. 5th Int. Conf. on Scalable Uncertainty Management (SUM’11), Dayton, October 10-13, volume 6929 of LNCS, pages 475–488. Springer, 2011. [Bauters et al., 2012] K. Bauters, S. Schockaert, M. De Cock, and D. Vermeir. Possible and necessary answer sets of possibilistic answer set programs. In Proc. I 24th EEE Int. Conf. on Tools with Artificial Intelligence (ICTAI’12), Athens, Nov. 7-9, pages 836–843, 2012. [Belhadi et al., 2013] A. Belhadi, D. Dubois, F. Khellaf-Haned, and H. Prade. Multiple agent possibilistic logic. J. of Applied Non-Classical Logics, 23:299–320, 2013. [Belnap, 1977] N. D. Belnap. A useful four-valued logic. In J. M. Dunn and G. Epstein, editors, Modern Uses of Multiple-Valued Logic, pages 7–37. D. Reidel, Dordrecht, 1977. [Ben Amor and Benferhat, 2005] N. Ben Amor and S. Benferhat. Graphoid properties of qualitative possibilistic independence relations. Int. J. Uncertainty, Fuzziness & Knowledge-based Syst., 13:59–97, 2005. [Ben Amor et al., 2002] N. Ben Amor, S. Benferhat, D. Dubois, K. Mellouli, and H. Prade. A theoretical framework for possibilistic independence in a weakly ordered setting. Int. J. Uncertainty, Fuzziness & Knowledge-based Syst., 10:117–155, 2002. [Ben Amor et al., 2003] N. Ben Amor, S. Benferhat, and K. Mellouli. Anytime propagation algorithm for min-based possibilistic graphs. Soft Comput., 8(2):150–161, 2003. [Ben Amor et al., 2014] N. Ben Amor, D. Dubois, H. Gouider, and H. Prade. Possibilistic networks: A new setting for modeling preferences. In U. Straccia and A. Cali, editors, Proc. 8th Int. Conf. on Scalable Uncertainty Management (SUM 2014), Oxford, Sept. 15-17, volume 8720 of LNCS, pages 1–7. Springer, 2014. [Benferhat and Bouraoui, 2013] S. Benferhat and Z. Bouraoui. Possibilistic DL-Lite. In W.r. Liu, V. S. Subrahmanian, and J. Wijsen, editors, Proc. 7th Int. Conf. on Scalable Uncertainty Management (SUM’13), Washington, DC, Sept. 16-18, volume 8078 of LNCS, pages 346–359. Springer, 2013. [Benferhat and Kaci, 2003] S. Benferhat and S. Kaci. Logical representation and fusion of prioritized information based on guaranteed possibility measures: Application to the distance-based merging of classical bases. Artificial Intelligence, 148(1-2):291–333, 2003. [Benferhat and Prade, 2005] S. Benferhat and H. Prade. Encoding formulas with partially constrained weights in a possibilistic-like many-sorted propositional logic. In L. Pack Kaelbling and A. Saffiotti, editors, Proc. of the 9th Inter. Joint Conf. on Artificiel Intelligence (IJCAI’05), Edinburgh, July 30-Aug. 5, pages 1281–1286, 2005. [Benferhat and Prade, 2006] S. Benferhat and H. Prade. Compiling possibilistic knowledge bases. In G. Brewka, S. Coradeschi, A. Perini, and P. Traverso, editors, Proc.



17th Europ. Conf. on Artificial Intelligence (ECAI’06), Riva del Garda, Aug. 29 Sept. 1, pages 337–341. IOS Press, 2006. [Benferhat and Sedki, 2008] S. Benferhat and K. Sedki. Two alternatives for handling preferences in qualitative choice logic. Fuzzy Sets and Systems, 159(15):1889–1912, 2008. [Benferhat and Smaoui, 2007a] S. Benferhat and S. Smaoui. Hybrid possibilistic networks. Int. J. Approx. Reasoning, 44(3):224–243, 2007. [Benferhat and Smaoui, 2007b] S. Benferhat and S. Smaoui. Possibilistic causal networks for handling interventions: A new propagation algorithm. In Proc. 22nd AAAI Conf. on Artificial Intelligence (AAAI’07), Vancouver, July 22-26,, pages 373–378, 2007. [Benferhat and Smaoui, 2011] S. Benferhat and S. Smaoui. Inferring interventions in product-based possibilistic causal networks. Fuzzy Sets and Systems, 169:26–50, 2011. [Benferhat and Titouna, 2005] S. Benferhat and F. Titouna. Min-based fusion of possibilistic networks. In E. Montseny and P. Sobrevilla, editors, Proc. 4th Conf. of the Europ. Soc. for Fuzzy Logic and Technology (EUSFLAT’05), Barcelona, Sept. 7-9, pages 553–558. Universidad Polytecnica de Catalunya, 2005. [Benferhat and Titouna, 2009] S. Benferhat and F. Titouna. Fusion and normalization of quantitative possibilistic networks. Applied Intelligence, 31(2):135–160, 2009. [Benferhat et al., 1992] S. Benferhat, D. Dubois, and H. Prade. Representing default rules in possibilistic logic. In Proc. 3rd Inter. Conf. on Principles of Knowledge Representation and Reasoning (KR’92), Cambridge, Ma, Oct. 26-29, pages 673–684, 1992. [Benferhat et al., 1993a] S. Benferhat, D. Dubois, and H. Prade. Argumentative inference in uncertain and inconsistent knowledge base. In Proc. 9th Conf. on Uncertainty in Artificial Intelligence, Washington, DC, July 9-11, pages 411–419. Morgan Kaufmann, 1993. [Benferhat et al., 1993b] S. Benferhat, D. Dubois, and H. Prade. Possibilistic logic: From nonmonotonicity to logic programming. In M. Clarke, R. Kruse, and S. Moral, editors, Proc. Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (ECSQARU’93), Granada, Nov. 8-10, volume 747 of LNCS, pages 17–24. Springer, 1993. [Benferhat et al., 1994a] S. Benferhat, D. Dubois, J. Lang, and H. Prade. Hypothetical reasoning in possibilistic logic: basic notions and implementation issues. In P. Z. Wang and K. F. Loe, editors, Between Mind and Computer, Fuzzy Science and Engineering, pages 1–29. World Scientific Publ., Singapore, 1994. [Benferhat et al., 1994b] S. Benferhat, D. Dubois, and H. Prade. Expressing independence in a possibilistic framework and its application to default reasoning. In Proc. 11th Europ. Conf. on Artificial Intelligence (ECAI’94), Amsterdam, Aug. 8-12, pages 150–154, 1994. [Benferhat et al., 1997a] S. Benferhat, T. Chehire, and F. Monai. Possibilistic ATMS in a data fusion problem. In D. Dubois, H. Prade, and R.R. Yager, editors, Fuzzy Information Engineering: A Guided Tour of Applications, pages 417–435. John Wiley & Sons, New York, 1997. [Benferhat et al., 1997b] S. Benferhat, D. Dubois, and H. Prade. Nonmonotonic reasoning, conditional objects and possibility theory. Artificial Intelligence, 92(1-2):259–276, 1997. [Benferhat et al., 1998a] S. Benferhat, D. Dubois, and H. Prade. From semantic to syntactic approaches to information combination in possibilistic logic. In B. BouchonMeunier, editor, Aggregation and Fusion of Imperfect Information, pages 141–161. Physica-Verlag, Heidelberg, 1998.



[Benferhat et al., 1998b] S. Benferhat, D. Dubois, and H. Prade. Practical handling of exception-tainted rules and independence information in possibilistic logic. Applied Intelligence, 9(2):101–127, 1998. [Benferhat et al., 1999a] S. Benferhat, D. Dubois, and H. Prade. An overview of inconsistency-tolerant inferences in prioritized knowledge bases. In D. Dubois, E. P. Klement, and H. Prade, editors, Fuzzy Sets, Logic and Reasoning about Knowledge, volume 15 of Applied Logic Series, pages 395–417. Kluwer, Dordrecht, 1999. [Benferhat et al., 1999b] S. Benferhat, D. Dubois, and H. Prade. Possibilistic and standard probabilistic semantics of conditional knowledge bases. J. of Logic and Computation, 9(6):873–895, 1999. [Benferhat et al., 1999c] S. Benferhat, D. Dubois, H. Prade, and M.-A. Williams. A practical approach to fusing prioritized knowledge bases. In Proc. 9th Portuguese Conf. on Artificial Intelligence (EPIA’99), Evora, Sept. 21-24, volume 1695 of LNCS, pages 222–236. Springer, 1999. [Benferhat et al., 2000a] S. Benferhat, D. Dubois, H. Fargier, H. Prade, and R. Sabbadin. Decision, nonmonotonic reasoning and possibilistic logic. In J. Minker, editor, Logic-Based Artificial Intelligence, pages 333–358. Kluwer Acad. Publ., 2000. [Benferhat et al., 2000b] S. Benferhat, D. Dubois, and H. Prade. Kalman-like filtering in a possibilistic setting. In W. Horn, editor, Proc.14th Europ. Conf. on Artificial Intelligence (ECAI’00), Berlin, Aug. 20-25, pages 8–12. IOS Press, 2000. [Benferhat et al., 2001a] S. Benferhat, D. Dubois, S. Kaci, and H.Prade. Bridging logical, comparative and graphical possibilistic representation frameworks. In S. Benferhat and P. Besnard, editors, Proc. 6th Europ. Conf. on Symbolic and Quantitative Approaches to reasoning with Uncertainty (ECSQARU’01), Toulouse, Sept. 19-21, volume 2143 of LNAI, pages 422–431. Springer, 2001. [Benferhat et al., 2001b] S. Benferhat, D. Dubois, S. Kaci, and H. Prade. Graphical readings of a possibilistic logic base. In J. Breese and D. Koller, editors, Proc. 17th Conf. on Uncertainty in Artificial Intelligence (UAI’01), Seattle, Aug. 2-5, pages 24–31. Morgan Kaufmann, 2001. [Benferhat et al., 2001c] S. Benferhat, D. Dubois, and H. Prade. A computational model for belief change and fusing ordered belief bases. In M.-A. Williams and H. Rott, editors, Frontiers in Belief Revision, pages 109–134. Kluwer Acad. Publ., 2001. [Benferhat et al., 2001d] S. Benferhat, D. Dubois, and H. Prade. Towards a possibilistic logic handling of preferences. Applied Intelligence, 14(3):303–317, 2001. [Benferhat et al., 2002a] S. Benferhat, D. Dubois, L. Garcia, and H. Prade. On the transformation between possibilistic logic bases and possibilistic causal networks. Int. J. Approx. Reasoning, 29(2):135–173, 2002. [Benferhat et al., 2002b] S. Benferhat, D. Dubois, S. Kaci, and H. Prade. Possibilistic merging and distance-based fusion of propositional information. Annals of Mathematics and Artificial Intelligence, 34(1-3):217–252, 2002. [Benferhat et al., 2002c] S. Benferhat, D. Dubois, H. Prade, and M.-A. Williams. A practical approach to revising prioritized knowledge bases. Studia Logica, 70(1):105– 130, 2002. [Benferhat et al., 2003] S. Benferhat, R. El Baida, and F. Cuppens. A possibilistic logic encoding of access control. In I. Russell and S. M. Haller, editors, Proc. 16th Int. Florida Artificial Intelligence Research Society Conf., St. Augustine, Fl., May 12-14, pages 481–485. AAAI Press, 2003. [Benferhat et al., 2004a] S. Benferhat, G. Brewka, and D. Le Berre. On the relation between qualitative choice logic and possibilistic logic. In Proc. 10th Inter. Conf. Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 04), July 4-9, Perugia, pages 951–957, 2004.



[Benferhat et al., 2004b] S. Benferhat, S. Lagrue, and O. Papini. Reasoning with partially ordered information in a possibilistic logic framework. Fuzzy Sets and Systems, 144(1):25–41, 2004. [Benferhat et al., 2005] S. Benferhat, F. Khellaf, and A. Mokhtari. Product-based causal networks and quantitative possibilistic bases. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems, 13:469–493, 2005. [Benferhat et al., 2006] S. Benferhat, D. Dubois, S. Kaci, and H. Prade. Bipolar possibility theory in preference modeling: Representation, fusion and optimal solutions. Information Fusion, 7(1):135–150, 2006. [Benferhat et al., 2008] S. Benferhat, D. Dubois, S. Kaci, and H. Prade. Modeling positive and negative information in possibility theory. Int. J. of Intelligent Systems, 23(10):1094–1118, 2008. [Benferhat et al., 2009] S. Benferhat, D. Dubois, and H. Prade. Interventions in possibilistic logic. In L. Godo and A. Pugliese, editors, Proc. 3rd Int. Conf. on Scalable Uncertainty Management (SUM’09), Washington, DC, Sept. 28-30, volume 5785 of LNCS, pages 40–54. Springer, 2009. [Benferhat et al., 2010] S. Benferhat, D. Dubois, H. Prade, and M.-A. Williams. A framework for iterated belief revision using possibilistic counterparts to Jeffrey’s rule. Fundam. Inform., 99(2):147–168, 2010. [Benferhat et al., 2011] S. Benferhat, J. Hu´ e, S. Lagrue, and J. Rossit. Interval-based possibilistic logic. In T. Walsh, editor, Proc. 22nd Inter. Joint Conf. on Artificial Intelligence (IJCAI’11), Barcelona, July 16-22,, pages 750–755, 2011. [Benferhat et al., 2013] S. Benferhat, Z. Bouraoui, and Z. Loukil. Min-based fusion of possibilistic DL-Lite knowledge bases. In Proc. IEEE/WIC/ACM Int. Conf. on Web Intelligence (WI’13), Atlanta, GA Nov. 17-20, pages 23–28. IEEE Computer Society, 2013. [Benferhat, 2010] S. Benferhat. Interventions and belief change in possibilistic graphical models. Artificial Intelligence, 174:177–189, 2010. [Besnard and Hunter, 1995] Ph. Besnard and A. Hunter. Quasi-classical logic: Nontrivializable classical reasoning from inconsistent information. In Ch. Froidevaux and J. Kohlas, editors, Proc. 3rd Europ. Conf. Symbolic and Quantitative Approaches to Reasoning and Uncertainty (ECSQARU’95), Fribourg, July 3-5, volume 946 of LNCS, pages 44–51. Springer, 1995. [Besnard and Lang, 1994] Ph. Besnard and J. Lang. Possibility and necessity functions over non-classical logics. In R. L´ opez de M´ antaras and D. Poole, editors, Proc. 10th Conf. on Uncertainty in Artificial Intelligence (UAI’94), Seattle, July 29-31, pages 69–76. Morgan Kaufmann, 1994. [Boche´ nski, 1947] I. M. Boche´ nski. La Logique de Th´ eophraste. Librairie de l’Universit´ e de Fribourg en Suisse, 1947. [Boldrin and Sossai, 1997] L. Boldrin and C. Sossai. Local possibilistic logic. J. of Applied Non-Classical Logics, 7(3):309–333, 1997. [Boldrin, 1995] L. Boldrin. A substructural connective for possibilistic logic. In Ch. Froidevaux and J. Kohlas, editors, Proc. 3rd Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (ECSQARU-95), Fribourg, July 3-5, volume 946 of LNCS, pages 60–68. Springer, 1995. [Bonnefon et al., 2008] J.-F. Bonnefon, R. Da Silva Neves, D. Dubois, and H. Prade. Predicting causality ascriptions from background knowledge: model and experimental validation. Int. J. Approximate Reasoning, 48(3):752–765, 2008. [Bonnefon et al., 2012] J.-F. Bonnefon, R. Da Silva Neves, D. Dubois, and H. Prade. Qualitative and quantitative conditions for the transitivity of perceived causation Theoretical and experimental results. Ann. Math. Artif. Intell., 64(2-3):311–333, 2012.



[Bosc and Pivert, 2005] P. Bosc and O. Pivert. About projection-selection-join queries addressed to possibilistic relational databases. IEEE Trans. on Fuzzy Systems, 13:124– 139, 2005. [Bosc et al., 2009] P. Bosc, O. Pivert, and H. Prade. A model based on possibilistic certainty levels for incomplete databases. In L. Godo and A. Pugliese, editors, Proc. 3rd Int. Conf. on Scalable Uncertainty Management (SUM’09), Washington, DC, Sept. 28-30, volume 5785 of LNCS, pages 80–94. Springer, 2009. [Bosc et al., 2010] P. Bosc, O. Pivert, and H. Prade. A possibilistic logic view of preference queries to an uncertain database. In Proc. 19th IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE’10), Barcelona, July 18-23, pages 379–384, 2010. [Bou et al., 2014] F. Bou, F. Esteva, and L. Godo. On possibilistic modal logics defined over MTL-chains. In Petr H´ ajek on Mathematical Fuzzy Logic, Trends in Logic, Springer, in press, 2014. [Brewka et al., 2004] G. Brewka, S. Benferhat, and D. Le Berre. Qualitative choice logic. Artificial Intelligence, 157(1-2):203–237, 2004. [Brewka et al., 2011] G. Brewka, V. Marek, and M. Truszczynski, eds. Nonmonotonic Reasoning. Essays Celebrating its 30th Anniversary., volume 31 of Studies in Logic. College Publications, 2011. [Buchanan and Shortliffe, 1984] B. G. Buchanan and E. H. Shortliffe, editors. RuleBased Expert Systems. The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, Ma., 1984. [Casali et al., 2011] A. Casali, L. Godo, and C. Sierra. A graded BDI agent model to represent and reason about preferences. Artificial Intelligence, 175(7-8):1468–1478, 2011. [Cayrol et al., 2014] C. Cayrol, D. Dubois, and F. Touazi. On the semantics of partially ordered bases. In Ch. Beierle and C. Meghini, editors, Proc. 8th Int. Symp. on Foundations of Information and Knowledge Systems (FoIKS’14), Bordeaux, Mar. 3-7, volume 8367 of LNCS, pages 136–153. Springer, 2014. [Chellas, 1980] B. F. Chellas. Modal Logic, an Introduction. Cambridge University Press, Cambridge, 1980. [Ches˜ nevar et al., 2005] C. I. Ches˜ nevar, G. R. Simari, L. Godo, and T. Alsinet. Argument-based expansion operators in possibilistic defeasible logic programming: Characterization and logical properties. In L. Godo, editor, Proc. 8th Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU’05), Barcelona, July 6-8, volume 3571 of LNCS, pages 353–365. Springer, 2005. [Cohen, 1977] L. J. Cohen. The Probable and the Provable. Clarendon Press, Oxford, 1977. [Coletti and Vantaggi, 2009] G. Coletti and B. Vantaggi. T-conditional possibilities: Coherence and inference. Fuzzy Sets and Systems, 160(3):306–324, 2009. [Confalonieri and Prade, 2014] R. Confalonieri and H. Prade. Using possibilistic logic for modeling qualitative decision: Answer set programming algorithms. Int. J. Approximate Reasoning, 55(2):711–738, 2014. [Confalonieri et al., 2012] R. Confalonieri, J. C. Nieves, M. Osorio, and J. V´ azquezSalceda. Dealing with explicit preferences and uncertainty in answer set programming. Ann. Math. Artif. Intell., 65(2-3):159–198, 2012. [De Baets et al., 1999] B. De Baets, E. Tsiporkova, and R. Mesiar. Conditioning in possibility with strict order norms. Fuzzy Sets and Systems, 106:221–229, 1999. [De Campos and Huete, 1999] L. M. De Campos and J. F. Huete. Independence concepts in possibility theory. Fuzzy Sets and Systems, 103:127–152 & 487–506, 1999. [De Cooman, 1997] G. De Cooman. Possibility theory. Part i: Measure- and integraltheoretic groundwork; Part ii: Conditional possibility; Part iii: Possibilistic independence. Int. J. of General Syst., 25:291–371, 1997.



[Dellunde et al., 2011] P. Dellunde, L. Godo, and E. Marchioni. Extending possibilistic logic over g¨ odel logic. Int. J. Approx. Reasoning, 52(1):63–75, 2011. [Dubois and Prade, 1980] D. Dubois and H. Prade. Fuzzy Sets and Systems - Theory and Applications. Academic Press, New York, 1980. [Dubois and Prade, 1988] D. Dubois and H. Prade. Possibility Theory. An Approach to Computerized Processing of Uncertainty. Plenum Press, New York and London, 1988. With the collaboration of H. Farreny, R. Martin-Clouaire and C. Testemale. [Dubois and Prade, 1990a] D. Dubois and H. Prade. The logical view of conditioning and its application to possibility and evidence theories. Int. J. Approx. Reasoning, 4(1):23–46, 1990. [Dubois and Prade, 1990b] D. Dubois and H. Prade. Resolution principles in possibilistic logic. Int. J. Approximate Reasoning, 4(1):1–21, 1990. [Dubois and Prade, 1991] D. Dubois and H. Prade. Epistemic entrenchment and possibilistic logic. Artificial Intelligence, 50:223–239, 1991. [Dubois and Prade, 1992] D. Dubois and H. Prade. Possibility theory as a basis for preference propagation in automated reasoning. In Proc. 1st IEEE Inter. Conf. on Fuzzy Systems (FUZZ-IEEE’92), San Diego, Ca., March 8-12, pages 821–832, 1992. [Dubois and Prade, 1993] D. Dubois and H. Prade. Belief revision and updates in numerical formalisms: An overview, with new results for the possibilistic framework. In R. Bajcsy, editor, Proc. 13th Int. Joint Conf. on Artificial Intelligence. Chamb´ ery, Aug. 28 - Sept. 3, pages 620–625. Morgan Kaufmann, 1993. [Dubois and Prade, 1995a] D. Dubois and H. Prade. Conditional objects, possibility theory and default rules. In L. Fari˜ nas Del Cerro G. Crocco and A. Herzig, editors, Conditionals: From Philosophy to Computer Science, Studies in Logic and Computation, pages 301–336. Oxford Science Publ., 1995. [Dubois and Prade, 1995b] D. Dubois and H. Prade. Possibility theory as a basis for qualitative decision theory. In Proc. 14th Int. Joint Conf. on Artificial Intelligence (IJCAI’95), Montr´ eal , Aug. 20-25, pages 1924–1932. Morgan Kaufmann, 1995. [Dubois and Prade, 1996] D. Dubois and H. Prade. Combining hypothetical reasoning and plausible inference in possibilistic logic. J. of Multiple Valued Logic, 1:219–239, 1996. [Dubois and Prade, 1997a] D. Dubois and H. Prade. A synthetic view of belief revision with uncertain inputs in the framework of possibility theory. Int. J. Approx. Reasoning, 17:295–324, 1997. [Dubois and Prade, 1997b] D. Dubois and H. Prade. Valid or complete information in databases - a possibility theory-based analysis. In A. Hameurlain and A.M. Tjoa, editors, Database and Expert Systems Applications, Proc. of the 8th Inter. Conf. DEXA’97, Toulouse, Sept. 1-5, volume 1308 of LNCS, pages 603–612. Springer, 1997. [Dubois and Prade, 1998] D. Dubois and H. Prade. Possibility theory: Qualitative and quantitative aspects. In D. M. Gabbay and Ph. Smets, editors, Quantified Representation of Uncertainty and Imprecision, volume 1 of Handbook of Defeasible Reasoning and Uncertainty Management Systems, pages 169–226. Kluwer Acad. Publ., 1998. [Dubois and Prade, 2000] D. Dubois and H. Prade. An overview of ordinal and numerical approaches to causal diagnostic problem solving. In D.M. Gabbay and R. Kruse, editors, Abductive Reasoning and Learning, Vol. 4 in Handbooks of Defeasible Reasoning and Uncertainty Management Systems, pages 231–280. Kluwer Acad. Publ., Boston, 2000. [Dubois and Prade, 2004] D. Dubois and H. Prade. Possibilistic logic: A retrospective and prospective view. Fuzzy Sets and Systems, 144:3–23, 2004. [Dubois and Prade, 2006] D. Dubois and H. Prade. A bipolar possibilistic representation of knowledge and preferences and its applications. In I. Bloch, A. Petrosino, A. Tettamanzi, and G. B. Andrea, editors, Revised Selected Papers from the Inter.



Workshop on Fuzzy Logic and Applications (WILF’05), Crema, Italy, Sept. 2005, volume 3849 of LNCS, pages 1–10. Springer, 2006. [Dubois and Prade, 2007] D. Dubois and H. Prade. Toward multiple-agent extensions of possibilistic logic. In Proc. IEEE Inter. Conf. on Fuzzy Systems (FUZZ-IEEE’07), London, July 23-26, pages 187–192, 2007. [Dubois and Prade, 2011a] D. Dubois and H. Prade. Generalized possibilistic logic. In S. Benferhat and J. Grant, editors, Proc. 5th Int. Conf. on Scalable Uncertainty Management (SUM’11), Dayton, Oh, Oct. 10-13, volume 6929 of LNCS, pages 428– 432. Springer, 2011. [Dubois and Prade, 2011b] D. Dubois and H. Prade. Handling various forms of inconsistency in possibilistic logic. In F. Morvan, A. Min Tjoa, and R. Wagner, editors, Proc. 2011 Database and Expert Systems Applications, DEXA, Int. Workshops, Toulouse, Aug. 29 - Sept. 2, pages 327–331. IEEE Computer Society, 2011. [Dubois and Prade, 2011c] D. Dubois and H. Prade. Non-monotonic reasoning and uncertainty theories. In G. Brewka, V. Marek, and M. Truszczynski, editors, Nonmonotonic Reasoning. Essays Celebrating its 30th Anniversary, volume 31 of Studies in Logic, pages 141–176. College Publications, 2011. [Dubois and Prade, 2012] D. Dubois and H. Prade. From Blanch´ e’s hexagonal organization of concepts to formal concept analysis and possibility theory. Logica Universalis, 6 (1-2):149–169, 2012. [Dubois and Prade, 2013] D. Dubois and H. Prade. Modeling “and if possible” and “or at least”: Different forms of bipolarity in flexible querying. In O. Pivert and S. Zadrozny, editors, Flexible Approaches in Data, Information and Knowledge Management, volume 497 of Studies in Computational Intelligence, pages 3–19. Springer, 2013. [Dubois et al., 1987] D. Dubois, J. Lang, and H. Prade. Theorem proving under uncertainty - A possibility theory-based approach. In J. P. McDermott, editor, Proc. 10th Int. Joint Conf. on Artificial Intelligence. Milan, Aug., pages 984–986. Morgan Kaufmann, 1987. [Dubois et al., 1988] D. Dubois, H. Prade, and C. Testemale. In search of a modal system for possibility theory. In Y. Kodratoff, editor, Proc. 8th Europ. Conf. on Artificial Intelligence (ECAI’88), Munich, Aug. 1-5, pages 501–506, London: Pitmann Publ., 1988. [Dubois et al., 1990] D. Dubois, J. Lang, and H. Prade. Handling uncertain knowledge in an ATMS using possibilistic logic. In Proc. 5th Inter. Symp. on Methodologies for Intelligent Systems, Knoxville, Oct. 25-27, pages 252–259. North-Holland, 1990. [Dubois et al., 1991a] D. Dubois, J. Lang, and H. Prade. Fuzzy sets in approximate reasoning. Part 2: Logical approaches. Fuzzy Sets and Systems, 40:203–244, 1991. [Dubois et al., 1991b] D. Dubois, J. Lang, and H. Prade. Timed possibilistic logic. Fundamenta Informaticae, 15:211–234, 1991. [Dubois et al., 1991c] D. Dubois, J. Lang, and H. Prade. Towards possibilistic logic programming. In K. Furukawa, editor, Proc. 8th Int. Conf. on Logic Programming (ICLP’91), Paris, June 24-28, 1991, pages 581–595. MIT Press, 1991. [Dubois et al., 1992] D. Dubois, J. Lang, and H. Prade. Dealing with multi-source information in possibilistic logic. In B. Neumann, editor, Proc. 10th Europ. Conf. on Artificial Intelligence (ECAI’92), Vienna, Aug. 3-7, pages 38–42. IEEE Computer Society, 1992. [Dubois et al., 1994a] D. Dubois, J. Lang, and H. Prade. Automated reasoning using possibilistic logic: semantics, belief revision and variable certainty weights. IEEE Trans. on Data and Knowledge Engineering, 6(1):64–71, 1994. [Dubois et al., 1994b] D. Dubois, J. Lang, and H. Prade. Handling uncertainty, context, vague predicates, and partial inconsistency in possibilistic logic. In D. Driankov, P. W.



Eklund, and A. L. Ralescu, editors, Fuzzy Logic and Fuzzy Control, Proc. IJCAI ’91 Workshop, Sydney, Aug. 24, 1991, volume 833 of LNCS, pages 45–55. Springer, 1994. [Dubois et al., 1994c] D. Dubois, J. Lang, and H. Prade. Possibilistic logic. In D. M. Gabbay, C. J. Hogger, J. A. Robinson, and D. Nute, editors, Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 3, pages 439–513. Oxford Univ. Press, 1994. [Dubois et al., 1997] D. Dubois, L. Fari˜ nas del Cerro, A. Herzig, and H. Prade. Qualitative relevance and independence: A roadmap. In Proc. 15h Int. Joint Conf. on Artificial Intelligence, Nagoya, pages 62–67, 1997. [Dubois et al., 1998a] D. Dubois, S. Moral, and H. Prade. Belief change rules in ordinal and numerical uncertainty theories. In D. Dubois and H. Prade, editors, Belief Change, pages 311–392. Kluwer, Dordrecht, 1998. [Dubois et al., 1998b] D. Dubois, H. Prade, and S. Sandri. A possibilistic logic with fuzzy constants and fuzzily restricted quantifiers. In T. P. Martin and F. ArcelliFontana, editors, Logic Programming and Soft Computing, pages 69–90. Research Studies Press, Baldock, UK, 1998. [Dubois et al., 1999a] D. Dubois, L. Fari˜ nas del Cerro, A. Herzig, and H. Prade. A roadmap of qualitative independence. In D. Dubois, H. Prade, and E. P. Klement, editors, Fuzzy Sets, Logics and Reasoning about Knowledge , volume 15 of Applied Logic series, pages 325–350. Kluwer Acad. Publ., Dordrecht, 1999. [Dubois et al., 1999b] D. Dubois, D. Le Berre, H. Prade, and R. Sabbadin. Using possibilistic logic for modeling qualitative decision: ATMS-based algorithms. Fundamenta Informaticae, 37(1-2):1–30, 1999. [Dubois et al., 2000] D. Dubois, P. Hajek, and H. Prade. Knowledge-driven versus datadriven logics. J. Logic, Language, and Information, 9:65–89, 2000. [Dubois et al., 2001a] D. Dubois, H. Prade, and R. Sabbadin. Decision-theoretic foundations of qualitative possibility theory. Europ. J. of Operational Research, 128(3):459– 478, 2001. [Dubois et al., 2001b] D. Dubois, H. Prade, and Ph. Smets. “Not impossible” vs. “guaranteed possible” in fusion and revision. In S. Benferhat and Ph. Besnard, editors, Proc. 6th Europ. Conf. Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU’01), Toulouse, Sept. 19-21, volume 2143 of LNCS, pages 522–531. Springer, 2001. [Dubois et al., 2003] D. Dubois, S. Konieczny, and H. Prade. Quasi-possibilistic logic and its measures of information and conflict. Fundamenta Informaticae, 57(2-4):101– 125, 2003. [Dubois et al., 2006] D. Dubois, S. Kaci, and H. Prade. Approximation of conditional preferences networks ?CP-nets? in possibilistic logic. In Proc. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE’06), Vancouver, July 16-21, pages 2337– 2342, 2006. [Dubois et al., 2007] D. Dubois, F. Esteva, L. Godo, and H. Prade. Fuzzy-set based logics - An history-oriented presentation of their main developments. In D. M. Gabbay and J. Woods, editors, Handbook of the History of Logic, Vol. 8, The Many-Valued and Nonmonotonic Turn in Logic, pages 325–449. Elsevier, 2007. [Dubois et al., 2011] D. Dubois, H. Prade, and S. Schockaert. Rules and metarules in the framework of possibility theory and possibilistic logic. Scientia Iranica, Transactions D, 18:566–573, 2011. [Dubois et al., 2012] D. Dubois, H. Prade, and S. Schockaert. Stable models in generalized possibilistic logic. In G. Brewka, Th. Eiter, and S. A. McIlraith, editors, Proc. 13th Int. Conf. Principles of Knowledge Representation and Reasoning (KR’12), Rome, June 10-14, pages 519–529. AAAI Press, 2012. [Dubois et al., 2013a] D. Dubois, E. Lorini, and H. Prade. Bipolar possibility theory as a basis for a logic of desire and beliefs. In W.r. Liu, V.S. Subramanian, and



J. Wijsen, editors, Proc. Int. Conf. on Scalable Uncertainty Management (SUM’13), Washington, DC, Sept. 16-18, number 8078 in LNCS, pages 204–218. Springer, 2013. [Dubois et al., 2013b] D. Dubois, H. Prade, and F. Touazi. Conditional preference nets and possibilistic logic. In L. C. van der Gaag, editor, Proc. 12th Europ. Conf. on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU’13), Utrecht, July 8-10, volume 7958 of LNCS, pages 181–193. Springer, 2013. [Dubois et al., 2013c] D. Dubois, H. Prade, and F. Touazi. Conditional preference-nets, possibilistic logic, and the transitivity of priorities. In M. Bramer and M. Petridis, editors, Proc. of AI-2013, the 33rd SGAI Int. Conf. on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, UK, Dec. 10-12, pages 175–184. Springer, 2013. [Dubois et al., 2014a] D. Dubois, E. Lorini, and H. Prade. Nonmonotonic desires - A possibility theory viewpoint. In Proc. ECAI Int. Workshop on Defeasible and Ampliative Reasoning (DARe’14), Prague, Aug. 19. CEUR, 2014. [Dubois et al., 2014b] D. Dubois, H. Prade, and A. Rico. The logical encoding of Sugeno integrals. Fuzzy Sets and Systems, 241:61–75, 2014. [Dubois et al., 2014c] D. Dubois, H. Prade, and S. Schockaert. Reasoning about uncertainty and explicit ignorance in generalized possibilistic logic. In Proc. 21st Europ. Conf. on Artificial Intelligence (ECAI’14), Prague, Aug. 20-22, 2014. [Dubois, 1986] D. Dubois. Belief structures, possibility theory and decomposable measures on finite sets. Computers and AI, 5:403–416, 1986. [Dubois, 2012] D. Dubois. Reasoning about ignorance and contradiction: many-valued logics versus epistemic logic. Soft Computing, 16(11):1817–1831, 2012. [Dupin de Saint Cyr and Prade, 2008] F. Dupin de Saint Cyr and H. Prade. Handling uncertainty and defeasibility in a possibilistic logic setting. Int. J. Approximate Reasoning, 49(1):67–82, 2008. [Dupin de Saint Cyr et al., 1994] F. Dupin de Saint Cyr, J. Lang, and Th. Schiex. Penalty logic and its link with Dempster-Shafer theory. In R. Lopez de Mantaras and D. Poole, editors, Proc. Annual Conf. on Uncertainty in Artificial Intelligence (UAI’94), Seattle, July 29-31, pages 204–211. Morgan Kaufmann, 1994. [El-Zekey and Godo, 2012] M. El-Zekey and L. Godo. An extension of G¨ odel logic for reasoning under both vagueness and possibilistic uncertainty. In S. Greco, B. BouchonMeunier, G. Coletti, M. Fedrizzi, B. Matarazzo, and R. R. Yager, editors, Proc. 14th Int. Conf. on Information Processing and Management of Uncertainty in KnowledgeBased Systems (IPMU’12), Part II, Catania, July 9-13, volume 298 of Comm. in Comp. and Inf. Sci., pages 216–225. Springer, 2012. [Fari˜ nas del Cerro, 1985] L. Fari˜ nas del Cerro. Resolution modal logic. Logique et Analyse, 110-111:153–172, 1985. [Flaminio et al., 2011] T. Flaminio, L. Godo, and E. Marchioni. On the logical formalization of possibilistic counterparts of states over n-valued Lukasiewicz events. J. Log. Comput., 21(3): 429–446, 2011. [Flaminio et al., 2011b] T. Flaminio, L. Godo and E. Marchioni. Reasoning about uncertainty of fuzzy events: an overview. In P. Cintula, C. G. Ferm¨ uller, L. Godo and P. H´ ajek, editors, Understanding Vagueness - Logical, Philosophical, and Linguistic Perspectives, Studies in Logic no. 36, London: College Publications, pages 367–401, 2011. [Flaminio et al., 2012] T. Flaminio, L. Godo, and E. Marchioni. Geometrical aspects of possibility measures on finite domain MV-clans. Soft Comput., 16(11): 1863–1873, 2012. [Gabbay, 1996] D. Gabbay. Labelled Deductive Systems. Volume 1. Oxford University Press, Oxford, 1996. [G¨ ardenfors, 1988] P. G¨ ardenfors. Knowledge in Flux: Modeling the Dynamics of Epistemic States. The MIT Press, 1988. 2nd ed., College Publications, 2008.



[G¨ ardenfors, 1990] P. G¨ ardenfors. Belief revision and nonmonotonic logic: Two sides of the same coin? In L. Aiello, editor, Proc. 9th Europ. Conf. in Artificial Intelligence (ECAI’90), Stockholm, Aug. 8-10, pages 768–773, London, 1990. Pitman. [Godo et al., 2012] L. Godo, E. Marchioni, and P. Pardo. Extending a temporal defeasible argumentation framework with possibilistic weights. In L. Fari˜ nas del Cerro, A. Herzig, and J. Mengin, editors, Proc. 13th Europ. Conf. on Logics in Artificial Intelligence (JELIA’12), Toulouse, Sept. 26-28, volume 7519 of LNCS, pages 242–254. Springer, 2012. [Grabisch and Prade, 2001] M. Grabisch and H. Prade. The correlation problem in sensor fusion in a possibilistic framework. Int. J. of Intelligent Systems, 16(11):1273–1283, 2001. [Grabisch, 2003] M. Grabisch. Temporal scenario modelling and recognition based on possibilistic logic. Artificial Intelligence, 148(1-2):261–289, 2003. [Grove, 1988] A. Grove. Two modellings for theory change. J. Philos. Logic, 17:157–170, 1988. [H´ ajek, 1998] P. H´ ajek. Metamathematics of Fuzzy Logic, volume 4 of Trends in Logic – Studia Logica Library. Kluwer Acad. Publ., Dordrecht, 1998. [H´ ajek et al., 1995] P. H´ ajek, L. Godo and F. Esteva. Fuzzy logic and probability. In Ph. Besnard and S. Hanks, editors, Proc. 11th Conf. on Uncertainty in Artificial Intelligence (UAI’95), Montreal, Aug. 18-20, pages 237–244, 1995. [H´ ajek et al., 1994] P. H´ ajek, D. Harmancov´ a, F. Esteva, P. Garcia and L. Godo: On modal logics for qualitative possibility in a fuzzy setting. In R. Lopez de Mantaras and D. Poole, editors, Proc. 10th Conf. on Uncertainty in Artificial Intelligence (UAI’94) , Seattle, Jul. 29-31, San Francisco: Morgan Kaufmann, pages 278–285, 1994. [Halpern, 2005] J. Y. Halpern. Reasoning about Uncertainty. MIT Press, Cambridge, Ma, 2005. [Herzig and Fari˜ nas del Cerro, 1991] A. Herzig and L. Fari˜ nas del Cerro. A modal analysis of possibility theory. In Ph. Jorrand and J. Kelemen, editors, Fundamentals of Artificial Intelligence Research (FAIR’91), Smolenice, Sept. 8-13, volume 535 of LNCS, pages 11–18. Springer, 1991. [Hunter, 2000] A. Hunter. Reasoning with contradictory information using quasiclassical logic. J. Log. Comput., 10(5):677–703, 2000. [Jensen, 2001] F. V. Jensen. Bayesian Networks and Graphs. Springer Verlag, 2001. [Kaci et al., 2000] S. Kaci, S. Benferhat, D. Dubois, and H. Prade. A principled analysis of merging operations in possibilistic logic. In Proc. 16th Conf. on Uncertainty in Artificial (UAI’00), Stanford, June 30 - July 3, pages 24–31, 2000. [Klement et al., 2000] E. P. Klement, R. Mesiar, and E. Pap. Triangular Norms. Springer, 2000. [Koehler et al., 2014a] H. Koehler, U. Leck, S. Link, and H. Prade. Logical foundations of possibilistic keys. In E. Ferm´ e and J. Leite, editors, Proc. 14th Europ. Conf. on Logics in Artificial Intelligence (JELIA’14), Madeira, Sept. 24-26, LNCS 8761, Springer, pages 181–195, 2014. [Koehler et al., 2014b] H. Koehler, S. Link, H. Prade, and X.f. Zhou. Cardinality constraints for uncertain data. In Proc. 33rd Int. Conf. on Conceptual Modeling (ER’14), Atlanta, Oct. 27-29, 2014. [Kraus et al., 1990] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, preferential models and cumulative logics. Artificial Intelligence, 44:167–207, 1990. [Lafage et al., 1999] C. Lafage, J. Lang, and R. Sabbadin. A logic of supporters. In B. Bouchon-Meunier, R. R. Yager, and L. A. Zadeh, editors, Information, Uncertainty and Fusion, pages 381–392. Kluwer Acad. Publ., 1999. [Lang et al., 1991] J. Lang, D. Dubois, and H. Prade. A logic of graded possibility and certainty coping with partial inconsistency. In B. D’Ambrosio and Ph. Smets,



editors, Proc 7th Annual Conf. on Uncertainty in Artificial Intelligence (UAI ’91), Los Angeles, July 13-15, pages 188–196. Morgan Kaufmann, 1991. [Lang, 1991] J. Lang. Possibilistic logic as a logical framework for min-max discrete optimisation problems and prioritized constraints. In P. Jorrand and J. Kelemen, editors, Proc. Inter. Workshop on Fundamentals of Artificial Intelligence Research (FAIR’91), Smolenice, Sept. 8-13, volume 535 of LNCS, pages 112–126. Springer, 1991. [Lang, 2001] J. Lang. Possibilistic logic: complexity and algorithms. In D. Gabbay, Ph. Smets, J. Kohlas, and S. Moral, editors, Algorithms for Uncertainty and Defeasible Reasoning, Vol. 5 of Handbook of Defeasible Reasoning and Uncertainty Management Systems, pages 179–220. Kluwer Acad. Publ., Dordrecht, 2001. [L´ ea Somb´ e Group (ed.), 1994] L´ ea Somb´ e Group (ed.). Ph. Besnard, L. Cholvy, M. O. Cordier, D. Dubois, L. Fari˜ nas del Cerro, C. Froidevaux, F. L´ evy, Y. Moinard, H. Prade, C. Schwind, and P. Siegel. Revision and Updating in Knowledge bases. Int. J. Intelligent Systems, 9(1):1–182, 1994. Also simultaneously published as a book by John Wiley & Sons, New York. [L´ ea Somb´ e Group, 1990] L´ ea Somb´ e Group. Ph. Besnard, M. O. Cordier, D. Dubois, L. Fari˜ nas del Cerro, C. Froidevaux, Y. Moinard, H. Prade, C. Schwind, and P. Siegel. Reasoning under incomplete information in Artificial Intelligence: A comparison of formalisms using a single example. Int. J. Intelligent Systems, 5(4):323–472, 1990. Also simultaneously published as a book by John Wiley & Sons, New York. [Lehmann and Magidor, 1992] D. Lehmann and M. Magidor. What does a conditional knowledge base entail? Artificial Intelligence, 55:1–60, 1992. [Levesque, 1990] H. J. Levesque. All I know: A study in autoepistemic logic. Artificial Intelligence, 42:263–309, 1990. [Levi, 1966] I. Levi. On potential surprise. Ratio, 8:107–129, 1966. [Levi, 1967] I. Levi. Gambling with Truth, chapters VIII and IX. Knopf, New York, 1967. [Levi, 1979] I. Levi. Support and surprise: L. J. Cohen’s view of inductive probability. Brit. J. Phil. Sci., 30:279–292, 1979. [Lewis, 1973] D. K. Lewis. Counterfactuals. Basil Blackwell, Oxford, 1973. [Lewis, 1976] D. K. Lewis. Probabilities of conditionals and conditional probabilities. Philosophical Review, 85:297–315, 1976. [Liau, 1998] C.-J. Liau. Possibilistic residuated implication logics with applications. Int. J. of Uncertainty, Fuzziness and Knowledge-Based Systems 6(4): 365–386, 1998. [Liau, 1999] C.-J. Liau. On the possibility theory-based semantics for logics of preference. Int. J. Approx. Reasoning, 20(2): 173–190,1999. [Liau and Lin, 1992] C.-J. Liau and B. I-P. Lin. Quantitative modal logic and possibilistic reasoning. In B. Neumann, editor, Proc. 10th Europ. Conf. on Artificial Intelligence (ECAI 92), Vienna, Aug. 3-7, pages 43–47, John Wiley and Sons, 1992. [Liau and Lin, 1996] C.-J. Liau and B. I-P. Lin. Possibilistic reasoning - A mini-survey and uniform semantics. Artificial Intelligence 88(1-2): 163–193,1996. [Liau and Liu, 2001] C.-J. Liau and D.-R. Liu. A possibilistic decision logic with applications. Fundam. Inform., 46(3): 199–217, 2001. [Marchioni, 2006] E. Marchioni. Possibilistic conditioning framed in fuzzy logics. Int. J. Approx. Reasoning, 43(2): 133–165, 2006. [Minker, 2000] J. Minker, editor. Logic-Based Artificial Intelligence. Kluwer, Dordrecht, 2000. [Nicolas et al., 2006] P. Nicolas, L. Garcia, I. St´ ephan, and C. Lef` evre. Possibilistic uncertainty handling for answer set programming. Ann. Math. Artif. Intell., 47(12):139–181, 2006.



[Nieves and Cort´ es, 2006] J. C. Nieves and U. Cort´ es. Modality argumentation programming. In V. Torra, Y. Narukawa, A. Valls, and J. Domingo-Ferrer, editors, Proc. 3rd Inter. Conf. on Modeling Decisions for Artificial Intelligence (MDAI’06), Tarragona, Spain, April 3-5, volume 3885 of LNCS, pages 295–306. Springer, 2006. [Nieves et al., 2007] J. C. Nieves, M. Osorio, and U. Cort´ es. Semantics for possibilistic disjunctive programs. In C. Baral, G. Brewka, and J. S. Schlipf, editors, Proc. 9th Int. Conf. on Logic Programming and Nonmonotonic Reasoning (LPNMR’07), Tempe, AZ, May 15-17, volume 4483 of LNCS, pages 315–320. Springer, 2007. [Parsons, 1997] T. Parsons. The traditional square of opposition. In E. N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Stanford University, spring 2014 edition, 1997. [Pearce, 2006] D. Pearce. Equilibrium logic. Annals of Mathematics and Artificial Intelligence, 47:3–41, 2006. [Pearl, 1988] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publ., 1988. [Pearl, 1990] J. Pearl. System Z: A natural ordering of defaults with tractable applications to nonmonotonic reasoning. In R. Parikh, editor, Proc. 3rd Conf. on Theoretical Aspects of Reasoning about Knowledge, Pacific Grove, pages 121–135. Morgan Kaufmann, 1990. [Pearl, 2000] J. Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, 2000. 2nd edition, 2009. [Pinkas, 1991] G. Pinkas. Propositional non-monotonic reasoning and inconsistency in symmetric neural networks. In Proc. 12th Int. Joint Conf. on Artificial Intelligence (IJCAI’91) - Vol. 1, pages 525–530, San Francisco, 1991. Morgan Kaufmann Publ. [Pivert and Prade, 2014] O. Pivert and H. Prade. A certainty-based model for uncertain databases. IEEE Trans. on Fuzzy Systems, 2014. To appear. [Prade, 2006] H. Prade. Handling (un)awareness and related issues in possibilistic logic: A preliminary discussion. In J. Dix and A. Hunter, editors, Proc. 11th Int. Workshop on Non-Monotonic Reasoning (NMR 2006), Lake District, May 30-June 1, pages 219–225. Clausthal Univ. of Techn., 2006. [Prade, 2009] H. Prade. Current research trends in possibilistic logic: Multiple agent reasoning, preference representation, and uncertain database. In Z. W. Ras and A. Dardzinska, editors, Advances in Data Management, pages 311–330. Springer, 2009. [Qi and Wang, 2012] G.l. Qi and K.w. Wang. Conflict-based belief revision operators in possibilistic logic. In J. Hoffmann and B. Selman, editors, Proc. 26th AAAI Conf. on Artificial Intelligence, Toronto, July 22-26. AAAI Press, 2012. [Qi et al., 2010a] G.l. Qi, J.f. Du, W.r. Liu, and D. A. Bell. Merging knowledge bases in possibilistic logic by lexicographic aggregation. In P. Gr¨ unwald and P. Spirtes, editors, UAI 2010, Proc. 26th Conf. on Uncertainty in Artificial Intelligence, Catalina Island, July 8-11, pages 458–465. AUAI Press, 2010. [Qi et al., 2010b] G.l. Qi, W.r. Liu, and D. A. Bell. A comparison of merging operators in possibilistic logic. In Y.x. Bi and M.-A. Williams, editors, Proc. 4th Int. Conf. on Knowledge Science, Engineering and Management (KSEM’10), Belfast, Sept. 1-3, volume 6291 of LNCS, pages 39–50. Springer, 2010. [Qi et al., 2011] G.l. Qi, Q. Ji, J. Z. Pan, and J.f. Du. Extending description logics with uncertainty reasoning in possibilistic logic. Int. J. Intell. Syst., 26(4), 2011. [Qi, 2008] G.l. Qi. A semantic approach for iterated revision in possibilistic logic. In D. Fox and C. P. Gomes, editors, Proc. 23rd AAAI Conf. on Artificial Intelligence (AAAI’08), Chicago, July 13-17, pages 523–528. AAAI Press, 2008. [Rescher, 1976] N. Rescher. Plausible Reasoning. Van Gorcum, Amsterdam, 1976. [Schiex et al., 1995] T. Schiex, H. Fargier, and G. Verfaillie. Valued constraint satisfaction problems: Hard and easy problems. In Proc. 14th Int. Joint Conf. on Artificial



Intelligence (IJCAI’95), Montr´ eal, Aug. 20-25, Vol.1, pages 631–639. Morgan Kaufmann, 1995. [Schiex, 1992] Th. Schiex. Possibilistic constraint satisfaction problems or “how to handle soft constraints”. In D. Dubois and M. P. Wellman, editors, Proc. 8th Annual Conf. on Uncertainty in Artificial Intelligence (UAI’92), Stanford, July 17-19, pages 268–275, 1992. [Serrurier and Prade, 2007] M. Serrurier and H. Prade. Introducing possibilistic logic in ILP for dealing with exceptions. Artificial Intelligence, 171:939–950, 2007. [Shackle, 1949] G. L. S. Shackle. Expectation in Economics. Cambridge University Press, UK, 1949. 2nd edition, 1952. [Shackle, 1961] G. L. S. Shackle. Decision, Order and Time in Human Affairs. (2nd edition), Cambridge University Press, UK, 1961. [Shackle, 1979] G. L. S. Shackle. Imagination and the Nature of Choice. Edinburgh University Press, 1979. [Shafer, 1976] G. Shafer. A Mathematical Theory of Evidence. Princeton Univ. Press, 1976. [Shortliffe, 1976] E. H. Shortliffe. Computer-based Medical Consultations MYCIN. Elsevier, 1976. [Spohn, 1988] W. Spohn. Ordinal conditional functions: a dynamic theory of epistemic states. In W. L. Harper and B. Skyrms, editors, Causation in Decision, Belief Change, and Statistics, volume 2, pages 105–134. Kluwer, 1988. [Spohn, 2012] W. Spohn. The Laws of Belief: Ranking Theory and Its Philosophical Applications. Oxford Univ. Press, 2012. [Walley, 1991] P. Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London, 1991. [Walley, 1996] P. Walley. Measures of uncertainty in expert systems. Artificial Intelligence, 83:1–58, 1996. [Yager and Liu, 2008] R. R. Yager and L. P. Liu, editors. Classic Works of the Dempster-Shafer Theory of Belief Functions. Springer Verlag, Heidelberg, 2008. [Yager, 1983] R. R. Yager. An introduction to applications of possibility theory. Human Systems Management, 3:246–269, 1983. [Zadeh, 1978] L. A. Zadeh. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1:3–28, 1978. [Zadeh, 1979a] L. A. Zadeh. Fuzzy sets and information granularity. In M. M. Gupta, R. Ragade, and R. R. Yager, editors, Advances in Fuzzy Set Theory and Applications, pages 3–18. North-Holland, Amsterdam, 1979. [Zadeh, 1979b] L. A. Zadeh. A theory of approximate reasoning. In J. E. Hayes, D. Mitchie, and L. I. Mikulich, editors, Machine intelligence, Vol. 9, pages 149–194. Ellis Horwood, 1979. [Zadeh, 1982] L. A. Zadeh. Possibility theory and soft data analysis. In L. Cobb and R. Thrall, editors, Mathematical Frontiers of Social and Policy Sciences, pages 69– 129. Westview Press, Boulder, Co., 1982. [Zhu et al., 2013] J.f. Zhu, G.l. Qi, and B. Suntisrivaraporn. Tableaux algorithms for expressive possibilistic description logics. In Proc. IEEE/WIC/ACM Int. Conf. on Web Intelligence (WI’13), Atlanta, Nov.17-20, pages 227–232. IEEE Comp. Soc., 2013.

COMPUTERISING MATHEMATICAL TEXT Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt Reader: Serge Autexier 1


Mathematical texts can be computerised in many ways that capture differing amounts of the mathematical meaning. At one end, there is document imaging, which captures the arrangement of black marks on paper, while at the other end there are proof assistants (e.g., Mizar, Isabelle, Coq, etc.), which capture the full mathematical meaning and have proofs expressed in a formal foundation of mathematics. In between, there are computer typesetting systems (e.g., LATEX and Presentation MathML) and semantically oriented systems (e.g., Content MathML, OpenMath, OMDoc, etc.). In this paper we advocate a style of computerisation of mathematical texts which is flexible enough to connect the different approaches to computerisation, which allows various degrees of formalisation, and which is compatible with different logical frameworks (e.g., set theory, category theory, type theory, etc.) and proof systems. The basic idea is to allow a man-machine collaboration which weaves human input with machine computation at every step in the way. We propose that the huge step from informal mathematics to fully formalised mathematics be divided into smaller steps, each of which is a fully developed method in which human input is minimal. Let us consider the following two questions: 1. What is the relationship between the logical foundations of mathematical reasoning and the actual practice of mathematicians? 2. In what ways can computers support the development and communication of mathematical knowledge?


Logical Foundations

Our first question, of the relationship between the practice of mathematics and its logical foundations, has been an issue for at least two millennia. Logic was already influential in the study and development of mathematics since the time of the ancient Greeks. One of the main issues was already known by Aristotle, namely that for a logical/mathematical proposition Φ, Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

• given a purported proof of Φ, it is not hard to check whether the argument really proves Φ, but • in contrast, if one is asked to find a proof of Φ, the search may take a very long time (or even go forever without success) even if Φ is true. Aristotle used logic to reason about everything (mathematics, farming, medicine, law, etc.). A formal logical style of deductive reasoning about mathematics was introduced in Euclid’s geometry [Heath, 1956]. The 1600s saw a increase in the importance of logic. Researchers like Leibniz wanted to use logic to address not just mathematical questions but also more esoteric questions like the existence of God. In the 1800s, the need for a more precise style in mathematics arose, because controversial results had appeared in analysis [Kamareddine et al., 2004a]. Some controversies were solved by Cauchy’s precise definition of convergence in his Cours d’Analyse [Cauchy, 1821], others benefited from the more exact definition of real numbers given by Dedekind [Dedekind, 1872], while at the same time Cantor was making a tremendous contribution to the formalisation of set theory and number theory [Cantor, 1895; Cantor, 1897] and Peano was making influential steps in formalised arithmetic [Peano, 1889] (albeit without an extensive treatment of logic or quantification). In the last decades of the 1800s, the contributions of Frege made the move toward formalisation much more serious. Frege found “. . . the inadequacy of language to be an obstacle; no matter how unwieldy the expressions I was ready to accept, I was less and less able, as the relations became more and more complex, to attain precision” Based on this understanding of a need for greater preciseness, Frege presented Begriffsschrift [Frege, 1879], the first formalisation of logic giving logical concepts via symbols rather than natural language. “Begriffsschrift” is the name both of the book and of the formal system the book presents. Frege wrote: “[Begriffsschrift’s] first purpose, therefore, is to provide us with the most reliable test of the validity of a chain of inferences and to point out every presupposition that tries to sneak in unnoticed, so that its origin can be investigated.” Later, Frege wrote the Die Grundlagen der Arithmetik and Grundgesetze der Arithmetik [Frege, 1893; Frege, 1903; van Heijenoort, 1967] where he argued that mathematics is a branch of logic and described arithmetic in the Begriffsschrift. Grundgesetze was the culmination of Frege’s work on building a formal foundation for mathematics. One of the major issues in the logical foundations of mathematics is that the naive approach of Frege’s Grundgesetze (and Cantor’s earlier set theory) is inconsistent. Russell discovered a paradox in Frege’s system (and also in Russell’s own system) that allows proving a contradiction, from which everything can be proven, including all the false statements [Kamareddine et al., 2004a]. The need to build

Computerising Mathematical Text


logical foundations for mathematics that do not suffer from such paradoxes has led to many diverging approaches. Russell invented a form of type theory which he used in the famous Principia Mathematica [Whitehead and Russel, 1910–1913]. Others have subsequently introduced many kinds of type theories and modern type theories are quite different from Russell’s [Barendregt et al., 2013]. Brouwer introduced a different direction, that of intuitionism. Later, ideas from intuitionism and type theory were combined, and even extended to cover the power of classical logic (which Brouwer’s intuitionism rejects). Zermelo followed a different direction in introducing an axiomatisation of set theory [Zermelo, 1908], later extended by Fraenkel and Skolem to form the well known Zermelo/Fraenkel (ZF) system. In yet another direction, it is possible to use category theory as a foundation. And there are other proposed foundations, too many to discuss here. Despite the variety of possible foundations for mathematics, in practice real mathematicians do not express their work in terms of a foundation. It seems that most modern mathematicians tend to think in terms that are compatible with ZFC (which is ZF extended with the Axiom of Choice), but in practice they almost never write the full formal details. And it is quite rare for mathematicians to do their thinking while regarding a type theory as the foundation, even though type theories are among the most thoroughly developed logical foundations (in particular with well developed computer proof software systems). Instead, mathematicians write in a kind of common mathematical language (CML) (sometimes called a mathematical vernacular ), for a number of reasons: • Mathematicians have developed conventional ways of using nouns, adjectives, verbs, sentences, and larger chunks of text to express mathematical meaning. However, the existing logical foundations do not address the convenient use of natural language text to express mathematical meanings. • Using a foundation requires picking one specific foundation, and any foundation commits to some number of fixed choices. Such choices include what kinds of mathematical objects to take as the primitives (e.g., sets, functions, types, categories, etc.), what kinds of logical rules to use (e.g., “natural deduction” vs. “logical deduction”, whether to allow the full power of classical logic, etc.), what kinds of syntax and semantics to allow for logical propositions (first-order vs. higher-order), etc. Having made some initial choices, further choices follow, e.g., for a set theory one must then choose the axioms (Zermelo/Fraenkel, Tarski/Grothendieck, etc.), or for a type theory the kinds of types and the typing rules (Calculus of Constructions, Martin-L¨of, etc.). Fixed choices make logical foundations undesirable to use for three reasons: – Much of mathematics can be built on top of all of the different foundations. Hence, committing to a particular foundation would seem to unnecessarily limit the applicability of mathematical results. – The details of how to build some mathematical concepts can vary quite a bit from foundation to foundation. Issues that cause difficulty include


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

how to handle “partial functions”, induction, reasoning modulo equations, etc. Since these issues can be handled in all foundations, mathematicians tend to see the low-level details of these issues as inessential and uninteresting, and are not willing to write the low-level details. – Some mathematics only works for some foundations. Hence, for a mathematician to develop the specialised expertise needed to express mathematics in terms of one particular foundation would seem to unnecessarily limit the scope of mathematics he/she could address. A mathematician is happy to be reassured by a mathematical logician that what they are doing can be expressed in some foundation, but the mathematician usually does not care to work out precisely how. Moreover there is no universal agreement as to which is the best logical foundation. • In practice, formalising a mathematical text in any of the existing foundations is an extremely time-consuming, costly, and mentally painful activity. Formalisation also requires special expertise in the particular foundation used that goes far beyond the ordinary expertise of even extremely good mathematicians. Furthermore, mathematical texts formalised in any of the existing foundations are generally structured in a way which is radically different from what is optimal for the human reader’s understanding, and which is difficult for ordinary mathematicians to use. (Some proof software systems like Mizar, which is based on Tarski/Grothendieck set theory, attempt to reduce this problem, and partially succeed.) What is a single step in a usual human-readable mathematical text may turn into a multitude of smaller steps in a formalised version. New details completely missing from the human-readable version may need to be woven throughout the entire text. The original text may need to be reorganised and reordered so radically that it seems like it is almost turned inside out in the formal version. So, although mathematics was a driving force for the research in logic in the 19th or 20th century, mathematics and logic have kept a distance from each other. Practising mathematicians do not use mathematical logic and have for centuries done most mathematical work outside of the strict boundaries of formal logic.


Computerisation of Mathematical Knowledge

Our second question, of how to use mechanical computers to support mathematical knowledge, is more recent but is unavoidable since automation and computation can provide tremendous services to mathematics. There are also extensive opportunities for combining progress in logic and computerisation not only in mathematics but also in other areas: bio-informatics, chemistry, music, etc. Mechanical computers have been used from their beginning for mathematical purposes. Starting in the 1960s, computers began to play a role in handling not just computations, but abstract mathematical knowledge. Nowadays, computers can represent mathematical knowledge in various ways:

Computerising Mathematical Text


• Pixel map images of pages of mathematical articles may be stored on the computer. While useful, it is very difficult for computer programs to access the semantics of mathematical knowledge presented this way [Autexier et al., 2010]. Even keyword searching is hard, since OCR (Optical Character Recognition) must be performed and high quality OCR for mathematical texts is an area with significant research challenges rather than a proven technology (e.g., there is great difficulty with matrices [Kanahori et al., 2006]). • Typesetting systems like LATEX or TEXMACS [van der Hoeven, 2004], can be used with mathematical texts for editing them and formatting them for viewing or printing. The document formats of these systems can also be used for storage and archiving. Such systems provide good defaults for visual appearance and allow fine control when needed. They support commonly needed document structures and allow custom structures to be created, at least to the extent of being able to produce the correct visual appearance. Unfortunately, unless the mathematician is amazingly disciplined, the logical structure of symbolic formulas is not directly represented. Furthermore, the logical structure of mathematics as embedded in natural language text is not represented at all. This makes it difficult for computer programs to access document semantics because fully automated discovery of the semantics of natural language text still performs too poorly to use in practical systems. Even human-assisted semi-automated semantic analysis of natural language is primitive, and we are aware of no such systems with special support for mathematical text. As a consequence, there is generally no computer support for checking the correctness of mathematics represented this way or for doing searching based on semantics (as opposed to keywords). • Mathematical texts can be written in more semantically oriented document representations like OpenMath [Abbott et al., 1996] and OMDoc [Kohlhase, 2006], Content MathML [W3C, 2003], etc. There is generally support for converting from these representations to typesetting systems like LATEX or Presentation MathML in order to produce readable/printable versions of the mathematical text. These systems are 1) better than the typesetting systems at representing the knowledge in a computer-accessible way, and 2) can represent Some aspects of the semantics of symbolic formulas. • There are software systems like proof assistants (also called proof checkers, these include Coq [Team, 1999–2003], Isabelle [Nipkow et al., 2002], NuPrL [Constable and others, 1986], Mizar [Rudnicki, 1992], HOL [Gordon and Melham, 1993], etc.) and automated theorem provers (Boyer-Moore, Otter, etc.), which we collectively call proof systems. Each proof system provides a formal language (based on some foundation of logic and mathematics) for writing/mechanically checking logic, mathematics, and computer software. Work on computer support for formal foundations began in the late 1960s with work by de Bruijn on Automath (AUTOmating MATHe-


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

matics) [Nederpelt et al., 1994]. Automath supported automated checking of the full correctness of a mathematical text written in Automath’s formal language. Generally, most proof systems support checking full correctness, and it is possible in theory (although not easy) for computer programs to access and manipulate the semantics of the mathematical statements. Closely related to proof systems, we find proof development/planning systems (e.g., Ωmega [Siekmann et al., 2002; Siekmann et al., 2003] and λClam [Bundy et al., 1990]) which are mathematical assistant tools that support proof development in mathematical domains at a user-friendly level of abstraction. An additional advantage of these systems is that they focus on proof planning and hence can provide different styles of proof development. Unfortunately, there are great disadvantages in using proof systems. First, all of the problems mentioned for logical foundations in section 1a are incurred, e.g., the enormous expense of formalisation. Furthermore, one must choose a specific proof system (Isabelle, Coq, Mizar, PVS, etc.) and each software system has its own advantages and pitfalls and takes quite some time to learn. In practice, some of these systems are only ever learnt from a “master” in an “apprenticeship” setting. Most proof systems have no meaningful support for the mathematical use of natural language text. A notable exception is Mizar, which however requires the use of natural language in a rigid and somewhat inflexible way. Most proof systems suffer from the use of proof tactics, which make it easier to construct proofs and make proofs smaller, but obscure the reasoning for readers because the meaning of each tactic is often ad hoc and implementation-dependent. As a result of these and other disadvantages, ordinary mathematicians do not generally read mathematics written in the language of a proof system, and are usually not willing to spend the effort to formalise their own work in a proof system. • Computer algebra systems (CAS: e.g., Maxima, Maple, Mathematica, etc.) are widely used software environments designed for carrying out computations, primarily symbolic but sometimes also numeric. Each CAS has a language for writing mathematical expressions and statements and for describing computations. The languages can also be used for representing mathematical knowledge. The main advantage for such a language is integration with a CAS. Typically, a CAS language is not tied to any specific foundation and has little or no support for guaranteeing correctness of mathematical statements. A CAS language also typically has little or no support for embedded natural language text, or for precise control over typesetting. So a CAS is often used for calculating results, but these results are usually converted into some other language or format for dissemination or verification. Nonetheless, there are useful possibilities for using a CAS for archiving and communicating mathematical knowledge. It is important to build a bridge between more than one of the above categories of ways of representing mathematical knowledge, and to make easier (without re-

Computerising Mathematical Text


quiring) the partial or full formalisation of mathematical texts in some foundation. In this paper, we discuss two approaches aimed at achieving this: Barendregt’s approach [Barendregt, 2003] towards an interactive mathematical proof mode and Kamareddine and Wells’ MathLang approach [Kamareddine and Wells, 2008] towards the gradual computerisation of mathematics. 2


Mathematical assistants are workstations running a program that verifies the correctness of mathematical theorems, when provided with enough evidence. Systems for automated deduction require less evidence or even none at all; proof-checkers on the other hand require a fully formalised proof. In the pioneering systems Automath1 (of N.G. de Bruijn, based on dependent type theory), and Mizar (of Andrzej Trybulec based on set-theory), proofs had to be given ready and well. On the other hand for systems like NuPrl, Isabelle, and Coq, the proofs are obtained in an interactive fashion between the user and the proof-checker. Therefore one speaks about an interactive mathematical assistant. The list of statements that have to be given to such a checker, the proof-script, is usually not mathematical in nature, see e.g. table 8. The problem is that the script consists of fine-grained steps of what should be done, devoid of any mathematical meaning. Mizar is the only system having a substantial library of certified results in which the proof-script is mathematical in nature. Freek Wiedijk [Wiedijk, 2006] speaks of the declarative style of Mizar. In [de Bruijn, 1987] a plea was given to use a mathematical vernacular for formalising proofs. This paper discusses two approaches influenced by de Bruijn’s mathematical vernacular: MathLang [Kamareddine and Wells, 2008] and MPL (Mathematical Proof Language) [Barendregt, 2003]. These approaches aim to develop a framework for computerising mathematical texts which is flexible enough to connect the different approaches to computerisation, which allows various degrees of formalisation, and which is compatible with different logical frameworks (e.g., set theory, category theory, type theory, etc.) and proof systems. Both approaches aim to bridge informal mathematics and formalized mathematics via automatic translations into the formalised language of interactive proof-assistants. In particular: • MPL aims to provide an interactive script language for an interactive proof assistant like Coq, that is declarative and hence mathematical in flavor.2 • MathLang is embodied in a computer representation and associated software tools, and its progress and design are driven by the need for computerising representative mathematical texts from various branches of mathematics. At this stage, MPL remains a script language and has no associated software tools. MathLang on the other hand, supports entry of original mathematical texts either 1 2A

similar approach to MPL is found in Isar3 , with an implemented system.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

in an XML format or using the TEXMACS editor and these texts are manipulated by a number of MathLang software tools. These tools provide methods for adding, checking, and displaying various information aspects. One aspect is a kind of weak type system that assigns categories (term, statement, noun (class), adjective (class modifier), etc.) to parts of the text, deals with binding names to meanings, and checks that a kind of grammatical sense is maintained. Another aspect allows weaving together mathematical meaning and visual presentation and can associate natural language text with its mathematical meaning. Another aspect allows identifying chunks of text, marking their roles (theorem, definition, explanation, example, section, etc.), and indicating relationships between the chunks (A uses B, A contradicts B, A follows from B, etc.). Software tool support can use this aspect to check and explain the overall logical structure of a text. Further aspects are being designed to allow adding additional formality to a text such as proof structure and details of how a human-readable proof is encoded into a fully formalised version (previously [Kamareddine et al., 2007b; Lamar, 2011] we used Mizar and Isabelle but here, for the first time we develop the MathLang formalisation into Coq). [Kamareddine and Wells, 2008] surveyed the status of the MathLang project up to November 2007. This paper picks on from that survey, fills in a number of formalisation and implementation gaps and creates a formalisation path via MathLang into Coq. We show for the first time how the DRa information can be used to automatically generate proof skeletons for different theorem provers, we formalise and implement the textual order of a text and explain how it can be derived from the original text. Our proposed generic algorithm (for generating the proof skeleton which depends on the original mathematical text and the desired theorem prover), is highly configurable and caters for arbitrary theorem provers. This generic algorithm as well as all the new algorithms and concepts we present here, are implemented in our software tool. We give hints for the development of an algorithm which is able to convert parts of a CGa annotated text automatically into the syntax of a special theorem prover. To test our approaches we specify using MPL a feasible interactive mathematical proof development for Newman’s Lemma and we create the complete path of encoding in and formalising through MathLang, for the first chapter of Landau’s book ”Grundlagen der Analysis”. For Newman’s Lemma in MPL, we show that the declarative interactive mathematical mode is more pleasant than the operational mode of Coq. For Landau’s chapter in MathLang, we show that the entire path from the informal text into the fully formalised Coq text is much easier to construct and comprehend in MathLang than in Coq. For this, we show how the plain text document of Landau’s chapter, can be easily annotated with categories and mathematical roles and how a Coq and a Mizar proof skeletons can be automatically generated for the chapter. We then use hints to convert parts of the annotated text of Landau’s first chapter into Coq. Both the Coq proof skeleton and the converted parts into Coq, simplified the process of the full formalisation of the first chapter of Landau’s book in Coq.

Computerising Mathematical Text


Although in this paper we only illustrate MPL and MathLang for Coq, the proposed approaches should work equally well for other proof systems (indeed, we have previously illustrated MathLang for Mizar and Isabelle [Kamareddine et al., 2007a; Kamareddine et al., 2007b]). 3


Sections 1a and 1b described issues with the practice of mathematics: the difficulty for the normal mathematician in directly using a formal foundation, and the disadvantages of the various computer representations of mathematics. To address these issues, we set out to develop two new mathematical languages, so that texts written in CML (the common mathematical language, expressed either with pen and paper, or LATEX) is written instead in a way that satisfies these goals: 1. A MathLang/MPL text should support the usual features of CML: natural language text, symbolic formulas, images, document structures, control over visual presentation, etc. And the usual computer support for editing such texts should be available. 2. It should be possible to write a MathLang/MPL text in a way that is significantly less ambiguous than the corresponding CML text. A MathLang/MPL text should somehow support representing the text’s mathematical semantics and structure. The support for semantics should cover not just individual pieces of text and symbolic formulas but also the entire document and the document’s relationship to other documents (to allow building connected libraries). The degree of formality in representing the mathematical semantics should be flexible, and at least one choice of degree of formality should be both inexpensive and useful. There should be some automated checking of the well-formedness of the mathematical semantics. 3. The structure of a MathLang/MPL text should follow the structure of the corresponding CML, so that the experience of reading and writing MathLang/MPL should be close to that of reading and writing CML. This should make it easier for an author to see and have confidence that a MathLang/MPL text correctly represents their intentions. Thus, if any foundational formal systems are used in MathLang/MPL, then the latter should somehow adapt the formal systems to the needs of the authors and readers, rather than requiring the authors and readers to adapt their thinking to fit the rigid confines of any existing foundations. 4. The structure of a MathLang/MPL text should make it easier to support further post-authorship computer manipulations that respect its mathematical structure and meaning. Examples include semantics-based searches, computations via computer algebra systems, extraction of proof sketches (to be completed into a full formalisation in a proof system), etc.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

5. A particular important case of the previous point is that MathLang/MPL should support (but not require) interfacing with proof systems so that a MathLang/MPL text can contain full formal details in some foundation and the formalisation can be automatically verified. 6. Authoring of a MathLang/MPL text should not be significantly harder for the ordinary mathematician than authoring LATEX. Features that the author does not want (such as formalisation in a proof system) should not require any extra effort from an author. 7. The design of MathLang/MPL should be compatible with (as yet undetermined) future extensions to support additional uses of mathematical knowledge. Also, the design of MathLang/MPL should make it easy to combine with existing languages (e.g., OMDoc, TEXMACS ). This way, MathLang/MPL might end up being a method for extending an existing language in addition to (or instead of) a language on its own. None of the previously existing representations for mathematical texts satisfies our goals, so we have been developing new techniques. In this paper we discuss where we are with both MathLang and MPL. MathLang/MPL are intended to support different degrees of formalisation. Furthermore, for those documents where full formalisation is a goal, we intend to allow this to be accomplished in gradual steps. Some of the motivations for varying degrees of formalisation have already been discussed in sections 1a and 1b. Full formalisation is sometimes desirable, but also is often undesirable due to its expense and the requirement to commit to many inessential foundational details. Partial formalisation can be desirable for various reasons; as examples, it has the potential to be helpful with automated checking, semantics-based searching and querying, and interfacing with computer algebra systems (and other mathematical computation environments). In both our languages, MathLang and MPL, partial formalisation can be carried out to different degrees. For example: • The abstract syntax trees of symbolic formulas can be represented accurately. This is usually missing when using systems like LATEX or Presentation MathML, while more semantically oriented systems provide this to some degree. This can provide editing support for algebraic rearrangements and simplifications, and can help interfacing with computer algebra systems. • The mathematical structure of natural language text can be represented in a way similar to how symbolic formulas are handled. Furthermore, mixed text and symbols can be handled. This can help in the same way as capturing the structure of symbolic formulas can help. Nearly all previous systems do not support handling natural language text in this way. • A weak type system can be used to check simple grammatical conditions without checking full semantic sensibility.

Computerising Mathematical Text


• Justifications (inside proofs and between formal statements) can be linked (without necessarily always indicating precisely how they are used). Some examples of potential uses of this feature include the following: – Extracting only those parts of a document that are relevant to specific results. (This could be useful in educational systems.) – Checking that each instance of apparently circular reasoning is actually handled via induction. – Calculating proof gaps as a first step toward fuller formalisation. • If one commits to a foundation (or in some cases, to a family of foundations), one can start to use more sophisticated type systems in formulas and statements for checking more aspects of well-formedness. • And there are further possibilities. 4


The design of MathLang is gradually being refined based on experience testing the use of MathLang for representative mathematical texts. Throughout the development, the design is tested by evaluating encodings of real mathematical texts, during which issues and difficulties are encountered, which lead to new needs being discovered and corresponding design adjustments. The design includes formal rules for the representation of mathematical texts, as well as patterns and methodology for entering texts in this representation, and supporting software. The choice of mathematical texts for testing is primarily oriented toward texts that represent the variety of mathematical writing by ordinary mathematicians rather than texts that represent the interests of formalists and mathematical logicians. Much of the testing has been with pre-existing texts. In some cases, texts that have previously been formalised by others were chosen in order to compare representations, e.g., A Compendium of Continuous Lattices [Gierz et al., 1980] of which at least 60% has been formalised in Mizar [Rudnicki, 1992], and Landau’s Foundations of Analysis [Landau, 1951] which was fully formalised in Automath [van Benthem Jutting, 1977a]. In other cases, texts of historical value which are known to have errors were chosen to ensure that MathLang’s design will not exclude them, e.g., Euclid’s Elements [Heath, 1956]. Other texts were chosen to exercise other aspects of MathLang. Authoring new texts has also been tested. In addition to the design of MathLang itself, there has been work on relating a MathLang text to a fully formalised version of the text. Using the information in the CGa and DRa aspects of a MathLang text, [Kamareddine et al., 2007b; Retel, 2009] developed a procedure for producing a corresponding Mizar document, first as a proof sketch with holes and then as a fully completed proof. [Lamar, 2011] attempted to follow suit with Isabelle. In this paper, we make further progress in completing the path in MathLang in order to reach full formalisation and we introduce a third theorem prover (Coq) as a test bed for MathLang (in addition to Mizar and Isabelle). We develop the proof skeleton idea presented


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Figure 1. Overall situation of work in MathLang earlier in [Kamareddine et al., 2007b] specifically for Mizar, into an automatically generated proof skeleton in a choice of theorem provers (including Mizar, Isar and Coq). To achieve this, we give a generic algorithm for proof skeleton generation which takes the required prover as one of its arguments. We also give hints for the development of a generic algorithm which automatically converts parts of a CGa annotated text into the syntax of the theorem prover it is given as an argument. Figure 1 (adapted from [Kamareddine et al., 2007b]) diagrams the overall current situation of work on MathLang. In the rest of this paper, we discuss the aspects CGa, TSa, and DRa in more detail, we introduce the generic automatic proof skeleton generator and how parts of the CGa annotated text can be formalised into a theorem prover. We also discuss interfacing MathLang with Coq.


The Core Grammatical aspect (CGa)

The Core Grammatical aspect (CGa) [Kamareddine et al., 2004c; Kamareddine et al., 2006; Maarek, 2007] is based on the Weak Type Theory (WTT) of Nederpelt [Nederpelt, 2002] whose metatheory was established by Kamareddine [Kamareddine and Nederpelt, 2004]. WTT in turn was heavily inspired by the Mathematical Vernacular (MV) [de Bruijn, 1987]. In WTT, a document is a book which is a sequence of lines, each of which is a pair of a sentence (a statement or a definition) and a context of facts (declarations or statements) assumed in the sentence. WTT has four ways of introducing names. A definition introduces a name whose scope is the rest of the book and associates the name with its meaning. A name introduced by a definition can have parameters whose scope is the body of the definition. A declaration in a context introduces a name (with no parameters) whose scope is only the current line. Fi-

Computerising Mathematical Text


nally, a preface gives names whose scope is the document; names introduced by prefaces have parameters but unlike definitions their meanings are not provided (and thus presumed to be given externally to the document). Declarations, definitions, and statements can contain phrases which are built from terms, sets, nouns, and adjectives. Using the terminology of object-oriented programming languages, nouns act like classes and adjectives act like mixins (a special kind of function from classes to classes). WTT uses a weak type system with types like noun, set, term, adjective, statement definition, context, and book to check basic well-formedness. Sets are used when something is definitely known to be a set and the richer structure of a noun is not needed, and terms are used for things that are not sets (and sometimes for sets in cases where the type system is too weak). Although WTT provides many useful ideas, the definition of WTT has many limitations. The many different ways of introducing names are too complicated and awkward. WTT provides no way to indicate which statements are used to justify other statements and in general does not deal with proofs and logical correctness. WTT provides no ways to present the structure of a text to human readers; there is no way of grouping statements and identifying their mathematical/discourse roles such as theorem, lemma, conjecture, proof, section, chapter. WTT provides no way to give human names to statements (e.g., “Newman’s Lemma”). WTT provides no way to use in one document concepts defined in another document. The Core Grammatical aspect (CGa) was shaped by repeated experiences of annotating mathematical texts. CGa simplifies difficult aspects of WTT, and enhances the nouns and adjectives of WTT with ideas from object-oriented programming so that nouns are more like classes and adjectives are more like mixins. In CGa, the different kinds of name-introducing forms of WTT are unified; all definitions by default have indefinite forward scope and a local scope operator allows local definitions. The basic constructs of CGa are the step and the expression. The tasks handled in WTT by books, prefaces, lines, declarations, definitions, and statements are all represented as steps in CGa. A step can be a block {s1 , . . . , sn }, which is merely a sequence of steps. A step can be a local scoping s1 ⊲ s2 , which is a pair of steps s1 and s2 where the definitions and declarations of s1 are restricted in scope to s2 and the assertions of s1 are assumptions of s2 . A step can also be a definition, a declaration, or an expression (which asserts a truth). Expressions are also used for the bodies of definitions and inside the types in declarations. The possibilities for expressions include uses of defined identifiers, identifier declarations, and noun descriptions. A noun description allows specifying characteristics of a class of entities. For example, {M : set; y : natural number; x : natural number; ∈(x, M )}⊲=(+(x, y), +(y, x)) is an encoding of this (silly) CML text: “Given that M is a set, y and x are natural numbers, and x belongs to M, it holds that x + y = y + x.” This example assumes that earlier in the document there are declarations like: . . . ; ∈(term, set) : stat; =(term, term) : stat; natural number : noun; + (natural number, natural number) : natural number; . . .


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Here, M, y, x, ∈, =, and + are identifiers4 while term, set, stat, and noun are keywords of CGa. The semicolon, colon, comma, parentheses, braces, and right triangle (⊲) symbols are part of the syntax of CGa. The statements like ∈(term, set) : stat are declarations; this example declares ∈ to be an operator that takes two arguments, one of type term and one of type set, and yields a result of type stat (statement). The statement M : set is an abbreviation for M() : set which declares the identifier M to have zero parameters. CGa uses grammatical/linguistic/syntactic categories (also called types) to make explicit the grammatical role played by the elements of a mathematical text. In the above example, we see the category expressions term, set, stat, noun, and natural number. In fact, the category expression natural number acts as an abbreviation for term(natural number), and term, set, and noun are abbreviations for term(Noun {}), set(Noun {}), and noun(Noun {}), which all use the uncharacterised noun description Noun {}. A noun description is of the form Noun s and describes a class of entities with characteristics (declared operations and true facts) defined by the step s. The arguments of the category constructors term, set, and noun are expressions which evaluate to noun descriptions. The category term(e) describes individual entities belonging to the class described by the noun expression e, and the category set(e) describes any set of such entities. The category noun(e) describes any noun which defines all the operations described by e with the same types. So in the above example, the abbreviation term is the type of all mathematical entities, the abbreviation set is the type of any set, noun is the type of any noun (and specifies no characteristics for it), and natural number is the type of any mathematical entity having the characteristics described by the noun natural number.5 The behaviour of nouns in CGa is similar to that of classes in object-oriented programming languages. CGa also has adjectives which are like object-oriented mixins and act as functions from nouns to nouns. These linguistic levels and syntactic elements are summarised in the following definitions. DEFINITION 1 (Linguistic levels). The syntax of CGa is based on a hierarchy of the five different linguistic levels given below. Elements from I and C are part of E, expressions are part of the phrases of P and steps S are built from phrases. 1. Identifier level I 2. Category level C 3. Expression level E 4. Phrase level P 5. Step level S DEFINITION 2 (Syntactic elements). The syntactic elements at each level are: 1. At identifier level: term identifiers I T , set identifiers I S , noun identifiers I N , adjective identifiers I A and statement identifiers I P . 2. At category level: term categories T , set categories S, noun categories N , adjective categories A, statement categories P and declaration categories D. 4 Our current implementation only allows ASCII characters in identifiers, but we plan to support any graphic Unicode characters. 5 CGa has other mechanisms that allow specifying additional characteristics of the noun natural number separate from its declaration, and we assume in this example that this is done.

Computerising Mathematical Text


3. At expression level: declaration expressions DEC, instantiation expressions INST, description expressions DSC, refinement expressions REF and the self expression SEL. 4. At phrase level: sub refinement phrases SUB, definition phrases DEF, declaration expressions DEC and statement expressions P. 5. At step level: local scoping steps LOC, block steps BLO and each phrase of P is a basic step. For details of the rules of CGa see [Kamareddine et al., 2006; Maarek, 2007]. Here, it is crucial to mention the following colour codings for these CGa categories: term set noun adjective statement definition declaration step context . The types of CGa are more sophisticated than the weak types of WTT and allow tracking which operations are meaningful in some additional cases. Although CGa’s types are more powerful than WTT’s, there are still significant limitations. One limitation is that higher-order types are not allowed. For example, although CGa allows the type (term, term) → term, which is the type of an operator that takes two arguments of type term and returns a result of type term, CGa does not allow using the type ((term) → term, term) → term, which would be the type of an operator that takes another operator as its first argument. Higher-order types can be awkwardly and crudely emulated in CGa by encapsulation with noun types, but this emulation does not work well due to the fact that CGa’s type polymorphism is shallow, which is another significant limitation. To work around the weakness of CGa’s type polymorphism, in practice we find ourselves often giving entities the type term instead of a more precise type. We continue to work on making the type system more flexible without making it too complex. It is important to understand that the goal of CGa’s type system is not to ensure full correctness, but merely to check whether the reasoning parts of a document are coherently built in a sensible way. CGa provides a kind of grammar for well-formed mathematics with grammatical categories and allows checking for basic well-formedness conditions (e.g., the origin of all names/symbols can be tracked). The design of CGa is due to Kamareddine, Maarek and Wells [Kamareddine et al., 2006]. The implementation of CGa is due to Maarek [Maarek, 2007].


The Text and Symbol aspect (TSa)

The Text and Symbol aspect (TSa) [Kamareddine et al., 2004b; Kamareddine et al., 2007a; Maarek, 2007; Lamar, 2011] allows integrating normal typesetting and authoring software with the mathematical structure represented with CGa. TSa allows weaving together usual mathematical authoring representations such as LATEX, XML, or TEXMACS with CGa data. Thanks to a notion of souring rules (called “souring” because it does the opposite of syntactic sugar ), TSa allows the structure of the mathematical text to follow that of the CML text as conceived by the mathematician. TSa allows interleaving pieces of CGa with pieces of CML in the form of mixtures of natural language, symbolic formulas, and formatting instructions for visual presentation. The interleaving can be at any level of granularity: meanings can be associated at a coarse grain with entire paragraphs or


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

There is

an element 0 in

R such that


0 =


∃( 0 : R, = ( + ( a, 0 ), a ) ) Figure 2. Example of CGa encoding of CML text sections, or at a fine grain with individual words, phrases, and symbols. Arbitrary amounts of mathematically uninterpreted text can be included. The TSa representation is inspired by the XQuery/XPath Data Model (XDM) [WC3, 2007] used for representing the information content of XML documents. In TSa, a document d is built from the empty document ([ ]) by sequencing (d1 , d2 ) and labelling (ℓhdi). For example, the CML text and its CGa representation given in figure 2 could be represented in TSa by the following fine-grained interleaving of CGa6 and LATEX: “There is #1 such that #2.” h∃h“#1 in #2”h:h“an element $0$”h0i, “$R$”hRiii, “$#1 = #2$”h=h“#1 + #2”h+h“a”hai, “0”h0iii, “a”haiiii This example (see [Kamareddine et al., 2007a]) uses the abbreviation that ℓ stands for ℓh[ ]i. For example, “a”hai actually stands for “a”hah[ ]ii. Associated with TSa are methods for extracting separately the CGa and the typesetting instructions or other visual representation. E.g., from the TSa above can be extracted the following TSa representation of just the CGa portion: ∃h:h0, Ri, =h+ha, 0i, aii The CGa portion of this text can be type checked and used for processing that needs to know the mathematical meaning of the text. Similarly, the following pieces of LATEX can also be extracted: “There is #1 such that #2.” h“#1 in #2”h“an element $0$”, “$R$”i, “$#1 = #2$”h“#1 + #2”h“a”, “0”i, “a”ii This tree of LATEX typesetting instructions can be further flattened for actual processing by LATEX into a string such as: “There is an element $0$ in $R$ such that $a + 0 = a$.” The idea of the TSa representation is independent of the visual formatting language used. Although we use LATEX in our example here, in our implementations so far we have used the TEXMACS internal representation and also XML. As part of using TSa to interleave CGa and more traditional natural language and typesetting information, we needed to develop techniques for handling certain challenging CML formations where the mathematical structure and the CML representation do not nicely match. For example, in the text 0 +a0 = a0 = a(0 + 0) = 6 The representation shown here omits type/category annotations that we usually include with the CGa identifiers used in the TSa representation.

Computerising Mathematical Text

0 + a0 =



a(0 + 0)


= a0 + a0

0 + a0 a0 a0 a(0 + 0) a(0 + 0) a0 + a0 Figure 3. Example of using souring in TSa to support sharing

a0 + a0, the terms a0 and a(0 + 0) are each shared between two equations. Most formal representations would require either duplicating these shared terms, like for example 0 + a0 = a0 ∧ a0 = a(0 + 0) ∧ a(0 + 0) = a0 + a0, or explicitly abstracting the shared terms. To allow the TSa representation to be as close to CML as possible, we instead solve this by using “souring” annotations in the TSa representation [Kamareddine et al., 2007a]. These annotations are a third kind of node label used in TSa, in addition to the CGa and formatting labels. Souring annotations are used to extract the correct mathematical meaning and the nice visual presentation in the CML style. For the above example, see figure 3. We have developed more sophisticated annotations that can handle more complicated cases of sharing of terms between equations. Souring annotations have also been developed to support several other common CML formulations. Support for folding and mapping over lists allows using forms like ∀a, b, c ∈ S.P as shorthand for ∀a ∈ S.∀b ∈ S.∀c ∈ S.P and {a, b, c} as shorthand for {a}∪({b}∪({c}∪∅)). We have not yet developed folding that is sophisticated enough to handle ellipsis (. . .) as in CML formulations like the next example (from [Sexton and Sorge, 2006]): f [x, . . . , x] = | {z }

f (n) (x) n!

n + 1 arguments

We have implemented a user interface as an extension of the TEXMACS editor for entering the TSa MathLang representation. The author can use mouse and keyboard commands to annotate CML text entered in TEXMACS with boxes representing the CGa grammatical categories in order to assign CGa identifiers and explicitly indicate mathematical meanings. The user interface allows displaying either a pure CML view which hides the TSa and CGa information, a pure CGa view, or various combined views including a view like that of figure 2. This interface allows adding souring annotations like those of figure 3. We plan to develop techniques for not just pairing a single CML presentation with its CGa meaning, but also allowing multiple parallel visual presentations such as multiple natural languages (not just English), both natural language and symbolic formulas, and presentations in different symbolic notations. We plan also to develop better software support to aid in semi-automatically converting existing CML texts into MathLang via TSa and CGa. The design of TSa is due to Kamareddine, Maarek, and Wells with contributions by Lamar to the souring rules [Kamareddine et al., 2007a; Maarek, 2007; Lamar, 2011]. The implementation is primarily by Maarek [Maarek, 2007] with contributions from Lamar [Lamar, 2011].



Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

The Document Rhetorical aspect (DRa)

The Document Rhetorical aspect (DRa) [Kamareddine et al., 2007c; Retel, 2009; Zengler, 2008] supports identifying portions of a text and expressing the relationships between them. Any portion of text (e.g., phrase, step, block, etc.) can be given an identity. Many kinds of relationships can be expressed between identified pieces of text. E.g., a chunk of text can be identified as a “theorem”, and another can be identified as the “proof” of that theorem. Similarly, one chunk of text can be a “subsection” or “chapter” of another. Given these relationships, it becomes possible to do computations to check whether all dependencies are identified, to check whether the relationships are sensible or problematic (and whether therefore the author should be warned), and to extract and explain the logical structure of a text. Dependencies identified this way have been used in generating formal proof sketches and identifying the proof holes that remain to be filled. This paper presents further formalisation and implementation of notions related to DRa. DRa is a system for attaching annotations to mathematical documents that indicate the roles played by different parts of a document. DRa assumes the underlying mathematical representation (which can be the MathLang aspects CGa or TSa) has some mechanism for identifying document parts. Some DRa annotations can be unary predicates on parts; these include annotations indicating ordinary document sectioning roles such as part, chapter, section, etc. (like the sectioning supported by LATEX, OMDoc, DocBook, etc.) and others indicating special mathematical roles such as theorem, lemma, proof, etc. Other DRa annotations can be binary predicates on parts; these include such relationships between parts as “justifies”, “uses”, “inconsistent with”, and “example of ”. Regarding the annotation of justifications, remember that a CML text is usually incomplete: a mathematical thought process makes jumps from one interesting point to the next, skipping over details. This does not mean that many mistakes can occur; these details are usually so obvious for the mathematician that a couple of words are enough (e.g., “apply theorem 35 ”). The mathematician knows that too many details hinder concentration. To allow MathLang text to be close to the CML text, DRa allows informal justifications, which can be seen as hints about which statements would be used in the proof of another statement. Figure 4 gives an example (taken from [Kamareddine et al., 2007b] and implemented by Retel [Retel, 2009]) where the mathematician has identified parts of the text (indicated by letters A through I in the figure). Figure 5, shows the underlying mathematical representation of some example DRa annotations for the example in figure 4. Here, the mathematician has given each identified part a structural (e.g., chapter, section, etc.) and/or mathematical (e.g., lemma, corollary, proof, etc.) rhetorical role, and has indicated the relation between wrapped chunks of texts (e.g., justifies, uses, etc.). Note that all the DRa annotations are represented as triples; this allows using the machinery of RDF [WC3, 2004] (a W3C standard that is aimed at the “semantic web”) to represent and manipulate them. The DRa structure of a text can be represented as a tree (which is exactly the

Computerising Mathematical Text


Lemma 1. For m, n ∈ N one has: m2 = 2n2 =⇒ m = n = 0


Proof. Define on N the predicate: P (m) ⇐⇒ ∃n.m2 = 2n2 & m > 0.

E justifies


Claim. P (m) =⇒ ∃m F′ < m.P (m′ ).



Indeed suppose m2 = 2n2 and m > 0. It follows that m2 is even, but then m must be even, as odds square to odds. So m = 2k and we have justifies 2n2 = m2 = 4k 2 =⇒ n2 = 2k 2 Since m > 0, if G follows that m2 > 0, n2 > 0 and n > 0. Therefore P (n). Moreover, m2 = n2 + n2 > n2 , so m2 > n2 and hence m > n. So we can take m′ = n. subpartOf


By the claim ∀m ∈ N.¬P (m), since there are no infinite descending sequences of natural numbers. subpartOf Now suppose m2 = 2n2 uses with m 6= 0. Then m > 0 and hence P (m). Contradiction. H Therefore m = 0. But then also n = 0.

Corollary 2.


2C ∈ / Q

√ √ justifies √ Proof. Suppose 2 ∈ Q, i.e. 2 = p/q with p ∈ Z, q ∈ Z − {0}. Then 2 = m/n with m = |p|, n = |q| 6= 0.√It follows that m2 = 2n2 . But then D n = 0 by the lemma. Contradiction shows that 2∈ / Q. 

Figure 4. Wrapping/naming chunks of text and marking relationships in DRa (A, hasMathematicalRhetoricalRole, lemma) (E, hasMathematicalRhetoricalRole, definition) (F , hasMathematicalRhetoricalRole, claim) (G, hasMathematicalRhetoricalRole, proof) (B, hasMathematicalRhetoricalRole, proof) (H, hasMathematicalRhetoricalRole, case) (I, hasMathematicalRhetoricalRole, case) (C, hasMathematicalRhetoricalRole, corollary) (D, hasMathematicalRhetoricalRole, proof)

(B, justifies, A) (D, justifies, C) (D, uses, A) (G, uses, E) (F , uses, E) (H, uses, E) (H, caseOf, B) (H, caseOf, I)

Figure 5. Example of DRa relationships between chunks of text in figure 4 tree of the XML representation of the DRa annotated MathLang document). Due to the tree structure of a DRa annotated document, we refer to an annotated part of a text as a DRa node. We see an example of such a DRa node in figure 8. The role of this node is declaration and its name is decA. Note that the content of a DRa node is the user’s CGa and TSa annotation. In the DRa annotation of a document, there is a dedicated root node (the Document node) where each top-level DRa node is a child of this root node. In figure 6, we see a tree consisting of 10 nodes. The root node (labelled Document) has four children nodes and five grandchildren nodes (which are all children of B). We distinguish between proved nodes (theorem, lemma, etc.) with a solid line in the picture and unproved nodes (axiom, definition, etc.) with a broken line. We introduce this distinction because with the current implementation of DRa the user has the possibility to create its own mathematical and structural roles. Since we want to check a DRa annotated document for validity, the information whether a node is to be proved or not is important. For example such information would result in an error if someone tries to prove an unproved node e.g. by proving


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Figure 6. Example of a tree of the DRa nodes of a document

Figure 7. Dependency graph for an example DRa tree a definition or an axiom. When document D2 references document D1 it can reference the root node of document D1 to include all of its mathematical text. In figure 4 one can see, that there are four top-level nodes: A, B, C and D, representing respectively lemma 1, a proof of lemma 1, corollary 2 and a proof of corollary 2. The proof of lemma 1 has five children: E, F, G, H, I representing respectively the definition of the predicate, a claim, the proof of the claim, case 1 and case 2. The visual representation of this tree can be seen in figure 6. By traversing the tree in pre-order we derive the original linear order of the DRa nodes of the text. Pre-order means that the traversal starts with the root node and for each node we first visit the parent node before we visit its children. It is important to mention that we have also an order of the nodes at the same level from left to right. This means that we enumerate the children of a node form 1 to n and process them in this way. In the example of figure 6, the pre-order would yield the order A, B, E, F, G, H, I, C, D. The DRa implementation can automatically extract a dependency graph (as seen in figure 7) that shows how the parts of a document are related. Textual Order To be able to examine the proper structure of a DRa tree we introduce the concept of textual order between two nodes in the tree. The concept of textual order is a modification of the logical precedence presented in [Kamareddine et al., 2007c]. In what follows, we formalise this concept of order and show how it can be used to automatically generate a proof skeleton. The textual order expresses the dependencies between parts of the text. For example if a node A uses a part of a node B, then in a sequence of reasoning steps, B has to be before A. In order to

Computerising Mathematical Text


Figure 8. An example for a single DRa node formally define textual order, we introduce some notions for DRa nodes. Recall that the content of a DRa node is its CGa and TSa part and that we have nine kinds of CGa annotations: term set noun adjective statement declaration definition step context . A DRa node can also have further DRa nodes as children (e.g. B in figure 6 has the children E, F, G, H and I). We give different sets for a DRa node n. All these sets can be automatically generated from the user’s CGa and TSa annotations of the text. Table 1 defines these sets and gives examples for the CGa annotated text in figure 9 which is the definition of the subset relation.

Figure 9. CGa annotations for the definition of the subset relation Set T (n) S(n) N (n) A(n) ST (n)

Description {x | x is part of {x | x is part of {x | x is part of {x | x is part of {x | x is part of


{x | ∃q part of n, q is annotated as declaration and x is the declared symbol of q } {x | ∃q part of n, q is annotated as definition and x is the defined symbol of q } {x | x is part of n and x is annotated as step }

DF (n) SP(n) C(n) EN V(n)

n n n n n

and and and and and

x x x x x

is is is is is

annotated annotated annotated annotated annotated

as as as as as

term } set } noun } adjective } statement }

the set of all parts of n annotated as context {x|∃m = 6 n, m is a node in the pre-order path from the root node to the node n, x is a part of m, and x is annotated as statement }

Example of Fig. 9 {x} {A, B} {} {} {A ⊂ B, x ∈ A, x ∈ B, x ∈ A =⇒ x ∈ B, ∀x(x ∈ A =⇒ x ∈ B)} {x} {⊂} {A ⊂ B ⇐⇒ ∀x(x ∈ A =⇒ x ∈ B)} {}

Table 1. Sets for a DRa node n and examples

Let us give further examples of DC(n) and DF(n). In section 4a we had the following example of a list of declarations (call it ex): . . . ; ∈(term, set) : stat; =(term, term) : stat; natural number : noun; + (natural number, natural number) : natural number; . . . For ex, we have that DC(ex) = {∈, =, natural number, +}. Now take the example of figure 10 (and call it ex′ ). This example introduces the definition of ¬ (Definition 1). We have that DF(ex′ ) = {¬}.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

The syntax of a definition in the internal representation of CGa (which is not necessarily the same as that given by the reader), is an identifier with a (possibly empty list of arguments) on the left-hand side followed by “:=” and an expression. The introduced symbol is the identifier of the left-hand side. For the example of figure 9 (call it ex′′ ), the introduced symbol is ⊂, hence DF(ex′′ ) = {⊂} and the internal CGa representation is: ⊂(A, B) := forall(a, impl(in(a, A), in(a, B))). Note that EN V(n) is the environment of all mathematical statements that occur before the statements of n (from the root node). Note furthermore that in the CGa syntax of MathLang, a definition or a declaration can only introduce a term , set , noun , adjective , or statement . Furthermore, recall that mathematical symbols or notions can only be introduced by definition or declaration and that mathematical facts can only be introduced by a statement . We define the set IN (n) of introduced symbols and facts of a DRa node n as follows:[ IN (n) := DF(n) ∪ DC(n) ∪ {s|s ∈ ST (n) ∧ s 6∈ EN V(n)} ∪ IN (c) c childOf n

At the heart of a context , step , definition , or declaration , is a set of statement , set , noun , adjective , and term . A DRa node n uses the set [ U SE(n) where: USE(n) := T (n) ∪ S(n) ∪ N (n) ∪ A(n) ∪ ST (n) ∪ U SE(c) c childOf n

Lemma 1. For every DRa node n we have: 1. DF(n) ∪ DC(n) ⊆ T (n) ∪ S(n) ∪ N (n) ∪ A(n) ∪ ST (n). 2. IN (n) ⊆ USE(n). Proof. We prove 2 by induction on the depth of parenthood of n. If n has no children then use lemma 1. Assume the property holds for all children c of n. By lemma 1 and the induction hypothesis, we have IN (n) ⊆ U SE(n).  We demonstrate these notions with an example. Consider a part of a mathematical text and its corresponding DRa tree with relations as in figure 10. We assume the document starts with an environment which contains two statements, T rue and F alse . Hence EN V(def1) = {T rue, F alse}. When traversing the tree we start with the given environment for the node def1: EN V(def1) = {T rue, F alse} The environment for case1 consists of the environment of def1 and all new statements of def1. In def1 there is only the new statement ¬ which is added to the environment: EN V(case1) = {¬} ∪ EN V(def1). After case1 all the statements of this node are added to the environment. These are ¬ T rue and ¬ T rue = F alse: EN V(case2) = {¬ T rue, ¬ T rue = F alse} ∪ EN V(case1) We can proceed with the building of the environment in the same way and get the last two environments of lem1 and pr1: EN V(lem1) = {¬ F alse, ¬ F alse = T rue} ∪ EN V(case2) EN V(pr1) = {¬¬ T rue, ¬¬ T rue = T rue} ∪ EN V(lem1) With this information we derive the sets as shown in table 2 for the single nodes. We can now formalise three different kinds of textual order ≺,  and ↔:

Computerising Mathematical Text






def1 caseOf



Proof justifies







Figure 10. Example of an annotated text and its corresponding DRa tree Node n def1 case1

IN (n) {¬} ∪ IN (Case 1) ∪ IN (Case 2) {¬ T rue, ¬T rue = F alse}


{¬F alse, ¬F alse = T rue}


{¬¬ T rue, ¬¬T rue = T rue}


{¬¬ T rue = ¬ F alse}

USE(n) {¬} ∪ USE(Case 1) ∪ USE(Case 2) {T rue, F alse, ¬, ¬ T rue, ¬T rue = F alse} {T rue, F alse, ¬, ¬F alse, ¬F alse = T rue} {T rue, ¬, ¬ T rue, ¬¬ T rue, ¬¬ T rue = T rue} {T rue, F alse, ¬, ¬ T rue, ¬¬ T rue, ¬ F alse, ¬¬ T rue = ¬ F alse, ¬ F alse = T rue}

Table 2. The sets IN and U SE for the example

• Strong textual order ≺: If a node A uses a declared/defined symbol x or a statement x introduced by a node B, we say that A succeeds B and write B ≺ A. More formally: B ≺ A := ∃x(x ∈ IN (B) ∧ x ∈ U SE(A)). • Weak textual order : This order describes a subpart relation between two nodes (A is a subpart of B, written as A  B). More formally: A  B := IN (A) ⊆ IN (B) ∧ U SE(A) ⊆ U SE(B) • Common textual order ↔: This order describes the relation that two nodes use at least one common symbol or statement. More formally: A ↔ B := ∃x(x ∈ U SE(A) ∧ x ∈ U SE(B)) When B ≺ A (resp. A  B) we also write A ≻ B (resp. B  A). A DRa relation induces a textual order. Table 3 gives some relations and their textual order. We can now verify the relations of the example of figure 10 and their textual orders (Table 4). It is obvious that all five conditions hold and hence the relations are valid. For example the relation (case2, uses, lem1) would not be valid, because ¬∃x(x ∈ USE(case1) ∧ x ∈ IN (lem1)). Note that these conditions are only of a syntactical form. There is no semantical checking if e.g. a “justifies” relation really connects a proved node and its proof.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Relation A uses B A inconsistentWith B A justifies B

Meaning A uses a statement or a symbol of B some statement in A contradicts a statement in B

Order B≺A B≺A

A is the proof for B

A relatesTo B

There is a connection between A and B but no dependence A is a case of B

A ↔ B A ↔ B AB

A caseOf B

Table 3. Example of DRa relations and their textual order Relation (case1, caseOf, def1) (case2, caseOf, def1) (pr1, justifies, lem1) (lem1, uses, def1) (pr1, uses, def1)

Condition IN (case1) ⊆ IN (def1) ∧ USE(case1) ⊆ U SE(def1) IN (case2) ⊆ IN (def1) ∧ USE(case2) ⊆ U SE(def1) ∃x(x ∈ USE(pr1) ∧ x ∈ USE(lem1)) ∃x(x ∈ USE(lem1) ∧ x ∈ IN (def1)) ∃x(x ∈ USE(pr1) ∧ x ∈ IN (def1))

Order case1  def1 case2  def1 pr1 ↔ lem1 def1 ≺ lem1 def1 ≺ pr1

Table 4. Conditions for the relations of the example The GoTO

The GoTO is the Graph of textual order. For each kind of relation in the dependency graph (DG) of a DRa tree we can provide a corresponding textual order ≺,  or ↔. These different kinds of order can be interpreted as edges in a directed graph. So we can transform the dependency graph into a GoTO by transforming each edge of the DG. So far there are two reasons why the GoTO is produced: 1. Automatic Checking of the GoTO can reveal errors in the document (e.g. loops in the structure of the document). 2. The GoTO is used to automatically produce a proof skeleton for a prover. To transform an edge of the DG we need to know which textual order it induces. Each relation has a specific order ≺, ≻, , , ↔. Table 5 shows the graphical representation of such edges and an example relation we have seen in our examples. There is also a relation between a DRa node and its children: For each child c of (A, uses, B)


(A, caseOf, B)


(A, justifies, B)


Table 5. Graphical representation of edges in the GoTO a node n we have the edge c  n in the GoTO. This “childOf” relation is added automatically when producing the GoTO. But it can be added manually by the user. This can be useful e.g. in papers with a page restriction, where some parts

Computerising Mathematical Text


of the text are relocated in the appendix but would be originally within the main text. The algorithm for producing the GoTO from the DG works in two steps: 1. transform each relation of the DG into its corresponding edge in the GoTO 2. for each child c of a node n add the edge c  n to the GoTO When performing this algorithm on the example of figure 7 we get the GoTO as demonstrated in figure 11. Each relation of the DG which induces a ↔ textual order is replaced by the corresponding edge in the GoTO. We can see these edges between a proved node and its proof where the “justifies” relation induces a ↔ order (e.g. between A and B, C and D, and F and G). The children of the node B are connected to B via  edges in the GoTO. For the “caseOf” relation, the user has manually specified the relation, the other edges were added automatically by the algorithm generating the GoTO. The relations which induce the order ≺ are transformed into the corresponding directed edges in the GoTO. We see that the direction of the nodes has changed with respect to the DG. This is because we only have “uses” relations, and for a relation (A, uses, B) we have the textual order B ≺ A which means, that the direction of the edge changes.

Figure 11. Graph of Textual Order for an example DRa tree Automatic checking of DG and GoTO We implemented two kinds of failures: warnings and errors. At the current development of DRa we check for four different kinds of failures: 1. Loops in the GoTO (error) 2. Proof of an unproved node (error) 3. More than one proof for a proved node (warning) 4. Missing proof for a proved node (warning) The checks for 2) – 4) are performed in the DG. For 2) we check for every node of type “unproved” if there is an incoming edge of type “justifies”. If so, an error is returned (e.g. when someone tries to prove an axiom or a definition). For 3) and 4) we check for each node of type “proved” if there is an incoming edge of type “justifies”. If not, we return a warning (this can be a deliberate omission of the proof or just a mistake). If there is more than one proof for one node we return also a warning (most formal systems cannot handle multiple proofs). For 1) we search for cycles in the GoTO. Therefore we have to define how we treat the three different kinds of edges. Edges of type ≺ and  are treated as directed edges. Edges of type ↔ are in principal undirected edges, which means


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

for an edge A ↔ B, one can get from A to B and from B to A in the GoTO. It is vital, that within one cycle such an edge is only used in one direction. Otherwise we would have a trivial cycle between two nodes connected by a ↔ edge. As we will see in the next section, a single node in the DRa tree can first be translated when all its children nodes are ready to be translated. To reflect this circumstance we have to add certain nodes in the GoTO for the cycle check. Let us demonstrate this with an example. Consider a DG and GoTO as in figure 12.

Figure 12. Example of a not recognised loop in a DRa (left DG, right GoTO) Apparently there is a cycle in this tree, because to be able to translate C we need to translate its children D and E. Since C uses A, A must be translated before translating C. But the child D of C is used by A leading to a deadlock. Neither A nor C can be processed. To recognise such cycles we add certain edges to the GoTO when checking for cycles. Therefore we have to look at the children of a node n: hidden cycles can only evolve, when there are edges ei from a child node ci to a target node ti which is not a sibling of ci . Hence we add an edge ci ≻ n for each such node ei to the GoTO. This can be done via algorithm 1. We could also foreach node n of the tree do foreach child c of n do foreach outgoing edge e of c do if target node t of e is no sibling of c then add a Strong textual precedence edge from n to t; end end end end

Algorithm 1: Adding additional edges to the GoTO add new edges for all incoming edges of the children ci but this is not necessary since the textual order of the “childOf” relation is a directed edge from each child ci to its parent node n and the transitivity of the edges helps find a cycle anyway. In the example from figure 12, algorithm 1 would add one edge to the GoTO: The child node D of C has an outgoing node to the non-sibling node A. So a new directed edge from C to A is added which yields the result of figure 13 where a cycle between the nodes A, C and A with the edges A-C and C-A appears.

Figure 13. GoTO graph of the example of figure 12 with added edges

Computerising Mathematical Text




Lemma 1



Proof 1

Lemma 2

B caseOf



Proof 2


caseOf uses

Case 1

Case 2



Figure 14. Example of a loop in the GoTO (DG left, GoTO right) Figure 14 demonstrates another situation of a cycle in a DRa annotated text. The problem is mainly, that lemma 1 uses lemma 2 but the proof of lemma 2 uses a part of the proof of lemma 1. This situation would end up in a deadlock when processing the GoTO e.g. when producing the proof skeleton. We see a cycle between the nodes A, C, D, F, B and A with the edges A-C, C-D, D-F, F-B, and B-A. Here we also see why we do not need to add incoming edges to the parent nodes. For node F we have an incoming edge but due to the direction of the “childOf” edge from F to B, we can use the transivity. In both examples, an error would be returned with the corresponding nodes and edges. The design and implementation of DRa were the subject of Retel’s thesis [Retel, 2009]. Further additions have since been carried out by Zengler as reported here. 5 CONNECTING MATHLANG TO FORMAL FOUNDATIONS Current approaches to formalising CML texts generally involve rewriting the text from scratch; there is no clear methodology in which the text can gradually change in small steps into its formal version. One of MathLang’s goals is to support formalising a text in small steps that do not require radically reorganising the text. Also, a text with fully formal content should continue to be able to be presented in the same way as a less formal version originally developed by a mathematician. We envision formalisation as working by adding additional layers of information to a MathLang document to support embedding formal proofs. Ideally, there should be flexible control over how much of the additional information is presented to the reader; the additional information could form part of the visual presentation, or could exist “behind the scenes” to provide assurance of correctness. As part of the goal of supporting formalisation in MathLang, we desire to keep MathLang independent of any particular formal foundation. However, as proofs embedded in a MathLang document become more formal, it will be necessary to tie them more closely to a particular proof system. It might be possible that fully formal documents could be kept independent of any particular foundation by allowing the most formal parts of a document to be expressed redundantly in multiple proof systems. (This is similar in spirit to the way the natural language portion of a document might be expressed simultaneously in multiple natural languages.) In this section we report on a methodology and software for connecting a MathLang document with formal versions of its content. We mainly concentrate on a formal foundation in Coq but for Mizar see [Kamareddine et al., 2007b; Retel, 2009] and for Isabelle see [Lamar, 2011]. Our formalisation into Mizar, involved constructing a skeleton of a Mizar document (e.g. figure 15) from a Math-


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt 18 L e m m a :

Lemma 1.


For m, n ∈ N one has: m2 = 2n2 =⇒ m = n = 0


defpred Claim: proof




per cases; suppose




Define on N the predicate: P (m) ⇐⇒ ∃n.m2 = 2n2 & m > 0.






P (m) =⇒ ∃m F′ < m.P (m′ ).

justifies uses


Indeed suppose m2 = 2n2 and m > 0. It follows that m2 is even, but then m must be even, as odds square to odds. So justifies m = 2k and we have 2n2 = m2 = 4k 2 =⇒ n2 = 2k 2 Since G m > 0, if follows that m2 > 0, n2 > 0 and n > 0. Therefore P (n). Moreover, m2 = n2 + n2 > n2 , so m2 > n2 and hence ′ m > n. So we can take m = n. subpartOf




By the claim ∀m ∈ N.¬P (m), since there are no infinite descending sequences of natural numbers.


end; suppose


Now suppose m2 = uses 2n2


with m 6= 0. Then m > 0 and hence P (m).HContradiction. Therefore m = 0. But then also n = 0.

Corollary 2.



end; end;

2C ∈ / Q

justifies √ √ Proof. Suppose 2 ∈ Q, i.e. 2 = p/q with p ∈ Z, q ∈ Z − {0}. Then √ 2 = m/n with m = |p|, n = |q| 6= 0. It follows Dm2 = 2n2 . But then √ that 2∈ / Q.  n = 0 by the lemma. Contradiction shows that

80 C o r o l l a r y : 81




Figure 15. Generating a Mizar Text-Proper skeleton from DRa and CGa Lang document, and then completing the Mizar skeleton separately. A Mizar document consists of an Environment-Declaration and a Text-Proper. In Mizar, the Environment-Declaration is used to generate the Environment which has the needed knowledge from MML (Mizar’s Mathematical Library). The Text-Proper is checked for correctness using the knowledge in the Environment. In this paper, we present the automation of the skeleton generation for arbitrary theorem provers and we give a generic algorithm for transforming the DRa tree into a proof skeleton. Since at this stage of formalisation we do not want to tie to any particular foundation, the algorithm is highly configurable which means it takes the desired theorem prover as an argument and generates the proof skeleton within this theorem prover. The aim of this skeleton generation is once again to stay as close as possible to the mathematician’s original CML text. But due to certain restrictions for different theorem provers the original order cannot always be respected. We give some classical examples when this can happen: • Nested lemmas/theorems: Sometimes mathematicians define new lemmas or theorems inside proofs. Not every theorem prover can handle such an approach (e.g. Coq). In the case of such theorem provers, it is necessary to “de-nest” the theorems/lemmas. • Forward references: Sometimes a paper first gives an example for a theorem before it states the theorem. Some theorem provers (e.g. Mizar) do not support such forward references. The text has to be rewritten so that it only has backward references (i.e. to already stated mathematical constructs). • Outsourced proofs: The practise in mathematical writing is to outsource in the appendix complex proofs that are not mandatory for the main results.

Computerising Mathematical Text


When formalising such texts, these proofs need to be put in the right place. The algorithm for re-arranging the parts of the text and generating the proof skeleton performs reordering only when necessary for the theorem prover at hand.


The generic automated Skeleton Generation Algorithm (gSGA)

The proof skeleton generation algorithm takes as arguments (cf. table 6): 1. the input MathLang XML file with DRa annotations; and 2. a configuration file (in XML format) for the theorem prover.

Table 6. The skeleton generation algorithm This algorithm works on the DRa tree as seen in the last section. A DRa node can have one of three states: processed (black), in-process (grey) and unprocessed (white). A processed node has already been translated into a part of the proof skeleton, a node in-process is one that is being checked, while an unprocessed node is still awaiting translation. This information allows to identify which nodes have already been translated and which are still to be translated. The method for generating the output of a single node is shown in algorithm 2. The algorithm starts while foundwhite do foreach child c of the node do if c is unprocessed && isReady(c) then processNode(c); generateOutput(c); foundwhite := true; break; end end end

Algorithm 2: generateOuput(Node node) at the Document root node, recursively searches for nodes in need of processing, and processes them so that the node at hand is translated and added to the proof skeleton. The decision whether a node is ready to be processed or not is only dependent on the GoTO of the DRa tree. A node is ready to be processed if: 1. It has no incoming ≺ edges (in the GoTO) of unprocessed (white) nodes. 2. All its children are ready to be processed. 3. If it is a proved node: its proof is ready to be processed. Algorithm 3 tests these three properties of a node and returns the result. It is important when checking if each child of the n children is ready, to perform the test n times because a rearrangement can also be required for the children. If there


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

foreach incoming edge e of the node do if type of e is ≺ && source of e is unprocessed (white) then return false end end mark node n as grey; n = number of children of the node; for 1..n do foreach child c of the node do if c is not processed && isReady(c) then mark c as grey; break; end end end if still a white node is among the children of the node then reset all grey nodes back to white; return false end if node is a proved node then proof = proof of the node; if not isReady(proof ) then reset all grey nodes back to white; return false end end reset all grey nodes back to white; return true

Algorithm 3: isReady(Node node)

are still white children after n steps, then the children cannot be yet processed and so the node cannot be processed. To illustrate algorithm 3 we look at a (typical and not well structured) mathematical text whose DG and GoTO edges are in figure 16.

Figure 16. To illustrate Skeleton generation (DG at top, GoTO at bottom)

The root node of the document can be marked as processed and the algorithm starts at this node. The first child is Lemma 1. Criterion 1) is fulfilled, since the node has no incoming ≺ edges in the GoTO. Criterion 2) is fulfilled because the node has no children. For criterion 3) Proof 1 has to be ready to be processed before we can mark Lemma 1 as ready to be processed. Proof 1 has no incoming ≺ edges. So criterion 1) is fulfilled. For criterion 2) the children of the proof have to be ready to be processed. Definition 1 is ready, but the proof of Claim 1, Proof C1 has an incoming node of an unprocessed node (Lemma 2). So Claim 1 is not ready and hence, neither are Proof 1 and Lemma 1.

Computerising Mathematical Text


The next Node to check is Lemma2. Criteria 1) and 2) are fulfilled, for criterion 3) the proof Proof 2 has to be ready to be processed. Criteria 1) and 3) of the proof are fulfilled, so its children are ready to be processed. The first child that can be processed is Definition 2. So it is marked as in-process (grey).

In a second run of the for loop for checking the children of Proof 2, Claim 2 and its proof are now ready, because Definition 2 is not white anymore but grey. This situation is the reason, why we must perform the check whether the n children of a node are ready n times exactly.

Output: Lemma 2 Proof 2 Definition 2 Claim 2 Proof C2 Since now all the children of Proof 2 are ready, the complete proof is ready and so is Lemma 2. The grey flags are unassigned and the output for Lemma 2 is generated. In this step all nodes Lemma 2, Proof 2, Claim 2, Proof C2 and Definition 2 are permanently marked as processed (black).

Since now a node has been processed, the algorithm starts again with the first white node. So Lemma 1 is checked again. Now the children of its proof can be processed because Lemma 2 is now processed and does not prevent the processing of Proof C1.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt Output: Lemma 1 Proof 1 Definition 1 Claim 1 Proof C1 At the end Lemma 1 and its proof can be processed. The final order of the nodes is: Lemma 2 Proof 2 Definition 2 Claim 2 Proof C2 Lemma 1 Proof 1 Definition 1 Claim 1 Proof C1 We see that with this order no node references other nodes which are not already translated. Lemma 2 is translated first. Its proof follows immediately. Definition 2 is reordered, because Claim 1 and its proof refer to it. So it has to be written in front of them. Lemma 1 can then be translated because now Lemma 2 which it refers to, is already translated.


The configuration of gSGA

The transformation of the DRa annotated text into a proof skeleton has two steps: • Reorder the text to satisfy the constraints of the particular theorem prover. • Translate each DRa annotation into the language of the theorem prover.

The configuration file for a particular theorem prover for the gSGA reflects these two steps: there is a dictionary part and a constraints part. The dictionary contains a rule for each mathematical or structural role of DRa. A single DRa node has two important properties: a name and a content. This information is used in the translation. Within the configuration file we can refer to the name of a node with %name and to the body with %body. A new line (for better readability) can be inserted with %nl. Consider the example of the DRa node from figure 8. The role of this node is declaration and its name is decA. The body of this node is the sentence Let A be a set or its CGa annotation. A translation into Mizar could be: reserve ; The rule for this translation would be: reserve %body ; Such a kind of declaration in Coq would be: Variable . And the for this translation would be: Variable %body . Here, we let a single rule be embedded in an XML tag whose attribute ”name” is the corresponding keyword:

reserve %body ;

Computerising Mathematical Text


The constraints section of the configuration file for a theorem prover configures two main properties: the allowance of forward properties and of nested mathematical constructs. Forward references can be allowed via the tag: true Changing the content of the tag to ”false” forbids forward references. If there is no such tag, the default value is ”false”. For a configuration of nested constructs there are two possibilities: • Either allow in general the nesting of constructs defining those exceptions for which nesting is not allowed; • Or forbid in general the nesting of constructs defining those exceptions for which nesting is allowed. The next configuration allows nesting in general but not for definitions and axioms: true false false


The flattening of the DRa graph

The next question we have to deal with, is how to perform changes to the tree when certain nestings are not allowed. We call this a flattening of the graph, because certain nodes are removed from their original position and inserted as direct children of the DRa top-level node. Algorithm 4 achieves this effect. foreach (child c of the node) do flattenNode(c); if c cannot be nested then nodelist := transitive closure of incoming nodes of c; foreach node n of nodelist do remove n of list of children of node; add n in front of node as a sibling; end end end

Algorithm 4: flattenNode(Node node) We refer to every child of the DRa top-level node as a node at level 1. Every child of such a node is at level 2 and so on. If a mathematical role must not be nested, it can only appear at level 1. So we check for each node at a level greater than level 1, if its corresponding mathematical role can be nested. If not, then the node and all its required siblings are removed from this level and put in front of their parent node. Since there is no “childOf” relation between this no-longer-child and its parent node, the relation between child and parent changes form  to ≺. The required sibling nodes are determined in the GoTO. When a node is moved in front of its parent node, there is a ≺ edge between this node and its former parent. Each sibling of the removed node from whom there is a incoming node must be moved with the node. This includes its children or - for a proved node its proof. Since for these children we have to move the related nodes too, we can


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Figure 17. A flattened graph of the GoTO of figure 16 without nested definitions

Figure 18. A flattened graph of the GoTO of figure 16 without nested claims

build the transitive closure over the incoming nodes of the node which has to be moved. All nodes in this closure have to be relocated in front of the parent node. We demonstrate this algorithm again on the example from figure 16. For a first demonstration we assume that the nesting of definitions is not allowed. So Definition 1 and Definition 2 have to be removed from level 2 and be relocated in front of their parent nodes. The transitive closure over incoming edges in the GoTO yields no new nodes for removing (because the definitions have no incoming edges in the GoTO). The resulting new flattened graph can be seen in figure 17. We see that the two definition are now at level 1 and their edges to their former parent nodes have changed from  to ≺. The output for this graph according to the algorithm from the last section is given on the left-hand side of table 7. Definition 1 Definition 2 Lemma 2 Proof 2 Claim 2 Proof C2 Lemma 1 Proof 1 Claim 1 Proof C1

Definition 2 Claim 2 Proof C2 Lemma 2 Proof 2 Claim 1 Proof C1 Lemma 1 Proof 1 Definition 1

Table 7. Outputs of the graphs of figures 17 (left-hand side) and 18 (right-hand side)

On the other hand, if we allow definitions to be nested but forbid nested claims, we get the graph of figure 18. The first claim which is found in the graph is Claim 1. The transitive closure yields that Proof C1 needs also to be removed since there is a ↔ edge to the claim. The second claim which is found is Claim 2. The transitive closure yields again that its proof as well as Definition 2 have to be removed. The output for this graph is given on the right-hand side of table 7.

Computerising Mathematical Text





Newman’s Lemma

As a case study we specify for Newman’s Lemma a feasible interactive mathematical proof development. It should be accepted by an interactive proof assistant, if these are to be accepted by a working mathematician. Table 8 gives an actual proof development in Coq for the main lemma is given. We start with the informal statement and proof. Let A be a set and let R be a binary relation on A. R+ is the transitive closure of R and R∗ is the transitive reflexive closure of R. Confluence of R, notation CR(R) (Church-Rosser property), is defined as follows (the notion cr(R, a) denotes confluence from a ∈ A). WCR(R) stands for weak confluence. 1. crR (a) ⇐⇒ ∀b1 , b2 ∈ A.[aR∗ b1 /\ aR∗ b2 ⇒ ∃c.b1 R∗ c /\ b2 R∗ c]. 2. CR(R) ⇐⇒ ∀a ∈ A.crR (a). 3. WCR(R) ⇐⇒ ∀a, b1 , b2 ∈ A.[aRb1 /\ aRb2 ⇒ ∃c.b1 R∗ c /\ b2 R∗ c]. Newman’s lemma states that for well-founded relations weak confluence implies strong confluence. The notion of well-foundedness is formulated as the possibility to prove statements by transfinite induction. Let P ∈ P(A). 4. INDR (P ) ⇐⇒ ∀a ∈ A.(∀y ∈ A.aRy ⇒ P (y)) ⇒ P (a). 5. WF(R) ⇐⇒ ∀P ∈ P(A).[INDR (P ) ⇒ ∀a ∈ A.P (a)]. Lemma 3 (Main Lemma.). WCR(R) ⇒ INDR (crR ). Proof.Assume WCR(R). Remember IN DR (crR ) ⇔ (∀a : A.(∀y : A.a R y→cr(R, y))→(crR (a)). Let a : A and assume ∀y : A.a R y→crR (y),


in order to show crR (a), i.e. ∀b1 , b2 : A.a R∗ b1 /\ a R∗ b2 →(∃c A.b1 R∗ c /\ b2 R∗ c). So let b1 , b2 : A with a R∗ bi , in order to show ∃ R∗ c. If a = b1 or a = b2 , then the result is trivial (take c = b2 or c = b1 respectively). So by lemma p7 (below) we may assume a R+ bi , which by lemma p6 (below) means a R xi R∗ bi , for some x1 , x2 .




Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

/ / b2



/ x2





/ / b2


By WCR(R) there is an x such that xi R∗ x. By (IH) one has crR (x1 ). So x R∗ b /\ b1 R∗ b, for some b. Again crR (x2 ). As x2 R∗ x R∗ b one has b R∗ c /\ b2 R∗ c, for some c. Then b1 R∗ b R∗ c and we are done.

PROPOSITION 3 (Newman’s Lemma). WCR(R) /\ WF(R) ⇒ CR(R). Proof.By WCR(R) and the main lemma we have INDR (crR ). Hence by WF(R) it follows that for P (a) = crR (a), one has ∀a ∈ A.crR (a). This is CR(R).  Now we will start a proof development for Newman’s lemma. Variable A:Set. Definition Bin:=[B:Set](B->B->Prop). Inductive TC [R:(Bin A)]: (Bin A) := TCb: (x,y:A)(R x y)->(TC R x y)| TCf: (x,y,z:A)((R x z)->(TC R z y)->(TC R x y)). Inductive TRC [R:(Bin A)]: (Bin A) := TRCb: (x:A)(TRC R x x)| TRCf: (x,y,z:A)((R x z) -> (TRC R z y)->(TRC R x y)). Definition Trans [R:(Bin A)]: Prop:= (x,y,z:A)((R x y)->(R y z)->(R x z)). Definition IND [R: (Bin A);P:(A->Prop)]: ((a:A)((y:A)(a R y)-> (P y))->(P a)).

Prop :=

Definition cr [R:(Bin A);a:A]:= (b1,b2:A)(TRC R a b1)/\(TRC R a b2)->(EX c:A|(TRC R b1 c)/\(TRC R b2 c)). Definition CR [R:(Bin A)]:=(a:A)(cr R a). Definition WCR [R:(Bin A)]:= (a,b1,b2:A)(a R b1)->(a R b2)->(EX c:A|(TRC R b1 c)/\(TRC R b2 c)). Definition WF [R:(Bin A)]:Prop:= (P:A->Prop)(IND R P)->(a:A)(P a).

Computerising Mathematical Text


Variable R:(Bin A). Lemma Lemma Lemma Lemma Lemma Lemma Lemma Lemma

p0: p1: p2: p3: p4: p5: p6: p7:

(x,y:A)((R x y) -> (TC R x y)). (x,y:A)((R x y) -> (TRC R x y)). (x,y:A)((TC R x y) -> (TRC R x y)). (Trans (TC R)). (Trans (TRC R)). (x,y,z:A)(R x y)->(TRC R y z)->(TRC R x z). (x,y:A)((TC R x y)->(EX z:A | (R x z)/\(TRC R z y))). (x,y:A)((TRC R x y)-> (eq A x y)\/(TC R x y)).

The proof-script for these lemmas are not shown. The main lemma is as follows. Lemma main :

(WCR R)->(IND R (cr R)).

The proof-script in Coq is given in table 8. Now we will give an interactive mathematical proof script, for which we claim that it should essentially be acceptable by a mathematician-friendly proof-assistant. On the left we find the mathematical script, on the right the proof-state. These may contain some Coq-like statements, like “Intros”, but these disappear and are replaced by mathematical statements. For an interactive version see{ps, dvi}. The dvi version has to be viewed in advi obtainable from pauillac.inria. fr/~miquel. First we introduce some user-friendly notation. Notation 4. For a,b:A we write (i) a R b := (R a b). (ii) a R+ b := (TC R a b). (iii) a R* b := (TRC R a b). Proof.

Assume WCR(R). Remember IND. Let a:A. Assume (y:A)((aRy)->(cr R y)).


Remember cr. Let a,b1,b2:A. Assume a R* bi, i=1,2. We have [a=b1 \/ a R+ b1], [a=b2 \/ a R+ b2], by lemma p6. Case a=b1, take c=b2. Trivial. Hence wlog (a R+ b1). Case a=b2, take c=b1. Trivial. Hence wlog (a R+ b2). Therefore (EX xi:A|a R xi R* bi), i=1,2, by lemma p7. Pick x1. Pick x2. We have (EX x.xi R* x), i=1,2, by (WCR R). Pick x.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

We have (cr R x1), by IH. Hence (EX b.b1 R* b /\ x R* b). Pick b. Moreover (cr R x2), by IH. Hence (EX c.b R* c /\ b2 R* c), by x2 R* b. Pick c. Since b1 R* c, by (Trans R*), we have (bi R* c), i=1,2. Thus c works. QED Newman’s Lemma.


Proof. Assume WCR(R) and WF(R). Then (IND R( cr R)), by WCR(R) and main. Remember CR and WF. We have (P:(A->Prop))((a:A)((y:A)(a R y)-> (P y))->(P a)). Apply (+) to (cr R). Then CR(R). QED



Towards a Mathematical Proof Language MPL

We will now sketch rather loosely a language that may be called MPL: Mathematical Proof Language. The language will need many extensions, but this kernel may be already useful. DEFINITION 5. The phrases used in MPL for the proposed proof-assistant with interactive mathematical mode belong to the following set. Assume B Towards A Remember t Let x:D Pick [in L] x Case B Take x=t [in B] Apply B to t As to

Then B [, by C] Suffices Wlog B, [since B \/ C] and QED

Here A,B, C are propositions in context Gamma, D is a type, x is a variable and t is a term of the right type. “Wlog” stands for “Without loss of generality”. DEFINITION 6 (Synonyms). Suffices = In order to show = We must show = Towards; Let = Given; Then = We have = It follows that = Hence = Moreover = Again; and = with; by = since. Before giving a grammar for tactic statements we will give their semantics. They have a precise effect on the proof-state. In the following definition we show what the effect is of a statement on the proof-state. In some cases the tactic has a side-effect on the proof-script, as we saw in the case of Newman’s lemma.

Computerising Mathematical Text


DEFINITION 7. (i) A proof-state (within a context Gamma) is a set of statements Delta and a statement A, such that all members of Delta are well-formed in Gamma and A is well-formed in Gamma, Delta. If the proof-state is (Delta;A), then the goal is to show Delta ⊢ A. (ii) The initial proof-state of a statement A to be proved is of course (∅;A). (iii) A tactic is map from proof-states to a list of proof-states, usually having a formula or an element as extra argument. DEFINITION 8. Assume C (Delta,C->B)


Let a:D (Delta,(x:D.P)) Remember name (Delta;A)

= =

Pick [in L] x (Delta,L;A)


Take x=name (Delta;EX x:D.A)


Apply B to name (Delta;A)


Case B (Delta;A)


As to Bi (Delta;B0 /\ B1)


As to B (Delta; B) QED (Delta,B)

= =

(Delta,C;B), and ‘‘Towards B’’ may be left in the script. (Delta,a:A;P[x =a]). (Delta;A’), where A’ results from A by unfolding the defined concept ‘name’. This can be applied to an occurrence of ‘name’, by clicking on it. Other occurrences remain closed but become transparent (as if opened). (Delta,x:D,B(x);A), where L is a formula reference of (EX x:D.B). (Delta;A[x =name]), if Delta|- name:D. (Delta,P[y =name];A), where B of the form ((y:D).P) is in Delta. (Delta,B;A),(Delta,C;A), if B \/ C in Delta; the second proof-state represents the next subgoal. (Delta;Bi),(Delta;B(1-i)), the second proof-state represents the next subgoal; (Delta; B); if B in Delta.

In all cases nothing happens if the side conditions are not satisfied. One should be able to refer to a statement C in two ways: either by naming C directly of by referring to a label for C, like “IH” in the proof of the main lemma above. We say that L is a formula reference of formula B if L is B or if L is a label for B. Labels are sometimes handy, but they should also be suppressed in order to keep the proof-state clean. If the argument of a tactic occurs at several places the system should complain. Then reference should be made to a unique label. It is assumed that proof-states (Delta,A) are in normal form, that is, if B /\ C is in Delta, then it is replaced by the pair B,C. If the final QED is accepted, then all


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

the statements in the proof that did not have an effect on the proof-state will be suppressed in the final lay-out of the proof (or may be kept in color orange as an option in order to learn where one did superfluous steps). The following tactics require some automated deduction. If the proof-assistant cannot prove the claimed result, an extra proof-state will be generated so that this result will be treated as the next subgoal. DEFINITION 9. [Since B\/C] wlog C (Delta;A)


Then B[, by C] (Delta;A)


Suffices B (Delta;A)


May assume B (Delta,A)


(Delta,C;A), if B\/C in Delta and the assistant can establish Delta|-B->A. (Delta,B;A), if C is a known lemma and the assistant can establish Delta,C|-B. (Delta;B) and the assistant can establish Delta|-B->A. (Delta,B;A) if the assistant can establish Delta|-~B->A and Delta|- B \/~B.

The tactic language MPL is defined by the following grammar. formref form+ tactic

:= := :=



label | form formref | form+ and formref Assume form+ | Towards form | Remember name | Let var:set | Pick [in formref] var | Case form | Take var = term [in formref ] | Apply formref to term | Then form[, by form+] | Suffices formref | Wlog form[, since form\/form] tactic. | tactic, tactic+ | tactic. tactic+

Here label is the proof-variable, used as a name for a statement (like IH in the proof of the main lemma), form is a Gamma, Delta inhabitant of Prop, name is any defined notion during the proof development, and var is an variable. An extension of MPL capable of dealing with computations will be useful. We have A(t).

Then A(s), since t=s.

Another one: Then t=s, by computation. It would be nice to have this in an ambiguous way: computation is meant to be pure conversion or an application of reflection. This corresponds to the actual mathematical usage:

Computerising Mathematical Text


5!=120, by computation. In a commutative ring, (x + y)2 = x2 + 2xy + y 2 , by computation. In type theory the first equality would be an application of the conversion rule, but for the second one reflection, see e.g. [Constable, 1995], is needed.


Procedural statements in the implementation of MPL

As we have seen in section 6a it is handy to have statements that modify the proof state, but are not recorded as such. For example if the proof state is (Delta; (x : D)(A(x)− > B(x)), then Intros is a fast way to generate Let x:D. Assume A(x), in order to prove B(x). in the proof. Another example is Clear L which removes formula L in the assumptions of the current subgoal. Also renaming variables is useful, as some statements may come from libraries and have a “wrong” choice of bound variables. 7

A FULL FORMALISATION IN COQ VIA MATHLANG: CHAPTER 1 OF LANDAU’S “GRUNDLAGEN DER ANALYSIS” Landau’s “Grundlagen der Analysis” [Landau, 1951] remains the only book which has been fully formalised in a theorem prover [van Benthem Jutting, 1977b]. This section summarises the encoding of the first chapter (natural numbers) of Landau’s book into all aspects of MathLang up to a full formalisation in Coq. We give a complete CGa, TSa and DRa annotation for the chapter, we generate a proof skeleton automatically with the gSGA forboth Mizar and Coq and then we give a complete formalised version of the chapter in Coq. To accomplish this, we have used the MathLang TEXMACS plugin to annotate the existing plaintext of the book. To clarify the path we took, we look once again at the overall diagram of the a and annotated the different paths in MathLang (figure 1). We first used path complete text with CGa, TSa and DRa annotations with the help of the MathLang TEXMACS plugin. The second step was to automatically generate a proof skeleton of the annotated text. With the help of the proof skeleton and the CGa annotations d and . e The final we fully formalised the proofs in Coq completing the paths result is a fully formalised version of the first chapter of Landau’s book in Coq.


CGa and TSa annotations

The Preface In the preface of a MathLang document we introduce symbols that are not defined in the text but are used throughout it. These are often quantifiers or Boolean connectives like ∧ or ∨. These symbols are often pre-encoded in theorem provers (e.g. Coq has special symbols for the logical and, or, implication, etc.). The preface


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

of the first chapter of Landau’s book consists of 17 different symbols as given in table 7a. Two functions deserve further explanation: 1. The “is a” function is used to express that a particular term is an instance of a noun. E.g. the first axiom of the book is that 1 is a natural number, so the encoding of this axiom is 1 is a natural number 2. The “index” function is used to express a notion in the style of ab = c which can be defined as a function index(a, b) = c. So the index function has two terms as argument and yields a term as a result.

The first section The first section of the first chapter introduces the natural numbers, equality on natural numbers and five axioms (an extension of the Peano axioms). We introduce a noun natural numbers and the set N of natural numbers. Equality = and inequality = between natural numbers are declared rather than defined. Three properties of equality are encoded. We will show one encoding of these to recapitulate TSa annotations with sharing and to see how to use the symbols given in the preface. The original statement is x = x for every x We see that this is a universal quantification of x and a well formed equivalent statement would be: ∀x(x = x). Since the positions are swapped in Landau’s text we use the position souring. The souring annotation of this statement is:

x = x for every


This yields the final statement

x =

x for every


Next we show how to encode axiom 2 showing that “wordy” parts of a text can also be annotated, not only mathematical statements. The original statement is: For each x there exists exactly one natural number, called the successor of x, which will be denoted by x′ The “for each” can be translated with a universal quantifier, the “exactly one” with the ∃! quantifier. So we get the general structure:

The complete statement can be e.g. encoded corresponding to the following formal statement ∀x(∃!x′ (succ(x) = x′ )):

Computerising Mathematical Text


Sections 2 - 4 Within the next sections, addition (section 2), ordering (section 3) and multiplication (section 4) are introduced. There are 36 theorems with proofs and 6 definitions: addition, greater than, less then, greater or equal than, less or equal than and multiplication. There are many simple structured theorems like that of figure 19. We want to examine our way of annotating these theorems. The main

Figure 19. Simple Theorem of the second section theorem x + y = y + x is annotated in a straightforward manner. x and y are annotated as terms, plus as a function, taking two terms as arguments and yielding a term as a result. The equality between these terms is a statement. Since we did not declare x and y in the preface or in a global context we do this with a local scoping. This information is added in the first annotated line where we declare x and y as terms and put these two annotations into a context which means that this binding holds within the whole step.

Figure 20. Souring in chains of equations Landau often used chains of equations for proofs as in this proof for the equality of x(y + z ′ ) and xy + xz ′ in the proof of Theorem 30 of the first chapter: x(y + z ′ ) = x((y + z)′ ) = x(y + z) + x = (xy + xz) = x = xy = (xz + x) = xy + xz ′ Here we benefit from our souring methods - especially the sharing of variables (See figure 20). There are also often hidden quantification like the one in the example x = x for every x, where we need the souring for swapping positions. These TSa functionalities save a lot of time in annotating mathematical documents. For some theorems we use the Boolean connectives although they are not mentioned explicitly in the text. E.g. Theorem 16 states: If x ≤ y, y < z or x < y, y ≤ z then x < z We annotate the premise of the theorem as a disjunction of two conjunctions as seen in figure 21. Another use of Boolean connectives is when we have formulations like “exactly one of the following must be the case...”. There we use the exclusive or ⊕ to annotate the fact that exactly one of the cases must hold. We defined the exclusive or in the preface and therefore have to take care that we find a corresponding construct in the used theorem prover (see table 7a).


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Figure 21. The annotated Theorem 16 of the Landau’s first chapter


Figure 22. The DRa tree of sections 1 and 2 of chapter 1 of Landau’s book

DRa annotation

The structure of Landau’s “Grundlagen der Analysis” is very clear: in the first section he introduces five axioms. We annotate these axioms with the mathematical role “axiom”, give them the names “ax11” - “ax15” and classify them as unproved nodes. In the following sections we have 6 definitions which we annotate with the mathematical role “definition”, give them names “def11” - “def16” and classify as unproved nodes. We have 36 proved nodes with the role “theorem”, named “th11” - “th136” and with proofs “pr11” - “pr136”. Some proofs are partitioned into an existential part and a uniqueness part. This partitioning can be useful e.g. for Mizar where we have keywords for these parts of a proof. In the Coq formalisation, we used this partitioning to generate two single proofs in the proof skeleton which makes it easier to formalise. Other proofs consist of different cases which we annotate as unproved nodes with the mathematical role “case”. This can be translated in the Mizar “per cases” statement or in single proofs in Coq. The DRa tree for sections 1 and 2 can be seen in figure 22. The relations are annotated in a straightforward manner. Each proof justifies its corresponding theorem. Some of the axioms depend on each other. Axiom 5 (“ax15”) is the axiom of induction. So every proof which uses induction, uses also this axiom. Definition 1 (“def11”) is the definition of addition. Hence every node which uses addition also uses this definition. Some theorems use other theorems via texts like: “By Theorem ...”. In total we have 36 justifies relations, 154 uses relations, 6 caseOf, 3 existencePartOf and 3 uniquenessPartOf relations. Figures 23 and 24 give the DG and GOTO of sections 1 and 2 resp. of the whole book. The DGs and GOTOs are automatically produced from the DRa annotated text. There are no errors or warning in the document which means we have no loops in the GoTO, no proofs for unproved nodes, no double proofs for a node and no missing proofs for proved nodes.


Generation of the proof skeleton

Since there are no errors in the GoTO, the proof skeleton can be produced without warnings. We have 8 mathematical roles in the document: axioms, definitions, theorems, proofs, cases, case, existenceParts and uniquenessParts. We make a

Computerising Mathematical Text


Figure 23. DG (top) and GOTO (bottom) of sections 1 and 2 of chapter 1 of Landau’s book distinction between cases and case because e.g. in Mizar we have a special keyword introducing cases (per cases;) and then keywords for each case (suppose ...). So we annotated the cases as child nodes of the case node. Table 10 gives an overview of the rules that were used to generate the Mizar and the Coq proof skeleton. Since in Coq there are no special keywords for uniqueness, existence or cases, these rules translate only the body of these nodes and add no keywords. In table 11 we give a part of the Mizar and Coq skeletons for section 4.7


Completing the proofs in Coq

As we already explained, MathLang aims to remain as independent as possible of a particular foundation, while in addition facilitating the process of formalising mathematics in different theorem provers. Two PhD students of the MathLang project (Retel, respectively Lamar) are concerned with the MathLang paths into Mizar, respectively Isar. In this paper we study for the first time the MathLang path into Coq. We show how the CGa, TSa and DRa encoding of chapter one of Landau’s book is taken into a fully formalised Coq code. Currently we use the proof skeleton produced in the last section and fill all the %body parts by hand. We intend to investigate in the future how parts of the CGa and DRa annotations can be transformed automatically to Coq. In this section we explain why the process of formalising a mathematical text into Coq through MathLang is simpler than the formalisation of the text directly into Coq. To begin with, we code the preface of the document (see table 7a). The most complicated section to code in Coq was the first one, because we had to translate the axioms in a way we can use them productively in Coq. We defined the natural numbers as an inductive set - just as Landau does in his book. Inductive nats : Set := | I : nats 7 The complete output of the skeleton for Mizar and Coq for the whole chapter can be found in the extended article on the web pages of the authors.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

Figure 24. DG (top) and GOTO (bottom) of all of chapter 1 of Landau’s book | succ : nats -> nats Then we translate axioms 2 - 4 almost literally from our CGa annotations. For example the annotation of Axiom 3 (“ax13” ) in our document is:

We always have




By just viewing the interpretations of the annotations we get: forall x (neq (succ(x), 1))


The automatically generated Coq proof skeleton for this axiom is: Axiom ax13 : .


Now, we simply replace the placeholder of (b) with the literal translation of the interpretations in (a) to get the valid Coq axiom (this literal translation could also be done by an algorithm that we plan to implement soon): Axiom ax13 : forall x:nats, neq (succ x) I . The other axioms could be completed in a similar way and as seen, this is a very simple process that can be carried out using automated tools that reduce the burden on the user (the proof skeleton is automated, the interpretations are obtained automatically from the CGa annotations which are simple to do, and for many parts of the text, the combination of the proof skeleton with the interpretations can also be automated).

Computerising Mathematical Text


Similarly for the theorems of chapter 1 of Landau’s book, full formalisation is straightforward: E.g. Theorem 1 is written by Landau as: If x 6= y then x′ 6= y ′ Its annotation in MathLang CGa is:


y If

x 6=

y then




The CGa annotation of the context (called local scoping) can also be seen as the premise of an implication. So the upper statement can be translated via a simple rewriting of the interpretations of the annotations to: decl(x), decl(y) : neq x y -> neq (succ x) (succ y) And when we compare this line with its Coq translation we see again, it is just a literal transcription of the interpretation parts of CGa and therefore could be easily performed by an algorithm. Theorem th11 (x y:nats) : neq x y -> neq (succ x) (succ y) . From the 36 theorems of the chapter 28 could be translated literally into their corresponding Coq theorems. Now, we want also to look at a simple proof and how it can be translated into Coq. The encoding of Theorem 2 of the first chapter in Coq is theorem th12 (x:nats) : neq (succ x) x . Landau proves this theorem with induction. He first shows, that 1′ 6= 1 and then 6 x′ . that with the assumption of x′ 6= x it also holds that (x′ )′ = Since we defined the natural numbers as an inductive set, we can also do our proof in the Landau style. We introduce the variable x and eliminate it, which yields two subgoals that we need to prove. These subgoals are exactly the induction basis and the induction step. Proof. intro x. elim x. 2 subgoals x : nats ___________________________________(1/2) neq (succ I) I _______________________________________________________________(2/2) forall n : nats, neq (succ n) n -> neq (succ (succ n)) (succ n)

To prove the first case, Landau used Axiom 3 which states, that for all x it holds that x′ = 6 1. We can just apply this axiom in Coq to prove the first case: apply ax13.

1 subgoal x : nats _______________________________________________________________(1/1) forall n : nats, neq (succ n) n -> neq (succ (succ n)) (succ n)

The next step is to introduce n as natural number and the induction hypothesis:


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt intros N H. 1 subgoal x : nats n : nats H : neq (succ n) n ______________________________________(1/1) neq (succ (succ n)) (succ n)

We see that this is exactly the second case of Landau’s proof. He proved this case with Theorem 1 - we do the same: apply th11.

1 subgoal x : nats n : nats H : neq (succ n) n ______________________________________(1/1) neq (succ n) n

And of course this is exactly the induction hypotheses which we already have as an assumption and we can finish the proof: assumption. Proof completed.

The complete theorem and its proof in Coq finally look like this: Theorem th12 (x:nats) : neq (succ x) x . Proof. intro x. elim x. apply ax13. intros n H. apply th11. assumption. Qed.

We also used another hint for translating from the CGa part to the Coq formalisation. When we have a Theorem of the following kind: Theorem th11 (x y:nats) : neq x y -> neq (succ x) (succ y) . This is equivalent to: Theorem th11 : forall x y:nats, neq x y -> neq (succ x) (succ y) . A proof of such a theorem always starts with the introduction of the universal quantified variables, so in this case x and y. In terms of Coq this means: intros x y. We can do this for every proof. If it is a proof by induction we can also choose the induction variable in the next step. For example if we have an induction variable x we would write: elim x. We took the proof skeleton for Coq and extended it with these hints and the straightforward encoding of the 28 theorems. The result can be found in the extended article on the authors’ web pages. With the help of these hints we were able to produce 234 lines of correct Coq lines. The completed proof has 957

Computerising Mathematical Text


lines. In other words, we could automatically generate one fourth of the complete formalised text. This is a large simplification of the formalisation process, even for an expert in Coq who can then better devote his attention to the important issues of formalisation: the proofs. Of course there are some proofs within this chapter whose translation is not as straightforward as the proof of Theorem 2 given above. But with the help of the CGa annotations and the automatically generated proof skeleton, we have completed the Coq proofs of the whole of chapter one in a couple of hours. Moreover, the combination of interpretations and proof skeletons can be implemented so that it leads for parts of the text, into automatically generated Coq proofs. This will speed further the formalisation and again will remove more burdens from the user. The complete Coq proof of chapter 1 of landau’s book can again be found in the extended article on the authors’ web pages. 8


MathLang and MPL are long-term projects and we expect there will be years of design, implementation, and evaluation, followed by repeated redesign, reimplementation, and re-evaluation. There are many areas which we have identified as needing more work and investigation. One area is improvements to the MathLang and MPL software (currently MathLang is based on the TEXMACS editor) to make it easier to enter information for the core MathLang aspects (currently CGa, TSa and DRa). This is likely to include work on semi-automatically recognising the mathematical meaning of natural language text. A second area is further designing and developing the portions of MathLang and MPL needed for better support of formalisation. An issue here is how much expertise in any particular target proof system will be needed for authoring. It may be possible to arrange things in MathLang and MPL to make it easy for an expert in a proof system to collaborate with an ordinary mathematician in completing a formalisation. A third area where work is needed is in the overall evaluation process needed to ensure MathLang and MPL meet actual needs. This will require testing MathLang and MPL with ordinary mathematicians, mathematics students, and other users. And there are additional areas where work will be needed, including areas we have not yet anticipated. The MathLang and MPL projects aim for a number of outcomes. MathLang aims to support mathematics as practised by the ordinary mathematician, which is generally not formalised, as well as work toward full formalisation. MPL aims to improve the interactive mathematical mode for proof assistants so that they can be user-friendly. We expect that after further improvements on the MathLang and MPL designs and software, writing MathLang documents (without formalising them) will be easy for ordinary mathematicians. MathLang and MPL aim to support various kinds of consistency checking even for non-formalised mathematics. MathLang and MPL will be independent of any particular logical foundation of mathematics; individual documents will be able to be formal in one or more


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

particular foundations, or not formalised. MathLang and MPL hope to open a new useful era of collaboration between ordinary mathematicians, logicians (who ordinarily stay apart from other mathematicians), and computer science researchers working in such areas as theorem proving and mathematical knowledge management who can develop tools to link them together. MathLang and MPL’s document representation are intended to help with various kinds of automated computerised processing of mathematical knowledge. It should be possible to link MathLang and MPL documents together to form a public library of reusable mathematics. MathLang and MPL aim to better support translation between natural languages of mathematical texts and multi-lingual texts. They also aim to better support the differing uses of mathematical knowledge by different kinds of people, including ordinary practising mathematicians, students, computer scientists, logicians, linguists, etc. BIBLIOGRAPHY [Abbott et al., 1996] J. Abbott, A. van Leeuwen, and A. Strotmann. Objectives of openmath. Technical Report 12, RIACA (Research Institute for Applications of Computer Algebra), 1996. The TR archives of RIACA are incomplete. Earlier versions of this paper can be found at the “old OpenMath Home Pages” archived at the Uni. K¨ oln. [Autexier et al., 2010] Serge Autexier, Petr Sojka, and Masakazu Suzuki. Foreword to the special issue on authoring, digitalization and management of mathematical knowledge. Mathematics in Computer Science, 3(3):225–226, 2010. [Barendregt et al., 2013] H Barendregt, Will Dekker, and Richard Statman. Lambda Calculus with Types. Cambridge University Press, 2013. [Barendregt, 2003] Henk Barendregt. Towards an interactive mathematical proof mode. In Kamareddine [2003], pages 25–36. [Bundy et al., 1990] Alan Bundy, Frank van Harmelen, Christian Horn, and Alan Smaill. The oyster-clam system. In Mark E. Stickel, editor, CADE, volume 449 of Lecture Notes in Computer Science, pages 647–648. Springer, 1990. [Cantor, 1895] Georg Cantor. Beitr¨ age zur Begr¨ undung der transfiniten Mengenlehre (part 1). Mathematische Annalen, 46:481–512, 1895. [Cantor, 1897] Georg Cantor. Beitr¨ age zur Begr¨ undung der transfiniten Mengenlehre (part 2). Mathematische Annalen, 49:207–246, 1897. ´ [Cauchy, 1821] Augustin-Louis Cauchy. Cours d’Analyse de l’Ecole Royale Polytechnique. Debure, Paris, 1821. Also in Œuvres Compl` etes (2), volume III, Gauthier-Villars, Paris, 1897. [Constable and others, 1986] R. Constable et al. Implementing Mathematics with the Nuprl Proof Development System. Prentice-Hall, 1986. [Constable, 1995] Robert L. Constable. Using reflection to explain and enhance type theory. In H. Schwichtenberg, editor, Proof and Computation, Computer and System Sciences 139, pages 109–144. Springer, 1995. [de Bruijn, 1987] N.G. de Bruijn. The mathematical vernacular, a language for mathematics with typed sets. In Workshop on Programming Logic, 1987. Reprinted in [Nederpelt et al., 1994, F.3]. [Dedekind, 1872] Richard Dedekind. Stetigkeit und irrationale Zahlen. Vieweg & Sohn, Braunschweig, 1872. Fourth edition published in 1912. [Frege, 1879] Gottlob Frege. Begriffsschrift: eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Nebert, Halle, 1879. Can be found on pp. 1–82 in [van Heijenoort, 1967]. [Frege, 1893] Gottlob Frege. Grundgesetze der Arithmetik, volume 1. Hermann Pohle, Jena, 1893. Republished 1962 (Olms, Hildesheim). [Frege, 1903] Gottlob Frege. Grundgesetze der Arithmetik, volume 2. Hermann Pohle, Jena, 1903. Republished 1962 (Olms, Hildesheim).

Computerising Mathematical Text


[Gierz et al., 1980] G. Gierz, K. H. Hofmann, K. Keimel, J. D. Lawson, M. W. Mislove, and D. S. Scott. A Compendium of Continuous Lattices. Springer-Verlag, 1980. [Gordon and Melham, 1993] M. Gordon and T. Melham. Introduction to HOL – A theorem proving environment for higher order logic. Cambridge University Press, 1993. [Heath, 1956] Thomas L. Heath. The 13 Books of Euclid’s Elements. Dover, 1956. In 3 volumes. Sir Thomas Heath originally published this in 1908. [Kamareddine and Nederpelt, 2004] Fairouz Kamareddine and Rob Nederpelt. A refinement of de Bruijn’s formal language of mathematics. J. Logic Lang. Inform., 13(3):287–340, 2004. [Kamareddine and Wells, 2008] Fairouz Kamareddine and J. B. Wells. Computerizing mathematical text with mathlang. Electron. Notes Theor. Comput. Sci., 205:5–30, 2008. [Kamareddine et al., 2004a] Fairouz Kamareddine, Twan Laan, and Rob Nederpelt. A Modern Perspective on Type Theory from Its Origins Until Today, volume 29 of Kluwer Applied Logic Series. Kluwer Academic Publishers, May 2004. [Kamareddine et al., 2004b] Fairouz Kamareddine, Manuel Maarek, and J. B. Wells. Flexible encoding of mathematics on the computer. In Mathematical Knowledge Management, 3rd Int’l Conf., Proceedings, volume 3119 of Lecture Notes in Computer Science, pages 160–174. Springer, 2004. [Kamareddine et al., 2004c] Fairouz Kamareddine, Manuel Maarek, and J. B. Wells. Mathlang: Experience-driven development of a new mathematical language. In Proc. [MKMNET] Mathematical Knowledge Management Symposium, volume 93 of ENTCS, pages 138–160, Edinburgh, UK (2003-11-25/---29), February 2004. Elsevier Science. [Kamareddine et al., 2006] Fairouz Kamareddine, Manuel Maarek, and J. B. Wells. Toward an object-oriented structure for mathematical text. In Mathematical Knowledge Management, 4th Int’l Conf., Proceedings, volume 3863 of Lecture Notes in Artificial Intelligence, pages 217–233. Springer, 2006. [Kamareddine et al., 2007a] Fairouz Kamareddine, Robert Lamar, Manuel Maarek, and J. B. Wells. Restoring natural language as a computerised mathematics input method. In MKM ’07 [2007], pages 280–295. [Kamareddine et al., 2007b] Fairouz Kamareddine, Manuel Maarek, Krzysztof Retel, and J. B. Wells. Gradual computerisation/formalisation of mathematical texts into Mizar. In Roman Matuszewski and Anna Zalewska, editors, From Insight to Proof: Festschrift in Honour of Andrzej Trybulec, volume 10(23) of Studies in Logic, Grammar and Rhetoric, pages 95–120. University of Bialystok, 2007. Under the auspices of the Polish Association for Logic and Philosophy of Science. [Kamareddine et al., 2007c] Fairouz Kamareddine, Manuel Maarek, Krzysztof Retel, and J. B. Wells. Narrative structure of mathematical texts. In MKM ’07 [2007], pages 296–311. [Kamareddine, 2003] Fairouz Kamareddine, editor. Thirty Five Years of Automating Mathematics, volume 28 of Kluwer Applied Logic Series. Kluwer Academic Publishers, November 2003. [Kanahori et al., 2006] Toshihiro Kanahori, Alan Sexton, Volker Sorge, and Masakazu Suzuki. Capturing abstract matrices from paper. In Mathematical Knowledge Management, 5th Int’l Conf., Proceedings, volume 4108 of Lecture Notes in Computer Science, pages 124–138. Springer, 2006. [Kohlhase, 2006] Michael Kohlhase. An Open Markup Format for Mathematical Documents, OMDoc (Version 1.2), volume 4180 of Lecture Notes in Artificial Intelligence. SpringerVerlag, 2006. [Lamar, 2011] Robert Lamar. A Partial Translation Path from MathLang to Isabelle. PhD thesis, Heriot-Watt University, Edinburgh, Scotland, May 2011. [Landau, 1930] Edmund Landau. Grundlagen der Analysis. Chelsea, 1930. [Landau, 1951] Edmund Landau. Foundations of Analysis. Chelsea, 1951. Translation of [Landau, 1930] by F. Steinhardt. [Maarek, 2007] Manuel Maarek. Mathematical Documents Faithfully Computerised: the Grammatical and Text & Symbol Aspects of the MathLang Framework. PhD thesis, Heriot-Watt University, Edinburgh, Scotland, june 2007. [MKM ’07, 2007] Towards Mechanized Mathematical Assistants (Calculemus 2007 and MKM 2007 Joint Proceedings), volume 4573 of Lecture Notes in Artificial Intelligence. Springer, 2007.


Fairouz Kamareddine, Joe Wells, Christoph Zengler and Henk Barendregt

[Nederpelt et al., 1994] Rob Nederpelt, J. H. Geuvers, and Roel C. de Vrijer. Selected Papers on Automath, volume 133 of Studies in Logic and the Foundations of Mathematics. NorthHolland, Amsterdam, 1994. [Nederpelt, 2002] Rob Nederpelt. Weak Type Theory: a formal language for mathematics. Technical Report 02-05, Eindhoven University of Technology, 2002. [Nipkow et al., 2002] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL — A Proof Assistant for Higher-Order Logic, volume 2283 of LNCS. Springer-Verlag, 2002. ´ [Peano, 1889] Giuseppe Peano. Arithmetices Principia, Nova Methodo Exposita. Bocca, Turin, 1889. An English translation can be found on pp. 83–97 in [van Heijenoort, 1967]. [Retel, 2009] Krzysztof Retel. Gradual Computerisation and verification of Mathematics: MathLang’s Path into Mizar. PhD thesis, Heriot-Watt University, Edinburgh, Scotland, April 2009. [Rudnicki, 1992] P. Rudnicki. An overview of the Mizar project. In Proceedings of the 1992 Workshop on Types for Proofs and Programs, 1992. [Sexton and Sorge, 2006] Alan Sexton and Volker Sorge. The ellipsis in mathematical documents. Talk overhead images presented at the IMA (Institute for Mathematics and its Applications, University of Minnesota) “Hot Topic” Workshop The Evolution of Mathematical Communication in the Age of Digital Libraries held on 2006-12-08/---09, 2006. [Siekmann et al., 2002] J¨ org H. Siekmann, Christoph Benzm¨ uller, Vladimir Brezhnev, Lassaad Cheikhrouhou, Armin Fiedler, Andreas Franke, Helmut Horacek, Michael Kohlhase, Andreas Meier, Erica Melis, Markus Moschner, Immanuel Normann, Martin Pollet, Volker Sorge, Carsten Ullrich, Claus-Peter Wirth, and J¨ urgen Zimmer. Proof development with omega. In Andrei Voronkov, editor, CADE, volume 2392 of Lecture Notes in Computer Science, pages 144–149. Springer, 2002. [Siekmann et al., 2003] Siekmann, Benzm¨ ullerand√Fiedler, Meier, Normann, and Pollet. Proof development with Ωmega: The irrationality of 2. In Kamareddine [2003], pages 271–314. [Team, 1999–2003] Coq Development Team. The coq proof assistant reference manual. INRIA, 1999–2003. [van Benthem Jutting, 1977a] Lambert S. van Benthem Jutting. Checking Landau’s “Grundlagen” in the AUTOMATH System. PhD thesis, Eindhoven, 1977. Partially reprinted in [Nederpelt et al., 1994, B.5,D.2,D.3,D.5,E.2]. [van Benthem Jutting, 1977b] Lambert S. van Benthem Jutting. Checking Landau’s “Grundlagen” in the AUTOMATH system. PhD thesis, Eindhoven, 1977. [van der Hoeven, 2004] Joris van der Hoeven. GNU TeXmacs. SIGSAM Bulletin, 38(1):24–25, 2004. [van Heijenoort, 1967] J. van Heijenoort. From Frege to G¨ odel: A Source Book in Mathematical Logic, 1879–1931. Harvard University Press, 1967. [W3C, 2003] W3C. Mathematical markup language (MathML) version 2.0. W3C Recommendation, October 2003. W3C (World Wide Web Consortium). [WC3, 2004] WC3. RDF Primer. W3C Recommendation, February 2004. W3C (World Wide Web Consortium). [WC3, 2007] WC3. XQuery 1.0 and XPath 2.0 data model (XDM). W3C Recommendation, 2007. W3C (World Wide Web Consortium). [Whitehead and Russel, 1910–1913] Alfred North Whitehead and Bertrand Russel. Principia Mathematica. Cambridge University Press, 1910–1913. In three volumes published from 1910 through 1913. Second edition published from 1925 through 1927. Abridged edition published in 1962. [Wiedijk, 2006] F. Wiedijk, editor. The Seventeen Provers of the World, foreword by Dana S. Scott, volume 3600 of LNCS. Springer Berlin, Heidelberg, 2006. [Zengler, 2008] Christoph Zengler. Research report. Technical report, Heriot-Watt University, November 2008. [Zermelo, 1908] Ernst Zermelo. Untersuchungen u ¨ber die Grundlagen der Mengenlehre (part 1). Mathematische Annalen, 65:261–281, 1908. An English translation can be found on pp. 199–215 in [van Heijenoort, 1967].

Computerising Mathematical Text Lemma main : (WCR R)->(IND R (cr R)). Proof. Intros. Unfold IND. Intro a. Intro IH. Unfold cr. Intuition. Assert (a=b1\verb+\/+(TC R a b1))\verb+/\+ (a=b2\verb+\/+(TC R a b2)). Split. Apply p7. Assumption. Apply p7. Assumption. Tactic Definition Get x := Elim x; Intros; Clear x. Get H0; Get H3. Exists b2. Split. Rewrite >> >> >> 


cP 1



Figure 3. Sets as graphs • the converse, i.e. for all t′ ∈ f (t) there is s′ ∈ f (s) such that s′ R t′ . A reflexive and symmetric relation R is f -conservative if R ⊆ F (R); it is f admissible if it is a fixed point of F , i.e., R = F (R). The authors note that F is monotone over a complete lattice, hence it has a greatest fixed point (the largest f -admissible relation). They also prove that such greatest fixed point can be obtained as the union over all f -conservative relations (the coinduction proof principle), and also, inductively, as the limit of a sequence of decreasing relations over the ordinals that starts with the universal relation A×A. The main difference between f -conservative relations and today’s bisimulations is that the former are required to be reflexive and symmetric. However, while the bisimulation proof method is introduced, as derived from the theory of fixed points, it remains rather hidden in Forti and Honsell’s works, whose main goal is to prove the consistency of anti-foundation axioms. For this the main technique uses the f -admissible relations. Aczel reformulates Forti and Honsell’s anti-foundation axiom X1 . In Forti and Honsell [1983], the axiom says that from every relational structure there is a unique homomorphism onto a transitive set (a relational structure is a set equipped with a relation on its elements; a set A is transitive if each set B that is an element of A has the property that all the elements of B also belong to A; that is, all composite elements of A are also subsets of A). Aczel calls the axiom AFA and expresses it with the help of graph theory, in terms of graphs whose nodes are decorated with sets. For this, sets are thought of as (pointed) graphs, where the nodes represent sets, the edges represent the converse membership relation (e.g., an edge from a node x to a node y indicates that the set represented by y is a member of the set represented by x), and the root of the graph indicates the starting point, that is, the node that represents the set under consideration. For instance, the sets {∅, {∅}} and D = {∅, {D}} naturally corresponds to the graphs of Figure 3 (where for convenience nodes are named) with nodes 2 and c being the roots. The graphs for the well-founded sets are those without infinite paths or cycles, such as the graph on the left in Figure 3. AFA essentially states that each graph represents a unique set. This is formalised via the notion of decoration. A decoration for a graph is an assignment of sets to nodes that respects the structure of the edges; that is, the set assigned to a node is equal to the set of the sets assigned to the children of the node. For instance, the decoration for the graph on the left of Figure 3 assigns ∅ to node 0, {∅} to node 1, and {∅, {∅}} to node 2, whereas that


Jos C. M. Baeten and Davide Sangiorgi

for the graph on the right assigns ∅ to a, {D} to b, and {∅, {D}} to c. Axiom AFA stipulates that every graph has a unique decoration. (In Aczel, the graph plays the role of the relational structure in Forti and Honsell, and the decoration the role of the homomorphism into a transitive set.) In this, there are two important facts: the existence of the decoration, and its uniqueness. The former tells us that the non-well-founded sets we need do exist. The latter tell us what is equality for them. Thus two sets are equal if they can be assigned to the same node of a graph. For instance the sets Ω, A and B mentioned at the beginning of this section are equal because the graph  •_ • has a decoration in which both nodes receive Ω, and another decoration in which the node on the left receives A and that on the right B. Bisimulation comes out when one tries to extract the meaning of equality. A bisimulation relates sets A and B such that • for all A1 ∈ A there is B1 ∈ B with A1 and B1 related; and the converse, for the elements of B1 . Two sets are equal precisely if there is a bisimulation relating them. The bisimulation proof method can then be used to prove equalities between sets, for instance the equality between the sets A and B above. Aczel formulates AFA towards end 1983; he does not publish it immediately having then discovered the earlier work of Forti and Honsell and the equivalence between AFA and X1 . Instead, he goes on developing the theory of non-wellfounded sets, mostly through a series of lectures in Stanford between January and March ’85, which leads to the book [Aczel, 1988]. Aczel shows how to use the bisimulation proof method to prove equalities between non-well-founded sets, and develops a theory of coinduction that sets the basis for the coalgebraic approach to semantics (Final Semantics). Up to Aczel’s book [Aczel, 1988], all the works on non-well-founded sets had remained outside the mainstream. This changes with Aczel, for two main reasons: the elegant theory that he develops, and the concrete motivations for studying nonwell-founded sets that he brings up, namely mathematical foundations of processes, in this prompted by the work of Milner on CCS and his way of equating processes with an infinite behaviour via a bisimulation quotient. 4


At this point we switch our attention from bisimulation and coinduction to process calculi. Process calculi start from a syntax, a language describing the objects of interest, elements of concurrent behavior and how they are put together. In this section the history of process calculi is traced back to the early seventies of the twentieth century, and developments since that time are sketched.

Concurrency Theory


The word ‘process’ refers to discrete behavior of agents, as discussed in Section 2. The word ‘calculus’ refers to doing calculations with processes, in order to calculate a property of a process, or to prove that processes are equal. We sketch the state of research in the early seventies, and state which breakthroughs were needed in order for the theories to appear. We consider the development of CCS, CSP and ACP. In Section 4.7, we sketch the main developments since then. The calculations are based on a basic set of laws that are established or postulated for processes. These laws are usually stated in the form of an algebra, using techniques and results from mathematical universal algebra (see e.g. [MacLane and Birkhoff, 1967]). To emphasize the algebraical basis, the term ‘process algebra’ is often used instead of ‘process calculus’. Strictly speaking, a process algebra only uses laws stated in the form of an algebra, while a process calculus can also use laws that use multiple sorts and binding variables, thus going outside the realm of universal algebra. A process calculus can start from a given syntax (set of operators) and try to find the laws concerning these operators that hold in a given semantical domain, while a process algebra can start from a given syntax and a set of laws or axioms concerning these operators, and next consider all the different semantical domains where these laws hold. Comparing process calculi that have different semantical domains works best by considering the set of laws that they have [Glabbeek, 1990a]. On the basis of the set of laws or axioms, we can calculate, perform equational reasoning. To compare, calculations with automata can be done by means of the algebra of regular expressions (see e.g. [Linz, 2001]). Since a process calculus addresses interaction, agents acting in parallel, a process calculus will usually (but not necessarily) have a form of parallel composition as a basic operator. To repeat, the study of process calculi is the study of the behavior of parallel or distributed systems based on a set of (algebraic) laws. It offers means to describe or specify such systems, and thus it has means to talk about parallel composition. Besides this, it can usually also talk about alternative composition (choice) and a form of sequential composition (sequencing). By means of calculation, we can do verification, i.e. we can establish that a system satisfies a certain property. What are these basic laws of process algebra? We can list some, that can be called structural laws. We start out from a given set of atomic actions, and use the basic operators to compose these into more complicated processes. We use notations from [Baeten et al., 2010], that unifies the literature on process calculi. As basic operators, we use + denoting alternative composition, · denoting sequential composition and k denoting parallel composition. Usually, there are also neutral elements for some or all of these operators, but we do not consider these yet. Some basic laws are the following (+ binding weakest, · binding strongest). • x + y = y + x (commutativity of alternative composition)

• x + (y + z) = (x + y) + z (associativity of alternative composition) • x + x = x (idempotence of alternative composition)


Jos C. M. Baeten and Davide Sangiorgi

• (x + y) · z = x · z + y · z (right distributivity of + over ·) • (x · y) · z = x · (y · z) (associativity of sequential composition) • x k y = y k x (commutativity of parallel composition) • (x k y) k z = x k (y k z) (associativity of parallel composition) These laws list some general properties of the operators involved. Note there is a law stating the right distributivity of + over ·, but no law of left distributivity. Adding the left distributivity law leads to a so-called linear time theory. Usually, left distributivity is absent, and we speak of a branching time theory, where the moment of choice is relevant. We can see that there is a law connecting alternative and sequential composition. In some cases, other connections are considered. On the other hand, we list no law connecting parallel composition to the other operators. It turns out such a connection is at the heart of process algebra, and it is the tool that makes calculation possible. In most process calculi, this law allows one to express parallel composition in terms of the other operators, and is called an expansion theorem. Process calculi with an expansion theorem are called interleaving process calculi, those without (such as a calculus of Petri nets) are called partial order or true concurrency calculi. For a discussion concerning this dichotomy, see [Baeten, 1993]. So, we can say that any mathematical structure with three binary operations satisfying these 7 laws is a process algebra. Most often, these structures are formulated in terms of automata-like models, namely the labeled transition systems of Definition 1. The notion of equivalence studied is usually not language equivalence. Prominent among the equivalences studied is bisimulation, as discussed in the previous section. Strictly speaking, the study of labeled transition systems, ways to define them and equivalences on them are not part of a process calculus. We can use the term process theory as a wider notion, that encompasses also semantical issues. Below, we describe the history of process algebra from the early seventies to the early eighties, by focusing on the central people involved. By the early eighties, we can say process algebra is established as a separate area of research. Subsection 4.7 will consider the main extensions since the early eighties until the present time.



One of the people studying the semantics of parallel programs in the early seventies was Hans Bekiˇc. He was born in 1936, and died due to a mountain accident in 1982. In the period we are speaking of, he worked at the IBM lab in Vienna, Austria. The lab was well-known in the sixties and seventies for the work on the definition and semantics of programming languages, and Bekiˇc played a part in this, working on the denotational semantics of ALGOL and PL/I. Growing out of his work on PL/I, the problem arose how to give a denotational semantics for

Concurrency Theory


parallel composition. Bekiˇc tackled this problem in [Bekiˇc, 1971]. This internal report, and indeed all the work of Bekiˇc is made accessible to us through the work of Cliff Jones [Bekiˇc, 1984]. On this book, we base the following remarks. In [Bekiˇc, 1971], Bekiˇc addresses the semantics of what he calls “quasi-parallel execution of processes”. From the introduction, we quote: Our plan to develop an algebra of processes may be viewed as a highlevel approach: we are interested in how to compose complex processes from simpler (still arbitrarily complex) ones. Bekiˇc uses global variables, so a state ξ is a valuation of variables, and a program determines an action A, which gives, in a state (non-deterministically) either null iff it is an end-state, or an elementary step f , giving a new state f ξ and rest-action A′ . Further, there are ⊔ and cases denoting alternative composition, ; denoting sequential composition, and // denoting (quasi-)parallel composition. On page 183 in [Bekiˇc, 1984], we see the following law for quasi-parallel composition: (A//B)ξ (cases ⊔


= Aξ : null → Bξ

(f, A′ ) → f, (A′ //B))

Bξ : null → Aξ (g, B ′ ) → g, (A//B ′ ))

and this is called the “unspecified merging” of the elementary steps of A and B. This is definitely a pre-cursor of what later would be called the expansion law of process calculi. It also makes explicit that Bekiˇc has made the first paradigm shift: the next step in a merge is not determined, so we have abandoned the idea of a program as a function. The book [Bekiˇc, 1984] goes on with clarifications of [Bekiˇc, 1971] from a lecture in Amsterdam in 1972. Here, Bekiˇc states that an action is tree-like, behaves like a scheduler, so that for instance f ; (g ⊔ h) is not the same as (f ; g) ⊔ (f ; h) for elementary steps f, g, h, another example of non-functional behavior. In a letter to Peter Lucas from 1975, Bekiˇc is still struggling with his notion of an action, and writes: These actions still contain enough information so that the normal operations can be defined between them, but on the other hand little enough information to fulfil certain desirable equivalences, such as: a; 0 = a etc.

a; (b; c) = (a; b); c

a//b = b//a


Jos C. M. Baeten and Davide Sangiorgi

In a lecture on the material in 1974 in Newcastle, Bekiˇc has changed the notation of // to k and calls the operator parallel composition. In giving the equations, we even encounter a “left-parallel” operator, with laws, with the same meaning that Bergstra and Klop will later give to their left-merge operator [Bergstra and Klop, 1982]. Concluding, we can say that Bekiˇc contributed a number of basic ingredients to the emergence of process algebra, but we see no coherent comprehensive theory yet.



The central person in the history of process calculi without a doubt is Robin Milner —we already mentioned his relevance for concurrency theory in Section 3.4, discussing bisimulation. A.J.R.G. Milner, who was born in 1934 and died in 2010, developed his process theory CCS (Calculus of Communicating Systems) over the years 1973 to 1980, culminating in the publication of the book [Milner, 1980] in 1980. The oldest publications concerning the semantics of parallel composition are [Milner, 1973; Milner, 1975], formulated within the framework of denotational semantics, using so-called transducers. He considers the problems caused by nonterminating programs, with side effects, and non-determinism. He uses the operations ∗ for sequential composition, ? for alternative composition and k for parallel composition. He refers to [Bekiˇc, 1971] as related work. Next, chronologically, are the articles [Milner, 1979; Milne and Milner, 1979]. Here, Milner introduces flow graphs, with ports where a named port synchronizes with the port with its co-name. Operators are | for parallel composition, restriction and relabeling. The symbol k is now reserved for restricted parallel composition. Structural laws are stated for these operators. The following two papers are [Milner, 1978a; Milner, 1978b], putting in place most of CCS as we know it. The operators prefixing and alternative composition are added and provided with laws. Synchronization trees are used as a model. The prefix τ occurs as a communication trace (what remains of a synchronization of a name and a co-name). The paradigm of message passing is taken over from [Hoare, 1978]. Interleaving is introduced as the observation of a single observer of a communicating system, and the expansion law is stated. Sequential composition is not a basic operator, but a derived one, using communication, abstraction and restriction. The paper [Hennessy and Milner, 1980], with Matthew Hennessy, formulates basic CCS, with observational equivalence and strong equivalence defined inductively. Also, so-called Hennessy-Milner logic is introduced, which provides a logical characterization of process equivalence. Next, the book [Milner, 1980] is the standard process calculus reference. Here we have for the first time in history a complete process calculus, with a set of equations and a semantical model. He presents the equational laws as truths about his chosen semantical domain, rather

Concurrency Theory


than considering the laws as primary, and investigating the range of models that they have. We pointed out in Section 3.4 that an important contribution, realized just after the appearance of [Milner, 1980], is the formulation of bisimulation. This became a central notion in process theory subsequently. The book [Milner, 1980] was later updated in [Milner, 1989]. A related development is the birth of structural operational semantics in [Plotkin, 1981]. More can be read about this in the historical paper [Plotkin, 2004b]. To recap, CCS has the following syntax: • A constant 0 that is the neutral element of alternative composition, the process that shows no behavior. It is the seed process from which other processes can be constructed using the operators. • For each action a from a given set of actions A, the action prefix operator a. , that prefixes a given process with the action a: after execution of a, the process continues. This is a restricted form of sequential composition. • Alternative composition +. It is important to note that the choice between the given alternatives is made by the execution of an action from one of them, thereby discarding the alternative, not before. As a consequence, there is the law x + 0 = x, as 0 is an alternative that cannot be chosen. • Parallel composition |. In a parallel composition x | y, an action from x or from y can be executed, or they can jointly execute a communication action. The set of actions A is divided into a set of names and a set of co-names (for each name a there is a co-name a ¯). The joint execution of a name and its corresponding co-name results in the execution of the special communication action τ . The action prefix operator τ. has a special set of laws called the τ -laws, that allow one to eliminate the τ action in a number of cases. Thus, the parallel composition operator does two things at a time: it allows a communication, and hides the result of the communication in a number of cases (a form of abstraction). • Recursion or fixed point construction. If P is a process expression possibly containing the variable x, then µx.P is the smallest fixed point of P , the process showing the least behavior satisfying the equation µx.P = P [µx.P/x], where the last construct is the process expression P with all occurrences of the variable x replaced by µx.P . The notions “smallest” and “least” refer to the fact that only behavior is included that can be inferred from the behavior of P and this equation. This construct is used to define processes that can execute an unrestricted number of actions, a so-called reactive process. In later work, Milner does not use binding of variables, but instead sees the fixed point as a new constant X, whose behavior is given by the recursive equation X = P .


Jos C. M. Baeten and Davide Sangiorgi

• Restriction or encapsulation ∂H , where H is a set of names and their corresponding co-names, will block execution of the actions in H. By blocking execution of the names and co-names, but always allowing τ , communication in a parallel composition can be enforced. CCS uses a different notation for this operator. • Relabeling or renaming ρf , where f is a function on actions that preserves the co-name relation and does not rename τ . This operator is useful to obtain different instances of some generically defined process. CCS uses a different notation for this operator.



A very important contributor to the development of process calculi is Tony Hoare. C.A.R. Hoare, born in 1934, published the influential paper [Hoare, 1978] as a technical report in 1976. The important step is that he does away completely with global variables, and adopts the message passing paradigm of communication, thus realizing the second paradigm shift. The language CSP (Communicating Sequential Processes) described in [Hoare, 1978] has synchronous communication and is a guarded command language (based on [Dijkstra, 1975]). No model or semantics is provided. This paper inspired Milner to treat message passing in CCS in the same way. A model for CSP was elaborated in [Hoare, 1980]. This is a model based on trace theory, i.e. on the sequences of actions a process can perform. Later on, it was found that this model was lacking, for instance because deadlock behavior is not preserved. For this reason, a new model based on failure pairs was presented in [Brookes et al., 1984] for the language that was then called TCSP (Theoretical CSP ). Later, TCSP was called CSP again. Some time later it was established that the failure model is the least discriminating model that preserves deadlock behavior (see e.g. [Glabbeek, 2001]). In the language, due to the presence of two alternative composition operators, it is possible to do without a silent step like τ altogether. The book [Hoare, 1985] gives a good overview of CSP. Between CCS and CSP, there is some debate concerning the nature of alternative composition. Some say the + of CCS is difficult to understand (“the weather of Milner”), and CSP proposes to distinguish between internal and external nondeterminism, using two separate operators. See also [Hennessy, 1988]. The syntax of CSP from [Hoare, 1985] constitutes: • A constant called ST OP that acts like the 0 CCS, but also a constant called SKIP (that we call 1) that is the neutral element of sequential composition. Thus, 0 stands for unsuccessful termination and 1 for successful termination. Both processes do not execute any action. • Action prefix operators a. as in CCS. There is no τ prefixing.

Concurrency Theory


• CSP has two alternative composition operators, ⊓ denoting non-deterministic or internal choice, and  denoting external choice. The internal-choice operator denotes a non-deterministic choice that cannot be influenced by the environment (other processes in parallel) and can simply be defined in CCS terms as follows: x ⊓ y = τ.x + τ.y. The external choice operator is not so easily defined in terms of CCS (see [Glabbeek, 1986]). It denotes a choice that can be influenced by the environment. If the arguments of the operator have initial silent non-determinism, then these τ -steps can be executed without making a choice, and the choice will be made as soon as a visible (non-τ ) action occurs. Because of the presence of two choice operators and a semantics that equates more processes than bisimilarity, all silent actions that might occur in a process expression can be removed [Bergstra et al., 1987]. • There is action prefixing like in CCS, but also full sequential composition. • The parallel composition operator of CSP allows interleaving but also synchronization on the same name, so that execution of an action a by both components results in a communication action again named a. This enables multi-way synchronization. • Recursion is handled as in CCS. • There is a concealment or abstraction operator that renames a set of actions into τ . These introduced τ can subsequently be removed from an expression.


Some Other Process Calculi

Around 1980, concurrency theory and in particular process theory is a vibrant field with a lot of activity world wide. We already mentioned research on Petri nets, that is an active area [Petri, 1980]. Another partial order process theory is given in [Mazurkiewicz, 1977]. Research on temporal logic has started, see e.g. [Pnueli, 1977]. Some other process calculi can be mentioned. We already remarked that Hoare investigated trace theory. More work was done in this direction, e.g. by Rem [Rem, 1983]. There is also the invariants calculus [Apt et al., 1980]. Another process theory that should be mentioned is the metric approach by De Bakker and Zucker [Bakker and Zucker, 1982a; Bakker and Zucker, 1982b]. There is a notion of distance between processes: processes that do not differ in behavior before the nth step have a distance of at most 2−n . This turns the domain of processes into a metric space, that can be completed, and solutions to guarded recursive equations (a type of well-behaved recursive equations) exist by application of Banach’s fixed point theorem [Banach, 1922].


Jos C. M. Baeten and Davide Sangiorgi



Jan Bergstra and Jan Willem Klop in 1982 started work on a question of De Bakker as to what can be said about solutions of unguarded recursive equations. As a result, they wrote the paper [Bergstra and Klop, 1982]. In this paper, the phrase “process algebra” is used for the first time. We quote: A process algebra over a set of atomic actions A is a structure A = hA, +, ·, T, ai (i ∈ I)i where A is a set containing A, the ai are constant symbols corresponding to the ai ∈ A, and + (union), · (concatenation or composition, left out in the axioms), T (left merge) satisfy for all x, y, z ∈ A and a ∈ A the following axioms: PA1

x+y =y+x


x + (y + z) = (x + y) + z x+x=x


(xy)z = x(yz) (x + y)z = xz + yz (x + y)Tz = xTz + yTz


axTy = a(xTy + yTx) aTy = ay

This clearly establishes a process calculus in the framework of universal algebra. In the paper, process algebra was defined with alternative, sequential and parallel composition, but without communication. A model was established based on projective sequences (a process is given by a sequence of approximations by finite terms), and in this model, it is established that all recursive equations have a solution. In adapted form, this paper was later published as [Bergstra and Klop, 1992]. In [Bergstra and Klop, 1984b], this process algebra PA was extended with communication to yield the theory ACP (Algebra of Communicating Processes). The book [Baeten and Weijland, 1990] gives an overview of ACP. The syntax of ACP comprises: • A constant 0 denoting inaction, as in CCS and CSP (written δ). • A set of actions A, each element denoting a constant in the syntax. Expressed in terms of prefixing and successful termination, each such constant can be denoted as a.1, execution of a followed by successful termination. This lumping together of two notions causes problems when the theory is extended with explicit timing, see [Baeten, 2003]. • Alternative composition + as in CCS. • Sequential composition · as in CSP.

Concurrency Theory


• The set of actions A has given on it a partial binary commutative and associative communication function that tells when two actions can synchronize in a parallel composition, and what is the resulting communication action. ACP does not have a special silent action τ , each communication action is just a regular action. Parallel composition then has interleaving and communication. The finite axiomatization of parallel composition then uses an auxiliary operator left merge as shown above, and in addition another auxiliary operator called communication merge. • Encapsulation ∂H blocking a subset of actions H of A as in CCS. • Recursion. A process constant X can be defined by means of a recursive equation X = P , where the constant may appear in the expression P . Also, a set of constants can be defined by a set of recursive equations, one for each constant.



Comparing the three most well-known process calculi CCS, CSP and ACP, we can say there is a considerable amount of work and applications realized in all three of them. In that sense, there seem to be no fundamental differences between the theories with respect to the range of applications. Historically, CCS was the first with a complete theory. Different from the other two, CSP has a least distinguishing equational theory. More than the other two, ACP emphasizes the algebraic aspect: there is an equational theory with a range of semantical models. Also, ACP has a more general communication scheme: in CCS, communication is combined with abstraction, in CSP, there is also a restricted communication scheme. In ensuing years, other process calculi were developed. We can mention SCCS [Milner, 1983], CIRCAL [Milne, 1983], MEIJE [Austry and Boudol, 1984], the process calculus of Hennessy [Hennessy, 1988]. We see that over the years many process calculi have been developed, each making its own set of choices in the different possibilities. The reader may wonder whether this is something to be lamented. In the paper [Baeten et al., 1991], it is argued that this is actually a good thing, as long as there is a good exchange of information between the different groups, as each different process calculus will have its own set of advantages and disadvantages. When a certain notion is used in two different process calculi with the same underlying intuition, but with a different set of equational laws, there are some who argue for the same notation, in order to show that we are really talking about the same thing, and others who argue for different notations, in order to emphasize the different semantical setting. With the book [Baeten et al., 2010], an integrated overview is presented of all features of CCS, CSP and ACP, based on an algebraic presentation, together with highlights of the main extensions since the 1980s. A good overview of developments is also provided by the impressive handbook [Bergstra et al., 2001].



Jos C. M. Baeten and Davide Sangiorgi


Theory A nice overview of the most important theoretical results since the start of process calculi is the paper [Aceto, 2003]. Also, remaining open problems are stated there. For a process calculus based on partial order semantics, see [Best et al., 2001]. There is a wealth of results concerning process calculi extended with some form of recursion, see e.g. the complete axiomatization of regular processes by Milner [Milner, 1984] or the overview on decidability in [Burkart et al., 2001]. Also, there is a whole range of expressiveness results, some examples can be found in [Bergstra and Klop, 1984a]. Tooling Over the years, several software systems have been developed in order to facilitate the application of process calculi in the analysis of systems. Here, we only mention general process calculus tools. Tools that deal with specific extensions are mentioned below. The most well-known general tool is the Concurrency Workbench, see [Moller and Stevens, 1999], dealing with CCS-type process calculus. There is also the variant CWB-NC, see [Zhang et al., 2003] for the current state of affairs. There is the French set of tools CADP, see e.g. [Fernandez et al., 1996]. Further, in the CSP tradition, there is the FDR tool (see The challenge in tool development is to combine an attractive user interface with a powerful and fast verification engine. Verification A measure of success of process calculus is the systems that have been successfully verified by means of techniques that come from process calculus. A good overview can be found in [Groote and Reniers, 2001]. Process calculus focuses on equational reasoning. Other successful techniques are model checking and theorem proving. Combination of these different approaches proves to be very promising. Data Process calculi are very successful in describing the dynamic behavior of systems. In describing the static aspects, treatment of data is very important. Actions and processes are parametrized with data elements. The combination of processes and data has received much attention over the years. A standardized formal description technique is LOTOS, see [Brinksma, 1989]. Another combination of processes and data is PSF, see [Mauw, 1991] with associated tooling. The process calculus with data µCRL (succeeded by mCRL2 [Groote and Mousavi, 2013]) has tooling focusing on equational verification, see e.g. [Groote and Lisser, 2001].

Concurrency Theory


Time Research on process calculus extended with a quantitative notion of time started with the work of Reed and Roscoe in the CSP context, see [Reed and Roscoe, 1988]. A textbook in this tradition is [Schneider, 2000]. There are many variants of CCS with timing, see e.g. [Yi, 1990], [Moller and Tofts, 1990]. In the ACP tradition, work starts with [Baeten and Bergstra, 1991]. An integrated theory, involving both discrete and dense time, both relative and absolute time, is presented in the book [Baeten and Middelburg, 2002]. Also the theory ATP can be mentioned, see [Nicollin and Sifakis, 1994]. An overview and comparison of different process algebras with timing can be found in [Baeten, 2003]. Tooling has been developed for processes with timing mostly in terms of timed automata, see e.g. UPPAAL [Larsen et al., 1997] or KRONOS [Yovine, 1997]. Equational reasoning is investigated for µCRL with timing [Usenko, 2002]. Mobility Research on networks of processes where processes are mobile and configuration of communication links is dynamic has been dominated by the π-calculus. An early reference is [Engberg and Nielsen, 1986], the standard reference is [Milner et al., 1992] and the textbooks are [Milner, 1999; Sangiorgi and Walker, 2001]. The associated tool is the Mobility Workbench, see [Victor, 1994]. Also in this domain, it is important to gain more experience with protocol verification. On the theory side, there are a number of different equivalences that have been defined, and it is not clear which is the ‘right’ one to use. Following, other calculi concerning mobility have been developed, notably the ambient calculus, see [Cardelli and Gordon, 2000]. As to unifying frameworks for different mobile calculi, Milner investigated action calculus [Milner, 1996] and bigraphs [Milner, 2001]. Over the years, the π-calculus is considered more and more as the standard process calculus to use. Important extensions that simplify some things are the psi-calculi, see [Bengtson et al., 2011]. Probabilities and Stochastics Process calculi extended with probabilistic or stochastic information have generated a lot of research. An early reference is [Hansson, 1991]. In the CSP tradition, there is [Lowe, 1993], in the CCS tradition [Hillston, 1996], in the ACP tradition [Baeten et al., 1995]. There is the process algebra TIPP with associated tool, see e.g. [G¨otz et al., 1993], and EMPA, see e.g. [Bernardo and Gorrieri, 1998]. The insight that both (unquantified) alternative composition and probabilistic choice are needed for a useful theory has gained attention, see e.g. the work in [D’Argenio, 1999] or [Andova, 2002].


Jos C. M. Baeten and Davide Sangiorgi

Notions of abstraction are still a matter of continued research. The goal is to combine functional verification with performance analysis. A notion of approximation is very important here, see e.g. [Desharnais et al., 2004]. Some recent references are [Jonsson et al., 2001; Markovski, 2008; Georgievska, 2011]. Hybrid Systems Systems that in their behavior depend on continuously changing variables other than time are the latest challenge to be addressed by process calculi. System descriptions involve differential algebraic equations, so here we get to the border of computer science with dynamics, in particular dynamic control theory. When discrete events are leading, but aspects of evolution are also taken into account, this is part of computer science, but when dynamic evolution is paramount, and some switching points occur, it becomes part of dynamic control theory. Process calculus research that can be mentioned is [Bergstra and Middelburg, 2005; Cuijpers and Reniers, 2003]. In process theory, work centres around hybrid automata [Alur et al., 1995] and hybrid I/O automata [Lynch et al., 1995]. A tool is HyTech, see [Henzinger et al., 1995]. A connection with process calculus can be found in [Willemse, 2003; Baeten et al., 2008]. Other Application Areas Application of process calculus in other areas can be mentioned. A process calculus dealing with shared resources is ACSR [Lee et al., 1994]. Process calculus has been used to give semantics of specification languages, such as POOL [Vaandrager, 1990] or MSC [Mauw and Reniers, 1994]. There is work on applications in security, see e.g. [Focardi and Gorrieri, 1995], [Abadi and Gordon, 1999] or [Schneider, 2001]. Work can be mentioned on the application of process calculi to biological processes, see e.g. [Priami et al., 2001]. Other application areas are web services [Bravetti and Zavattaro, 2008; Laneve and Padovani, 2013], ubiquitous computing [Honda, 2006] and workflow [Puhlmann and Weske, 2005]. 5


In this chapter, a brief history is sketched of concurrency theory, following two central breakthroughs. Early work centred around giving semantics to programming languages involving a parallel construct. Two breakthroughs were needed: first of all, abandoning the idea that a program is a transformation from input to output, replacing this by an approach where all intermediate states are important. We consider this development by considering the history of bisimulation. The second breakthrough consisted of replacing the notion of global variables by the paradigm of message passing and local variables. We consider this development by considering the history of process calculi.

Concurrency Theory


In the seventies of the twentieth century, both these steps were taken, and full concurrency theories evolved. In doing so, concurrency theory became the underlying theory of parallel and distributed systems, extending formal language and automata theory with the central ingredient of interaction. In the following years, much work has been done, and many concurrency theories have been formulated, extended with data, time, mobility, probabilities and stochastics. The work is not finished, however. We formulated some challenges for the future. More can be found in [Aceto et al., 2005]. An interesting recent development is a reconsideration of the foundations of computation including interaction from concurrency theory. This yields a theory of executability, which is computability integrated with interaction, see [Baeten et al., 2012]. ACKNOWLEDGEMENTS The authors are grateful to Luca Aceto and Rob van Glabbeek for comments on an earlier draft of the paper. Sangiorgi’s work has been partially supported by the ANR project 12IS02001 “PACE”. BIBLIOGRAPHY [Abadi and Gordon, 1999] M. Abadi and A.D. Gordon. A calculus for cryptographic protocols: The spi calculus. Inf. Comput., 148(1):1–70, 1999. ´ [Aceto et al., 2005] L. Aceto, W.J. Fokkink, A. Ing´ olfsd´ ottir, and Z. Esik. Guest editors’ foreword: Process algebra. Theor. Comput. Sci., 335(2-3):127–129, 2005. [Aceto et al., 2007] L. Aceto, A. Ing´ olfsd´ ottir, K. G. Larsen, and J. Srba. Reactive Systems: Modelling, Specification and Verification. Cambridge University Press, 2007. [Aceto et al., 2012] L. Aceto, A. Ingolfsdottir, and J. Srba. The algorithmics of bisimilarity. In Sangiorgi and Rutten [2012]. [Aceto, 2003] L. Aceto. Some of my favourite results in classic process algebra. Bulletin of the EATCS, 81:90–108, 2003. [Aczel, 1988] P. Aczel. Non-well-founded Sets. CSLI lecture notes, no. 14, 1988. [Alur et al., 1995] R. Alur, C. Courcoubetis, N. Halbwachs, T.A. Henzinger, P.-H. Ho, X. Nicollin, A. Olivero, J. Sifakis, and S. Yovine. The algorithmic analysis of hybrid systems. Theoretical Computer Science, 138:3–34, 1995. [Alvarez et al., 1991] Carme Alvarez, Jos´ e L. Balc´ azar, Joaquim Gabarr´ o, and Miklos Santha. Parallel complexity in the design and analysis on conurrent systems. In Proc. PARLE ’91: Parallel Architectures and Languages Europe, Volume I: Parallel Architectures and Algorithms, volume 505 of Lecture Notes in Computer Science, pages 288–303. Springer, 1991. [Andova, 2002] S. Andova. Probabilistic Process Algebra. PhD thesis, Technische Universiteit Eindhoven, 2002. [Apt et al., 1980] K.R. Apt, N. Francez, and W.P. de Roever. A proof system for communicating sequential processes. TOPLAS, 2:359–385, 1980. [Austry and Boudol, 1984] D. Austry and G. Boudol. Alg` ebre de processus et synchronisation. Theoretical Computer Science, 30:91–131, 1984. [Baeten and Bergstra, 1991] J.C.M. Baeten and J.A. Bergstra. Real time process algebra. Formal Aspects of Computing, 3(2):142–188, 1991. [Baeten and Middelburg, 2002] J.C.M. Baeten and C.A. Middelburg. Process Algebra with Timing. EATCS Monographs. Springer Verlag, 2002.


Jos C. M. Baeten and Davide Sangiorgi

[Baeten and Weijland, 1990] J. Baeten and W. Weijland. Process Algebra, volume 18 of Cambridge Tracts in Theoretical Computer Science. Cambridge Uuniversity Press, 1990. [Baeten et al., 1991] J.C.M. Baeten, J.A. Bergstra, C.A.R. Hoare, R. Milner, J. Parrow, and R. de Simone. The variety of process algebra. Deliverable ESPRIT Basic Research Action 3006, CONCUR, 1991. [Baeten et al., 1995] J.C.M. Baeten, J.A. Bergstra, and S.A. Smolka. Axiomatizing probabilistic processes: ACP with generative probabilities. Information and Computation, 121(2):234–255, 1995. [Baeten et al., 2008] J.C.M. Baeten, D.A. van Beek, P.J.L. Cuijpers, M.A. Reniers, J.E. Rooda, R.R.H. Schiffelers, and R.J.M. Theunissen. Model-based engineering of embedded systems using the hybrid process algebra Chi. In C. Palamidessi and F.D. Valencia, editors, Electronic Notes in Theoretical Computer Science, volume 209. Elsevier Science Publishers, 2008. [Baeten et al., 2010] J.C.M. Baeten, T. Basten, and M.A. Reniers. Process Algebra: Equational Theories of Communicating Processes. Number 50 in Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2010. [Baeten et al., 2012] J.C.M. Baeten, B. Luttik, and P. van Tilburg. Turing meets Milner. In M. Koutny and I. Ulidowski, editors, Proceedings CONCUR 2012, number 7454 in Lecture Notes in Computer Science, pages 1–20, 2012. [Baeten, 1993] J.C.M. Baeten. The total order assumption. In S. Purushothaman and A. Zwarico, editors, Proceedings First North American Process Algebra Workshop, Workshops in Computing, pages 231–240. Springer Verlag, 1993. [Baeten, 2003] J.C.M. Baeten. Embedding untimed into timed process algebra: The case for explicit termination. Mathematical Structures in Computer Science, 13:589–618, 2003. [Bakker and Zucker, 1982a] J.W. de Bakker and J.I. Zucker. Denotational semantics of concurrency. In Proceedings 14th Symposium on Theory of Computing, pages 153–158. ACM, 1982. [Bakker and Zucker, 1982b] J.W. de Bakker and J.I. Zucker. Processes and the denotational semantics of concurrency. Information and Control, 54:70–120, 1982. [Balc´ azar et al., 1992] Jos´ e L. Balc´ azar, Joaquim Gabarr´ o, and Miklos Santha. Deciding Bisimilarity is P-Complete. Formal Asp. Comput., 4(6A):638–648, 1992. [Banach, 1922] S. Banach. Sur les op´ erations dans les ensembles abstraits et leur application aux ´ equations int´ egrales. Fundamenta Mathematicae, 3:133–181, 1922. [Barwise and Moss, 1996] J. Barwise and L. Moss. Vicious Circles: On the Mathematics of Non-Wellfounded Phenomena. CSLI (Center for the Study of Language and Information), 1996. [Bekiˇ c, 1971] H. Bekiˇ c. Towards a mathematical theory of processes. Technical Report TR 25.125, IBM Laboratory Vienna, 1971. [Bekiˇ c, 1984] H. Bekiˇ c. Programming Languages and Their Definition (Selected Papers edited by C.B. Jones). Number 177 in LNCS. Springer Verlag, 1984. [Bengtson et al., 2011] J. Bengtson, M. Johansson, J. Parrow, and B. Victor. Psi-calculi: a framework for mobile processes with nominal data and logic. Logical Methods in Computer Science, 7:1–44, 2011. [Benthem, 1976] J. van Benthem. Modal Correspondence Theory. PhD thesis, Mathematisch Instituut & Instituut voor Grondslagenonderzoek, University of Amsterdam, 1976. [Benthem, 1983] J. van Benthem. Modal Logic and Classical Logic. Bibliopolis, 1983. [Benthem, 1984] J. van Benthem. Correspondence theory. In D.M. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume 2, pages 167–247. Reidel, 1984. [Bergstra and Klop, 1982] J.A. Bergstra and J.W. Klop. Fixed point semantics in process algebra. Technical Report IW 208, Mathematical Centre, Amsterdam, 1982. [Bergstra and Klop, 1984a] J.A. Bergstra and J.W. Klop. The algebra of recursively defined processes and the algebra of regular processes. In J. Paredaens, editor, Proceedings 11th ICALP, number 172 in LNCS, pages 82–95. Springer Verlag, 1984. [Bergstra and Klop, 1984b] J.A. Bergstra and J.W. Klop. Process algebra for synchronous communication. Information and Control, 60(1/3):109–137, 1984. [Bergstra and Klop, 1992] J.A. Bergstra and J.W. Klop. A convergence theorem in process algebra. In J.W. de Bakker and J.J.M.M. Rutten, editors, Ten Years of Concurrency Semantics, pages 164–195. World Scientific, 1992. [Bergstra and Middelburg, 2005] J.A. Bergstra and C.A. Middelburg. Process algebra for hybrid systems. Theoretical Computer Science, 335, 2005.

Concurrency Theory


[Bergstra et al., 1987] J.A. Bergstra, J.W. Klop, and E.-R. Olderog. Failures without chaos: A new process semantics for fair abstraction. In M. Wirsing, editor, Proceedings IFIP Conference on Formal Description of Programming Concepts III, pages 77–103. North-Holland, 1987. [Bergstra et al., 2001] J.A. Bergstra, A. Ponse, and S.A. Smolka, editors. Handbook of Process Algebra. North-Holland, Amsterdam, 2001. [Bernardo and Gorrieri, 1998] M. Bernardo and R. Gorrieri. A tutorial on EMPA: A theory of concurrent processes with non-determinism, priorities, probabilities and time. Theoretical Computer Science, 202:1–54, 1998. [Best et al., 2001] E. Best, R. Devillers, and M. Koutny. A unified model for nets and process algebras. In [Bergstra et al., 2001], pp. 945–1045, 2001. [Brand, 1978] D. Brand. Algebraic simulation between parallel programs. Research Report RC 7206, Yorktown Heights, N.Y., 39 pp., 1978. [Brauer and Reisig, 2006] Wilfried Brauer and Wolfgang Reisig. Carl Adam Petri und die ”Petrinetze”. Informatik Spektrum, 29:369–374, 2006. [Bravetti and Zavattaro, 2008] M. Bravetti and G. Zavattaro. A foundational theory of contracts for multi-party service composition. Fundamenta Informaticae, 89:451–478, 2008. [Brinksma, 1989] E. Brinksma, editor. Information Processing Systems, Open Systems Interconnection, LOTOS – A Formal Description Technique Based on the Temporal Ordering of Observational Behaviour, volume IS-8807 of International Standard. ISO, Geneva, 1989. [Brookes et al., 1984] S.D. Brookes, C.A.R. Hoare, and A.W. Roscoe. A theory of communicating sequential processes. Journal of the ACM, 31(3):560–599, 1984. [Buchholz, 1994] P. Buchholz. Markovian process algebra: composition and equivalence. In U. Herzog and M. Rettelbach, editors, Proc. 2nd Workshop on Process Algebras and Performance Modelling, pages 11–30. Arbeitsberichte des IMMD, Band 27, Nr. 4, 1994. [Burge, 1975] William H. Burge. Stream processing functions. IBM Journal of Research and Development, 19(1):12–25, 1975. [Burkart et al., 2001] O. Burkart, D. Caucal, F. Moller, and B. Steffen. Verification on infinite structures. In [Bergstra et al., 2001], pp. 545–623, 2001. [Cardelli and Gordon, 2000] L. Cardelli and A.D. Gordon. Mobile ambients. Theoretical Computer Science, 240:177–213, 2000. [Cuijpers and Reniers, 2003] P.J.L. Cuijpers and M.A. Reniers. Hybrid process algebra. Technical Report CS-R 03/07, Technische Universiteit Eindhoven, Dept. of Comp. Sci., 2003. [D’Argenio, 1999] P.R. D’Argenio. Algebras and Automata for Timed and Stochastic Systems. PhD thesis, University of Twente, 1999. [de Roever, 1977] Willem P. de Roever. On backtracking and greatest fixpoints. In Arto Salomaa and Magnus Steinby, editors, Fourth Colloquium on Automata, Languages and Programming (ICALP), volume 52 of Lecture Notes in Computer Science, pages 412–429. Springer, 1977. [Desharnais et al., 2004] J. Desharnais, V. Gupta, R. Jagadeesan, and P. Panangaden. Metrics for labeled Markov systems. Theoretical Computer Science, 318:323–354, 2004. [Dijkstra, 1975] E.W. Dijkstra. Guarded commands, nondeterminacy, and formal derivation of programs. Communications of the ACM, 18(8):453–457, 1975. [Engberg and Nielsen, 1986] U. Engberg and M. Nielsen. A calculus of communicating systems with label passing. Technical Report DAIMI PB-208, Aarhus University, 1986. [Fernandez et al., 1996] J.-C. Fernandez, H. Garavel, A. Kerbrat, R. Mateescu, L. Mounier, and M. Sighireanu. CADP (CAESAR/ALDEBARAN development package): A protocol validation and verification toolbox. In R. Alur and T.A. Henzinger, editors, Proceedings CAV ’96, number 1102 in Lecture Notes in Computer Science, pages 437–440. Springer Verlag, 1996. [Floyd, 1967] R. W. Floyd. Assigning meaning to programs. In Proc. Symposia in Applied Mathematics, volume 19, pages 19–32. American Mathematical Society, 1967. [Focardi and Gorrieri, 1995] R. Focardi and R. Gorrieri. A classification of security properties for process algebras. Journal of Computer Security, 3:5–33, 1995. [Forti and Honsell, 1983] M. Forti and F. Honsell. Set theory with free construction principles. Annali Scuola Normale Superiore, Pisa, Serie IV, X(3):493–522, 1983. [Georgievska, 2011] S. Georgievska. Probability and Hiding in Concurrent Processes. PhD thesis, Eindhoven University of Technology, Department of Mathematics and Computer Science, Eindhoven, the Netherlands, 2011. [Ginzburg, 1968] A. Ginzburg. Algebraic Theory of Automata. Academic Press, 1968.


Jos C. M. Baeten and Davide Sangiorgi

[Glabbeek, 1986] R.J. van Glabbeek. Notes on the methodology of CCS and CSP. Technical Report CS-R8624, Centrum Wiskunde & Informatica, Amsterdam, 1986. [Glabbeek, 1990a] R.J. van Glabbeek. Comparative Concurrency Semantics and Refinement of Actions. PhD thesis, Vrije Universiteit, Amsterdam, 1990. [Glabbeek, 1990b] R.J. van Glabbeek. The linear time-branching time spectrum (extended abstract). In Jos C. M. Baeten and Jan Willem Klop, editors, First Conference on Concurrency Theory (CONCUR’90), volume 458 of Lecture Notes in Computer Science, pages 278–297. Springer, 1990. [Glabbeek, 1993] R.J. van Glabbeek. The linear time — branching time spectrum II (the semantics of sequential systems with silent moves). In E. Best, editor, Fourth Conference on Concurrency Theory (CONCUR’93), volume 715, pages 66–81. Springer, 1993. [Glabbeek, 2001] R.J. van Glabbeek. The linear time – branching time spectrum I. The semantics of concrete, sequential processes. In [Bergstra et al., 2001], pp. 3–100, 2001. [G¨ otz et al., 1993] N. G¨ otz, U. Herzog, and M. Rettelbach. Multiprocessor and distributed system design: The integration of functional specification and performance analysis using stochastic process algebras. In L. Donatiello and R. Nelson, editors, Performance Evaluation of Computer and Communication Systems, number 729 in LNCS, pages 121–146. Springer, 1993. [Gourlay et al., 1979] John S. Gourlay, William C. Rounds, and Richard Statman. On properties preserved by contraction of concurrent systems. In Gilles Kahn, editor, International Symposium on Semantics of Concurrent Computation, volume 70 of Lecture Notes in Computer Science, pages 51–65. Springer, 1979. [Groote and Lisser, 2001] J.F. Groote and B. Lisser. Computer assisted manipulation of algebraic process specifications. Technical Report SEN-R0117, CWI, Amsterdam, 2001. [Groote and Mousavi, 2013] J.F. Groote and M.R. Mousavi. Modelling and Analysis of Communicating Systems. MIT Press, 2013. [Groote and Reniers, 2001] J.F. Groote and M.A. Reniers. Algebraic process verification. In [Bergstra et al., 2001], pp. 1151–1208, 2001. [Hansson, 1991] H. Hansson. Time and Probability in Formal Design of Distributed Systems. PhD thesis, University of Uppsala, 1991. [Harel and Pnueli, 1985] D. Harel and A. Pnueli. On the development of reactive systems. In Logic and Models of Concurrent Systems, NATO Advanced Study Institute on Logics and Models for Verification and Specification of Concurrent Systems. Springer, 1985. [Hennessy and Milner, 1980] M. Hennessy and R. Milner. On observing nondeterminism and concurrency. In J.W. de Bakker and J. van Leeuwen, editors, Proceedings 7th ICALP, number 85 in Lecture Notes in Computer Science, pages 299–309. Springer Verlag, 1980. [Hennessy, 1988] M. Hennessy. Algebraic Theory of Processes. MIT Press, 1988. [Henzinger et al., 1995] T.A. Henzinger, P. Ho, and H. Wong-Toi. Hy-Tech: The next generation. In Proceedings RTSS, pages 56–65. IEEE, 1995. [Hillston, 1996] J. Hillston. A Compositional Approach to Performance Modelling. PhD thesis, Cambridge University Press, 1996. [Hinnion, 1980] R. Hinnion. Contraction de structures et application ` a NFU. Comptes Rendus Acad. des Sciences de Paris, 290, S´ er. A:677–680, 1980. [Hinnion, 1981] R. Hinnion. Extensional quotients of structures and applications to the study of the axiom of extensionality. Bulletin de la Soci´ et´ e Mathmatique de Belgique, XXXIII (Fas. II, S´ er. B):173–206, 1981. [Hoare, 1969] C.A.R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12:576–580, 1969. [Hoare, 1978] C.A.R. Hoare. Communicating sequential processes. Communications of the ACM, 21(8):666–677, 1978. [Hoare, 1980] C.A.R. Hoare. A model for communicating sequential processes. In R.M. McKeag and A.M. Macnaghten, editors, On the Construction of Programs, pages 229–254. Cambridge University Press, 1980. [Hoare, 1985] C.A.R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985. [Honda, 2006] K. Honda. Process algebras in the age of ubiquitous computing. Electr. Notes Theor. Comput. Sci., 162:217–220, 2006. [Huffman, 1954] D.A. Huffman. The synthesis of sequential switching circuits. Journal of the Franklin Institute (Mar. 1954) and (Apr. 1954), 257(3–4):161–190 and 275–303, 1954.

Concurrency Theory


[Jensen, 1980] Kurt Jensen. A method to compare the descriptive power of different types of petri nets. In Piotr Dembinski, editor, Proc. 9th Mathematical Foundations of Computer Science 1980 (MFCS’80), Rydzyna, Poland, September 1980, volume 88 of Lecture Notes in Computer Science, pages 348–361. Springer, 1980. [Jonsson et al., 2001] B. Jonsson, Yi Wang, and K.G. Larsen. Probabilistic extensions of process algebras. In J.A. Bergstra, A. Ponse, and S.A. Smolka, editors, Handbook of Process Algebra, pages 685–710. North-Holland, 2001. [Kahn, 1974] Gilles Kahn. The semantics of simple language for parallel programming. In IFIP Congress, pages 471–475. North-Holland, 1974. [Kanellakis and Smolka, 1990] Paris C. Kanellakis and Scott A. Smolka. CCS expressions, finite state processes, and three problems of equivalence. Inf. Comput., 86(1):43–68, 1990. [Kemeny and Snell, 1960] J. Kemeny and J. L. Snell. Finite Markov Chains. Van Nostrand Co. Ltd., London, 1960. [Kwong, 1977] Y. S. Kwong. On reduction of asynchronous systems. Theoretical Computer Science, 5(1):25–50, 1977. [Landin, 1964] Peter J. Landin. The mechanical evaluation of expressions. The Computer Journal, 6(4):308–320, 1964. [Landin, 1965a] Peter J. Landin. Correspondence between ALGOL 60 and Church’s Lambdanotation: Part I. Commun. ACM, 8(2):89–101, 1965. [Landin, 1965b] Peter J. Landin. A correspondence between ALGOL 60 and Church’s Lambdanotations: Part II. Commun. ACM, 8(3):158–167, 1965. [Landin, 1969] P. Landin. A program-machine symmetric automata theory. Machine Intelligence, 5:99–120, 1969. [Laneve and Padovani, 2013] C. Laneve and L. Padovani. An algebraic theory for web service contracts. In E. Broch Johnsen and L. Petre, editors, IFM, volume 7940 of Lecture Notes in Computer Science, pages 301–315. Springer, 2013. [Larsen and Skou, 1991] Kim Guldstrand Larsen and Arne Skou. Bisimulation through probabilistic testing. Inf. Comput., 94(1):1–28, 1991. Preliminary version in POPL’89, 344–352, 1989. [Larsen et al., 1997] K.G. Larsen, P. Pettersson, and Wang Yi. Uppaal in a nutshell. Journal of Software Tools for Technology Transfer, 1, 1997. [Lee et al., 1994] I. Lee, P. Bremond-Gregoire, and R.Gerber. A process algebraic approach to the specification and analysis of resource-bound real-time systems. Proceedings of the IEEE, 1994. Special Issue on Real-Time. [Linz, 2001] P. Linz. An Introduction to Formal Languages and Automata. Jones and Bartlett, 2001. [Lowe, 1993] G. Lowe. Probabilities and Priorities in Timed CSP. PhD thesis, University of Oxford, 1993. [Lynch et al., 1995] N. Lynch, R. Segala, F. Vaandrager, and H.B. Weinberg. Hybrid I/O automata. In T. Henzinger, R. Alur, and E. Sontag, editors, Hybrid Systems III, number 1066 in Lecture Notes in Computer Science. Springer Verlag, 1995. [MacLane and Birkhoff, 1967] S. MacLane and G. Birkhoff. Algebra. MacMillan, 1967. [Manna, 1969] Z. Manna. The correctness of programs. J. Computer and System Sciences, 3(2):119–127, 1969. [Markovski, 2008] J. Markovski. Real and Stochastic Time in Process Algebras for Performance Evaluation. PhD thesis, Eindhoven University of Technology, Department of Mathematics and Computer Science, Eindhoven, the Netherlands, 2008. [Mauw and Reniers, 1994] S. Mauw and M.A. Reniers. An algebraic semantics for basic message sequence charts. The Computer Journal, 37:269–277, 1994. [Mauw, 1991] S. Mauw. PSF: a Process Specification Formalism. PhD thesis, University of Amsterdam, 1991. See [Mazurkiewicz, 1977] A. Mazurkiewicz. Concurrent program schemes and their interpretations. Technical Report DAIMI PB-78, Aarhus University, 1977. [McCarthy, 1963] J. McCarthy. A basis for a mathematical theory of computation. In P. Braffort and D. Hirshberg, editors, Computer Programming and Formal Systems, pages 33–70. NorthHolland, Amsterdam, 1963. [Meyer and Stockmeyer, 1972] Albert R. Meyer and Larry J. Stockmeyer. The equivalence problem for regular expressions with squaring requires exponential space. In 13th Annual Symposium on Switching and Automata Theory (FOCS), pages 125–129. IEEE, 1972.


Jos C. M. Baeten and Davide Sangiorgi

[Milne and Milner, 1979] G.J. Milne and R. Milner. Concurrent processes and their syntax. Journal of the ACM, 26(2):302–321, 1979. [Milne, 1983] G.J. Milne. CIRCAL: A calculus for circuit description. Integration, 1:121–160, 1983. [Milner and Tofte, 1991] R. Milner and M. Tofte. Co-induction in relational semantics. Theoretical Computer Science, 87:209–220, 1991. Also Tech. Rep. ECS-LFCS-88-65, University of Edinburgh, 1988. [Milner et al., 1992] R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes. Information and Computation, 100:1–77, 1992. [Milner, 1970] R. Milner. A formal notion of simulation between programs. Memo 14, Computers and Logic Resarch Group, University College of Swansea, U.K., 1970. [Milner, 1971a] R. Milner. An algebraic definition of simulation between programs. In Proc. 2nd Int. Joint Conferences on Artificial Intelligence. British Comp. Soc. 1971. [Milner, 1971b] R. Milner. Program simulation: an extended formal notion. Memo 17, Computers and Logic Resarch Group, University College of Swansea, U.K., 1971. [Milner, 1973] R. Milner. An approach to the semantics of parallel programs. In Proceedings Convegno di informatica Teoretica, pages 285–301, Pisa, 1973. Instituto di Elaborazione della Informazione. [Milner, 1975] R. Milner. Processes: A mathematical model of computing agents. In H.E. Rose and J.C. Shepherdson, editors, Proceedings Logic Colloquium, number 80 in Studies in Logic and the Foundations of Mathematics, pages 157–174. North-Holland, 1975. [Milner, 1978a] R. Milner. Algebras for communicating systems. In Proc. AFCET/SMF joint colloquium in Applied Mathematics, Paris, 1978. [Milner, 1978b] R. Milner. Synthesis of communicating behaviour. In J. Winkowski, editor, Proc. 7th MFCS, number 64 in LNCS, pages 71–83, Zakopane, 1978. Springer Verlag. [Milner, 1979] R. Milner. Flowgraphs and flow algebras. Journal of the ACM, 26(4):794–818, 1979. [Milner, 1980] R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer, 1980. [Milner, 1983] R. Milner. Calculi for synchrony and asynchrony. Theoretical Computer Science, 25:267–310, 1983. [Milner, 1984] R. Milner. A complete inference system for a class of regular behaviours. Journal of Computer System Science, 28:439–466, 1984. [Milner, 1989] R. Milner. Communication and Concurrency. Prentice Hall, 1989. [Milner, 1996] R. Milner. Calculi for interaction. Acta Informatica, 33:707–737, 1996. [Milner, 1999] R. Milner. Communicating and Mobile Systems: the π-Calculus. Cambridge University Press, 1999. [Milner, 2001] R. Milner. Bigraphical reactive systems. In K.G. Larsen and M. Nielsen, editors, Proceedings CONCUR ’01, number 2154 in LNCS, pages 16–35. Springer Verlag, 2001. [Moller and Stevens, 1999] F. Moller and P. Stevens. Edinburgh Concurrency Workbench user manual (version 7.1). Available from, 1999. [Moller and Tofts, 1990] F. Moller and C. Tofts. A temporal calculus of communicating systems. In J.C.M. Baeten and J.W. Klop, editors, Proceedings CONCUR’90, number 458 in LNCS, pages 401–415. Springer Verlag, 1990. [Moore, 1956] E.F. Moore. Gedanken experiments on sequential machines. Automata Studies, Annals of Mathematics Series, 34:129–153, 1956. [Nerode, 1958] A. Nerode. Linear automaton transformations. In Proc. American Mathematical Society, volume 9, pages 541–544, 1958. [Nicollin and Sifakis, 1994] X. Nicollin and J. Sifakis. The algebra of timed processes ATP: Theory and application. Information and Computation, 114:131–178, 1994. [Owicki and Gries, 1976] S. Owicki and D. Gries. Verifying properties of parallel programs: An axiomatic approach. Communications of the ACM, 19:279–285, 1976. [Park, 1979] D. Park. On the semantics of fair parallelism. In Proc. Abstract Software Specifications, Copenhagen Winter School, Lecture Notes in Computer Science, pages 504–526. Springer, 1979. [Park, 1981a] D.M.R. Park. Concurrency on automata and infinite sequences. In P. Deussen, editor, Conf. on Theoretical Computer Science, volume 104 of Lecture Notes in Computer Science, pages 167–183. Springer, 1981.

Concurrency Theory


[Park, 1981b] D.M.R. Park. A new equivalence notion for communicating systems. In G. Maurer, editor, Bulletin EATCS, volume 14, pages 78–80, 1981. Abstract of the talk presented at the Second Workshop on the Semantics of Programming Languages, Bad Honnef, March 16–20 1981. Abstracts collected in the Bulletin by B. Mayoh. [Petri, 1962] C.A. Petri. Kommunikation mit Automaten. PhD thesis, Institut fuer Instrumentelle Mathematik, Bonn, 1962. [Petri, 1980] C.A. Petri. Introduction to general net theory. In W. Brauer, editor, Proc. Advanced Course on General Net Theory, Processes and Systems, number 84 in LNCS, pages 1–20. Springer Verlag, 1980. [Plotkin, 1976] G.D. Plotkin. A powerdomain construction. SIAM Journal of Computing, 5:452– 487, 1976. [Plotkin, 1981] G.D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI FN-19, Aarhus University, 1981. Reprinted as [Plotkin, 2004a]. [Plotkin, 2004a] G.D. Plotkin. A structural approach to operational semantics. Journal of Logic and Algebraic Programming, 60:17–139, 2004. [Plotkin, 2004b] G.D. Plotkin. The origins of structural operational semantics. Journal of Logic and Algebraic Programming, 60(1):3–16, 2004. [Pnueli, 1977] A. Pnueli. The temporal logic of programs. In Proceedings 19th Symposium on Foundations of Computer Science, pages 46–57. IEEE, 1977. [Pous and Sangiorgi, 2012] Damien Pous and Davide Sangiorgi. Enhancements of the bisimulation proof method. In Sangiorgi and Rutten [2012]. [Priami et al., 2001] C. Priami, A. Regev, W. Silverman, and E. Shapiro. Application of stochastic process algebras to bioinformatics of molecular processes. Information Processing Letters, 80:25–31, 2001. [Puhlmann and Weske, 2005] F. Puhlmann and M. Weske. Using the pi-calculus for formalizing workflow patterns. In W.M.P. van der Aalst, B. Benatallah, F. Casati, and F. Curbera, editors, Business Process Management, volume 3649, pages 153–168, 2005. [Reed and Roscoe, 1988] G.M. Reed and A.W. Roscoe. A timed model for communicating sequential processes. Theoretical Computer Science, 58:249–261, 1988. [Rem, 1983] M. Rem. Partially ordered computations, with applications to VLSI design. In J.W. de Bakker and J. van Leeuwen, editors, Foundations of Computer Science IV, volume 159 of Mathematical Centre Tracts, pages 1–44. Mathematical Centre, Amsterdam, 1983. [Rutten and Jacobs, 2012] Jan Rutten and Bart Jacobs. (co)algebras and (co)induction. In Sangiorgi and Rutten [2012]. [Rutten and Turi, 1994] J. Rutten and D. Turi. Initial algebra and final coalgebra semantics for concurrency. In Proc. Rex School/Symposium 1993 “A Decade of Concurrency — Reflexions and Perspectives”, volume 803 of Lecture Notes in Computer Science. Springer, 1994. [Sangiorgi and Rutten, 2012] Davide Sangiorgi and Jan Rutten, editors. Advanced Topics in Bisimulation and Coinduction. Cambridge University Press, 2012. [Sangiorgi and Walker, 2001] D. Sangiorgi and D. Walker. The π-calculus: a Theory of Mobile Processes. Cambridge University Press, 2001. [Sangiorgi, 2009] Davide Sangiorgi. On the origins of bisimulation and coinduction. ACM Trans. Program. Lang. Syst., 31(4), 2009. [Sangiorgi, 2012] Davide Sangiorgi. Introduction to Bisimulation and Coinduction. Cambridge University Press, 2012. [Schneider, 2000] S.A. Schneider. Concurrent and Real-Time Systems (the CSP Approach). Worldwide Series in Computer Science. Wiley, 2000. [Schneider, 2001] S.A. Schneider. Process algebra and security. In K.G. Larsen and M. Nielsen, editors, Proceedings CONCUR ’01, number 2154 in LNCS, pages 37–38. Springer Verlag, 2001. [Scott and Strachey, 1971] D.S. Scott and C. Strachey. Towards a mathematical semantics for computer languages. In J. Fox, editor, Proceedings Symposium Computers and Automata, pages 19–46. Polytechnic Institute of Brooklyn Press, 1971. [Scott, 1960] D. Scott. A different kind of model for set theory. Unpublished paper, given at the 1960 Stanford Congress of Logic, Methodology and Philosophy of Science, 1960. [Segerberg, 1968] K. Segerberg. Decidability of S4.1. Theoria, 34:7–20, 1968. [Segerberg, 1971] Krister Segerberg. An essay in classical modal logic. Filosofiska Studier, Uppsala, 1971. [Stirling, 2012] Colin Stirling. Bisimulation and logic. In Sangiorgi and Rutten [2012].


Jos C. M. Baeten and Davide Sangiorgi

[Usenko, 2002] Y.S. Usenko. Linearization in µCRL. PhD thesis, Technische Universiteit Eindhoven, 2002. [Vaandrager, 1990] F.W. Vaandrager. Process algebra semantics of POOL. In J.C.M. Baeten, editor, Applications of Process Algebra, number 17 in Cambridge Tracts in Theoretical Computer Science, pages 173–236. Cambridge University Press, 1990. [Victor, 1994] B. Victor. A Verification Tool for the Polyadic π-Calculus. Licentiate thesis, Department of Computer Systems, Uppsala University, Sweden, May 1994. Available as report DoCS 94/50. [Willemse, 2003] T.A.C. Willemse. Semantics and Verification in Process Algebras with Data and Timing. PhD thesis, Technische Universiteit Eindhoven, 2003. [Yi, 1990] Wang Yi. Real-time behaviour of asynchronous agents. In J.C.M. Baeten and J.W. Klop, editors, Proceedings CONCUR’90, number 458 in LNCS, pages 502–520. Springer Verlag, 1990. [Yovine, 1997] S. Yovine. Kronos: A verification tool for real-time systems. Journal of Software Tools for Technology Transfer, 1:123–133, 1997. [Zhang et al., 2003] D. Zhang, R. Cleaveland, and E. Stark. The integrated CWB-NC/PIOAtool for functional verification and performance analysis of concurrent systems. In H. Garavel and J. Hatcliff, editors, Proceedings TACAS ’03, number 2619 in Lecture Notes in Computer Science, pages 431–436. Springer-Verlag, 2003.

DEGREES OF UNSOLVABILITY Klaus Ambos-Spies and Peter A. Fejer Reader: Richard Shore 1


Modern computability theory took off with Turing [1936], where he introduced the notion of a function computable by a Turing machine. Soon after, it was shown that this definition was equivalent to several others that had been proposed previously and the Church-Turing thesis that Turing computability captured precisely the informal notion of computability was commonly accepted. This isolation of the concept of computable function was one of the greatest advances of twentieth century mathematics and gave rise to the field of computability theory. Among the first results in computability theory was Church’s and Turing’s work on the unsolvability of the decision problem for first-order logic. Computability theory to a great extent deals with noncomputable problems. Relativized computation, which also originated with Turing, in [Turing, 1939], allows the comparison of the complexity of unsolvable problems. Turing formalized relative computation with oracle Turing machines. If a set A is computable relative to a set B, we say that A is Turing reducible to B (A ≤T B). By identifying sets that are reducible to each other, we are led to the notion of degree of unsolvability first introduced by Post [1944]. The degrees form a partially ordered set whose study is called degree theory. From the start, the study of the computably enumerable degrees has played a prominent role in degree theory. This may be partially due to the fact that until recently most of the unsolvable problems that have arisen outside of computability theory are computably enumerable (c.e.). The c.e. sets can intuitively be viewed as unbounded search problems, a typical example being those formulas provable in some effectively given formal system. Reducibility allows us to isolate the most difficult c.e. problems, the complete problems. The standard method for showing that a c.e. problem is undecidable is to show that it is complete. Post [1944] asked if this technique always works, i.e., whether there is a noncomputable, incomplete c.e. set. This problem came to be known as Post’s Problem and it was the origin of degree theory. Degree theory became one of the core areas of computability theory and attracted some of the most brilliant logicians of the second half of the twentieth Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


Klaus Ambos-Spies and Peter A. Fejer

century. The fascination with the field stems from the quite sophisticated techniques needed to solve the problems that arose, many of which are quite easy to state. The hallmark of the study of the c.e. degrees is the priority method introduced by Friedberg and Muˇcnik to solve Post’s Problem. Advances in c.e. degree theory were closely tied to developments of this method. For the degrees as a whole, forcing arguments are central. Forcing techniques originated in set theory, but their use in degree theory can be traced back to the paper of Kleene and Post [1954] which predated the introduction of forcing in set theory. Degree theory has been central to computability theory in the sense that the priority method was developed to solve problems in degrees but has been applied throughout computability theory. In this chapter, we will limit ourselves to Turing reducibility though many other reducibilities have been studied in computability theory. By formalizing relative computability, Turing reducibility is the most general effective reducibility but by limiting the access to the oracle in various ways interesting special cases arise such as many-one or truth-table reducibilities. (See e.g. Odifreddi [1999b] for more on these so-called strong reducibilities.) Other reducibilities are obtained by either giving up effectivity of the reduction as is done for instance in the enumeration reducibilities or the arithmetical reducibilities where computability is replaced by computable enumerability and first-order definability in arithmetic or by considering resource-bounded computability as is done in computational complexity. The most prominent examples of the latter are the polynomial time reducibilities leading to the notion of NP-completeness (see the chapter of Fortnow and Homer in this volume). The concentration on Turing reducibility is also justified by the fact that the core technical work in classical computability theory was done to prove results about the Turing degrees and the main techniques in the field were developed to prove these results. The two structures of Turing degrees that will concern us are the upper semi-lattices of all the Turing degrees and of the computably enumerable Turing degrees, denoted by D and R. Our focus will be on the structure R of the c.e. degrees. This emphasis may be justified by the particularly important role played by the c.e. sets in mathematical logic and by the specific challenges of this area at the dividing line of computability and noncomputability which led to the development of fascinating new techniques - but it also reflects some of our prejudice based on our research interests. The period covered in this chapter is from the beginning of degree theory in the 1940s till the end of the 20th century. More recent results which are directly related to the work falling in this period are included in the presentation, and in the final section there are a few pointers to some of the more recent developments. We feel, however, that it is too early for a general review and evaluation of this more recent work from a historical point of view. The emphasis of the chapter will be on the early developments of degree theory. We use the term “computable” rather than “recursive” following the suggestion of Soare [1999]. This change in terminology has been widely adopted and reflects

Degrees of Unsolvability


more accurately the nature of the subject. In the same vein, we use “computably enumerable” for “recursively enumerable” and so on. The old terminology survives in our use of the symbol R for the structure of the c.e. degrees. We have used little notation in this chapter and what we do use is standard and can be found in the relevant chapters of the Handbook of Computability Theory [Griffor, 1999]. 2


In this section we trace the origins of the central concepts underlying degree theory. The history of the concept of computable function has been dealt with in detail in the literature (see for instance Kleene [1981] and Soare [1999]). For this reason, we do not discuss the topic here. The study of degrees of unsolvability begins with the concept of Turing reducibility which originated with Turing in Section 4 of [Turing, 1939]. As Post puts it in [Post, 1944], Turing presents the definition as a “side issue.” In the paper, Turing defines an “o-machine” (oracle machine) as an oracle Turing machine as we understand the concept today, but with a fixed oracle, namely, the set of well-formed formulas A (i.e., λ-terms) that are dual (i.e., have the property that A(n) is convertible to 2 for every well-formed formula n representing a positive integer). Turing is interested in such oracle machines because he considers a problem to be number-theoretic if it can be solved by an o-machine. Turing proves the problem of determining if an o-machine is “circlefree” (i.e., prints out infinitely many 0s and 1s) is not a number-theoretic problem. He then does not come back to the idea of oracle machine in the rest of the paper. Although Turing does not consider arbitrary oracles and hence does not give a general definition of relative computability, as Post puts it in [Post, 1944], Turing’s formulation “can immediately be restated as the general formulation of ‘recursive reducibility’ of one problem to another.” Post himself does not give any formal definition of Turing reducibility in his 1944 paper, but instead relies on an intuitive description. Although it is clear then that it was known since at least 1944 how to give a general definition of relative computability based on oracle Turing machines, the first occurrence of such a definition given completely is in Kleene’s Introduction to Metamathematics [Kleene, 1952]. By different means, the first formal definition of relative computability to appear in print was in Kleene [1943] which used general recursive functions. A definition of relative computability using canonical sets is in Post’s 1948 abstract [Post, 1948]. The next concept fundamental to degree theory is that of degree itself. Post [1944] defines two unsolvable problems to have the same degree of unsolvability if each is reducible to the other, one to have lower degree of unsolvability than the other if the first is reducible to the second but the second is not reducible to the first, and to have incomparable degree of unsolvability if neither is reducible to the other. The abstraction of this idea to achieve the current concept of degree as an equivalence class of sets of natural numbers each reducible to the other appears first in the Kleene-Post paper [1954]. (Actually, in this paper a degree is defined as an equivalence class of number-theoretic functions, predicates and sets, but the


Klaus Ambos-Spies and Peter A. Fejer

authors realize that there would be no loss of generality in considering sets only.) This same paper is the first place where the upper semi-lattice structure of the Turing degrees is described in print. The origin of the concept of computable enumerability is more straightforward. The concept first appeared in print in Kleene’s article [1936] and his definition is equivalent to the modern one except that he does not allow the empty set as computably enumerable. (Of course he used the term “recursively enumerable” instead of “computably enumerable”.) Post in 1921 invented an equivalent concept which he called generated set. This work was not submitted for publication until 1941 and did not appear until 1965 in a collection of early papers in computability theory edited by Martin Davis [Post, 1965]. The final concept whose origins we wish to comment on is the jump operator. In 1936, Kleene showed in [Kleene, 1936] that K = {x : {x}(x) ↓} (or more precisely, the predicate ∃yT (x, x, y)) is computably enumerable but not computable. Not having a definition of reducibility at this point, Kleene could not show that K was complete (i.e., that every computably enumerable set is reducible to K). In his 1943 paper, Kleene again shows that K is c.e. but not computable and here he has a definition of reducibility, but the completeness of K is not shown. Thus it was Post in his 1944 paper who first showed the completeness of K, in fact he showed that every c.e. set is 1-reducible to K. (Actually, Post’s set K is equivalent to {hx, yi : {x}(y) ↓}.) Post used the term “complete” to describe K, but wrote in a footnote “Just how to abstract from K the property of completeness is not, at the moment, clear.” By 1948, the abstract concept of completeness had become clear to Post, because he wrote in his abstract [Post, 1948], that to each set S of positive integers, he associated a “complete” S-canonical set S ′ (S-canonical is equivalent to computably enumerable in S) and each S-canonical set is Turing reducible to S ′ , while S ′ is not reducible to S. Post did not give the definition of S ′ in his abstract, nor did he publish his work later. Thus, the first published proof that for each set A there is a set A′ complete for A in the sense of Post is due to Kleene [1952]. The final step in the introduction of the jump operator is in Kleene and Post [1954], where it is shown that if A and B are in the same Turing degree, then so are their jumps, so the jump is well-defined on degrees. The arithmetic hierarchy was invented by Kleene [1943] and independently by Mostowski [1947]. The connection between the arithmetic hierarchy and the jump appears to be due to Post, but he never published it. In his 1948 abstract, Post announces the result that for all n, both of the classes Σn+1 , Πn+1 contain a set of higher degree of unsolvability than any set in ∆n+1 . The obvious way to see this is the recognition that a set is in Σn+1 if and only if it is one-one reducible to ∅(n+1) and that a set is ∆n+1 if and only if it is Turing reducible to ∅(n) . Post gives no indication of how his theorem is proven except that it is connected with the scale of sets ∅, ∅′ , ∅′′ , . . .. The theorem that a set is ∆n+1 if and only if it is Turing reducible to a finite collection of Σn and Πn sets is attributed by Kleene [1952] to Post and this abstract, and while this result does not explicitly involve the jump, it suggests again that Post was using the sets ∅(n) for his result.

Degrees of Unsolvability




Having looked at the origin of the basic concepts of degree theory, we now turn to the papers that founded the subject. The first paper in degree theory, and perhaps the most important, is Emil Post’s 1944 paper “Recursively enumerable sets of positive integers and their decision problems.” Beyond the completeness of K, this paper does not contain any results on the Turing degrees. Its importance lies rather in what has become known as Post’s Problem and Post’s Program, as well as in the attention it drew to the field of degree theory, particularly the computably enumerable degrees, and the clarity of its exposition. The results that do occur in the paper were of great importance in two other related fields of computability theory, strong reducibilities and the lattice of c.e. sets under inclusion. Post’s Problem is the question of whether there exists a computably enumerable set that is neither computable nor complete. In degree-theoretic terms, the problem is whether there are more than two c.e. Turing degrees. Post’s Problem received a lot of attention, and the solution finally obtained for the problem introduced the priority method, the most important proof technique in computably enumerable degree theory. Post’s Program was to try to construct a c.e. set that is neither computable nor complete by defining a structural property of a set, proving that sets with the structural property exist, and then showing that any set with the structural property must be noncomputable and incomplete. Post in particular tried to use thinness properties of the complement of a set to achieve this goal. Though Post failed to achieve this goal for Turing reducibility, he succeeded for some stronger reducibilities he introduced in his paper, namely one-one (1), many-one (m), bounded truth-table (btt) and truth-table (tt) reducibilities. These reducibilities, although not as fundamental as Turing reducibility, are very natural and have been widely studied. For showing the existence of noncomputable btt-incomplete (hence m- and 1-incomplete) sets, Post introduced simple sets, i.e., c.e. sets whose complements are infinite but contain no infinite c.e. sets. He proved that simple sets exist and cannot be bounded truth-table complete, but can be truth-table complete. Post also introduced hypersimple sets, a refinement of simple sets, and proved that hypersimple sets exist and are truth-table incomplete. He suggested a further strengthening of simplicity, namely hyperhypersimplicity, but he left open the question whether hyperhypersimple sets exist and whether they have to be Turing incomplete. Thus, Post initiated the study of the c.e. sets under reducibilities stronger than Turing reducibility and showed that the structural approach is a powerful tool in this area. Strong reducibilities have been widely studied, particularly in the Russian school of computability theory, where the structural approach has been used very fruitfully, although this approach has not been very successful in studying the Turing degrees. Another area influenced by the results in Post’s paper is the study of the lattice of c.e. sets. In this field, the simple, hypersimple and hyperhypersimple sets have played an important role. Even though the initial solution to Post’s Problem made no use of Post’s Pro-


Klaus Ambos-Spies and Peter A. Fejer

gram, the program has had an influence for many decades and eventually was justified. We describe the relevant results here. Myhill [1956] introduced the notion of maximal set. A maximal set is a c.e. set whose complement is as thin as possible, from the computability theoretic point of view, without being finite. Yates [1965] constructed a complete maximal set thereby showing that Post’s Program, narrowly defined, cannot succeed. However, taken in a broader sense, namely if one allows any structural property of a c.e. set not just a thinness property of the complement, then Post’s Program does succeed. The first solution, due to Marchenkov [1976] and based on some earlier result of D¨egtev [1973], in part follows Post’s approach quite closely. The thinness notions of Post are generalized by replacing numbers with equivalence classes of any c.e. equivalence relation η. Then it is shown that, for Tennenbaum’s Q-reducibility, η-hyperhypersimple sets are Q-incomplete. Finally this result is transferred to Turing reducibility by observing that any Turing complete semirecursive set is already Q-complete and by showing that there are semirecursive η-hyperhypersimple sets for appropriately chosen η. So this solution combines a thinness property, η-hyperhypersimplicity, with some other structural property, semirecursiveness. In an attempt to define what a natural incompleteness property is, it has been suggested to consider lattice-theoretic properties. After Myhill [1956] observed that the partial ordering of c.e. sets under inclusion is a lattice, this lattice E became a common setting for studying structural properties of the c.e. sets. A property is called lattice-theoretic if it is definable in E. Simplicity, hyperhypersimplicity and maximality are lattice-theoretic but hypersimplicity and Marchenkov’s incompleteness property are not. The question whether there is a lattice-theoretic solution of Post’s Program was answered positively by Harrington and Soare [1991]. To finish our discussion of Post’s paper, we make some comments on the style of exposition. In general, exposition in degree theory has gone from formal to informal. However, Post’s paper is written in a very informal and easy to read style and has often been cited as a good example of exposition. Post’s paper is the text of an invited talk at the February 1944 New York meeting of the American Mathematical Society. Post states as one of his goals to give an intuitive presentation that can be followed by a mathematician not familiar with the formal basis. This does not mean that Post felt that the formal proofs were not needed. In fact, he assures his listeners that with a few exceptions, all of the results he is reporting have been proven formally, and he indicates that he intends to publish the results with formal proofs. (This publication was never completed.) Post adds “Yet the real mathematics must lie in the informal development. For in every instance the informal ‘proof’ was first obtained; and once gotten, transforming it into the formal proof turned out to be a routine chore.” The next milestone in the history of degree theory was the 1954 paper of Kleene and Post. As mentioned above, this paper introduced the degrees as an upper semi-lattice and defined the jump as an operator on degrees. The paper begins the study of the algebraic properties of this upper semi-lattice and points out

Degrees of Unsolvability


additional questions about the structure which inspired much of the earliest work on it. The idea of writing down conditions which a set to be constructed must meet and then breaking down each condition into infinitely many subconditions, called requirements, appears here for the first time. The paper also introduces the coinfinite extension technique for constructing sets. In this technique, an increasing sequence of coinfinite sets S0 ⊆ S1 ⊆ · · · of natural numbers is constructed along with a sequence of binary-valued functions f0 , f1 , . . ., where each fi has domain Si and each fi+1 extends fi . fn is defined so that any set whose characteristic function extends fn meets the nth requirement. Any set S of natural numbers whose characteristic function extends all the fn ’s (if as usual i Si is the set of all natural numbers, then there is only one such set) meets all the requirements. When each set Si is finite, this method is called the finite extension method. The authors also noted that the degree of the sets obtained by their constructions is bounded by the jump of the given sets used in the construction. Using this technique, the authors showed a large number of results including the following: • between every degree and its jump, there are countable anti-chains and dense countable chains (so in particular there are incomparable degrees below 0′ ); • for every nonzero degree, there is a degree incomparable with the given degree; • there are countable subsets of the degrees that do not have a least upper bound; • the degrees do not form a lattice. All but the last of these results used the finite extension method. The last result introduced another technique that proved to be useful - exact pairs. An ideal of the degrees (i.e., a nonempty subset closed downward and closed under joins) has an exact pair if there is a pair of degrees a0 , a1 such that the ideal consists of exactly those degrees below both a0 and a1 . An exact pair for an ideal with no greatest element is necessarily a pair without a meet and the paper shows that for every degree a, the ideal consisting of the downward closure of {a, a′ , a′′ , . . .} has an exact pair. The Kleene-Post paper is significant for many reasons. Perhaps most important is the fact that it introduced the study of the algebraic properties of the upper semi-lattice of the degrees as a legitimate activity. This study is still being pursued vigorously 60 years later. Also very important are the techniques introduced. This includes not just the coinfinite extension method and the use of exact pairs, but also a general viewpoint towards constructing sets with desired properties rather than the structural approach attempted earlier by Post, the Kleene-Post approach is to list the requirements to be met and then construct a set to meet those requirements directly. No attempt is made to find “natural” examples. This approach has characterized the field till today. More specifically, the finite (or


Klaus Ambos-Spies and Peter A. Fejer

coinfinite) extension method may be viewed as both a precursor to general forcing arguments and an important step towards the priority method, so that it led the way to the predominant techniques (except for coding) in the field. Also significant were the many questions raised in the paper. These included questions concerning what relationships are possible between the jumps of two degrees given the relationship between the degrees themselves, the question of which degrees are in the range of the jump operator, and whether the degrees are dense. Another question raised by the following sentence in the paper was the definability of the jump: While the operation a ∪ b is characterizable intrinsically from the abstract partially ordered system of the degrees as the l.u.b. of a and b, the operation a′ may so far as we know merely be superimposed upon this ordering. This question has itself been studied intensely, but the question is also significant for having introduced a program of determining which natural operations and subsets of the degrees are definable from the ordering. This program is still being actively pursued and there have been notable successes (see Section 11 below). Many of the questions raised by Kleene and Post were answered by Spector [1956]. Most of these results were proven using the coinfinite extension technique, but the fact that there are minimal degrees (i.e., minimal nonzero elements of the degree ordering) and hence the degrees are not dense, needed a new technique. Spector’s technique is best explained using trees (as was done by Shoenfield [1966]), although Spector did not present his method this way. A sequence of total binary trees T0 , T1 , . . . is constructed with each tree a subtree of the previous one. The trees are selected so that any set whose characteristic function lies on Tn meets the nth requirement. A set whose characteristic function lies on all the trees meets all the requirements. Spector also proved that every countable ideal of the degrees has an exact pair (an intermediate result about exact pairs was announced by Lacombe [1954]) and that the degrees below 0′ are not a lattice. Shoenfield [1959] was also clearly inspired by the Kleene-Post paper and proves among other things that there are degrees below 0′ which are not computably enumerable. The Kleene-Post paper was significant as well for the style of presentation it introduced. Although motivation and intuition are provided in a readable manner, the actual proofs themselves are very formal, using the T predicate, and by contemporary standards are very hard to read even though the results would not be considered today to be that difficult. Most papers in the field, including the papers of Spector and Shoenfield cited above, were written in this style for many years after the appearance of the Kleene-Post paper. Two aspects of the legacy of the Kleene-Post paper have come in for criticism the use of purely computability-theoretic methods to prove results when techniques from other areas could be used, and the explication of proofs in a formal way which

Degrees of Unsolvability


makes them hard to read. Myhill [1961] was probably making both criticisms when he wrote: The heavy symbolism used in the theory of recursive functions has perhaps succeeded in alienating some mathematicians from this field, and also in making mathematicians who are in this field too embroiled in the details of thier[sic] notation to form as clear an overall picture of their work as is desirable. In particular the study of degrees of recursive unsolvability by Kleene, Post, and their successors [in a footnote, Shoenfield and Spector are mentioned here] has suffered greatly from this defect, so that there is considerable uncertainty even in the minds of those whose specialty is recursion theory as to what is superficial and what is deep in this area. In the paper, Myhill advocates the use of Baire category methods to prove results in degree theory. Those results which do not have such proofs can be considered “truly ‘recursive’ ” while those results with such proofs are “merely set-theoretic”. In his paper, Myhill proves Shoenfield’s theorem [Shoenfield, 1960] that there is an uncountable collection of pairwise incomparable degrees using category methods. He also states that a Baire category proof of the Kleene-Post theorem that there are incomparable degrees below 0′ will be given in another publication, but this never appeared (and it is not at all clear how such a proof would look unless Myhill had in mind some bounded Baire category theory as described below). Baire category methods in degree theory are also investigated in Sacks [1963b], Martin [1967], Stillwell [1972] and Yates [1976]. If the collection of all sets with a certain property is a comeager subset of 2ω (under the usual topology) then by the Baire category theorem the collection is nonempty and a set with the property exists. Martin showed the existence of a noncomputable set whose degree has no minimal predecessors using this method. Measure theory has also been proposed as a means to prove theorems about degrees. This was first done in Spector [1958]. This paper mainly concerns hyperdegrees and hyperjumps, but it reproves the Kleene-Post result that there is a countably infinite collection of pairwise incomparable degrees. The measuretheoretic approach was also considered in most of the papers listed above that considered Baire category. One way to use measure theory is to show that the collection of all sets with a desired property has measure 1 (in the Lebesgue measure). Martin’s result on minimal predecessors can be obtained this way as well. Still, extremely few results in degree theory can be obtained by just quoting results about Baire category or measure. There are some close relations between Baire category and the finite extension method of Kleene and Post, however. In this method, one shows that the collection of sets meeting each requirement contains a dense open set. Thus the collection of sets meeting all the requirements is comeager and so nonempty. (It follows that if the collection of all sets meeting all requirements is not comeager, then the finite extension method cannot be used to produce a set meeting all the requirements.) In fact, the standard proof of Baire’s


Klaus Ambos-Spies and Peter A. Fejer

Theorem may be viewed as a finite extension argument. What distinguishes a finite extension argument from a pure Baire category argument is that by analyzing the complexity of the extension strategies one obtains bounds on the complexity of the constructed sets which a category argument does not provide since many of the complexity classes which are of interest in computability theory are countable hence meager. Baire category can be made more suitable for degree theory, however, by effectivizing this concept. Such effectivizations have been considered in terms of forcing notions and, in particular, the typical sets obtained this way, called generic sets, played a significant role in the analysis of the global degrees. Feferman [1965] introduced arithmetically generic sets and Hinman [1969] refined this concept by considering n-generic sets related to the nth level Σn of the arithmetical hierarchy. Roughly speaking, an n-generic set has all properties that can be forced by a Σn -extension strategy. Since the class of n-generic sets is comeager, advantages of the Baire category approach are preserved but since there are n-generic sets computable in the nth jump ∅(n) , at the same time we can obtain results on initial segments of D. For instance, we can show the existence of incomparable degrees below 0′ by observing that the even and odd parts of any 1-generic set are Turing-incomparable. It was Jockusch [1980] who emphasized the applicability of these bounded genericity concepts to degree theory. For a comprehensive survey of genericity in degree theory see Kumabe [1996]. In a similar way the application of algorithmic randomness concepts, in particular 1-randomness due to Martin-L¨ of [1966], has made the measure approach more suitable for degree theory. Good overviews of this approach can be found in the monographs by Downey and Hirschfeldt [2010] and Nies [2009]. The arithmetical forcing notions can be viewed as special cases of Cohen forcing [Cohen, 1963] which plays a fundamental role not only in set theory but also in global degree theory. In general, forcing techniques became the most fundamental tool for the study of the structure D. The techniques which were developed here are not only Cohen style forcing notions based on the Baire category idea, however, and there is a variety of other forcing techniques, for instance forcing notions based on trees (Spector - Sacks forcing) and on measure, to name just a few. In his monograph [1983], Lerman introduces a framework for forcing which is tailored to applications in the degrees and which captures the fundamental forcing notions used there. The second criticism of the Kleene-Post legacy, concerning style of presentation, was eventually accepted. Starting around 1965, a more informal style of exposition as in Post’s 1944 paper became the norm. We will discuss this in Section 6. The genesis of [Kleene and Post, 1954] was described this way by Kleene in [Crossley, 1975]: This [anyone who does not publish his work should be penalized] is just what I wrote to Emil Post, on construction of incomparable degrees and things like that, and he made some remarks and hinted at having some results and I said (in substance): “Well, when you leave it this

Degrees of Unsolvability


way, you say you have these results, you don’t publish them. The fact that you have them prevents anyone else who has heard of them from doing anything on it.” So he said (in substance): “You have sort of pricked my conscience and I shall write something out”, and he wrote some things out, in a very disorganized form, and he suggested that I give them to a graduate student to turn into a paper. As I recall, I think I did try them on a graduate student, and the graduate student did not succeed in turning them into a paper, and then I got interested in them myself, and the result was eventually the Post-Kleene paper.[...] There were things that Post did not know, like that there was no least upper bound. You see, Post did not know whether it was an upper semi-lattice or a lattice. I was the one who settled that thing. The paper itself does not state which author is responsible for which contribution. Davis, who was Post’s student as an undergraduate, states in [Post, 1994] that Post announced in his 1948 abstract the result that there are incomparable degrees below 0′ and discussed this result with Davis in a reading course. Although it is not true that Post announces his result in the abstract, it is clear from Davis’ recollection that the result is due to Post. A complete understanding of who proved what in this paper will probably never be obtained. Post struggled with manic-depressive disease his whole life and according to Davis (see [Post, 1994]) died of a heart attack in a mental institution shortly after an electro-shock therapy session. The Kleene-Post paper was his last. For more details on Post’s life see [Post, 1994]. Kleene’s real interests were in generalized recursion theory and [Kleene and Post, 1954] is his only paper in the Turing degrees. 4


Post’s Problem was solved independently by Friedberg [1957c] and Muˇcnik [1956] (see [Muˇcnik, 1958] for an expanded version). Both show that there are incomparable c.e. degrees and therefore that incomplete, noncomputable c.e. sets exist. In his abstract [Friedberg, 1956], Friedberg refers to his solution as making the Kleene-Post construction of incomparable degrees below 0′ “recursive”. The new technique introduced by both papers to solve the problem has come to be known as the priority method. The version used in these papers is specifically known as the finite injury priority method. In the priority method, one has again requirements or conditions which the sets being constructed must meet, as in the finite extension method. Usually when the priority method is used, the set to be constructed must be c.e., so it is constructed as the union of a uniformly computable increasing sequence of finite sets, the ith finite set consisting of those elements enumerated into the set by the end of stage i of the construction. The requirements are listed in some order with requirements earlier in the order having higher priority than ones later in


Klaus Ambos-Spies and Peter A. Fejer

the order. In a coinfinite extension argument, at stage n action is taken to meet requirement n. This action consists of specifying that certain numbers are in the set being constructed and others are not in the set. The status of infinitely many numbers is left unspecified. Action at all future stages obeys these restrictions. Because the determination of what action to take at a given stage cannot be made effectively, the set constructed by this method is not c.e. In the priority method, at stage n action is taken for whichever is the highest priority requirement Rin that appears to need attention at the stage. Action consists of adding numbers into the set (which cannot be undone later) and wanting to keep other numbers out of the set. If at a later stage a higher priority requirement acts and wants to put a number into the set which Rin wanted to keep out, then this number is added and Rin is injured and must begin again. On the other hand, no lower priority requirement can injure Rin . In a finite injury priority argument, each requirement only needs to act finitely often to be met, once it is no longer injured. (For the solution to Post’s problem, each requirement needs to act at most twice after it is no longer injured.) By induction, it follows that each requirement is injured only finitely often, is met, and acts only finitely often. In the Friedberg-Muˇcnik solution to Post’s Problem, requirements are of the form A 6= {e}B and B = 6 {e}A , where A and B are the two c.e. sets being built whose degrees are to be incomparable and {e} is the eth Turing reduction. Action for A 6= {e}B consists of choosing a witness x not restrained by any higher priority requirement on which it is desired to obtain A(x) 6= {e}B (x) and then waiting for s a stage s with the current approximation {e}B s (x) equal to 0. Then x is put into s A and numbers less than the use of the computation {e}B s (x) that are not in Bs are restrained from B. If this restraint is never violated, the requirement is met. A higher priority requirement of the form B 6= {i}A may act later and injure the original requirement, but each requirement acts only finitely often after it stops being injured, so all requirements are met. The priority method is fundamental for the study of the computably enumerable degrees and has applications in other areas of computability theory as well. Friedberg’s paper [1958] contains three further applications of the finite injury method. He shows that every noncomputable c.e. set is the union of two disjoint noncomputable c.e. sets (the Friedberg Splitting Theorem), that maximal sets exist, and that there is an effective numbering of the c.e. sets such that each c.e. set occurs exactly once in the numbering. The Friedberg Splitting Theorem is a particularly simple priority argument as there are no injuries. Priority is just used to decide which requirement to satisfy at a given stage when there is more than one requirement that can be satisfied. In the maximal set construction, there is a set of movable markers {Γe }e∈ω . Each marker Γe has associated with it a binary string of length e called its e-state. The e-state is determined by the position of Γe . The eth requirement is that the e-state of Γe be lexicographically at least as great as the e′ -state of all markers Γe′ with e′ > e. Once markers Γe′′ with e′′ < e stop moving, Γe moves at most 2e − 1 times. Here is a case where the maximum number of times a requirement Rn can act after higher priority requirements stop

Degrees of Unsolvability


acting depends on n but is still computable. Finite injury constructions can often be combined with a method called the permitting method to push constructions below a nonzero c.e. degree. In the simplest version of the permitting method, two effective enumerations {As }s∈ω and {Bs }s∈ω of c.e. sets A and B have the property that for all x, s, x ∈ As+1 − As implies (∃y ≤ f (x))(y ∈ Bs+1 − Bs ), where f (x) is a computable function. It follows that A ≤T B because if s is a stage such that every number less than or equal to f (x) that belongs to B is already in Bs , then x ∈ A if and only if x ∈ As . In many cases, the function f is the identity function. The first argument that uses permitting is in Dekker [1954], but the principle is not stated in a more abstract manner until Yates [1965]. The first theorem in degree theory that can be proven using permitting is the result claimed by Muˇcnik [1956] and proven by Friedberg [1957a] (after seeing Muˇcnik’s claim) that below any nonzero c.e. degree there are two incomparable c.e. degrees. When this result is proven using permitting to construct the two incomparable c.e. sets A and B both reducible to a noncomputable c.e. set C, the requirements are as given above, but before a number x can be put into say A to meet a requirement, a number y less than or equal to x has to enter C. A single requirement A = 6 {e}B can now have more than one follower, i.e., number x on which the requirement tries to make A and {e}B different. A follower x s is appointed and if later {e}B s (x) = 0, then the follower is realized. Once the follower is realized, restraint is put on the lower priority requirements to preserve the computation and the follower will be put into A if C permits at some later stage. Meanwhile another follower is appointed and it goes through the same cycle. This action continues until either a follower is never realized or a realized follower is permitted to be put into A. Each requirement acts only finitely often after it stops being injured because if not, then there are infinitely many followers, all realized. Once a follower x is realized at stage s, no number less than x enters C at a stage greater than s. This makes C computable, contradicting the assumption. Thus, each requirement acts only finitely often after it stops being injured. The requirement is met because either a follower is never realized or a diagonalization is successfully carried out. Note that here we have no computable bound on how often a requirement acts; however, it is only negative action that we cannot bound. Once a requirement stops being injured, it only acts once positively. While the finite injury technique had many successes, it has obvious limitations as well. In general, any construction that involves coding a given noncomputable c.e. set into a set being built will involve infinite injury. As we will discuss in the following sections, more powerful techniques were invented to deal with this type of construction. Nonetheless, important results (for example [Downey and Lempp, 1997]) were proven using the finite injury technique, albeit in sophisticated ways, long after infinite injury techniques were invented. Just as category can be used to help investigate the limits of what can be proven with the finite extension method, Maass [1982], Jockusch [1985] and Nerode and Remmel [1986] introduced some effective genericity concepts for c.e. sets designed to determine what can be


Klaus Ambos-Spies and Peter A. Fejer

proven about a c.e. set with finite injury constructions. Given the ubiquity of the priority method in proving results about the c.e. degrees and the importance of Post’s Problem, it is natural to ask if this problem can be solved without the priority method. The two solutions mentioned in the previous section that are in the spirit of Post’s Program also use the priority method. However, Kuˇcera [1986] has given a priority-free solution. Kuˇcera obtained his solution from the existence of low fixed point free functions (Jockusch and Soare [1972]) by observing that any such function bounds a simple set. The sets constructed by the priority method to solve Post’s Problem have as their only purpose to be a solution. One might then ask if there are any natural solutions to Post’s Problem. Since naturalness is not a precisely defined notion, this question is rather vague, but it is fair to say that every particular c.e. set of natural numbers that has arisen from nonlogical considerations so far is either computable or complete. (For some of the strong reducibilities there are “natural” examples of incomplete c.e. sets: Kolmogorov [1965] has observed that the set of algorithmically compressible strings is simple, hence not btt-complete. As Kummer [1996] has shown, however, this set is tt-complete, hence T-complete.) Thus one could say that the great complexity in the structure of the c.e. degrees arises solely from studying unnatural problems. However, it is true that every c.e. degree can be obtained by a process studied outside of computability theory, even if the particular instances of the process that produce noncomputable, incomplete degrees do not arise in practice. For example, Boone [1965] shows that every c.e. degree contains the word problem for a finitely presented group, while Feferman [1957] shows that every c.e. degree is the degree of a recursively axiomatizable theory. While, in general, we do not provide biographical details in this chapter, we feel that some background information on the invention of the priority method by Friedberg and Muˇcnik would be of interest. In a phone conversation of July 1999 we asked Richard Friedberg about the genesis of his work in computability theory. Friedberg was a mathematics and physics major at Harvard. In the summer of 1955 he was looking for a topic for a senior thesis. He was advised by David Mumford to look into metamathematics and so read Kleene’s book [1952]. In the book, Kleene asks if there are incomparable degrees and Friedberg solved this problem on his own. When he found out that the solution was in the Kleene-Post paper, he was encouraged because the solution had only been published recently. He next found two degrees neither of which is computable in the jump of the other. Friedberg wrote to Kleene about this and Kleene suggested that Friedberg work on Post’s Problem. Friedberg worked on the problem the whole fall of 1955 without making any progress. He was taking a seminar with Hao Wang at Harvard and for his term paper decided to write about his attempts to solve the problem and why they didn’t work. In the course of writing this paper, he solved the problem. Friedberg’s official advisor at Harvard was Willard Quine, but his real advisor was Hartley Rogers. Friedberg explained his result to Rogers and then sent it to Kleene. There was a mistake in his write-up and he received a skeptical reply from Kleene. He fixed the mistake and resent

Degrees of Unsolvability


his proof. This time Kleene said it was correct. Friedberg then sent in a notice to the Bulletin of the AMS (received January 10, 1956). After this, Friedberg was invited to speak at Princeton University and the Institute for Advanced Studies, where he met Kurt G¨ odel, Georg Kreisel and Freeman Dyson. He gave talks at other universities and met Hilary Putnam and Alfred Tarski among others. After graduating from Harvard, Friedberg went to medical school for two and a half years. During the summers of 1957 and 1958, he worked at IBM with Bradford Dunham. In 1957, Dunham took his group to Cornell University for the AMS meeting in Recursion Theory. There, Putnam told Friedberg about the maximal set problem, which he solved once he got back to IBM. Friedberg’s favorite among his theorems is his theorem on numberings, which he believes is his hardest. After leaving medical school, Friedberg went to graduate school in Physics at Columbia. He received his PhD in 1962 and is a professor emeritus of physics at Barnard College and Columbia University. According to Al. A. Muˇcnik’s son Andrei A. Muˇcnik, who was among the leading experts in algorithmic randomness, his father was a Ph.D. student at the Pedagogical Institute in Moscow when he learned about Post’s Problem. In 1954 his thesis advisor, Petr Sergeevich Novikov, presented the problem in a seminar talk. Novikov expressed his expectation that this question would be resolved within the next two years. When Muˇcnik worked on Post’s Problem he was familiar with the papers by Post and by Kleene and Post. He solved the problem in 1955 and the solution became the core of his Ph.D. thesis. Muˇcnik’s results were highly appreciated and he presented his work at some of the major mathematics conferences in the USSR. After his Ph.D. Muˇcnik became a researcher at the Institute of Applied Mathematics at the Academy of Sciences in Moscow. He continued to work in computability theory and mathematical logic but he did not obtain any further results on the degrees of unsolvability. Thus, like Friedberg, Muˇcnik left degree theory shortly after he obtained his fundamental result though, unlike Friedberg, he stayed in the field of logic. While Friedberg’s work had a deep impact on the further development of computability theory in the United States and Britain, Muˇcnik’s lasting influence on the Russian computability community was much more limited. 5


A typical requirement that cannot be handled by a finite injury strategy is a Friedberg-Muˇcnik type requirement A 6= {e}B , where B is subject to infinitary positive requirements. It was exactly this type of requirement for which the infinite injury method was first used, by Shoenfield [1961]. For any set X and number e, let A[e] = {hx, ei : hx, ei ∈ A} and call a subset A of a set B a thick subset of B if for all e, B [e] − A[e] is finite. Shoenfield’s theorem was that if B is a c.e. set with B [e] finite or equal to ω [e] for every e, then there is a thick c.e. subset A of B that is not complete. In the proof, the incompleteness of A is shown by constructing


Klaus Ambos-Spies and Peter A. Fejer

a c.e. set D with D 6≤T A. The requirements D 6= {e}A have to be met in spite of the infinitary positive requirements on A to be a thick subset of B. Shoenfield applied his theorem to constructing theories, not to degree theory, but one can show the existence of an incomplete high c.e. degree using the theorem. (A c.e. degree a is high if it has the highest possible jump, i.e., a′ = 0′′ .) The next step towards using the infinite injury technique in degree theory was the Sacks Splitting Theorem shown in [Sacks, 1963b] the proof of which requires a variant of the finite injury technique more widely applicable than the original one used by Friedberg and Muˇcnik. The theorem states that if B is a c.e. set and C is a noncomputable set Turing reducible to ∅′ , then there are disjoint c.e. sets A0 and A1 such that B = A0 ∪ A1 and C 6≤T Ai for i = 0, 1. The key requirements are of the form C 6= {e}Ai . These are harder to meet than the requirements in the Friedberg-Muˇcnik Theorem because C is a given set. Sacks’ insight was to use a preservation strategy. By putting restraint on Ai to preserve computations A Cs (x) = {e}s i,s (x), one forces a difference between {e}Ai and C since otherwise C would be computable. This construction is finite injury, but there is no computable bound on both the negative and positive injuries. The first theorem in degree theory proven using the infinite injury method was the Sacks Jump Theorem [Sacks, 1963c]. This theorem states that a degree c is the jump of a c.e. degree if and only if 0′ ≤ c and c is c.e. in 0′ . Furthermore, given such a degree c and a degree b with 0 < b ≤ 0′ , one can find a c.e. degree a with a′ = c and b 6≤ a. Previously, Shoenfield [1959] had shown that a degree c is the jump of a degree ≤ 0′ if and only if 0′ ≤ c and c is c.e. in 0′ and Friedberg [1957b] had shown a degree c is the jump of another degree if and only if c ≥ 0′ . Sacks’ proof made use of the preservation strategy. Infinite injury proofs vary greatly, but they have some common features. The most basic feature is the existence of infinitary positive and/or negative requirements. Infinitary negative requirements put on a restraint that has an infinite lim sup; however, in order for it to be possible for the positive requirements to be met, a method is found to ensure that the restraint for each negative requirement has finite lim inf. Even after this is done, there is a synchronization problem. Two negative requirements, each with finite lim inf of restraint, can still have a combined restraint with infinite lim inf. One way of dealing with this problem is to have followers of positive requirements get past the restraints of negative requirements one at a time. This was Sacks’ approach and it was later formalized in the so-called pinball machine model introduced in [Lerman, 1973]. Another approach is the nested strategies method ([Lachlan, 1966b]) where the restraint of one negative requirement is based on the current restraint of the higher priority negative requirements. In this way, it is sometimes possible to get all the negative restraints to fall back simultaneously. Yet another model is the priority tree model ([Lachlan, 1975b]). Here, each requirement has several strategies. Each strategy makes a guess about the outcomes of the higher priority requirements. Each strategy is assigned a node in a tree. In the simplest case, the strategies for the nth requirement are put on level n of the tree. There is then a true path through the

Degrees of Unsolvability


tree consisting of those strategies whose guess is correct and along this path the action is finitary even though the overall action for a requirement is still infinitary. At any stage of the construction, there is a guess about the true path, called the accessible path, and action is limited to accessible strategies. The accessible paths approximate the true path in the sense that for any given length n, the initial segment of length n of the true path is the lim inf of the initial segments of length n of the accessible paths. This means that a 0′′ oracle can determine both the true path and how each requirement is met. This tree representation can be used to explain the difference between finite and infinite injury. When we model a finite injury argument using a priority tree, due to the fact that every strategy acts only finitely often, the accessible paths approximate the true path more effectively, namely, the true path becomes the limit of the accessible paths, not just the lim inf, so here the true path and the way a requirement is satisfied can be recognized by using 0′ as an oracle. So in modern terminology, the finite injury and infinite injury methods are also called the 0′ -priority method and the 0′′ -priority method. Shoenfield’s original infinite injury construction does not use a tree, but he has strategies that make guesses about the outcomes of higher priority positive requirements, so his proof could be viewed as a forerunner of the tree model for infinite injury constructions. The next significant result in degree theory after the Jump Theorem that used infinite injury was the Density Theorem of Sacks [1964]. This theorem states that the c.e. degrees are dense. Given two c.e. sets C, D with C 1 such that they cannot be approximated within a factor of δ unless P = NP. Since these initial works on probabilistically checkable proofs, we have seen a large number of outstanding papers improving the proof systems and getting stronger hardness of approximation results. H˚ astad [1997] gets tight results for some approximation problems. Arora [1998], after failing to achieve lower bounds for traveling salesman in the plane, has developed a polynomial-time approximation algorithm for this and related problems. A series of results due to Cai, Condon, Lipton, Lapidot, Shamir, Feige and Lov´asz [Cai et al., 1992; Cai et al., 1990; Cai et al., 1991; Feige, 1991; Lapidot and Shamir, 1991; Feige and Lov´ asz, 1992] have modified the protocol of Babai, Fortnow and Lund [1991] to show that every language in NEXP has a two-prover, one-round proof systems with an exponentially small error. This problem remained so elusive because running these proof systems in parallel does not have the expected error reduction [Fortnow et al., 1994]. In 1995, Raz [1998] showed that the error does go down exponentially when these proofs systems are run in parallel.



If you generate a random number on a computer, you do not get a truly random value, but a pseudorandom number computed by some complicated function on some small, hopefully random seed. In practice this usually works well so perhaps in theory the same might be true. Many of the exciting results in complexity theory in the 1980’s and 90’s consider this question of derandomization–how to reduce or eliminate the number of truly random bits to simulate probabilistic algorithms. The first approach to this problem came from cryptography. Blum and Micali [1984] first to show how to create randomness from cryptographically hard functions. Yao [1990] showed how to reduce the number of random bits of any algorithm based on any cryptographically secure one-way permutation. H˚ astad, Impagliazzo, Levin and Luby [1999] building on techniques of Goldreich and Levin [1989] and Goldreich, Krawczyk and Luby [1993] show that one can get pseudorandomness from any one-way function. Nisan and Wigderson [1994] take a different approach. They show how to get pseudorandomness based on a language hard against nonuniform computation. Impagliazzo and Wigderson [1997] building on this result and Babai, Fortnow, Nisan and Wigderson [1993] show that BPP equals P if there exists a language in exponential time that cannot be computed by any subexponential circuit.


Lance Fortnow and Steven Homer

For derandomization of space we have several unconditional results. Nisan [1992] gives general tools for derandomizing space-bounded computation. Among the applications, he gets a O(log2 n) space construction for universal traversal sequences for undirected graphs. Saks and Zhou [1999] show that every probabilistic logarithmic space algorithm can be simulated in O(log3/2 n) deterministic space. 7


Many of the fundamental concepts and methods of complexity theory have their genesis in mathematical logic, and in computability theory in particular. This includes the ideas of reductions, complete problems, hierarchies and logical definability. It is a well-understood principle of mathematical logic that the more complex a problem’s logical definition (for example, in terms of quantifier alternation) the more difficult its solvability. Descriptive complexity aims to measure the computational complexity of a problem in terms of the complexity of the logical language needed to define it. As is often the case in complexity theory, the issues here become more subtle and the measure of the logical complexity of a problem more intricate than in computability theory. Descriptive complexity has its beginnings in the research of Jones, Selman, Fagin [Jones and Selman, 1974; Fagin, 1973; Fagin, 1974] and others in the early 1970’s. More recently descriptive complexity has had significant applications to database theory and to computeraided verification. The ground breaking theorem of this area is due to Fagin [1973]. It provided the first major impetus for the study of descriptive complexity. Fagin’s Theorem gives a logical characterization of the class NP. It states that NP is exactly the class of problems definable by existential second order Boolean formulas. This result, and others that follow, show that natural complexity classes have an intrinsic logical complexity. To get a feel for this important idea, consider the NP-complete problem of 3 colorability of a graph. Fagin’s theorem says there is a second order existential formula which holds for exactly those graphs which are 3-colorable. This formula can be written as (∃A, B, C)(∀v)[(A(v) ∨ B(v) ∨ C(v)) ∧ (∀w)(E(v, w) → ¬(A(v) ∧ A(w)) ∧ ¬(B(v) ∧ B(w)) ∧ ¬(C(v) ∧ C(w)))]. Intuitively this formula states that every vertex is colored by one of three colors A, B, or C and no two adjacent vertices have the same color. A graph, considered as a finite model, satisfies this formula if and only if it is 3-colorable. Fagin’s theorem was the first in a long line of results which prove that complexity classes can be given logical characterizations, often very simply and elegantly. Notable among these is the theorem of Immerman and Vardi [Immerman, 1982; Vardi, 1982] which captures the complexity of polynomial time. Their theorem states that the class of problems definable in first order logic with the addition of the least fixed point operator is exactly the complexity class P. Logspace can be characterized along these same lines, but using the transitive closure (TC) operator

Computational Complexity


rather than least fixed point. That is, nondeterministic logspace is the class of problems definable in first order logic with the addition of TC (see Immerman [1988]). And if one replaces first order logic with TC with second order logic with TC the result is PSPACE (see Immerman [1983]). Other, analogous results in this field go on to characterize various circuit and parallel complexity classes, the polynomial time hierarchy, and other space classes, and even yield results concerning counting classes. The intuition provided by looking at complexity theory in this way has proved insightful and powerful. In fact, one proof of the famous Immerman-Szelepcsenyi Theorem [Immerman, 1988; Szelepcs´enyi, 1988] (that by Immerman) came from these logical considerations. This theorem say that any nondeterministic space class which contains logspace is closed under complement. An immediate consequence is that the context sensitive languages are closed under complement, answering a question which had been open for about 25 years. To this point we have considered several of the most fully developed and fundamental areas of complexity theory. We now survey a few of the more central topics in the field dealing with other models of computation and their complexity theory. These include circuit complexity, communication complexity and proof complexity. 8



Circuit Complexity

The properties and construction of efficient Boolean circuits are of practical importance as they are the building block of computers. Circuit complexity studies bounds on the size and depth of circuits which compute a given Boolean functions. Aside from their practical value, such bounds are closely tied to important questions about Turing machine computations. Boolean circuits are directed acyclic graphs whose internal nodes (or “gates”) are Boolean functions, most often the “standard” Boolean functions, and, or and not. In a circuit, the nodes of in-degree 0 are called input nodes and labeled with input variables. The nodes with out-degree 0 are called output nodes. The value of the circuit is computed in the natural way by giving values to the input variables, applying the gates to these values, and computing the output values. The size, s(C), of a circuit C is the number of gates it contains. The depth, d(C), of a circuit C is the length of the longest path from an input to an output node. A circuit with n inputs can be thought of as a recognizer of a set of strings of length n, namely those which result in the circuit evaluating to 1. In order to consider circuits as recognizing an infinite set of strings, we consider circuit families which are infinite collections of circuits, Cn , one for each input length. In this way a circuit family can recognize a language just as a Turing machine can. A circuit family is a nonuniform model, the function taking n to Cn may not be computable. A nonuniform circuit family can recognize noncomputable sets. We


Lance Fortnow and Steven Homer

can measure the size and depth of circuit families using asymptotic notation. So, for example, we say that a circuit family has polynomial size if s(Cn ) is O(p(n)), for some polynomial p(n). Any language in P has polynomial size circuits. That is, it is recognized by a circuit family which has polynomial size. And so proving that some NP problem does not have polynomial size circuits would imply that P 6= NP. Largely because of many such implications for complexity classes, considerable effort has been devoted to proving circuit lower bounds. However, to this point this effort has met with limited success. In an early paper, Shannon [1949] showed that most Boolean functions require exponential size circuits. This proof was nonconstructive and proving bounds on specific functions is more difficult. In fact, no non-linear lower bound is known for the circuit size of a concrete function. To get more positive results one needs to restrict the circuit families being considered. This can be done by requiring some uniformity in the function mapping n to Cn , or it can be done by restricting the size or depth of the circuits themselves. For example, the class AC0 consists of those languages recognized by uniform, constant depth, polynomial size circuits with and, or and not gates which allow unbounded fan-in. One early and fundamental results, due to Furst, Saxe and Sipser [1988] and Ajtai [1983] is that the parity function is not in AC0 , and in fact requires exponential size AC0 -type circuits [Yao, 1990]. This immediately implies that AC0 differs from the class ACC of languages which have circuit families made from AC0 circuits with the addition of M odm gates, with m fixed for the circuit family. It can also be shown to imply the existence of an oracle separating the polynomial hierarchy from PSPACE. It is also known that the classes ACC(p) are all distinct, where only M odp gates are allowed, for p a prime. This was shown by Smolensky [1987] and Razborov [1998]. ACC itself has resisted all lower bound techniques and in fact it is not even know to be properly contained in NP. Razborov [1985a] showed that clique does not have small monotone circuits, i.e., just AND and OR gates without negations. However, this result says more about the limitations of monotone circuits as Razborov [1985] showed that the matching problem, known to be in P, also does not have small monotone circuits.


Communication Complexity

Much of modern computer science deals with the speed and efficiency at which digital communication can take place. Communication complexity is an attempt to model the efficiency and intrinsic complexity of communication between computers. It studies problems which model typical communication needs of computations and attempts to determine the bounds on the amount of communication between processors that these problems require. The basic question of communication complexity is, how much information do two parties need to exchange in order to carry out a computation? We assume both parties have unlimited computational power.

Computational Complexity


For example, consider the case where both parties have n input bits and they want to determine if there is a position i ≤ n where the two bits in position i match. It is not hard to see that the communication complexity of this problem is n, as the n bits are independent and in the worst case, all n bits of one party have to be transmitted to the other. Now consider the problem of computing the parity of a string of bits where 1/2 of the bits are given to party 1 and the other half to party 2. In this case, party 1 need only compute the parity of her bits and send this parity to party 2 who can then compute the parity of the whole bit string. So in this case the communication complexity is a single bit. Communication complexity has provided upper and lower bounds for the complexity of many fundamental communication problems. It has clarified the role which communication plays in distributed and parallel computation as well as in the performance of VLSI circuits. It also applies and has had an impact on the study of interactive protocols. For a good survey of the major results in this field, consult Nisan and Kushelevitz [1996].


Proof Complexity

The class NP can be characterized as those problems which have short, easily verified membership proofs. Dual to NP-complete problems, like SAT, are co−NPcomplete problems, such as TAUT (the collection of propositional tautologies). TAUT is not known to have short, easily verified membership proofs, and in fact if it did then NP = co−NP (see Cook and Reckhow [1973]). Proof complexity studies the lengths of proofs in propositional logic and the connections between propositional proofs and computational complexity theory, circuit complexity and automated theorem proving. In the last decade there have been significant advances in lower bounds for propositional proof complexity as well as in the study of new and interesting proof systems. Cook and Reckhow [1973] were the first to make the notion of a propositional proof system precise. They realized that to do this they needed to specify exactly what a proof is and to give a general format for presenting and efficiently verifying a proof p. They defined a propositional proof system S to be a polynomial-time computable predicate, R, such that for all propositional formulas, F, F ∈ T AU T ⇐⇒ ∃p R(F, p). The complexity of S is then defined to be the smallest function f : N −→ N which bounds the lengths of the proofs of S as a function of the lengths of the tautologies being proved. Efficient proof systems, those with complexity bounded by some polynomial, are called polynomialbounded proof systems. Several natural proof systems have been defined and their complexity and relationship explored. Among the most studied are Frege and extended-Frege Proof systems [Urquhart, 1987] and [Krajicek and Pudlak, 1989], refutation systems, most notably resolution [Robinson, 1965] and circuit based proof systems [Ajtai, 1983] and [Buss, 1987]. We briefly discuss the complexity of resolution systems


Lance Fortnow and Steven Homer

here, but see Beame and Pitassi [1998] for a nice overview of results concerning these other proof systems. Resolution proof systems are the most well-studied model. Resolution is a very restricted proof system and so has provided the setting for the first lower bound proofs. Resolution proof systems are refutation systems where a statement D is proved by assuming its negation and deriving a contradiction from this negation. In a resolution proof system there is a single rule of inference, resolution, which is a form of cut. In its propositional form it says that if F ∨ x and G ∨ ¬x are true then F ∨G follows. A restricted form of resolution, called regular resolution, was proved to have a superpolynomial lower bound by Tseitin [1968] on certain tautologies representing graph properties. The first superpolynomial lower bound for general resolution was achieved by Haken [1989] who in 1985 proved an exponential lower bound for the pigeonhole principle. Since then several other classes of tautologies have been shown to require superpolynomial long resolution proofs. 9


The mark of a good scientific field is its ability to adapt to new ideas and new technologies. Computational complexity reaches this ideal. As we have developed new ideas of probabilistic and parallel computation, the complexity community has not thrown out the previous research, rather they have modified the existing models to fit these new ideas and have shown how to connect the power of probabilistic and parallel computation to our already rich theory. Most recently complexity theorists have begun to analyze the computational power of machines based on quantum mechanics. In 1982, Richard Feynman [1982], the physicist, noted that current computer technology could not efficiently simulate quantum systems. He suggested the possibility that computers built on quantum mechanics might be able to perform this task. David Deutch [1985] in 1985 developed a theoretical computation model based on quantum mechanics and suggested that such a model could efficiently compute problems not computable by a traditional computer. Two unexpected quantum algorithms have provided the central motivation for studying quantum computation: Shor’s [1997] procedure for factoring integers in polynomial time on a quantum computer and Grover’s [1996] technique for √ searching a database of n elements in O( n) time. We know surprisingly little about the computational complexity of quantum computing. Bernstein and Vazirani [1997] give a formal definition of the class BQP of language efficiently computable by quantum computers. They show the surprising robustness of BQP which remains unscathed under variations of the model such as restricting to a small set of rational amplitudes, allowing quantum subroutines and a single measurement at the end of the computation. Bernstein and Vazirani show that BQP is contained in PSPACE. Adleman, DeMarrais and Huang [1997] show that BQP is contained in the counting class PP. Bennett, Bernstein, Brassard and Vazirani [1997] give a relativized world

Computational Complexity


where NP is not contained in BQP. We do not know any nonrelativized consequences of NP in BQP or if BQP lies in the polynomial-time hierarchy. What about quantum variations of NP and interactive proof systems? Fenner, Green, Homer and Pruim [1999] consider the class consisting of the languages L such that for some polynomial-time quantum Turing machine, x is in L when M (x) accepts with positive probability. They show the equivalence of this class to the counting class co−C= P. Watrous [1999] shows that every language in PSPACE has a bounded-round quantum interactive proof system. Later Watrous with Jain, Ji and Upadhyay [2011] would show that quantum interactive proofs accept the same languages as PSPACE. We have seen quite a bit of progress on quantum decision tree complexity. In this model we count the number of queries made to a black-box database of size n. Quantum queries can be made in superposition. Deutsch and Josza [1992] gave an early example of a simple function that can be solved with one query quantumly but requires Ω(n) queries deterministically or probabilistically with no error. Bernstein and Vazirani [1997] give the first example of a problem that can be solved with polynomial number of queries quantumly but requires a superpolynomial number of queries probabilistically with bounded error. Simon [1997] gives another example with an exponential gap. Brassard and Høyer [1997] gave a zero-error quantum algorithms for Simon’s problem. Shor’s factoring algorithm [Shor, 1997] can be viewed as an extension of Simon’s problem that finds the period in a periodic black-box function. All of these examples require a promise, i.e., restricting the allowable inputs to be tested. Fortnow and Rogers [1999] and Beals, Buhrman, Cleve, Mosca and de Wolf [1998] show that a promise is necessary to get a superpolynomial separation. 10


Despite the plethora of exciting results in computational complexity over the last four decades of the 20th century, true complexity class separations have remained beyond our grasp. Tackling these problems, especially showing a separation of P and NP, is our greatest challenge for the future. How will someone prove that P and NP differ? As of this writing, we have no serious techniques that could help separate these classes. What kind of future ideas could lead us to answer this difficult question? Some possibilities: • A unexpected connection to other areas of mathematics such as algebraic geometry or higher cohomology. Perhaps even an area of mathematics not yet developed. Perhaps someone will develop a whole new direction for mathematics in order to handle the P versus NP question. • New techniques to prover lower bounds for circuits, branching programs and/or proof systems in models strong enough to give complexity class separations.


Lance Fortnow and Steven Homer

• A new characterization of P or NP that makes separation more tractable. • A clever twist on old-fashioned diagonalization, still the only techniques that has given any lower bounds on complexity classes. Complexity theory will progress in areas beyond class separation. Still, quite a few interesting questions remain in many areas, even basic questions in quantum computational complexity remain. Complexity theorists will continue to forge new ground and find new and exciting results in these directions. As with probabilistic, parallel and quantum complexity, new models of computation will be developed. Computational complexity theorists will be right on top of these developments leading the way to understand the inherent efficient computational power of these models. We have seen many books and popular news stories about the other “complexity”, complex systems that occur in many aspects of society and nature such as financial markets, the internet, biological systems, the weather and debatably even physical systems. This theory suggests that such systems have a very simple set of rules that when combined produce quite a complex behavior. Computer programs exhibit a very similar behavior. We will see computational complexity techniques used to help understand the efficiency of the complex behavior of these systems. Finally, computational complexity will continue to have the Big Surprise. No one can predict the next big surprise but it will happen as it always does. Let us end this survey with a quote from Juris Hartmanis’ notebook (see [Hartmanis, 1981]) in his entry dated December 31, 1962 This was a good year. This was a good forty years and complexity theory is only getting started. 11


There have been several articles on various aspects of the history of complexity theory, many of which we have used as source material for this article. We give a small sampling of pointers here: • [Fortnow, 2013] The first author has a recent popular science book on the P versus NP problem with a chapter on the early history of P and NP. • [Hartmanis, 1981] Juris Hartmanis reminisces on the beginnings of complexity theory. • [Trakhtenbrot, 1984] Boris Trakhtenbrot describes the development of NPcompleteness from the Russian perspective. • [Sipser, 1992] Michael Sipser gives a historical account of the P versus NP question including a copy and translation of G¨odel’s historic letter to von Neumann.

Computational Complexity


• [Garey and Johnson, 1979] Michael Garey and David Johnson give a “terminological history” of NP-completeness and a very readable account of the basic theory of NP-completeness. • The collection of papers edited by Hochbaum [1995] is a good overview of progress made in approximating solutions to NP-hard problems. • Consult the book by Greenlaw, Hoover and Ruzzo [1995] to learn more of complexity theory within P and for many more P-complete problems. • The Turing award lectures of Cook [1983], Karp [1986], Hartmanis [1994] and Stearns [1994] give interesting insights into the early days of computational complexity. • The textbook of Homer and Selman [2000] contains a careful development of the definitions and basic concepts of complexity theory, and proofs of many central facts in this field. • The complexity columns of SIGACT news and the Bulletin of the EATCS have had a number of excellent surveys on many of the areas described in this article. • The two collections Complexity Theory Retrospective [Selman, 1988] and Complexity Theory Retrospective II [Hemaspaandra and Selman, 1997] contain some excellent recent surveys of several of the topics mentioned here. ACKNOWLEDGMENTS Earlier versions of this survey have appeared in the Bulletin of the European Association for Theoretical Computer Science (volume 80, June 2003) and as an invited presentation at the 17th Annual Conference on Computational Complexity in 2002. The authors would like to thank their colleagues, far too numerous to mention, whom we have had many wonderful discussions about complexity over the past few decades. Many of these discussions have affected how we have produced various aspects of this article. BIBLIOGRAPHY [Adleman et al., 1997] L. Adleman, J. DeMarrais, and M. Huang. Quantum computability. SIAM Journal on Computing, 26(5):1524–1540, 1997. [Adleman and Huang, 1987] L. Adleman and M. Huang. Recognizing primes in random polynomial time. In Proceedings of the 19th ACM Symposium on the Theory of Computing, pages 462–469. ACM, New York, 1987. [Ajtai, 1983] M. Ajtai. σ11 formulea on finite structures. Journal of Pure and Applied Logic, 24:1–48, 1983.


Lance Fortnow and Steven Homer

[Aleliunas et al., 1979] R. Aleliunas, R. Karp, R. Lipton, L. Lov´ asz, and C. Rackoff. Random walks, universal traversal sequences, and the complexity of maze problems. In Proceedings of the 20th IEEE Symposium on Foundations of Computer Science, pages 218–223. IEEE, New York, 1979. [Agrawal et al., 2002] M. Agrawal, N. Kayal, and N. Saxena. PRIMES is in P. Unpublished manuscript, Indian Institute of Technology Kanpur, 2002. [Arora et al., 1998] S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and the hardness of approximation problems. Journal of the ACM, 45(3):501–555, May 1998. [Adleman and Manders, 1977] L. Adleman and K. Manders. Reducibility, randomness, and intractibility. In Proceedings of the 9th ACM Symposium on the Theory of Computing, pages 151–163. ACM, New York, 1977. [Arora, 1998] S. Arora. Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. Journal of the ACM, 45(5):753–782, September 1998. [Ambox-Spies, 1989] K. Ambos-Spies. On the relative complexity of hard problems for complexity classes without complete problems. TCS, 64:43–61, 1989. [Arora and Safra, 1998] S. Arora and S. Safra. Probabilistic checking of proofs: A new characterization of NP. Journal of the ACM, 45(1):70–122, January 1998. [Babai, 1985] L. Babai. Trading group theory for randomness. In Proceedings of the 17th ACM Symposium on the Theory of Computing, pages 421–429. ACM, New York, 1985. [Bennett et al., 1997] C. Bennett, E. Bernstein, G. Brassard, and U. Vazirani. Strengths and weaknesses of quantum computing. SIAM Journal on Computing, 26(5):1510–1523, 1997. [Beals et al., 1998] R. Beals, H. Buhrman, R. Cleve, M. Mosca, and R. de Wolf. Quantum lower bounds by polynomials. In Proceedings of the 39th IEEE Symposium on Foundations of Computer Science, pages 352–361. IEEE, New York, 1998. [Borodin et al., 1989] A. Borodin, S. A. Cook, P. W. Dymond, W. L. Ruzzo, and M. Tompa. Two applications of inductive counting for complementaion problems. SIAM J. Computing, 13:559–578, 1989. [Babai et al., 1991] L. Babai, L. Fortnow, and C. Lund. Non-deterministic exponential time has two-prover interactive protocols. Computational Complexity, 1(1):3–40, 1991. [Babai et al., 1991a] L. Babai, L. Fortnow, L. Levin, and M. Szegedy. Checking computations in polylogarithmic time. In Proceedings of the 23rd ACM Symposium on the Theory of Computing, pages 21–31. ACM, New York, 1991. [Babai et al., 1993] L. Babai, L. Fortnow, N. Nisan, and A. Wigderson. BPP has subexponential simulations unless EXPTIME has publishable proofs. Computational Complexity, 3:307–318, 1993. [Ben-Or et al., 1988] M. Ben-Or, S. Goldwasser, J. Kilian, and A. Wigderson. Multi-prover interactive proofs: How to remove intractability assumptions. In Proceedings of the 20th ACM Symposium on the Theory of Computing, pages 113–131. ACM, New York, 1988. [Baker et al., 1975] T. Baker, J. Gill, and R. Solovay. Relativizations of the P = NP question. SIAM Journal on Computing, 4(4):431–442, 1975. [Berman and Hartmanis, 1977] L. Berman and H. Hartmanis. On isomorphisms and density of NP and other complete sets. SIAM Journal on Comput., 6:305–322, 1977. [Brassard and Høyer, 1997] G. Brassard and P. Høyer. An exact quantum polynomial-time algorithm for Simon’s problem. In Proceedings of the 5th Israeli Symposium on Theory of Computing and Systems (ISTCS’97), pages 12–23. IEEE, New York, 1997. [Boppana et al., 1987] R. Boppana, J. H˚ astad, and S. Zachos. Does co-NP have short interactive proofs? Information Processing Letters, 25(2):127–132, 1987. [Blum, 1967] M. Blum. A machine-independent theory of the complexity of recursive functions. Journal of the ACM, 14(2):322–336, April 1967. [Blum and Micali, 1984] M. Blum and S. Micali. How to generate cryptographically strong sequences of pseudo-random bits. SIAM Journal on Computing, 13:850–864, 1984. [Babai and Moran, 1988] L. Babai and S. Moran. Arthur-Merlin games: a randomized proof system, and a hierarchy of complexity classes. Journal of Computer and System Sciences, 36(2):254–276, 1988. [Borodin, 1972] A. Borodin. Computational complexity and the existence of complexity gaps. Journal of the ACM, 19(1):158–174, January 1972. [Beame and Pitassi, 1998] P. Beame and T. Pitassi. Propositional proof complexity: Past, present and future. Bull. of the EATCS, 65:66–89, 1998.

Computational Complexity


[Beigel et al., 1995] R. Beigel, N. Reingold, and D. Spielman. PP is closed under intersection. Journal of Computer and System Sciences, 50(2):191–202, 1995. [Buss, 1987] S. Buss. Polynomial size proofs of the pigeon hole principle. Journal of Symbolic Logic, 57:916–927, 1987. [Bernstein and Vazirani, 1997] E. Bernstein and U. Vazirani. Quantum complexity theory. SIAM Journal on Computing, 26(5):1411–1473, 1997. [Cantor, 1874] G. Cantor. Ueber eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen. Crelle’s Journal, 77:258–262, 1874. [Cai et al., 1990] J. Cai, A. Condon, and R. Lipton. On bounded round multi-prover interactive proof systems. In Proceedings of the 5th IEEE Structure in Complexity Theory Conference, pages 45–54. IEEE, New York, 1990. [Cai et al., 1991] J. Cai, A. Condon, and R. Lipton. PSPACE is provable by two provers in one round. In Proceedings of the 6th IEEE Structure in Complexity Theory Conference, pages 110–115. IEEE, New York, 1991. [Cai et al., 1992] J. Cai, A. Condon, and R. Lipton. On games of incomplete information. Theoretical Computer Science, 103(1):25–38, 1992. [Chandra et al., 1981] A. Chandra, D. Kozen, and L. Stockmeyer. Alternation. Journal of the ACM, 28:114–133, 1981. [Clay Math. Inst., 2000] Clay Mathematics Institute. Millennium prize problems., 2000. [Cobham, 1964] A. Cobham. The intrinsic computational difficulty of functions. In Proceedings of the 1964 International Congress for Logic, Methodology, and Philosophy of Science, pages 24–30. North-Holland, Amsterdam, 1964. [Cook, 1971] S. Cook. The complexity of theorem-proving procedures. In Proc. 3rd ACM Symp. Theory of Computing, pages 151–158, 1971. [Cook, 1973] S. Cook. A hierarchy for nondeterministic time complexity. Journal of Computer and System Sciences, 7(4):343–353, August 1973. [Cook, 1983] S. Cook. An overview of computational complexity. Communications of the ACM, 26(6):400–408, June 1983. [Cook and Reckhow, 1973] S. Cook and R. Reckhow. Time bounded random access machines. JCSS, 7(4):354–375, 1973. [Deutsch, 1985] D. Deutsch. Quantum theory, the Church-Turing principle and the universal quantum computer. Proceedings of the Royal Society of London A, 400:97, 1985. [Deutsch and Jousza, 1992] D. Deutsch and R. Jousza. Rapid solution of problems by quantum computation. Proceedings of the Royal Society of London A, 439:553, 1992. [Edmonds, 1965] J. Edmonds. Maximum matchings and a polyhedron with 0,1-vertices. Journal of Research at the National Bureau of Standards (Section B), 69B:125–130, 1965. [Edmonds, 1965a] J. Edmonds. Paths, trees and flowers. Canadian Journal of Mathematics, 17:449–467, 1965. [Fagin, 1973] R. Fagin. Contributions to the model theory of finite structures. Ph.D. Thesis, U.C. Berkeley, 1973. [Fagin, 1974] R. Fagin. Generalized first-order spectra and polynomial-time recognizable sets. In Complexity of Computation (ed. R. Karp), pages 27–41. SIAM-AMS Proc. 7, 1974. [Feige, 1991] U. Feige. On the success probability of the two provers in one round proof systems. In Proceedings of the 6th IEEE Structure in Complexity Theory Conference, pages 116–123. IEEE, New York, 1991. [Feldman, 1986] P. Feldman. The optimum prover lives in PSPACE. Manuscript, 1986. [Feynman, 1982] R. Feynman. Simulating physics with computers. International Journal of Theoretical Physics, 21:467, 1982. [Fenner et al., 1994] S. Fenner, L. Fortnow, and S. Kurtz. Gap-definable counting classes. Journal of Computer and System Sciences, 48(1):116–148, 1994. [Feige et al., 1988] U. Feige, A. Fiat, and A. Shamir. Zero knowledge proofs of identity. Journal of Cryptology, 1(2):77–94, 1988. [Fenner et al., 1999] S. Fenner, F. Green, S. Homer, and R. Pruim. Determining acceptance possibility for a quantum computation is hard for PH. Proceedings of the Royal Society of London, 455:3953–3966, 1999. [Feige et al., 1996] U. Feige, S. Goldwasser, L. Lov´ asz, S. Safra, and M. Szegedy. Interactive proofs and the hardness of approximating cliques. Journal of the ACM, 43(2):268–292, March 1996.


Lance Fortnow and Steven Homer

[F¨ urer et al., 1989] M. F¨ urer, O. Goldreich, Y. Mansour, M. Sipser, and S. Zachos. On completeness and soundness in interactive proof systems. In S. Micali, editor, Randomness and Computation, volume 5 of Advances in Computing Research, pages 429–442. JAI Press, Greenwich, 1989. [Feige and Lov´ asz, 1992] U. Feige and L. Lov´ asz. Two-prover one-round proof systems: Their power and their problems. In Proceedings of the 24th ACM Symposium on the Theory of Computing, pages 733–744. ACM, New York, 1992. [Fortnow, 1997] L. Fortnow. Counting complexity. In In Lane Hemaspaandra and Alan Selman, editor, Complexity Theory Retrospective II, pages 81–107. Springer, New York, 1997. [Fortnow, 2013] L. Fortnow. The Golden Ticket: P, NP, and the Search for the Impossible. Princeton University Press, 2013. [Fortnow and Rogers, 1999] L. Fortnow and J. Rogers. Complexity limitations on quantum computation. Journal of Computer and System Sciences, 59(2):240–252, 1999. [Fortnow et al., 1994] L. Fortnow, J. Rompel, and M. Sipser. On the power of multi-prover interactive protocols. Theoretical Computer Science A, 134:545–557, 1994. [Gill, 1977] J. Gill. Computational complexity of probabilistic complexity classes. SIAM Journal on Computing, 6:675–695, 1977. [Garey and Johnson, 1979] M. Garey and D. Johnson. Computers And Intractability: A Guide To The Theory of NP-Completeness. W. H. Freeman, San Francisco, 1979. [Goldwasser and Kilian, 1999] S. Goldwasser and J. Kilian. Primality testing using elliptic curves. Journal of the ACM, 46(4):450–472, July 1999. [Goldreich et al., 1993] O. Goldreich, H. Krawczyk, and M. Luby. On the existence of pseudorandom generators. SIAM Journal on Computing, 22(6):1163–1175, December 1993. [Goldreich and Levin, 1989] O. Goldreich and L. Levin. A hard-core predicate for all one-way functions. In Proceedings of the 21st ACM Symposium on the Theory of Computing, pages 25–32. ACM, New York, 1989. [Goldwasser et al., 1989] S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof-systems. SIAM Journal on Computing, 18(1):186–208, 1989. [Goldreich et al., 1991] O. Goldreich, S. Micali, and A. Wigderson. Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems. Journal of the ACM, 38(3):691–729, 1991. [Grover, 1996] L. Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the 28th ACM Symposium on the Theory of Computing, pages 212–219. ACM, New York, 1996. [Goldwasser and Sipser, 1989] S. Goldwasser and M. Sipser. Private coins versus public coins in interactive proof systems. In S. Micali, editor, Randomness and Computation, volume 5 of Advances in Computing Research, pages 73–90. JAI Press, Greenwich, 1989. [Hartmanis, 1981] J. Hartmanis. Observations about the development of theoretical computer science. Annals of the History of Computing, 3(1):42–51, 1981. [Hartmanis, 1982] J. Hartmanis. A note on natural complete sets and godel numberings. TCS, 17:75–89, 1982. [Hartmanis, 1986] J. Hartmanis. G¨ odel, Von neumann and the P=?NP problem. In Current Trends in Theoretical Computer Science, pages 445–450. World Scientific Press, New York, 1986. [Hartmanis, 1994] J. Hartmanis. Turing Award Lecture: On computational complexity and the nature of computer science. Communications of the ACM, 37(10):37–43, October 1994. [H˚ astad, 1989] J. H˚ astad. Almost optimal lower bounds for small depth circuits. In S. Micali, editor, Randomness and Computation, volume 5 of Advances in Computing Research, pages 143–170. JAI Press, Greenwich, 1989. [H˚ astad, 1997] J. H˚ astad. Some optimal inapproximabiity results. In Proceedings of the 29th ACM Symposium on the Theory of Computing, pages 1–10. ACM, New York, 1997. [Hartmanis and Baker, 1975] J. Hartmanis and T. Baker. On simple godel numberings and translations. SIAM Journal on Computing, 4:1–11, 1975. [Hartmanis and Berman, 1978] J. Hartmanis and L. Berman. On polynomial time isomorphisms and some new complete sets. JCSS, 16:418–422, 1978. [H˚ astad et al., 1999] J. H˚ astad, R. Impagliazzo, L. Levin, and M. Luby. A pseudorandom generator from any one-way function. SIAM Journal on Computing, 28(4):1364–1396, August 1999.

Computational Complexity


[Hochbaum, 1995] D. Hochbaum. Approximation Algorithms for NP-Hard Problems. PSW Publishing Company, Boston, 1995. [Hartmanis and Stearns, 1965] J. Hartmanis and R. Stearns. On the computational complexity of algorithms. Transactions of the American Mathematical Society, 117:285–306, 1965. [Hennie and Stearns, 1966] F. Hennie and R. Stearns. Two-tape simulation of multitape Turing machines. Journal of the ACM, 13(4):533–546, October 1966. [Hemaspaandra and Selman, 1997] L. Hemaspaandra and A. Selman. Complexity Theory Retrospective II. Springer, New York, 1997. [Homer and Selman, 2000] S. Homer and A. Selman. Computability and Complexity Theory. Springer, 2000. [Ibarra, 1972] O. Ibarra. A note concerning nondeterministic tape complexities. Journal of the ACM, 19(4):608–612, 1972. [Immerman, 1982] N. Immerman. Relational queries computable in polynomial time. In Proc. 14th Symposium on Theory of Computation, pages 147–152. ACM Press, 1982. [Immerman, 1983] N. Immerman. Languages which capture complexity classes. In Proc. 15th Symposium on Theory of Computation, pages 760–778. ACM Press, 1983. [Immerman, 1988] N. Immerman. Nondeterministic space is closed under complementation. SIAM Journal on Computing, 17(5):935–938, 1988. [Impagliazzo and Wigderson, 1997] R. Impagliazzo and A. Wigderson. P = BPP if E requires exponential circuits: Derandomizing the XOR lemma. In Proceedings of the 29th ACM Symposium on the Theory of Computing, pages 220–229. ACM, New York, 1997. [Jain et al., 2011] Rahul Jain, Zhengfeng Ji, Sarvagya Upadhyay, and John Watrous. Qip = pspace. J. ACM, 58(6):30:1–30:27, December 2011. [Jones, 1975] N. Jones. Space-bounded reducibility among combinatorial problems. Journal of Computer and System Sciences, 11:68–85, 1975. [Jones and Selman, 1974] N. Jones and A. Selman. Turing machines and the spectra of firstorder formulae. Journal Symbolic Logic, 39:139–150, 1974. [Karp, 1972] R. Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, pages 85–104. Plenum Press, New York, 1972. [Karp, 1986] R. Karp. Combinatorics, complexity and randomness. Communications of the ACM, 29(2):98–109, February 1986. [Kurtz et al., 1987] S. Kurtz, S. Mahaney, and J. Royer. Progress on collapsing degrees. In Proc. Structure in Complexity Theory Second Annual Conference, pages 126–131, 1730 Massachusetts Avenue, N.W., Washington, D.C. 20036-1903, 1987. Computer Society Press of the IEEE. [Kurtz et al., 1989] S. Kurtz, S. Mahaney, and J. Royer. The isomorphism conjecture fails relative to a random oracle (extended abstract). In ACM Symposium on Theory of Computing, pages 157–166, 1989. [Kushilevitz and Nisan, 1996] E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, Cambridge, 1996. [Krajicek and Pudlak, 1989] J. Krajicek and P. Pudlak. Propositional proof systems, the consistency of first order theories and the complexity of computation. Journal of Symbolic Logic, 53(3):1063–1079, 1989. [Lautemann, 1983] C. Lautemann. BPP and the polynomial hierarchy. Information Processing Letters, 17(4):215–217, 1983. [Levin, 1973] L. Levin. Universal sorting problems. Problems of Information Transmission, 9:265–266, 1973. English translation of original in Problemy Peredaci Informacii. [Lund et al., 1992] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. Algebraic methods for interactive proof systems. Journal of the ACM, 39(4):859–868, 1992. [Lapidot and Shamir, 1991] D. Lapidot and A. Shamir. Fully parallelized multi prover protocols for NEXP-time. In Proceedings of the 32nd IEEE Symposium on Foundations of Computer Science, pages 13–18. IEEE, New York, 1991. [Lipton and Zalcstein, 1977] R. Lipton and E. Zalcstein. Word problems solvable in logspace. Journal of the ACM, 3:522–526, 1977. [Mahaney, 1982] S. Mahaney. Sparse complete sets for NP: solution of a conjecture of Berman and Hartmanis. Journal of Comput. System Sci., 25:130–143, 1982. [McCreight and Meyer, 1969] E. McCreight and A. Meyer. Classes of computable functions defined by bounds on computation. In Proceedings of the First ACM Symposium on the Theory of Computing, pages 79–88. ACM, New York, 1969.


Lance Fortnow and Steven Homer

[Meyer and Stockmeyer, 1972] A. Meyer and L. Stockmeyer. The equivalence problem for regular expressions with squaring requires exponential space. In Proc. of the 13th IEEE Symposium on Switching and Automata Theory, pages 125–129, 1730 Massachusetts Avenue, N.W., Washington, D.C. 20036-1903, 1972. Computer Society Press of the IEEE. [Mahaney and Young, 1985] S. Mahaney and P. Young. Orderings of polynomial isomorphism types. Theor. Comput. Sci., 39(2):207–224, August 1985. [Myhill, 1955] J. Myhill. Creative sets. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 1:97–108, 1955. [Myhill, 1960] J. Myhill. Linear bounded automata. Technical Note 60–165, Wright-Patterson Air Force Base, Wright Air Development Division, Ohio, 1960. [Nisan, 1992] N. Nisan. Pseudorandom generators for space-bounded computation. Combinatorica, 12(4):449–461, 1992. [Nisan and Wigderson, 1994] N. Nisan and A. Wigderson. Hardness vs. randomness. Journal of Computer and System Sciences, 49:149–167, 1994. [Papadimitriou and Yannakakis, 1991] C. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes. Journal of Computer and System Sciences, 43:425–440, 1991. [Rabin, 1963] M. Rabin. Real time computation. Israel Journal of Mathematics, 1:203–211, 1963. [Razborov, 1985] A. Razborov. Lower bounds of monotone complexity of the logical permanent function. Mathematical Notes of the Academy of Sciences of the USSR, 37:485–493, 1985. [Razborov, 1985a] A. Razborov. Lower bounds on the monotone complexity of some boolean functions. Doklady Akademii Nauk SSSR, 281(4):798–801, 1985. In Russian. English Translation in [Razborov, 1985b]. [Razborov, 1985b] A. Razborov. Lower bounds on the monotone complexity of some boolean functions. Soviet Mathematics–Doklady, 31:485–493, 1985. [Raz, 1998] R. Raz. A parallel repetition theorem. SIAM Journal on Computing, 27(3):763–803, June 1998. [Reingold, 2008] Omer Reingold. Undirected connectivity in log-space. J. ACM, 55(4):17:1– 17:24, September 2008. [Ruby and Fischer, 1965] S. Ruby and P. Fischer. Translational methods and computational complexity. In Proceedings of the Sixth Annual Symposium on Switching Circuit Theory and Logical Design, pages 173–178, New York, 1965. IEEE. [Hoover et al., 1995] H. Hoover R. Greenlaw and W. Ruzzo. Limits to Parallel Computation: P-Completeness Theory. Oxford University Press, Oxford, 1995. [Robinson, 1965] J.A. Robinson. A machine oriented logic based on resolution. Journal of the ACM, 12(1):23–41, 1965. [Savitch, 1970] W. Savitch. Relationship between nondeterministic and deterministic tape classes. Journal of Computer and System Sciences, 4:177–192, 1970. [Savitch, 1973] W. Savitch. Maze recognizing automata and nondeterministic tape complexity. JCSS, 7:389–403, 1973. [Schoning, 1990] U. Schoning. The power of counting. In In Alan Selman, editor, Complexity Theory Retrospective, pages 204–223. Springer, New York, 1990. [Selman, 1988] A. Selman. Complexity Theory Retrospective. Springer, New York, 1988. [Seiferas et al., 1978] J. Seiferas, M. Fischer, and A. Meyer. Separating nondeterministic time complexity classes. Journal of the ACM, 25(1):146–167, 1978. [Shannon, 1949] C.E. Shannon. Communication in the presence of noise. IRE, 37:10–21, 1949. [Shamir, 1992] A. Shamir. IP = PSPACE. Journal of the ACM, 39(4):869–877, 1992. [Shor, 1997] P. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing, 26(5):1484–1509, 1997. [Simon, 1997] D. Simon. On the power of quantum computation. SIAM Journal on Computing, 26(5):1474–1483, 1997. [Sipser, 1983] M. Sipser. A complexity theoretic approach to randomness. In Proceedings of the 15th ACM Symposium on the Theory of Computing, pages 330–335. ACM, New York, 1983. [Sipser, 1992] M. Sipser. The history and status of the P versus NP question. In Proceedings of the 24th ACM Symposium on the Theory of Computing, pages 603–618. ACM, New York, 1992.

Computational Complexity


[Smolensky, 1987] R. Smolensky. Algebraic methods in the theory of lower bounds for boolean circuit complexity. In Proc. 19th Symposium on Theory of Computation, pages 77–82. ACM Press, 1987. [Smullyan, 1961] R. Smullyan. Theory of Formal Systems, volume 47 of Annals of Mathematical Studies. Princeton University Press, 1961. [Solovay and Strassen, 1977] R. Solovay and V. Strassen. A fast Monte-Carlo test for primality. SIAM Journal on Computing, 6:84–85, 1977. See also erratum 7:118, 1978. [Stearns, 1994] R. Stearns. Turing award lecture: It’s time to reconsider time. Communications of the ACM, 37(11):95–99, November 1994. [Stockmeyer, 1976] L. Stockmeyer. The polynomial-time hierarchy. Theor. Computer Science, 3:1–22, 1976. [Saks and Zhou, 1999] M. Saks and S. Zhou. BPH SPACE(S) ⊆ DPSPACE(S3/2 ). Journal of Computer and System Sciences, 58(2):376–403, April 1999. [Szelepcs´ enyi, 1988] R. Szelepcs´ enyi. The method of forced enumeration for nondeterministic automata. Acta Informatica, 26:279–284, 1988. [Toda, 1991] S. Toda. PP is as hard as the polynomial-time hierarchy. SIAM Journal on Computing, 20(5):865–877, 1991. [Trakhtenbrot, 1964] B. Trakhtenbrot. Turing computations with logarithmic delay. Algebra i Logika, 3(4):33–48, 1964. [Trakhtenbrot, 1984] R. Trakhtenbrot. A survey of Russian approaches to Perebor (brute-force search) algorithms. Annals of the History of Computing, 6(4):384–400, 1984. [Tseitin, 1968] G. S. Tseitin. On the complexity of derivations in the propositional calculus. In In Studies in Constructive Mathematics and Mathematical Logic, volume Part II. Consultants Bureau, New-York-London, 1968. [Turing, 1936] A. Turing. On computable numbers, with an application to the Etscheidungs problem. Proceedings of the London Mathematical Society, 42:230–265, 1936. [Urquhart, 1987] A. Urquhart. Hard examples for resolution. Journal of the ACM, 34:209–219, 1987. [Valiant, 1979] L. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8:189–201, 1979. [Vardi, 1982] M. Vardi. Complexity of relational query languages. In Proc. 14th Symposium on Theory of Computation, pages 137–146. ACM Press, 1982. [Watrous, 1999] J. Watrous. PSPACE has constant-round quantum interactive proof systems. In Proceedings of the 40th IEEE Symposium on Foundations of Computer Science, pages 112–119. IEEE, New York, 1999. [Yamada, 1962] H. Yamada. Real-time computation and recursive functions not real-time computable. IEEE Transactions on Computers, 11:753–760, 1962. [Yao, 1990] A. Yao. Coherent functions and program checkers. In Proceedings of the 22nd ACM Symposium on the Theory of Computing, pages 84–94. ACM, New York, 1990.

LOGIC PROGRAMMING Robert Kowalski Readers: Maarten van Emden and Luis Moniz Pereira 1


The driving force behind logic programming is the idea that a single formalism suffices for both logic and computation, and that logic subsumes computation. But logic, as this series of volumes proves, is a broad church, with many denominations and communities, coexisting in varying degrees of harmony. Computing is, similarly, made up of many competing approaches and divided into largely disjoint areas, such as programming, databases, and artificial intelligence. On the surface, it might seem that both logic and computing suffer from a similar lack of cohesion. But logic is in better shape, with well-understood relationships between different formalisms. For example, first-order logic extends propositional logic, higher-order logic extends first-order logic, and modal logic extends classical logic. In contrast, in Computing, there is hardly any relationship between, for example, Turing machines as a model of computation and relational algebra as a model of database queries. Logic programming aims to remedy this deficiency and to unify different areas of computing by exploiting the greater generality of logic. It does so by building upon and extending one of the simplest, yet most powerful logics imaginable, namely the logic of Horn clauses. In this paper, which extends a shorter history of logic programming (LP) in the 1970s [Kowalski, 2013], I present a personal view of the history of LP, focusing on logical, rather than on technological issues. I assume that the reader has some background in logic, but not necessarily in LP. As a consequence, this paper might also serve a secondary function, as a survey of some of the main developments in the logic of LP. Inevitably, a history of this restricted length has to omit a number of important topics. In this case, the topics omitted include meta LP, high-order LP, concurrent LP, disjunctive LP and complexity. Other histories and surveys that cover some of these topics and give other perspectives include [Apt and Bol, 1994; Brewka et al., 2011; Bry et al. 2007; Ceri et al., 1990; Cohen, 1988; Colmerauer and Roussel, 1996; Costantini, 2002; Dantsin et al., 1997; Eiter et al., 2009; Elcock, 1990; van Emden, 2006; Hewitt, 2009; Minker, 1996; Ramakrishnan and Ullman, 1993]. Perhaps more significantly and more regrettably, in omitting coverage of technological issues, I may be giving a misleading impression of their significance. Handbook of the History of Logic. Volume 9: Computational Logic. Volume editor: Jörg Siekmann Series editors: Dov M. Gabbay and John Woods Copyright © 2014 Elsevier BV. All rights reserved.


Robert Kowalski

Without Colmerauer’s practical insights [Colmerauer et al., 1973], Boyer and Moore’s [1972] structure sharing implementation of resolution [Robinson, 1965a], and Warren’s abstract machine and Prolog compiler [Warren, 1978, 1983; Warren et al., 1977], logic programming would have had far less impact in the field of Computing, and this history would not be worth writing.


The Horn clause basis of logic programming

Horn clauses are named after the logician Alfred Horn, who studied some of their mathematical properties. A Horn clause logic program is a set of sentences (or clauses) each of which can be written in the form: A0 ← A1 ∧ . . . ∧ An where n ≥ 0. Each Ai is an atomic formula of the form p(t1 , ..., tm ), where p is a predicate symbol and the ti are terms. Each term is either a constant symbol, variable, or composite term of the form f (t1 , ..., tm ), where f is a function symbol and the ti are terms. Every variable occurring in a clause is universally quantified, and its scope is the clause in which the variable occurs. The backward arrow ← is read as “if”, and ∧ as “and”. The atom A0 is called the conclusion (or head ) of the clause, and the conjunction A1 ∧ ... ∧ An is the body of the clause. The atoms A1 , ..., An in the body are called conditions. If n = 0, then the body is equivalent to true, and the clause A0 ← true is abbreviated to A0 and is called a fact. Otherwise if n 6= 0, the clause is called a rule. It is also useful to allow the head A0 of a clause to be false, in which case, the clause is abbreviated to ← A1 ∧ ... ∧ An and is called a goal clause. Intuitively, a goal clause can be understood as denying that the goal A1 ∧ ... ∧ An has a solution, thereby issuing a challenge to refute the denial by finding a solution. Predicate symbols represent the relations that are defined (or computed) by a program, and functions are treated as a special case of relations, as in relational databases. Thus the mother function, exemplified by mother(john) = mary, is represented by a fact such as mother(john, mary). The definition of maternal grandmother, which in functional notion is written as an equation: maternal-grandmother (X) = mother(mother(X)) is written as a rule in relational notation: maternal-grandmother (X) ← mother(X, Z) ∧ mother(Z, Y )1 Although all variables in a rule are universally quantified, it is often more natural to read variables in the conditions that are not in the conclusion as existentially quantified with the body of the rule as their scope. For example, the following two sentences are equivalent: 1 In this paper, I use the Prolog notation for clauses: Predicate symbols, function symbols and constants start with a lower case letter, and variables start with an upper case letter. Numbers can be treated as constants.

Logic Programming


∀XY Z [maternal-grandmother (X) ← mother(X, Z) ∧ mother(Z, Y )] ∀XY [maternal-grandmother (X) ← ∃Z [mother(X, Z) ∧ mother(Z, Y )]] Function symbols are not used for function definitions, but are used to construct composite data structures. For example, the composite term cons(s, t) can be used to construct a list with first element s followed by the list t. Thus the term cons(john, cons(mary, nil)) represents the list [john, mary], where nil represents the empty list. Terms can contain variables, and logic programs can compute input-output relations containing variables. However, for the semantics, it is convenient to regard terms that do not contain variables, called ground terms, as the basic data structures of logic programs. Similarly, a clause or other expression is said to be ground, if it does not contain any variables. Logic programs that do not contain function symbols are also called Datalog programs. Datalog is more expressive than relational databases, but is also decidable. Horn clause programs with function symbols have the expressive power of Turing machines, and consequently are undecidable. Horn clauses are sufficient

Figure 1. An and-or tree and corresponding propositional Horn clause program. for many applications in artificial intelligence. For example, and-or trees can be represented by ground Horn clauses.2 See figure 1. 2 And-or trees were employed in many early artificial intelligence programs, including the geometry theorem proving machine of Gelernter [1963]. Search strategies for and-or trees were investigated by Nils Nilsson [1968], and in a theorem-proving context by Kowalski [1970].



Robert Kowalski

Logic programs with negation

Although Horn clauses are the underlying basis of LP and are theoretically sufficient for all programming and database applications, they are not adequate for artificial intelligence, most importantly because they fail to capture non-monotonic reasoning. For non-monotonic reasoning, it is necessary to extend Horn clauses to clauses of the form: A0 ← A1 ∧ ... ∧ An ∧ not B1 ∧ ... ∧ not Bm where n ≥ 0 and m ≥ 0. Each Ai and Bi is an atomic formula, and “not” is read as not. Atomic formulas and their negations are also called literals. Here the Ai are positive literals, and the not Bi are negative literals. Sets of clauses in this form are called normal logic programs, or just logic programs for short. It can be argued that normal logic programs, with appropriate semantics for negation, are sufficient to solve the frame problem in artificial intelligence. Here is a candidate solution using an LP representation of the situation calculus [McCarthy and Hayes, 1969]: holds(F, do(A, S)) ← poss(A, S) ∧ initiates(A, F, S) holds(F, do(A, S)) ← poss(A, S) ∧ holds(F, S) ∧ not terminates(A, F, S) Here holds(F, S) expresses that a fact F (also called a fluent) holds in a state (or situation) S; poss(A, S) that the action A is possible in state S; initiates(A, F, S) that the action A performed in state S initiates F in the resulting state do(A, S); and terminates(A, F, S) that A terminates F . Together, the two clauses assert that a fact holds in a state either if it is initiated by an action or if it held in the previous state and was not terminated by an action. This representation of the situation calculus also illustrates meta-logic programming, because the predicates holds, poss, initiates and terminates can be understood as meta-predicates, where the variable F ranges over names of sentences. Alternatively, they can be interpreted as second-order predicates, where F ranges over first-order predicates.


Logic programming issues

In this article, I will discuss the development of LP and its extensions, their semantics, and their proof theories. We will see that lurking beneath the deceptively simple syntax of logic programs are profound issues concerning semantics, proof theory and knowledge representation. For example, what does it mean for a logic program P to solve a goal G? Does it mean that P logically implies G, in the sense that G is true in all models of P ? Does it mean that some larger theory than P , which includes assumptions implicit in P , logically implies G? Or does it mean that G is true in some natural, intended model of P ?

Logic Programming


And how should G be solved? Top-down by using the clauses in P as goalreduction procedures, to reduce goals that match the conclusions of clauses to sub-goals that correspond to their conditions? Or bottom-up to generate new conclusions from conditions, until the generated conclusions include all the information needed to solve the goal G in one step? We will see that these two issues — what it means to solve a goal G, and whether to solve G top-down or bottom-up — are related. In particular, bottomup reasoning can be interpreted as generating a model in which G is true. These issues are hard enough for Horn clause programs. But they are much harder for logic programs with negative conditions. In some semantics, a negative condition not B has the same meaning as classical negation ¬B, and solving a negative goal not B is interpreted as reasoning with ¬B. But in most proof theories, not B is interpreted as some form of negation as failure: not B holds if all attempts to show B fail. In addition to these purely logical problems concerning semantics and proof theory, LP has been plagued by controversies concerning declarative versus procedural representations. Declarative representations are naturally supported by bottomup model generation. But both declarative and procedural representations can be supported by top-down execution. For many advocates of purely declarative representations, such exploitation of procedural representations undermines the logic programming ideal. These issues of semantics, proof theory and knowledge representation have been a recurring theme in the history of LP, and they continue to be relevant today. They are reflected, in particular, by the growing divergence between Prolog-style systems that employ top-down execution on one side, and answer set programming and Datalog systems that employ bottom-up model generation on the other side. 2


The discovery of the top-down method for executing logic programs occurred in the summer of 1972, as the result of my collaboration with Alain Colmerauer in Marseille. Colmerauer was developing natural language question-answering systems, and I was developing resolution theorem-provers, and trying to reconcile them with procedural representations of knowledge in artificial intelligence.



Resolution was developed by John Alan Robinson [1965a] as a technique for indexautomated theorem-proving automated theorem-proving, with a view to mechanising mathematical proofs. It consists of a single inference rule for proving that a set of assumptions P logically implies a theorem G. The resolution method is a refutation procedure, which does so by reductio ad absurdum, converting P and the


Robert Kowalski

negation ¬G of the theorem into a set of clauses and deriving the empty clause, which represents falsity. Clauses are disjunctions of literals. In Robinson’s original definition, clauses were represented as sets. In the propositional case: given two clauses {A} ∪ F and {¬A} ∪ G the resolvent is the clause F ∪ G. The two clauses {A} ∪ F and {¬A} ∪ G are said to be the parents of the resolvent, and the literals A and ¬A are said to be the literals resolved upon. If F and G are both empty, then the resolvent of {A} and {¬A} is the empty clause, representing a contradiction or falsity. In the first-order case, in which all variables are universally quantified with scope the clause in which they occur, it is necessary to unify sets of literals to make them complementary: given two clauses K ∪ F and L ∪ G the resolvent is the clause F θ ∪ Gθ. where θ is a most general substitution of terms for variables that unifies the atoms in K and L, in the sense that Kθ = {A} and Lθ = {¬A}. It is an important property of resolution, which greatly contributes to its efficiency, that if there is any substitution that unifies K and L, then there is a most general such unifying substitution, which is unique up to renaming of variables. The set representation of clauses (and sets of clauses) builds in the inference rules of commutativity, associativity and idempotency of disjunction (and conjunction). The resolution rule itself generalises modus ponens, modus tollens, disjunctive syllogism, and many other separate inference rules of classical logic. The use of the most general unifier, moreover, subsumes in one operation, the infinitely many inferences of the form “derive P (t) from ∀XP (X)” that are possible with the inference rule of universal instantiation. Other inference rules are eliminated (or used) in the conversion of sentences of standard first-order logic into clausal form. Set notation for clauses is not user-friendly. It is more common to write clauses {A1 , . . . , An , ¬B1 , . . . , ¬Bm } as disjunctions A1 ∨ . . . ∨ An ∨ ¬B1 ∨ . . . ∨ ¬Bm . However, sets of clauses, representing conjunctions of clauses, are commonly written simply as sets. Clauses can also be represented as conditionals in the form: A1 ∨ . . . ∨ An ← B1 ∧ . . . ∧ Bm . where ← is material implication → (or ⊃) written backwards. The discovery of resolution revolutionised research on automated theorem proving, as many researchers turned their hands towards developing refinements of the resolution rule. It also inspired other applications of logic in artificial intelligence, most notably to the development of question-answering systems, which represent

Logic Programming


data or knowledge in logical form, and query that knowledge using logical inference. One of the most successful and most influential such system was QA3, developed by Cordell Green [1969]. In QA3, given a knowledge base and goal to be solved, both expressed in clausal form, an extra literal answer(X) is added to the clause or clauses representing the negation of the goal, where the variables X represent some value of interest in the goal. For example, to find the capital of the usa, the goal ∃Xcapital(X, usa) is negated and the answer literal is added, turning it into the clause ¬capital(X, usa) ∨ answer(X). The goal is solved by deriving a clause consisting only of answer literals. The substitutions of terms for variables used in the derivation determine values for the variables X. In this example, if the knowledge base contains the clause capital(washington, usa), the answer answer(washington) is obtained in one resolution step. Green also showed that resolution has many other problem-solving applications, including robot plan formation. Moreover, he showed how resolution could be used to automatically generate a program written in a conventional programming language, such as LISP, from a specification of its input-output relation written in the clausal form of logic. As he put it: “In general, our approach to using a theorem prover to solve programming problems in LISP requires that we give the theorem prover two sets of initial axioms: 1. Axioms defining the functions and constructs of the subset of LISP to be used 2. Axioms defining an input-output relation such as the relation R(x, y), which is to be true if and only if x is any input of the appropriate form for some LISP program and y is the corresponding output to be produced by such a program.” Green also seems to have anticipated the possibility of dispensing with (1) and using only the representation (2) of the relation R(x, y), writing: “The theorem prover may be considered an ‘interpreter’ for a high-level assertional or declarative language — logic. As is the case with most high-level programming languages the user may be somewhat distant from the efficiency of ‘logic’ programs unless he knows something about the strategies of the system.” “I believe that in some problem solving applications the ‘high-level language’ of logic along with a theorem-proving program can be a quick programming method for testing ideas.” However, he does not seem to have pursued these ideas much further. Moreover, there was an additional problem, namely that the resolution strategies of that time behaved unintuitively and were very redundant and inefficient. For example, given a clause of the form L1 ∨ . . . ∨ Ln , and n clauses of the form ¬Li ∨ Ci , resolution would derive the same clause C1 ∨ . . . ∨ Cn redundantly in n! different ways.



Robert Kowalski

Procedural representations of knowledge

Green’s ideas fired the enthusiasm of researchers working in contact with him at Stanford and Edinburgh, but they also attracted fire from MIT, where researchers were advocating procedural representations of knowledge. Terry Winograd’s PhD thesis gave the most compelling and most influential voice to this opposition. Winograd [1971] argued (page 232): “Our heads don’t contain neat sets of logical axioms from which we can deduce everything through a ‘proof procedure’. Instead we have a large set of heuristics and procedures for solving problems at different levels of generality.” He quoted (pages 232-3) Green’s own admission of some of the difficulties: “It might be possible to add strategy information to a predicate calculus theorem prover, but with current systems such as QA3, ‘To change strategies in the current version, the user must know about set-ofsupport and other program parameters such as level bound and term depth. To radically change the strategy, the user presently has to know the LISP language and must be able to modify certain strategy sections of the program.’ (p. 236).”3 Winograd’s procedural alternative to purely “uniform” logical representations was based on Carl Hewitt’s language Planner. Winograd [1971] describes Planner in the following terms (page 238): “The language is designed so that if we want, we can write theorems in a form which is almost identical to the predicate calculus, so we have the benefits of a uniform system. On the other hand, we have the capability to add as much subject-dependent knowledge as we want, telling theorems about other theorems and proof procedures. The system has an automatic goal-tree backup system, so that even when we are specifying a particular order in which to do things, we may not know how the system will go about doing them. It will be able to follow our suggestions and try many different theorems to establish a goal, backing up and trying another automatically if one of them leads to a failure (see section 3.3).” In contrast (page 215): “Most ‘theorem-proving’ systems do not have any way to include this additional intelligence. Instead, they are limited to a kind of ‘working in the dark’. A uniform proof procedure gropes its way through the collection of theorems and assertions, according to some general procedure which does not depend on the subject matter. It tries to combine facts which might be relevant, working from the bottom-up.” 3 We will see later that the set-of-support strategy was critical, because it allowed QA3 to incorporate a form of backward reasoning from the theorem to be proved.

Logic Programming


Winograd’s PhD thesis presented a natural language understanding system that was a great advance at the time, and its advocacy of Planner was enormously influential. Even Stanford and Edinburgh were affected by these ideas. Pat Hayes and I had been working in Edinburgh on a book [Hayes and Kowalski, 1971] about resolution theorem-proving, when he returned from a second visit to Stanford (after the first visit, during which he and John McCarthy wrote the famous situation calculus paper [McCarthy and Hayes, 1968]). He was greatly impressed by Planner, and wanted to rewrite the book to take Planner into account. I was not enthusiastic, and we spent many hours discussing and arguing about the relationship between Planner and resolution theorem proving. Eventually, we abandoned the book, unable to agree.


Resolution, part two

At the time that QA3 and Planner were being developed, resolution was not well understood. In particular, it was not understood that a proof procedure, in general, is composed of an inference system that defines the space of all proofs and a search strategy that explores the proof space looking for a solution of a goal. We can represent this combination as an equation: proof procedure = proof space + search strategy A typical proof space has the structure of an and-or tree turned upside down. Typical search strategies include breadth-first search, depth-first search and some form of best-first or heuristic search. In the case of the resolution systems at the time, the proof spaces were horrendously redundant, and most search strategies used breadth-first search. Attempts to improve efficiency focussed on restricting (or refining) the resolution rule without losing completeness, to reduce the size of the proof space. The best known refinements were hyper-resolution and set of support. Hyper-resolution [Robinson, 1965b] is a generalised form of bottom-up (or forward) . In the propositional case, given an input clause: D0 ∨ ¬B1 ∨ . . . ∨ ¬Bm and m input or derived positive clauses: B1 ∨ D1 ,


Bm ∨ Dm

where each Bi is an atom and each Di is a disjunction of atoms, hyper-resolution derives the positive clause: D0 ∨ D1 ∨ . . . ∨ Dm . Bottom-up reasoning with Horn clauses is the special case in which D0 is a single atom and each other Di is an empty disjunction, equivalent to false. In this special


Robert Kowalski

case, rewriting disjunctions as conditionals, hyper-resolution derives B0 from the input clause: B0 ← B1 ∧ . . . ∧ Bm and the input or derived facts, B1 , . . . , Bm . The problem with hyper-resolution, as Winograd observed, is that it derives new clauses from the input clauses, without paying attention to the problem to be solved. It is “uniform” in the sense that, given a theorem to be proved, it uniformly performs the same inferences bottom-up from the axioms, ignoring the theorem until it generates it, as if by accident. In contrast with hyper-resolution, the set of support strategy [Wos et al., 1965] focuses on a subset of clauses that are relevant to the problem at hand: A subset S ′ of an input set S of clauses is a set of support for S iff S − S ′ is satisfiable. The set of support strategy restricts resolution so that at least one parent clause belongs to the set of support or is derived by the set of support restriction. If the satisfiable set of clauses S − S ′ represents a set of axioms, and the set of support S ′ represents the negation of a theorem, then the set of support strategy implements an approximation of top-down reasoning by reductio ad absurdum. It also ensures that any input clauses (or axioms) used in a derivation are relevant to the theorem, in the spirit of relevance logics [Anderson and Belnap, 1962].4 The set of support strategy only approximates top-down reasoning. A better approximation is obtained by linear resolution, which was discovered independently by Loveland [1970], Luckham [1970] and Zamov and Sharonov [1969]. Linear resolution addresses the problem of relevance by focusing on a top clause C0 , which could represent an initial goal: Let S be a set of clauses. A linear derivation of a clause Cn from a top clause C0 ∈ S is a sequence of clauses C0 , ..., Cn such that every clause Ci+1 is a resolvent of Ci with some input clause in S or with some ancestor clause Cj where j < i. (It was later realised that ancestor resolution is unnecessary if S is a set of Horn clauses.) The top clause C0 in a linear derivation can be restricted to one belonging to a set of support. The resulting space of all linear derivations from a given top clause C0 has the structure of a proof tree whose nodes are clauses and whose branches are linear derivations. Using linear resolution to extend the derivation of a clause Ci to the derivation of a clause Ci+1 generates the derived node Ci+1 as a child of 4 Another important case is the one in which S − S ′ represents a database (or knowledge base) together with a set of integrity constraints that are satisfied by the database, and S ′ represents a set of updates to be added to the database. The set of support restriction then implements a form of bottom-up reasoning from the updates, to check that the updated database continues to satisfy the integrity constraints. Moreover, it builds in the assumption that the database satisfied the integrity constraints prior to the updates, and therefore if there is an inconsistency, the update must be “relevant” to the inconsistency.

Logic Programming


the node Ci . The same node Ci can have different children Ci+1 , corresponding to different linear resolutions. In retrospect, the relationship with Planner is obvious. If the top clause C0 represents an initial goal, then the tree of all linear derivations is a goal tree, and generating the tree top-down is a form of goal reduction. The tree can be explored using different search strategies. Depth-first search, in particular, can be informed by Planner-like strategies that both specify “a particular order in which to do things”, but also “back up” automatically in the case of failure. The relationship with Planner was not obvious at the time. Even as recently as 2005, Paul Thagard in Mind: Introduction to Cognitive Science, compares logic unfavourably with production systems, stating on page 45: “In logic-based systems, the fundamental operation of thinking is logical deduction, but from the perspective of rule-based systems, the fundamental operation of thinking is search.”5 But it wasn’t just this lack of foresight that stood in the way of understanding the relationship with Planner: there was still the n! redundant ways of resolving upon n literals in the clauses Ci . This redundancy was recognized and eliminated without the loss of completeness by Loveland [1972], Reiter [1971], and Kowalski and Kuehner [1971], independently at about the same time. The obvious solution was simply to resolve upon the literals in the clauses Ci in a single order. This order can be determined statically, by ordering the literals in the input clauses, and imposing the same order on the resolvents. Or it could be determined dynamically, as in selected linear (SL) resolution [Kowalski and Kuehner, 1971], by selecting a single literal to resolve upon in a clause Ci when the clause is chosen for resolution. Both methods eliminate redundancy, but dynamic selection can lead to smaller search spaces.6 Ironically, both Loveland [1972] and Kowalski and Kuehner [1971] also noted that linear resolution with an ordering restriction is equivalent to Loveland’s [1968] earlier model elimination proof procedure. The original model elimination procedure was presented so differently that it took years even for its author to recognise the equivalence. The SL resolution paper also pointed out that the set of all SL derivations forms a search space, and described a heuristic search strategy for finding simplest proofs. In the conclusions, with implicit reference to Planner, it claimed: 5 This claim makes more sense if Thagard, like Winograd before him, associates logic exclusively with forward reasoning. As Sherlock Holmes explained to Dr. Watson, in A Study in Scarlet: “In solving a problem of this sort, the grand thing is to be able to reason backward. That is a very useful accomplishment, and a very easy one, but people do not practise it much. In the everyday affairs of life it is more useful to reason forward, and so the other comes to be neglected. There are fifty who can reason synthetically for one who can reason analytically.” 6 Dynamic selection is useful, for example, to solve goals with different input-output arguments. For example, given the clause p(X, Y ) ← q(X, Z)∧ r(Z, Y ) and the goal p(a, Y ), then the subgoal q(a, Z) should be selected before r(Z, Y ). But given the goal p(X, b), the subgoal r(Z, b) should be selected before q(X, Z).


Robert Kowalski

“Moreover, the amenability of SL-resolution to the application of heuristic methods suggests that, on these grounds alone, it is at least competitive with theorem-proving procedures designed solely from heuristic considerations.”



The development of various forms of linear resolution with set of support and ordering restrictions brought resolution systems closer to Planner-like theoremprovers. But these resolution systems did not yet have an explicit procedural interpretation.


The representation of grammars in logical form

Among the various confusions that prevented a better understanding of the relationship between logical and procedural representations was the fact that Winograd’s thesis, which so advanced the Planner cause, employed a different procedural language, Programmar, for natural language grammars. To add to the confusion, Winograd’s natural language understanding system was implemented in a combination of Programmar, LISP and micro-Planner (a simplified subset of Planner, developed by Charniak, Sussman and Winograd [Charniak et al., 1971]), So it wasn’t obvious whether Planner (or micro-Planner) was supposed to be a general-purpose programming language, or a special purpose language for proving theorems, for writing plans or for some other purpose. In the theorem-proving group in Edinburgh, where I was working at the time, much of the debate surrounding Planner focused on whether “uniform”, resolution proof procedures are adequate for proving theorems, or whether they need to be augmented with Planner-like, domain-specific control information. In particular, I was puzzled by the relationship between Planner and Programmar, and began to investigate whether grammars could be written in a logical form. This was auspicious, because in the summer of 1971 Alain Colmerauer invited me for a short visit to Marseille. Colmerauer knew everything there was to know about formal grammars and their application to programming language compilers. During 1967–1970 at the University of Montreal, he developed Q-systems [1969] as a rule-based formalism for processing natural language. Q-systems were later used on a daily basis from 1982 to 2001 to translate English weather forecasts into French for Environment Canada. Since 1970, he had been in Marseille, building up a team working on natural language question-answering, investigating SL-resolution for the questionanswering component. I arrived in Marseille, anxious to get Colmerauer’s feedback on my preliminary ideas about representing grammars in logical form. My representation used a function symbol to concatenate words into strings of words, and axioms to express

Logic Programming


that concatenation is associative. It was obvious that reasoning with such associativity axioms was inefficient. Colmerauer immediately saw how to avoid the axioms of associativity, in a representation that later came to be known as metamorphosis grammars [Colmerauer, 1975] (or definite clause grammars [Pereira and Warren, 1980]). We saw that different kinds of resolution applied to the resulting grammars give rise to different kinds of parsers. For example, forward reasoning with hyper-resolution performs bottom-up parsing, while backward reasoning with SL-resolution performs top-down parsing.7


Horn clauses and SLD-resolution

It was during my second visit to Marseille in April and May of 1972 that the idea of using SL-resolution to execute Horn clause programs emerged. By the end of the summer, Colmerauer’s group had developed the first version of Prolog, and used it to implement a natural language question-answering system [Colmerauer et al., 1973]. I reported an abstract of my own findings at the MFCS conference in Poland in August 1972 [Kowalski, 1972].8 The first Prolog system was an implementation of SL-resolution for the full clausal form of first-order logic, including ancestor resolution. But the idea that Horn clauses were an interesting case was already in the air. Donald Kuehner [1969], in particular, had already been working on bi-directional strategies for Horn clauses. However, the first explicit reference to the procedural interpretation of Horn clauses appeared in [Kowalski, 1974]. The abstract begins: “The interpretation of predicate logic as a programming language is based upon the interpretation of implications: B if A1 and. . . and An as procedure declarations, where B is the procedure name and A1 and . . . and An is the set of procedure calls constituting the procedure body.” The theorem-prover described in the paper is a variant of SL-resolution, to which Maarten van Emden later attached the name SLD-resolution, standing for “selected linear resolution with definite clauses”: A definite clause is a Horn clause of the form B ← B1 ∧ . . . ∧ Bn . A goal clause is a Horn clause of the form ← A1 ∧ . . . ∧ An .

Given a goal clause ← A1 ∧ . . . ∧ Ai−1 ∧ Ai ∧ Ai+1 ∧ . . . ∧ An with selected atom Ai and a definite clause B ← B1 ∧ . . . ∧ Bm , where θ is a most general substitution that unifies Ai and B, the SLD-resolvent is the goal clause ← (A1 ∧ . . . ∧ Ai−1 ∧ B1 ∧ . . . ∧ Bm ∧ Ai+1 ∧ . . . ∧ An )θ. 7 However, Colmerauer [1991] remembers coming up with the alternative representation of grammars, not during my visit in 1971, but after my visit in 1972. 8 In the abstract, I used a predicate val(f (X), Y ) instead of a predicate f (X, Y ), using Phillip Roussel’s idea of val as “formal equality”. Roussel was Colmerauer’s PhD student and the main implementer of the first Prolog system.


Robert Kowalski

Given a set of definite clauses S and an initial goal clause C0 , an SLDderivation of a goal clause Cn is a sequence of goal clauses C0 , . . . , Cn such that every Ci+1 is the SLD-resolvent of Ci with some input clause in S. An SLD-refutation is an SLD-derivation of the empty clause. SLD-resolution is more flexible than SL-resolution restricted to Horn clauses.9 In SL-resolution the atoms Ai must be selected last-in-first-out, but in SLDresolution, there is no restriction on their selection. Both refinements of linear resolution avoid the redundancy of unrestricted linear resolution, and both are complete, in the sense that if a set of Horn clauses is unsatisfiable, then there exists both an SL-resolution refutation and an SLD-resolution refutation in their respective search spaces. In both cases, different selection strategies give rise to different, complete search spaces. But the more flexible selection strategy of SLDresolution means that search spaces can be smaller, and therefore more efficient to search. In SLD resolution, goal clauses have a dual interpretation. In the strictly logic interpretation, the symbol ← in a goal clause ← A1 ∧ . . . ∧ An is equivalent to classical negation; the empty clause is equivalent to falsity; and a refutation indicates that the top clause is inconsistent with the initial set of clauses S. However, in a problem-solving context, it is natural to think of the symbol ← in a goal clause ← A1 ∧ . . . ∧ An as a question mark ? or command !, and the conjunction A1 ∧ . . . ∧ An as a set of subgoals, whose variables are all existentially quantified. The empty clause represents an empty set of subgoals, and a “refutation” indicates that the top clause has been solved. The solution is represented by the substitutions of terms for variables in the top clause, generated by the most general unifiers used in the refutation — similar to, but without the answer literals of QA3. As in the case of linear resolution more generally, the space of all SLD-derivations with a given top clause has the structure of a goal tree, which can be explored using different search strategies. From a logical point of view, it is desirable that the search strategy be complete, so that the proof procedure is guaranteed to find a solution if there is one in the search space. Complete search strategies include breadth-first search and various kinds of best-first and heuristic search. Depthfirst search is incomplete in the general case, but it takes up much less space than the alternatives. Moreover, it is complete if the search space is finite, or if there is only one infinite branch that is explored after all of the others. Notice that there are two different, but related notions of completeness: one for search spaces, and the other for search strategies. A search space is complete if it contains a solution whenever the semantics dictates that there is a solution; and a search strategy is complete if it finds a solution whenever there is one in the 9 If SL-resolution is applied to Horn clauses, with a goal clause as top clause, then ancestor resolution is not possible, because all clauses in the same SL-derivation are then goal clauses, which cannot be resolved with one another.

Logic Programming


search space. For a proof procedure to be complete, both its search space and its search strategy need to be complete. The different options for selecting atoms to resolve upon in SLD-resolution and for searching the space of SLD-derivations were left open in [Kowalski, 1974], but were pinned down in the Marseille Prolog interpreter. In Prolog, subgoals are selected last-in-first-out in the order in which the subgoals are written, and branches of the search space are explored depth-first in the order in which the clauses are written. By choosing the order in which subgoals and clauses are written, a Prolog programmer can exercise considerable control over the efficiency of a program.


Logic + Control

In those days, it was widely believed that logic alone is inadequate for problemsolving, and that some way of controlling the theorem-prover is needed for efficiency. Planner combined logic and control in a procedural representation that made it difficult to identify the logical component. Logic programs with SLDresolution also combine logic and control, but make it possible to read the same program both logically and procedurally. I later expressed this as Algorithm = Logic + Control (A = L + C) [Kowalski, 1979a], influenced by Pat Hayes′ [1973] Computation = Controlled Deduction. The most direct implication of the equation is that, given a fixed logical representation L, different algorithms can be obtained by applying different control strategies, i.e. A1 = L + C1 and A2 = L + C2 . Pat Hayes [1973], in particular, argued that logic and control should be expressed in separate languages, with the logic component L providing a pure, declarative specification of the problem, and the control component C supplying the problem solving strategies needed for an efficient algorithm A. Moreover, he argued against the idea, expressed by A1 = L1 + C and A2 = L2 + C, of using a fixed control strategy C, as in Prolog, and formulating the logic Li of the problem to obtain a desired algorithm Ai . This idea of combining logic and control in separate object and meta-level languages has been a recurring theme in the theorem-proving and AI literature. It was a major influence, for example, on the development of PRESS, which solved equations by expressing the rules of algebra in an object language, and the rules for controlling the search for solutions in a meta-language. According to its authors, Alan Bundy and Bob Welham [1981]: “PRESS consists of a collection of predicate calculus clauses which together constitute a Prolog program. As well as the procedural meaning attached to these clauses, which defines the behaviour of the PRESS program, they also have a declarative meaning - that is, they can be regarded as axioms in a logical theory.” In retrospect, PRESS was an early example of a now common use of Prolog to write meta-interpreters.


Robert Kowalski

But most applications do not need such an elaborate combination of logic and control. For example, the meta-level control program in PRESS does not need a meta-meta-level control program. In fact, for some applications, even the modest control available to the Prolog programmer is unnecessary For these applications, it suffices for the programmer to specify only the logic of the problem, and to leave it to Prolog to solve the problem without any help But often, leaving it to Prolog alone can result, not only in unacceptable inefficiency, but even in non-terminating failure to find a solution. Here is a simple example, written in Prolog notation, where :- stands for ← and every clause ends in a full stop: likes(bob, X) : − likes(X, bob) likes(bob, logic) : − likes(bob, X). Prolog fails to find the solution X = logic, because it explores the infinite branch generated by repeatedly using the first clause, without getting a chance to explore the branch generated by the second clause. If the order of the two clauses is reversed, Prolog finds the solution. If only one solution is desired then it terminates. But if all solutions are desired, then it encounters the infinite branch, and goes into the same infinite loop. Perhaps the easiest way to avoid such infinite loops in ordinary Prolog is to write a meta-interpreter, as in PRESS. 10 Problems and inefficiencies with the Prolog control strategy led to numerous proposals for LP languages incorporating enhanced control features. Some of them, such as Colmerauer’s [1982] Prolog II, which allowed insufficiently instantiated subgoals to be suspended, were developed as extensions of Prolog. Other proposals that departed more dramatically from ordinary Prolog included the use of coroutining in IC-Prolog [Clark et al., 1972] selective backtracking [Bruynooghe and Pereira, 1984] and meta-level control for logic programs [Gallaire and Lasserre, 1982; Pereira, 1984] IC-Prolog, in particular, led to the development by Clark and Gregory [1983, 1986] of the concurrent logic programming language Parlog, which led in turn to numerous variants of concurrent LP languages, one of which KL1, developed by Kazunori Ueda [1986], was adopted as the basis for the systems software of the Fifth Generation Computer Systems (FGCS) Project in Japan. The FGCS Project was a ten year project beginning in 1982, sponsored by Japan’s Ministry of International Trade and Industry and involving all the major Japanese computer manufacturers. Its main objective was to develop a new generation of computers employing massive parallelism and oriented towards artificial 10 In other cases, much simpler solutions are often possible. For example, to avoid infinite loops with the program path(X, X) and path(X, Y ) ← link (X, Z) ∧ path(Z, Y ), it suffices to add an extra argument to the path predicate to record the list of nodes visited so far, and to add an extra condition to the second clause to check that the node Z in link (X, Z) is not in this path. For some advocates of declarative programming this is considered cheating. For others, it illustrates a practical application of A = L1 + C1 = L2 + C2 .

Logic Programming


intelligence applications. From the start of the project, logic programming was identified as the preferred software technology. The FGCS project did not achieve its objectives, and all three of its main areas of research — parallel hardware, logic programming software, and AI applications — suffered a world-wide decline. These days, however, there is growing evidence that the FGCS project was ahead of its time. In the case of logic programming, in particular, SLD-resolution extended with tabling [Tamaki and Sato, 1986; Sagonas et al., 1994; Chen and Warren, 1996; Telke and Liu, 2011] avoids many infinite loops, like the one in the example above. Moreover, there also exist alternative techniques for executing logic programs that do not rely upon the procedural interpretation, including the model generation methods of Answer Set programming (ASP) and the bottom-up execution strategies of Datalog. ASP and Datalog have greatly advanced the ideal of purely declarative representations, relegating procedural representations to the domain of imperative languages and other formalisms of dubious character. However, not everyone is convinced that purely declarative knowledge representation is adequate either for practical computing or for modelling human reasoning. Thagard [2005], for example, claims that the following, useful procedure cannot easily be expressed in logical terms (page 45): If you want to go home and you have the bus fare, then you can catch a bus. On the contrary, the sentence can be expressed literally in the logical form: can(you, catch-bus) ← want(you, go-home) ∧ have(you, bus-f are) But this rendering requires the use of modal operators or modal predicates for want and can. More importantly, it misses the real logic of the procedure: go(you, home) ← have(you, bus-f are) ∧ catch(you, bus). Top-down reasoning applied to this logic generates the procedure, without sacrificing either the procedure or the declarative belief that justifies it 4


The earliest influences on the development of logic programming had come primarily from automated theorem-proving and artificial intelligence. But researchers in the School of AI in Edinburgh also had strong interests in the theory of computation, and there was a lot of excitement about Dana Scott’s [1970] recent fixed point semantics for programming languages. Maarten van Emden suggested that we investigate the application of Scott’s ideas to Horn clause programs and that we compare the fixed point semantics with the logical semantics.



Robert Kowalski

What is the meaning of a program?

But first we needed to establish a common ground for the comparison. If we identify the data structures of a logic program P with the set of all ground terms constructible from the vocabulary of P , also called the Herbrand universe of P , then we can view the “meaning” (or denotation) of P as the set of all ground atoms A that can be derived from P 11 , which is expressed by: P ⊢ A. Here ⊢ can represent any derivability relation. Viewed in programming terms, this is analogous to the operational semantics of a programming language. But viewed in logical terms, this is a proof-theoretic definition, which is not a semantics at all. In logical terms, it is more natural to understand the semantics of P as given by the set of all ground atoms A that are logically implied by P , written: PA The operational and model-theoretic semantics are equivalent for any sound and complete notion of derivation – the most important kinds being top-down and bottom-up. Top-down derivations include model-elimination, SL-resolution and SLD-resolution. Model-elimination and SL-resolution are sound and complete for arbitrary clauses. So they are sound and complete for Horn clauses in particular. Moreover, ancestor resolution is impossible for Horn clauses. So model-elimination and SL-resolution without ancestor resolution are sound and complete for Horn clause programs. The selection rule in both SL-resolution and SLD-resolution constructs a linear representation of an and-tree proof. In SL-resolution the linear representation is obtained by traversing the and-tree depth-first. In SLD-resolution the linear representation can be obtained by traversing the and-tree in any order.12 The completeness of SLD-resolution was first proved by Robert Hill [1974]. Bottom-up derivations are a special case of hyper-resolution, which is also sound and complete for arbitrary clauses, and therefore for Horn clauses as well. Moreover, as we soon discovered, hyper-resolution is equivalent to the fixed point semantics.


Fixed point semantics

In Dana Scott’s [1970] fixed point semantics, the denotation of a recursive function is given by its input-output relation. The denotation is constructed by approximation, starting with the empty relation, repeatedly plugging the current 11 Notice that this excludes programs which represent perpetual processes. Moreover, it ignores the fact that, in practice, logic programs can compute input-output relations containing variables. This is sometimes referred to as the “power of the logical variable”. 12 Note that and-or trees suggest other strategies for executing logic programs, for example by decomposing goals into subgoals top-down, searching for solutions of subgoals in parallel, then collecting and combining the solutions bottom-up. This is like the MapReduce programming model used in Google [Dean and Ghemawat, 2008].

Logic Programming


approximation of the denotation into the definition of the function, transforming the approximation into a better one, until the complete denotation is obtained in the limit, as the least fixed point. Applying the same approach to a Horn clause program P , the fixed point semantics uses a similar transformation TP , called the immediate consequence operator, to map a set I of ground atoms representing an approximation of the input-output relations of P into a more complete approximation TP (I): TP (I) = {A0 | A0 ← A1 ∧ . . . ∧ An ∈ ground(P ) and {A1 , . . . , An } ⊆ I}. Here ground (P ) is the set of all ground instances of the clauses in P over the Herbrand universe of P . The application of TP to I is equivalent to applying one step of hyper-resolution to the clauses in ground (P ) ∪ I. Not only does every Horn clause program P have a fixed point I such that TP (I) = I, but it has a least fixed point, lfp(TP ), which is the denotation of P according to the fixed point semantics. The least fixed point is also the smallest set of ground atoms I closed under TP , i.e. the smallest set I such that TP (I) ⊆ I. This alternative characterisation provides a link with the minimal model semantics, as we will see below. The least fixed point can be constructed, as in Scott’s semantics, by starting with the empty set {} and repeatedly applying TP : If TP0 = {} and TPi+1 = TP (TPi ), then lfp(TP ) = ∪0≤i TPi . The result of the construction is equivalent to the set of all ground atoms that can be derived by applying finitely many steps of hyper-resolution to the clauses in ground (P ). The equality lfp(TP ) = ∪0≤i TPi is usually proved in fixed point theory by appealing to the Tarski-Knaster theorem. However, in [van Emden and Kowalski, 1976], we showed that the equivalence follows from the completeness of hyperresolution and the relationship between least fixed points and minimal models. Here is a sketch of the argument: A ∈ lfp(TP ) iff A ∈ min(P ) i.e. least fixed points and minimal models coincide. A ∈ min(P ) iff P  A i.e. truth in the minimal model and all models coincide. P  A iff A ∈ ∪0≤i TPi i.e. hyper-resolution is complete.


Minimal model semantics

The minimal model semantics was inspired by the fixed point semantics, but it was based on the notion of Herbrand interpretation. The key idea of Herbrand


Robert Kowalski

interpretations is to identify an interpretation of a set of sentences with the set of all ground atomic sentences that are true in the interpretation. In a Herbrand interpretation, the domain of individuals is the set of ground terms in the Herbrand universe of the language. A Herbrand interpretation is any subset of the Herbrand base, which is the set of all ground atoms of the language. The most important property of Herbrand interpretations is that, in first-order logic, a set of sentences has a model if and only if it has a Herbrand model. This property is a form of the Skolem-L¨owenheim-Herbrand theorem.13 Thus the model-theoretic denotation of a Horn clause program: M (P ) = {A | A is a ground atom and P  A} is actually a Herbrand interpretation of P in its own right. Moreover, it is easy to show that M (P ) is also a Herbrand model of P . In fact, it is the smallest Herbrand model min(P ) of P . Therefore: A ∈ min(P ) iff P  A. It is also easy to show that the Herbrand models of P coincide with the Herbrand interpretations that are closed under the operator TP , i.e.: I is a Herbrand model of P iff TP (I) ⊆ I. This is because the immediate consequence operator mimics, not only hyperresolution, but also the definition of truth for Horn clauses: A set of Horn clauses P is true in a Herbrand interpretation I if and only if, for every ground instance A0 ← A1 ∧ . . . ∧ An of a clause in P , A0 is true in I if A1 , . . . , An are true in I. It follows that the least fixed point and the minimal model are identical: lfp(TP ) = min(P ).



The logicians Andr´eka and N´emeti visited Edinburgh in 1975, and wrote a report, published in [Andr´eka and N´emeti, 1978], proving the Turing completeness of Horn clause logic. Sten-˚ Ake T¨ arnlund [1977] obtained a similar result independently. It was a great shock, therefore, to learn that Raymond Smullyan [1956] had already published an equivalent result. Here is the complete abstract: 13 The property can be proved in two steps: First, convert S into clausal form by using “Skolem” functions to eliminate existential quantifiers. Although the resulting set S ′ of clauses and S are not equivalent, S has a model iff S ′ has a model. A set of clauses S ′ has a model iff S ′ has a Herbrand model M , constructed using the Herbrand universe of S ′ . Therefore S has a model if and only if it has a Herbrand model M . (Contrary claims in the literature that S may have a model, but no Herbrand model, are based on the assumption that the Herbrand interpretations of S are constructed using the Herbrand universe of S.)

Logic Programming


A new approach to recursive enumerability is considered based on the notion of “minimal models”. A formula of the lower functional calculus of the form F1 · F2 · · · Fn−1 · ⊃ ·Fn (or F1 alone, if n = 1) in which each Fi is atomic, and Fn contains no predicate constants, is termed regular. Let A be a finite set of regular formulae; Σ a collection of sets and relations, on some universe U ; I an interpretation of the predicate constants (occurring in A) as elements of Σ. The ordered triple L viz. (A, U, I) is a recursive logic over Σ. A model of L is an interpretation of the predicate variables Pi in which each formula of A is valid. Let Pi∗ be the intersection of all attributes assignable to Pi in some model; these Pi∗ are called definable in L. If each Pi is interpreted as Pi∗ , it can be proved that there is a model — this is the minimal model. Sets definable in some L over Σ are termed recursively definable from Σ. It is proved: (1) the recursively enumerable sets are precisely those which are recursively definable from the successor relation and the unit set {0}; (2) Post’s canonical sets in an alphabet a1 · · · an , are those recursively definable from the concatenation relation and the unit sets {a1 } · · · {an }. Smullyan seems not to have published the details of his proofs. But he investigated the relationship between derivability and computability in his book on the Theory of Formal Systems [Smullyan, 1961]. These formal systems are variants of the canonical systems of Post, with strong similarities to Horn clause programs.


Logic and databases

The question-answering systems of the 1960s and 1970s represented information in logical form, and used theorem-provers to answer questions represented in logical form. It was the application of SL-resolution to such deductive question-answering that led to Colmerauer’s work on Prolog. In the meanwhile, Ted Codd [1970] published his relational model , which represented data as relations in logical form, but used the “non-deductive” algebraic operations of selection, projection, Cartesian product, set union and set difference, to specify database queries. However, he also showed [Codd, 1972] that the relational algebra is equivalent to a more declarative relational calculus, in which relations are defined in first-order logic. I first learned about relational databases in 1974 at a course on the foundations of computer science at the Mathematics Centre in Amsterdam. I was giving a short course of lectures on logic for problem solving, using a set of notes, which I later expanded into my 1979 book [Kowalski, 1979b]. Erich Neuhold was giving a course about formal properties of databases, with a focus on the relational model. It was immediately obvious that the relational model and logic programming had much in common. I organised a five day workshop at Imperial College London in May 1976, using the term “logic programming” to describe the topic of the workshop. A full day


Robert Kowalski

was devoted to presentations about logic and databases. Herv´e Gallaire and JeanMarie Nicholas presented the work they were doing in Toulouse, and Keith Clark talked about his work on negation as failure. Jack Minker visited Gallaire and Nicholas in 1976, and together they organised the first workshop on logic and databases in Toulouse in 1977. The proceedings of the workshop, published in 1978, included Clark’s results on negation as failure, and Reiter’s paper on closed world databases.



The practical value of extending Horn clause programs to normal logic programs with negative conditions was recognized from the earliest days of logic programming, as was the obvious way to reason with them — by negation as failure (abbreviated as NAF): to prove not p, show that all attempts to prove p fail. Intuitively, NAF is justified by the assumption that the program contains a complete definition of its predicates. The assumption is very useful in practice, but was neglected in formal logic. The problem was to give this proof-theoretic notion a logical semantics. Ray Reiter [1978] investigated NAF in the context of a first-order database D, interpreting it as the closed world assumption (CWA) that the negation not p of a ground atom p holds in D if there is no proof of p from D. He showed that the CWA can lead to inconsistencies in the general case — for example, given the database D = {p ∨ q}, it implies not p, and not q; but for Horn data bases no such inconsistencies can arise. However, Keith Clark was the first to investigate NAF in the context of logic programs with negative conditions.


The Clark completion

Clark’s solution was to interpret logic programs as short hand for definitions in if-and-only-if form, as illustrated for the propositional program in figure 2. In the non-ground case, the logic program needs to be augmented with an equality theory, which mimics the unification algorithm, and which essentially specifies that ground terms are equal if and only if they are syntactically identical. An example with a fragment of the necessary equality theory, is given in figure 3. Together with the equality theory, the if-and-only-if form of a logic program P is called the completion of P , written comp(P ). It is also sometimes called the predicate completion or the Clark completion. As figure 3 illustrates, negation as failure correctly simulates reasoning with the completion in classical logic. Although NAF is sound with respect to the completion semantics, it is not

Logic Programming


Figure 2. The logic program of figure 1, and its completion.

Figure 3. A proof of not likes(logic, logic) using negation as failure and backward reasoning compared with a proof of ¬likes(logic, logic) classical logic, presented upside down. Notice that the use of classical negation turns the disjunction of alternatives into a logical conjunction.


Robert Kowalski

complete. For example, if P is the program: p←q p ← ¬q q←q then comp(P ) implies p. But given the goal ← p, NAF goes into an infinite loop trying, but failing to show q. The completion semantics does not recognise such infinite failure, because proofs in classical logic are finite. For this reason, the completion semantics is also called the semantics of negation as finite failure. In contrast with the completion semantics, the CWA formalises negation as potentially infinite failure, inferring ¬q from q ← q. Similarly, the minimal model semantics of Horn clauses concludes that ¬q is true in the minimal model of q ← q. Clark did not investigate the relationship between the completion semantics and the various alternative semantics of Horn clauses. Probably the first such investigation was by Apt and van Emden [1982], who showed, among other things, that if P is a Horn clause program then: I is a Herbrand model of comp(P ) iff TP (I) = I. Compare this with the property that I is a Herbrand model of P iff TP (I) ⊆ I.


The analogy with arithmetic

Clark’s 1978 paper was not the first to propose the completion semantics. [Clark and T¨ arnlund, 1977] proposed using the completion together with induction schemas on the structure of terms to prove program properties, by analogy with the use of induction in first-order Peano arithmetic. Consider the Horn clause definition of append (X, Y, Z), which holds when the list Z is the concatenation of the list X followed by the list Y : append(nil, X, X) append(cons(U, X), Y, cons(U, Z)) ← append(X, Y, Z) This is analogous to the definition of plus(X, Y, Z), which holds when X + Y = Z: plus(0, X, X) plus(s(X), Y, s(Z)) ← plus(X, Y, Z) Here the successor function s(X) represents X + 1, as in Peano arithmetic. These definitions alone are adequate for computing their denotations. More generally, they are adequate for solving any goal clause (which is an existentially quantified conjunction of atoms). However, to prove program properties expressed in the full syntax of first-order logic, the definitions need to be augmented with their completions and induction axioms. For example, the completion and induction over the natural numbers are both needed to show that the plus relation defined above is functional: ∀XY U V [plus(X, Y, U ) ∧ plus(X, Y, V ) → U = V ]

Logic Programming


Similarly, to show, for example, that append is associative, the definition of append needs to be augmented both with the completion and induction over lists. Because many program properties can be expressed in the logic programming sublanguage of first-order logic, it can be hard to distinguish between clauses that are needed for computation, and clauses that are emergent properties. A similar problem arises with deductive databases. As Nicolas and Gallaire [1978] observed, it can be hard to distinguish between clauses that define data, and integrity constraints that restrict data. For real applications, these distinctions are essential. For example, without making these distinctions, a programmer can easily write a program that includes both the definition of append and the property that append is associative. The resulting, ”‘purely declarative” logic program would be impossibly inefficient. The analogy with arithmetic helps to clarify the relationships between the different semantics of logic programs: It suggests that the completion augmented with induction schemas is like the first-order axioms for Peano arithmetic, and the minimal model is like the standard model of arithmetic. The fact that both notions of arithmetic have a place in mathematics suggests that both kinds of “semantics” also have a place in logic programming. Interestingly, the analogy also works in the other direction. The fact that minimal models are the denotations of logic programs shows that the standard model of arithmetic has a syntactic core, which consists of the Horn clauses that define addition and multiplication. Martin Davis [1980] makes a similar point, but his core is essentially the Horn clause definitions of addition and multiplication augmented with the Clark Equality Theory: ∃x.Z(x) ∀xy.[Z(x) ∧ Z(y) ⊃ x = y] ∀x.∃y.S(x, y) ∀xy.[S(x, y) ⊃ ¬Z(y)] ∀xy.[Z(y) ⊃ A(x, y, x)] ∀xyzuv.[A(x, y, z) ∧ S(y, u) ∧ S(z, v) ⊃ A(x, u, v)] ∀xy.[Z(y) ⊃ P (x, y, y)] ∀xyzuv.[P (x, y, z) ∧ S(y, u) ∧ A(z, x, v) ⊃ P (x, u, v)] Here Z(x) stands for “x is zero”, S(x, y) for “y is the successor of x”, A(x, y, z) for “x + y = z” and P (x, y, z) for “xy = z”. Arguably, the syntactic core of the standard model of arithmetic explains how we can understand what it means for a sentence of arithmetic to be true, even if it may be impossible to prove that the sentence is true.


Database semantics

In the same workshop in which Clark presented his work, Nicolas and Gallaire [1978] considered related issues from a database perspective. They characterised


Robert Kowalski

the relational database approach as viewing databases as model-theoretic structures (or interpretations), and the deductive database approach as viewing databases as theories. They argued that, in relational databases, both query evaluation and integrity constraint satisfaction are understood as evaluating the truth value of a sentence in an interpretation. But in deductive databases, they are understood as determining whether the sentence is a theorem, logically implied by the database viewed as a theory. Hence the term “deductive”. In retrospect, it is now clear that both kinds of databases, whether relational or “deductive”, can be viewed either as an interpretation or as a theory. A more fundamental issue at the time of the 1978 workshop was the inability of the relational calculus and relational algebra to define recursive relations, such as the transitive closure of a binary relation. Aho and Ullman [1979] proposed to remedy this by extending the relational algebra with fixed point operators. This proposal was pursued by Chandra and Harel [1982], who classified and analysed the complexity of the resulting hierarchy of query languages. Previously, Harel [1980] had published a harsh review of the logic and databases workshop proceedings [Gallaire and Minker, 1979], criticising it for claiming that deductive databases define relations in first-order logic despite the fact that transitive closure cannot be defined in first-order logic. During the 1980s, the deductive database community, with roots mainly in artificial intelligence, became assimilated into a new Datalog community, influenced by logic programming, but with its roots firmly in the database field. In keeping with its database perspective, Datalog excludes function symbols. So all Herbrand models are finite, and are computable bottom-up. But pure bottom-up computation, whether viewed as model generation or as theorem-proving, ignores the query until it derives it as though by accident. To make model generation relevant to the query, Datalog uses transformations such as Magic Sets [Bancilhon, et al 1985] to incorporate the query into the transformed database rules. As a consequence of its model generation approach, Datalog ignores the completion semantics in favour of the minimal model and fixed point semantics. For example, the surveys by Ceri, Gottlob and Tanca [1989], and Ramakrishnan and Ullman [1993], and even the more general survey of the complexity and expressive power of logic programming by Dantsin, Eiter, Gottlob and Voronkov [2001] mention the completion only in passing. Minker’s [1996] retrospective on Logic and Databases acknowledges the distinctive character of Datalog, but also includes the completion semantics. In particular, the completion semantics contributed to investigations of the semantics of integrity constraints, which was an important topic in deductive databases, before the field of Datalog fully emerged. 6


Theoretical investigations of the completion semantics continued, and were highlighted in John Lloyd’s [1985, 1987] influential Foundations of Logic Programming

Logic Programming


book, which included results from Keith Clark’s [1980] unpublished PhD thesis. Especially important among the later results were the three-valued completion semantics of Fitting [1985] and Kunen [1987], which give, for example, the truth value undefined to p in the program p ← not p, whose completion is inconsistent in two-valued logic. This and other work on the completion semantics are presented in Shepherdson’s [1988] survey. Much of this work concerns the correctness and completeness of SLDNF resolution (SLD resolution extended with negation as finite failure), relative to the completion semantics.



The most significant next step in the investigation of negation was the study of stratified negation in database queries by Chandra and Harel [1985] and Naqvi [1986]. The simplest example of a stratified logic program is that of a deductive database E ∪ I whose predicates are partitioned into extensional predicates, defined by facts E, and intensional predicates, defined in terms of the extensional predicates by facts and rules I. Consider, for example, a network of nodes, some of whose links at any given time may be broken14 . This can be represented by an extensional database, say: E:

link(a, b)

link(a, c)

link(b, c)

broken(a, c)

Two nodes in the network are connected if there is a path of unbroken links. This can be represented intensionally by the clauses: I:

connected (X, Y ) ← link (X, Y ) ∧ not broken(X, Y ) connected (X, Y ) ← connected (X, Z) ∧ connected (Z, Y )

The conditions of the first clause in I are completely defined by E. So they can be evaluated independently of I. The use of E to evaluate these conditions results in a set of Horn clauses I ′ ∪ E, which intuitively has the same meaning as I ∪ E: I ′:

connected (a, b) connected (b, c) connected (X, Y ) ← connected (X, Z) ∧ connected (Z, Y )

The natural, intended model of the original deductive database E ∪ I is the minimal model M of the resulting set of Horn clauses I ′ ∪ E: M:

link (a, b) link (a, c) link (b, c) connected (a, b) connected (b, c)

broken(a, c) connected (a, c)

14 This example is inspired by the following quote from Hellerstein [2010]: “Classic discussions of Datalog start with examples of transitive closure on family trees: the dreaded anc and desc relations that afflicted a generation of graduate students. My group’s work with Datalog began with the observation that more interesting examples were becoming hot topics: Web infrastructure such as webcrawlers and PageRank computation were essentially transitive closure computations, and recursive queries should simplify their implementation.”


Robert Kowalski

This construction can be iterated if the intensional part of the database is also partitioned into layers (or strata). The further generalisation from databases to logic programs with function symbols was investigated independentl van Gelder [1989] and by Apt, Blair and Walker [1988]. Let P be a logic program, and let Pred = Pred 0 ∪ . . . ∪ Pred n be a partitioning and ordering of the predicate symbols of P . If A is an atomic formula, let stratum(A) = i if and only if the predicate symbol of A is in Pred i . Then P is stratified (with respect to this stratification of the predicate symbols), if and only if for every clause head ← body in P and for every condition C in body: if C is an atomic condition, then stratum(C) ≤ stratum(head) if C is a negative condition notA, then stratum(A)