Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools 3030722791, 9783030722791

The research presented in this book shows how combining deep neural networks with a special class of fuzzy logical rules

529 150 10MB

English Pages 160 [186] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools 3030722791, 9783030722791

The research presented in this book shows how combining deep neural networks with a special class of fuzzy logical rules

451 31 18MB Read more

Hybrid Intelligent Systems Based on Extensions of Fuzzy Logic, Neural Networks and Metaheuristics 3031289986, 9783031289989

301 120 19MB Read more

Hybrid Intelligent Systems Based on Extensions of Fuzzy Logic, Neural Networks and Metaheuristics 9783031289996, 9783031289989

In this book, recent theoretical developments on fuzzy logic, neural networks and optimization algorithms, as well as th

222 116 81MB Read more

Hybrid Intelligent Systems Based on Extensions of Fuzzy Logic, Neural Networks and Metaheuristics 9783031289989, 9783031289996

269 24 19MB Read more

Neural Networks, Fuzzy Logic and Genetic Algorithms 9788120321861

10,075 1,859 17MB Read more

New Perspectives on Hybrid Intelligent System Design based on Fuzzy Logic, Neural Networks and Metaheuristics 9783031082658, 9783031082665

297 42 15MB Read more

Fuzzy Logic and Neural Networks for Hybrid Intelligent System Design 9783031220418, 9783031220425

This book covers recent developments on fuzzy logic, neural networks and optimization algorithms, as well as their hybri

461 60 8MB Read more

Extension of the Fuzzy Sugeno Integral based on Generalized Type-2 Fuzzy Logic 9783030164157

559 97 4MB Read more

Multicriteria Decision-Making Under Conditions Of Uncertainty: A Fuzzy Set Perspective 1119534925, 9781119534921

A guide to the various models and methods to multicriteria decision-making in conditions of uncertainty presented in a s

533 37 5MB Read more

Multicriteria Decision-Making Under Conditions of Uncertainty: A Fuzzy Set Perspective 1119534925, 9781119534921

A guide to the various models and methods to multicriteria decision-making in conditions of uncertainty presented in a s

892 148 5MB Read more

Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools
3030722791, 9783030722791

Author / Uploaded
József Dombi
Orsolya Csiszár

Table of contents :
Foreword
Preface
Introduction—Aggregation and Intelligent Decision
Contents
List of Figures
List of Tables
Elements of Nilpotent Fuzzy Logic
1 Connectives: Conjunctions, Disjunctions and Negations
1.1 Introduction
1.2 Preliminaries
1.2.1 Negations
1.2.2 Triangular Norms and Conorms
1.3 Characterization of Strict Negation Operators
1.4 Nilpotent Connective Systems
1.4.1 Structural Properties of Connective Systems
1.4.2 Consistent Connective Systems
1.5 Summary
References
2 Implications
2.1 Introduction
2.2 Preliminaries
2.3 R-Implications in Bounded Systems
2.4 S-Implications in Bounded Systems
2.4.1 Properties of iSn, iSd and iSc
2.4.2 S-Implications and the Ordering Property
2.5 A Comparison of Implications in Bounded Systems
2.6 Min and Max Operators in Nilpotent Connective Systems
2.7 Summary
References
3 Equivalences
3.1 Introduction
3.2 Preliminaries
3.3 Equivalences in Bounded Systems
3.3.1 Properties of ec(x,y) and ed(x,y)
3.4 Dual Equivalences
3.4.1 Properties of bared and barec
3.5 Arithmetic Mean Operators in Bounded Systems
3.6 Aggregated Equivalences
3.6.1 Properties of the Aggregated Equivalence Operator
3.7 Applications
3.8 Summary
References
4 Modifiers and Membership Functions in Fuzzy Sets
4.1 Introduction
4.2 Unary Operators in Nilpotent Logical Systems
4.2.1 Possibility and Necessity as Unary Operators Derived from Multivariable Operators
4.2.2 Drastic Unary Operators
4.2.3 Composition Rules
4.2.4 Multivariable Operators Derived from Unary Operators
4.2.5 A General Framework: The α, β, γ- Model
4.3 Unary Operators Induced by Negation Operators
4.4 Membership Functions
4.5 Non-membership Functions
4.6 Summary
References
Decision Operators
5 Aggregative Operators
5.1 Introduction
5.2 Preliminaries
5.3 Shifting Transformations on the Generator Functions – A General Parametric Formula
5.4 The Weighted General Operator
5.5 Properties of the General and the Weighted General Operator
5.5.1 The De Morgan Property
5.5.2 Bisymmetry
5.6 The Two-Variable General and Weighted Aggregative Operator
5.7 Summary
References
6 Preference Operators
6.1 Introduction
6.2 Operators of Nilpotent Systems - A General Framework
6.2.1 Normalization of the Generator Functions
6.2.2 The General Parametric Operator
6.2.3 The Unary Operators: Negation, Modifiers and Hedges
6.3 Preference Modeling
6.4 Properties of the Preference Operator
6.4.1 Basic Properties
6.4.2 Ordering Properties
6.4.3 Preference and Negation
6.4.4 Preference, Conjunction and Disjunction
6.4.5 Preference and Aggregation
6.4.6 Additive Transitivity
6.4.7 Bisymmetry and the Common Base Property
6.4.8 Preference and Unary Operators
6.5 Summary
References
Learning and Neural Networks
7 Squashing Functions
7.1 Introduction
7.2 Łukasiewicz Operators
7.3 Approximation of the Cutting Function
7.3.1 The Sigmoid Function
7.3.2 The Interval [a,b] Squashing Function
7.3.3 The Error of the Approximation
7.4 Approximation of Piecewise Linear Membership Functions
7.5 Summary
References
8 Learning Rules
8.1 Introduction
8.2 Problem Definition and Solution Outline
8.3 Preliminaries
8.4 The Structure and Representation of the Rules
8.5 The Optimization Process
8.5.1 Rule Optimization by GA
8.5.2 A Gradient-Based Local Optimization of Memberships
8.6 Applications
8.7 Summary
References
9 Interpretable Neural Networks Based on Continuous-Valued Logic and Multi-criteria Decision Operators
9.1 Introduction
9.2 Related Work
9.3 Nilpotent Logical Systems and Multicriteria Decision Tools
9.4 Nilpotent Logic-Based Interpretation of Neural Networks
9.5 Playground Examples
9.5.1 XOR
9.5.2 Preference
9.6 Summary
References
10 Conclusions

Citation preview

Studies in Fuzziness and Soft Computing

József Dombi Orsolya Csiszár

Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools

Studies in Fuzziness and Soft Computing Volume 408

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Fuzziness and Soft Computing” contains publications on various topics in the area of soft computing, which include fuzzy sets, rough sets, neural networks, evolutionary computation, probabilistic and evidential reasoning, multi-valued logic, and related fields. The publications within “Studies in Fuzziness and Soft Computing” are primarily monographs and edited volumes. They cover significant recent developments in the field, both of a foundational and applicable character. An important feature of the series is its short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by SCOPUS, DBLP, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.

More information about this series at http://www.springer.com/series/2941

József Dombi Orsolya Csiszár •

Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools

123

József Dombi Institute of Informatics University of Szeged Szeged, Hungary

Orsolya Csiszár Faculty of Basic Sciences Esslingen University of Applied Sciences Esslingen, Germany Institute of Applied Mathematics Óbuda University Budapest, Hungary

ISSN 1434-9922 ISSN 1860-0808 (electronic) Studies in Fuzziness and Soft Computing ISBN 978-3-030-72279-1 ISBN 978-3-030-72280-7 (eBook) https://doi.org/10.1007/978-3-030-72280-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Foreword

In the traditional two-valued logic, each statement is either true or false. However, for imprecise (“fuzzy”) properties like “small” (or “tall”), in many cases, we are not 100% sure that some value is small—we only have some degree of confidence that this value is small and some non-zero degree of confidence that this value is not small. This phenomenon is the main idea behind fuzzy logic. In fuzzy logic, for each property P and for each object x, for the statement P(x) (“x has the property P”), in addition to possible “true”–“false” values—which in a computer are usually represented by 1 and 0—we also have degrees of confidence that take intermediate values, i.e., values from the interval [0, 1]. In fuzzy logics, the law of contradiction (that A & :A is always false) and the law of excluded middle (that A _ :A is always true) are, in general, false. However, these two laws are true for some fuzzy “and” and “or” operations—namely for operations which are isomorphic to f& ða; bÞ ¼ maxða þ b 1; 0Þ and f_ ða; bÞ ¼ minða þ b; 1Þ. These two types of operations are atypical, and, because of this, they are rarely studied and rarely used in applications. On the other hand, the law of contradiction and the law of excluded middle have an intuitive appeal. It is therefore reasonable to study fuzzy logics in which these two laws are satisfied. Such a study is one of the main foci of this book. It analyzes the triples of “and”, “or”, and “not” operations that satisfy these two laws and that are—in some reasonable sense—consistent with each other. The book provides a full description of all such triples—and shows that for most such triples, we can define, in addition to the main negation, several additional negation operations— which is also in good agreement with our intuition, where we usually distinguish between, e.g., the usual negation (such as “not small”) and a strong negation (such as “large”). The authors also study how the need to be reasonably consistent with the corresponding triple affects implication operations, hedges (like “very”), and different averaging operations—ranging from symmetric ones (that treat all the inputs equally) to weighted ones, where some inputs are given more weight than others. A very interesting (and innovative) part of the book is the study of preference operations a\b that describe to what extend b is preferable to a. These operations v

vi

Foreword

have many properties common with implication, as a result of which they are often identified with implication operations, but, as the authors show, there is a subtle but important difference. These parts are very interesting and important by themselves, but the most interesting part, in my opinion, is the relation with deep neural networks. In deep neural networks, data processing consists of interchangingly performing linear transformations and the rectified linear transformation f ðxÞ ¼ maxð0; xÞ. Interestingly, both operations f& ða; bÞ ¼ maxða þ b 1; 0Þ and f_ ða; bÞ ¼ minða þ b; 1Þ can be easily represented as a composition of such neural-network transformations. This fact helps interpret transformations in a deep neural network in terms of “and” and “or” operations—and thus leads to the possibility to supplement the empirical success of deep neural networks with something which is currently largely missing in machine learning: natural-language interpretation of their results. And then comes another interesting twist. Remember that the book starts with the idea to limit ourselves to fuzzy logics in which the law of contradiction and the law of excluded middle are satisfied—the resulting formula turned out to be equivalent to what is used in the current deep learning techniques. But what if we use more general fuzzy logics—i.e., replace the original fuzzy operations with the ones for which the law of contradiction and the law of excluded middle are only approximately true? In terms of neural networks, this is equivalent to replacing the current activation function f ðxÞ ¼ maxð0; xÞ with different ones. On a few examples, the authors show that this idea is also very promising: On these examples, it leads to more efficient learning. Summarizing: the research presented in this book shows how fuzzy logic can make deep neural networks more interpretable—and even, in many cases, more efficient. These are very interesting and very promising results—and I am saying it with 100% confidence:-). January 2021

Vladik Kreinovich

Preface

This monograph consists of new research results developed by the authors and their co-authors, Zsolt Gera and Gábor Csiszár, and it focuses on a special class of continuous-valued logic and multi-criteria decision tools. Based on their common theoretical basis, we propose a consistent framework for modeling human thinking by using the tools of both fields: fuzzy logical operators as well as multi-criteria decision tools, such as aggregative and preference operators. Fuzzy logic together with multi-criteria decision-making tools provides very powerful tools for modeling human thinking. Another successful field in this direction is that of artificial neural networks, which were inspired by the biological neural networks that constitute human brains. Deep learning based on neural networks is revolutionizing the business and technology world. However, there is an increasing need to address the problem of interpretability, safety, and transparency. This challenge is closely related to the fact that although deep neural networks have achieved impressive experimental results, especially in image classification, they have been shown to be surprisingly unstable when it comes to adversarial perturbations: Minimal changes to the input image may cause the network to misclassify it. Moreover, although machine learning algorithms are capable of learning from a set of data and of producing a model that can be used to solve different problems, the values of the accuracy or the prediction error are insufficient, since these only provide an incomplete description of most real-world problems. The interpretability of a machine learning model gives an insight into its internal functionality to explain the reasons why it suggests making certain decisions. In a high-risk environment, it is vital to know why a decision was made; the predictive performance on a test dataset is not enough. In black-box models, less is known about what influencing variables are actually driving the final decision. The relationship between the input and output is often limited in complexity and local interpretations. However, the white-box models such as linear regression and decision trees are significantly easier to explain and interpret, they provide less predictive capacity, and they are not always capable of modeling the inherent complexity of the dataset (i.e., feature interactions).

vii

viii

Preface

In this book, we offer a unified framework for logical operators and decision tools with applications in neural computation. We show how combining deep neural networks with structured logical rules and multi-criteria decision tools might help reduce the black-box nature of neural models. We strongly believe that our work is an important step toward better interpretability, transparency, and safety of neural models.

Introduction—Aggregation and Intelligent Decision

Aggregation is the process used to combine several numerical values into a single representative value. The function that performs this process is called an aggregation function. Despite the simplicity of this definition, the range of its applications is incredibly large: in applied mathematics (e.g., probability theory, statistics, decision theory), computer sciences (e.g., artificial intelligence, operation research, pattern recognition, and image processing), economics and finance, and multi-criteria decision aids (see, e.g., [16]). The main factor in determining the structure of the required aggregation function is the relationship among the criteria. At one extreme, there is the case in which all the criteria must be satisfied. At the other extreme is the situation where we desire that it satisfies at least one of the criteria. These two extreme cases lead to the use of “and” and “or” operators. A decision can be interpreted as the intersection (“and” operator) of fuzzy sets, when there is no compensation between low and high values. If the decision is interpreted as the union (“or” operator) of fuzzy sets, full compensation is assumed. This means that logical operators can be viewed as decision functions. However, it is obvious that no managerial decision represents any of these extreme situations. Aggregative operators can accommodate modeling compensation to a certain degree. Another important application of aggregation functions comes from artificial intelligence, fuzzy logic [61, 62]. Pattern recognition and classification, as well as image analysis, are typical examples. According to Aristotle, in mathematics, it was originally assumed that “the same thing cannot at the same time both belong and not belong to the same object and in the same respect. Of any object, one thing must be either asserted or denied.” The idea of many-valued logic was initiated by Jan Łukasiewicz around 1920. “Logic changes from its very foundations if we assume that in addition to truth and falsehood there is also some third logical value or several such values” [90]. Many-valued logic was for decades considered as a purely theoretical topic. It was the introduction of fuzzy sets by Zadeh in 1965 [159] which opened the way to fuzzy logics. Aggregation functions are inevitably used in fuzzy logic, as a generalization of logical connectives. In artificial intelligence, these techniques are mainly used when ix

x

Introduction—Aggregation and Intelligent Decision

a system has to make a decision. It is possible that the system has not just a single criterion for each alternative, but several criteria. This case corresponds to a multi-criteria decision-making problem. Furthermore, if a system needs to have a good representation of an environment, it requires the knowledge supplied by information sources in order to be reliable. However, the information supplied by a single source (by an expert or sensor) is often not reliable enough. This is why the information provided from several sensors (or experts) should be combined to improve data reliability and accuracy and also to include some features that are impossible to perceive with individual sensors. One of the most significant problems of fuzzy set theory is the proper choice of set-theoretic operations [126, 130]. Here, the fulfillment of the law of contradiction and the excluded middle and the coincidence of the residual and the S-implication are desired [61, 144]. Since the class of nilpotent t-norms and t-conorms has these nice properties, they are helpful in building up logical structures. This book is organized as follows. In Part I, we focus on fuzzy nilpotent logical systems. First, in Chapter 1, we give an insight into the logical connectives, and then, in Chapters 2 and 3, we examine implications and equivalence operators. Due to the fact that all continuous Archimedean (i.e., representable) nilpotent t-norms are isomorphic to the Łukasiewicz t-norm, the previously studied nilpotent systems were all isomorphic to the well-known Łukasiewicz logic. Here, we show that a consistent logical system generated by nilpotent operators is not necessarily isomorphic to Łukasiewicz logic. We introduce the so-called general nilpotent logical systems with the advantage of them having three naturally derived negations. In Chapter 4, we describe the modifiers and membership functions in fuzzy sets based on the generator function of the logical operators. Using the same generator function gives a unified framework, where all the operators are connected to each other. In this way, we can describe all the operators using a generator function and only a few parameters. In Part II, we focus on the decision operators: in Chapter 5 on the aggregative operator and in Chapter 6 on preference modeling, a crucial part of multi-criteria decision-making. Since the modeling is always affected by the presence of different kinds of uncertainty due to the imperfect human knowledge available and the limited capability of observation and/or discrimination, fuzzy set theory can offer a solution on how to handle uncertainty. Here, the decision operators together with the logical operators and modifiers discussed in Part I form a consistent framework, since all the operators are based on the same generator function and they are all connected to each other. After having presented this rich array of operators, we turn to practical applications in machine learning in Part III. The results presented in this part are the fruits of a collaboration between the authors, Zsolt Gera and Gábor Csiszár. In Chapter 7, we introduce the so-called squashing functions, as differentiable approximations of the cutting function. This approximation will be needed for the gradient-based optimization processes in the applications. In Chapters 8 and 9, we provide two application examples to demonstrate the usefulness of the proposed unified framework and show how combining deep neural networks with structured

Introduction—Aggregation and Intelligent Decision

xi

logical rules and multi-criteria decision tools really contributes to the reduction of the black-box nature of neural models.

Contents

Part I

Elements of Nilpotent Fuzzy Logic

1

Connectives: Conjunctions, Disjunctions and Negations 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Negations . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Triangular Norms and Conorms . . . . . . . . . . 1.3 Characterization of Strict Negation Operators . . . . . 1.4 Nilpotent Connective Systems . . . . . . . . . . . . . . . . 1.4.1 Structural Properties of Connective Systems . 1.4.2 Consistent Connective Systems . . . . . . . . . . 1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

3 3 4 4 6 8 9 12 18 25 28

2

Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 R-Implications in Bounded Systems . . . . . . . . . . . . . . . . 2.4 S-Implications in Bounded Systems . . . . . . . . . . . . . . . . 2.4.1 Properties of iSn ; iSd and iSc . . . . . . . . . . . . . . . . . 2.4.2 S-Implications and the Ordering Property . . . . . . . 2.5 A Comparison of Implications in Bounded Systems . . . . 2.6 Min and Max Operators in Nilpotent Connective Systems 2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

29 29 30 32 34 35 36 38 38 39 41

3

Equivalences . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . 3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . 3.3 Equivalences in Bounded Systems . . . . 3.3.1 Properties of ec ðx; yÞ and ed ðx; yÞ

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

43 43 45 46 47

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . . . . . . . .

. . . . .

. . . . .

xiii

xiv

Contents

3.4 Dual Equivalences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Properties of ed and ec . . . . . . . . . . . . . . . . . . . . 3.5 Arithmetic Mean Operators in Bounded Systems . . . . . . . 3.6 Aggregated Equivalences . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 Properties of the Aggregated Equivalence Operator 3.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

6

. . . . . .. .. ..

. . . . . . . .

. . . . . . . .

. . . . . . . .

Modifiers and Membership Functions in Fuzzy Sets . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Unary Operators in Nilpotent Logical Systems . . . . . . . . . . . . . 4.2.1 Possibility and Necessity as Unary Operators Derived from Multivariable Operators . . . . . . . . . . . . . . . . . . . . 4.2.2 Drastic Unary Operators . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Composition Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 Multivariable Operators Derived from Unary Operators . . . 4.2.5 A General Framework: The a; b; c- Model . . . . . . . . . . . 4.3 Unary Operators Induced by Negation Operators . . . . . . . . . . . . 4.4 Membership Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Non-membership Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Part II 5

. . . .

49 50 52 53 54 58 58 60 63 63 64 65 68 69 70 71 72 77 79 80 81

Decision Operators

Aggregative Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Shifting Transformations on the Generator Functions – A General Parametric Formula . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 The Weighted General Operator . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Properties of the General and the Weighted General Operator . . . 5.5.1 The De Morgan Property . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Bisymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 The Two-Variable General and Weighted Aggregative Operator . . . 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preference Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Operators of Nilpotent Systems - A General Framework . . . 6.2.1 Normalization of the Generator Functions . . . . . . . . 6.2.2 The General Parametric Operator . . . . . . . . . . . . . . . 6.2.3 The Unary Operators: Negation, Modifiers and Hedges

... ... ... ... ... ...

85 85 86 87 88 89 89 93 94 99 99 101 101 103 104 105 107

Contents

xv

6.3 Preference Modeling . . . . . . . . . . . . . . . . . . . . . . . 6.4 Properties of the Preference Operator . . . . . . . . . . . 6.4.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . 6.4.2 Ordering Properties . . . . . . . . . . . . . . . . . . . 6.4.3 Preference and Negation . . . . . . . . . . . . . . . 6.4.4 Preference, Conjunction and Disjunction . . . 6.4.5 Preference and Aggregation . . . . . . . . . . . . . 6.4.6 Additive Transitivity . . . . . . . . . . . . . . . . . . 6.4.7 Bisymmetry and the Common Base Property 6.4.8 Preference and Unary Operators . . . . . . . . . 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

108 109 109 111 112 113 114 114 115 116 116 117

7

Squashing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Łukasiewicz Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Approximation of the Cutting Function . . . . . . . . . . . . . . . 7.3.1 The Sigmoid Function . . . . . . . . . . . . . . . . . . . . . 7.3.2 The Interval ½a; b Squashing Function . . . . . . . . . . 7.3.3 The Error of the Approximation . . . . . . . . . . . . . . 7.4 Approximation of Piecewise Linear Membership Functions 7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

121 121 122 124 125 125 130 131 133 134

8

Learning Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Problem Definition and Solution Outline . . . . . . . . . . . . . . . 8.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 The Structure and Representation of the Rules . . . . . . . . . . 8.5 The Optimization Process . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 Rule Optimization by GA . . . . . . . . . . . . . . . . . . . . 8.5.2 A Gradient-Based Local Optimization of Memberships 8.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

... ... ... ... ... ... ... ... ... ... ...

135 135 136 137 139 140 140 141 142 144 144

9

Interpretable Neural Networks Based on Continuous-Valued Logic and Multi-criteria Decision Operators . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Nilpotent Logical Systems and Multicriteria Decision Tools 9.4 Nilpotent Logic-Based Interpretation of Neural Networks .

. . . . .

147 147 149 150 155

Part III

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

Learning and Neural Networks

. . . . .

. . . . .

. . . . .

xvi

Contents

9.5 Playground Examples 9.5.1 XOR . . . . . . 9.5.2 Preference . . 9.6 Summary . . . . . . . . References . . . . . . . . . . .

. . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

161 161 162 165 167

10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

List of Figures

Fig. 1.1 Fig. 1.2 Fig. 1.3 Fig. 1.4 Fig. 2.1 Fig. 2.2 Fig. 2.3 Fig. 3.1 Fig. 3.2 Fig. 3.3 Fig. 3.4 Fig. 3.5 Fig. 3.6 Fig. 3.7 Fig. 4.1 Fig. 4.2 Fig. 4.3 Fig. 4.4 Fig. 4.5

Continuous negations with different m values . . . . . . . . . . . . . . . ................................................. The relationship between m, mc and md in consistent rational systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conjunction c½x; y and disjunction d½x; y . . . . . . . . . . . . . . . . . . ic and id implications for rational generators . . . . . . . . . . . . . . . . Sn -implications for rational generators . . . . . . . . . . . . . . . . . . . . Sc -implications for rational generators . . . . . . . . . . . . . . . . . . . . . ec ðx; yÞ and ed ðx; yÞ for rational generators . . . . . . . . . . . . . . . . . ec ðx; yÞ and ed with rational generators . . . . . . . . . . . . . . . . . . . . The domain of aggregated equivalences . . . . . . . . . . . . . . . . . . . Aggregated equivalences with rational generators with m ¼ 0:3 . . . Pointwise equivalence of fuzzy numbers with rational generators (mc ¼ md ¼ 0:3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pointwise dual equivalence of fuzzy numbers with rational generators (mc ¼ md ¼ 0:3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pointwise aggregated equivalence of triangular fuzzy numbers with rational generators (m ¼ 0:6) . . . . . . . . . . . . . . . . . . . . . . . . Unary operators generated by f ðxÞ ¼ x . . . . . . . . . . . . . . . . . . . . Unary operators ”not”, ”impossible, ”possible” and ”necessary” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sNmd ;mc ðxÞ and sPmc ;md ðxÞ for A ¼ 0:1; 0:4 and 0:75 . . . . . . . . . . . . . . Membership funcions dðkÞ e ðxÞ generated by fc ðxÞ ¼ 1 x . . . . . . 1 Membership funcions dðkÞ e ðxÞ generated by fc ðxÞ ¼ 1 þ 1mc x . . . . mc 1x

Fig. 4.6 Fig. 4.7

Non-membership funcions d^ðkÞ e ðxÞ generated by fd ðxÞ ¼ x . . . . . . ^ Non-membership funcions dðkÞ e ðxÞ generated by fd ðxÞ ¼ 1 þ 1md 1x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1md x

5 21 24 25 40 41 41 46 50 55 56 59 59 60 66 73 76 78 79 80 80

xvii

xviii

List of Figures

Fig. 5.1 Fig. 5.2 Fig. 5.3 Fig. 6.1 Fig. 7.1 Fig. 7.2

Fig. Fig. Fig. Fig.

7.3 7.4 7.5 7.6

Fig. 7.7

Fig. 7.8 Fig. 7.9 Fig. 7.10 Fig. 7.11 Fig. 7.12 Fig. 8.1

Fig. 9.1 Fig. 9.2 Fig. 9.3 Fig. 9.4 Fig. 9.5

The shifting transformation in the linear case, fm1 ðxÞ for m ¼ 0, m ¼ m , m ¼ 1; where m ¼ f 1 12 . . . . . . . . . . . . . . . . . . . . . . The weighted aggregative operator aw for f ðxÞ ¼ md 1x 11 þ 1m ; md ¼ 0:8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . d x The uninorm-like property of the weighted aggregative operator a1 ðx; yÞ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The preference operator with generator functions f ðxÞ ¼ x and pffiffiffi f ðxÞ ¼ x, for w ¼ 1 and w ¼ 0:5 . . . . . . . . . . . . . . . . . . . . . The truth tables of the nilpotent conjunction, disjunction and implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Left: Generalized cutting functions for a ¼ 0; b ¼ 1; c ¼ 2; d ¼ 4. Right: a trapezoidal membership function constructed as the conjunction of the former two, with a negation applied to the right one.. . . . . . . . . . . . . . . . . . . . . The cutting function and its approximation . . . . . . . . . . . . . . . The sigmoid function, with parameters d ¼ 0 and b ¼ 4 . . . . The first derivative of the sigmoid function . . . . . . . . . . . . . . The integral function of the sigmoid function and another shifted by 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Left: the interval squashing function with an increasing b parameter (a ¼ 0 and b ¼ 2). Right: the interval squashing function with a zero and a negative b parameter . . . . . . . . . . The approximation of the nilpotent conjunction with b values 1; 4; 8; and 32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . The meaning of ha\d xib . . . . . . . . . . . . . . . . . . . . . . . . . . . . The approximation of a trapezoid and a triangular membership function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unary operators approximated using the squashing function, for a ¼ 4; b ¼ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preference operators for w ¼ 1 and w ¼ 2, first using the cutting function, then the squashing function for b ¼ 3 . . . . . On the left, the squashing function with parameters a ¼ 0, k ¼ 1 and b ¼ 16. On the right, soft trapezoidals with various b values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nilpotent conjunction and disjunction followed by their approximations using the squashing function . . . . . . . . . . . . . Approximations using the squashing function versus using sigmoidal functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Squashing functions for a ¼ 0:5, k ¼ 1; for different b values (b1 ¼ 1; b2 ¼ 2; b3 ¼ 5; and b4 ¼ 50) . . . . . . . . . . . . . . . . . . Nilpotent neural model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nilpotent perceptron model . . . . . . . . . . . . . . . . . . . . . . . . . . .

..

88

..

96

..

97

. . 109 . . 123

. . . .

. . . .

123 124 125 126

. . 126

. . 128 . . 129 . . 130 . . 132 . . 133 . . 133

. . 138 . . 150 . . 150 . . 153 . . 155 . . 156

List of Figures

Fig. 9.6

Fig. 9.7 Fig. 9.8 Fig. 9.9 Fig. 9.10 Fig. 9.11 Fig. 9.12 Fig. 9.13 Fig. 9.14 Fig. 9.15 Fig. 9.16 Fig. 9.17

Basic types of neural networks with two input values using logical operators in the hidden layer used to find different regions of the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Perceptron model of the conjunction and the disjunction . . . . . . Perceptron model classifying a circle with radius r . . . . . . . . . . . Output values for a triangular domain using nilpotent logic and its continuous approximation for different parameter values . . . . Output values for a circular region using nilpotent logic a, and its differentiable approximation b . . . . . . . . . . . . . . . . . . . . . . . . Nilpotent neural structure representing the expression (x [ 0Þ AND ðy [ 0ÞÞ OR ððx\0Þ AND ðy\0Þ . . . . . . . . . . . . A nilpotent neural network block designed for preference modeling, Example 9.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A nilpotent neural network block designed for preference modeling, Example 9.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A comparison of the performance of different activation functions in Example 9.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nilpotent neural network block designed for modeling XOR. . . . Nilpotent neural network block designed for modeling preference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nilpotent neural network block designed for finding a concave region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xix

157 159 159 159 160 162 163 163 164 164 166 166

List of Tables

Table Table Table Table Table Table Table Table Table Table Table Table Table

1.1 1.2 1.3 1.4 1.5 1.6 2.1 2.2 4.1 4.2 4.3 4.4 4.5

Table Table Table Table Table Table

6.1 6.2 9.1 9.2 9.3 9.4

Table 9.5

Power functions as normalized generators . . . . . . . . . . . . . . . Power functions as normalized generators . . . . . . . . . . . . . . . Exponential functions as normalized generators . . . . . . . . . . . Rational functions as normalized generators . . . . . . . . . . . . . Rational functions as normalized generators – 3 negations . . Mixed types of normalized generator functions . . . . . . . . . . . Properties of implications in bounded systems . . . . . . . . . . . . Rational generator functions . . . . . . . . . . . . . . . . . . . . . . . . . . x1 and x2 values for m ¼ 1; m ¼ 0 and m ¼ m . . . . . . . . . . . . . x1 and x2 values for f ðxÞ ¼ x . . . . . . . . . . . . . . . . . . . . . . . . . Special values for a; b and c . . . . . . . . . . . . . . . . . . . . . . . . . Special values for c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rational functions as normalized generators – 2 natural negations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The most important two-variable operators . . . . . . . . . . . . . . ðkÞ The key unary operators sm ðxÞ . . . . . . . . . . . . . . . . . . . . . . . The most important two-variable operators . . . . . . . . . . . . . . The key unary operators oa;c ðxÞ . . . . . . . . . . . . . . . . . . . . . . . Weights and biases for modeling the XOR logical gate . . . . . Performance of the squashing function with b ¼ 10; k ¼ 1; a ¼ 0:5, compared to ReLu, Sigmoid and Tanh in Example 9.4, with learning rate 0.003, and a trainingto-test-data ratio of 70% . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weights and biases for modeling the preference operator . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

26 26 26 27 27 27 39 39 67 68 72 72

.. 75 . . 105 . . . .

. . . .

108 151 151 161

. . 165 . . 165

xxi

Part I

Elements of Nilpotent Fuzzy Logic

Chapter 1

Connectives: Conjunctions, Disjunctions and Negations

Abstract In fuzzy logic, the law of contradiction and the law of excluded middle are, in general, false. However, because of their intuitive appeal, it is reasonable to study fuzzy logical systems, where these laws hold. In this chapter, we give a full description of consistent conjunction, disjunction, and negation triples in such systems. We also introduce additional negation operators that naturally define thresholds for better modeling of human thinking. The results of this chapter form the basis for constructing fuzzy logical systems that can later, in Chap. 9, be represented by neural-network transformations. This representation will assist the natural language interpretation of machine learning methods.

1.1 Introduction One of the most significant problems of fuzzy set theory is the proper choice of set-theoretic operations [1, 2]. Triangular norms and conorms have been examined thoroughly in the literature [3–6]. The most well-characterized class of t-norms is the so-called representable t-norms, derived from the solution of the associative functional equation [7]. Representable t-norms and t-conorms are often used as conjunctions and disjunctions in logical structures [8, 9]. Henceforth we refer to representable t-norms as conjunctions (c(x, y)) and representable t-conorms as disjunctions (d(x, y)). It should be also mentioned that all strict t-norms are isomorphic to the product torm, while all nilpotent t-norms are isomorphic to the Łukasiewicz t-norm [6]. Łukasiewicz fuzzy logic [10–13] is the logic where the conjunction is the Łukasiewicz t-norm and the disjunction is the Łukasiewicz t-conorm. The class of non-strict t-norms has preferable properties that make them more helpful in building up logical structures. Among these properties are the fulfillment of the law of contradiction and the excluded middle, the continuity of the implication, and the coincidence of the residual and the S-implication [14, 15]. Due to the fact that all continuous Archimedean (i.e. representable) nilpotent t-norms are isomorphic © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_1

3

4

1 Connectives: Conjunctions, Disjunctions and Negations

to the Łukasiewicz t-norm [6], the previously studied nilpotent systems were all isomorphic to the well-known Łukasiewicz-logic. In this chapter, we will show that a consistent logical system generated by nilpotent operators is not necessarily isomorphic to Łukasiewicz-logic [16]. Of course, this lack of isomorphy is not the result of introducing a new operator family, it simply means that the system itself is built up in a significantly different way using more than one generator function. This chapter is organized as follows. First, a characterization of negation operators is given in Sect. 1.3, as negations will have an important role to play in Sect. 1.4. After considering the class of connective systems generated by nilpotent operators, their structural properties are examined in Sect. 1.4. Examples of bounded systems, i.e. consistent nilpotent systems which are not isomorphic to Łukasiewicz-logic are shown. Necessary and sufficient conditions are given for these systems to satisfy the De Morgan law, the classification property and consistency. A wide range of examples for consistent and non-consistent bounded systems can be found in Sect. 1.5.

1.2 Preliminaries First, we recall some basic notations and results regarding negation operators, t-norms and t-conorms that will be useful in the sequel.

1.2.1 Negations Definition 1.1. A unary operation n : [0, 1] → [0, 1] is called a negation if it is non-increasing and compatible with classical logic, i.e. n(0) = 1 and n(1) = 0. A negation is strict if it is also strictly decreasing and continuous. A negation is strong if it is also involutive, i.e. n(n(x)) = x. Due to the continuity and strict monotonicity of n, for continuous negations there always exists some ν∗ , for which n(ν∗ ) = ν∗ holds. ν∗ is called the neutral value of the negation and the notation n ν∗ stands for a negation operator with neutral value ν∗ . In the literature ν∗ is often denoted by e. In Fig. 1.1 we can see some negations with different ν∗ values. Drastic negations [17] are the so-called intuitionistic and dual intuitionistic negations (denoted by n 0 and n 1 respectively): 1 if x = 0 1 if x < 1 n 0 (x) = and n 1 (x) = 0 if x > 0 0 if x = 1 These drastic negations are neither continuous nor strictly decreasing, therefore they are not strict negations, but we can get them as limits of strict negations.

1.2 Preliminaries Fig. 1.1 Continuous negations with different ν∗ values

5 1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

Definition 1.2. A continuous, strictly increasing function ϕ : [a, b] → [a, b] with boundary conditions ϕ(a) = a, ϕ(b) = b is called an automorphism of [a, b]. The well-known representation theorem was obtained by Trillas. Proposition 1.1. ([18]) n is a strong negation if and only if n(x) = f n−1 (1 − f n (x)), where f n : [0, 1] → [0, 1] is an automorphism of [0, 1]. Remark 1.1. This result also means that n is a strong negation iff n(x) = f n−1 n ( f n (x)) ,

(1.1)

where f n , called the generator function of n, f n : [0; 1] → [0; 1] is a strictly monotone, continuous function with f n (0) = 0 and f n (1) = 1 and n is a strong negation. 1−x 2 we get n(x) = . Example 1.1. For f n (x) = x 2 and n (x) = 1−x 1+x 1+x 2

6

1 Connectives: Conjunctions, Disjunctions and Negations

1.2.2 Triangular Norms and Conorms A triangular norm (t-norm for short) T is a binary operation on the closed unit interval [0,1] such that ([0, 1], T ) is an abelian semigroup with neutral element 1 which is totally ordered, i.e., for all x1 , x2 , y1 , y2 ∈ [0, 1] with x1 ≤ x2 and y1 ≤ y2 we have T (x1 , y1 ) ≤ T (x2 , y2 ), where ≤ is the natural order on [0, 1]. Standard examples of t-norms are the minimum TM , the product TP , the Łukasiewicz t-norm TL given by TL (x, y) = max(x + y − 1, 0), and the drastic product TD with TD (1, x) = TD (x, 1) = x, and TD (x, y) = 0 otherwise. Clearly, TM and TD are the greatest and smallest t-norm, respectively, i.e., for each t-norm T we have TD ≤ T ≤ TM . A triangular conorm (t-conorm for short) S is a binary operation on the closed unit interval [0, 1] such that ([0, 1], S) is an abelian semigroup with neutral element 0 which is totally ordered. Standard examples of t-conorms are the maximum SM , the probabilistic sum SP , the Łukasiewicz t-conorm SL given by SL (x, y) = min(x + y, 1), and the drastic sum SD with SD (0, x) = SD (x, 0) = x, and SD (x, y) = 1 otherwise. Clearly, SM and SD are the smallest and greatest t-conorms, respectively, i.e., for each t-cnorm S we have SM ≤ S ≤ SD . A continuous t-norm T is said to be Archimedean if T (x, x) < x holds for all x ∈ (0, 1). A continuous Archimedean T is called strict if T is strictly monotone; i.e. T (x, y) < T (x, z) whenever x ∈ (0, 1] and y < z, and nilpotent if there exist x, y ∈ (0, 1) such that T (x, y) = 0. From the duality between t-norms and t-conorms, we can easily derive the following properties. A continuous t-conorm S is said to be Archimedean if S(x, x) > x holds for every x, y ∈ (0, 1). A continuous Archimedean S is called strict if S is strictly monotone; i.e. S(x, y) < S(x, z) whenever x ∈ [0, 1) and y < z, and nilpotent if there exist x, y ∈ (0, 1) such that S(x, y) = 1. A t-norm is said to be positive if x, y > 0 implies T (x, y) > 0. From the duality between t-norms and t-conorms, we can easily get the following properties as well. A continuous t-conorm S is said to be Archimedean if S(x, x) > x holds for every x, y ∈ (0, 1), strict if S is strictly monotone i.e. S(x, y) < S(x, z) whenever x ∈ [0, 1) and y < z, and nilpotent if there exist x, y ∈ (0, 1) such that S(x, y) = 1. Proposition 1.2. ([19, 20]) A function T : [0, 1]2 → [0, 1] is a continuous Archimedean t-norm iff it has a continuous additive generator; i.e. there exists a continuous strictly decreasing function t : [0, 1] → [0, ∞] with t (1) = 0, which is uniquely determined up to a positive multiplicative constant, such that T (x, y) = t −1 (min(t (x) + t (y), t (0)), x, y ∈ [0, 1].

(1.2)

Proposition 1.3. ([19], [20]) A function S : [0, 1]2 → [0, 1] is a continuous Archimedean t-conorm if and only if it has a continuous additive generator; i.e. there exists a continuous strictly increasing function s : [0, 1] → [0, ∞] with s(0) = 0, which

1.2 Preliminaries

7

is uniquely determined up to a positive multiplicative constant, such that S(x, y) = s −1 (min(s(x) + s(y), s(1)), x, y ∈ [0, 1].

(1.3)

Proposition 1.4. [6] A t-norm T is strict if and only if t (0) = ∞ holds for each continuous additive generator t of T. A t-norm T is nilpotent if and only if t (0) < ∞ holds for each continuous additive generator t of T. A t-conorm S is strict if and only if s(1) = ∞ holds for each continuous additive generator s of S. A t-conorm S is nilpotent if and only if s(1) < ∞ holds for each continuous additive generator s of S. In both Propositions 1.2 and 1.3 above, we can allow the generator functions to be strictly increasing or strictly decreasing, which will mean that they will be determined up to a (not necessarily positive) multiplicative constant. For an increasing generator function t of a t-conorm and similarly for a decreasing generator function s of a t-conorm, min in (1.2) and (1.3) has to be replaced by max. In this case we will have t (0) = ±∞ and s(1) = ±∞ for strict norms and similarly, t (0) < ∞ or t (0) > −∞ and s(1) < ∞ or s(1) > −∞ for the nilpotent ones. Proposition 1.5. [6] Let T be a continuous Archimedean t-norm. If T is strict, then it is isomorphic to the product t-norm TP ; i.e., there exists an automorphism of the unit interval φ such that Tφ = φ −1 (T (φ(x), φ(y))) = TP . If T is nilpotent, then it is isomorphic to the Łukasiewicz t-norm TL ; i.e., there exists an automorphism of the unit interval φ such that Tφ = φ −1 (T (φ(x), φ(y))) = TL . From the definitions of t-norms and t-conorms it immediately follows that t-norms are conjunctive, while t-conorms are disjunctive aggregation functions. Therefore, they are widely used as conjunctions and disjunctions in multivalued logical structures. The logical system based on the nilpotent Łukasiewicz t-norm as a conjunction is called Łukasiewicz-logic [10–12]. The use of the so-called cutting function makes the formulae simpler. Definition 1.3. ([16, 21]) Let us define the cutting operation [ ] by ⎧ ⎨0 if x < 0 [x] = x i f 0 ≤ x ≤ 1 ⎩ 1 if 1 < x and let the notation [ ] also act as ‘brackets’ when writing the argument of an operator. Then we can write f [x] instead of f ([x]).

8

1 Connectives: Conjunctions, Disjunctions and Negations

1.3 Characterization of Strict Negation Operators The main purpose of this section is to present a representation of strict negations with a wide range of examples, since negations will have an important role to play in the next section. First let us see some further examples of negation operators. Hamacher proved in [22] that the only negation having a polynomial form is 1 − x, the so-called standard negation, introduced by Zadeh in [23]. He also proved that if an involutive negation belongs to the class of rational polynomials, then it has the following form: n λ (x) =

1−x , where λ > −1. 1 + λx

(1.4)

Sugeno [24] had the same result from the concept of fuzzy measures and integrals. In the literature, generally the standard negation 1 − x or infrequently 1−x (1.4) for 1+x λ =1) are used. Here we make suggestions about using different types of negations as well. The negation operators can be characterized by their neutral values. In [25] (see also [26]) Dombi introduced the following negation formula by expressing n λ (x) with the help of its neutral element ν∗ : n ν∗ (x) =

1 ∗ 2 x 1+( 1−ν ν∗ ) 1−x

0

x = 1; x = 1.

(1.5)

Note that if ν∗ → 0, then lim n ν∗ (x) = n 0 (x), and if ν∗ → 1, then lim n ν∗ (x) = n 1 (x) and for ν∗ = 21 we get the standard negation. Yager [17] introduced n(x) = (1 − x α )1/α , α > 0.

(1.6)

Both this type of negation operator and the above-mentioned n λ reduce to the standard negation when α = 1 and λ = 0 respectively. 1 It is easy to see that the neutral value of the negation operator in (1.6) is 2− α . If we write this negation operator by using its neutral value as a parameter, we get −log2 ν∗

− 1 . n(x) = 1 − x log2 ν∗ Note that the representation in Proposition 1.1 is not unique. It is not always easy to find a generator function. The following propositions state that there can be infinitely many generator functions for a negation operator. Proposition 1.6. Let ν∗ ∈ (0, 1), f : [0, 1] → [0, 1] f (x) =

⎧ ⎨

α

1 ∗ 1−x 1+ 1−ν ν∗ · x

⎩0

if x = 0 if x = 0.

1.3 Characterization of Strict Negation Operators

9

is a generator function of the negation n ν∗ (see (1.5)) for any α = 0. Proof. It can readily be seen that f −1 (x) = 1 −α ,

∗ 1−x 1+ 1−ν ν∗ · x

hence f −1 (1 − f (x)) =

1

, and 1 − f (x) =

= n ν∗ (x).

1 ν∗ 1−x α 1+ 1−ν ∗( x )

1

2 x ∗ 1+ 1−ν ν∗ 1−x

Remark 1.2. Note that in Proposition 1.6, if f is a generator function of n, then f −1 also generates n. Proposition 1.7. In Proposition 1.1 (Trillas) the generator function can also be decreasing. Proof. We prove that if f n is a generator function of n, then gn (x) = 1 − f n (x) is also a generator function of n. If f n is the generator function of n, then n(x) = f n−1 (1 − f n (x)). If gn (x) = 1 − f n (x) then gn−1 (x) = f n−1 (1 − x). With this generator function the negation has the following form: gn−1 (1 − gn (x)) = gn−1 (1 − (1 − f n (x))) = gn−1 ( f n (x)) = f n−1 (1 − f n (x)). Since f n is increasing, gn is decreasing. For the neutral element ν∗ , using the representation theorem, we get ν∗ = f −1 (1 − f (ν∗ )) , so ν∗ = f −1 21 . x −1 For the generator function g(x) = aa−1 , where a > 0, a = 1, we get n(x) = loga a + 1 − a x .

(1.7)

If we choose the inverse function g −1 (x) = loga (x(a − 1) + 1) for the generator function, we obtain 1−x , (1.8) n(x) = 1 + x(a − 1) which was mentioned above. In this section three basic families of strict negations generated by rational, power and exponential functions were considered. (See also Tables 1.1, 1.3 and 1.4.)

1.4 Nilpotent Connective Systems Next, instead of operators by themselves, connective systems are considered. Definition 1.4. The triple (c, d, n), where c is a t-norm, d is a t-conorm and n is a strong negation, is called a connective system. Definition 1.5. A connective system is nilpotent if the conjunction c is a nilpotent t-norm, and the disjunction d is a nilpotent t-conorm.

10

1 Connectives: Conjunctions, Disjunctions and Negations

Definition 1.6. Two connective systems (c1 , d1 , n 1 ) and (c2 , d2 , n 2 ) are isomorphic if there exists a bijection φ : [0, 1] → [0, 1] such that φ −1 (c1 (φ(x), φ(y))) = c2 (x, y) φ −1 (d1 (φ(x), φ(y))) = d2 (x, y) φ −1 (n 1 (φ(x))) = n 2 (x). In the nilpotent case, the generator functions of the disjunction and the conjunction (both being determined up to a multiplicative constant) can be normalized the following way: t (x) s(x) , f d (x) := . f c (x) := t (0) s(1) Remark 1.3. Thus, the normalized generator functions are uniquely defined. We will use normalized generator functions for conjunctions and disjunctions well. This means that the normalized generator functions of conjunctions, disjunctions and negations are f c , f d , f n : [0, 1] → [0, 1]. I will suppose that f c is continuous and strictly decreasing, f d is continuous and strictly increasing and f n is continuous and strictly monotone. Note that by using Proposition 1.7, there are two special negations generated by the normalized additive generators of the conjunction and the disjunction. Definition 1.7. The negations n c and n d generated by f c and f d , respectively, n c (x) = f c−1 (1 − f c (x)) and

n d (x) = f d−1 (1 − f d (x))

are called natural negations of c and d respectively. This means that for a connective system with normalized generator functions f c , f d and f n we can associate three negations using n c , n d and n, see (1.1). Proposition 1.8. With the help of the cutting operator (see Definition 1.3), we can write the conjunction and disjunction in the following form, where f c and f d are decreasing and increasing normalized generator functions, respectively. c(x, y) = f c−1 [ f c (x) + f c (y)],

(1.9)

d(x, y) = f d−1 [ f d (x) + f d (y)].

(1.10)

1.4 Nilpotent Connective Systems

11

Proof. From (1.2), we know that c(x, y) = f c−1 (min( f c (x) + f c (y), f c (0)) = f c−1 (min( f c (x) + f c (y), 1) = f c−1 [ f c (x) + f c (y)],

and similarly, from (1.3) d(x, y) = f d−1 (min( f d (x) + f d (y), f d (0)) = f d−1 (min( f d (x) + f d (y), 1) = f d−1 [ f d (x) + f d (y)].

Remark 1.4. Note that in Proposition 1 it is necessary to use normalized generator functions as the following example shows. This point supports the use of normalized functions. Example 1.2. Let f c (x) = 2 − 2x.

1 1 , = f c−1 (min ( f c (x) + f c (y), f c (0))) = f c−1 (2) = 0, c 2 2 while 1 1 1 + fc = f c−1 [2 − 1 + 2 − 1] = f c−1 [2] = f c−1 (1) = . f c−1 f c 2 2 2 Remark 1.5. Note that using the cutting function defined above we can omit applying the min and max operators. In the literature, the use of the pseudo-inverse was replaced by the forms (1.2) and (1.3). Definition 1.8. A connective system is called a Łukasiewicz system if it is isomorphic to ([x + y − 1], [x + y], 1 − x), i.e. if there exists a bijection φ : [0, 1] → [0, 1]; such that the connective system has the form (φ −1 [φ(x) + φ(y) − 1], φ −1 [φ(x) + φ(y)], φ −1 [1 − φ(x)]) f or ∀x, y ∈ [0, 1]. Proposition 1.9. For nilpotent t-norms and t-conorms, Definition 1.7 is equivalent to the following definition (also denoted by N T and N S ; see [3, 27]): n c (x) = N T (x) = sup {y ∈ [0, 1] | c(x, y) = 0}, x ∈ [0, 1], n d (x) = N S (x) = inf {y ∈ [0, 1] | d(x, y) = 1}, x ∈ [0, 1]. Proof. For the conjunction, c(x, y) = f c−1 [ f c (x) + f c (y)] = 0 if and only if f c (x) + f c (y) ≥ 1, from which y ≤ f c−1 (1 − f c (x)) = n c (x). For y = n c (x), c(x, n c (x)) = 0 is trivial. The proof is similar for the disjunction case as well.

12

1 Connectives: Conjunctions, Disjunctions and Negations

1.4.1 Structural Properties of Connective Systems Definition 1.9. The classification property means that the law of contradiction holds, i.e. c(x, n(x)) = 0, ∀x, y ∈ [0, 1], (1.11) and the excluded third principle holds as well, i.e. d(x, n(x)) = 1,

∀x, y ∈ [0, 1].

(1.12)

Definition 1.10. The De Morgan identity means that c(n(x), n(y)) = n(d(x, y))

(1.13)

d(n(x), n(y)) = n(c(x, y)).

(1.14)

or

Remark 1.6. These two forms of the De Morgan law are equivalent if the negation is involutive. The first De Morgan law holds with a strict negation n if and only if the second holds with n −1 (see Fodor and Roubens, [4], p. 18) Definition 1.11. A connective system is said to be consistent if the classification property (Definition 1.9) and the De Morgan identity (Definition 1.10) hold.

1.4.1.1

The Classification Property

Now we will examine the conditions that the connectives and their normalized generator functions in a connective system must satisfy if we want the classification property to hold. Proposition 1.10. (See also [4], [27], 2.3.2.) In a connective system (c, d, n), the classification property holds if and only if n d (x) ≤ n(x) ≤ n c (x),

f or ∀x ∈ [0, 1]

where n c and n d are the natural negations of c and d, respectively. Proof. From the excluded third principle, we have d(x, n(x)) = 1. Using the normalized generator function, f d−1 [ f d (x) + f d (n(x))] = 1. It means that f d (x) + f d (n(x)) ≥ 1, from which f d (n(x)) ≥ 1 − f d (x). f d and its inverse f d−1 are strictly increasing, so we get the left hand side of the inequality: n(x) ≥ f d−1 (1 − f d (x)) = n d (x).

1.4 Nilpotent Connective Systems

13

Similarly, we get the right hand side from the law of contradiction c(x, n(x)) = 0. Using the normalized generator function we get f c−1 [ f c (x) + f c (n(x))] = 0. From the definition of the cutting function f c (x) + f c (n(x)) ≥ 1, which means that f c (n(x)) ≥ 1 − f c (x). Since f c and f c−1 are strictly decreasing, n(x) ≤ f c−1 (1 − f c (x)) = n c (x), n d (x) ≤ n(x) ≤ n c (x). Remark 1.7. Generally, in a consistent system only one negation is used in the literature. The logical connectives are usually generated by a single generator function. c(x, y) = f −1 [ f (x) + f (y) − 1] , d(x, y) = f −1 [ f (x) + f (y)] , n(x) = f −1 (1 − f (x)) , where f : [0, 1] → [0, 1] is a continuous, strictly increasing function. The question arises immediately of whether the use of more than one negation is possible. This possibility will be examined later on (see Sect. 1.4.2.1). Next we give examples for connective systems in which the classification property holds, but which does not fulfil the De Morgan law. In Sect. 1.5, an overview of all the examples included in the following part of this section is presented. The examples from the rational family will be considered in detail in Sect. 1.4.2.1. √ √ Example 1.3. Let f n (x) := x 2 , f c (x) := 1 − x and f d (x) := x. This connective system fulfills the classification property but it does obey the De Morgan law. (See also Table 1.1.) Another example can be obtained by using the rational family of normalized generators functions f n (x) =

f c (x) =

f d (x) =

1 1+

ν 1−x 1−ν x

1 1+

νc x 1−νc 1−x

1 1+

νd 1−x 1−νd x

,

f n (0) = 0,

,

f c (1) = 0,

,

f d (0) = 0,

choosing e.g. νd = 0.3, νc = 0.7 and ν = 0.5. (See Table 1.4.)

14

1 Connectives: Conjunctions, Disjunctions and Negations

The existence of such systems explains why we have to consider the De Morgan law in the following section.

1.4.1.2

The De Morgan Law

Now we will examine the conditions that the connectives and their normalized generator functions have to satisfy, if we want the connective system to obey the De Morgan law. Before stating Proposition 1.12, we need to solve the following functional equation. Lemma 1.1. Let u : [0, 1] −→ [0, 1] be a continuous, strictly increasing function with u(0) = 0 and u(1) = 1. The functional equation [u(x) + u(y)] = u[x + y]

(1.15)

(where [ ] stands for the cutting operator defined in Definition 1.3) has a unique solution u(x) = x. Proof. • First, we prove that u[0] = 0. Let us suppose that u[0] = c, where 0 ≤ c ≤ 1. Then c = u[0 + 0] = [2u(0)], which means c = [2c] i.e. c = 1, or c = 0, however c = 1 contradicts u(0) = 0. • Second, we show that u[1] = 1. Similarly, let us suppose that u[1] = c, where 0 ≤ c ≤ 1. Then c = u[1 + 1] = [2u(1)], which means c = [2c]; i.e. c = 1, or c = 0, but for c = 0 we getcontradiction. • Third, we prove that u 21 = 21 . If x < 21 , then 2x < 1. u is strictly increasing, therefore u(2x) < 1 as well. u[2x] = u(2x) = 2u(x) = [2u(x)], because of the continuity of u, lim x→ 21 u(2x) = u(1), 2 lim x→ 21 u(x) = 1, which implies u 21 = 21 . 1 • Similarly, we can prove that u 2m = 21m . • Next, we prove that u 43 = 34 . 3 1 1 u 4 = u 2 + 4 = u 21 + u 41 = 21 + 41 = 43 . k • In a similar way, we find that u 2m = 2km . Then, for any rational number from [0, 1], we have u(x) = x. • Let r be any arbitrary irrational number from [0, 1]. There exists a sequence of rational numbers qn such that ∀n : qn ∈ [0, 1] and qn −→ r . Because of the continuity of u we have u(qn ) −→ u(r ), which implies u(r ) = r. Note that the solution of the following general form of the functional equation (1.15) can be found in the papers of Baczynski [19] (Propositions 3.4. and 3.6.).

1.4 Nilpotent Connective Systems

15

Proposition 1.11. Fix real a, b > 0. For a function f : [0, a] → [0, b], the following statements are equivalent. 1. The function f satisfies the functional equation f (min(x + y, a)) = min( f (x) + f (y), b) ∀x, y ∈ [0, a]. 2. Either f = b, or f = 0, or 0 if x = 0 f (x) = b if 0 < x ≤ a or there exists a unique constant c ∈ [b/a, ∞) such that f (x) = min(cx, b),

x ∈ [0, a].

Remark 1.8. In particular, for a = b = 1 we get the statement of Lemma 1.1. Proposition 1.12. If f c is the normalized generator function of a conjunction in a connective system, f d is a normalized generator function of the disjunction and n is a strong negation, then the following statements are equivalent: 1. The De Morgan law holds in the connective system. That is, c(n(x), n(y)) = n(d(x, y)).

(1.16)

2. The normalized generator functions of the conjunction, disjunction and the negation operator obey the following equations (which are obviously equivalent to each other): (1.17) n(x) = f c−1 ( f d (x)) = f d−1 ( f c (x)) , f c (x) = f d (n(x)) or equivalently

f d (x) = f c (n(x)).

(1.18)

Proof. (1.18) ⇒ (1.16) is obvious. (1.16) ⇒ (1.17): Let us write the De Morgan law using the normalized generator functions. f c−1 [ f c (n(x)) + f c (n(y))]) = n( f d−1 [ f d (x) + f d (y)]). Applying f c (x) to both sides of the equation, we get [ f c (n(x)) + f c (n(y))] = f c (n( f d−1 [ f d (x) + f d (y)])). Let us substitute x by the value f d−1 (x). Then we have [ f c (n( f d−1 (x))) + f c (n( f d−1 (y)))] = f c (n( f d−1 [ f d ( f d−1 (x)) + f d ( f d−1 (y))])).

16

1 Connectives: Conjunctions, Disjunctions and Negations

From this, we get the following functional equation: [ f c (n( f d−1 (x))) + f c (n( f d−1 (y)))] = f c (n( f d−1 [x + y])). If we use u(x) := f c (n( f d−1 (x))), then we get the following form of the functional equation: [u(x) + u(y)] = u[x + y]. We can readily see that function u(x) satisfies the conditions of Lemma 1.1; i.e. it is a continuous, strictly monotone increasing function with = 0 and u(1) = 1. u(0) This means that by Lemma 1.1, u(x) = x. Hence, f c n f d−1 (x) = x. Remark 1.9. Note that in Proposition 1.12 any two of n, f c , f d determine the third. However, note that this remark does not mean that any two of n, f c , f d can be chosen arbitrary. If f c and f d are given and we want the De Morgan property to hold, we get n from (1.17). This means that for f c and f d the equation in (1.17) has to hold. Hence, in order to get an involutive negation, we must look carefully at the appropriate relationship of the normalized generator functions as the following example shows. Example 1.4. Let f c (x) = 1 − x α and f d (x) = x β , where α = β. Then f c−1 ( f d (x)) =

√ α β 1 − x β = 1 − x α = f d−1 ( f c (x)) .

Proposition 1.13. If the De Morgan property holds in a connective system (c, d, n), then (1.19) n c (n(x)) = n (n d (x)) and similarly, n d (n(x)) = n (n c (x)) ,

(1.20)

where n c and n d are the natural negations. Proof. Because of the involutive property of n it is sufficient to prove (1.19). That is, = n d (x). n f c−1 (1 − f c (n(x))) = f d−1 f c f c−1 1 − f c f c−1 ( f d (x)) Corollary 1.1. If the De Morgan law holds in a connective system (c, d, n), then n(x) = n c (x) if and only if n(x) = n d (x), where n c and n d are the natural negations.

(1.21)

1.4 Nilpotent Connective Systems

17

Remark 1.10. Note that if any two of n, n d , n c are equal, then the third is equal to them as well. Proposition 1.14. Let h be the transformation for which h ( f c (x)) = f d (x) in a connective system where the De Morgan property holds. Then h is a (strong) negation. Proof. By using the involutive property of n, we get f d−1 ( f c (x)) = f c−1 ( f d (x)) , f d (x) = f c f d−1 ( f c (x)) , f c (x) = f d f c−1 ( f d (x)) = h ( f d (x)) , f c−1 (x) = f d−1 h −1 (x) , f d f c−1 (x) = h −1 (x) = h(x). So h is also involutive. It is easy to see that h(0) = 1, h(1) = 0 and h(x) = f d f c−1 (x) is strictly monotone decreasing. Now we give examples of consistent and non-consistent connective systems where the De Morgan property holds. For examples from the rational family of normalized generator functions, see Propositions 1.18 and 1.19. Example 1.5. If in a connective system the conjunction, the disjunction and the negation have the following forms f n (x) = x, f c (x) = (1 − x)α , f d (x) = x α , then this connective system is consistent (i.e. the De Morgan law and the classification property hold), if and only if 0 < α ≤ 1. (See also Table 1.1.) Proof. It is easy to see that from the Proposition 1.12 formula (1.17) is true for the above-mentioned normalized generator and negation functions: x α = (1 − (1 − x))α , which means that the De Morgan law holds. It is easy to see that the classification property holds if and only if x α + (1 − x)α ≥ 1, which is only true for 0 < α ≤ 1.

18

1 Connectives: Conjunctions, Disjunctions and Negations

Remark 1.11. Note that the example above indicates that there exists a system in which the De Morgan property holds, but the classification property does not (for α > 1). (See also Table 1.1.) For an example from the rational family of normalized generator functions (see Propositions 1.18 and 1.19 and also Table 1.4) ⎧ 1 ⎨ x = 0, ν 1−x f n (x) = 1 + 1−ν x ⎩ 0 x = 0;

f c (x) =

⎧ ⎨

1+ ⎩ 0 ⎧ ⎪ ⎨

f d (x) = 1 + ⎪ ⎩0

1 νc x 1−νc 1−x

1 νd 1−x 1−νd x

x = 1, x = 1; x = 0, x = 0.

we can choose e.g. ν = 0.6, νc = 0.2 and νd = 0.36. Example 1.6. If we express the normalized generator functions in Example 1.11 in terms of the neutral values of the related negations, we get 1

f n (x) = x, f c (x) = (1 − x) log0.5 (1−νc ) , f d (x) = x logνd (0.5) . This system fulfills the De Morgan identity if and only if νc + νd = 1, and it is consistent if and only if νd ≤ 21 also holds. (See also Table 1.1.)

1.4.2 Consistent Connective Systems Next, we consider consistent connective systems (in which the De Morgan property and the classification property hold together). Proposition 1.15. 1. If the connective system (c, d, n) is consistent, then f c (x) + f d (x) ≥ 1 for any x ∈ [0, 1], where f c and f d are the normalized generator functions of the conjunction c and the disjunction d, respectively. 2. If f c (x) + f d (x) ≥ 1 for any x ∈ [0, 1] and the De Morgan law holds, then the connective system (c, d, n) satisfies the classification property as well (which now means that the system is consistent). Proof. By Proposition 1.10, the classification property holds if and only if f d−1 (1 − f d (x)) = n d (x) ≤ n(x) ≤ n c (x) = f c−1 (1 − f c (x))

1.4 Nilpotent Connective Systems

19

and by Proposition 1.12, the De Morgan identity holds if and only if n(x) = f d−1 ( f c (x)) = f c−1 ( f d (x)) . From the right hand side of the inequality, we get f c−1 ( f d (x)) ≤ f c−1 (1 − f c (x)) , so f c (x) + f d (x) ≥ 1.

Similarly, we get the same result from the left hand side of the inequality.

Remark 1.12. Note that as Example 1.3 shows, f c (x) + f d (x) ≥ 1 does not imply the De Morgan law, even if the classification property holds. Moreover, f c (x) + f d (x) ≥ 1 without the De Morgan law does not imply the classification property either (for a counterexample we can chose f n = x 2 and α = 0.7 in Example 1.11). Next, examples for consistent systems are presented. Example 1.7. If in a connective system the generator function of the conjunction, the disjunction and the negation have the following forms f c (x) = 1 − x α , f d (x) = x α , f n (x) = x α , where α > 0, then the De Morgan law and the classification property hold for every value of α. (See also Table 1.1.) Example 1.8. More generally, the connective system with generator functions β

f c (x) = (1 − x α ) α , f d (x) = x β , f n (x) = x α , where α, β > 0, is consistent if and only if β ≤ α. (See also Table 1.1.) Note that Example 1.8 reduces to Example 1.11 if α = 1 and 0 < β ≤ 1 and to Example 1.7 if α = β. Proposition 1.16. In a connective system the following equations are equivalent: f c (x) + f d (x) = 1

(1.22)

n c (x) = n d (x),

(1.23)

where f c , f d are the normalized generator functions of the conjunction and the disjunction and n c , n d are the natural negations.

20

1 Connectives: Conjunctions, Disjunctions and Negations

Proof. From f d (x) = 1 − f c (x), f d−1 (x) = f c−1 (1 − x) and n d (x) = f d−1 (1 − f d (x)) = f d−1 (1 − (1 − f c (x))) = f d−1 ( f c (x)) = n(x) = f c−1 (1 − f c (x)) = n c (x).

Remark 1.13. Let us suppose that in a connective system the De Morgan property holds. If condition (1.22) holds, then n c (x) = n(x) = n d (x), and therefore the system is consistent. Remark 1.14. Note that if condition (1.22) holds, we get the classical nilpotent (Łukasiewicz) logic.

1.4.2.1

Bounded Systems

The question arises of whether we can use more than one generator function in our connective system without losing consistency. In the literature only systems generated by one generator function have been considered (see e.g. Baczyński and Jayaram, [27], Theorem 2.3.18). In these systems the natural negations of the conjunction and the disjunction coincide with the negation operator. Next, the case n c (x) = n d (x) = n(x) is examined. Definition 1.12. A nilpotent connective system is called a bounded system if f c (x) + f d (x) > 1, or equivalently n d (x) < n(x) < n c (x) holds for all x ∈ (0, 1), where f c and f d are the normalized generator functions of the conjunction and disjunction, and n c , n d are the natural negations. The following example demonstrates the existence of consistent bounded systems. Example 1.9. (See also Table 1.1.) The connective system generated by f c (x) := 1 − x α , f d (x) := 1 − (1 − x)α , n(x) := 1 − x, α ∈ (1, ∞] is a consistent bounded system.

1.4 Nilpotent Connective Systems Fig. 1.2 Three different types of negations: the conjunctive, the disjunctive and the normal negations

21

1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

Proof. Applying (1.17) from Proposition 1.12, we get: f c (n(x)) = 1 − (1 − x)α = f d (x), which means that√the De Morgan law holds. It is easy to see that n c (x) = √ α 1 − x α , n d (x) = 1 − α 1 − (1 − x)α , i.e. n d (x) < n(x) < n c (x), which means that the classification property is also true (see Fig. 1.2). For the normalized generator functions, we have f c (x) + f d (x) > 1 for all x ∈ (0, 1). Remark 1.15. In Example 1.9 for α = 1, we get n d (x) = n(x) = n c (x); i.e. f c (x) + f d (x) = 1. Proposition 1.17. In a connective system (c, d, n), the following statements are equivalent: f c (x) + f d (x) > 1 for all x ∈ (0, 1), (1.24) f d f c−1 (x) > 1 − x

for all x ∈ (0, 1),

(1.25)

f c f d−1 (x) > 1 − x

for all x ∈ (0, 1),

(1.26)

where f c and f d are the normalized generator functions of c and d. Proof. From n d (x) < n(x) < n c (x) we have f d−1 (1 − f d(x)) < f c−1 ( f d (x)) . Replacing x by f d−1 (x) we get f d−1 (1 − x) < f c−1 (x), i.e. f c f d−1 (x) > 1 − x, which is also equivalent to f c f d−1 (1 − x) > x.

22

1 Connectives: Conjunctions, Disjunctions and Negations

Next we consider the case of the rational family of the normalized generator functions introduced by Dombi in [25]. Proposition 1.18. For the Dombi functions (see also Eq. (1.5) and Proposition 1.6), ⎧ 1 ⎨ x = 0, ν 1−x f n (x) = 1 + 1−ν x ⎩ 0 x = 0;

f c (x) =

f d (x) =

⎧ ⎨

1

1+ ⎩ 0 ⎧ ⎪ ⎨

x = 1,

νc x 1−νc 1−x

x = 1;

1

1+ ⎪ ⎩0

x = 0,

νd 1−x 1−νd x

x = 0.

the following statements are equivalent: 1. The connective system generated by the Dombi functions in Proposition 1.18 satisfies the De Morgan law. 2. For parameters νd and νc in the normalized generator functions and for parameter ν in the negation function, the following equation holds:

1−ν ν

2 =

νc 1 − νd . 1 − νc νd

(1.27)

Proof. By Proposition 1.12, the De Morgan law holds if and only if: f c (n(x)) = f d (x).

(1.28)

From Proposition 1.6 for α = −1, we know that n(x) = so f c (n(x)) =

1+

1 1−ν 2 ν

x 1−x

,

1 1 = νc νd ν 2 1−x 1 + ( 1−ν )( ) 1 + 1−ν x 1−νd c

(1.29)

1−x x

.

1.4 Nilpotent Connective Systems

23

This means that the equality holds if and only if the parameters on the left and the right hand side are equal. That is,

2

1−ν ν

=

νc 1 − νd . 1 − νc νd

(1.30)

Remark 1.16. From (1.30), we get that the De Morgan law holds if and only if ν=

1+

1 νc 1−νd 1−νc νd

.

(1.31)

Proposition 1.19. For the natural negations derived from the Dombi functions defined in Proposition 1.18, the following statements are equivalent for x ∈ (0, 1): n d (x) < n(x) < n c (x),

(1.32)

νd < ν < νc .

(1.33)

Proof. 1 1+

d 2 x ( 1−ν ) 1−x νd

1

(1.34)

is also equivalent to (1.32) and (1.33). Proposition 1.20. For the Dombi functions defined in Proposition 1.18, the followings are equivalent: f c (x) + f d (x) > 1, for all x ∈ (0, 1),

(1.35)

νc + νd < 1.

(1.36)

Proof. 1+

1 νc x 1−νc 1−x

>1−

1+

1 νd 1−x 1−νd x

=

1+

1 1−νd x νd 1−x

24

if and only if

1 Connectives: Conjunctions, Disjunctions and Negations

1 − νd νc < , 1 − νc νd

which is equivalent to νc + νd < 1.

Remark 1.18. Note that if the De Morgan property holds, n d (x) < n(x) < n c (x)

(1.37)

is also equivalent to (1.35) and (1.36). From Propositions 1.19 and 1.20, the relationship between νc and νd can be seen. In Fig. 1.3, we can see the possible values of νc and νd for fixed values of ν. The values of ν as a function of νc and νd can be seen in Fig. 1.3. Remark 1.19. By using Equations (1.37), (1.36), and (1.31) we find that in a consistent system with f c (x) + f d (x) > 1, ν < 21 always holds. νc 1−νd Remark 1.20. For ν = 21 , we get 1−ν = 1, so νc = νd = ν = 21 . c νd Example 1.10. For νc = 0.5 and νd = 0.1 ν = 0.25, νc + νd < 1 and n d (x) < n(x) < n c (x). Example 1.11. If in a connective system the conjunction, the disjunction and the negation have the following forms

Fig. 1.3 The relationship between ν, νc and νd in consistent rational systems

1.4 Nilpotent Connective Systems

25

Fig. 1.4 Conjunction c[x, y] and disjunction d[x, y]

f n (x) = x, f c (x) = (1 − x)α , f d (x) = x α , then this connective system is consistent (i.e. the De Morgan law and the classification property hold), if and only if 0 < α ≤ 1. (See also Table 1.1.) In Fig. 1.4, examples for conjunctions and disjunctions are shown for f c (x) + f d (x) = 1 and for f c (x) + f d (x) > 1, respectively. Note that the coincidence and the separation of n c and n d (see their alternative definition in Proposition 1.9 as well) can easily be seen.

1.5 Summary In this chapter, we showed that a consistent logical system generated by nilpotent operators is not necessarily isomorphic to the Łukasiewicz-logic, which means that nilpotent logical systems are wider than we have thought earlier. Using more than one generator functions we examine three naturally derived negations in these systems. It was shown that the coincidence of the three negations leads back to a system which is isomorphic to the Łukasiewicz-logic. Examples of consistent nilpotent logical structures with three different negations have been demonstrated. Next, we give an overview of the three families of normalized generator functions used in our examples and propositions, namely power, exponential and rational functions (see Tables 1.1, 1.3 and 1.4). For the power generator functions the logical connectives are also given (see Table 1.2). In the case of the rational functions, and in a special case of the power functions we give the normalized generators in terms of the neutral values as well (see Table 1.5). Finally, we give some examples of con-

26

1 Connectives: Conjunctions, Disjunctions and Negations

Table 1.1 Power functions as normalized generators fn

Classification De Morgan

Remarks

1.3

x2

fc √ 1−x

fd √ x

−

1.11

x

(1 − x)α

xα

01

1.6

x

1 (1 − x) log0.5 (1−νc )

x

if and only if νd ≤ 0.5

if and only if νc + νd = 1

1.11 and 1.11 in terms of the neutral value

1.7

xα

1 − xα

xα

α>0

xβ

β ≤ α; α, β > 0

1 − (1 − x)α

1.8

xα

1.9

x

1−x

β α

α

1 − xα

logν 0.5 d

α ≥ 1, fc + fd > 1 if and only if α>1

Table 1.2 Power functions as normalized generators 1.3

x2

fn

fc √ 1−x

fd √ x

n(x) 1 − x2

1.11

x

(1 − x)α

xα

1−x

1.11

x

(1 − x)α

xα

1.7

xα

1 − xα

xα

1−x √ α 1 − xα

1.8

xα

xβ

√ α 1 − xα

1.9

x

1 − (1 − x)α

1−x

β 1 − xα α

1 − xα

c(x, y) √ √ 1 − (1 − x) + (1 − y) 2 1 1 − (1 − x)α + (1 − y)α α 1 1 − (1 − x)α + (1 − y)α α 1 1 − 2 − x α − yα α 1 ⎛ α ⎞α β β β ⎠ ⎝1 − 1 − x α α + 1 − y α α

d(x, y) √ √ x+ y 2 1 α x + yα α 1 α x + yα α α 1 x + yα α

1 1 − (1 − x)α + (1 − y)α − 1 α

1 1 − 2 − x α − yα α

Table 1.3 Exponential functions as normalized generators fn fc fd De Morgan law a x −1 a−1

(a+1−a x )loga b −1 b−1

b x −1 b−1

1 x β + yβ β

Consistency Consistent for e.g. a = 0.5, b = 0.7 or a = 0.7, b = 0.85

sistent connective systems with mixed types of normalized generator functions (see Table 1.6). In the next chapter, we will focus on the implication operators in nilpotent systems.

1.5 Summary

27

Table 1.4 Rational functions as normalized generators fn fc fd 1 1 1 1.18 and 1.19 νc x νd ν 1−x 1 + 1 + 1−ν 1 + 1−νc 1−x x 1−ν

Classification

d

νd < ν < νc

1−x x

De Morgan 1−ν 2 = ν νc 1−νd 1−νc νd ν = 1

νc 1−νd 1−νc νd

1+

1.3

ν = 0.5

νc = 0.7

νd = 0.3

−

1.11

ν = 0.6

νc = 0.2

νd = 0.36

−

1.10

ν = 0.25

νc = 0.5

νd = 0.1

Table 1.5 Rational functions as normalized generators – 3 negations f (x) (normalized f −1 (x) 1 − f (x) generator) 1 1 1 Negation 1−x x ν 1−x 1+ 1−ν 1+ 1−ν ν x ν 1−x 1 + 1−ν x

Negation n(x) = 1+

1

Conjunction

1+

νc x 1−νc 1−x

1 c 1+ 1−ν νc

1 c 1+ 1−ν νc

x 1−x

1−x x

Disjunction

1+

νd 1−x 1−νd x

1 1+

1

1−νd 1−x νd x

1+

1−νd x νd 1−x

ν

n c (x) = 1+

1

1 1−ν 2

n d (x) = 1+

1

νc 1−νc

1

1−νd νd

x 1−x

2

2

x 1−x

x 1−x

Table 1.6 Mixed types of normalized generator functions

Rational and power Power and exponential

fn

fc ⎛

1 ν 1−x 1 + 1−ν x

⎜ ⎝

xα

⎞α

1 ⎟ ⎠ 2 x 1 + 1−ν ν 1−x 1 α α a 1−x −1 a−1

fd

De Morgan law

Consistency

xα

Consistent for e.g. α = 1, ν = 0.8 or α = 2, ν = 0.9

a x −1 a−1

a > 0, a = 1, α > 0. Consistent for e.g. α = 1, a = 0.5

References 1. Schweizer, B., Sklar, A.: Probabilistic Metric Spaces. North-Holland, Amsterdam (1983) 2. Siegfried, W.: A general concept of fuzzy connectives, negations and implications based on t-norms and t-conorms. Fuzzy Sets Syst. 11, 115–134 (1983) 3. Klement, E.P., Mesiar, R., Pap, E.: Triangular Norms. Kluwer Academic Publishers, Boston (2000) 4. Fodor, J., Roubens, M.: Fuzzy preference modelling and multicriteria decision support (1994) 5. Fodor, J.: A new look at fuzzy connectives. Fuzzy Sets Syst. 57, 141–148 (1993)

28

1 Connectives: Conjunctions, Disjunctions and Negations

6. Beliakov, G., Pradera, A., Calvo, T.: Aggregation functions: a guide for practitioners. In: Studies in Fuzziness and soft Computing, page 375. Springer (2007) 7. A. J. Ober eine Klasse von Funktionalgleichungen 1(xl):247–252 (1920) 8. A Treatise on Many-Valued Logics, vol. 9. Research Studies Press, Baldock (2001) 9. Novák, V., Perfilieva, I., Mo˘cko˘r, J.: Mathematical Principles of Fuzzy Logic. Kluwer Academic publishers, Boston (1999) 10. Hájek, P.: Metamathematics of Fuzzy Logic. Kluwer Academic Publishers, Dordrecht (1998) 11. Łukasiewicz, J.: On Three-Valued Logic. Selected works by Jan Łukasiewicz. North-Holland, Amsterdam (1970) 12. Ono, H.: Trends in Logic: 50 Years of Studia Logica, vol. 20, Chapter Substructural Logics and Residuated Lattices - An Introduction (2003) 13. Rothenberg, R.: Łukasiewicz’s many valued logic as a Doxastic Modal Logic. Ph.D. thesis, University of St. Andrews (2005) 14. Dubois, D., Prade, H.: Fuzzy sets in approximate reasoning. Fuzzy Sets Syst. 40, 143–202 (1991) 15. Trillas, E., Valverde, L.: On some functionally expressable implications for fuzzy set theory. In: Proceedings of the 3rd International Seminar on Fuzzy Set Theory, Linz, Austria, pp. 173–1902 (1981) 16. Dombi, J., Csiszár, O.: The general nilpotent operator system. Fuzzy Sets Syst. 261, 1–19 (2015) 17. Yager, R.R.: On the measure of fuzziness and negation part I: membership in the unit interval. Int. J. General Syst. 5, 221–229 (1979). https://doi.org/10.1080/03081077908547452. ISSN 0308-1079 18. Trillas, E.: On negation functions in the theory of fuzzy sets (1979) 19. Baczyński, M., Jayaram, B.: On the distributivity of fuzzy implications over nilpotent or strict triangular conorms. IEEE Trans. Fuzzy Syst. 17(3), 590–603 (2009). https://doi.org/10.1109/ TFUZZ.2008.924201. ISSN 10636706 20. Ling, C.: Representation of associative functions. Publ. Math. Debrecen 12, 189–212 (1965) 21. Sabo, M., Strezo, P.: On reverses of some binary operators. Kybernetika 41, 435–450 (2005) 22. Hamacher, H.: Über logische Aggregationen nicht-binär explizierter Entscheidungskriterien: eine axiomatischer Beitrag zur normativen Entscheidungstheorie. Ph.D. thesis, Frankfurt a. M., (1978).https://publications.rwth-aachen.de/record/64749. Zugl.: Aachen, Techn. Hochsch., Diss. (1978) 23. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965) 24. Sugeno, M.: Fuzzy measures and fuzzy integrals—a survey. In: Dubois, D., Prade, H., Yager, R.R. (eds.) Readings in Fuzzy Sets for Intelligent Systems, pp. 251 – 257. Morgan Kaufmann (1993). https://doi.org/10.1016/B978-1-4832-1450-4.50027-4. http://www. sciencedirect.com/science/article/pii/B9781483214504500274. ISBN 978-1-4832-1450-4 25. Dombi, J.: Towards a general class of operators for fuzzy systems. IEEE Trans. Fuzzy Syst. 16(2), 477–484 (2008). https://doi.org/10.1109/TFUZZ.2007.905910 26. Dombi, J.: Pliant Operator System, vol. 378, pp. 31–58 (2011) 27. Baczynski, M., Jayaram, B.: Fuzzy Implications, 1st edn. Springer (2008). ISBN 3540690808

Chapter 2

Implications

Abstract Fuzzy implications are a generalization of the classical two-valued implication to the multi-valued setting. As one of the main operations in fuzzy logic, they play a vital role both in the theory and applications, such as multivalued mathematical logic, fuzzy logic systems, fuzzy control, approximate reasoning, expert systems, image processing, and data analysis. Here, we focus on nilpotent logical systems and examine two different kinds of implications and the concept of a weak ordering property. Furthermore, we consider both R- and S-implications with respect to the three naturally derived negations from the previous chapter. The formulae and the basic properties of these implications are given which will come in handy when we implement fuzzy logic into neural architecture in Chap. 9.

2.1 Introduction In Chap. 1, it was shown that a consistent connective system generated by nilpotent operators is not necessarily isomorphic to the Łukasiewicz-system. Using more than one generator function, consistent nilpotent connective systems could be obtained in a significantly different way with three naturally derived negations. Those consistent nilpotent connective systems which are not isomorphic to Łukasiewicz logic are called bounded systems. Based on the results of Chap. 1, we will now focus on implications in bounded systems. The results of this chapter can be found in Dombi and Csiszár [1]. Fuzzy implications are definitely among the most important operations in fuzzy logic [2, 3]. Firstly, other basic logical connectives of the binary logic can be obtained from the classical implication. Secondly, the implication operator plays a crucial role in the inference mechanisms of any logic, like modus ponens, modus tollens, hypothetical syllogism in classical logic. Fuzzy implications all generalize the classical implication with the two possible crisp values from 0, 1, to the fuzzy concept with truth values lying in the unit interval [0, 1] [4]. In classical logic the implication can be defined in several ways. The most well-known implications are the usual material implication from the Kleene algebra, the implication obtained as the residuum of the conjunction in Heyting algebra (also called pseudo-Boolean algebra) in the intuitionistic logic framework and the implication in the setting of quantum logic. While all these differently defined implications have identical truth tables in the classical case, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_2

29

30

2 Implications

the natural generalizations of the above definitions in the fuzzy logic framework are not identical. This fact has led to some new intense research on fuzzy implications [5–13]. Next, we focus on residual and S-implication operators in bounded systems [14]. This chapter is organized as follows. After some preliminaries in Sect. 2.2, we examine the residual implication in Sect. 2.3 and S-implications with special attention to the ordering property in Sect. 2.4. In Sect. 2.6 we show that in a bounded system, the minimum and maximum operators can also be expressed in terms of the conjunction, the implication and the negation. Finally, in Sect. 2.5 we show that in a bounded system the implications examined here can never coincide. The formulae and the properties of implications are summarized in Sect. 2.7.

2.2 Preliminaries A mapping i : [0, 1]2 → [0, 1] is called an implication operator if and only if it satisfies the boundary conditions i(0, 0) = i(0, 1) = i(1, 1) = 1 and i(1, 0) = 0. The above conditions are the minimum requirements for an implication operator. Other potentially interesting properties of implication operators are listed in [2, 10, 12, 15, 16]. All fuzzy implications can be obtained by generalizing the implication operator of classical logic. In this sense, Fodor and Roubens, [17], established the following definition. Definition 2.1. A fuzzy implication is a function i : [0, 1]2 → [0, 1] that satisfies the following properties: 1. The first place antitonicity: for all x1 , x2 , y ∈ [0, 1] (i f x1 ≤ x2 then i(x1 , y) ≥ i(x2 , y)).

(FA)

2. The second place isotonicity: for all x, y1 , y2 ∈ [0, 1] (i f y1 ≤ y2 then i(x, y1 ) ≤ i(x, y2 )).

(SI)

3. The dominance of falsity of antecedent: i(0, y) = 1 for all

y ∈ [0, 1].

(DF)

i(x, 1) = 1 for all x ∈ [0, 1].

(DT)

4. The dominance of truth of consequent:

5. The boundary condition: i(1, 0) = 0 and i(1, 1) = 1.

(BC)

2.2 Preliminaries

31

Other important but usually not required properties of fuzzy implications are defined below (see Baczyński and Jayaram, [2]). Definition 2.2. A fuzzy implication i satisfies 1. The left neutrality property (the neutrality of truth) if i(1, y) = y

for all y ∈ [0, 1].

(LN)

2. The exchange principle if i(x, i(y, z)) = i(y, i(x, z)) for all x, y, z ∈ [0, 1].

(EP)

3. The identity principle if i(x, x) = 1 for all x ∈ [0, 1].

(IP)

4. The strong negation principle if the mapping n ∗ defined as n ∗ (x) = i(x, 0) for all x ∈ [0, 1]

(SN)

is a strong negation. 5. The law of contraposition (or in other words,the contrapositive symmetry) with respect to a strong negation n if i(x, y) = i(n ∗ (y), n ∗ (x)) for all x, y ∈ [0, 1].

(LC)

6. The ordering property if i(x, y) = 1 if and only if x ≤ y

for all x, y ∈ [0, 1].

(OP)

Remark 2.1. The negation operator n ∗ is also called the natural negation of the implication i (see Baczyński and Jayaram, [2]). A detailed study of possible relations among all these properties can be found in [2, 11, 15]. Notice that other properties can also be found in the literature. In particular, i(x, n ∗ (x)) = n ∗ (x) for all x ∈ [0, 1], where n ∗ is a strong negation (see Mas et al. [3]). Three well-established classes of implication operators are (S,N)-, QL- and Rimplications. Definition 2.3. (Baczyński and Jayaram, [2], p. 57.) A function i : [0, 1]2 → [0, 1] is called an S-implication if there exists a t-conorm S and a strong negation n ∗ such that i S (x, y) = S(n ∗ (x), y), x, y ∈ [0, 1].

32

2 Implications

Definition 2.4. (Baczyński and Jayaram, [2], p. 90.) A function i : [0, 1]2 → [0, 1] is called a QL-operation if there exists a t-conorm S, a t-norm T and a strong negation n ∗ such that i Q (x, y) = S(n ∗ (x), T (x, y)), x, y ∈ [0, 1]. In general, QL-operations violate property (FA). For the conditions under which (FA) is satisfied, see Fodor, [18]. When a QL-operation is a fuzzy implication, then it is called a QL-implication. Definition 2.5. (Baczyński and Jayaram, [2], p. 68.) A function i : [0, 1]2 → [0, 1] is called an R-implication if there exists a t-norm T such that i R (x, y) = sup{z ∈ [0, 1] | T (x, z) ≤ y}. In the case where the given t-norm is left-continuous, we will refer to the Rimplication defined above as a residual implication [2, 8, 19]. Note that in this case we have T (x, y) = in f (z ∈ [0, 1], | i(x, z) ≥ y). It is easy to see that both z

S-implications and R-implications satisfy properties (FA)-(BC), regardless of the t-norm T , the t-conorm S and the strong negation n ∗ types. Hence, they are implications in the Fodor and Roubens sense. Different characterizations of S-implications, QL-implications and R-implications can be found in the literature (for details, see [2, 3, 17]). It is worth mentioning here that new characterizations of R and S-implications can also be found at Trillas, [20]. Next, implications in bounded systems will be examined.

2.3 R-Implications in Bounded Systems For implications in nilpotent connective systems, the i notation is used. For the residual implication, we easily get the following formula (see Theorem 2.5.21., Baczyński and Jayaram, [2]). Proposition 2.1. In a nilpotent connective system (c, d, n), the residual implication has the following form. i R (x, y) = f c−1 f c (y) − f c (x) , where f c is the additive generator function of c, and [ ] is the cutting operator defined in Definition 1.3. Proof. From the definition of residual implication, i R (x, y) = max {z : c(x, z) ≤ y} ,

2.3 R-Implications in Bounded Systems

where

33

c(x, z) = f c−1 f c (x) + f c (z) ≤ y.

From this, we have z = f c−1 f c (y) − f c (x) .

Proposition 2.2. We can also express i R by using the negation operator and the normalized additive generator function of d. Proof. From n(x) = f c−1 ( f d (x)), we have f c (x) = f d (n(x)) and f c−1 (x) = n −1 f d−1 (x) , i R (x, y) = n −1 f d−1 f d (n(x)) − f d (n(y)) . The notation H is introduced below for further applications. A new formula for i R is given in (2.2) by using H . H (x) = 1 − f d (n(x)), so

(2.1)

n −1 f d−1 (x) = H −1 (1 − x). From this we have

i R (x, y) = H −1 1 − 1 − H (x) − (1 − H (y)) = H −1 [H (y) − H (x) + 1] . (2.2) Next, the properties in Definition 2.2 are examined to ascertain whether they are compatible with the R-implication in a nilpotent connective system. Remark 2.2. Note that the following results regarding the properties of i R correspond with Section 2.5. in [2]. Proposition 2.3. In a nilpotent connective system, i R satisfies 1. The left neutrality property (the neutrality of truth), (LN) i.e. i R (1, y) = y for all y ∈ [0, 1], 2. The exchange principle, (EP) i.e. i R (x, i R (y, z)) = i R (y, i R (x, z)) for all x, y, z ∈ [0, 1], 3. The identity principle, (IP) i.e. i R (x, x) = 1 for all x ∈ [0, 1], 4. The strong negation principle, (SN), since n ∗R (x) = i R (x, 0) = n c (x) for all x, y ∈ [0, 1] is a strong negation, 5. The law of contraposition (contrapositive symmetry), (LC) with respect to the strong negation in (SN); i.e. i R (x, y) = i R (n c (y), n c (x)) for all x, y ∈ [0, 1], 6. The ordering principle, (OP) is valid for i R (x, y), i. e. i R (x, y) = 1 if and only if x ≤ y.

34

2 Implications

Proof. LN, EP, IP and OP always hold for an R-implication derived from a continuous t-norm (see Theorem 2.5.7, Baczyński and Jayaram, [2]). LC follows directly from the definition of n c . EP and OP together always imply SN for continuous implications (see Corollary 1.4.19, Baczyński and Jayaram, [2]). Remark 2.3. Note that the law of contraposition (contrapositive symmetry), (LC) with respect to the strong negation n; i.e. i R (x, y) = i R (n(y), n(x)) for all x, y ∈ [0, 1], never holds in a bounded system (see also Corollary 1.5.12., Baczyński and Jayaram, [2]). Proof. We will prove that i R (x, y) = i R (n(y), n(x)) holds for all x, y ∈ [0, 1] if and only if f c (x) + f d (x) = 1; i.e. the system is a Łukasiewicz logical system. If x ≤ y, then n(y) ≤ n(x), and therefore from the ordering property we get that both sides are equal to 1. If x > y, then the two sides of the equality are equal if and only if f c (y) − f c (x) = f d (x) − f d (y), i.e. f c (x) + f d (x) = f c (y) + f d (y) for all x, y ∈ [0, 1], which means that f c (x) + f d (x) is a constant. Since f c (0) + f d (0) = 1, f c (x) + f d (x) = 1. A different form of the residual implication is also given in the next section.

2.4 S-Implications in Bounded Systems In a nilpotent connective system (c, d, n) we can define different types of Simplications. Definition 2.6. 1. i Sn (x, y) = d(n(x), y), x, y ∈ [0, 1], 2. i Sd (x, y) = d(n d (x), y), x, y ∈ [0, 1], 3. i Sc (x, y) = d(n c (x), y), x, y ∈ [0, 1], where n c and n d are the natural negations of c and d. Replacing the disjunction in the definitions above by an appropriate composition of negations and the conjunction leads us to further possible definitions of implications. Since in a bounded system the negations n, n c and n d never coincide, negations different from n can also be used in a similar way to the De Morgan identity. Definition 2.7. In a nilpotent connective system (c, d, n) 1. i Scn (x, y) = n (c(x, n(y))) , x, y ∈ [0, 1], 2. i Scd (x, y) = n d (c(x, n d (y))) , x, y ∈ [0, 1], 3. i Scc (x, y) = n c (c(x, n c (y))) , x, y ∈ [0, 1], where n c and n d are the natural negations of c and d.

2.4 S-Implications in Bounded Systems

35

Note that from the De Morgan identity it follows immediately that i Scn (x, y) = i Sn (x, y) and as the following proposition shows, i Scc is the residual implication. Proposition 2.4. In a nilpotent connective system (c, d, n), i Scc (x, y) = f c−1 [ f c (y) − f c (x)] = i R (x, y), where f c is the normalized additive generator function of c. Proof.

i Scc (x, y) = n c (c(x, n c (y))) = n c f c−1 [ f c (x) + 1 − f c (y)] = f c−1 [1 − (1 − f c (y) + f c (x))] = f c−1 [ f c (y) − f c (x)].

2.4.1 Properties of i Sn , i Sd and i Sc First, the formulae for the S-implications defined above will be given. Proposition 2.5. In a nilpotent connective system (c, d, n), 1. i Sn (x, y) = f d−1 [ f c (x) + f d (y)], 2. i Sd (x, y) = f d−1 [1 − f d (x) + f d (y)], 3. i Sc (x, y) = f d−1 [ f d (y) + f d (n c (x))] , where f c and f d are the normalized additive generator functions of c and d, respectively. Proof. All the three formulae are simple to verify.

Next, the basic properties of the S-implications in a nilpotent connective system are stated. Note that the following results are consistent with those described in Section 2.5, Baczyński and Jayaram, [2]. Proposition 2.6. In a nilpotent connective system, i Sn , i Sd and i Sc satisfy 1. The left neutrality property (the neutrality of truth), (LN), i.e. i(1, y) = y for all y ∈ [0, 1], 2. The exchange principle, (EP), i.e. i (x, i(y, z)) = i (y, i(x, z)) for all x, y, z ∈ [0, 1], 3. The identity principle, (IP), i.e. i (x, x) = 1 for all x ∈ [0, 1], 4. The strong negation principle, (SN) since i S (x, 0) for all x, y ∈ [0, 1] is a strong negation, 5. The law of contraposition (contrapositive symmetry), (LC) with respect to the strong negation in SN.

36

2 Implications

Proof. 1. LN holds for every S-implication (see Proposition 2.4.3, Baczyński and Jayaram, [2]), 2. EP holds for every S-implication (see Proposition 2.4.3, Baczyński and Jayaram, [2]), 3. IP holds as well, because of the consistency property and the use of nilpotent operators (see Theorem 2.4.17, Baczyński and Jayaram, [2]). 4. For SN, (a) n ∗n (x) = i Sn (x, 0) = d(n(x), 0) = f d−1 [ f d (n(x)) + 0] = n(x), (b) n ∗d (x) = i Sd (x, 0) = d(n d (x), 0) = f d−1 [ f d (n d (x)) + 0] = n d (x), (c) n ∗c (x) = i Sc (x, 0) = d(n c (x), 0) = f d−1 [ f d (n c (x)) + 0] = n c (x), 5. The proof of LC is trivial.

2.4.2 S-Implications and the Ordering Property First, the so-called weak ordering principle for implications is defined. Although the ordering principle plays an important role, as we will see, only the weak ordering property can be required in general. Definition 2.8. The implication i satisfies the weak ordering principle (WOP) if the following statement holds: i(x, y) = 1 if and only if x ≤ τ (y), where τ is a strictly increasing function from [0, 1] → [0, 1] with τ (0) = 0 and τ (1) = 1. Remark 2.4. In the terminology of Maes and De Baets, τ from Definition 2.8 is an affirmation (see Maes and De Baets [21]). Remark 2.5. Note that for τ (x) = x, we get the original ordering property (OP). Henceforth we use the following notations for the composition of two negation operators. Definition 2.9. In a connective system (c, d, n) τn,d (x) := n(n d (x)), and τc,d (x) := n c (n d (x)), where n c and n d are the natural negations of c and d respectively.

2.4 S-Implications in Bounded Systems

37

Remark 2.6. Note that in a consistent connective system τd,n = τn,c and similarly, τc,n = τn,d . Proposition 2.7. In a nilpotent connective system i Sd satisfies the ordering principle (OP), while i Sn and i Sc satisfy the weak ordering principle (WOP). Proof. For i Sd we have the following: i Sd (x, y) = 1 if and only if f d−1 [ f d (n d (x)) + f d (y)] = 1, which means that f d (n d (x)) + f d (y) ≥ 1, from which we get n d (x) ≥ n d (y), which holds if and only if x ≤ y. For i Sc , let τ (x) = τc,d (x) = n c (n d (x)). i Sc (x, y) = 1 if and only if f d−1 [ f d (n c (x)) + f d (y)] = 1, which means that f d (n c (x)) + f d (y) ≥ 1, from which we get n c (x) ≥ n d (y), so x ≤ n c (n d (y)) = τc,d (y). Similarly, for i Sn , let τ (x) = τn,d (x) = n(n d (x)). i Sn (x, y) = 1 if and only if f d−1 [ f d (n(x)) + f d (y)] = 1, which means that f d (n(x)) + f d (y) ≥ 1, from which we get n(x) ≥ n d (y), so x ≤ n(n d (y)) = τn,d (y). Next, we give an example for a bounded system illustrating that i Sn does not satisfy the ordering property. For f c (x) = 1 − x 2 ; f d (x) = 1 − (1 − x)2 ; n(x) = 1 − x, there exist an x and a y for which i Sn (x, y) = 1 and y < x, i.e. the ordering principle does not hold, because i Sn (x, y) = 1 if and only if d(n(x), y) = 1. For x = 0.5 and y = 0.4 we get f c (0.5) + f d (0.4) = (1 − 0.52 ) + (1 − (1 − 0.4)2 ) = 0.75 + (1 − 0.36) = 1.39, so i(0.5, 0.4) = 1 and (y < x). Remark 2.7. Note that the following statements are equivalent: i Sc (x, y) = 1 if and only if x ≤ y

(2.3)

f c (x) + f d (x) = 1 for all x ∈ [0, 1].

(2.4)

In other words, the ordering property, (OP) never holds in a bounded system. We show that the ordering property holds if and only if f c (x) + f d (x) = 1. We have n c (x) ≥ n d (y). This means that the ordering property for i Sc (and also for i Sn ) is equivalent to the following: n c (x) ≥ n d (y) if and only if x ≤ y. It is evident that the condition above holds if and only if n d (x) = n c (x), i.e. f c (x) + f d (x) = 1.

38

2 Implications

2.5 A Comparison of Implications in Bounded Systems Now, we will prove that in a bounded system, the different types of implications considered so far never coincide. Proposition 2.8. In a connective system (c, d, n), any two of the implications defined so far coincide if and only if f c (x) + f d (x) = 1, where f c and f d are the normalized additive generator functions of c and d respectively. Proof. It was shown in Proposition 2.4 that in a bounded system, the natural negations of the implications are identical only in the case of i R and i Sd the equality of these two implications, which means that it is sufficient to examine. Since i R satisfies O P while i Sc for f c (x) + f d (x) = 1 does not (see Table 2.1), we see that in a bounded system they cannot be equal. Remark 2.8. It is clear that in a Łukasiewicz logical system (where f c (x) + f d (x) = 1), all the implications considered above coincide. From the results of Sects. 2.3 and 2.4, we can say that in a bounded system we have two different implications (namely i R and i Sd ) that satisfy all of the properties L N − O P (see Table 2.1). Hence, the notations i c and i d are used, to coincide with the additive generator functions f c and f d used in the formulae of the implications, respectively (see Table 2.1). Henceforth let us use the following notation for the sake of simplicity. (2.5) i d (x, y) := i Sd (x, y) and i c (x, y) := i R (x, y).

2.6 Min and Max Operators in Nilpotent Connective Systems In this section, we will show that in a nilpotent connective system, the minimum and maximum operators can be expressed in terms of the conjunction, the disjunction and the negation operator. Proposition 2.9. c (x, i c (x, y)) = Min(x, y), x, y ∈ [0, 1] Proof. c (x, i c (x, y)) = f c−1 [ f c (x) + [ f c (y) − f c (x)]] . For x ≤ y f c (x) ≥ f c (y), which means that c [x, i c (x, y)] = x. Similarly, for x ≥ y f c (x) ≤ f c (y), which means that c (x, i c (x, y)) = y.

2.6 Min and Max Operators in Nilpotent Connective Systems

39

Table 2.1 Properties of implications in bounded systems Formula

LN

EP

IP

SN

LC

WOP

OP

f c−1 [ f c (y) − f c (x)]

f d−1 1 − f d (x) + f d (y)

n c (x)

i Sn

f d−1 f c (x) + f d (y)

n d (x)

n(x)

i Sc

f d−1 f d (y) + f d (n c (x))

− τn,d (x)

n c (x)

− τc,d (x)

ic = iR id = i Sd

Table 2.2 Rational generator functions Negation

Conjunction

Disjunction

f (x) (generator) 1 ν 1−x 1 + 1−ν x 1 νc x 1 + 1−ν c 1−x 1 νd 1−x 1 + 1−ν d x

f −1 (x)

1 − f (x)

negation

1 1−x 1+ 1−ν ν x

1 x 1+ 1−ν ν 1−x

n(x) =

1 c x 1+ 1−ν νc 1−x

1 c 1−x 1+ 1−ν νc x

1 1−ν 1+ ν d 1−x x d

1 1−ν x 1+ ν d 1−x d

1 2 x 1 + 1−ν ν 1−x 1 n c (x) = νc 2 x 1 + 1−ν 1−x c 1 n d (x) = 2 1−νd x 1+ νd 1−x

Proposition 2.10. n (c (n(x), i c (n(x), n(y)))) = Max (x, y), x, y ∈ [0, 1] Proof. The statement follows immediately from the previous proposition (or also can been proved in a similar way).

2.7 Summary In Table 2.1, the results concerning the properties of each implication are listed. For rational additive generator functions, the implications have been plotted in Figs. 2.1, 2.2 and 2.3. The formulae of the additive generators and the implications are summarized in Tables 2.1 and 1.4 (the corresponding negations are listed in Table 2.2). In the next chapter, we will focus on the equivalence operators in nilpotetn systems.

40

Fig. 2.1 i c and i d implications for rational generators

2 Implications

2.7 Summary

41

Fig. 2.2 Sn -implications for rational generators

Fig. 2.3 Sc -implications for rational generators

References 1. Dombi, J., Csiszár, O.: The general nilpotent operator system. Fuzzy Sets Syst. 261, 1–19 (2015) 2. Baczynski, M., Jayaram, B.: Fuzzy Implications, 1st edn. Springer (2008). ISBN 3540690808 3. Mas, M., Monserrat, M., Torrens, J., Trillas, E.: A survey on fuzzy implication functions. IEEE Trans. Fuzzy Syst. 15(6), 1107–1121 (2007). https://doi.org/10.1109/TFUZZ.2007.896304 4. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965) 5. Aguiló, I., Suñer, J., Torrens, J.: A characterization of residual implications derived from leftcontinuous uninorms. Inf. Sci. 180, 3992–4005 (2010). ISSN 00200255. https://doi.org/10. 1016/j.ins.2010.06.023

42

2 Implications

6. Baczyński, M., Jayaram, B.: On the distributivity of fuzzy implications over nilpotent or strict triangular conorms. IEEE Trans. Fuzzy Syst. 17(3), 590–603 (2009). ISSN 10636706. https:// doi.org/10.1109/TFUZZ.2008.924201 7. Bandler, W., Kohout, L.: Fuzzy power sets and fuzzy implication operators. Fuzzy Sets Syst. 4, 13–30 (1980). ISSN 01650114. https://doi.org/10.1016/0165-0114(80)90060-3 8. Jenei, S.: New family of triangular norms via contrapositive symmetrization of residuated implications. Fuzzy Sets Syst. 110(2), 157–174 (2000). ISSN 0165-0114. https:// doi.org/10.1016/S0165-0114(97)00374-6, http://www.sciencedirect.com/science/article/pii/ S0165011497003746 9. Qin, F., Baczynski, M., Xie, A.: Distributive equations of implications based on continuous triangular norms (I). IEEE Trans. Fuzzy Syst. 20(1), 153–167 (2012). https://doi.org/10.1109/ TFUZZ.2011.2171188 10. Shi, Y., Van Gasse, B., Ruan, D., Kerre, E.: On the first place antitonicity in QL-implications. Fuzzy Sets Syst. 159(22), 2988–3013 (2008). ISSN 0165-0114. https://doi.org/10.1016/j.fss. 2008.04.012, http://www.sciencedirect.com/science/article/pii/S0165011408002340. Theme: Logic and Algebra 11. Shi, Y., Van Gasse, B., Ruan, D., Kerre, E.: On dependencies and independencies of fuzzy implication axioms. Fuzzy Sets Syst. 161(10), 1388–1405 (2010). ISSN 0165-0114. https://doi.org/10.1016/j.fss.2009.12.003, http://www.sciencedirect.com/science/ article/pii/S0165011409005302. Theme: Aggregation functions 12. Trillas, E., Alsina, C.: On the law [p/spl and/q/spl rarr/r]=[(p/spl rarr/r)v(q/spl rarr/r)] in fuzzy logic. IEEE Trans. Fuzzy Syst. 10(1), 84–88 (2002). https://doi.org/10.1109/91.983281 13. Valverde, L.: On the structure of F-indistinguishability operators. Fuzzy Sets Syst. 17(3), 313– 328 (1985). ISSN 0165-0114. https://doi.org/10.1016/0165-0114(85)90096-X, http://www. sciencedirect.com/science/article/pii/016501148590096X 14. Dombi, J., Csiszár, O.: Implications in bounded systems. Inf. Sci. 283, 229–240 (2014) 15. Bustince, H., Burillo, P., Soria, F.: Automorphisms, negations and implication operators. Fuzzy Sets Syst. 134, 209–229 (2003). ISSN 01650114. https://doi.org/10.1016/S01650114(02)00214-2 16. Dubois, D., Prade, H.: Fuzzy sets in approximate reasoning. Fuzzy Sets Syst. 40, 143–202 (1991) 17. Fodor, J., Roubens, M.: Fuzzy preference modelling and multicriteria decision support (1994) 18. Fodor, J.C.: Contrapositive symmetry of fuzzy implications. Fuzzy Sets Syst. 69(2), 141– 156 (1995). ISSN 0165-0114. https://doi.org/10.1016/0165-0114(94)00210-X, http://www. sciencedirect.com/science/article/pii/016501149400210X 19. Klement, E.P., Navara, M.: A survey on different triangular norm-based fuzzy logics. Fuzzy Sets Syst. 101, 241–251 (1999) 20. Trillas, E., Alsina, C.: Fuzzy Logic. IEEE Trans. Fuzzy Syst. 10(1), 84–88 (2002) 21. Maes, K.C., De Baets, B.: Negation and affirmation: the role of involutive negators. Soft Comput. 11(7), 647–654 (2007)

Chapter 3

Equivalences

Abstract In this chapter, we focus on equivalences in nilpotent logical systems. We study three different types of equivalence operators and resolve the paradox that there is no equivalence relation in a non-Boolean setting which fulfills ∀x e(x, x) = 1 and e(x, n(x)) = 0 by aggregating the implication-based equivalence and its dual operator. We will also show that the aggregated equivalence has some nice properties like associativity, threshold transitivity, and T-transitivity. For applications in image processing, we define the overall equivalence of two grey level images and give an important semantic meaning of the aggregated equivalences. Finally, for applications in image processing, we define the overall equivalence of two grey level images and give an important semantic meaning to the aggregated equivalences.

3.1 Introduction The theory of fuzzy relations is a generalization of that of crisp relations of a set. Zadeh introduced the concept of fuzzy relations in [1] and the concept of fuzzy similarity relations in [2]. Since then, many authors have studied fuzzy equivalence relations [3–6] and it has proven to be useful in different contexts such as fuzzy control, approximate reasoning and fuzzy cluster analysis. As the research progressed, it became clear that any given relation may or may not satisfy a particular requirement for the fuzzy equivalence relation introduced by Zadeh. As shown by Gupta and Gupta [7], the condition μ(x, x) = 1 for ∀x ∈ X is too strong for defining a fuzzy reflexive relation μ on a set X (see also Yeh [8] and Chon [9]). Therefore, new types of fuzzy reflexive relations had to be introduced. Yeh [8], defined the concept of -reflexive fuzzy relations and weakly reflexive fuzzy relations by weakening the standard reflexive fuzzy relation to μ(x, x) ≥ > 0. Gupta and Gupta [7], introduced G-reflexive fuzzy relations as a generalization of reflexive fuzzy relations. When discussing fuzzy transitive relations, different approaches have been adopted. The first type of transitivity is that introduced by Zadeh in [2], and the second type of transitivity is the so-called T-transitivity of fuzzy relations, defined with the help of the t-norm. In [10–13], fuzzy T-transitivity has been intensely studied. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_3

43

44

3 Equivalences

Recently, Mesiar et al. [14], noticed that the associativity of a t-norm is superfluous in the above context, especially since we never have to aggregate more than two arguments. Thus, they have introduced a conjunctor instead of a t-norm. An alternative approach based on implications has been considered by Schmechel and Thiele [15, 16]. Jayaram and Mesiar [17], studied I-transitivity, where the implicator I is nothing more than a binary operator satisfying the boundary conditions of an implication. Another type of transitivity, the so-called -fuzzy transitivity, was introduced by Beg and Ashraf [18]. Ali et al. [19], introduced the concept of (α, β)-fuzzy reflexive relations, as a generalization of fuzzy reflexive relations as well as of fuzzy G-reflexive relations. More general types of fuzzy symmetric relations, a (α, β)-fuzzy symmetric relation and (α, β)-fuzzy transitive relations, were also studied. The concepts of (α, β)-fuzzy reflexive, symmetric and transitive relations naturally lead to the concept of (α, β)-fuzzy equivalence relations on a set. De Baets and Mesiar [20], introduced the concept of a T-partition as a generalization of that of a classical partition. Although the above list of authors is by no means complete, it gives us some idea about the importance of the concept of fuzzy equivalence relations in different contexts. Now we resolve a paradox of the equivalence relation by aggregating the implication-based equivalence and its dual operator. In Sect. 1, it was shown that a consistent connective system generated by nilpotent operators is not necessarily isomorphic to the Łukasiewicz system. Using more than one generator function, consistent nilpotent connective systems can be obtained in a different way with three naturally derived negation operators. As the class of nilpotent t-norms has preferable properties that make them useful in constructing logical structures, the advantages of such systems are obvious (see Klement and Navara [21]). Due to the fact that all continuous Archimedean (i.e. representable) nilpotent t-norms are isomorphic to the Łukasiewicz t-norm (see Grabisch et al. [22]), the nilpotent systems studied earlier were all isomorphic to the well-known Łukasiewicz logic. Those consistent connective systems which are not isomorphic to Łukasiewicz logic are called bounded systems (see Dombi and Csiszár [23]). Based on the results of Sects. 1 and 5, our focus is now on equivalences in bounded systems [24]. This chapter is organized as follows. After some preliminaries in Sect. 3.2, we define and examine the implication-based equivalences in bounded systems in Sect. 3.3. Next, the so-called dual equivalences are introduced and examined in Sect. 3.4. Using the arithmetic mean operator examined in Sect. 3.5, the aggregated equivalences are introduced and studied in Sect. 3.6. We show that unlike the other two types, the aggregated equivalences are threshold transitive and associative as well. In Sect. 3.7, for further applications in image processing, the overall equivalence of two grey level images was defined, and an important semantic meaning of the aggregated equivalences was given. Lastly, in Sect. 3.8, we summarize the key results. The results of this chapter can be found in [25].

3.2 Preliminaries

45

3.2 Preliminaries There exist several approaches to the definition of equivalences. Equivalences can be viewed as binary relations [3–6, 9–11]. Given a non-empty set X , a subset σ of X × X is called a binary relation on X . A binary relation σ on X is reflexive if (x, x) ∈ σ, ∀x ∈ X ; σ is symmetric if (x, y) ∈ σ implies (y, x) ∈ σ , ∀x, y ∈ X ; σ is transitive if (x, y) ∈ σ and (y, z) ∈ σ imply (x, z) ∈ σ , ∀x, y, z ∈ X . A binary relation is called an equivalence relation if it is reflexive, symmetric and transitive. Recall that a fuzzy subset μ of X is a mapping μ : X → [0, 1]. Definition 3.1. A fuzzy binary relation on X and Y is a fuzzy subset μ of X × Y . A fuzzy binary relation on a set X is a fuzzy subset μ of X × X , i.e. a function μ : X × X → [0, 1]. Definition 3.2. A fuzzy relation μ on a set X is said to be reflexive if μ(x, x) = 1, ∀x ∈ X , and symmetric if μ(x, y) = μ(y, x), ∀x, y ∈ X. Definition 3.3. A fuzzy relation μ on a set X is said to be fuzzy transitive if μ(x, z) ≥ sup {min(μ(x, y), μ(y, z))} ∀(x, y), (y, z) ∈ X × X. y∈X

Definition 3.4. A fuzzy relation μ on X is a fuzzy equivalence relation if it is a reflexive, symmetric and fuzzy transitive relation on X . Now we will consider an equivalence as a connective. We give the definition of an equivalence as a binary operation on the unit interval according to Fodor and Roubens. Definition 3.5. (Fodor and Roubens [26]) A function e : [0, 1]2 → [0, 1] is called an equivalence if it satisfies the following conditions: 1. 2. 3. 4.

Symmetry, i.e. e(x, y) = e(y, x) for ∀x, y ∈ [0, 1], Compatibility, i.e. e(0, 1) = e(1, 0) = 0 and e(0, 0) = e(1, 1) = 1, Reflexivity, i.e. e(x, x) = 1 for ∀x ∈ [0, 1], Monotonicity, i.e. x ≤ x ≤ y ≤ y ⇒ e(x, y) ≤ e(x , y ).

Definition 3.6. An operator e(x, y) : [0, 1]2 → [0, 1] is said to be 1. T-transitive with respect to a t-norm T , if ∀x, y, z ∈ [0, 1] : T (e(x, y), e(y, z)) ≤ e(x, z), 2. threshold transitive with respect to a threshold ν (0 < ν < 1), if e(x, y) ≥ ν and e(y, z) ≥ ν together imply e(x, z) ≥ ν for ∀x, y, z ∈ [0, 1], 3. invariant with respect to a negation n, if e(x, y) = e(n(x), n(y)) for ∀x, y ∈ [0, 1], 4. associative, if e(x, e(y, z)) = e(e(x, y), z)) holds for ∀x, y, z ∈ [0, 1].

46

3 Equivalences

Fig. 3.1 ec (x, y) and ed (x, y) for rational generators

3.3 Equivalences in Bounded Systems Let us now consider a nilpotent connective system (c, d, n) (see Sect. 1.4) and let us denote the normalized generator functions of c and d by f c and f d , respectively. Using the above-defined implications i c and i d , we can define two different types of equivalences (Fig. 3.1). Definition 3.7. The conjunctive and disjunctive equivalence operators are defined as follows: ec (x, y) = c (i c (x, y), i c (y, x)) ed (x, y) = n d (d (n d (i d (x, y)) , n d (i d (y, x)))) Proposition 3.1. In a bounded system, ec (x, y) = f c−1 [| f c (x) − f c (y)|] and similarly,

Proof.

ed (x, y) = f d−1 [1 − | f d (x) − f d (y)|] . ec (x, y) = f c−1 [[ f c (y) − f c (x)] + [ f c (x) − f c (y)]] .

If x < y, then f c (x) ≥ f c (y), which means that we have f c−1 [ f c (x) − f c (y)] . Similarly, if y > x, then f c (x) ≤ f c (y) and we get f c−1 [ f c (y) − f c (x)] . Similarly for ed , by using n d (i d (y, x)) = f d−1 [ f d (y) − f d (x)] , we obtain

3.3 Equivalences in Bounded Systems

47

n d (ed (x, y)) = f d−1 f d (x) − f d (y) + f d (y) − f d (x) = f d−1 | f d (x) − f d (y)| . Therefore,

ed (x, y) = f d−1 1 − | f d (x) − f d (y)| .

Remark 3.1. Since 0 ≤ | f c (x) − f c (y)| ≤ 1 and 0 ≤ 1 − | f d (x) − f d (y)| ≤ 1, the cutting function can be omitted here. For conceptual reasons, we prefer to leave it in all of the formulae.

3.3.1 Properties of ec(x, y) and ed (x, y) Next, we will examine the chief properties of ec (x, y) and ed (x, y) and show that they coincide if and only if the connective system is a Łukasiewicz system. Proposition 3.2. Let νc and νd be the fixpoints of n c and n d , respectively. The operators, ec (x, y) and ed (x, y) have the following properties: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

Compatibility (see Definition 3.5). Symmetry (see Definition 3.5). Reflexivity (see Definition 3.5). Monotonicity (see Definition 3.5). ec is T-transitive with respect to the conjunction c (see Definition 3.6) and similarly, ed is T-transitive with respect to the t-norm generated by 1 − f d (x). ec and ed are not threshold transitive (see Definition 3.6) with respect to νc and νd . Invariance (see Definition 3.6) with respect to n c and n d . ec (1, x) = ec (1, x) = x, ed (0, x) = n d (x), and similarly, ec (0, x) = n c (x). ec (x, y) = 0 if and only if x, y ∈ 0, 1 and x = y. Similarly, ed (x, y) = 0 if and only if x, y ∈ 0, 1 and x = y. n d (ed (x, y)) = ed (n d (x), y)) if and only if x ∈ {0, 1} or y ∈ {0, 1} and n c (ec (x, y)) = ec (n c (x), y)) if and only if x ∈ {0, 1} or y ∈ {0, 1}. ec (x, νc ) ≥ νc and similarly, ed (x, νd ) ≥ νd .

Proof. 1. From f c−1 (0) = 1, it follows that ec (1, 1) = ec (0, 0) = 1. From f c (1) = 0, f c (0) = 1 and f c−1 (1) = 0, we get that ec (0, 1) = ec (1, 0) = 0. Similarly, from f d−1 (1) = 1, it follows that ed (1, 1) = ed (0, 0) = 1. From f d (1) = 1, f d (0) = 0 and f d−1 (0) = 0, we get that ed (0, 1) = ed (1, 0) = 0. 2. The proof is trivial. 3. ec (x, x) = f c−1 (0) = 1 and ed (x, x) = f d−1 (1) = 1. 4. We have to show that from x ≤ x ≤ y ≤ y it follows that ec (x, y) ≤ ec (x , y ). Using the monotonicity of f c (x) and f c−1 (x), the statement follows immediately. For ed , we have to show that from x ≤ x ≤ y ≤ y it follows that ed (x, y) ≤

48

3 Equivalences

ed (x , y ). Using the monotonicity of f d (x) and f d−1 (x) the statement follows immediately. 5. By using the decreasing property of f c−1 and the triangle inequality, we find that c(e(x, y), e(y, z)) = = f c−1 (| f c (x) − f c (y)| + | f c (y) − f c (z)|) ≤ f c−1 (| f c (x) − f c (z)|) = e(x, z). The proof is similar for ed as well. 6. ec (x, y) ≥ νc iff | f c (x) − f c (y)| ≤ 21 and similarly, ec (y, z) ≥ νc iff | f c (y) − f c (z)| ≤ 21 . Obviously, these conditions are not sufficient for | f c (x) − f c (z)| ≤ 1 . Similarly, ed (x, y) ≥ νd iff 1 − | f d (x) − f d (y)| ≥ 21 and similarly, ed (y, z) ≥ 2 νd iff | f d (y) − f d (z)| ≥ 21 . Obviously, these conditions are not sufficient for 1 − | f d (x) − f d (z)| ≥ 21 . 7. ec (n c (x), n c (y)) = f c−1 [| f c (n c (x)) − f c (n c (y))|] = f c−1 [|1 − f c (x) − (1− f c (y))|] = f c−1 [| f c (y) − f c (x)|] = ec (x, y). Similarly, ed (n d (x), n d (y)) = f d−1 [| f d (n d (x)) − f d (n d (y))|] = f d−1 [|1 − f d (x) − (1 − f d (y))|] = f d−1 [| f d (y) − f d (x)|] = ed (x, y). 8. Using the fact that f c (1) = 0, we get ec (1, x) = f c−1 [| f c (1) − f c (x)|] = x. Similarly, using the fact that f c (0) = 1 and that 0 ≤ f c (x) ≤ 1 for ∀x ∈ [0, 1], we get ec (0, x) = f c−1 [| f c (0) − f c (x)|] = n c (x). For ed , using the fact that f d (1) = 1 and that 0 ≤ f d (x) ≤ 1 for ∀x ∈ [0, 1] we get ed (1, x) = f d−1 [1− | f d (1) − f d (x)|] = x. From f d (0) = 0 , we get ed (0, x) = f d−1 [1 − | f d (0)− f d (x)|] = n d (x). 9. If ec (x, y) = 0, then | f c (x) − f c (y)| = 1, from which x, y ∈ 0, 1 and x = y. Going in the opposite direction is trivial. 10. n c (ec (x, y)) = f c−1 (1 − | f c (x) − f c (y)|) and ec (n c (x), y)) = f c−1 (|1 − f c (x) − f c (y)|). Considering the four cases and using the monotonicity of f c (x), we get that x ∈ {0, 1} or y ∈ {0, 1}. The proof is similar for ed (x, y) as well. 11. Using the monotonicity property of f c (x) and the fact that f c (νc ) = 21 , we get ec (x, νc ) = f c−1 [| f c (x) − f c (νc )|] = f c−1 | f c (x) − 21 | ≥ νc , since 0 ≤ [| f c (x) − 21 | ≤ 21 . Similarly, using the monotonicity property of f d (x) and the fact that f d (νd ) = 21 , we get ed (x, νd ) = since

1 2

f d−1

[1 − | f d (x) − f d (νd )|] =

≤ 1 − | f d (x) − 21 | ≤ 1.

f d−1

1 1 − | f d (x) − | ≥ νd , 2

Proposition 3.3. If x, y > νc or x, y < νc , then ec (x, y) > νc . Similarly, if x, y > νd or x, y < νd , then ed (x, y) > νd . Proof. If x, y > νc , then f c (x), f c (y) < 21 , so | f c (x) − f c (y)| < 21 , which means that ec (x, y) > νc . Similarly, if x, y < νc , then f c (x), f c (y) > 21 , so | f c (x) − f c (y)|

3.3 Equivalences in Bounded Systems

49

< 21 , which means that ec (x, y) > νc . For ed , if x, y > νd , then f d (x), f d (y) > 21 , so | f d (x) − f d (y)| < 21 , which means that ed (x, y) > νd . Similarly, if x, y < νd , then f d (x), f d (y) < 21 , so | f d (x) − f d (y)| < 21 , which means that ed (x, y) > νd .

Remark 3.2. Equivalences ec and ed are not associative. Proof. A possible counterexample might be the case of rational generators with νc = 0.6 and νd = 0.3, x = 0.3, y = 0.4 and y = 0.5. In this case we get ec (x, ec (y, z)) ≈ 0.39, ec (ec (x, y), z) ≈ 0.62, while for ed (x, ed (y, z)) ≈ 0.38 and ed (ed (x, y), z) ≈ 0.64.

Proposition 3.4. In a connective system the above-defined equivalences ec (x, y) and ed (x, y) coincide if and only if f c (x) + f d (x) = 1 (or equivalently n c = n d , i.e. in a Łukasiewicz system), where f c and f d are the normalized generation function of the conjunction and disjunction operators, respectively. Proof. 1. If f c (x) + f d (x) = 1, then f c (x) = 1 − f d (x) and f c−1 (x) = f d−1 (1 − x), from which we get ec (x, y) = f c−1 [| f c (x) − f c (y)|] = f d−1 [1 − | f d (x) − f d (y)|] = ed (x, y). 2. If ec (x, y) = ed (x, y), then in particular ec (0, x) = ed (x, 0), which means that n c (x) = n d (x) must hold for all x ∈ [0, 1].

3.4 Dual Equivalences In classical logic, the equivalence operator has the following important property as well: e(x, n(x)) = 0. As is well known, demanding e(x, x) = 1 and e(x, n(x)) = 0 at the same time gives rise to a paradox [27]. Lemma 3.1. There is no equivalence relation which fulfils both e(x, x) = 1 and e(x, n(x)) = 0. Proof. Let ν be the fix point of the negation n(x). Then 1 = e(ν, ν) = e(ν, n(ν)) = 0, which is a contradiction.

However, in practical applications the property e(x, n(x)) = 0 might be of even greater importance than reflexivity (see Dombi [27]). Motivated by this demand, we define new types of operators below. First, the so-called dual equivalence is defined, denoted by e. ¯ Let us now consider a nilpotent connective system (c, d, n) and let us denote the normalized generator functions of c and d by f c and f d , respectively. Definition 3.8. The dual equivalence operations are defined as follows. e¯c (x, y) = n c (ec (x, n c (y)) and e¯d (x, y) = n d (ed (x, n d (y)) .

50

3 Equivalences

Fig. 3.2 e¯c (x, y) and e¯d with rational generators

Proposition 3.5. In a bounded system, the equivalence operators have the form e¯c (x, y) = f c−1 [1 − | f c (x) + f c (y) − 1|] and e¯d (x, y) = f d−1 [| f d (x) + f d (y) − 1|] . Proof. The formulae can be derived from direct calculation.

Remark 3.3. Since 0 ≤ | f c (x) + f c (y) − 1| ≤ 1 and 0 ≤ | f d (x) + f d (y) − 1| ≤ 1, the cutting function can be omitted here. For conceptual reasons, we prefer to leave it in all of the formulae (Fig. 3.2).

3.4.1 Properties of e¯ d and e¯ c Next, the main properties of the dual equivalences are studied. Proposition 3.6. Let νc and νd , be the fixpoints of n c and n d , respectively. Then the operators e¯c (x, y) and e¯d (x, y) have the following properties: 1. 2. 3. 4. 5.

Compatibility (see Definition 3.5). Symmetry (see Definition 3.5). e¯c (x, y) and e¯d (x, y) are not reflexive, but e¯c (x, n c (x)) = e¯d (x, n d (x)) = 0. e¯c (x, y) and e¯d (x, y) are not monotonic. e¯c is T-transitive with respect to the conjunction c (see Definition 3.6) and similarly, e¯d is T-transitive with respect to the t-norm generated by 1 − f d (x). 6. e¯c (x, y) and e¯d (x, y) are not threshold transitive with respect to νc and νd (see Definition 3.6).

3.4 Dual Equivalences

51

7. Invariance with respect to n c and n d (see Definition 3.6). 8. e¯c (1, x) = e¯c (1, x) = x e¯d (0, x) = n d (x), and similarly, e¯c (0, x) = n c (x). 9. e¯c (x, y) = 0 if and only if x = n c (y) and similarly, e¯d (x, y) = 0 if and obly if x = n d (y). 10. n d (e¯d (x, y)) = e¯d (n d (x), y)) if and only if x ∈ {0, 1} or y ∈ {0, 1} and n c (e¯c (x, y)) = e¯c (n c (x), y)) if and only if x ∈ {0, 1} or y ∈ {0, 1}. 11. e¯c (x, νc ) ≤ νc and e¯d (x, νd ) ≤ νd . Proof. 1. Using the formulae given in Proposition 3.5, compatibility is trivial. 2. Using the formulae given in Proposition 3.5, symmetry is trivial as well. 3. Follows from direct calculation. Since e¯c (x, n c (x)) = 0 holds for the fixpoint νc of the n c as well, reflexivity cannot hold. Similarly for e¯d . 4. A counterexample might be the case of rational generators with νc = 0.3. e¯c (0.1, 0.6) ≈ 0.75, while e¯c (0.4, 0.5) ≈ 0.68, and similarly for e¯d (0.4, 0.6) ≈ 0.21, while e¯d (0.45, 0.5) ≈ 0.19. 5. By using the decreasing property of f c−1 and the fact that |a + b − 1| + |b + c − 1| − 1 ≤ |a + c − 1| holds for all a, b, c ∈ [0, 1], we obtain c(e¯c (x, y), e¯c (y, z)) = f c−1 (2 − | f c (x) + f c (y) − 1| − | f c (y) + f c (z) − 1|) ≤ ≤ f c−1 (1 − | f c (x) + f c (z) − 1|) = e¯c (x, z).

The proof is similar for e¯d . 6. A possible counterexample might be for rational generators with νc = 0.3, x = 0.85, y = 0.9 and z = 0.87, or for νd = 0.3, x = 0.7, y = 0.9 and z = 0.6. 7. e¯c (n c (x), n c (y)) = 1 − f c−1 [|1 − f c (x) + 1 − f c (y) − 1|] = e¯c (x, y) and similarly, e¯d (n d (x), n d (y)) = f d−1 [|1 − f d (x) + 1 − f d (y) − 1|] = e¯d (x, y). 8. Using the fact that f c (1) = 0, we get e¯c (1, x) = f c−1 [1 − | f c (1) + f c (x) − 1|] = x. Similarly, using the fact that f c (0) = 1 and that 0 ≤ f c (x) ≤ 1 for ∀x ∈ [0, 1], we get e¯c (0, x) = f c−1 [1 − | f c (0) + f c (x) − 1|] = n c (x). For ed , using the fact that f d (1) = 1 and that 0 ≤ f d (x) ≤ 1 for ∀x ∈ [0, 1] we get e¯d (1, x) = f d−1 [| f d (1) + f d (x) − 1|] = x. Using the fact that f d (0) = 0 , we get e¯d (0, x) = f d−1 [| f d (0) − f d (x) − 1|] = n d (x). 9. Using the fact that f c (n c (x)) = 1 − f c (x) and f d (n d (x)) = 1 − f d (x), we get e¯c (x, n c (x)) = 1 − f c−1 (0) = 0 and similarly e¯d (x, n d (x)) = f d−1 (0) = 0. If e¯c (x, y) = 0, then f c (x) + f c (y) = 1, from which f c (x) = 1 − f c (y), i.e. x = f c−1 [1 − f c (y)] = n c (y). Similarly, if e¯d (x, y) = 0, then f d (x) + f d (y) = 1, from which f d (x) = 1 − f d (y), i.e. x = f d−1 [1 − f d (y)] = n d (y). 10. n c (e¯c (x, y)) = f c−1 (1 − | f c (x) + f c (y) − 1|) and e¯c (n c (x), y)) = f c−1 (1 − | f c (x) − f c (y)|). Considering the four cases and using the monotonicity of f c (x), we get that x ∈ {0, 1} or y ∈ {0, 1}. The proof for ed (x, y) follows in a similar way.

52

3 Equivalences

11. Using the strict monotonicity of f c , f d and their inverse functions, and the fact that f c (νc ) = f d (νd ) = 21 , the proof can be found by direct calculation.

Remark 3.4. e¯c (x, y) and e¯d (x, y) are not associative. Proof. It is easy to find a counterexample, e.g. for rational generators with νc = 0.3, e¯c (0.3, e¯c (0.4, 0.5)) ≈ 0.58, while e¯c (e¯c (0.3, 0.4), 0.5) ≈ 0.16.

Similarly, e¯d (0.1, e¯d (0.5, 0.7)) ≈ 0.12, while e¯d (e¯c (0.1, 0.5), 0.7) ≈ 0.03. Proposition 3.7. In a connective system the above-defined equivalences e¯c (x, y) and e¯d (x, y) coincide if and only if f c (x) + f d (x) = 1 (or equivalently n c = n d , i.e. in a Łukasiewicz system), where f c and f d are the normalized generation function of the conjunction and disjunction operators, respectively. Proof. 1. If f c (x) + f d (x) = 1, then f c (x) = 1 − f d (x) and f c−1 (x) = f d−1 (1 − x), from which we get e¯c (x, y) = f c−1 [1 − | f c (x) + f c (y) − 1|] = f d−1 [|1 − f d (x) − f d (y)|] = ed (x, y). 2. If ec (x, y) = ed (x, y), then in particular e¯c (0, x) = e¯d (x, 0), which means that n c (x) = n d (x) must hold for all x ∈ [0, 1].

3.5 Arithmetic Mean Operators in Bounded Systems Let us define the so-called arithmetic mean operators in a bounded system. Definition 3.9. In a connective system (c, d, n) −1 m (α) c (x, y) := f c [α · f c (x) + (1 − α) · f c (y)]

and similarly, −1 m (α) d (x, y) := f d [α · f d (x) + (1 − α) · f d (y)] ,

where f c and f d are the normalized generator functions of the conjunction and disjunction operators, respectively, 0 < α < 1. m c and m d are called weighted arithmetic mean operators. (α) Proposition 3.8. m (α) c (x, y) and m d (x, y) satisfy the self-De Morgan property with respect to n c and n d respectively, i.e.

(α) n c m (α) c (x, y) = m c (n c (x), n c (y)) and similarly,

n d m (α), (x, y) = m (α), d d (n d (x), n d (y)) .

3.5 Arithmetic Mean Operators in Bounded Systems

53

Proof. −1 n c m (α) c (x, y) = f c [1 − (α · f c (x) + (1 − α) · f c (y))] = f c−1 [α · (1 − f c (x)) + (1 − α) · (1 − f c (y)))] = m (α) c (n c (x), n c (y)) .

For m d , the proof is similar.

3.6 Aggregated Equivalences Next, we define a new type of operator derived from the equivalences defined above. They are aggregated by using the arithmetic mean operators defined in Definition 3.9 for α = 21 . This new operator is a compromise between the normal and the dual equivalences, i.e. it fulfils neither e(x, x) = 1 nor e(x, n(x)) = 0, but it has a nice property, namely e(ν, ν) = ν. If we recall that the values represent uncertainties and ν, as the fix point of the negation means that we hesitate whether the objects A and B have a particular property or not, it is also sensible to remain unsure about their equivalence value. This new operator will be called the aggregated equivalence operator. Definition 3.10. The aggregated equivalence operators are defined as follows. 1

ec∗ (x, y) = m c2 (ec (x, y), e¯c (x, y)) , 1

ed∗ (x, y) = m d2 (ed (x, y), e¯d (x, y)) . Proposition 3.9. The aggregated equivalence operator in a bounded system ec∗ (x, y) = f c−1

1 1 | f c (x) − f c (y)| + (1 − | f c (x) + f c (y) − 1|) 2 2

and ed∗ (x, y) = f d−1

1 1 (1 − | f d (x) − f d (y)|) + | f d (x) + f d (y) − 1| . 2 2

Proof. It follows from direct calculation.

Proposition 3.10. The conjunctive aggregated equivalence operator has the following property: ⎧ n c (y), if x ≤ y ≤ n c (x) ⎪ ⎪ ⎨ x, if n c (y) ≤ x ≤ y ec∗ (x, y) = (x), if y ≤ x and y ≤ n c (x) n ⎪ c ⎪ ⎩ y, if y ≤ x and y ≥ n c (x).

54

3 Equivalences

Proof. 1. If x ≤ y ≤ n c (x), then using the monotonicity of f c and the fact that n c (x) = f c−1 (1 − f c (x)), we get f c (x) ≥ f c (y) and f c (x) + f c (y) ≥ 1. In this case it means that ec∗ (x, y) = n(y). 2. If n c (y) ≤ x ≤ y, then using the monotonicity of f c and the fact that n c (x) = f c−1 (1 − f c (x)) we get f c (x) ≥ f c (y) and f c (x) + f c (y) ≤ 1. In this case it means that ec∗ (x, y) = x. 3. If y ≤ x and y ≤ n c (x), then we get f c (x) ≤ f c (y) and f c (x) + f c (y) ≥ 1. In this case ec∗ (x, y) = n c (x) follows. 4. If y ≤ x and y ≥ n c (x), then f c (x) ≤ f c (y) and f c (x) + f c (y) ≤ 1. In this case it means that ec∗ (x, y) = y.

Proposition 3.11. The disjunctive aggregated equivalence operator has the following property: ⎧ n d (y), if x ≤ y and x ≤ n d (y) ⎪ ⎪ ⎨ x, if n d (y) ≤ x ≤ y ed∗ (x, y) = (x), if y ≤ x and x ≤ n d (y) n ⎪ d ⎪ ⎩ y, if y ≤ x and n d (y) ≤ x. Proof. 1. If x ≤ y and x ≤ n d (y), then using the monotonicity of f d and the fact that n d (x) = f d−1 (1 − f d (x)) we get f d (x) ≤ f d (y) and f d (x) + f d (y) ≤ 1. In this case it follows that ed∗ (x, y) = n d (y). 2. If n d (y) ≤ x ≤ y, then we get f d (x) ≤ f d (y) and f d (x) + f d (y) ≥ 1. In this case it follows that ed∗ (x, y) = x. 3. If y ≤ x and x ≤ n d (y), then we get f d (x) ≥ f d (y) and f d (x) + f d (y) ≤ 1. In this case it follows that ed∗ (x, y) = n d (x). 4. If y ≤ x and n d (y) ≤ x, then f d (x) ≥ f d (y) and f d (x) + f d (y) ≥ 1. In this case it follows that ed∗ (x, y) = y (Figs. 3.3 and 3.4).

3.6.1 Properties of the Aggregated Equivalence Operator Next, the main properties of the aggregated equivalences are examined. In Propositions 3.12 and 5.9, we will show that unlike the above-mentioned equivalences, the aggregated equivalences are threshold transitive and associative as well. Proposition 3.12. Let νc and νd be the fixpoints of n c and n d , respectively. The aggregated equivalences have the following properties: 1. Compatibility (see Definition 3.5). 2. Symmetry (see Definition 3.5). 3. The aggregated equivalences are not reflexive, but ec∗ (νc , νc ) = νc and ed∗ (νd , νd ) = νd hold. In addition,

3.6 Aggregated Equivalences

55

Fig. 3.3 The domain of aggregated equivalences

ec∗ (x, x)

=

and similarly, ed∗ (x, x)

=

n c (x), if x ≤ νc x, if x ≥ νc . n d (x), if x ≤ νd x, if x ≥ νd .

4. Monotonicity (see Definition 3.5). 5. ec∗ is T-transitive with respect to the conjunction c (see Definition 3.6) and similarly, ed∗ is T-transitive with respect to the t-norm generated by 1 − f d (x). 6. The aggregated equivalences are threshold transitive with respect to νc and νd (see Definition 3.6). 7. Invariance with respect to n c and n d (see Definition 3.6). 8. ec∗ (1, x) = ed∗ (1, x) = x, ed∗ (0, x) = n d (x), and similarly, ec∗ (0, x) = n c (x). 9. ec∗ (x, y) = 0 if and only if x, y ∈ {0, 1} and x = y. Similarly for ed∗ . 10. n c (ec∗ (x, y)) = ec∗ (n c (x), y)) if and only if x ∈ {0, 1} or y ∈ {0, 1} and n d (ed∗ (x, y)) = ed∗ (n d (x), y)) if and only if x ∈ {0, 1} or y ∈ {0, 1}. 11. ec∗ (x, νc ) = νc and similarly, ed∗ (x, νd ) = νd . Proof. 1. It follows from direct calculation. 2. The proof is trivial. 3. The statement follows from Propositions 3.10 and 3.11. 4. We will demonstrate monotonicity for ec∗ . For ed∗ , the proof is similar. If x ≤ x ≤ y ≤ y, then by Proposition 3.10 we have to consider two cases. (a) y ≤ n c (x). In this case ec∗ (x, y) = n c (y).

56

3 Equivalences

Fig. 3.4 Aggregated equivalences with rational generators with ν = 0.3

i. If y ≤ n c (x ), then ec∗ (x , y ) = n c (y ), which means that ec∗ (x, y) ≤ ec∗ (x , y ). ii. If y ≥ n c (x ), then ec∗ (x , y ) = x and n c (y) ≤ n c (y ) ≤ x , so ec∗ (x, y) ≤ ec∗ (x , y ). (b) y ≥ n c (x). In this case ec∗ (x, y) = x. i. If y ≥ n c (x ), then ec∗ (x , y ) = x , which means that ec∗ (x, y) ≤ ec∗ (x , y ). ii. If y ≤ n c (x ), then ec∗ (x , y ) = n c (y ) and n c (y ) ≥ x ≥ x, so ec∗ (x, y) ≤ ec∗ (x , y ). 5. By using the decreasing property of f c−1 and the fact that |a − b| − |a + b − 1| + |b − c| − |b + c − 1| + 1 ≥ |a − c| − |a + c − 1| holds for all a, b, c ∈ [0, 1], the statement follows from direct calculation. The proof is similar for ed∗ . 6. We will prove the threshold transitivity for ec∗ . For ed∗ , the proof is similar. The condition ec∗ (x, y) ≥ νc is equivalent to the following inequality. f c−1

1 1 | f c (x) − f c (y)| + (1 − | f c (x) + f c (y) − 1|) ≥ νc , 2 2

which means that | f c (x) − f c (y)| ≤ | f c (x) + f c (y) − 1|. This means that either f c (x), f c (y) ≤ 21 , or f c (x), f c (y) ≥ 21 must hold, i.e. either x, y ≥ νc , or x, y ≤ νc . Together with the condition ec∗ (y, z) ≥ νc , we also have that y, z ≥ νc , or y, z ≤ νc , from which we easily get that either x, z ≥ νc , or x, z ≤ νc must hold, i.e. ec∗ (x, z) ≥ νc .

3.6 Aggregated Equivalences

7. 8. 9. 10. 11.

57

This follows from direct calculation. This follows from the properties of ec , e¯c , ed , and e¯d . The statement follows from Propositions 3.10 and 3.11. This follows from direct calculation. ec∗ (x, νc ) = f c−1 21 | f c (x) − 21 | + 21 1 − | f c (x) − 21 | = f c−1 21 = νc . proof is similar for ed∗ .

The

Remark 3.5. Note that from the third property in Proposition 3.12, it follows immediately that ec∗ (x, x) ≥ νc and for ed∗ as well. Proposition 3.13. ec∗ (x, y) > νc if and only if x, y > νc or x, y < νc , ec∗ (x, y) = νc if and only if x = νc or y = νc , and ec∗ (x, y) < νc otherwise. The proof is similar for ed∗ (x, y). Proof. The statement readily follows from Propositions 3.10 and 3.11. ec∗

Remark 3.6. Note that and both c-transitive (see [26]).

ed∗

considered as fuzzy binary relations on [0,1], are

Proposition 3.14. ec∗ and ed∗ are associative. Proof. Let us consider ed∗ (x, y). First, we will show that associativity holds in the case where f d (x) = 1 − x. Let us use the following notation for the disjunctive aggregated equivalence for f d (x) = 1 − x. L(x, y) := ed∗ (x, y) =

1 (|x + y − 1| − |x − y| + 1) . 2

It can be shown that L(x, y) = min(max(1 − x, y), max(x, 1 − y)). From this, we get L(x, L(y, z)) = min(max(x, y, z), max(x, 1 − y, 1 − z), max(1 − x, y, 1 − z), max(1 − x, 1 − y, z)) = L(L(x, y), z),

which means that L(x, y) is associative. In particular, for an arbitrary generator function f d , f d−1 (L( f d (x), L( f d (y), f d (z)))) = f d−1 (L(L( f d (x), f d (y)), f d (z))) also holds. Since 1 1 ed∗ (x, y) = f d−1 (1 − | f d (x) − f d (y)|) + | f d (x) + f d (y) − 1| = f d−1 (L( f d (x), f d (y))) , 2 2

associativity of ed∗ (x, y) is proved. The proof for ec∗ is similar as well.

58

3 Equivalences

Proposition 3.15. In a connective system, the above-defined equivalences ec∗ (x, y) and ed∗ (x, y) coincide if and only if f c (x) + f d (x) = 1 (or equivalently n c = n d , i.e. in a Łukasiewicz system), where f c and f d are the normalized generation function of the conjunction and disjunction operators, respectively. Proof. 1. If f c (x) + f d (x) = 1, then using the fact that f c (x) = 1 − f d (x) and f c−1 (x) = f d−1 (1 − x), we get ec∗ (x, y) = ed∗ (x, y). 2. If ec∗ (x, y) = ed∗ (x, y), then in particular ec∗ (0, x) = ed∗ (x, 0), which means that n c (x) = n d (x) must hold for all x ∈ [0, 1].

3.7 Applications In signal and image processing, verifying the equivalence of two signals or two images is always of great importance. Let us assume that two grey level images, i.e. two integer-valued function f and g defined on a subinterval I of Z2 , are given. After normalizing f and g, the equivalence of the images can be calculated in each picture element x of I (pixel) by using the equivalence operators considered above. For simplicity, let us assume that I = {0, ..., n}2 , and let us use the following notations: xi, j := f (i, j) and yi, j := g(i, j). The overall equivalence of the two images (which measures the overlap) can be calculated by an arithmetic mean in the following way. Definition 3.11. Let us consider two normalized grey level images, f, g : I → [0, 1], where I = {0, ..., n}2 . Their overall equivalence E is defined the following way: n 1 e(xi, j , yi, j ), E( f, g) := 2 n i, j=1 where xi, j = f (i, j) and yi, j = g(i, j), and e stands for one of the equivalences considered so far. The overall equivalence can be defined for one dimensional signals similarly. Note that for values around the middle grey level, the aggregated equivalences, ec∗ and ed∗ , give the maximal level of uncertainty, which gives them an important semantic meaning. Therefore, when studying the equivalence of two grey level images, the aggregated equivalences are of great importance.

3.8 Summary In this chapter, equivalences in bounded systems were examined. Three different types of operators were studied, and a paradox of the equivalence (i.e. there is

3.8 Summary

59

Fig. 3.5 Pointwise equivalence of fuzzy numbers with rational generators (νc = νd = 0.3)

Fig. 3.6 Pointwise dual equivalence of fuzzy numbers with rational generators (νc = νd = 0.3)

60

3 Equivalences

Fig. 3.7 Pointwise aggregated equivalence of triangular fuzzy numbers with rational generators (ν = 0.6)

no equivalence relation in a non-Boolean setting which fulfils ∀xe(x, x) = 1 and e(x, n(x)) = 0) is resolved by aggregating the implication-based equivalence and its dual operator. We will also show that the aggregated equivalence has nice properties like associativity, threshold transitivity and T-transitiviy. In Figs. 3.5, 3.6 and 3.7, examples of the pointwise equivalence of two fuzzy numbers are illustrated by means of all the above-mentioned equivalences. In the next chapter, we will focus on the modifiers and membership functions.

References 1. Zadeh, L.: Fuzzy sets. Inf. Control 8, 338–353 (1965). https://doi.org/10.1016/S00199958(65)90241-X. ISSN 0019-9958 2. Zadeh, L.: Similarity relations and fuzzy orderings. Inf. Sci. 3, 177–200 (1971). https://doi. org/10.1016/S0020-0255(71)80005-1. ISSN 0020-0255 3. Chakraborti, M.K., Das, M.: On fuzzy equivalance 1. Fuzzt Set Syst. 11, 185–193 (1983) 4. Chakraborti, M.K., Das, M.: On fuzzy equivalence 2. Fuzzy Sets Syst. 11, 299–307 (1983) 5. Murali, V.: Fuzzy equivalence relations. Fuzzy Sets Syst. 30, 155–163 (1989) 6. Nemitz, W.C.: Fuzzy relations and fuzzy functions. Fuzzy Sets Syst. 19, 177–191 (1986)

References

61

7. Gupta, K., Gupta, R.: Fuzzy equivalence relation redefined. Fuzzy Sets Syst. 79, 227–233 (1996). https://doi.org/10.1016/0165-0114(95)00155-7. ISSN 0165-0114 8. Yeh, R.T.: Toward an algebraic theory of fuzzy relational systems. Technical report, USA (1973) 9. Chon, I.: -fuzzy equivalence relations. Kangweon-Kyungki Math. J. 14(1), 71–77 (2006) 10. Boixader, D., Jacas, J., Recasens, J.: Transitive closure and betweenness relations. Fuzzy Sets Syst. 120, 415–422 (2001). https://doi.org/10.1016/S0165-0114(99)00133-5. ISSN 01650114 11. Boixader, D.: On the relationship between T-transitivity and approximate equality. Fuzzy Sets Syst. 133, 161–169 (2003). https://doi.org/10.1016/S0165-0114(02)00241-5. ISSN 01650114 12. De Cock, M., Kerre, E.: On (un)suitable fuzzy relations to model approximate equality. Fuzzy Sets Sys. 133(2), 137–153 (2003). https://doi.org/10.1016/S0165-0114(02)00239-7. http://www.sciencedirect.com/science/article/pii/S0165011402002397. ISSN 0165-0114 13. Demirci, M., Recasens, J.: Fuzzy groups, fuzzy functions and fuzzy equivalence relations. Fuzzy Sets Syst. 144, 441–458 (2004). https://doi.org/10.1016/S0165-0114(03)00301-4. ISSN 0165-0114 14. Mesiar, R., Reusch, B., Thiele, H.: Fuzzy equivalence relations and fuzzy partitions. J. Multiple Valued Log. Soft Comput. 12, 167–181 (2006) 15. Schmechel, N.: On lattice-isomorphism between fuzzy equivalence relations and fuzzy partitions. In: Proceedings 25th International Symposium on Multiple-Valued Logic, pp. 146–151 (1995) 16. Thiele, H., Schmechel, N.: On the mutual definability of fuzzy tolerance relations and fuzzy tolerance coverings. In: Proceedings 25th International Symposium on Multiple-Valued Logic, pp. 1383–1390 (1995) 17. Jayaram, B., Mesiar, R.: I-Fuzzy equivalence relations and I-fuzzy partitions. Inf. Sci. 179(9), 1278–1297 (2009). https://doi.org/10.1016/j.ins.2008.12.027. ISSN 0020-0255 18. Beg, I., Ashraf, S.: Fuzzy equivalence relations. Kuwait J. Sci. Eng 35(1A), 191–206 (2008) 19. Ali, M.I., Feng, F., Shabir, M.: A note on (, ∨ q)-fuzzy equivalence relations and indistinguishability operators. Hacettepe J. Math. Stat. 40, 383–400 (2011). ISSN 2651-477X 20. De Baets, B., Mesiar, R.: T-partitions. Fuzzy Sets Syst. 97(2), 211–223 (1998). https:// doi.org/10.1016/S0165-0114(96)00331-4. http://www.sciencedirect.com/science/article/pii/ S0165011496003314. ISSN 0165-0114 21. Klement, E.P., Navara, M.: A survey on different triangular norm-based fuzzy logics. Fuzzy Sets Syst. 101, 241–251 (1999) 22. Beliakov, G., Pradera, A., Calvo, T.: Aggregation functions: a guide for practitioners. In: Studies in Fuzziness and Soft Computing, p. 375. Springer, Heidelberg (2007) 23. Dombi, J., Csiszár, O.: The general nilpotent operator system. Fuzzy Sets Syst. 261, 1–19 (2015) 24. Dombi, J., Csiszár, O.: Equivalence operators in nilpotent systems. Fuzzy Sets Syst. 299, 113–129 (2016) 25. Dombi, J., Csiszár, O.: Implications in bounded systems. Inf. Sci. 283, 229–240 (2014) 26. Fodor, J., Roubens, M.: Fuzzy preference modelling and multicriteria decision support (1994) 27. Dombi, J.: Equivalence operators that are associative. Inf. Sci. 281, 281–294 (2014). https://doi.org/10.1016/j.ins.2014.05.027. http://linkinghub.elsevier.com/retrieve/pii/S0020025514005738. ISSN 0020-0255

Chapter 4

Modifiers and Membership Functions in Fuzzy Sets

Abstract In fuzzy theory, modalities (like possibly and necessarily) and hedges (like quite, very, and extremely) are the most commonly examined unary operators. Here, we introduce two reasonable approaches for defining these unaries; by repeating the arguments of many-variable operators and by using compositions of negations. This way, hedges and also modalities can be viewed as a part of a logical system. We show that membership functions, which play a substantial role in the overall performance of fuzzy representation, can also be defined using a generator function. In the literature, the membership functions are usually chosen independently of the logical operators of the system. Parameters are normally fine-tuned based on pure experimental results. Now, we make a suggestion of how modifiers and membership functions can be linked to the logical operators of the system. This unified framework will be useful and aid better interpretability of neural computations in Chap. 9.

4.1 Introduction Negation operators were studied thoroughly in Sect. 1, as they play a significant role in logical systems by building connections between the main operators (De Morgan law) and by characterising their basic properties. Despite their significance, on other unary operators (compared to the multivariable ones) there is only limited literature available. In fuzzy theory, modalities (like possibly, necessarily, ...) and hedges (like very, quite, extremely, ...) are the most studied unary operators, which modify the linguistic variables [1–8]. In this chapter, the focus is on the unary operators of a nilpotent logical system [9]. They perform various operations such as incrementing or decrementing a value and they can be widely used to express modalities and hedges in human thinking [10]. Here, our main purpose is to consider the main unary operators of a nilpotent logical system in an integral framework and to uncover the underlying general structure of all the operators considered so far [9]. This allows us to provide a generally applicable system, where all the operators are connected to each other, and the modalities © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_4

63

64

4 Modifiers and Membership Functions in Fuzzy Sets

and hedges are operator-dependent. In such a system, only a few parameters are to be given. By fitting the parameter values, the system can be used to model real-life problems. First, in Sect. 4.2.1, a possible way of constructing unary operators is considered: repeating the argument in multivariable operators; i.e. by choosing xi = x j (∀i, j) for the arguments of the many-variable operators. This is how it can be guaranteed that the operators are connected. In Sect. 4.2.2, our focus is on the drastic unary operators, in Sect. 4.2.3 on the composition rules and then in Sect. 4.2.4, we show how the multivariable operators can be derived from unary ones. This result highlights the importance of the unary operators in a logical system. In Sect. 4.2.5, a general framework is given for all the operators discussed so far. In Sect. 4.3, we introduce unary operators derived as a composition of negation operators [11]. Membership functions, which play a substantial role in the overall performance of fuzzy representation, can also be defined by means of a generator function. In the literature, the membership functions are usually chosen independently of the logical operators of the system. Parameters are normally fine-tuned on the basis of empirical results. Now, we will suggest how modifiers and membership functions can be connected to the logical operators of the system. Using operatordependent membership functions makes it possible to build up a system by using a single generator function and some parameters. Moreover, it can provide a theoretical explanation for the choice of membership functions and modifiers. In Sects. 4.4 and 4.5, we suggest a new way of creating membership- and nonmembership functions. Lastly, in Sect. 4.6, the main results are summarized.

4.2 Unary Operators in Nilpotent Logical Systems In the early 1970’s, Zadeh [12] introduced a class of powering modifiers, which defined the concept of linguistic variables and hedges (like very, quite, extremely, ...). He proposed computing with words as an extension of fuzzy sets and logic theory and introduced modifier functions of fuzzy sets called linguistic hedges, which change the meaning of the primary terms. As pointed out by Zadeh, linguistic variables and terms are closer to human thinking and therefore, words and linguistic terms can be used to model human thinking systems [13]. Hedges and also modalities (like possibly, necessarily, ...) are the most examined unary operators. From a semantic viewpoint, these unary operators can also be viewed as a part of a logical system. In this section, two possible ways of extending a nilpotent logical system by defining the necessity and possibility operators are examined. The novelty of these two methods lies in the fact that they provide a logical system, where all the operators are connected to each other. The possibility and necessity operators have to satisfy the following equations. impossible(x) = necessity(not(x))

(4.1)

4.2 Unary Operators in Nilpotent Logical Systems

65

and possible(x) = not(impossible(x)).

(4.2)

4.2.1 Possibility and Necessity as Unary Operators Derived from Multivariable Operators A possible way of obtaining unary operators is by choosing xi = x j (∀i, j) for the arguments of the many-variable operators. Based on the De Morgan property of the conjunction and the disjunction, the unary operators derived from them satisfy the above two equations. Definition 4.1. Let k ∈ N, λ ∈ R+ , λ > 1, f : [0, 1] → [0, 1] be an increasing bijection and let us define the so-called necessity operator, τ N(k) (x) : [0, 1] → [0, 1] in the following way: τ N(k) (x) := c[x, x, ...x ] = f −1 [k( f (x) − 1) + 1] ,

(4.3)

k−times

and the generalized necessity operator τ N(λ) (x) : [0, 1] → [0, 1] as τ N(λ) (x) := f −1 [λ( f (x) − 1) + 1] = = f −1 [λ f (x) − (λ − 1)] ,

(4.4)

where c is the conjunction generated by f c (x) = 1 − f (x). Similarly, the so-called possibility operator can also be defined by means of the disjunction operator. Definition 4.2. Let k ∈ N, λ ∈ R+ , λ > 1, f : [0, 1] → [0, 1] be an increasing bijection and let us define the so-called possibility operator, τ N(k) (x) : [0, 1] → [0, 1] in the following way: τ P(k) (x) := d[x, x, ...x ] = f −1 [k f (x)] ,

(4.5)

k−times

and the generalized possibility operator τ P(λ) (x) : [0, 1] → [0, 1] as τ P(λ) (x) := f −1 [λ f (x)] ,

(4.6)

where d is the disjunction generated by f (x). Next, the so-called sharpness operator is defined, based on the self-duality of the aggregative operator.

66

4 Modifiers and Membership Functions in Fuzzy Sets

Fig. 4.1 Unary operators generated by f (x) = x

Definition 4.3. Let k ∈ N, λ ∈ R+ , λ > 1, f : [0, 1] → [0, 1] be an increasing bijection and let us define the so-called sharpness operator, τ S(k) (x) : [0, 1] → [0, 1] in the following way: τ S(k) (x)

:= a[x, x, ...x ] = f

−1

k−1 k f (x) − , 2

(4.7)

k−times

and the generalized sharpness operator τ S(λ) (x) : [0, 1] → [0, 1] as λ−1 , τ S(λ) (x) := f −1 λ f (x) − 2

(4.8)

where a is the aggregative operator generated by f (x). The three definitions above can be summarized in a unified formula. Definition 4.4. Let λ ∈ R+ , λ > 1, ν ∈ [0, 1], f : [0, 1] → [0, 1] be an increasing bijection. Let us define the unary operator τν(λ) (x) in the following way. τν(λ) (x) := f −1 [λ ( f (x) − f (ν)) + f (ν)] .

(4.9)

Remark 4.1. For ν = 1, ν = 0 and ν = ν ∗ (i.e. f (ν) = 21 ), we get the necessity, the possibility and the sharpness operators, respectively. The above-defined unary operators fulfill the following De Morgan identities (see Eqs. 4.1 and 4.2). Proposition 4.1. Let f : [0, 1] → [0, 1] be an increasing bijection and let n(x) be the negation generated by f (x). n τ N(λ) (x) = τ P(λ) (n(x)) ,

(4.10)

4.2 Unary Operators in Nilpotent Logical Systems

67

Table 4.1 x1 and x2 values for ν = 1, ν = 0 and ν = ν ∗ ν x1

(λ) τ N (x) 1 f −1 1 − λ1 τ P(λ) (x) (λ)

τ S (x)

0

0

ν∗

f

−1 λ−1 2λ

x2 1 f −1 f −1

1

λλ+1 2λ

n τ P(λ) (x) = τ N(λ) (n(x)) ,

(4.11)

n τ S(λ) (x) = τ S(λ) (n(x)) .

(4.12)

Proof. The proof is similar in all three cases. Let us prove the first statement. Taking into account the fact that 1 − [x] = [1 − x], (λ) (λ) n τ N (x) = f −1 [1 − [λ f (x) − (λ − 1)]] = f −1 [λ(1 − f (x))] = τ P (n(x)) . Proposition 4.2. τν(λ) (x) is for ∀ν ∈ [0, 1] increasing. Let x = x1 be the greatest value, for which τν(λ) (x) = 0, and let x = x2 be the lowest value, for which τν(λ) (x) = 1. In this case =1 λ − 1 f (ν) (4.13) x1 = f λ and x2 = f =1

λ−1 1 f (ν) + . λ λ

(4.14)

Proof. The monotonicity follows from the monotonicity of f (x). To find x1 and x2 , the following two equations need to be solved: λ ( f (x) − f (ν)) + f (ν) = 0, and λ ( f (x) − f (ν)) + f (ν) = 1. The solution follows from a direct calculation.

The values x1 and x2 in Proposition 4.2 for ν = 1, ν = 0 and ν = ν ∗ can be found in Table 4.1.

Proposition 4.3. Let ν ∗ = f −1 21 . λ , τ N(λ) (ν ∗ ) = f −1 1 − 2

(4.15)

68

4 Modifiers and Membership Functions in Fuzzy Sets

Table 4.2 x1 and x2 values for f (x) = x ν (λ) τ N (x) τ P(λ) (x) (λ) τ S (x)

x1

x2

1

1−

0

0

ν∗

λ−1 2λ

τ P(λ) (ν ∗ ) and

= f

1 λ

−1

1 1 λ λ+1 2λ

λ 2

τ S(λ) (ν ∗ ) = ν ∗ .

Proof. The statements follow from direct calculations.

(4.16)

(4.17)

Remark 4.2. Note that ν ∗ is a fixpoint of the sharpness operator τ S(λ) (x). Next, let us consider the case f (x) = x. Remark 4.3. In particular for f (x) = x, τ N(λ) (x) = min (1, max (0, λx − (λ − 1))) ,

(4.18)

τ P(λ) (x) = min (1, max (0, λx)) ,

(4.19)

τ S(λ) (x)

λ−1 . = min 1, max 0, λx − 2

(4.20)

In Fig. 4.1, unary operators generated by f (x) = x are shown. For the values x1 and x2 , see Table 4.2. Remark 4.4. As can be seen, for f (x) = x, the unary operators τ I(λ) (x) (I ∈ {N , P, S}), have a value in (0, 1) if and only if x ∈ (x1 , x2 ). Note that the length of this interval, x2 − x1 = λ1 .

4.2.2 Drastic Unary Operators Let us now define the so-called drastic unary operators in the following way. Definition 4.5. Let f : [0, 1] → [0, 1] be an increasing bijection. Let τ N(∞) (x) := lim τ N(λ) (x), λ→∞

(4.21)

4.2 Unary Operators in Nilpotent Logical Systems

τ P(∞) (x) := lim τ P(λ) (x),

(4.22)

τ S(∞) (x) := lim τ S(λ) (x).

(4.23)

λ→∞

and

69

λ→∞

τ N(∞) (x), τ P(∞) (x) and τ S(∞) (x) are called drastic unary operators. Proposition 4.4. τ N(∞) (x)

=

0 if x < 0 1 i f x = 1,

and

0 if x = 0 1 i f x > 0,

(4.25)

⎧ ⎨0 if x < ν τ S(∞) (x) = ν i f x = ν ⎩ 1 i f x > ν.

(4.26)

τ P(∞) (x) and

(4.24)

=

Proof. The statement follows from a direct calculation.

4.2.3 Composition Rules In human thinking and languages, emphasis is often expressed by repeating modalities and hedges, such as “very-very”. The following proposition shows that the necessity, possibility and sharpness operators are all closed under composition. The parameter of the composition is the product of the input parameters. Proposition 4.5. Let f : [0, 1] → [0, 1] be an increasing bijection and let n(x) be the negation generated by f (x). τ N(λ1 ) τ N(λ2 ) (x) = τ N(λ1 λ2 ) (x),

(4.27)

τ P(λ1 ) τ P(λ2 ) (x) = τ P(λ1 λ2 ) (x),

(4.28)

τ S(λ1 ) τ S(λ2 ) (x) = τ S(λ1 λ2 ) (x).

(4.29)

Proof. 1. We need to shoe that τ N(λ1 ) τ N(λ2 ) (x) = f −1 λ1 f f −1 [λ2 f (x) − (λ2 − 1)]) − (λ2 − 1)] = f −1 [λ1 [λ2 f (x) − (λ2 − 1)] − (λ2 − 1)] .

70

4 Modifiers and Membership Functions in Fuzzy Sets

(a) For λ2 f (x) − (λ2 − 1) ≤ 0; i.e. for f (x) ≤ 1 − λ12 , we obtain τ N(λ1 ) τ N(λ2 ) (x) = 0. In this case, τ N(λ1 λ2 ) (x) = 0 as well, since from f (x) ≤ 1 − λ12 follows f (x) ≤ 1 − λ12 − λ11 ; i.e. λ1 λ2 f (x) − ((λ1 − 1)(λ2 − 1) − 1) ≤ 0. (b) For 0 < λ2 f (x) − (λ2 − 1) ≤ 0 < 1; i.e. for f (x) > 1 − λ12 , the cutting function can be omitted and the statement follows from a direct calculation. (c) Taking into account the fact that λ2 > 1 and 0 ≤ f (x) ≤ 1, λ2 f (x) − (λ2 − 1) > 1 is impossible. 2. We need to show that τ P(λ1 ) τ P(λ2 ) (x) = f −1 [λ1 [λ2 f (x)]] = f −1 [λ1 λ2 f (x)] . (a) If f (x) ≥ λ12 , then f −1 [λ1 [λ2 f (x)]] = f −1 [λ1 λ2 f (x)] = 1. (b) If 0 < f (x) < λ12 , then the cutting function can be omitted and the statement is trivial. 3. We need to show that τ S(λ1 ) τ S(λ2 ) (x) = f −1 λ1 λ2 f (x) − λ22−1 − λ12−1 = f −1 λ1 λ2 f (x) − λ1 λ22 −1 . (a) If λ2 f (x) − λ22−1 ≤ 0, then taking into account the fact that λi > 1, the left 2 −1 hand side of the equation is 0. Since in this case f (x) ≤ λ2λ , 2λ1 λ2 f (x) ≤ 2 λ1 λ2 − 1 ≤ λ1 λ2 − 1. Therefore, the value in the cutting function on the right hand side is less than or equal to 0, which means that the equation is valid. (b) If 0 ≤ λ2 f (x) − λ22−1 ≤ 1, then the cutting function can be omitted and the statement is trivial. 2 +1 > 1, which means that the left (c) If λ2 f (x) − λ22−1 > 1, then f (x) > λ2λ 2 hand side of the equation is 1. Since in this case 1+λ21 λ2 < λ1 λ22+λ1 < λ1 λ2 f (x), the value in the cutting function on the right hand side is greater than 1, which means that the equation is valid. Proposition 4.6. 1. For the drastic operators τ I(∞) τ J(∞) (x) = τ J(∞) (x),

(4.30)

where I, J ∈ {N , P, S}. Proof. This statement follows from direct calculation.

4.2.4 Multivariable Operators Derived from Unary Operators Proposition 4.7 tells us how the conjunction and the disjunction can be expressed in terms of the unary operators and the arithmetic mean operator. First, let us recall the definition of the arithmetic mean operator.

4.2 Unary Operators in Nilpotent Logical Systems

Definition 4.6. m(x) := f

−1

71

k 1 ( f (xi )) , k i=1

(4.31)

where f : [0, 1] → [0, 1] is an increasing bijection. Proposition 4.7. The unary operators satisfy the following equation: τν(k) (m(x)) = oν (x).

(4.32)

In particular, 1. τ P(k) (m(x)) = d(x), 2. τ N(k) (m(x)) = c(x), 3. τ S(k) (m(x)) = a(x).

Proof. The statements follow from a direct calculation.

Proposition 4.8. The necessity operator and the possibility operator have the following property: 1. d τ P(k) (x1 ), τ P(k) (x2 ), ...τ P(k) (xk ) = τ P(k) (d(x)) . 2. c τ N(k) (x1 ), τ N(k) (x2 ), ...τ N(k) (xk ) = τ N(k) (c(x)) , −1

k

Proof. 1. The following statement has to be proven: f [λ f (xi )] = f −1 i=1 k λ [ f (xi )] . If λ f (xi ) ≤ 1 for ∀i, then the statement is trivial. If ∃i, for i=1

which λ f (xi ) > 1, then both sides of the equations have the same value (i.e. a value of 1). 2. This follows from the first statement by applying the De Morgan law.

4.2.5 A General Framework: The α, β, γ - Model All basic operators discussed so far can be handled in a common framework, since they all can be described by the following parametric form (Table 4.3). Definition 4.7. Let x, y ∈ [0, 1], α, β, γ ∈ R and let f : [0, 1] → [0, 1] be a strictly increasing bijection. Let the general parametric operator be oα,β,γ (x, y) := f −1 [α f (x) + β f (y) + γ ].

(4.33)

72

4 Modifiers and Membership Functions in Fuzzy Sets

Table 4.3 Special values for α, β and γ α Disjunction

β

γ

1

1

0

Conjunction 1

1

−1

oα,β,γ (x, y)

Notation

f −1 [ f (x) + f (y)] f −1 [ f (x) + f (y) − 1]

d(x, y) i(x, y)

Implication

−1

1

1

Arithmetic mean

0.5

0.5

0

f −1 [ f (y) − f (x) + 1] f (y) f −1 f (x)+ 2

Preference

−0.5

0.5

0.5

f −1

−0.5

f −1

Aggregative 1 operator

1

Table 4.4 Special values for γ α

f (y)− f (x)+1 2

c(x, y) m(x, y)

f (x) + f (y) − 21

p(x, y) a(x, y)

γ

oα,γ (x, y)

Notation

Possibility Necessity

α α

0 1−α

τ P (x) τ N (x)

Sharpness

α

α−1 2

f −1 [α f (x)] f −1 [α f (x) − (α − 1)] f −1 [α f (x) − (α−1) 2 ]

τ S (x)

The most commonly used operators for special values of α, β and γ are listed in Table 6.1. Now let us focus on the unary (1-place) case. Definition 4.8. Let x ∈ [0, 1], α, γ ∈ R and let f : [0, 1] → [0, 1], a strictly increasing bijection. Then oα,γ (x) := f −1 [α f (x) + γ ].

(4.34)

For special values of γ , see Table 9.2 (Table 4.4). In this framework it becomes possible to define all the operators by a single generator function and only few parameters.

4.3 Unary Operators Induced by Negation Operators We shall assume that the possibility and necessity operators have to fulfill the following conditions: impossible(x) = necessity(not(x))

(4.35)

4.3 Unary Operators Induced by Negation Operators

73

Fig. 4.2 Unary operators “not”, “impossible, “possible” and “necessary”

and possible(x) = not(impossible(x)).

(4.36)

In the previous section we introduced possibility and necessity operators by repeating the arguments of multivariable operators. An alternative way to define unary operators is by means of a suitable composition of negation operators. From a semantic point of view, we can think of the word “impossible” as a stricter (stronger) negation, in a sense that it has a smaller fixpoint (see Fig. 4.2). If n ν2 (x) is a negation with a fixpoint ν2 (with the semantic meaning of “not”) and n ν1 (x) is a stricter negation with a fixpoint ν1 (with the semantic meaning of “impossible”), i.e. ν1 < ν2 , then the necessity operator can be derived from the following interpretation (see also (4.1)): “impossible” = “necessarily not”; i.e. if we denote the necessity operator by τ N (x), n ν1 (x) = τ N (n ν2 (x)). Based on this interpretation, by plugging n ν2 (x) into the equation above and taking into account the fact that n ν2 (x) is involutive, we can define the necessity operator, in the following way. Definition 4.9. Let n ν1 (x) and n ν2 (x) be negations with fixpoints ν1 and ν2 respectively, where ν1 < ν2 . (4.37) τνN1 ,ν2 (x) := n ν1 (n ν2 (x)) is called the necessity operator.

74

4 Modifiers and Membership Functions in Fuzzy Sets

Similarly, interpreting “possible” as “not impossible” (see also (4.2)), the possibility operator can be defined in the following way. Definition 4.10. Let n ν1 (x) and n ν2 (x) be negations with fixpoints ν1 and ν2 respectively, where ν1 < ν2 . Then τνP1 ,ν2 (x) := n ν2 (n ν1 (x))

(4.38)

is called the possibility operator. Remark 4.5. Note that the necessity and possibility operators differ only in the order of the negations in the composition. Necessity and possibility can be described by the parameter values ν1 and ν2 . Next, we define the duality between the possibility and the necessity operators. Definition 4.11. τνP1 ,ν2 (x) and τνN1 ,ν2 (x) are dual if they are defined by means of the same negations n ν1 (x) and n ν2 (x) with fixpoints ν1 and ν2 respectively, where ν1 < ν2 . Remark 4.6. Drastic necessity and possibility operators can be obtained by using drastic negations. Drastic negations are the so-called intuitionistic and dual iintuitionistic negations (denoted by n 0 and n 1 respectively): 1 if x = 0 1 if x < 1 n 0 (x) = and n 1 (x) = 0 if x > 0 0 if x = 1 These drastic negations are neither continuous nor strictly decreasing, therefore they are not strict negations, but we can get them as limits of strict negations. In a bounded system, the natural negations can serve as n ν1 and n ν2 . In this case, necessity and possibility can also be defined in a natural way, and the parameters of the generator functions of the conjunction, disjunction and negation, determine the parameters of the modal operators. Remark 4.7. If f c (x) and f d (x) are the generator functions of a bounded system and n c (x) and n d (x) the natural negations of c and d with fixpoints νc and νd respectively, then the possibility and necessity operators can be defined by τνNd ,νc (x) = n d (n c (x)),

(4.39)

τνPc ,νd (x) = n c (n d (x)),

(4.40)

since νd < νc . Example 4.1. The Dombi functions defined as ⎧ 1 ⎨ ν 1−x 1 + f n (x) = 1−ν x ⎩ 0

x = 0, x = 0;

4.3 Unary Operators Induced by Negation Operators

75

Table 4.5 Rational functions as normalized generators – 2 natural negations f (x) f −1 (x) 1 − f (x) Negation 1 1 1 1 2 νc 1−x c x 1+ 1−ν c x 1+ 1−ν νc 1−x 1−νc 1 + 1−ν c x νc 1−x 1 + νc 1 1 1 1 2 1−ν 1−ν x νd 1−x 1+ ν d 1−x 1+ ν d 1−x 1 + 1−ν 1−ν x d d 1 + νd d d x

⎧ ⎨

1

f c (x) = 1 + ⎩ 0

f d (x) =

⎧ ⎪ ⎨

νc x 1−νc 1−x

1

1+ ⎪ ⎩0

νd 1−x 1−νd x

x 1−x x 1−x

x = 1, x = 1; x = 0, x = 0,

where ν, νd , νc ∈ (0, 1), νd < ν < νd , generate a bounded system if and only if νc + νd < 1 [14]. Here, 1 n c (x) = 2 x c 1 + 1−ν νc 1−x and n d (x) =

1+

1 1−νd νd

2

, x 1−x

see Table 4.5. Proposition 4.9. For the Dombi functions from Example 4.1, τνNd ,νc (x) = n d (n c (x)) =

1 1+ A·

and τνPc ,νd (x) = n c (n d (x)) = where A =

νc 1−νc

2

1−νd νd

2

1 1+

1 A

·

(4.41)

1−x x

1−x x

,

(4.42)

.

In Fig. 4.3, τνNd ,νc (x) and τνPc ,νd (x) have been plotted for the Dombi functions with different values of νc and νd .

76

4 Modifiers and Membership Functions in Fuzzy Sets

Fig. 4.3 τνNd ,νc (x) and τνPc ,νd (x) for A = 0.1, 0.4 and 0.75

1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

Proof. We will just prove the case for the necessity operator. The case for the possibility can be proven in a similar way. Let us use the following notations: C :=

1 − νc νc

2

and D :=

τνNd ,νc (x) = n d (n c (x)) =

=

1 1+

D 1−x C x

=

1 − νd νd

2

1 1+ D

1 1+ A·

1 x 1+C 1−x 1− 1+C1 x 1−x

1−x x

.

=

.

Proposition 4.10. τνN1 ,ν2 (x) and τνP1 ,ν2 (x) satisfy the basic properties of modalities: N 1 τνN1 ,ν2 (1) = 1 N 2 τνN1 ,ν2 (x) ≤ x N 3 x ≤ y if and only if τνN1 ,ν2 (x) ≤ τνN1 ,ν2 (y)

4.3 Unary Operators Induced by Negation Operators

77

P1 τνP1 ,ν2 (0) = 0 P2 x ≤ τνP1 ,ν2 (x) P3 x ≤ y if and only if τνP1 ,ν2 (x) ≤ τνP1 ,ν2 (y).

Proof. We will just prove the case for the necessity operator. The case for the possibility can be proven in a similar way. Here, N 1 τνN1 ,ν2 (1) = n ν1 (n ν2 (1)) = 1 N 2 n ν1 (x) ≤ n ν2 (x) =⇒ x ≥ n ν1 (n ν2 (x)) = τνN1 ,ν2 (x) N 3 Follows from the monotonicity of the negations.

The following proposition describes all the possible compositions of the negations n ν1 , n ν2 , possibility τνP1 ,ν2 and necessity τ Nν1 ,ν2 .

Proposition 4.11. 1. “it is not impossible” = “it is possible”; i.e. n ν2 n ν1 (x) = τνP1 ,ν2 (x).

2. “it is not possible” = “it is impossible”; i.e. n ν2 τνP1 ,ν2 (x) = n ν2 (x).

N

3. “it is not necessary” = “it is possibly not”; i.e. n ν2 τν1 ,ν2 (x) = τνP1 ,ν2 n ν2 (x) .

4. “it is impossible that it is not” = “it is necessary”; i.e. n ν1 n ν2 (x) = τνN1 ,ν2 (x). 5. “it is = “it is necessary that it is impossible”; i.e.

impossible that it is possible” n ν1 τνP1 ,ν2 (x) = τνN1 ,ν2 n ν1 (x) . 6. “it is impossible = “it

that it is necessary”

is possible that it is impossible” = “not”; i.e. n ν1 τνN1 ,ν2 (x) = τνP1 ,ν2 n ν1 (x) = n ν2 (x). 7. “it is possible that it is necessary” = “it is necessary that it is possible”; i.e. τνP1 ,ν2 (τνN1 ,ν2 (x)) = τνN1 ,ν2 (τνP1 ,ν2 (x)) = x. Proof. All these statements follow from direct calculation, taking into account the fact that the negation operators n ν1 and n ν2 are involutive.

4.4 Membership Functions As commonly applied membership functions, the triangular membership functions are formed using straight lines. These straight line membership functions have the advantage of elegant simplicity. However, triangle membership functions are nondifferentiable in three points, which may lead to problems if using classical optimization methods. Because of their smoothness and compact notation, Gaussian membership functions are popular for specifying fuzzy sets. These curves have the advantage of being smooth and nonzero at all points.

78

4 Modifiers and Membership Functions in Fuzzy Sets

When it comes to applications, real-life situations have a higher complexity and usually special membership functions are required. Most of the applications use arbitrary functions that suit the given situation regarding simplicity, convenience, speed and efficiency. The membership functions defined in this section model the truth value of the statement “x is equal to 0”. Similarly, by means of an adequate translation, such membership functions can be easily obtained by modelling the statement “x is equal to a”, where a is an arbitrary given value. Note that in the following definition, the parameter ε has the semantic meaning of tolerance. Definition 4.12. Let f c : [0, 1] → [0, 1] be a decreasing bijection, ν ∈ (0, 1), λ ∈ R, λ > 1, ε ∈ [0, 1], and let us define the operator-dependent membership function as x λ (4.43) δε(λ) (x) = f c−1 f c (ν) . ε Proposition 4.12. 1. δε(λ) (x) is an even function, 2. δε(λ) (ε) = ν, 3. δε(λ) (0) = 1. Proof. It follows from direct calculation.

In Figs. 4.4 and 4.5, operator-dependent membership functions are illustrated using the generator function of the Łukasiewicz- and Dombi operators, respectively. For λ = 2, the absolute value function can be omitted, which turns out to be a key step towards differentiability. Remark 4.8. Note that the above-defined construction of membership functions connects the Gauss-curve and probability theory together by providing a Gaussian membership function for λ = 2 and f c (x) = − ln x (the generator function of the product operator, which is part of probabilistic reasoning).

(λ)

Fig. 4.4 Membership functions δε (x) generated by f c (x) = 1 − x

4.5 Non-membership Functions

79

(λ)

Fig. 4.5 Membership functions δε (x) generated by f c (x) =

1 c 1+ 1−ν νc

x 1−x

Remark 4.9. Note that for λ = 1 and f c (x) = 1 − x (the generator function of the Łukasiewicz t-norm), the above definition provides a triangular membership function. The following proposition states an important advantage of this approach to membership functions and modifiers.

4.5 Non-membership Functions A generalization of fuzzy sets was introduced by Atanassov in 1986 as intuitionistic fuzzy sets (IFSs) [15], including both membership and non-membership of the elements. The non-membership functions can be defined naturally by using the generator function of the disjunction operator. These functions can model the truth value of the statement “x is not equal to 0” or, by means of an adequate translation, also the statement “x is not equal to a”, where a is an arbitrary fixed value. Definition 4.13. Let f d : [0, 1] → [0, 1] be an increasing bijection, ν ∈ (0, 1), λ ∈ R, λ > 1, ε ∈ [0, 1], and let us define the operator-dependent membership function as x λ δˆε(λ) (x) = f d−1 f d (ν) . (4.44) ε Proposition 4.13. 1. δˆε(λ) (x) is an even function, 2. δˆε(λ) (ε) = ν, 3. δˆε(λ) (0) = 0. Proof. It follows from direct calculation.

In Figs. 4.6 and 4.7 operator-dependent non-membership functions have been plotted using the generator function of the Łukasiewicz- and Dombi operators, respectively.

80

4 Modifiers and Membership Functions in Fuzzy Sets

(λ)

Fig. 4.6 Non-membership functions δˆε (x) generated by f d (x) = x

Fig. 4.7 Non-membership functions δˆε(λ) (x) generated by f d (x) =

1 νd 1+ 1−ν

d

1−x x

4.6 Summary The purpose of this chapter was to consider the main unary operators of a nilpotent logical system in an integral framework and to reveal the underlying general structure of all the previously examined operators in nilpotent logical systems. The unary operators were obtained by repeating the argument in multivariable operators and also by using compositions of negations to provide a widely applicable system, where all the operators are connected to each other, and where the modalities and hedges are operator-dependent. In this way, we can describe all the operators by using a generator function and only a few parameters. By fitting the parameter values, the system can be used to model real-life problems. The possibility, necessity and sharpness operators were deeply examined and we showed how the multivariable operators may be derived from the unary ones. In the next two chapters, we will focus on the decision operators.

References

81

References 1. Banks, W.: Mixing crisp and fuzzy logic in applications. In: Idea/Microelectronics, WESCON/1994, pp. 94–97 (1994). https://doi.org/10.1109/WESCON.1994.403621 2. De Cock, M., Bodenhofer, U., Kerre, E.E.: Modelling linguistic expressions using fuzzy relations. In: Proceedings of 6th International Conference on Soft Computing (IIZUKA 2000), 2(Iizuka 2000), pp. 353–360 (2000) 3. Huynh, V.N., Ho, T.B., Nakamori, Y.: A parametric representation of linguistic hedges in Zadeh’s fuzzy logic. Int. J. Approx. Reason. 30(3), 203–223 (2002). https://doi.org/10.1016/ S0888-613X(02)00075-0. ISSN 0888-613X 4. Jang, J., Sun, C., Mizutani, E.: Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] (1997). http://ieeexplore.ieee.org/ document/633847/. ISSN 0018-9286 5. Türk¸sen, I.B.: A foundation for CWW: meta-linguistic axioms. In: Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS, vol. 1, pp. 395–400 (2004). https://doi.org/10.1109/nafips.2004.1336315. http://www.scopus.com/inward/record. url?eid=2-s2.0-4544223298&partnerID=40&md5=8c2c1dcce750e5d0721542ecb9f09883 6. Zadeh, L.: The concept of a linguistic variable and its applications to approximate reasoning II. Inf. Sci. 8(4), 199–249 (1975) 7. Zadeh, L.: The concept of a linguistic variable and its applications to approximate reasoning III. Inf. Sci. 8(4), 199–249 (1975) 8. Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning I. Inf. Sci. 8(3), 199–249 (1975). https://doi.org/10.1016/0020-0255(75)90036-5. ISSN 00200255 9. Dombi, J., Csiszár, O.: Operator-dependent modifiers in nilpotent logical systems. In: Proceedings of the 10th International Joint Conference on Computational Intelligence - vol. 1: IJCCI, pp. 126–134. INSTICC, SciTePress (2018) 10. Liu, B.-D., Member, S., Chen, C.-Y., Member, S., Tsao, J.-Y.: Design of adaptive fuzzy logic controller based on linguistic-hedge concepts and genetic algorithms. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 31(1), 32–53 (2001) 11. Csiszár, O., Dombi, J.: Generator-based modifiers and membership functions in nilpotent operator systems. In: IEEE International Work Conference on Bioinspired Intelligence (iwobi 2019), pp. 99–106 (2019) 12. Zadeh, L.A.: A fuzzy-set-theoretic interpretation of linguistic hedges. J. Cybern. 2(3), 4–34 (1972) 13. Zadeh, L.A.: Quantitative fuzzy semantics. Inf. Sci. 3, 159–176 (1971) 14. Dombi, J., Csiszár, O.: The general nilpotent operator system. Fuzzy Sets Syst. 261, 1–19 (2015) 15. Atanassov, K.: Intuitionistic fuzzy sets. Fuzzy Sets Syst. 20(1), 87–96 (1986)

Part II

Decision Operators

Chapter 5

Aggregative Operators

Abstract In human thinking, averaging operators, where a high input can compensate for a lower one, play a significant role. As in the previous chapters, we focus on nilpotent logical systems and describe aggregative operators in such systems ranging from symmetric ones (that treat all the inputs equally) to weighted ones, where some inputs are given more weight than others. As a starting point, instead of associativity, we focus on the necessary and sufficient condition of the self-dual property. We give a parametric form of the generated operator by using a shifting transformation of the generator function. The parameter here has an important semantical meaning as a threshold of expectancy (decision level). We show that nilpotent conjunctive, disjunctive, aggregative, and negation operators can be obtained by changing this parameter value. This way we can provide a general framework for different types of operators using only one generator function. The formula also contains a parameter with the semantical meaning of the threshold of expectancy. Interestingly, the resulting formula turns out to be equivalent to that used in current deep learning techniques.

5.1 Introduction In human thinking, averaging operators, where a high input can compensate for a lower one, play a significant role. The aggregative operator was first introduced in 1982 by Dombi [1], by selecting a set of minimal concepts that must be fulfilled by an evaluation-like operator. The concept of uninorms was introduced in [2], as a generalization of both t-norms and t-conorms. By adjusting its neutral element ν, a uninorm is a t-norm if ν = 1 and a t-conorm if ν = 0. Uninorms have turned out to be useful in many areas like expert systems [3], aggregation [4, 5] and the fuzzy integral [6, 7]. The main difference in the definition of the uninorms and aggregative operators is that the self-duality requirement does not appear in uninorms, and the neutral element property is not in the definition for the aggregative operators. The representation © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_5

85

86

5 Aggregative Operators

theorem for strict, continuous on [0, 1] × [0, 1]\{(0, 1), (1, 0)} uninorms (or representable uninorms) was given by Fodor et al. [8] (see also Klement et al. [9]). Such uninorms are called representable uninorms and they were previously introduced as aggregative operators [1]. Recently, a characterization of the class of uninorms with a strict underlying t-norm and t-conorm was presented in [10]. In [11], the authors show that uninorms with nilpotent underlying t-norm and t-conorm belong to Umin or Umax . Further results on uninorms with fixed values along their borders can be found in [12]. Our main purpose here is to consider generated nilpotent operators in an integral frame and to examine the nilpotent self-dual generated operators [13]. A general parametric framework for the nilpotent conjunctive, disjunctive, aggregative and negation operators is given and it is demonstrated how the nilpotent generated operator can be applied to preference modelling. This chapter is organized as follows. In Sect. 5.3, the general parametric operator oν (x) of nilpotent systems is given. Here, the parameter has an important semantical meaning as the threshold of expectancy. In Sect. 5.4, the weighted form of this operator, aν,w (x) is examined. In Sect. 5.5, the properties (De Morgan property, commutativity, self-duality, fulfillment of the boundary conditions, bisymmetry) of the weighted general operator are examined. Here, the formula for the commutative selfDe Morgan operator, the so-called weighted aggregative operator is presented. Then in Sect. 5.6, we focus on the two-variable case, where it is proved that the two-variable operator with weights w1 = w2 = 1 is conjunctive for low input values, disjunctive for high ones, and averaging otherwise; i.e. a high input can compensate for a lower one. Lastly, in Sect. 5.7, the main results are summarized and a possible direction of future work is mentioned.

5.2 Preliminaries The concept of aggregative operators and uninorms will play an important role in the sequel. Definition 5.1. (See Dombi [1]) An aggregative operator is a function a : [0, 1] → [0, 1] with the following properties: 1. Continuous on [0, 1]2 \{(0, 1), (1, 0)}; 2. a(x, y) < a(x, y ) if y < y , x = 0, x = 1, a(x, y) < a(x , y) if x < x , y = 0, y = 1; 3. a(0, 0) = 0 and a(1, 1) = 1 (boundary conditions); 4. There exists a strong negation η such that a(x, y) = η(a(η(x), η(y))) (the selfDe Morgan identity) if {x, y} = {0, 1}; 5. a(1, 0) = a(0, 1) = 0 or a(1, 0) = a(0, 1) = 1. Definition 5.2. (See Yager and Rybalov [2]) A mapping U : [0, 1] × [0, 1] → [0, 1] is a uninorm, if it is symmetric, associative, nondecreasing and there exists an e ∈ [0, 1] such that U (e, x) = x for all x ∈ [0, 1].

5.2 Preliminaries

87

The structure of uninorms was first examined by Fodor et al. in [8]. Proposition 5.1. (See Fodor et al. [8]) Let U : [0, 1]2 → [0, 1] be a function and ν ∈]0, 1[. The following statements are equivalent: 1. U is a uninorm with neutral element ν which is strictly monotonic on ]0, 1[2 and continuous on [0, 1]2 \{(0, 1), (1, 0)}. 2. There exists an increasing bijection u : [0, 1] → [−∞, ∞] with u(ν) = 0 such that for all (x, y) ∈ [0, 1]2 , we have U (x, y) = u −1 (u(x) + u(y)),

(5.1)

where, in the case of a conjunctive uninorm U , we use the convention ∞ + (−∞) = −∞, while, in the disjunctive case, we use ∞ + (−∞) = ∞. If (5.1) holds, the function u is uniquely determined by U up to a positive multiplicative constant, and it is called an additive generator of the uninorm U .

5.3 Shifting Transformations on the Generator Functions – A General Parametric Formula From now on, we shall consider nilpotent logical systems. First we show that by shifting the generator function of a disjunction, we can get a conjunction and also operators that fulfill the self-De Morgan property. We provide a general parametric formula for these operators, where the conjunction, disjunction and the so-called aggregative operator differ only in one single parameter. Definition 5.3. Let f : [0, 1] → [0, 1] be an increasing bijection, ν ∈ [0, 1], and x = (x1 , . . . , xn ), where xi ∈ [0, 1] and let us define the general operator by oν (x) = f

−1

n

( f (xi ) − f (ν)) + f (ν) = f

i=1

−1

n

f (xi ) − (n − 1) f (ν) .

i=1

(5.2) Remark 5.1. Note that ν is a neutral element of oν (x) and that oν (x) can be generated by g(x) = f (x) − f (ν), since in this case g −1 (x) = f −1 (x + f (ν)). Aggregation functions generated in a similar way as (5.2) were also discussed by Kolesarova and Komornikova [14]. Proposition 5.2. The general operator in (5.2) 1. For ν = 1 is o1 (x) = c(x), a conjunction. 2. For ν = 0 is o0 (x) = d(x), a disjunction. Proof. Since f (1) = 1 and f (0) = 0, the proof is trivial.

88

5 Aggregative Operators

f 1−1

f ν−1 *

f 0−1

−1 ∗ Fig. 5.1 The shifting transformation in the linear case, f ν (x) for ν = 0, ν = ν , ν = 1; where ν ∗ = f −1 21

Remark 5.2. A conjunction and a disjunction differ only in one parameter of the general operator in (5.2). The parameter has the semantical meaning of the level of expectancy. Generalized conjunction and disjunction functions (GCD) were also examined by Dujmović and Larsen in [18]. The shifting transformation in the linear case is presented in Fig. 5.1. Next, a more general, weighted form of this operator will be examined.

5.4 The Weighted General Operator If a weighted operator ow (x) : [0, 1]n → [0, with w = 1] (w1 , . . . , wn ), wi > 0 real n parameters is represented by ow (x) = f −1 wi f (xi ) , where f : [0, 1] → [0, 1] i=1 n −1 is a bijection, then it can also be written as ow (x) = f f (xi ) , where xi is i=1

got via a so-called weighting transformation: xi = f −1 (wi f (xi )). Below, we apply this weighting transformation to the arguments of the operator in (9.1) to get the so-called weighted general operator. Definition 5.4. Let w = (w1 , . . . , wn ) and wi > 0 be real parameters, f : [0, 1] → [0, 1] an increasing bijection with ν ∈ [0, 1]. The weighted general operator is defined by n −1 wi ( f (xi ) − f (ν)) + f (ν) . (5.3) aν,w (x) := f i=1

5.5 Properties of the General and the Weighted General Operator

89

5.5 Properties of the General and the Weighted General Operator 5.5.1 The De Morgan Property A question that immediately arises is for which parameter values does the abovedefined general operator satisfy for the generalized De Morgan property concerning the negation generated by f (x) (the generator function of o(x)). That is, for which values of ν1 , ν2 the following equation holds for all x ∈ [0, 1]n . n(oν1 (x)) = oν2 (n(x)). For ν1 = 0, ν2 = 1 or ν1 = 1, ν2 = 0, we get the classical the De Morgan law. Proposition 5.3. Let f : [0, 1] → [0, 1] be an increasing bijection, νi ∈ [0, 1] and x = (x1 , . . . , xn ), where xi ∈ [0, 1], n(x) = f −1 (1 − f (x)) and oνi (x) the general operator. Then n(oν1 (x)) = oν2 (n(x)) holds for all x = (x1 , . . . , xn ), where xi ∈ [0, 1] if and only if f (ν1 ) + f (ν2 ) = 1. Proof. Using the fact that n(x) = f −1 (1 − f (x)), we get

f

−1

1−

n

f (xi ) − (n − 1) f (ν1 )

i=1

= f

−1

n

(1 − f (xi )) − (n − 1) f (ν2 ) .

i=1

1. First, we will show that f (ν1 ) + f (ν2 ) = 1 is necessary. n f (xi ), B := −(n − 1) f (ν1 ), we get Using the notations A := i=1

[A − (n − 1) f (ν2 )] = 1 − [A + B].

(5.4)

Let us consider the following cases: (a) First let us assume that ν1 = 0; 1. (5.4) must hold for all x ∈ [0, 1]n , in particular for x = (ν1 , . . . ν1 ). In this case 0 < A + B = f (ν1 ) < 1, so the cutting function can be omitted, and we get B = (1 − n) + (n − 1) f (ν2 ), from which f (ν1 ) + f (ν2 ) = 1. (b) Next, we show that the cutting function can also be omitted, if ν1 = 0 (i.e. B = 0), ν2 = 1. This means that we have to show that i. n − A − (n − 1) f (ν2 ) ≤ 0 and A + B = A ≥ 1, or ii. n − A − (n − 1) f (ν2 ) ≥ 1 and A + B = A ≤ 0 cannot hold for all x ∈ [0, 1]n . For example, for x = (x, . . . x), where x = f −1 n1 = 0, we get A = 1, and (n − 1)(1 − f (ν2 )) > 0.

90

5 Aggregative Operators

(c) Next, we show that the cutting function can also be omitted if ν1 = 1 (i.e. B = 1 − n) and ν2 = 0. This means that we have to show that i. n − A − (n − 1) f (ν2 ) ≤ 0 and A + B = A + 1 − n ≥ 1, or ii. n − A − (n − 1) f (ν2 ) ≥ 1 and A + B = A + 1 − n ≤ 0 cannot hold for all x ∈ [0, 1]n . Since A ≤ n, the first condition in 1(c)i holds only for x = 1, not for all x ∈ [0, 1]n . The condition in 1(c)ii does not hold for x = 1 and A = n, say. (d) For ν1 = 0, ν2 = 1 or ν1 = 1, ν2 = 0, the self-De Morgan property trivially holds. 2. Next, we will prove that f (ν1 ) + f (ν2 ) = 1 is also sufficient. If f (ν1 ) + f (ν2 ) = 1 holds, then f (ν1 ) = 1 − f (ν2 ), so we have to prove the following equation: f −1 (1 − [A − n + 1 + C]) = f −1 [n − A − C] , where A :=

n

f (xi ) and C := (n − 1) f (ν2 ). Since 1 − [A − n + 1 + C] =

i=1

[1 − A + n − 1 − C], the statement is trivial. −1 1

Remark 5.3. For ν1 = ν2 , the only solution is ν1 = ν2 = f ; i.e. the self-De 2 Morgan property holds if and only if the parameter ν is the fixpoint of the negation; i.e. ν = f −1 21 = ν ∗ . Remark 5.4. For ν1 = ν2 , we find that the operator oν (x, y) fulfills the self-De Morgan property if and only if it has the following form: f −1

n i=1

n−1 f (xi ) − . 2

(5.5)

In particular, for two variables: 1 f −1 f (x) + f (y) − . 2

(5.6)

Proposition 5.4. Let f : [0, 1] → [0, 1] be an increasing bijection, ν ∈ [0, 1] and x = (x1 , . . . , xn ), where xi ∈ [0, 1], w = (w1 , . . . , wn ), wi > 0, n(x) = f −1 (1 − f (x)). The weighted general operator aν,w (x) satisfies the self-De Morgan property, n wi = 1, or ν = f −1 21 = ν ∗ . if and only if i=1

Proof. The self-De Morgan property means that n(aν,w (x)) = aν,w (n(x))

5.5 Properties of the General and the Weighted General Operator

91

holds for all x; i.e. ⎛

⎡

f −1 ⎝1 − ⎣

n

⎤⎞

⎡

wi ( f (xi ) − f (ν)) + f (ν)⎦⎠ = f −1 ⎣

i=1

Let A :=

n

n

⎤ wi (1 − f (xi ) − f (ν)) + f (ν)⎦ .

i=1

wi f (xi ) and B :=

i=1

n

wi . Since f (x) is strictly increasing, we have to

i=1

show that 1 − [A − f (ν) (B − 1)] = [B − A − B f (ν) + f (ν)]. 1. First, we show that this condition is sufficient. If B = 1, then we get 1 − [A] = = [1 − A], which always holds. If f (ν) = 21 , then we get 1 − A − B−1 2 B − A − B2 + 21 . Using the fact that 1 − [x] = [x] always holds, we can immediately see that the two sides are equal. 2. Second, we show that this condition is also necessary. (a) First, let us assume that ν = 0; 1. For x = (ν, . . . ν), A = f (ν)B, so on the left hand side we get 1 − [ f (ν)], which means that the cutting function can n wi = 1, or be omitted. Thus 2 f (ν)(B − 1) = B − 1, from which B = i=1

f (ν) = 21 . (b) For ν = 0, we get 1 − [A] = [B − A]. For x0 = (x0 , . . . x0 ), where 0 < x0 < 1, A = f (x0 )B; i.e. 1 − [ f (x0 )B] = [(1 − f (x0 ))B], where the cutting function can be omitted, since f (x0 )B > 0 and (1 − f (x0 ))B > 0. n wi = 1. Thus B = i=1

(c) For ν = 1, we get 1 − [A − B + 1] = [−A + 1]. For x0 = (x0 , . . . x0 ), 0 < x0 < 1, A = f (x0 )B, so 1 − [ f (x0 )B − B + 1] = [− f (x0 )B + 1]; i.e. [B − f (x0 )B] = [1 − f (x0 )B] must hold. • If B = B=

n

wi ≤ 1, then the cutting function can be omitted, and we get

i=1 n

wi = 1.

i=1

• If B =

n

wi ≥ 1, then let f (x0 ) :=

i=1

B=

n

1 n wi

=

1 . B

So we get B ≤ 1, and

i=1

wi = 1 must hold.

i=1

Proposition 5.5. The weighted general operator aν,w (x) is commutative, if and only if w1 = w2 = · · · = wn . Proof. Trivial.

92

5 Aggregative Operators

Corollary 5.1. A commutative weighted general operator fulfills the self-De Morgan property if and only if w = n1 or ν = ν ∗ , where f (ν ∗ ) = 21 ; i.e. it has one of the following forms:

f −1

n 1 f (xi ) n

(5.7)

i=1

or

n

f −1 w

f (xi ) −

i=1

n 2

+

1 . 2

(5.8)

Remark 5.5. Note that (5.7) is a special case of (5.8) for w = n1 . Remark 5.6. If aν,w is commutative and satisfies the self-De Morgan operator, then it is independent of the parameter ν. Therefore the lower index ν can be omitted, and we will refer to this case simply as aw . n f (xi ) − n2 As we have seen, the weighted general operator of the form f −1 w i=1 1 + 2 , is commutative and satisfies the self-De Morgan property. With such nice properties it is a good idea to give it a distinctive name. Definition 5.5. The operator n n 1 aw (x) = f −1 w f (xi ) − + , 2 2 i=1

(5.9)

where w > 0, is called the weighted aggregative operator. Proposition 5.6. The weighted general operator aν,w (x) satisfies 1. The boundary condition aν,w (0) = 0, if and only if ν = 0 or

n

wi ≥ 1 (for a

i=1

commutative operator: w ≥ n1 );

2. The boundary condition aν,w (1, . . . , 1) = 1, if and only if ν = 1 or (for a commutative operator: w ≥

1. aν,w (0) = f

n

n

i=1

wi (− f (ν)) + f (ν) = 0, if and only if f (ν)(1 − B) ≤

i=1

0; i.e. ν = 0 or

wi ≥ 1 (for

wi .

i=1 −1

n i=1

a commutative operator: w ≥ n1 ); 4. Here, aν,w (ν, . . . , ν) = ν. n

wi ≥ 1

i=1 1 ); n

3. Both of the above-mentioned boundary conditions if and only if

Proof. Let B :=

n

wi ≥ 1.

5.5 Properties of the General and the Weighted General Operator

2. aν,w (1, . . . , 1) = f −1

n

93

wi (1 − f (ν)) + f (ν) = 1, if and only if (1 −

i=1

n

f (ν))B + f (ν) ≥ 1; i.e. (1 − B)( f (ν) − 1) ≥ 0, so ν = 1 or

wi ≥ 1.

i=1

3. It follows from the previous two statements. 4. aν,w (ν, . . . , ν) = f −1 [ f (ν)] = ν.

Remark 5.7. Note that for commutative operators, the condition

n

wi ≥ 1 is equiv-

i=1

alent to w ≥ n1 .

5.5.2 Bisymmetry An important property of aggregation functions is the grouping character; i.e. whether it is possible to build a partial aggregation for subgroups of input values, and then to get the overall value by combining these partial results. A strong form of such a condition is associativity, which allows us to start with the aggregation process before knowing all the inputs to be aggregated. However, associativity is a rather restrictive property. Associativity and idempotency together cancel the effect of repeating arguments in the aggregation procedure, so it is not possible to simulate the presence of weights by repeating arguments. A weaker condition is bisymmetry, which expresses the fact that the aggregation of the elements of any matrix can be performed first on the rows, then on the columns, or conversely. This natural property means that in the case of n judges and m candidates, say, the overall score of the candidates can be calculated by first aggregating the scores of each candidate, and then aggregating these overall values; or an alternative way is to first aggregate the scores given by each judge and then aggregate these values. The following propositions characterize bisymmetric and associative functions (see Aczél, [15, 16]). Proposition 5.7. An operator o : [0, 1]n → R is continuous, strictly increasing, idempotent, and bisymmetric if and only if it represents a quasi-linear mean; i.e. there is a continuous and strictly monotonic function f : [0, 1] → R such that o(x) = f

−1

n

wi f (xi ) ,

i=1

where wi > 0,

n

wi = 1.

i=1

Proposition 5.8. An operator o : [0, 1]n → R is continuous, strictly increasing and bisymmetric if and only if it represents a quasi-linear function; i.e. there is a continuous and strictly monotonic function f : [0, 1] → R such that

94

5 Aggregative Operators

o(x) = f

−1

n

wi f (xi ) + b ,

i=1

where wi > 0, b ∈ R. If instead of bisymmetry the function satisfies the stronger conditions of commutativity and associativity, then we have the following corollary when wi = 1. Proposition 5.9. An operator o : [0, 1]n → R is continuous, strictly increasing, commutative and associative if and only if it represents a quasi-linear function with wi = 1; i.e. there is a continuous and strictly monotonic function f : [0, 1] → R such that n −1 o(x) = f f (xi ) + b , i=1

b ∈ R. Proposition 5.10. The weighted aggregative operator with weights w ≤ n1 is bisymmetric. n n f (xi ) ≤ n. Therefore, 0 ≤ w f (xi )− Proof. 1. Since 0 ≤ f (x) ≤ 1, 0 ≤ i=1 i=1 n + 21 ≤ 1, so in (5.9), the cutting function can be omitted, and the operator has 2 the form of the function stated in Proposition 5.8, which means it is bisymmetric.

5.6 The Two-Variable General and Weighted Aggregative Operator Now, we examine the weighted aggregative operator of two variables. Corollary 5.2. A commutative weighted general operator aν,w fulfills the self-De Morgan property if and only if w = 21 or ν = ν ∗ , where f (ν ∗ ) = 21 ; i.e. the weighted aggregative operator of two variables has the following form: f

−1

1 w( f (x) + f (y) − 1) + . 2

(5.10)

Proof. It follows directly from Proposition 5.4.

Remark 5.8. Note that for w = 21 , (5.10) has the following form:

5.6 The Two-Variable General and Weighted Aggregative Operator

f −1

95

f (x) + f (y) . 2

(5.11)

This is the so-called general arithmetic mean, where the cutting function can be omitted. Corollary 5.3. A two-variable weighted aggregative operator aw , n(aw (n(x), x)) = aw (n(x), x) = ν ∗ , and, in particular, aw (0, 1) = aw (1, 0) = ν ∗ , where ν ∗ = f −1

1 2

.

Corollary 5.4. A two-variable commutative weighted general operator aν,w satisfies the boundary conditions 1. aν,w (0, 0) = 0, if and only if w ≥ 2. aν,w (1, 1) = 1, if and only if w ≥

1 2 1 2

or ν = 0; or ν = 1.

Proof. It follows directly from Proposition 5.6.

Corollary 5.5. A two-variable commutative weighted aggregative operator aw satisfies the boundary conditions aw (0, 0) = 0 and aw (1, 1) = 1, if and only if w ≥ 21 . Corollary 5.6. A weighted aggregative operator aw (x, y) satisfies the boundary conditions aw (0, 0) = 0 and aw (1, 1) = 1, if and only if it has the following form: aw (x, y) = f

−1

1 w( f (x) + f (y) − 1) + , 2

(5.12)

where w ≥ 21 . Proposition 5.11. A weighted aggregative operator aw (x, y), which satisfies the boundary conditions, has the following property (Fig. 5.2): 1. If x, y ≤ ν, then aw (x, y) ≤ ν, 2. If x, y ≥ ν, then aw (x, y) ≥ ν. Proof. A weighted aggregative operator aw (x, y), which satisfies the boundary conditions has the following form: aw (x, y) = f

−1

1 w( f (x) + f (y) − 1) + , 2

where w ≥ 21 . 1. First, we consider the case where ν is the fix point of the negation; i. e. ν = ν ∗ , f (ν) = 21 .

96

5 Aggregative Operators

Fig. 5.2 The weighted aggregative operator aw for f (x) =

1 1+

νd 1−x 1−νd x

, νd = 0.8

(a) If x, y ≤ ν, then from the increasing property of f (x), we find that f (x), f (y) ≤ 21 ; i.e. w( f (x) + f (y) − 1) + 21 ≤ 21 , so aw (x, y) ≤ ν. (b) If x, y ≥ ν, then from the increasing property of f (x), we find that f (x), f (y) ≥ 21 ; i.e. w( f (x) + f (y) − 1) + 21 ≥ 21 , so aw (x, y) ≥ ν. 2. Second, we consider the case where w = 21 . From x, y ≤ ν, it follows that f (x), f (y) ≤ f (ν). Therefore, f −1 f (x)+2 f (y) ≤ f −1 [ f (ν)] = ν.

Proposition 5.12. A weighted aggregative operator, with w1 = w2 = 1, has the following properties (see Fig. 5.3): 1. If x, y ≤ ν ∗ , then a1 (x, y) is conjunctive; i.e. ∀x, y a1 (x, y) ≤ min(x, y). 2. If x, y ≥ ν ∗ , then a1 (x, y) is disjunctive; i.e. ∀x, y a1 (x, y) ≥ max(x, y). 3. If x ≤ ν ∗ ≤ y, or y ≤ ν ∗ ≤ x then a1 (x,y) is averaging; i.e. ∀x, y min(x, y) ≤ a1 (x, y) ≤ max(x, y), where ν ∗ = f −1 21 . Proof. The operator has the following form: 1 1 a1 (x, y) = f −1 ( f (x) + f (y) − 1) + = f −1 f (x) + f (y) − . 2 2 1. Let us assume that x ≤ y ≤ ν ∗ . From the increasingproperty of f (x), we see that f (x) ≤ f (y) ≤ f (ν ∗ ) = 21 ; i.e. a1 (x, y) = f −1 ( f (x) + f (y) − 1) + 21 ≤ x = min(x, y). 2. Let us assume that ν ∗ ≤ x ≤ y. From the increasingproperty of f (x), we see that 1 = f (ν ∗ ) ≤ f (x) ≤ f (y); i.e. a1 (x, y) = f −1 ( f (x) + f (y) − 1) + 21 ≥ 2 y = max(x, y). 3. Let us assume that x ≤ ν ∗ ≤ y. If x ≤ ν ∗ ≤ y, then f (x) ≤ 21 ≤ f (y), so min(x, y) = x ≤ a1 (x, y) = f −1 ( f (x) + f (y) − 1) + 21 ≤ y = max(x, y).

5.6 The Two-Variable General and Weighted Aggregative Operator

97

Fig. 5.3 The uninorm-like property of the weighted aggregative operator a1 (x, y)

Remark 5.9. The above-mentioned property holds if and only if w = 1. For w > 1, the conjunctive and disjunctive properties hold, but the averaging property does not. Remark 5.10. As we have seen, a1 (x, y) has a uninorm-like property (see Proposition 5.12) and it satisfies the self-De Morgan as well. However, it is not property associative (since a1 (0, 1) = a1 (1, 0) = f −1 21 = ν ∗ ) and therefore it cannot be a uninorm. Note that aggregative operators in the strict case (see Dombi [1]) are always associative and therefore they are special uninorms. Note that by substituting n(x) and y in the commutative self-De Morgan weighted aggregative operator, the operator a(n(x), y) has certain properties that are similar to those expected of a preference operator. Preference modelling is a fundamental part of several applied fields of decision-making [17]. In the classical theory, preference is a binary relation closely related to implications, with the meaning x Ry ⇐⇒ “y is not worse than x”. Preferences between alternatives can also be described by a valued preference relation p, such that the value p(x, y) is normalized, and it is understood as the degree to which the statement “y is not worse than x” is true: p(x, y) = truth of (y ≥ x). Here, p is a continuous function, which is strictly decreasing in the first variable and strictly increasing in the second one, and p(x, y) = n( p(y, x)) must also hold. Therefore it is sensible to define preference in the following way:

98

The Two-Variable General and Weighted Aggregative Operator

Definition 5.6. Let w > 0 be a real parameter and f : [0, 1] → [0, 1] be an increasing bijection. Moreover, let us define the preference operator as pw (x, y) = aw (n(x), y) = f −1 w( f (y) − f (x)) + 21 . Remark 5.11. Note that the negation operator generated by f (x) : [0, 1] → [0, 1] can be expressed in the following way: n(x) = f −1 ( f (ν ∗ ) − f (x)) + f (ν ∗ ) , where ν ∗ = f −1

1 2

(5.13)

.

Corollary 5.7. We have shown that the general formula aν,w (x) := f

−1

n

wi ( f (xi ) − f (ν)) + f (ν)

(5.14)

i=1

for the weighted general operator includes the following special cases: 1. For f (ν) = 1 and wi = 1 ∀i, it is a conjunction with generator 1 − f (x). 2. For f (ν) = 0 and wi = 1 ∀i, it is a disjunction with generator f (x). n 3. For f (ν) = 21 or wi = 1, it satisfies the self-De Morgan property. i=1

4. For f (ν) = 21 and w1 = · · · = wn , or for wi = n1 , it is a weighted aggregative operator (a commutative self-De Morgan operator). n 5. For ν = 0 or wi ≥ 1, it satisfies the boundary condition aν (0) = 0. 6. For ν = 1 or

i=1 n

wi ≥ 1, it satisfies the boundary condition aν (1) = 1.

i=1

7. In particular for two variables, with • • • • • •

1 2

≤ w1 = w2 and f (ν) = 21 , it is

commutative, satisfies the De Morgan property, satisfies the boundary conditions (i.e. a(0, 0) = 0 and a(1, 1) = 1)), a(0, 1) = a(1, 0) = ν, if x, y ≤ ν, then a(x, y) ≤ ν, if x, y ≥ ν, then a(x, y) ≥ ν.

8. For two variables, with w1 = w2 = 1 and f (ν) = 21 , it is • • • • • • •

commutative, satisfies the De Morgan property, satisfies the boundary conditions (i.e. a(0, 0) = 0 and a(1, 1) = 1)), a(0, 1) = a(1, 0) = ν, it is conjunctive for x, y ≤ ν, it is disjunctive for x, y ≥ ν, it is averaging for x ≤ ν ≤ y and for y ≤ ν ≤ x.

5.6 The Two-Variable General and Weighted Aggregative Operator

99

9. For one variable and with w = −1, it is a negation operator with generator f (x). 10. aw (n(x), y) = f −1 w( f (y) − f (x)) + 21 = pw (x, y).

5.7 Summary To sum up, we may conclude that the weighted general operator (obtained by shifting and weighting the generator function of a disjunction) provides a general framework for different types of operators using only one generator function. The formula contains a parameter with the semantical meaning of the threshold of expectancy. Changing the parameter values, we can obtain conjunctive, disjunctive and self-dual operators with nice properties, and for one variable a negation operator. As a starting point, instead of associativity, we focused on the necessary and sufficient condition of the self-dual property. These results may have a significant contribution for applications in machine learning, since the parametric formula is easy to learn. Using an adequate optimization technique, the parameter with the best fit can be found. For a thorough examination of the preference operator see Chapter 6. The main disadvantage of the nilpotent operator family is the lack of differentiability, since there are significant areas where the parameters are learned by a gradient based optimization method. In this case, the lack of continuous derivatives makes the application impossible. Using the so-called squashing function (see Chapter 7) provides a solution to the above-mentioned problem by employing a continuously differentiable approximation of the cut function. In the next chapter, we will focus on the preference operator.

References 1. Dombi, J.: Basic concepts for a theory of evaluation: the aggregative operator. Eur. J. Oper. Res. 10, 282–293 (1982). ISSN 03772217. https://doi.org/10.1016/0377-2217(82)90227-2 2. Yager, R.R., Rybalov, A.: Uninorm aggregation operators. Fuzzy Sets Syst. 80(1), 111–120, (1996). ISSN 01650114. https://doi.org/10.1016/0165-0114(95)00133-6 3. De Baets, J., Bernard, F., An alternative proof: Van Melle’s combining function in MYCIN is a representable uninorm. Fuzzy Sets Syst. 104, 133–136 (1999) 4. Beliakov, G., Pradera, A., Calvo, T.: Aggregation Functions: A Guide for Practitioner. Studies in Fuzziness and Soft Computing, p. 375. Springer, Heidelberg (2007) 5. Yager, R.R., Rybalov, A.: Bipolar aggregation using the Uninorms. Fuzzy Optimization and Decision Making 10(December 2010), 59–70 (2011). ISSN 15684539. https://doi.org/10.1007/ s10700-010-9096-8 6. Benvenuti, P., Mesiar, R.: Pseudo-arithmetical operations as a basis for the general measure and integration theory. Information Sciences 160, 1–11 (2004). ISSN 00200255. https://doi. org/10.1016/j.ins.2003.07.005

100

5 Aggregative Operators

7. Klement, E.P., Mesiar, R., Pap, E.: Integration with respect to decomposable measures, based on a conditionally distributive semiring on the unit interval. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 8(6), 701–717 (2000). ISSN 0218-4885. https://doi.org/10.1016/S02184885(00)00051-4 8. Fodor, J.C., Yager, R.R., Rybalov, A.: Structure of uninorms. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 5(4), 411–427 (1997). ISSN 0218-4885. https://doi.org/10.1142/ S0218488597000312 9. Klement, E.P., Mesiar, R., Pap, E.: On the relationship of associative compensatory operators to triangular norms and conorms. Int. J. Uncertainty, Fuzziness Knowl.-Based Syst. 04(02), 129–144 (1996). https://doi.org/10.1142/S0218488596000081. URL https://doi.org/10.1142/ S0218488596000081 10. Fodor, J., De Baets, B.: A single-point characterization of representable uninorms. Fuzzy Sets Syst. 202, 89–99 (2012). ISSN 01650114. https://doi.org/10.1016/j.fss.2011.12.001 11. Li, G., Liu, H.-W., Fodor, J.: Single-point characterization of uninorms with nilpotent underlying t-norm and t-conorm. Int. J. Uncertainty, Fuzziness Knowl.-Based Syst. 22(04), 591–604 (2014). https://doi.org/10.1142/S0218488514500299 12. Csiszar, O., Fodor, J.: On uninorms with fixed values along their border. Annales Univ. Sci. Budapest. Sect. Comp. 42, 93–108 (2014) 13. Dombi, J., Csiszár, O.: Self-dual operators and a general framework for weighted nilpotent operators. Int. J. Approximate Reasoning 81, 115–127 (2017) 14. Kolesárová, A., Komorníková, M.: Triangular norm-based iterative compensatory operators. Fuzzy Sets Syst. 104(1), 109–120 (1999). ISSN 0165-0114. https://doi.org/10.1016/S01650114(98)00263-2 15. Aczel, J.: Ober eine Klasse von Funktionalgleichungen 1(xl), 247–252 (1920) 16. Aczel, J., Dhombres, J.: Functional Equations in Several Variables. Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge (1989). https://doi.org/ 10.1017/CBO9781139086578 17. Fodor, J., Roubens, M.: Fuzzy preference modelling and multicriteria decision support (1994) 18. Dujmović, J.J., Larsen, H.L.: Generalized conjunction/disjunction. Int. J. Approx. Reason. 46(3), 423–446 (2007)

Chapter 6

Preference Operators

Abstract The theories of multi-criteria decision-making (MCDM) and fuzzy logic both seek to model human thinking. In MCDM, aggregation processes and preference modeling play the central role. This chapter presents a consistent framework for modeling human thinking using the tools of both fields: fuzzy logical operators as well as aggregation and preference operators. In this framework, aggregation, preference, and the logical operators are described by the same unary generator function. Similar to the implication being defined as a composition of the disjunction and the negation operator, preference operators that describe to what extent x is preferable to y, are introduced as a composition of the aggregative operator and the negation operator. Although these operators have many properties in common with implications, we show that there is a subtle but important difference. After a profound examination of the main properties of the preference operator, our main goal is the implementation of this operator into neural networks.

6.1 Introduction When it comes to modeling human thinking, two main approaches have received special attention in the last few decades: fuzzy logic and multi-criteria decision analysis (MCDA), or multi-criteria decision-making (MCDM). In real-world applications, a decision-maker, more often than not, faces decision situations where multiple criteria have to be considered simultaneously. Since the modeling is always affected by the presence of different kinds of uncertainty due to imperfect human knowledge, fuzzy set theory, a language that is capable of dealing with uncertainty, has been successful in MCDM modeling as well. Fuzzy sets provide a theoretical framework for quantifying a type of uncertainty, such as imprecision and ambiguity, which are inherent in many decision-making processes. The seminal paper by Orlovsky [1] may be regarded as the first attempt to use fuzzy set theory in preference modeling. In his paper, Orlovsky defines the strict preference relation and the indifference relation with the use of Łukasiewicz- and © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_6

101

102

6 Preference Operators

minimum t-norms. As a consequence, numerous approaches have been proposed to solve fuzzy MCDM problems [2]. A review and comparison of many of these methods can be found in [3]. In multiple criteria decision-making (MCDM), the decision-maker’s preference plays a key role (see e.g. [4], for a comprehensive taxonomy of the MCDA process characteristics, see [5]), and hence, preference modeling is fundamental. The classical MCDM procedures generally perform in two steps; aggregation and exploitation. First, the aggregation part defines an outranking relation that indicates the global preference between any ordered pair of alternatives. Second, the exploitation transforms the information into a global ranking, usually by using a ranking method to obtain a score function, as in classical procedures typical of the so-called European (or French) School, such as PROMETHEE [6] and ELECTRE III [7]. The preference operators of these outranking methods can be readily described well by the generator-based preference operator introduced here. The aggregation procedures in decision-making often use value functions or preference relations. In the classical theory, preference is a binary relation with the semantical meaning of p(x, y) = tr uth o f (x ≤ y). In order to deal with the preference operator and the logical operators in a consistent framework, the in- and output values need to be normalized. As we can easily see, the preference operator does not belong to the logical operators, since logical operators need to be consistent with the classical logic; i.e. on the boundaries we need to get crisp values from {0, 1}. However, if the two input values are the same, preference operators should give a neutral output value (greatest uncertainty level); i.e. different from 0 and 1. This means that for (0, 0) and for (1, 1), preference operators cannot have a crisp value from {0, 1}, and therefore they do not belong to the world of logical operators in the strict sense. In this chapter, we will suggest how we can still create a theoretical framework that synthesizes the worlds of continuous logic and MCDM, and examine the main properties of the preference operator in nilpotent systems. This consistent framework has potential applications in the field of artificial intelligence, as an important step towards the interpretability of neural models. Recently, intelligent learning methods, especially deep learning models have been revolutionizing the business and technology world. One of the greatest challenges is the increasing need for interpretability, transparency, and safety. Although deep neural networks have achieved impressive experimental results, they may surprisingly be unstable when it comes to adversarial perturbations. For example in image classification, minimal changes to the input image may cause the network to misclassify it [8–11]. In predictive modeling, interpretability (opening the “black box”) is becoming more and more important. In a high-risk environment, we also need to know the reasons why a decision was made. Neural models have also been developed for multiple criteria decision-making [12, 13]. In these models, the motivation is to model the decision-maker’s underlying preference structures by means of supervised learning based on sampled preference data. The recent advances in theory and methodology of neural networks and fuzzy

6.1 Introduction

103

logic have laid the foundation for developing models based on neural architecture for MCDM in a fuzzy environment. On the one hand, fuzzy systems can deal with uncertainty and linguistic terms, modeling the decision-maker’s preferences using fuzzy rules. On the other hand, neural networks can exhibit learning capability. In this direction, Preference Learning (PL) is emerging as an extended paradigm in machine learning by inducing predictive preference models from experimental data [14–17]. PL is used in various research fields such as knowledge discovery or recommender systems. Although various combinations of neural networks and MCDM have been considered in different contexts, there has been little attempt to combine neural networks with continuous logical systems. The novelty here is to suggest a consistent framework for modeling human thinking by using the tools of all three fields: fuzzy logical operators, MCDM tools (such as aggregation and preference operators), as well as deep learning methods. Beyond the theoretical demand, our objective here is to provide multicriteria decision tools to the nilpotent neural model introduced in [18]. This chapter is organized as follows. In Sect. 6.2, we recall the basic preliminaries for better readability. Section 6.3 introduces the problem of preference modeling in these systems and suggests a definition for the preference operator combining the aggregative operator with the negation operator of the system [19]. The main properties of the preference operator are examined in Sect. 6.4. In Chapter 9, we will show how the nilpotent preference can be modeled by a perceptron and illustrate this with applications in neural networks. To obtain differentiability, the squashing function as a smooth approximation of the cutting function is used in the formulae.

6.2 Operators of Nilpotent Systems - A General Framework First, we shall highlight some of our related preliminary results. Among other families of fuzzy logics, nilpotent fuzzy logic is beneficial from several perspectives. The fulfillment of the law of contradiction and the excluded middle, and the coincidence of the residual and the S-implication [20, 21] make the application of nilpotent operators in logical systems look promising. In [22–27], a wide range of operators were thoroughly examined: in [23], negations, conjunctions and disjunctions, in [24] implications, and in [25] equivalence operators. In [26], the aggregative operators were studied and a parametric form of a general operator oν was given by using a shifting transformation of the generator function. Varying the parameters, nilpotent conjunctive, disjunctive, aggregative (where a high input can compensate for a lower one) and negation operators can all be derived. It was also demonstrated how the nilpotent generated operator can be applied for preference modeling. Moreover, as shown in [27], membership functions, which play a substantial role in the overall performance of fuzzy representation, can be defined using a generator function. In

104

6 Preference Operators

[18], the authors showed that in the area of continuous logic, nilpotent logical systems are the most suitable for neural computation.

6.2.1 Normalization of the Generator Functions Let us first consider the most important operators in classical logic, namely the conjunction, the disjunction, and the negation operator. These three basic operators together form a so-called connective system. When extending classical logic to continuous logic, compatibility and consistency are crucial. The negation should also be involutive; i.e. n(n(x)) = x, for ∀x ∈ [0, 1]. Involutive negations are called strong negations. Definition 1. [23] The triple (c, d, n), where c is a t-norm, d is a t-conorm and n is a strong negation, is called a connective system. Definition 2. [23] A connective system is nilpotent if the conjunction c is a nilpotent t-norm, and the disjunction d is a nilpotent t-conorm. In the nilpotent case, the generator functions of the disjunction, and the conjunction (denoted by t (x) and s(x), respectively) are bounded functions, being determined up to a multiplicative constant. This means that they can be normalized in the following way: t (x) s(x) , f d (x) := . (6.1) f c (x) := t (0) s(1) Note that the normalized generator functions are now uniquely defined. Next, we recall the definition of the cutting function, to simplify the notations used. The differentiable approximation of the cutting function, the squashing function S(x) introduced and examined in [28], is a ReLu-like bounded activation function in our model. In [26], the authors showed that all the nilpotent operators can be described by using one generator function f (x) and the cutting function. Definition 3. [23] Let us define the cutting operation [ ] by ⎧ ⎨0 [x] = x ⎩ 1

if if if

x 0, x = (x1 , ...xn ) ∈ [0, 1]n , ν = (ν1 , ...νn ) ∈ [0, 1]n and let f : [0, 1] → [0, 1] be a strictly increasing bijection. Let us define the threshold-based nilpotent operator by n oν,w (x) = f −1 i=1 wi ( f (x i ) − f (νi )) + f (ν) n = f −1 i=1 wi f (x i ) + C , where C = f (ν) −

n

wi f (νi ).

(6.12)

(6.13)

i=1

Remark 6.1. Note that the equation in (6.12) describes the perceptron model in neural computation. Here, the parameters all have semantic meanings, namely importance (weights), decision level and level of expectancy. The most commonly used operators for n = 2 and for special values of wi and C, also for f (x) = x, are listed in Table 6.1.

6.2.3 The Unary Operators: Negation, Modifiers and Hedges Now let us focus on the unary (1-variable) case, investigated in [27], which also plays a crucial role in the nilpotent neural model. The unary operators are mainly used to construct modifiers and membership functions by using a generator function. The role of membership functions can be viewed as the modeling of an inequality [29]. Note that non-symmetrical membership functions can also be constructed by connecting two unary operators with a conjunction [22, 27]. For the most important unary operators, see Table 6.2. Definition 8. [27] Let λ ∈ R+ , λ > 1, ν ∈ [0, 1], f : [0, 1] → [0, 1] be an increasing bijection. Let us define the unary operator τν(λ) (x) in the following wayÉ τν(λ) (x) := f −1 [λ ( f (x) − f (ν)) + f (ν)] .

(6.14)

Remark 6.2. For ν = 1, ν = 0 and ν = ν∗ (i.e. f (ν) = 21 ), we get the necessity, the possibility and the sharpness operators, respectively.

108

6 Preference Operators (λ)

Table 6.2 The key unary operators τν (x) (λ) ν τν (x) Possibility Necessity Sharpness Negation (λ = −1)

1 0

ν∗ = f −1 21

ν∗ = f −1 21

f −1 [λ f (x)]

for f (x) = x

[λx] f −1 [λ f (x) − (λ − 1)] [λx − (λ − 1)] f −1 [λ f (x) − 21 (λ − 1)] [λx − 21 (λ − 1)] f −1 [− f (x) + 1] [−x + 1]

Notation τ P (x) τ N (x) τ S (x) n(x)

6.3 Preference Modeling Preference modeling is an inevitable part of several applied fields of decision-making and at the same time, it has its own intriguing theoretical problems [2]. Since the modeling is always affected by the presence of different kinds of uncertainty, the use of soft techniques is sensible. Fuzzy set theory is a language that is capable of dealing with uncertainty. In the classical theory, preference is a binary relation closely related to the implications: x Ry ⇐⇒ “y is not worse than x”. Preferences between any two alternatives can also be described by a valued preference relation p, such that the value p(x, y) is normalized, and introduced as the degree to which the statement “y is not worse than x” is true: p(x, y) = truth of (y ≥ x). Here, p is a continuous function, which is strictly decreasing in the first variable, and strictly increasing in the second variable, and p(x, y) = n( p(y, x)) must also hold. In accordance with the case of the implication defined as a composition of the disjunction and the negation operator, i(x, y) = d(n(x), y), it seems useful to define the preference operator by composing the aggregation and the negation operator, p(x, y) = a(n(x), y). In other words, by substituting n(x) and y into the commutative self-De Morgan weighted aggregative operator, the operator a(n(x), y) has certain properties that are similar to those expected of a preference operator. Consequently, it is sensible to define the preference operator in the following way: Definition 9. Let w > 0 be a real parameter and let f : [0, 1] → [0, 1] be an increasing bijection. Let us define the preference operator as pw (x, y) = aw (n(x), y) = f −1 w( f (y) − f (x)) + 21 .

Remark 4. Note that for w = 21 , p 21 (x, y) = f −1 21 ( f (y) − f (x) + 1) , henceforth referred to as p(x, y). Preference operators with different generator functions and weights are illustrated in Fig. 6.1. The fact that the implication and the preference operators can be derived

6.3 Preference Modeling

109

Fig. 6.1 The preference operator with generator functions f (x) = x and f (x) = and w = 0.5

√

x, for w = 1

in a similar way as oν,w (n(x), y) with ν = 1, ν = f −1 21 respectively, provides a possible explanation for the common misconception about their use. Let us consider the following two examples: If x < y and y < z, then x < z If x → y and y → z, then x → z. The first one is based on the property of the preference relation describing the transitivity of preferences, while the second one is based on the implication (hypothetical reasoning or hypothetical syllogism). In our everyday language, we do not usually distinguish between these two types of reasoning, and we tend to confuse them [30].

6.4 Properties of the Preference Operator In this section, we will give a systematic overview of the main properties of the preference operator defined in Definition 9. First, the basic properties are examined in Sect. 6.4.1. In Sect. 6.4.2, we focus on the ordering properties, which play an outstanding role in preference modeling. In Sects. 6.4.3 to 6.4.5 a wide range of compositions with other operators (namely the negation operator, the conjunction, the disjunction, the aggregation and some other unary operators) are investigated. Lastly, additive transitivity and bisymmetry are studied in Sects. 6.4.6 and 6.4.7.

6.4.1 Basic Properties Now, we examine some basic properties of the preference operator pw (x, y). Note the similarities with the properties of implications. Proposition 2. The preference operator pw (x, y) has the following properties:

110

6 Preference Operators

1. Continuity; 2. Self-duality (SD, see also Sect. 6.4.3); i.e. pw (x, y) = n( pw (n(x), n(y));

(SD)

3. Neutrality: pw (x, x) = ν∗ ; 4. Weak dominance of falsity of antecedent (WDF): pw (0, y) ≥ ν∗

for all

y ∈ [0, 1];

(WDF)

5. Weak dominance of truth of consequent (WDT): pw (x, 1) ≥ ν∗

for all x ∈ [0, 1];

(WDT)

6. Boundary conditions ((BC), Compatibility) pw (0, 0) = pw (1, 1) = ν∗ ; pw (0, 1) ≥ ν∗ ,

pw (1, 0) ≤ ν∗ ;

(BC)

7. In particular, pw (0, 1) = 1,

and

pw (1, 0) = 0

if and only if w ≥ 21 ; 8. The preference property (PP); i.e. x < y if and only if pw (x, y) > ν∗ ;

(PP)

9. The threshold transitivity (TT); i.e. pw (x, y) > ν∗ and pw (y, z) > ν∗ ⇒ pw (x, z) > ν∗ ; where ν∗ = f −1

1 2

(TT)

.

Proof. 1. It follows directly from the continuity of f . 2. From the commutativity and self-duality of aw (x, y), we get pw (x, y) = aw (n(x), y) = n(aw (x, n(y))) = n( pw (n(x), n(y)). 3. It follows from direct calculation.

4. pw (0, y) = f −1 w f (y) + 21 ≥ f −1 21 = ν∗ .

5. pw (x, 1) = f −1 w (1 − f (x)) + 21 ≥ f −1 21 = ν∗ .

6. pw (0, 0) = pw (1, 1) = f −1 21 = ν∗ ; pw (0, 1) = f −1 w ( f (1) − f (0)) + 21 = f −1 21 + w ≥ ν∗ ; pw (1, 0) = f −1 w ( f (0) − f (1)) + 21 = f −1 21 − w ≤ ν∗ .

6.4 Properties of the Preference Operator

111

7. It follows directly from 6. 8. Since f is a strictly increasing function, x < y if and only if f (x) < f (y). pw (x, y) = f −1 w ( f (y) − f (x)) + 21 > f −1 21 = ν∗ . 9. It follows directly from 8. Remark 5. Note that in the first statement of (BC), ν∗ represents the maximal level of uncertainity.

6.4.2 Ordering Properties Next, we will focus on the ordering properties, which play a major role in preference modeling. Note the similarities with the implications. Proposition 3. The preference operator pw (x, y) satisfies: 1. The first place antitonicity: for all x1 , x2 , y ∈ [0, 1] (i f x1 ≤ x2 then pw (x1 , y) ≥ pw (x2 , y)).

(FA)

2. The second place isotonicity: for all x, y1 , y2 ∈ [0, 1] (i f y1 ≤ y2 then pw (x, y1 ) ≤ pw (x, y2 ));

(SI)

3. The weak ordering property: pw (x, y) = 1 if and only if

x ≤ τ (y), y ≥ f

−1

1 , 2w

where x ∈ [0, 1], and τ (x) : [0, 1] → [0, 1] is an increasing function. 4. pw (x, y) = 0 if and only if x ≥ ρ(y), y ≤ f

−1

1 1− , 2w

(WOP)

112

6 Preference Operators

where x ∈ [0, 1], and ρ(x) : [0, 1] → [0, 1] is an increasing function. Proof. 1. It follows directly from the monotonicity of f (x) : 1 1 ≤ f −1 w ( f (y2 ) − f (x)) + . pw (x, y1 ) = f −1 w ( f (y1 ) − f (x)) + 2 2 2. It follows directly from the monotonicity of f (x) : 1 1 ≥ f −1 w ( f (y) − f (x2 )) + . pw (x1 , y) = f −1 w ( f (y) − f (x1 )) + 2 2 3. pw (x, y) = 1 if and only if 1 = 1; f −1 w ( f (y) − f (x)) + 2 1 i.e. f (y) − f (x) ≥ 2w , which means x ≤ f −1 f (y) − must hold. Therefore, 1 τ (y) = f −1 f (y) − 2w

1 2w

1 , where y ≥ f −1 2w

is an increasing function with the expected property. 4. The same goes for 1 −1 f (y) + . ρ(y) = f 2w Remark 6. Note that in 3, for w = = 0 for ∀y, which means that p 12 (x, y) = 1 if and only if x = 0, y = 1. For w = 1, p1 (x, y) = 1 if and only if y ≥ ν∗ and x ≤ f −1 f (y) − 21 ≤ ν∗ . Similarly, in 9.7, for w = 21 , ρ(y) = 1 for ∀y, which means that p 21 (x, y) = 0 if and only if x = 1, y = 0. For w = 1, p1 (x, y) = 0 if and only if y ≥ ν∗ and x ≥ f −1 f (y) − 21 ≥ ν∗ . 1 , τ (y) 2

6.4.3 Preference and Negation Next, we will consider several compostions of the preference operator with the negation operator. Here, we can see again that the preference operator is closely related to the implication operator. The next proposition tells us that p(x, y) has similar properties to the law of contraposition for implication operators.

6.4 Properties of the Preference Operator

113

Proposition 4. The preference operator p(x, y) has the following properties with respect to a strong negation n: 1. pw (x, y) = pw (n(y), n(x)) for all x, y ∈ [0, 1]; 2. pw (x, y) = n ( pw (y, x))

for all x, y ∈ [0, 1];

3. n ( pw (x, y)) = pw (n(x), n(y)) for all x, y ∈ [0, 1]; Proof. rom the commutativity and self-duality of aw (x, y), we find that pw (x, y) = aw (n(x), y) = aw (y, n(x)) = pw (n(y), n(x)). The same goes for the other two statements.

6.4.4 Preference, Conjunction and Disjunction Now we will examine compositions of the preference operator with the main logical operators (conjunction and disjunction). Proposition 5. The preference operator p(x, y) has the following properties: 1. Asymmetry: c ( pw (x, y), pw (y, x)) = 0;

(AS)

d ( pw (x, y), pw (y, x)) = 1;

(SSC)

c ( p(x, y), p(y, z)) ≤ p(x, z).

(TT)

2. S-strong completeness:

3. T-transitivity

Proof. 1. Let A := w( f (y) − f (x)), Then c ( pw (x, y), pw (y, x)) = f −1

A+

2. Likewise for the disjunction operator.

1 1 + −A + − 1 = f −1 (0) = 0. 2 2

114

6 Preference Operators

3. This follows from direct calculation, based on the fact that the terms in square bracketshave values between 0 and 1. Hence, the cutting functions can be omitted.

6.4.5 Preference and Aggregation Next, we will examine compositions of the preference operator with the aggregative operator a(x, y). Proposition 6. The preference operator p(x, y) has the following properties: 1. Transitivity a ( p(x, y), p(y, z)) = p(x, z) 2. Common Base p(x, y) = a ( p(y, z), p(z, x)) 3. Inverse Property y = a (x, p1 (y, z)) 4. Neutrality ν∗ = a ( p(x, y), p(y, x)) for all xi , yi ∈ [0, 1]. Proof. ll these statements follow from direct calculation, based on the fact that the terms in square brackets have values between 0 and 1; i.e. the cutting functions can be omitted.

6.4.6 Additive Transitivity In [31], Tanino examined different types of transitivities. Among others, he considered the so-called additive transitivity, where we take p(x, y) − 21 to be an intensity of preference of y over x. Definition 10. p(x, y) is an additive preference, if p(x, y) − holds.

1 2

+

p(y, z) −

1 2

= p(x, z) −

1 2

(6.15)

6.4 Properties of the Preference Operator

115

Proposition 7. p(x, y) is an additive preference, if and only if for its generator function f (x) = x holds. Proof. 1. The sufficiency of the condition follows from direct calculation. 2. To prove the necessity case, let p(x,

y) be a preference operator generated by f (x). Let us define g(x) = f −1 x + 21 − 21 , a = 21 ( f (y) − f (x)) and b = 1 f (z) − f (y)) . Here, p(x, y) is an additive preference if and only if 2 ( g(a) + g(b) = g(a + b).

(6.16)

The solution of this functional equation is g(a) = ca, c = 0, c ∈ R. This means that 1 1 1 1 + =c x− + . (6.17) f −1 (x) = g x − 2 2 2 2 From f −1 (0) = 0 and f −1 (1) = 1 follows c = 1 and f (x) = x. Remark 7. From the above proposition follows that an additive preference has the form 1 p(x, y) = (y − x + 1). (6.18) 2 Remark 8. Note that the use of the generator function f (x) = x leads to Łukasiewicz logic.

6.4.7 Bisymmetry and the Common Base Property When it comes to the problem of consistent aggregation, associativity and bisymmetry play an important role [32]. From an aggregation perpective, associativity is an excellent way of extending a binary function to an n-ary one, but, in some cases, bisymmetry can offer an even better way. Proposition 8. The preference operator p(x, y) is bisymmetric; i.e. p ( p(x1 , y1 ), p(x2 , y2 )) = p ( p(x1 , x2 ), p(y1 , y2 ))

(BS)

holds for ∀xi , yi ∈ [0, 1]. Proof. Taking into account the fact that the terms in square brackets all have values between 0 and 1, the cutting functions can be omitted. This way, the statement follows from direct calculation.

116

6 Preference Operators

Proposition 9. The preference operator p(x, y) has the common base property; i.e. p(x, y) = p ( p(z, x), p(z, y))

(CB)

holds for ∀xi , yi ∈ [0, 1]. Proof. Taking into account the fact that the terms in square brackets all have values between 0 and 1, the cutting functions can be omitted. This way, the statement follows from direct calculation.

6.4.8 Preference and Unary Operators Unary operators (see Table 9.2) and hence membership functions, which play a substantial role in the overall performance of fuzzy representation, may also be interpreted as preference operators, as the following proposition asserts. In the literature, membership functions are usually chosen independently from the logical operators of the system. Parameters are normally fine-tuned on the basis of empirical results. As we note in Eq. (9.10), modifiers and membership functions may be connected to the logical operators of the system. Using operator-dependent membership functions allows us to construct a system using a single generator function and a few parameters. Moreover, this can provide a theoretical explanation for the choice of membership functions and modifiers. Now we will show that the unary operators can be interpreted as preferences: Proposition 10.

(x) = pλ (ν∗ , x), τν(λ) ∗ (1)

τ N2 (x) = p(1, x), (1)

τ P2 (x) = p(0, x), where ν∗ = f −1

1 . 2

Proof. t follows from direct calculation.

6.5 Summary Similar to the implication operator being defined as a composition of the disjunction and the negation operator, preference operators were introduced as a composition of the aggregative operators and the negation operator. The main properties were examined systematically. In the next part, we will combine machine learning methods

6.5 Summary

117

with nilpotent systems described in the previous chapters. We will show that nilpotent logical systems offer an appropriate mathematical framework for hybridization of continuous nilpotent logic, multicriteria decion-making and neural models.

References 1. Orlovsky, S.: Decision-making with a fuzzy preference relation. Fuzzy Sets Syst. 1(3), 155– 167 (1978). ISSN 0165-0114. https://doi.org/10.1016/0165-0114(78)90001-5. http://www. sciencedirect.com/science/article/pii/0165011478900015 2. Fodor, J., Roubens, M.: Fuzzy preference modelling and multicriteria decision support (1994) 3. Kahraman, C., Onar, S.C., Oztaysi, B.: Fuzzy multicriteria decision-making: a literature review. Int. J. Comput. Intell. Syst. 8(4), 637–666 (2015) 4. Bouyssou, D., Marchant, T., Pirlot, M., Tsoukias, A., Vincke, P.: Evaluation and decision models with multiple criteria: stepping stones for the analyst. In: International Series in Operations Research & Management Science, vol. 86 . Springer (2006). ISBN 978-0-38731098-5 ´ 5. Cinelli, M., KadziÅ Dski, M., Gonzalez, M., SÅCowiÅDski, R.: How to support the application of multiple criteria decision analysis? let us start with a comprehensive taxonomy. Omega 96, 102261 (2020). ISSN 0305-0483. https://doi.org/10.1016/j.omega.2020.102261. http://www.sciencedirect.com/science/article/pii/S0305048319310710 6. Brans, J.P., Vincke, P.: Note-a preference ranking organisation method: the promethee method for multiple criteria decision-making. Manage. Sci. 31(6), 647–656, June 1985. ISSN 00251909. https://doi.org/10.1287/mnsc.31.6.647 7. Figueira, J.R., Mousseau, V., Roy, B.: Electre methods. In: Greco, S., Ehrgott, M., Figueira, J. (eds.) Multiple Criteria Decision Analysis. International Series in Operations Research & Management Science, vol. 223. Springer, New York (2016) 8. Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 387–402, Berlin, Heidelberg (2013). ISBN 978-3-642-40994-3 9. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2015) 10. Szegedy C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2014) 11. Thys, S., Ranst, W.V., GoedemÃl’, T.: Fooling automated surveillance cameras: adversarial patches to attack person detection (2019) 12. Wang, J.: A neural network approach to modeling fuzzy preference relations for multiple criteria decision making. Comput. Oper. Res. 21(9), 991–1000 (1994). ISSN 0305-0548. https:// doi.org/10.1016/0305-0548(94)90070-1. http://www.sciencedirect.com/science/article/pii/ 0305054894900701 13. Wang, J., Malakooti, B.: A feedforward neural network for multiple criteria decision making. Comput. Oper. Res. 19(2), 151–167 (1992). ISSN 0305-0548. https://doi.org/10.1016/03050548(92)90089-N. http://www.sciencedirect.com/science/article/pii/030505489290089N 14. Elgharabawy, A., Parsad, M., Lin, C.: Preference neural networ. IEEE Trans. Neural Networks Learn. Syst. (2019) 15. Fürnkranz, J., Hüllermeier, E.: Preference Learning: An Introduction. In: Greco, S., Ehrgott, M., Figueira, J. (eds.) Preference Learning, pp. 1–17. Springer, Heidelberg (2011) 16. Brafman, R., Domshlak, C.: Preference handling - an introductory tutorial. AI Mag. 30(1), 58 (2009). https://doi.org/10.1609/aimag.v30i1.2114. https://www.aaai.org/ojs/index.php/ aimagazine/article/view/2114

118

6 Preference Operators

17. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005). https://doi.org/10.1109/TKDE.2005.99 18. Csiszár, O., Csiszár, G., Dombi, J.: Interpretable neural networks based on continuousvalued logic and multicriteria decision operators. Knowl.-Based Syst. 199, 105972 (2020). ISSN 0950-7051. https://doi.org/10.1016/j.knosys.2020.105972. http://dx.doi.org/10.1016/j. knosys.2020.105972 19. Csiszár, O., Csiszár, G., Dombi, J.: How to implement mcdm tools and continuous logic into neural computation? towards better interpretability of neural networks. Knowl.-Based Syst. (2020) 20. Dubois, D., Prade, H.: Fuzzy sets in approximate reasoning. Fuzzy Sets Syst. 40, 143–202 (1991) 21. Trillas, E., Valverde, L.: On some functionally expressable implications for fuzzy set theory. In: Proceedings of the 3rd International Seminar on Fuzzy Set Theory, Linz, Austria, pp. 173–1902 (1981) 22. Csiszár, O., Dombi, J.: Generator-based modifiers and membership functions in nilpotent operator systems. In: IEEE International Work Conference on Bioinspired Intelligence (IWOBI 2019), pp. 99–106 (2019) 23. Dombi, J., Csiszár, O.: The general nilpotent operator system. Fuzzy Sets Syst. 261, 1–19 (2015) 24. Dombi, J., Csiszár, O.: Implications in bounded systems. Inf. Sci. 283, 229–240 (2014) 25. Dombi, J., Csiszár, O.: Equivalence operators in nilpotent systems. Fuzzy Sets Syst. 299, 113–129 (2016) 26. Dombi, J., Csiszár, O.: Self-dual operators and a general framework for weighted nilpotent operators. Int. J. Approximate Reasoning 81, 115–127 (2017) 27. Dombi, J., Csiszár, O.: Operator-dependent modifiers in nilpotent logical systems. In: Proceedings of the 10th International Joint Conference on Computational Intelligence, IJCCI, pp. 126–134, vol. 1. INSTICC, SciTePress (2018) 28. Dombi, J., Gera, Z.: The approximation of piecewise linear membership functions and łukasiewicz operators. Fuzzy Sets Syst. 154, 275–286 (2005) 29. Dombi, J.: Membership function as an evaluation. Fuzzy Sets Syst. 35(1), 1–21 (1990). ISSN 0165-0114. https://doi.org/10.1016/0165-0114(90)90014-W. http://www. sciencedirect.com/science/article/pii/016501149090014W 30. Dombi, J., Baczynski, M.: General characterization of implication’s distributivity properties: the preference implication. IEEE Trans. Fuzzy Syst. 1 (2019). https://doi.org/10.1109/ TFUZZ.2019.2946517 31. Tanino, T.: Fuzzy preference orderings in group decision making. Fuzzy Sets Syst. 12(2), 117– 131 (1984). ISSN 0165-0114. https://doi.org/10.1016/0165-0114(84)90032-0. http://www. sciencedirect.com/science/article/pii/0165011484900320 32. Fodor, J., Kacprzyk, J.: Aspects of Soft Computing. Intelligent Robotics and Control. Springer, Heidelberg (2019)

Part III

Learning and Neural Networks

Chapter 7

Squashing Functions

Abstract In fuzzy logic, the most commonly used membership functions are triangular and trapezoid ones. A crucial drawback of these functions is the lack of differentiability; a property that would be useful for learning systems. In this chapter, we introduce the so-called squashing functions; a differentiable parametrized family of functions that can not only be used for approximating piecewise linear membership functions but also Łukasiewicz-type logical operators. We show that the derivative of a squashing function is the difference of two sigmoid functions. This fact will come useful in gradient-based applications.

7.1 Introduction The construction and the interpretation of fuzzy membership functions have always been crucial in fuzzy theory and practice. Bilgic and Türksen gave a comprehensive overview of the most relevant interpretations in [1]. For the construction of membership functions, Dombi [2] had an axiomatic point of view, Civanlar and Trussel [3] used statistical data, Bagis [4], and Denna et al. [5], Karaboga [6] applied tabu search. However, most fuzzy applications use piecewise linear membership functions because of their easy handling, such as in embedded fuzzy control applications where the limited computational resources do not allow the use of complicated membership functions. In other areas where the model parameters are learned by a gradient based optimization method, they cannot be used because the lack of continuous derivatives. For example, to fine-tune a fuzzy control system by a simple gradient-based technique we require that the membership functions be differentiable for every input. There are numerous papers dealing with the concept of fuzzy set approximation and membership function differentiability (see, for example, [7–9]). In this chapter, we will give a different solution to the problem of non-differentiability of piecewise linear functions by approximating the cutting function of the Łukasiewicz operators, and use it to construct continuously differentiable membership functions which approximate the well-known triangular or trapezoidal membership functions [10]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 J. Dombi and O. Csiszár, Explainable Neural Networks Based on Fuzzy Logic and Multi-criteria Decision Tools, Studies in Fuzziness and Soft Computing 408, https://doi.org/10.1007/978-3-030-72280-7_7

121

122

7 Squashing Functions

This chapter is organized as follows. In Sect. 7.2, we give a brief overview of Łukasiewicz operators, then in Sect. 7.3, we give the basic properties of the sigmoid function which serves as the basis for the approximation and prove the main approximation theorem. Additionally, we examine the derivatives and the convergence of the proposed approximation. In Sect. 7.4, we apply the approximation to triangular and trapezoidal membership functions. The results of this chapter are the fruits of a collaboration between József Dombi and Zsolt Gera [10].

7.2 Łukasiewicz Operators The Łukasiewicz operator class (see e.g. [11–13]) is commonly used for various purposes, see e.g. [14, 15]. In this well-known operator family, the cutting function (denoted by [ ]) plays an important role: Definition 11. Let the cutting function be ⎧ ⎪ ⎨0 if x ≤ 0 [x] = min(max(0, x), 1) = x if 0 < x < 1 ⎪ ⎩ 1 if 1 ≤ x. Let the generalized cutting function be

[x]a,b = [(x − a)/(b − a)] =

⎧ ⎪ ⎨0

x−a ⎪ b−a

⎩

1

if x ≤ a if a < x < b if b ≤ x,

where a, b ∈ R and a < b. In neural network terminology, this cutting function is called the saturating linear transfer function. All nilpotent operators are constructed using the cutting function. The formulas for the nilpotent conjunction, disjunction, implication and negation are the following: c(x, y) = [x + y − 1], d(x, y) = [x + y], (7.1) i(x, y) = [1 − x + y], n(x) = 1 − x, where x, y ∈ [0, 1]. The truth tables of the first three can be seen in Fig. 7.1. We will refer to triangular and trapezoidal membership functions as piecewise linear membership functions. They are popular in fuzzy control because of their easy handling. The generalized cutting function can be used to describe piecewise linear

7.2 Łukasiewicz Operators

123

1 0.8

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.2

0 1

0 1

0.6 0.4

0 1

0.8

0.8

0.8 0.6

0.6

y

0.4 0.2 0.2

0 0

0.4

0.6

0.8

1

y

0.6 0.4 0.2

x

0 0

0.2

0.4

0.6

1

0.8

y

0.4 0.2

x

0.2

0 0

0.4

0.6

0.8

1

x

Fig. 7.1 The truth tables of the nilpotent conjunction, disjunction and implication 1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

y

0

1

2

3

x

4

5

1

2

3

4

5

x

Fig. 7.2 Left: Generalized cutting functions for a = 0, b = 1, c = 2, d = 4. Right: a trapezoidal membership function constructed as the conjunction of the former two, with a negation applied to the right one.

membership functions. Generally speaking, a trapezoidal membership function can be constructed via the conjunction of two generalized cutting functions like so: c([x]a,b , 1 − [x]c,d ) = [[x]a,b + 1 − [x]c,d − 1] = [[x]a,b − [x]c,d ],

(7.2)

where a, b, c, d are real numbers and a < b ≤ c < d. As a special case, for b = c we get a triangular membership function. For an example of the general case see Fig. 7.2. The Łukasiewicz operator family has some nice theoretical properties, such as the law of non-contradiction (the conjunction of a variable and its negation is always zero) and the law of the excluded middle (the disjunction of a variable and its

124

7 Squashing Functions

negation is always one) both hold, and the coincidence of the residual and the material implications. This is why this operator class is widely used in fuzzy logic and it is the closest one to the classic Boolean logic. However, the lack of differentiability makes the use of gradient-based optimization techniques impossible. The root of this problem is the shape of the cutting function itself.

7.3 Approximation of the Cutting Function A solution to the problem of differentiability is to replace the original cutting function with a continuously differentiable approximation (see Fig. 7.3). In this section, we will construct an approximating function by means of sigmoid functions. The reason for the choice of the sigmoid function is its significant role in applications such as artificial neural networks [16], optimization methods, economical and biological models [17].

Fig. 7.3 The cutting function and its approximation

1

0.8

0.6

y 0.4

0.2

0

0.5

1

x

1.5

2

7.3 Approximation of the Cutting Function

125 1 0.8 0.6

y 0.4 0.2 0.5

1

1.5

x

Fig. 7.4 The sigmoid function, with parameters d = 0 and β = 4

7.3.1 The Sigmoid Function The sigmoid function (see Fig. 7.4) is defined as (β)

σd (x) =

1 , 1 + e−β(x−d)

(7.3)

where the lower index d is omitted if d = 0. Let us recall some of its properties that will be useful later on: • Its derivative can be expressed by itself (see Fig. 7.5): ∂σd (x) (β) (β) = βσd (x) 1 − σd (x) ∂x (β)

(7.4)

• Its integral has the following form:

(β)

σd (x) d x = −

1 (−β) ln σd (x) β

(7.5)

Since the sigmoid function asymptotically equals 1 as x tends to infinity, the integral function of the sigmoid function is asymptotically x (see Fig. 7.6).

7.3.2 The Interval [a, b] Squashing Function In order to get an approximation of the generalized cutting function, let us integrate the difference of two sigmoid functions, which are translated by a and b (a < b), respectively:

126

7 Squashing Functions 1 0.8 0.6

y 0.4 0.2 0.5

1

1.5

x

Fig. 7.5 The first derivative of the sigmoid function 2 1.8 1.6 1.4 1.2

y

1 0.8 0.6 0.4 0.2 0

1

2

3

x

Fig. 7.6 The integral function of the sigmoid function and another shifted by 1

1 b−a

(β) σa(β) (x) − σb (x) d x

1 (β) = σa(β) (x) d x − σb (x) d x b−a

1 (−β) 1 1 − ln σa(−β) (x) + ln σb (x) . = b−a β β

After simplification, we get the interval [a, b] squashing function: Definition 12. Let the interval [a, b] squashing function be (β) Sa,b (x)

(−β) 1/β σb (x) 1 = ln (−β) b−a σa (x)

1/β 1 + eβ(x−a) 1 ln = . b−a 1 + eβ(x−b)

(7.6)

7.3 Approximation of the Cutting Function

127

The parameters a and b affect the position of the interval squashing function, while the β parameter determines the precision of the approximation. We now need (β) to prove that Sa,b (x) is really an approximation of the generalized cutting function. Proposition 7.1. Let a, b ∈ R, a < b and β ∈ R+ . Then (β)

lim Sa,b (x) = [x]a,b ,

β→∞ (β)

and Sa,b (x) is continuous in x, a, b and β. (β)

Proof. It is easy to prove the continuity, since Sa,b (x) is a simple composition of continuous functions, and the fact that the sigmoid function has a range of [0, 1] which ensures that the quotient is always positive. To calculate the limit, we separate three cases, depending on the relation between a, b, and x. • Case 1 (x < a < b): Since β(x − a) < 0, so eβ(x−a) → 0 and likewise eβ(x−b) → 0. Hence the quotient converges to 1 if β → ∞, and the logarithm of one is zero. • Case 2 (a ≤ x ≤ b):

1/β 1 + eβ(x−a) lim β→∞ 1 + eβ(x−b) ⎛

1/β ⎞ eβ(x−a) e−β(x−a) + 1 1 ⎠ ln ⎝ lim = β→∞ b−a 1 + eβ(x−b)

1/β e x−a e−β(x−a) + 1 1 ln lim = 1/β β→∞ b−a 1 + eβ(x−b)

−β(x−a) 1/β e +1 1 x−a ln e lim = . β→∞ 1 + eβ(x−b) 1/β b−a

1 ln b−a

We transform the nominator so that we can factor out e x−a of the limes. The nominator e−β(x−a) converges to 0 as well as the denominator eβ(x−b) , and therefore the quotient converges to 1 if β → ∞. As a result, the limit of the interval squashing function equals (x − a)/(b − a), which is by definition the generalized cutting function.

128

7 Squashing Functions

• Case 3 (a < b < x):

1/β 1 + eβ(x−a) lim β→∞ 1 + eβ(x−b) ⎛

−β(x−a) 1/β ⎞ β(x−a) e e +1 1 ⎠ ln ⎝ lim = β→∞ eβ(x−b) e−β(x−b) + 1 b−a

1/β e x−a e−β(x−a) + 1 1 = ln lim β→∞ e x−b e−β(x−b) + 1 1/β b−a

1/β −β(x−a) +1 e e x−a 1 ln x−b lim = . β→∞ e−β(x−b) + 1 1/β b−a e

1 ln b−a

Similar to Case 2, the quotient converges to 1: x−a

e 1 ln x−b b−a e x−a−(x−b) 1 ln e = b−a b−a 1 = ln eb−a = = 1. b−a b−a

(β)

lim Sa,b (x) =

β→∞

1

1

0.8

0.8

0.6

0.6

y

y 0.4

0.4

0.2

0.2

0

1

2

x

3

0

1

2

3

x

Fig. 7.7 Left: the interval squashing function with an increasing β parameter (a = 0 and b = 2). Right: the interval squashing function with a zero and a negative β parameter

7.3 Approximation of the Cutting Function

129

In Fig. 7.7, the interval squashing function has been plotted with different β parameters. The following proposition states some important properties of the interval squashing function. Proposition 7.2. (β)

lim Sa,b (x) = 1/2,

β→0

(−β)

(β)

Sa,b (x) = 1 − Sa,b . As an another example, the approximation of the nilpotent conjunction is shown in Fig. 7.8. For further use, let us introduce another form of the interval squashing function. Instead of using parameters a and b which were the “boundaries” on the x axis, from now on we will use a and δ, where a is the center of the squashing function and δ is its slope. Definition 13. Let the squashing function be a