Multiple Criteria Decision Making: Beyond the Information Age (Contributions to Management Science) 3030524051, 9783030524050

Data and its processed state 'information' have become an indispensable resource for virtually all aspects of

109 79 10MB

English Pages 423 [413] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Multiple Criteria Decision Making: Beyond the Information Age (Contributions to Management Science)
 3030524051, 9783030524050

Table of contents :
Preface
Contents
Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive Distance Function
1 Introduction
2 Literature Review
3 Multi-sensor Target Classification Models
3.1 Ensemble of Classifiers with Modified Distance Function (ECMDF): Overview
3.2 The Relationship Between Correlation Coefficients and Evidence Distances
3.3 Description of ECMDF Algorithm
3.3.1 The Training Phase
3.3.2 The Test Phase
3.4 Ensemble of Classifiers with Modified Distance Function and Sensor Accuracy (ECMDFS): Overview
4 Computational Results
4.1 Datasets
4.2 Parameter Settings
4.3 Performance Measure
4.4 Computational Results
4.5 Paradoxical Situations
5 Hybrid Multi-sensor Target Classification Algorithms
6 Conclusion
References
Selection of Emergency Assembly Points: A Case Study for the Expected Istanbul Earthquake
1 Introduction
2 Disaster Management
3 Post-disaster Resettlement
4 Selection of Emergency Assembly Points
5 Methodology
6 Case Study
6.1 Study Area
6.2 Implementation of the Model and Results
6.2.1 Phase 1: Assessment of the Current EAPs
6.2.2 Phase 2: Assessment of Regional Sufficiency
6.2.3 Phase 3: Selection of New Alternatives
6.2.4 Repetition of Previous Phrases
7 Conclusion
References
Assessing Smartness and Urban Development of the European Cities: An Integrated Approach of Entropy and VIKOR
1 Introduction
2 Literature Review
2.1 Smart City Concept
2.2 Ranking Methodologies Approach in Urban Planning and Development
3 Data and Research Methodology
4 Results and Analysis
5 Conclusion
References
An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating Kidney Stone Treatment Alternatives
1 Introduction
2 Literature Review
2.1 MCDM in Healthcare
2.2 Health Technology Assessment Studies
2.3 MCDM Models in HTA
3 Methodology
3.1 Health Technology Assessment
3.2 Preliminaries of Fuzzy Logic
3.3 Hierarchical Fuzzy TOPSIS
4 The Proposed Hierarchical Structure for the Evaluation of Kidney Stone Treatment Methods
4.1 The Proposed Hierarchical Evaluation Structure
4.2 The Main Criteria Considered in the MCDM-Based HTA Framework
4.3 Sub-criteria Considered in the MCDM-Based HTA Framework
5 Case Study: Application of the Proposed MCDM-Based HTA Framework
5.1 Selected Kidney Stone Treatment Methods for the Case Study
5.2 Application of the Proposed MCDM-Based HTA Framework
6 Conclusion
Appendix: A Brief Version of the Survey Used to Collect Verbal Evaluations
1. An Example Question for Linguistic Evaluation of Alternatives with Respect to Sub-criteria
2. The Question Used to Determine Weights of the Main Criteria
3. An Example Question Used to Determine Weights of the Sub-criteria
References
Geographic Distribution of the Efficiency of Childbirth Services in Turkey
1 Introduction
2 Methodology
2.1 Data and Study Procedures
2.2 Stepwise Selection of Study Variables in the DEA Model
2.3 Robustness Tests for DEA Efficiency Scores, Using Jackknife Analysis
2.4 Incorporating DEA Results with Decision-Tree Procedures
3 Application Results
3.1 Descriptive Statistics
3.2 Adding Variables into the DEA Model
3.3 Efficiency Analysis of Provinces in Terms of Childbirth Services
3.4 Testing the Robustness of DEA Results with Jackknife Analysis
3.5 Geographic Distribution of Provincial Efficiency Scores for Childbirth Services
3.6 Incorporating a Classification Tree Method with DEA
4 Discussion
4.1 Key Findings
4.2 What Is Already Known and What this Study Adds
4.3 Implications for Health Policymakers
4.4 Implications for the Public
4.5 Limitations
4.6 Policy and Research Implications
5 Conclusion
References
Multicriteria Methods and the Hydropower Plants Planning in Brazil
1 Introduction
2 Inventory Studies Decision-Making
2.1 Criteria
2.2 Multicriteria Analysis
3 New Multicriteria Approach Proposal
4 Application
4.1 Weighted Sum Method
4.2 VIP Analysis
5 Conclusions
References
Regional Examination of Energy Investments in Turkey Using an Intuitionistic Fuzzy Method
1 Introduction
2 Literature Review
3 Overview of Renewable Energy Resources
4 Intuitionistic Fuzzy Sets (IFS)
5 An Integrated Intuitionistic Fuzzy Multi-criteria Decision-Making Method (IFMCDM)
5.1 Intuitionistic Fuzzy TOPSIS (IFTOPSIS)
6 Case Study
6.1 Main Criteria and Sub-criteria
6.2 Application
6.2.1 Sorting Alternatives by Means of IFTOPSIS Method
7 Analyses
7.1 Sensitivity Analysis
8 Conclusion
References
Small Series Fashion Supplier Selection Using MCDM Methods
1 Introduction
2 Literature Review
2.1 General Overview of Supplier Selection
2.2 Multi-criteria Decision Methods for Supplier Selection
3 Methodology
4 Experimental Results
4.1 Supplier Selection Criteria
4.2 MCDM Methods
4.2.1 AHP for Criteria Ranking
4.2.2 TOPSIS Method for Supplier Ranking (A Single Expert Decision-Making)
4.2.3 Fuzzy-TOPSIS Method for Supplier Ranking (Group Decision-Making)
4.2.4 Sensitivity Analysis
5 Conclusion
Appendix
Survey Questionnaire
References
Enhanced Performance Assessment of Airlines with Integrated Balanced Scorecard, Network-Based Superefficiency DEA and PCA Methods
1 Introduction
2 Literature Review
2.1 DEA-Based Airline Studies
2.2 Dimension Reduction with PCA to Enhance the Discriminatory Power of DEA
3 Proposed Methodology
3.1 Integration of BSC and DEA Approach
3.2 The Use of a Network-Based Superefficiency DEA with a Balanced Scorecard Approach and Social Network Analysis
3.3 The Integration of PCA and DEA
4 Application of the Proposed Performance Assessment Framework
4.1 The Selection of the Input and Output Variables
4.2 The Use of a Network-Based Superefficiency DEA with a Balanced Scorecard Approach
4.3 PCA–DEA Approach for Overall Performance Assessment
5 Managerial Implication and Further Research Opportunities
References
The Effects of Country Characteristics on Entrepreneurial Activities
1 Introduction
2 Literature Review
3 The Global Entrepreneurship Monitor Database
3.1 National Expert Survey
3.2 Adult Population Survey
4 The Proposed Model of Entrepreneurship Success
4.1 Variable Selection and Definition
4.1.1 Governmental Factors
4.1.2 Human Factors
4.1.3 Business Environment
4.1.4 Perceived Behavioral Control
4.1.5 Entrepreneurial Intention and Action
4.1.6 Fear of Failure
4.2 Dataset
5 Methodology
5.1 Structural Equation Modeling
5.2 Partial Least Squares Method
6 Application of the Model
6.1 Kaiser–Meyer–Olkin and Bartlett's Test
6.2 Evaluation of the Model
6.3 Moderating Effect
6.4 Model Fit
6.5 Country Group Analysis
6.5.1 The Country-Specific Models
6.5.2 Multigroup Analysis
7 Conclusion and Recommendations
References
A Geometric Standard Deviation Based Soft Consensus Model in Analytic Hierarchy Process
1 Introduction
2 Preliminaries
3 Consensus Models
4 A New Proposed Soft Consensus Model
4.1 Numerical Experiment of the Convergence of the Model
4.2 Numerical Experiment of the Acceptable Consistency of the Final Group PCM in the New Model
5 Comparison of Consensus Models
6 Case Study
7 Conclusions
Appendix 1
Appendix 2
References
Coherency: From Outlier Detection to Reducing Comparisons in the ANP Supermatrix
1 Introduction
2 Literature Review
2.1 Consistency and Coherency
2.2 Units of Measurement
2.3 Linking Pins
2.4 Cluster Analysis
3 Linking Coherency Index
3.1 Motivating Examples
3.2 Simulation Results
3.3 Calculating the Linking Coherency Index (LCI)
4 Dynamic Clustering
5 Reducing the Number of Comparisons
6 Conclusion
References
Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple Attribute Decision-Making
1 Introduction
2 Preliminaries: Neutrosophic Sets, Numbers, and Operations
3 Literature Review: Weighting in N-MADM
4 Entropy-Based Objective Attribute Weighting in N-MADM
5 Application
5.1 Data Set 1
5.2 Data Set 2
5.3 Wilcoxon Signed-Rank Test Results
6 Conclusions and Future Researches
References
Implementation of Cumulative Belief Degree Approach to Group Decision-Making Problems Under Hesitancy
1 Introduction
2 Preliminaries
3 Implementation of CBD Approach to Multi-criteria Group Decision-Making Problem with HFLTS
3.1 Problem Description
3.2 Proposed Methodology
4 Illustrative Example
5 Conclusion
References
A Literature Survey on Project Portfolio Selection Problem
1 Introduction
2 Classification Scheme
2.1 Type of Study
2.2 Methods
2.2.1 Benefit Measurement Methods
2.2.2 Mathematical Programming Methods
2.2.3 Cognitive Emulation Methods
2.2.4 Simulation and Heuristic Approaches (S&H)
2.2.5 Hybrid Methods
2.2.6 Other Methods
2.3 Types of Projects
3 Analysis of the Literature
3.1 Analysis on Type of Study
3.2 Analysis on Methods
3.3 Analysis on Types of Projects
4 Research Directions
5 Conclusion
References

Citation preview

Contributions to Management Science

Y. Ilker Topcu Özay Özaydın Özgür Kabak Şule Önsel Ekici  Editors

Multiple Criteria Decision Making

Beyond the Information Age

Contributions to Management Science

The series Contributions to Management Science contains research publications in all fields of business and management science. These publications are primarily monographs and multiple author works containing new research results, and also feature selected conference-based publications are also considered. The focus of the series lies in presenting the development of latest theoretical and empirical research across different viewpoints. This book series is indexed in Scopus.

More information about this series at http://www.springer.com/series/1505

Y. Ilker Topcu • Özay Özaydın • Özgür Kabak • Sule ¸ Önsel Ekici Editors

Multiple Criteria Decision Making Beyond the Information Age

Editors Y. Ilker Topcu Industrial Engineering Department, Faculty of Management Istanbul Technical University Istanbul, Turkey

Özay Özaydın Industrial Engineering Department, Faculty of Engineering Do˘gu¸s University Istanbul, Turkey

Özgür Kabak Industrial Engineering Department, Faculty of Management Istanbul Technical University Istanbul, Turkey

Sule ¸ Önsel Ekici Industrial Engineering Department, Faculty of Engineering Do˘gu¸s University Istanbul, Turkey

ISSN 1431-1941 ISSN 2197-716X (electronic) Contributions to Management Science ISBN 978-3-030-52405-0 ISBN 978-3-030-52406-7 (eBook) https://doi.org/10.1007/978-3-030-52406-7 © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

We dedicate this book to our Professors and the Honorary Chairs of the MCDM 2019 Conference: • Prof. Dr. Atac Soysal • Prof. Dr. Fusun Ulengin We would like to express our heartfelt gratitude to them for guiding us in our academic career and guiding many generations of professionals and professors.

Preface

The current century is experiencing a shift from a production-focused age to the information-focused age, with the help of information and communication technology (ICT). Transition periods increase the vitality of strategic decision making, and we believe that information age will be followed by a decision age. In this age, multiple criteria decision making (MCDM) will play a crucial role in using ICT for making better decisions. International Society on MCDM aims to develop, test, evaluate, and apply methodologies for solving multiple criteria decision-making problems, to foster interaction and research in the scientific field of multiple criteria decision making, and to cooperate with other organizations in the study of management from a quantitative perspective. The society brings together academicians, professionals, researchers, students, and policymakers at a bi-annual conference since 1975. We, as the organizers of the 25th International Conference on Multiple Criteria Decision Making (MCDM 2019) which took place at Istanbul Technical University, Istanbul, Turkey, from June 16 to June 21, 2019, are very honored to host this latest MCDM conference. The conference gathered 261 participants from 5 continents and 39 countries. Among all participants, 82 of them were students, 19 of them were from industries, and 6 of them were accompanying persons. There were 222 talks and in 62 sessions. We had 4 sessions for plenary talks and 3 tutorial sessions. There was a session for the International Society on MCDM awardees’ talks, and there was another session for MCDM doctoral dissertation award finalists’ talks. Additionally, we scheduled 3 sponsored special, 12 invited, and 38 contributed sessions in 5 parallel sessions (7 times) and 6 parallel sessions (3 times). Selected abstracts that were presented at the conference were invited to be considered as chapters in this book in the Springer book series Contributions to Management Science. After a maximum of 3 rounds of revisions based on 3 reviewers’ comments per each chapter, 15 studies were deemed publishable. The authors of these chapters are from various countries, namely Turkey, the USA, Brazil, Portugal, France, Serbia, Kosovo, and Slovenia. vii

viii

Preface

The chapters cover a wide variety of topics like data analysis to urban development and emergency management, healthcare, environmental concerns, supply chain management, airline management, consumer behavior, and others. Methods used in the studies also widely vary, namely data envelopment analysis, TOPSIS, AHP/ANP, fuzzy sets, and others. The first chapter in this book considers the classification phase of an automatic target recognition (ATR) system having heterogeneous sensors. A novel multiple criteria classification method based on modified Dempster–Shafer theory is proposed. The second chapter aims to explain related criteria for selecting emergency assembly points (EAP) and design a DSS for the city policymakers. The proposed model is applied to Istanbul as a case study. The third chapter aims to rank European cities with respect to their smart and urban development indicators. The ranking used was based on a Eurostat Survey, namely Urban Audit Perception Survey. The fourth chapter proposes a hierarchical evaluation structure for kidney stone treatment methods based on Health Technology Assessment (HTA) using hierarchical fuzzy TOPSIS method. The fifth chapter uses input-oriented data envelopment analysis (DEA) and integrates a decision-tree procedure into the DEA process for determining a geographic distribution of the efficiency of childbirth services in Turkish provinces. The sixth chapter proposes a new method for hydropower inventory problems, where the contradiction between energy demand and socio-environmental efficiency has to be resolved. The study considers a case in Brazil. The seventh chapter uses the intuitionistic fuzzy TOPSIS method to evaluate geothermal, solar, biomass, hydroelectric, and wind energy sources, thus deciding if using renewable energy is a viable option. The eighth chapter proposes insights for the fashion industry with respect to supplier selection using group decision making. The study includes a multi-criteria selection problem from a retailer perspective. The ninth chapter evaluates performance of airline companies, incorporating finance, customers, internal processes, learning, and growth dimensions of balanced scorecard approach including financial and non-financial performance measures by utilizing a network-based super-efficient data envelopment analysis (DEA). The tenth chapter compares the developed countries with emerging and developing countries with respect to their entrepreneurial activities, trying to find the effects of country characteristics on creating new businesses. The eleventh chapter investigates questions on consensus models in the analytic hierarchy process (AHP) and proposes a new model based on the weighted geometric mean and geometric standard deviation. The twelfth chapter aims to focus on coherency to help detect outliers and reduce comparisons in the analytic network process (ANP) supermatrix. Their proposed test for coherency and updating the corresponding incoherent priority vector claims that it results in improved accuracy of the decision model.

Preface

ix

The thirteenth chapter, after making a brief literature survey on objective prioritization methods for neutrosophic multiple attribute decision-making (NMADM) problems, proposes to use entropy measurement to prioritize attributes for N-MADM problems. The fourteenth chapter introduces a multi-criteria group decision-making problem with various evaluation formats including hesitant fuzzy linguistic terms (HFLT) while proposing a transformation formula for converting HFLTs to cumulative belief degrees (CBD). The final chapter focuses on project portfolio selection (PPS) through a literature survey as PPS is considered an important activity for gaining competitive advantage, increasing profit margins, and accomplishing strategic objectives. This book is a concise collection of various applications and methods used for various topics. It once again shows that MCDM is vital in numerous areas from healthcare management to environmental decisions. We would like to thank all the authors for their contributions to make this book possible and also express our gratitude to the referees for their invaluable comments to make the chapters reach their maximum potential. Y. Ilker Topcu Özay Özaydın Özgür Kabak Sule Önsel Ekici May 2020

Istanbul, Turkey

Contents

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive Distance Function .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Bengü Atıcı, Esra Karasakal, and Orhan Karasakal

1

Selection of Emergency Assembly Points: A Case Study for the Expected Istanbul Earthquake .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . Sezer Sava¸s, Sehnaz ¸ Cenani, and Gülen Ça˘gda¸s

37

Assessing Smartness and Urban Development of the European Cities: An Integrated Approach of Entropy and VIKOR . . . . . . . . . . . . . . . . . . . Jelena J. Stankovi´c, Žarko Popovi´c, and Ivana Marjanovi´c

69

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating Kidney Stone Treatment Alternatives .. . . . . . .. . . . . . . . . . . . . . . . . . . . Eren Erol, Beyza Özlem Yilmaz, Melis Almula Karadayi, and Hakan Tozan

99

Geographic Distribution of the Efficiency of Childbirth Services in Turkey . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 131 Songul Cinaroglu Multicriteria Methods and the Hydropower Plants Planning in Brazil. . . . 155 Igor Raupp, João Clímaco, Fernanda Costa, and Marcelo Miguez Regional Examination of Energy Investments in Turkey Using an Intuitionistic Fuzzy Method .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 175 Pınar Darende, Babak Daneshvar Rouyendegh (B. Erdebilli), and Tahir Khaniyev Small Series Fashion Supplier Selection Using MCDM Methods . . . . . . . . . . 203 Nitin Harale, Sebastien Thomassey, and Xianyi Zeng

xi

xii

Contents

Enhanced Performance Assessment of Airlines with Integrated Balanced Scorecard, Network-Based Superefficiency DEA and PCA Methods . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 225 Umut Aydın, Melis Almula Karadayı, Füsun Ülengin, and Kemal Burç Ülengin The Effects of Country Characteristics on Entrepreneurial Activities . . . . 249 Seda Yanık and Nihat Can Sinayi¸s A Geometric Standard Deviation Based Soft Consensus Model in Analytic Hierarchy Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 281 Petra Grošelj and Gregor Dolinar Coherency: From Outlier Detection to Reducing Comparisons in the ANP Supermatrix .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 317 Orrin Cooper and Idil Yavuz Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple Attribute Decision-Making .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 343 Sait Gül Implementation of Cumulative Belief Degree Approach to Group Decision-Making Problems Under Hesitancy .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 369 Nurullah Güleç and Özgür Kabak A Literature Survey on Project Portfolio Selection Problem.. . . . . . . . . . . . . . . 387 Özge Sahin ¸ Zorluo˘glu and Özgür Kabak

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive Distance Function Bengü Atıcı, Esra Karasakal, and Orhan Karasakal

Abstract Automatic Target Recognition (ATR) systems are used as decision support systems to classify the potential targets in military applications. These systems are composed of four phases, which are selection of sensors, preprocessing of radar data, feature extraction and selection, and processing of features to classify potential targets. In this study, the classification phase of an ATR system having heterogeneous sensors is considered. We propose novel multiple criteria classification methods based on the modified Dempster–Shafer theory. Ensemble of classifiers is used as the first step probabilistic classification algorithm. Artificial neural network and support vector machine are employed in the ensemble. Each non-imaginary dataset coming from heterogeneous sensors is classified by both classifiers in the ensemble, and the classification result that has a higher accuracy ratio is chosen for each of the sensors. The proposed data fusion algorithms are used to combine the sensors’ results to reach the final class of the target. We present extensive computational results that show the merits of the proposed algorithms.

This article is based on an unpublished MS thesis of the first author co-supervised by the second and third authors. B. Atıcı ASELSAN A.S, ¸ Gölba¸sı Facilities, Ankara, Turkey e-mail: [email protected] E. Karasakal Industrial Engineering Department, Middle East Technical University, Ankara, Turkey e-mail: [email protected] O. Karasakal () Industrial Engineering Department, Çankaya University, Ankara, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_1

1

2

B. Atıcı et al.

1 Introduction Radar systems have important roles both in military and in civilian applications. Those systems changed the way armies have fought since their invention during World War II. As the capabilities of those systems in terms of range, sensitivity, and the number of tracks that can be followed increase, the popularity of automatic target recognition (ATR) systems has also increased. ATR systems take readings from radars and then process it to recognize the class of targets. Based on the number of data sources used, ATR systems can be divided into two categories, which are single data source and multiple data source. Contrary to single data source systems, multiple data source systems can provide complementary knowledge about the target, and hence greater accuracy. According to the measurements produced by sensors, different types and different numbers of sensors may be used as data sources such as ultra-high rangeresolution radar profiles, synthetic aperture radar, and laser radar (Rogers et al. 1995) in ATR systems. Thus, sensor readings in ATR systems are usually heterogeneous. ATR systems are composed of four phases, which are selection of data sources, preprocessing/segmentation of data collected, feature selection and extraction, and classification (Rogers et al. 1995). This study focuses on the classification phase of ATR systems. Consider a battlefield in which multiple heterogeneous sensor readings come for the same potential target. It is highly desirable to identify the class of the target as soon as possible before engaging the target if necessary. To achieve that, after preprocessing the sensor readings and feature selection/segmentation, sensor readings must be classified. In this study, an ensemble of classifiers is employed. Artificial neural network (ANN) and support vector machine (SVM) are selected as individual classifiers in the ensemble due to their high classification power (Sagi and Rokach 2018). As sensor readings have different features data, they cannot be fused directly. Thus, probabilistic classification is employed for the dataset of each sensor, and the result of each of them is fused by the modified Dempster–Shafer Theory (DST). In the ensemble, ANN and SVM are trained for each sensor for each specific dataset, and the classifier with higher accuracy is chosen. The main motivation of the study is to develop new methods that allow to calculate the evidence distances in an elastic way in paradoxical and high conflicting situations. The proposed way of calculating the distances is used in the calculation of credibility degrees of the sensors. The rest of the study is organized as follows: in Sect. 2, a literature review on classification algorithms and DST is given. In Sect. 3, the proposed models are explained in detail. Details of the computational experiments and discussions are given in Sect. 4. Section 5 introduces the hybrid extension of the proposed algorithms and provides the related computational results. Concluding remarks are given in the last section.

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

3

2 Literature Review Classification problem deals with predicting the class of the observations. It is a well-known problem in machine learning (ML). In ATR systems, the main aim is to identify the targets correctly and quickly so that the required weapons can be directly engaged to them. Thus, speed and classification accuracy may be regarded as the most critical objectives of ATR systems. Kotsiantis (2007) summarizes the most popular classification algorithms in ML, which are decision trees, artificial neural network, naïve Bayes, k-nearest neighbors, SVM, and rule-learners. Among other classifiers, ANN and SVM are the most prominent classifiers in accuracy and speed of classification (Kotsiantis 2007). Single-layer perceptron is introduced by McCulloch and Pitts (1943). Their model is composed of several input nodes that take the feature values of the input data, and output node where the recommendation of the perceptron is received as 0 and 1. However, McCulloch and Pitts’ model (1943) is only applicable to linearly separable datasets. As opposed to the single-layer perceptron, ANN is composed of multiple layers. In addition to the increased number of layers, a multi-layer perceptron uses nonlinear activation function. The aim of ANN is to adjust the weights so that the total error is minimized. This is achieved by training the network. The most popular training algorithm is backpropagation algorithm (Rumelhart et al. 1986). However, the deficiency of backpropagation is the speed of training. Random weights algorithm is another training algorithm (Cao et al. 2018). Other than these, genetic algorithms (Siddique and Tokhi 2001) and Bayesian methods (Vivarelli and Williams 2001) are among the other alternative training algorithms. In ANN, initialization of parameters that will be optimized by the algorithm while learning and selection of hyper parameters that will not be updated by the algorithm are important for a successful implementation. Thus, hyper parameters such as the number of hidden layers, learning rate, momentum, activation function, batch size, epochs, and dropout should be considered for a customized algorithm carefully. Kotsiantis (2007) states that the selection of activation function, network structure, and weights of the connections are three aspects that significantly affect the overall performance of ANN. According to Géron (2018), there are three popular activation functions in terms of performance, which are rectified linear, hyperbolic tangent, and sigmoid activation function. Zhang et al. (2018a) divide the most commonly used network architectures into four categories, which are auto-encoders, deep belief networks, convolutional neural networks, and recurrent neural networks. Determining the number of neurons in hidden layers is another aspect affecting the performance of ANN. Interested readers are referred to Camargo and Yoneyama (2001) for details on this issue. SVM is the most known type of kernel methods, which mainly deals with classification (Chollet 2018). Vapnik and Chervonenkis (1964) introduced SVM as a linear formulation. Then, in 1995, Vapnik and Cortes (1995) proposed a nonlinear formulation of SVM with the concept of soft margin. Training SVM is

4

B. Atıcı et al.

achieved by solving Quadratic Programming Problem (QP) (Zanghirati and Zanni 2003). Yet, QP is NP hard due to large computational complexity. Platt (1999) introduced Sequential Minimal Optimization (SMO) algorithm to overcome the drawbacks of traditional QP. Keerthi and Gilbert (2002) further improved SMO by proving the convergence of the generalized SMO. Hsu and Lin (2002) proposed the decomposition method to speed up the training time of SVM. James et al. (2017) divide SVM into four categories which are maximal margin classifier, support vector classifiers, support vector machines, and multi-class support vector machines. Maximal margin classifier is used for linearly separable data with two different classes. However, if the data is not linearly separable, it may give unsatisfactory results. Vapnik and Cortes (1995) proposed the soft margin concept to this problem. Soft margin allows some of the data points to be classified as the wrong class at some user-specified cost, C. Generally, cross-validation is used to find the value of C (Chollet 2018). A great majority of the real-world problems have nonlinear class boundaries. In this situation, by assuming that the linear boundary exists, the input data is mapped to a higher dimensional space through kernel trick so that separating hyperplane is linear. After the mapping, classical SVM becomes applicable. There are different types of kernel functions. Yet, Géron (2018) states that polynomial and radial basis functions are the most popular ones. Kavzoglu and Colkesen (2009) and Genton (2001) give a comprehensive review of the existing kernel functions. SVM is originally proposed as a binary classification algorithm. SVM is also applicable for multi-class classification through decomposing the existing problem into binary problems. One versus one and one versus all classification are the most important two algorithms in this category (Jonathan Milgram et al. 2006). Dempster–Shafer Theory (DST), also known as Belief Theory or Evidence Theory, is an effective fusion method to combine separate pieces of evidences coming from different sensors under uncertainty and imprecision. The theory is first introduced by Dempster (1967). Then, Shafer mathematically formulized the theory (Shafer 1976). DST has a wide range of application areas such as target recognition (Chen et al. 2014; Dong and Kuang 2015), decision-making (Leung et al. 2013; Dymova and Sevastjanov 2010), multi-sensor classification (Pal and Ghosh 2001; Foucher et al. 2002), reliability analysis (Zhou et al. 2015), expert systems (Deng and Chan 2011), and fault diagnosis (Fan and Zuo 2006). DST combines separate pieces of evidences and classifies them as the most likely class in the frame of discernment (FOD). FOD is a set of all possible states of the system.  = {H1 , H2 , . . . , HM }

(1)

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

5

FOD consists of M many hypotheses. These hypotheses are mutually exclusive and exhaustive. Hi represents the possible states of the system. On the other hand, the power set, 2 , derived from FOD contains the propositions, possible predictions, of the system. 2 = {∅, {H1 } , {H2 } , . . . , {HM } , {H1 , H2 } , . . . , {H1 , HM } , . . . , {H1 , H2 , . . . , HM }}

(2) Power set is composed of 2M many propositions. Every element of the power set is one of the subsets of FOD. If H1 ⊂ , then H1 ∈ 2 . Initial support degree of proposition Hi , m(Hi ), is defined over the power set. These support degrees are called basic probability assignments (BPA). BPA can be thought of as a function that maps m(Hi ) as m : 2 → [0, 1]. BPA should satisfy the nonnegativity and unity properties given in Eqs. (3) and (4), respectively: m (∅) = 0 

(3)

m (Hi ) = 1

(4)

Hi ⊆2

Getting BPAs from different sensors and combining them, belief and plausibility of each hypothesis are found. Belief of proposition Hi , Bel(Hi ), shows the total belief assigned to proposition Hi . On the other hand, plausibility of Hi , Pl(Hi ), shows the upper probability that proposition Hi cannot be regarded as true anymore. They are calculated according to Eqs. (5) and (6): Bel (Hi ) =



m (Hl )

(5)

Hl ⊂Hi

Pl (Hi ) =



m (Hl )

(6)

Hl ∩Hi =∅

Uncertainty in DST is represented by the difference between belief and plausibility degree of proposition Hi : μ (Hi ) = Pl (Hi ) − Bel (Hi )

(7)

μ(Hi ) represents the uncertainty in the evidence. To fuse the evidences coming from different sensors, DST combination rule, Eq. (8) is proposed. Assume that there are two sensors and we have a FOD as

6

B. Atıcı et al.

 = {H1 , H2 , . . . , HM }. Note that DST explicitly assumes that all sensors are independent:  m(H ) =

1 1−k





Hi ∩Hj =H

  m1 (Hi ) ∗ m2 Hj

0

if H =  ∅ if H = ∅

(8)

1 is the normalization factor, k is a measure of conflict between evidences, and 1−k which ensures the unity property stated in condition (4). It is calculated according to Eq. (9):

k=



  m1 (Hi ) ∗ m2 Hj

(9)

Hi ∩Hj =∅

DST combination rule satisfies both commutative and associate law. After getting the belief and plausibility of all of the propositions, DST applies some decision-making rules to get the final classification result of the system. There are two common decision-making rules, which are choosing the proposition with the maximum belief value or maximum plausibility value. Unlike probabilistic data fusion, DST does not need prior probabilities. The algorithm recognizes that each evidence may have different levels of detail, and assigns probabilities to pieces of evidence only if there is supporting information. It can efficiently deal with the imprecise and uncertain information (Khaleghi et al. 2013). However, due to the complex monitoring of the environment and limited accuracy of the sensors, DST may give counterintuitive results. The first critic about DST belongs to Zadeh (1979). The author shows that when the evidences are conflicting, DST gives insufficient results (Zadeh 1979, 1984, 1986). After his critics, many researches focused on this topic. Common paradoxes in DST are categorized into four groups (Li et al. 2016) which are complete conflict paradox, 0 trust paradox, 1 trust paradox, and high conflict paradox. When conflicting degree is equal to 1, DST combination rule cannot be applied. This paradox is called as “Complete Conflict Paradox.” 0 Trust Paradox is also called as “One Bullet Veto.” When one of the evidences is totally denied by one of the sensors, the resulting fused mass of it always gets 0. Under highly conflicting evidences, DST may give total mass to one of the evidences even if they are poorly justified by all sensors. This situation is called “1 Trust Paradox.” Even if the majority of the sensors justify one of the evidences, DST may give a small combined mass to it due to a high conflict degree. This paradox is called “High Conflict Paradox.” Existing literature is divided into two about the main causes of the conflict. Some of the researches state that conflicting situation is the result of the imprecision in sensors. This type of researches focuses on new modified versions of DST based on evidence correction before combination. On the other hand, the second group of researches states that the main problem comes from the normalization step of DST and investigates new conflict redistribution strategies. Apart from these, some of the researches take into account both the complex monitoring environment and

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

7

Table 1 Summary of the studies that propose evidence correction algorithms

Paper Shafer (1976) Murphy (2000) Yong et al. (2004) Mercier et al. (2008) Horiuchi (1998) Fan et al. (2018) Zhang et al. (2017) Chen et al. (2018)

Reliability of sensors

Proposed way for assigning reliabilities

Proposed way for evidence correction other than reliability of sensors

Simple averaging the evidences Credibility degrees Contextual discounting Source value Dynamic reliability Evidence distance Credibility degrees

Jousselme distance

Source variances Intuitionistic fuzzy MCDM Euclidean distance Jousselme distance

Hybrid model Uncertainty is added to the credibility degrees

the imprecision of sensors, recently. In Table 1, a summary of the most popular studies in evidence correction is given. On the contrary to the first part, some of the researches predicate insufficient result of DST on normalization step. Smets and Kennes (1994) proposed an alternative combination rule to the classical DST combination rule. Their combination rule is actually the non-normalized version of the Dempster’s rule. Yager (1987) stated that conflicting situation is not reliable, and it should be added to the total ignorance, or universal set, as discounting term. Unlike Smets and Kennes (1994), the author does not make open world assumption and gives zero mass to an empty set. Dubois and Prade (1988) states that if two of the sources conflict with each other, one of them is not reliable. Their formulation consists of both conjunctive and disjunctive combination. Inagaki (1991) proposed a general formulation for combination rules and distribution of conflict. His formula distributes the mass of empty set, globally after the conjunctive combination of masses. In addition to Inagaki’s (1991) general formulation, Smarandache and Dezert (2006) proposed another general formulation. Their formula distributes the conflict to involved focal elements. Yet, Leung et al. (2013) criticize the way of distributing the conflict by saying that the proposed general formulations should distribute the conflict all over the sets. Like Smarandache and Dezert (2006), Josang et al. (2003) are inspired by Inagaki (1991). Their algorithm is the same as the Inagaki’s (1991) except for the way of calculating the weights. Florea et al. (2009) proposed a class robust combination rule that distributes the conflict to all over the set. In their formulation, they use the conflict between evidences as the weight. Li et al. (2016) provide a new conflict redistribution and decision-making strategy by taking into account sensor priorities and evidence credibility. In their

8

B. Atıcı et al.

approach, sensor priorities are assigned according to the type and the precision of the sensors. The proposed algorithm modifies the weights before combination based on their consistency and reliability indexes calculated according to the algorithm proposed by Yong et al. (2004). Chen et al. (2017) revise two pieces of evidences separately by weighted Minkowski and Betting Commitment distance functions. Then, they combine these two by modified combination rule that assigns conflict locally. Optimal Weighted Dempster–Shafer (OWDS) is developed by Liu et al. (2018). This method optimizes the classifier weights by minimizing the distance between the combination result and the true label of the data. Distance is calculated according to Jousselme’s distance function. Zhang et al. (2018b) modify evidences according to Bhattacharyya distance and combines through new combination rule. However, the proposed algorithm gives reasonable results with small datasets. Ye et al. (2018) modify evidences according to the reliability index and average mass assignment. To calculate the reliability index, Matusita distance is used and the difference between evidences is calculated. Also, to consider the different consistency degrees of evidences, the average mass assignment is calculated. After, they modify the evidences according to the reliability index and average mass assignment. Finally, the weighted mass assignment is employed to combine the modified evidences.

3 Multi-sensor Target Classification Models In this section, two different multi-sensor target classification algorithms are introduced. Both algorithms classify each sensors’ dataset by ensemble and then combine them with modified DST. Proposed algorithms utilize the idea of credibility degree as in Yong et al. (2004) and Chen et al. (2018). However, we propose an adaptive Lp distance metric instead of Jousselme distance used in Yong et al. (2004) and Chen et al. (2018). The proposed algorithms also differ from each other in the modification of the evidences in DST. The first algorithm modifies the evidences by credibility degrees of the sensors only whereas the second one modifies by both credibility degrees and accuracies of the sensors.

3.1 Ensemble of Classifiers with Modified Distance Function (ECMDF): Overview Ensemble of Classifiers with Modified Distance Function (ECMDF) is the first proposed multi-sensor target classification approach. The proposed model is ML based. Thus, it is explained in two parts which are training and test phases. Training Phase of ECMDF The training phase of ECMDF consists of three main steps as shown in Fig. 1. The algorithm starts with reading the multi-sensor datasets

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

9

Start

Read datasets of all sensors

Step 0

Split all sensors’ datasets as training and test Train and classify each sensors’ training dataset by ANN

Train and classify each sensors’ training dataset by SVM

Choose the classifier with the highest accuracy for each sensor for training dataset

Step 1

Calculate pairwise correlation coefficients between sensors and built the correlation matrix Scale the elements of correlation matrix Calculate pairwise distances between all evidences for each target by

metric

Calculate average credibility degrees for all sensors for all targets

Discount all sensors’ evidences for each target

Combine evidences for each target by the classical DST combination rule

Yes Untried scaling interval?

No Determine the best scaling interval that maximizes the number of correct predictions Stop

Fig. 1 Flowchart of the training phase of ECMDF

Step 2

10

B. Atıcı et al.

in Step 0. In Step 1, sensor datasets are split as training and test. Based on the training datasets, both ANN and SVM are trained separately for each sensor. After training both of the classifiers for each sensor, the one with the higher classification accuracy based on training datasets is chosen for each sensor. With the chosen classifier, each sensors’ accuracy ratios, probabilistic classification results, and predicted classes are determined based on training datasets by tuned classifiers. In Step 2, training outputs of all sensors are combined through modified DST. In this step, firstly pairwise correlation coefficients between sensors’ predicted classes are calculated over all training datasets. Then, the correlation matrix is built. The resulting matrix’s elements comprise of values between −1 and 1. They are then scaled between user-specified intervals. Scaled correlation coefficient matrix’s elements are used as p parameter of Lp metric while calculating the pairwise distances between evidences of sensors for each target. This issue is explained in detail in the next subsection. Based on distances, pairwise distance matrixes are built for each target. Using the distance matrixes, average credibility degrees of all sensors are calculated for all targets. By using the calculated average credibility degrees, probabilistic classification results of sensors are discounted and combined with the classical DST combination rule. At this stage, the proposed algorithm moves to the beginning of Step 2, and Step 2 is repeated for different values of scaling intervals of correlation coefficients. After Step 2 is repeated for all scaling intervals, the one that maximizes the number of correct predictions is chosen, which ends up the training phase of ECMDF. Test Phase of ECMDF In the training phase, ECMDF identifies the classifier for each sensor, ANN or SVM, and determines sensor accuracies and credibility degrees to be later used in the test phase to discount the evidence. The flowchart of the test phase of ECMDF can be seen in Fig. 2. The test phase starts with Step 3 by probabilistically classifying each sensor test dataset by the classifier determined in the training phase. In this step, if there are i many sensors, then there will be i different evidences for the same target. To find the final classification result, i many number of evidences for each target should be combined. After all the evidences are discounted with calculated credibility degrees in the training phase for each target, they are combined by the classical DST combination rule. After performing the combination, ECMDF gives the final combined evidence of each target. Finally, the proposition with the maximum belief value in combined evidence is chosen for each target.

3.2 The Relationship Between Correlation Coefficients and Evidence Distances Yong et al. (2004) use credibility degrees to represent the reliability degrees of the sensors. By inspiring from Yong et al.’s (2004) work, a new way of calculating credibility degrees to discount the evidence is developed. In the proposed approach,

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

11

Start

Training phase

Training

Classify each sensor’s test dataset with the predetermined classifier Discount all sensors’ evidences for each target with calculated credibility degrees Combine evidences for each target by the classical DST combination rule Determine the final class of

Step 3

Stop

Fig. 2 Flowchart of the test phase of ECMDF

High correlation coefficient between sensors

• High consistency between sensors • Small distance to discount the evidences

Low correlation coefficient between sensors

• Low consistency between sensors • Large distance to discount the evidences

Fig. 3 Relationship between correlation coefficients and evidence discounting

Lp metric is employed, in which p parameter learns its value from the training datasets of sensors through the correlation coefficients between sensors. In the distance function, the correlation between sensors is calculated firstly. Then, they are scaled between user-specified intervals. The logic behind the training of the p parameter is depicted in Fig. 3. If the correlation between two sensors is high, consistency between them is high too. Therefore, the distance between evidences coming from these sensors should be small. On the other hand, if the correlation coefficients between sensors are low, then the distance between evidences coming from them should be higher.

12

B. Atıcı et al.

Fig. 4 Change in distance between A and B with increasing p parameter

Given sample vectors A and B, Fig. 4 shows how the distance between vectors change with varying p parameter of Lp distance metric. As p increases, the distance between the vectors decreases. Notice that also as the correlation between sensors increases, the numerical value of the correlation coefficient increases. In order to apply the logic explained above, correlation coefficients between sensors is used to determine the alternative p values of Lp metric: A = [14, 22, 35]

B = [5, 2, 9] .

3.3 Description of ECMDF Algorithm The following assumptions are made in the proposed approach: • Sensors are heterogeneous, and they provide different types of information about the same potential target. That is each sensor provides different feature information. • Sensors are independent. • The FOD of sensors is the same. • Datasets of sensors are equally long. • Sensors provide information for only numerical features. • Targets belong to only one of the classes. They do not have multiple labels or classes. • There is not open-world assumption. That is, all possible classes for targets are contained in FOD. The following notations are used for ECMDF algorithm:

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

nbSensors nbTrainingTargets nbTestTargets nbPropositions sInterval i, j k, l m, n p, r u = (x, y) t Hp mkip mki mCki acci CORR corrij sCorrij dkij DISk skij SIMk supki crdki crdi m kp kkt m kpt m kp fCrdi mtmip mtmi mt mp mt mpt mt mp avermp Bel(Hp )m

13

Number of sensors in multi-sensor system Total number of targets in training dataset of each sensor Total number of targets in test dataset of each sensor Number of propositions that represents the total number of potential results/classes of the fusion process List that contains different intervals of scaling operation Indices for nbSensors, i, j = 1, . . . , nbSensors Indices for nbTrainingTargets, k, l = 1, . . . , nbTrainingTargets Indices for nbTestTargets, m, n = 1, . . . , nbTestEvidences Indices for nbPropositions, p, r = 1, . . . , nbPropositions Represents the scaling interval whereas x and y show the lower and the upper bounds for scaling, respectively Index for the number of combinations Class of proposition p BPA of sensor i for proposition p and target k for training dataset Piece of evidence of sensor i for target k for training dataset Predicted class of target k based on mki by sensor i Accuracy of sensor i based on the training dataset Pairwise correlation matrix between all sensors Correlation coefficient between sensor i and j’s predicted classes for all training evidence Scaled correlation coefficient between sensor i and j’s predicted classes for all training evidences Distance between evidences of sensor i and j for target k Pairwise distance matrix between all sensors’ evidences for target k Similarity degree between evidences of sensor i and j for target k Pairwise similarity matrix between all sensors’ evidences for target k Support degree of sensor i’s evidence for target k Credibility degree of sensor i’s evidence for target k Average credibility degree of sensor i for all of targets Discounted BPA for target k for proposition p for training datasets of all sensors Combined conflicting degree of all sensors for target k in combination step t For target k, BPA of proposition p in tth combination for training datasets of all sensors For target k, BPA of proposition p in final combination for training datasets of all sensors Calculated credibility degree of sensor i after trainin For target m, sensor i’s BPA for proposition p for test dataset Piece of evidence of sensor i for target m for test dataset Discounted BPA for target m for proposition p for test datasets of all sensors For target m, BPA of proposition p in tth combination for test datasets of all sensors For target m, BPA of proposition p in final combination for test datasets of all sensors Average BPA for target m for proposition p Belief degree of proposition p for target m

14

B. Atıcı et al.

The overall objective of the system is to maximize the number of correctly predicted targets:  ym =

1 if the class of target m is correctly predicted 0 otherwise

(10)

The objective function of ECMDF as follows: max

n 

ym

(11)

m=1

The detailed steps of ECMDF algorithm are given below.

3.3.1 The Training Phase Step 0 Read datasets of all sensors. Step 1.1 Split datasets of sensors as training and test with user-specified test size. Step 1.2 Train separately both ANN and SVM for each sensor i. Step 1.3 Probabilistically classify sensors’ training datasets with trained and tuned ANN and SVM. Step 1.4 For each sensor i, if ANN gives higher classification accuracy on training dataset of sensor i, then assign its probabilistic classification results for each proposition to respective BPA of sensor i for target k for the same proposition p, mkip . Assign its probabilistic classification result to sensor i’s piece of evidence for target k, mki . Assign its predicted classes of each target to sensor i’s predicted classes for target k, mCki . Assign its classification accuracy ratio to sensor i’s accuracy, acci . Otherwise, assign SVM’s results to them. Step 1.5 To penalize the sensors that have lower sensor priorities and to reward the ones that have higher priorities, take the second power of each of them and normalize. Do it for each acci for each sensor i and update acci accordingly. Step 2.1 By using sensor i’s and sensor j’s predicted classes for each target in training datasets, mCki and mCkj , calculate the pairwise Pearson correlation coefficients, corrij. Built the correlation matrix, CORR. 

nbTrainingTargets

corrij =

k=1

   mC ki − mC ki mC kj − mCkj    2 ∀i, j  mC kj − mC kj (mC ki − mCki )2 (12)

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

⎤ corr11 · · · corr1j ⎥ ⎢ CORR = ⎣ ... . . . ... ⎦ corri1 · · · corrij

15



(13)

Step 2.2 Scale each corrij in correlation matrix according to the first interval in sInterval and assign its result to each sCorrij for sensor i and sensor j. Step 2.3 For each target k, sensor i and sensor j; by using the scaled correlation coefficients in Step 2.2 as p parameter of Lp metric, calculate the distances between sensors i and j’s evidences, mki and mkj , for each target k by using sensors accuracies, acci and accj , as weight.

dkij =

 nbPropositions 

⎛ ⎝acci ∗ acc  sCorrij j

l=1

⎞ 1 sCorrij sCorrij    ⎠ ∗ mkil − mkj l

∀k, i, j (14)

Step 2.4 Built distance matrix, DISk for each target k. ⎤ 1 · · · dk1j ⎥ ⎢ DISk = ⎣ ... . . . ... ⎦ dki1 · · · 1 ⎡

∀k

(15)

Step 2.5 Normalize each DISk for each target k. Step 2.6 Calculate the similarity degrees between evidences of sensor i and j, skij , for each sensor and target k and built the similarity matrix for each target k. skij = 1 − dkij ⎡

1 ··· ⎢ .. . . SIMk = ⎣ . . ski1 · · ·

∀k, i, j ⎤ sk1j .. ⎥ . ⎦

(16)

∀k

(17)

1

Step 2.7 Calculate the credibility degrees, crdki of each sensor i’s evidence for each target k. Then, take the averages of credibility degrees to find the average credibility degree of sensor i for all targets. supki =

nbSensors  j =1

skij

∀k, i

(18)

16

B. Atıcı et al.

supki crdki = nbSensors j =1

supkj

∀k, i

(19)

nbTrainingTargets crdi =

crdki nbTrainingTargets

∀i

k=1

(20)

Step 2.8 To penalize the sensors that have lower credibility degrees and to reward the ones that have higher credibility degrees, take the second power of the calculated credibility degrees in Step 2.7 in Eq. (20) and normalize them. Do it for each crdi for each sensor i and update crdi accordingly. Step 2.9 Discount each BPA of sensor i for target k and proposition p, mkip , and sum over all the sensors to get one unified BPA, m kp , for each target k and proposition p. m kp =

nbSensors 

crdi ∗ mkip

∀k, p

(21)

i=1

Step 2.10 Calculate the combined conflicting degree, kkt , for each target k for combination step t = 1. kkt =



m kp ∗ m kr

∀k, t = 1, 2, . . . , nbSensors − 1

(22)

Hp ∩Hr =∅

Step 2.11 Using Eq. (23), combine the discounted BPA’s once by using conflicting degree in Eq. (22). Then, go to Step 2.10, increase t by 1, and calculate the conflicting degree between the same discounted evidence by combining it t many times with itself. Through the conflicting degree, combine the discounted BPAs t many times. Repeat this step nbSensors − 1 times for each target k and proposition p.  m kpt

=

1 1−kkt



Hp ∩Hr =Hp

0

m kp ∗ m kr if p = ∅, ∀k, p, t = 1, . . . , nbSensors − 1 if p = ∅, ∀k, p, t = 1, . . . , nbSensors − 1

(23) when t = nbSensors − 1, assign the value of m kpt to m kp . Step 2.12 Calculate belief degree for each target k, and proposition p.    Bel Hp k = m kl

∀k, p

(24)

Hl ⊂Hp

Step 2.13 Assign class of proposition p according to the highest belief degree, Bel(Hp )k , to target k as the result of the fusion process.

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

17

Step 2.14 Turn back to Step 2.2. Scale correlation coefficient matrix, CORR with the next interval in sInterval. Repeat steps 2.2 to 2.13 until there is no new interval in sInterval. Step 2.15 Choose the best scaling interval, u, that gives the highest number of correct predictions for the test phase of ECMDF. Assign the credibility degrees calculated during the best scaling interval to f Crdi to be used in the test phase later.

3.3.2 The Test Phase Step 3.1 For each sensor, with the selected classifier in Step 1.4, probabilistically classify sensor i’s test dataset. Assign its probabilistic classification results for each proposition to respective BPA of sensor i for target m for the same proposition p, mtmip . Step 3.2 With normalized credibility degrees determined in training phase in Step 2.15, f Crdi , discount each BPA, mtmip , for each sensor i, proposition p and target m and sum over all the sensors to get one unified BPA, mt mp , for each proposition p and target m: mt mp =

nbSensors 

f Crd i ∗ mt mip

∀m, p

(25)

i=1

Step 3.3 Calculate combined conflicting degree, kmt , for each target m for combination step t = 1. 

kmt =

mt mp ∗ mt mr

∀m, t = 1, 2, . . . , nbSensors − 1

(26)

Hp ∩Hr =∅

Step 3.4 Using Eq. (27), combine the unified BPA’s once by using conflicting degree calculated in Eq. (26). Then, go to Step 3.3, increase t by 1, and calculate the conflicting degree between the same discounted evidence by combining it t many times with itself. Through the conflicting degree, combine the discounted BPAs t many times. Repeat this step nbSensors − 1 times for each target m and proposition p.  mt mpt =

1 1−kmt



Hp ∩Hr =Hp

0

mt mp ∗ mt mr if p = ∅, ∀m, p, t = 1, . . . , nbSensors − 1 , if p = ∅, ∀m, p, t = 1, . . . , nbSensors − 1

(27) when t = nbSensors − 1, assign the value of mt mpt to mt mp .

18

B. Atıcı et al.

Step 3.5 Calculate belief degree for each target m, and proposition p.    mt ml Bel Hp m =

∀m, p

(28)

Hl ⊂Hp

Step 3.6 Assign the class of proposition p according to the highest belief degree, Bel(Hp )m , to target m as the result of the fusion process.

3.4 Ensemble of Classifiers with Modified Distance Function and Sensor Accuracy (ECMDFS): Overview In ECMDF, BPAs are discounted by only credibility degrees of the sensors, which represents the similarity of one sensor’s evidences to other sensors. This measure helps us to modify the evidences in such a way that if one or more of the sensors are poorly functioning, then its/their effect in the final combination of the evidences is decreased due to low correlation coefficients of sensors. However, if the majority of sensors poorly function, then the correlation between all of them may be high, which may lead to misleading results. To overcome this situation, another discounting factor is needed. In the second approach, which is Ensemble of Classifiers with Modified Distance Function and Sensor Accuracy (ECMDFS), the evidences are discounted by sensor accuracy ratios and credibility degrees. The flowchart in Figs. 1 and 2 of ECMDF is valid for ECMDFS for training and test phases, respectively. Two approaches differ from each other in modifying the evidences. They also have the same objective function as Eq. (11).

4 Computational Results In this section, the computational results of the proposed algorithms are presented and compared with benchmark models. Dataset generation, parameter settings, and performance measure are discussed before presenting the computational results.

4.1 Datasets In ATR studies, working with real data is problematic. Thus, one needs to generate artificial data as realistic as possible. We generate artificial data to test the proposed algorithms. While generating datasets, an algorithm depicted in Fig. 5 is used. According to the algorithm, firstly, the number of sensors is determined. Then, the classification dataset by adopting Guyon’s (2003) algorithm is generated. Next,

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

19

Start

Determine the number of sensors

Generate dataset with predetermined parameters according to Guyon (2003) for the first sensor Replicate the dataset for the remaining sensors

Add noise to sensors dataset as required

Randomly drop some features of each sensor’s dataset

Stop Fig. 5 Dataset generation algorithm

this dataset is replicated the number of sensor times. To make the sensors dataset different from each other, noise generated according to a normal distribution with mean 0 and variance 0.05 is added to each sensors’ dataset. While adding the noise, the algorithm proposed by Xu and Yu (2017) is used. In real ATR systems, each radar may provide different feature readings about the same potential target. Then, that information from each radar are combined to predict the potential target’s class. To apply the same concept, some of the features determined randomly are dropped from each sensors’ dataset to represent the heterogeneous sensors. The algorithms are tested with small, medium, and large datasets with different levels for the number of samples, which are 100, 500, and 1000. Two different levels for the number of features are used, i.e., 5 and 10. Three different settings for the number of noisy sensors are used, i.e., 0, 1, and 2. When the number of noisy sensors is 0, all sensors are consistent. When the number of noisy sensors is greater than 0, the given number of sensors is not consistent with the rest of the sensors. If the number of noisy sensors is greater than 0, another dataset with additional parameters is generated with Guyon’s (2003) algorithm for noisy sensors. In this dataset, all class labels of noisy sensors are randomly changed. In addition, the feature values of each noisy sensor are shifted by a given number to make the classification task even harder. For the experimental purposes, feature values of noisy sensors are shifted by 3.

20

B. Atıcı et al.

4.2 Parameter Settings In Table 2, parameter settings are given for the generic framework of computational runs. Number of classes and number sensors are taken as 3. To test the algorithms for a range of values representing the full spectrum of the Lp metric, [1,2], [1,5], [1,10], [1,20], [1,40] are used as scaling intervals. While training both ANN and SVM, two-third of the generated dataset is used as the training data in each problem settings. ANN and SVM are trained with default hyper parameters given in Tables 3 and 4. For setting the hyper parameters of ANN and SVM, random search and grid search are used, respectively. To tune both ANN and SVM, different combinations of hyper parameters are passed to the models, which are given in Table 5 for ANN and in Table 6 for SVM. The final values of these hyper parameters are decided based on the generated dataset dynamically. Table 2 Parameter settings for all ensembles

Parameters Number of classes Number of sensors Scaling interval Training size

Value 3 3 [1,2], [1,5], [1,10], [1,20], [1,40] 2/3

Table 3 Default parameters used in the training of each ANN for each sensor ANN parameters Weights initialization

Kernel initializer Bias initializer

Number of hidden layers Number of hidden neuronsa Activation function Learning rate Optimizer Loss function Number of epochs Batch size Number of iterations for random search

Default values Random normal Random normal 1  √  f + s /L Rectified linear unit 0.1 Stochastic gradient descent Categorical cross-entropy 32 50 5

a f represents the number of features, s represents the number of samples, and L is the number of hidden layers

Table 4 Default parameters used in the training of each SVM for each sensor

SVM parameters Kernel function C Gamma Decision function shape

Default values Radial basis function 1 1 ÷ Number of Features One versus rest

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

21

Table 5 Hyper parameters tuned with random search and their passed combinations for ANN Hyper parameters Activation function

Values/types Softmax activation function Rectified linear unit Hyperbolic tangent activation function Sigmoid activation function Stochastic gradient descent Root mean square propagation Adaptive gradient Adaptive moment estimation 32 or number of samples [10,20,30,40,50]

Optimizer

Batch size Epoch

Table 6 Hyper parameters tuned with grid search and their passed combinations for SVM Hyper parameters Kernel function C Degree for polynomial kernel function

Values/types Radial basis function Polynomial kernel function [1, 5, 10] [1, 2, 3]

Ke and Liu (2008) method that minimizes the mean square error is used for determining the number of hidden nodes in each layer. In order to get probabilistic classification results, categorical cross-entropy loss function and softmax activation function in the output layer are used for ANN. In computational results, the learning rate is taken as 0.1, and 5 iterations are conducted for a random search. In addition, the number of epochs in random search is restricted to values between 10 and 50.

4.3 Performance Measure To evaluate the classification quality of the proposed algorithms, the percentage of correct predictions (PCP) is used as the performance measure. This performance metric measures the percentage of correct predictions over the test dataset. As it gets higher, the classification power of the model increases:  ym =

1 if class of target m is correctly predicted 0 otherwise

(29)

n PCP =

m=1 ym

nbTestTargets

, where m = 1, 2, . . . , nbTestTargets

(30)

22

B. Atıcı et al.

4.4 Computational Results In this study, we propose a new way of correcting the evidences that enable us to overcome the problems (i.e., the paradoxical situations) of the classical DST (Dempster 1967; Shafer 1976). The proposed models aim to elastically calculate the evidence distances and assign credibility degrees of the sensors through them. Proposed models are inspired by Yong et al. (2004) in discounting evidences by credibility degrees. There are recent studies that generally utilize the idea in the seminal work of Yong et al. (2004). Thus, in order to assess the performance of the proposed models, we use the classical DST and Yong et al. (2004) as our benchmark models. They only differ from ECMDF and ECMDFS in the fusion part of the evidences coming from multiple sensors. We name the benchmark models Ensemble of Classifiers with Classical Dempster–Shafer Theory (ECDST), Ensemble of Classifiers with Yong et al. (2004) (ECY). All models are coded in Python and computational experiments are conducted ® using a personal computer with Intel Core™ i5-5200U 2.20 GHz processor, 8 GR RAM, 64 Bit OS with Windows 10 operating system. ECDST, ECY, ECMDF, and ECMDFS are solved for 18 different problem settings. In each problem setting, five different datasets are generated with different seed values and a summary of the results is presented in Table 7. Out of the 18 different problem settings, ECY and ECMDF give worse performance in 16 and 14 problem settings than ECDST, respectively. ECY gives the same results with ECDST for two of the problem settings, 5/100/0 and 10/100/0. On the other hand, ECMDF gives the same results with ECDST for three of the problem settings, and gives higher results than ECDST for only one of the problem settings, 5/100/1 with 0.61% difference. When the number of noisy sensors is greater than 0, ECY and ECMDF begin to perform poorly due to the nature of the calculation of credibility degrees. When the number of noisy sensors in the ensemble is greater than the number of consistent sensors, credibility degrees of noisy sensors begin to increase. This situation reduces the consistent sensors’ evidences and increases the effect of noisy sensors’ evidences in the evidence combination phase in each target. Thus, the difference between PCP values of ECY and ECMDF from ECDST is much higher in problem settings 5/100/2, 5/500/2, 5/1000/2, 10/100/2, 10/500/2, and 10/1000/2. The maximum difference between PCP values of ECY and ECDST occurs in 5/500/2 with −54.76%. For the same problem setting, the difference between PCP values of ECMDF and ECDST is −20.24%. Also, ECMDF’s worst performance occurs in 5/100/2 with −29.09%. ECMDFS gives better PCP results than ECY and ECMDF when the number of noisy sensors is greater than 0. Out of 18 problem settings, ECMDFS gives better PCP results in 6 problem settings than ECDST and gives the same results in 2 problem settings. The highest difference for PCP values occurs in 10/100/1, which is 1.82%. Also, when the number of features is 5 and the number of noisy sensors is 2, ECMDFS’s PCP values are higher than ECDST for all sizes of datasets. The

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

23

Table 7 Summary of computational results for PCP values Problem settinga 5/100/0 5/500/0 5/1000/0 5/100/1 5/500/1 5/1000/1 5/100/2 5/500/2 5/1000/2 10/100/0 10/500/0 10/1000/0 10/100/1 10/500/1 10/1000/1 10/100/2 10/500/2 10/1000/2 Overall

PCP (%) ECDST 81.82 91.15 88.73 80.61 89.45 86.91 72.73 85.21 82.42 77.58 88.97 89.09 72.12 87.39 86.36 69.70 81.21 83.82 83.07

ECY 81.82 90.30 85.64 77.58 84.36 79.09 39.39 33.45 35.52 77.58 87.27 86.97 64.24 80.12 80.73 47.88 41.58 37.52 67.28

ECMDF 81.82 90.30 88.24 81.21 87.88 85.03 43.64 64.97 79.03 77.58 88.24 88.42 72.12 82.42 83.09 49.70 57.33 63.58 75.81

ECMDFS 81.82 90.42 88.24 81.82 88.85 85.76 73.33 85.82 82.61 77.58 88.73 88.30 73.94 86.42 85.21 61.82 80.97 84.30 82.55

a Number

of features, number of samples, and number of noisy sensors are represented by x, y, and z in Problem Setting column by x/y/z, respectively Table 8 p values of hypothesis testing for PCP values between each ensemble and ECDST Ensemble ECDST

ECY 0.0049

ECMDF 0.0563

ECMDFS 0.8206

Table 9 p values of hypothesis testing for PCP values between each ensemble and ECY Ensemble ECY

ECDST 0.0049

ECMDF 0.1665

ECMDFS 0.0068

worst performance of ECMDFS occurs in problem setting 10/100/2. For this setting, the difference between PCP values of ECMDFS and ECDST is −7.88%. In order to test the statistical significance, paired t-test is performed. Table 8 shows the related p values of the hypothesis testing of PCP values of each ensemble and ECDST. At a 5% significance level, ECDST shows better PCP values than ECY. On the other hand, at a 6% significance level, it also performs better than ECMDF. There is no statistically significant difference between ECDST and ECMDFS. In Table 9, p values of paired t-test between PCP values of ECY and other ensembles are given. At a 5% significance level, ECDST and ECMDFS give higher PCP values than ECY. On the other hand, at the same significance level, ECY and ECMDF give similar results.

24

B. Atıcı et al.

[1,2], [1,5], [1,10], [1,20], [1,40] are used as scaling intervals to represents the full spectrum of the Lp metric. The number of times of each scaling interval used is reported in Table 10. Table 10 shows that the adaptive distance idea works in representing the distance to achieve higher PCP values. Also, average accuracy ratios of ANN and SVM for each sensor in each problem setting are reported in Table 11. It is observed that ANN gives better classification accuracies than SVM for noisy sensors. Table 12 shows the computational times. As the number of noisy sensors increases, computational times for all of the ensemble increases. However, all models have approximately similar computational times. Table 10 Number of times of each scaling interval used for ECMDF and ECMDFS

Scaling interval [1, 2] [1, 5] [1, 10] [1, 20] [1, 40]

ECMDF 55 10 6 8 11

ECMDFS 61 4 5 7 13

Table 11 Average classification accuracies of ANN and SVM for each problem setting Problem setting 5/100/0 5/500/0 5/1000/0 5/100/1 5/500/1 5/1000/1 5/100/2 5/500/2 5/1000/2 10/100/0 10/500/0 10/1000/0 10/100/1 10/500/1 10/1000/1 10/100/2 10/500/2 10/1000/2 Overall

ANN (%) Sensor 1 0.78 0.84 0.84 0.78 0.84 0.84 0.78 0.84 0.84 0.84 0.85 0.87 0.84 0.85 0.87 0.84 0.85 0.87 0.84

Sensor 2 0.76 0.78 0.77 0.76 0.78 0.77 0.39 0.36 0.35 0.83 0.78 0.79 0.83 0.78 0.79 0.52 0.39 0.39 0.66

Sensor 3 0.74 0.76 0.82 0.38 0.37 0.36 0.38 0.37 0.36 0.81 0.82 0.82 0.50 0.43 0.39 0.50 0.43 0.39 0.53

SVM (%) Sensor 1 0.91 0.89 0.87 0.91 0.89 0.87 0.91 0.89 0.87 0.91 0.95 0.95 0.91 0.95 0.95 0.91 0.95 0.95 0.91

Sensor 2 0.86 0.82 0.83 0.86 0.82 0.83 0.34 0.34 0.33 0.94 0.87 0.87 0.94 0.87 0.87 0.31 0.34 0.32 0.69

Sensor 3 0.84 0.86 0.86 0.35 0.33 0.33 0.35 0.33 0.33 0.90 0.89 0.92 0.30 0.33 0.33 0.30 0.33 0.33 0.51

Problem setting 5/100/0 5/500/0 5/1000/0 5/100/1 5/500/1 5/1000/1 5/100/2 5/500/2 5/1000/2 10/100/0 10/500/0 10/1000/0 10/100/1 10/500/1 10/1000/1 10/100/2 10/500/2 10/1000/2 Overall

ECDST Training time 107 201 233 192 436 933 195 647 1407 195 218 307 195 500 1516 169 763 3201 634

Test time 4 8 13 8 10 15 7 10 12 8 10 16 8 11 13 6 12 14 10

ECY Training time 139 192 223 181 424 915 186 634 1394 181 206 292 180 485 1499 184 750 3193 625

Table 12 Computational times of each ensemble in seconds Test time 6 12 17 9 12 19 8 12 17 9 13 20 9 14 17 8 12 17 13

ECMDF Training time 149 236 300 195 473 1003 201 689 1317 197 256 380 196 539 1585 197 796 3268 665 Test time 10 8 11 7 9 13 7 8 13 8 9 13 8 9 12 7 9 12 10

ECMDFS Training time 212 243 311 203 480 1016 207 688 1484 206 265 394 206 547 1595 205 809 3281 686

Test time 4 6 10 6 7 12 5 8 11 6 8 12 5 7 10 5 9 12 8

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . . 25

26

B. Atıcı et al.

4.5 Paradoxical Situations In the previous subsection, datasets generated for each problem setting do not contain any paradoxical situations. By only changing the parameters in problem settings, paradoxical evidences cannot be obtained. Thus, probabilistic classification results of classifiers are manipulated to generate the paradoxical situations. We explain the data manipulation approach we take for each paradox below. Complete Conflict Paradox To make conflicting degree equal to 1, the true class’ BPA is changed as 1 for the first sensor. For the second sensor, if the minimum BPA in FOD belongs to true class, then true class’ BPA is changed as 0, the maximum BPA in FOD is changed as 1. On the other hand, if the minimum BPA in FOD does not belong to the true class, it is changed as 1, and the maximum BPA is changed as 0. 0 Trust Paradox To generate this paradox, one of the sensors’ true class’ BPA is changed as 0. If true class’ BPA occurs at the minimum BPA of FOD, its BPA is added to maximum BPA. Otherwise, true class’ BPA is added to minimum BPA in FOD. 1 Trust Paradox To generate this paradox, two different propositions’ BPA for two different sensors are manipulated. For example, if the true class of the target is C, BPA of it in the first sensor is changed as 0 and its value is added to the first proposition’s BPA. In addition, BPA of A in the second sensor’s evidence is changed as 0 and its value is added to the second proposition’s BPA. High Conflict Paradox To generate High Conflict Paradox, the algorithm to generate the Complete Conflict Paradox is first applied. Then, a very small mass is added to BPAs whose values are 0. Lastly, added masses are subtracted from BPA that has the maximum mass for each evidence and target. For each of the paradoxical situations, five different datasets with three different sensors and the number of classes as 3 where each of them has a size of 1000 samples are generated and averages of PCP values are presented in Table 13. In Complete Conflict Paradox, DST cannot be applied as the conflict degree is equal to 1 and thus, the denominator in the combination rule gets 0. On the other hand, in 0 Trust Paradox, the BPA of true class is changed as 0 in one of the sensors for each

Table 13 Average PCP values for paradoxical situations Paradox Complete conflict paradox 0 trust paradox 1 trust paradox High conflict paradox ∗ Values

ECDST Not applicable 0.00 0.00 87.21

ECY 91.94 73.45 77.82 91.94

ECMDF 98.06 81.94 82.79 98.06

ECMDFS 98.42 82.85 83.82 98.30

in the table are the averages of the results of five different datasets for each paradox

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

27

target. Thus, DST gives 0 mass to this class. The same situation is also valid for 1 Trust Paradox. As a result, DST gives 0% PCP values for both of the paradoxes. For all paradoxes, ECMDFS outperforms the others. In addition, ECMDF gives similar results with ECMDFS. ECY’s performance under the paradoxes is satisfactory but not good as proposed algorithms.

5 Hybrid Multi-sensor Target Classification Algorithms In this section, the hybrid extensions of both ECMDF and ECMDFS are presented. While the classical DST combination rule gives unsatisfactory results under paradoxical situations, the proposed models handle those cases well. Thus, the hybrid extensions of both ECMDF and ECMDFS are proposed to combine the power of both the classical DST and the proposed models. The hybrid ECMDF is abbreviated as h-ECMDF and the hybrid ECMDFS as h-ECMDFS. In the hybrid models, the main idea is to use the classical DST when the conflicting degree is low and to use the proposed approaches when the conflicting degree is high. h-ECMDF and h-ECMDFS are inspired from Zhang et al. (2017) in the activation of the proposed approaches. Like Zhang et al. (2017), the conflicting degree is controlled through a threshold value, ξ . If the conflicting degree is smaller than ξ , then the evidences are fused by the classical DST. If the conflicting degree is larger than ξ , then the evidences are fused by the proposed approaches. Figure 6 shows the flowchart of the hybrid models. To determine a reasonable threshold value, problem settings 10/100/1 and 10/100/2, which ECMDFS gives the best and the worst performances, are solved with different levels of conflicting degrees, k ∈ {0.75, 0.80, 0.85, 0.90, 0.95, 1}. For each of the problem setting, five different datasets are generated. For each problem setting, different PCP values are observed in only one of the seeds. For those seeds, Figs. 7 and 8 show the number of correctly predicted targets versus different values

Calculate the conflicting degree, for each target

Yes

No ξ?

Classical DST Fig. 6 Flowchart of hybrid models

ECMDF/ECMDFS

28

B. Atıcı et al.

Number of Correctly Predicted Targets

Number of Correctly Predicted Targets vs Conflicting Degree for 10/100/1 30 28 26

ECDST

24

h-ECMDFS

22 0.75

0.80

0.85

0.90

0.95

1.00

Conflicting Degree

Number of Correctly Predicted Targets

Fig. 7 Number of correctly predicted targets versus different k values for problem setting 10/100/1

Number of Correctly Predicted Targets vs Conflicting Degree for 10/100/2 26 25 24

ECDST

23

h-ECMDFS

22 0.75

0.80

0.85

0.90

0.95

1.00

Conflicting Degree

Fig. 8 Number of correctly predicted targets versus different k values for problem setting 10/100/2 Table 14 Number of activation times of each algorithm for problem setting 10/100/1

k 0.75 0.80 0.85 0.90 0.95 1

Number of activation times ECDST ECMDFS 29 4 29 4 30 3 30 3 31 2 33 0

of k for problem settings 10/100/1 and 10/100/2, respectively. Tables 14 and 15 show the number of activation times of each algorithm. In problem setting 10/100/1, ECMDFS gives better PCP value than ECDST. In Fig. 7, if h-ECMDFS is used with different threshold values ranging from 0.75 to 1, the same PCP values are observed for all conflicting degrees except for 1. Also, when the threshold value is equal to 1, h-ECMDFS becomes the same with

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . . Table 15 Number of activation times of each algorithm for problem setting 10/100/2

k 0.75 0.80 0.85 0.90 0.95 1

29

Number of activation times ECDST ECMDFS 29 4 31 2 32 1 33 0 33 0 33 0

ECDST. In this situation, all evidences are combined with the classical DST. In Fig. 8, increasing the threshold value also increases the PCP values of h-ECMDFS. When the conflicting degree is increased to 0.8, h-ECMDFS predicts one more target as true. This situation also valid, when k = 0.85. Additional improvements are observed when the threshold value is increased to 0.90 or higher. However, for conflicting degrees greater than 0.85, the hybrid algorithm uses the classical DST for combining evidences. By using hybrid models, we are to combine the power of both algorithms. Thus, both ECMDFS and classical DST should be used to benefit the strengths of both of them. A careful investigation of the generated data shows that, if the threshold value is set to 0.90 or higher, ECMDFS is not activated. By considering the number of activation times of ECMDFS in both of the problem settings, 0.80 is chosen as the threshold value, ξ , for experimental purposes. Note that the number of paradoxical situations in generated data is small. Thus, it is expected that the marginal improvements in performances of the hybrid models will be limited. h-ECMDF and h-ECMDFS are solved for 18 different problem settings. In each problem settings, the same datasets to test ECMDF and ECMDFS are used, and average PCP results for each problem setting for ECDST, ECMDF, ECMDFS, hECMDF, and h-ECMDFS are presented in Table 16. The last line of Table 16 shows the overall averages of all problem settings for each ensemble. By the hybrid extension of the proposed models, 6.41% and 0.31% increases in overall average PCP values are observed for ECMDF and ECMDFS, respectively. Out of 18 problem settings, h-ECMDF gives better PCP values for 14 settings than ECMDF. Also, h-ECMDFS’s PCP values are higher than ECMDFS’s for nine settings. The largest performance increase by the hybrid extensions occurs in the problem settings with two noisy sensors for ECMDF. The largest difference between ECMDF and ECDST, which occurs in the problem setting 5/100/2 (i.e., a performance difference of −29.09% decreases to −4.24%). The largest difference for ECMDFS decreases from −7.88% to −0.61% in the problem setting 10/100/2. However, the maximum positive difference between ECMDFS and ECDST decreases from 1.82% to 0.61%.

30

B. Atıcı et al.

Table 16 PCP values of the hybrid models Problem setting 5/100/0 5/500/0 5/1000/0 5/100/1 5/500/1 5/1000/1 5/100/2 5/500/2 5/1000/2 10/100/0 10/500/0 10/1000/0 10/100/1 10/500/1 10/1000/1 10/100/2 10/500/2 10/1000/2 Overall

PCP (%) ECDST 81.82 91.15 88.73 80.61 89.45 86.91 72.73 85.21 82.42 77.58 88.97 89.09 72.12 87.39 86.36 69.70 81.21 83.82 83.07

ECMDF 81.82 90.30 88.24 81.21 87.88 85.03 43.64 64.97 79.03 77.58 88.24 88.42 72.12 82.42 83.09 49.70 57.33 63.58 75.81

ECMDFS 81.82 90.42 88.24 81.82 88.85 85.76 73.33 85.82 82.61 77.58 88.73 88.30 73.94 86.42 85.21 61.82 80.97 84.30 82.55

h-ECMDF 81.21 91.03 88.67 80.61 88.12 85.94 68.48 84.85 82.42 76.36 88.61 88.61 72.12 86.91 84.61 68.48 79.76 83.21 82.22

h-ECMDFS 81.21 90.79 88.73 80.61 88.73 86.30 73.33 85.21 82.42 76.97 88.85 88.61 72.73 87.52 85.45 69.09 81.09 83.82 82.86

Table 17 shows the activation times of the classical DST and ECMDF/ECMDFS in each problem setting. In general, when the number of samples increases, the activation time of ECMDF/ECMDFS increases. However, the majority of the evidences for each target in each problem setting is fused with the classical DST. That is why the average PCP values in Table 16 are similar to each other for all ensembles. Paired t-test is performed to see the statistical significance. Table 18 shows p values of hypothesis testing of PCP values of h-ECMDF, h-ECMDFS, and ECDST. At a 5% significance level, all ensembles give similar results.

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

31

Table 17 Activation times of the algorithms in hybrid approaches in each problem setting Problem setting 5/100/0 5/500/0 5/1000/0 5/100/1 5/500/1 5/1000/1 5/100/2 5/500/2 5/1000/2 10/100/0 10/500/0 10/1000/0 10/100/1 10/500/1 10/1000/1 10/100/2 10/500/2 10/1000/2 a Values

# of DST used Min Average a 31 32.40 152 157.80 303 315.00 31 32.40 152 157.80 303 315.00 26 30.40 161 164.20 330 330.00 29 32.00 158 161.00 304 309.80 29 32.00 158 161.00 304 309.80 26 30.80 151 161.60 321 327.00

Max 33 162 327 33 162 327 33 165 330 33 163 316 33 163 316 33 165 330

# of ECMDF/ECMDFS used Min Average a Max 0 0.60 2 3 7.20 13 3 15.00 27 0 0.60 2 3 7.20 13 3 15.00 27 0 2.60 7 0 0.80 4 0 0.00 0 0 1.00 4 2 4.00 7 14 20.20 26 0 1.00 4 2 4.00 7 14 20.20 26 0 2.20 7 0 3.40 14 0 3.00 9

in the table are the averages of the results of five different datasets for each problem

setting Table 18 p values of the paired t-tests

Ensemble ECDST

h-ECMDF 0.7054

h-ECMDFS 0.9212

6 Conclusion In this study, the classification phase of ATR systems is addressed. For the classification purpose, an ensemble of classifiers is employed and non-imaginary heterogeneous data sources are adopted. In the ensemble, for each sensor both ANN and SVM are trained and the classifier that has the highest accuracy ratio is chosen. Different results coming from each classifier for each sensor are combined through the modified DST. We propose new approaches that elastically calculate the distances between each evidence and modify them to overcome the paradoxical situations in DST. In all proposed algorithms, Lp distance metric is used. This distance metric enables us to calculate the distances between evidences by taking into account the effects of each sensor. Sensor accuracies calculated in the training phase are used as weights of sensors in the distance metric. Also, pairwise correlation coefficients between sensors calculated in the training phase are used as p parameters of Lp distance metric.

32

B. Atıcı et al.

Two different algorithms, namely ECMDF and ECMDFS are proposed. The first one discounts the evidences by credibility degrees whereas the second one discounts based on both credibility degrees and sensor reliabilities. Both algorithms calculate credibility degrees and sensor reliabilities based on training datasets. Computational results of ECMDF and ECMDFS are compared with the ensembles that integrates the classical DST and the modified DST proposed by Yong et al. (2004) through generated datasets that represent ATR systems. The results show that ECY and ECMDF perform poorly when the number of noisy sensors is more than the consistent ones. In this situation, credibility degrees between noisy sensors are high and thus, the effect of the evidences coming from them is high in the evidence combination phase. The results also show that there is no statistically significant difference between ECDST and ECDMFS in the computational experiments. In addition, ECMDFS shows better performance in problem settings when the number of features is low and the number of noisy sensors is at maximum. Both ECDST and ECMDFS outperform ECY. The behavior of all ensembles in the paradoxical situations is also investigated. Probabilistic classification results of each classifier for each sensor are manipulated and artificial datasets are generated. The results show that ECMDF and ECMDFS give the best performances in all paradoxes. The hybrid extensions of ECMDF and ECMDFS, h-ECMDF, and h-ECMDFS are also presented. In the hybrid algorithms, if the conflicting degree is smaller than the threshold value, classical DST is activated. Otherwise, ECMDF or ECMDFS is activated. Both hybrid models are tested with the same problem settings and datasets with ECMDF and ECMDFS. By the hybrid extension of ECMDF, better performance measures for the majority of the problem settings are achieved. To the best of our knowledge, there is no study that considers adaptive distance metric in the Dempster–Shafer Theory for assigning the reliability of the sources. The main contribution of the study is to integrate classical DST with adaptive distance metric, Lp metric, for assigning the credibility degrees of each sensor. The proposed distance metric calculates the distances scenario based and flexible to the different types of datasets. Also, the study combines DST with ML and MCDM. The proposed way of combination of different pieces of evidences through Lp metric by modified DST is proven to be effective in both computational results and paradoxical situations. All ensembles in this study are tested with generated artificial datasets. In reallife applications, finding multi-sensor datasets in sensor data fusion is problematic. A future direction would be testing the proposed algorithms with real datasets.

References Camargo LS, Yoneyama T (2001) Specification of training sets and the number of hidden neurons for multilayer perceptrons. Neural Comput 13(12):2673–2680 Cao W, Wang X, Ming Z, Gao J (2018) A review on neural networks with random weights. Neurocomputing 275:278–287

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

33

Chen Y, Cremers AB, Cao Z (2014) Interactive color image segmentation via iterative evidential labeling. Inf Fusion 20:292–304 Chen J, Ye F, Jiang T, Tian Y (2017) Conflicting information fusion based on an improved DS combination method. Symmetry 9(11):278 Chen L, Diao L, Sang J (2018) Weighted evidence combination rule based on evidence distance and uncertainty measure: an application in fault diagnosis. Math Probl Eng 2018:1–10 Chollet F (2018) Deep learning with Python. Manning Publications, Shelter Island Dempster AP (1967) Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 38(2):325–339 Deng Y, Chan FT (2011) A new fuzzy dempster MCDM method and its application in supplier selection. Expert Syst Appl 38(8):9854–9861 Dong G, Kuang G (2015) Target recognition via information aggregation through Dempster– Shafers evidence theory. IEEE Geosci Remote Sens Lett 12(6):1247–1251 Dubois D, Prade H (1988) Representation and combination of uncertainty with belief functions and possibility measures. Comput Intell 4(3):244–264 Dymova L, Sevastjanov P (2010) An interpretation of intuitionistic fuzzy sets in terms of evidence theory: decision making aspect. Knowl-Based Syst 23(8):772–782 Fan X, Zuo MJ (2006) Fault diagnosis of machines based on D–S evidence theory. Part 2: application of the improved D–S evidence theory in gearbox fault diagnosis. Pattern Recogn Lett 27(5):377–385 Fan C, Song Y, Lei L, Wang X, Bai S (2018) Evidence reasoning for temporal uncertain information based on relative reliability evaluation. Expert Syst Appl 113:264–276 Florea MC, Jousselme A, Bossé É, Grenier D (2009) Robust combination rules for evidence theory. Inf Fusion 10(2):183–197 Foucher S, Germain M, Boucher J, Benie G (2002) Multisource classification using ICM and Dempster-Shafer theory. IEEE Trans Instrum Meas 51(2):277–281 Genton MG (2001) Classes of kernels for machine learning: a statistics perspective. J Mach Learn Res 2:299–312 Géron A (2018) Hands-on machine learning with Scikit-learn and TensorFlow concepts, tools, and techniques to build intelligent systems. Beijing, OReilly Guyon I (2003) Design of experiments for the NIPS 2003 variable selection benchmark Horiuchi T (1998) Decision rule for pattern classification by integrating interval feature values. IEEE Trans Pattern Anal Mach Intell 20(4):440–448 Hsu CW, Lin CJ (2002) A simple decomposition method for support vector machines. Machine Learning, 46(1-3):291–314 Inagaki T (1991) Interdependence between safety-control policy and multiple-sensor schemes via Dempster-Shafer theory. IEEE Trans Reliab 40(2):182–188 James G, Witten D, Hastie T, Tibshirani R (2017) An introduction to statistical learning: with applications in R. Springer, New York Jonathan Milgram J, Cheriet M, Sabourin R (2006) “One against one” or “one against all”: which one is better for handwriting recognition with SVMs? In: Tenth international workshop on frontiers in handwriting recognition, Université de Rennes 1, Oct 2006, La Baule Josang A, Daniel M, Vannoorenberghe P (2003) Strategies for combining conflicting dogmatic beliefs. In: Proceedings of the 6th international conference of information fusion, 2003 Kavzoglu T, Colkesen I (2009) A kernel functions analysis for support vector machines for land cover classification. Int J Appl Earth Obs Geoinf 11(5):352–359 Ke J, Liu X (2008) Empirical analysis of optimal hidden neurons in neural network modeling for stock prediction. In: 2008 IEEE Pacific-Asia workshop on computational intelligence and industrial application, vol 2, pp 828–832 Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46(1–3):351–360 Khaleghi B, Khamis A, Karray FO, Razavi SN (2013) Multisensor data fusion: a review of the state-of-the-art. Inf Fusion 14(1):28–44

34

B. Atıcı et al.

Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268 Leung Y, Ji N, Ma J (2013) An integrated information fusion approach based on the theory of evidence and group decision-making. Inf Fusion 14(4):410–422 Li Y, Chen J, Ye F, Liu D (2016) The improvement of DS evidence theory and its application in IR/MMW target recognition. J Sens 2016:1–15 Liu Z, Pan Q, Dezert J, Martin A (2018) Combination of classifiers with optimal weight based on evidential reasoning. IEEE Trans Fuzzy Syst 26(3):1217–1230 Mcculloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133 Mercier D, Quost B, Denœux T (2008) Refined modeling of sensor reliability in the belief function framework using contextual discounting. Inf Fusion 9(2):246–258 Murphy CK (2000) Combining belief functions when evidence conflicts. Decis Support Syst 29(1):1–9 Pal N, Ghosh S (2001) Some classification algorithms integrating Dempster-Shafer theory of evidence with the rank nearest neighbor rules. IEEE Trans Syst Man Cybern Syst Hum 31(1):59–66 Platt JC (1999) Using analytic QP and sparseness to speed training of support vector machines. In: Neural information processing systems 11. MIT, pp 557–563 Rogers SK, Colombi JM, Martin CE, Gainey JC, Fielding KH, Burns TJ, Oxley MU (1995) Neural networks for automatic target recognition. Neural Netw 8(7–8):1153–1184 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536 Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscip Rev 8(4):1–18 Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton Siddique M, Tokhi M (2001) Training neural networks: backpropagation vs. genetic algorithms. IJCNN01. International joint conference on neural networks. Proceedings (Cat. No.01CH37222), 2673-2678 Smarandache F, Dezert J (2006) Advances and applications of DSmT for information fusion: collected works. American Research Press, Rehoboth Smets P, Kennes R (1994) The transferable belief model. Artif Intell 66(2):191–234 Vapnik V, Chervonenkis A (1964) A note on one class of perceptrons. Autom Remote Control 25 Vapnik V, Cortes C (1995) Support-vector networks. Mach Learn 20(3):273–297 Vivarelli F, Williams CK (2001) Comparing Bayesian neural network algorithms for classifying segmented outdoor images. Neural Netw 14(4–5):427–437 Xu W, Yu J (2017) A novel approach to information fusion in multi-source datasets: a granular computing viewpoint. Inf Sci 378:410–423 Yager R (1987) On the Dempster-Shafer framework and new combination rules. Inf Sci 41(2):93– 137 Ye F, Chen J, Tian Y (2018) A robust DS combination method based on evidence correction and conflict redistribution. J Sens 2018:1–12 Yong D, Wenkang S, Zhenfu Z, Qi L (2004) Combining belief functions based on distance of evidence. Decis Support Syst 38(3):489–493 Zadeh LA (1979) On the validity of Dempster’s rule of combination of evidence, ERL memo M 79/24. University of California, Berkeley Zadeh LA (1984) Book review: a mathematical theory of evidence. AI Mag 5(3):81–83 Zadeh LA (1986) A simple view of the Dempster-Shafer theory of evidence and its implication for the rule of combination. AI Mag 7(2):85–90 Zanghirati G, Zanni L (2003) A parallel solver for large quadratic programs in training support vector machines. Parallel Comput 29(4):535–551 Zhang L, Ding L, Wu X, Skibniewski MJ (2017) An improved Dempster–Shafer approach to construction safety risk perception. Knowl-Based Syst 132:30–46 Zhang Q, Yang LT, Chen Z, Li P (2018a) A survey on deep learning for big data. Inf Fusion 42:146–157

Heterogeneous Sensor Data Fusion for Target Classification Using Adaptive. . .

35

Zhang W, Ji X, Yang Y, Chen J, Gao Z, Qiu X (2018b) Data fusion method based on improved D-S evidence theory. In: 2018 IEEE international conference on big data and smart computing (BigComp) Zhou Q, Zhou H, Zhou Q, Yang F, Luo L, Li T (2015) Structural damage detection based on posteriori probability support vector machine and Dempster–Shafer evidence theory. Appl Soft Comput 36:368–374

Selection of Emergency Assembly Points: A Case Study for the Expected Istanbul Earthquake Sezer Sava¸s, Sehnaz ¸ Cenani, and Gülen Ça˘gda¸s

Abstract Disasters have devastating effects on socioeconomic status, the built environment, and infrastructure of the cities; also, they lead to complex and ambiguous disaster management processes that are hard to manage. Therefore, developing a proper disaster management strategy is crucial. Selecting appropriate emergency assembly points (EAPs) as well as shelter areas and routing habitants to the EAPs and shelter areas are essential in disaster management, especially for the densely populated cities. These processes need a multidisciplinary approach involving different stakeholders. The aim of this study is to better understand the criteria related to the selection of EAPs and design a decision support system (DSS) for policy-makers of the city. As a multi-criteria decision problem, selection of the most appropriate areas in terms of location, capacity, resources, allocation of victims, and evacuation route principles is formulated in a mixed-method model using Geographic Information System (GIS) and The Analytic Hierarchy Process (AHP). Gaziosmanpa¸sa district within the area of the expected Istanbul earthquake is chosen as a case study to demonstrate the implementation of the proposed model. Results indicated that improvements were achieved on comprehensiveness among population, averaged evacuation distance traveled by the victims, and reaching times of the affected victims to the emergency assembly points when compared to the current situation.

S. Sava¸s () Department of Architecture, Istanbul University, Istanbul, Turkey e-mail: [email protected] S. ¸ Cenani Department of Architecture, Istanbul Medipol University, Istanbul, Turkey e-mail: [email protected] G. Ça˘gda¸s Department of Architecture, Istanbul Technical University, Istanbul, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_2

37

38

S. Sava¸s et al.

1 Introduction The term “disaster” is usually applied to a breakdown in the normal functioning of a community that has a significant adverse impact on people, their works, and their environment which exceeds the ability of the affected people to cope using only its own resources (United Nations 1992). This situation may be a result of a natural event or human activity. According to the forecasts, it is predicted that in 2050, 2.9 billion more people will live in cities, the total urban population will reach approximately 6 billion people and 69% of the world population will be living in cities (United Nations 2010). An important part of this ratio is metropolitan cities. Even today, a significant proportion (~19%) of the world’s population lives in 300 major metropolitan cities, where about half of the global GDP is generated (Istrate and Nadeau 2012). This concentrated urban population also poses new challenges to humanity such as air pollution, congestion, climate change, waste management, human health, economic uncertainties, inequalities, and disaster management (OECD 2012). This situation paves the way for increased use of modern technologies in daily urban life and provides the integration of systems like innovative transportation, infrastructure, logistics, energy, management, economy, and so on, which can be called smart cities in total. It can briefly be summarized that the concept of a smart city is the use of modern technologies for better living conditions and less environmental impact (IEEE 2014). However, the effects of this intensifying new lifestyle are not only about living standards and environmental impact. The growth of cities also causes a higher population to be exposed to natural and technological disasters (Harrison and Williams 2016). The incredibly complex and dense structure of cities resembling a labyrinth of urban systems such as transportation, water supply, sewerage, energy, housing makes them one of the most vulnerable places from natural disasters. According to a study by the United Nations (2016), almost 890 million people worldwide live in at least one major risk of natural disasters, including floods, droughts, hurricanes, or earthquakes. Figure 1 shows the distribution of reported natural disasters occurring between 1900 and 2011 in EM-DAT database by years (URL-1). According to this distribution, the situation seems quite worrying. Today, nearly 400 natural disasters occur every year in the world. According to a report published by OECD (2017), Turkey takes the fourth place in terms of average number of disasters per year (average of 7.2 disasters per year) and is in seventh place in terms of financial damage done by the disasters (each disaster costs average of 1.7 billion dollars). Considering all of these, public transports run on clean energy, solar-powered street lighting, and green buildings with low carbon emissions are important concepts for smart cities; but to be really smart, cities must also consider the impacts of disasters. Smart cities would be stupid without a proper disaster strategy (URL-2). There cannot be a smart city before it being a safe city. When we start talking about the management of cities, disaster risk reduction is one of the key

Selection of Emergency Assembly Points: A Case Study for the Expected. . .

39

Fig. 1 Numbers of reported natural disasters during 1900–2011 from the EM-DAT database of international disasters (URL-1)

aspects. A smart city must also be a safe city with effective and appropriate disaster response, disaster detection, disaster prevention, disaster information, and disaster management systems (URL-3). This chapter mainly focuses on the preparedness phase of the disaster management as offering a decision-making tool to support the government on analysis and selection of emergency assembly points (EAP). This selection process needs a multidisciplinary approach involving many different knowledge areas including management, seismology, environment, sociology, planning, architecture, law, transportation, engineering, and so on. The aim of this study is to better understand the criteria related to the selection of emergency assembly points and design a decision support system (DSS) for policymakers of the city. As a multi-criteria decision problem, the selection of the most appropriate areas in terms of location, capacity, resources, allocation of victims, and evacuation principles is formulated in a mixed-method model that uses Analytic Hierarchy Process (AHP) and Geographic Information System (GIS). As a case study, the expected Istanbul earthquake is chosen in terms of what-if scenario of a very disastrous earthquake to demonstrate the implementation of the proposed model. Considering that the difficulty of obtaining data at the whole city scale and the number of emergency assembly points determined at the Istanbul level (approximately 3000 emergency assembly points exist in Istanbul according to The Disaster and Emergency Management Authority AFAD) will exceed the scope of this chapter; therefore, the case study is restricted to the Gaziosmanpa¸sa District of

40

S. Sava¸s et al.

Istanbul Province. At this point, it should be noted that the problems in this example are not only specific to the selected area and it is very likely to find these problems similarly in other districts of Istanbul. In the following sections, disaster management and post-disaster resettlement process will be briefly examined first. Then, the term emergency assembly points will be explained and the criteria for selecting these areas will be investigated from the relevant literature. Afterward, a model for the selection of emergency assembly points will be proposed, and analyses related to the sample study area will be conducted and conformity checks of the existing areas will be performed in order to test the functioning of the model. To sum up the model, selection principles are examined and analyzed from the relevant literature and important criteria are determined. AHP method is used to determine criterion weights and prioritize alternatives. As Nappi and Souza (2015) indicate, the decision-making process in risk situations may be assessed by a number of models like simulation models, heuristic models, or multi-criteria models. However, the multi-criteria models are the best choice to meet disaster relief goals because these models address decision problems by covering various issues (Nappi and Souza 2015). AHP method, introduced by Saaty (1988), is one of the most used methods in multi-criteria decision-making. AHP is an effective tool for dealing with complex decision-making and may aid the decision-maker to set priorities and make the best decision (URL-4). By reducing complex decisions to a series of pairwise comparisons, and then producing the results, the AHP aids to capture both subjective and objective aspects of a decision. Furthermore, the AHP incorporates a useful technique for checking the consistency of the decision-maker’s evaluations; in this way it reduces the bias in the decision-making process (URL-4). As a multi-criteria decision problem, the selection of the most suitable areas in terms of location, capacity, resources, victim allocation, and evacuation path principles is determined by a hybrid model using GIS and AHP methods. Subsequently, the selected areas are integrated in such a way as to work together with the existing areas defined by local governments. As a result, improvements were achieved on comprehensiveness among population, averaged evacuation distance traveled by the victims, and reaching times of the affected victims to the emergency assembly points when compared to the current situation. The remainder of this chapter is organized as follows. The next section explains the fundamentals of disaster management. The details about post-disaster resettlement is explained in Sect. 3. Section 4 describes the selection of emergency assembly points. The methodology and the case study are discussed in Sects. 5 and 6, respectively. Finally, the last section presents a summary of recommendations for future work, and also discusses our main findings and draws some conclusions.

Selection of Emergency Assembly Points: A Case Study for the Expected. . .

41

2 Disaster Management Disasters have devastating effects on the socioeconomic, built environment, and infrastructure of the cities; also, they create complex and ambiguous managemental processes that are hard to handle. Therefore, developing a proper disaster management strategy is a crucial need. Also, disasters need excessive efforts on logistical and organizational aspects of the affected country. So, using traditional methods while developing this strategy may not be efficient; therefore, it is crucial that the implementers of disaster management should adopt contemporary, rapid, precise, and effective methods in the process of successful humanitarian aid and disaster relief. In general, there are three basic phases of disaster management, which are: (1) preparedness: which consists of the activities of the pre-disaster period; (2) response: which consists of the activities of disaster period; and (3) recovery: which consists of the activities of the post-disaster period. Some sources divide the preparedness phase into two and add a fourth phase named mitigation to this disaster management life cycle process, while others evaluate mitigation as a subprocess of preparedness. Also, it should be noted that mitigation activities may not be applicable for all types of disaster scenarios. The preparedness phase needs proactive approaches, as the other ones need reactive approaches. In the predisaster period, preparedness (and mitigation in some cases) activities are held in a strategic planning approach. In the disaster period, response activities take place, and so agile principles should be taken into account in this period as short-term project management, flexibility, and nimbleness are the most important subjects. In the after-disaster period, recovery and (if necessary) resettlement activities replace the premises (Chandraprakaikul 2010). One of the hardest things in disaster management is the trade-off between cost and responsiveness; also, it is a hard and fragile process that needs quick actions. A disaster places exceptional burdens on the supply chain, logistics capacity, and organizational skills of the affected country. Actions such as mobilizing relief personnel, equipment and supplies, evacuating the injured people, and performing temporary gathering and sheltering activities for those directly affected by the disaster require optimized logistics, procurement, and management systems, which are the different sub-elements of a disaster management system. Although a disaster cannot be avoided completely, most of the damages can be reduced by taking the necessary steps to create an effective structure. In the case of a newly planned city, the most innovative techniques in disaster management can be integrated from the planning stage; therefore, an integrated Smart Disaster Management system can be implemented right from the concept stage. But, in case of when the disaster management activities are needed to be integrated into an existing city, full integration may not be possible, and unfortunately, the preparedness activities will have limited coverage. Therefore, in such cases, response activities will have a high priority (Rao et al. 2017). Therefore, strategic planning should be done long before the disaster occurs in order to receive effective, efficient, and timely responses

42

S. Sava¸s et al.

during and immediately after disasters. Selecting appropriate emergency assembly points and shelter areas as well as routing for the safest evacuation are the key to this planning process, especially for the densely populated cities. The Disaster and Emergency Management Authority (AFAD) formed in 2009 as an agency under the Ministry of Interior of the Republic of Turkey has all the authority related to planning and implementation about disaster management in Turkey. AFAD also has subunits in different cities.

3 Post-disaster Resettlement The main purpose of emergency response efforts is to provide shelter and assistance to victims of the disaster as soon as possible (Rawls and Turnquist 2010). Then, the temporary sheltering process must proceed rapidly in order to ensure the normality by switching to the recovery phase and to complete the reconstruction after the disaster. Resettlement activities play an important role in the economic and psychological recovery of a disaster-affected society (El-Anwar and Chen 2012). These actions not only give a place to the victims but also meet the vital, functional, and social needs of these people (Hu et al. 2014). Quarantelli (1995) divides the post-disaster resettlement (which is a spreading process to response and recovery phases of disaster management) into four phases: (1) instant settlement (for the first few hours after the disaster occurred); (2) emergency settlement (for 1 to 2 days after the disaster occurs); (3) temporary resettlement (for a few weeks after the disaster occurred); and (4) permanent resettlement (for several years after disaster). Hu et al. (2014), in turn, formulates the process by adding a fifth step, which is transitional resettlement (fills between temporary resettlement which is covering several months after the disaster and permanent resettlement which is covering several years). To sum up the post-disaster resettlement process according to this formulation, right after the disaster, emergency rescue resources are appropriately distributed and predetermined temporary sheltering areas are immediately established. Shortly after the arrival of the rescue resources, the victims are gathered in the emergency assembly points, which are the safe areas very close to the disaster areas, where very short-term sheltering needs are met. However, considering the factors related to water, sanitary, electricity, and other management issues, most homeless and injured victims need to complete their gradual transition to temporary sheltering areas within a few weeks after the disaster. The sheltering areas (or resettlement areas in other words) at this stage consist of tents or awnings that cannot be used for a long time because they have social, safety, and comfort problems and are relatively close to disaster areas. For this reason, disaster victims should be placed in transitional resettlement areas (like container cities) where they can live for several years (until permanent resettlement sites are erected) as soon as possible (Hu et al. 2014). Finally, disaster victims are transferred to newly reconstructed permanent resettlement areas and normality is rapidly ensured. It is important to note again

Selection of Emergency Assembly Points: A Case Study for the Expected. . .

43

that in some sources the transitional resettlement step is not included as a separate resettlement process and is considered as a part within the temporary resettlement step. When the report published by the Japan International Cooperation Agency (JICA) and the Istanbul Metropolitan Municipality (IMM) in 2002 is examined and cross-checked with spatial definitions made by AFAD, post-disaster resettlement is categorized as (1) emergency assembly, (2) temporary sheltering, and (3) postdisaster reconstruction (JICA 2002). When it is compared with the prementioned relevant literature, emergency assembly consists of instant and partially emergency settlement processes, while temporary sheltering consists of partially emergency, temporary, and transitional settlement processes. It will be useful to use this threestep classification instead of a five-step classification in this study to keep up with the current practices of Turkey performed by the AFAD. To sum up, temporary sheltering areas are the preplanned basic areas that meet the temporary accommodation of the victims and provide the necessary humanitarian standards for their comfortable lives in the period following the occurrence of the disaster; while, emergency assembly points are disaster-free safe areas where disaster victims must reach quickly after the disaster (Çınar et al. 2018). In other words, EAPs are the areas that will safely and satisfactorily accommodate disaster victims for hours at short notice (Burtles 2013). As the major purpose of EAPs is to ensure safe exit and provide immediate secure location, if these areas are not properly planned, panic and chaos can spread among the victims that are affected by the disaster. Thus, a panic atmosphere can set back the emergency response operations, as well as increasing the potential damage of the disaster arising from unpredicted threats (URL-5). In EAPs, disaster victims are informed, coordinated, brought together with emergency relief/assistance teams and, if necessary, the short-term sheltering needs of the victims are met and oriented to the temporary sheltering areas in an organized way (Çınar et al. 2018; Amideo et al. 2019; Kar and Hodgson 2008). So, it is crucial to determine the needs, requirements, attributes, and expectations of victims at the individual and group levels to identify correct EAPs (Burtles 2013). Some prominent characteristics of EAPs are distance/accessibility, safety, and capacity/amount of space (URL-5). To make it clear, these points should not be too close to disaster-affected areas, yet, not too away for reaching on foot either. They should be easily and safely accessible for both victims and emergency personnel. EAPs must be open areas which are located at a safe distance from potential threats like existing buildings, power lines, gas lines, vehicles, high traffic roads, bridges, etc. that can pose further risks to victims. They need to be large enough for gathering and short accommodation of nearby victims as well as for service to victims as emergency personnel will need space for basic first aid treatments and emergency equipment (URL-5, URL-6, and URL-7). Apart from those, ownership and continuity of escape routing are important for EAPs, which means that victims should have freedom of access at any time without notice if it is a public property or in a short notice if it is a private property; also, continuity of escape routing to

44

S. Sava¸s et al.

emergency assembly points should be ensured by considering the roads, passages, and bridges that are at risk of damaging or closing (Burtles 2013).

4 Selection of Emergency Assembly Points The term of emergency assembly point and the selection criteria of these areas cannot be found in the disaster management literature as much as the temporary shelter areas. Some notable studies have been found about the selection criteria of emergency assembly points (Çınar et al. 2018; Burtles 2013; Aksoy et al. 2009; JICA 2002; Tarabanis and Tsionas 1999). When these studies are examined, the prominent criteria for the selection of these points can be compiled unordered as followed: 1. Capacity: In the JICA (2002) report, it is proposed that these places should be in each neighborhood unit with a gross minimum of 1.5 m2 per person. In Tarabanis and Tsionas (1999) study, it is proposed that the net usage area per person should be a minimum of 2 m2 per person. In TAMP-˙Izmir, the area per capita is accepted as a ratio above international standards, which is 4 m2 per person. 2. Position/Accessibility: The maximum walking distance from building islands to emergency assembly points should be max. 500 m/15 min or less, which is the maximum acceptable walking distance within easy reach of each individual. 3. Size: The area should be bigger than 500 m2 . 4. Connection: The connection of assembly points with main arteries should be established by considering the roads that are at risk of closing and their continuity with other meeting areas should be ensured. Also, it should be ensured that road, pedestrian, and disabled access to these areas is possible. 5. Function: The existing active green areas include playgrounds, sports grounds, pocket parks, neighborhood parks, small parks, and district parks; passive green areas like carpet pitches; building gardens (if seismically sufficient) like school, mosque, and hospital gardens; empty spaces and open car parks can be recommended as assembly points. 6. Infrastructure: The priority will be on the areas that have access to infrastructure systems that will provide basic vital needs such as electricity, water, and WC. 7. Support Units: The priority will be on the areas which have social reinforcement areas such as schools, hospitals, family health centers, medical centers, mosques, active/passive green spaces and sports facilities belonging to education and health services in the 500-m walking distance to the emergency assembly points (i.e., in the priority service radius). 8. Ownership: Public lands should be preferred first. Empty areas and private open car parks can be preferred by considering the connection, continuity, accessibility, and the size of the area. Although these criteria are used to assess each EAP individually, in addition to that, the EAPs should be evaluated regionally for each different building island,

Selection of Emergency Assembly Points: A Case Study for the Expected. . .

45

neighborhood, district and city levels. Unexceptionally, each of the levels should fulfill the necessary criteria individually and cumulatively (Çınar et al. 2018).

5 Methodology The aim of this study is to better understand the criteria related to the selection of emergency assembly points and design a decision support system for effective decision-making in disaster planning for policy-makers related to the subject. As it is mentioned earlier, AFAD has all the planning, management, and application authority related to disaster management in Turkey. This planning authority includes the identification and selection of emergency assembly points. The information about emergency assembly points designated by AFAD is open to the public and relatively easy to access through most municipalities and the e-government system. With the proposed model, a systematic process for selecting emergency assembly points is being designed. With this model, it is tried to create a model that will help to question the position and characteristics of the current emergency assembly areas decided by the AFAD and if necessary, to select new assembly areas and integrate them with the current ones to resolve the insufficiency. It is a hybrid model that uses GIS and AHP methods together. The flowchart of the model can be seen in Fig. 2. As shown in Fig. 2, the model has three phases. If the third phase is performed successfully, the second phase is triggered again and repeated. This cycle is repeated either until the model fulfills the criteria and the designated region is labeled as safe or is unable to continue anymore since there is no suitable alternative to choose and the designated region is labeled as unsafe. The first phase is an assessment of the current areas. This is the individual-level assessment of each EAP. At this stage, suitability analysis of the existing EAPs in the AFAD database is made through the decision table shown in Table 1 based on the criteria of (1) Capacity, (2) Position/Accessibility, (3) Size, (4) Connection, (5) Function, (6) Infrastructure, (7) Support Units, and (8) Ownership. The decision table checks the criteria one by one respectively according to values/attributes drawn from the relevant literature. Decision criteria are on the left column and decision rules are on other columns. The EAPs which get R1 are not suitable, R2 are conditionally suitable, and R3 are suitable ones. As a result of this analysis, areas that do not meet the criteria are removed from this list. Those who meet the criteria conditionally or fully continue to the second stage by going to the region-based database. Accordingly, the identification tag of each emergency assembly point is generated as seen in Table 2. The second phase is the assessment of regional sufficiency. This is the cumulative-level assessment of each neighborhood and district in which the EAPs take place. At this stage, it is analyzed whether the determined EAPs are cumulatively sufficient for the designated region. This testing process is based on regional data drawn from GIS and selection criteria from the literature. This test consists of two parts. In the first part, “capacity” criterion is checked and it is

S. Sava¸s et al.

Fig. 2 Flowchart

46

1.5 >500 – –

>1.5 1.5 500 Yes No – – – R1

>1.5 500 Yes Yes No >500 Private R2

>1.5 500 Yes Yes No >500 Public R2

>1.5 500 Yes Yes No 1.5 500 Yes Yes No 1.5 500 Yes Yes Yes >500 Private R2

>1.5 500 Yes Yes Yes >500 Public R2

>1.5 500 Yes Yes Yes 1.5 500 Yes Yes Yes 0 and λ ∈ R

Distance between two TFN s

(6)

 d (A, B) =

1 (a − b )2 + (a − b )2 + (a − b )2 1 1 2 2 3 3 3

!

(7) Another concept utilized in this study is defuzzification. It can be basically defined as transition from a fuzzy number to a crisp number. To be able to obtain quantifiable results, defuzzification should be conducted where needed. There are so many types of defuzzification functions. Some examples are center of gravity, first of maximum, adaptive integration and quality method. In this study, the graded

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

109

mean integration method proposed by Zhao (Zhao et al. 2019) is used. Algebraic expression can be demonstrated as below Eq. (8):   xij + 4yij + zij M vij = 6

(8)

3.3 Hierarchical Fuzzy TOPSIS Hierarchical Fuzzy TOPSIS is a generalized version of TOPSIS where the method is reorganized considering the fuzzy mechanism and hierarchical evaluation structure. This method is used in this study for the evaluation of the proposed hierarchical model. The method consists of six steps. Detailed representation of the methodology is as follows (Kahraman et al. 2007): Consider a work that consists of “m” alternatives represented by the index “i” (i = 1, . . . , m), “p” main criteria represented by the index “j” (j = 1, . . . , p) having a total of “n” sub-criteria represented by the index “s” (s = 1, . . . , n) and “t” respondents represented by the index “k” (k = 1, . . . , t) who were questioned via survey. The hierarchical structure of the work can be visualized as in Fig. 3. Step: 1 Construct Decision Matrix (A) where rows represent alternatives and columns represent sub-criteria. Each intersection cell represents a performance/score value. ⎡ ⎤ a111 a112 · · · ⎢ ⎥ A = ⎣ a211 a212 · · · ⎦ (9) .. .. .. . . . m×n

If there are “t” respondents, then: t 

aij s =

aij sk

k=1

(10)

t

where aijsk is the score of the ith alternative under jth main criteria and sth subcriteria evaluated by kth respondent.

Goal Main Criteria - 1 SubCriteria11

SubCriteria12

...

...

Main Criteria - 2 SubCriteria1...

SubCriteria21

SubCriteria22

...

SubCriteria2...

...

Fig. 3 General representation of the hierarchical structure

...

Main Criteria - p

...

SubCriteriap1

SubCriteriap2

...

SubCriteriap...

110

E. Erol et al.

Next, construct Matrix of Weights (W), where each cell represents weights of main criteria with respect to the goal: ⎡

w1 ⎢ w2 ⎢ W = ⎢. ⎣ ..

⎤ ⎥ ⎥ ⎥ ⎦

wp

(11)

p×1

If there are “t” respondents, then: t 

wj =

wj k

k=1

(12)

t

where wjk is the weight assigned to jth main criteria by kth respondent. Then, construct the matrix (U), where each cell represents the weights of the sub-criteria with respect to the main criteria. Note that in this case, weights of each sub-criteria should be evaluated considering weights of their father criteria (main criteria), i.e., by multiplying these values: Up... = up... × wp

(13)

The resulting matrix becomes: ⎡

u11 ⎢u ⎢ 12 ⎢ . ⎢ . ⎢ . ⎢ ⎢ u1... ⎢ ⎢ 0 ⎢ ⎢ 0 ⎢ ⎢ U =⎢ ⎢ 0 ⎢ ⎢ 0 ⎢ ⎢ 0 ⎢ ⎢ ⎢ 0 ⎢ ⎢ 0 ⎢ ⎢ .. ⎣ . 0

0 ··· 0 ··· .. . ··· 0 ··· u21 · · · u22 · · · .. . ··· u2... 0 0 0 .. .

··· ··· ··· ···

··· 0 ···

⎤ 0 0 ⎥ ⎥ ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ ⎥ 0 ⎥ ⎥ .. ⎥ . ⎥ ⎥ 0 ⎥ ⎥ ⎥ up1 ⎥ ⎥ up2 ⎥ ⎥ .. ⎥ . ⎦ up... n×p

(14)

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

111

If there are “t” respondents, then: t 

uij =

uij k

k=1

(15)

t

where uijk is the composite weight of jth sub-criteria under ith main criteria assigned by kth respondent. Step: 2 Obtain the Normalized Decision Matrix (R) using a suitable normalization method. In this case, linear normalization is utilized. If aij is a fuzzy number, then it can be shown in the triangular form aij = (xij, yij , zij ). Let xj+ = max xij , yj+ = max yij , zj+ = max zij , xj− = min xij , yj− = min yij , zj− = min zij " For Benefit Criteria :

rij = " rij =

For Cost Criteria : ⎡

r11 ⎢ R = ⎣ r21 .. .

xij yij zij , , zj+ yj+ xj+

xj− yj− zj− , , zij yij xij

# (16)

# (17)

⎤ r12 · · · r22 · · · ⎥ ⎦ .. . . . m×n .

(18)

Step: 3 Obtain Weighted Normalized Decision Matrix (V) by simply multiplying each member of the matrix (R) with their corresponding weights. If wij is a fuzzy number, then it can be shown in the triangular form wj = (α ij , β ij , λij ). " vij =

For Benefit Criteria :

zj+ "

vij =

For Cost Criteria : ⎡

v11 ⎢ v21 V =⎣ .. .

xij

v12 v22 .. .

xj− zij

αj ,

αj ,

⎤ ··· ···⎥ ⎦ .. .

yij yj+ yj− yij

m×n

βj ,

βj ,

zij xj+ zj− xij

# (19)

λj # λj

(20)

(21)

112

E. Erol et al.

Step: 4 Detect Positive Ideal Solution (vj+ ) and Negative Ideal Solution (vj− ) for each column with the formulas given below. For crisp data, they can be detected directly by finding maximums and minimums, but for the fuzzy data, an alternative method should be derived. In this case, defuzzification should be utilized. The graded mean integration method is used in this step (Eq. 22):   vj+ = fuzzy number with the largest M vij   vj− = fuzzy number with the smallest M vij   where M vij =

(22)

xij +4yij +zij 6

Step: 5 Calculate the distance of each alternative to Positive Ideal Solution and Negative Ideal Solution. In this case, distances are calculated using Euclidian Distance. If vj+ or vj− is a   fuzzy number, then they can be shown in the triangular form vj+ = x + , y + , z+   and vj− = x − , y − , z− , respectively. The formulas for calculating the Euclidian distance are below (Karatas et al. 2018):   Distance to Positive Ideal Solution Si +

n  j =1

$ % 2  2  2 &  + + + 1 − x + b − y + c − z a ij ij ij j j j 3

(23)   Distance to Negative Ideal Solution Si −

n  j =1

$ %  2  2  2 & 1 aij − xj− + bij − yj− + cij − zj− 3

(24) Step: 6 Rank the alternatives in decreasing order of Ci , select the alternative with the highest Ci score. Ci is the score where Si+ and Si− are combined into a single value using Eq. (25). Ci =

Si−

Si− + Si+

i = 1, . . . , m

(25)

4 The Proposed Hierarchical Structure for the Evaluation of Kidney Stone Treatment Methods This section provides information about the proposed hierarchical evaluation structure used for the evaluation of kidney stone treatment methods. Firstly, the hierarchical structure is represented, then, the definitions of each criterion are given in the following subsections.

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

113

4.1 The Proposed Hierarchical Evaluation Structure In this study, a hierarchical structure consisting of five main criteria and 24 subcriteria is constructed for the evaluation and comparison of kidney stone treatment ® alternatives, which is based on the HTA Core Model (EUnetHTA 2016). The proposed hierarchical structure is represented in Fig. 4.

.

Goal: Selection of Suitable Treatment Approach

Technical & Technological Occupational Health & Safety and Risks

Organizational

Economic

Training

Investment Cost

Ethical, Legal and Social

Complications Patient Perception

Operation Room Complexity

Patient Contribution Rate

Equal Accessibility

Operation Time Limitations

Hospitalization Time Minimally Invasive

Sterilization

Availability

Effectiveness Composition of Stones Anesthesia Requirement

Reliability

Complexity

Clinical

Mobility

ESWL

Unit Operation Cost

Informed Consent

LL

Fig. 4 A visual demonstration of the proposed hierarchical evaluation structure

Size of Stones Location of Stones

114

E. Erol et al.

4.2 The Main Criteria Considered in the MCDM-Based HTA Framework The model consists of five main criteria, which are Technical and Technological; Organizational; Economic; Ethical, Legal, and Social; and Clinical. A detailed explanation of each criterion is given in this section. Technical and Technological Aspects The development of technology diversifies treatment methods. The utilization of different technologies also brings different technical characteristics. As a result, a comparison of treatment methods under technical and technological aspects becomes necessary. These aspects include material and device availability, complexity, reliability, sterilization of devices or materials used, and occupational health and safety to investigate the medical staff’s working conditions. Occupational health and safety is significantly important for the healthcare service quality of medical staff. Organizational Aspects Every treatment method requires an organizational arrangement for each stage of the process. A logical arrangement expedites the treatment process, reduces the burden on stakeholders, and brings added value. The organizational aspects include training requirements for every stage of treatment operation, the complexity of the room where the operation will be executed, and the degree of the flow of device, patient, or staff to be able to conduct treatment. Economic Aspects Economical point of view has an impact on all stakeholders in the process. An economic evaluation of a treatment approach reveals various costs such as device investment cost, labor cost, tool, and device cost for each patient and fees that patients have to pay for the treatment. Ethical, Legal, and Social Aspects The legal and ethical aspects of treatment methods are among the most important factors that should be addressed since human life quality is directly affected by the stages in a treatment process. In this manner, the existence of rules and regulations that ensure the rights of the patients and also the existence of patient approval before a treatment can be investigated. Social aspects are directly related to the relationship between the patients and the treatment approach. Thus, the examination of the accessibility of treatment for every type of patient and patients’ perception of treatment are significant points. Clinical Aspects Clinical characteristics involve the sub-criteria that need to be dealt with in many different aspects. This sub-criteria can be divided into three different levels as preoperation, intra-operative, and post-operative.

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

115

• Pre-operation level: Since each treatment has guidelines for implementation, the factors that affect the applicability and requirements of the treatment method should be clearly studied. • Intra-operative level: This level covers all situations that may occur during the operation. • Post-operative level: Includes analysis of all post-operative conditions such as complications that may arise after the operation and the success rate.

4.3 Sub-criteria Considered in the MCDM-Based HTA Framework The hierarchical evaluation model is made of the main criteria and relevant sub-criteria which are either benefit or cost variable. Benefit criteria require maximization in contrast to cost criteria which require minimization. Besides that, a criterion can be numerical which can be expressed in real numbers or linguistic in which fuzzy numbers are needed. Table 1 summarizes the variable type and also the definition of each sub-criterion in the proposed framework.

5 Case Study: Application of the Proposed MCDM-Based HTA Framework The case study is related to the selection of appropriate treatment alternative among the two most commonly used ones in the current healthcare system. In Sect. 5.1, LL and ESWL are briefly explained. Next, input matrices for calculations are presented. Then, resulted values after application of Hierarchical Fuzzy TOPSIS methodology are summarized in Sect. 5.2.

5.1 Selected Kidney Stone Treatment Methods for the Case Study (1) Laser Lithotripsy (LL) LL is one of the most popular treatment methods for kidney stones. Kidney stones show up in the urinary tract which consists of bladder, kidney, ureter, and urethra. LL is an effective treatment method for dealing with stones in the urinary tract by utilizing laser technology. Application of LL starts with detecting the location of stone by placing the ureteroscope which has illumination, viewing, and working channel ports through urinary. Then, an optical fiber that is responsible for the transmission of infrared laser radiation is placed into the working channel port of

Clinical aspects

Ethical, legal, and social aspects

Economic aspects

Organizational aspects

Main criteria Technical and technological aspects

Sub-criteria Complexity Reliability Sterilization Occupational health and safety and risks Availability Training Operation room complexity Mobility Investment cost Patient contribution rate Unit operation cost Patient perception Equal accessibility Informed consent Complications Effectiveness Minimally invasive Anesthesia requirement Operation time Limitations Hospitalization time Size of stones Location of stones Composition of stones

Table 1 Definitions and variable types of study variables Variable type Linguistic Linguistic Linguistic Linguistic Linguistic Linguistic Linguistic Linguistic Numerical Numerical Numerical Linguistic Linguistic Linguistic Numerical Numerical Linguistic Linguistic Numerical Linguistic Numerical Numerical Linguistic Linguistic

Definition Level of complexity of the device to operate Durability of the device against breakdowns Assurance of perfect sterilization of the device for every different patient Ensuring that medical staff is working under suitable conditions Readiness of the device for any time Degree of training requirements to be able to operate the treatment The level of burden that the complexity brings to the organization Need of flow of device, patient or staff to be able to conduct the treatment Purchasing cost of the device The amount to be paid by the patient for the treatment and the device Sum of labor, tool, and device operation costs for curing a patient How patients perceive the treatment Accessibility of treatment for every type of patient Permission a patient gives a doctor to perform a test or procedure Frequency of medical complications occur after the treatment Number of stone-free operations/number of operations Ensuring damages to tissues as little as possible Requirement of anesthesia before the treatment The amount of time (minutes) that device is used for one patient Amount of limitations which prevent the application of the treatment The time spent by the patient in hospital after the treatment Stone-size range that the treatment method is suitable to apply Location of stones directly affecting the applicability of the treatment Composition of stones directly affecting the applicability of the treatment

116 E. Erol et al.

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

117

Fig. 5 Visual representation of Laser Lithotripsy procedure (Urologic Surgeons of Washington 2019)

ureteroscope. After this, stones are fragmented into pieces with the laser. Higher stone-free and less push back ratios, needlessness of multiple interventions (in most cases), reliability, effectiveness especially in pediatric patients, observation of fewer complications after the procedure are some advantages of LL (Hardy 2018) (Fig. 5). (2) Extracorporeal Shockwave Lithotripsy (ESWL) The other candidate kidney stone treatment alternative is ESWL. It aims stone fragmentation by converting sound waves into shock waves without damaging muscles, bones, or skin. Generation of these shock waves is accomplished by a lithotripter outside of the human body and the energy generated travels through the body which results in fragmentation. The resulting stone pieces can be completely excreted from the body after a few weeks (Wickham 1985) (Fig. 6). ESWL can be considered as the primary treatment approach for kidney stones from patient perspective owing to its several important properties such as noninvasiveness, low cost, high efficiency of stone disintegration, less exposure to anesthesia, and shorter hospitalization time (Özlük 2012).

118

E. Erol et al.

Fig. 6 Visual representation of ESWL procedure (Urologist Bhopal 2019)

5.2 Application of the Proposed MCDM-Based HTA Framework In this section, Hierarchical Fuzzy TOPSIS methodology is applied for the evaluation of LL and ESWL. Initially, the determined criteria and their associated sub-criteria were evaluated by seven experts who are urologists from different hospitals. They were questioned via a survey (see Appendix) to obtain their evaluations. When evaluations were completed, it was noticed that five sub-criteria are not suitable for comparison of LL and ESWL. They are Sterilization, Informed Consent, Anesthesia Requirement, Location of Stones, and Composition of Stones. The reason for excluding these data is that the answers for these questions are certain for LL and ESWL and hence are not open to the interpretation of the experts. Therefore, the inclusion of them does not add value to the comparison process. The remaining 19 criteria are taken into consideration in computations. The input parameter, scores matrix (Eq. 9), consists of two different data types which are linguistic and numerical. While linguistic data were derived from the expert evaluations, numerical data were gathered from both literature review and various sources. Linguistic evaluations were either on a five-level scale or two-level scale (yes–no) depending on the characteristic of criterion. The verbal evaluations of the experts in this study should be handled as fuzzy numbers to be able to make computations and obtain a quantifiable result. Verbal expressions are converted into TFNs using a structure consisting of five levels. Table 2 demonstrates corresponding TFNs for each level.

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . . Table 2 Linguistic terms versus corresponding TFN

Linguistic term Very bad (VB) or very low (VL) Bad (B) or low (L) Average (A) or moderate (M) Good (G) or high (H) Very good (VG) or very high (VH)

119

Corresponding TFN (0.0, 0.1, 0.3) (0.1, 0.3, 0.5) (0.3, 0.5, 0.7) (0.5, 0.7, 0.9) (0.7, 0.9, 1.0)

Table 3 Fuzzy importance weights of the main criteria Criteria

Expert 1 Expert 2 Expert 3 Expert 4 Expert 5 Expert 6 Expert 7

Technical and technological aspects (MC1 ) Organizational aspects (MC2 ) Economic aspects (MC3 ) Ethical, legal, and social aspects (MC4 ) Clinical aspects (MC5 )

H

VH

L

H

H

L

H

L

VH

M

H

M

L

M

VH VL

M VH

H VH

VH VH

M M

VL VH

L H

H

VH

VH

VH

H

VH

H

The experts used the linguistic terms presented in Table 2 to evaluate the importance of the criteria and sub-criteria. Fuzzy importance weights of the main criteria assigned by seven experts are given in Table 3. The linguistic evaluations are transformed into numeric form as TFN considering Table 2. The evaluations of seven different experts on each main criterion are handled by the arithmetic mean considering Eq. (12) so that the matrix (Eq. 26) is obtained. ⎡ ⎤ MC1 (0.414, 0.614, 0.8) ⎥ MC2 ⎢ ⎢ (0.329, 0.529, 0.714) ⎥ ⎢ ⎥ W = MC3 ⎢ (0.371, 0.557, 0.729) ⎥ (26) ⎢ ⎥ MC4 ⎣ (0.514, 0.7, 0.843) ⎦ MC5 (0.614, 0.814, 0.957) 5×1 Table 4 shows the proposed weights of each sub-criteria by each expert. These evaluations were used for constructing matrix (Eq. 14). When linguistic evaluations for sub-criteria weights in Table 4 are converted to corresponding TFN’s, the matrix in Eq. (14) can be filled. However, the values cannot be written straight forward, their main criteria weights should also be considered to be able to have a hierarchical manner. Thus, matrix (Eq. 27) is filled by considering Eq. (13). Seven different evaluations are averaged according to Eq. (15).

Clinical aspects

Ethical, legal, and social

Economic

Organizational

Main criteria Technical and technological

Sub-criteria Complexity Reliability Occupational health and safety and risks Availability Training Operation room complexity Mobility Investment cost Patient contribution rate Unit operation cost Patient perception Equal accessibility Complications Effectiveness Minimally invasive Operation time Limitations Hospitalization time Size of stones

Table 4 Fuzzy importance weights of the sub-criteria Expert 1 H H M M VH M VL H VL M M M H H VH M M L VH

Expert 2 H VH VH VH H H H L H M VH VH VH H H M H M H

Expert 3 L M M M H H H H H H H VH VH H VH VL H M M

Expert 4 H H VH VH H H H VH VH VH VH VH VH VH H H VH H H

Expert 5 M M VH H H M H H H H H M H VH VH H H H VH

Expert 6 M VH M VH H VH VH L VL M H M H VH H L H L H

Expert 7 M M M M H M L H M M H H H H H M H L VH

120 E. Erol et al.

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . . ⎡ 0 0 0 SC11 (0.357, 0.557, 0.757) ⎢ 0 0 0 SC12 ⎢ (0.471, 0.671, 0.843) ⎢ (0.529, 0.729, 0.886) 0 0 0 SC13 ⎢ ⎢ 0 0 0 SC14 ⎢ ⎢ (0.471, 0.671, 0.829) ⎢ 0 0 0 (0.5, 0.7, 0.857) SC15 ⎢ ⎢ 0 (0.529, 0.729, 0.914) 0 0 SC21 ⎢ ⎢ ⎢ 0 (0.443, 0.643, 0.829) 0 0 SC22 ⎢ ⎢ 0 (0.4, 0.586, 0.771) 0 0 SC23 ⎢ ⎢ ⎢ 0 0 (0.414, 0.614, 0.8) 0 SC31 ⎢ ⎢ 0 0 0.357, 0.529, 0.714 0 SC32 ⎢ ⎢ ⎢ SC33 ⎢ 0 0 (0.414, 0.614, 0.8) 0 U= ⎢ SC41 ⎢ 0 0 0 (0.529, 0.729, 0.9) ⎢ SC42 ⎢ 0 0 0 (0.5, 0.7, 0.857) ⎢ ⎢ SC51 ⎢ 0 0 0 0 ⎢ SC52 ⎢ 0 0 0 0 ⎢ ⎢ SC53 ⎢ 0 0 0 0 ⎢ SC54 ⎢ 0 0 0 0 ⎢ ⎢ SC55 ⎢ 0 0 0 0 ⎢ SC56 ⎢ 0 0 0 0 ⎢ ⎢ SC57 ⎢ 0 0 0 0 ⎢ SC58 ⎣ 0 0 0 0 SC59 0 0 0 0

121 0 0 0 0 0 0 0 0 0



⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ 0 ⎥ ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ 0 ⎥ ⎥ (0.586, 0.786, 0.943)⎥ ⎥ ⎥ (0.586, 0.786, 0.943)⎥ ⎥ (0.586, 0.786, 0.943)⎥ ⎥ (0.471, 0.671, 0.871)⎥ ⎥ ⎥ (0.286, 0.471, 0.671)⎥ ⎥ (0.529, 0.729, 0.9) ⎥ ⎥ ⎥ (0.271, 0.471, 0.671)⎥ ⎥ (0.557, 0.757, 0.914)⎦ (0.557, 0.757, 0.914) 22×5 (27)

Scores of linguistic sub-criteria with respect to alternatives were evaluated by each expert and recorded in Table 5 (for LL) and Table 6 (for ESWL). Data for investment cost, patient contribution rate and unit operation cost were obtained from the Turkish Ministry of Health and urology department of a private hospital. The rest of the numerical data were obtained from the literature. The data in Tables 5 and 6 can be transferred into the decision matrix as in Table 7. Equation (26) is considered as the main criteria weights, Eq. (27) is considered as the sub-criteria weights, and Table 7 is considered as the decision matrix which contains scores of alternatives. These values are taken as input for the Hierarchical Fuzzy TOPSIS method which is presented in Sect. 3.3. The resulted closeness coefficient values for both alternatives are given in Table 8. Table 8 concludes that LL is the preferred treatment approach with the Ci value of 0.577. Through the methodology, closeness coefficients for LL and ESWL are calculated as 0.577 and 0.372, respectively. As a result, LL is preferred over ESWL since the methodology declares selecting the alternative which has higher Ci value, in other words, closest to the positive ideal solution and farthest from the negative ideal solution. Results are consistent with the studies in the literature, as the LL is found to be advantageous in overall. This is mostly owing to the good performance of LL under the criteria with the highest weights. The results are meaningful under the scope of the Turkish healthcare system. Criteria weights would probably show the difference between regions, hence results would be directly affected. However, the results are expected to bring insights to all healthcare providers working in the field, and also to future case studies on kidney stone treatment methods.

Criteria Complexity (SC11 ) Reliability (SC12 ) Occupational health and safety and risks (SC13 ) Availability (SC14 ) Training (SC21 ) Operation room complexity (SC22 ) Mobility (SC23 ) Investment cost (SC31 ) Patient contribution rate (SC32 ) Unit operation cost (SC33 ) Patient perception (SC41 ) Equal accessibility (SC42 ) Complications (SC51 ) Effectiveness (SC52 ) Minimally invasive (SC53 ) Operation time (SC54 ) Limitations (SC55 ) Hospitalization time (SC56 ) Size of stones (SC57 )

Table 5 Score matrix for LL LL Expert 1 Expert 2 Expert 3 Expert 4 VH H VH H M M H L VB A A A M M M H VH H VH M M H VH H H M M H 16,718 Turkish Lira (TRY) 8500 TRY 12,000 TRY G VG G VG A A A A 9–25% (European Association of Urology 2018) 80% (Özlük 2012) Yes Yes Yes Yes 28.2 min (Özlük 2012) M M VL L 12.4 h (Özlük 2012) 5–30 mm (Özlük 2012) Expert 6 H L A H M M M

G G

Yes L

Expert 5 H H B H H H M

G G

Yes L

M

Yes

A A

Expert 7 H M A M H H H

122 E. Erol et al.

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

123

Table 6 Score matrix for ESWL Criteria

ESWL Expert 1

Expert 2

Expert 3

Expert 4

Expert 5

Expert 6

Expert 7

Complexity (SC11 ) Reliability (SC12 )

M VH

L H

H VH

H H

L M

H H

L M

Occupational health and safety and risks (SC13 ) Availability (SC14 ) Training (SC21 )

G

G

G

A

A

A

G

H M

H M

VH H

H M

H H

H M

H M

Operation room complexity (SC22 ) Mobility (SC23 )

M

VL

H

M

H

M

L

VL

M

VL

L

L

M

VL

Investment cost (SC31 ) Patient contribution rate (SC32 ) Unit operation cost (SC33 ) Patient perception (SC41 )

12,776 TRY 1135 TRY

G

G

G

A

G

Equal accessibility (SC42 ) Complications (SC51 )

G VG G G G 7–22% (European Association of Urology 2018)

A

A

Effectiveness (SC52 ) Minimally invasive (SC53 ) Operation time (SC54 )

68% (Özlük 2012) No No

Limitations (SC55 ) Hospitalization time (SC56 ) Size of stones (SC57 )

M VH 1 h (Özlük 2012)

4830 TRY G

G

No

Yes

Yes

Yes

Yes

M

VH

H

M

H

62.8 min (Özlük 2012)

5–16 mm (Özlük 2012)

Table 7 Decision matrix LL

ESWL

LL

ESWL

SC11

(0.557,0.757,0.929)

(0.3,0.5,0.7)

SC41

(0.529,0.729,0.9)

(0.471,0.671,0.871)

SC12

(0.3,0.5,0.7)

(0.5,0.7,0.871)

SC42

(0.357,0.557,0.757)

(0.471,0.671,0.857)

SC13

(0.229,0.414,0.614)

(0.414,0.614,0.814)

SC51

(9,17,25)

(7.2,14.51,21.8)

SC14

(0.386,0.586,0.786)

(0.529,0.729,0.914)

SC52

(80,80,80)

(68,68,68)

SC21

(0.5,0.7,0.871)

(0.357,0.557,0.757)

SC53

(1,1,1)

(0.571,0.571,0.571)

SC22

(0.471,0.671,0.857)

(0.286,0.471,0.671)

SC54

(28.2,28.2,28.2)

(62.8,62.8,62.8)

SC23

(0.386,0.586,0.786)

(0.157,0.329,0.529)

SC55

(0.171,0.357,0.557)

(0.471,0.671,0.843)

SC31

(16718,16718,16718)

(12776,12776,12776)

SC56

(12.4,12.4,12.4)

(1,1,1)

SC32

(8500,8500,8500)

(1135,1135,1135)

SC57

(5,17,30)

(5,10.5,16)

SC33

(12000,12000,12000)

(4830,4830,4830)

Table 8 Closeness coefficient values of the alternatives

LL ESWL

Ci 0.577 0.372

124

E. Erol et al.

Besides, the sensitivity analysis method proposed by Ayan and Perçin (2012) was applied to check the robustness of our decision results and also observe the most sensitive main criteria. In this context, ten different cases were created by changing the weights of the main criteria. In the first five cases, the weight of “very high” was assigned to one criterion while other weights kept “very low.” In contrast, in the last five cases, one criterion weighted as “very low” and other criteria as “very high.” For each case, the resulted closeness coefficients of the alternatives were calculated and presented in Table 9 and the results of the sensitivity analysis are shown in Fig. 7. According to the sensitivity analysis results, it is observable that most sensitive criteria are organizational aspects (MC2 ) and clinical aspects (MC5 ) since results are dramatically affected when weights of these criteria are changed. The weight of organizational aspects significantly affects the performance of ESWL, and clinical aspects directly affect the performance of LL. In addition, it is clear that LL is the preferable alternative in 8 of 10 cases, which means LL performs better than ESWL in overall. Table 9 Closeness coefficient values of decision alternatives for different criteria weights Weights of main criteria Case #

Closeness coefficient

MC1

MC2

MC3

MC4

MC5

LL

ESWL

1

(0.7,0.9,0.1)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

0.700

0.660

2

(0.0,0.1,0.3)

(0.7,0.9,0.1)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

0.661

0.667

3

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.7,0.9,0.1)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

0.708

0.663

4

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.7,0.9,0.1)

(0.0,0.1,0.3)

0.779

0.655

5

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.0,0.1,0.3)

(0.7,0.9,0.1)

0.724

0.362

6

(0.0,0.1,0.3)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

0.557

0.357

7

(0.7,0.9,0.1)

(0.0,0.1,0.3)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

0.584

0.364

8

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.0,0.1,0.3)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

0.552

0.360

9

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.0,0.1,0.3)

(0.7,0.9,0.1)

0.515

0.362

10

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.7,0.9,0.1)

(0.0,0.1,0.3)

0.543

0.655

0.9

Closeness Coefficient

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Case 1

Case 2

Case 3

Case 4

Case 5 LL

Fig. 7 Sensitivity analysis results

Case 6 ESWL

Case 7

Case 8

Case 9 Case 10

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

125

6 Conclusion Developments in the healthcare field around the world have increased the importance of decisions to be made in this field. In a sustainable and well-functioning health system, decision-making processes have significant importance to be able to use resources optimally. At this point, the implementation and the importance of MCDM methods, which provide reliable and quantitative outputs, have increased especially in the last decades. In this study, a hierarchical evaluation model for comparing kidney stone treatment alternatives is proposed. The proposed structure is based on HTA Core ® Model . The existence of hierarchical structure, as well as a vast number of criteria, brings the necessity of a powerful and effective method. Hence, Hierarchical Fuzzy TOPSIS which is an MCDM method is selected since it is one of the most suitable methods for the complex structure of the proposed model and works in a fuzzy environment. The need for the usage of fuzzy in the study not only has arisen from the usage of linguistic variables but also from the need for a methodology that may successfully be able to deal with computational challenges of the proposed model, in addition to the complexity of the data structure due to the nature of HTA. In the proposed model, the criteria require verbal or numerical evaluation in certain cases. Criteria that can be expressed as the numerical value can be obtained from the literature or the field. However, linguistic criteria require consultation with experts in this field. Their expressions can be converted into linguistic variables to be able to handle verbal expressions in a quantitative manner. The fuzzy theory should be utilized to accomplish this. In this study, the verbal expert evaluations were collected from seven urologists working at public and private hospitals located in Turkey through a survey (Appendix). Each question of the survey contains either a five-level or two-level scale depending on the characteristics of criteria. Moreover, numerical data that cannot be found in the literature were obtained from one of the largest hospital complexes in Istanbul and the Turkish Ministry of Health. Since the study contains medical terms and requires expert knowledge as stated, it became a collaborative work between researchers and experts, which brought better reliability. The methodology guarantees to select the best kidney stone treatment alternative which has higher closeness coefficient (Ci ) value. In other words, the best alternative should be the closest to the positive ideal solution and farthest from the negative ideal solution. Through the methodology, closeness coefficients for LL and ESWL are calculated as 0.577 and 0.372, respectively. It is concluded that LL has a higher Ci value then ESWL. As a result, LL is preferred over ESWL and selected as the most suitable treatment alternative for kidney stone treatment. The superiority of LL is mostly owing to its good performance in high-weighted clinical aspects. The results can be treated in the scope of the Turkish healthcare system. As the weights of criteria may vary from region to region, results will also be directly affected. For example, the cost could be the most important criteria for an undeveloped country. Nevertheless, the results are expected to bring insights to all experts and healthcare

126

E. Erol et al.

providers in the field as well as be a source for future case studies on kidney stone treatment methods. To the best of our knowledge, there is no study in the published literature concerning HTA of LL and ESWL yet. Outcomes of this study are HTA of LL and ESWL separately and also a quantitative comparison of these alternatives. The most important outcome is the proposed MCDM-based hierarchical evaluation structure for kidney stone treatment alternatives. The results of this study can yield insights to all stakeholders in the healthcare field. In future studies, the proposed model can be used in comparing other kidney stone treatment alternatives. Moreover, the model can be enhanced by adding further aspects and criteria or applying nine domains of ® HTA Core Model .

Appendix: A Brief Version of the Survey Used to Collect Verbal Evaluations 1. An Example Question for Linguistic Evaluation of Alternatives with Respect to Sub-criteria Table 10 The survey for verbal evaluation of the degree of training required to be able to carry out the treatment Sub-criteria Laser lithotripsy Training

ESWL

Very low Low Moderate High Very high Very low Low Moderate High Very high

2. The Question Used to Determine Weights of the Main Criteria Table 11 The survey used to determine the importance levels of the main criteria when treatment alternatives are considered Criteria Technical and technological aspects Organizational aspects Economic aspects Ethical, legal, and social aspects Clinical aspects

Importance level Very Low Low Very low Low Very low Low Very low Low Very low Low

Moderate Moderate Moderate Moderate Moderate

High High High High High

Very High Very high Very high Very high Very high

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

127

3. An Example Question Used to Determine Weights of the Sub-criteria Table 12 The survey for determining importance levels of each aspect from the technical and technological point of view Sub-criteria Complexity Reliability Sterilization Occupational health and safety and risks Availability

Importance level Very low Low Very low Low Very low Low Very low Low Very low Low

Moderate Moderate Moderate Moderate Moderate

High High High High High

Very high Very high Very high Very high Very high

References Adunlin G, Diaby V, Xiao H (2015) Application of multicriteria decision analysis in health care: a systematic review and bibliometric analysis. Health Expect 18:1894–1905. https://doi.org/ 10.1111/hex.12287 A˘gaç G, Baki B (2016) Sa˘glık alanında çok kriterli karar verme teknikleri kullanımı: literatür incelemesi. Hacettepe Sa˘glık ˙Idaresi Dergisi 19(3):343–363 Alada˘g Z, Avcı S, Çelik B et al (2017) Özel hastane seçim kriterlerinin analitik hiyerar¸si prosesi ile de˘gerlendirilmesi ve kocaeli ili uygulaması. In: 5th international symposium on innovative Technologies in Engineering and Science, Baku, Azerbaijan, 29–30 Sep 2017 Andersohn F, Bornemann R, Damm O et al (2014) Vaccination of children with a live-atteunated, intranasal influenza vaccine-analysis and evaluation through a health technology assessment. GMS Health Technol Assess 10:Doc03. https://doi.org/10.3205/hta000119 Angelis A, Kanavos P (2014) Applying multiple criteria decision analysis in the context of health technology assessment: an empirical case study. Value Health 17:A552. https://doi.org/ 10.1016/j.jval.2014.08.1804 Angelis A, Kanavos P (2017) Multiple criteria decision analysis (MCDA) for evaluating new medicines in health technology assessment and beyond: the advance value framework. Soc Sci Med 188:137–156. https://doi.org/10.1016/j.socscimed.2017.06.024 Ayan TY, Perçin S (2012) AR-GE projelerinin seçiminde grup kararına dayalı bulanık karar verme yakla¸sımı. Atatürk Üniversitesi ˙Iktisadi ve ˙Idari Bilimler Dergisi 26(2):237–255 Banta D (2003) The development of health technology assessment. Health Policy 63(2):121–132 Belton V, Stewart TJ (2002) Multiple criteria decision analysis: an integrated approach. Kluwer Academic, London Bilekova BK, Gavurova B, Rogalewicz V (2018) Application of the HTA Core Model for complex evaluation of the effectiveness and quality of Radium-223 treatment in patients with metastatic castration resistant prostate cancer. Health Econ Rev 8:27. https://doi.org/10.1186/s13561-0180211-9 Bridges JFP, Jones C (2007) Patient-based health technology assessment: a vision of the future. Int J Technol Assess Health Care 23:30–35. https://doi.org/10.1017/s0266462307051549 Broekhuizen H, Groothuis-Oudshoorn CGM, van Til JA et al (2015) A review and classification of approaches for dealing with uncertainty in multi-criteria decision analysis for healthcare decisions. Pharmacoconomics 33:445–455. https://doi.org/10.1007/s40273-014-0251-x Büyüközkan G, Çiftçi G (2012) A combined fuzzy AHP and fuzzy TOPSIS based strategic analysis of electronic service quality in healthcare industry. Expert Syst Appl 39:2341–2354. https:// doi.org/10.1016/j.eswa.2011.08.061

128

E. Erol et al.

Chan FTS, Kumar N, Tiwari M et al (2008) Global supplier selection: a fuzzy-AHP approach. Int J Prod Res 46:3825–3857. https://doi.org/10.1080/00207540600787200 Diaby V, Goeree R (2014) How to use multi-criteria decision analysis methods for reimbursement decision-making in healthcare: a step-by-step guide. Expert Rev Pharm Out 14:81–99. https:// doi.org/10.1586/14737167.2014.859525 Drake JI, de Hart JCT, Monleón C et al (2017) Utilization of multiple-criteria decision analysis (MCDA) to support healthcare decision-making. J Mark Access Health Policy 5:1360545. https://doi.org/10.1080/20016689.2017.1360545 Ettinger S, Stanak M, Szyma´nski P et al (2017) Wearable cardioverter defibrillators for the prevention of sudden cardiac arrest: a health technology assessment and patient focus group study. Med Devices (Auckl) 10:257–271. https://doi.org/10.2147/MDER.S144048 EUnetHTA (2016) Joint Action 2, Work Package 8. HTA Core Model ® version 3.0. https:/ /www.eunethta.eu/wp-content/uploads/2018/03/HTACoreModel3.0-1.pdf. Accessed 07 Sep 2019 European Association of Urology (EAU) (2018) EAU guidelines on urolithiasis. https:// uroweb.org/guidelines/. Accessed 20 Nov 2018 Frosini F, Miniati R, Grillone S, Dori F, Gentili GB, Belardinelli A (2016) Integrated HTAFMEA/FMECA methodology for the evaluation of robotic system in urology and general surgery. Technol Health Care 24(6):873–887 Garrido MV, Kristensen FB, Nielsen CP et al (2008) Health technology assessment and health policy-making in Europe: current status, challenges and potential. London Giansanti D, Pochini M, Giovagnoli MR (2014) Integration of tablet technologies in the elaboratory of cytology: a health technology assessment. Telemed e-Health 20(10):909–915 Goetghebeur MM, Wagner M, Khoury H et al (2008) Evidence and value: impact on DEcisionMaking – the EVIDEM framework and potential applications. BMC Health Serv Res 8:270. https://doi.org/10.1186/1472-6963-8-270 Goetghebeur MM, Wagner M, Khoury H et al (2012) Bridging health technology assessment (HTA) and efficient health care decision making with multicriteria decision analysis (MCDA): applying the EVIDEM framework to medicines appraisal. Med Decis Mak 32:376–388. https:/ /doi.org/10.1177/0272989X11416870 Hardy LA (2018) Improving thulium fiber laser lithotripsy efficiency. Dissertation, The University of North Carolina at Charlotte Hasan M, Büyüktahtakın E, Elamin E (2019) A multi-criteria ranking algorithm (MCRA) for determining breast cancer therapy. Omega 82:83–101 Howard S, Scott IA, Ju H et al (2018) Multicriteria decision analysis (MCDA) for health technology assessment: the Queensland health experience. Aust Health Rev. https://doi.org/ 10.1071/AH18042 Ivlev I, Vacek J, Kneppo P (2015) Multi-criteria decision analysis for supporting the selection of medical devices under uncertainty. Eur J Oper 247:216–228. https://doi.org/10.1016/ j.ejor.2015.05.075 Ivlev I, Jablonsky J, Kneppo P (2016) Multiple-criteria comparative analysis of magnetic resonance imaging systems. Int J Med Inform 8:124. https://doi.org/10.1504/ijmei.2016.075757 Jansen TC, van Bommel J, Bakker J (2009) Blood lactate monitoring in critically ill patients: a systematic health technology assessment. Crit Care Med 37:2827–2839. https://doi.org/ 10.1097/CCM.0b013e3181a98899 Kahraman C, Ate¸s NF, Çevik S et al (2007) Hierarchical fuzzy TOPSIS model for selection among logistics information technologies. J Enterp 20:143–168. https://doi.org/10.1108/ 17410390710725742 Karadayi MA, Karsak EE (2014) Fuzzy MCDM approach for health-care performance assessment in Istanbul. In: Callaos N, Hashimoto S, Rutkauskas AV, Sanchez B, Zinn CD (eds) The 18th world multi-conference on systemics, cybernetics and informatics proceedings vol ii, Florida, July 2014. SIII

An MCDM-Based Health Technology Assessment (HTA) Study for Evaluating. . .

129

Karatas M, Tozan H, Karacan I (2018) An integrated multi-criteria decision making methodology for health technology assessment. Eur J Ind Eng 12:504. https://doi.org/10.1504/ ejie.2018.10014740 La Torre G, de Waure C, Chiaradia G et al (2010) The health technology assessment of bivalent hpv vaccine cervarix® in Italy. Vaccine 28:3379–3384. https://doi.org/10.1016/ j.vaccine.2010.02.080 Liu HC, Wu J, Li P (2013) Assessment of health-care waste disposal methods using a VIKORbased fuzzy multi-criteria decision making method. Waste Manag 33:2744–2751. https:// doi.org/10.1016/j.wasman.2013.08.006 Mahboub-Ahari A, Hajebrahimi S, Yusefi M et al (2016) EOS imaging versus current radiography: a health technology assessment study. Med J Islam Repub Iran 30:331 Martelli N, Hansen P, van den Brink H et al (2016) Combining multi-criteria decision analysis and mini-health technology assessment: a funding decision-support tool for medical devices in a university hospital setting. J Biomed Inform 59:201–208. https://doi.org/10.1016/ j.jbi.2015.12.002 Miniati R, Dori F, Cecconi G et al (2013) HTA decision support system for sustainable business continuity management in hospitals: the case of surgical activity at the University Hospital in Florence. Technol Health Care 21:49–61. https://doi.org/10.3233/THC-120709 Mitchell MD, Williams K, Brennan PJ, Umscheid CA (2010) Integrating local data into hospitalbased healthcare technology assessment: two case studies. Int J Technol Assess Health Care 26(3):294–300 Mühlbacher AC, Kaczynski A (2016) Making good decisions in healthcare with multi-criteria decision analysis: the use, current research and future development of MCDA. Appl Health Econ Health Pol 14:29–40. https://doi.org/10.1007/s40258-015-0203-4 Nojomi M, Moradi-Lakeh M, Velayati A et al (2016) Health technology assessment of non-invasive interventions for weight loss and body shape in Iran. Med J Islam Repub Iran 30:348 Oliviera MD, Mataloto I, Kanavos P (2019) Multi-criteria decision analysis for health technology assessment: addressing methodological challenges to improve the state of the art. Eur J Health Econ 20:891–918. https://doi.org/10.1007/s10198-019-01052-3 Özlük C (2012) Proksimal üreter ta¸slarının tedavisinde ESWL, pnömotik litotripsi ve lazerle litotripsi metodlarının etkinliklerinin kar¸sıla¸stırılması. Dissertation, Gazi University Öztürk N (2017) Multi criteria decision making model for health technology assessment and an application in dialysis. Dissertation, Marmara University Öztürk N, Tozan H, Vayvay Ö (2016) Comprehensive needs analysis for health technology assessment studies and improvement proposal. Eurasian J Health Technol Assess 1(1):69–76 Özüdoˇgru AG (2018) Determination of biomedical device selection criteria. In: 2018 Medical Technologies National Congress (TIPTEKNO 2018). 2018 Medical technologies National Congress, Magusa, Cyprus, November 2018. IEEE, pp 1–4 Padma T, Balasubramanie P (2011) A fuzzy analytic hierarchy processing decision support system to analyze occupational menace forecasting the spawning of shoulder and neck pain. Expert Syst Appl 38:15303–15309. https://doi.org/10.1016/j.eswa.2011.06.037 Palozzi G, Brunelli S, Falivena C (2018) Higher sustainability and lower opportunistic behaviour in healthcare: a new framework for performing hospital-based health technology assessment. Sustainability 10(10):3550 Perry TS (1995) Lotfi A. Zadeh [fuzzy logic inventor biography]. IEEE Spectr 32:32–35. https:// doi.org/10.1109/6.387136 Republic of Turkey Ministry of Health (2019) What is HTA?. http://www.hta.gov.tr/EN/ std_hta.aspx. Accessed 07 Sep 2019 Saarni SI, Anttila H, Saarni SE et al (2011) Ethical issues of obesity surgery—a health technology assessment. Obes Surg 21:1469–1476. https://doi.org/10.1007/s11695-011-0386-1 Thokala P, Devlin N, Marsh K et al (2016) Multiple criteria decision analysis for health care decision making – an introduction: report 1 of the ISPOR MCDA emerging good practices task force. Value Health 19:1–13. https://doi.org/10.1016/j.jval.2015.12.003

130

E. Erol et al.

Tony M, Wagner M, Khoury H et al (2011) Bridging health technology assessment (HTA) with multicriteria decision analyses (MCDA): field testing of the EVIDEM framework for coverage decisions by a public payer in Canada. BMC Health Serv Res 11:329. https://doi.org/10.1186/ 1472-6963-11-329 Torfi F, Farahan RZ, Rezapour S (2010) Fuzzy AHP to determine the relative weights of evaluation criteria and fuzzy TOPSIS to rank the alternatives. Appl Soft Comput 10:520–528. https:// doi.org/10.1016/j.asoc.2009.08.021 Urologic Surgeons of Washington (2019) Ureteroscopy with laser lithotripsy for the treatment of kidney stones. https://www.dcurology.net/procedures/ureteroscopy-with-laser-lithotripsy.php. Accessed 07 Sep 2019 Urologist Bhopal (2019) ESWL – external shockwave lithotripsy. http://wwwa.urologistbhopal. com/medical-care/surgery-for-kidney-stones/eswl-external-shockwave-lithotripsy/. Accessed 07 Sep 2019 Váchová L, Hajdíkova T (2017) Evaluation of Czech hospitals performance using MCDM methods. In: Proceedings of the world congress on engineering and computer science 2017, vol II, San Francisco, 25–27 Oct 2017 Velmurugan R, Selvamuthukumar S (2012) The analytic network process for the pharmaceutical sector: multi criteria decision making to select the suitable method for the preparation of nanoparticles. J Pharm Sci 20:59. https://doi.org/10.1186/2008-2231-20-59 Wagner M, Khoury H, Bennetts L et al (2017) Appraising the holistic value of Lenvatinib for radio-iodine refractory differentiated thyroid cancer: a multi-country study applying pragmatic MCDA. BMC Cancer 17:272. https://doi.org/10.1186/s12885-017-3258-9 Wang X, Chan HK (2013) A hierarchical fuzzy TOPSIS approach to assess improvement areas when implementing green supply chain initiatives. Int J Prod Res 51:3117–3130. https:// doi.org/10.1080/00207543.2012.754553 Wickham JEA (1985) Extracorporeal shock wave treatment for kidney stones. Br J Urol 290:188– 189 World Health Organization (2019) Health Technology Assessment. https://www.who.int/ medical_devices/assessment/en/. Accessed 07 Sept 2019 Yazdani S, Jadidfard M (2017) Developing a decision support system to link health technology assessment (HTA) reports to the health system policies in Iran. Health Policy Plann 32:504– 515. https://doi.org/10.1093/heapol/czw160 Yi˘git A, Erdem R (2016) Sa˘glık teknolojisi de˘gerlendirme: kavramsal bir çerçeve. Süleyman Demirel Üniversitesi Sosyal Bilimler Enstitüsü Dergisi 1(23):215–249 Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353. https://doi.org/10.1016/S00199958(65)90241-X Zhao H, Guo S, Zhao H (2019) Comprehensive assessment for battery energy storage systems based on fuzzy-MCDM considering risk preferences. Energy 168:450–461. https://doi.org/ 10.1016/j.energy.2018.11.129

Geographic Distribution of the Efficiency of Childbirth Services in Turkey Songul Cinaroglu

Abstract Medicalization of childbirth services and regional differences are the major obstacles in the improvement of women and child health in Turkey. The present study analyzes the geographic distribution of the efficiency of childbirth services in Turkish provinces. Data was collected from the official statistical records of the 2017 Public Hospitals Statistical Yearbook. Charnes, Cooper, and Rhodes’ (Eur J Oper Res 2(6):429–444, 1978) input-oriented data envelopment analysis (DEA) was applied to determine provincial efficiency scores, using childbirthspecific input and output indicators. Jackknife analysis was used for a robustness check of the DEA scores. Four different DEA models were constructed, and the final model’s efficiency scores were recorded. Finally, a decision-tree procedure was integrated into the DEA results, and predictors of efficient and inefficient provinces were examined. A total of 81 provinces in Turkey, representing seven geographic regions, were included in the analysis. The results showed that 18% of the provinces were efficient in terms of childbirth services. Average efficiency scores were high (0.71) for provinces located in the Southeast Anatolia Region. The most important predictor of efficiency for childbirth services is the number of beds in neonatal intensive care units (Neo_int_n_b). A geographic distribution of the provincial efficiency scores of childbirth services shows that eastern Turkey has the highest score. Neo_int_n_b is the most important determinant of efficiency scores. Ensuring public-health managers’ awareness about and continuous monitoring of childbirth services, while focusing more on regional differences, is essential to improve the status of children’s health in Turkey.

S. Cinaroglu () FEAS, Department of Health Care Management, Hacettepe University, Ankara, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_5

131

132

S. Cinaroglu

1 Introduction Worldwide, about 140 million women give birth every year. Increasing medicalization of childbirth services is decreasing women’s opportunity to give birth normally. This might negatively affect their birthing experiences (WHO 2018). Economic and social factors influence not only childbirth services in developing countries but also mothers’ preferences (Akhter and Schech 2018). The dominant birthing model in most of the Western world is medicalized childbirth (Benyamini et al. 2017). Additionally, cesarean sections (C-section) are on the rise worldwide (Khan et al. 2017). Between 1990 and 2014, the global average C-section rate increased by 12.4%, with an average annual gain of 4.4% (Betran et al. 2016). The efficiency and quality of childbirth services determine the number of maternal deaths and pose significant risks for the health of both the mother and the child. Lassi et al. (2016) reported that less than 1% of the annual maternal deaths occurred in the developed world, indicating that almost all take place in low- and middle-income countries. According to the World Bank’s country income classification (WB 2018), Turkey is both a middle-income and a developing country. Medicalization of childbirth services in Turkey has been studied by many health-policy researchers (see Santas et al. 2018; Betran et al. 2016; Pınar et al. 2016). Despite the World Health Organization’s “ideal” recommended rate of C-section (between 10% and 15%) (WHO 2015), in Turkey C-section as a percentage of all births rose from 21% to 53% between 2002 and 2016. This is higher than the 32% rise for the 26 OECD countries and the 28% rise for the 24 EU nations in 2016 (MoH 2016). The high percentage of C-section in Turkey questions the efficiency and quality of its childbirth services. For many years, Turkey has seen high rates of infant mortality relative to its economic status (Koc and Eryurt 2017). In the past, adult mortality rates in Turkey were similar to those of other countries with the same socioeconomic status. A low level of life expectancy at birth correlated with a high rate of infant mortality. This inverse situation of population dynamics in Turkey is called the “Turkish puzzle” (Gursoy-Tezcan 1992). In the 2000s, though, mortality rates for infants and children aged less than 5 years started falling rapidly. Since the 1980s, changes in socioeconomic dynamics and successful maternal and child healthcare programs factored into these improvements (Koç et al. 2010). However, this progress has not continued. In 2016, Turkey ranked second-to-last among OECD countries in infant mortality (deaths/1000 live births) (OECD 2018a). Regional differences exist between the western and eastern parts of Turkey (Yardım and Uner 2018). The 2013 Turkish Demographic and Health Survey (TDHS 2013) showed that in addition to high infant mortality, prominent differences were found between regions and rural versus urban areas in terms of infant and child mortality. Turkey’s western region is the most intensely populated as well as the most socioeconomically developed part of the country (TDHS 2013). ˙Istanbul is the most populated Turkish city and its manufacturing and cultural center (Masoumi et al. 2018). The eastern part of Turkey is the least developed, with a low level of industrialization and economic benefits. Policymakers are interested in developing

Geographic Distribution of the Efficiency of Childbirth Services in Turkey

133

eastern Turkey, and development projects have been launched in southeastern Turkey, such as The Southeastern Anatolia Project [Güneydo˘gu Anadolu Projesi (GAP)], which began in 1970. In the future, this project will grow as it becomes a more comprehensive plan for regional modernization. Political, cultural, social, economic, and cultural enhancement of southeastern Turkey are the major aims of this project. Effective political oversight of southeastern Turkey has been hampered by cultural barriers. This ambitious regional development project will also address healthcare reform, including maternal and childbirth services (Bilgen 2018). Health policy, planning, and reform intertwine and commingle with maternal and childbirth services. Turkey’s 2003 Health Transformation Program (HTP) reformed the healthcare sector by providing all citizens with equal healthcare (see Akdag 2003; MoH 2007; Mollahalilo˘glu et al. 2018). HTP tried to enhance the healthcare system’s responsiveness by removing barriers, increasing the accessibility and availability of healthcare services, reducing waiting times, and improving patient satisfaction (see Akdag 2003; Atun et al. 2013; Mollahalilo˘glu et al. 2018). Transforming the healthcare delivery system and strengthening family healthcare services were two significant steps of this reform process. Family physicians provide healthcare services to registered citizens (Akdag 2003). Family healthcare facilities, available to most women, are first-degree, neighborhood health institutions staffed by nurses, midwives, and physicians (Ta¸sçı-Duran and Sevil 2013). Under HTP, Turkey has seen significant improvements in primary and child healthcare services. Hospital births rose from 84.8% in 2007 to 96.8% in 2012. Moreover, the percentage of C-section births grew from 21% in 2012 to 53% in 2016 (Dilli et al. 2016). In private hospitals, C-section accounted for 70.5% of all hospital births in 2016 (MoH 2016). As per recent OECD statistical records, Turkey leads all developed countries in terms of C-section per 1000 live births (OECD 2018b). Policymakers in Turkey need to consider regional differences in healthcare. The percentage of hospital births is higher in the more populated areas of Turkey, such as ˙Istanbul, West Marmara Region, Aegean, West Anatolia Region, and the Mediterranean and West Black Sea regions, compared with the country’s average (Dilli et al. 2016). In other words, eastern Turkey has fewer in-hospital births. Differences between the western and eastern parts of the country have led to unequal access and utilization of healthcare. After more than a decade of health reform in Turkey, regional differences continue to exist (Yardım and Uner 2018), often tied to the levels of development. GAP looks to develop eastern Turkey, particularly to make it economically competitive with eastern Turkey. Launched in 1970, GAP principally hopes to provide sustainable development and to solve socioeconomic problems. After four decades (Bilgen 2018), official 2017 records showed that 74% of the energy projects and 26.4% of the irrigation projects under GAP had been completed (GAP-BK˙I 2016). However, Turkey still has a long way to go to achieve sustainable access to social and healthcare services. GAP regions still rank above the national average in annual population growth, fertility rates, average household size, infant mortality rates, and unemployment (Bilgen 2018). Healthcare and economic reforms are necessary for regional development. Improvements in healthcare provide significant benefits for vulnerable groups

134

S. Cinaroglu

and disadvantaged populations who live in less-developed parts of the country. Improvements in primary care services (including maternal and child care) provide significant benefits for the population’s health and decrease geographic differences in healthcare services. Clearly, effective use of maternal and childbirth services is essential for developing countries such as Turkey. Ta¸sçı-Duran and Sevil (2013) have suggested that despite the numerous differences between regions in Turkey, most pregnancy-related deaths can be prevented by better-quality medical care and more effective application of public and private resources. Thus, primary healthcare services and the education of pregnant women play prominent roles in enhancing child healthcare and combating regional differences. Family health centers, which are available to most women, are first-rate, neighborhood health institutions staffed by nurses, midwives, and physicians (Ta¸sçı-Duran and Sevil 2013). Clearly, monitoring patients’ experiences is necessary to get the benefits resulting from an expansion of healthcare services. The effectiveness of specific healthcare services, such as labor and delivery, is essential for improving the population’s general health. Nevertheless, Turkey lacks data about child healthcare services. Numerous studies evaluate the efficiency of general healthcare services. Using data since 1994 from the Ministry of Health (MoH), Ersoy et al. (1997) analyzed 573 hospitals and found that 90.6% of the country’s hospitals were technically inefficient. Another study, by Sahin and Ozcan (2000), concentrated on provincial differences in healthcare services. Results from the input-oriented Charnes, Cooper, and Rhodes (CCR) 1978 model showed that 44 of the 80 Turkish provinces were operating inefficiently and that the distribution of resources among the provinces was unequal. Over time, scholars have considered HTP’s influence on efficiency. Sahin et al. (2011), who wrote the first study about HTP efficiency, used 2005–2008 data from 352 public hospitals. Study results showed that the efficiency of these facilities increased by 12.5%, primarily because of technological improvements. Another study, by Kaçak et al. (2014), which used the same data, indicated that efficient and high-quality hospitals accounted for only 11% of the sample. However, another 11% of hospitals were efficient but classified as poor quality. Inefficient but highquality hospitals totaled 32% while inefficient and poor-quality hospitals occupied the largest category, with 45%. This indicates that even after reforms, more work needs to be done to enhance the efficiency and quality of public hospitals. Public hospitals operate in a highly competitive environment, which can motivate a hospital to improve its efficiency. Ozgen-Narci et al. (2015) examined the efficiency of Turkish hospitals and found that 17% of them were efficient. This suggests that hospital efficiency in Turkey does not seem to be affected by hospital competition. Other studies are concerned with the effect of HTP on efficiency in public healthcare services. A current study on HTP and hospital efficiency emphasized that the number of inputs slightly increased or remained steady over 9 years. Hospitals increased their outputs almost 100% for outpatient visits and inpatient days and by almost 250% from 2001 to 2009. These statistics show that hospital efficiency increased while the quality of healthcare services improved. Efficiency improvements attributable to HTP’s ability to mobilize excess inputs

Geographic Distribution of the Efficiency of Childbirth Services in Turkey

135

(which otherwise would cause great inefficiencies) improve service (Mollahalilo˘glu et al. 2018). Understanding provincial differences with regard to childbirth services is essential for developing countries to fight poor maternal and childcare outcomes. One drawback of healthcare development programs is insufficient monitoring and evaluation of the programs’ effectiveness, which could transform the population’s social and health dynamics. In Turkey, although HTP includes comprehensive reforms in terms of childbirth services, there is a lack of monitoring effort with regard to the efficiency of childbirth services. Additionally, the existing literature uses wellknown input and output indicators to measure the efficiency of healthcare services, such as number of beds, nurses, physicians, outpatients, inpatients, and surgeries in Turkey (Ersoy et al. 1997; Sahin and Ozcan 2000; Sahin et al. 2011; Ozgen-Narci, et al. 2015; Mollahalilo˘glu et al. 2018). While prior knowledge provides significant information about the efficiency of healthcare services, childbirth services still need to be examined. On the other hand, regional differences in maternal and childbirth services still remain in Turkey. Turkish women generally submit to medicalized birth, despite the unpleasant experiences of hospital birth (Ozgen-Narci et al. 2015). Thus, increasing medicalization is a major problem of childbirth services in Turkey. However, there is a lack in the number of studies related to childbirth services’ efficiency by considering the medicalization of these services. The objective of this study is to provide a more fine-grained understanding of the efficiency of childbirth services in Turkish provinces. In this chapter, Sect. 2 describes the methods, Sect. 3 presents the application results, Sect. 4 discuss study findings, and the final section is the conclusion.

2 Methodology 2.1 Data and Study Procedures The required data about the 81 Turkish provinces was obtained from the 2017 Public Hospitals Statistical Year Book. Figure 1 presents a brief overview about the study procedure. The procedure starts with a stepwise selection of potential input and output indicators regarding the efficiency of childbirth services. The analysis uses

1. Stepwise selection of variables in the DEA model

Fig. 1 Study procedure

2. Efficiency analysis of childbirth services by incorporating Jackknife analysis

3. Combination of DEA results with the decision tree procedure

136

S. Cinaroglu

the CCR input-oriented data envelopment analysis (DEA) model (Charnes et al. 1978). Jackknife analysis was used in the analysis to test the robustness of the DEA results. DEA results were integrated into a decision-tree procedure, using a Classification and Regression Tree (CART) algorithm, with the most important input and output variables determining and predicting efficient and inefficient provinces. In this study, a critical review and analysis of existing literature was used for the selection of input/output variables. The following common input variables used for the efficiency analysis of healthcare services in Turkey are incorporated into the study model (Mollahalilo˘glu et al. 2018; Ersoy et al. 1997; Sahin and Ozcan 2000; Sahin et al. 2011; Ozgen-Narci et al. 2015): number of physicians, number of nurses/midwives, and number of beds in the neonatal intensive care unit (NICU). Additionally, the output variables of this study are inspired from Ozcan (2014), which provides examples of DEA that speak directly to every significant facet of healthcare services. Moreover, a multidisciplinary investigation workshop document about the preference of birth type of Turkish women was used for the selection of output indicators. This document emphasizes increasing medicalization of childbirth services and the number of operative deliveries and C-sections in Turkey. Further, it focuses on the need to encourage Turkish women to give normal birth in order to enhance the efficiency of childbirth services (MoH 2018). In the light of existing knowledge, the output variables of this study are as follows: total number of normal deliveries, C-sections, and operative deliveries.

2.2 Stepwise Selection of Study Variables in the DEA Model Adding study variables into the DEA model significantly assists the efficiency analysis. The literature suggests limiting the number of variables relative to the number of decision-making units (DMUs). Generally, the total number of input and output variables in a DEA model should be no more than one-third of the number of DMUs in the analysis (Ozcan 2014). Thus, selecting input and output variables and using an appropriate number of input and output variables in a DEA model are essential to achieve optimal efficiency scores (Wagner and Shimshak 2007). Norman and Stoker (1991) suggested a stepwise approach to reduce the number of input and output variables in a DEA model. This process begins with a simple model (Model 1) that includes one input and one output variable. Then, efficiency scores for DMUs in Model 1 are calculated and correlated with all potential input and output variables. High statistical correlations indicate that potential input or output variables will influence the DEA results. Further, a new variable is added into the DEA model, based on the correlation values. This stepwise procedure is repeated until no high correlations or effective variables exist in the dataset. Both variable selection and the number and characteristics of DMUs in DEA models are critical to improve DEA performance. A DEA jackknife analysis was used to gain an optimal number of DMUs and to test the robustness of DEA results.

Geographic Distribution of the Efficiency of Childbirth Services in Turkey

137

2.3 Robustness Tests for DEA Efficiency Scores, Using Jackknife Analysis The literature suggests that DEA results are sensitive to outliers within the dataset (Ozcan 2014; Norman and Stoker 1991). In this study, to assess the impact of outliers on the efficiency analysis, jackknife analysis was performed. This is an iterative procedure that produces a distribution of estimates by removing one efficient observation at a time and observing the difference in efficiency scores (Yang et al. 2017; Efron 1982). In other words, it tests the robustness of DEA results. In this procedure, a limited number of samples are obtained by omitting one observation at a time (Efron 1982; Zere et al. 2006). In other words, efficient DMUs are dropped one at a time from the analysis, and efficiency scores are recalculated. Efficiency rankings between the model, including all DMUs in the analysis and those representing the withdrawal of each efficient DMU, are then tested using the Spearman rank correlation coefficient (Zere et al. 2006; Hernandez and Sebastian 2014). The correlation coefficient equals 1, indicating that the rankings are exactly the same. In other words, the efficiency frontier does not change, whether or not an observation is incorporated into the model. A value of 0 indicates an absence of correlation between the rankings. A reverse ranking is implied by a value of −1 (Zere et al. 2006). Clearly, a value of 0 implies an absence of correlation and indicates that excluding the outlier completely changes the efficiency scores. DEA robustness tests estimate the impact of individual DMUs on overall efficiency scores (Efron 1982; Zere et al. 2006). In this study, after obtaining robustness tests of DEA results and final efficiency scores of DMUs, a decision-tree procedure is incorporated into the DEA analysis to understand predictor variables of efficient and inefficient DMUs for childbirth services.

2.4 Incorporating DEA Results with Decision-Tree Procedures Models integrating DEA have long been discussed and examined in the literature (Seol et al. 2007; Lee 2010; Emrouznejad and Anouze 2010; Chuang et al. 2011). The existing literature emphasizes that DEA explains efficiency scores but not factors related to inefficiency. These limitations can be overcome with decisiontree procedures. This allows a better understanding of results gathered from DEA, by a detailed examination of factors related to efficiencies and inefficiencies. In this study, a decision-tree procedure is integrated into the DEA process. The literature also mentions shortfalls that occur by combining the DEA process with decision-tree procedures. To create a reliable classification tree, large numbers of observations usually are required. However, in most DEA processes reported in the literature, the number of DMUs is not large enough to create a decision tree (Emrouznejad and Anouze 2010).

138

S. Cinaroglu

To solve this problem, ensemble learning techniques, such as bagging and boosting, are advisable optimization techniques to deal with the model’s overfitting problem (Witten and Frank 2005). Random Forests uses an ensemble learning technique and creates numerous trees in the forest to optimize prediction results. Random Forests is a combination of decision trees, and as a result, they vote for the most popular class. It uses a CART algorithm to create decision trees (Breiman 2001). This algorithm is used to create a classification forest to predict efficient and inefficient provinces in terms of childbirth services. During this procedure, each variable, splitting criterion, such as the “Gini index,” is computed for all possible cut points within the range of that variable (Breiman 2001). The variable determined for the next split is the one that creates the highest overall criterion value, i.e., the one with the best cutting point. One advantage of a Random Forests procedure is that it provides prediction results competitive with boosting and bagging but does not progressively change the training set (Cheung-Wai Chan and Paelinckx 2008; Oshiro et al. 2012). In our case, Random Forests and CART algorithm, tree-based ensemble techniques, are used for modeling. In Random Forests, the reason behind creating multiple numbers of trees and creating a forest is to optimize prediction results. However, increasing the number of trees in a forest does not always increase the number of prediction results (Oshiro et al. 2012), but leads to better prediction results. The previous literature emphasizes that the performance of single trees is low compared with the multitude of trees in Random Forests. In this regard, one advantage of the Random Forest procedure is that its results are predictable and competitive with those obtained from boosting and bagging, without progressively changing the training set. Although this procedure is not obvious, it is useful; therefore, Random Forests should be viewed as a Bayesian procedure that improves prediction performance (Breiman 2001; Cheung-Wai Chan and Paelinckx 2008; Oshiro et al. 2012). In this study, classification accuracy (CA), area under the ROC curve (AUC), and F1 results are used as performance measures. CA is defined as the number of correct predictions made, divided by the total number of predictions made, multiplied by 100 (to formulate a percentage). The ROC curve is a measure of classifier performance (Bradley 1997) that measures the difference between the distribution of the two classes (Hand and Till 2001). The value of AUC changes from 0 to 1. Values next to 1 represent high classification performance (Hand and Till 2001). The F measure is another multi-class classification performance measure, whose main focus is the relationship between the data’s positive labels and those given by a classifier, based on the sums of per-text decisions. The F1 score reaches its best score at 1, reflecting perfect precision and recall (Powers 2011).

Geographic Distribution of the Efficiency of Childbirth Services in Turkey

139

Table 1 Descriptive statistics Inputs/outputs Variables

Label

n

Inputs

Phy Nur_mid Neo_int_n_b Nor_del C_sec Oper_del

81 54 81 154 81 4 81 192 81 217 81 0

Outputs

Physicians Nurses/midwives Number of beds in NICU Normal delivery Cesarean section Operative delivery

Min. Max 7796 15, 545 517 46, 588 29, 655 5420

Median Mean Sd. 245 982 26 2752 1789 12

516 1465 47 5316 3295 269

1032 2013 75 7553 4461 902

3 Application Results 3.1 Descriptive Statistics Descriptive statistics of all input and output variables are presented in Table 1, which also provides minimum, maximum, mean, and standard deviation values. Mean values and standard deviation scores for input variables are as follows: total number of physicians’ mean = 516, s.d. = 1032; total number of nurses and midwives’ mean = 1465; s.d. = 2013; and number of beds in NICUs’ mean = 47, s.d. = 75. Descriptive statistics for output variables are normal delivery mean = 5316, s.d. = 7553, C-section mean = 3295, s.d. = 4461, and operative delivery mean = 269, s.d. = 902.

3.2 Adding Variables into the DEA Model In this study, Norman and Stoker’s (1991) stepwise approach is applied to add variables into the model. The number of beds in NICUs is an input variable, and the number of deliveries is an output variable for Model 1. Efficiency scores are recorded for the 81 Turkish provinces. All potential input and output variables are incorporated into the study’s model, and Spearman rank correlations between potential input and output variables and efficiency scores gathered from Model 1 are estimated. Multicollinearity is ruled out because all correlations are lower than 0.30, and all potential input and output variables are included in the study’s model (see Table 2).

3.3 Efficiency Analysis of Provinces in Terms of Childbirth Services In this study, we use an input-oriented variable returns to scale model. During a DEA procedure, jackknife analysis was performed to test the robustness of the efficiency

140

S. Cinaroglu

Table 2 Potential input/output variable correlations with Model 1 Variablesa Nur_mid Phy Oper_del C_sec

Spearman correlation coefficient Spearman correlation coefficient Spearman correlation coefficient Spearman correlation coefficient Spearman correlation coefficient

Eff_Model_1b −0.295** −0.261* −0.033 −0.155

a See

Table 1 for variable labels efficiency scores obtained from Model 1 (input variable is Neo_int_n_b, output variable is: Nor_del); * p < 0.05; ** p < 0.01

b Eff_Model_1;

The initial model (81 DMUs) The second model (75 DMUs) The third model (73 DMUs)

18 efficient DMUs 63 inefficient DMUs

18 iterations for every efficient DMU

16 efficient DMUs 59 inefficient DMUs 14 effiicient DMUs 59 inefficient DMUs

16 iterations for every efficient DMU

14 iterations for every efficient DMU

First 6 iterations have low correlations with initial model First 2 iterations have low correlations with second model

First iteration have low correlation with third model

First 6 efficient DMUs are removed from the initial model

Efficiency scores are recalculated

First 2 efficient DMUs are removed from the second model

Efficiency scores are relcalculated

First efficient DMU is removed from the third model

Efficiency score are recalculated

Final model (Fourth model) (72 DMUs) (13 efficient, 59 inefficient DMUs)

Fig. 2 DEA and Jackknife analysis

scores for the 81 Turkish provinces. After the jackknife analysis, the efficiency scores of a final model are presented for childbirth services in the Turkish provinces.

3.4 Testing the Robustness of DEA Results with Jackknife Analysis The jackknife analysis is performed to test the robustness of DEA efficiency scores, obtained from input and output variables in the 81 Turkish provinces (Fig. 2). The iteration procedure generates four models. The first, called the “initial model,” has 81 DMUs. According to this model, 22% of the provinces are efficient in terms of childbirth services (In this model, 18 DMUs are efficient and 63 are inefficient). The average efficiency score for the first model is 0.69. During the initial model’s jackknifing iteration procedure, all the 18 efficient DMUs are dropped one at a time from the analysis, and efficiency scores are reestimated. The correlations between efficiency scores obtained from the initial model and 18 different iterations are presented in the correlogram. It is seen that efficiency scores obtained from the first six iterations presented low correlations with the initial model. In other words, reestimated efficiency scores of provinces, after dropping the first six efficient provinces one at a time from the analysis, indicate a low level of correlation with the initial model. Obviously, efficiency scores obtained without including the first

Geographic Distribution of the Efficiency of Childbirth Services in Turkey

141

six efficient DMUs one at a time does not give results similar to the initial model’s efficiency scores. Thus, considering the first six efficient provinces of the initial model affects the efficiency frontier. In this regard, to avoid extreme outliers that could affect the robustness of efficiency scores, the first six efficient DMUs in the initial model are removed, and efficiency scores are reestimated (Fig. 2). After this iteration, 75 provinces remain for the second model. The reestimated efficiency scores for the second model show that 16 provinces are efficient and 59 are inefficient. The average efficiency score for this model is 0.71. After a jackknifing iteration procedure of the second model, efficiency scores gathered from the first two iterations present low correlations with the second model. Thus, the first two efficient DMUs of the second model are removed, and efficiency scores are reexamined. For the third model, the recalculated efficiency scores for 73 DMUs show that 14 provinces are efficient and 59 are inefficient (Fig. 2). The average efficiency score for the third model is 0.71. Subsequently, after performing a jackknifing iteration procedure on the third model, efficiency scores from the first iteration present low correlations with the third model. The first efficient observation in the third model is dropped, and efficiency scores are recalculated. Finally, recalculated efficiency scores of the final (fourth) model for 72 provinces show that 13 provinces are efficient and 59 are inefficient. The iteration procedure for the final (fourth) model shows that the reestimated efficiency scores of the final (fourth) model have high correlations with 13 iterations. In other words, incorporating efficient DMUs into the final (fourth) model does not affect the efficiency frontier. The average efficiency score for the final model is 0.71. The DEA model construction procedure that was incorporated with jackknifing is presented in Fig. 2. The correlogram shows Spearman correlation coefficients between the efficiency scores of the initial model and 18 iterations (see Graph 1). The first column of this correlogram presents the degree of similarity between the efficiency scores of the initial model and 18 iterations, which is created by dropping every efficient province in the initial model one at a time and recalculating efficiency scores. Efficiency scores of the initial model have a low level of correlation with the first through sixth iteration scores. The first through sixth efficient provinces are removed from the first model because of their potential to affect the efficiency frontier. Since DEA is a relative efficiency measure, efficiency scores are recalculated for the 75 provinces in the second model. The first column in the correlogram presented in Graph 2 shows a Spearman correlation coefficient of the efficiency scores obtained for 75 provinces in model 2 and 16 iterations. Efficiency scores in the second model have a low level of similarity with the first and second iteration scores. Thus, the first and second efficient provinces are removed from the second model, and efficiency scores are recalculated for 73 DMUs. The first column of Graph 3 shows a Spearman correlation coefficient of efficiency scores obtained for 73 provinces in Model 3 and 14 iterations. Efficiency scores in the third model have a low level of correlation with the first iteration scores. Thus, we infer that the first efficient DMU has the potential to change the efficiency

142

S. Cinaroglu

Graph 1 Correlogram for the efficiency scores of the initial model and 18 iterations for every efficient DMU of the initial model

frontier of the third model. In this regard, the first efficient DMU in the third model is removed from the model, and efficiency scores are recalculated for 72 DMUs. The final (fourth) correlogram presents Spearman correlation coefficients of efficiency scores obtained for 72 provinces in Model 4 and 13 iterations. The efficiency scores of this model correlate with the efficiency scores of 13 iterations. In other words, efficiency scores obtained from the fourth (final) model and the 13 iterations are similar. Thus, the fourth (final) model comprises 72 DMUs, of which 13 are efficient and 59 are inefficient. After the jackknife analysis, 18% of the provinces were found to be efficient in terms of childbirth services (Graph 4).

3.5 Geographic Distribution of Provincial Efficiency Scores for Childbirth Services The final model’s geographic distribution of efficiency scores for 72 provinces is presented in Fig. 3. Among these provinces, 13 are efficient and 59 are inefficient.

Geographic Distribution of the Efficiency of Childbirth Services in Turkey

143

Graph 2 Correlogram for efficiency scores of the second model and 16 iterations for every efficient DMU of the second model

The average general efficiency score is 0.71 and the average efficiency score for inefficient DMUs is 0.64. In this figure, dark red colors represent provinces with high-efficiency scores (>0.91), and light green colors represent provinces with lowefficiency scores S>B>G>H

S>W>H>B>G

G>Nuclear>W>Solar CHP>Natural gas>H

W>B>G>S>H

W>S>B>G>H

W>B>S>CHP>H> Nuclear> Conventional Energy

W>S>B>G>H

Hydro>Wind> Geothermal> Photovoltaic H>S>W=G>B

W>S>B>G>H

184

P. Darende et al.

3 Overview of Renewable Energy Resources Solar, wind, hydroelectric, geothermal, and biomass energy resources were chosen in the scope of this case study. Hydrogen and wave energy resources were not taken into account since the opinions of experts were taken as a reference. Besides, existing and promising power plants in Turkey were also investigated to make a decision. For instance, wave energy is still in the prototype phase in Zonguldak, whereas using hydrogen energy is also in the research state for Turkey. Therefore, why these are not utilized may be explained this way. Five energy alternatives, which are solar, wind, bioenergy, hydroelectric, and geothermal energy, are clarified below. Solar Energy: Solar energy is an attractive alternative because it is sustainable, renewable, and environmentally friendly (Konstantin 2017). Wind Energy: Wind turbines convert the kinetic energy of the wind into mechanical energy, and then, the mechanical energy is transformed into electrical energy (Price 2005). Hydroelectric Energy: The hydroelectric potential of Turkey constitutes 1% of the world’s total and 16% of the European total (Mutlu 2013). Geothermal Energy: Rain and snow water reaches the magma layer from the crust of the world. Bioenergy: All natural substances of vegetable or animal origin, the main components of which are carbohydrate compounds, are defined as biomass energy resources. The advantages and disadvantages of alternative renewable energy resources are shown in Table 3. Table 3 Advantages and disadvantages of alternative renewable energy sources Alternatives Solar

Wind

Advantages Having durable parts Low maintenance Low operating costs Cheaper costs of installations To be environment friendly

Hydroelectric

Natural Energy independence

Geothermal

Economical and safe Multipurpose heating To use existing waste To be predictable

Bioenergy

Disadvantages High investment costs Less energy absorption in winter storage problems Noise and visual pollution Jamming the signals of communication devices High investment cost Long construction period dependence on rain Causes corrosion, oxidation Calcification in equipment Consumes a lot of water Requires technology and effort causes misemployment for agricultural areas and products

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

185

4 Intuitionistic Fuzzy Sets (IFS) The fuzzy sets utilized to define linguistic values are called fuzzy numbers that enable us to digitize qualitative values. Fuzzy sets are a kind of classical sets. In classical clusters, 1 is defined as membership, 0 is defined as non-membership. In a fuzzy set μA , is the degree of membership of an element to the set, 1 – μA, is the degree of non-membership (Zadeh 1987). Thus, the total of the degree of membership and the degree of non-membership equals 1. However, this approach is not an effective method to address uncertainty in real-life applications since the sum of the degrees of membership and not membership may be less than one. Therefore, Atanassov presented the intuitionistic fuzzy set theory, the generalized version of the fuzzy set theory (Atanassov 1986; Rouyendegh et al. 2019). The intuitionistic fuzzy sets are more useful than fuzzy sets in the uncertainty environment, as they allow a more comprehensive assessment of the degree of membership, degree of non-membership, and degree of hesitation (Zhang and Liu 2011). Therefore, in order to meet this need, the fuzzy set theory is generalized, and intuitionistic fuzzy set theory is obtained. This approach is suggested to be used at the points where the decision-maker hesitates in real-life problems and is thought to be more suitable for achieving the correct result. A = {x, μA (x), vA (x) x ∈ X} ,

μA (x) : X → [0, 1]

vA (x) : X → [0, 1]

(1) According to the intuitionistic fuzzy set theory: ⎧ ⎨ x element to set A Degree of non − membership ⎩ Hesitation index πA(x)

⎫ μA (x) ⎬ vA (x) ⎭ π A (x)

In intuitionistic fuzzy set theory, it is shown in: 0 ≤ μA(x) + υA(x) ≤ 1

(2)

This equation also clearly shows that the fuzzy sets and the intuitionistic fuzzy sets are different.

186

P. Darende et al.

The hesitation index indicates the hesitation level of whether an x element belongs to set A and is calculated as given in Eq. (3). πA(x) = 1 − μA(x)–υA(x) 0 ≤ πA (x) ≤ 1

(3) (4)

Intuitionistic fuzzy logic is often used because its integration with multi-criteria decision-making allows for clearer decision-making in uncertain situations and judgments (Özkan Özen and Koçak 2017). For this reason, while counting the qualitative values in this work, an intuitionistic fuzzy multi-criteria decision-making (IFMCDM) method is used.

5 An Integrated Intuitionistic Fuzzy Multi-criteria Decision-Making Method (IFMCDM) In this chapter, it is aimed to choose the most appropriate one for the purposes calculated according to the criteria of the selected renewable energy resource facilities. As a result of the examinations made among the MCDM methods, a solution method was utilized by combining the TOPSIS method with intuitionistic fuzzy sets in terms of its suitability and availability for existing data.

5.1 Intuitionistic Fuzzy TOPSIS (IFTOPSIS) Brief descriptions of the symbols used in the intuitionistic fuzzy TOPSIS method are given below: R(k) λ rij (k) μij (k) ν ij (k) π ij (k) wi (k) A± S± C*

kth decision-maker’s intuitionistic fuzzy decision matrix {λ1 , λ2 , . . . , λ3 } is the weighted vector of kth decision-maker ith alternative of an intuitive value from jth criterion given by kth decision-maker jth criterion membership degree of ith alternatives to kth decision-maker jth criterion non-membership degree of ith alternatives to kth decision-maker The uncertainty degree to kth decision-maker Weight of jth criterion to kth decision-maker Positive- and negative-ideal solution Positive- and negative-ideal separation measures Value of relative closeness to ideal solution

Step 1 Determination of the weight of decision-makers (DMs). The importance of the decision-makers in which the surveys were conducted was determined according to their field of study and their degree of expertise, and these

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

187

levels of importance were expressed in linguistic variables. The significance weights of the decision-makers are determined using the linguistic variables determined intuitionistic fuzzy numbers and Eq. (5).    k μk + πk μkμ+v k λk =   , l   μk μk + πk μk +vk

λk ≥ 0, k = 1, 2, . . . , l ve

l 

λk = 1,

(5)

k=1

k=1

Step 2 Creating a combined intuitionistic fuzzy decision matrix based on the opinions of decision-makers. The unified decision matrix consists of combining the decision-makers’ evaluations about the alternatives. The IFWA method proposed by Xu (2007) is used in Eq. (6) to combine the decision-makers’ evaluations about the alternatives. The formula used to calculate the decision matrix is given in Eq. (6):   rij = IFWAλ rij (1) , rij (2) , . . . , rij (l) = rij (1) λ1 ⊕ rij (2) λ2 ⊕ · · · ⊕ rij (l) λl & % l  l  l  l  λ * λ * λ λ * * = 1− 1 − μij (k) k , vij (k) k , 1 − μij (k) k − vij (k) k k=1

k=1

k=1

       rij = μAi xj , vAi xj , πAi xj

k=1

(i = 1, 2, . . . , m; j = 1, 2, . . . , n) (6)

A combined intuitionistic fuzzy decision matrix R was created based on the opinions of decision-makers: ⎛  r11  ⎜ ⎜ r21  R=⎜ ⎜ . ⎝ ..  r

n1

r12 r13 · · · r1m . r22 r23 · · · .. .. .. . . .. . . . . · · · · · · · · · rnm

⎞ ⎟ ⎟ ⎟ ⎟ ⎠

Step 3 Determination of criterion weights. Each criterion has a different level of importance in the decision-making problem. Criteria weights are calculated using the opinions of individual decisionmakers. Linguistic variables are used to determine these weights. In addition, the intuitionistic fuzzy values given by each decision-maker for the criteria are combined to create the criteria weights. Criterion weights are calculated with Eq. (7):   wj = IFWAλ wj (1) , wj (2) , . . . , wj (l) = wj (1)λ1 ⊕ wj (2) λ2 ⊕ · · · ⊕ wj (l) λl % & l  l  l  l  λk * λk * λk λk * * (k) (k) (k) (k) 1 − μij vj 1 − μj vj = 1− , , − k=1

k=1

k=1

k=1

(7)

188

P. Darende et al.

Step 4 Creating a combined weighted intuitionistic fuzzy decision matrix. Considering both the weights given to the criteria and the combined intuitionistic fuzzy decisions, the combined weighted intuitionistic fuzzy decision matrix is obtained using the multiplication operator of the IFS as shown below (Atanassov 1986):  '. (  R = R ⊕ W = μ ij , V ij = X, μij μj , Vij + Vj − Vij Vj |x ∈ X

(8)

π ij = 1 − vij − vj − μij μj + vij vj

(9)

The combined weighted intuitionistic fuzzy decision matrix is shown below: ⎡   ⎢ ⎢ R = ⎢ ⎣  

    , π , v11 11  μ12 , v12 , π12  μ11 , π , π μ 22 , v22 μ 21 , v21 21 22 .. .. . .    , π , π μm1 , vm1 μ , v m1 m1 m1 m1

  rij = μ ij , vij , πij

  ⎤ , π · · · μ 1n , v1n 1n  , π ⎥ · · · μ 2n , v2n 2n ⎥ ⎥ .. .. ⎦ . .   · · · μmn , vmn , πmn

(i = 1, 2, . . . , m; j = 1, 2, . . . , n)

A weighted intuitionistic fuzzy decision matrix was obtained by using Eqs. (8) and (9). Step 5 Obtaining intuitionistic fuzzy positive-ideal solution and intuitionistic fuzzy negative-ideal solution. With reference to the TOPSIS methodology evaluation criterion, the criteria should be classified under the headings benefit (J1 ) and cost (J2 ). The intuitionistic fuzzy positive-ideal solution A* and the intuitionistic fuzzy negative-ideal solution A− are calculated as follows:  ∗  ∗ ∗ ∗ ∗ ∗ ∗ A∗ = r1 , r2 , . . . , rn ’ , rj ’ = μj ’ , vj ’ , πj ’ , j = 1, 2, . . . , n  −  −   − − − − − A− = r1 , r2 , . . . , rn , rj = μj , vj , πj , j = 1, 2, . . . , n   /  ' ( ' ( ∗ min μij |j ∈ J1 , max μij |j ∈ J2 μj =   i /  i ' ( ' ( ∗ vj = min vij |j ∈ J1 , max vij |j ∈ J2 i   /  i ' ( ' ( ' ( ' ( − πj = 1 − min μij − max vij |j ∈ J1 , 1 − max μij − min vij |j ∈ J2 i i i  / i  ' ( ' ( − μj = min μij |j ∈ J1 , max μij |j ∈ J2   i /  i ' ( ' ( − vj = min vij |j ∈ J1 , max vij |j ∈ J2 i   /  i ' ( ' ( ' ( ' ( − πj = 1 − min μij − max vij |j ∈ J1 , 1 − max μij − min vij |j ∈ J2 i

i

i

i

(10)

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

189

Step 6 Calculation of positive and negative separation measure. The chosen distance is utilized to measure the distance among the intuitionistic fuzzy positive-ideal solution and the intuitionistic fuzzy negative-ideal solution (Szmidt and Kacprzyk 2000). In this study, Hamming distance measure Eq. (11) was utilized to measure the distance among positive and negative intuitionistic fuzzy ideal solution: Si ∗ =

n     1 1  0 μij − μj ∗  + vij − vj ∗  + πij − πj ∗  , i = 1, 2, . . . , m 2 j =1

Si − =

n     1 1  0 μij − μj −  + vij − vj −  + πij − πj −  , i = 1, 2, . . . , m 2 j =1

(11) Step 7 The closeness coefficient of alternatives is determined for each criterion and the numerical values of the criteria are reached. The last step is to find the relative closeness coefficient of each alternative and to sort the alternatives. Equation (12) shown below is used to calculate the relative proximity coefficient: Ci ∗ =

Si − , 0 ≤ Ci ∗ ≤ 1, i = 1, 2, . . . , m Si + Si − ∗

(12)

After determining the relative closeness coefficient of each alternative, the alternatives are ranked in descending order of Ci *. The qualitative values obtained by the IFTOPSIS method are combined with the quantitative values and an alternative ranking is created using the TOPSIS method.

6 Case Study Energy is a significant factor for the social life and sustainable economic development of Turkey. Besides, the importance attached to renewable energy investments is increasing day by day in order to get rid of the image of a foreign-dependent country in terms of energy. Studies on renewable energy investments in Turkey cover the whole country; however, Turkey is divided into seven regions with different dynamics. These regions are demonstrated in the map of the regions in Fig. 4. To use clean energy efficiently, it is essential to invest in renewable energy in accordance with the dynamics and characteristics of each region. The IFTOPSIS method was applied to select alternative renewable energy resources in the regions of Turkey, and the resources to be invested in for each of these regions are listed. The alternatives used in this study may be listed as solar (A1), wind (A2), bioenergy (A3)

190

P. Darende et al.

Fig. 4 Map of the regions of Turkey Table 4 Main criteria and sub-criteria

Technical specifications

Economic

Environment

Social impact

Energy Efficiency Operating Life Installed Power Capacity Installation Cost Operation Cost Support of Government Profit Payback Period Greenhouse Gas Land Use Water Consumption Rate Environmental Damage Social Acceptance Job Opportunities Safety

hydroelectric (A4), and geothermal (A5) energy resources. In order to determine the energy resources to be selected in the regions, four main criteria (social, economic, environmental, and technical characteristics) and 15 sub-criteria were utilized (Darende 2019).

6.1 Main Criteria and Sub-criteria The expert-approved criteria which were used in the selection of alternative resources were determined from among the most frequently used criteria in studies in the literature. The main criteria and sub-criteria that were determined are shown in Table 4.

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

QUALITATIVE DATA

C1

C4

C5

C7

C8

C10

191

QUANTITATIVE DATA

C11

C13

C14

C2

C3

C6

C9

C12

C15

COLLECTED DATA

IFTOPSIS

TOPSIS

Fig. 5 The method process to be applied to each region

6.2 Application In this case study, alternative data for the aforementioned criteria were obtained from ministry data, official reports, or articles, while some data are expressed as fuzzy sets with the help of linguistic variables. Therefore, in our case study, the criteria were examined in two classes as qualitative and quantitative data. While quantitative data were determined using common values in ministry data, official reports, and articles, qualitative data were criteria determined by using linguistic variables. After applying the intuitionistic fuzzy multi-criteria decisionmaking method to these survey results, these qualitative data were transformed into numerical data. Finally, these values were combined with qualitative data, and an alternative order was created for each region by using the TOPSIS method. The method to be applied is shown in Fig. 5.

6.2.1 Sorting Alternatives by Means of IFTOPSIS Method Step 1 Calculation of the weights of decision-makers (DMs). The values in Table 5 were used as the linguistic decision variables during the calculation and the importance levels and weights of decision makers given in Table 6 were kept constant for each region. The calculated weights of DMs are shown in Table 6. Step 2 Creating a combined intuitionistic fuzzy decision matrix based on the opinions of decision-makers. This step was used in the group decision-making process to attain the unified intuitionistic fuzzy decision matrix with using the linguistic decision variables in Table 7. The steps are explained over the tables of the Black Sea region and the evaluations of decision-makers for the qualitative criteria are shown in Table 8.

192 Table 5 Linguistic decision variables used in determining the importance of decision makers and criteria

P. Darende et al. Linguistic term Very important Important Medium Unimportant Very unimportant

IFNs 0.90 0.75 0.50 0.20 0.05

0.05 0.20 0.40 0.75 0.90

0.05 0.05 0.10 0.05 0.05

Table 6 Linguistic decision variables used in evaluating alternatives

Importance levels of decision-makers DM1 DM2 DM3 DM4 0.292 0.172 0.244 0.292

Table 7 Linguistic decision variables for the criteria

Linguistic term Very high High Medium Low Very low

IFNs 0.80 0.60 0.50 0.25 0.15

0.10 0.25 0.50 0.6 0.75

0.10 0.15 0.00 0.15 0.10

Step 3 Determination of criterion weights. The criteria weights were determined using the linguistic decision variables in Table 5 and are shown in Table 9. Step 4 Creating the aggregated weighted intuitionistic fuzzy decision matrix. The weighted decision matrix for the Black Sea region is shown in Table 10. Step 5 Obtaining intuitionistic fuzzy positive-ideal solution (PIS) and intuitionistic fuzzy negative-ideal solution (NIS). Intuitionistic fuzzy PIS and NIS sets of the Black Sea region are demonstrated in Table 11. Step 6 Determination of positive and negative separation measures. Positive and negative separation measures calculated in Table 12 show positive and negative difference measurements (S+ and S−) for the Black Sea region. Step 7 The closeness coefficients of alternatives are calculated for each criterion, and the numerical values of the criteria are reached. The numerical values obtained for the qualitative criteria of the Black Sea region are depicted in Table 13. The data in Table 13 were combined with the data obtained from the Ministry of Energy and Natural Resources, and Table 14 was created for the Black Sea region as a result of the subsequent study. Step 8 Sorting alternatives by using TOPSIS (Table 15). Alternative energy resources to be invested in were listed using the regionally obtained C values. The method explained with the use of the Black Sea region was

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

193

Table 8 Evaluations of decision-makers for the qualitative criteria of the Black Sea region DMs

Alternative

DM-1

Solar Wind Bio Hydro Geothermal Solar Wind Bio Hydro Geothermal Solar Wind Bio Hydro Geothermal Solar Wind Bio Hydro Geothermal

DM-2

DM-3

DM-4

Criteria C1 C4 L L M M M H VH M VL H L M L H M H VH H VL VH L H M H H VH VH H L VH L L L M H H VH H L H

C5 L M H M H M M H M H L M M M H L M H L H

C7 L M M VH VL L L M VH VL L M H VH L L L H VH L

C8 H H M L VH H H M L VH VH M M L VH H H L L H

C10 L L M M M VL L L L L VL L VL L L VL L M L M

C11 L VL VH VH VH VL VL L M M L VL M M M VL VL L VH M

C13 H M M L M L L M M VL M H H VL M H H H L H

C14 H H H H H L L H H L L L L L L L L L L L

Table 9 Weight values of the criteria μ v π μ v π

W1 0.883 0.063 0.054 W9 0.491 0.427 0.083

W2 0.750 0.200 0.050 W10 0.491 0.427 0.083

W3 0.738 0.193 0.069 W11 0.345 0.598 0.057

W4 0.837 0.105 0.058 W12 0.649 0.294 0.057

W5 0.806 0.124 0.069 W13 0.584 0.349 0.067

W6 0.690 0.229 0.081 W14 0.637 0.290 0.073

W7 0.837 0.105 0.058 W15 0.556 0.355 0.089

W8 0.837 0.105 0.058

applied for each region. The calculations were formed by using Microsoft Excel and MATLAB, and IFTOPSIS analysis results were obtained for each region in Table 16. To put across a framework to interpret the results obtained from the regions, it was observed that solar and wind energy prevailed throughout Turkey. While solar energy was dominant in the eastern and southern regions, it was observed that the potential of wind energy was high in the western regions.

194

P. Darende et al.

Table 10 Weighted decision matrix for the Black Sea region Weighted A1 μ aggregated v decision matrix π A2 μ v π A3 μ v π A4 μ v π A5 μ v π

C1 0.221 0.625 0.154 0.350 0.573 0.077 0.491 0.386 0.122 0.706 0.157 0.137 0.181 0.687 0.132

C4 0.334 0.526 0.140 0.455 0.441 0.104 0.554 0.284 0.162 0.479 0.379 0.142 0.586 0.258 0.156

C5 0.242 0.634 0.124 0.403 0.562 0.035 0.466 0.384 0.150 0.352 0.586 0.061 0.484 0.343 0.173

C7 0.209 0.642 0.149 0.332 0.592 0.076 0.465 0.414 0.121 0.669 0.195 0.136 0.172 0.701 0.128

C8 0.554 0.284 0.162 0.483 0.370 0.147 0.366 0.577 0.057 0.209 0.642 0.149 0.632 0.222 0.146

C10 0.151 0.734 0.115 0.209 0.642 0.149 0.326 0.615 0.059 0.279 0.614 0.107 0.342 0.588 0.071

C11 0.172 0.701 0.128 0.125 0.776 0.098 0.450 0.409 0.140 0.592 0.280 0.128 0.517 0.385 0.099

C13 0.443 0.413 0.144 0.439 0.423 0.138 0.465 0.414 0.121 0.233 0.655 0.112 0.407 0.497 0.096

C14 0.314 0.521 0.165 0.314 0.521 0.165 0.368 0.463 0.169 0.368 0.463 0.169 0.314 0.521 0.165

Table 11 Positive and negative intuitionistic fuzzy ideal solution values Positive and negative intuitionistic fuzzy ideal solution r1 * r4 * r5 * r7 * r8 * A+ μ* 0.706 0.334 0.242 0.669 0.209 v* 0.157 0.525 0.633 0.194 0.642 π* 0.136 0.140 0.124 0.136 0.148 r1 − r4 − r5 − r7 − r8 − − − A μ 0.181 0.585 0.483 0.171 0.631 v− 0.686 0.258 0.343 0.700 0.222 π− 0.132 0.156 0.172 0.127 0.14

r10 * 0.151 0.733 0.115 r10 − 0.341 0.587 0.070

r11 * 0.125 0.776 0.098 r11 − 0.591 0.279 0.128

r13 * 0.465 0.412 0.121 r13 − 0.233 0.654 0.112

r14 * 0.367 0.462 0.169 r14 − 0.314 0.520 0.164

Table 12 Positive and negative difference measurements for the Black Sea region Criteria C1 C4 C5 C7 C8 C10 C11 C13 C14

A1 S+ 0.486 0.000 0.000 0.460 0.358 0.000 0.076 0.022 0.058

S− 0.061 0.267 0.290 0.059 0.078 0.191 0.421 0.242 0.000

A2 S+ 0.416 0.121 0.161 0397 0.274 0.092 0.000 0.027 0.058

S− 0.169 0.183 0.219 0.160 0.148 0.132 0.497 0.231 0.000

A3 S+ 0.229 0.241 0.250 0.219 0.156 0.175 0.367 0.001 0.000

S− 0.310 0.032 0.040 0.294 0.355 0.027 0.141 0.241 0.058

A4 S+ 0.000 0.146 0.110 0.000 0.000 0.128 0.497 0.242 0.000

S− 0.275 0.171 0.126 0.269 0.219 0.203 0.156 0.211 0.105

A5 S+ 0.530 0.267 0.290 0.506 0.423 0.191 0.392 0.084 0.058

S− 0.000 0.000 0.000 0.000 0.000 0.000 0.105 0.174 0.000

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . . Table 13 Relative proximity values of qualitative criteria to the ideal solution for the Black Sea region

C* values of qualitative criteria Criteria A1 A2 A3 C1 0.112 0.288 0.574 C4 1.000 0.601 0.116 C5 1.000 0.576 0.138 C7 0.112 0.287 0.572 C8 0.178 0.351 0.694 C10 1.000 0.590 0.133 C11 0.847 1.000 0.278 C13 0.915 0.896 0.996 C14 0.000 0.000 1.000

195

A4 1.000 0.539 0.533 1.000 1.000 0.613 0.239 0.465 1.000

A5 0.000 0.000 0.000 0.000 0.000 0.000 0.211 0.675 0.000

Table 14 Black Sea region data Data table for the black sea region Criteria Solar C1 0.112 C2 16.300 C3 25.000 C4 1.000 C5 1.000 C6 13.300 C7 0.113 C8 0.178 C9 23.000 C10 1.000 C11 0.848 C12 0.040 C13 0.915 C14 0.000 C15 0.530

Table 15 Relative proximity values with ideal solution for the Black Sea region

Wind 0.289 264.100 25.000 0.602 0.576 7.300 0.287 0.351 10.000 0.591 1.000 0.050 0.896 0.000 0.400

Bio 0.575 53.300 20.000 0.116 0.139 13.300 0.573 0.694 26.000 0.133 0.278 20.000 0.997 1.000 1.000

Hydro 1.000 8561.500 30.000 0.539 0.533 7.300 1.000 1.000 26.000 0.613 0.239 8.100 0.466 1.000 0.330

Geo 0.000 0.000 25.000 0.000 0.000 10.500 0.000 0.000 38.000 0.000 0.211 0.007 0.675 0.000 2.130

C* values Solar Wind Bio Hydro Geothermal

S+ 0.119 0.111 0.119 0.103 0.163

S− 0.139 0.119 0.116 0.143 0.103

CI˙ 0.539 0.517 0.495 0.580 0.387

196

P. Darende et al.

Table 16 IFTOPSIS analysis results for each region Alternatives Regions (C* I˙ ) Black Aegean Sea Solar 0.539 0.566 Wind 0.517 0.601 Biomass 0.495 0.390 Hydro 0.580 0.444 Geothermal 0.387 0.514

Marmara 0.552 0.632 0.383 0.457 0.429

Mediterranean 0.824 0.624 0.460 0.392 0.391

Eastern Anatolia 0.756 0.537 0.510 0.528 0.403

Southeastern Anatolia 0.818 0.475 0.448 0.493 0.407

Central Anatolia 0.792 0.547 0.411 0.399 0.370

7 Analyses As it may be easily seen from the results that were obtained, the first alternative energy resource investment in the Black Sea region was determined as a hydroelectric power plant. The impact of hydroelectric power plants on people in the region is very negative due to floods in villages as a result of wrong practices. However, studies reveal that, due to the high potential of the Black Sea region, a hydroelectric power plant is still the first choice in investment prioritization. The second investment choice in the region is a solar power plant, which was determined as the closest result to the relative proximity of a hydroelectric power plant to the ideal solution. When the relative proximity values of the Central Anatolia Region, Southeast Anatolia Region, Eastern Anatolia Region, and Mediterranean Region were examined, it was observed that a solar power plant took the first place. When other investment alternatives in the regions were analyzed, it was concluded that wind power plants have a potential that may be evaluated as a second investment alternative with a high average. For the Aegean and Marmara regions which are located in the western part of the country, it was concluded that investment in wind power plants will be the most appropriate choice. The alternative investment rankings of the regions are shown in Fig. 6.

7.1 Sensitivity Analysis A sensitivity analysis based on changing the weights of the main criteria was conducted to analyze the effect of the main criteria weights on the ranking of the renewable energy resources. While conducting a sensitivity analysis, a sample region was taken, and 24 different scenarios were obtained based on changing the weights of the main criteria. In the first six scenarios applied for the Aegean region, the technical main criteria were taken as the most important criteria, while the economic main criterion was the most important main criterion in scenarios 7–12. Environmental criteria in scenarios 13–18, and finally, social criteria in scenarios 19–24 were determined as the most important main criteria. The results

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . . Black Sea Region

Aegean Sea Region

0,00

0,50

1,00

0,50

1,00

0,00

0,818

1,00

0,793 0,547

SOLAR WIND

0,412 0,399

BIO

0,448 0,407 0,50

0,50 Central Anatolia Region

0,493 0,475

0,00

1,00

0,757 0,537 0,528 0,510 0,404

SOLAR WIND HYDRO BIO GEO

Southeastern Anatolia Region SOLAR HYDRO WIND BIO GEO

0,50

Eastern Anatolia Region

0,632 0,553 0,457 0,429 0,384 0,00

0,515 0,444 0,391 0,00

Marmara Region WIND SOLAR HYDRO GEO BIO

0,601 0,566

WIND SOLAR GEO HYDRO BIO

0,581 0,539 0,518 0,495 0,388

HYDRO SOLAR WIND BIO GEO

197

HYDRO

0,370

GEO 1,00

0,00

1,00

Mediterranean Region 0,824

SOLAR

0,625

WIND

0,460

BIO HYDRO

0,393

GEO

0,392 0,00

0,50

1,00

Fig. 6 C* value graphs of alternatives by regions

obtained from these scenarios were utilized to analyze the impact of the criteria weights. The scenarios and the relevant rankings of the renewable energy resources obtained by utilizing the IFTOPSIS method are listed in Table 17. According to the results presented in Table 17, biomass ranked the fourth in the first three scenarios, and it ranked the third in scenarios 4–6. Although it was a partially preferable alternative in the case where the weight of technical criteria was high, the reason why biomass was rated as the last alternative in scenarios 7–24 was that it is costly, not an environment-friendly option and there are problems in sustainability and

198

P. Darende et al.

Table 17 The scenarios and related ranks of renewable energy sources Sce.

Weights of main criteria WC1 WC2 WC3 WC4

C1 *

C2 *

C3 *

C4 *

C5 *

Rank 1 2

3

4

5

1 2

0.395 0.395

0.232 0.232

0.196 0.177

0.177 0.196

0.579 0.574

0.541 0.547

0.491 0.495

0.476 0.476

0.565 0.569

A1 A1

A5 A5

A2 A2

A3 A3

A4 A4

3 4

0.395 0.395

0.177 0.177

0.232 0.196

0.196 0.232

0.591 0.580

0.536 0.472

0.491 0.548

0.481 0.480

0.568 0.576

A1 A5

A5 A1

A2 A3

A3 A4

A4 A2

5 6 7

0.395 0.395 0.232

0.196 0.196 0.395

0.232 0.177 0.196

0.177 0.232 0.177

0.590 0.575 0.569

0.497 0.480 0.575

0.534 0.551 0.410

0.480 0.479 0.442

0.564 0.577 0.493

A1 A5 A2

A5 A1 A1

A3 A3 A5

A2 A2 A4

A4 A4 A3

8 9

0.232 0.177

0.395 0.395

0.177 0.232

0.196 0.196

0.564 0.580

0.565 0.595

0.416 0.369

0.442 0.441

0.497 0.484

A2 A2

A1 A1

A5 A5

A4 A4

A3 A3

10 11 12

0.177 0.196 0.196

0.395 0.395 0.395

0.196 0.232 0.177

0.232 0.177 0.232

0.561 0.580 0.554

0.569 0.597 0.559

0.384 0.378 0.399

0.440 0.441 0.441

0.492 0.484 0.496

A2 A2 A2

A1 A1 A1

A5 A5 A5

A4 A4 A4

A3 A3 A3

13 14

0.196 0.177

0.232 0.232

0.395 0.395

0.177 0.196

0.668 0.658

0.629 0.667

0.340 0.332

0.473 0.472

0.493 0.493

A2 A2

A1 A1

A5 A5

A4 A4

A3 A3

15 16 17

0.232 0.196 0.232

0.177 0.177 0.196

0.395 0.395 0.395

0.196 0.232 0.177

0.629 0.668 0.669

0.653 0.599 0.612

0.370 0.353 0.367

0.479 0.478 0.478

0.506 0.505 0.502

A2 A1 A1

A1 A2 A2

A5 A5 A5

A4 A4 A4

A3 A3 A3

18 19

0.177 0.177

0.196 0.232

0.395 0.196

0.232 0.395

0.668 0.576

0.607 0.463

0.340 0.433

0.475 0.466

0.501 0.547

A1 A1

A2 A5

A5 A4

A4 A2

A3 A3

20 21 22

0.196 0.196 0.232

0.232 0.177 0.177

0.177 0.232 0.196

0.395 0.395 0.395

0.572 0.589 0.578

0.455 0.464 0.449

0.445 0.436 0.483

0.466 0.471 0.472

0.551 0.551 0.560

A1 A1 A1

A5 A5 A5

A4 A4 A4

A2 A2 A2

A3 A3 A3

23 24

0.177 0.232

0.196 0.196

0.232 0.177

0.395 0.395

0.588 0.573

0.469 0.467

0.427 0.466

0.470 0.470

0.547 0.560

A1 A1

A5 A5

A4 A4

A2 A2

A3 A3

predictability of the resource. Geothermal energy took the first place in scenarios where technical and social criteria gained importance such as scenarios 4 and 6. On the other hand, it was observed that it took the second place in other criteriaweighted scenarios. The third option was chosen in scenarios 6–18, and lastly, the second option was chosen in scenarios 19–24. However, particularly considering the environmental damage and social acceptance levels of the resources, it was observed that hydroelectric power plants were some of the bad choices, as well as biomass plants. While it took the fourth place in scenarios 7–18, it took the third place in scenarios 19–24. That is, however, the main disadvantages of solar energy are that it is not an economical option since the investment cost of solar energy is very high, and electricity is still not efficiently produced. For this reason, it fell to the second place in places where economic criteria prevailed, while it generally ranked the first in the scenarios. Wind energy was the alternative power plant to be observed in the first place in the region. Considering the scenarios, it was in the first place in the scenarios between 7 and 15, where the weight of the economic criteria was high. The most important reason for this may be the reduction of turbine costs as a result of shifting to domestic

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

199

production. Wind energy ranked in the second place in scenarios 16–18 and in the third place in scenarios 1–3. In the scenarios where social acceptance gained importance, it lost its investment priority, and it was observed to be in the fourth place.

8 Conclusion The rapid population growth and developments in technology and industry cause the increasing demand for energy resources. The most prominent characteristic of the Turkish energy policy is the ever-increasing demand and foreign dependence in the supply of energy resources. It is possible to reduce this foreign dependency and have an independent energy supply. Turkey is a geographically very convenient and rich country in terms of renewable energy resources and the diversity of these resources. It is anticipated that most of the energy needed can be produced as a result of investments in Turkey and the establishment of the necessary infrastructure. This study is on the assessment of renewable energy resources to invest in the seven geographical regions of Turkey by using different criteria. The mentioned assessment was made for each region by using the data of that region, and the most suitable renewable energy resources were determined. In this context, the energy needs of Turkey including electricity and heating will increase in the coming years because Turkey is mainly among developing countries. The government must choose renewable energy resources over time due to the negative effects of fossil fuels on human health and the environment, besides the decrease in fossil fuel resources. For this reason, at first, the criteria were determined, and both quantitative and qualitative criteria were included. The purpose of this study was to present a decision-making procedure to determine whether renewable energy will be used in the relevant region. In the presented procedure, geothermal, solar, biomass, hydroelectric, and wind energy sources were evaluated by using the Intuitive Fuzzy TOPSIS method. The investment priorities of the selected facilities were sorted by using the obtained data, criteria, and expert opinions based on the relevant region. According to the results obtained from studies conducted throughout Turkey, although wind energy was prevalent in the western regions of the country, solar power plants in other regions were identified as a priority investment advice. Despite the negative social views in the Black Sea region, it is anticipated that investments should still be made on hydroelectric power plants. Additionally, in 24 scenarios examined according to the importance of the criteria, while wind energy was determined as the first investment option in the scenarios where economic criteria were important, solar power plants were preferred in the scenarios where environmental and social acceptance criteria were highly important. For future research, the existing data may be replicated with other multi-criteria decisionmaking methods, and more accurate and significant results may be achieved by comparing the outputs.

200

P. Darende et al.

References Abdullah L, Najib L (2016) Sustainable energy planning decision using the intuitionistic fuzzy analytic hierarchy process: choosing energy technology in Malaysia. Int J Sustain Energy 35(4):360–377 Abu-Taha R (2011) Multi-criteria applications in renewable energy analysis: a literature review. In: Technology management in the energy smart world (PICMET), Proceedings of PICMET, 11, pp 1–8 Ahmad S, Tahar RM (2014) Selection of renewable energy sources for sustainable development of electricity generation system using analytic hierarchy process: a case of Malaysia. Renew Energy 63:458–466 Al Garni H, Kassem A, Awasthi A, Komljenovic D, Al-Haddad K (2016) A multicriteria decision making approach for evaluating renewable power generation sources in Saudi Arabia. Sustain Energy Technol Assess 16:137–150 Atanassov KT (1986) Intuitionistic fuzzy-sets. Fuzzy Sets Syst 20(1):87–96 Bilgen S, Kele¸s S, Kaygusuz A, Sarı A, Kaygusuz K (2008) Global warming and renewable energy sources for sustainable development: a case study in Turkey. Renew Sustain Energy Rev 12(2):372–396 Boran FE, Genç S, Kurt M, Akay D (2009) A multi-criteria intuitionistic fuzzy group decision making for supplier selection with TOPSIS method. Expert Syst Appl 36(8):11363–11368 Boran FE, Boran K, Menlik T (2012) The evaluation of renewable energy technologies for electricity generation in Turkey using intuitionistic fuzzy TOPSIS. Energy Sourc Part B Econ Plan Policy 7(1):81–90 Büyüközkan G, Güleryüz S (2014) A new GKV based AHP framework with linguistic interval fuzzy preference relations for renewable energy planning. J Intell Fuzzy Syst 27(6):3181–3195 Büyüközkan G, Güleryüz S (2016) An integrated DEMATEL-ANP approach for renewable energy resources selection in Turkey. Int J Prod Econ 182:435–448 Çelikbilek Y, Tüysüz F (2016) An integrated grey based multi-criteria decision making approach for the evaluation of renewable energy sources. Energy 115:1246–1258 Darende P (2019) Türkiye’de Yenilenebilir Enerji Kaynaklarinin Bölgesel Yatırımlarının Sezgisel Bulanik Mantik Yöntemi ˙Ile ˙Incelenmesi. Unpublished master thesis, TOBB University of Economics and Technology, Ankara Demirtas O (2013) Evaluating the best renewable energy Technology for sustainable energy planning. Int J Energy Econ Policy 3:23–33 Dicorato M, Forte G, Trovato M (2008) Environmental-constrained energy planning using energyefficiency and distributed-generation facilities. Renew Energy 33:1297–1313 Erdem S, Gencer C, Atmaca E, Karaca T, Aydo˘gan EK (2013) Türkiye’de Enerji Santrallerinin AHP Yöntemi ˙Ile Seçimi. Dumlupınar Üniversitesi Sosyal Bilimler Dergisi 2013 Özel Sayısı, pp 38–62 Ertay T, Kahraman C, Kaya ˙I (2013) Evaluation of renewable energy alternatives using Macbeth and fuzzy Ahp multicriteria methods: the case of Turkey. Technol Econ Dev Econ 19(1):38–62 Evans A, Strezov V, Evans TJ (2009) Assessment of sustainability indicators for renewable energy technologies. Renew Sustain Energy Rev 13:1082–1088 Haralambopoulos DA, Polatidis H (2003) Renewable energy projects: structuring a multicriteria group decision-making framework. Renew Energy 28:961–973 Kabak M, Dagdeviren M (2014) Prioritization of renewable energy sources for Turkey by using a hybrid MCDM methodology. Energy Convers Manag 79:25–33 Kahraman C, Kaya ˙I, Cebi S (2009) A comparative analysis for multiattribute selection among renewable energy alternatives using fuzzy axiomatic design and fuzzy analytic hierarchy process. Energy 34:1603–1616 Kaya T, Kahraman C (2010) Multicriteria renewable energy planning using an integrated fuzzy VIKOR & AHP methodology: the case of Istanbul. Energy 35(6):2517–2527

Regional Examination of Energy Investments in Turkey Using an Intuitionistic. . .

201

Kaya T, Kahraman C (2011) Multicriteria decision making in energy planning using a modified fuzzy TOPSIS methodology. Expert Syst Appl 38(6):6577–6585 Konstantin HP (2017) Master’s thesis Japan’s renewable energy potentials possible ways to reduce the dependency on fossil fuels. Ritsumeikan Asia Pacific University Menegaki A (2008) Valuation for renewable energy: a comparative review. Renew Sustain Energy Rev 12(9):2422–2437 Mutlu E (2013) Türkiye’de yenilenebilir enerji ekonomisi ve Ankara iline ait swot analizler, yüksek lisans tezi, Adres: https://tez,yok,gov,tr/UlusalTezMerkezi/ internet adresinden 10,10,2019 tarihinde edinilmi¸stir Özkan Özen Y, Koçak A (2017) Bulanık Analitik Hiyerar¸si ve Bulanık Dematel Yöntemleri Kullanılarak Kurumsal Kaynak Planlaması Yazılım Seçimi ve De˘gerlendirilmesi. Yönetim ve Ekonomi: Celal Bayar Üniversitesi ˙Iktisadi ve ˙Idari Bilimler Fakültesi Dergisi 24(3):929–957 Pohekar SD, Ramachandran M (2004) Application of multi-criteria decision making to sustainable energy planning-a review. Renew Sust Energ Rev 8:365–381 Polatidis H, Haralambopoulos DA (2004) Local renewable energy planning: a participatory multicriteria approach. Energy Sources 26:1253–1264 Price TJ (2005) Wind engineering. Retrieved June 23, 2016 from Wind Eng 29(3):191–200 Rouyendegh BD, Yıldızba¸sı A, Arıkan ÜZB (2018) Using intuitionistic fuzzy TOPSIS in site selection of wind power plants in Turkey. Adv Fuzzy Syst 2018:1–14 Rouyendegh BD, Topuz K, Dag A, Oztekin A (2019) An AHP-IFT integrated model for performance evaluation of E-commerce web sites. Inf Syst Front 21(6):1345–1355 Sadeghi M, Rashidzadeh M, Soukhakian M (2012) Using analytic network process in a group decision-making for supplier selection. Informatica (Netherlands) 23:621–643 Sengül ¸ Ü, Eren M, Eslamian Shiraz S, Gezder V, Sengül ¸ AB (2015) Fuzzy TOPSIS method for ranking renewable energy supply systems in Turkey. Renew Energy 75:617–625 Stojcetovic B, Nikolic D, Velinov V, Bogdanovic D (2016) Application of integrated strengths, weaknesses, opportunities, and threats and analytic hierarchy process methodology to renewable energy project selection in Serbia. J Renew Sustain Energy 8(3) Streimikiene D, Balezentis T, Krisciukaitiene I, Belezentis A (2012) Prioritizing sustainable elecricity production technologies: MCDM approach. Renew Sust Energ Rev 16:3302–3311 Szmidt E, Kacprzyk J (2000) Distances between intuitionistic fuzzy sets. Fuzzy Sets Syst 114(3):505–518 (URL-1) KPMG Enerji Sektörel Bakı¸s (2018). https://assets,kpmg/content/dam /kpmg/tr/pdf/2018/ 02/sektorel-bakis-2018-enerji,pdf internet adresinden 10,10,2019 tarihinde edinilmi¸stir (URL-2) Türkiyenin Enerji Görünüm Raporu (2018). https://www,mmo,org,tr/sites/default/files/ EnerjiGorunumu2018_1,pdf internet adresinden 10,10,2019 tarihinde edinilmi¸stir (URL-3) Yenilenebilir Enerji Raporu. https://setav,org/assets/uploads/2017/04/ YenilenebilirEnerji,pdf internet adresinden 10,10,2019 tarihinde edinilmi¸stir Wang B, Kocaoglu F (2010) A decision model for energy resource selection in China. Energy Policy 38(11):7130–7141 Wang JJ, Jing YY, Zhang CF, Zhao JH (2009) Review on multi-criteria decision analysis aid in sustainable energy decision-making. Renew Sustain Energy Rev 13(9):63–78 Weigelt C, Shittu E (2016) Competition, regulatory policy, and firms’ resource investments: the case of renewable energy technologies. Acad Manage J 59(2):678–704 Xu ZH (2007) Intuitionistic fuzzy aggregation operator. IEEE Trans Fuzzy Syst 15(6):1179–1187 Yazdani-Chamzini A, Fouladgar MM, Zavadskas EK, SHH M (2013) Selecting the optimal renewable energy using multi criteria decision making. J Bus Econ Manage 14(5):957–978 Yi SK, Sin HY, Heo E (2011) Selecting sustainable renewable energy source for energy assistance to North Korea. Renew Sustain Energy Rev 15(1):554–563 Zadeh LA (1987) Outline of a new approach to the analysis of complex systems and decision process. In: Zadeh LA, Yager RR, Ovchinnikov S, Tong RM, Nguyen HT (eds) Fuzzy sets and applications: selected papers. Wiley, Toronto, pp 105–146 Zhang SF, Liu SY (2011) A GRA-based intuitionistic fuzzy multi-criteria group decision making method for personnel selection. Expert Syst Appl 38(9):11401–11405

Small Series Fashion Supplier Selection Using MCDM Methods Nitin Harale, Sebastien Thomassey, and Xianyi Zeng

Abstract In fashion supply chain management, the selection of suppliers is a critical and integral part of overall business processes. Due to the advent of ecommerce business models, it is increasingly becoming challenging to select the appropriate criteria among many conflicting, inconsistent, and contradictory ones based on which supplier selection problem can be solved. In this chapter, authors aim to address this challenge by using MCDM methods, viz., AHP, TOPSIS, and Fuzzy-TOPSIS. The research problem in this chapter revolves around the challenge of selecting the best suppliers considering the range of criteria from retailers’ point of view. A questionnaire-based survey approach is adopted to evaluate and identify highly important qualitative and quantitative criteria. The respondents of the survey include supply chain managers from the four fashion companies in Europe that produce range of customized fashion and luxury products. Furthermore, participants evaluated their suppliers based on each relevant criterion using linguistic scales. It is demonstrated that the results from this study provide suitable approach for dealing with complex supplier selection criteria and the pool of competing suppliers in the market. Authors envisage that this chapter will provide valuable insights and contribute to the enhanced group decision-making in the fashion industry in regard to supplier selection.

1 Introduction Fashion industry operates in both mass scale and small series production frameworks. While mass scale fashion production strategies are continually in a developing stage, small series production, where in individual consumer level customization

European H2020 FBD_BModel project funded this study. N. Harale () · S. Thomassey · X. Zeng GEMTEX, ENSAIT, Ecole Centrale de Lille, Lille, France e-mail: [email protected]; [email protected]; [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_8

203

204

N. Harale et al.

is emphasized, is significantly challenging for the development of robust production and supply chain network and configuration strategies (Macchion et al. 2015). Reliable and potential suppliers are the backbone and major drivers of companies’ overall business success. In order for the companies to survive in the fierce market competition, it is indispensable to choose suppliers in an effective, efficient, and profitable way. Short product life cycle, rapidly changing fashion trends, and ever-changing consumer preferences drive fashion industry to the point that it becomes highly challenging and difficult for them to make effective procurement decisions. Fashion companies not only need to ensure that they have the best suppliers but also to evaluate their performance from time to time based on various performance indicators. Moreover, textile companies are constantly required to comply with various environmental and market regulations owing to which the number of criteria for the supplier selection varies. More often than not, these criteria are complex in terms of their relationship with one another and their relative importance. Fashion industry is undergoing significant transformation in recent years given the global competition and emergence of new markets. In addition to it, knowledge of consumer choices is, more than ever, becoming critical for the development and growth of fashion industry as a whole. Supplier selection and supply chain performance evaluation in a fashion industry is an integral part of its business operations and is a complex process as it involves careful analysis of number of key criteria. All these factors constitute a multi-criteria decision-making problem of supplier selection as part of supply chain management in fashion industry (Yang et al. 2008). Although supplier selection has largely been the major focus of researches in fashion supply chain management, the rapid transformation of fashion business models owing to the digital revolution poses significant challenges for the management of fashion retail companies. Many studies point out that the key to survive in the fierce market competition is by effectively being more adaptive and flexible to adjust to the changing consumer preferences (Lesisa et al. 2018). In recent years, there has been an incessant growth of advanced researches and innovations in information technology, which has paved a way for many possibilities than it was possible few decades ago. Decision-making in fashion industry must be done in real time given the rapid changes in the market trends. To make this possible, supply chain managers need to have an access to the useful information from customers’ and supplier’s side. Companies’ advanced big data analytics integrated with their information management systems allow their managers to gage the market situation and to identify important factors for making decisions including supplier selection (Banica and Hagiu 2016). As a result, fashion companies are now able to tap into their data repositories to derive significant insights about the consumers in order to develop supply chain strategies that will enable them to quickly adapt and respond to the market forces. Supplier selection, in general, is a multi-criteria and multi-objective decisionmaking problem which has been addressed and studied in plethora of researches conducted in operational research; industrial engineering and management; and production research fields (Wen et al. 2018). Multi-criteria decision methods (MCDM)

Small Series Fashion Supplier Selection Using MCDM Methods

205

are widely popular and used for solving supplier selection problems as they can address the conflict between selected criteria and map their relative importance (Kahraman et al. 2015). However, conventional criteria and methodologies used for solving supplier selection problems do not perform well given the rapid transformation of fashion industry due to changing consumer behavior, data-driven business models, and digital market trends. In this chapter, our main objective is to identify new criteria relevant to digital fashion business models and use them in MCDM methods for selecting the best suppliers. In this study, the focus of our supplier selection problem can be located in a small series fashion production framework. In this chapter, we conducted a thorough literature study to identify potential criteria and studied expert opinions about their relevance for the supplier selection in a digital business framework. This chapter also contributes to the processes of digitalization of supply chains in small series fashion business processes.

2 Literature Review Vast research literature exists on supplier selection problems and it has continuously been growing ever since (Karsak and Dursun 2016). Our literature study focuses on supplier selection as a general supply chain management problem across various industries.

2.1 General Overview of Supplier Selection Supplier selection is one of the major decisions made in fashion supply chain management and it significantly influences other business decisions and strategies. Supplier selection entails four major stages: (1) choosing subcontracting method, (2) preliminary screening of possibly suitable suppliers based on various important criteria, (3) ranking of suppliers, (4) selection of best suppliers (Weele 2010). Detailed descriptive analysis of stepwise processes involved in the supplier selection in various industries is presented in the study of Taherdoost and Brard (2019). Supplier selection has been one of the most researched subject matters in fashion production and supply chain management. Plethora of methods including optimization methods has been developed and presented in many studies (e.g., Teng and Jaramillo 2005; Liu et al. 2019; Chai and Ngai 2015) to address supplier selection problems taking into account various criteria. Yildiz (2016) applied interval type-2 FuzzyTOPSIS and Fuzzy-TOPSIS methods to select the best supplier among available alternatives of garment suppliers for the Turkish fashion company. However, there is a significant dearth of literature addressing supplier selection problems using MCDM methods with the main focus on fashion industry.

206

N. Harale et al.

Many studies have explored the problem of supplier selection based on multiple qualitative and quantitative criteria (Liao and Kao 2010; Ku et al. 2010). Detailed study on various criteria not only from economic aspect but also from environmental and social aspects to be included in supplier evaluation analysis can be referred in (Vieira et al. 2016; Winter and Lasch 2016). However, the criteria based on which multi-criteria decision methods applied are traditional ones and do not reflect the changes in fashion industry due to the advent of new regulatory framework and digital technology.

2.2 Multi-criteria Decision Methods for Supplier Selection Multi-criteria decision methods (MCDM) methods are widely used for making evaluations of different solutions or alternatives from multiple aspects and for making recommendations based on this evaluation. The supplier selection problem is regarded as an MCDM problem since it involves the evaluation of finite set of potential suppliers based on multitude of criteria and their ranking according to score calculations. The most popular MCDM methods are AHP and ANP (known as multi-attribute utility methods); Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) and Multi-criteria Optimization and Compromise Solution (VIKOR); and Elimination and Choice Expressing Reality (ELECTRE). The detailed description of these methods can be studied (Lee and Chang 2018). There also exist several integrated MCDM methods such as Fuzzy-AHP, FuzzyTOPSIS, Fuzzy-ANP, etc. which are primarily based on fuzzy number set theory (Kaya et al. 2019). An integrated model combining Analytic Hierarchy Process (AHP) and linear physical programming (LPP) model together to select suppliers is presented in the study by Kumar et al. (2018). A highly efficient hybrid supplier selection model based on Data Envelopment Analysis (DEA), a mathematical programming based method, is presented by Alikhani et al. (2019), which incorporated both desirable and undesirable criteria defined as risk factors and the results from this study indicate that the separate analysis of suppliers based only on defined subjective risk criteria other than quantitative information is not effective and leads to wrong decisions. Another interesting study by Guarnieri and Trojan (2019) employed the Copeland method, which computes aggregated criteria weights; AHP method to calculate individual criteria weights; and ELECTRE-TRI method for sorting suppliers based on social, ethical, and environmental criteria. Using a broad supplier selection framework, new methods that have not been applied in traditional supply chains along with methods used for all the phases of supplier selection in range of industries are presented in the form of detailed literature review by de Boer et al. (2001). Chai et al. (2013) presented a systematic review of vast literature on multicriteria decision approaches applied for supplier selection problems in different industries and classified these approaches into seven broad categories based on various uncertainties and risk factors.

Small Series Fashion Supplier Selection Using MCDM Methods

207

A detailed literature review of various multi-criteria decision-making (MCDM) methods has been presented in Ho et al. (2010). This review presents individual methods, such as Data Envelopment Analysis (DEA), Mathematical programming, AHP, Case-based reasoning, ANP, Fuzzy set Theory, Simple multi-attribute rating technique, and Genetic algorithm, and several hybrid approaches that have been prevalently used in supplier evaluation and selection, and discusses in detail the prevalent criteria for supplier evaluation and also the limitations of the MCDM methods. The major limitation that the study found in this study is that the criteria selected for supplier evaluation and selection did not have any influence on companies’ business strategies and goals. With the changing business frameworks and digital technology, it has been indispensable for the decision makers to incorporate those criteria which have high influence on companies’ business goals, which is the main focus of our study. Fuzzy-AHP method is proposed in the study by Lin and Twua (2012) as a multicriteria decision methodology to select the best fashion trend alternatives based on their evaluation by experts on various selection criteria. Several other MCDM methods such as ANP, DEMATEL, VIKOR, etc. have been presented in Lin and Jerusalem (2016), which are used specifically for developing decision solutions for various fashion design schemes. It is evident that majority of decisions that are integral to fashion industry as a whole are formulated as multi-criteria decisions as it involves multiple alternatives, for example, variety of design schemes and various criteria for evaluating these alternatives. Therefore, it makes decisionmaking in fashion industry more complex and challenging in nature. In another interesting study by Amin et al. (n.d.), SWOT analysis is integrated with fuzzy linear programming to solve supplier selection as well as sourcing quantity decision problems. In many optimization-based methods, we can handle only quantitative information while it poses problems in dealing with qualitative criteria, and therefore, the supplier selection decisions based on conventional methods are not effective. In the age of digital business models, fashion companies are striving to take advantage of emerging big data tools and technology to improve their business performance and growth. In the context of small series fashion production, problems arising out of uncertainties in supply chain network configuration and increasing degree of changes in consumer preferences lead to the need of revisiting supplier selection problem and exploring new relevant factors that can drive effective sourcing decision-making. Modern information processing technology enables fashion companies to track the norms that drive consumer shopping patterns and also market dynamics. There is a dearth of literature that tackles these trends in the fashion industry. This study is aimed at filling a lacuna in the study of supplier selection in the context of data-driven digital fashion business models and small series level customized apparel production. One of the key objectives of this study is to identify important criteria, both quantitative and qualitative, from the point of view of sourcing managers working in fashion industries.

208

N. Harale et al.

As we have outlined above that our main focus in this study is on the sourcing decision-making in terms of selecting best suppliers for the small series fashion production, we find, through literature study, that conventional approaches that are more suitable for the mass fashion supply chain and production, and are not quite effective for the problem that we aim to address. Given the repaid changes in the fashion market, our study will help improve the flexibility and address uncertainties related to supplier selection and thereby general fashion business processes. The outcomes from this study also respond to the growing attention and demands from fashion retailers and supply chain managers for the new supplier selection approaches.

3 Methodology Supplier selection, in general, entails two main steps: firstly, to identify the potential criteria based on which best suppliers to be chosen for the sourcing, and secondly, to apply MCDM methods to classify and rank the suppliers from which the top-ranked suppliers can be selected for making the business deals. For the first step, we, in line with Dickson (1966) and Weber et al. (1991), we adopted questionnaire-based survey approach to derive opinions and consensus from supply chain experts and managers from four European fashion companies on most important supplier selection criteria from the perspective of new emerging digital fashion business models. The questionnaire was formulated in such a way that managers were able to agree with the relevant supplier selection criteria and evaluate their relative importance with each other. For the second step, we implemented MCDM models viz., AHP; TOPSIS; Fuzzy-TOPSIS models. AHP is one of the most widely applied methods for solving relative measurement problems, however, its popularity is due to its ability to compare the relative performance of actions or alternatives to be evaluated based on evaluation criteria. AHP method was introduced by Saaty (1977) and it has been applied to solve plethora of complex MCDM problems ever since. We employed AHP especially for computing supplier selection criteria weights. The choice of TOPSIS, Fuzzy-TOPSIS methods is corroborated by the fact that these models are developed based on the fuzzy set theory (Zadeh 1965, 1973) and they possess unique ability to handle fuzzy human judgments or decisions involved in MCDM problems. In our study, we have sought industry experts’ judgments on important criteria for supplier selection and also on the evaluation of suppliers based on these criteria. These judgments are highly likely to suffer from vagueness in terms of making relative evaluation. Therefore, the selection of aforementioned methods is justified. Moreover, these methods are well capable of aggregating multiple decision makers’ evaluation on alternatives and ranking them based on computed scores. There is vast literature (see for e.g., Munier et al. 2019; Zavadskas et al. 2014) available that outlines the theoretical and mathematical formulations of all MCDM

Small Series Fashion Supplier Selection Using MCDM Methods

209

methods including those applied in this study. Therefore, detailed mathematical formulation for the chosen MCDM methods is beyond the scope of this chapter.

4 Experimental Results 4.1 Supplier Selection Criteria In order to select highly relevant criteria from the strategic point of view of current market and customer trends and digital technology for fashion supplier selection, we have sought inputs from the decision makers composed of sourcing managers from four European fashion companies. We conducted face-to-face survey in which we involved two supply chain experts from each of the four e-commerce fashion companies that specialize in small series customized products such as customized shirts, women bags, all-season multifunctional jackets, and customized luggage trollies. As a data security principle, the respondents were anonymized. In total, eight respondents were involved in the survey. The main question that we incorporated in the survey questionnaire is: What are the most important criteria, from the perspective of your current business strategies and goals, and the digital fashion supply chain that you use for selecting your suppliers? The respondents were presented with the list of 15 criteria as shown in Table 1, that we preselected based on our literature study, and they were asked to select those that they consider highly important for their business strategies, needs, and goals. Respondents were

Table 1 Supplier selection criteria from literature Criterion (C) Cost (C1) Market reputation(C2) Service efficiency(C3) Management efficiency (C4) R&D facilities(C5) Late delivery(C6) Lead time (C7) Quality (C8) Flexibility (C9) Operational efficiency (C10) Innovation (C11) Trust (C12) Location (C13) Digitization (C14) Sustainability(C15)

Type Quantitative Qualitative Qualitative Qualitative Qualitative Qualitative Quantitative Qualitative Qualitative Quantitative Qualitative Qualitative Qualitative Qualitative Qualitative

Evaluation score 1.375 10.043 11.831 12.733 11.927 10.953 3.625 3.871 4.608 5.272 5.935 7.282 9.064 10.237 9.862

Selected? Yes No No No No No Yes Yes Yes Yes Yes Yes Yes No Yes

210

N. Harale et al.

Table 2 Illustration of mean value computation for Cost and Lead time Respondent 1 Respondent 2 Respondent 3 Respondent 4 Respondent 5 Respondent 6 Respondent 7 Respondent 8 Mean

Number assigned for Cost 1 1 3 2 1 1 1 1 11/8 = 1.375

Number assigned for Lead time 4 2 2 3 3 6 5 4 29/8 = 3.625

asked to select all those criteria from the list of 15 criteria that they feel important for them in order for making decision in regard to supplier selection. The respondents were asked to number the criteria based on their importance: “1” signifying the most important one, and “15” signifying the least important. In order to randomly select the top important criteria of the list, we used mean of all the numbers assigned by respondents to each of the criteria. Consequently, we selected the top nine criteria based on their small average number. The criterion (for example, Cost) having the smallest mean value of all the numbers assigned by respondents is selected as the first most important criteria in the list of nine criteria. Likewise, the criterion (for example, Sustainability) with the ninth-lowest mean value is selected as the ninth most important criteria in the list. The illustration of mean value computation for criteria Cost and Lead time is given in Table 2.

4.2 MCDM Methods 4.2.1 AHP for Criteria Ranking In the next step, we compute the relative importance of the nine supplier selection criteria using AHP method, which is widely used for solving multi-criteria decisionmaking in various industries (Vaidya and Kumar 2006). To proceed with AHP, we, firstly, asked one of the industry experts to evaluate the relative importance of these criteria with respect to each other keeping their business strategies and the main goal of selecting the best suppliers in mind. For the inputs of evaluation, we prepared a questionnaire (as shown in Appendix) in which we asked the respondent to fill the pairwise comparison matrix using linguistic scale, presented below in Table 3, proposed by Saaty (1977) that has previously been used in many studies. Evaluation of supplier selection criteria, given in Table 1, by one of the industry experts constituted pairwise comparison matrix (PCM) as shown in Table 4, which then checked for consistency as per the AHP steps.

Small Series Fashion Supplier Selection Using MCDM Methods Table 3 Saaty’s pairwise comparison scale

Linguistic description Equally important Moderately important Strongly important Very strongly important Extremely important

211 Numerical value 1 3 5 7 9

Finally, we computed the priority vectors of criteria, values of which are the normalized values of PCM calculated by the geometric mean method, which is one of the many normalization methods used for priority vector calculation. The computed score of criteria is given in Table 5. The scores computed as the priority score are represented in terms of percentage points: for example, the priority score for the criteria Cost is 0.27, and it can be interpreted as the 27% weightage in terms of importance. The range of priority score is from 0 to 1. Based on the computed priority scores of criteria, they are ranked as shown in Fig. 1. From Fig. 1, it can be inferred that the top three most important supplier selection criteria are Cost, Lead time, and Quality.

4.2.2 TOPSIS Method for Supplier Ranking (A Single Expert Decision-Making) Based on the criteria evaluation by AHP, we further applied the TOPSIS method to evaluate suppliers corresponding to each criterion. We have ten fashion suppliers, denoted alphabetically as Supplier A, Supplier B, . . . . . . , Supplier J, and they are to be evaluated based on supplier selection criteria identified from the result of AHP method. The candidate suppliers are located in the European zone and they provide raw materials such as fabrics and other accessories, and customization services for the customized products, as mentioned in Sect. 4.1, sold by a small series fashion retail company. For this method, we focused on a fashion retail brand that provides customized shirts to its customers. For the retailer selling customized shirts, suppliers will provide fabric, fabric cutting, stitching, customized operations on shirts, etc. We involved only one expert in the process of evaluation for ranking suppliers. The decision matrix including criteria scores calculated by AHP, from Table 5, and the evaluation made by one industry expert is shown in Fig. 2. In the decision matrix shown in Fig. 2, values in the first two columns, i.e., Cost (Euro/meter) and Lead time (days) are the simulated data points, while values in rest of the columns, also highlighted in faint blue color, are the evaluation inputs by industry expert, who is a supply chain manager, using Likert scale as shown in Fig. 3. We collected these inputs in a face-to-face interview with the supply chain manager who was asked to fill the empty decision matrix on A4 paper. For the sake of simplification, we do not intend to include intermediate stepwise results which include weighted normalization of TOPSIS decision matrix, computation of Ideal scores (positively ideal solutions) and Anti-ideal scores (negatively ideal solutions),

Cost Lead time Quality Flexibility Operational efficiency Innovation Trust Location Sustainable operations

Cost 1 1 1/7 1/7 1/5 1/7 1/7 1/9 1/5

Lead time 1 1 1/5 1/7 1/5 1/3 1/5 1/7 1/5

Quality 7 5 1 1/7 1/7 1/5 1/9 1/9 1/5

Flexibility 7 7 7 1 1/5 1/3 1/5 1/3 1/5

Table 4 Pairwise comparison matrix for criteria evaluation Operational efficiency 5 5 7 5 1 1/3 01/5 1/3 1/5

Innovation 7 3 5 3 3 1 1/7 1/7 1/5

Trust 7 5 9 5 5 7 1 1/7 1/5

Location 9 7 9 3 3 7 7 1 1/3

Sustainable operations 5 5 5 5 5 5 5 3 1

212 N. Harale et al.

Small Series Fashion Supplier Selection Using MCDM Methods Table 5 Criteria weights

Criteria Cost Lead time Quality Flexibility Operational efficiency Innovation Trust Location Sustainable operations

213 Weights 0.271166004 0.224977768 0.171336758 0.089177084 0.070806166 0.076452702 0.048976684 0.024720476 0.022386358

Fig. 1 AHP ranking of supplier selection criteria

and finally, the computation of closeness coefficient values based on which, ten suppliers are ranked as shown in Fig. 4.

4.2.3 Fuzzy-TOPSIS Method for Supplier Ranking (Group Decision-Making) The results from the TOPSIS method are based on single expert evaluation. Further, we carried out group decision-making using the Fuzzy-TOPSIS method by involving four industry experts’ evaluation of the same suppliers that we evaluated using the TOPSIS method. As the evaluation of multiple experts is often vague and laden with too much subjective information, we applied the Fuzzy-TOPSIS method which draws from the fuzzy number set theory. For our analysis, we employed

214

N. Harale et al.

Fig. 2 Decision matrix for TOPSIS

Linguistic description Worst

Numerical value 1

Good

3

Beer

5

Best

7

Fig. 3 Likert scale

Triangular Fuzzy numbers, which are widely used for group decision-making. For the detailed mathematical formulation of the Fuzzy-TOPSIS method, comparative study by Ertuˇgrul and Karaka¸soˇglu (2008) can be referred. In the first step of implementation of Fuzzy-TOPSIS, group of four industry experts evaluate supplier selection criteria using fuzzy evaluation scale (0–1 range), which is shown in Table 6. The decision matrix for the criteria evaluation by four experts group is shown below in Fig. 4. The four experts as decision makers are denoted by DM1, DM2, DM3, and DM4 respectively. We input the evaluation values in terms of TFN (Triangular Fuzzy Numbers) made by decision makers in the Fuzzy decision matrix given in Fig. 5. In the second step, similar to criteria evaluation, we asked decision makers to evaluate the same ten suppliers based on each criterion. For the supplier rating, we used fuzzy rating scale (1–10 range) as shown in Table 7. The decision matrix, as shown in Fig. 6, includes the experts’ evaluation for the criteria ‘Cost’ in terms of

Small Series Fashion Supplier Selection Using MCDM Methods

215

Fig. 4 Supplier ranking by TOPSIS Table 6 Linguistic scale for criteria evaluation

Linguistic description Very low Low Medium low Medium Medium high High Very high

DM1

Cost (0, 0, 0.1) Lead Time (0.7, 0.9, 1.0) Quality (0, 0.1, 0.3) Flexibility (0, 0.1, 0.3) Operaonal efficiency (0, 0.1, 0.3) Innovaon (0, 0.1, 0.3) Trust (0.9, 1.0, 1.0) Locaon (0, 0.1, 0.3) Sustainable Operaons (0.5, 0.7, 0.9) Fig. 5 Decision matrix for criteria evaluation

DM2 (0.5, 0.7, 0.9) (0.9, 1.0, 1.0) (0.1, 0.3, 0.5) (0.7, 0.9, 1.0) (0.3, 0.5, 0.7) (0.9, 1.0, 1.0) (0.5, 0.7, 0.9) (0, 0, 0.1) (0.1, 0.3, 0.5)

Triangular Fuzzy numbers (0, 0, 0.1) (0, 0.1, 0.3) (0.1, 0.3, 0.5) (0.3, 0.5, 0.7) (0.5, 0.7, 0.9) (0.7, 0.9, 1.0) (0.9, 1.0, 1.0)

DM3 DM4 (0.7, 0.9, 1.0) (0.5, 0.7, 0.9) (0, 0, 0.1) (0.1, 0.3, 0.5) (0.7, 0.9, 1.0) (0.1, 0.3, 0.5) (0, 0, 0.1) (0.5, 0.7, 0.9) (0, 0, 0.1) (0.9, 1.0, 1.0) (0.3, 0.5, 0.7) (0, 0.1, 0.3) (0.5, 0.7, 0.9) (0.5, 0.7, 0.9) (0.3, 0.5, 0.7) (0.1, 0.3, 0.5) (0.5, 0.7, 0.9) (0.3, 0.5, 0.7)

216

N. Harale et al.

Table 7 Linguistic scale for supplier rating

Supplier A Supplier B Supplier C Supplier D Supplier E Supplier F Supplier G Supplier H Supplier I Supplier J

Linguistic description Very low Low Medium low Medium Medium high High Very high

Triangular Fuzzy numbers (0, 0, 1) (0, 1, 3) (1, 3, 5) (3, 5, 7) (5, 7, 9) (7, 9, 10) (9, 10, 10)

DM1

DM2

DM3

DM4

(9, 10, 10)

(7, 9, 10)

(5, 7, 9)

(0, 1, 3)

(3, 5, 7)

(5, 7, 9)

(5, 7, 9)

(0, 0, 1)

(0, 1, 3)

(0, 1, 3)

(1, 3, 5)

(3, 5, 7)

(7, 9, 10)

(9, 10, 10)

(1, 3, 5)

(0, 1, 3)

(0, 1, 3)

(9, 10, 10)

(7, 9, 10)

(5, 7, 9)

(0, 1, 3)

(9, 10, 10)

(1, 3, 5)

(1, 3, 5)

(1, 3, 5)

(9, 10, 10)

(5, 7, 9)

(1, 3, 5)

(1, 3, 5)

(9, 10, 10)

(7, 9, 10)

(9, 10, 10)

(0, 0, 1)

(9, 10, 10)

(5, 7, 9)

(0, 1, 3)

(5, 7, 9)

(1, 3, 5)

(7, 9, 10)

(1, 3, 5)

Fig. 6 Example of decision matrix for supplier rating for “Cost” criteria

TFN. Since we have nine supplier selection criteria, nine decision matrices similar to the one shown in Fig. 6, are formed and fed to the Fuzzy-TOPSIS model. The decision matrix for the supplier rating for criteria “Cost” is depicted in Fig. 6. The intermediate steps entailed in the implementation of Fuzzy-TOPSIS are aggregation of criteria weights; computation of fuzzy weighted normalized decision matrix for supplier rating; and computation of Fuzzy Positive Ideal Solution (FPIS) and Fuzzy Negative Ideal Solution (FNIS); computation of distance of each supplier from FPIS and FNIS; and finally, the computation of closeness coefficient of each supplier. Based on the final computation of closeness coefficient values, we ranked all the ten suppliers, and the ranking is shown in Fig. 7. If we compare the supplier ranking from Fuzzy-TOPSIS with TOPSIS supplier ranking, it indicates that the ranking for single expert evaluation differs significantly from the group decision-making. The top three best suppliers from the TOPSIS method are Supplier J, Supplier F, and Supplier A, while from the Fuzzy-TOPSIS method, the top three best suppliers are Supplier F, Supplier B, and Supplier C. Moreover, the three worst suppliers from TOPSIS method are Supplier C, Supplier

Small Series Fashion Supplier Selection Using MCDM Methods

217

Fig. 7 Supplier ranking by Fuzzy-TOPSIS method

E, and Supplier B, while from the Fuzzy-TOPSIS method, the top three best suppliers are Supplier D, Supplier J, and Supplier G. This significant variation in the supplier ranking by TOPSIS and Fuzzy-TOPSIS methods can be attributed to the fact that only one decision maker is involved in the supplier evaluation by TOPSIS method while four experts are involved in the supplier evaluation by FuzzyTOPSIS method. Moreover, the fundamental difference could be due to the different normalization methods that each of these two methods involves. It is in line with the established findings in the area of MCDM methods that found the variation in the results drawn from each MCDM method. Furthermore, the ranking of suppliers may vary further if we increase or decrease the number of decision makers, number of supplier to be evaluated, and the number of supplier selection criteria.

4.2.4 Sensitivity Analysis We performed sensitivity analysis to check the extent to which the results coming from the methods can vary depending on the changes in the inputs of the methods, namely, supplier selection criteria, number of suppliers, and the number of decision makers. While there could be N number of scenarios for which sensitivity analysis can be done, we considered only the Fuzzy-TOPSIS method, which allows us to see the dynamics of results based on changing the number of suppliers, number of supplier selection criteria, and number of decision makers. So, the inputs of FuzzyTOPSIS will be changed to check the variation in the results.

218 Fig. 8 Decision matrices for criteria evaluation using two experts

Fig. 9 Example of decision matrix for supplier rating for “Cost” criteria using two experts

N. Harale et al.

DM1

Cost (0, 0, 0.1) Lead Time (0.7, 0.9, 1.0) Quality (0, 0.1, 0.3) Flexibility (0, 0.1, 0.3) Operaonal efficiency (0, 0.1, 0.3) Innovaon (0, 0.1, 0.3) Trust (0.9, 1.0, 1.0) Locaon (0, 0.1, 0.3) Sustainable Operaons (0.5, 0.7, 0.9)

DM2 (0.5, 0.7, 0.9) (0.9, 1.0, 1.0) (0.1, 0.3, 0.5) (0.7, 0.9, 1.0) (0.3, 0.5, 0.7) (0.9, 1.0, 1.0) (0.5, 0.7, 0.9) (0, 0, 0.1) (0.1, 0.3, 0.5)

DM1 Supplier A (9, 10, 10) Supplier B (3, 5, 7) Supplier C (0, 1, 3) Supplier D (7, 9, 10) Supplier E (0, 1, 3) Supplier F (0, 1, 3) Supplier G (1, 3, 5) Supplier H (1, 3, 5) Supplier I (0, 0, 1) Supplier J (5, 7, 9)

DM2 (7, 9, 10) (5, 7, 9) (0, 1, 3) (9, 10, 10) (9, 10, 10) (9, 10, 10) (9, 10, 10) (9, 10, 10) (9, 10, 10) (1, 3, 5)

Scenario: Two experts, instead of four, are involved in the evaluation while all other inputs are constant (Figs. 8 and 9). If we compare the sensitivity analysis result (Fig. 10), where we used evaluation of only two experts, and the result from Fuzzy-TOPSIS method (Fig. 7), wherein four experts were involved, it is clearly visible that the supplier ranking varies significantly, For example, the best three suppliers in Fuzzy-TOPSIS method with four experts are Supplier F, Supplier B, and Supplier C, while in the Fuzzy-TOPSIS method with two experts, the best three suppliers in are Supplier C, Supplier A, and Supplier B. From the sensitivity analysis, it is evident that the supplier ranking result from Fuzzy-TOPSIS method varies depending on the number of decision makers, who participate in the evaluation process. In other words, judgment evaluations by different decision makers conflict with each other and significantly influence the supplier ranking.

Small Series Fashion Supplier Selection Using MCDM Methods

219

Fig. 10 Supplier ranking by Fuzzy-TOPSIS method with two experts’ evaluation

5 Conclusion Selecting the best suppliers is an integral and indispensable process and it constitutes the main aspects of fashion supply chain management. By achieving maximum efficiency in the process of supplier selection, fashion companies can achieve greater business success and growth. In this paper, we selected nine supplier selection criteria as the main important strategies of the fashion retailers from the point of view of consumer choices and the changes in the fashion business models due to the adoption of e-commerce and digital technology. Further, we ranked these criteria based on AHP computation and concluded that the main important criteria are Cost, Lead time, and Quality. We inputted criteria weights into the TOPSIS model and made an evaluation of ten fashion suppliers corresponding to each one of nine supplier selection criteria by industry expert. As a next step, we implemented FuzzyTOPSIS by involving four industry expert’s judgment on criteria importance as well as suppliers’ evaluation on each criterion, and the results are achieved in terms of the final ranking of best suppliers. Furthermore, sensitivity analysis is performed on the Fuzzy-TOPSIS method to check the influence of number of decision makers participating in the evaluation process, and it is found that the number of decision makers significantly alter the final ranking of suppliers. Overall, the results achieved in this study are promising and are of great significance in today’s fashion supply chain management and decision-making. We believe that the approach presented in this study can be implemented for small series fashion production where the reliable and efficient suppliers are critical for business growth. In the context of group decision-making, this approach provides useful

220

N. Harale et al.

tools to investigate the importance of supplier selection criteria and suppliers. We have successfully aggregated judgments of experts on qualitative and quantitative aspects of criteria and the time of computation for these methods was quite short and therefore, efficient. The results reaffirm the significance and relevance of MCDM methods for supplier selection in the fashion industry. The approach we adopted in this study is unique in terms of identifying key strategically important criteria for the digital small series fashion supply chain management, and therefore, it constitutes a novel and practical tool to solve supplier selection problems. As this paper is based on real industry case study, it could serve several valuable managerial insights for the real industrial decision-making in small series fashion supply chain management. Firstly, decision makers can utilize the methods presented in this paper as a decision-making tool to gage the trade-off between supplier selection criteria in terms of their importance derived from experts’ judgments. This way, it would be possible to remove less important criteria from the evaluation part and include only those which will be deemed important from the perspective of decision makers and companies’ business goals and needs. Secondly, proposed methods would enable decision makers to involve large pool of candidate suppliers for the selection of best among them for business purposes, and multiple experts in the decision-making process. The case study presented in this paper could serve to be a ready-made analytical tool to solve a supplier selection problem in a real life industrial scenario where multiple candidate suppliers are willing to provide their material and services for the products that retailer is selling, and retailer needs to decide as to which ones of these candidate suppliers can be the best suppliers as per his business needs and goals. It is worth highlighting that the retailer, as a user of the results from this study, can benefit from a great deal of flexibility in terms of choosing supplier selection criteria, number of candidate suppliers, and number of decision makers as per his on understanding or discretion. This way, our study could be used as an effective and quick decision-making tool to select suppliers for small series fashion products, especially in the context of ecommerce fashion business framework, where customer demands, product variety, and business strategies constantly evolve and change. This study has two major shortcomings: firstly, as decision-making in an ecommerce business framework is being increasingly consumer centric, we could not directly involve consumers in the evaluation process; and secondly, we could not identify common performance indicator to be able to identify the best performing MCDM methods in terms of ranking best suppliers. In the future, we intend to address these shortcomings and also explore opportunities to include product wise supplier selection aspect in the analysis. Moreover, we also intend to employ customer order data attributes in the supplier selection process, which will constitute a proper big data-oriented approach. Another aspect of future work in this direction could be to develop a more common framework for the easier selection of MCDM method specific to relevant decision-making problems and the agents involved in it. Another limitation could be realized in terms of addressing the variation in the final supplier ranking when we alter the number of decision makers in the evaluation process. The crucial problem that could be explored in this respect is to analyze

Small Series Fashion Supplier Selection Using MCDM Methods

221

the performance of suppliers selected from single expert evaluation with the ones selected from group evaluation. In the future, this aspect can be thoroughly explored and it can certainly enhance the implications of our study.

Appendix Survey Questionnaire Participant Company Name- —————————————————Name of the Decision Maker/Expert- —————————————– Date: —————————– Objective To identify the highly important criteria for selecting the best suppliers for small series products. Question 1 Keeping the objective in mind, kindly number each of the criteria from the list according to the Likert scale?

Criterion (C) Cost (C1) Market reputation(C2) Service efficiency(C3) Management efficiency (C4) R&D facilities(C5) Late delivery(C6) Lead time (C7) Quality (C8) Flexibility (C9) Operational efficiency (C10) Innovation (C11) Trust (C12) Location (C13) Digitization (C14) Sustainability(C15)

Type Quantitative Qualitative Qualitative Qualitative Qualitative Qualitative Quantitative Qualitative Qualitative Quantitative Qualitative Qualitative Qualitative Qualitative Qualitative

Number

Question 2 Keeping the objective in mind, kindly rate the relative importance of each criterion with respect to each other.

222

N. Harale et al.

Scale of relative importance Linguistic description Equally important Moderately important Strongly important Very strongly important Extremely important

Numerical value 1 3 5 7 9

Example According to you, how important is Cost with respect to Lead time? Response If you think that Cost is strongly important than the Lead time, then. you should fill in value 5 (Strongly Important). Kindly fill in your responses in the form of numerical values

References Alikhani R, Torabi SA, Altay N (2019) Strategic supplier selection under sustainability and risk criteria. Int J Prod Econ 208:69–82. https://doi.org/10.1016/j.ijpe.2018.11.018 Amin SH, Razmi J, Zhang G (n.d.) Supplier selection and order allocation based on fuzzy SWOT analysis and fuzzy linear programming. Expert Syst Appl 38(1):334–342. https://doi.org/ 10.1016/j.eswa.2010.06.071 Banica L, Hagiu A (2016) Using big data analytics to improve decision-making in apparel supply chains. In: Information systems for the fashion and apparel industry. Elsevier, pp 63–95. https:/ /doi.org/10.1016/B978-0-08-100571-2.00004-X Chai J, Ngai EWT (2015) Multi-perspective strategic supplier selection in uncertain environments. Int J Prod Econ 166:215–225. https://doi.org/10.1016/j.ijpe.2014.09.035 Chai J, Liu JNK, Ngai EWT (2013) Application of decision-making techniques in supplier selection: a systematic review of literature. Expert Syst Appl 40(10):3872–3885. https://doi.org/ 10.1016/J.ESWA.2012.12.040

Small Series Fashion Supplier Selection Using MCDM Methods

223

de Boer L, Labro E, Morlacchi P (2001) A review of methods supporting supplier selection. Eur J Purch Supply Manage 7(2):75–89. https://doi.org/10.1016/S0969-7012(00)00028-9 Dickson GW (1966) An analysis of vendor selection systems and decisions. J Purch 2(1):5–17. https://doi.org/10.1111/j.1745-493X.1966.tb00818.x Ertuˇgrul I, Karaka¸soˇglu N (2008) Comparison of fuzzy AHP and fuzzy TOPSIS methods for facility location selection. Int J Adv Manuf Technol 39(7–8):783–795. https://doi.org/10.1007/ s00170-007-1249-8 Guarnieri P, Trojan F (2019) Decision making on supplier selection based on social, ethical, and environmental criteria: a study in the textile industry. Resour Conserv Recycl 141:347–361. https://doi.org/10.1016/j.resconrec.2018.10.023 Ho W, Xu X, Dey PK (2010) Multi-criteria decision making approaches for supplier evaluation and selection: a literature review. Eur J Oper Res 202(1):16–24. https://doi.org/10.1016/ j.ejor.2009.05.009 Kahraman C, Onar SC, Oztaysi B (2015) Fuzzy multicriteria decision-making: a literature review. Int J Comput Intell Syst 8(4):637–666. https://doi.org/10.1080/18756891.2015.1046325 Karsak EE, Dursun M (2016) Taxonomy and review of non-deterministic analytical methods for supplier selection. Int J Comput Integr Manuf 29(3):263–286. https://doi.org/10.1080/ 0951192X.2014.1003410 Kaya ˙I, Çolak M, Terzi F (2019) A comprehensive review of fuzzy multi criteria decision making methodologies for energy policy making. Energ Strat Rev 24:207–228. https://doi.org/10.1016/ j.esr.2019.03.003 Ku C-Y, Chang C-T, Ho H-P (2010) Global supplier selection using fuzzy analytic hierarchy process and fuzzy goal programming. Qual Quant 44(4):623–640. https://doi.org/10.1007/ s11135-009-9223-1 Kumar GK, Rao MS, Rao VVSK (2018) Supplier selection and order allocation in supply chain. Mater Today: Proc 5(5):12161–12173. https://doi.org/10.1016/j.matpr.2018.02.194 Lee H-C, Chang C-T (2018) Comparative analysis of MCDM methods for ranking renewable energy sources in Taiwan. Renew Sust Energ Rev 92:883–896. https://doi.org/10.1016/ j.rser.2018.05.007 Lesisa TG, Marnewick A, Nel H (2018) The identification of supplier selection criteria within a risk management framework towards consistent supplier selection. In: 2018 IEEE international conference on industrial engineering and engineering management (IEEM). IEEE, pp 913–917. https://doi.org/10.1109/IEEM.2018.8607429 Liao C-N, Kao H-P (2010) Supplier selection model using Taguchi loss function, analytical hierarchy process and multi-choice goal programming. Comput Ind Eng 58(4):571–577. https:/ /doi.org/10.1016/j.cie.2009.12.004 Lin SW, Jerusalem MA (2016) Integrated MCDM for evaluating fashion design schemes. Int J Cloth Sci Technol 28(6):880–892. https://doi.org/10.1108/IJCST-01-2016-0005 Lin C, Twua CH (2012) Fuzzy MCDM for evaluating fashion trend alternatives. Int J Cloth Sci Technol 24(2):141–153. https://doi.org/10.1108/09556221211205586 Liu H-C, Quan M-Y, Li Z, Wang Z-L (2019) A new integrated MCDM model for sustainable supplier selection under interval-valued intuitionistic uncertain linguistic environment. Inf Sci 486:254–270. https://doi.org/10.1016/j.ins.2019.02.056 Macchion L, Moretto A, Caniato F, Caridi M, Danese P, Vinelli A (2015) Production and supply network strategies within the fashion industry. Int J Prod Econ 163:173–188. https://doi.org/ 10.1016/j.ijpe.2014.09.006 Munier N, Hontoria E, Jiménez-Sáez F (2019) Strategic approach in multi-criteria decision making, vol 275. Springer International, Cham. https://doi.org/10.1007/978-3-030-02726-1 Saaty TL (1977) A scaling method for priorities in hierarchical structures. J Math Psychol 15(3):234–281. https://doi.org/10.1016/0022-2496(77)90033-5 Taherdoost H, Brard A (2019) Analyzing the process of supplier selection criteria and methods. Procedia Manuf 32:1024–1034. https://doi.org/10.1016/j.promfg.2019.02.317

224

N. Harale et al.

Teng SG, Jaramillo H (2005) A model for evaluation and selection of suppliers in global textile and apparel supply chains. Int J Phys Distrib Logist Manag 35(7):503–523. https://doi.org/10.1108/ 09600030510615824 Vaidya OS, Kumar S (2006) Analytic hierarchy process: an overview of applications. Eur J Oper Res 169(1):1–29. https://doi.org/10.1016/J.EJOR.2004.04.028 van Weele AJ (2010) Purchasing and supply chain management: analysis, strategy, planning and practice Cengage Learning Vieira SFA, de Godoy Lima MY, Gehlen KRH (2016) Sustainable trend: a study about innovations in the productive chain of the textile sector. In: 2016 Portland international conference on management of engineering and technology (PICMET). IEEE, pp 2392–2402. https://doi.org/ 10.1109/PICMET.2016.7806791 Weber CA, Current JR, Benton WC (1991) Vendor selection criteria and methods. Eur J Oper Res 50(1):2–18. https://doi.org/10.1016/0377-2217(91)90033-R Wen X, Choi T-M, Chung S-H (2018) Fashion retail supply chain management: a review of operational models. https://doi.org/10.1016/j.ijpe.2018.10.012 Winter S, Lasch R (2016) Environmental and social criteria in supplier evaluation – lessons from the fashion and apparel industry. J Clean Prod 139:175–190. https://doi.org/10.1016/ j.jclepro.2016.07.201 Yang JL, Chiu HN, Tzeng GH, Yeh RH (2008) Vendor selection by integrated fuzzy MCDM techniques with independent and interdependent relationships. Inf Sci 178(21):4166–4183. https://doi.org/10.1016/j.ins.2008.06.003 Yildiz C (2016) Interval type 2-fuzzy TOPSIS and fuzzy TOPSIS method in supplier selection in garment industry. Ind Textila 67(5):322–332 Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353. https://doi.org/10.1016/S00199958(65)90241-X Zadeh LA (1973) Outline of a new approach to the analysis of complex systems. IEEE Trans Syst Man Cybern SMC-3(1):28–44 Zavadskas EK, Turskis Z, Kildien˙e S (2014) State of art surveys of overviews on MCDM/MADM methods. Technol Econ Dev Econ 20(1):165–179. https://doi.org/10.3846/ 20294913.2014.892037

Enhanced Performance Assessment of Airlines with Integrated Balanced Scorecard, Network-Based Superefficiency DEA and PCA Methods Umut Aydın, Melis Almula Karadayı, Füsun Ülengin, and Kemal Burç Ülengin

Abstract In the last decade, due to the aggressively increasing competition in the airline industry, strategic decisions to improve airline performance have become crucial. However, evaluating airline efficiency is an extremely complex, multidimensional problem and requires the application of Multiple Criteria Decision-Making (MCDM) methods. This study evaluates the performance of 45 airline companies via combining the balanced scorecard (BSC) approach and the network-based superefficient data envelopment analysis (DEA). The proposed methodology incorporates finance, customers, internal processes, learning, and growth dimensions of BSC into the analysis in order to conduct a comprehensive assessment of airline companies from financial and nonfinancial perspectives of performance. Moreover, the eigenvector centrality concept is used to determine the airlines that should act as a role model (peer) for efficiency in each dimension of BSC. Rankings

Extension of this study was submitted to the “TÜB˙ITAK ARDEB 1001—Support Program for Scientific and Technological Research Projects.” U. Aydın Faculty of Engineering and Natural Sciences, Department of Transportation Engineering, Bandırma Onyedi Eylül University, Bandırma, Turkey e-mail: [email protected] M. A. Karadayı () Faculty of Engineering and Natural Sciences, Department of Industrial Engineering, Istanbul Medipol University, Istanbul, Turkey e-mail: [email protected] F. Ülengin School of Management, Sabanci University, Istanbul, Turkey e-mail: [email protected] K. B. Ülengin Management Faculty, Management Engineering Department, Istanbul Technical University, Istanbul, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_9

225

226

U. Aydın et al.

of airline companies in each dimension are also presented using the eigenvector centrality values. Additionally, in order to improve the discriminatory power of DEA, initially the principal component analysis (PCA) is conducted and based on the representation of the 14 variables by seven factors revealed from PCA, a compact model that integrates the four dimensions of the evaluation is obtained. Those factors are named according to their characteristics as Flight Capacity, Profitability, Profitability per Employee, Customer Satisfaction, Operational Profitability, Liquidity, and Operational Performance. Those key performance indicators are used in order to make overall performance evaluation and reveal the overall rankings. Finally, the significance of the ranking differences between the ranking based on each of the four dimensions and the overall ranking is tested by spearman rank correlation.

1 Introduction Last decade has witnessed important financial and operational uncertainties which influenced the related performances That is why airline managers should focus on the policies in order to improve the performance and improve the weaknesses (Pineda et al. 2018). The studies that analyze the airline efficiencies are generally based solely on financial performance. However, the problem has a multidimensional measure that should take into account nonfinancial aspects in addition to financial ones. That is why, the Balanced Scorecard (BSC) which measures the performance from multidimensional perspective is used in this study. The BSC takes into account finance, customers, internal processes, and, learning and growth in order to measure the performance. The finance dimension is evaluated based on ratios of the financial reports. The customer dimension is analyzed using customer satisfaction related criteria. The internal process dimension evaluates the processes used in the company. Finally, the learning and growth perspective specifies the processes used by the company in order to grow with new projects and skilled employees (Kaplan and Norton 1996; Basso et al. 2018). In an earlier study, Amado et al. (2012) integrated the BSC method with Data Envelopment Analysis (DEA) and evaluated the performance of equipment maintenance departments based on the four dimensions of BSC. In our study, an integrated approach that involves the network-based superefficiency DEA with a BSC and Social Network Analysis is proposed. The networkbased superefficient DEA is employed to increase the discrimination among efficient airline companies even further. For this purpose, the results of the DEA models are converted to a directed and weighted network where the nodes represent the airline companies and the link between each pair of nodes represents their reference performance (Liu and Lu 2010). Social Network Analysis is used to identify the importance of each airline company within the network with respect to each dimension of BSC. Moreover, reference network diagrams are drawn and presented based on eigenvector centrality values. Airline companies are ranked according

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

227

to eigenvector centrality values in descending order. Hence, top-performing (topranked) airline companies with respect to each dimension of BSC can be identified as a result of social network analysis. In addition to evaluating airlines from a different perspective using BSC, an overall efficiency calculation conducted to observe how airlines vary from the overall model to BSC dimensions. At this stage, since there are four variables for inputs and also ten variables for outputs to calculate efficiency using superefficient DEA, Principal Component Analysis (PCA) is conducted for the prior reduction of the variables. DEA, even if superefficient DEA fails to discriminate efficient DMU from inefficient ones when there is a multidimensionality problem (Jothimani et al. 2017). PCA implemented for inputs and outputs separately before running DEA and two factors used as inputs named Flight Capacity and Profitability while five factors for outputs named Profitability per Employee, Customer Satisfaction, Operational Profitability, Liquidity, and Operational Performance. After the calculation of efficiency scores for both BSC dimensions and overall model, Spearman Rank Correlation is used for evaluation whether airlines’ efficiency rankings vary around models or not. Section 2 gives an overview of the researches conducted for performance evaluation of the airline companies using DEA and PCA models. The third section introduces the proposed methodologies employed in this study. The application of the proposed methodology and obtained results are presented in the fourth section. Finally, conclusions and further suggestions are given.

2 Literature Review 2.1 DEA-Based Airline Studies There are numerous studies in which DEA has been employed for the performance evaluation of airline companies. This section outlines a review of the DEA-based airline studies for the past decade. There are many region- or country-based airline efficiency studies in the literature. Chow (2010) analyzed the productivity changes of state-owned and non-state owned Chinese airline companies from 2003 to 2007 using Malmquist index. The input variables were the number of full-time employee, aircraft fuel usage, and seat capacity. The output variable was revenue from ton km of passengers and freight. Results revealed that non-state-owned airlines were performing better than the stateowned ones. Ouellette et al. (2010) examined the impact of regulation changes on the efficiency of Canadian airlines between 1960 and 1999 using a DEA approach with quasi-fixed inputs. They concluded that deregulation explains the large part of the inefficiency. Coli et al. (2011) utilized DEA and SFA to measure the operational efficiency of 42 domestic routes of Italian airline, Air One, for the year 2007. Number of total seats and the variable direct operating costs were considered as

228

U. Aydın et al.

inputs, and passenger scheduled revenue was defined as an output variable in their study. Pires and Fernandes (2012) used DEA and the Malmquist index to analyze the financial efficiency of 42 airlines from 25 countries for the year 2011. Barros et al. (2013) evaluated the performance of 11 US airlines with the VRS DEA model with convexity assumption during the period 1998–2010. The study used three inputs, i.e., total cost, number of employees, and number of gallons, while the output variables were total revenue, revenue per mile (RPM), and passenger load factor (PLF). Min and Joo (2016) investigated the impact of airline alliances worldwide on the airline’s comparative operational efficiency using a DEA framework. Saranga and Nagpal (2016) analyzed the operational efficiency of Indian airlines and its impact on market performance. Lately, Sakthidharan and Sivaraman (2018) also examined the efficiency of Indian airlines. Kottas and Madas (2018) used superefficiency DEA to evaluate the effect of strategic alliances on 30 international airlines. Kuljanin et al. (2019) used Fuzzy Theory-based DEA and the Malmquist productivity index to analyze the performance of 17 airlines operating in Central and South East Europe between 2008 and 2012. Some of the DEA-based airline studies focused on environmental performance in addition to technical and financial performance. They considered carbon emissions in the set of outputs. Arjomandi and Seufert (2014) used Bootstrapped DEA to investigate the technical and environmental performance of 48 international airlines for the period 2007–2010. The inputs used included number of full-time equivalent employees and capital. The outputs included ton kilometers available and annual CO2 emissions. Chang et al. (2014) also examined the environmental performance of 27 international airlines using a slack-based measure of DEA. Zhang et al. (2017) analyzed and compared the energy efficiency of Chinese and American airlines during 2011–2014. The results revealed that the mean energy efficiencies of Chinese airlines lagged behind American airlines during the study period. Another group of studies in the literature includes two-stage DEA models which utilize DEA and bootstrapped truncated regression suggested by Simar and Wilson (2007). Barros and Peypoch (2009) employed the DEA and bootstrapped truncated regression model to evaluate the performance of 28 European airlines from 2000 to 2005 and also determined the drivers of airline efficiency. Lu et al. (2012) applied a two-stage DEA model to measure the production and marketing efficiency of 30 US Airlines and also investigated the relationship between efficiency and corporate governance. They found out that low-cost airlines, on average, are more efficient carriers than the full-service ones, but less efficient marketers. Moreover, they concluded that corporate governance significantly influences the operational performance of the firm. Recently, there has been a growing interest in employing Network DEA models to evaluate the performance evaluation of airlines. Tavassoli et al. (2014) proposed slacks-based measure network data envelopment analysis (SBM-NDEA) approach to evaluate both technical efficiency and service effectiveness of airlines. Li et al. (2015) proposed the Virtual Frontier Network SBM model for analyzing the efficiency of 22 international airlines during the period 2008–2012. Network

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

229

structure for airline efficiency evaluation was constructed based on three stages: Operations Stage, Services Stage, and Sales Stage. Mallikarjun (2015) employed unoriented DEA network approach to determine the peer airlines and also the source of inefficiency of 27 US domestic airlines for the year 2012. Chen et al. (2017) applied a stochastic network DEA to take into consideration undesirable outputs such as flight delays and CO2 emissions for efficiency analysis of 13 Chinese airlines during the period 2006–2014. As can be seen from the literature, the performance evaluation of airlines based on multidimensional perspective is scarce. One of the contributions of this research is to reduce this gap using BSC. BSC has become a popular technique for evaluating organizational performance. Since the financial and nonfinancial variables are both considered to obtain more meaningful results than a traditional performance perspective (Dinçer et al. 2017). BSC can serve as the compliment of DEA and using the DEA–BSC results, organizations could develop their improvement strategies (García-Valderrama et al. 2009; Wu and Liao 2014). Hence, an integrated DEA–BSC approach has been developed in the past decades for performance evaluation in various fields. Chen et al. (2008) investigated the operating efficiency of banks in Taiwan. GarcíaValderrama et al. (2009) applied the BSC–DEA model to evaluate the operating efficiency of R&D projects. Asosheh et al. (2010) used the integrated BSC model to investigate the information technology project. Aryanezhad et al. (2011) applied the BSC–DEA method to evaluate relative efficiency in the banking sector. Amado et al. (2012) assessed the performance of an equipment maintenance department of a multinational company using an integrated BSC–DEA model. Basso et al. (2018) utilized the joint use of BSC and DEA to analyze the performance of museums. Additionally, there are limited number of integrated BSC–DEA studies for assessing the performance in the airline industry. Wu and Liao (2014) measured the operational efficiency of 38 major airlines in the world by proposing integrated BSC and DEA.

2.2 Dimension Reduction with PCA to Enhance the Discriminatory Power of DEA PCA is used widely to increase the discrimination power of the DEA and to handle multidimensional data. The first group of studies utilized PCA as the initial step before running DEA. Adler and Golany (2001) used PCA–DEA, which is a hybrid approach, to handle the curse of dimensionality in evaluating airline performance. Gnewuch and Wohlrabe (2018) used PCA on standardized data to extract the main components to be used as inputs and outputs of the DEA model to evaluate 188 economics departments around the world. Wu et al. (2018) evaluated eco-efficiency coal-fired power plant using superefficiency DEA and PCA is initially used to reduce the multidimensionality and to increase the discriminatory power of the

230

U. Aydın et al.

DEA. There are numerous studies that used PCA in different areas with the same purpose such as Wu and Li (2017) to find the impact of the female executives on firm performances, Guo et al. (2018) for energy efficiency and Liu et al. (2019) for health expenditure. There are also studies that used PCA as a second step following DEA. Ho and Wu (2009) and Stoica et al. (2015) calculated the efficiency scores of banks and then used PCA with efficiency scores of DMU’s in order to identify different strategic groups of the company. As can be seen from the literature review, BSC and DEA are widely used to evaluate the performance evaluation of airline companies. Moreover, PCA is also employed for the proposer selection of the input/output variables selection for DEA. However, their application of airline evaluations is scarce.

3 Proposed Methodology The proposed methodology presented in this study consists of four main stages. Firstly, we determine input and output variables benefiting from literature for each dimension of BSC. After determining study variables, a network-based superefficient DEA is employed for conducting a social network analysis for each perspective of BSC. At the next stage, eigenvector centrality values are calculated for each airline company using Pajek Software and airline companies are ranked according to eigenvector centrality values in descending order. At the final stage, PCA is applied to reduce the input and output variables to evaluate the overall performance of airline companies. Hence, we may observe how airlines vary from the overall model to BSC dimensions. Figure 1 summarizes the proposed framework for evaluating the performance of airlines in this study.

3.1 Integration of BSC and DEA Approach BSC provides a framework to understand how each part of the organization contributes to its success through a series of cause and effect relationship. In this way, rather than treating the system as a black box, as is the case in DEA, it is possible to get information about the subprocesses to focus on in order to improve the overall performance (Amado et al. 2012). The balanced scorecard focuses on both financial and nonfinancial performance. It includes customers, internal processes and learning and growth to evaluate the intangible assets and intellectual capital. Kaplan and Norton (1996) underline that there is a causal relationship between the four dimensions and if the financial results are the final goal of a business enterprise, learning and growth, internal processes and customers play leading indicators role to contribute to achieve the financial goal (Chen et al. 2011) The BSC provides a good picture for executives and underline that the current

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

231

Fig. 1 Proposed methodology

good financial performance does not ensure a good future financial performance. Therefore, an appropriate working environment should also be created for the employees and motivate them to be creative and eager to learn to contribute to the development of the firm. However, BSC alone does not provide an opportunity to develop, communicate, and implement strategy in corporate setting (Wu and Liao 2014). There are only very few studies that examine the pros and cons of DEA and BSC to highlight the importance of their integration for a better evaluation of performance and efficiency (Kádárová et al. 2015). Figure 2 shows these differences. Therefore, one of the contributions of this study is the integration of BSC and DEA in order to reduce their cons and increase their pros.

3.2 The Use of a Network-Based Superefficiency DEA with a Balanced Scorecard Approach and Social Network Analysis DEA measures the relative performance of decision-making units (DMUs) using 0–1 scale where efficient units get a score of 1 and the inefficient ones have a score

232

U. Aydın et al. Characteristics

BSC

DEA

Way of Comparison

comparison with an ideal virtual unit

proportional i l comparison of the same units

View - rating

multiple view perspectives

input/ output

Mathematical ranking

weak

strong

Application

performance evaluation

technical efficiency

Accuracy of measurement

unclear

high

Presentation of opportunities for improvement

weak

high

Variety of suitable results

does not support

has

Future view

has

does not have

Relationship to business strategy

has

does not have

Fig. 2 Proposed differences between DEA and BSC method—outputs of deep analysis (Jothimani et al. 2017)

less than 1. The efficient units lie on the efficiency frontiers and cannot be compared with respect to each other (Charnes et al. 1978; Cooper et al. 2000). Classical DEA methods may result in multiple efficient solutions. To provide a food discrimination, it is expected that the number of efficient units is at least 2 m × s, where m is the number of inputs and s is the number of outputs. This study makes another contribution to the literature by the use of networkbased eigenvector concept suggested by Liu and Lu (2010) in order to identify the importance of each airline company within the network and identify the role models for each of the four dimensions of the BSC. The efficiency scores of all airlines are calculated using input-oriented, constant returns-to-scale super-efficient slacks-based measure of efficiency (Super-SBM-IC) model. An airline company is accepted to be efficient if its superefficiency score is greater than 1 (Andersen and Petersen 1993).

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

233

Mathematical representation of the employed DEA model for an efficient DMUk is given as follows (Tone 2002; Tran et al. 2019): 1 m

min δk = 1 s

s.t.

n 

x˜i ≥

m  i=1 s  r=1

x˜ i xik y˜r yrk

,

xij λj , i = 1, . . . , m,

j =1,j =k

y˜r ≤

n 

yrj λj , r = 1, . . . , s,

(1)

j =1,j =k

x˜i ≥ xik , i = 1, . . . , m, 0 ≤ y˜r ≤ yrk , r = 1, . . . , s, λj ≥ 0, j = 1, . . . , n, j = k, where x˜i (i = 1, . . . , m) and y˜r (r = 1, . . . , s) are decision variables with respect to inputs and outputs, respectively, and λ is a non-negative vector. As it is widely preferred in the airline performance studies, in this study, the input-oriented DEA model with constant return to scale is used (Chow 2010; Sakthidharan and Sivaraman 2018). In fact, the input-oriented DEA model is appropriate due to the fact that the airline industry can control its input usage easily. In the next step, all DEA results are transformed into a directed and weighted network where each node represents a DMU (airline company) and the link between a pair of node represents the reference relationship between the pair. The corresponding lambda value λjk gives information about the endorsement of inefficient airline to efficient airline. Moreover, λtj k shows the contribution of the ith input of the kth airline to the jth airline in the reference set with respect to specification t. The next step normalizes the calculated lambda values. The contribution of the ith input of the kth airline to the jth airline in the reference set with respect to specification t can be normalized as follows (Liu and Lu 2010):

IWt,k ij = 

λtj k xijt j ∈E

λtj k xijt

, 0 < IWt,k ij ≤ 1.

(2)

234

U. Aydın et al.

In the same way, the contribution of the rth output of the kth airline to the jth airline in the reference set with respect to specification t can be normalized as follows: OWt,k rj = 

t λtj k yrj t t j ∈E λj k yrj

, 0 < OWt,k rj ≤ 1.

(3)

In the reference set with respect to DEA specification t, the overall contribution of the kth airline to the jth airline can be calculated as follows: IOWtj k

# " m s  t,k  1 t,k = IWij + OWrj (m + s) i=1

(4)

r=1

In the next step, the results of all DEA specifications are all combined in one network to obtain adjacency matrix A. A=

2 w 

3 IOWtj k

(5)

t =1

where A is a square matrix of order n and w gives the total number of DEA specifications w = (2m − 1)(2s − 1). In step 5, the eigenvector centrality value of each network node (airline company) is computed and airline companies are ranked with respect to their eigenvector centrality values in descending order. As it was mentioned before, in the network-based DEA approach, the nodes represent the airline companies and the link between each pair of nodes represents the relationship between them. As a result, an airline company that has a balanced strength will be more preferred with respect to a company being efficient in one parameter and not in the others. The DEA approach cannot discriminate against these companies while the network-based approach will give a lower rank to the latter. The reference of an efficient airline company is accepted to be as an endorsement that this company gets from an inefficient airline company. The efficiency of the airline companies is computed based on all possible input and output combinations. As a result, (2m − 1) (2s − 1) combinations of DEAs are computed for input m and output s, with each specification representing one input/output combination. The network-based approach combines the results of these different combinations and the most efficient airline companies are accepted to be the ones having the highest centrality value. That is why, the selection of a suitable centrality measure for the network under evaluation is very important. In fact, different centrality values such as degree centrality (Freeman 1979; Wasserman 1994), betweenness centrality (Freeman 1979), and closeness centrality (Sabidussi 1966; Freeman 1979; Wasserman 1994) can give different results because they evaluate the centrality from a different perspective. On the other hand, in eigenvector centrality proposed

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

235

by Bonacich (1927), the centrality is evaluated not only based on the number of connections that an airline company has but also on the number of connections of its neighbors as well as its neighbor’s neighbors. Due to the fact in eigenvector centrality the power and centrality of each node depend on the power and centrality of other nodes, in this study the eigenvector centrality method is selected as the most suitable centrality measure. In this approach, the power of a node (i.e., its popularity) is evaluated based on a benchmarking, by taking into account the number of peers that endorse it and also the importance and the strength of the endorsing peers. A node with a high eigenvector is the one connected to high degree influential nodes. Connections to the nodes that are themselves highly connected will be more influential.

3.3 The Integration of PCA and DEA DEA is one of the well-known nonparametric methods used for efficiency measurement but when there are a large number of variables which are candidates for inputs and outputs, the DEA fails to discriminate efficient DMU from inefficient ones and this situation is also called the curse of dimensionality thus to struggle this limitation of DEA, a hybrid approach called PCA–DEA is used (Jothimani et al. 2017). According to Yap et al. (2013), this method is robust to sample size. In general, the simple way for variable reduction is dropping the variable but this approach leads to the underestimation of the information that the variable has. Since the components are the linear combinations of the variables. Therefore, this approach can reduce the variables used for the calculation of efficiency with minimum loss of information that the variables have. PCA is a multivariate analysis that reduces variables to the uncorrelated dimensions constructed with linear combinations of variables and these dimensions generally describe 80–90% variance of variables. Since the first few factors can explain the variables’ variance, they can substitute the variables (Adler and Golany 2001). Detailed information related to PCS can be found in (Adler and Yazhemsky 2010). Using PCA as a prior step before the application of DEA permits to achieve reduced dimensions from multidimensional variables and it additionally helps reducing computational workload in analyzes using dimensions reflected the information that variables carry (Wu and Li 2017). Initially, PCA is used separately for inputs and outputs variables until both components of inputs and outputs have 80% variances of original variables. Having 80% variance is crucial in order to avoid the misclassification of the efficient and inefficient DMU’s using DEA (Jothimani et al. 2017). On the other hand, after the application of PCA, the factor scores can be negative. However, the DEA does not use the variables having negative values. Therefore, the most negative value of each variable can be added to all values of variables or any other transformation that makes all values positive can be applied at this step.

236

U. Aydın et al.

4 Application of the Proposed Performance Assessment Framework 4.1 The Selection of the Input and Output Variables According to our knowledge, there is no research on the specification of the most appropriate input and output variables for the efficiency evaluation of the airline companies (Nissi and Rapposelli 2008). In this study, the inputs and outputs for each of the four dimensions of the BSC are initially specified based on the literature survey. The literature survey on the inputs and outputs used for each dimension of the BSC model is presented in Table 1. As can be seen in Table 1, Return on Equity (ROE), Net Profit Margin (NPM), Debt Ratio, Revenue per Revenue Passenger Kilometers (RRPK), and Current Ratio variables are the financial indicators. ROE is about the profitability of shareholders equities while NPM is equal to how much net income or profit is generated as

Table 1 The literature survey on the inputs and outputs used for each dimension of the BSC model Perspectives Customer

Finance

Internal process

Learning and growth

Performance indicators Input variables Output variables Fleet PLF Debt ratio

OCSS (Max 10)

NPM

IENT (Max 5)

Fleet

ROE

Cargo carried

NPM Debt ratio

Fleet Debt ratio NPM

Fleet Debt ratio NPM

Current ratio RRPK OTP Cargo carried PAX PAX/employee

RPE

Reference study Mallikarjun (2015), Barros and Couto (2013) Teker et al. (2016), Law and Breznik (2018) Teker et al. (2016), Law and Breznik (2018) Wang (2008), Panicker and Seshadri (2013), Dinçer et al. (2017) Teker et al. (2016) Feng and Wang (2000), Bigliardi and Ivo Dormio (2010), Dinçer et al. (2017) Wang (2008)

Tsionas et al. (2017) Wu and Liao (2014) Zins (2001), Wang et al. (2004), Barros and Dieke (2007), Dinçer et al. (2017) Brulhart and Moncef (2015), Noori (2015)

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

237

a percentage of revenue. Debt Ratio is the proportion of debt that the companies finance their assets and also Current Ratio shows the companies’ power of meeting the short term obligations. Besides these variables that are common indicators for all sectors, RRPK is specified for especially the airlines’ sector and means that how much a passenger pays to airlines for travel 1 km. Overall Customer Satisfaction Score (OCSS) and Inflight Entertainment (IENT) are the marketing indicators that showed how many customers satisfied with airlines’ services. Passenger Load Factor (PLF) shows the percentage of utilization of the capacity. On-Time Performance (OTP) refers to the level of success of taking off at the scheduled time. Cargo Carried and Passenger Carried (PAX) shows how many passengers carried in one physical year and PAX per Employee means that how many PAX handled by the employee and Revenue per Employee (RPE) also shows that how much revenue gained by an employee. Lastly, Fleet is an indicator of the number of commercial aircraft of the company. Airline companies listed in the “Skytrax Top 100 Airlines 2016” are selected for this study but due to the lack of data for each airline, the complete and accurate data requirement, 45 airline companies all over the world are considered and analyzed in this study. Moreover, the study variables of these airlines can be categorized as financial, marketing, and performance-related dimensions. Although it is difficult to collect reliable service-quality related variables, in order to obtain more realistic airline performance analysis, their inclusion to the model is important (Panicker and Seshadri 2013). For this purpose, the related data are obtained from Skytrax Internet-based surveys. Those surveys measure the satisfaction of customers on airline quality worldwide. The survey results are published online at https://www. airlinequality.com. The remaining data were revealed from Bloomberg database (Bloomberg 2016) as well as the annual reports of the airlines. When Table 2 is analyzed it can be seen that nonproportional variables have high standard deviations while in proportional variables they are below the mean. In fact, it can be seen, in particular, that the 45 evaluated airline companies show similar performance in terms of occupancy rates and on-time departure performance. For each dimension of BSC, the number of DMU (airlines) is at least twice the total number of input and output variables, and, thus conforms to the required standards (Golany and Roll 1989). The efficiency of the airline companies is calculated considering all possible input and output combinations. As a result, (2m − 1) (2s − 1) combinations of DEAs are computed for input m and output s, with each specification representing one input/output combination. Therefore, in this study 253 DEA models are run in total, where 48 models belong to Customer dimension (considering 3 inputs and 3 outputs), 93 are run for the Finance dimension (considering 2 inputs and 5 outputs), 105 for the Internal Process dimension (considering 3 inputs and 4 outputs) and finally 7 for the Learning and Growth dimension (considering 3 inputs and 1 output).

238

U. Aydın et al.

Table 2 Summary statistics analysis of the data used in the model Variables Debt ratio Net profit margin (NPM) Return on equity (ROE) Current ratio Revenue per employee (RPE) Revenue per revenue passenger kilometer (RRPK) Cargo carried (tons) Passengers carried (in millions) (PAX) Passenger load factor (PLF) On-time departure performance (OTP) (%) Size of fleet Passengers/employees (PAX/employee) Inflight entertainment satisfaction score (IENT) Overall customer satisfaction score (OCSS)

Minimum 0.79 0 0 0.20 5.34 0.01

Maximum 76.68 20.17 169.11 1.94 182918.09 14.13

Mean 36.89 9.60 47.25 0.79 5518.43 1.17

Std. Dev. 18.73 4.71 29.95 0.35 26989.26 2.96

36,000 4 69.16 60 30 332.07 1

9,800,000 199 91.6 89.87 1536 77142.857 4

1,460,145 47.74 80.17 80.98 294.76 3485.66 2.96

2,249,129 44.82 4.66 6.66 298.17 11183.60 0.85

3

8

6.31

1.31

4.2 The Use of a Network-Based Superefficiency DEA with a Balanced Scorecard Approach After the evaluation based on the superefficiency DEA model, a social network analysis based on the eigenvector centrality method is applied in order to get a better discrimination among the efficient airline. This centrality measure shows the most popular airlines that are endorsed by other influential airlines based on four dimensions. For this purpose, the eigenvector centrality of each airline is calculated according to different input–output combinations of each of the four dimensions. The network-based DEA approach converts the DEA results to a weighted directed network using the Kamada–Kawai layout alternative of Pajek software. Pajek software is used for evaluating large networks and in Kamada–Kawai layout, the efficient networks are generally located in the center (de Nooy et al. 2011). When the top section of Fig. 3 is analyzed, it can be seen that Norwegian Air Shuttle, Korean Air, Bangkok Airways, South African, United Airlines, Icelandair, Singapore Airlines, Aegean Airlines, and Virgin Australia are in the center of the network. In addition, when the Table related to the Figure is investigated, it can be seen that among the central airline Korean Air, Norwegian Air, South African Airways and United Airlines are also the most central airlines. In fact, this can also be seen from the size of the related nodes in the figure. Therefore, they are the role models in terms of customer satisfaction dimension to the other airline companies. On the other hand, in terms of the finance dimension, Icelandair, Bangkok Airways, Copa Airlines, Aegean Airlines, South African Airways, TAP Air Portugal,

Customer Dimension

Icelandair Bangkok Airways Aegean Airlines South African Airways TAP Air Portugal Hawaiian Airlines Copa Airlines

Korean Air Norwegian South African Airways United Airlines Icelandair Singapore Airlines Bangkok Airways Aegean Airlines Virgin Australia

0.6677 0.4581 0.4422 0.2813 0.2009 0.1467 0.0872

0.3648 0.3609 0.3403 0.3125 0.2891 0.1953

0.3648

Eigenvector centrality value 0.3676 0.3648

Fig. 3 Eigenvector analysis based on customer and finance dimensions

Finance Dimension

Company

Enhanced Performance Assessment of Airlines with Integrated Balanced. . . 239

240

U. Aydın et al.

and Hawaiian Airlines are in the inner periphery of the network (see the bottom part of Fig. 3). Therefore, they are the role models to the other airlines for this dimension. When the airline companies are compared with respect to the internal process dimension, Korean Air, United Airlines, Thai Airways, South African Airways, Asiana Airlines, Icelandair, TAP Air Portugal, Delta, and Lufthansa Airlines are located at the center. However, in terms of eigenvector centrality measure, Korean Air is the most influential role model (see the top part of Fig. 4). Finally, the role models in terms of learning perspective are Thai Airways and TAP Air Portugal (see the bottom part of Fig. 4) but with its highest number of edges connecting to it Thai has the highest exposure(mode) and acts as gatekeeping and a liaison between subgroups in terms of this dimension. Its eigenvector centrality value shows that it is also the airline company that is connected to other influential airlines with high degree. That is why, as can be seen from the Figure, it has the largest size node.

4.3 PCA–DEA Approach for Overall Performance Assessment In this study, factor analysis is performed based on the variables that construct both input and output data set. The analysis reveals that, for the input data, four variables can be reduced to two factors for the input data set and ten variables can also be reduced to five factors for the output data. Varimax rotation shows that the first factor of inputs (Factor 1) involves Fleet and Cargo variable. Therefore, Factor 1 can be associated with the operational indicator, while Factor 2 can be named financial indicator since it includes Debt Ratio and Net Profit Margin. The eigenvalues of these two factors are above the 1 and they explain about 85% variance of the four variables using as inputs at the BSC stage (see Table 3). The same process is also implemented for output variables and five factors are obtained from ten variables. However, due to the proportion of variance explained and the eigenvalue that being near 1, 10 variables are decided to be reduced to 5 factors. These 5 factors can explain the nearly 85% variance of 10 variables. Since Factor 1 includes Pax/Employee and RPE it can be said that this factor is an employee indicator. Factor 2 includes IENT and OCSS thus this dimension reflects the marketing performance. The third Factor having ROE and PAX variables can be interpreted differently since the airlines’ core product is transporting the passenger and ROE reflects the profitability of equity, this factor shows the success of airlines about operational profitability. Factor 4 is nearly about the liquidity of the company and the last factor has information about airline operational activities since it includes OTP, PLF, and RRPK variables (see Table 4). PCA–DEA is conducted for both decreasing the variables to fewer dimensions and increasing the discrimination power of the DEA (Wu and Li 2017; Jothimani et al. 2017). After completing the PCA stage, the efficiency scores obtained using two dimensions’ factor scores as input and five dimensions’ factor scores as outputs

Internal Processes

Thai Airways TAP Air Portugal

0.9874 0.1584

0.599 0.399 0.379 0.285 0.272 0.259 0.228 0.126 0.124 0.105 0.095 0.079 0.051 0.046 0.032 0.017 0.019 0.010 0.009 0.005 0.002 0.001

Fig. 4 Eigenvector analysis based on internal and learning and growth dimensions

Learning and Growth

Korean Air Norwegian United Airlines Lufthansa Asiana Airlines Thai Airways Delta Airlines Singapore Airlines Icelandair Virgin Australia Aegean Airlines South African Airways EasyJet American Airlines TAP Air Portugal China Airlines Bangkok Airways Southwest Airlines Hawaiian Airlines Vietnam Airlines Emirates Air France

Enhanced Performance Assessment of Airlines with Integrated Balanced. . . 241

242

U. Aydın et al.

Table 3 Factor analysis results for input variables. Entries in bold type indicate the highest loading for each variable Factor 1 0.949 0.939 −0.037 0.167 2.093 52.327

Fleet Cargo Debt ratio NPM Eigenvalues % of variance

Factor 2 0.096 0.116 −0.897 0.870 1.303 32.581

Table 4 Factor analysis results for output variables. Entries in bold type indicate the highest loading for each variable Pax/employee RPE IENT OCSS ROE PAX CURRENT OTP PLF RRPK Eigenvalues % of variance

Factor 1 0.993 0.987 0.113 0.029 −0.035 −0.061 −0.015 −0.068 −0.192 0.070 2.844 28.436

Factor 2 0.039 0.115 0.877 0.724 −0.157 0.069 0.063 −0.258 −0.241 0.511 1.884 18.843

Factor 3 −0.051 −0.061 −0.016 −0.291 0.868 0.711 0.049 0.070 0.518 −0.001 1.452 14.519

Factor 4 −0.012 0.026 −0.150 0.415 0.177 −0.532 0.885 0.126 −0.093 0.395 1.207 12.072

Factor 5 −0.042 0.074 −0.063 −0.088 −0.069 0.114 0.289 0.788 −0.627 0.528 0.878 8.776

using superefficiency DEA thus an additional overall evaluation can be performed. The results of the superefficient DEA model after reduction is presented in Table 5. All inputs and outputs for each dimension listed in Table 2 are used in the superefficiency DEA models. For the analysis of how airline rankings change for each dimension and the overall model, Spearman Rank Correlation analysis is conducted (See Table 6). Table 6 shows the correlations between superefficiency DEA models and it can be said that there is no highly significant correlation coefficient and also the average correlation is about 0.319. Therefore, the rank of airlines varies with respect to each of four dimensions as well as with respect to the overall model and this highlights the importance of evaluating the airlines according to the four dimensions of the BSC model.

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

243

Table 5 Results of reduced superefficient DEA Model DMU Aegean Airlines Aeroflot Aeromexico Air Canada Air China Air France Air New Zealand Alaska Airlines American Airlines ANA All Nippon Airways Asiana Airlines Avianca Bangkok Airways British Airways Cathay Pacific China Airlines China Eastern China Southern Copa Airlines Delta Air Lines easyJet Emirates EVA Air

Overall super efficiency score 1.301 0.634 0.654 1.195 0.437 1.014 0.774 0.500 1.069 0.888 0.964 0.633 1.203 0.554 0.716 0.730 0.511 0.658 0.528 0.423 0.638 0.542 1.250

Overall super efficiency score 1.024 0.798 0.720 1.210 1.945 0.809 0.632 1.008 0.895 0.437 0.818 0.760 0.663 0.794 6.016 0.499 0.728 1.927 0.553 0.266 0.881 0.760

DMU Finnair Garuda Indonesia Hainan Airlines Hawaiian Airlines Icelandair Japan Airlines Jet Airways KLM Korean Air Lufthansa Norwegian Qantas Airways SAS Scandinavian Singapore Airlines South African Airways Southwest Airlines TAP Air Portugal Thai Airways Turkish Airlines United Airlines Vietnam Airlines Virgin Australia

Table 6 Spearman rank correlation analysis Overall model Customer Finance Internal Learning and Growth

Overall model 1 0.559 0.55 0.246 0.257

Customer

Finance Internal

Learning and Growth

1 0.542 0.484 0.294

1 −0.170 0.075

1

1 0.360

5 Managerial Implication and Further Research Opportunities This study proposes a novel approach that integrated superefficient DEA model with social network analysis. This network-based methodology converts the results of superefficient DEA model into a weighted directed network where the nodes are the airline companies and the links represent the endorsement relations between each pair of nodes. This additional analysis base on centrality measure permits

244

U. Aydın et al.

to further discriminate the airline companies. The eigenvector centrality measure selected fort his purpose shows the airlines that are connected with other influential airlines. The proposed methodology also uses BSC approach and evaluates the airline companies based on four dimensions, namely financial, customer, internal processes and learning and growth. Such an approach based on multiple dimensions provides another contribution to the literature where the airline companies are generally evaluated based solely on the financial dimension. The results show that Icelandair, Bangkok Airways, Copa Airlines, Aegean Airlines, South African Airways, TAP Air Portugal, and Hawaiian Airlines act as role models from the financial perspective, while Icelandair is the most central airline. On the other hand, Norwegian, Korean Air, Bangkok Airways, South African, United Airlines, Singapore Airlines, Aegean Airline, and Virgin Australia are in the inner periphery of the network from the customer perspective where. Korean, Norwegian Air, South African Airways, and United Airlines are the most central airlines and directly influence the other airlines from customer perspective. Korean Air, United Airlines, Thai Airways, South African Airways, Asian Airlines, Icelandair, TAP Air Portugal, Delta, and Lufthansa act as a role model in terms of internal processes while Korean Air has the highest influence due to its eigenvector centrality. Finally, Thai Airways and TAP Air Portugal can be treated as role model in terms of the learning and growth perspective where Thai Airways is the most influential due to its eigenvector centrality value. Therefore, the managers of the other airline companies should evaluate in detail the policies adopted by the influential airlines for each of the four dimensions and take active measures to improve their own company. Therefore, as a further suggestion, the relative strengths and weaknesses of superefficient airlines can be specified in detail and the areas of improvement of each airline company can be investigated. Additionally, the method proposed by Edirisinghe and Zhang (2007) can be used to select automatically the inputs and outputs to be used for each dimension. As a final suggestion, the airline companies can be clustered according to their continents and their similarity and differences can be highlighted.

References Adler N, Golany B (2001) Evaluation of deregulated airline networks using data envelopment analysis combined with principal component analysis with an application to Western Europe. Eur J Oper Res 132(2):260–273 Adler N, Yazhemsky E (2010) Improving discrimination in data envelopment analysis: PCA–DEA or variable reduction. Eur J Oper Res 202(1):273–284 Amado CA, Santos SP, Marques PM (2012) Integrating the data envelopment analysis and the balanced scorecard approaches for enhanced performance assessment. Omega 40(3):390–403 Andersen P, Petersen NC (1993) A procedure for ranking efficient units in data envelopment analysis. Manag Sci 39:1261–1264. https://doi.org/10.1287/mnsc.39.10.1261

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

245

Arjomandi A, Seufert JH (2014) An evaluation of the world’s major airlines’ technical and environmental performance. Econ Model 41:133–144 Aryanezhad M, Najafi E, Farkoush SB (2011) A BSC-DEA approach to measure the relative efficiency of service industry: a case study of banking sector. Int J Ind Eng Comput 2(2):273– 282 Asosheh A, Nalchigar S, Jamporazmey M (2010) Information technology project evaluation: an integrated data envelopment analysis and balanced scorecard approach. Expert Syst Appl 37(8):5931–5938 Barros CP, Couto E (2013) Productivity analysis of European airlines, 2000-2011. J Air Transp Manag 31:11–13 Barros CP, Dieke PU (2007) Performance evaluation of Italian airports: a data envelopment analysis. J Air Transp Manag 13(4):184–191 Barros CP, Liang QB, Peyboch N (2013) The technical efficiency of US airlines. Transportation Research Part A: Policy and Practice 50:139–148 Barros CP, Peypoch N (2009) An evaluation of European airlines’ operational performance. Int J Prod Econ 122(2):525–533 Basso A, Casarin F, Funari S (2018) How well is the museum performing? A joint use of DEA and BSC to measure the performance of museums. Omega 81:67–84 Bigliardi B, Ivo Dormio A (2010) A balanced scorecard approach for R&D: evidence from a case study. Facilities 28(5/6):278–289 Bloomberg LP (2016) Stock price graph for Airlines. Retrieved April 12, 2017 from Bloomberg terminal Bonacich P (1927) Technique for analyzing overlapping memberships. In: Costner H (ed) Sociological methodology. Jossey-Bass, San Francisco Brulhart F, Moncef B (2015) Causal linkages between supply chain management practices and performance. J Manuf Technol Manag 26(5):678–702 Chang YT, Park HS, Jeong JB, Lee JW (2014) Evaluating economic and environmental efficiency of global airlines: a SBM-DEA approach. Transp Res Part D: Transp Environ 27:46–50 Charnes A, Cooper WW, Rhodes E (1978) Measuring the efficiency of decision making units. Eur J Oper Res 2(6):429–444 Chen TY, Chen CB, Peng SY (2008) Firm operation performance analysis using data envelopment analysis and balanced scorecard: a case study of a credit cooperative bank. Int J Product Perform Manag 57(7):523–539 Chen FH, Hsu TS, Tzeng GH (2011) A balanced scorecard approach to establish a performance evaluation and relationship model for hot spring hotels based on a hybrid MCDM model combining DEMATEL and ANP. Int J Hosp Manag 30:908–932 Chen Z, Wanke P, Antunes JJM, Zhang N (2017) Chinese airline efficiency under CO2 emissions and flight delays: a stochastic network DEA model. Energy Econ 68:89–108 Chow CKW (2010) Measuring the productivity changes of Chinese airlines: the impact of the entries of non-state-owned carriers. J Air Transp Manag 16(6):320–324 Coli M, Nissi E, Rapposelli A (2011) Efficiency evaluation in an airline company: some empirical results. J Appl Sci 11(4):737–742 Cooper WW, Seiford LM, Tone K (2000) Data envelopment analysis. In: Cooper WW, Seiford LM, Zhu J (eds) Handbook on data envelopment analysis, 1st edn. Kluwer Academic, Boston, pp 1–40 de Nooy W, Mrvar A, Batagelj V (2011) Exploratory social network analysis with Pajek, revised and expanded 2nd edn. Cambridge University Press, Cambridge Dinçer H, Hacıo˘glu Ü, Yüksel S (2017) Balanced scorecard based performance measurement of European airlines using a hybrid multicriteria decision making approach under the fuzzy environment. J Air Transp Manag 63:17–33 Edirisinghe NC, Zhang X (2007) Generalized DEA model of fundamental analysis and its application to portfolio optimization. J Bank Financ 31(11):3311–3335 Feng CM, Wang RT (2000) Performance evaluation for airlines including the consideration of financial ratios. J Air Transp Manag 6(3):133–142

246

U. Aydın et al.

Freeman LC (1979) Centrality in social networks conceptual clarification in Hawaii nets conferences. Social networks. Int J Struct Anal Lausanne 1(3):215–239 García-Valderrama T, Mulero-Mendigorri E, Revuelta-Bordoy D (2009) Relating the perspectives of the balanced scorecard for R&D by means of DEA. Eur J Oper Res 196(3):1177–1189 Gnewuch M, Wohlrabe K (2018) Super-efficiency of education institutions: an application to economics departments. Educ Econ 26(6):610–623 Golany B, Roll Y (1989) An application procedure for DEA. Omega 17(3):237–250 Guo P, Qi X, Zhou X, Li W (2018) Total-factor energy efficiency of coal consumption: an empirical analysis of China’s energy intensive industries. J Clean Prod 172:2618–2624 Ho CTB, Wu DD (2009) Online banking performance evaluation using data envelopment analysis and principal component analysis. Comput Oper Res 36(6):1835–1842 Jothimani D, Shankar R, Yadav SS (2017) A PCA-DEA framework for stock selection in Indian stock market. J Model Manag 12(3):386–403 Kádárová J, Durkáˇcová M, Teplická K, Kádár G (2015) The proposal of an innovative integrated BSC–DEA model. Proc Econ Financ 23:1503–1508 Kaplan RS, Norton DP (1996) Using the balanced scorecard as a strategic management system. Harvard Business Review 74:75–85 Kottas AT, Madas MA (2018) Comparative efficiency analysis of major international airlines using data envelopment analysis: exploring effects of alliance membership and other operational efficiency determinants. J Air Transp Manag 70:1–17 Kuljanin J, Kali´c M, Caggiani L, Ottomanelli M (2019) A comparative efficiency and productivity analysis: implication to airlines located in Central and South-East Europe. J Air Transp Manag 78:152–163 Law KM, Breznik K (2018) What do airline mission statements reveal about value and strategy? J Air Transp Manag 70:36–44 Li Y, Wang YZ, Cui Q (2015) Evaluating airline efficiency: an application of virtual frontier network SBM. Transp Res Part E: Log Transp Rev 81:1–17 Liu JS, Lu WM (2010) DEA and ranking with the network-based approach: a case of R&D performance. Omega 38:453–464 Liu W, Xia Y, Hou J (2019) Health expenditure efficiency in rural China using the super-SBM model and the Malmquist productivity index. Int J Equity Health 18(1):111 Lu WM, Wang WK, Hung SW, Lu ET (2012) The effects of corporate governance on airline performance: production and marketing efficiency perspectives. Transp Res Part E: Log Transp Rev 48(2):529–544 Mallikarjun S (2015) Efficiency of US airlines: a strategic operating model. J Air Transp Manag 43:46–56 Min H, Joo SJ (2016) A comparative performance analysis of airline strategic alliances using data envelopment analysis. J Air Transp Manag 52:99–110 Nissi E, Rapposelli A (2008) A data envelopment analysis study of airline efficiency. In: Mantri JK (eds) Research methodology on data envelopment analysis. Brown Walker Press, Boca Raton, pp 269–280 Noori B (2015) Prioritizing strategic business units in the face of innovation performance: combining fuzzy AHP and BSC. Int J Bus Manage 3(1):36–56 Ouellette P, Petit P, Tessier-Parent LP, Vigeant S (2010) Introducing regulation in the measurement of efficiency, with an application to the Canadian air carriers’ industry. Eur J Oper Res 200(1):216–226 Panicker S, Seshadri V (2013) Devising a balanced scorecard to determine Standard Chartered Bank’s performance: a case study. Int J Bus Res Dev 2(2) Pineda PJG, Liou JJH, Hsu CC, Chuang YC (2018) An integrated MCDM model for improving airline operational and financial performance. J Air Transp Manag 68:103–117 Pires HM, Fernandes E (2012) Malmquist financial efficiency analysis for airlines. Transp Res Part E: Log Transp Rev 48(5):1049–1055 Sabidussi G (1966) The centrality index of a graph. Psychometrika 31(4):581–603

Enhanced Performance Assessment of Airlines with Integrated Balanced. . .

247

Sakthidharan V, Sivaraman S (2018) Impact of operating cost components on airline efficiency in India: a DEA approach. Asia Pac Manag Rev 23(4):258–267 Saranga H, Nagpal R (2016) Drivers of operational efficiency and its impact on market performance in the Indian Airline industry. J Air Transp Manag 53:165–176 Simar L, Wilson PW (2007) Estimation and inference in two stage, semi-parametric models of productive efficiency. J Econ 136:31–64 Stoica O, Mehdian S, Sargu A (2015) The impact of internet banking on the performance of Romanian banks: DEA and PCA approach. Proc Econ Financ 20:610–622 Tavassoli M, Faramarzi GR, Saen RF (2014) Efficiency and effectiveness in airline performance using a SBM-NDEA model in the presence of shared input. J Air Transp Manag 34:146–153 Teker S, Teker D, Güner A (2016) Financial performance of top 20 airlines. Procedia Soc Behav Sci 235:603–610 Tone K (2002) A slacks-based measure of super-efficiency in data envelopment analysis. Eur J Oper Res 143:32–41. https://doi.org/10.1016/S0377-2217(99)00407-5 Tran TH, Mao Y, Nathanail P, Siebers PO, Robinson D (2019) Integrating slacks-based measure of efficiency and super-efficiency in data envelopment analysis. Omega 85:156–165. https:// doi.org/10.1016/j.omega.2018.06.008 Tsionas MG, Chen Z, Wanke P (2017) A structural vector autoregressive model of technical efficiency and delays with an application to Chinese airlines. Transp Res A Policy Pract 101:1– 10 Wang Y (2008) Applying FMCDM to evaluate financial performance of domestic airlines in Taiwan. Expert Syst Appl 34(3):1837–1845 Wang RT, Ho CT, Feng CM, Yang YK (2004) A comparative analysis of the operational performance of Taiwan’s major airports. J Air Transp Manag 10(5):353–360 Wasserman S (1994) Advances in social network analysis: research in the social and behavioral sciences. Sage, Newbury Park Wu H, Li Y (2017) The impacts of female executives on firm performances: based on principle component analysis (PCA) and data envelopment analysis (DEA). In: Proceedings of the tenth international conference on management science and engineering management. Springer, Singapore, pp 223–235 Wu WY, Liao YK (2014) A balanced scorecard envelopment approach to assess airlines’ performance. Ind Manag Data Syst 114(1):123–143 Wu Y, Ke Y, Xu C, Xiao X, Hu Y (2018) Eco-efficiency measurement of coal-fired power plants in China using super efficiency data envelopment analysis. Sustain Cities Soc 36:157–168 Yap GLC, Ismail WR, Isa Z (2013) An alternative approach to reduce dimensionality in data envelopment analysis. J Mod Appl Stat Methods 12(1):17 Zhang J, Fang H, Wang H, Jia M, Wu J, Fang S (2017) Energy efficiency of airlines and its influencing factors: a comparison between China and the United States. Resour Conserv Recycl 125:1–8 Zins AH (2001) Relative attitudes and commitment in customer loyalty models: some experiences in the commercial airline industry. Int J Serv Ind Manag 12(3):269–294

The Effects of Country Characteristics on Entrepreneurial Activities Seda Yanık and Nihat Can Sinayi¸s

Abstract Today entrepreneurship and start-up companies are the most important source of innovation and economic growth. Thus, the ecosystems to foster entrepreneurship are emphasized more and more in almost all economies. Efforts for increasing the entrepreneurship can only be successful when the root causes and their relationship with the entrepreneurial actions are understood. This chapter is an investigation of the main factors of creating new businesses and a comparison between two main country groups. Countries with advanced economies and emerging and developing economies will be set side by side to understand whether the preindicated factors can be generalizable for these groups. In this study, we hypothesize the definition of entrepreneurship as a highly relatable indication to main factors. These are perceived behavioral control, business environment, human factors, and governmental factors. These main factors are investigated to form the basis of creating new ventures. A model with the interrelationships of these factors is proposed and then estimated using structural equation modeling to test our hypotheses. Partial least squares method is employed to provide accurate results. The model is put into bootstrapping and multigroup analysis to provide significance relationships of the model. Results show that perceived behavioral control is the most effective factor in the entrepreneurial intentions and actions for both advanced and emerging countries but slightly higher for emerging countries. Contrarily, the business environment has a negative effect on the entrepreneurial intentions and actions for emerging countries. Human factors also slightly affect both advanced and emerging countries’ entrepreneurial intentions and actions, where its effect is a little higher for the advanced economies. Moreover, a second-tier component is the “governmental factors” affecting the human factors and business environment significantly. When compared, the effect of governmental factors is stronger for emerging countries.

S. Yanık () · N. C. Sinayi¸s Industrial Engineering Department, Istanbul Technical University, Istanbul, Turkey e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_10

249

250

S. Yanık and N. C. Sinayi¸s

1 Introduction Entrepreneurship is defined as the organization that is constructed by someone who starts a company, arranges business deals, and takes risks in order to make a profit. Entrepreneurships have to be distinctive not to be considered as a regular business. It should initiate brand-new products or a product that has been developed over its former version and it should be able to compete in an emerging or a still operational market. An entrepreneurship has to provide the following bullet points (Davidsson 2016): • It has to create unusual alternatives to customers and provide them additional value with the capital they have. • It has to develop market offerings to actors that are operationally related to the corporation’s tasks by developing the efficiency factors. • It should pull new entrants to the entrepreneurship’s market and create a new competing atmosphere. Nevertheless, entrepreneurship has various definitions around the globe. Countries provide different meanings to entrepreneurship. It even changes for the individuals of the same community, but countries commonly have a general understanding of it. Moreover, the academic environment defines entrepreneurship from different point of views. Some definitions of entrepreneurship from the academic community can be listed as follow (Davidsson 2016): • • • • • • • • •

New entry The creation of new enterprise The creation of new organizations A purposeful activity to initiate, maintain, and aggrandize a profit-oriented business The process by which individuals—either on their own or inside organizations— pursue opportunities without regard to the resources they currently control The occupational choice to work for one’s own account and risk The junction where venturesome individuals and valuable business opportunities meet A specific effort by an existing firm or new entrant to introduce a new combination of resources The act by which new firms come into existence

There is no consensus on the definition of entrepreneurship because it is a societal and an economic phenomenon that changes according to different communities due to their differences in certain aspects. It is mostly dependent on behavior and outcomes of the societies (Davidsson 2016). All in all, it is seen that different definitions of entrepreneurship are affected by cultures and these cultures’ understanding of entrepreneurship. Thus, we can hypothesize that the pathway leading to success in entrepreneurships varies among different countries.

The Effects of Country Characteristics on Entrepreneurial Activities

251

This paper is an investigation of the main factors of creating new businesses and a comparison between two main country groups. Countries with advanced economies and, emerging and developing economies will be set side by side to understand whether the preindicated factors can be generalizable for these groups. In this study, we hypothesize the definition of entrepreneurial actions as a highly relatable indication to main four factors. These are perceived behavioral control, business environment, human factors, and governmental factors. These main factors are investigated to form the basis of creating new ventures. A model with the interrelationships of these factors is proposed and then estimated using structural equation modeling to test our hypotheses. The partial least squares method is executed to provide accurate results with medium-sized sampling. The model is put into bootstrapping and multigroup analysis to provide significance relationships of the model. Results are interpreted to make quantitative and qualitative conclusions. Data that is used in this research is obtained from the Global Entrepreneurship Monitor’s (GEM) two databases: National Expert Survey (NES) and Adult Population Survey (APS). NES provides the national perspective on new businesses and Adult Population Survey (APS) supplies the social viewpoint of individuals toward entrepreneurship. Before the model is run, collected data are tested using Kaiser–Meyer–Olkin Test and Bartlett’s Test to check whether the data is suitable for making analyses. The model is estimated separately for the data of the country cluster of advanced economies and the data of the country cluster of emerging markets and developing economies. Then, relationships of the factors that lead to entrepreneur success for these two different clusters of countries are compared. The study is organized as follows: In Sect. 2 we review studies with different perspectives. Section 3 describes the Global Entrepreneurship Monitor (GEM) database for the interested reader. In Sect. 4, the entrepreneurship model is proposed. The methodology and the application of the model are presented in Sects. 5 and 6 respectively. In Sect. 7, conclusions are provided.

2 Literature Review There exists a vast amount of literature on entrepreneurship success, failure, their relation with many different factors as well as their effect on various factors. We conduct a thorough review of the recent studies and will be reporting some of those works of different perspectives in the following. Van Stel et al. (2005) study the effect of entrepreneurial activity on national economic growth and use GEM data to analyze if total entrepreneurial activity (TEA) has any effect on GDP growth. They use 36 countries in their statistical model and conduct t-test and regression analysis. They conclude that entrepreneurships have a significant influence on economic growth. This influence is correlated with per capita income. Thus, the effect of entrepreneurships varies by the economic situation of the countries.

252

S. Yanık and N. C. Sinayi¸s

Hechavarria and Ingram (2016) analyze various forms of entrepreneurships and the entrepreneurial activity variations on gender basis. They use GEM and World Values Survey to measure specifically female individuals. For this analysis, a data collection of 55 countries of the year 2009 is used with a logistic multilevel model. The conclusion is that females as entrepreneurship leaders are less likely to construct commercial entrepreneurships than social entrepreneurships while males tend to create fewer social ventures in hegemonic masculine societies. Maniyalath and Narendran (2016) also use GEM data to indicate if some socioeconomic variables measuring the national income part of the Human Development Index (HDI) can be used to determine female entrepreneurship rates. They test their propositions and apply these variables in cross-country regression analyses. They conclude that religion has a high impact on female entrepreneurships regarding the religiosity of the country. Wyrwich et al. (2016) test how entrepreneurial peers get together to form a new business. The comparison is made by dividing the country into two as East Germany and West Germany to analyze cultural differences within a country. Pinillos and Reyes (2011) analyze the relationship between individualist–collectivist culture and entrepreneurial activity by using 52 countries’ data. They conclude that entrepreneurship rate is negatively related to individualism in countries with low or medium development rate in GDP per capita. In contrast, when the development rate is high, individualism is positively correlated with entrepreneurship rate. Barazandeh et al. (2015) look out for the relation of entrepreneur’s business performance with social norms affecting entrepreneurial competencies. 125 cases of 59 countries are formed to conduct the data in confirmatory factor analysis. They conclude that the entrepreneurial competencies on business performance have a positive relation with entrepreneurial social norms on entrepreneurs’ competencies. Stuetzer et al. (2014) use GEM data of Western Germany in their multilevel analysis to highlight the relationship between regional characteristics and individual entrepreneurship. Their hypotheses failed since they found no correlation between regional knowledge creation, the economic context and entrepreneurial culture. They unintentionally concluded that these variables have indirect effect on the individual understanding of founding opportunities. This also creates entrepreneurship intentions. Wach (2015) uses GEM data to analyze the relationship between culture and entrepreneurship in EU countries. He first clusters EU countries into segments and proposes hypotheses using main variables such as cultural and social norms, Perceived opportunities, perceived capabilities, fear of failure, early-stage entrepreneurial activity (TEA), and entrepreneurship as a good career choice. It is concluded that there is no evidence to conclude that Innovation-driven economies are much more entrepreneurial than efficiency-driven economies. And it is approved that entrepreneurs from entrepreneurial cultures perceive more entrepreneurial opportunities and it results in much higher rate for new businesses and necessitybased entrepreneurship is rather low in entrepreneurial cultures as these two variables are negatively correlated.

The Effects of Country Characteristics on Entrepreneurial Activities

253

As summarized above Gem database has been used widely in the academic community to analyze entrepreneurship. The analyses are commonly gathered around understanding the individual characteristics of the entrepreneur, the effect of culture and success. In our study, we provide a macro perspective and aim to understand the effect of success factors in comparison to two classes as advanced and emerging countries. We aim to understand the differences between these two groups in terms of entrepreneurship success factors.

3 The Global Entrepreneurship Monitor Database Global Entrepreneurship Monitor (GEM) is the world’s leading study on entrepreneurship. It is managed centrally hence it also has an international spread which provides high-quality information, annually prepared detailed reports, and more. GEM began as a conjoint of two colleges of Atlantic Ocean neighbors: London Business School from the UK and Babson College from the USA. The main objective was to discover why some countries are more “entrepreneurial” than the others. It provides absolutely free data to all users and does not restrict any of its data with fee charges. Various organizations, research centers, and media outlets take GEM as a reference to gather data to create analyses on entrepreneurship. Organizations such as OECD, United Nations, World Bank, and World Economic Forum use GEM data in policy making processes. The database consists of data of more than 100 countries of 18-year period collected via more than 200,000 interviews a year with the help of more than 500 specialists worldwide. This database is funded by more than 200 funding institutions and used by at least 300 academic and research institutions. GEM provides two kinds of surveys which form the whole database. The first one is “Entrepreneurial Behavior and Attitudes” and the other one is “The Entrepreneurial Ecosystem.” Entrepreneurial Behavior and Attitudes is supported by Adult Population Survey (APS) and The Entrepreneurial Ecosystem is supported by National Expert Survey (NES). In summary, the national context and how that impacts entrepreneurship is collected in NES and entrepreneurial behavior and attitudes of individuals are collected in APS. Both databases are divided according to three criteria: Country, Indicator, and Year.

3.1 National Expert Survey National expert survey (NES) is one of the two main databases of GEM. NES is considered to be the indicator of the Entrepreneurial Framework Condition. With various variables of different countries, it helps users to define the environment which certain entrepreneurship are part of. It also creates a better understanding of the creation and growth processes of new businesses.

254

S. Yanık and N. C. Sinayi¸s

Each GEM country must have minimum number of 36 experts to create national surveys. First, the experts are selected. Then, the experts conduct the research by finding volunteers that can fill out the form. These forms are sent to NES coordinator to be checked. A quality control is done to make sure the data is viable. If data information is correct, it is stored to begin the harmonization process.

3.2 Adult Population Survey Adult population survey (APS) is the other main survey in GEM database. It shows entrepreneurship characteristics of individuals in different countries. Unlike NES, data collection process is more decentralized in APS. It changes according to the teams in different countries. In countries with over 85% telephone landline coverage, interviews are made on the phone. In countries with lack of landline coverage, these interviews are made face-to-face. Minimum of 2000 volunteers contribute to each country in the data collection process of APS.

4 The Proposed Model of Entrepreneurship Success Considering different perspectives on the definition of entrepreneurship and various academic works, we set a new course in approaching and understanding entrepreneurship. The literature review has enabled us to gather various perspectives and a new model is proposed using these highly praised perspectives and brandnew factors. The aim of this research is to answer what motivates entrepreneurial intentions and actions and how these factors differ in different countries. This research defines four main factors which affect the success of entrepreneurships. These factors are perceived behavioral control, business environment, human factors, and governmental factors. It is assumed that all of these factors have certain impacts on entrepreneurial intentions and actions in different countries. It is also assumed that there are relationships between these factors. The hypotheses tested in this study are as follows: H1: The governmental factors of a country positively influences the human factors of a country. H2: The governmental factors of a country positively influence the business environment of a country. H3: The human factors of a country has positive influence on the business environment of a country. H4: The human factors of a country positively affects the entrepreneurial intention and action. H5: The business environment of a country has negative effect on the entrepreneurial intention and action.

The Effects of Country Characteristics on Entrepreneurial Activities

255

Governmental Factors H1

H2

Human

Business H3

Factors

Environment

H5 H4

Perceived Beh-

Entrepr. IntenH6

tion & Action

avioral Control H7

Fear of Failure

Fig. 1 The conceptual model

H6: The perceived behavioral control positively affects the entrepreneurial intention and action. H7: The fear of failure affects the relationship between perceived behavioral control and entrepreneurial intention and action. Hypotheses building process is the first step of creating the model in Fig. 1. All of the hypotheses will be tested whether they are true for the two categories of countries: countries with advanced economies and countries with emerging market and developing economies.

4.1 Variable Selection and Definition In order to make further tests on the presumed hypotheses, independent variables are needed to be identified. These independent variables will be affecting their related dependent variables which are the four main factors and entrepreneurship factor, in this case. The independent variables are selected from the wide range of variables that are provided in the GEM database.

256

S. Yanık and N. C. Sinayi¸s

4.1.1 Governmental Factors Inevitably, governmental factors have massive influence power on the outputs of a countries economy. In the literature, its effect has been analyzed by many scholars (Armour and Cumming 2006; Bjørnskov and Foss 2008; Griffiths et al. 2009; Obaji and Olugu 2014) entrepreneurships. Here we consider governmental factors as a model component for affecting the human factors and business environment of a country that have an effect on entrepreneurial intention and action. Due to its indirect influence on the entrepreneurial intention and action, we refer it as a second-tier factor. In this study, we represent the governmental factors using some variables introduced in NES sub-databases. From NES database, the variables “governmental support and policies,” “taxes and bureaucracy,” and “governmental programs” are chosen as independent variables to be combined to represent the dependent variable, governmental factors. “Governmental Support and Policies” is defined as the extent to which public policies support entrepreneurship—entrepreneurship as a relevant economic issue. “Taxes and Bureaucracy” indicates the extent to which public policies support entrepreneurship—taxes or regulations are either size-neutral or encourage new and SMEs. And “government Entrepreneurship Programs” refers to the presence and quality of programs directly assisting SMEs at all levels of government (national, regional, and municipal).

4.1.2 Human Factors For this dependent variable, all dependent variables are also collected from the NES database. For the dependent variable human factors, the associated indicators are found to be “cultural and social norms,” “basic school entrepreneurial education and training,” and “post school entrepreneurial education and training.” In the literature, entrepreneurial culture and human capital have been referred to in the various studies (Davidsson and Wiklund 1997; Obschonka et al. 2015; Terjessen et al. 2016). “cultural and social norms” shows the extent to which cultural and social norms endorse actions to create new businesses in order to level up personal wealth. Other two indicators are “basic school entrepreneurial education and training” and “post school entrepreneurial education and training.” The level of educational practice in creating and managing SMEs that is within the curriculum defines these two. Primary and secondary level school layer is defined within basic schools and higher education such as college, business schools are defined within post schools.

4.1.3 Business Environment The business environment in a country is one of the most important determinants of economic activities, thus entrepreneurial activities. It has been referred to in

The Effects of Country Characteristics on Entrepreneurial Activities

257

the entrepreneurship research many times (Lee and Peterson 2000; Thai and Turkina 2014). The business environment factor is constructed using the indicators of “commercial and professional infrastructure,” “financing for entrepreneurs,” “internal market openness,” “physical and services infrastructure,” and “R&D transfer.” The GEM terminology accounts for the mentioned indicators as described in the following: “commercial and professional infrastructure” is the presence of property rights, commercial, accounting and other legal and assessment services and institutions that support or promote SMEs. “financing for entrepreneurs” is the availability of financial resources—equity and debt—for small and medium enterprises (SMEs) (including grants and subsidies). “Internal Market Openness” is the extent to which new firms are free to enter existing markets. “Physical Infrastructure” is the ease of access to physical resources—communication, utilities, transportation, land, or space—at a price that does not discriminate against SMEs. “R&D Transfer” is the extent to which national research and development will lead to new commercial opportunities and is available to SMEs.

4.1.4 Perceived Behavioral Control Perceived behavioral control is defined as the attractiveness of the proposed behavior or degree to which the person positively or negatively evaluates the idea of becoming an entrepreneur. Perceived behavioral control refers to the perceived ease or difficulty of becoming an entrepreneur (Ajzen 1991; Liñán 2004). The dimensions of perceived opportunities and capability beliefs in this factor have been addressed in entrepreneurship studies (Griffiths et al. 2009). Another dimension, perceived attractiveness of self-employment, has been also been referred to in the eclectic theory of entrepreneurship studies. (Thai and Turkina 2014; Verheul et al. 2002). Perceived behavioral control factors are constructed using the following three indicators from the APS database of GEM: “perceived opportunities,” “entrepreneurship as a good career choice,” “high status to successful entrepreneurs.” Perceived opportunities accounts for the percentage of 18– 64 population (individuals involved in any stage of entrepreneurial activity excluded) who see good opportunities to start a firm in the area where they live. “Entrepreneurship as a Good Career Choice” refers to the percentage of 18–64 population who agree with the statement that in their country, most people consider starting a business as a desirable career choice. “High Status to Successful Entrepreneurs Rate” shows the percentage of 18–64 population who agree with the statement that in their country, successful entrepreneurs receive high status.

258

S. Yanık and N. C. Sinayi¸s

4.1.5 Entrepreneurial Intention and Action The last and foremost construct of the model is “entrepreneurial intention and action” factor. We aim to understand how other factors help entrepreneurial intention and actions can be fostered and how other factors interact with each other. There exist many studies to understand the dynamics of entrepreneurship using many different perspectives in the literature (Kuratko et al. 2015; Obaji and Olugu 2014; Griffiths et al. 2009). “Entrepreneurial Intention and Action” is constructed using four indicators from the APS database. The first one is “perceived capabilities” which is defined as the percentage of 18–64 population (individuals involved in any stage of entrepreneurial activity excluded) who believe they have the required skills and knowledge to start a business. The second variable is “entrepreneurial intentions rate,” showing the percentage of 18–64 population (individuals involved in any stage of entrepreneurial activity excluded) who are latent entrepreneurs and who intend to start a business within 3 years. Another variable is “Total early-stage Entrepreneurial Activity (TEA) Rate” and measures the percentage of 18–64 population who are either a nascent entrepreneur or owner-manager of a new business. And finally last variable “Established Business Ownership,” showing a mature level of entrepreneurial activity is defined as the percentage of TEA applicable people who own/manage a recently established business which is at least 42 months old.

4.1.6 Fear of Failure An additional variable in the model is “fear of failure rate.” We question whether this variable changes the nature of the relationship of the two constructs Perceived behavioral control and Entrepreneurial Intention and Action. Fear of failure is incorporated in the APS database and defined as the percentage of 18–64 population (individuals involved in any stage of entrepreneurial activity excluded) who indicate that fear of failure would prevent them from setting up a business.

4.2 Dataset As mentioned before, data used in this research is collected from the GEM database (GEM 2019). We use the data to interpret the hypothesized model and make further qualitative results between two country groups: countries with advanced economies (AE) and countries with emerging markets and developing economies (EMDE). The groups of countries are owed from the categorization of the International Monetary Fund (IMF). Countries in the database are matched with the IMF’s country database (IMF Data Mapper 2019). GEM provides data for various countries starting from 2001. However, some of the variables used in this research have no figures for some of the years. Also, few of

The Effects of Country Characteristics on Entrepreneurial Activities

259

the variables are introduced after the initiation of the GEM project. Due to this fact, a method is required to handle the missing values in the dataset. We use missing value imputation with k nearest neighbor (k-NN) algorithm. k-NN algorithm uses ‘feature similarity’ to predict the values of missing value data points. This means that the new point is assigned a value based on how closely it resembles the points in the dataset. Thus, we first find the k’s closest neighbors to the observation with the missing data and then impute its value based on the non-missing values in the neighborhood. Using this missing value imputation approach, there will be no missing data when the model is run in SmartPLS. This will increase the accuracy of the results provided in the reports section of the study. In total, 828 observations are used in this study. Four hundred and thirty six of these recordings belong to the countries with advanced economies and 496 of the recordings belong to the countries with emerging markets and developing economies. The frequency of the appearance of these countries in the formed database is given in Table 1.

5 Methodology The methodology for generating a well-fitting model needs multiple steps. Figure 2 provides a flowchart of the approach taken in this study. The evaluation process is the most significant phase of this study. Using certain measures of reflective and formative measurement models assists the study in creating the best outcome.

5.1 Structural Equation Modeling There are numerous statistical analysis techniques that can be used to clarify the explanatory ability and efficiency of statistical data (Hair et al. 2014). The model gathered for this study consists of multiple relationships between the defined dependent variables. Because of that multiple regression, factor analysis, or other techniques cannot give accurate results. The best option is to apply Structural Equation Modeling (SEM). SEM can investigate multiple dependence relationships at the same time (Hair et al. 2014). It helps to reveal the calculation of multiple, interrelated, and dependent relationships along with the unobserved relationships among the variables. In order to understand the relationships between theoretical concepts, SEM provides a basis to form a model that is supported with measured data. Sets of measured data are used to explain related constructs and in that way relationships between the constructs can be understood with the reduced measurement error of the concepts. It is significant to note that SEM is not an exploratory analysis. The objective of this study is to confirm whether predefined hypotheses are correct. That’s why a confirmatory analysis technique, such as SEM, is efficient to use within this research. SEM needs modifications by the researcher

260

S. Yanık and N. C. Sinayi¸s

Table 1 Frequencies of country data of the model database Advanced economies Country Frequency Australia 11 Austria 6 Belgium 12 Canada 10 Cyprus 3 Czech Republic 3 Denmark 11 Estonia 6 Finland France Germany Greece Hong Kong Iceland Ireland Israel Italy Japan Latvia

16 11 17 16 5 8 17 12 15 8 10

Lithuania Luxembourg Netherlands New Zealand Norway Portugal Puerto Rico Singapore Slovakia Slovenia South Korea Spain Sweden Switzerland Taiwan United Kingdom United States

4 6 15 5 15 9 7 10 8 17 12 18 12 14 10 16 17

Emerging economies Country Frequency Algeria 3 Angola 5 Argentina 17 Bangladesh 1 Barbados 5 Belize 2 Bolivia 3 Bosnia and 8 Herzegovina Botswana 4 Brazil 18 Bulgaria 4 Burkina Faso 3 Cameroon 3 Chile 16 China 12 Colombia 12 Costa Rica 3 Croatia 17 Dominican 3 Republic Ecuador 10 Egypt 7 El Salvador 3 Ethiopia 1 Georgia 2 Ghana 3 Guatemala 9 Hungary 13 India 10 Indonesia 7 Iran 10 Jamaica 9 Jordan 2 Kazakhstan 5 Kosovo 1 Lebanon 4 Libya 1

Country Macedonia Madagascar Malawi Malaysia Mexico Montenegro Morocco Namibia

Frequency 6 2 2 10 11 1 4 2

Nigeria Pakistan Palestine Panama Peru Philippines Poland Qatar Romania Russia Saudi Arabia

3 3 2 9 14 4 9 4 5 11 5

Senegal 1 Serbia 3 South Africa 16 Sudan 1 Suriname 2 Syria 1 Thailand 11 Tonga 1 Trinidad and Tobago 5 Tunisia 4 Turkey 9 Uganda 7 United Arab Emirates 7 Uruguay 12 Vanuatu 1 Venezuela 5 Vietnam 4 Zambia 3

The Effects of Country Characteristics on Entrepreneurial Activities

Draft the model

Collect data and test its suitability

261

Evaluate the model of complete data set and execute necessary modifications

Execute the model with PLS-SEM

NO

Report quantitative and qualitative test results

Execute bootstrapping and PLS-MGA

Assess PLS-SEM results of complete set

YES

Does the model pass the tests?

Fig. 2 Stages of the methodology

throughout its processing. Results should be analyzed carefully to develop the model and finalize the conclusion of the research. Throughout the application process of the model, the required actions are taken to improve the results and minimize the errors. In SEM terminology, a latent variable or construct is a variable that is explained by multiple indicators. A measured/observed variable or indicator is the variable that is provided by the researcher. Indicators are used to form constructs in structural models. An endogenous construct is a dependent construct on the single/multiple constructs while an exogenous construct is an independent latent variable that is only defined by its indicators. Structural model is the layout of constructs’ dependence relationships. Figure 1 can be considered as the basic structural model of the study. Another term, path diagram is the visual representation of all the relationships of the model (Hair et al. 2014). In that case, Fig. 3 is the path diagram of the model. Single-headed arrows show the dependence relationships between the indicators and latent constructs or between the latent constructs.

5.2 Partial Least Squares Method Since the data set of this research is not vast in size, among many methods, the partial least squares (PLS) method is employed for estimating the model. It is significant to note that structural equation modeling with partial least squares method (PLS–SEM) is not the same method as PLS regression which is also a technique of multivariate data analysis. PLS–SEM works finely with small sets of samples and as the sample set gets larger, the precision of the estimations gets stronger. Also, there are no limitations of indicator requirements in construct building. As more indicators with significance are introduced to each construct, the validity of the model gets higher. In contrast to many other SEM methods, PLS does not allow circular, double direction relationships. Only the dependence relationships are allowed in building the model. The objective of PLS–SEM is the minimization

262

S. Yanık and N. C. Sinayi¸s

Fig. 3 The model with outer loadings and total effects of complete data set

of the unexplained variances of endogenous constructs by underestimating the relationships in structural models (Hair et al. 2017). An additional benefit of PLS– SEM is the Multigroup Analysis (MGA) which has been used in this study.

6 Application of the Model The application of the model is conducted by estimating the relationships and their significance. Before that we introduce some tests and measures that are used to examine the data validity and the evaluation of the model relationships, then we investigate the model fit and multigroup differences in the following subsections, respectively.

The Effects of Country Characteristics on Entrepreneurial Activities

263

6.1 Kaiser–Meyer–Olkin and Bartlett’s Test Before the initiation of the SEM model, data collected from GEM is needed to be validated. In order to check the validity of data, one of the two tests, Kaiser–Meyer– Olkin (KMO) and Barlett’s tests can be used and they are applied in IBM’s SPSS program. KMO test measures the suitability of the variables for factor analysis. The correlation matrix of the variables is formed to check whether the data suitable to make further tests. If the KMO sampling adequacy is higher than 0.5, then the formed data will be viable in testing the model. Higher value means the data is more accurate to be used in factor analysis. Bartlett’s Test hypothesizes that the correlation matrix of the indicators is an identity matrix. By its nature, it assumes that there are no significant relationships between the indicators. The test is considered to be successful and the data becomes viable to use in factor analysis if the significance level is less than 0.05. For the model, Bartlett’s test p-value is 0.000 which means that the test is significant, and data can be used in PLS–SEM.

6.2 Evaluation of the Model In order to create a valid and reliable model with maximized explained variances of endogenous variables, the model has to be assessed with some of the measures. Figure 3 shows the model after PLS–SEM is initiated. Numbers between indicators and constructs are the outer loadings and the numbers between each construct are total effects. According to Hair et al. (2017), the measurement model must pass the following measures. This evaluation needs to be conducted on the complete data to result in a viable model. Reflective measurement model measures evaluate the internal consistency with composite reliability, convergent validity with average variances extracted (AVE), and discriminant validity with Heterotrait– Monotrait Ratio (HTMT). Formative measurement models measure convergent validity, collinearity between the indicators and significance of outer weights (Hair et al. 2017). Outer loadings relevance testing measures the loadings between the latent variables and the indicators. According to Hair et al. (2017), outer loadings less than 0.4 should be removed from the model. Outer loadings which are more than 0.7 can remain in the model. Indicators with loadings that are between 0.4 and 0.7 can remain in the model only if composite reliability or average variance extracted do not increase with the removal of the indicator. Average Variances Extracted (AVE) measures the convergent validity of the construct validity in a model (Hair et al. 2017). This measure multiplies the loadings of the indicators and takes the mean of these values for each construct (Eq. 1). Minimum AVE value for each construct

264

S. Yanık and N. C. Sinayi¸s

needs to be 0.5. It is the square of 0.7 which is the minimum requirement of outer loadings. " AVE =

M 2 i=1 li

#

M

(1)

Cronbach’s alpha is a measure that is usually used to evaluate the internal consistency reliability of a model. It assumes that all of the outer loadings of a construct are the same (Hair et al. 2017). In contrast to many other statistical tests, PLS–SEM gives different outer loadings to each indicator. Due to this fact, Cronbach’s alpha is a conservative measure in evaluating the internal consistency reliability of this research’s model. That is why another measure called composite reliability will be used to make further assessment on the reflective measurement model. It is calculated by the squared sum of the loadings of a construct divided by the sum of the numerator with the sum of measurement of error of each indicator of that construct which is calculated by one minus the square of the loading of the indicator (Eq. 2). It is recommended that this value should at least be 0.7.  M ρc =  M

i=1 li

2

2

i=1 li

+

M

(2)

i=1 var (ei )

Cross loading of an indicator shows the correlation of that indicator with all present constructs within the model. Cross loadings approach is used to evaluate the discriminant validity of the model. Researcher’s main aim in forming a PLS– SEM model should be to create unique constructs so that the model can provide significant results. Cross loadings are the first step in understanding whether the indicators form a unique construct, or they are more correlated with other constructs. If that is the case, the construct will have no unique specification. Researcher either needs to remove the indicator from the model or relate the indicator with the correct construct. Another discriminant validity measure used in this research is HeterotraitMonotrait Ratio (HTMT). This ratio measures the proportion of all correlations of indicators measuring cross constructs divided by the mean of the indicators’ correlations of the same construct (Henseler et al. 2015). In brief, HTMT estimates what would be the correlation of two constructs if the measurement is perfectly executed. It is expected that HTMT ratio does not exceed 1. Collinearity tests whether the indicators of the model have high correlated relationships with each other or not. When a pair of indicators is highly relatable, PLS–SEM cannot measure the weights of indicators. Since PLS–SEM is usable with small sample sets, sampling error is more than the other SEM methods. Due to this fact, low collinearity of formative indicators is expected in the model. Assessing collinearity starts with the estimation of the tolerance of an indicator. Tolerance is calculated by the proportion of variance of an indicator which is regressed on all

The Effects of Country Characteristics on Entrepreneurial Activities

265

other indicators. Then this value is subtracted from “1” to find the tolerance of an indicator. When this value is divided by “1” a measure called variance inflation factor (VIF) is found. In PLS–SEM a tolerance level of 0.2 or less or VIF level of 5 or more indicates high collinearity (Hair et al. 2017). Levels of 5 and more shows at least 80% of variance explained by the other indicators of the model. If the indicators pass the evaluation of collinearity statistics, the next step is to test whether the outer weight significance of bootstrap sampling. Bootstrapping supplies the statistics of resampling by replacing the samples randomly. It is suggested that 5000 bootstrap samples provide accurate results for PLS–SEM (Hair et al. 2017). Evaluation requires the indicators to have significant outer weights. If the outer weight is not significant, then the outer loading is assessed whether it is more than 0.5 or not. If it is more than 0.5 the indicator is retained but if it is less than the designated value, then the significance of outer loading is checked. If it is not significant the indicator needs to be removed from the model. Taking the abovementioned model evaluation measures into account, we make some modifications on the indicators and constructs of model in order to obtain a reliable and valid model during the evaluation process. Multiple models are tested to finalize the model so that it can pass the measures and can be interpreted for the sake of this research. In Table 2, the constructs, indicators, and their abbreviations are listed. The final model in Fig. 3 provides appropriate outer loadings that match the minimum requirements suggested in the literature. The outer loading details are also given in Table 3. All the outer loadings are above 0.7. There exist a few indicators with an outer loading between 0.6 and 0.7 but we keep them in the model because composite reliability or average variance extracted do not increase with the removal of the indicator. In order to ensure that the model has convergent validity and internal consistency, AVE and composite reliability measures are tested according to SmartPLS output. Table 4 shows that for all of the constructs the measures are above the requirement of 0.5 and 0.7. It is discovered that the constructs in the final model are unique. The discriminant validity approach on the reflective model shows that all indicators are assigned to the most correlated construct. The cross loadings values provide a layout of this result and bring the conclusion that no cross loading of an indicator is higher in any other construct. In addition, Heterotrait-Monotrait Ratio of each construct with each other is below the maximum level of “1.” The results of HTMT can be observed in Table 5. The model passes the discriminant validity approach as well as it passes internal consistency and convergent validity. In evaluating the formative measurement model, collinearity between the indicators and relevance of outer weights along with significance outer loadings are presented in Table 3. These results are once again provided by applying bootstrapping with 5000 samples quantity as PLS–SEM approach requires. All pvalues in this approach are 0, meaning that all relationships between indicators and latent variables are significant. It means that the model is ready to be interpreted and taken in multigroup analysis.

266

S. Yanık and N. C. Sinayi¸s

Table 2 The constructs, indicators, and their abbreviations Constructs Business environment Business environment Business environment Business environment Business environment Entrepreneurial intention and action Entrepreneurial intention and action Entrepreneurial intention and action Entrepreneurial intention and action Governmental factors Governmental factors Governmental factors Human factors Human factors Human factors Perceived behavioral control Perceived behavioral control Perceived behavioral control

Indicators Commercial and professional infrastructure Financing for entrepreneurs Internal market openness Physical and services infrastructure R&D transfer Entrepreneurial intentions rate

Indicator abbreviations Com.Infr. Finc.Ent. Mark.Open Phy.Infr RD.Tran. Ent.Inten

Established business ownership

Est.Bus

Perceived capabilities

Per.Cap

Total early-stage entrepreneurial activity (TEA) Rate Governmental programs Governmental support and policies Taxes and bureaucracy Basic school entrepreneurial education and training Post school entrepreneurial education and training Cultural and social norms Entrepreneurship as a good career choice

TEA Gov.Prog Gov.Sup Tax.Bur Bas.Sch PostSch Cult.Norm Ent.GoodC

Perceived opportunities

Per.Opp

High status to successful entrepreneurs

Status.Ent

6.3 Moderating Effect Moderating (or sometimes named as “mediating”) effect exists when an independent variable or structure changes the strength or the direction of the relationship between two constructs of a model (Hair et al. 2014). Thus, this relationship is not the same under every condition but its nature changes with the existence of a moderator variable. Actually, the moderating effect is added in the model as an interaction term which is an independent (i.e., exogenous) variable, obtained as result of nonadditive joint-relation with another independent variable. This interaction term affects a dependent (i.e., endogenous) variable of the model. For example, let’s assume X is an independent variable affecting a dependent variable Y in the model and let us say A is added as a moderator variable. In this case, the question, “does X and A have a joint effect on Y which is different that the individual linear effects of X and

The Effects of Country Characteristics on Entrepreneurial Activities

267

Table 3 Outer weight and outer loading significance of bootstrapping with the collinearity test result on the model

Indicator ← Construct Bas.Sch ← Human factors Com.Infr. ← Business environment Cult.Norm ← Human factors Ent.GoodC ← Perceived behavioral control Ent.Inten ← Entrepreneurial intention and action Est.Bus ← Entrepreneurial intention and action Fear.Fail ← Failure fear Finc.Ent. ← Business environment Gov.Prog ← Governmental factors Gov.Sup ← Governmental factors Mark.Open ← Business environment Per.Cap ← Entrepreneurial intention and action Per.Opp ← Perceived behavioral control Per. Beh. Contr. * Fail. Fear ← Moderating effect 1 Phy.Infr ← Business environment Post.Sch ← Human factors RD.Tran. ← Business environment Status.Ent ← Perceived behavioral control TEA ← Entrepreneurial intention and action Tax.Bur ← Governmental factors

Outer loading sample mean 0.843 0.792 0.796 0.75

Outer loading P values 0 0 0 0

Outer weight sample mean 0.473 0.216 0.4 0.435

Outer weight P values 0 0 0 0

0.874

0

0.33

0

0.629

0

0.17

0

1 0.813 0.903 0.873 0.842

0 0 0 0

1 0.25 0.413 0.345 0.252

0 0 0 0

0.863

0

0.355

0

0.824

0

0.567

0

1.119

0

1

0.686 0.764 0.871 0.697

0 0 0 0

0.203 0.37 0.314 0.296

0 0 0 0

0.925

0

0.322

0

0.855

0

0.381

0

Table 4 Average variance extracted and composite reliability of each construct in model Business environment Entrepreneurial intention and action Failure fear Governmental factors Human factors Moderating effect 1 Perceived behavioral control

Composite reliability 0.901 0.898 1 0.909 0.844 1 0.803

Average variance extracted (AVE) 0.646 0.691 1 0.769 0.643 1 0.576

268

S. Yanık and N. C. Sinayi¸s

Table 5 HTMT relationships of constructs in the model

Business Environ. Entrepreneurial 0.43 intention and action Failure Fear 0.113 Governmental 0.835 factors Human factors 0.748 Moderating 0.108 effect 1 Perceived 0.223 behavioral control

Entrepreneurial intention and Failure fear action

Govern. Human Moderating factors factors effect 1

0.345 0.302

0.061

0.149 0.259

0.051 0.145

0.674 0.06

0.047

0.817

0.204

0.179

0.26

0.142

Table 6 Bootstrapping results of path coefficients and p-values Business environment → Entrepreneurial intention and action Failure fear → Entrepreneurial intention and action Governmental factors → Business environment Governmental factors → Human factors Human factors → Business environment Human factors → Entrepreneurial intention and action Moderating effect 1 → Entrepreneurial intention and action Perceived behavioral control → Entrepreneurial intention and action

Original sample −0.383

Sample mean −0.383

P values 0

−0.178

−0.178

0

0.57

0.57

0

0.533 0.303 0.168

0.533 0.303 0.167

0 0 0

−0.096

−0.096

0

0.531

0.532

0

A” is hypothesized. Thus, the moderating effect is shown as “X*A” and this joint effect on Y is questioned. In this model, we hypothesize that the relationship between the two constructs, “perceived behavioral control” and “entrepreneurial intentions and action” is not the same depending on the variable of “fear of failure.” Thus, we inspect whether the interaction term “perceived behavioral control * fear of failure” has a moderating effect on “entrepreneurial intentions and action.” In Table 6, we see that the moderating effect is significant with a p-value. We can also analyze the size and nature of the mediating effect of fear of failure on the relationship between perceived behavioral control and entrepreneurial intention and action using the graph given in Fig. 4. As seen the blue line is a low level of fear of failure, it increases the positive effect of perceived behavioral control

The Effects of Country Characteristics on Entrepreneurial Activities

269

Entrepreneuria Intention & Action

Moderating Effect 1 0.75 0.50 0.25 0.00 –0.25 –0.50 –1.00

–0.75

–0.50 –0.25 0.00 0.25 0.50 Perceived Behavioral Control

Failure Fear at –1 SD

Failure Fear at Mean

0.75

1.00

Failure Fear at +1 SD

Fig. 4 The mediating effect of fear of failure on the relationship between perceived behavioral control and entrepreneurial intention and action

on entrepreneurial intention and action. When fear of failure is high which is the case depicted with the green line in the graph, perceived behavioral control still has a positive effect on entrepreneurial intention and action but its effect is weaker due to the moderator.

6.4 Model Fit Similar to the collinearity assessment on indicators and constructs, another test method of collinearity is conducted on the structural model. Initially, an absolute fit index and one incremental fit index are used in understanding how well the structural model fits the sampling. Then, R-square values are reported to show the predictive accuracy of the model. The first measure is a standard measure of absolute fit used in most of the SEM techniques. Standardized root means square residual (SRMR) measures the error scale of residual of the correlations within the set. A perfect-fitting model would have SRMR value as zero (Hair et al. 2014). Henseler et al. (2014) introduce the SRMR as a goodness of fit measure for PLS–SEM that can be used to avoid model misspecification. A value of less than 0.10 or of 0.08 in a more conservative manner (Hu and Bentler 1999) are considered a good fit. Multiple tests on the model are executed and the final model is achieved with the minimum SRMR among all the trials. The estimated model SRMR of this model is found to be 0.086 showing a close to a conservative version of good fit. Another measure to test model fitting is an incremental fit index called Normed Fit Index (NFI). NFI is calculated by taking the ratio of chi-squared value of

270

S. Yanık and N. C. Sinayi¸s

the fitted model to the chi-squared value of null model. The value changes between 0 and 1 (0 showing no fit and 1 showing perfect fit) (Hair et al. 2014). Recommendations as low as 0.80 as a cutoff have been (Hooper et al. 2008). NFI of our model is 0.78. However, NFI does not penalize for adding parameters in the model so it should be used with caution for model comparisons so the use of the NFI is rare (Henseler et al. 2016). It is stated in the literature that PLS path modeling and the traditional covariancebased SEM (CBSEM) have different objectives. While PLS path modeling provides latent variable scores with beneficial characteristics for prediction, CBSEM is better suited for model validation, model selection, and model comparisons (Henseler and Sarstedt 2013). In the context of PLS–SEM, measuring the fit by SRMS or RMStheta does not offer much value. It is even stated that their use can be harmful because researchers may be tempted to sacrifice predictive power to achieve better fit (Hair et al. 2017). Since the objective of PLS is to maximize variance explained rather than fit, prediction oriented measures such as R-square are used to evaluate PLS models. Thus, we also check the R-square values as a fit measure of the structural model. The R-Square values for explaining the dependent constructs “business environment,” “entrepreneurial intention and action,” “human factors” are found to be 0.60, 0.59, 0.29, respectively. The acceptable level of R2 depends on the research context with higher values indicating higher level of predictive accuracy (Hair et al. 2017). In some disciplines such as consumer behavior, values of 0.20 are considered high while in academic research that focuses on marketing issues, 0.75, 0.50, and 0.25 are described substantial, moderate, and weak explanation of endogenous constructs respectively.

6.5 Country Group Analysis Until now all evaluation methods and statistical analyses were done for the complete data set. The main aim of this study is to provide a base ground to compare the predefined country groups: countries with advanced economies (AE) and countries with emerging markets and developing economies (EMDE). From this point on, analyses will be based on the country groups to test the assumptions made on the hypotheses section of the study.

6.5.1 The Country-Specific Models A well-prepared fitting structural model is expected to have high values in its endogenous constructs. Coefficient of determination (R2 ) identifies the measurement of the explainable variance of the endogenous constructs. It introduces the effect of exogenous constructs on the endogenous construct they are linked on. Success and education are the endogenous latent variables (constructs) of this model

The Effects of Country Characteristics on Entrepreneurial Activities

271

Fig. 5 “Advanced Economies” model with outer loadings and total effects

since they are dependent on other constructs. Investment convenience, culture, and economics are the exogenous constructs of the structural model. Normally, high R2 values are expected from all the models. Since the study focuses on the comparison of two groups, the model needs to be remained the same for both groups. This way, the analyses will be more accurate. In Figs. 5 and 6 the country group-specific models, “Advanced Economies” and “Emerging markets and developing economies” country groups are represented respectively showing the path coefficients of the inner model, loadings of the outer model, and R-square values of the endogenous constructs. When the two country-specific models are compared, the model fit for emerging markets and developing economies dataset is better. Besides, path coefficient and outer loadings are significantly different. First we compare the R-square values of the two models using the bootstrapping results in Table 7. Bootstrapping with 5000 samples is executed for the models. This application provides p-values and t*statistics for the relationships introduced in the model. Significance level is set as 0.05 for this study.

272

S. Yanık and N. C. Sinayi¸s

Fig. 6 “Emerging markets and developing economies” model with outer loadings and total effects Table 7 Country group-specific R-square values of the endogenous constructs

Business environment Entrepreneurial intention and action Human factors

AE countries R-square mean 0.637

EMDE countries R-square mean 0.513

AE countries p-values 0

EMDE countries p-values 0

0.219

0.609

0

0

0.264

0.282

0

0

For EMDE countries, the “Entrepreneurial Intention and Action” construct is explained moderately whereas the construct cannot be explained by the factors of the model. For AE countries, the business environment can be predicted moderately with the proposed model. The measure of effect size (f2 ) is an advanced measure which is calculated by the assistance of coefficient of variation. It estimates the effect of an exogenous construct when its relationship with its corresponding endogenous construct is

The Effects of Country Characteristics on Entrepreneurial Activities

273

Table 8 Effect sizes for AE and EMDE and their significances

Business environment → Entrepreneurial intention and action Failure fear → Entrepreneurial intention and action Governmental factors → Business environment Governmental factors → Human factors Human factors → Business environment Human factors → Entrepreneurial intention and action Moderating effect 1 → Entrepreneurial intention and action Perceived behavioral control → Entrepreneurial intention and action

AE countries f -square mean 0.056

EMDE countries f -square mean 0.107

AE countries p-values 0.039

EMDE countries p-values 0.006

0.006

0.097

0.98

0.003

0.42

0.385

0

0

0.363

0.398

0

0

0.444

0.137

0

0

0.099

0.047

0.018

0.051

0.006

0.016

0.848

0.295

0.094

0.819

0.018

0

added or removed (Hair et al. 2017). First the coefficient of variation with the construct included is calculated. Then R2 value of the endogenous construct without the corresponding exogenous construct is calculated. Then they are subtracted and divided by one minus R2 value of included (Eq. 3). f2 =

2 2 Rincluded − Rexcluded 2 1 − Rincluded

(3)

Effect size values of 0.02 show small effect, 0.15 shows medium effect and 0.35 shows large effect. Any values above 0.35 show larger effect and below 0.02 show no effect at all (Hair et al. 2017). Table 8 shows all the effect sizes of exogenous constructs related to the endogenous constructs. It is seen that Entrepreneurial Intention and Action is significantly affected by Business Environment (0.06, weak effect) and Human Factors (0.10, weak effect) in AE countries. In EMDE countries, we observe that Business Environment (0.11, weak effect) and Human Factors (0.05, weak effect), additionally Fear of Failure (0.10, weak effect) and Perceived Behavioral Control (0.82, large effect) come into play significantly as influencing factors of Entrepreneurial Intention and Action. Finally, the relationships between the construct and indicators are also provided from the bootstrapping procedure of 5000 samples. p-values in Table 9 shows the significant relationships between the related latent variables and indicators. If the pvalue cutoff is used as 0.05, the indicator “High Status to Successful Entrepreneurs”

Entrepreneurial. intention and action

Perceived behavioral control

Governmental factors

Business environment

Construct Human factors

Indicator Bas.Sch Cult.Norm Post.Sch Com.Infr. Phy.Infr Finc.Ent. Mark.Open RD.Tran. Tax.Bur Gov.Prog Gov.Sup Per.Opp Ent.GoodC Status.Ent Ent.Inten Est.Bus TEA Per.Cap

AE countries Loadings Loadings mean p-values 0.817 0 0.789 0 0.781 0 0.799 0 0.706 0 0.788 0 0.809 0 0.814 0 0.868 0 0.883 0 0.868 0 0.818 0 0.52 0 0.54 0 0.545 0 0.542 0 0.897 0 0.685 0

Table 9 Outer weight and outer loading statistics of both country groups Weights mean 0.445 0.395 0.416 0.236 0.229 0.241 0.269 0.297 0.423 0.386 0.336 0.735 0.477 0.194 0.271 0.2 0.541 0.353

Weights p-values 0 0 0 0 0 0 0 0 0 0 0 0 0.001 0.098 0 0.002 0 0

EMDE countries Loadings Loadings mean p-values 0.815 0 0.831 0 0.815 0 0.664 0 0.491 0 0.77 0 0.771 0 0.85 0 0.817 0 0.893 0 0.882 0 0.863 0 0.745 0 0.717 0 0.863 0 0.623 0 0.908 0 0.854 0 Weights mean 0.368 0.462 0.387 0.208 0.175 0.304 0.281 0.381 0.347 0.422 0.385 0.623 0.316 0.315 0.333 0.173 0.32 0.368

Weights p-values 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

274 S. Yanık and N. C. Sinayi¸s

The Effects of Country Characteristics on Entrepreneurial Activities

275

is not significant for the construct “Perceived Behavior Control” for AE countries. Otherwise, all the other relationships are found to be significant.

6.5.2 Multigroup Analysis Last analysis performed on this study is Multigroup Analysis which is available in SmartPLS program. This analysis provides outputs of differences between data groups. In this study, differences between AE and EMDE groups are presented. Comparison between the relationships of the constructs gives a better insight into making solid conclusions. Table 10 provides the output of PLS multigroup analysis of relationship differences and their p-values. The study is based on 0.05 significance level, so the outputs are analyzed according to this limitation. Once again, the bolded p-values show significance. In construct-based approach, there are four relationships that are showing significant differences. Human factors’ effect on the Business Environment and Entrepreneurial Intention and Action is different among the AE and EMDE countries. Moreover, governmental factors is also having significantly different effect on Entrepreneurial Intention and Action construct. Finally, there exists a different level of effect of Failure Fear on the Entrepreneurial Intention and Action. Multigroup analysis exhibits also a few differences in indicator-construct relationships. Table 11 shows the outer loading differences and p-values of differences between the two groups. Bolded p-values show the significance of test results.

Table 10 Total effects differences and p-values for multigroup analysis

Construct ← Construct Business environment → Entrepreneurial intention and action Failure fear → Entrepreneurial intention and action Governmental factors → Business environment Governmental factors → Entrepreneurial intention and action Governmental factors → Human factors Human factors → Business environment Human factors → Entrepreneurial intention and action Moderating effect 1 → Entrepreneurial intention and action Perceived behavioral control → Entrepreneurial intention and action

Total effects for group differences (AE-EMDE) 0.032 0.188 0.022 0.086

p-value 0.658 0.01 0.285 0.031

0.017 0.163 0.171 0.108 0.327

0.629 0.001 0.013 0.061 1

276

S. Yanık and N. C. Sinayi¸s

Table 11 Outer loading differences and p-values for group analysis Indicator ← Construct Bas.Sch ← Human factors Com.Infr. ← Business environment Cult.Norm ← Human factors Ent.GoodC ← Perceived behavioral control Ent.Inten ← Entrepreneurial intention and action Est.Bus ← Entrepreneurial intention and action Fear.Fail ← Failure fear Finc.Ent. ← Business environment Gov.Prog ← Governmental factors Gov.Sup ← Governmental factors Mark.Open ← Business environment Per.Cap ← Entrepreneurial intention and action Per.Opp ← Perceived behavioral control Per. behavioral control * failure fear ← moderating effect 1 Phy.Infr ← Business environment Post.Sch ← Human factors RD.Tran. ← Business environment Status.Ent ← Perceived behavioral control TEA ← Entrepreneurial intention and action Tax.Bur ← Governmental factors

Outer loading group differences (AE-EMDE) 0.001 0.134 0.04 0.223 0.319 0.079 0 0.018 0.01 0.014 0.038 0.159 0.024 0.036

P value 0.497 0 0.922 0.944 1 0.819 0.06 0.294 0.725 0.748 0.119 0.993 0.551 0.349

0.215 0.033 0.036 0.148 0.004 0.05

0 0.841 0.971 0.9 0.536 0.024

7 Conclusion and Recommendations The aim of this study is to create an original model that introduces a unique approach in understanding how entrepreneurial intention and action can be fostered in the world. At first, we review how the academic world approaches in understanding the factors of entrepreneurship. Then, a model is created with the acquired knowledge and the database of GEM is used to gather the data. Using the data of a recent time period is significant since the world of entrepreneurship has undergone a massive change in the last decade. Due to this, the study was made using the GEM data between years 2001 and 2018, and the research is conducted on PLS–SEM using the SmartPLS software. During the study, the original structural model is evaluated with multiple evaluation measures. This evaluation process made it mandatory to make variations in the model to construct a well-fitting structural model. Due to this fact, some modifications were to the model. The result of effect size shows the most important findings as the most effective factor in entrepreneurial action is, globally, the perceived behavioral control which is constructed with individual perceptions of entrepreneurship and opportunities. However, when we analyze the AE and EMDE countries, the positive effect of EMDE is almost two to three times more

The Effects of Country Characteristics on Entrepreneurial Activities

277

compared to AE countries. The second important finding is that a developed business environment has a negative effect on entrepreneurial action, showing that an immature business environment motivates entrepreneurial action. When we analyze the AE and EMDE countries individually, the negative effect of a mature business environment is slightly higher than the positive effect of perceived behavioral control in AE countries. Whereas, EMDE countries positive effect of perceived behavioral control, much higher than the negative effect of business environment on entrepreneurial action. We also conclude that the human factor is a much important factor on business environment and entrepreneurial action for AE countries compared to EMDE countries. The least effective factor on entrepreneurial action is found to be the governmental factors for both AE and EMDE countries. Using the outcomes of this research, policies for specific country groups can be decided. For example, in EMDE countries the human factors can be enhanced to decrease the negative mediating effect of fear of failure. In AE countries whereas, policies to enhance the perceived behavioral control of individual would be beneficial in order to motivate the entrepreneurial action. Last but not least, the results show that governmental factors role is comparatively lower than individuals’ roles in both AE and EMDE country groups. Thus, policies on people are the key to foster entrepreneurship in the world. This research can provide a basis for making further investigation on understanding how these country groups achieve national success in starting up new businesses. Further analyses can be done to investigate the main reasons behind the achieved results of this exploration, especially by building and proving models specifically for the AE and EMDE countries. Finally, this study can provide a benchmarking basis for countries with emerging markets and developing economies in gathering better results in entrepreneurial arena.

References Ajzen I (1991) Theory of planned behavior. Organ Behav Hum Decis Process 50(2):179–211 Armour J, Cumming D (2006) The legislative road to Silicon Valley. Oxf Econ Pap-New Ser 58(4):596–635 Barazandeh M, Parvizian K, Alizadeh M, Khosravi S (2015) Investigating the effect of entrepreneurial competencies on business performance among early stage entrepreneurs Global Entrepreneurship Monitor (GEM 2010 survey data). J Glob Entrep Res 5(1):18 Bjørnskov C, Foss N (2008) Economic freedom and entrepreneurial activity: some cross-country evidence. Public Choice 134(3–4):307–328 Davidsson P (2016) Researching entrepreneurship: conceptualization and design, vol 33. Springer, New York Davidsson P, Wiklund H (1997) Values, beliefs and regional variations in new firm formation rates. J Econ Psychol 18(2–3):179–199 GEM (2019). https://www.gemconsortium.org/data. Accessed 25 Apr 2019 Griffiths MD, Kickul J, Carsrud AL (2009) Government bureaucracy, transactional impediments, and entrepreneurial intentions. Int Small Bus J 27(5):626–646

278

S. Yanık and N. C. Sinayi¸s

Hair J, Black W, Babin B, Anderson R (2014) Multivariate data analysis, 7th edn. Harlow, Pearson Education, pp 546–600 Hair JF, Hult GTM, Ringle CM, Sarstedt M (2017) A primer on partial least squares structural equation modeling (PLS-SEM), 2nd edn. Sage, Thousand Oaks Hechavarria DM, Ingram AE (2016) The entrepreneurial gender divide: Hegemonic masculinity, emphasized femininity and organizational forms. Int J Gend Entrep 8(3):242–281 Henseler J, Sarstedt M (2013) Goodness-of-fit indices for partial least squares path modeling. Comput Stat 28:565–580 Henseler J, Dijkstra TK, Sarstedt M, Ringle CM, Diamantopoulos A, Straub DW, Ketchen DJ, Hair JF, Hult GTM, Calantone RJ (2014) Common beliefs and reality about partial least squares: comments on Rönkkö & Evermann (2013). Organ Res Methods 17(2):182–209 Henseler J, Ringle C, Sarstedt M (2015) A new criterion for assessing dicriminant validity in variance-based structural equation modeling. J Acad Mark Sci 43:115–135 Henseler J, Hubona G, Ray PA (2016) Using PLS path modeling in new technology research: updated guidelines. Ind Manag Data Syst 116(1):2–20 Hooper D, Coughlan J, Mullen MR (2008) Structural equation modelling: guidelines for determining model fit. Electron J Bus Res Methods 6(1):53–60 Hu L-T, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model Multidiscip J 6(1):1–55 IMF Data Mapper (2019). https://www.imf.org/external/datamapper/ NGDPDPC@WEO/ OEMDC/WEOWORLD/ADVEC. Accessed 05 Feb 2019 Kuratko DF, Morris MH, Schindehutte M (2015) Understanding the dynamics of entrepreneurship through framework approaches. Small Bus Econ 45:1–13 Lee SM, Peterson SJ (2000) Culture, entrepreneurial orientation, and global competitiveness. J World Bus 35(4):401–416 Liñán F (2004) Intention-based models of entrepreneurship education. Piccola Impresa/Small Bus 2004(3):11–35 Maniyalath N, Narendran R (2016) The human development index predicts female entrepreneurship rates. Int J Entrep Behav Res 22(5):745–766 Obaji NO, Olugu MU (2014) The role of government policy in entrepreneurship development. Sci J Bus Manage 2(4):109–115 Obschonka M, Stuetzer M, Gosling SD, Rentfrow PJ, Lamb ME, Potter J et al (2015) Entrepreneurial regions: do macro-psychological cultural characteristics of regions help solve the “Knowledge Paradox” of economics? PLoS One 10(6):1–21 Pinillos MJ, Reyes L (2011) Relationship between individualist–collectivist culture and entrepreneurial activity: evidence from Global Entrepreneurship Monitor data. Small Bus Econ 37(1):23–37 Stuetzer M, Obschonka M, Brixy U, Sternberg R, Cantner U (2014) Regional characteristics, opportunity perception and entrepreneurial activities. Small Bus Econ 42(2):221–244 Terjessen S, Hessels J, Li D (2016) Comparative international entrepreneurship: a review and research agenda. J Manage 42(1):1–46 Thai MTT, Turkina E (2014) Macro-level determinants of formal entrepreneurship versus informal entrepreneurship. J Bus Ventur 29:490–510 Van Stel A, Carree M, Thurik R (2005) The effect of entrepreneurial activity on national economic growth. Small Bus Econ 24(3):311–321 Verheul I, Wennekers S, Audretsch D, Thurik R (2002) An eclectic theory of entrepreneurship: policies, institutions and culture. In: Audretsch D, Thurik R, Verheul I, Wennekers S (eds) Entrepreneurship: determinants and policy in a European–US comparison. Kluwer Academic, Boston

The Effects of Country Characteristics on Entrepreneurial Activities

279

Wach K (2015) Impact of cultural and social norms on entrepreneurship in the EU: cross-country evidence based on GEM survey results. Zarzadzanie ˛ w Kulturze (1):1529 Wyrwich M, Stuetzer M, Sternberg R (2016) Entrepreneurial role models, fear of failure, and institutional approval of entrepreneurship: a tale of two regions. Small Bus Econ 46(3):467– 492

A Geometric Standard Deviation Based Soft Consensus Model in Analytic Hierarchy Process Petra Grošelj and Gregor Dolinar

Abstract Consensus building models are widely studied in connection to the multi-criteria group decision making problems. The aim of such models is to enhance the level of agreement between decision makers (DMs). The paper studies several properties and discusses arising questions regarding consensus models in analytic hierarchy process. A new consensus reaching model based on the weighted geometric mean and geometric standard deviation is proposed. A simulation of two scenarios of the novel model is applied to the data from the case study to demonstrate its validity. The results are compared to the three consensus models, selected from the literature. The analysis revealed that the adaptations that DMs accept and make to their judgments in the new consensus model are appropriately high. Consequently, DMs do not have to make subsequent modifications and generally all DMs make minimum one change in the iteration process, which contributes to the comfortable group environment and effective final decisions.

1 Introduction Multi-criteria decision making consists of evaluating and selecting alternatives regarding multiple criteria simultaneously. Criteria in complex real world problems can be conflicting and the construction of the model and its evaluation often demand group decision making approaches with participation of several stakeholders, experts or interest groups. When knowledge and experiences of one decision maker (DM) are limited, a group of DMs can contribute different opinions, comprehension, and perspectives that can stimulate creativity and their synergy can crucially improve the final solution of the problem (Srdjevic et al. 2013). However, conflicts and oppositions can make the decision making process slower and exhausting.

P. Grošelj () · G. Dolinar Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_11

281

282

P. Grošelj and G. Dolinar

The most desired solution is consensual decision. Consensus signifies total and unanimous agreement of all DMs (Herrera-Viedma et al. 2014; Dong et al. 2010; Bezdek et al. 1978). Because it is difficult to achieve it in practice, soft consensus that indicates certain level of agreement between DMs is often used instead of the total consensus. In the paper we studied soft consensus models but we do not distinguish between terms “consensus” and “soft consensus.” Formal consensus reaching models have been widely studied regarding different types of preference relations: linguistic, fuzzy, ordinal,... that express DMs judgments. In our study we focused on the analytic hierarchy process (AHP) and multiplicative preference relations presented as pairwise comparison matrices (PCMs). In group AHP individual PCMs are aggregated into group opinion. According to Dyer and Forman (1992) there are four basic approaches in group AHP: consensus, voting or compromise, and two aggregating methods (Forman and Peniwati 1998): aggregating individual priorities and aggregating individual judgments. Consensus in AHP presumes consent of DMs on the judgments or on the priority vectors (Steele et al. 2007). DMs start with heterogeneous opinions and through the consensual process they consent on a consensual decision (Hartmann et al. 2009). In compromise DMs also have heterogeneous opinions, but because cooperation is important to them, they agree to support the decision that differs from what they think is the best (Steele et al. 2007). According to these definitions we can conclude that aggregating methods are only special cases of consensus or compromise. Generally, it is hard to judge whether the final solution is consensus or compromise. We should also consider that a good decision may not satisfy any of DMs, and a popular decision may not be good (Steele et al. 2007). Formal consensus reaching process has a dynamic and iterative character. It consists of several iterative steps, namely consensus rounds that evolve individual opinions towards the consensual opinion (Herrera-Viedma et al. 2014). In soft consensus models diverse individual opinions are brought closer until a predefined agreement is reached (Wibowo and Deng 2013). The iterative step-by-step process can be attractive for DMs because it increases the participation of DMs, helps managing the conflicts and differences in their opinions and the final decision can be more satisfying for DMs (Wibowo and Deng 2013; Kahraman and Cebi 2009). In each consensus round the degree of consensus is measured. The level of consensus is correlated to the closeness of individual judgments, that can be measured by the distance between the priority vectors, compatibility index or any other consensus measure (Dong and Saaty 2014). These measures can be divided into two groups: the measures based on the distances between DMs evaluations and the group evaluations and the measures based on the distances between DMs assessments (Palomares et al. 2014). In each iteration round the suggestion is given to all DMs or more usually to only one or two the most discordant DMs to make adaptation to their preferences. The change of the most incompatible judgments promises the highest consensus improvement (Dong and Cooper 2016). Frequently a moderator manages the consensus building process. He collects pairwise comparisons from all DMs, defines the level of the consensus, and recommends DMs how to adapt their preferences.

A Geometric Standard Deviation Based Soft Consensus Model in AHP

283

The consensus models differ whether DMs have to follow the advice of the moderator or DMs can decide by themselves about the adjustment of their preferences. If the decision is left to the DMs, this can contribute to the more comfortable and stress-free group environment that can add to the higher value of DMs judgments (Dong and Saaty 2014). On the other hand, DMs may be pressured by the other DMs to follow the moderator’s advice (Herrera-Viedma et al. 2014). The proposed consensus models in literature are very diversified. Chiclana et al. (2008) included consistency into their consensus model. Wibowo and Deng (2013) developed their consensus model on the interactive algorithm. Herrera-Viedma et al. (2014) reviewed soft consensus models in fuzzy environment. Regan et al. (2006) presented an iterative consensus reaching model for aggregating individual priority vectors. Their model is founded on the Lehrer–Wagner model and the philosophy of negotiation (Lehrer and Wagner 1981) and they applied it to the environmental management problem. Srdjevic et al. (2013) transformed Regan et al. model into a two-phase algorithm that divides DMs in subgroups in the first phase and seeks consensus within and between subgroups in the second phase. Kou et al. (2017) optimization model aggregates individual priority vectors. Several authors proposed consensus models that derive a group consensus PCM from the individual PCMs. Yeh et al. (2001) embedded genetic algorithm in their consensus model. Altuzarra et al. (2007) centered their model around Bayesian analysis. Pedrycz and Song (2011) based their consensus model on the information granularity. Dong et al. (2010) presented two consensus models that use row geometric mean method as prioritization method. Their first model uses geometric cardinal consensus index derived from geometric consistency index as a measure of closeness of DMs. Their second model is based on the geometric ordinal consensus index. Wu and Xu (2012) proposed a consensus model with group consensus index based on the compatibility index (Saaty 1994) as a measure of the level of consensus. The most incompatible DM is obligated to adjust his PCM. The consensus model from Dong and Saaty (2014) is similar to Wu and Xu (2012) model, only that in their model DMs can decide whether they want to make the adaptation to their PCMs or not. Dong and Cooper (2016) presented a peer-to-peer adaptive consensus model where the pair of two the most discordant DMs adapt their preferences. Their measure of the consensus level is similar to group consensus index. Dong et al. (2017) proposed a consensus model with a twofold feedback mechanism. The aim of this research is to investigate several questions that emerge when analyzing consensus models: What is the appropriate number of iterations? Should DMs be forced to adapt their preferences or not? How many times are DMs willing to modify their judgments? Are DMs prepared to make subsequent adjustments of their judgments? What is the appropriate threshold of soft consensus measure? Our intention is to propose some possible conclusions. The objective of this study is to employ the findings of the consensus models analysis in a new developed soft consensus model that takes into account all results of the research. To validate the novel model an evaluation of its behavior is carried out. We performed a simulation on two scenarios to the data from the environmental study (Grošelj and Zadnik Stirn 2015) and compared the results to three other consensus models from the literature.

284

P. Grošelj and G. Dolinar

The paper is structured as follows. In Sect. 2 a short introduction of AHP is provided. In Sect. 3 consensus models and their properties are discussed and in Sect. 4 the new soft consensus model is explained. In Sect. 5 the outline of comparison of our model and three consensus models from the literature is presented. The simulation on the case study is implemented and the results are discussed in Sect. 6. In Sect. 7 some conclusions are made.

2 Preliminaries AHP model has a hierarchical structure consisting of goal, criteria, subcriteria, and alternatives. Let X = {x1 , x2 , . . . , xn } be a finite set of objects on the same level of hierarchy. The core of AHP are pairwise comparisons. Using 1–9 Saaty’s fundamental scale (Table 1), pairwise comparisons express relative dominance of one object over the other on the same level of hierarchy regarding the objects on the higher level. They are represented in a n × n pairwise comparison matrix (PCM)   A = aij n×n

(1)

with aii = 1 and aj i = 1/aij . PCM A is consistent if aij = aik akj , for all i, j, k ∈ {1, 2, . . . , n}.

(2)

Table 1 Saaty’s 1–9 fundamental scale for pairwise comparisons in AHP (Saaty 2006) Intensity of importance 1

Definition Equal importance

2 3

Weak Moderate importance

4 5

Moderate plus Strong importance

6 7

Strong plus Very strong

8 9

Very, very strong Extreme importance

Explanation Two objects contribute equally to the objective Experience and judgment slightly favor one object over another Experience and judgment strongly favor one object over another One object is favored very strongly over another The evidence favoring one object over another is of the highest possible order of affirmation

A Geometric Standard Deviation Based Soft Consensus Model in AHP

285

Table 2 The random consistency index (Saaty 2006) n RIn

1 0

2 0

3 0.52

4 0.89

5 1.11

6 1.25

7 1.35

8 1.40

9 1.45

10 1.49

To measure the level of inconsistency Saaty (1980) defined consistency ratio CR: CIA , RIn

(3)

λA,max − n . n−1

(4)

CRA = where the consistency index is defined by CIA =

Consistency ratio depends on the principal eigenvalue λA,max of PCM A and its order n, and random index RIn (Table 2). In general, PCMs with CR < 0.1 are considered acceptably consistent. One of the most common prioritization methods is eigenvector method. The priority vector w = (w1 , w2 , . . . , wn )T is the principal eigenvector of PCM A: Aw = λA,max w.

(5)

Let DM1 , DM2 , . . . , DMm be m DMs with weights of importance m  ρk = 1 that express ρ = (ρ1 , ρ2 , . . . , ρm )T , 0 ≤ ρk ≤ 1, k = 1, . . . , m and   k=1 (k) their importance and power. Their PCMs are Ak = aij , k = 1, . . . , m n×n   T and their derived priority vectors are w(k) = w1 (k) , . . . , wn (k) , k = 1, . . . , m. Aggregation of individual judgments is one of the basic approaches to obtain the group priority vector from PCMs of DMs. The elements of group PCM are equal to the weighted geometric mean (WGM) of individual pairwise comparisons: G = (gij )n×n , with gij =

m   4 (k) ρk aij .

(6)

k=1

The method is called weighted geometric mean method (WGMM). If all DMs PCMs are acceptably consistent, the group PCM is also acceptably consistent (Grošelj and Zadnik Stirn 2012). The result of aggregation of individual PCMs by WGMM is a compromise and not a consensual result and can be unsatisfactory for DMs because it does not take into account the entire range of the individual judgments. Geometric mean can be identical whether the judgments are all similar to the geometric mean or they are very diverse.

286

P. Grošelj and G. Dolinar

3 Consensus Models DMs in group AHP can express their views as pairwise comparisons which include subjectivity and imprecision. The goal of the soft consensus iterative model is to reach a certain level of agreement between DMs. This can increase the effectiveness of the final decision. The degree of consensus is measured by consensus index. The maximal number of iterations should be defined in advance. The number of iterations is an important issue in consensus models. If the convergence process to the consensual opinion is very slow, the number of iterations increases. This can occur in a larger group of DMs (Srdjevic et al. 2013) or when the modifications made by DMs are insignificant. Smaller number of iterations is preferable when the costs of modifying DMs judgments are considered (Ben-Arieh and Easton 2007; Labella et al. 2019). We presume that DMs are unwilling to make more than two or three adaptations of their judgments otherwise their initial judgments get lost in the made changes. It is also important how to define one iteration. Often in one iteration one DM is asked to make the adjustment of his opinion (Dong and Saaty 2014) and he does or does not accept the recommendation. Then the second iteration starts. If the number of DMs that reject the adjustments is high, the final level of agreement between DMs remains low. The possible improvement is that one iteration lasts until one DM is willing to make the correction of his opinion or all DMs reject it. The magnitude of the adaptation does not only influence the number of iterations but also impacts the number of modifications that one DM has to make in succession. It is ideal that when DM accepts one adjustment then in the next iteration another DM is selected as the most disagreeable. We presume that DMs are reluctant to make subsequent adjustments. We can conclude that it is beneficial for the consensus model if the proposed adaptations are not very small. Consensus models often demanded that DMs that rejected the moderator’s advice cannot change their mind in later iterations (Dong and Saaty 2014; Dong et al. 2017). We think that this can be a serious drawback of the model. If DM has already accepted one update of his judgments, then he may be hesitant to accept another adaptation until at least some of the other DMs update their judgments. When all or the majority of the other DMs make one modification of their judgments then he may be prepared to make another adjustment. Soft consensus unifies DMs judgments only to a certain degree. Therefore, the algorithm of the iterative soft consensus process should have integrated several stop conditions. The maximal number of iterations is one of the stop conditions. The iterative process also stops if all DMs reject the modifications of their judgments. The most important stop condition occurs when appropriate degree of consensus is achieved. The level of consensus is measured by consensus index and when the threshold for consensus index is reached the iterative process stops. Simulations of the iterative consensus process can assist to set an appropriate threshold that can be achieved within the maximal number of iterations. To employ all findings we developed a new soft consensus model and compared it to the several other models from the literature.

A Geometric Standard Deviation Based Soft Consensus Model in AHP

287

4 A New Proposed Soft Consensus Model WGMM is one of the basic approaches to aggregate DMs judgments. It has been used in many applications (Duke and Aull-Hyde 2002; Ananda and Herath 2008; Cortés-Aldana et al. 2009; Lee et al. 2009; Sun and Li 2009; Wang and Chin 2009; Akaa et al. 2016). Although DMs may disagree with the final result, it can be a good foundation for the consensus model. Several consensus models based the modification of DMs judgments on the WGM (Wu and Xu 2012; Dong and Saaty 2014; Dong et al. 2017). However, the WGM cannot express the diversity of DMs opinions. To measure the closeness of DMs judgments regarding WGM, geometric standard deviation can be used. Geometric standard deviation describes the variability of a set of numbers around their WGM. The geometric standard (k) deviation of pairwise comparisons aij , i, j = 1, . . . , n, k = 1, . . . , m of m DMs is defined by Eq. (7): (W GMM) sij

5 6 6 = exp 6 6 7

1−

1 m  k=1

m 

ρk2

k=1

 ρk ln

(k)

aij gij

2 .

(7)

Geometric standard deviation is greater or equal to one, with equality when all pairwise comparisons are identical. Smaller variability between DMs pairwise comparisons result in smaller values of geometric standard deviation. Let   (W GMM) S = sij

n×n

(8)

be the matrix of geometric standard deviations. It is symmetric. We propose a novel soft consensus model based on the geometric standard deviation as a measure of closeness of group opinion. The group standard deviation consensus index (GSDCI) that measures the degree of consensus is defined as the maximal geometric standard deviation of all pairwise comparisons: 8 9 (W GMM) GSDCI = max sij ; i = 1, . . . , n, j = 1, . . . , n .

(9)

The iterations of the soft consensus model run until geometric standard deviations of all pairwise comparisons (all elements of matrix S) are smaller as the predefined threshold value C. To define the most discordant DM, we define the contribution (k) Dij of DM k to the geometric standard deviation sij .  (k) 2 a Dij(k) = ρk ln gij . ij

(10)

288

P. Grošelj and G. Dolinar

Then disagreement value is calculated for all DMs (Eq. 11). It presents the average of the contributions to the geometric standard deviation for all pairwise comparisons of DM k. Let D be the vector of disagreement values. T  D = D (1) , ..., D (m) , D (k) =

  (k) 2 Dij . n(n − 1) n

(11)

i=1 j >i

Before the soft consensus reaching model starts DMs should make pairwise comparisons and the consistency of all PCMs should be checked. The model starts with individual, acceptably consistent PCMs and with weights of importance of DMs. In each step the most discordant DM is selected according to the highest disagreement value. The moderator suggests him to adapt his PCM. If he accepts the recommendation, the new geometric standard deviations are calculated, the stop conditions are checked and the iteration process stops or the next iteration follows. If he rejects the recommendation, the next most discordant DM is selected. The iteration process stops when one of the stop conditions is met. There are several stop conditions built in the model: Stop condition 1. If all geometric standard deviations (all elements of matrix S) are smaller as the predefined threshold value, then the iterative consensus process stops. Stop condition 2. If the predefined maximal number of iterations T is reached, then the iterative consensus process stops. Stop condition 3. If all DMs reject to make an adaptation to their PCMs, then the iterative consensus process stops. When the iteration process stops the final group weights are derived by the eigenvector method. Algorithm Inputs. Acceptably consistent PCMs of m DMs Ak , k = 1, . . . m, the weights of importance of DMs ρ = (ρ1 , ρ2 , . . . , ρm )T , the threshold value of the geometric standard deviations C and the maximal number of iterations T . Outputs. The number of iterations t, 0 ≤ t ≤ T , final individual PCMs A∗k , k = 1, . . . m, final group PCM G∗ , group weights wgroup , and final matrix of geometric standard deviations S ∗ .     (k)0 (k) Step 1 Set t = 0 and start with A0k = aij = aij , k = 1, . . . , m. n×n

Step 2

n×n

Calculate the group PCM by WGMM   G = gijt t

n×n

,

gijt

=

m  4 k=1

aij(k)t

ρk

.

(12)

A Geometric Standard Deviation Based Soft Consensus Model in AHP

Step 3

289

Calculate the matrix of geometric standard deviations   (W GMM)t S t = sij

n×n

(13)

T  and the vector of disagreement values D t = D (1)t , . . . , D (m)t . Step 4 If all sij(W GMM)t ≤ C or t = T or all DMs rejected the adaptation of their judgments (D (k)t = 0, k = 1, . . . m), then go to Step 6, otherwise continue with the next step. Step 5 Select DM h with maximal D (h)t . The moderator suggests him to update his PCM from A(h)t to A(h)t +1: ⎧ ⎪ ⎪ aij(h)t , aij(h)t = gijt ⎪ ⎪ ⎪ ⎪ ⎨ (h)t (W GMM) (h)t (h)t +1 , aij < gijt . aij = aij sij (14) ⎪ ⎪ (h)t ⎪ aij ⎪ (h)t ⎪ > gijt ⎪ (W GMM) , aij ⎩ sij If DM h accepts the moderator’s advice, then he updates his PCM and PCMs of the other DMs remain unchanged. Set t = t + 1 and return to Step 2. If DM h rejects the moderator’s advice, set D (h)t = 0 and return to Step 4. Step 6 Let A∗k = Atk , k = 1, . . . , m, G∗ = Gt , and S ∗ = S t . The output solution is A∗1 , A∗2 , . . . , A∗m , G∗ , S ∗ and the number of iterations t. Step 7 Calculate the group weights wgroup = (w1 group , . . . , wn group )T by the eigenvector method (5). Step 8 End the algorithm.

4.1 Numerical Experiment of the Convergence of the Model The modification of the most discordant DM judgments is proposed according to the Eq. (14). Numerical experiment has been carried out to examine the convergence of the new model. The convergence of the model indicates total consensus which results in GSDCI = 1. Acceptably consistent PCMs of sizes 4 × 4, 5 × 5, and 6 × 6 were randomly generated for 2, 3, 4, 5, 6, 7, 8, 9, and 10 DMs. For each PCM of size n × n the top triangle was generated as follows: for all drawn uniformly from the set 8 1 ≤ i < j ≤ n, aij is a random element 9

1, 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 2, 3, 4, 5, 6, 7, 8, 9 . Elements in the bottom triangle are then simply reciprocal values of the elements from the top triangle: aj i = 1/aij for all 1 ≤ i < j ≤ n. Consistency ratio is then calculated, and if CR < 0.1, the matrix is accepted, otherwise it is discarded. The procedure is repeated until the requested

290

P. Grošelj and G. Dolinar

number of acceptably consistent PCMs are obtained, where the requested number means the number of DMs. The iteration consensus process with 100 iterations was performed with the assumption that all DMs accept all suggested modifications of PCMs. We investigated whether GSDCI < 1.001 after 100 iterations. We performed 100.000 repetitions of the iteration consensus process for 2, 3, 4, 5, 6, 7, 8, 9, and 10 DMs with PCMs of size 4 × 4, 10.000 with PCMs of size 5 × 5 and 1.000 repetitions with PCMs of size 6 × 6. All calculations were executed in Matlab. The results of the experiment show GSDCI < 1.001 and therefore the convergence of the model in all cases.

4.2 Numerical Experiment of the Acceptable Consistency of the Final Group PCM in the New Model We start the new consensus model with acceptably consistent matrices that assure that the group PCM gained by WGMM is acceptably consistent (Grošelj and Zadnik Stirn 2012). The new proposed consensus model does not guarantee the acceptable consistent group PCMs through iteration process. To examine this issue we carried out another numerical experiment. Acceptably consistent PCMs of sizes 4 × 4, 5 × 5, and 6 × 6 were randomly generated for different numbers of DMs. The iteration consensus process with 20 iterations was performed with the assumption that all DMs accept all suggested modifications of PCMs. We examined whether the final group PCM after 20 iterations is acceptably consistent. We performed 100.000 repetitions of the iteration consensus process for PCMs of size 4×4, 20.000 repetitions of the iteration consensus process for PCMs of size 5 × 5 and 10.000 repetitions of PCMs of size 6 × 6. The results in Fig. 1 present the percentage of unacceptably consistent final PCMs for different number of DMs. The results show that the percentage of unacceptably consistent final group PCMs is low. For fixed size of PCMs the highest percentage is for 3 DMs and then it decreases to zero when the number of DMs increases. For 4 ×4 PCMs the percentage of unacceptably consistent final group PCMs is 1.4% for 3 DMs, then it decreases for more DMs and it essentially reaches zero for 8 DMs. The percentage is also lower for bigger sizes of PCMs. For 5 × 5 PCMs the percentage of unacceptably consistent final group PCMs is 0.3% for 3 DMs, then it falls for more DMs and it practically reaches zero for 6 DMs. For 6 × 6 PCMs we found unacceptably consistent final group PCMs only for 3 DMs. The results show that in the majority cases the final PCM is acceptably consistent.

A Geometric Standard Deviation Based Soft Consensus Model in AHP

291

Fig. 1 Percentage of unacceptably consistent final group PCMs

5 Comparison of Consensus Models In the last years comparisons between consensus models appeared in the literature (Labella et al. 2018, 2019). The authors used A FRamework for the analYsis of Consensus Approaches (AFRYCA) (Palomares et al. 2014). In our study we selected three consensus models from literature: Model 2 (Dong et al. 2010), Model 3 (Dong and Saaty 2014), and Model 4 (Dong et al. 2017) to compare them to our proposed model (New model). The algorithms of the selected models are similar to the algorithm of New model. We decided to compare the results of consensus models also to the results of the compromise method WGMM. The selected consensus models are all based on the WGM. In all models the moderator suggests the modification of PCM to the most discordant DM. The recommended update to DM h is in the form  αt  1−αt aij(h)t +1 = aij(h)t , xijt

(15)

where αt , 0 < αt < 1 is a parameter which determines the portion of DM’s PCM that is preserved in the iteration t and xijt is a part that contributes to the change of DMs PCM and depends on the model. In Model 2, xijt = quotient of the group weights. In Model 3 and Model 4,

xijt

=

gt

wi

gt

is defined as the

gijt

is defined as the

wj

292

P. Grošelj and G. Dolinar

element from WGM. In Model 4, Dong et al. (2017) introduced also parameter βt which reduces DM’s weight of importance if he rejects the suggested adaptation. To simulate diverse behavior in the consensus reaching process we defined two different scenarios: Scenario 1. Scenario 2.

All DMs accept all recommendations made by the moderator. All DMs accept the first recommendation for the modification made by the moderator. Every DM is prepared to accept the second (and subsequent) adaptation only when one half of the others DMs have already made their adjustments.

We made the simulation to the data from the ecological study (Grošelj and Zadnik Stirn 2015). The idea of the first scenario is to research how the model behaves in the most basic case, when all DMs cooperate all the time or they are forced to accept the adjustments. We investigated the magnitude of the changes of PCMs, how many adaptations DMs have to make and if the adjustments were in subsequent iterations or not. The purpose of the second scenario is to study a more realistic situation when DMs are not prepared to make subsequent adaptations to their PCMs.

6 Case Study The simulation of the new proposed model and the selected consensus models was applied to the data from the study of optimal management of Pohorje, a mountainous area in northeastern part of Slovenia (Grošelj and Zadnik Stirn 2015). One of the goals of that study was to select the optimal strategy for the development of Pohorje by AHP method. The hierarchy of SWOT groups and factors as criteria and four possible alternatives was established. The SWOT analysis was carried out on the workshops with local people and organizations. The alternatives express possible directions of Pohorje development: • Going with the flow with no changes to the current situation in Pohorje, where everyone tries to realize their goals in tourism or attend to their existence in agriculture and forestry and with no connections between sectors. • Sustainable development, with sustainable agriculture, timber production and tourism, preserving nature, cultural and natural heritage. • Intensive sectoral development, with intensive use of natural resources, tourism, and industry. • Conservation of nature that protects park Pohorje and focuses on preservation of nature and biodiversity, sustainable tourism, and ecological agriculture. For our simulation we selected only one SWOT group—strengths and comparisons of alternatives regarding five strengths factors. The importance of SWOT factors was established on the workshop and the results are presented in Table 3.

A Geometric Standard Deviation Based Soft Consensus Model in AHP

293

Table 3 Weights of strengths Strengths Cultural heritage Natural resources Preservation of nature Climate Natural park

Table 4 Weights of importance of DMs

Description Rich cultural and technical heritage, crafts Water, wood, stone Preservation of nature, wildness, remains of virgin forests, undamaged nature Clean air of Pohorje, climate resorts Conditions for natural park and stoppage of big capital and mega projects DM DM1 DM2 DM3 DM4 DM5 DM6 DM7 DM8 DM9 DM10 DM11 DM12

Field Tourism Tourism Tourism Forestry Forestry Forestry Agriculture Agriculture Agriculture Nature protection Nature protection Nature protection

Weights 0.246 0.246 0.215 0.154 0.138

Weights 0.0814 0.0814 0.0814 0.0759 0.0759 0.0759 0.0807 0.0807 0.0807 0.0953 0.0953 0.0953

Twelve DMs, local stakeholders, and experts from the fields of forestry, agriculture, tourism, and nature protection participated in the evaluation of alternatives regarding SWOT factors and their PCMs are presented in Appendix 1. In that study CR < 0.15 was allowed because this adaptation helped the stakeholders significantly (Grošelj and Zadnik Stirn 2015). The importance of four fields was evaluated by AHP method. Because there were three DMs from each field, the fields’ weights were divided by three to calculate the DMs weights of importance (Table 4). For the simulation of both scenarios we set the maximal number of iterations to 30. If the maximal number of iterations is reached DMs make on average 2.5 changes to their PCMs. In the new proposed model one iteration implies one modification. We adapted the other three selected models to count the iterations in the same manner to make the models more comparable. We set the threshold value of the geometric standard deviations at C = 1.3. Our analysis showed that this threshold is reachable within 30 iterations. For the other models we set the consensus indexes thresholds and other model parameters as their authors proposed: αt = 0.7 for all iterations t = 1, . . . , T and Models 2–4, GCCI = 0.35 for Model 2 (Dong et al. 2010), GCI = 1.1 for Model 3 (Dong and Saaty 2014), and LGCI = 0.1 and βt = 0.7 for all iterations t = 1, . . . , T , for Model 4 (Dong et al. 2017).

294

P. Grošelj and G. Dolinar

The consensus reaching process for the New model is presented in detail for Scenario 1 for Cultural heritage. In this Scenario all DMs accept the suggested updates of their PCMs. Step 1: We set t = 0. The consensus reaching process started with initial PCMs of twelve DMs that are presented in Appendix 1 in Tables 21, 22, 23, 24, and 25. ⎡ ⎤ 1 0.198 1.473 1.012 ⎢ 5.055 1 5.357 4.211 ⎥ ⎥ Step 2: G0 = ⎢ ⎣ 0.679 0.187 1 0.646 ⎦, CRG0 = 0.0044. 0.988 0.238 1.548 1 ⎤ 1 1.638 2.469 3.138 ⎢ 1.638 1 1.590 1.958 ⎥ ⎥ S0 = ⎢ ⎣ 2.469 1.590 1 2.507 ⎦, 3.138 1.958 2.507 1 ⎡

Step 3:

! D 0 = 0.052 0.057 0.052 0.112 0.040 0.032 0.011 0.054 0.100 0.039 0.011 0.033 .

Step 4: Because several sij(W GMM)0 > 1.3, we continue with Step 5. Step 5: DM 4 has maximal disagreement value D (4)0 = 0.112. The moderator suggests that he updates his PCM according to the Eq. (14) from A04 to ⎡

⎤ 1 0.153 2.835 0.628 ⎢ 6.553 1 5.662 1.958 ⎥ ⎥ A14 = ⎢ ⎣ 0.353 0.177 1 0.501 ⎦ . 1.593 0.511 1.994 1 He accepts the moderator’s advice and updates his PCM. The PCMs of the other DMs remain unchanged. We set t = 1 and return to Step 2. The process is repeated for another 18 iterations. The results of the iterations are presented in Appendix 2. After we set t = 19 and return to Step 2, we get ⎡ ⎤ 1 0.243 1.353 1.183 ⎢ 4.109 1 4.502 3.891 ⎥ ⎥ G19 = ⎢ ⎣ 0.739 0.222 1 0.709 ⎦ and CRG19 = 0.0038.

Step 3:

0.845 0.257 1.410 1 ⎡ ⎤ 1 1.183 1.286 1.296 ⎢ 1.183 1 1.116 1.180 ⎥ ⎥ S 19 = ⎢ ⎣ 1.286 1.116 1 1.240 ⎦, 1.296 1.180 1.240 1

! D 19 = 0.002 0.002 0.001 0.004 0.007 0.003 0.002 0.003 0.001 0.005 0.004 0.004 .

Step 4: Because all sij(W GMM)19 < 1.3, we continue with Step 6 that gives us the output solution: ∗ ∗ t = 19, G∗ = G19 , S ∗ = S 19 , and A∗k = A19 k . A1 , . . . , A12 are presented in Table 26 in Appendix 2. In Step 7 we calculate the group weights by the

A Geometric Standard Deviation Based Soft Consensus Model in AHP

295

Table 5 Scenario 1: group weights of alternatives regarding cultural heritage, number of iterations, the final values of consensus index, and the threshold values of consensus index Cultural heritage Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.156 0.577 0.116 0.150 19 1.296 1.300

Model 2 0.144 0.606 0.103 0.147 22 0.3457 0.350

Model 3 0.142 0.604 0.103 0.151 19 1.100 1.100

Model 4 0.143 0.606 0.102 0.149 30 3.3065 0.100

WGMM 0.139 0.615 0.101 0.146

Table 6 Scenario 1: group weights of alternatives regarding natural resources, number of iterations, the final values of consensus index, and the threshold values of consensus index Natural resources Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.161 0.540 0.121 0.178 21 1.287 1.300

Model 2 0.146 0.574 0.111 0.170 29 0.316 0.350

Model 3 0.149 0.575 0.109 0.167 21 1.092 1.100

Model 4 0.146 0.571 0.110 0.173 30 3.985 0.100

WGMM 0.141 0.546 0.137 0.175

Table 7 Scenario 1: group weights of alternatives regarding preservation of nature, number of iterations, the final values of consensus index, and the threshold values of consensus index Preservation of nature Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.149 0.541 0.096 0.214 23 1.276 1.300

Model 2 0.127 0.565 0.090 0.218 30 0.338 0.350

Model 3 0.127 0.562 0.088 0.223 24 1.094 1.100

Model 4 0.128 0.561 0.091 0.220 30 3.901 0.100

WGMM 0.121 0.527 0.097 0.254

eigenvector method (5): wgroup = (0.156, 0.577, 0.116, 0.150) and in Step 8 the algorithm ends. The group weights of Scenario 1 are presented in Tables 5, 6, 7, 8, 9, and 10. The results show that the ranking of alternatives is identical for all models. The weights of alternatives regarding strengths factors differ for maximum 4% between the models and the aggregated weights of alternatives regarding the SWOT group Strengths differ for maximum 2% between the models. The results of Models 2–4 are similar because the algorithms of the models are similar. The results of Models 2–4 are also on average more comparable to the results of WGMM, which can indicate smaller influence of the individual judgments and higher influence of the

296

P. Grošelj and G. Dolinar

Table 8 Scenario 1: group weights of alternatives regarding climate, number of iterations, the final values of consensus index, and the threshold values of consensus index Climate Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.132 0.468 0.086 0.314 24 1.279 1.300

Model 2 0.136 0.493 0.077 0.294 20 0.337 0.350

Model 3 0.137 0.489 0.076 0.298 24 1.097 1.100

Model 4 0.138 0.473 0.074 0.315 30 3.551 0.100

WGMM 0.134 0.503 0.077 0.286

Table 9 Scenario 1: group weights of alternatives regarding natural park, number of iterations, the final values of consensus index, and the threshold values of consensus index Natural park Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.125 0.549 0.078 0.248 21 1.250 1.300

Model 2 0.125 0.570 0.077 0.228 23 0.334 0.350

Model 3 0.120 0.580 0.076 0.225 16 1.098 1.100

Model 4 0.122 0.576 0.077 0.225 30 3.154 0.100

WGMM 0.126 0.564 0.077 0.233

Table 10 Scenario 1: aggregated global weights of alternatives regarding strengths Global weights Going with the flow Sustainable development Intensive sectoral development Conservation of nature

New model 0.148 0.540 0.103 0.210

Model 2 0.137 0.567 0.094 0.202

Model 3 0.137 0.567 0.093 0.203

Model 4 0.137 0.563 0.094 0.206

WGMM 0.133 0.555 0.102 0.210

geometric mean, that occur in the adaptations in Models 2–4, on the final results. The number of iterations was the lowest in New model and Model 3. The number of iterations in New model ranges from 19 to 24 iterations, which confirms that the threshold value 1.3 of consensus index was suitably selected. The threshold values of consensus indexes of Models 2 and 3 are appropriate but in the case of Model 1 it is not appropriate because it is too low and beyond reach of 30 iterations. Table 11 presents DMs that made changes of their PCMs in the iterations. The results show that at the beginning of the iteration process the first few most incompatible DMs are almost identical for all models. In general, DM2 and DM9 were the most disagreeable. The results of the consensus building process for comparisons of alternatives regarding Cultural heritage show that DMs in New model have to make at most two adaptations, while in Models 2–4 the most disagreeable DMs made three or four changes. All DMs made at least one adjustment in New model, while in Models 2–4 DM7 and DM11 made no modifications.

It 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Cultural heritage New M2 M3 4 4 4 9 9 9 2 8 2 1 4 1 8 2 9 3 1 4 5 9 8 10 3 3 6 5 5 12 6 6 9 8 10 4 2 4 1 10 2 3 4 8 2 1 3

M4 4 9 8 4 1 9 2 5 3 8 10 4 6 12 1

Natural resources New M2 M3 2 2 2 6 6 6 2 2 2 3 6 6 6 2 3 9 3 2 1 9 9 4 1 1 8 6 6 2 4 4 3 8 8 11 2 3 6 3 2 8 9 9 4 1 1 M4 6 2 6 2 9 1 6 2 3 8 4 9 6 1 2

Preservation of nature New M2 M3 2 2 2 4 2 2 2 4 4 11 2 2 5 11 11 3 4 4 1 1 5 6 3 9 8 6 6 4 9 1 2 5 3 9 8 2 12 2 8 10 4 4 5 12 11 M4 2 2 4 2 11 4 1 3 8 6 5 2 9 11 4

Climate New M2 9 9 11 2 2 11 5 9 8 3 3 5 9 4 4 2 1 8 10 11 7 9 11 1 2 7 9 3 12 4 M3 9 2 11 9 5 3 11 8 9 2 4 7 1 3 5

M4 9 9 11 5 8 2 9 11 3 4 1 5 8 2 10

Natural park New M2 9 9 2 2 6 6 3 9 8 3 1 8 11 2 2 6 5 1 9 11 6 5 2 3 4 9 9 8 6 4

M4 9 9 2 6 8 3 9 5 2 1 6 8 3 11 9 (continued)

M3 2 9 6 9 2 3 6 8 5 9 3 1 6 2 11

Table 11 Scenario 1: DMs that adapted their PCMs in iterations of consensus models: It—The number of iteration, New—new model, M1—Model 1, M2—Model 2, M3—Model 3, M4—Model 4

A Geometric Standard Deviation Based Soft Consensus Model in AHP 297

It 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Cultural heritage New M2 M3 7 12 1 11 9 9 8 3 12 6 5 6 10 8

Table 11 (continued)

M4 9 2 5 3 10 8 4 6 12 1 9 2 7 5 3

Natural resources New M2 M3 3 6 6 9 4 4 12 8 8 1 11 3 7 2 2 2 3 9 5 9 10 1 4 12 6 7 8 5 M4 3 8 4 9 11 2 6 1 3 12 7 8 4 9 2

Preservation of nature New M2 M3 6 11 12 2 1 5 9 3 6 4 6 1 1 9 3 7 8 9 11 5 10 3 2 2 10 8 4 1 12 11 3 6 M4 12 1 3 8 6 9 10 5 2 11 4 1 3 8 12

Climate New M2 6 2 1 5 11 8 5 11 7 10 3 1 4 9 11 7 8 3 M3 2 4 8 10 11

M4 9 11 4 3 1 8 5 2 7 10 9 11 4 3 12

Natural park New M2 5 2 10 1 11 6 3 11 7 3 2 5 8 10

M3 8

M4 5 6 2 1 4 3 8 11 6 2 1 4 9 3 8

298 P. Grošelj and G. Dolinar

A Geometric Standard Deviation Based Soft Consensus Model in AHP

299

In the iteration process of Natural resources in New model DM2 was the most incompatible with four changes of his judgments, followed by DM3 and DM6 with three changes and all the other DMs made at least one adjustment. In Models 2–4 the most incompatible DMs have to make five or in Model 4 even six alterations. One DM made no changes in Model 2, two DMs in Model 4, and five DMs in Model 3. The results of Preservation of nature consensus iteration process demonstrate that the modifications of PCMs in Models 2–4 can be insignificant and the most discordant DM2 has to make five modifications of his PCM and even a subsequent modifications of his PCM in iterations 1 and 2. That was not the case in New model, where DM2 has to make four not subsequent changes. All DMs made adaptations in New model, while that was not the case for DM7 in Models 2–4. In the results of Climate DMs have to make maximum three (Model 3), four (New model and Model 2) or five (Model 4) adjustments. All DMs made changes in New model, DM6 did not make any modifications in Models 2–4 and DM 12 in Models 2 and 3. Similarly, the results of Natural park show that the most discordant DM has to make three (Models 2 and 3), four (New model) or five (Model 4) adaptations. DM12 did not make any adjustments in any model, DM7 in Models 2–4, DM10 in Models 3 and 4, and DM4 in Model 3. We can conclude that in New model DMs have to make smaller number of modifications on average. When DMs accept all suggested modifications then the most incompatible DMs have to make three to five changes in 30 iterations, while some DMs do not make any adjustments. This can be challenging and can cause uncomfortable group environment because DMs that made the most adaptations can experience the consensus building process as not respectful to their initial judgments. The fact that all DMs make at least one adjustment can contribute to the more constructive group atmosphere. This was the case only in New model in four iteration processes out of five, which can be beneficial for New model. On the other hand, there were one to even five DMs that did not make any adaptations in Models 2–4. The group weights of Scenario 2 are in Tables 12, 13, 14, 15, 16, and 17. The results show that the ranking of alternatives is identical for all models. Comparing results of Scenario 1 and Scenario 2, the weights of alternatives regarding strengths

Table 12 Scenario 2: group weights of alternatives regarding cultural heritage, number of iterations, the final values of consensus index, and the threshold values of consensus index Cultural heritage Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.156 0.577 0.116 0.150 19 1.296 1.300

Model 2 0.144 0.607 0.103 0.146 23 0.334 0.350

Model 3 0.142 0.604 0.103 0.151 18 1.100 1.100

Model 4 0.143 0.604 0.104 0.149 30 4.488 0.100

300

P. Grošelj and G. Dolinar

Table 13 Scenario 2: group weights of alternatives regarding natural resources, number of iterations, the final values of consensus index, and the threshold values of consensus index Natural resources Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.170 0.504 0.149 0.177 22 1.265 1.300

Model 2 0.148 0.562 0.117 0.173 30 0.314 0.350

Model 3 0.149 0.575 0.109 0.167 29 1.092 1.100

Model 4 0.146 0.590 0.094 0.170 30 4.268 0.100

Table 14 Scenario 2: group weights of alternatives regarding preservation of nature, number of iterations, the final values of consensus index, and the threshold values of consensus index Preservation of nature Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.145 0.521 0.105 0.229 24 1.287 1.300

Model 2 0.126 0.559 0.093 0.223 30 0.334 0.350

Model 3 0.127 0.562 0.088 0.223 24 1.094 1.100

Model 4 0.132 0.586 0.087 0.195 30 3.948 0.100

Table 15 Scenario 2: group weights of alternatives regarding climate, number of iterations, the final values of consensus index, and the threshold values of consensus index Climate Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.127 0.484 0.086 0.303 23 1.293 1.300

Model 2 0.137 0.494 0.077 0.293 24 0.342 0.350

Model 3 0.137 0.489 0.076 0.298 22 1.097 1.100

Model 4 0.133 0.434 0.068 0.366 30 3.415 0.100

factors are similar, almost equal in Models 2 and 3 and with slight differences in New model and Model 4. The differences between Scenarios 1 and 2 in New model are a consequence of the fact that different DMs made adaptations and that the adjustments were larger in New model than in Models 2 and 3. The number of iterations is very similar in Scenarios 1 and 2, which implies that if DMs are allowed to decline the adaptations, the iteration process is not more time consuming. However, in time evaluation of the models, the time for the selection of another DM, when the first rejects the suggestion of moderator, should be considered. Tables 18, 19, 20 present DMs that changed their PCMs in particular iterations. In all models the same DM has to make several subsequent rejections of modifications. This cannot be avoided because if someone refuses to make an adjustment he

A Geometric Standard Deviation Based Soft Consensus Model in AHP

301

Table 16 Scenario 2: group weights of alternatives regarding natural park, number of iterations, the final values of consensus index, and the threshold values of consensus index Natural park Going with the flow Sustainable development Intensive sectoral development Conservation of nature Number of iterations Consensus index Threshold of consensus index

New model 0.125 0.551 0.082 0.243 22 1.260 1.300

Model 2 0.122 0.573 0.077 0.228 22 0.350 0.350

Model 3 0.120 0.580 0.076 0.225 17 1.098 1.100

Model 4 0.116 0.595 0.074 0.216 30 3.040 0.100

Table 17 Scenario 2: aggregated global weights of alternatives regarding Strengths Global weights Going with the flow Sustainable development Intensive sectoral development Conservation of nature

New model 0.148 0.529 0.112 0.210

Model 2 0.137 0.563 0.096 0.203

Model 3 0.137 0.562 0.095 0.206

Model 4 0.136 0.569 0.088 0.207

remains the most discordant DM in the subsequent iteration. The results show that DMs in Models 2–4 made more denials of moderators recommendations than DMs in New model. In the results of Cultural heritage the course of iteration process in New model was ideal because all DMs accepted the changes and every DM made minimum one and maximum two adaptations that were at least nine iterations separated. The iteration process stopped after 19 iterations. In Models 2–4, DM4 has made 6–8 rejections and DM9 made 2–3 rejections. Both DMs accepted three or four changes in the iteration process, while DM7 and DM11 did not make any adaptations. In the results of Natural resources in New model DM2 declined modification 16 times and accepted four changes, while in Models 2–4 DM2 and DM6 made 20 refusals and 4–5 acceptances of adjustments. In New model all DMs made minimum one change, while in Model 2–4 one or two DMs made no change. The results of Preservation of nature show that DM2 was the most discordant. He refused adaptations 15 times in New model and 18–24 times in Models 2–4 and consented on modifications 4 times in New model and 4–5 times in Models 2–4. Only DM7 made no change in Models 2–4, while all DMs made at least one modification in New model. In the Climate consensus building process DM9 was the most incompatible. He made 2 rejections and 4 acceptances in New model, 9 rejections and 4 acceptances in Models 2, and 3 and 22 rejections and 5 acceptances in Model 4. All DMs made adjustments in New model and Model 4, whereas DM6 and DM12 did not make any adjustments in Models 2 and 3. The iteration process of Natural park exposed three discordant DMs. In New model DM2 made 6 refusals, DM6 four refusals and DM9 three refusals, together 13 refusals while all of them agreed to

302

P. Grošelj and G. Dolinar

Table 18 Scenario 2: comparison if alternatives regarding cultural heritage and natural resources: DMs that adapted their PCMs in iterations of consensus models: It—the number of iteration, New—new model, M1—Model 1, M2—Model 2, M3—Model 3, M4—Model 4, d—DM declines the modifications It 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Cultural heritage New M2 M3 4 4 4 9 9 9 2 8 2 1 4d, 2 1 8 4d, 1 9d, 4d, 8 3 4d, 9d, 3 4d, 9d, 3 5 4d, 9d, 5 4d, 9d, 5 10 4 4 6 9 9 12 6 6 9 8 10 4 4d, 2 4d, 2 1 4d, 10 4d, 3 3 4d, 1 4d, 1 2 4 4 7 9 8 11 12 9 8 3 12 6 5 6 10 4 8

M4 4 9 8 4d, 1 4d, 9d, 2 4d, 9d, 5 4d, 9d, 3 4 9 8 4d, 6 4d, 10 4d, 12 4d, 1 4 2 9 3 5 8 4d, 10 4 6 12 1 2 9 7 3 5

Natural resources New M2 2 2 6 6 2d, 3 2d, 6d, 3 2d, 9 2d, 6d, 9 2d, 1 2d, 6d, 1 2d, 6d, 4 2d, 6d, 4 2d, 8 2d, 6d, 8 2 2 6 6 2d, 11 2d, 6d, 3 2d, 3 2d, 6d, 9 2d, 4d, 8d, 6d, 7 2d, 6d, 1 2d, 4 2d, 6d, 4 2d, 8 2d, 6d, 8 2 2 2d, 6 6 2d, 10 2d, 6d, 11 2d, 5 2d, 6d, 3 2d, 12 2d, 6d, 9 2d, 9 2d, 6d, 1 2d, 1 2d, 6d, 4 2 2 6 2d, 6d, 12 2d, 6d, 7 2d, 6d, 8 2d, 6d, 11 2d, 6d, 5 2 6

M3 2 6 2d, 6d, 3 2d, 6d, 9 2d, 6d, 1 2d, 6d, 4 2d, 6d, 8 2 6 2d, 6d, 3 2d, 6d, 9 2d, 6d, 1 2d, 6d, 4 2d, 6d, 8 2 6 2d, 6d, 3 2d, 6d, 9 2d, 6d, 11 2d, 6d, 1 2d, 6d, 7 2 6 2d, 6d, 12 2d, 6d, 4 2d, 6d, 8 2d, 6d, 3 2d, 6d, 9 2

M4 6 2 6d, 2d, 9 6d, 2d, 1 6d, 2d, 3 6d, 2d, 8 6d, 2d, 4 6 2 6d, 2d, 9 6d, 2d, 3d, 1 6d, 2d, 3 6d, 2d, 9d, 8 6d, 2d, 9d, 4 6 2 6d, 2d, 9 6d, 2d, 3d, 12 6d, 2d, 3 6d, 2d, 7 6d, 2d, 1 6 2 6d, 2d, 9 6d, 2d, 12 6d, 2d, 11 6d, 2d, 8 6d, 2d, 3 6 2

take three adjustments. In Models 2–4 the total number of rejections varied from 11 to 25, while DM2, DM6, and DM9 made 3–5 modifications each. DM12 made no adjustments in New model, while the number of DMs that made no adjustments varied from two to four in other models. The results show that the number of rejections is significantly smaller in New model than in Models 2–4. The smaller number of refusals influences better and more comfortable group environment where the same DM is less exposed and

A Geometric Standard Deviation Based Soft Consensus Model in AHP

303

Table 19 Scenario 2: comparison if alternatives regarding preservation of nature and climate: DMs that adapted their PCMs in iterations of consensus models: It—the number of iteration, New—new model, M1—Model 1, M2—Model 2, M3—Model 3, M4—Model 4, d—DM declines the modifications It 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Preservation of nature New M2 2 2 4 2d, 4 2d, 11 2d, 11 2d, 5 2d, 4d, 9 2d, 9 2d, 4d, 1 2d, 8 2d, 4d, 3 2d, 1 2d, 4d, 8 2 2 3 2d, 4 6 2d, 6 2d, 4 2d, 5 2d, 12 2d, 4d, 12 2d, 10 2d, 4d, 11 2d, 5 2d, 4d, 1 2 2 6ne 2d, 4 9 2d, 3 6 2d, 6 2d, 12d, 7 2d, 9 2d, 3 2d, 8 2d, 12 2d, 5 2d, 1 2 2 2d, 10 4 2d, 4 11 2d, 1 2d, 10 2d, 3 2d, 3 2d, 12 2d, 11 2 6

M3 2 2d, 4 2d, 11 2d, 9 2d, 5 2d, 4d, 1 2d, 4d, 8 2 2d, 4 2d, 3 2d, 6 2d, 4d, 11 2d, 12 2d, 4d, 9 2 2d, 4 2d, 5 2d, 10 2d, 1 2d, 6 2d, 3 2 8

M4 2 2d, 4 2d, 11 2d, 4d, 5 2d, 4d, 5 2d, 4d, 3 2d, 4d, 8 2 2d, 4 2d, 6 2d, 11 2d, 4d, 9 2d, 1 2d, 4d, 3 2 2d, 4 2d, 6 2d, 8 2d, 12 2d, 5 2d, 10 2 2d, 11 2d, 1 2d, 4 2d, 3 2d, 9 2d, 6 2 8

Climate New M2 9 9 11 2 2 11 5 9d, 3 8 9d, 4 3 9d,2d, 5 9d, 4 9d,2d, 8 9 9 1 2 7 11 10 9d, 1 11 9d, 7 2 9d, 3 9d, 12 9d, 4 9 9 6 2 5 5 1 8 11 11 7 10 3 9d, 1 4 9 11d, 2 7 3

M3 9 2 11 9d, 5 9d, 3 9d, 2d, 8 9d, 2d, 11d, 4 9 2 11 9d, 7 9d, 1 9d, 3 9d, 5 9 2 4 8 10 11 9d, 7 9

M4 9 9d, 11 5 9d, 8 9d, 2 9d, 11d, 3 9d, 11d, 5d, 4 9 11 9d, 5 9d, 10 9d, 8 9d, 11d, 1 9d, 11d, 5d, 2 9 9d, 11 9d, 5 9d, 10 9d, 8 9d, 11d, 12 9d, 11d, 7 9 9d, 11 9d, 5 9d, 6 9d, 10 9d, 4 9d, 3 9 1

the execution time for the iteration process is shorter. These findings confirm the efficiency of New model. The main reason for less rejections is the magnitude of modifications that is apparently more suitable in our new model. The next interesting observation was that irrespective of all the refusals made by DMs, the same DMs almost always made equal number of adjustments as in Scenario 1 where they accepted all modifications. The only difference were consecutive numbers of iterations when the modifications were made. Nevertheless this difference can

304

P. Grošelj and G. Dolinar

Table 20 Scenario 2: comparison if alternatives regarding natural park: DMs that adapted their PCMs in iterations of consensus models: It—the number of iteration, New—new model, M1— Model 1, M2—Model 2, M3—Model 3, M4—Model 4, d—DM declines the modifications It 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Natural park New 9 2 6 3 8 1 11 2d, 5 2 9 6 2d, 4 2d, 9d, 6d, 7 2d, 9d, 6d, 3 2d, 9d, 6d, 5 2 6d, 9 9 6 10 11 2d, 5d, 1 2d, 8

M2 9 2 6 9d, 3 9d, 8 9d, 2d, 6d, 1 9d, 2d, 6d, 11 9 2 6 5 9d, 3 9d, 8 9d, 4 9 2 1 6 11 3 5 9

M3 2 9 6 9d, 2d, 3 9d, 2d, 6d, 8 9d, 2d, 6d, 5 9d, 2d, 6d, 1 9d, 2 9 6 9d, 3 9d, 6d, 2d, 11 9d, 6d, 2d, 8 9d, 6d, 2d, 5 9d, 2 9 6

M4 9 9d, 2 9d, 6 9d, 8 9d, 3 9d, 5 9d, 2d, 6d, 1 9 9d, 2 9d, 6 9d, 3d, 11 9d, 3 9d, 8 9d, 5 9 9d, 6d, 2 9d, 6 9d, 1 9d, 3 9d, 4 9d, 8 9 11 6 9d, 5 9d, 1 9d, 3 9d, 2 9 7

crucially affect the group atmosphere and the final decision as well. In New model all DMs made minimum one adjustment except in one iteration process, while the number of DMs that made no changes ranged from one to four in Models 2–4. This observation has positive impact on the credibility of New model. We perceived that DMs that made no modifications were almost the same in both scenarios, which indicate that DMs refusals of changes does not significantly impact the number of DMs that made no change.

A Geometric Standard Deviation Based Soft Consensus Model in AHP

305

7 Conclusions Consensus models are an important issue in group multi-criteria decision making. The iteration process helps DMs to bring their opinions closer to the consensual group opinion. In the paper we focused on consensus models in analytic hierarchy process. The paper examined several properties of consensus models. We discussed the appropriate number of iterations in the consensus building process and how many changes DMs make or are prepared to make in the iteration process. We examined the threshold of consensus index. We concluded that threshold of consensus index and maximal number of iterations are closely related. Smaller values of consensus index demand more iteration rounds to be reached. Analysis of the behavior of the consensus model can contribute to the appropriate setting of the threshold value that can be achieved before the maximal number of iterations is realized. The main part of the study presents a novel soft consensus model, based on the weighted geometric mean and geometric standard deviation. We analyzed the behavior of the proposed consensus model and three other consensus models from the literature in two scenarios on the data from the case study. The research demonstrated the validity of the novel consensus model. Evaluation of Scenario 1 confirmed the finding that it is beneficial if DMs are allowed to decide whether they want to make a modification to their judgments or not (Dong and Saaty 2014). Investigation of Scenario 2 revealed that if DMs modifications of their PCMs are large enough, this can influence the number of iterations, the number of modifications and also refusals that one DM has to make, and the number of DMs that make at least one change. Regarding the above mentioned ascertainment the analysis demonstrated favorableness of our proposed model. Further applications of the new consensus model can confirm its suitability as AHP group method that can create satisfactory group decision environment. Acknowledgments The authors acknowledge the financial support from the Slovenian Research Agency (research core funding No. P4-0059).

Appendix 1 See Tables 21, 22, 23, 24, and 25.

306

P. Grošelj and G. Dolinar

Table 21 PCMs of twelve DMs of alternatives regarding cultural heritage 1

1 8

2

2

1

1 7

1

6

1

1 7

1 3

1 3

1

1 4

7

1 5

8

1

8

9

7

1

8

8

7

1

4

4

4

1

9

1

1 2

1 8

1

3

1

1 8

1

1

3

1 4

1

1

1 7

1 9

1

1 5

1 2

1 9

1 3

1

1 6

1 8

1

1

3

1 4

1

1

5

1

5

1

DM1, CR = 0.07

DM2, CR = 0.148

DM3, CR = 0.014

DM4, CR = 0.136

1

1 3

1 2

1 3

1

1 5

2

1 2

1

1 6

2

2

1

1 9

4

1 2

3

1

3

3

5

1

6

2

6

1

6

6

9

1

9

9

2

1 3

1

1 2

1 2

1 6

1

1 5

1 2

1 6

1

1 2

1 4

1 9

1

1 5

3

1 3

2

1

2

1 2

5

1

1 2

1 6

2

1

2

1 9

5

1

DM5, CR = 0.054

DM6, CR = 0.015

DM7, CR = 0.045

DM8, CR = 0.127

1

1 9

1

7

1

1 3

1 2

1 2

1

1 5

3

1

1

1 2

2

2

9

1

9

9

3

1

3

3

5

1

4

4

2

1

3

3

1

1 9

1

2

1 3

1

1 2

1 3

1 4

1

1 2

1 2

1 3

1

1

1 7

1 9

1 3

2

1 3

1

1 4

1

1 2

1 3

1

1

3 1

DM9, CR = 0.141

2

1

DM10, CR = 0.045

2

DM11, CR = 0.059

DM12, CR = 0.004

Table 22 PCMs of twelve DMs of alternatives regarding natural resources 1

1 7

3

4

1

1 5

1 7

1 5

1

1 7

1 5

1 5

1

1 6

4

1 6

7

1

8

7

5

1

1 7

1 6

7

1

4

3

6

1

9

4

1 3

1 8

1

3

7

7

1

1

5

1 4

1

1 5

1 4

1 9

1

1 5

1 4

1 7

1 3

1

5

6

1

1

5

1 3

5

1

6

1 4

5

1

DM1, CR = 0.121

DM2, CR = 0.147

DM3, CR = 0.146

DM4, CR = 0.145

1

1 3

5 2

1 2

1

2

1 5

4

1

1 3

2

2

1

1 9

2

1 4

3

1

3

3

1 2

1

1 3

3

3

1

4

2

9

1

9

9

2 5

1 3

1

1 2

5

3

1

6

1 2

1 4

1

1 2

1 2

1 9

1

1 5

2

1 3

2

1

1 4

1 3

1 6

1

1 2

1 2

2

1

4

1 9

5

1

DM5, CR = 0.063 1

1 9

1

3

9

1

9

1

1 9

1 3

1 9

DM6, CR = 0.078 1

1 4

9

4

1

1

4

1 2

1 4

1 4

1

2

1 4

DM9, CR = 0.078

2

1 2

4

DM7, CR = 0.044 1

1 7

4

7

1

1

1 2

1 3

1 7

2

1

2

1 7

DM10, CR = 0.045

DM8, CR = 0.123

3

1 2

1

1 3

2

2

7

7

3

1

3

3

1

1 2

1 2

1 3

1

1

2

1

1 2

1 3

1

1

DM11, CR = 0.081

DM12, CR = 0.023

A Geometric Standard Deviation Based Soft Consensus Model in AHP

307

Table 23 PCMs of twelve DMs of alternatives regarding preservation of nature 1

1 8

6

1 2

1

1 5

1 7

1 5

1

1 6

4

1 5

1

1 5

3

1 9

8

1

9

1

5

1

1 7

1 6

6

1

7

1

5

1

9

1 5

1 6

1 9

1

1 9

7

7

1

1

1 4

1 7

1

1 9

1 3

1 9

1

1 9

2 1 9 1 DM1, CR = 0.118

5 6 1 1 DM2, CR = 0.147

5 1 9 1 DM3, CR = 0.054

9 5 9 1 DM4, CR = 0.124

1

1 4

1 2

1 2

1

1 4

2

1 4

1

1 4

3

1 2

1

1 9

4

1 3

4

1

4

4

4

1

6

1 2

4

1

6

3

9

1

9

7

2

1 4

1

2

1 2

1 6

1

1 7

1 3

1 6

1

1 2

1 4

1 9

1

1 6

2

1 4

1 2

1

4

2

7

1

2

1 3

2

1

3

1 7

6

1

DM5, CR = 0.045

DM6, CR = 0.021

DM7, CR = 0.044

DM8, CR = 0.133

1

1 9

2

3

1

1 5

1

1

1

1 9

1 2

1 3

1

1 3

2

2

9

1

8

6

5

1

5

5

9

1

7

7

3

1

3

4

1 2

1 8

1

1 3

1

1 5

1

1

2

1 7

1

2

1 2

1 3

1

1 2

1 3

1 6

3

1

1

1 5

1

1

3

1 7

1 2

1

1 2

1 4

2

1

DM9, CR = 0.142

DM11, CR = 0.062

DM10, CR=0

DM12, CR = 0.050

Table 24 PCMs of twelve decision makers of alternatives regarding climate 1

1 9

6

1 4

1

1 4

3

1 8

1

1 4

4

1 6

1

1 5

6

1 5

9

1

9

1

4

1

5

1 6

4

1

6

1 3

5

1

7

1 2

1 6

1 9

1

1 6

1 3

1 5

1

1 7

1 4

1 6

1

1 8

1 6

1 7

1

1 7

4 1 6 1 DM1, CR = 0.146

8 6 7 1 DM2, CR = 0.136

6 3 8 1 DM3, CR = 0.079

5 2 7 1 DM4, CR = 0.126

1

1 2

1 2

1 2

1

1 3

2

1 2

1

1 3

2

1 2

1

1 9

1

1 5

2

1

2

3

3

1

7

3

3

1

6

1 3

9

1

9

6

2

1 2

1

1 2

1 2

1 7

1

1 5

1 2

1 6

1

1 4

1

1 9

1

1 6

2

1 3

2

1

2

1 3

5

1

2

3

4

1

5

1 6

6

1

DM5, CR = 0.081

DM6, CR = 0.027

DM7, CR = 0.111

DM8, CR = 0.082

1

1 9

1

3

1

1 4

1

1 2

1

1 8

3

3

1

1 2

3

1 2

9

1

9

7

4

1

4

4

8

1

8

8

2

1

4

2

1

1 9

1

3

1

1 4

1

1 2

1 3

1 8

1

1 2

1 3

1 4

1

1 4

1 3

1 7

1 3

1

2

1 4

2

1

1 3

1 8

2

1

2

1 2

4

1

DM9, CR = 0.088

DM10, CR = 0.023

DM11, CR = 0.081

DM12, CR = 0.030

308

P. Grošelj and G. Dolinar

Table 25 PCMs of twelve DMs of alternatives regarding natural park 1

1 7

4

1 2

1

1 6

5

5

1

1 6

3

1 8

1

1 5

4

1 3

7

1

9

8

6

1

7

7

6

1

8

1

5

1

8

6

1 4

1 9

1

1 8

1 5

1 7

1

1 2

1 3

1 8

1

1 9

1 4

1 8

1

1 2

2

1 8

8

1

1 5

1 7

2

1

8

1

9

1

3

1 6

2

1

DM1, CR = 0.144

DM2, CR = 0.126

DM3, CR = 0.041

DM4, CR = 0.143

1

1 4

1 2

1 2

1

1 6

4

1 5

1

1 4

3

1 2

1

1 9

1

1 7

4

1

3

2

6

1

6

1 3

4

1

6

2

9

1

9

7

2

1 3

1

1 2

1 4

1 6

1

1 7

1 3

1 6

1

1 3

1

1 9

1

1 7

2

1 2

2

1

5

3

7

1

2

1 2

3

1

7

1 7

7

1

DM5, CR = 0.017

DM6, CR = 0.134

1

1

1

1

1

1 5

1

1 3

1

1 9

3

1 2

1

1 3

2

1 2

1

1

1

1

5

1

4

4

9

1

9

9

3

1

4

4

1

1

1

1

1

1 4

1

1 3

1 3

1 9

1

1 2

1 2

1 4

1

1 3

1

1

1

1

3

1 4

3

1

2

1 9

2

1

2

1 4

3

1

DM9, CR = 0

DM10, CR = 0.048

DM7, CR = 0.023

DM11, CR = 0.081

DM8, CR = 0.142

DM12, CR = 0.058

Appendix 2 Iterations of the Algorithm: t = 1:⎡

⎡ ⎤ ⎤ 1 0.191 1.375 1.104 1 1.638 2.233 2.855 ⎢ ⎢ ⎥ ⎥ ⎢ 5.248 1 5.171 4.433 ⎥ ⎢ 1.638 1 1.549 1.770 ⎥ G1 = ⎢ ⎥, CRG1 = 0.006, S 1 = ⎢ ⎥ ⎣ 0.727 0.193 1 0.693 ⎦ ⎣ 2.233 1.549 1 2.351 ⎦ 0.906 0.226 1.443 1 2.855 1.770 2.351 1 ! 1 D = 0.048 0.051 0.051 0.021 0.042 0.038 0.010 0.059 0.091 0.040 0.013 0.032 .

The maximal disagreement value D (9)1 = 0.091, ⎡ ⎤ 1 0.182 2.234 2.452 ⎢ 5.496 1 5.811 5.084 ⎥ ⎥ A29 = ⎢ ⎣ 0.448 0.172 1 1.276 ⎦ t = 2: ⎡

0.408 0.197 0.784

1

⎡ ⎤ ⎤ 1 0.198 1.467 1.014 1 1.592 2.244 2.512 ⎢ ⎢ ⎥ ⎥ ⎢ 5.043 1 4.992 4.232 ⎥ ⎢ 1.592 1 1.500 1.700 ⎥ G2 = ⎢ ⎥, CRG2 = 0.006, S 2 = ⎢ ⎥ ⎣ 0.682 0.200 1 0.647 ⎦ ⎣ 2.244 1.500 1 2.127 ⎦ 0.986 0.236 1.547 1 2.512 1.700 2.127 1 ! 2 D = 0.053 0.057 0.051 0.018 0.039 0.033 0.011 0.056 0.020 0.038 0.010 0.032 .

A Geometric Standard Deviation Based Soft Consensus Model in AHP

309

The maximal disagreement value D (2)2 = 0.057, ⎡ ⎤ 1 0.227 2.244 2.389 ⎢ 4.398 1 5.335 4.707 ⎥ ⎥ A32 = ⎢ ⎣ 0.446 0.189 1 0.470 ⎦ 0.419 0.212 2.127 1 t = 3:⎡ ⎤

⎡ ⎤ 1 0.206 1.567 0.941 1 1.575 2.241 2.207 ⎢ ⎢ ⎥ ⎥ ⎢ 4.856 1 4.830 4.053 ⎥ ⎢ 1.575 1 1.461 1.639 ⎥ G3 = ⎢ ⎥, CRG3 = 0.008, S 3 = ⎢ ⎥ ⎣ 0.638 0.207 1 0.608 ⎦ ⎣ 2.241 1.461 1 2.110 ⎦ 1.063 0.247 1.644 1 2.207 1.639 2.110 1 ! 3 D = 0.059 0.015 0.053 0.015 0.038 0.028 0.012 0.053 0.023 0.036 0.008 0.032 .

The maximal disagreement value D (1)3 = 0.059, ⎡ ⎤ 1 0.197 0.892 0.906 ⎢ 5.078 1 5.476 5.491 ⎥ ⎥ A41 = ⎢ ⎣ 1.121 0.183 1 1.422 ⎦ 1.103 0.182 0.703 1 t = 4:⎡ ⎤

⎡ ⎤ 1 0.214 1.468 0.882 1 1.534 2.267 2.130 ⎢ ⎢ ⎥ ⎥ ⎢ 4.679 1 4.684 3.894 ⎥ ⎢ 1.534 1 1.417 1.554 ⎥ G4 = ⎢ ⎥, CRG4 = 0.007, S 4 = ⎢ ⎥ ⎣ 0.681 0.214 1 0.572 ⎦ ⎣ 2.267 1.417 1 1.869 ⎦ 1.134 0.257 1.748 1 2.130 1.554 1.869 1 ! 4 D = 0.017 0.017 0.049 0.015 0.033 0.026 0.015 0.054 0.027 0.031 0.009 0.033 .

The maximal disagreement value D (8)4 = 0.054, ⎡ ⎤ 1 0.170 1.765 1.065 ⎢ 5.867 1 6.353 5.793 ⎥ ⎥ A58 = ⎢ ⎣ 0.567 0.157 1 0.374 ⎦ t = 5:⎡

0.939 0.173 2.675

1

⎡ ⎤ 1 0.221 1.374 0.938 1 1.470 2.141 2.088 ⎢ ⎢ ⎥ ⎥ ⎢ 4.520 1 4.554 3.758 ⎥ ⎢ 1.470 1 1.352 1.463 ⎥ G5 = ⎢ ⎥, CRG5 = 0.005, S 5 = ⎢ ⎥ ⎣ 0.728 0.220 1 0.602 ⎦ ⎣ 2.141 1.352 1 1.740 ⎦ 1.067 0.266 1.662 1 2.088 1.463 1.740 1 ! 5 D = 0.015 0.017 0.048 0.017 0.032 0.028 0.015 0.009 0.026 0.029 0.011 0.030 . ⎤

The maximal disagreement value D (3)5 = 0.048, ⎡ ⎤ 1 0.210 0.714 0.696 ⎢ 4.761 1 5.410 2.734 ⎥ ⎥ A63 = ⎢ ⎣ 1.402 0.185 1 0.575 ⎦ 1.437 0.366 1.740

1

310

P. Grošelj and G. Dolinar

t = 6:⎡

⎡ ⎤ ⎤ 1 0.228 1.462 0.995 1 1.436 1.934 1.958 ⎢ ⎢ ⎥ ⎥ ⎢ 4.381 1 4.667 3.643 ⎥ ⎢ 1.436 1 1.353 1.478 ⎥ G6 = ⎢ ⎥, CRG6 = 0.006, S 6 = ⎢ ⎥ ⎣ 0.684 0.214 1 0.575 ⎦ ⎣ 1.934 1.353 1 1.701 ⎦ 1.005 0.275 1.738 1 1.958 1.478 1.701 1 ! 6 D = 0.018 0.018 0.010 0.016 0.035 0.027 0.014 0.008 0.025 0.032 0.009 0.028 .

The maximal disagreement value D (5)6 = 0.035, ⎡ ⎤ 1 0.232 0.967 0.653 ⎢ 4.307 1 4.060 4.433 ⎥ ⎥ A75 = ⎢ ⎣ 1.034 0.246 1 0.850 ⎦ t = 7:⎡

1.533 0.226 1.176

1

⎡ ⎤ 1 0.222 1.537 1.047 1 1.410 1.809 1.828 ⎢ ⎢ ⎥ ⎥ ⎢ 4.503 1 4.776 3.753 ⎥ ⎢ 1.410 1 1.319 1.476 ⎥ G7 = ⎢ ⎥, CRG7 = 0.007, S 7 = ⎢ ⎥ ⎣ 0.651 0.209 1 0.599 ⎦ ⎣ 1.809 1.319 1 1.715 ⎦ 0.955 0.267 1.700 1 1.828 1.476 1.715 1 ! 7 D = 0.017 0.013 0.012 0.016 0.008 0.029 0.012 0.008 0.022 0.036 0.008 0.027 . ⎤

The maximal disagreement value D (10)7 = 0.036, ⎡ ⎤ 1 0.236 0.904 0.914 ⎢ 4.230 1 3.955 4.428 ⎥ ⎥ A810 = ⎢ ⎣ 1.106 0.253 1 0.858 ⎦ t = 8:⎡

1.094 0.226 1.166

1

⎡ ⎤ 1 0.215 1.626 1.109 1 1.372 1.642 1.738 ⎢ ⎢ ⎥ ⎥ ⎢ 4.653 1 4.903 3.894 ⎥ ⎢ 1.372 1 1.269 1.469 ⎥ G8 = ⎢ ⎥, CRG8 = 0.008, S 8 = ⎢ ⎥ ⎣ 0.615 0.204 1 0.631 ⎦ ⎣ 1.642 1.269 1 1.727 ⎦ 0.901 0.257 1.586 1 1.738 1.469 1.727 1 ! 8 D = 0.016 0.011 0.014 0.016 0.009 0.032 0.010 0.008 0.018 0.009 0.008 0.026 . ⎤

The maximal disagreement value D (6)8 = 0.032, ⎡ ⎤ 1 0.275 1.218 0.869 ⎢ 3.643 1 4.727 2.937 ⎥ ⎥ A96 = ⎢ ⎣ 0.821 0.212 1 0.345 ⎦ 1.151 0.341 2.896

1

A Geometric Standard Deviation Based Soft Consensus Model in AHP

311

t = 9:⎡

⎡ ⎤ ⎤ 1 0.220 1.566 1.157 1 1.381 1.645 1.658 ⎢ ⎢ ⎥ ⎥ ⎢ 4.542 1 4.815 4.010 ⎥ ⎢ 1.381 1 1.260 1.407 ⎥ G9 = ⎢ ⎥, CRG9 = 0.007, S 9 = ⎢ ⎥ ⎣ 0.639 0.208 1 0.657 ⎦ ⎣ 1.645 1.260 1 1.594 ⎦ 0.864 0.249 1.521 1 1.658 1.407 1.594 1 ! 9 D = 0.015 0.011 0.014 0.019 0.009 0.009 0.010 0.008 0.017 0.008 0.009 0.024 .

The maximal disagreement value D (12)9 = 0.024, ⎡ ⎤ 1 0.362 1.216 1.206 ⎢ 2.762 1 3.779 4.220 ⎥ ⎢ ⎥ A10 12 = ⎣ 0.822 0.265 1 0.625 ⎦ t = 10:⎡ G10

0.829 0.237 1.594

1

⎡ ⎤ 1 0.214 1.493 1.103 1 1.274 1.641 1.602 ⎢ ⎢ ⎥ ⎥ ⎢ 4.684 1 4.922 4.142 ⎥ ⎢ 1.274 1 1.208 1.387 ⎥ =⎢ ⎥, CRG10 = 0.007, S 10 = ⎢ ⎥ ⎣ 0.670 0.203 1 0.629 ⎦ ⎣ 1.641 1.208 1 1.559 ⎦ 0.907 0.241 1.591 1 1.602 1.387 1.559 1 ⎤

! D 10 = 0.015 0.012 0.013 0.019 0.008 0.008 0.010 0.007 0.019 0.007 0.010 0.006 .

The maximal disagreement value D (9)10 = 0.019, ⎡ ⎤ 1 0.232 1.361 1.530 ⎢ 4.313 1 4.812 3.667 ⎥ ⎢ ⎥ A11 9 = ⎣ 0.735 0.208 1 0.819 ⎦ t = 11:⎡ G11

0.654 0.273 1.222

1

⎡ ⎤ 1 0.218 1.435 1.061 1 1.269 1.616 1.517 ⎢ ⎢ ⎥ ⎥ ⎢ 4.593 1 4.848 4.034 ⎥ ⎢ 1.269 1 1.199 1.380 ⎥ =⎢ ⎥, CRG11 = 0.006, S 11 = ⎢ ⎥ ⎣ 0.697 0.206 1 0.607 ⎦ ⎣ 1.616 1.199 1 1.487 ⎦ 0.942 0.248 1.649 1 1.517 1.380 1.487 1 ⎤

! D 11 = 0.015 0.013 0.011 0.018 0.007 0.007 0.011 0.007 0.003 0.007 0.010 0.006 .

The maximal disagreement value D (4)11 = 0.018, ⎡ ⎤ 1 0.194 1.755 0.952 ⎢ 5.164 1 4.723 2.702 ⎥ ⎢ ⎥ A12 4 = ⎣ 0.570 0.212 1 0.746 ⎦ 1.050 0.370 1.341

1

312

P. Grošelj and G. Dolinar

t = 12:⎡ G12

⎡ ⎤ ⎤ 1 0.222 1.384 1.096 1 1.242 1.553 1.475 ⎢ ⎢ ⎥ ⎥ ⎢ 4.511 1 4.782 4.134 ⎥ ⎢ 1.242 1 1.192 1.310 ⎥ =⎢ ⎥, CRG12 = 0.006, S 12 = ⎢ ⎥ ⎣ 0.723 0.209 1 0.625 ⎦ ⎣ 1.553 1.192 1 1.487 ⎦ 0.913 0.242 1.600 1 1.475 1.310 1.487 1

! D 12 = 0.014 0.013 0.011 0.004 0.007 0.007 0.011 0.008 0.003 0.006 0.011 0.005 .

The maximal disagreement value D (1)12 = 0.014, ⎡ ⎤ 1 0.245 1.386 1.336 ⎢ 4.088 1 4.595 4.190 ⎥ ⎢ ⎥ A13 1 = ⎣ 0.722 0.218 1 0.957 ⎦ 0.748 0.239 1.045 1 t = 13:⎡ ⎡ ⎤ G13

⎤ 1 0.226 1.434 1.131 1 1.240 1.520 1.473 ⎢ ⎢ ⎥ ⎥ ⎢ 4.432 1 4.714 4.044 ⎥ ⎢ 1.240 1 1.186 1.291 ⎥ =⎢ ⎥, CRG13 = 0.008, S 13 = ⎢ ⎥ ⎣ 0.697 0.212 1 0.605 ⎦ ⎣ 1.520 1.186 1 1.398 ⎦ 0.884 0.247 1.652 1 1.473 1.291 1.398 1

! D 13 = 0.003 0.0117 0.0122 0.004 0.008 0.007 0.011 0.008 0.003 0.006 0.010 0.005 .

The maximal disagreement value D (3)13 = 0.0122, ⎡ ⎤ 1 0.261 1.084 1.026 ⎢ 3.839 1 4.562 3.531 ⎥ ⎢ ⎥ A14 3 = ⎣ 0.922 0.219 1 0.803 ⎦ 0.975 0.283 1.245 1 t = 14:⎡ ⎡ ⎤ G14

⎤ 1 0.230 1.484 1.167 1 1.243 1.449 1.432 ⎢ ⎢ ⎥ ⎥ ⎢ 4.355 1 4.649 4.129 ⎥ ⎢ 1.243 1 1.178 1.259 ⎥ =⎢ ⎥, CRG14 = 0.008, S 14 = ⎢ ⎥ ⎣ 0.674 0.215 1 0.622 ⎦ ⎣ 1.449 1.178 1 1.410 ⎦ 0.857 0.242 1.608 1 1.432 1.259 1.410 1

! D 14 = 0.003 0.011 0.003 0.004 0.008 0.008 0.010 0.008 0.002 0.007 0.010 0.005 .

The maximal disagreement value D (2)14 = 0.0122, ⎡ ⎤ 1 0.283 1.549 1.668 ⎢ 3.538 1 4.523 3.740 ⎥ ⎢ ⎥ A15 2 = ⎣ 0.646 0.221 1 0.633 ⎦ t = 15:⎡ G15

0.600 0.267 1.508

1

⎡ ⎤ 1 0.234 1.440 1.133 1 1.253 1.417 1.358 ⎢ ⎢ ⎥ ⎥ ⎢ 4.279 1 4.587 4.053 ⎥ ⎢ 1.253 1 1.173 1.256 ⎥ =⎢ ⎥, CRG15 = 0.006, S 15 = ⎢ ⎥ ⎣ 0.695 0.218 1 0.640 ⎦ ⎣ 1.417 1.173 1 1.395 ⎦ 0.882 0.247 1.563 1 1.358 1.256 1.395 1 ⎤

! D 15 = 0.003 0.003 0.002 0.004 0.007 0.008 0.0112 0.009 0.002 0.006 0.0105 0.004 .

A Geometric Standard Deviation Based Soft Consensus Model in AHP

313

The maximal disagreement value D (7)15 = 0.0112, ⎡ ⎤ 1 0.209 1.412 1.473 ⎢ 4.789 1 5.116 4.777 ⎥ ⎢ ⎥ A16 7 = ⎣ 0.709 0.196 1 0.697 ⎦ 0.679 0.209 1.434 1 t = 16:⎡ ⎡ ⎤ G16

⎤ 1 0.238 1.400 1.106 1 1.226 1.396 1.305 ⎢ ⎢ ⎥ ⎥ ⎢ 4.201 1 4.528 3.979 ⎥ ⎢ 1.226 1 1.152 1.223 ⎥ =⎢ ⎥, CRG16 = 0.005, S 16 = ⎢ ⎥ ⎣ 0.715 0.221 1 0.657 ⎦ ⎣ 1.396 1.152 1 1.383 ⎦ 0.904 0.251 1.522 1 1.305 1.223 1.383 1

! D 16 = 0.003 0.003 0.002 0.004 0.06 0.008 0.002 0.010 0.002 0.005 0.011 0.004 .

The maximal disagreement value D (11)16 = 0.011, ⎡ ⎤ 1 0.245 1.150 1.305 ⎢ 4.078 1 4.607 3.272 ⎥ ⎢ ⎥ A17 11 = ⎣ 0.465 0.217 1 0.692 ⎦ 0.767 0.306 1.446 1 t = 17:⎡ ⎡ ⎤ G17

⎤ 1 0.243 1.356 1.134 1 1.216 1.300 1.307 ⎢ ⎢ ⎥ ⎥ ⎢ 4.121 1 4.590 3.903 ⎥ ⎢ 1.216 1 1.144 1.233 ⎥ =⎢ ⎥, CRG17 = 0.004, S 17 = ⎢ ⎥ ⎣ 0.738 0.218 1 0.678 ⎦ ⎣ 1.300 1.144 1 1.365 ⎦ 0.882 0.256 1.476 1 1.307 1.233 1.365 1

! D 17 = 0.002 0.003 0.001 0.004 0.06 0.008 0.002 0.011 0.002 0.005 0.004 0.004 .

The maximal disagreement value D (8)17 = 0.011, ⎡ ⎤ 1 0.207 1.358 1.392 ⎢ 4.827 1 5.551 4.697 ⎥ ⎢ ⎥ A18 8 = ⎣ 0.736 0.180 1 0.510 ⎦ t = 18:⎡ G18

0.718 0.213 1.960

1

⎡ ⎤ 1 0.247 1.328 1.159 1 1.186 1.283 1.314 ⎢ ⎢ ⎥ ⎥ ⎢ 4.056 1 4.540 3.838 ⎥ ⎢ 1.186 1 1.115 1.200 ⎥ =⎢ ⎥, CRG18 = 0.004, S 18 = ⎢ ⎥ ⎣ 0.753 0.220 1 0.695 ⎦ ⎣ 1.283 1.115 1 1.308 ⎦ 0.863 0.261 1.439 1 1.314 1.200 1.308 1 ⎤

! D 18 = 0.002 0.002 0.001 0.004 0.06 0.008 0.002 0.003 0.002 0.005 0.004 0.003 .

The maximal disagreement value D (6)18 = 0.008, (Table 26) ⎡ ⎤ 1 0.232 1.563 1.142 ⎢ 4.320 1 4.238 3.521 ⎥ ⎢ ⎥ A19 6 = ⎣ 0.640 0.236 1 0.452 ⎦ 0.876 0.284 2.215

1

314

P. Grošelj and G. Dolinar

Table 26 Final PCMs of twelve DMs of alternatives regarding cultural heritage 1 0.245 1.386 4.088 1 4.595 0.722 0.218 1 0.748 0.239 1.045 DM1, CR = 0.003 1 0.194 1.755 5.164 1 4.723 0.570 0.212 1 1.050 0.370 1.341 DM4, CR = 0.023 1 0.209 1.412 4.789 1 5.116 0.709 0.196 1 0.679 0.209 1.434 DM7, CR = 0.011 1 0.236 0.904 4.230 1 3.955 1.106 0.253 1 1.094 0.226 1.166 DM10, CR = 0.003

1.336 4.190 0.957 1 0.952 2.702 0.746 1 1.473 4.777 0.697 1 0.914 4.428 0.858 1

1 0.283 1.549 3.538 1 4.523 0.646 0.221 1 0.600 0.267 1.508 DM2, CR = 0.012 1 0.232 0.967 4.307 1 4.060 1.034 0.246 1 1.533 0.226 1.176 DM5, CR = 0.008 1 0.207 1.358 4.827 1 5.551 0.736 0.180 1 0.718 0.213 1.960 DM8, CR = 0.021 1 0.245 2.150 4.078 1 4.607 0.465 0.217 1 0.767 0.306 1.446 DM11, CR = 0.016

1.668 3.740 0.633 1 0.653 4.433 0.850 1 1.392 4.697 0.510 1 1.305 3.272 0.692 1

1 0.261 1.084 3.839 1 4.562 0.922 0.219 1 0.975 0.283 1.245 DM3, CR = 0.001 1 0.232 1.563 4.320 1 4.238 0.640 0.236 1 0.876 0.284 2.215 DM6, CR = 0.021 1 0.232 1.361 4.313 1 4.812 0.735 0.208 1 0.654 0.273 1.222 DM9, CR = 0.016 1 0.362 1.216 2.762 1 3.779 0.822 0.265 1 0.829 0.237 1.594 DM12, CR = 0.014

1.026 3.531 0.803 1 1.142 3.521 0.452 1 1.530 3.667 0.819 1 1.206 4.220 0.627 1

References Akaa OU, Abu A, Spearpoint M, Giovinazzi S (2016) A group-ahp decision analysis for the selection of applied fire protection to steel structures. Fire Safety J 86:95–105. ISSN 03797112. https://doi.org/10.1016/j.firesaf.2016.10.005 Altuzarra A, Moreno-Jiménez JM, Salvador M (2007) A bayesian priorization procedure for ahpgroup decision making. Eur J Oper Res 182(1):367–382. ISSN 0377-2217. https://doi.org/10. 1016/j.ejor.2006.07.025 Ananda J, Herath G (2008) Multi-attribute preference modelling and regional land-use planning. Ecological Economics 65(2):325–335. ISSN 0921-8009. https://doi.org/10.1016/j.ecolecon. 2007.06.024 Ben-Arieh D, Easton T (2007) Multi-criteria group consensus under linear cost opinion elasticity. Decis Support Syst 43(3):713–721. ISSN 0167-9236. https://doi.org/10.1016/j.dss.2006.11. 009. Integrated Decision Support Bezdek JC, Spillman B, Spillman R (1978) A fuzzy relation space for group decision theory. Fuzzy Sets Syst 1(4):255–268. ISSN 0165-0114. https://doi.org/10.1016/0165-0114(78)90017-9 Chiclana F, Mata F, Martinez L, Herrera-Viedma E, Alonso S (2008) Integration of a consistency control module within a consensus model. Int J Uncertainty Fuzziness Knowledge Based Syst 16(supp01):35–53. ISSN 0218-4885 Cortés-Aldana FA, García-Melón M, Fernández-de-Lucio I, Aragonés-Beltrán P, Poveda-Bautista R (2009) University objectives and socioeconomic results: A multicriteria measuring of alignment. Eur J Oper Res 199(3):811–822. ISSN 0377-2217. https://doi.org/10.1016/j.ejor. 2009.01.065

A Geometric Standard Deviation Based Soft Consensus Model in AHP

315

Dong Q, Cooper O (2016) A peer-to-peer dynamic adaptive consensus reaching model for the group ahp decision making. Eur J Oper Res 250(2):521–530. ISSN 0377-2217. http://dx.doi. org/10.1016/j.ejor.2015.09.016 Dong Q, Saaty TL (2014) An analytic hierarchy process model of group consensus. J Syst Sci Syst Eng 23(3):362–374. ISSN 1861-9576. https://doi.org/10.1007/s11518-014-5247-8 Dong Q, Zhü K, Cooper O (2017) Gaining consensus in a moderated group: A model with a twofold feedback mechanism. Expert Syst Appl 71:87–97. ISSN 0957-4174. https://doi.org/ 10.1016/j.eswa.2016.11.020 Dong Y, Zhang G, Hong W-C, Xu Y (2010) Consensus models for ahp group decision making under row geometric mean prioritization method. Decis Support Syst 49(3):281–289. ISSN 0167-9236. http://dx.doi.org/10.1016/j.dss.2010.03.003 Duke JM, Aull-Hyde R (2002) Identifying public preferences for land preservation using the analytic hierarchy process. Ecological Economics 42(1–2):131–145. ISSN 0921-8009. https:// doi.org/10.1016/s0921-8009(02)00053-8 Dyer RF, Forman EH (1992) Group decision support with the analytic hierarchy process. Decis Support Syst 8(2):99–124. ISSN 0167-9236. https://doi.org/10.1016/0167-9236(92)90003-8 Forman E, Peniwati K (1998) Aggregating individual judgments and priorities with the analytic hierarchy process. Eur J Oper Res 108(1):165–169. ISSN 0377-2217. https://doi.org/10.1016/ s0377-2217(97)00244-0 Grošelj P, Zadnik Stirn L (2012) Acceptable consistency of aggregated comparison matrices in analytic hierarchy process. Eur J Oper Res 223(2):417–420. ISSN 0377-2217 Grošelj P, Zadnik Stirn L (2015) The environmental management problem of Pohorje, Slovenia: A new group approach within ANP – SWOT framework. J Environ Manag 161:106–112. ISSN 0301-4797. http://dx.doi.org/10.1016/j.jenvman.2015.06.038 Hartmann S, Martini C, Sprenger J (2009) Consensual decision-making among epistemic peers. Episteme 6(2):110–129 Herrera-Viedma E, Cabrerizo FJ, Kacprzyk J, Pedrycz W (2014) A review of soft consensus models in a fuzzy environment. Information Fusion 17:4–13. ISSN 1566-2535. http://dx.doi. org/10.1016/j.inffus.2013.04.002 Kahraman C, Cebi S (2009) A new multi-attribute decision making method: Hierarchical fuzzy axiomatic design. Expert Syst Appl 36(3, Part 1):4848–4861. ISSN 0957-4174. https://doi. org/10.1016/j.eswa.2008.05.041 Kou G, Chao X, Peng Y, Xu L, Chen Y (2017) Intelligent collaborative support system for ahpgroup decision making. Stud Inf Control 26(2):131–142. ISSN 1220-1766 Lee AHI, Chang H-J, Lin C-Y (2009) An evaluation model of buyer-supplier relationships in hightech industry - the case of an electronic components manufacturer in taiwan. Comput Ind Eng 57(4):1417–1430. ISSN 0360-8352. https://doi.org/10.1016/j.cie.2009.07.012 Lehrer K, Wagner C (1981) Rational consensus in science and society. Reidel, Dordrecht Labella Á, Liu Y, Rodríguez RM, Martínez L (2018) Analyzing the performance of classical consensus models in large scale group decision making: A comparative study. Appl Soft Comput 67:677–690. ISSN 1568-4946. https://doi.org/10.1016/j.asoc.2017.05.045 Labella Á, Liu H, Rodríguez RM, Martínez L (2019) A cost consensus metric for consensus reaching processes based on a comprehensive minimum cost model. Eur J Oper Res 281(2):316–331. ISSN 0377-2217. https://doi.org/10.1016/j.ejor.2019.08.030 Palomares I, Estrella FJ, Martínez L, Herrera F (2014) Consensus under a fuzzy context: Taxonomy, analysis framework afryca and experimental case of study. Information Fusion 20:252–271. ISSN 1566-2535. https://doi.org/10.1016/j.inffus.2014.03.002 Pedrycz W, Song M (2011) Analytic hierarchy process (ahp) in group decision making and its optimization with an allocation of information granularity. IEEE Trans Fuzzy Syst 19(3):527– 539. ISSN 1063-6706 Regan HM, Colyvan M, Markovchick-Nicholls L (2006) A formal model for consensus and negotiation in environmental management. J Environ Manag 80(2):167–176. ISSN 0301-4797. https://doi.org/10.1016/j.jenvman.2005.09.004

316

P. Grošelj and G. Dolinar

Saaty TL (1994) A ratio scale metric and the compatibility of ratio scales: The possibility of arrow’s impossibility theorem. Appl Math Lett 7(6):51–57. ISSN 0893-9659. http://dx.doi. org/10.1016/0893-9659(94)90093-0 Saaty TL (1980) The analytic hierarchy process. McGraw-Hill, New York Saaty TL (2006) Fundamentals of decision making and priority theory with the analytic hierarchy process. RWS Publications, Pittsburgh Srdjevic B, Srdjevic Z, Blagojevic B, Suvocarev K (2013) A two-phase algorithm for consensus building in ahp-group decision making. Appl Math Modell 37(10–11):6670–6682. ISSN 0307904X. http://dx.doi.org/10.1016/j.apm.2013.01.028 Steele K, Regan HM, Colyvan M, Burgman MA (2007) Right decisions or happy decision-makers? Soc Epistemol J Knowl Cult Pol 21(4):349–368. ISSN 0269-1728 Sun J, Li H (2009) Financial distress early warning based on group decision making. Comput Oper Res 36(3):885–906. ISSN 0305-0548. https://doi.org/10.1016/j.cor.2007.11.005 Wang Y-M, Chin K-S (2009) A new data envelopment analysis method for priority determination and group decision making in the analytic hierarchy process. Eur J Oper Res 195(1):239–250. ISSN 0377-2217. https://doi.org/10.1016/j.ejor.2008.01.049 Wibowo S, Deng H (2013) Consensus-based decision support for multicriteria group decision making. Comput Ind Eng 66(4):625–633. ISSN 0360-8352 Wu Z, Xu J (2012) A consistency and consensus based decision support model for group decision making with multiplicative preference relations. Decis Support Syst 52(3):757–767. ISSN 0167-9236. https://doi.org/10.1016/j.dss.2011.11.022 Yeh J-M, Kreng B, Lin C (2001) A consensus approach for synthesizing the elements of comparison matrix in the analytic hierarchy process. Int J Syst Sci 32(11):1353–1363. ISSN 0020-7721. https://doi.org/10.1080/00207720110052012

Coherency: From Outlier Detection to Reducing Comparisons in the ANP Supermatrix Orrin Cooper and Idil Yavuz

Abstract The process of coming to a decision with the Analytic Network Process (ANP) and statistical approaches have many similarities. Testing the data is crucial in both frameworks to avoid either using a biased model or obtaining inaccurate predictions. Cross validation is one example of a step used in regression to give an accurate measure of a model’s predictive power. In the ANP there was no way to check the priority vectors in a Supermatrix for outliers or perform any cross validation. The Linking Coherency Index is a way to perform cross validation at the level of the weighted Supermatrix and, in other words, check for “Super-Consistency” to identify outliers among the priority vectors in the weighted Supermatrix. Testing for coherency and updating incoherent priority vectors is an important step to improving the accuracy of the decision model. After identifying incoherent priority vectors, decision makers can: return to the decision making process and update the incoherent priority vector or use a dynamic clustering algorithm to provide a suggested update using data from the linking estimates that were obtained when calculating the Linking Coherency Index. Coherent linking estimates can also be used to reduce the number of needed comparisons in an ANP model which would likely increase its application. Coherency testing is a worthwhile and crucial step in the ANP and should become the norm anytime a decision will be made using the framework.

O. Cooper () Fogelman College of Business and Economics, University of Memphis, Memphis, TN, USA e-mail: [email protected] I. Yavuz Department of Statistics, School of Science, Dokuz Eylul University, Izmir, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_12

317

318

O. Cooper and I. Yavuz

1 Introduction The steps of obtaining a decision using the ANP can be summarized as collecting data (obtaining priorities), testing the data (checking consistency), building a model from the data (obtaining the Supermatrix), testing the model, and coming to a decision. This flow is very similar to the ones employed by many statistical approaches; but when practicing ANP, there are a few crucial steps for which there is no equivalent which can potentially result in biased decisions. In statistics, when data is collected, before proceeding to modeling the data for the hypothesis of interest, usually an exhaustive data screening process is carried out to determine potential outliers or to figure out if there are any patterns that can explain the occurrences of missing values. There is an immense body of literature dealing with such checks, and new approaches keep emerging with the rise of artificial intelligence and machine learning techniques (Hodge and Austin 2004; Domingues et al. 2018). Without these checks, it is inevitable that the poor-quality data in the set would lead to biased models and, in the end, inaccurate predictions if not properly dealt with. Similar consequences will surely follow in the ANP if data checking is not done properly. Until recently, the principle data quality check that was considered by practitioners of the ANP was checking the consistency of a pairwise comparison matrix, which is crucial, but fails to identify whether there are any priority vectors that stand out from the others when the entire set of priorities is considered. Much like outlier detection in statistics, there needs to be an “outlier detection” check for ANP data, i.e. the entire set of priority vectors in the weighted Supermatrix. An example can be given to emphasize how checking consistency is necessary but not sufficient. Assume a data set from a survey of 20 questions is filled out by 100 people. A statistician would want to screen this data for two things. First, by adding “check” questions to the survey, the statistician would make sure that a person taking the survey did not just guess but gave meaningful answers. This can be thought of like the consistency check for the priorities as this check determines whether a person is self-consistent at the level of the pairwise comparison matrix. The statistician then would look at the complete data to determine whether a person’s answering pattern is quite different from the rest, i.e. if that person is an outlier or not. This would mean for the ANP, looking at all the priority vectors as a whole—the weighted Supermatrix—and determining if any of them stand out. Until recently, there were no tests that would serve this very important purpose, but the newly developed coherency testing is proposed as one way to look at the entire set of priority vectors in a weighted Supermatrix. Another crucial step that many statistical approaches benefit from is model checking. Whether the method considered is a regression, time-series, or clustering model, before proceeding to conclusions or predictions the model’s fit needs to be tested. Cross validation, with all its variations, has served this purpose for a long time and validation has proven to be even more important recently with the rise of machine learning methods on big data. An example can be given by assuming

Coherency “Reducing Comparisons”

319

a regression setting. The data set at hand can be split into k-parts, k different models can be built by excluding each one of the k-parts by turn, and the predictive performance of the model built on the remaining (k − 1) parts can be tested on the data that was excluded when building the model. Averaging the performance metric values from the k-runs can give a very accurate measure of the model’s predictive power when new data arrives. This whole purpose of cross validating is to ensure that the statistical models can be counted on before actually using them for making predictions and drawing conclusions. Such a step is currently missing in the ANP literature. The Supermatrix can be thought of as the model that will be used for decision making; but there are no tests that can provide an insight on how the model will perform. Coherency testing can serve this purpose as a weighted Supermatrix labeled as coherent will more likely lead to reliable decisions. Even though making a trustworthy decision is the main goal when using an ANP model, one cannot overlook the difficulties encountered when building the model. For this reason, while keeping the goal in mind of collecting high quality data and obtaining accurate decisions, reducing the number of necessary comparisons would be very valuable to practitioners. Coherency testing can also lead to the reduction of needed comparisons in an ANP decision model as explained in Sect. 5. To detect outliers among the priority vectors in a weighted Supermatrix, perform cross validation, and to be able to reduce the number of needed comparisons coherency testing is a worthwhile and in fact beneficial and crucial step in the ANP and should become the norm anytime a decision will be made using the framework.

2 Literature Review The ANP is a popular multi-criteria decision making method to evaluate complex decisions (Saaty 1980, 1977; Chen et al. 2019). The following references serve as great sources for the theory, axioms, and applications of the ANP (Saaty 1980, 1994, 2005; Lee et al. 1996). Consistency is an important way to test the pairwise comparisons to increase the likelihood of making a good decision. If the weighed Supermatrix could be represented as a large pairwise comparison matrix (PCM), then coherency could be thought of in a simple sense as a testing the consistency of the weighted Supermatrix to perform cross validation and test for outlying priority vectors. One crucial difference between a PCM and the weighed Supermatrix is the unit, or units, of measurement in each respective matrix. Understanding the units of measurement in a Supermatrix will not only initially present a potential challenge in measuring some form of consistency at the level of the Supermatrix, but also when this understanding is integrated with the concept of linking pins it will provide clarity as to how it is possible to calculate linking estimates (LE). The linking estimates are pivotal to calculating coherency indices and can also be used in a clustering technique to identify incoherent priority vectors in a weighted

320

O. Cooper and I. Yavuz

Supermatrix. The literature regarding consistency, the units of measurement, linking pins, and clustering is explored in greater detail in this section.

2.1 Consistency and Coherency While consistency tests and coherency tests share much in common, they are also distinct tests that should be used simultaneously in ANP decisions since they provide outlier detection in different levels as explained in the Introduction. Beginning with the Analytic Hierarchy Process (AHP), consistency is a way to use relationships in a pairwise comparison matrix as one measure of how consistent a decision maker was in the judgments that were provided (Saaty 1980; Vargas 1982; Kwiesielewicz and Van Uden 2004). Similarly, coherency can be defined as self-consistent and non-contradictory with respect to a particular system (Hastings and Gross 2012; VIM 2004) which instead of a single PCM the “system” is an aggregation of PCMs within a weighted Supermatrix. Coherency is a form of model testing and cross validation that looks at all the priority vectors as a whole. In a PCM, comparisons are made among elements with respect to a single unit of measurement (Wedley and Choo 2011). For example, if one were to say that an element A contains more of criterion Z than element B, and that element B contains more of criterion Z than element C, then a consistent response would be that element A contains more of criterion Z than element C. One might even go further and not only test the direction but also the strength of the relationship (Kou et al. 2016). This oversimplified example overlooks more precise definitions of consistency like ordinal and cardinal consistency (Kou et al. 2016) because it is the relationships that will help us understand coherency. In a similar way that the relationship between A and B, and between B and C could be used to infer an expected relationship between A and C in a PCM, coherency uses the information aggregated from the PCMs across a weighted Supermatrix, an entire network, to measure consistency and infer values and relationships within a given weighted Supermatrix. Consistency and coherency can be directly calculated from the PCM and weighted Supermatrix, respectively, and do not require access to any other data. With both consistency and coherency, the elicited relationships are not expected to be perfect; in fact, much as with legal testimony, perfect consistency or coherency across multiple priority vectors would be a possible sign of a fabricated story or model. However, at the same time .1 has been established as a reasonable upper bound for levels of inconsistency (Saaty 1994) and 1.1 is a reasonable upper bound for coherent data (Cooper and Yavuz 2016). Chen et al. (2015) address incomplete PCMs and suggest that the literature can be organized into two categories: improving inconsistencies and estimating missing judgments. A connecting path method is shown to be able to address both issues within a PCM. Coherency can also both improve incoherencies and estimate missing judgments in a weighted Supermatrix.

Coherency “Reducing Comparisons”

321

2.2 Units of Measurement Recognizing and understanding the units of measurement in priority vectors and Supermatrices is essential to perform model testing and identify outliers. First, it is important to recognize that ratio scales are not unitless, but actually have a unit of measurement that must be taken into consideration. The unit of measurement is both important and useful; but it also frequently overlooked or misunderstood (Wedley and Choo 2001, 2011). An unweighted Supermatrix will commonly have many priority vectors that each sum to unity even within the same column. Second, by considering the unit of measurement and recognizing that a 1 in one column, does not necessarily equal a 1 in the same or in another column (Wedley 2013; Zahir 2007) we acknowledge that each ratio scale/priority vector most likely has a different unit of measurement. One exception would be when a rating scale is used multiple times. Even when alternatives are compared to criteria in the same cluster unless the alternatives are identical then a 1 from one priority vector will not equal a 1 in another priority vector. To an outsider an unweighted Supermatrix might look like an almost random conglomeration of different units of measurement. Herein lie two potential challenges: (1) how can the priority vectors in different units be combined in the Supermatrix? And, once they are in a weighted Supermatrix, (2) how can the priority vectors that are in different units be compared together if a 1 in one column, does not necessarily equal a 1 in the same or in another column? First, pairwise comparisons are commonly used to compare and weight the priority vectors from different clusters in order to obtain a column stochastic Supermatrix (Saaty 2005). One way this can be done is by applying criteria cluster weights to all of the priority vectors within a given cluster (Saaty 2005). However, one should not assume that a priority vector from an alternative in a given criteria cluster will have a contribution equal to the priority vector of another alternative from the same criteria cluster. Or that a priority vector for any other element in a cluster will have an equal contribution to any other element in the same cluster. This relationship usually applies both with respect to providing the same contribution to the entire system or with respect to the other priority vectors in each respective column. Comparing the priority vectors within each column individually and not in aggregate by cluster will lead to a fully-dependent ANP model (Cooper and Liu 2017) which is also likely be more coherent than a semi-dependent ANP model. Understanding and using the units of measurement is also crucial to being able to calculate the coherency. Finally, Wedley and Choo (2011) explain that when comparing the resulting priority vectors, focusing on the ratios rather than solely the rank will improve the efficacy of the ANP. Wedley and Choo (2011) conclude “Therein lie both the advantage and dilemma of AHP. We do not need explicit knowledge of the underlying unit of measure to derive a ratio scale, yet the derived scale has a unit.”

322

O. Cooper and I. Yavuz

2.3 Linking Pins After weighting the priority vectors within each column of the unweighted Supermatrix each column is essentially converted into a single ratio vector that sums to unity within the weighted Supermatrix. Each ratio vector, while summing to unity is in a unit of measurement of that is still generally unique to each respective column. If only there were a way to identify relationships between an element in one column of the weighted Supermatrix and another element for the same criteria or alternative but within another column. Then in essence one is identifying the relationship between a 1 in one column and a 1 in the other column and the elements in each column could then be linked. The linking pin method proposed by Schoner et al. (1993) is a normalization process that takes advantage of the structural dependence that exists in the ANP Supermatrix. The alternatives and the criteria in an ANP network are structurally dependent on each other. The structural dependence within the Supermatrix is a result of the relationships that exist among the elements within the Supermatrix and provides the information needed to convert all of the entries within the weighted Supermatrix into ratios of a single unit of measurement. This conversion is the essential step that leads to outlier detection and cross validation of a Supermatrix. As linking pins are used across vectors of different units it is important to recognize that the conversions cannot be arbitrary because they are dependent on the specific units involved in each column of the weighted Supermatrix. When converted correctly linking estimates can then be obtained to calculate a coherency index. Refer to Sect. 3.3 for more details about calculating the coherency index.

2.4 Cluster Analysis Cluster analysis is a collection of techniques used for detecting hidden groups or patterns in a data set; and it has been widely applied to many fields including machine learning, pattern recognition, information retrieval, bioinformatics, and decision science. For example, Kou and Lou (2012) propose a hierarchical clustering method that combines multiple factors to identify clusters of web pages to assist users to find relevant web pages faster. In the literature, clustering and AHP have been integrated to improve decision making. Liu and Shih (2005) integrate data mining, group AHP decision making, and clustering to provide product recommendations for customers. Ho and Hung (2008) use surveys and the AHP to develop effective marketing strategies for school selection. There is a large body of research surrounding clustering methods and algorithms. Jain (2010) and Popat and Emmanuel (2014) provide reviews of clustering and many of the most well-known clustering methods. Clustering has been used extensively in group decision making (Liu et al. 2016; Zhang et al. 2018; Liu et al. 2014). By building on the concept of clustering, defining a new distance measure, and using a predetermined minimum number of

Coherency “Reducing Comparisons”

323

elements one can use the information in the weighted Supermatrix and a cluster of “good” priority vectors to identify and suggest updates for incoherent priority vectors (Yavuz and Cooper 2017) which is yet another contribution coherency testing will make since this can naturally lead to a reduction in the number of necessary comparisons when building an ANP model.

3 Linking Coherency Index The initial idea behind the Linking Coherency Index (LCI) originated from modeling a Supermatrix in variable form and recognizing how different columns could be used to convert other columns in one part of the weighted Supermatrix into the values from another part of the Supermatrix. As Schoner et al. (1993) explained, there is structural dependence within an ANP weighted Supermatrix. The ability to use this structural dependence would allow for a decision maker to obtain “k-parts” to perform a type of cross validation in an ANP decision model. The relationships and structural dependence can be identified in an example. For the sake of simplicity, assume you are using ANP to make what turns out to be a very boring decision. Nonetheless, it will serve to illustrate the structural dependence in an ANP weighted Supermatrix that can be used to test for coherency and reduce the number of needed comparisons in an ANP decision. Assume you are choosing between two alternatives which will be evaluated against two criteria. You first compare the two alternatives to each other with respect to how much they influence each of the criteria and in your opinion the two alternatives equally influence the first criterion (.5, .5) and also equally influence the second criterion (.5, .5). Next, you compare the influence of the first criterion with respect to its influence on the two alternatives and in your opinion the influence again is equal (.5, .5). These priority vectors are entered into the Supermatrix in (1). ⎡

C1 C2 A1 A2

⎤ 0 0 0.5 .?? ⎥ C2 ⎢ ⎢ 0 0 0.5 .?? ⎥. ⎣ A1 0.5 0.5 0 0 ⎦ A2 0.5 0.5 0 0 C1

(1)

It is important to highlight at this point that there is nothing wrong with the comparisons made to this point; and, likewise that another individual could have assigned different priorities for the criteria and alternatives in this problem. It is also not necessary that the criteria be tangible/measurable or that we have access to some “true” underlying values to move forward. If the weighted Supermatrix was a pairwise comparison matrix where we had said a = b and b = c then to be selfconsistent we would also assume that a = c. Any other value would be an outlier to some degree or another. In a similar way, from the cross validation, we can infer

324

O. Cooper and I. Yavuz

what the priorities should be if we are to be self-consistent at the level of the entire Supermatrix or in other words, “Coherent.” This form of model checking through cross validation will allow us to develop a measure of coherency and provide predictions of coherent priority vectors to update incoherent priority vectors or reduce the number of needed comparisons in an ANP Supermatrix. Because the two alternatives were determined to each influence each of the criteria equally and that the influence of the first criterion between the two alternatives is also equal, then, in order to be coherent, or consistent at the level of the Supermatrix, the second criterion must also equally influence each alternative. The coherent priority vector for A2 should be (.5, .5). With only two elements in each cluster, by default all of the pairwise comparisons will be perfectly consistent. This is important to recognize for two reasons: (1) In practice, the comparisons will seldomly be both perfectly consistent and coherent. Just as there are acceptable levels of inconsistency there are also acceptable levels of incoherency. (2) Any priority vector other than (.5, .5) for A2 would have been perfectly consistent at the level of the PCM but the weighted Supermatrix would not be coherent. Therefore, we can see that consistency is a necessary but not a sufficient condition in a valid ANP decision. In the survey example given in the Introduction this would be like adding the check questions to the survey but not checking if the individual was an outlier. Current tests alone would not have identified this “inconsistency” at the level of the weighted Supermatrix. Without the concept of coherency there would have been no way to identify an incoherent priority vector in this example. Just because it was determined that each alternative equally influences each criterion that in and of itself is insufficient to determine that the alternatives are equal because as Zahir (2007) explained a 1 here does not necessarily equal a 1 there. However, when through the structural dependence in the ANP Supermatrix the ratio of the first criterion’s contribution was determined to be equal among the two alternatives we could see through the other relationships that the priority vector A2 should be (.5, .5) in order to be coherent at the level of the Supermatrix. In larger and more realistic decisions the logic is the same and the coherency of a Supermatrix can be calculated. Now with a basic idea behind what coherency is and how it works, the motivation to test for coherency will be provided through two examples and generalized with the results from large scale simulations. In order to make a meaningful comparison in the examples and simulations in each case we begin with a representative weighted Supermatrix that we assume is a correct or intended weighted Supermatrix. From the intended Supermatrix, errors that decision makers may commit are introduced at the level of the pairwise comparisons and reflected in another weighted Supermatrix. The resulting priority vector from each Supermatrix will be obtained from the respective limit matrix. Like Wedley and Choo (2011) we will focus on both the final priority vectors and the rank when comparing the results. This will provide a method to demonstrate the value of testing for and improving the coherency of a weighted Supermatrix. Then the formulas to calculate the LCI of a weighted Supermatrix to test for coherency will be provided.

Coherency “Reducing Comparisons”

325

3.1 Motivating Examples It is useful to look at two examples of larger and more realistic weighted Supermatrices to recognize the need to test for coherency. In the first example, (2), the pairwise comparisons were perfectly consistent. The only “error” in this decision is that the decision maker confused the order of the priorities in C1.3 . This might happen when comparing a criterion like cost where if not in its own network should be compared differently. In other words, the most expensive alternative should receive the smallest priority. Regardless of whether the “error” resulted from that type of error, a bias, or other mistake the most important point is that even with one single incoherent priority vector the final ranking changed to where the best alternative would not be chosen. Table 1 shows the final priority vector for the intended weighted Supermatrix and the weighted Supermatrix in (2) and what should have been the third most preferred alternative became the most preferred alternative. Without checking the coherency of the weighted Supermatrix this decision maker would have chosen the third most preferred alternative. As the priorities were rounded to three decimal places, they may not exactly sum to unity, we hope this will not confuse the reader. ⎡

C1.1

C1.2

C1.3

C2.1

C2.2

C2.3

A1

A2

A3

A4

⎤ 0 0 0 0 0 0 0.204 0.203 0.092 0.223 C1.2⎢ 0 0 0 0 0 0.253 0.197 0.323 0.063 ⎥ ⎢ 0 ⎥ ⎢ C1.3⎢ 0 0 0 0 0 0 0.037 0.198 0.116 0.216 ⎥ ⎥ C2.1⎢ 0 0 0 0 0 0.100 0.151 0.283 0.162 ⎥ ⎢ 0 ⎥ C2.2⎢ 0 0 0 0 0 0.094 0.119 0.120 0.244 ⎥ ⎢ 0 ⎥. C2.3⎢ 0 0 0 0 0 0.312 0.131 0.066 0.090 ⎥ ⎢ 0 ⎥ A1 ⎢ 0 0 0 ⎥ ⎢ 0.241 0.281 0.386 0.131 0.141 0.467 0 ⎥ A2 ⎢ 0 0 0 ⎥ ⎢ 0.338 0.309 0.054 0.278 0.251 0.278 0 ⎥ A3 ⎣ 0.098 0.323 0.408 0.332 0.162 0.089 0 0 0 0 ⎦ A4 0.323 0.087 0.152 0.259 0.447 0.166 0 0 0 0 C1.1

(2)

The second example, (3), represents another decision where there is no systematic bias or mistake in how the pairwise comparison questions were answered. The consistency index of each pairwise comparison matrix was below .03. However, there is some error in all the pairwise comparisons as could be expected in a real decision. And once again the best alternative is ranked third (Table 2). Just as in Table 1 Single incoherent priority vector final results

Alternatives A1 A2 A3 A4

Intended priority .220 .311 .199 .270

Rank 3 1 4 2

Limiting priority .271 .260 .236 .232

Rank 1 2 3 4

326

O. Cooper and I. Yavuz

Table 2 Common weighted Supermatrix final results

Intended priority .277 .224 .190 .310

Alternatives A1 A2 A3 A4

Rank 2 3 4 1

Limiting priority .280 .271 .184 .266

Rank 1 2 4 3

statistics where data screening, “check” questions, and cross validation are used to improve the likelihood of making a good decision the same should be done in ANP models. In this case the decision maker would choose the second most preferred alternative as the best alternative if he does not check for outliers among the priority vectors at the level of the weighted Supermatrix by checking the coherency of the weighted Supermatrix and updating any incoherent priority vectors. ⎡

C1.1

C

C

C

C

C

A

A

A

A

1.2 1.3 2.1 2.2 2.3 1 2 3 4 ⎤ 0 0 0 0 0 0 0.216 0.240 0.177 0.167 C1.2⎢ 0 0 0 0 0 0.199 0.225 0.112 0.242 ⎥ ⎢ 0 ⎥ C1.3⎢ 0 0 0 0 0 0 0.169 0.080 0.145 0.122 ⎥ ⎢ ⎥ C2.1⎢ 0 0 0 0 0 0.122 0.314 0.206 0.145 ⎥ ⎢ 0 ⎥ C2.2⎢ 0 0 0 0 0 0.148 0.071 0.319 0.136 ⎥ ⎢ 0 ⎥. C2.3⎢ 0 0 0 0 0 0.146 0.069 0.041 0.188 ⎥ ⎢ 0 ⎥ A1 ⎢ 0 0 0 ⎥ ⎢ 0.303 0.301 0.396 0.156 0.257 0.312 0 ⎥ A2 ⎢ 0 0 0 ⎥ ⎢ 0.355 0.380 0.136 0.287 0.120 0.255 0 ⎥ A3 ⎣ 0.149 0.070 0.094 0.283 0.406 0.077 0 0 0 0 ⎦ A4 0.194 0.249 0.375 0.274 0.216 0.357 0 0 0 0

C1.1

(3)

3.2 Simulation Results Specific examples can serve important purposes but they also lead to the question of how often will these kinds of problems arise in general. In addition, is it really worth borrowing these assumptions from statistics to check for outlying priorities vectors and performing cross validation at the level of the Supermatrix? To answer these questions large scale simulations of 100,000 decision models were simulated under various assumptions. The number of alternatives (Alts), the number of criteria (Crit), the consistency ratio limit (CR), and the maximum allowed perturbation in the pairwise comparisons were varied. Table 3 contains the results from these simulations. In terms of the maximum allowed perturbation a 0.3 Perturbation index the pairwise comparisons will on average be off by 0.75 units and at most 2.7 units when using a continuous distribution on the standard 1–9 scale. The most important column in this table is the Error rate. The Error rate is the percentage of times when

Coherency “Reducing Comparisons” Table 3 Simulation results

327

# Alts 3 3 4 4 4 4 4 4 5 5 5

# Crit 8 18 12 12 12 12 12 12 6 9 15

CR limit 0.05 0.05 0.03 0.05 0.10 0.03 0.05 0.10 0.05 0.05 0.05

Perturbation index 0.40 0.40 0.40 0.40 0.40 0.50 0.50 0.50 0.40 0.40 0.40

Error rate 19.2% 14.2% 17.7% 17.2% 16.4% 29.1% 28.8% 27.5% 23.7% 17.7% 10.0%

the ratios of the limiting priority vectors changed enough that the rank of at least two of the alternatives changed. The original priorities were required to differ by at least .02 to require enough difference between the rankings that the change in priorities was significant. This type of change in the priorities would lead to choosing less preferred alternatives as the most preferred alternative. Clearly, from Table 3, these Supermatrices that were simulated using consistent PCMs, lead to inaccurate decisions quite often when the coherency of the weighted Supermatrix was not checked. Therefore, adapting outlier testing and cross validation from statistics by testing weighted Supermatrices for coherency and updating incoherent priority vectors can greatly reduce the frequency of these errors.

3.3 Calculating the Linking Coherency Index (LCI) Assume the weighted Supermatrix is ⎡

C1.1

C1.2⎢

C1.1

C1.2

...

C2.3

0 0 .. .

0 0 .. .

... ... .. .

0 0 .. .

⎢ .. ⎢ . ⎢ ⎢ Sˆ = C2.3⎢ 0 0 ⎢ A1 ⎢ ⎢ xC1.1,1 xC1.2,1 A2 ⎣ xC1.1 ,2 xC1.2 ,2 A3 xC1.1 ,3 xC1.2 ,3

... 0 . . . xC2.3 ,1 . . . xC2.3 ,2 . . . xC2.3 ,3

A

A

A

1 2 3 ⎤ xA1 ,1.1 xA2 ,1.1 xA3 ,1.1 xA1 ,1.2 xA2 ,1.2 xA3 ,1.2 ⎥ ⎥ .. .. .. ⎥ . . . ⎥ ⎥ . xA1 ,2.3 xA2 ,2.3 xA3 ,2.3 ⎥ ⎥ ⎥ 0 0 0 ⎥ 0 0 0 ⎦ 0 0 0

(4)

328

O. Cooper and I. Yavuz

To look at the entire set of priority vectors in the weighted Supermatrix, every column vector in the original weighted Supermatrix is converted to a new ratio vector by dividing the components of each column vector by a common, nonzero, element within its respective column vector. In each column the resulting ratio vector now reflects each element’s contribution as a ratio of the contribution of the common element in each column. The first element in each column vector will be used herein as the common component. The resulting matrix will now have the form: ⎡

C1.1

C1.2

...

C2.3

0

0

...

0

0 .. . 0

... .. . ...

0 .. . 0

1

...

C1.1

⎢ 0 ⎢ .. ⎢ .. . ⎢ . ⎢ ⎢ C2.3⎢ 0 ⎢ xC ,1 ⎢ 1.1 A1 ⎢ x =1 ,1 ⎢ C1.1 xC1.1 ,2 ⎢ A2 ⎣ xC1.1 ,1 C1.2⎢

A3

xC1.1 ,3 xC1.1 ,1

xC1.2 ,2 xC1.2 ,1 xC1.2 ,3 xC1.2 ,1

... ...

A1 xA1 ,1.1 xA1 ,1.1 = xA1 ,1.2 xA1 ,1.1

.. .

xA1 ,2.3 xA1 ,1.1

1 xC2.3 ,2 xC2.3 ,1 xC2.3 ,3 xC2.3 ,1

1

A2

A3

1

1



xA2 ,1.2 xA3 ,1.2 xA2 ,1.1 xA3 ,1.1

.. .

.. .

xA2 ,2.3 xA3 ,2.3 xA2 ,1.1 xA3 ,1.1

0

0

0

0

0

0

0

0

0

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥. ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

(5)

Since a 1 in A1 is most likely not equal to a 1 in A2 the challenge becomes how to use the structural dependence in a weighted Supermatrix to determine the relationship between the common elements in each column. The entries from one column in the lower left-hand side of the Supermatrix in (5) are the values of the relationship between the common elements in each column and are used as the links to convert the entries in the upper right-hand side of the Supermatrix into ratios of a single unit, and the converted upper right hand side will have the form: A1 xC1.1 ,1 . xC ,1 xC1.1 1.1 ,1 xA1 ,1.1 . xC1.1 ,1

⎡ xA

C1.1

⎢ ⎢ .. ⎢ . ⎢ ⎣

C1.2⎢

C2.3

1 ,1.1 xA1 ,1.1 xA1 ,1.2

.. .

xA1 ,2.3 xC1.1 ,1 xA1 ,1.1 . xC1.1 ,1

= =

A2 xA2 ,1.1 xC1.1 ,2 T1.1,1 xA ,1.1 . xC ,1 xA2 ,1.2 xC1.1 ,2 T1.2,1 xA2 ,1.1 . xC1.1 ,1 2 1.1

.. .

= T2.3,1

xA2 ,2.3 xC1.1 ,2 xA2 ,1.1 . xC1.1 ,1

= =

A3 xA3 ,1.1 xC1.1 ,3 T1.1,2 xA ,1.1 . xC ,1 xA3 ,1.2 xC1.1 ,3 T1.2,2 xA3 ,1.1 . xC1.1 ,1 3 1.1

= T2.3,2

= T1.1,3



⎥ = T1.2,3 ⎥ ⎥ ⎥. .. ⎥ . ⎦ xA3 ,2.3 xC1.1 ,3 . = T 2.3,3 xA ,1.1 xC ,1 3

(6)

1.1

With each entry now represented in the units of a particular, yet identical, ratio as in (6) they can be aggregated and combined to obtain a new estimate of Sˆ which we call a linking estimate (LE). This new Supermatrix will be notated by SCL1.1 since

Coherency “Reducing Comparisons”

329

the criterion C1.1 was used as the link. This estimate can be obtained by performing the following calculations: ⎡

C1.1

C1.2

...

C2.3

0

0

...

0

0 .. . 0

... .. . ...

0 .. . 0

...

T2.3,1 T2.3,. T2.3,2 T2.3,. T2.3,3 T2.3,.

C1.1

⎢ ⎢ 0 .. ⎢ .. . ⎢ ⎢ . ⎢ = C2.3⎢ 0 ⎢ ⎢ T1.1,1 A1 ⎢ T 1.1,. ⎢ T1.1,2 A2 ⎢ ⎣ T1.1,. C1.2⎢

SCL1.1

A3

T1.1,3 T1.1,.

T1.2,1 T1.2,. T1.2,2 T1.2,. T1.2,3 T1.2,.

... ...

A1 A2 A3 T1.1,1 T1.1,2 T1.1,3 T..,1 T..,2 T..,3 T1.2,1 T1.2,2 T1.3,3 T..,1 T..,2 T..,3

.. .

T2.3,1 T..,1

0 0 0



⎥ ⎥ ⎥ .. ⎥ ... . ⎥ ⎥ T2.3,2 T2.3,3 ⎥ ⎥, T..,2 T..,3 ⎥ 0 0 ⎥ ⎥ ⎥ 0 0 ⎥ ⎦ 0 0

(7)

   where Ti.j,. = 3n=1 Ti.j,n and T..,n = 2i=1 3j =1 Ti.j,n . The same process can be repeated n + m times where n is the number of alternatives and m is the number of criteria; a different criterion or alternative is chosen as the link each time, resulting in n + m linking estimates (LE)s. The n + m linking estimates can be related to the k regression models resulting from a kfold cross validation performed on a data set to measure the predictive power of a regression model. If every column vector had been measured without any error, then the LEs would all be identical. In practice, however, each priority vector will have some bias and imprecision. Hence the LEs will not be equal. But when a priority vector, i.e. column in Sˆ is used as the link to create a linking estimate the more accurate it is the more the estimation will look like S∗L . The less accurate it is the less it will look like S∗L and hence can be identified as a potential outlier. The Linking Coherency Index (LCI ) will serve as a measure to compare how coherent the weighted Supermatrix is. The idea is to compare the LEs to each other and determine which LE is the most different when compared to the others. This will tell us that the column of Sˆ that was used as the link to get that particular LE is problematic and needs to be updated. This process is carried out iteratively: dealing with the most-off column in each iteration. In respect to garbage in–garbage out, we assume that at least half of the priority vectors in Sˆ are reasonably valid so that when we identify a LE as problematic and update the linking priority vector that yielded ˆ that particular LE, we are not by mistake changing a valid part of S. Let the sets containing all the criteria and alternatives be Ω1 = {C1.1 , C1.2 , . . . , C2.3 } and Ω2 = {A1 , A2 , A3 }. Then, the LCI between two LEs can be defined as:

LCIbc =

⎧ 1 n+m m ⎪ ⎨ (n∗m) j =1 i=m+1 ⎪ ⎩

1 (n∗m)

,

if b, c ∈ Ω1 ,

j =m+1 min(S L ,S L ) , b,(ij) c,(ij)

if b, c ∈ Ω2 ,

m m+n i=1

L L max(Sb,(ij) ,Sc,(ij) ) L L min(Sb,(ij) ,Sc,(ij) ) L L ) max(Sb,(ij) ,Sc,(ij)

(8)

330

O. Cooper and I. Yavuz

L where S∗,(ij ) represent the entry in ith row and j th column of the corresponding matrix. Here LCIbc measures the closeness between two LEs SbL and ScL and for any b, c ∈ Ωk , k = 1, 2 (1) LCIbc ≥ 1; (2) LCIbc = LCIcb (symmetry); (3) LCIbb = 1 (reflexivity). The iterations are then carried out as follows. For a given estimated Supermatrix ˆ the LE are first calculated as explained above. Then a n × n LCI matrix Lalt S, is obtained by comparing all the LEs obtained by using alternatives as links with each other as explained in Eq. (8). Similarly, a m × m LCI matrix Lcrit is obtained for the LE obtained by using criteria as links. Then LCI -scores for all the criteria and alternatives are calculated by averaging the LCI values in each column in both LCI matrices except the comparison of a LE against itself and are notated by LCI .

LCI X =

⎧ m,i=j ⎨ i=1 Lcrit (ij) , ⎩

n,i=m−1 j i=1 Lalt (ij) , n−1

if X ∈ Ω1 , if X ∈ Ω2 ,

(9)

where j represents the column of criterion/alternative X in the corresponding LCI matrix and L∗(ij ) represent the entry in ith row and j th column of the corresponding matrix. If the largest LCI calculated as in (9) is greater than a pre-specified threshold , where  ≥ 1, then the column in Sˆ belonging to the criterion or alternative with that LCI needs to be updated by the decision maker. This process is continued until the largest LCI obtained from the updated Sˆ is smaller than . Choosing an appropriate  will also improve the performance of the method significantly and based on simulated models we suggest using  < 1.1. This method which used the mean to calculate the LCI-score is effective in determining the most incoherent priority vector. By using the mean, the LCIscores of coherent priority vectors can be strongly influenced by incoherent priority vectors, which could be thought of like outliers. Since the LCI’s are essentially distances between the LEs, having an outlier in the set of LEs will artificially increase all the other LCI-scores since that particular LE will be further away than all the others when the mean is used as an aggregation method to combine the information in the LCI matrices. Better approaches can be proposed for this purpose, like, for example, using the median for aggregation to avoid this artificial increase in LCI-scores of non-outlier LEs and better distinguish the LEs causing the spread. This approach is discussed in detail in Sect. 4. The examples and simulation results showed that calculating the LCI to identify incoherent priority vectors is a worthwhile effort to improve the validity of the decision which acts like a form of cross validation performed on a Supermatrix. Decision makers can return to the PCMs of the incoherent priority vectors and update comparisons to obtain coherent priority vectors and improve the likelihood of making a valid decision.

Coherency “Reducing Comparisons”

331

It may also be difficult to get an updated priority vector. What if a common statistical method could be adapted to predict coherent priority vectors to replace the incoherent priority vectors? Or as a common complaint about the ANP is the number of comparisons that are required; what if the incoherent priority vectors were incoherent because they were simply incomplete comparisons? Then the number of required comparisons could be reduced! Returning to the original example where there were only two alternatives and two criteria, because the relationships were very simple, as a reader you could have identified that a priority vector like (.9, .1) for A2 would be incoherent and need to be updated and even could have provided the priority vector (.5, .5) as the coherent update. Also because the other comparisons were also perfectly coherent, not only could an updated vector be provided for A2 , but it could have been left blank to reduce the number of needed comparisons as it was in the example and still obtain the priority vector (.5, .5) as the coherent update. In general, decisions will have more than 2 alternatives and criteria and almost never be perfectly coherent then how could one determine how to provide an updated priority vector?

4 Dynamic Clustering When testing for coherency the LEs are created and can be used not only to calculate the LCI but also through a clustering algorithm provide the information needed to suggest an update for incoherent priority vectors. Cluster analysis is a technique to detect hidden groups or patterns is a data set. These are some of the unique challenges that a weighted Supermatrix possesses: how to measure similarities among the LEs, even with a small sample size regular clustering methods would require a large number of clustering variables, the ability to limit the number of elements needed to be considered a “good” cluster. A dynamic clustering algorithm can address each of the challenges and reduce the burden on decision makers to provide coherent priority vectors. LEs are large matrices which cannot efficiently be represented graphically so we will simply use an asterisk symbol in Fig. 1. We further assume that if the original weighted Supermatrix was perfectly accurate, then all the LEs would be the same and lie exactly in the middle of the graph. In practice, the LE will not all lie in the same place because of imprecision in the measurements. Figure 1 represents the LEs from a weighted Supermatrix that could be updated with dynamic clustering. In the chart there is a large coherent cluster in the middle with random errors in some priority vectors and there is another small cluster that exhibits a common bias. There is another LE representing a single column vector that is also incoherent enough it should be revised. With the large and small clusters a key assumption is that the largest cluster of LEs is the accurate cluster which happens to be the case in this scenario. Both the smaller cluster and the priority vector with too much error could be identified and updated to obtain a coherent weighted Supermatrix.

332

O. Cooper and I. Yavuz

* ** **** * ** * * *** * *

*

*

Fig. 1 Visualization of the LEs from an updatable Supermatrix

If the LEs come from a weighted Supermatrix where the priority vectors have sufficient error, they will be widely dispersed and there would not be enough of a pattern to reliably suggest meaningful updates in which case the weighted Supermatrix would be identified as such and returned to the user to update before testing for coherency again. In the case that the LEs come from a systematically biased weighted Supermatrix it is possible the largest cluster would not be centered in the middle of the graph but biased in one direction or another. Assuming the largest cluster of LEs is the cluster with the most coherent measurements is not unreasonable, but it can also be considered the greatest weakness of dynamic clustering. In practice, without access to the “true” data there is no way to know which priority vectors are accurate and which are not. In the event that a coherent bias systematically occurs both in the majority of the priority vectors and in a way that is consistent, i.e. coherent, across the different scales of each individual priority vector then the biased cluster would be used to update the rest of the weighted Supermatrix. Such a bias would likely lead to an incorrect ranking in the first place and hence using dynamic clustering will not make the decision less optimal. The dynamic clustering algorithm begins with a weighted Supermatrix and the user can set a limit for the desired level of coherency (Yavuz and Cooper 2017). The default is 1.05 but can be changed to require the weighted Supermatrix be more or less coherent. Setting the limits follow the notion of Type-1 and Type-2 errors where if the limit is set very close to 1 this may lead to updating a Supermatrix when not necessary, and moving the limit away from 1 may lead to labeling an incoherent Supermatrix as coherent. Depending on the number of alternatives and criteria the minimum number of LEs that are needed in the “good” cluster is determined. Then the LEs and their respective LCI-scores are calculated to identify the most incoherent priority vector (MIPV) and to check if the “good” cluster is large enough

Coherency “Reducing Comparisons”

333

to proceed. If the Supermatrix is coherent, then the limit matrix can be calculated to obtain the final priorities. If the weighted Supermatrix is not coherent, and the LEs are too incoherent, then the weighted Supermatrix will be returned to the user for external updating. If the weighted Supermatrix is incoherent, and a good cluster exists, then the dynamic clustering algorithm will predict the recommended priority vector to replace the MIPV. The approach summarized here differs from the approach of averaging the LCIs to determine the most incoherent priority vector and having the user update it until the Supermatrix can be labeled as coherent in two fundamental ways. The first one is how the most incoherent priority vector is determined. In this approach the LCI-Scores in an LCI matrix are not averaged but instead the assumption is made that a “good” cluster consisting of at least half of the criteria or alternatives exists (depending on which part of the Supermatrix is being studied) and each LE is assigned an LCI-Score which is essentially the worst relationship that particular LE would have had with the other LEs in the good cluster. This can be simply achieved by taking the median instead of the mean in each row of the LCI matrix. Using the median allows the actual coherent priority vectors to shine as such due to the fact that by using the median their LCI-Scores become immune to the false inflation caused by the incoherent priority vector(s). This leads to the second fundamental difference of the dynamic clustering where now the coherent priority vectors can be labeled as such and used for replacing the incoherent ones without having to go back to the user if a large enough coherent cluster exists. The user can accept or modify the suggested priority vector and the weighted Supermatrix is updated with the replacement priority vector. The process then repeats until the weighted Supermatrix is coherent. The limit matrix is calculated and a summary of the actions is provided. The steps are summarized in Table 4 and details about these steps can be found in Yavuz and Cooper (2017). In large scale simulations Yavuz and Cooper (2017)

Table 4 Dynamic clustering summary Step 1

2

Initial calculations a. Set the initial parameters and the number of LEs needed in a “good” cluster b. Obtain the LEs and calculate the Linking Coherency Index (LCI) and LCI-score for each alternative and criterion c. Identify the most incoherent priority vector (MIPV) d. Evaluate if the MIPV is a candidate for updating e. Classify the Supermatrix as updatable (U) or non-updatable (NU) Update and report a. Update the most incoherent priority vector b. Repeat steps 1b-1d using the updated weighted Supermatrix c. Continue updating until no further improvements can be made d. Report the results and a summary of the actions performed

334

O. Cooper and I. Yavuz

showed that the dynamic clustering algorithm is very effective in identifying and updating incoherent priority vectors in weighted Supermatrices. The simulations were carried out under different scenarios that varied: the numbers of alternatives and criteria in the Supermatrix, the number of incoherent priority vectors, the location of the incoherent priority vectors, and the amount of error introduced into the pairwise comparisons. Results from the different combinations are presented in tables that show how many Supermatrices were updatable, how often the mean absolute error between the predicted and actual priority vectors was reduced, and how often more error was introduced into the rank.

5 Reducing the Number of Comparisons Coherency testing serves as an outlier detection and model testing practice for the ANP; but, another advantage it can bring to the table is that the number of comparisons needed to fill the weighted Supermatrix can be reduced. Let us begin with a representative weighted Supermatrix with four alternatives evaluated with respect to six criteria in (10) where the decision maker(s) will not need to provide all of the pairwise comparisons. Most of the priority vectors have been provided and each contains some error to represent a realistic decision. Additionally, there is a bias in the priority vector that represents how C2.1 is distributed among the alternatives which we should expect the dynamic clustering algorithm to identify and update. The decision maker felt confident enough with the pairwise comparisons that one cluster of comparisons for A3 and both clusters of comparisons for A4 were not completed to reduce the number of needed comparisons to make this decision. In this example we will begin with an unsophisticated “guess” of 1/n for the missing comparisons so that sample values of the LCI-Scores can be reported in the subsequent tables. ⎡

C1.1

C

C

C

C

C

A

A

A

A

1.2 1.3 2.1 2.2 2.3 1 2 3 4 ⎤ 0 0 0 0 0 0 0.101 0.209 0.190 0.000 C1.2⎢ 0 0 0 0 0 0.223 0.14 0.143 0.000 ⎥ ⎢ 0 ⎥ ⎢ C1.3⎢ 0 0 0 0 0 0 0.162 0.215 0.119 0.000 ⎥ ⎥ C2.1⎢ 0 0 0 0 0 0.230 0.126 0.000 0.000 ⎥ ⎢ 0 ⎥ C2.2⎢ 0 0 0 0 0 0.132 0.215 0.000 0.000 ⎥ ⎢ 0 ⎥. C2.3⎢ 0 0 0 0 0 0.154 0.096 0.000 0.000 ⎥ ⎢ 0 ⎥ A1 ⎢ 0 0 0 ⎥ ⎢ 0.165 0.477 0.227 0.176 0.158 0.311 0 ⎥ A2 ⎢ 0 0 0 ⎥ ⎢ 0.374 0.352 0.382 0.059 0.322 0.318 0 ⎥ A3 ⎣ 0.316 0.105 0.067 0.294 0.268 0.133 0 0 0 0 ⎦ A4 0.146 0.067 0.324 0.471 0.252 0.238 0 0 0 0

C1.1

(10)

In many decisions or forms of analysis, more [data] is better. And of course, in a weighted Supermatrix the algorithm gets better with greater amounts of coherent

Coherency “Reducing Comparisons”

335

data. In statistics increasing the sample size is one of the most common ways to increase the power of a test. When dealing with intangible data, in particular, this can be more complicated. The study of reducing the number of comparisons in a PCM has received considerable attention. Harker (1987) could be used as a great source of options to reduce the required number of pairwise comparisons in the PCMs in an ANP decision. This can also have application to a weighted Supermatrix because a weighed Supermatrix can be transformed into a large pairwise comparison matrix. Wedley et al. (1993) suggest starting rules to reduce the number of comparisons and also to reduce decision maker fatigue because more comparisons can introduce more error. The decision maker can approach the decision about which comparisons to not complete either as an a priori or an ad hoc decision. In practice, the comparisons can be reduced both within the initial PCMs and then also in the number of PCMs that are not used. It is important to recognize how many of the priority vectors in a weighted Supermatrix are essentially redundant. n is the number of elements in a given cluster. The minimum number of comparisons needed to calculate the final priorities would be n − 1 in each of the PCMs to calculate the priority vectors from one side of the weighted Supermatrix and then a single priority vector in one column from the other side of the weighted Supermatrix. The rest of the comparisons is redundant. One of the reasons for completing at least some of the redundant comparisons is to calculate consistency in the PCM and coherency in the weighted Supermatrix much like gathering a large enough sample to use part of it for cross validation. Additionally, we suggest at least half of the priority vectors on the other side of the Supermatrix be completed. First, because the algorithm uses the median; and, also that from simulation results it was evident that using less than half of the priority vectors was not enough to counter the bias within small samples. The decision about how many and which specific priority vectors to eliminate cannot be generalized across every ANP model. This is because of important variations within different models. For example, choosing between cell phones likely has the potential for fewer adverse implications than potentially choosing a less than ideal plan to immunize a population against a transmittable disease. Similarly, the size of a PCM and weighted Supermatrix determines how many redundant comparisons there are. More precisely, the things to consider when deciding on how many and which priority vectors to eliminate can be organized under four titles: size, cost, impact, and error. The information each of these “variables” will carry and why they are significant for the decision are given in the following discussions. Size: The size of the PCMs and weighted Supermatrix determines how much redundancy exists in the ANP decision. A weighted Supermatrix, for a decision with two alternatives evaluated with respect to two clusters with two elements in each cluster, has some but very little redundancy that can be reduced. However, a weighted Supermatrix, with five alternatives evaluated with respect to four criteria clusters that each contain four to five elements, has a lot of redundancy and is a good candidate where many redundant comparisons could be reduced.

336

O. Cooper and I. Yavuz

Cost: At the outset one can determine which priority vectors are the most difficult and/or expensive to obtain. The decision maker can then choose to obtain the priority vectors in the other half of the weighted Supermatrix and use them to infer the most difficult and expensive priority vectors. Impact: While the overall impact of a decision or of specific clusters does not change the number of redundant comparisons, it is important in determining how many of the redundant comparisons will be left unfilled. Error: This can be related to the amount of inconsistency in other PCMs and/or determining which comparisons the respective decision makers are most confident to measure correctly. In certain cases, it can be easier to determine the allocation of a single element among the alternatives than the allocation of an element with respect to another element within a single alternative and vice versa. This will determine which side of the weighted Supermatrix to leave unfilled. Now that these “variables” are defined, the number of priority vectors that can be eliminated which can be denoted by ne can be thought of as a function of them, i.e. ne = f (size, cost, impact, error).

(11)

The form of the function f in Eq. (11) will be different for every decision and the importance of these variables in f will vary greatly from decision to decision. From another point of view, the number of comparisons provided can be viewed as the sole input for the information gathered for a decision and then the relationship between the number of comparisons and the information will be of interest. For illustrative purposes, how this relationship can take form under different scenarios is visualized in Fig. 2. With no other variable affecting the relationship, as the number of comparisons provided increases to the point of filling the whole Supermatrix, the information should increase linearly or almost linearly to its maximum attainable value. This case is visualized in part “a” of the Fig. 2. If this is the case for a decision, then the decision maker should not be interested at all in reducing the number of comparisons. If, however, in a decision, the same amount of information can be obtained using fewer comparisons like in part “b” or the same amount of information can be obtained even faster like in part “c,” then the decision maker should be very interested in making fewer comparisons. There may be cases where the variable of interest is very hard to measure and prone to errors. Then the relationship can take a form like in part “d” where making more comparisons after a certain point starts damaging the decision making process. In that case finding the optimum number of comparisons will become even more critical for that decision maker. Of course, various relationships can be envisioned like the ones discussed here but the takeaway is that the number of comparisons to make and which ones to eliminate is a choice that will be specific to each decision making problem according to (11). These decisions could be made upfront or could be determined by checking in at certain points during the decision making process. For example, consider a PCM with at least 5 elements where all the comparisons except the last two rows have been completed and are very consistent, then what is point of the

Coherency “Reducing Comparisons”

337

Fig. 2 Visualization of possible relationships between the number of comparisons and information

last comparisons? In the same way one could determine that enough of the weighted Supermatrix has been filled and is coherent enough to being reviewing and potentially using suggested updates. In summary, it is important to emphasize this is not a replacement to good work and analysis—there is no such thing as a free lunch. It is not a black box that can generate meaningful data from thin air. The decision maker should review each of the suggestions for the intended priorities whether for complete or incomplete pairwise comparisons. However, it can significantly reduce the number of redundant pairwise comparisons required to build an ANP model. From Steps 1b–1d the LCI-Scores are calculated and C2.1 is identified as the MIPV as shown in Table 5. In this example, the decision maker was indeed coherent enough in the other pairwise comparisons that a “good” cluster exists and the weighted Supermatrix is classified as “Updatable.” Even though the empty comparisons for A3 and A4 have a raw LCI of 2.84 and 3.54 because of the size of the clusters the LCI-Score for C2.1 is 2.561 which is bolded in Table 5. According Table 5 Missing comparisons round 1 Alternatives LCI-scores (A1 –A4 ) Criteria LCI-scores (C1.1 –C2.3 ) Most incoherent priority vector (MIPV) Suggested update

1.392 1.392 1.958 1.958 1.611 1.178 1.624 2.561 C2.1 (.176, .058, .294, .470) C2.1 (.338, .245, .186, .231)

1.624

1.587

338

O. Cooper and I. Yavuz

to Step 2a the suggested update for C2.1 is (.338, .245, .186, .231). We will assume the decision maker agrees with this update and that the weighted Supermatrix is updated and we proceed to Step 2b. From Steps 1b–1d, as shown in Table 6, A4 is identified as the MIPV with an LCI-score of 1.898. Because the other pairwise comparisons were sufficiently coherent the relationships identified in the other LEs can be used to predict the priority vector for the incomplete comparisons (.116, .039, .279, .186, .255, .125). The decision maker can evaluate the predicted priority vector and adjust it as they see fit. If adjustments are made the new priority vector will be tested in the next round and it is possible that the adjusted priority vector will be identified as the MIPV with a proposed revision. In this case the suggested priority vector is accepted and included in the updated weighted Supermatrix. In Table 7, the LCI-Scores are improving but as we should expect with incomplete comparisons that the weighted Supermatrix is still not coherent. However, it is worth noting that by using the median we can see from the LCI-Scores for the alternatives that the “good’ cluster is clearly coherent and the values for the coherent priority vectors are not as biased from the LE generated by the incomplete A3 priority vector. The set of incomplete comparisons in A3 are identified as the MIPV with an LCI-score of 1.674 and the predicted priority vector is (.291, .071, .067, .174, .315, .081). This suggestion is accepted and included in the updated weighted Supermatrix. As the dynamic clustering algorithm begins another iteration, the new LCIScores for the criteria and alternatives in Table 8 have improved significantly. Depending on the cutoff set in Step 1a the algorithm could stop at this point and calculate the limit matrix and final priorities. However, with the limit set at 1.05, C2.3 is identified as the MIPV with an LCI-score of 1.079 and is a candidate for Table 6 Missing comparisons round 2 Alternatives LCI-scores (A1 –A4 ) Criteria LCI-scores (C1.1 –C2.3 ) Most incoherent priority vector (MIPV) Suggested update

1.101 1.101 1.674 1.898 1.322 1.1782 1.357 1.316 1.321 A4 (.167, .167, .167, .167, .167, .167) A4 (.116, .039, .279, .186, .255, .125)

1.364

Table 7 Missing comparisons round 3 Alternatives LCI-scores (A1 –A4 ) Criteria LCI-scores (C1.1 –C2.3 ) Most incoherent priority vector (MIPV) Suggested update

1.048 1.048 1.674 1.048 1.162 1.110 1.116 1.244 1.256 A3 (.190, .143, .119, .167, .167, .167) A3 (.291, .071, .067, .174, .315, .081)

1.116

Table 8 Missing comparisons round 4 Alternatives LCI-scores (A1 –A4 ) Criteria LCI-scores (C1.1 –C2.3 ) Most incoherent priority vector (MIPV) Suggested update

1.047 1.049 1.000 1.000 1.040 1.020 1.012 1.014 C2 .3 (.311, .319, .133, .238) C2 .3 (.358, .271, .133, .237)

1.014

1.079

Coherency “Reducing Comparisons”

339

updating. It appears from looking at the difference between the MIPV and Suggested Update for C2.3 in Table 8, based on the other pairwise comparisons in the weighted Supermatrix, that the decision maker identified a larger difference between A1 and A2 with respect to C2.3 than he recognized when comparing them together. The decision maker should go back and look at the inconsistency in the respective pairwise comparison matrices; and revisit those specific comparisons to reevaluate if indeed the two alternatives are not almost identical with respect to C2.3 . If indeed the two alternatives are different with respect to C2.3 , then the decision maker can accept this suggested update. In this example we assume the decision maker still feels strongly that the contribution of A1 and A2 with respect to C2.3 is almost identical and to accept the LCI-Score of 1.079. The updated weighted Supermatrix will then be evaluated again. At this point while there is still some incoherency among the priority vectors in the updated weighted Supermatrix and all the LCI-Scores except C2.3 are all below the cutoff (Table 9). At this point any additional suggested updates are likely to introduce more error than they would correct. The dynamic clustering algorithm will now proceed to Step 2d and calculate the limit matrix and provide a summary of the process. The final priorities from the limit matrix that were calculated from the updated weighted Supermatrix (12) and are provided in Table 10 as the “Algorithm” Priority. The correct priority vectors for C2.1 and C2.3 and the pairwise comparisons for the missing comparisons in A3 and A4 were inserted in the Supermatrix in (10) along with the other intended priorities to show the intended weighted Supermatrix (13) for comparison purposes. The final priorities for the intended weighted Supermatrix were also calculated and provided in Table 10 as the “Intended” priority. The final priorities from the updated weighted Supermatrix are within .015 of the final priorities from the intended weighted Supermatrix and the ranking is the same. By using the dynamic clustering algorithm, the decision maker was able to reduce the number of needed comparisons in the ANP decision, identify and update incoherent priority vectors, and correctly identify A2 as the preferred alternative. Table 9 Missing comparisons final LCI Alternatives LCI-scores (A1 –A4 ) Criteria LCI-scores (C1.1 –C2.3 )

Table 10 Final priorities with missing comparisons

1.047 1.040

1.049 1.020

Alternatives A1 A2 A3 A3

1.000 1.012

1.000 1.014

Algorithm priority .261 .332 .188 .218

Rank 2 1 4 3

1.014

Intended priority .276 .339 .181 .205

1.079

Rank 2 1 4 3

340

O. Cooper and I. Yavuz



C1.1

C

C

C

C

C

A

A

A

A

1.2 1.3 2.1 2.2 2.3 1 2 3 4 ⎤ 0 0 0 0 0 0 0.101 0.209 0.291 0.116 C1.2⎢ 0 0 0 0 0 0.223 0.140 0.071 0.039 ⎥ ⎢ 0 ⎥ C1.3⎢ 0 0 0 0 0 0 0.162 0.215 0.067 0.279 ⎥ ⎢ ⎥ C2.1⎢ 0 0 0 0 0 0.230 0.126 0.174 0.186 ⎥ ⎢ 0 ⎥ C2.2⎢ 0 0 0 0 0 0.132 0.215 0.315 0.255 ⎥ ⎢ 0 ⎥. C2.3⎢ 0 0 0 0 0 0.154 0.096 0.081 0.125 ⎥ ⎢ 0 ⎥ A1 ⎢ 0 0 0 ⎥ ⎢ 0.165 0.477 0.227 0.338 0.158 0.358 0 ⎥ A2 ⎢ 0 0 0 ⎥ ⎢ 0.373 0.351 0.382 0.245 0.322 0.272 0 ⎥ A3 ⎣ 0.315 0.105 0.067 0.186 0.268 0.133 0 0 0 0 ⎦ A4 0.146 0.066 0.324 0.231 0.252 0.237 0 0 0 0

C1.1



C1.1

C

C

C

C

C

A

A

A

(12)

A

1.2 1.3 2.1 2.2 2.3 1 2 3 4 ⎤ 0 0 0 0 0 0 0.114 0.209 0.348 0.154 C1.2⎢ 0 0 0 0 0 0.223 0.140 0.087 0.038 ⎥ ⎢ 0 ⎥ ⎢ C1.3⎢ 0 0 0 0 0 0 0.143 0.209 0.087 0.270 ⎥ ⎥ C2.1⎢ 0 0 0 0 0 0.229 0.117 0.043 0.115 ⎥ ⎢ 0 ⎥ C2.2⎢ 0 0 0 0 0 0.143 0.209 0.347 0.269 ⎥ ⎢ 0 ⎥. C2.3⎢ 0 0 0 0 0 0.143 0.117 0.087 0.154 ⎥ ⎢ 0 ⎥ A1 ⎢ 0 0 0 ⎥ ⎢ 0.160 0.471 0.217 0.471 0.172 0.313 0 ⎥ A2 ⎢ 0 0 0 ⎥ ⎢ 0.360 0.353 0.391 0.294 0.310 0.313 0 ⎥ A3 ⎣ 0.320 0.118 0.087 0.059 0.276 0.125 0 0 0 0 ⎦ A4 0.160 0.059 0.304 0.176 0.241 0.250 0 0 0 0

C1.1

(13)

6 Conclusion The ANP has been used to make decisions across many disciplines. The flow or steps are similar to that of many statistical approaches. In statistics there are data screening processes to reduce bias and improve the accuracy of predictions. Until recently, the principle data quality check in the ANP has been the consistency of the PCM. Checking the coherency in a weighted Supermatrix is an important data screening test that can detect incoherent priority vectors. Incoherent priority vectors can be thought of as priority vectors that stand out from the others when the entire set of priority vectors in a weighted Supermatrix are considered. Another form of model checking used in regression and machine learning is cross validation. The linking estimates obtained while calculating the LCI can be used to cross validate the priority vectors in the weighted Supermatrix. Checking the coherency in the weighted Supermatrix is a beneficial and crucial step in the ANP. Additional motivation to check the coherency in a weighted Supermatrix was provided by the examples and the simulations results. The consistency of a PCM is a necessary but not a sufficient condition. Conservative estimates from the simulations

Coherency “Reducing Comparisons”

341

show that between 10% and 29% of Supermatrices with very consistent PCMs were not coherent and lead to the final rankings changing to the extent that less preferred alternatives would be chosen as the most preferred alternative. In the first example, a simple scenario provided the logic behind how coherency takes advantage of the structural dependence that exists among the priority vectors in the weighted Supermatrix. It is crucial to understand the units of measurement in a weighted Supermatrix to take advantage of the structural dependence. The method to do so was provided in Sect. 3.3. After identifying the most incoherent priority vector (MIPV) the linking estimates can be used in a dynamic clustering algorithm to predict a coherent priority vector to update the MIPV. Reducing the number of comparisons that are required in ANP decision models would likely increase its application. It is shown in an example how the linking estimates can be used to predict priority vectors for missing comparisons. This is an added benefit of checking the coherency of a weighted Supermatrix. Coherency and data screening and model checking in the ANP are still in their infancy. In group decision making there has been a great deal of research about how to develop group consensus. In one sense the LEs used to calculate the LCI could be thought of as a group of Supermatrices. If the LE are treated as a “group,” this opens additional ways to not only aggregate the LEs but also different measures or dimensions to measure coherency like many of the consistency or proximity measures that have been developed to measure and improve group consensus. The advances in this area will allow decision makers to ensure their ANP models can be counted on before using them to make decisions or draw conclusions.

References Chen K, Kou G, Tarn JM, Song Y (2015) Bridging the gap between missing and inconsistent values in eliciting preference from pairwise comparison matrices. Ann Oper Res, 1–21 Chen Y, Qiuxia J, Hui F, Hui L, Jiarui H, Yanqi W, Jie C, Cheng W, Yuehua W (2019) Analytic network process: Academic insights and perspectives analysis. J Clean Prod 235(1):1276–1294 Cooper O, Liu G (2017) Achieving the desired level of dependency in anp decision models. Int J Anal Hierarchy Process 9(1):2–26 Cooper O, Yavuz I (2016) Linking validation: A search for coherency within the supermatrix. Eur J Oper Res 252(1):232–245 Domingues R, Filippone M, Michiardi P, Zouaoui J (2018) A comparative evaluation of outlier detection algorithms: Experiments and analyses. Pattern Recognition 74:406–421 Harker PT (1987) Incomplete pairwise comparisons in the analytic hierarchy process. Mathematical Modelling 9(11):837–848 Hastings A, Gross LJ (2012) Encyclopedia of theoretical ecology, vol 4. Univ of California Press Ho HF, Hung CC (2008) Marketing mix formulation for higher education: An integrated analysis employing analytic hierarchy process, cluster analysis and correspondence analysis. Int J Educ Manag 22(4):328–340 Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85– 126 Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666

342

O. Cooper and I. Yavuz

Kou G, Lou C (2012) Multiple factor hierarchical clustering algorithm for large scale web page and search engine clickstream data. Ann Oper Res 197(1):123–134 Kou G, Ergu D, Lin C, Chen Y (2016) Pairwise comparison matrix in multiple criteria decision making. Technol Econ Dev Econ 22(5):738–765 Kwiesielewicz M, Van Uden E (2004) Inconsistent and contradictory judgements in pairwise comparison method in the ahp. Comput Oper Res 31(5):713–719 Lee H, Shi Y, Nazem SM (1996) Supporting rural telecommunications: A compromise solutions approach. Ann Oper Res 68(1):33–45 Liu BS, Shen YH, Chen XH, Chen Y, Wang XQ (2014) A partial binary tree dea-da cyclic classification model for decision makers in complex multi-attribute large-group interval-valued intuitionistic fuzzy decision-making problems. Information Fusion 18:119–130 Liu DR, Shih YY (2005) Integrating ahp and data mining for product recommendation based on customer lifetime value. Inf Manag 42(3):387–400 Liu Y, Fan ZP, Zhang X (2016) A method for large group decision-making based on evaluation information provided by participators from multiple groups. Information Fusion 29:132–141 Popat SK, Emmanuel M (2014) Review and comparative study of clustering techniques. Int J Comput Sci Inf Technol 5(1):805–812 Saaty TL (1977) A scaling method for priorities in hierarchical structures. J Math Psychol 15(3):234–281 Saaty TL (1980) The analytic hierarchy process: planning, priority setting, resources allocation. McGraw-Hill Saaty TL (1994) How to make a decision: the analytic hierarchy process. Interfaces 24(6):19–43 Saaty TL (2005) Theory and applications of the analytic network process: decision making with benefits, opportunities, costs, and risks. RWS publications Schoner B, Wedley WC, Choo EU (1993) A unified approach to ahp with linking pins. Eur J Oper Res 64(3):384–392 Vargas LG (1982) Reciprocal matrices with random coefficients. Mathematical Modelling 3(1):69– 81 VIM I (2004) International vocabulary of basic and general terms in metrology (vim). International Organization 2004:09–14 Wedley W (2013) Ahp/anp before, present and beyond. ISAHP 2013 Wedley WC, Choo EU (2001) A unit interpretation of multi-criteria ratios. In: Proceedings of the sixth international symposium on the analytic hierarchy process, Berne, Switzerland, pp 561– 569 Wedley WC, Choo EU (2011) Multi-criteria ratios: What is the unit? J Multi-Criteria Decis Anal 18(3-4):161–171 Wedley WC, Schoner B, Tang TS (1993) Starting rules for incomplete comparisons in the analytic hierarchy process. Math Comput Modell 17(4-5):93–100 Yavuz I, Cooper O (2017) A dynamic clustering method to improve the coherency of an anp supermatrix. Ann Oper Res 254(1-2):507–531 Zahir S (2007) A new approach to understanding and finding remedy for rank reversals in the additive analytic hierarchy process. ASAC 2007 Zhang HJ, Dong YC, Herrera-Viedma E (2018) Consensus building for the heterogeneous largescale gdm with the individual concerns and satisfactions. IEEE Trans Fuzzy Syst 26:864–898

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple Attribute Decision-Making Sait Gül

Abstract Since the evaluation of alternatives is based on the importance of the attribute which is represented by weights, decision analysts should choose a proper weighting method. There are two basic types of weighting. The first one is subjective weighting that consults the expertise of decision-makers (DMs). The second type is called objective weighting that ignores the DMs’ prioritization/preferences and just looks at the alternatives’ scores which are measured with respect to attributes. In this study, we propose the usage of entropy measurement in determining the objective attribute weights for neutrosophic multiple attribute decision-making (NMADM) issues. Seven entropy methods for interval-valued neutrosophic sets are considered for solving two different hypothetical decision-making problems in order to present the proposition’s usability and efficiency. A nonparametric statistical test, namely Wilcoxon signed-rank test is performed for the comparison of the weight sets generated by seven different entropy-based methods. The results show that the entropy-based objective weighting is properly useful for N-MADM.

1 Introduction Multiple Attribute Decision-Making (MADM) is a rational and quantitative tool for occupying the most agreeable/favorite alternative(s) among potential alternatives with respect to predetermined attributes. In order to describe a problem as a MADM problem, there are two requirements: “the presence of multiple alternatives” and “differed results of each alternative” (˙Inan et al. 2017). Tzeng and Huang (2011) mentioned that the historical origins of MADM can be tracked until the correspondence between N. Bernoulli and P.R. de Montmort, discussing the St. Petersburg paradox and this thinking process generated the fundamentals of utility theory. After centuries, we now possess a scientific field on

S. Gül () Faculty of Engineering and Natural Sciences, Management Engineering Department, Bahçe¸sehir University, ˙Istanbul, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_13

343

344

S. Gül

decision-making and a huge literature including many methodologies such as AHP, TOPSIS, VIKOR, ELECTRE, PROMETHEE, ORESTE, COPRAS, etc. MADM methods aim to provide a methodology for evaluating, sorting, or classifying alternatives by considering mostly conflicting attributes and has been widely used in economics, management, and engineering. When the number of alternatives and attributes increase, the complexity within the MADM problem will increase. Besides, in terms of the existence of multiple decision-makers (DM), the complexity will be multiplied. Sodenkamp et al. (2018) said that the complexity will be augmented if the process involves qualitative and quantitative evaluations on the possible alternatives. These evaluations are usually vague and imprecise, and significantly complicate the construction of decision-making procedures. When the immense complexity in the decision setting and the intrinsic ambiguity in human intelligence, DMs confront some kind of difficulties in stating their judgments by traditional crisp numbers in real cases (Liang et al. 2017). As Russell (1923) stated that “All traditional logic habitually assumes that precise symbols are being employed. It is therefore not applicable to this terrestrial life but only to an imagined celestial existence,” the relationship between certain and ambivalent has occupied an important place in the minds of scholars for centuries. Jan Lukasiewicz wrote pioneering works on multi-valued logics by extending the spectrum of certainty to the range of [0, 1]. Max Black introduced simple fuzzy sets for the first time in literature and presented the fundamentals of operations on this new reality representation concept. Finally, Zadeh (1965) rediscovered the concept of fuzzy sets and developed an explicit system for its mathematical background (Peng and Dai 2018a). Zadeh (1965) promoted fuzzy sets (FS) to represent ambiguity by a membership value that is appointed to each target and changes between 0 and 1. Atanassov (1986) described the intuitionistic fuzzy set (IFS) as a new representation of reality with a membership degree and a non-membership degree. Their total value can be less than or equal to 1. It is more rationale than classical FS but lacks a comprehensive representation of the vague information existed in real applications (Peng and Dai 2018a). Smarandache introduced the neutrosophic set theory (NS) by assigning an independent degree to the indeterminacy or hesitancy within a decision and has propounded the terms of “neutrosophy” and “neutrosophic” (Sahin ¸ and Liu 2016). Smarandache (1998) wrote that “Neutrosophy is a new branch of philosophy which studies the origin, nature, and scope of neutralities, as well as their interactions with different ideational spectra.” Peng and Dai (2018a) claimed that NS is more convenient for modeling human judgments since it characterizes the vagueness of knowledge in a more extensive manner. The realism of the neutrosophy can be exemplified by a social experience leading the outcomes of sports (win, tie, or defeat) and voting (yes, abstention, or no).

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

345

In case of considering the DM evaluations as neutrosophic numbers or linguistic terms, it means that the current MADM problem is neutrosphicated. So, the problem in hand can be called neutrosophic MADM, or N-MADM in a short form. Each occurrence in neutrosophy has definite and independent degrees of truth, falsity, and indeterminacy. For example, when it is demanded to learn the thought of a DM regarding an issue, he/she may say that the possibility degree of the truthiness of issue is 70%, the possibility degree of the falseness is 40%, and the possibility degree of being unsure is 10%. This judgment statement is expressed as a NS: (Abdel-Basset et al. 2018). This representation is beyond the modeling capacity of FS or IFS. For the same example, while FS uses only 0.7 as the membership degree in modeling the opinion with numbers, IFS will utilize 0.7 as membership degree and a smaller degree than or equal to 0.3 (=1–0.7) as non-membership degree. Another important issue that should be carefully considered in MADM problems is the calculation of the weights of attributes. Since the evaluation of alternatives is based on the importance of the attribute which is represented by weights, decision analysts should choose a proper weighting method. There are two basic types of weighting. The first one is subjective weighting (for instance, pairwise comparison of AHP) that consults the expertise of decision-makers (DMs) with the aim of revealing the hidden information in their conscious and indicating them with numbers. The second one is called objective weighting. This type ignores the DMs’ attribute prioritization and just looks at the alternative scores which were measured with respect to attributes (Koksalmis and Kabak 2019). In literature, some drawbacks are specified for subjective weighting such as risks that may be originated from self-seeking DMs or possible long time periods that are required for data collection from DMs. To overcome them, objective weighting methods are developed (Malhotra and Malhotra 2002; Gül et al. 2018). In this study, there are two basic aims. The first one is to make a brief literature overview of the objective weighting of attributes which are utilized in NMADM. Secondly, we propose the usage of entropy-based weighting in determining the objective attribute weights by extending the current approaches proposed for single-valued NS (SVNS) into the usage for other main types of NSs, namely interval-valued NS (IVNS). There are six sections in this chapter. Section 2 gives the preliminaries regarding NS and some basic operations. A literature review on objective weighting in NMADM is given in Sect. 3. Section 4 summarized the entropy concept and 7 considered entropy measures that are used in the study. The proposed approach and Wilcoxon Signed-Rank Test for comparing results generated by the usage of entropy measures are explained in Sect. 5. Then, the application and comparison results are shared at the end of Sect. 5. Section 6 explains the conclusions of the study and cite the future research agenda.

346

S. Gül

2 Preliminaries: Neutrosophic Sets, Numbers, and Operations This section introduces some notions related to single and interval-valued NSs. Definition 1 (Smarandache 1998) Let X be a universe of discourse, then an NS is defined as follows: A = {< x, TA (x), IA (x), FA (x) >}

(1)

where TA : X → ]0− , 1+ [ is truth-membership function, IA : X → ]0− , 1+ [ is indeterminacy-membership function, and FA : X → ]0− , 1+ [ is falsity-membership function. There is no restriction on the sum of the mentioned membership degrees, so 0− ≤ sup TA (x) + sup IA (x) + sup FA (x) ≤ 3+ . The general definition of NS is not easy to apply to real applications, so Wang et al. (2010) developed SVNS and Wang et al. (2005) defined IVNS as instances of NS to let this concept more understandable and operationalizable. Definition 2 (Wang et al. 2010) Let X be a universe of discourse, then an SVNS is defined as follows: A = {< x, TA (x), IA (x), FA (x) >}

(2)

where TA : X → [0, 1], IA : X → [0, 1], and FA : X → [0, 1] with 0 ≤ TA (x) + IA (x) + FA (x) ≤ 3. Definition 3 (Wang et al. 2005) Let X be a universe of discourse, then an IVNS is defined as follows: 8 ! ! ! 9 A = < x, TAl (X), TAu (X) , IAl (X), IAu (X) , FAl (X), FAu (X) > (3) where superscript l represents the infimum (inf ) and u shows the supremum (sup) of mentioned membership functions. There is a restriction on the supremum values of intervals: 0 ≤ TAu (X) + IAu (X) + FAu (X) ≤ 3. Definition 4 (Wang et al. 2010) The complement of an SVNS (Eq. 2) is denoted by AC and defined as given below:

AC = {< x, FA (x), 1 − IA (x), TA (x) >}

(4)

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

347

Definition 5 (Wang et al. 2005) The complement of an IVNS (Eq. 3) is denoted by AC and defined as given below: 8 ! ! ! 9 AC = < x, FAl (X), FAu (X) , 1 − IAu (X), 1 − IAl (X) , TAl (X), TAu (X) > (5) There are algebraic rules that can be used for making numerical operations with NSs but they are not given in this study due to page limitations. Section 4 shows all the formulas required for the current entropy analysis.

3 Literature Review: Weighting in N-MADM Peng and Dai (2018a) presented a very comprehensive literature review on NS in terms of five different perspectives: the extensions, aggregations, information measures, decision-making tools, and image processing. That paper covered more than 200 studies published in Web of Science indexed journals between 1998 and 2017. However, it did not cover any kind of objective weighting approach that can be very informative in terms of MADM researchers. So, we tried to fulfill this literature gap with a survey on an objective weighting issue. The most important part of MADM approaches is the aggregation of evaluations which are determined with respect to attributes. Mostly, these attributes have different importance on decisions and the importance possessed by each attribute is defined with a weight value. Similarly, group MADM problems may require another weight consideration for handling the importance of decision-makers (DMs) who are consulted for the evaluations of alternatives or attributes. The expertise or knowledge level of each DM can be different, so the weights of decision-makers can represent these differences about the expertise levels. There are two basic weighting procedures in literature: subjective methods consider individual judgments on the attribute or expert weights while objective methods only work with the points of alternatives which are evaluated with respect to attributes. The former needs DMs’ assessments, especially for determining the attribute weights and a group self-assessment process that can be performed by the members of the decision group for determining the DMs’ weights in group MADM. The latter does not require any additional information or evaluation rather than alternative scores because it does not consider the preferences or judgments of experts. In literature, some drawbacks are specified for subjective weightings such as risks that may be originated from self-seeking DMs or possible long time periods that are required for data collection from DMs. To overcome them, objective weighting methods are developed. In this study, the aim is to make an overview of the objective weighting of attributes and DMs in N-MADM. After an extensive research and filtering process

348

S. Gül

on scientific database Web of Science, 26 papers dealing with objective weighting issues in neutrosophic environment were found. By a detailed analysis, brief inferences that are given below are made. These findings can provide guidance to the future researches on objective decision-making. In literature, it is found that 25 papers proposed any objective attribute weighting methods, and eight different approaches were found (Table 1). The most utilized one is Maximizing Deviation by 11 papers (44%). The deviation within performance scores of alternatives for each attribute is considered by this approach. The basic idea is that when little deviations are generated by an attribute over all alternatives, it signifies that this attribute possesses a limited importance for the whole process; adversely, when very clear differences are shown by an attribute it can be stated that this attribute has to be considered as one of the leading attributes in obtaining the most desired alternative. Entropy takes the second place with five papers (20%). A detailed explanation is given in Sect. 4. Table 1 presents the other seven different methods. Deriving DMs’ weights constitute a relatively new domain in the group MADM field. In N-MADM literature, while 13 papers used any weighting procedure for experts, only 3 of them performed an objective procedure (Table 2). TOPSISbased approaches were used by two papers (Pouresmaeil et al. 2017; Peng et al. 2018) and one of them used the variation coefficient method simultaneously (Peng et al. 2018). Mondal and Pramanik (2014) utilized a hybrid score-function based approach. While five papers accepted the DMs’ weights as given, four papers appealed qualitative group consensus. As a result, it is seen that the literature requires more objective methodologies. As the third part of the survey, 14 MADM approaches were found in N-MADM (Table 3). The most used ones are TOPSIS by 9 (36%), and GRA by 7 (28%). MULTIMOORA, ELECTRE, Cognitive Maps, DEMATEL, AHP, QUALIFLEX, EDAS, MABAC, etc. are other methods which were neutrosophicated. Among three papers that considered the weights of DMs, Pouresmaeil et al. (2017) selected VIKOR, Peng et al. (2018) chose an integration of TOPSIS and QUALIFLEX, and Mondal and Pramanik (2014) used a score function-based method. In terms of the most used two objective attribute weighting methods, we can make an additional inference. Among 11 articles using Maximizing Deviation weighting, Chi and Liu (2013), Zhang and Wu (2014), Broumi et al. (2015), Pramanik et al. (2015), Dey et al. (2016a), and Peng et al. (2018) used TOPSIS. Among five articles using Entropy weighting, Biswas et al. (2014a), Pramanik and Mondal (2015), and Mondal and Pramanik (2015a) utilized GRA. It is seen that literature needs different methodological perspectives. In N-MADM literature, the studies used various types of information as instances of NS (Table 4). Ten papers (40%) worked with SVNS, and six papers (24%) utilized IVNS. The other types are linguistic, simplified, bi-polar, rough, etc. Their usage rates change between 3% and 8%.

Maximizing Mean-squared Variation Grey Multi-objective Corr. coefficient Entropy deviation deviation TOPSIS coefficient system Optimization optimization and std. dev. Chi and Liu (2013) X Zhang and Wu (2014) X Biswas et al. (2014a) X Biswas et al. (2014b) X Mondal and Pramanik (2014) X Broumi et al. (2015) X Pramanik et al. (2015) X Pramanik and Mondal (2015) X Dey et al. (2015) X Mondal and Pramanik (2015a) X Mondal and Pramanik (2015b) X Pouresmaeil et al. (2017) X Tian et al. (2016) X Sahin ¸ and Liu (2016) X Dey et al. (2016a) X Dey et al. (2016b) X Peng et al. (2017) X Tian et al. (2017) X X Liang et al. (2017) Xiong and Cheng (2018) X Liang et al. (2018) X Nirmal and Bhatt (2016) X Peng and Dai (2018b) X Peng et al. (2018) X Ji et al. (2018) X

Table 1 Papers proposing objective attribute weighting

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . . 349

350

S. Gül

Table 2 Papers dealing with DMs’ weights Qualitative Variation Hybrid score group consensus TOPSIS coefficient function Given Mondal and Pramanik (2014) X Pouresmaeil et al. (2017) X Tian et al. (2016) X Sahin ¸ and Liu (2016) X Tian et al. (2017) X Liang et al. (2017) X Xiong and Cheng (2018) X Liang et al. (2018) X Peng and Dai (2018b) X Peng et al. (2018) X X Ji et al. (2018) X Abdel-Basset et al. (2019) X

Each paper performed an application of their proposed methodologies but only four of them (Liang et al. 2017, 2018; Ji et al. 2018; Abdel-Basset et al. 2019) worked on real problems (15%). The remaining 22 papers (88%) only introduce illustrative applications. It is obvious that it is required to clarify the NSs’ potential of being applicable in real-life problems. In literature, it is seen that: • Entropy and Maximizing Deviation objective weighting methods are common in N-MADM literature. We chose to focus on entropy-based methodology because there is no proposition on the usage of the mentioned method for IVNS. • The literature needs more propositions for the weighting of DMs. Entropy-based methodology can be used for this purpose. • TOPSIS and GRA have the majority in N-MADM literature. Different methodologies should be neutrosophicated for extending the field. • There are many studies working with SVNS. So, there is a need about improving the methodologies which will be worked with IVNS, rough, simplified, or any other kind of neutrosophic sets. We preferred to propose a weighting and comparing procedure for IVNS. • Numerical examples are common in literature, but the researchers should apply the N-MADM approaches into real industry cases. After the sufficient and adequate improvement, the real-life application of the proposed procedure will be in our future research agenda.

Chi and Liu (2013) Zhang and Wu (2014) Biswas et al. (2014a) Biswas et al. (2014b) Mondal and Pramanik (2014) Broumi et al. (2015) Pramanik et al. (2015) Pramanik and Mondal (2015) Dey et al. (2015) Mondal and Pramanik (2015a) Mondal and Pramanik (2015b) Tian et al. (2016) Sahin ¸ and Liu (2016)

X

X

X

X

X

X X

X

X

X

X

X

(continued)

EntropyCognitive weighted Score Weighted DEMATEL AHP TOPSIS VIKOR QUALIFLEX maps MABAC GRA MULTIMOORA EDAS ELECTRE ranking function average X

Table 3 MADM methods

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . . 351

EntropyCognitive weighted Score Weighted DEMATEL AHP TOPSIS VIKOR QUALIFLEX maps MABAC GRA MULTIMOORA EDAS ELECTRE ranking function average Dey et al. (2016a) X Dey et al. (2016b) X Peng et al. (2017) X Tian et al. (2017) X Pouresmaeil et al. X (2017) Liang et al. X (2017) Xiong and Cheng X (2018) Liang et al. X X (2018) Nirmal and Bhatt X (2016) Peng and Dai X X (2018a, b) Peng et al. (2018) X X Ji et al. (2018) X X Abdel-Basset et X X X al. (2019)

Table 3 (continued)

352 S. Gül

Chi and Liu (2013) Zhang and Wu X (2014) Biswas et al. X (2014a) Biswas et al. X (2014b) Mondal and Pramanik (2014) Broumi et al. (2015) Pramanik et al. (2015) Pramanik and Mondal (2015) Dey et al. (2015) Mondal and Pramanik (2015a) Mondal and Pramanik (2015b) Tian et al. (2016) Sahin ¸ and Liu (2016) Dey et al. (2016a) Dey et al. (2016b)

X

X

X

X

X

X

X

X

X

X

X

X

(continued)

Single Interval Linguistic Simplified Simplified Single valued Interval neutrosophic Bi-polar Rough Probability valued NS valued NS NS NS linguistic NS neutrosophic soft sets uncertain linguistic variables NS NS multi-valued NS X

Table 4 Neutrosophic set types

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . . 353

Single Interval Linguistic Simplified Simplified Single valued Interval neutrosophic Bi-polar Rough Probability valued NS valued NS NS NS linguistic NS neutrosophic soft sets uncertain linguistic variables NS NS multi-valued NS Peng et al. (2017) X Tian et al. (2017) X Pouresmaeil et al. X (2017) Liang et al. X (2017) Xiong and Cheng X (2018) Liang et al. X (2018) Nirmal and Bhatt X (2016) Peng and Dai X (2018a, b) Peng et al. (2018) X Ji et al. (2018) X X Abdel-Basset et X al. (2019)

Table 4 (continued)

354 S. Gül

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

355

4 Entropy-Based Objective Attribute Weighting in N-MADM “Entropy,” developed and introduced by Rudolf Clausius in the nineteenth century, is exposed to demonstrate the degree of uniformity of any energy distributed in the space. Shannon extended this concept into information science in order to measure the dispersion within the dataset (Shannon and Weaver 1947) and recommended the term of entropy as a measure of ambiguity in an information issue (Shemshadi et al. 2011). In terms of decision-making domain, entropy is an efficient mathematical appliance for quantifying uncertain alternative information while determining the weights of attributes. Entropy-based objective weighting is based on this idea: an attribute will have a higher importance if a greater dispersion in evaluations of alternatives is occurred. According to this definition, the dispersion of the data in the same attribute can be a measurement of its importance. There are many entropy measurement approaches in N-MADM literature and selected examples are given below. Firstly, it would be better to summarize Shannon’s entropy approach (Liu et al. 2015). Step 1: Normalization of the alternatives’ scores for each attribute. xij pij = m

i=1 xij

(6)

where pij is the alternative i’s performance score with respect to attribute j and m is the number of attributes. Step 2: Calculation of the entropy Enj for each attribute j.  Enj = −

 m 1 pij ln pij i=1 ln m

(7)

Step 3: Definition of the divergence through divj = 1 – Enj where divj is the divergence degree of the intrinsic information of attribute j. Step 4: Determination of the objective attribute weights as follows: divij wij = m j =1 divij

(8)

In the following application section, seven different entropy-based attribute weighting methodologies are used and evaluated. Before introducing them, Aydo˘gdu (2015)’s definition of entropy for IVNSs which is based on the entropy measurement proposition for interval-valued intuitionistic fuzzy sets (Wei et al. 2015) is clarified because any entropy measurement should satisfy these axioms.

356

S. Gül

Definition 6 Let N(X) be all IVNSs on X and A  N(X). An entropy on IVNSs is a function En: N(X) → [0, 1] which satisfies the following axioms: 1. En(A) = 0 if A 0 is crisp set. 1 0 1 2. En(A) = 1 if TAl (X), TAu (X) = FAl (X), FAu (X) and IAl (X) = IAu (X) for all x  X. 3. En(A) = En(AC ) for all A  N(X) where AC is the complement of A. 4. En(A) ≥ En(B) if B is a subset of A when IAu (X) − IAl (X) < IBu (X) − IBl (X) for all x  X. Now, we can introduce the entropies considered in this study. It was proved that the entropies developed by various scholars, satisfies the axiomatic requirements. Definition 7 (Ye and Du 2019) Let Dk (k = 1, 2, 3, 4) be the different four distance measures, then for any A  N(X), Yk (A) = Enk (A) = 1 – 2Dk for k = 1, 2, 3, 4 is an entropy measure of an IVNS N. Yk (A) measures are given below for several k values.

Y1 (A) = 1 − 2D1 = 1 −

1 n j =1 3n

         0 l   T xj − 0.5 + T u xj − 0.5 + I l xj − 0.5 A A A            1 + IAu xj − 0.5 + FAl xj − 0.5 + FAu xj − 0.5

(9)

 % 2      2 1 n TAl xj − 0.5 + TAu xj − 0.5 Y2 (A) = 1 − 2D2 = 1 − 2 6n j =1    2    2 + IAl xj − 0.5 + IAu xj − 0.5

&/    2    2 1/2 + FAl xj − 0.5 + FAu xj − 0.5

(10)     8      l   u 2 n  Y3 (A) = 1 − 2D3 = 1 − 3n j =1 max TA xj − 0.5 , TA xj − 0.5             + max IAl xj − 0.5 , IAu xj − 0.5     ! 9        + max FAl xj − 0.5 , FAu xj − 0.5

(11)     8        Y4 (A) = 1 − 2D4 = 1 − n2 nj=1 max 12 TAl xj − 0.5 + TAu xj − 0.5 ,     1 max  l x  − 0.5 +  u x  − 0.5 ,  IA j  IA j 2      9!     1 max  l x − 0.5 +  u x − 0.5  F  F j j A A 2

(12)

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

357

In each formula, there is distance consideration: normalized Hamming distance in Eq. (9), normalized Euclidean distance in Eq. (10). Ye and Du (2019) developed the last two distance measures in Eqs. (11) and (12) by considering normalized Hamming distance generated by Hausdorff metric. Definition 8 (Majumder and Samanta 2014) The entropy of IVNS N is as follows for all x  X. 80    1  l    l  c  1 n l x + Fl x Y5 (A) = 1 − 2n T xj − IA xj  j j I j =1 A 1  u  A   u A c  0 u  u  + TA xj + FA xj IA xj − IA xj 

(13)

Definition 9 (Aydo˘gdu 2015) The entropy of IVNS N is as follows for all x  X.           ⎤ 2 − TAl xj − FAl xj  − TAu xj − FAu xj      ⎥ − | IAl xj − IAu xj | 1 n ⎢ ⎢ ⎥ Y6 (A) =  l         ⎢ ⎥ l x  + T u x − F u x  ⎦ j =1 ⎣ 2 + T n x − F j j j j A A   A  A  + | IAl xj − IAu xj | ⎡

(14)

Aydo˘gdu (2015) developed this entropy by extending the entropy measure developed by Wei et al. (2015) for interval-valued intuitionistic fuzzy sets. Definition 10 (Ye and Cui 2018) Let A be an IVNS in a universal set N. Then, its exponential entropy measure is formulated as follows: Y7 (A) =

n √1 j=1 6n( e−1)

       l l TAl xj e 1−TA (xj ) + 1 − TAl xj eTA (xj )      l   l −1 + IAl xj e 1−IA (xj ) + 1 − IAl xj eIA (xj )        l l −1 + FAl xj e 1−FA (xj ) + 1 − FAl xj eFA (xj )      u u −1 + TAu xj e(1−TA (xj )) + 1 − TAu xj eTA (xj )      u u −1 + IAu xj e(1−IA (xj )) + 1 − IAu xj eIA (xj ) !      u u −1 + FAu xj e(1−FA (xj )) + 1 − FAu xj eFA (xj ) − 1

(15) Ye and Cui (2018) extended the fuzzy exponential entropy introduced by Pal and Pal (1989). Now, we can clarify the proposition of the entropy measures in objective attribute weighting issues required in N-MADM problems. There are two general steps that are given below. Note that our proposition is made for IVNSs, but it can be extended for other kinds of performance score representations such as SVNS, rough NS, and bi-polar NS. Step 1. Calculation of entropy for each attribute by considering the alternative evaluation scores or linguistic terms that were assigned by DMs for this attribute.

358

S. Gül

One of Eqs. (9–15) can be used. In terms of the DMs’ preferences, any other kind of entropy measure may be chosen. For the current study, we just considered the given equations of entropy measures for comparison purposes. Enj represents the entropy measure of attribute j. Step 2. Determination of objective weights by normalizing the divergence values for each attribute (divj = 1 – Enj ). For simplicity, the weight formulation in Eq. (3) is modified by replacing the divj term with “1 – Enj .” 1 − Enj   j =1 1 − Enj

wj = m

(16)

The pairwise differences between methodologies including different entropy measures are tested by conducting a nonparametric test called Wilcoxon SignedRank Test. This test is appropriate for comparing data sets (weight sets here) that do not satisfy the assumptions required by parametric statistical tests such as normality or sample size. Nonparametric procedures are useful in investigating the serious suspicions in terms of assumptions. The literature has demonstrated that nonparametric tests have impressive outcomes in identifying population differences when the assumptions are not satisfied (Mendenhall et al. 1986). The Wilcoxon Signed-Rank Test uses the signs of the differences between samples and also their magnitudes (how large they are). Generally, the null hypothesis claims that samples have come from populations following the same probability distribution. The associated null and alternative hypotheses are: H0 : There is no statistically significant difference on variables. H1 : There is a statistically significant difference on variables. The test statistics are rankings of the differences between the pairs. There are eight steps as follows (Triola 1997). The pairs of data are importance weights in the current MADM problem. Step 1. For each pair of weights, compute the difference d by subtracting the second weight from the first one. Record the signs, and ignore any pairs for which d = 0. Step 2. Neglect the signs, then rank d values in an increasing order. When d values are the same, the mean rank of the tie is assigned. Step 3. Attach to each rank the sign of the difference from which it came. That is, insert those signs. Step 4. Find the sum of the absolute values of the negative ranks and the sum of the positive ranks. Step 5. Represent the smaller of the two sums by T. Step 6. n is the number of attributes.

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

359

Step 7. Compute the test statistic and critical values based on the sample size, as shown in Eq. (17).

Test statistic =

⎧ ⎨

f or n ≤ 30 : T

⎩ f or n > 30 : z =



T − n(n+1) 4

(17)

n(n+1)(2n+1) 24

Critical values can be found from appropriate statistical tables. If n ≤ 30, critical T value is found from a dedicated table for the Wilcoxon Signed-Rank Test by considering the significance level and n. For n > 30, critical z value is found from the standard normal distribution table. Step 8. When reaching a conclusion, reject H0 if the sample data has a test statistic in the critical region, i.e., the test statistic is less than or equal to the critical value. Otherwise, fail to reject it. This test can only be used for matched data. It is appropriate while comparing the weight sets generated by different entropy-based weighting formulas because the weights are matched in terms of attributes—that is, the different weights of the same attribute will be compared with the aim of discovering the possible significant differences between entropy-based weighting formulas.

5 Application For illustration purposes, the entropy-based objective weighting proposition for IVNSs is used on two different hypothetical data sets which were taken from other studies. Data set 1 was directly taken from Lupianez (2009). Data set 2 was originally built by Kulak and Kahraman (2005) but the neutrosophicated version involving IVNSs was developed by Kour et al. (2014). The details about MADM problems including mentioned hypothetical data will be explained in associated subsections.

5.1 Data Set 1 A firm has to make a decision about project selection. They determined four potential projects: automotive (AU), meal (ME), software (SO), and defense (DE). The decision will be reached by considering three different attributes: the potential risk (A1 ), the sustainable growth (A2 ), and the impact to nature (A3 ). The four projects are evaluated by the DMs using IVNSs, and then the performance scores are determined as given in Table 5.

360

S. Gül

Table 5 Data set 1 A1 Tl AU 0.4 ME 0.6 SO 0.3 DE 0.7

Tu 0.5 0.7 0.6 0.8

Il 0.2 0.1 0.2 0

Iu 0.3 0.2 0.3 0.1

Fl 0.3 0.2 0.3 0.1

Fu 0.4 0.3 0.4 0.2

A2 Tl 0.4 0.6 0.5 0.6

Tu 0.6 0.7 0.6 0.7

Il 0.1 0.1 0.2 0.1

Iu 0.3 0.2 0.3 0.2

Fl 0.2 0.2 0.3 0.1

Fu 0.4 0.3 0.4 0.3

A3 Tl 0.7 0.3 0.4 0.6

Tu 0.9 0.6 0.5 0.7

Il 0.2 0.3 0.2 0.3

Iu 0.3 0.5 0.4 0.4

Fl 0.4 0.8 0.7 0.8

Fu 0.5 0.9 0.9 0.9

To give an example, the calculation detail of the first entropy measure, Y1 (Aj ), is given as follows (j = 1, 2, 3). The scores of alternatives with respect to A1 can be given as an NS. A1 = {< [0.4, 0.5] , [0.2, 0.3] , [0.3, 0.4] >, < [0.6, 0.7] , [0.1, 0.2] , [0.2, 0.3] >, < [0.3, 0.6] , [0.2, 0.3] , [0.3, 0.4] >, < [0.7, 0.8] , [0, 0.1] , [0.1, 0.2] >} The first entropy measure for the first attribute, Y1 (A1 ) in Eq. (9), is computed as given below. Y1 (A1 ) = En1 = 1 −

1 3∗4

[|0.4 − 0.5| + |0.5 − 0.5| + |0.2 − 0.5| + |0.3 − 0.5| + |0.3 − 0.5| + |0.4 − 0.5| + |0.6 − 0.5| + |0.7 − 0.5| + · · · + |0.1 − 0.5| + |0.2 − 0.5|] = 0.5333

By using the same equation, En2 and En3 are found 0.5667 and 0.6000, respectively. Then, the weight formula given in Eq. (16) is used to calculate the objective weights of the three attributes as follows: w1 =

1 − 0.5333 = 0.3590 (1 − 0.5333) + (1 − 0.5667) + (1 − 0.6000)

w2 =

1 − 0.5667 = 0.3333 (1 − 0.5333) + (1 − 0.5667) + (1 − 0.6000)

w3 =

1 − 0.6000 = 0.3077 (1 − 0.5333) + (1 − 0.5667) + (1 − 0.6000)

These weights are found by using the first entropy measure. Table 6 summarizes all the weight sets determined by using different entropy measures. The columns depict the entropy measure number as Yk (k = 1, . . . , 7) and rows represent the attributes A1 , A2 , and A3 .

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

361

Table 6 Different weight sets for the attributes of data set 1 A1 A2 A3

Y1 0.3590 0.3333 0.3077

Y2 0.3520 0.3298 0.3182

Y3 0.3434 0.3333 0.3232

Y4 0.3377 0.3247 0.3377

Y5 0.3644 0.3461 0.2895

Y6 0.3046 0.3391 0.3564

Y7 0.3387 0.3322 0.3291

The weight orders of attributes for different entropy measures are quite similar. As given below, five out of seven approaches gave the same order. The fourth approach brought a different order as giving the same weight to the first and third attributes. • For Y1 , Y2 , Y3 , Y5 , Y7 : A1  A2  A3 • For Y4 : A1 ~ A3  A2 • For Y6 : A3  A2  A1 The most important difference belongs to the sixth approach since it gave a reverse order of attributes. As seen in Table 6, the weights are very close to each other. However, the significant pairwise differences between approaches are tested by conducting a Wilcoxon test and the results are shared in Sect. 5.3.

5.2 Data Set 2 Let us consider an international company which is in need of a freight transportation service. This company determined four potential transporters and called them B1, B2, B3, and B4. The attribute considered in the evaluation process are transportation cost (A1 ), defective rate (A2 ), lateness rate (A3 ), flexibility (A4 ), and documentation efficiency (A5 ). Kulak and Kahraman (2005) took A1 , A2 , and A3 as crisp ratio variables; A4 and A5 as linguistic variables. Kour et al. (2014) neutrosophicated these values by representing the evaluation scores as IVNSs. The aim is to find the best transporter company. IVNS-based evaluation scores are given in Table 7. Due to the page limitation, any details about the calculations will not be specified. Let us go directly with the weights which are summarized in Table 8. Here there are four different orderings which are depicted below. Only three (first, second, and seventh) approaches gave the same order. The third one only changed the order of A3 and A4 ; the fourth and fifth approaches changed the order of A4 and A5 . Similar to the findings of the analysis made on data set 1, the most different approach is the sixth entropy measure again, but it did not give a reverse order. It changed the order of A5 and A1 ; and A3 and A4 when compared with the first ordering. • • • •

For Y1 , Y2 , Y7 : A1  A5  A4  A3  A2 . For Y3 : A1  A5  A3  A4  A2 . For Y4 , Y5 : A1  A4  A5  A3  A2 . For Y6 : A5  A1  A3  A4  A2 .

362

S. Gül

Table 7 Data set 2

B1 B2 B3 B4

B1 B2 B3 B4

A1 Tl 0.7 0.8 0.85 0.8

Tu 0.8 0.85 0.89 0.9

A4 Tl 0.6 0.01 0.9 0.5

Il 0.01 0.01 0.02 0.01 Tu 0.8 0.02 0.92 0.6

Tl 0.02 0.03 0.05 0.02

Il 0.4 0.3 0.5 0.5

A2 Tl 0.8 0.01 0.4 0.2

Tl 0.02 0.6 0.03 0.2

Tu 0.2 0.2 0.3 0.2

Tu 0.2 0.2 0.3 0.2

Il 0.01 0.4 0.01 0.1

Tu 0.85 0.03 0.6 0.4

Il 0.02 0.8 0.1 0.6

Il 0.3 0.3 0.5 0.3

Tl 0.03 0.9 0.3 0.7

Tu 0.3 0.3 0.2 0.3

A5 Tl 0.4 0.85 0.7 0.7

Il 0.5 0.5 0.4 0.4

Tu 0.5 0.9 0.8 0.8

A3 Tl 0.3 0.8 0.9 0.2 Il 0.1 0.01 0.02 0.3

Tu 0.4 0.92 0.95 0.3

Il 0.2 0.01 0.01 0.3

Tl 0.3 0.02 0.04 0.4

Tl 0.4 0.04 0.02 0.6 Tu 0.1 0.2 0.2 0.02

Tu 0.1 0.2 0.3 0.3

Il 0.2 0.3 0.4 0.4

Il 0.2 0.4 0.4 0.01

Table 8 Different weight sets for the attributes of data set 2 A1 A2 A3 A4 A5

Y1 0.2272 0.1702 0.1973 0.1981 0.2071

Y2 0.2212 0.1776 0.1942 0.2018 0.2052

Y3 0.2233 0.1800 0.1964 0.1934 0.2069

Y4 0.2248 0.1708 0.1831 0.2113 0.2101

Y5 0.3003 0.1450 0.1805 0.1902 0.1840

Y6 0.2232 0.1533 0.1964 0.1935 0.2337

Y7 0.2096 0.1905 0.1972 0.2006 0.2021

Another finding is regarding the order of attribute 1 (A1 ) and 2 (A2 ). A1 is mostly placed at the beginning of the orderings and A2 takes a constant place at the end of each ordering. So, it can be concluded that while A1 is the most important attribute of this N-MADM problem, A2 is the least important one.

5.3 Wilcoxon Signed-Rank Test Results Even though the weights are very close to each other for both data sets, a Wilcoxon Signed-Rank Test can be conducted for investigating possible significant differences between the weight sets determined by different entropy-based objective weighting approaches. The methodological details are given at the beginning of this section. Seven entropy measures are compared; thus, the pairwise comparison number is equal to (7*6)/2 = 21. Rather than applying the mathematical formulation for each comparison, we used SPSS statistical software to conduct the test. The significance level was selected as 5%. In tables, the test statistic is compared with the critical value which is determined by considering the selected significance level. Tables 9 and 10 gives the Wilcoxon test results for data set 1 and 2. Critical T value is equal to 1 for both issues when the significance level is 5%.

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . . Table 9 Wilcoxon test results for data set 1

Table 10 Wilcoxon test results for data set 2

Y2 Y3 Y4 Y5 Y6 Y7 Y3 Y4 Y5 Y6 Y7 Y4 Y5 Y6 Y7 Y5 Y6 Y7 Y6 Y7 Y7

Y2 Y3 Y4 Y5 Y6 Y7 Y3 Y4 Y5 Y6 Y7 Y4 Y5 Y6 Y7 Y5 Y6 Y7 Y6 Y7 Y7

− Y1 − Y1 − Y1 − Y1 − Y1 − Y1 − Y2 − Y2 − Y2 − Y2 − Y2 − Y3 − Y3 − Y3 − Y3 − Y4 − Y4 − Y4 − Y5 − Y5 − Y6

− Y1 − Y1 − Y1 − Y1 − Y1 − Y1 − Y2 − Y2 − Y2 − Y2 − Y2 − Y3 − Y3 − Y3 − Y3 − Y4 − Y4 − Y4 − Y5 − Y5 − Y6

Negative ranks 3 1.5 3 1.5 3 3 1.5 2 1.5 3 3 3 1.5 3 3 1.5 3 3 3 3 1.5

Negative ranks 3.5 3.5 2.67 4 2.5 2.67 2.33 2.5 3.5 2.5 2.5 5 4 2.5 3.5 3.5 2.5 2.25 2.5 2.67 5

363 Positive ranks 1.5 3 1.5 3 1.5 1.5 3 1 3 1.5 1.5 1.5 3 1.5 1.5 3 1.5 1.5 1.5 1.5 3

Positive ranks 2.67 2.67 3.5 2.33 5 3.5 4 5 2.67 5 5 2.5 2.33 5 2.67 2.67 5 2.75 5 3.5 2.5

T 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5

T 2.67 2.67 2.67 2.33 2.5 2.67 2.33 2.5 2.67 2.5 2.5 2.5 2.33 2.5 2.67 2.67 2.5 2.25 2.5 2.67 2.5

364

S. Gül

As seen in Tables 9 and 10, T values of all pairwise comparisons for both issues are higher than or equal to 1. So, the null hypothesis for the whole comparisons cannot be rejected; there is no significant difference between approaches. All the different entropy-based weighting approaches give similar results, so we can use any of them in applications. It is noted that there are few attributes in the current data sets because of the extent of the current N-MADM problems. The applications involving more attributes may offer more comprehensive results regarding the differences among entropy-based weighting approaches.

6 Conclusions and Future Researches Neutrosophic set is a relatively new member of three-dimensional membership functions that have been widespread in recent years. In this study, firstly we made and shared a niche N-MADM literature survey focusing on the objective weighting approaches for two basic weighting purposes: weighting of attributes and weighting of DMs. Maximizing Deviation and Entropy approaches are common in literature in terms of the determination of attribute weights. For the expert weights, there are few studies in literature so it cannot be inferred any general conclusions. Another finding is about the common instances of neutrosophic sets: SVNS and IVNS. As a result of the literature survey, we chose to propose an entropy-based objective weighting procedure for attributes in the IVNS environment. For the mentioned goal, we propose the usage of entropy measures for weighting attributes. Seven different entropy measures are considered in this study in order to find the most appropriate one and suggest it as general usage for MADM researchers. There are two steps in our proposition: firstly the entropies in each attribute should be found and transformed to a divergence value and then, divergences should be normalized. The normalized divergence values are weights of attributes. Different entropy measures can potentially give several weight sets for the same application. So, they should be evaluated by a comparison. Wilcoxon SignedRank Test is proposed to make this evaluation and the SPSS software package is utilized for this purpose. To operationalize our proposition, we selected two data sets from the literature and performed the weighting procedure with them. The orderings of attribute weights were determined and compared by the Wilcoxon test. The results show that there are no significant differences between orderings, so DM can choose to use any of them. They give very close attribute weights. In future studies, we plan to begin with the real-life application of our proposed procedure as the literature survey indicates. The findings should be tested and approved by conducting real-life applications on industrial decision-making cases involving many conflicting attributes. Secondly, it is required to develop more sophisticated and responsive entropy formulas. Finally, the literature needs more comprehensive scientific approaches for weighting DMs.

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

365

References Abdel-Basset M, Mohamed M, Chang V (2018) NMCDA: a framework for evaluating cloud computing services. Fut Gen Comput Syst 86:12–29 Abdel-Basset M, Gunasekaran M, Mohamed M, Chilamkurti N (2019) A framework for risk assessment, management and evaluation: economic tool for quantifying risks in supply chain. Fut Gen Comput Syst 90:489–502 Atanassov KT (1986) Intuitionistic fuzzy sets. Fuzzy Sets Syst 20(1):87–96 Aydo˘gdu A (2015) On entropy and similarity measure of interval valued neutrosophic sets. Neutrosophic Sets Syst 9:47–49 Biswas P, Pramanik S, Giri BC (2014a) Entropy based grey relational analysis method for multi attribute decision making under single valued neutrosophic assessments. Neutrosophic Sets Syst 2:102–110 Biswas P, Pramanik S, Giri BC (2014b) A new methodology for neutrosophic multi-attribute decision making with unknown weight information. Neutrosophic Sets Syst 3:42–50 Broumi S, Ye J, Smarandache F (2015) An extended TOPSIS method for multiple attribute decision making based on interval neutrosophic uncertain linguistic variables. Neutrosophic Sets Syst 8:22–31 Chi P, Liu P (2013) An extended TOPSIS method for the multiple attribute decision making problems based on interval neutrosophic set. Neutrosophic Sets Syst 1:63–70 Dey PP, Pramanik S, Giri BC (2015) An extended grey relational analysis based interval neutrosophic multi attribute decision making for weaver selection. J New Theory 9:82–93 Dey PP, Pramanik S, Giri BC (2016a) An extended grey relational analysis based multiple attribute decision making in interval neutrosophic uncertain linguistic setting. Neutrosophic Sets Syst 11:21–30 Dey PP, Pramanik S, Giri BC (2016b) Neutrosophic soft multi-attribute decision making based on grey relational projection method. Neutrosophic Sets Syst 11:98–106 Gül S, Kabak Ö, Topcu Y˙I (2018) An OWA operator-based cumulative belief degrees approach for credit rating. Int J Intell Syst 33:998–1026 ˙Inan UH, Gül S, Yılmaz H (2017) A multiple attribute decision model to compare the firms’ occupational health and safety management perspectives. Safety Sci 91:221–231 Ji P, Zhang HY, Wang JQ (2018) Selecting an outsourcing provider based on the combined MABAC-ELECTRE method using single-valued neutrosophic linguistic sets. Comp Ind Eng 120:429–441 Koksalmis E, Kabak Ö (2019) Deriving decision makers’ weights in group decision making: an overview of objective methods. Inf Fusion 49:146–160 Kour D, Mukherjee S, Basu K (2014) Multi-attribute decision making problem for transportation companies using entropy weights-based correlation coefficients and TOPSIS method under interval-valued intuitionistic fuzzy environment. Int J of Comp Appl Math 9(2):127–138 Kulak O, Kahraman C (2005) Fuzzy multi-attribute selection among transportation companies using axiomatic design and analytic hierarchy process. Inf Sci 170(2–4):191–210 Liang R, Wang J, Zhang H (2017) Evaluation of e-commerce websites: an integrated approach under a single-valued trapezoidal neutrosophic environment. Knowl-Based Syst 135:44–59 Liang WZ, Zhao GY, Hong CS (2018) Performance assessment of circular economy for phosphorus chemical firms based on VIKOR-QUALIFLEX method. J Clean Prod 196:1365–1378 Liu HC, You JX, You XY, Shan MM (2015) A novel approach for failure mode and effects analysis using combination weighting and fuzzy VIKOR method. Appl Soft Comput 28:579–588 Lupianez FG (2009) Interval neutrosophic sets and topology. Kybernetes 38(3/4):621–624 Majumder P, Samanta SK (2014) On similarity and entropy of neutrosophic sets. J Intell Fuzzy Syst 26(3):1245–1252 Malhotra R, Malhotra DK (2002) Differentiating between good credits and bad credits using neurofuzzy systems. Eur J Oper Res 136(1):190–211

366

S. Gül

Mendenhall W, Reinmuth JE, Beaver RJ (1986) Statistics for management and economics, 5th edn. PWS, Boston Mondal K, Pramanik S (2014) Multi-criteria group decision making approach for teacher recruitment in higher education under simplified neutrosophic environment. Neutrosophic Sets Syst 6:28–34 Mondal K, Pramanik S (2015a) Rough neutrosophic multi-attribute decision-making based on grey relational analysis. Neutrosophic Sets Syst 7:8–17 Mondal K, Pramanik S (2015b) Neutrosophic decision making model for clay-brick selection in construction field based on grey relational analysis. Neutrosophic Sets Syst 9:64–71 Nirmal NP, Bhatt MG (2016) Selection of automated guided vehicle using single valued neutrosophic entropy based novel multi attribute decision making technique. In: Smarandache F, Pramanik S (eds) New trends in neutrosophic theory and applications, Pons Editions, Brussels, p105–114 Pal NR, Pal SK (1989) Object background segmentation using new definitions of entropy. IEE Proc E – Comput Dig Tech 136(4):284–295 Peng X, Dai J (2018a) A bibliometric analysis of neutrosophic set: two decades review from 1998 to 2017. Artif Intell Rev 1–57 Peng X, Dai J (2018b) Approaches to single-valued neutrosophic MADM based on MABAC, TOPSIS and new similarity measure with score function. Neural Comp Appl 29(10):939–954 Peng X, Dai J, Yuan H (2017) Interval-valued fuzzy soft decision making methods based on MABAC, similarity measure and EDAS. J Intell Fuzzy Syst 152(4):373–396 Peng X, Zhang HY, Wang JQ (2018) Probability multi-valued neutrosophic sets and its application in multi-criteria group decision-making problems. Neural Comp Appl 30(2):563–583 Pouresmaeil H, Shivanian E, Khorram E, Fathabadi HS (2017) An extended method using TOPSIS and VIKOR for multiple attribute decision making with multiple decision makers and single valued neutrosophic numbers. Adv Appl Statist 4:261–292 Pramanik S, Mondal K (2015) Interval neutrosophic multi-attribute decision-making based on grey relational analysis. Neutrosophic Sets Syst 9:13–22 Pramanik S, Dey PP, Giri BC (2015) TOPSIS for single valued neutrosophic soft expert set based multi-attribute decision making problems. Neutrosophic Sets Syst 10:88–95 Russell B (1923) Vagueness. Australas J Psych Philos 1:84–92 Sahin ¸ R, Liu P (2016) Maximizing deviation method for neutrosophic multiple attribute decision making with incomplete weight information. Neural Comput Appl 27(7):2017–2029 Shannon CE, Weaver W (1947) The mathematical theory of communication. The University of Illinois Press, Urbana Shemshadi A, Shirazi H, Toreihi M, Tarokh MJ (2011) A fuzzy VIKOR method for supplier selection based on entropy measure for objective weighting. Expert Syst Appl 38:12160–12167 Smarandache F (1998) Neutrosophy: neutrosophic probability, set, and logic. American Research Press, Rehoboth Sodenkamp MA, Tavana M, Di Caprio D (2018) An aggregation method for solving group multicriteria decision-making problems with single-valued neutrosophic sets. Appl Soft Comput 71:715–727 Tian ZP, Zhang HY, Wang JQ, Chen XH (2016) Multi-criteria decision-making method based on a cross-entropy with interval neutrosophic sets. Int J Syst Sci 15:3598–3608 Tian ZP, Wang J, Wang JQ, Zhang HY (2017) An improved MULTIMOORA approach for multicriteria decision-making based on interdependent inputs of simplified neutrosophic linguistic information. Neural Comput Appl 28(Suppl 1):585–597 Triola MF (1997) Elementary statistics, 7th edn. Pearson/Addison Wesley, Boston Tzeng GH, Huang JJ (2011) Multiple attribute decision making: methods and applications. CRC Press, Boca Raton Wang H, Smarandache F, Zhang YQ, Sunderraman R (2005) Interval neutrosophic sets and logic: theory and applications in computing. Hexis, Phoenix

Usage of Entropy-Based Objective Weighting in Neutrosophic Multiple. . .

367

Wang H, Smarandache F, Zhang YQ, Sunderraman R (2010) Single valued neutrosophic sets. In: Smarandache F (ed) Multispace & multistructure. Neutrosophic transdisciplinarity, vol IV. North-European Scientific, Hanko, pp 410–413 Wei CP, Wang P, Zhang YZ (2015) Entropy, similarity measure of interval valued intuitionistic sets and their applications. Inf Sci 181(19):4273–4286 Xiong W, Cheng J (2018) A novel method for determining the attribute weights in the multiple attribute decision-making with neutrosophic information through maximizing the generalized single-valued neutrosophic deviation. Information 9(6):137 Ye J, Cui W (2018) Exponential entropy for simplified neutrosophic sets and its application in decision making. Entropy 20:357 Ye J, Du SG (2019) Some distances, similarity and entropy measures for interval-valued neutrosophic sets and their relationship. Int J Mach Learn Cybern 10(2):347–355 Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353 Zhang Z, Wu C (2014) A novel method for single-valued neutrosophic multi-criteria decision making with incomplete weight information. Neutrosophic Sets Syst 4:35–49

Implementation of Cumulative Belief Degree Approach to Group Decision-Making Problems Under Hesitancy Nurullah Güleç and Özgür Kabak

Abstract In multi-criteria group decision-making problems (GDM), decisionmakers (DMs) may have hesitancy in their evaluations on assigning linguistic terms. Recently, the use of the Hesitant Fuzzy Linguistic Terms (HFLTs) has increased rapidly due to the flexibility it provides to DMs for representing the hesitancy in their evaluation. Using HFLTs, DMs can assign more than one linguistic term in their evaluations. There are a lot of studies in the literature for solving GDM problems with HFLTs. However, when HFLTs and other evaluation formats such as direct value assignment, classical fuzzy sets, linguistic terms, etc. are used in the same problem, the methods in the literature are limited. The Cumulative Belief Degree (CBD) approach, based on fuzzy linguistic terms and belief structure, is a solution method applied in many complex multi-criteria GDM problems under different assessment methods. In this study, a multi-criteria GDM problem in which the evaluations are provided with different evaluation formats including HFLTs is considered. A method based on the CBD approach has been developed. Specifically, the transformation formula for converting HFLTs to CBDs is proposed. The proposed method has been applied to a sales manager selection problem in the literature. The method has been shown to be convenient for GDM problems with HFLTs.

N. Güleç Faculty of Engineering and Natural Sciences, Industrial Engineering Department, Ankara Yildirim Beyazit University, Ankara, Turkey e-mail: [email protected] Ö. Kabak () Faculty of Management, Industrial Engineering Department, Istanbul Technical University, Istanbul, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_14

369

370

N. Güleç and Ö. Kabak

1 Introduction As the complexity of the decision-making problems increases, applying new methods both in identifying and solving problems are becoming necessary. In group decision-making (GDM) problems, alternatives are evaluated by more than one decision-maker (DM). Alternatives in the feasible set are ranked, sorted, or selected according to the DMs’ evaluations based on the assessments of multiple quantitative or qualitative criteria (Wan et al. 2017). There are many studies in the literature about GDM problems (Wan et al. 2017; Galo et al. 2018; Liu and Rodríguez 2014). Uncertainty and hesitancy in DMs’ evaluations are one of the fundamental issues in multi-criteria decision-making problems. Some of the reasons for this issue are the nature of the judgment, the vagueness intrinsic to the evaluation of qualitative criteria, and the complete lack of information (Galo et al. 2018). To increase the quality of the evaluations and improve the quality of the solution, flexibility in the assessment of DMs should be represented appropriately. Along with the complexity and uncertainty of decision-making problems, real numbers have become incapable of fully reflecting the hesitancy and uncertainty in decisions. In some cases, by using exact numerical values to express preferences may be difficult for DMs. In such a case, a more appropriate way to express DMs’ assessments may be utilizing linguistic information (Song and Hu 2017). In 1975, Zadeh (1975) proposed the fuzzy linguistic approach to represent evaluations as linguistic variables. In this framework, the values of a linguistic variable are represented by single or simple linguistic terms (Zadeh 1975). Using linguistic information instead of quantitative values in the decision-making evaluation process helps to reduce uncertainty and complexity. However, using linguistic terms is not adequate to represent human’s more comprehensive cognition (Liao and Xu 2015). It is not easy for DMs to provide just a single term to evaluate alternatives due to the complexity and uncertainty. To deal with this issue, Torra (2010) proposed hesitant fuzzy set (HFS), which allows DMs to consider simultaneously several hesitate values to present evaluation. In decision-making problems defined in a linguistic context with a high degree of uncertainty, DMs might hesitate among different linguistic terms and need richer linguistic expressions to express their assessments (Liu and Rodríguez 2014). Furthermore, the single fuzzy terms have become insufficient to evaluate different criteria (Wan et al. 2017). In complex problems with uncertainty or lack of information, it is difficult for DMs to express their opinions using a single linguistic term. In such problems, DMs may be hesitant between more than one linguistic term and may need more expressions to express themselves better (Zhang et al. 2018). Rodriguez et al. (2011) extended HFS to hesitant fuzzy linguistic term sets (HFLTSs) to better express DMs themselves and to overcome their hesitancy due to uncertainty in problems. The HFLTS is defined as an ordered finite subset of the consecutive linguistic terms. The hesitancy of DMs can be expressed using more than one HFS linguistic term. The main reason why the use of the HFLTSs is suggested in the GDM problems is that the HFLTSs can be structured to include all the DMs’ opinions (Galo et al. 2018). The HFLTSs have

Implementation of Cumulative Belief Degree Approach to Group Decision-. . .

371

been applied to decision-making problems many times in recent years (Chang 2017; Chen and Hong 2014; Liao et al. 2014; Liu et al. 2014; Montes et al. 2015; Wei et al. 2013). There are various kinds of methods used in the literature for GDM problems with HFLTS. The Cumulative Belief Degree (CBD) is a solution approach, firstly applied on nuclear safeguards problem (Kabak and Ruan 2011), built on fuzzy linguistic terms and belief structure for multi-criteria GDM problems (Kabak et al. 2014). In this method, to represent the belief of experts on their evaluation of the criteria, the belief structure has been applied. The CBD approach has also been used in GDM problems where different assessment criteria such as the HFS and the intuitionistic fuzzy set are used (Ervural and Kabak 2016). In a recent study, Ervural and Kabak (2019) propose a CBD approach for GDM problems with heterogeneous information. They develop transformation formulae for several preference representation scales to belief structure, including 2-tuple representation, classical fuzzy sets, hesitant fuzzy sets, and intuitionistic fuzzy sets. However, considering all CBD literature including this recent one, to the best of our knowledge, there is no study that uses CBDs approach for GDM problems with HFLTSs. In this respect, the main contribution of this study is to develop a new methodology for a GDM problem with HFLTSs using the CBD approach that is an effective approach to deal with various multiple attribute GDM problems. In the problem defined in this study, DMs express their opinions with HFLTS. These expressions are then processed using an original transformation formula for converting evaluations in HFLTS to belief structures. The proposed methodology is applied to a sales manager selection problem for illustration purposes. The rest of the chapter is organized as follows. In Sect. 2 Preliminaries, we briefly provide the definition of HFLTs, belief structure, and CBD. In Sect. 3, we define the Multi-criteria GDM problem under hesitancy. The proposed methodology based on CBD is described step by step. Section 4 provides an illustrative example in which the proposed methodology is applied. Finally, some concluding remarks are given in Sect. 5.

2 Preliminaries Before presenting the proposed methodology, we first introduce and define the concepts used in the methodology. Generally, although single linguistic expression is used, in real-life problems, DMs often remain hesitant between multiple linguistic terms when evaluating alternatives. The concept of HFS has been introduced in Torra (2010), and in Rodriguez et al. (2011), the concepts regarding HFLTS have been presented. Definition 1 Hesitant fuzzy linguistic term sets (HFLTSs) (Rodriguez et al. 2011).

372

N. Güleç and Ö. Kabak

Let S = {sk | k = 0, . . . K} be a linguistic term set. The cardinality of this set is K + 1 and this set should be satisfied ranking rule: si > sj if i > j. A HFLTS S (HS ) is an ordered finite subset of the consecutive linguistic terms of S. Definition 2 Belief structure (Kabak and Ruan 2011) B=

K  9 8  βik ≤ 1, ∀i βik , sk , k = 0, . . . , K , ∀i,

(1)

k=0

sk and βik represent the kth linguistic term and the belief degree for the alternative i at sk level, respectively. Definition 3 Cumulative belief structure (Kabak and Ruan 2011) C=

K 8  9  γik , sk , k = 0, . . . , K , ∀i, γik = βil

(2)

l=k

In Eq. (2), sk indicates kth linguistic term and γik represents the CBD related to alternative i at sk threshold level.

3 Implementation of CBD Approach to Multi-criteria Group Decision-Making Problem with HFLTS In the literature, transformation formulations of the CBD approach for different assessment methods, such as numerical values, interval values, linguistic values, 2-tuples, intuitionistic fuzzy values, HFS (Ervural and Kabak 2016, 2019) are available. However, the CBD approach will be applied for the first time to a problem where DMs make their assessments with HFLTS. In this section, the formulations necessary for applying the CBD approach to the Multi-criteria GDM Problem with HFLTS will be given. The proposed methodology will then be applied to an illustrative example problem.

3.1 Problem Description In GDM problems multiple DMs evaluate the same set of alternatives to be able to choose the best alternative. Rodriguez et al. (2011) developed a multi-criteria linguistic decision-making model with linguistic expressions. In this type of multicriteria GDM problem, DMs evaluate alternatives by using linguistic expressions. In this study, a multiple criteria GDM problem is assumed in which DMs assess the alternatives by using HFLTS. In Multi-criteria GDM Problem with HFLTS,

Implementation of Cumulative Belief Degree Approach to Group Decision-. . .

373

more than one DMs evaluate alternatives by using HFLTS under multiple criteria. In this problem, the aim is to rank I number of alternatives (A1 , A2, . . . , AI ) that E number of DMs (DM1 , DM2, . . . , DME ) evaluate according to J number of criteria (C1 , C2, . . . , CJ ). De donates the decision matrix of DM e. ⎡

e S11 ··· ⎢ .. . . De = ⎣ . .

e ⎤ S1J .. ⎥ . ⎦

SIE1 · · · SIeJ In the proposed method, we assume that Sije values are HFLTSs. For example, if DM1 evaluates A2 with respect to C3 as s4 and s5 , the related representation will be 1 = {s , s }. S23 4 5 DMs weight the criteria such as wje for Cj to measure the level of significance of criterion for DMs. In this problem, weights are assumed to be normalized,  each e w j j = 1. Furthermore, a weight (λe ) for every DM is assigned with respect to expertise or experience of DMs. Finally, the notations used to describe the problem are presented as follows: Ai : Alternative i (i = 1 . . . I, I : number of alternatives) DMe : Decision Maker e (e = 1 . . . E, E : number of DMs) Cj : Criterion j (j = 1 . . . J, J : number of criterian) λe : Weight of DMe wje : Weight of Cj for DMe sk : Linguistic variable k (k = 0, . . . . K, K : number of linguistic terms in belief structure) Sije : Evaluation of DMe for alternative i with respect to criterion j with linguistic variables xijek = Evaluation of DMe for alternative i with respect to criterion j at sk level Xije = Evaluation of DMe for alternative i with respect to criterion j βijek = Belief degree of DMe alternative i with respect to criterion j at sk level Bije = Belief degree of DMe alternative i with respect to criterion j δie : Similarity degree of DMe for alternative i De = Decision Matrix of DMe

3.2 Proposed Methodology In order to solve the above-defined GDM problem, we developed a seven-step methodology defined as follows: Step 1: Identify the Problem and Collect the DMs Evaluations In this step, DMs are asked to evaluate alternatives according to different criteria using fuzzy linguistic terms. As a result, a decision matrix is created for each DM.

374

N. Güleç and Ö. Kabak

Step 2: Transform DM Evaluations to the Belief Structures In the CBD approach, assessments of DMs are analyzed by transforming into belief structures. In the CBD approach, assessments of DMs are analyzed by transforming into belief structures. One of the important capabilities of CBD is that all preference representation structures, such as real numbers, interval value assignments, linguistic terms, 2-tuples, and extensions of fuzzy sets can be all transformed to belief structures without any loss of information (Ervural and Kabak 2019). Transformation formulas for linguistic terms, interval value, and numerical value evaluations are presented by Kabak and Ruan (2011) and for intuitionistic fuzzy sets and HFSs by Ervural and Kabak (2016). In this study, the transformation of the HFLTSs to belief structure is presented as follows: Xije

=

8

xijek

9

 , xijek

=

βijek = xijek × K

1, if sk exist in Sije 0, otherwise

∀e, ∀i, ∀j, ∀k

(3)

, k = 0, . . . , K

∀e, ∀i, ∀j

(4)

1

ek k=0 xij

For example; 1 = {s , s } for K = 6 S23 2 3 1 = {0, 0, 1, 1, 0, 0, 0} X23 12 = 0.5, β 13 = 0.5 β23 23 1 = {(0.5, s ) ; (0.5, s )} B23 2 3 1 represents the belief structure B23

of for DM1 for A2 with respect to C3 .

Step 3: Aggregate the Criteria Scores for Each DM At this step, DM assessments transformed to belief structures in the previous step are aggregated for each DM and each alternative. Weights (wje ) assigned by DMs to criteria are used for aggregation. The weighted sum of belief degrees of the DMs with respect to criteria for each sk are calculated. Results are the total belief degree of the alternative, for each DM at each sk level. 

βiek

e ek j wj × βij  =   wje |e, j, k βijek > 0

(5)

βiek represents the belief degree of DMe for Ai at sk level. Bie =

8  9 βiek , sk , k = 0, . . . , K , ∀e, ∀i

(6)

Step 4: Calculate the Cumulative Belief Degrees After calculating the total belief degree of the alternative, for each DM at each sk level, at this step, the cumulative belief structure for DMe ’s assessments of Ai can

Implementation of Cumulative Belief Degree Approach to Group Decision-. . .

375

be defined as follows: Cie =

 9 8 γiek , sk , k = 0, . . . , K ∀e, ∀i

(7)

Cie represents the CBD of DMe for Ai . γiek =

K 

ep

βi , ∀e, ∀i, ∀k

(8)

p=k

Step 5: Aggregate DM’s Assessments for Each Alternative DMs’ evaluations for each alternative are aggregated to find the final CBD. Final CBD provides that alternatives can be ranked. A weight (λe ) for every DMs can be determined with respect to expertise or experience. By using these weights and the cumulative belief structure of DMs, the total performance of each alternative at each linguistic term level can be found as follows: Ci =

 9 8 γik , sk , k = 0, . . . , K , ∀i

γik =

E 

λe × γiek , ∀i, ∀k

(9)

(10)

e=1

Step 6: Calculate Consensus Degree In GDM problems, the consensus process is necessary to achieve a final result with a certain level of agreement between the DMs. It is desirable that DMs reach a high degree of consensus before making the final decision. For proposed approach, δie , similarity degree between DMe ’s assessments of Ai , and aggregated performance of Ai is calculated as follow: K δie

=1−

k=0

| γik − γiek | , ∀e, ∀i K

(11)

I δe =

e i=1 δi

I

, ∀e

(12)

δ e represents the consensus level of DMe . All DMs’ consensus levels are evaluated together and a general consensus level is calculated for the problem. The calculated value is compared with a threshold value to determine whether the results obtained are the common decision of DMs over the specified level.

376

N. Güleç and Ö. Kabak

E δ=

e=1 δ

e

E

(13)

δ represents the general consensus degree for the problem. For the proposed approach, the threshold value is set as follow: δ min = 1 −

1 K

(14)

By comparing the general consensus degree and threshold value, whether the results obtained meet the agreed level of consensus is tested. If δ ≥ δ min , then the process continues to the next step. Otherwise, the DMs whose δ e is less than δ min are asked, if they would prefer to change their evaluations. Step 7: Find the Collective Preferences Two approaches are proposed to compare the final CBDs of the alternatives: the Aggregated Score Approach and the Linguistic-cut approach. Linguistic-cut approach is rank ordering of alternatives according to a certain sk level that represents the satisfaction level of the DM. Aggregated score (ASi ) for each alternative, the total expectation can be found by the following formula: ASi =

K−1 

  vk γik − γik+1 + vi γiK , ∀i

(15)

k=0

vk indicates an expectation value for the linguistic term sk .

4 Illustrative Example In this section, the GDM problem concerned with selecting the best sales manager a for manufacturing company (Yu et al. 2017) is used to illustrate the proposed model. In the problem, as a result of the research, it was decided that four candidates (A1 , A2 , A3 , A4 ) were suitable for the sales manager position. To select the best candidate, four human resources experts’ (DM1 , DM2 , DM3 , DM4 ) opinions were taken for the final evaluation. All four DMs have the same level of expertise and experience. That is, the weights assigned to the experts are equal (λ = [0.25, 0.25, 0.25, 0.25]T ). Experts evaluated the candidates according to six criteria shown in Table 1. The weighting vector of the criteria was w = (0.2, 0.3, 0.15, 0.1, 0.15, 0.1)T . Experts use seven linguistic expressions shown in Table 2 to evaluate candidates with respect to criteria. The assessments provided by the four experts were HFLTSs defined on S, as shown in Table 3.

Implementation of Cumulative Belief Degree Approach to Group Decision-. . . Table 1 Criteria and descriptions

Criterion C1 C2 C3 C4 C5 C6

Table 2 Linguistic expression and descriptions

377

Description Oral communication skills Past experience General aptitude Willingness Self-confidence First impression

Linguistic expression s0 s1 s2 s3 s4 s5 s6

Description Poor Slightly poor Fair Slightly good Good Very good Extremely good

Table 3 Decision matrix of the experts DM1

DM2

DM3

DM4

A1 A2 A3 A4 A1 A2 A3 A4 A1 A2 A3 A4 A1 A2 A3 A4

C1 s5 , s6 s4 , s5 s3 , s4 s5 , s6 s3 , s4 s4 , s5 , s6 s1 , s2 s5 , s6 s2 , s3 , s4 s5 , s6 s4 s3 , s4 s2 , s3 s3 , s4 , s5 s0 , s1 , s2 s2 , s3 , s4

C2 s5 , s6 s5 , s6 s4 , s5 s1 , s2 s5 s5 s6 s4 , s5 s3 , s4 s4 , s5 s5 , s6 s4 , s5 s4 s6 s5 , s6 s4 , s5

C3 s6 s5 s3 , s4 s3 , s4 , s5 s5 , s6 s4 s4 , s5 s3 , s4 s4 , s5 s4 s4 s2 , s3 , s4 s5 , s6 s3 , s4 s5 , s6 s4

C4 s2 , s3 s5 , s6 s4 , s5 s6 s4 , s5 s5 , s6 s3 , s4 s4 s4 , s5 s5 , s6 s1 , s2 s3 s3 , s4 s4 , s5 , s6 s5 , s6 s5

C5 s2 , s3 , s4 s4 , s5 s5 , s6 s4 , s5 , s6 s3 , s4 s4 , s5 s5 , s6 s5 , s6 s5 , s6 s4 , s5 s5 , s6 s5 , s6 s3 s3 , s4 s3 , s4 s5 , s6

C6 s5 , s6 s4 , s5 s5 , s6 s4 s3 , s4 s4 s4 s5 s2 , s3 s4 , s5 s3 s5 s4 s5 s5 s3

Step 2: Transform DM Evaluations to the Belief Structures By using Eqs. (3) and (4), DMs’ assessments are transformed to belief structure. Belief structures of DMs’ are presented in Tables 4, 5, 6 and 7. Step 3: Aggregate Criteria Scores for Each DM By using Eqs. (5) and (6), aggregate DMs’ belief structure is transformed into belief structure. DMs’ assessments are aggregated for each DM and each alternative by using belief structures in the previous step. To get belief degree of the alternatives,

378

N. Güleç and Ö. Kabak

Table 4 Belief structures of DM1 s0 DM1

A1

C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6

A2

A3

A4

s1

β1k 1 =

0.3

0.15

s3

s4

0.5 0.333

0.5 0.333

0.333 0.5

0.5 0.5

0.5

0.5 0.5 0.5 0.5 0.5 0.5

s5 0.5 0.5

s6 0.5 0.5 1

0.5 0.5 0.5 1 0.5 0.5 0.5

0.5 0.5 0.5

0.5 0.5 0.5 0.5 0.5

0.5 0.5 0.5

0.5 0.333

Weights of Criteria

|0.2

s2

Belief Structure of

0 0 0 0 0 0 0 0 0 0 0 0 0.1 0.15 0.1|× 0 0 0.5 0.5 0 0 0.33 0.33 0 0 0 0 0.2 + 0.3 + 0.15 + 0.1 + 0.15 + 0.1

0.333

0.333

0.333 1

0.333

1 0.333

for

0 0 0 0 0.33 0

0.5 0.5 0 0 0 0

Sum of Weights

0.5 0.5 1 0 0 0 =|0 0

0.1

0.1

0.05

Belief Degree of

0.3

0.4|

for

for each DM at each sk level, the weighted sum of belief degrees of the DMs with respect to criteria for each sk are calculated. As an example, the calculations of DM1 for A1 are presented below: Belief degrees of DMs’ are presented in Table 8. Step 4: Calculate the CBDs Belief degrees of each DM for the alternatives at each sk were calculated previous step level. At this step, the CBD for DMe ’s assessments for each Ai are calculated

Implementation of Cumulative Belief Degree Approach to Group Decision-. . .

379

Table 5 Belief structures of DM2 s0 DM2

A1

A2

A3

A4

C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6

s1

s2

s3 0.5

0.5 0.5

s4 0.5

0.5 0.5 0.5 0.333

s5

s6

1 0.5 0.5

0.5

0.333 1

0.333

0.5 0.5

0.5

1 0.5 1 0.5

0.5 1 0.5

0.5 0.5

0.5 0.5

0.5

0.5 0.5

0.5

0.5 1

0.5

1

0.5

0.5 0.5 1

by using Eqs. (7) and (8). As an example, the calculations of DM1 for A4 at s3 are presented below: γ413 = 0.25 + 0.2 + 0.2 + 0.05 = 0.7 CBDs of DMs are presented in Table 9. Step 5: Aggregate the DM Evaluations for Each Alternative In order to rank the alternatives, it is necessary to combine the CBD of each DM to get a single CBD structure for each alternative at each sk level. The weights assigned to the DMs are used for this calculation. In this problem, DMs’ weights are equal. By using Eqs. (9) and (10), CBD structures are calculated. The results are presented in Table 10. As an example, the calculations for A4 at s3 are presented follows: γ43 = 0.25 × 0.7 + 0.25 × 1 + 0.25 × 0.95 + 0.35 × 0.93 = 0.9

380

N. Güleç and Ö. Kabak

Table 6 Belief structures of DM3 s0 DM3

A1

A2

A3

A4

C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6

s1

s2 0.333

0.5

s3 0.333 0.5

s4 0.333 0.5 0.5 0.5

s5

s6

0.5 0.5 0.5

0.5

0.5 0.5 1 0.5 0.5 1

0.5 0.5

0.5

0.5 0.5 0.5

0.5

0.5

0.5

0.5

0.5

1 0.5

0.5 1 0.5 0.333

0.333 1

0.5 0.5 0.333

0.5

0.5 1

0.5

Step 6: Calculate Consensus Degree Scores found as CBDs are obtained for each alternative at each sk level. It is important to determine how close these scores are to the scores (CBDs of DMs) given by the DMs to the alternatives in order to show how acceptable the results are for the DMs. First starting with calculating the threshold value: δ min = 1 −

1 = 0.83 6

In the second step, the closeness between the score of each DM and the result score is calculated. As an example, the calculations of consensus degree of DM1 for A3 is shown below: δ31 = 1 −

|1−1|+|1−0.98|+|1−0.93|+|1−0.88|+|0.83−0.78|+|0.45−0.52|+|0.13−0.25| 6

δ31 = 0.92

Implementation of Cumulative Belief Degree Approach to Group Decision-. . .

381

Table 7 Belief structures of DM4 s0 DM4

A1

A2

A3

A4

C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6 C1 C2 C3 C4 C5 C6

s1

s2 0.5

s3 0.5

s4

s5

s6

0.5

0.5

1 0.5 1 0.333

0.5 1 0.333

0.333 1

0.5 0.5

0.5 0.333 0.5

0.333

0.333

1 0.333

0.333

0.333 0.5 0.5 0.5 0.5

0.5

0.333

0.333 0.5 1

0.5 0.5 0.5

1 0.333

0.5 1 0.5

0.5

1

This calculation is made for all DMs in each alternative. The results of these calculations are presented in Table 11. As shown in the table, the average consensus level for all DMs is above the threshold. This means that all DMs are satisfied with the outcome score of the alternatives. Step 7: Find the Collective Preferences There are two methods for interpreting the results obtained. Firstly, the results are evaluated with Aggregated Score Approach method. The calculations for all alternatives using Eq. (15) are presented below: AS1 = 0 × (1 − 1) + 1 × (1 − 1) + 2 × (1 − 0.92) + 3 × (0.92 − 0.7) +4 × (0.7 − 0.42) + 5 × (0.42 − 0.17) + 6 × (0.17) = 3.49 AS2 = 0 × (1 − 1) + 1 × (1 − 1) + 2 × (1 − 1) + 3 × (1 − 0.95) +4 × (0.95 − 0.62) + 5 × (0.62 − 0.2) + 6 × (0.2) = 4.61

382 Table 8 Belief degrees of DMs

N. Güleç and Ö. Kabak

DM1

DM2

DM3

DM4

A1 A2 A3 A4 A1 A2 A3 A4 A1 A2 A3 A4 A1 A2 A3 A4

s0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.07 0.00

s1 0.00 0.00 0.00 0.15 0.00 0.00 0.10 0.00 0.00 0.00 0.05 0.00 0.00 0.00 0.07 0.00

s2 0.10 0.00 0.00 0.15 0.00 0.00 0.10 0.00 0.12 0.00 0.05 0.05 0.10 0.00 0.07 0.07

s3 0.10 0.00 0.18 0.05 0.23 0.00 0.05 0.08 0.27 0.00 0.10 0.25 0.30 0.22 0.08 0.17

s4 0.05 0.23 0.38 0.20 0.28 0.39 0.23 0.33 0.34 0.43 0.35 0.30 0.45 0.25 0.08 0.37

s5 0.30 0.58 0.33 0.20 0.43 0.49 0.15 0.43 0.20 0.43 0.23 0.33 0.08 0.20 0.38 0.33

s6 0.45 0.20 0.13 0.25 0.08 0.12 0.38 0.18 0.08 0.15 0.23 0.08 0.08 0.33 0.28 0.08

A1 A2 A3 A4 A1 A2 A3 A4 A1 A2 A3 A4 A1 A2 A3 A4

s0 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

s1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.93 1.00

s2 1.00 1.00 1.00 0.85 1.00 1.00 0.90 1.00 1.00 1.00 0.95 1.00 1.00 1.00 0.87 1.00

s3 0.90 1.00 1.00 0.70 1.00 1.00 0.80 1.00 0.88 1.00 0.90 0.95 0.90 1.00 0.80 0.93

s4 0.80 1.00 0.83 0.65 0.78 1.00 0.75 0.93 0.62 1.00 0.80 0.70 0.60 0.78 0.73 0.77

s5 0.75 0.78 0.45 0.45 0.50 0.61 0.53 0.60 0.28 0.58 0.45 0.40 0.15 0.53 0.65 0.40

s6 0.45 0.20 0.13 0.25 0.08 0.12 0.38 0.18 0.08 0.15 0.23 0.08 0.08 0.33 0.28 0.08

A1 A2 A3 A4

s0 1.00 1.00 1.00 1.00

s1 1.00 1.00 0.98 1.00

s2 1.00 1.00 0.93 0.96

s3 0.92 1.00 0.88 0.90

s4 0.70 0.95 0.78 0.76

s5 0.42 0.62 0.52 0.46

s6 0.17 0.20 0.25 0.14

Table 9 CBDs of DMs DM1

DM2

DM3

DM4

Table 10 CBD of alternatives

Implementation of Cumulative Belief Degree Approach to Group Decision-. . . Table 11 Consensus degrees A1 A2 A3 A4 Average

DM1 0.88 0.97 0.92 0.91 0.92

DM2 0.94 0.97 0.95 0.92 0.95

383 DM3 0.94 0.97 0.97 0.95 0.96

DM4 0.92 0.94 0.93 0.96 0.94

1.20 1.00 0.80 0.60 0.40 0.20 0.00 S0

S1

S2 A1

S3 A2

S4 A3

S5

S6

A4

Fig. 1 Result of linguistic-cut approach

AS3 = 0 × (1 − 0.98) + 1 × (0.98 − 0.93) + 2 × (0.93 − 0.88) + 3 × (0.88 − 0.78) +4 × (0.78 − 0.52) + 5 × (0.52 − 0.25) + 6 × (0.25) = 3.96

AS4 = 0 × (1 − 1) + 1 × (1 − 0.96) + 2 × (0.96 − 0.9) + 3 × (0.9 − 0.76) +4 × (0.76 − 0.46) + 5 × (0.46 − 0.14) + 6 × (0.14) = 3.74

According to the results of the calculation, we can say that the ranking order of four candidates: A2  A3  A4  A1 Secondly, we can evaluate the alternatives with the linguistic-cut approach. The result of linguistic-cut approach is presented in Fig. 1. In this approach, different ranking can be obtained for different sk levels. Therefore, it should be decided first which sk level is satisfactory. For example, ranking from s4 : A2  A3  A4  A1. The illustrative example problem is taken from Yu et al. (2017) as previously mentioned. Yu et al. (2017) listed the alternatives as A2  A4  A3  A1 as a result of their proposed methodology. The results indicate that the best alternative selected

384

N. Güleç and Ö. Kabak

by both methods is the same; however, the place of A4 and A3 is reverse in ranking. It can be said that the cause of the difference between their ranking and our ranking is that there is not the best decision due to the nature of decision-making problems. For example, in the CBD-based proposed method, if alternatives are evaluated at s6 instead of s4 in the linguistic-cut approach, the ranking will be A3  A2  A1  A4 . Another reason why Yu et al. (2017) have a different ranking may be that their proposed methodology is based on an unbalanced linguistic term set.

5 Conclusion In multi-criteria GDM problems, flexibility in the assessment of DMs increases the quality of the evaluations improves the quality of the solution. In the case of complex decision-making problems, DMs may not be certain in their evaluations and may have hesitancy about which linguistic term to indicate. This hesitancy may negatively affect the quality of the solution. In a case where there are doubts among several linguistic terms, the HFLTs provide flexibility of indicating more than one linguistic term. Using HFLTs, DMs can assign more than one linguistic term in their evaluations. In recent years, the studies including HFLTs increases rapidly. There are various kinds of methods used in the literature for GDM problems with HFLTs. The CBD is a solution approach built on fuzzy linguistic terms and belief structure for multicriteria GDM problems. The CBD approach has been applied to many decisionmaking problems with different evaluation measurements. In this study, a GDM problem in which DMs have evaluated alternatives using HFLT has been solved using the CBD approach. Transformation formula for converting the HFLTs to belief degrees are introduced. The applicability of the approach is shown by implementing the approach sales manager selection problem. As future research, problems including HFLTs and other evaluation formats such as direct value assignment, classical fuzzy sets, linguistic terms, etc. can be tackled and performance of the proposed method can be evaluated. Furthermore, the proposed method can be applied to complex real-life problems where multiple stakeholders exist.

References Chang KH (2017) A more general reliability allocation method using the hesitant fuzzy linguistic term set and minimal variance OWGA weights. Appl Soft Comput 56:589–596 Chen SM, Hong JA (2014) Multi-criteria linguistic decision making based on hesitant fuzzy linguistic term sets and the aggregation of fuzzy sets. Inf Sci 286:63–74 Ervural B, Kabak Ö (2016) A novel group decision making approach based on the cumulative belief degrees. IFAC-Papers On Line 49(12):1832–1837

Implementation of Cumulative Belief Degree Approach to Group Decision-. . .

385

Ervural B, Kabak Ö (2019) A cumulative belief degree approach for group decision-making problems with heterogeneous information. Expert Syst e12458 Galo NR, Calache LDDR, Carpinetti LCR (2018) A group decision approach for supplier categorization based on hesitant fuzzy and ELECTRE TRI. Int J Prod Econ 202:182–196 Kabak Ö, Ruan DA (2011) A cumulative belief degree-based approach for missing values in nuclear safeguards evaluation. IEEE Trans Knowl Data Eng 23(10):1441–1454 Kabak Ö, Ülengin F, Önsel S, ¸ Özaydin Ö, Akta¸s E (2014) Cumulative belief degrees approach for analyzing the competitiveness of the automotive industry. Knowl-Based Syst 70:15–25 Liao H, Xu Z (2015) Approaches to manage hesitant fuzzy linguistic information based on the cosine distance and similarity measures for HFLTSs and their application in qualitative decision making. Expert Syst Appl 42(12):5328–5336 Liao H, Xu Z, Zeng XJ (2014) Distance and similarity measures for hesitant fuzzy linguistic term sets and their application in multi-criteria decision making. Inf Sci 271:125–142 Liu H, Rodríguez RM (2014) A fuzzy envelope for hesitant fuzzy linguistic term set and its application to multicriteria decision making. Inf Sci 258:220–238 Liu H, Cai J, Jiang L (2014) On improving the additive consistency of the fuzzy preference relations based on comparative linguistic expressions. Int J Intell Syst 29(6):544–559 Montes R, Sánchez AM, Villar P, Herrera F (2015) A web tool to support decision making in the housing market using hesitant fuzzy linguistic term sets. Appl Soft Comput 35:949–957 Rodriguez RM, Martinez L, Herrera F (2011) Hesitant fuzzy linguistic term sets for decision making. IEEE Trans Fuzzy Syst 20(1):109–119 Song Y, Hu J (2017) A group decision-making model based on incomplete comparative expressions with hesitant linguistic terms. Appl Soft Comput 59:174–181 Torra V (2010) Hesitant fuzzy sets. Int J Intell Syst 25(6):529–539 Wan SP, Qin YL, Dong JY (2017) A hesitant fuzzy mathematical programming method for hybrid multi-criteria group decision making with hesitant fuzzy truth degrees. Knowl-Based Syst 138:232–248 Wei C, Zhao N, Tang X (2013) Operators and comparisons of hesitant fuzzy linguistic term sets. IEEE Trans Fuzzy Syst 22(3):575–585 Yu W, Zhang Z, Zhong Q, Sun L (2017) Extended TODIM for multi-criteria group decision making based on unbalanced hesitant fuzzy linguistic term sets. Comput Ind Eng 114:316–328 Zadeh LA (1975) The concept of a linguistic variable and its application to approximate reasoning—I. Inf Sci 8(3):199–249 Zhang B, Liang H, Zhang G (2018) Reaching a consensus with minimum adjustment in MAGDM with hesitant fuzzy linguistic term sets. Inf Fusion 42:12–23

A Literature Survey on Project Portfolio Selection Problem Özge Sahin ¸ Zorluo˘glu and Özgür Kabak

Abstract This study focuses on Project Portfolio Selection (PPS) that is an important activity for organizations in order to get a competitive advantage, increase their profit, and accomplish their objectives. There are numerous articles in the literature for PPS. However, the literature survey studies analyzing these articles are limited. The aim of this study is to conduct a literature survey and develop a classification scheme for PPS problem to support the researchers and practitioners working in the field of PPS. We present a classification scheme based on three categories: type of the study, methods used for PPS problem, and types of projects. In this study, 253 articles are deeply investigated and categorized according to the classification scheme. Critical analyses are conducted and various results are obtained. According to the results, although the PPS problem has multiple criteria and group decision-making characteristics, these methods are used in only 33% of the studies. Therefore, we present four future research directions for PPS problem in the context of multiple criteria and group decision-making: (1) importance of utilizing multiple criteria and group decision-making approaches for PPS problem, (2) need for methods that can handle information in different formats, missing values, and uncertainty, (3) need for decision support systems for PPS, (4) need for new methods specific to different types of projects.

1 Introduction Project Portfolio Selection (PPS) is a process of determining a suitable portfolio of projects by evaluating a great number of project proposals with respect to many factors specific to the given case and conflicting criteria under scarce resources. According to Archer and Ghasemzadeh (1999) “PPS is the periodic activity involved in selecting a portfolio of projects, that meets an organization’s stated objectives

Ö. Sahin ¸ Zorluo˘glu () · Ö. Kabak Faculty of Management, Industrial Engineering Department, Istanbul Technical University, Istanbul, Turkey e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 Y. I. Topcu et al. (eds.), Multiple Criteria Decision Making, Contributions to Management Science, https://doi.org/10.1007/978-3-030-52406-7_15

387

388

Ö. Sahin ¸ Zorluo˘glu and Ö. Kabak

without exceeding available resources or violating other constraints.” PPS is a multi-dimensional problem. In most problems, there are numerous projects to be evaluated. Different sets of criteria are considered in different problems. Also, some organizations have more than one goal which might be conflicting. The dependency between projects is another feature of PPS problems. Some projects may depend on other projects in precedence or succession manner. Besides, in some PPS problems, the information may be uncertain or missing. In the PPS process, there are specific and scarce resources/constraints that should be taken into account. Therefore, structured and systematic approaches are needed to deal with such a complex problem. Performing the right portfolio of projects is a crucial activity for organizations in order to take place and sustain in the competitive environment, catch the success, increase their profit, grow stronger in the market, and accomplish their objectives. Therefore, organizations should effectively determine the portfolio of projects to be carried out. Although PPS is an important problem that is frequently studied in the literature, there is a lack of studies that investigate the literature in depth, conduct detailed analyses, and guide for future researches simultaneously. Among the few literature review studies on PPS, Heidenberger and Stummer (1999) and Verbano and Nosella (2010) conduct a literature review study on research and development (R&D) PPS to reveal the methods used for R&D PPS in the literature. There are also conference papers of Iamratanakul et al. (2008) and Elbok and Berrado (2017) on PPS. The authors make a definition of PPS and summarize the methods used for PPS problem in the literature. There are specific features of PPS problems that are appropriate for multiple criteria decision-making (MCDM) and group decision-making (GDM) characteristics. In a PPS problem, there are multiple and/or conflicting criteria to be considered. These criteria can be varied as subjective or objective according to the type of criteria. Subjective criteria could be the quality of the project (Liu et al. 2010) innovation degree of the project (Liu et al. 2010) and/or brand image enhancement (Elbok and Berrado 2017). Objective criteria could be the cost of the project and/or labor cost (Erdoˇgmu¸s et al. 2005). The number of decision-makers (DMs) who are participated in the PPS process change according to the structure of organization. In some organizations, a single DM, a manager, who is in charge of PPS process or the top manager, can make the decision. In most of organizations, the decision is made by stakeholders including top managers, senior and junior managers, and employees related to the projects, etc. If several DMs take place in the evaluation of the projects, the problem becomes a GDM problem. In most cases, PPS is a complex problem with a high number of alternatives to be evaluated with respect to multiple criteria by multiple DMs. Therefore, MCDM and GDM methods are suitable approaches to deal with the PPS problem. The aim of this study is to carry out a literature survey for PPS problem and suggest future studies specifically in the context of MCDM and GDM. In order to shed light on the researchers who work on PPS and support future studies, we make a detailed research on papers published since 1972 in the field of PPS

A Literature Survey on Project Portfolio Selection Problem

389

and develop a classification scheme for this literature review. In the classification scheme, categorizations are done based on the type of study, existing methods in the literature, and types of projects. Critical analyzes are conducted and various results are presented with regard to these categorizations. Finally, four interesting research directions are determined for future PPS studies in the context of MCDM and GDM. The paper is organized as follows: In Sect. 2, a classification scheme related to the PPS problem is proposed. Section 3 presents a detailed literature survey on the PPS problem. Future research directions are provided in Sect. 4. The last section contains the conclusion.

2 Classification Scheme A classification scheme related to the PPS problem is developed for the literature survey. We classified the PPS problem based on three aspects: type of the study, methods used for PPS problem, and types of projects. In Fig. 1, the developed classification scheme is presented. In this section, we explained the classification scheme providing examples from the literature.

2.1 Type of Study According to the type of study, the articles are categorized as applied (A), theoretic (T), and others (O). “Applied” articles propose/use method with data obtained from a real-life problem (R) or hypothesized data (H). Therefore, if an article has both applied and theoretic contribution, then it is classified as “applied.” For instance, Dutra et al. (2014) develop an economic–probabilistic model and the proposed model is applied for a real-life investment PPS problem at a power distribution company. Fliedner and Liesiö (2016) propose methods for PPS problems considering uncertainty and an application is conducted at an international semiconductor manufacturer for selecting a portfolio of projects. “Theoretic” (T) articles utilize a method without an application or propose a new method for a PPS problem. For example, Carlsson et al. (2007) propose a hybrid method for real option valuation and a fuzzy mixed-integer programming model in case future cash flows are predicted by trapezoidal fuzzy numbers for R&D PPS problem. The articles that are not suggesting any method or providing applications such as literature or systematic review are classified in the “others” (O) title. The article by Archer and Ghasemzadeh (1999) could be categorized as “others.” Archer and Ghasemzadeh (1999) introduce a framework for the PPS problem consisting of many stages where users are allowed to select the most suitable method for each stage and also they are allowed to make changes in a stage if the changes will improve the PPS process.

390

Ö. Sahin ¸ Zorluo˘glu and Ö. Kabak

A classification scheme for PPS problem 1. Type of study 1.1 Applied (A), 1.1.1 Real case based application (R) 1.1.2 Hypothetical application (H) 1.2 Theoretic (T), 1.3 Others (O) (systematic review, literature review, research article etc.) 2. Methods 2.1 Benefit measurement methods 2.1.1 Multi-criteria decision making methods (MCDM) 2.1.2 Economic models (EM) 2.1.3 Group decision making approaches (GDM) 2.2 Mathematical programming methods 2.2.1. Linear Programming (LP) 2.2.2 Non-linear programming (NLP) 2.2.3 Integer programming (IP) 2.2.4 Multi-objective programming (MOP) 2.2.5 Dynamic programming (DP) 2.2.6 Stochastic programming (SP) 2.2.7 Fuzzy mathematical programming (FMP) 2.2.8 Robust mathematical programming (RMP) 2.3 Cognitive emulation methods 2.3.1. Decision tree approach (DT) 2.3.2 Statistical approaches (SA) 2.3.3 Expert systems (ES) 2.4 Simulation and Heuristic approaches (S&H) 2.5 Hybrid methods 2.6 Other methods 3. Types of projects 3.1 General projects (G) 3.2 R&D projects 3.3 IT/IS projects 3.4 Investment projects (INP) 3.5 Construction projects (CP) 3.6 New product development projects (NPD) 3.7 Transportation or railway infrastructure projects (TIP/RIP) 3.8 Energy projects (EP) 3.9 Lean and/or lean 6 sigma projects (L/6S) 3.10 Oil and gas industry projects (OGI) 3.11 Maintenance projects (MP) 3.12 Performance development projects (PDP) 3.13 Software projects (SOP) 3.14 High technology projects (HTP) 3.15 Power distribution projects (PODP) 3.16 Smart city projects (SCP) 3.17 Scientific research projects (SRP) 3.18 Weapon selection projects (WSP)

Fig. 1 Classification scheme for PPS problem

A Literature Survey on Project Portfolio Selection Problem

391

2.2 Methods We benefitted from literature review studies (Heidenberger and Stummer 1999; Iamratanakul et al. 2008; Elbok and Berrado 2017) to classify the articles according to the methods. Methods that are used in the PPS problem could be classified into six categories: Benefit measurement methods, mathematical programming methods, cognitive emulation methods, simulation and heuristic approaches, hybrid methods, and other methods.

2.2.1 Benefit Measurement Methods Benefit measurement methods provide a composite score for each project as a result of the evaluation process. These scores are used to determine a portfolio of projects. Benefit measurement methods are categorized as multi-criteria decisionmaking methods (MCDM), economic models (EM) and group decision-making (GDM) approaches. MCDM can be defined as an evaluation of a set of alternatives with respect to multiple and conflicting criteria. Multi-attribute decision-making (MADM) and multi-objective decision-making (MODM) are the subsets of MCDM (Kabak and Ervural 2017). The number of alternatives is discrete in MADM, besides the number of alternatives is continuous in MODM. Accordingly, there are numerous MCDM methods in the literature that are used for PPS. These methods are Analytic Hierarchy Process (AHP), analytic network process (ANP), the technique for order preference by similarity to ideal solution (TOPSIS), the decision-making trial and evaluation laboratory (DEMATEL), VIKOR (VIšekriterijumsko Kompromisno Rangiranje), utility theory, multi-attribute value theory (MAVT), data envelopment analysis (DEA), multi-attributive border approximation area comparison (MABAC), and fuzzy and gray set theory-based methods. Among these methods, AHP developed by Saaty (1980) is one of the most frequently used methods in PPS studies. AHP is a useful method for handling complex real-world MADM problems. In AHP, the decision-maker evaluates criteria and alternatives by pairwise comparison to obtain a final score for each alternative. There are also some extensions of AHP in the literature. For instance, Özkır and Demirel (2012) use fuzzy AHP with entropy in order to deal with uncertainty and develop a fuzzy linear programming model for PPS problem. Furthermore, Relich and Pawlewski (2017) integrate a fuzzy weighted averaging method and artificial neural networks to evaluate projects and predict the performance of projects respectively while considering uncertainty. Altuntas and Dereli (2015) develop a new approach that represents the perspective of the government using DEMATEL and patent citation analysis for prioritizing a portfolio of investment projects supported by the government depending on two criteria: reducing the foreign trade deficit and bringing new investments. Ravanshadnia et al. (2010) present a new approach called Fuzzy MADM project selection model (FMPS) which integrates AHP and simple additive weighting (SAW) methods with

392

Ö. Sahin ¸ Zorluo˘glu and Ö. Kabak

fuzzy theory to obtain weights of criteria and reach a final score for each project, respectively. Economic models include economic indexes, discounted cash flow method, net present value analysis (NPV), internal rate of return (IRR), real options, cost-benefit analysis, fuzzy set theory, and cost-benefit analysis-based economic models such as fuzzy real options and fuzzy net present value analysis. Luo and Sheu (2010) use real options pricing approach for R&D PPS. They avoid risks and create managerial flexibility by considering risk management activities in their study. Dutra et al. (2014) conduct an economic-probabilistic analysis of the expected return on projects for PPS with a complete set of criteria. The model is applied to investment PPS of a power distribution company to show the efficiency of the proposed approach. Some studies in the literature approached the PPS problem from the perspective of group decision-making. Hauc et al. (2010) determined macroeconomic and infrastructure criteria with various sub-criteria and applied the SMART method and swing technique together with the participation of multiple decision-makers to prioritize railway infrastructure construction projects. In addition, Wei and Chang (2011) developed a method that integrates fuzzy set theory and multi-criteria group decision-making for portfolio selection of new product development projects. Fuzzy group decisions are combined and converted into classical numbers, then formulated as a fuzzy linear programming model. Carazo et al. (2012) proposed a two-step approach in which decision-makers participate in the second step of the process. Furthermore, Jeng and Huang (2015) developed a hybrid approach for R&D PPS where there are dependency and feedback between alternatives and criteria. In this approach, they utilized modified Delphi, DEMATEL, and ANP methods. Cluzel et al. (2016) seek an answer for how innovative environmental R&D projects are produced and selected in complex industrial systems. Dobrovolskien˙e and Tamoši¯unien˙e (2016) claimed that decision-makers should take into account the sustainability of projects as well as risk and return. They asserted a sustainability index for construction PPS and resource allocation. Finally, Debnath et al. (2017) used the Delphi method from the process-oriented multi-criteria group decisionmaking methods in project portfolio selection.

2.2.2 Mathematical Programming Methods Mathematical programming methods are widely used in the literature for modeling the PPS problem. The most frequently used mathematical programming methods for the PPS problem are linear programming (LP), nonlinear programming (NLP), integer programming (IP), multi-objective programming (MOP), dynamic programming (DP), stochastic programming (SP), fuzzy mathematical programming (FMP) and robust mathematical programming (RMP). For instance, Padhy and Sahu (2011) propose an integer linear programming model for selecting a portfolio of six sigma projects. Yu et al. (2012) develop a nonlinear programming model for the PPS problem considering the interactions between projects and propose a genetic algorithm to solve the problem. Khalili-Damghani et al. (2012) formulate a MOP

A Literature Survey on Project Portfolio Selection Problem

393

for the PPS problem considering four objective functions: maximizing profit and total internal rate of return and minimizing total cost and total unused resources. Ça˘glar and Gürel (2017) suggest two models (IP with DP and a chance constraint SP) for two cases of PPS problem with project cancellations. Moreover, for R&D PPS problem, Wang and Hwang (2007) utilize fuzzy set theory and propose a fuzzy integer programming to model uncertainty and flexibility in parameters. Fliedner and Liesiö (2016), develop a robust portfolio model for PPS to handle the uncertainty that enables decision-makers to select the level of conservatism used to determine dominance relations between project portfolios.

2.2.3 Cognitive Emulation Methods According to Heidenberger and Stummer (1999), cognitive emulation methods aim to create real decision-making process in an organization. Cognitive emulation methods consider experiences in the past to predict consequences in the future. Cognitive emulation methods could be classified as decision tree approaches (DT), statistical approaches (SA), and expert systems (ES). Statistical approaches are mean-Gini analysis, the test of hypothesis, multivariate statistical analysis, chisquare test, and regression. Expert systems are artificial neural networks, Kohonen neural networks, fuzzy expert systems, and gray theory-based artificial neural networks. There are a few studies using cognitive emulation methods for determining a portfolio of projects. For instance, Considering uncertainty in parameters, Barucke Marcondes et al. (2017) utilize both mean-Gini analysis and stochastic dominance to find an optimal set of R&D projects in the portfolio. Oh et al. (2012) introduce a fuzzy expert system with embracing three tools for PPS: strategic bucket for sharing scarce resources effectively, scoring method for assessment of NPD projects, and matrix analysis for providing balance between projects.

2.2.4 Simulation and Heuristic Approaches (S&H) Simulation approaches used for PPS include Monte Carlo simulation and system dynamics while heuristics approaches include genetic algorithm and Pareto ant colony optimization. For example, Iniestra and Gutierrez (2009) utilize nondominated sorting genetic algorithm (NSGA-II) which is a multi-objective evolutionary algorithm to solve the proposed MOP model and determine the Pareto solutions. Doerner (2004) proposes Pareto ant colony optimization to solve PPS and make comparisons with Pareto simulated annealing and the non-dominated sorting genetic algorithm to show the efficiency of the proposed meta heuristic. Mavrotas and Pechak (2013) work on handling the uncertainty in the PPS problem. They introduce an integrated approach using MCDA, integer programming, and Monte Carlo simulation where Monte Carlo simulation is used for generating parameters for MCDA and integer programming.

394

Ö. Sahin ¸ Zorluo˘glu and Ö. Kabak

2.2.5 Hybrid Methods In the literature, some studies use more than one method for the PPS problem. Khalili-Damghani et al. (2012) integrate multi-objective optimization with TOPSIS. Tavana et al. (2015) propose a hybrid PPS approach including DEA, TOPSIS, and integer programming. Razi and Shariat (2017) introduce a PPS model using graybased artificial neural network and regression tree. Yang et al. (2016) integrate DEMATEL and ANP with zero-one goal programming for transport infrastructure PPS. Jafarzadeh et al. (2018) develop a hybrid approach including fuzzy Quality Function Development (QFD) and DEA.

2.2.6 Other Methods There are a number of methods that are not included in the categories given above. Examples of the other methods are technology road maps (Lee et al. 2008), critical success factors analysis (Costantino et al. 2015), weighted network model (Wang et al. 2017), diamond model (Ahn et al. 2010), stochastic multi-attribute acceptability analysis (SMAA) (Song et al. 2019a, b), geographic information system (GIS) (Hashemizadeh and Ju 2019), De Novo approach (Fiala 2018), graph theory and technology mining (Azimi et al. 2019), uncertain programming (Yan and Ji 2018).

2.3 Types of Projects The third and the last category in the classification scheme is the types of projects that are evaluated in the process of the PPS problem. There are numerous types of projects in the literature. One of the most studied project types is the R&D projects. Since technological innovations are crucial for success, R&D projects are tools for continuous improvement of organizations. Organizations should conduct and maintain R&D based projects to sustain in the competitive business environment and to reach their goals. For instance, Fang et al. (2008) propose a bi-objective mixedinteger stochastic programming model for both R&D PPS and securities portfolio selection. IT/IS projects are also frequently studied in the literature. IT/IS projects could be defined as complicated, interdependent, and not structured projects which need a high level of coordination and communication from several fields. For example, Cho and Shaw (2013) set up an IT PPS model utilizing the mean-variance efficient frontier balance portfolio return and risk. The third most studied project type is investment projects (INP). Investment PPS is the process of allocating capital on hand to a portfolio of projects which are expected to return the maximum profit. Zhang et al. (2011) propose triangle and interval fuzzy return models for investment PPS and also an improved heuristic rules based on genetic algorithm and traversal algorithm. Some studies do not make differentiation for the type of projects and

A Literature Survey on Project Portfolio Selection Problem

395

analyses the projects in general. We classified these studies as general projects (G). For instance, Guo et al. (2018) use fuzzy real options and introduce a multi-objective genetic algorithm for selecting a portfolio of projects in an uncertain environment. Albano et al. (2019) introduce a mixed-integer nonlinear programming model for the PPS problem taking into account some performance measures. There are some other project types included in the classification scheme: construction projects (CP) (e.g., Dobrovolskien˙e and Tamoši¯unien˙e 2016; Tsai et al. 2017), new product development projects (NPD) (e.g., Killen et al. 2008; Oh et al. 2012), transportation or railway infrastructure projects (TIP/RIP) (e.g., Gurgur and Morley 2008; Yang et al. 2016), lean and/or lean 6 sigma projects (L/6S) (e.g., Padhy and Sahu 2011; Kornfeld et al. 2013), energy (EP) (e.g., Wu et al. 2019), maintenance (MP) (e.g., Carnero 2015; Mild et al. 2015), oil and gas industry (OGI) (e.g., Lopes and de Almeida 2015; Yan and Ji 2018), performance development (PDP) (e.g., Cho and Moon 2006), software (SOP) (e.g., Kremmel and Biffl 2011), high technology (HTP) (e.g., Tavana et al. 2013), power distribution (PODP) (e.g., Mussoi and Teive 2013) and scientific research (SRP) projects (e.g., Ribeiro and Alves 2017) and weapon selection projects (WSP) (e.g., Xiong et al. 2017). All of these types are included in the classification scheme.

3 Analysis of the Literature A two-stage literature review is conducted to reveal the current studies in the literature related to the PPS problem and to examine the studies of high importance in detail. Firstly, we utilize the keywords of “Project Portfolio Selection,” “Project Portfolio Evaluation,” “Project Portfolio Prioritization,” “Project Portfolio Ranking” and “Project Portfolio” in title, abstract or keywords parts on Scopus (www.scopus.com) database to search for articles which are written in the English language. As a result of this search, more than 600 articles published since 1972 are encountered. We screen these articles to find out the articles in the scope. The scope is limited to the articles that are related to project portfolio selection, prioritization, and evaluation. For this, initially, we screen the titles and abstracts of the articles. Secondly, we examine more than 300 articles that are obtained in the initial stage. As a result, we determine 253 articles in the given scope. We examine these articles deeply and categorize them according to the classification scheme presented in Fig. 1. To give the idea of the classification of the articles, the PPS studies in the literature published in 2019 are given in Table 1. We provide the results of the classification in the following sub-sections. In Table 2, the articles obtained from the literature review are categorized according to scientific journals they are published in. It is seen that the “European Journal of Operational Research” has the highest number of articles with 14 articles. Other important journals are the International Journal of Project Management and Annals of Operations Research. Table 2 shows the distribution in the number of articles according to the scientific journals which published three or more articles

Liu and Zhang (2019) Wu et al. (2019) Zhang et al. (2019) Song et al. (2019b)

R

H

X

X

R

X

X

X

H

X

X

H

H

X

R

X

R

R

X

R

Naldi et al. (2019) Zhong et al. (2019)

Ça˘glar and Gürel (2019) Storch de Gracia et al. (2019) Dong and Wan (2019) Li et al. (2019)

X

R

Song et al. (2019a)

X

X

X

X

Benefit measurement Type of study methods Mathematical programming methods A T O MCDM EM GDM LP NLP IP MOP DP SP FMP RMP

Table 1 PPS studies in the literature published in recent years (2019) Cognitive emulation methods DT SA ES

X

X

X

X

X

X

X

X

X

X

X

X

Memetic Computing Expert Systems

Iranian J. of Fuzzy Systems Fuzzy Optimization and Decision Making Applied Soft Computing Applied Energy

Computers and Industrial Engineering Socio-Economic Planning Sciences J. of the Operational Research Society Socio-Economic Planning Sciences Management Decision

S&H Hybrid Others Scientific journal

(continued)

G

G

EP

G

G

R&D

EP

R&D

G

G

G

Type of project

396 Ö. Sahin ¸ Zorluo˘glu and Ö. Kabak

R

R

R

Dixit and Tiwari (2019)

Albano et al. (2019)

Hosseini et al. (2019) Azimi et al. (2019)

H

R

R

Artemkina et al. (2019)

Hessami et al. (2019)

Wu and Chen (2019)

H

H

Hashemizadeh and Ju (2019)

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

Technological Forecasting & Social Change Int. J. of Energy Economics and Policy Construction management and economics Int. J. of Information Management

Int. J. of Environmental Science and Technology Annals of Operations Research IEEE Transactions on Engineering Management Kybernetes

SCP

G

OGI

HTP

G

G

G

CP

A Literature Survey on Project Portfolio Selection Problem 397

398

Ö. Sahin ¸ Zorluo˘glu and Ö. Kabak

Table 2 Distribution in number of articles according to scientific journals The Scientific Journal European Journal of Operational Research International Journal of Project Management Annals of Operations Research Applied Soft Computing Expert Systems with Applications Journal of the Operational Research Society IEEE Transactions on Engineering Management Computers and Industrial Engineering Decision Support Systems Operations Research International Journal of Industrial Engineering The Engineering Economist Interfaces International Journal of Quality and Reliability Management Research-Technology Management Information Sciences Arabian Journal for Science and Engineering Pesquisa Operacional Sustainability Soft Computing Other 133 Journals (number of articles