Decision Economics: Complexity of Decisions and Decisions for Complexity (Advances in Intelligent Systems and Computing, 1009) 9783030382261, 9783030382278, 3030382265

This book is based on the International Conference on Decision Economics (DECON 2019). Highlighting the fact that import

141 62 23MB

English Pages 335 [334]

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Advances in Computing and Intelligent Systems: Proceedings of ICACM 2019 (Algorithms for Intelligent Systems) 9789811502217, 9811502218

This book gathers selected papers presented at the International Conference on Advancements in Computing and Management

615 91 20MB Read more

Advanced Computing and Systems for Security (Advances in Intelligent Systems and Computing) 9789811081828, 9789811081835, 9811081824

This book contains extended version of selected works that have been discussed and presented in the fourth International

201 95 7MB Read more

Advances in Artificial Systems for Medicine and Education III (Advances in Intelligent Systems and Computing) 9783030391614, 3030391612

This book discusses the latest advances in the development of artificial intelligence systems and their applications in

247 23 55MB Read more

Advances in Computing and Intelligent Systems: Proceedings of ICACM 2019 (Algorithms for Intelligent Systems) [1st ed. 2020] 9789811502224, 9789811502217, 9811502226

This book gathers selected papers presented at the International Conference on Advancements in Computing and Management

236 56 61MB Read more

Complexity in Tsunamis, Volcanoes, and their Hazards (Encyclopedia of Complexity and Systems Science Series) [1st ed. 2022] 1071617044, 9781071617045

This volume of the Encyclopedia of Complexity and Systems Science, Second Edition is an authoritative single source for

1,242 44 49MB Read more

Intelligent Communication, Control and Devices: Proceedings of ICICCD 2020 (Advances in Intelligent Systems and Computing) [1st ed. 2021] 9811615098, 9789811615092

This book focuses on the integration of intelligent communication systems, control systems and devices related to all as

961 138 15MB Read more

Cognitive Informatics and Soft Computing: Proceeding of CISC 2019 (Advances in Intelligent Systems and Computing, 1040) 9789811514500, 9789811514517, 981151450X

The book presents new approaches and methods for solving real-world problems. It highlights, in particular, innovative r

238 79 32MB Read more

Soft Computing: Theories and Applications : Proceedings of SoCTA 2019 (Advances in Intelligent Systems and Computing) [1st ed. 2020] 9811540314, 9789811540318

This book focuses on soft computing and how it can be applied to solve real-world problems arising in various domains, r

3,280 146 43MB Read more

Risk Taking and Decision Making: Foreign Military Intervention Decisions 9780804765077

Risks are an integral part of complex, high-stakes decisions, and decisionmakers are faced with the unavoidable tasks of

156 92 62MB Read more

From complexity in the natural sciences to complexity in operations management systems. Volume 1 9781786303684, 178630368X

727 84 6MB Read more

Decision Economics: Complexity of Decisions and Decisions for Complexity (Advances in Intelligent Systems and Computing, 1009)
9783030382261, 9783030382278, 3030382265

Author / Uploaded
Edgardo Bucciarelli (editor)
Shu-Heng Chen (editor)
Juan Manuel Corchado (editor)

Table of contents :
Preface
Decision Economics: A novel discipline.
References
Organisation
General Chairs
International Program Committee
Organising Committee
Contents
The Editors
Calibrating Methods for Decision Making Under Uncertainty
1 Introduction
2 Decision Making Under Uncertainty
2.1 Clairvoyance
3 Three Utility Functions
3.1 Constant Absolute Risk Aversion, CARA
3.2 Constant Relative Risk Aversion, CRRA
3.3 The Dual-Risk-Profile DRP Function from Prospect Theory
4 The Experiments, by Simulation
5 Results
6 Discussion
7 Conclusion
References
Coordination and Search for New Solutions
1 Introduction
2 Outline of the Simulation Model
3 Simulation Experiments
4 Results and Discussion
5 Conclusion
References
Appeasement or Radicalism: A Game Between Intruders and Occupiers
1 Introduction
2 Model
2.1 Setting
2.2 Solution
3 Further Discussion
4 Conclusion
References
Coping with Bounded Rationality, Uncertainty, and Scarcity in Product Development Decisions: Experimental Research
Abstract
1 Introduction
2 Experiment Setup
2.1 Puzzle Game
2.2 Variables
2.3 Experiments
3 Results
3.1 Evidence of Learning
4 Conclusions and Discussion
Acknowledgement
References
What Next for Experimental Economics? Searching for an Alternative Pathway Founded on Mathematical Methods
Abstract
1 Introduction
2 The Scientific Method in Both Economic Theory and Experimental Economics: The Centrality of Mathematics
3 Decomposing Price-Effect: The DP-E Algorithm
4 Conclusions and Future Research
Appendix to Sect. 3
References
Generating Neural Archetypes to Instruct Fast and Interpretable Decisions
1 Introduction
2 GH-ARCH
3 Experimental Results
3.1 Understanding Archetypes
3.2 Making Sense of Archetypes and Decisions in Economics
4 Conclusions
References
Strength Investing - A Computable Methodology for Capturing Strong Trends in Price-Time Space in Stocks, Futures, and Forex
Abstract
1 Strength Investing versus Value Investing and Trend Following
2 Plausibility of Strength Investing - Source of Profitability
3 Strength Investing in Stock Markets
4 Tactical Strategies of Trend Trading
References
A Deep Learning Framework for Stock Prediction Using LSTM
Abstract
1 Introduction
2 Methodology
2.1 Training and Test Sets
2.2 Features and Target
2.3 LSTM Networks
2.4 Benchmark Models
2.4.1 Linear Regression
2.4.2 KNN
2.4.3 ARIMA
2.5 Data
3 Empirical Results
3.1 Main Results
4 Conclusion
Acknowledgments
References
Testing Fiscal Solvency in Macroeconomics
Abstract
1 Introduction
2 Testing for Fiscal Solvency
3 Conclusions
References
CEO Gender and Financial Performance: Empirical Evidence of Women on Board from EMEA Banks
Abstract
1 Introduction
2 Literature Review About Board Diversity
3 Methodology
3.1 Endogeneity
3.2 The GMM model
4 Empirical models
5 Conclusions
References
The Minimum Heterogeneous Agent Configuration to Realize the Future Price Time Series Similar to Any Given Spot Price Time Series in the AI Market Experiment
1 AI Market Behaviors Hinted by FICA's Behaviors
2 Our New Remarkable Results
2.1 Future Price Formations by the Default Agent Configuration
2.2 Future Price Formations in the New Special Environment
3 The Importance of Interacting Price Formation Among the Strategies
3.1 Self-correlation and the Partial Correlation to Lag 10 with 95% White-Noise Confidence Bands
3.2 Analyzing the Self-referencing Factor in View of Some Agent Behaviors
References
A Lab Experiment Using a Natural Language Interface to Extract Information from Data: The NLIDB Game
Abstract
1 Introduction
2 Some Foundational Issues
3 Designing a Natural Language User Interface
4 Designing a Lab Experiment: The NLIDB Game
5 Results and Discussion
6 Conclusions and Future Research
References
The Interdependence of Shared Context: Autonomous Human-Machine Teams
Abstract
1 Introduction
1.1 Introduction to Interdependence
1.2 Literature on Interdependence
1.3 Our Previous Findings on Interdependence
1.4 New Theory
1.5 Conclusions
References
Games with Multiple Alternatives Applied to Voting Systems
1 Introduction
2 Preliminaries
2.1 Simple Games with Multiple Alternatives
2.2 Power Indices with Multiple Alternatives
3 Real Voting Systems
3.1 Nassau County Board of Supervisors of New York
3.2 Voting Alternatives in the Spanish Parliament
3.3 House of Commons of the United Kingdom
4 Conclusions and Future Work
References
Measuring the Macroeconomic Uncertainty Based on the News Text by Supervised LDA for Investor's Decision Making
1 Introduction
1.1 Measurement of the Macroeconomic Uncertainty
1.2 Our Contributions
2 Related Works
3 Datasets
3.1 Text Data
3.2 Numeric Data
4 Process
4.1 Text Data Preprocessing
4.2 Numeric Data Prepossessing
4.3 Topic Classification
4.4 Uncertainty Measurement
5 Topic Model
5.1 Supervised Latent Dirichlet Allocation
6 Results
6.1 Topic Classification
6.2 Macro Economic Uncertainty Index
6.3 Correlation with the Volatilities of Other Market Indices
7 Conclusions and Future Works
References
Mapping the Geography of Italian Knowledge
Abstract
1 Introduction and Foundational Issues
2 Data and Methods
2.1 The Neural Tool and the Database Used
2.2 The Knowledge Complexity Index (KCI)
3 The Geography of Italian Complex Knowledge
4 Conclusions
References
Hot Trading and Price Stability Under Media Supervision in the Chinese Stock Market
Abstract
1 Introduction
2 Empirical Design and Data
2.1 Turnover Calculation
2.2 Composite Transparency
2.3 Media Coverage
2.4 Momentum Calculations
2.5 Data Description
3 Empirical Outcomes
3.1 The Two-Way Sorted Conditional Turnover-Momentum Portfolio
3.2 The Two-Way Sorted Conditional Transparency-Momentum Portfolio
3.3 The Two-Way Sorted Conditional Coverage-Momentum Portfolio
4 Robustness Checks
4.1 Extensive Formation and Holding Periods for Port (O, R) Portfolio
4.2 Extensive Formation and Holding Periods for Port (T, R) Portfolio
4.3 Extensive Formation and Holding Periods for Port (C, R) Portfolio
5 Conclusions
References
Does Cultural Distance Affect International Trade in the Eurozone?
Abstract
1 Introduction
2 Cultural Distance
2.1 Factors Influencing Trade Between Countries: A Brief Summary
3 Methods: Cultural Distance Index and Impact on International Trade
4 Data
5 Results
5.1 Cultural Distance Among European Countries
5.2 The Effect of International Trade
6 Conclusions
References
Making Sense of Economics Datasets with Evolutionary Coresets
1 Introduction
2 Background
2.1 Machine Learning and Classification
2.2 Coreset Discovery
2.3 Multi-objective Evolutionary Algorithms
3 EvoCore
4 Experimental Results and Discussion
4.1 Meta-analysis
4.2 Making Sense of Coresets and Decisions in Economics
5 Conclusions
References
Understanding Microeconomics Through Cognitive Science: Making a Case for Early Entrepreneurial Decisions
Abstract
1 Introduction and Motivation
2 Epistemological Foundation: Defining Cognitive Agents
3 The Problem Space: Analytical Formalisation
4 The Solution Space: Searching for Satisficing Alternatives
5 Towards a Meta-Cognitive Generalisation
6 Conclusions and Future Research
References
Concept Cloud-Based Sentiment Visualization for Financial Reviews
1 Introduction
2 CSCV
2.1 Extension of Word-Level Sentiment Using the LRP Method
2.2 Word-Cloud Based Text-Visualization
3 Experimental Evaluation
3.1 Text Corpus
3.2 Original Word Sentiment Assignment Property
3.3 Contextual Word Sentiment Assignment Property
3.4 Other Experimental Settings
3.5 Result
4 Text Visualization
5 Related Works
6 Conclusion
References
Population Aging and Productivity. Does Culture Matter? Some Evidence from the EU
Abstract
1 Introduction
2 Aging, Productivity and Culture
3 Method and Results
4 Conclusion
References
Complex Decision-Making Process Underlying Individual Travel Behavior and Its Reflection in Applied Travel Models
Abstract
1 Evolution of Approaches for Modeling Travel Choices
2 Limited Knowledge and Learning from Individual Experience
3 Assessment of Potential for Machine Learning (ML) Methods in Transportation Modeling
4 Important Future Directions
References
Learning Uncertainty in Market Trend Forecast Using Bayesian Neural Networks
1 Introduction
2 Data
2.1 Dataset
2.2 Prediction Task
2.3 Feature Engineering
3 Method
3.1 Convolutional Neural Networks
3.2 Dropout for Bayesian Neural Networks
3.3 Proposed Bayesian CNN
3.4 Prediction with Bayesian CNN Model
4 Results and Discussion
4.1 Comparison Method
4.2 Training
4.3 Modeling Results
4.4 Probability Prediction
5 Conclusion
References
Amortizing Securities as a Pareto-Efficient Reward Mechanism
1 The Model
1.1 Innovation-Backed Securities (IBS)
1.2 More Modeling Elements
2 Designing the Shape of Amortizing Securities
2.1 The Decentralized Equilibrium Innovation Rate
2.2 The Socially Optimal Innovation Rate
2.3 The Socially Optimal Shape of Amortizing Securities
3 Concluding Remarks
References
A Micro-level Analysis of Regional Economic Activity Through a PCA Approach
1 Introduction
2 The Economy of the Province: An Overview
3 An over-Time Analysis of Economic Macro-categories
4 A Principal Component Analysis of the Provincial Economy
5 Conclusions and Future Research
References
Can a Free Market Be Complete?
1 Introduction
2 Contracts as Algorithms
3 The Question of Market Completeness
4 A Complete World
5 To Own or Not to Own ...
6 Conclusion
References
The Self-Organizing Map: An Methodological Note
Abstract
1 Introduction and Fundamental Background
2 The Self-Organizing Algorithm
2.1 Applications and Extensions
3 Conclusions and Future SOM Development
References
The Study of Co-movement Mechanisms Between the Mainland Chinese and Hong Kong Capital Markets
Abstract
1 Introduction
2 Problem Statement
3 Macroeconomic Factors
4 Microeconomic Factors
5 Conclusion – Economic Meaning and Policy Recommendations
References
In Favor of Multicultural Integration: An Experimental Design to Assess Young Migrants’ Literacy Rate
Abstract
1 Introduction
2 Context Analysis
2.1 Foreign Students in the Italian Educational System
3 Two Research Perspectives
4 The Importance and Necessity of Language Assessment in Multicultural Educational Settings
5 Experimental Design
5.1 Experimental Task
5.2 Bioprofile
5.3 Achievement Test
5.4 Placement Test
5.5 Proficiency Test BICS - CALP
6 Conclusion and Future Research
References
Coping with Long-Term Performance Industrial Paving: A Finite Element Model to Support Decision-Making Structural Design
Abstract
1 Introduction
2 Purpose of the Work
3 Finite Element Model Description
3.1 Basic Logical Scheme of DTFEM
3.2 DTFEM Construction
4 DTFEM Processing
4.1 Case 1: Results
4.2 Case 2: Results
4.3 Summary
5 Conclusions and Future Research
Acknowledgments
References
Does Trust Create Trust? A Further Experimental Evidence from an Extra-Laboratory Investment Game
Abstract
1 Introduction
2 From Social Psychology to Economics Experimentation
3 Testing Norms of Trust and Cooperation
3.1 Experimental Procedures
3.2 Preliminary Results
4 Conclusions
References
Evolution of Business Collaboration Networks: An Exploratory Study Based on Multiple Factor Analysis
Abstract
1 Introduction
2 Inter-organizational Networks and Collaboration
3 Methods and Data
3.1 Multiple Factor Analysis (MFA)
3.2 Networks and Data Preparation
4 Results
4.1 Intra-structure
4.2 Inter-structure
4.3 Clustering
5 Conclusions and Final Considerations
References
Ethics and Decisions in Distributed Technologies: A Problem of Trust and Governance Advocating Substantive Democracy
Abstract
1 Ethics and Decision in Distributed Technology
1.1 Philosophy as Pilot of Technology? the Moralizing of Things and the Informatization of Morality
1.2 Distributed Technologies: Strengths and Weaknesses of an Announced Technological Revolution
1.3 “Distributed”, the Quality Beyond the Border of Technology: The Case of PERSONA Project
1.4 The Importance of Ethics and Decision-Making to Guide the Environmental Architecture of Technology
2 Distributed Technologies, Trust, and Governance of Democracy
2.1 Trust, a Complex Human Feeling
2.2 Are We Experiencing a Real “Crisis of Trust”?
2.3 The Crisis as Change of Trust-in-Governance Model
2.4 Democracy for Trusting Distributed Technology
3 Conclusions
Acknowledgements
References
The Ascending Staircase of the Metaphor: From a Rhetorical Device to a Method for Revealing Cognitive Processes
Abstract
1 Introduction
2 The Ascending Staircase Paradigm of the Metaphor
3 A Cognitive Approach to Metaphor
3.1 Metaphor: A Growing Polygonal Chain
3.2 Metaphor: Ornatus Element vs. Cognitive Element
3.3 Metaphor: Terms and Definitions
4 Metaphor as a Cognitive Device in Educational Activity
5 Conclusion and Future Research
References

Citation preview

Advances in Intelligent Systems and Computing 1009

Edgardo Bucciarelli Shu-Heng Chen Juan Manuel Corchado Editors

Decision Economics: Complexity of Decisions and Decisions for Complexity

Advances in Intelligent Systems and Computing Volume 1009

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **

More information about this series at http://www.springer.com/series/11156

Edgardo Bucciarelli Shu-Heng Chen Juan Manuel Corchado •

•

Editors

Decision Economics: Complexity of Decisions and Decisions for Complexity

123

Editors Edgardo Bucciarelli Department of Philosophical, Pedagogical and Economic-Quantitative Sciences, Section of Economics and Quantitative Methods University of Chieti-Pescara Pescara, Italy

Shu-Heng Chen AI-ECON Research Center - Department of Economics National Chengchi University Taipei, Taiwan

Juan Manuel Corchado Edificio Parque Científico Universidad de Valladolid AIR Institute Salamanca, Spain

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-38226-1 ISBN 978-3-030-38227-8 (eBook) https://doi.org/10.1007/978-3-030-38227-8 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Dedicated to Shu-Heng Chen and Juan Manuel Corchado: For all that has been and all that will be together with you both, in the name of education, research, and intellectual social responsibility. E.B.

Decision Economics: A novel discipline. Three years ago, very much inspired by the legacy of Herbert A. Simon (1916– 2001), we organised a special event in commemoration of the hundredth anniversary of his birth under the umbrella of the 13th International Symposium on Distributed Computing and Artificial Intelligence (DCAI) in the University of Seville, Spain. This was also the first time that we attempted to introduce decision economics as a new branch of economics formally. In the past, from a strictly scientific point of view, the term “decision economics” was occasionally used in conjunction with managerial economics, mainly as an application of neoclassical microeconomics. However, given the increasingly interdisciplinary nature of decision-making research, it is desirable to have a panoramic view that is much broader and much more inclusive than the conventional standard view. Therefore, in our first edition of decision economics, we have provided a tentative definition of “decision economics” so as to register this neologism as a discipline in economics (Bucciarelli, Silvestri and Rodriguez, 2016, p. vii), and we have further added some remarks to elaborate on the proposed definition in subsequent editions (Bucciarelli, Chen and Corchado, 2017, 2019). Our efforts over the last three years have successfully aroused a new wave of interest in decision economics, and the special sessions run over the last three years have now expanded into an autonomous conference, beyond DCAI. This remarkable growth is manifested by a total of 35 chapters that are included in this volume, which is almost double the size of our previous edition.

v

vi

Preface

We are certainly grateful to our collaborating partner DCAI for its determination to support this new series. In addition to that, since this is the first international conference on “decision economics” ever held in the world of economics, we would like to highlight the significance of this milestone. For us, this milestone denotes and corresponds to two important recent developments related to decision economics. The first one is the changing and expanding domain of decision economics, not just in terms of its methodology but also in its ontology. The second one, while also related to the first one, is concerned with AI possibilities and their implications for decision-making in economics and finance. Not only do these two developments shape the structure of the fourth edition of this series, but they also give rise to the subtitle of the volume, namely Complexity of Decisions and Decisions for Complexity. Before leaving room for the individual chapters of this volume, all inspired by a shared framework based on decision economics, let us elaborate on the two underpinning developments. First, the study of choice-making and decision-making is one of the most well-received inner definitions of economics as well as one of its cutting edge research that is carried out ever since the magnum opus of Lionel Robbins (1898– 1984), namely An Essay on the Nature and Significance of Economic Science (Robbins, 1932). Robbins states, “For rationality in choice is nothing more and nothing less than choice with complete awareness of the alternatives rejected. And it is just here that Economics acquires its practical significance.” (Ibid., p. 136; Italics added). With this spirit, the neoclassical economics that followed and strengthened later tended to frame decision problems in an ideal form of logic, endowed with a ‘sufficient’ degree of knowledge, descriptions and transparency; in this way, complete awareness of the alternatives rejected was ensured. This formulation may have helped economics become a formal – but impersonal and detached from human experience – science (e.g., Stigum, 1990); in any case, an unyielding devotion to it without reservation has also alienated economics itself substantially from many realistic aspects of decision making from both normatively and positively perspectives. In reality, few non-trivial decision problems have a complete description or are completely describable. Gerd Gigerenzer opens his book Gut Feelings (Gigerenzer, 2007) with the following sarcastic story: A professor from Columbia University (New York) was struggling over whether to accept an offer from a rival university or to stay. His colleague took him aside and said, “Just maximize your expected utility–you always write about doing this.”. Exasperated, the professor responded, “Come on, this is serious.” (Ibid, p. 3; Italics added).

Given the kind of the aforementioned problem, which is fraught with what Michael Polanyi (1891–1976) coined as the tacit dimension (Polanyi, 1966), the decision is usually made with the involvement of a set of bounded cognitive abilities, critical skills, knowledge and experience, as well as imagination, gut feeling, affection, emotion, social conformity, cultural routines, and so on and so forth. Unfortunately, this list of added elements has not been sufficiently dealt with in mainstream economics. Decision economics, as a new discipline in economics, acknowledges the interdisciplinary nature of decision-making. Even though the two countervailing

Preface

vii

forces—pleasure and pain—had already been brought into economic theory back in the days of Jeremy Bentham (1748–1832) and Stanley Jevons (1835–1882), it may still be too optimistic to think that these two forces can be properly understood in purely mathematical and quantifiable manner. Had not neoclassical economics been generally oblivious to the complexity of decision problems, a pluralistic approach to decision-making might have already been appreciated. In this regard, decision economics attempts to serve as a bridge between economics and other related disciplines, from biology, and the social sciences, to the humanities, and the computer and cognitive sciences, and also to broaden and to deepen the connection between economics and other disciplines. This scientific pluralism has been pursued in our previous three editions and is also reiterated in this edition. After all, decision economics aims neither to endorse exclusively nor to resemble exactly hard sciences since its subject is considerably different from them and continually changing, starting with its immanent ontology (and ethics) and the underlying plurality of paradigms which should, therefore, be regarded and welcomed. And it is precisely this that makes economics − and decision economics therein − a fascinating science, not at all dismal, but rather worthy of being further explored, studied, and taught not only in the sacred groves of academe. Now that more and more social scientists are approaching and realising the emergence of boundedly rational agents − by the way, is there still any doubt remaining?! − if not even irrational agents, and despite the existence of possible invariants of human behaviour (Simon, 1990), the following quotation from Kurt W. Rothschild (1999) describes the idea concisely: “A plurality of paradigms in economics and in social sciences in general is not only an obvious fact but also a necessary and desirable phenomenon in a very complex and continually changing subject.”. (Ibid., p. 5). In the final analysis, accordingly, if anything were to be introduced a different normative standard, it can not elude a rational theory of heuristics this time (Gigerenzer, 2016). The second point we would like to emphasise is the relationship between AI possibilities and decision economics. Within the cognitive sciences, AI has long been regarded as a tool for building—among others—decision support systems to cope with economic decision-making in increasingly complex business and IT environments. This is why AI is a fulcrum of decision economics, and its functionality has been well demonstrated in Gigerenzer and Selten (2001) and in our previous edition (Bucciarelli, Chen and Corchado, 2019). Nevertheless, with the third wave of the Digital Revolution (e.g. Gershenfeld, Gershenfeld and Cutcher-Gershenfeld, 2017), AI is more than just a toolkit for decision economics; it can go further to redirect the entire research trend and the emerging paradigm of decision economics, supported in the first place by a different structure of thought and cognitive representations than does neoclassical economics. As we have emphasised in the last three editions, decision economics—or Simonian economics —is built upon a broad notion of boundedly rational agents characterised by limited cognitive capabilities, and hence the limited capability to search and generate alternatives, as well as to process data and thus extract information and knowledge. With this inescapable constraint, the search rules and the decision rules that are implementable are naturally required to be bounded in their either computational

viii

Preface

complexity or algorithmic complexity (e.g. Velupillai, 2010, 2018). That being the case, the criterion based on complete awareness of the alternatives rejected, the premise suggested by Lionel Robbins, is difficult to comply with. And not only difficult to comply with, but difficult to convey helpfully, too. All things considered, in these prolific days of AI with the Internet of everything (e.g. Lawless, Mittu and Sofge, 2019), people are anxious to know what the meaning of decision-making is, should AI in the end be able to take care of all decisions made by humans, definitely, after first handling those routine ones with great success. In fact, today, once in a while, we may have been amazed by the machine intelligence demonstrated by the intelligent assistants or chatbots residing in our smartphones, just to mention one example. People start to wonder again on when and to what extent the Turing test will be passed (e.g. Saygin, Cicekli and Akman, 2000). In 1950, Alan Turing surmised that it may take a century to flag this triumph (Turing, 1950), which is about 30 years from now on. To give a timely review of this progress, we would like to make a preannouncement that Decision Economics 2020 plans to have a celebration on “Turing Tests: 70 Years On” as the main track of the conference, under the aegis of the United Nations. To conclude this Preface, according to Minsky and Papert (1988), the human’s pursuit of machine intelligence has gone through roughly two stages. The first one is more ambitious in that it aims to design machines that can do what humans can do. The McCulloch–Pitts neural network is an example of this (McCulloch and Pitts, 1943). After being in the doldrums for a long while, the second one is more humble; it aims to design machines that can learn what humans can do, entering the era of learning machines or autonomous machines. The Rosenblatt’s perceptron (Rosenblatt, 1962) is an example, too, but many other interesting examples have only occurred in recent years. This second-stage machine intelligence substantiates the so-called connectionism initiated by Donald Hebb (1904–1985) (Hebb, 1949) and Friedrich Hayek (1899–1992) (Hayek, 1952), and hence can effectively absorb the tacit knowledge from human experts. Furthermore, armed with the Internet of everything, it can get access to an essentially infinitely large space to retrieve historical data and to do great analysis based on similarity. Long before, the philosopher and economist David Hume (1711–1776) had already given the greatest guide to modern AI, namely the authority of experience, as we quoted from his An Enquiry Concerning Human Understanding in 1748: In reality all arguments from experience are founded on the similarity which we discover among natural objects, and by which we are induced to expect effects similar to those which we have found to follow from such objects. […] and makes us draw advantage from that similarity which nature has placed among different objects. From causes which appear similar we expect similar effects. This is the sum of all our experimental conclusions. (Ibid, 1748, Section IV, Italics added).

Unfortunately, back in the eighteenth, nineteenth or even most of the twentieth centuries, our bounded rationality, constrained by (very) limited memory or archives and further crippled by the limited search capability, non-trivially

Preface

ix

restricted our ability to take advantage of those similarities. Today, when learning machines become an extended part of humans like the idea of cyborgs portraits, then one may wonder not only how decisions will be made, but what the decisions will actually be. Would we be able to have smarter or better decisions? Given this possible future, how should we decide the route leading towards it, that is to say, our decisions for the complexity of our future (Helbing, 2019). With these two highlights regarding the present and the future of decision economics, let us also make a brief remark on our chosen subtitle. “Complexity of Decisions” shows this volume as a continuation of our second edition of the series (Bucciarelli, Chen and Corchado, 2017), referring to the complex ontology of decisions. If decisions can be arranged in a hierarchy of complexity, then those non-trivial and consequential decisions are expected to locate themselves at the higher levels of the hierarchy, which are often cognitively demanding, radically uncertain, imprecise, vague and incomplete. On the other hand, “Decisions for Complexity” refers to the methods used to make complex decisions and relate this volume to our previous edition (Bucciarelli, Chen and Corchado, 2019). The two subtitles are then further illustrated by the 35 chapters collected in this volume. Our final remarks are for those scholars who have contributed to the success of DECON 2019. We have had the good fortune of working with an outstanding group of scientists and scholars from several disciplines, starting from the members of the International Program Committee: Beautiful minds and beautiful people, all dedicated to our common cause, and their hard work has made our efforts easier. In particular, without Sara Rodríguez González and Fernando De la Prieta, it is hard to imagine how we would have completed this year’s conference on decision economics and, of course, this special book. Our greatest debt of gratitude is both to the members of the International Program Committee and to each of the contributors in this volume. Among the latter, the winners of the two international awards “Decision Economics 2019” are Robert E. Marks (University of New South Wales, School of Economics, Sydney), for the best paper entitled “Calibrating Methods for Decision Making Under Uncertainty”, and Friederike Wall (University of Klagenfurt, Department of Management Control and Strategic Management), for the best application paper entitled “Coordination and Search for New Solutions: An Agent-based Study on the Tension in Boundary Systems”. DECON 2019 was organised by the University of Chieti-Pescara (Italy), the National Chengchi Unversity of Taipei (Taiwan), and the University of Salamanca (Spain), and was held at the Escuela Politécnica Superior de Ávila, Spain, from 26th to 28th June, 2019. We acknowledge the sponsors: IEEE Systems Man and Cybernetics Society, Spain Section Chapter, and IEEE Spain Section (Technical Co-Sponsor), IBM, Indra, Viewnext, Global Exchange, AEPIA-and-APPIA, with the funding supporting of the Junta de Castilla y León, Spain (ID: SA267P18-Project co-financed with FEDER funds). Edgardo Bucciarelli Shu-Heng Chen Juan Manuel Corchado

x

Preface

References Bucciarelli, E., Silvestri, M., & González, S. R. (Eds.): Decision Economics, In Commemoration of the Birth Centennial of Herbert A. Simon 1916–2016 (Nobel Prize in Economics 1978). Springer, Cham (Switzerland) (2016). Bucciarelli, E., Chen, S. H., & Corchado, J. M. (Eds.): Decision Economics: In the Tradition of Herbert A. Simon’s Heritage. Springer, Cham (Switzerland) (2017) Bucciarelli, E., Chen, S. H., & Corchado, J. M. (Eds.): Decision Economics. Designs, Models, and Techniques for Boundedly Rational Decisions. Springer-Nature, Cham (Switzerland) (2019). Gershenfeld, N., Gershenfeld, A., & Cutcher-Gershenfeld, J.: Designing Reality: How to Survive and Thrive in the Third Digital Revolution. Basic Books, New York (2017). Gigerenzer, G.: Gut Feelings: The Intelligence of Unconsciousness. Penguin Books, London (2007). Gigerenzer, G.: Towards a Rational Theory of Heuristics. In: R. Frantz, L. Marsh (eds.) Minds, Models and Milieux. Commemorating the Centennial of the Birth of Herbert Simon, pp. 34−59. Palgrave Macmillan, London (2016). Gigerenzer, G., & Selten, R. (Eds.): Bounded Rationality: The Adaptive Toolbox. The MIT Press, Cambridge (MA) (2001). Hayek, F. A.: The Sensory Order: An Inquiry into the Foundations of Theoretical Psychology. University of Chicago Press, Chicago (1952). Hebb, D. O.: The Organization of Behavior. John Wiley and Sons, New York (1949). Helbing, D. O. (Ed.): Towards Digital Enlightenment: Essays on the Dark and Light Sides of the Digital Revolution. Springer-Nature, Cham (Switzerland) (2019). Hume, D.: An Enquiry Concerning Human Understanding. In: E. Steinberg (Ed.) Hackett Publishing Company, Indianapolis (1977) [1748]. Lawless, W. F., Mittu, R., & Sofge, D.: Artificial Intelligence for the Internet of Everything. Elsevier, Cambridge (MA) (2019). McCulloch, W. S., & Pitts, W.: A Logical Calculus of the Ideas Immanent in Nervous Activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133 (1943). Minsky, M., & Papert, S.: Perceptrons: An Introduction to Computational Geometry. The MIT Press, Cambridge (MA) (1988) [1969]. Polanyi, M.: The Tacit Dimension. University of Chicago Press, Chicago (1966). Robbins, L. C.: An Essay on the Nature and Significance of Economic Science. McMillan & Co, London (1932). Rosenblatt, F.: Principles of Neurodynamics. Spartan, Washington DC (1962). Rothschild, K. W.: To Push and to Be Pushed. American Economist, 43(1), 1−8 (1999). Saygin, A. P., Cicekli, I., & Akman, V.: Turing Test: 50 Years Later. Minds and Machines, 10(4), 463–518 (2000). Simon, H. A.: Invariants of Human Behaviour. Annual Review of Psychology, 41(1), 1−19 (1990). Stigum, B. P.: Toward a Formal Science of Economics: The Axiomatic Method in Economics and Econometrics. The MIT Press, Cambridge (MA) (1990). Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. Velupillai, K. V.: Computable Foundations for Economics. Routledge, Abingdon (UK) (2010). Velupillai, K. V.: Models of Simon. Routledge, Abingdon (UK) (2018).

Organisation

General Chairs Edgardo Bucciarelli Shu-Heng Chen Juan Manuel Corchado

University of Chieti-Pescara, Italy National Chengchi University, Taipei, Taiwan University of Salamanca, Spain

International Program Committee Federica Alberti José Carlos R. Alcantud Barry Cooper† Sameeksha Desai Zhiqiang Dong Felix Freitag Jakob Kapeller Amin M. Khan Alan Kirman Alexander Kocian William Lawless Nadine Levratto Roussanka Loukanova Nicola Mattoscio Elías Moreno Giulio Occhini Lionel Page Paolo Pellizzari Enrico Rubaltelli Anwar Shaikh

University of Portsmouth, UK University of Salamanca, Spain University of Leeds, UK Indiana University, Bloomington, USA South China Normal University, China Universitat Politècnica de Catalunya, Spain Johannes Kepler University of Linz, Austria IST, University of Lisbon, Portugal Aix-Marseille Université, France University of Pisa, Italy Paine College, Augusta, USA Université Paris Ouest Nanterre La Défense, France Stockholm University, Sweden University of Chieti-Pescara, Italy University of Granada, Spain Italian Association for Informatics and Automatic Calculation, Milan, Italy University of Technology Sydney, Australia Ca’ Foscari University of Venice, Italy University of Padua, Italy The New School for Social Research, New York, USA

xi

xii

Pietro Terna Katsunori Yamada Ragupathy Venkatachalam Peter Vovsha Stefano Zambelli

Organisation

University of Turin, Italy Osaka University, Japan Goldsmiths, University of London, UK INRO Research, New York, USA University of Trento, Italy

Organising Committee Juan Manuel Corchado Rodríguez Javier Parra Sara Rodríguez González Roberto Casado Vara Fernando De la Prieta Sonsoles Pérez Gómez Benjamín Arias Pérez Javier Prieto Tejedor Pablo Chamoso Santos Amin Shokri Gazafroudi Alfonso González Briones José Antonio Castellanos Yeray Mezquita Martín Enrique Goyenechea Javier J. Martín Limorti Alberto Rivas Camacho Ines Sitton Candanedo Daniel López Sánchez Elena Hernández Nieves Beatriz Bellido María Alonso Diego Valdeolmillos Sergio Marquez Guillermo Hernández González Mehmet Ozturk Luis Carlos Martínez de Iturrate Ricardo S. Alonso Rincón Niloufar Shoeibi Zakieh Alizadeh-Sani Jesús Ángel Román Gallego Angélica González Arrieta

University of Salamanca, Institute, Spain University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, Institute, Spain University of Salamanca, University of Salamanca, University of Salamanca, Institute, Spain University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, Institute, Spain University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, Institute, Spain University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca, University of Salamanca,

Spain and AIR Spain Spain Spain Spain Spain Spain Spain and AIR Spain Spain Spain and AIR Spain Spain Spain Spain Spain Spain Spain Spain Spain Spain Spain and AIR Spain Spain Spain Spain and AIR Spain Spain Spain Spain Spain

Organisation

José Rafael García-Bermejo Giner Pastora Vega Cruz Mario Sutil Ana Belén Gil González Ana De Luis Reboredo

xiii

University of Salamanca, Spain University University University University

of of of of

Salamanca, Salamanca, Salamanca, Salamanca,

Spain Spain Spain Spain

Contents

Calibrating Methods for Decision Making Under Uncertainty . . . . . . . . Robert E. Marks

1

Coordination and Search for New Solutions . . . . . . . . . . . . . . . . . . . . . Friederike Wall

10

Appeasement or Radicalism: A Game Between Intruders and Occupiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yehui Lao and Zhiqiang Dong

18

Coping with Bounded Rationality, Uncertainty, and Scarcity in Product Development Decisions: Experimental Research . . . . . . . . . . Ben Vermeulen, Bin-Tzong Chie, Andreas Pyka, and Shu-Heng Chen

24

What Next for Experimental Economics? Searching for an Alternative Pathway Founded on Mathematical Methods . . . . . . Carmen Pagliari and Edgardo Bucciarelli

36

Generating Neural Archetypes to Instruct Fast and Interpretable Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pietro Barbiero, Gabriele Ciravegna, Giansalvo Cirrincione, Alberto Tonda, and Giovanni Squillero Strength Investing - A Computable Methodology for Capturing Strong Trends in Price-Time Space in Stocks, Futures, and Forex . . . . Heping Pan

45

53

A Deep Learning Framework for Stock Prediction Using LSTM . . . . . . Yaohu Lin, Shancun Liu, Haijun Yang, and Harris Wu

61

Testing Fiscal Solvency in Macroeconomics . . . . . . . . . . . . . . . . . . . . . . Paolo Canofari and Alessandro Piergallini

70

xv

xvi

Contents

CEO Gender and Financial Performance: Empirical Evidence of Women on Board from EMEA Banks . . . . . . . . . . . . . . . . . . . . . . . . Stefania Fensore The Minimum Heterogeneous Agent Configuration to Realize the Future Price Time Series Similar to Any Given Spot Price Time Series in the AI Market Experiment . . . . . . . . . . . . . . . . . . . . . . . Yuji Aruka, Yoshihiro Nakajima, and Naoki Mori A Lab Experiment Using a Natural Language Interface to Extract Information from Data: The NLIDB Game . . . . . . . . . . . . . . . . . . . . . . Raffaele Dell’Aversana and Edgardo Bucciarelli

77

85

93

The Interdependence of Shared Context: Autonomous Human-Machine Teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 W. F. Lawless Games with Multiple Alternatives Applied to Voting Systems . . . . . . . . 117 Joan Blasco and Xavier Molinero Measuring the Macroeconomic Uncertainty Based on the News Text by Supervised LDA for Investor’s Decision Making . . . . . . . . . . . 125 Kyoto Yono, Kiyoshi Izumi, Hiroki Sakaji, Takashi Shimada, and Hiroyasu Matsushima Mapping the Geography of Italian Knowledge . . . . . . . . . . . . . . . . . . . . 134 Daniela Cialfi and Emiliano Colantonio Hot Trading and Price Stability Under Media Supervision in the Chinese Stock Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Hung-Wen Lin, Jing-Bo Huang, Kun-Ben Lin, and Shu-Heng Chen Does Cultural Distance Affect International Trade in the Eurozone? . . . 154 Donatella Furia, Iacopo Odoardi, and Davide Ronsisvalle Making Sense of Economics Datasets with Evolutionary Coresets . . . . . 162 Pietro Barbiero and Alberto Tonda Understanding Microeconomics Through Cognitive Science: Making a Case for Early Entrepreneurial Decisions . . . . . . . . . . . . . . . 171 Edgardo Bucciarelli and Fabio Porreca Concept Cloud-Based Sentiment Visualization for Financial Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Tomoki Ito, Kota Tsubouchi, Hiroki Sakaji, Tatsuo Yamashita, and Kiyoshi Izumi Population Aging and Productivity. Does Culture Matter? Some Evidence from the EU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Emiliano Colantonio, Donatella Furia, and Nicola Mattoscio

Contents

xvii

Complex Decision-Making Process Underlying Individual Travel Behavior and Its Reflection in Applied Travel Models . . . . . . . . . . . . . . 201 Peter Vovsha Learning Uncertainty in Market Trend Forecast Using Bayesian Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Iwao Maeda, Hiroyasu Matsushima, Hiroki Sakaji, Kiyoshi Izumi, David deGraw, Hirokazu Tomioka, Atsuo Kato, and Michiharu Kitano Amortizing Securities as a Pareto-Efficient Reward Mechanism . . . . . . . 219 Hwan C. Lin A Micro-level Analysis of Regional Economic Activity Through a PCA Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Giulia Caruso, Tonio Di Battista, and Stefano Antonio Gattone Can a Free Market Be Complete? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Sami Al-Suwailem and Francisco A. Doria The Self-Organizing Map: An Methodological Note . . . . . . . . . . . . . . . . 242 Daniela Cialfi The Study of Co-movement Mechanisms Between the Mainland Chinese and Hong Kong Capital Markets . . . . . . . . . . . . . . . . . . . . . . . 250 Xianda Shang, Yi Zhou, and Jianwu Lin In Favor of Multicultural Integration: An Experimental Design to Assess Young Migrants’ Literacy Rate . . . . . . . . . . . . . . . . . . . . . . . . 258 Cristina Pagani and Chiara Paolini Coping with Long-Term Performance Industrial Paving: A Finite Element Model to Support Decision-Making Structural Design . . . . . . . 268 Oliviero Camilli Does Trust Create Trust? A Further Experimental Evidence from an Extra-Laboratory Investment Game . . . . . . . . . . . . . . . . . . . . . 279 Edgardo Bucciarelli and Assia Liberatore Evolution of Business Collaboration Networks: An Exploratory Study Based on Multiple Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . 292 Pedro Duarte and Pedro Campos Ethics and Decisions in Distributed Technologies: A Problem of Trust and Governance Advocating Substantive Democracy . . . . . . . . 300 Antonio Carnevale and Carmela Occhipinti The Ascending Staircase of the Metaphor: From a Rhetorical Device to a Method for Revealing Cognitive Processes . . . . . . . . . . . . . . 308 Cristina Pagani and Chiara Paolini

The Editors

Edgardo Bucciarelli is an Italian economist. Currently he is an associate professor of Economics at the University of Chieti-Pescara (Italy), where he earned his PhD in Economics (cv SECS/P01). His main research interests lie in the area of complexity and market dynamics; decision theory, design research, experimental microeconomics, classical behavioural economics, and economic methodology. His main scientific articles appeared, among others, in the Journal of Economic Behavior and Organization, Journal of Post Keynesian Economics, Metroeconomica, Applied Economics, Computational Economics, and other international journals. Several key contributions appeared in chapters of book in Physica-Verlag, and Springer Lecture Notes in Economics and Mathematical Systems. At present, he teaches Experimental Economics, Cognitive Economics and Finance, and Economic Methodology at University of Chieti-Pescara. He is one of the Directors of the Research Centre for Evaluation and Socio-Economic Development, and the co-founder of the academic spin-off company “Economics Education Services”. He is the co-founder, organising chair, program committee chair in a number of international conferences. Shu-Heng Chen is a Taiwanese economist. He earned his PhD in Economics at the University of California (UCLA, Los Angeles, United States) in 1992. Currently, he is a distinguished professor of Economics at the Department of Economics and also the vice President of the National Chengchi University (Taipei, Taiwan). Furthermore, he is the founder and director of the AI-ECON Research Center at the College of Social Sciences of the National Chengchi University and the coordinator of the Laboratory of Experimental Economics in the same University. He is unanimously considered one of the most influential and pioneer scholars in the world in the field of applied research known as Computational Economics. His scientific contributions were directed to the affirmation of the computational approach aimed to the interpretation of the theoretical issues and applied economic problems still today unresolved, from a perspective more connected to reality and therefore different from the dominant neoclassical paradigm. In

xix

xx

The Editors

particular, his most decisive contributions are aimed to the approach based on models with heterogeneous agents and the genetic programming in the socio-economic studies. His work as a scholar is interdisciplinary and focused since the beginning on methodologies related to the bounded rationality and Herbert A. Simon’s contributions. Shu-Heng Chen holds the position of Editor of prestigious international economic journals and is author of more than 150 publications including scientific articles, monographs and book chapters. Juan Manuel Corchado is the vice President for Research and Technology Transfer at University of Salamanca (Spain) where he holds the position of full professor of Computer Science and AI. He is the Director of the Science Park and Director of the Doctoral School of University of Salamanca, in which he has been elected twice as Dean of the Faculty of Science. In addition to a PhD in Computer Sciences earned at University of Salamanca, he holds a PhD in Artificial Intelligence from University of the West of Scotland. Juan Manuel Corchado is Visiting Professor at Osaka Institute of Technology since January 2015, Visiting Professor at University Teknologi Malaysia since January 2017 and a Member of the Advisory group on Online Terrorist Propaganda of the European Counter Terrorism Centre (EUROPOL). Juan Manuel Corchado is the Director of BISITE Research Group (Bioinformatics, Intelligent Systems and Educational Technology) created by him in 2000. He is also editor and Editor-in-chief of specialised journals such as Advances in Distributed Computing and Artificial Intelligence Journal, International Journal of Digital Contents and Applications, and Oriental Journal of Computer Science and Technology.

Calibrating Methods for Decision Making Under Uncertainty Robert E. Marks(B) Economics, University of New South Wales Sydney, Sydney, NSW 2052, Australia [email protected] http://www.agsm.edu.au/bobm

Abstract. This paper uses simulation (written in R) to compare six methods of decision making under uncertainty: the agent must choose one of eight lotteries where the six possible (randomly chosen) outcomes and their probabilities are known for each lottery. Will risk-averse or riskpreferring or other methods result in the highest mean payoff after the uncertainty is resolved and the outcomes known? Methods include maxmax, max-min, Laplace, Expected Value, CARA, CRRA, and modified Kahneman-Tversky. The benchmark is Clairvoyance, where the lotteries’ outcomes are known in advance; this is possible with simulation. The findings indicate that the highest mean payoff occurs with risk neutrality, contrary to common opinion. Keywords: Decision making · Uncertainty Simulation · Clairvoyance · Risk neutrality

1

· Utility functions ·

Introduction

This is not a descriptive paper. It does not attempt to answer the positive question of how people make decisions under uncertainty. Instead, it attempts to answer the normative question of how best to make decisions under uncertainty. How best to choose among lotteries. We must first define “best” and “uncertainty”. By “best” we mean decisions that result in the highest payoffs, where the payoffs are the sum of the prizes won across a series of lotteries. The experimental set-up is that each period the agent is presented with eight lotteries, each with six possible known outcomes or prizes (chosen in the range ±$10). No uncertainty about possible payoffs. But there is uncertainty in each lottery about which payoff or prize will occur. The best information the agent has are the probabilities of the six possible prizes or payoffs in each lottery. Each lottery has six possible payoffs, but the values of these payoffs and their probabilities vary across the eight distinct lotteries. Choosing among these is what we mean by “decision making under uncertainty”.

c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 1–9, 2020. https://doi.org/10.1007/978-3-030-38227-8_1

2

R. E. Marks

2

Decision Making Under Uncertainty

We model agents as possessing various approaches to this problem. – A simple approach (the Laplace method) is to ignore any information about the probabilities of payoffs and instead just choose the lottery with the highest average or mean payoff, by calculating the mean of each lottery’s six possible payoffs. – Another method (modelling an optimistic agent) is to choose the lottery with the highest possible best payoff, the max-max method. – Modelling a pessimistic agent, another method is to choose the lottery with the highest possible worst payoff, the max-min method. Neither of these methods uses the known probabilities, or even five of the six payoffs. – A fourth method is to use the known probabilities to choose the lottery with the highest expected payoff, weighting each possible payoff by the probability of its occurring, the Expected Value method. – Three different families of utility functions. 2.1

Clairvoyance

The so-called Clairvoyant decision maker [1] knows the realisation of any uncertainty, so long as this requires no judgment by the Clairvoyant, and the realisation does not depend on any future action of the Clairvoyant. Here, with simulation of probabilistic outcomes, we can model a Clairvoyant who knows the realised outcome (among the six random possibilities) of each of the eight lotteries, while other decision makers remain ignorant of this. We simulate each outcome as occurring with its (known) probability: only one realised outcome per lottery. The Clairvoyant chooses the lottery with the highest realised outcome of the eight. We can say something of this: if A1 , ...An are i.i.d. uniform on (0, 1), then n . Here, n = 6 and the expected Mn = max(A1 , ...An ) has the expectation of n+1 maximum outcome for any lottery must be 67 × 20 − 10 = $7.14.1 But the realisation of any lottery is in general less than its maximum outcome, and its simulated realised outcome is generated from the weighted random probability distribution of the six possible outcomes. The Clairvoyant is faced by eight lotteries, and chooses the lottery with the highest simulated realised outcome (which the Clairvoyant knows). It turns out (from the simulation) that the expected maximum of these eight realised outcomes is $7.788.2 This is the best on average that any decision maker can achieve, given our experimental platform. It is our benchmark.

1 2

The lottery outcomes fall randomly in the range ±$10; see Sect. 4. With 48 outcomes, the expected maximum outcome across the eight lotteries is $9.59; the expected maximum of the eight simulated realised outcomes is 81.2% of this maximum.

Decision Making

3

3

Three Utility Functions

The remaining methods map the known possible payoffs to “utilities”, where the utilities are monotone (but not in general linear) in the dollar amounts of the possible payoffs. These methods vary in how the utilities are mapped from the payoffs. By definition, the utility of a lottery L is its expected utility, or (1) U (L) = pi U (xi ), where each (discrete) outcome xi occurs with probability pi , and U (xi ) is the utility of outcome xi . Risk aversion is the curvature (U /U ): if the utility curve is locally – – linear (say, at a point of inflection, where U = 0), then the decision maker is locally risk neutral; – concave (its slope is decreasing – Diminishing Marginal Utility), then the decision maker is locally risk averse; – convex (its slope is increasing), then the decision maker is locally risk preferring. We consider three types of utility function: 1. those which exhibit constant risk preference across all outcomes (so-called wealth-independent utility functions, or Constant Absolute Risk Aversion CARA functions; see Eq. (2) below); 2. those where the risk preference is a function of the wealth of the decision maker (the Constant Relative Risk Aversion CRRA functions; see Eq. (5) below); and 3. those in which the risk profile is a function of the prospect of gaining (risk averse) or losing (risk preferring): the DRP Value Functions from Prospect Theory. See Eqs. (6) and (7) below. Since the utility functions are monotone transformations of the possible payoffs, it would be pointless to consider the max-max, max-min, or Laplace methods using utilities instead of payoff values. 3.1

Constant Absolute Risk Aversion, CARA

Using CARA, utility U of payoff x is given by U (x) = 1 − e−γx ,

(2)

where U (0) = 0 and U (∞) = 1, and where γ is the risk aversion coefficient: γ=−

U (x) . U (x)

(3)

When γ is positive, the function exhibits risk aversion; when γ is negative, risk preferring; and when γ is zero, risk neutrality, which is identical with the Expected Value method.

4

R. E. Marks

3.2

Constant Relative Risk Aversion, CRRA

The Arrow-Pratt measure of relative risk aversion (RRA) ρ is defined as ρ(w) = −w

U (w) = wγ. U (w)

(4)

This introduces wealth w into the agent’s risk preferences, so that lower wealth can be associated with higher risk aversion. The risk aversion coefficient γ is as in (3). The Constant Elasticity of Substitution (CES) utility function: U (w) =

w1−ρ , 1−ρ

(5)

with positive wealth, w > 0, exhibits constant relative risk aversion CRRA, as in (4). In the CRRA simulations, we use the cumulative sum of the realisations of payoffs won (or lost, if negative) in previous lotteries chosen by the agent plus the possible payoff in this lottery as the wealth w in (5). It can be shown that with w > 0, ρ > 0 is equivalent to risk aversion. With w > 0 and ρ = 1, the CES function becomes the (risk-averse) logarithmic utility function, U (w) ≈ log(w). With w > 0 and ρ < 0, it is equivalent to risk preferring. 3.3

The Dual-Risk-Profile DRP Function from Prospect Theory

From Prospect Theory [2], we model the DRP Value Function, which maps from quantity x to value V with the following two-parameter equations (with β > 0 and δ > 0): 1 − e−βx , 0 ≤ x ≤ 100, (6) V (x) = 1 − e−100β 1 − eβx , −100 ≤ x < 0. (7) 1 − e−100β The parameter β > 0 models the curvature of the function, and the parameter δ > 0 the asymmetry associated with losses. The DRP function is not wealth independent.3 Three DRP functions in Fig. 1 (with three values of β, and δ = 1.75, for prizes between ±$100) exhibit the S-shaped asymmetry postulated by Kahneman and Tversky [2]. The DRP function exhibits risk seeking (loss aversion) when x is negative with respect to the reference point x = 0, and risk aversion when x is positive. We use here a linear probability weighting function (hence no weighting for smaller probabilities). As Fig. 1 suggests, as δ → 1 and β → 0, the value function asymptotes to a linear, risk-neutral function (in this case with a slope of 1). V (x) = −δ

3

This does not require that we include wealth w in the ranking of the lotteries, as in CRRA case; instead we choose a reference point at the current level of wealth, and consider the prospective gains and losses of the eight lotteries.

Decision Making 1 = 0. 05

0

................................................................................................................................................................................................................................................................................................................................................................................................................................................... ................................. ............................ .................. ... ............................... .............. ... ....................... ........... ... .................... ........ ... ................. ....... ... ............... ...... ... .............. ..... ... ............ ..... ... ........... .... ... .......... .... ... ......... .... ... ........ ... ........ ... .... ....... ... ....... ... ... ....... ... ... ...... ... ... ...... ... . . . . . .. .. . . ... . ... ..... .. ... ...... .. ... ..... .. ... ..... .. ... ..... ... .. .... ... .. .... ... .... .. ... .... .. ... .... ... .. .... ... .... .... ... ... ... .... .... ... .... ... ... ... ... .... ... ... ... ... .... . . . . . .. .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... ... .. ... ... .. ... ... .. ... ... .. ... .. .. ... .. ... .. ... ... .. .. ... .. ... .. .. ... .. .. ... . . ... .... .. . ... .. ... .. ... ... . . . . . . .. ...

.. .. .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .... . .. . .... .. . . .. . .. . .. . .... .. . .. .... . .. . .... . .. . .... .. . .. .... . .. ... . .. .. . .. . ... . .. .. . .. .. .. .. .. .. . .. .. .. .. .. . . . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. .. .. .. . . .. . . .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. ... .. .. ... .. .. .. ... .. .. ... .. .. ... .. .. ... .. .. ... .. .. ... .. .... ... .. ... .... .. ... .. ... ...... . . . . .. .. ... ... .. ... .... .. ... .... .. .. .... .. .... .. .. .... .. .. .... .. .. .... .. .. .... .. ..... .. .. ..... .. ... ..... .. ... ..... .. ..... ... .. ...... ... .. ...... ... ...... .. ... ...... . .. . . . . . . . . . . . ....... .... .. ........ .... .. ........ .... ......... .. ..... .......... .. ..... .......... .. ..... ............ .. ...... ............ .. ....... ............... ................ ........ .. .......... ................... .. ............. ....................... .. .................. ............................ .. ............................... ....................................... ..................................................................................................................................................................................................................................................................................................................................... . ..

Value

5

..

..

..

..

..

..

..

..

..

..

0

-1

= 0. 019

-2 -100

-50

0 X

50

100

Fig. 1. A prospect theory (DRP) value function ([3])

4

The Experiments, by Simulation

The experimental set-up is to generate eight lotteries, each with six possible outcomes, each outcome with its own probability of occurrence. The outcomes are chosen form a uniform distribution between +$10 and −$10; the probabilities are chosen at random so they add to unity for each lottery. The agent has complete information about the outcomes and their probabilities. Then the agent chooses the “best” lottery, based on the method of choice. The actual realisation of one of the six possibilities from the chosen lottery is simulated, using the generated probabilities: a payoff with a probability of 0.x will be realised on average with a frequency of 100x%. The realisation of outcome in the chosen lottery is the agent’s score (in dollars, say). In each iteration, payoff realisations are derived for each of the eight lotteries. Agents are presented with n iterations of the proceeding choice, and each iteration generates new lotteries with new possible payoffs and new probabilities of the payoffs. The mean payoff over these n choices is the score of the specific decision method being tested.4 General opinion is that firms, at least, are better served by slightly riskaverse behaviour. Too risk averse and attractive prospects are ignored (“nothing ventured, nothing gained”), but too risk preferring is the same as gambling, with the risk of losing heavily. What do our simulations tell us about the best method of decision making under uncertainty?

4

See the R [4] code at http://www.agsm.edu.au/bobm/papers/riskmethods.r.

6

5

R. E. Marks

Results

Table 1 presents the mean results of 10,000 iterations (independent samples) of the eight lottery/six prize experimental platform, with results for: 1. 2. 3. 4. 5. 6.

The benchmark Clairvoyant method the Expected Value method the Laplace method the max-max method the max-min method random choice among the eight lotteries.

Table 1. Mean payoffs by method. Method

Payoff ($)

Clairvoyant

7.787999

Expected value 3.87175

% Clairvoyant % EV 100 49.71431

100

Laplace

3.359935

43.14247

86.78

Max-max

1.391732

17.87021

35.95

Max-min

2.427924

31.1752

62.71

Random

0.02162699 0

0

The Clairvoyant would have won $7.79 with perfect foresight. The other methods, of course, cannot see the future, which is the essence of decision making under uncertainty. Expected Value (the risk-neutral decision maker) is second, with 49.7% of the Clairvoyant’s score; Laplace is third, with 43.1%. Surprisingly, the (pessimist’s) max-min, at 31.2%, is almost twice as good as the (optimist’s) max-max, at 17.9%. Unsurprisingly, choosing among the eight lotteries randomly is worst, with effectively a zero mean payoff (of 2.16 cents, or 0.56% of EV). Table 2 presents the mean results of 10,000 iterations of the CARA method with different values of the risk-aversion coefficient γ: the results show that the best decisions are made when γ ≈ 0, that is when the method is risk neutral and approximates the Expected Value method. Table 3 present the mean results of 10,000 iterations of the CRRA method with different values of the RRA parameter ρ and reveals that with a CRRA decision maker, again the best profile (the value of ρ that results in the highest expected payoff) is close to zero. That is, as with the CARA method, there is in this set-up no advantage to being risk averse or risk preferring (even a little): the best profile is risk neutrality, as reflected in the Expected Value method. Note that the logarithmic utility method (with ρ = 1.0) performs at only 98.88% of the Expected Value method. Table 4 presents the mean results of 10,000 iterations of twelve DRP functions, combinations of three values of δ and four values of β. The results are

Decision Making Table 2. CARA mean payoffs, varying γ. gamma γ Payoff ($) % Clairvoyant % EV −0.2

3.471413

44.57387

−0.16

3.611061

46.367

93.2669

−0.12

3.700547

47.51602

95.57816

−0.08

3.819605

49.04476

98.65321

−0.04

3.858212

49.54048

1 × 10−4

3.871811

49.7151

0.04

3.832964

49.2163

98.99824

0.08

3.783976

48.58727

97.73297

0.12

3.72903

47.88175

96.31381

0.16

3.653434

46.91108

94.36131

0.20

3.561462

45.73013

91.98584

89.66004

99.65034 100.0016

Table 3. CRRA, mean payoffs, varying ρ. rho ρ

Payoff ($) % Clairvoyant % EV

−2.5

3.756992

48.24079

97.03601

−2.0

3.811378

48.93912

98.4407

−1.5

3.835013

49.2426

99.05116

−1.0

3.848999

49.42218

99.41239

−0.5

3.866546

49.64749

99.8656

−4

1 × 10

3.87175

49.71431

100

0.5

3.85773

49.5343

99.6379

1.0

3.828434

49.15812

98.88123

1.5

3.805642

48.86547

98.29256

2.0

3.777273

48.5012

97.55984

2.5

3.752105

48.17804

96.90979

Table 4. DRP, % of EV, varying δ and β. beta β δ = 1.001 δ = 1.2

δ = 1.4

0.001

100

99.80686 99.44209

0.1

99.59886

98.58363 98.60167

0.2

98.23075

97.98895 97.22377

0.4

96.98482

95.91222 95.2202

7

8

R. E. Marks

the percentages of the EV method. From the mean result for Random in Table 1 (which is +0.56% of EV), we can conclude that the errors in Table 4 are about 1.12% (±0.56%) of EV. Again we see that risk-neutral behaviour, here with δ → 1 and β → 0, is the best method for choosing among risky lotteries.

6

Discussion

Whereas there has been much research into reconciling actual human decision making with theory [5], we are interested in seeing what is the best (i.e. most profitable) risk profile for agents faced with risky choices. Rabin [6] argues that loss aversion [2] rather than risk aversion, is a more realistic explanation of how people actually behave when faced with risky decisions. This is captured in our DRP function, which nonetheless favours risk neutrality as a method. An analytical study of Prospect Theory DRP Value Functions [7] posits an adaptive process for decision-making under risk such that, despite people being seen to be risk averse over gains and risk seekers over losses with respect to the current reference point [2], the agent eventually learns to make risk-neutral choices. Their result is consistent with our results. A simulation study [8] examines the survival dynamics of investors with different risk preferences in an agent-based, multi-asset, artificial stock market and finds that investors’survival is closely related to their risk preferences. Examining eight possible risk profiles, the paper finds that only CRRA investors with relative risk aversion coefficients close to unity (log-utility agents) survive in the long run (up to 500 simulations). This is not what we found (see Table 3 with ρ = 1). Our results here are consistent with earlier work on this topic [3,9] in which we used machine learning (the Genetic Algorithm) to search for agents’ best risk profiles in decision making under uncertainty. Our earlier work was in response to [10], which also used machine learning in this search, and which wrongly concluded that risk aversion was the best profile.

7

Conclusion

As economists strive to obtain answers to questions that are not always amenable to calculus-based results, the use of simulation is growing, and answers are being obtained. This paper exemplifies this: the question of which decision-making method gives the highest payoff in cases of uncertainty (where the possible payoffs and their probabilities are known) is not, in general, amenable to closedform solution. The answer is strongly that risk-neutral methods are best, as exemplified by the Expected Value method. We believe that exploration of other experiments in decision making under uncertainty (with complete information) will confirm the generality of this conclusion. Will relaxing our assumptions of complete information about possible outcomes and their probabilities result in different conclusions? This awaits further research.

Decision Making

9

Acknowledgments. The author thanks Professor Shu-Heng Chen for his encouragement, and discussants of the previous papers [3, 9] in this research program. An anonymous reviewer’s comments have improved the paper.

References 1. Howard, R.A.: The foundations of decision analysis. IEEE Trans. Syst. Sci. Cybern. ssc–4, 211–219 (1968) 2. Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291 (1979) 3. Marks, R.E.: Searching for agents’ best risk profiles. In: Handa, H., Ishibuchi, M., Ong, Y.-S., Tan, K.-C. (eds.) Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES 2014), Chapter 24. Proceedings in Adaptation, Learning and Optimization, vol. 1, pp. 297-309. Springer (2015). http:// www.agsm.edu.au/bobm/papers/marksIES2014.pdf 4. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project. org/ 5. Arthur, W.B.: Designing economic agents that act like human agents: a behavioral approach to bounded rationality. Am. Econ. Rev. Papers Proc. 81, 353–360 (1991) 6. Rabin, M.: Risk aversion and expected-utility theory: a calibration theorem. Econometrica 68, 1281–1292 (2000) 7. DellaVigna, S., LiCalzi, M.: Learning to make risk neutral choices in a symmetric world. Math. Soc. Sci. 41, 19–37 (2001) 8. Chen, S.-H., Huang, Y.-C.: Risk preference, forecasting accuracy and survival dynamics: simulation based on a multi-asset agent-based artificial stock market. J. Econ. Behav. Organ. 67(3–4), 702–717 (2008) 9. Marks, R.E.: Learning to be risk averse? In: Serguieva, A., Maringer, D., Palade, V., Almeida, R.J. (eds.) Proceedings of the 2014 IEEE Computational Intelligence for Finance Engineering and Economics (CIFEr), London, 28–29 March, pp. 10751079. IEEE Computational Intelligence Society (2015) 10. Szpiro, G.G.: The emergence of risk aversion. Complexity 2, 31–39 (1997)

Coordination and Search for New Solutions An Agent-Based Study on the Tension in Boundary Systems Friederike Wall(B) Alpen-Adria-Universitaet Klagenfurt, 9020 Klagenfurt, Austria [email protected] http://www.aau.at/csu

Abstract. Boundary systems, setting constraints for managerial decision-makers, incorporate a tension between granting flexibility to decisionmakers for finding new options as well as aligning managerial choices with respect to overall objectives. This study employs a computational approach to examine configurations of search strategy and coordination mechanisms in boundary systems for their effects on organizational performance. The results suggest that the complexity of the organizational decision-problem subtly shapes the effectiveness of the configurations – suggesting to employ search strategies providing flexibility to managerial decision-makers when complexity is low and to emphasize tight coordination and exploitative or ambidextrous strategies for higher levels of complexity. Keywords: Agent-based simulation · Complexity · Coordination · Management control systems · NK fitness landscapes · Search strategy

1

Introduction

According to the prominent “Levers of control” (LOC) framework [1], organizations employ boundary systems to constrain the behavior of managerial decisionmakers and, by this, to affect decision-making in the direction of the overall objective. It is well recognized that the boundary system incorporates a certain tension: shaping – or even enforcing – decision-makers’ search for novel solutions via exploitation or exploration on the one hand and restricting decision-makers in favor of coordination towards superior solutions to the overall firm’s objective on the other [1–3]. This tension, in particular, occurs under behavioral assumptions on decision-makers in the spirit of Simon [4,5]. Several, mostly empirical studies were conducted in order to figure out the interrelations of the boundary system with other control systems of the LOCframework or to identify contingent factors which may affect the effectiveness of the boundary system (e.g., task complexity) (for overviews see [2,3]). This c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 10–17, 2020. https://doi.org/10.1007/978-3-030-38227-8_2

Coordination and Search for New Solutions

11

study seeks to contribute to this body of research – though focusing on the balance of components within the boundary system – and in particular relates to the growing research emphasizing the internal fit of components of management controls [6]. In particular, the study addresses the following research question: Which effects on overall organizational performance result from certain combinations of search strategy and coordination mechanisms taking complexity of the decision problem to be solved as contingent factor into account? For investigating the research question, the paper makes use of an agentbased simulation. A simulation-based research method appears appropriate to capture search processes and an agent-based simulation allows to consider the collaboration of various interacting parties (e.g., units) within an organization (with further references [7]). In the model, the task environment of the organizations is represented according to the framework of NK fitness landscapes [8,9] which was originally introduced in the domain of evolutionary biology and, since then, broadly employed in managerial science [7]. A key feature of the NK framework is that it allows to easily control for the complexity of the decision problem [10,11]. The model captures different search strategies (exploitative, explorative or ambidextrous, [12,13]) and two mechanisms of coordination.

2

Outline of the Simulation Model

Organizational Decision Problem: In the simulations, artificial organizations are observed while searching for superior solutions for a decision-problem which is modeled according to the framework of NK-fitness landscapes: At time step t, the organizations face an N -dimensional binary decision problem, i.e., dt = (d1t , ..., dN t ) with dit ∈ {0, 1}, i = 1, ..., N , out of 2N different binary vectors possible. Each of the two states dit ∈ {0, 1} provides a contribution Cit to the overall performance V (dt ) where the Cit are randomly drawn from a uniform distribution with 0 ≤ Cit ≤ 1. The parameter K (with 0 ≤ K ≤ N − 1) reflects the number of those choices djt , j = i which also affect the performance contribution Cit of choice dit and, thus, captures the complexity of the decision problem in terms of the interactions among decisions. Hence, contribution Cit may not only depend on the single choice dit but also on K other choices: Cit = fi (dit ; di1 t , ...diK t ),

(1)

with {i1 , ..., iK } ⊂ {1, ..., i − 1, i + 1, ..., N }. In case of no interactions among choices, K equals 0, and K is N − 1 for the maximum level of complexity where each single choice i affects the performance contribution of each other binary choice j = i. The overall performance Vt achieved in period t results as normalized sum of contributions Cit from Vt = V (dt ) =

N 1 Cit . N i=1

(2)

12

F. Wall

Departmental Preferences and Boundaries by Search Strategy: The N dimensional decision problem is partitioned into M disjoint partial problems of, for the sake of simplicity, equal size N r . Each of these sub-problems is delegated to one department r – with the particular competencies of department r’s head being subject to the boundary system. Department heads seek to maximize compensation which is merit-based and, for the sake of simplicity, results from linear compensation functions based on the contribution Ptr (drt ) of department r’s contribution to overall performance Vt (see Eq. 2) as given by r

Ptr (drt )

N 1 = Cit N i=1+w

(3)

r−1 with w = p=1 N p for r > 1 and w = 0 for r = 1. In every time step t, each manager r seeks to identify the best – in terms of compensation – configuration for the “own” choices drt out of the currently available options which are shaped according to the search strategy as part of the boundary system: Search Strategies: In line with Simon’s [4,5] behavioral assumptions, our decision-makers are not able to survey the entire search space and, hence, they cannot “locate” the optimal solution of their decision problem “at once”. Rather, they search stepwise for superior solutions. In each time step t, each manager r and dr,a2 for the partial decision probdiscovers two alternative solutions dr,a1 t t r∗ lem compared to the status quo dt−1 . For these alternatives, boundaries are set by the headquarter in terms of the – required as well as allowed – distance to the status quo. In particular, a prescribed search strategy may be exploitative, explorative or ambidextrous. In the former case, the Hamming distances of the alterna N r r∗ r,a1 ) = i=1 dt−1 − dr,a1 tive options to the status quo equal 1 (i.e., h(d = 1; t h(dr,a2 ) = 1); in a purely explorative strategy Hamming distances of the two alternatives are higher than 1, i.e., h(dr,a1 ), h(dr,a2 ) ≥ 2 allowing for more or less “long jumps”. Moreover, the simulations are run for ambidextrous strategies capturing cases of h(dr,a1 ) = 1 and h(dr,a2 ) ≥ 2.

Formation of Expectations: The decision-makers show some further cognitive limitations: (1) The head of department r cannot anticipate the other departments’ q = r choices and assumes that they will stay with the status quo, i.e., opt for dq∗ t−1 . (2) The department heads are not able to perfectly ex-ante evaluate and dr,a2 effects on their actual value base their newly discovered options’ dr,a1 t t for compensation Ptr (drt ) (see Eq. 3). Rather, ex ante evaluations are afflicted with noise which is, for the sake of simplicity, a relative error imputed to the true performance [14]. The error terms follow a Gaussian distribution N (0; σ) with expected value 0 and standard deviations σ r for each r; errors are independent from each other. Hence, the perceived performance P˜tr (dt ) of manager r – i.e., the perceived value base for compensation – is given by: P˜tr (drt ) = Ptr (drt ) + er,own (drt )

(4)

Coordination and Search for New Solutions

13

With this, each manager r has a distinct partial and imperfect “view” of the true fitness landscape. However, for the status quo option, we assume that department head r remembers the compensation from the last period and, with this, knows the actual performance Ptr of status quo if implemented again. Based on the evaluation of options, each department head r compiles a list Lrt = dr,p1 , dr,p2 , dr,p3 indicates the most preof preferences where dr,p1 t t t t r,a1 and dr,a2 (and so forth). ferred option out of dr∗ t−1 , dt t

Boundaries Set by the Coordination Mechanism: The next step within each period t is to determine the solution for the organization’s overall decision problem dt . For this, as a part of the boundary system, the model captures two, in a way, extreme modes of coordination in the spirit of Sah and Stiglitz [15]: Decentralized Mode: The highest level of autonomy is granted to the M departments if each of them is allowed to choose its most preferred option. Then, , ..., dr,p1 , ..., dtMs ,p1 ). The the overall configuration dt results from dt = (d1,p1 t t headquarter does not intervene in decision-making directly and its role is limited to registering the achieved performances Ptr (drt ) in the end of each period t and to compensate the department heads accordingly. Hierarchical Mode: Each department transfers its list Lrt of preferences to the , ...dr,p1 , ...dM,p1 ) headquarter which compiles a composite vector dC = (d1,p1 t t t from the first preferences and then seeks to evaluate the overall performance V (dC ) (see Eq. 2) this solution promises. However, also the headquarter is not capable to perfectly ex ante evaluate new options, i.e., other solutions than the status quo: the headquarter’s evaluations also are afflicted with a relative error following a Gaussian distribution with expected value 0 and standard deviations σ cent resulting in a perceived overall performance V˜ (dC ). The headquarter decides in favor of the composite vector, i.e., dt = dC , if dC promises the same or a higher performance than the status quo dt−1 , i.e., if V˜ (dC ) ≥ V (d∗t−1 ). If this condition is not satisfied, the headquarter evaluates a vector composed from the departments’ second preferences. If this also does not, at least, promise the performance of the status quo, then the status quo is kept, i.e., then dt = dt−1 .

3

Simulation Experiments

The simulation experiments (Table 1) are intended to provide some findings on the configuration of the boundary system as given by the search strategy and the mode of coordination employed. The simulation experiments are conducted for six search strategies where, for example, a search strategy named “1–1” briefly denotes the “exploitation only” case with h(dr,a1 ) = h(dr,a2 ) = 1; for the other strategies see Table 1. Since the complexity of the underlying search problem shapes the need for coordination, the experiments distinguish four levels of complexity of the decision problem as well as of the interactions among the M = 3 departments. For this, two parameters are employed: Parameter K depicts the complexity of the

14

F. Wall Table 1. Parameter settings

Parameter

Values/types

Observation period

T = 250

Number of choices

N = 12

Number of departments

M = 3 with d1 = (d1 , d2 , d3 , d4 ), d2 = (d5 , d6 , d7 , d8 ), d3 = (d9 , d10 , d11 , d12 )

Interaction structures

Decomposable: K = 2; K ex = 0; Near-decomposable: K = 3; K ex = 1; Non-decomposable, intermediate: K = 5; K ex = 3; Non-decomposable, high: K = 8; K ex = 5

Search strategy

“exploitation only”: “1–1”; “exploration only”: “2–2”; “2–3”; “3–3”; “ambidextrous”: “1–2”; “1–3”

Modes of coordination

Decentralized; hierarchical

Precision of ex-ante evaluation σ r = 0.05∀r ∈ {1...M }; σ cent = 0.1 (headquarter) Simulation runs

Per scenario 2,500 runs with 10 runs on 250 distinct fitness landscapes

entire problem according to the NK framework, and K ex denotes the level of interactions across sub-problems and, with that, also across departments. The experiments distinguish four different interaction structures (Table 1): (1) In the perfectly decomposable structure the overall search problem is decomposed into M = 3 disjoint parts with maximal intense intra-sub-problem interactions, but no cross-sub-problem interactions (i.e., K ex = 0). (2) In the nearly decomposable structure with K ∗ = 1 only slight cross-sub-problem interactions occur in that every performance contribution Ci in primary control of unit r is affected by only one choice made by another unit q = r. In the non-decomposable cases with (3) intermediate or (4) high interactions, a single option di affects the performance contributions of K ex = 3 or K ex = 5 choices, respectively, which are in the primary control of other departments. For each combination of interaction structure, search strategy and coordination mode 2,500 simulations are run.

4

Results and Discussion

Figure 1 displays condensed results for the simulation experiments: For each interaction structure, configuration of search strategy and coordination mode, the final performance Vt=250 averaged over 2,500 simulation runs is displayed1 1

Confidence intervals at a 99.9 level of Vt=250 show the following ranges: decomposable: ±0.002 to ±0.004 in decentralized and hierarchical coordination; neardecomposable: dec. ±0.004 to ±0.005; hierar. ±0.003 to ±0.004; non-decomposable intermediate: dec. ±0.005 to ±0.01; hierar. ±0.004 to ±0.005; non-decomposable high: dec. ±0.005 to ±0.01; hierar. ±0.004 to ±0.005.

Coordination and Search for New Solutions

15

Fig. 1. Final performances Vt=250 and average alterations per period for different search strategies, coordination modes and levels of complexity of decision problems. Each mark represents the averages of 2,500 simulation runs. For parameter settings see Table 1

which may be regarded as an indicator for the effectiveness of search. Moreover, the average number of single choices altered per period over 250 periods and averaged over the the 2,500 runs is reported – coarsely informing about the average scope of change in the configuration dt . In the decomposable and near-decomposable interaction structures, the coordination need is low since there are no or only few interdependencies, respectively. Hence, the two coordination modes should lead to rather similar results. This conjecture is broadly confirmed by the results as can be seen in Fig. 1a and b. For the near-decomposable structure the hierarchical mode performs slightly better than the decentralized for all search strategies which is in line with intuition, since here some interactions requiring coordination are effective. It is worth mentioning that, for both interaction structures, “mixed” strategies, i.e., strategies allowing some flexibility in terms of two different distances of search, provide highest performance levels. However, the average changes per period are at a rather low level, increasing from a level of around 0.1 to 0.3 – apart from the level of 0.8 in the “2–3” strategy for decentralized mode which is discussed more into detail subsequently.

16

F. Wall

With an increasing level of cross-problem interactions – as in the “nondecomposable intermediate and high structures” – things clearly change (Fig. 1c and d). The decentralized mode performs remarkably worse than the more tight coordination by hierarchy which is in line with intuition: coordination need is at a medium or high level and the decentralized mode, in fact, does not provide any coordination to balance effects of interactions. In particular, the performance levels obtained with the decentralized mode are rather sensitive to the search strategy and the average changes in the dit per period vary between 0.1 and 2.9 (intermediate) and 0.2 and 3.8 (high). In contrast, the hierarchical mode stabilizes the search processes: the performance appears to be relatively insensitive to the search strategies and even the average changes per period are at a relatively similar and low level between 0.1 and 0.45 across all search strategies simulated. Moreover, it is worth mentioning that especially those strategies show high performance levels which allow the decision-makers to (also) conduct exploitative behavior – i.e., strategies “1–1”, “1–2” and “1–3”. Hence, an interesting question is what could drive the destabilizing effects of the decentralized mode. An explanation may lie in the subtle interference of interactions and imperfect information at the agents’ site: a departmental decision-maker r – forming its preferences in t + 1 without knowing about the intentions of the fellow agents – may not only be surprised by the actual performance P r achieved in t (see Eqs. 3 vs. 4) but also by the fellow agents’ choices in t which – due to interactions – has affected r’s performance. Thus, this eventually lets agent r adapt the “own” choices in t + 1 and so forth – leading to frequent time-delayed mutual adjustments. With long-jumps prescribed by the search strategy, these frequent adjustments result in high average switches as the plots show. In this sense, the “2–3”-strategy, providing more flexibility in terms of allowing to “tune” exploration, is more prone to mutual adjustments than the less flexible “3–3”-strategy - which is supported by the average number of alterations per period with a factor 2 between the two strategies. However, the average number of changes may be relevant in further sense: whenever alterations cause some “switching costs”, the average alterations per period also provide some indication on the costs of a certain search strategy. Hence, an interesting trade-off shows up: with the hierarchical mode alterations, and, in this sense, costs decrease while, reasonably employing hierarchy for coordination also does not come along without costs.

5

Conclusion

This study examines the effects of the combined use of a prescribed search strategy and the coordination mechanism – both regarded as part of an organization’s boundary system – on overall organizational performance taking complexity of decision problems as contingent factor into account. The results of the agent-based simulation suggest that for low complexity the coordination mode employed is of relatively low relevance while mixed search strategies, i.e., letting decision-makers shape the scope of novelty, provide highest levels of organizational performance. In contrast, when the decision problems are of higher

Coordination and Search for New Solutions

17

levels of complexity tight coordination clearly appears more effective - particularly, if combined with search strategies allowing (also) for exploitative search. Results suggest that tight coordination stabilizes the alterations in the solutions implemented even in highly complex environments which could be of particular interest when alterations cause some switching costs. This directs towards further extensions of this research endeavor. As such, for example, the effects of costs for search and coordination which were precluded so far could be analyzed. Moreover the model does, by far, not represent the width of coordination modes and interrelations with further types of management control like, for example, performance evaluations and incentive systems could be studied. This relates to the growing body of research emphasizing the balance between the various types of management controls which may be studied by more comprehensive computational models of firms’ management control systems.

References 1. Simons, R.: Levers of Control: How Managers Use Innovative Control Systems to Drive Strategic Renewal. Harvard Business Press, Boston (1994) 2. Widener, S.K.: An empirical analysis of the levers of control framework. Acc. Organ. Soc. 32, 757–788 (2007) 3. Kruis, A.-M., Speklé, R.F., Widener, S.K.: The levers of control framework: an exploratory analysis of balance. Manag. Acc. Res. 32, 27–44 (2016) 4. Simon, H.A.: A behavioral model of rational choice. Quart. J. Econ. 69, 99–118 (1955) 5. Simon, H.A.: Theories of decision-making in economics and behavioral science. Am. Econ. Rev. 49, 253–283 (1959) 6. Bedford, D.S., Malmi, T., Sandelin, M.: Management control effectiveness and strategy: an empirical analysis of packages and systems. Acc. Organ. Soc. 51, 12– 28 (2016) 7. Wall, F.: Agent-based modeling in managerial science: an illustrative survey and study. RMS 10, 135–193 (2016) 8. Kauffman, S.A., Levin, S.: Towards a general theory of adaptive walks on rugged landscapes. J. Theor. Biol. 128, 11–45 (1987) 9. Kauffman, S.A.: The Origins of Order: self-Organization and Selection in Evolution. Oxford University Press, Oxford (1993) 10. Li, R., Emmerich, M.M., Eggermont, J., Bovenkamp, E.P., B¨ ack, T., Dijkstra, J., Reiber, J.C.: Mixed-integer NK landscapes. In: Parallel Problem Solving from Nature IX, vol. 4193, pp. 42–51. Springer, Berlin (2006) 11. Csaszar, F.A.: A note on how NK landscapes work. J. Organ. Design 7, 1–6 (2018) 12. Siggelkow, N., Levinthal, D.A.: Temporarily divide to conquer: centralized, decentralized, and reintegrated organizational approaches to exploration and adaptation. Organ. Sci. 14, 650–669 (2003) 13. Lazer, D., Friedman, A.: The network structure of exploration and exploitation. Adm. Sci. Q. 52, 667–694 (2007) 14. Wall, F.: The (beneficial) role of informational imperfections in enhancing organisational performance. In: Lecture Notes in Economics and Mathematical Systems, vol. 645, pp. 115–126. Springer, Berlin (2010) 15. Sah, R.K., Stiglitz, J.E.: The architecture of economic systems: hierarchies and polyarchies. Am. Econ. Rev. 76, 716–727 (1986)

Appeasement or Radicalism: A Game Between Intruders and Occupiers Yehui Lao(B) and Zhiqiang Dong South China Normal University, Guangzhou 510631, Guangdong, China [email protected]

Abstract. This paper pursues the theory of Dong and Zhang (2016) with compromise of occupier. It suggests that security of natural property rights can not be traded with compromise of occupier and the appeasement will be failed. Keywords: Decision making

1

· Conflict · Compromise

Introduction

The existent literatures discuss why there is costly disagreement in bilateral bargaining. The first view is because of irrational negotiators or because of incomplete information. Schelling (1956), (1960), (1966) gives another reason. Rational negotiators are likely to force compromise from the counterparty by visibly committing themselves to an aggressive bargaining stance to enhance their share of the surplus. If both parties make this strategic commitments at the same time, the conflict happends. Crawford (1982) improves the model of Schelling that he focuses on the decision of both parties to attempt commitment and how to attempt commitment. Crawford concludes, in the case of symmetric information, whether compromise equilibria, which are often many, exist depends on the cost of revoking commitment. Dong and Zhang (2016) construct a sequential game of endowment effect showing that exogenous availability of resources, the inborn contestability of the intruder, and the perceived endowment effect of the incumbent jointly determine the security of natural property rights. This paper aims to find out the Nash Equilibrium for predicting the probability of conflict.

2 2.1

Model Setting

It is a game model, based on the model of Dong and Zhang (2016), with complete information between the potential intruder (denoted as player 1) and occupier (denoted as player 2). In this game, two parties may fight for the occupied c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 18–23, 2020. https://doi.org/10.1007/978-3-030-38227-8_3

Appeasement or Radicalism

19

property. Assuming the irreplaceable property has an initial value V , where V > 0, for player 1, and a subjective value of αV , where α ≥ 1 due to endowment effect, for player 2. t is denoted as period t while t ∈ [1, +∞). When the game runs, the player 1 decides if he attacks. His attack costs c1 > 0. When c1 = 0, the game breaks. The c1 is also the signal for the player 2 deciding whether she defends (radicalism) or not (appeasement). Given c1 > 0, the game breaks as well when she chooses to defend (c2 > 0). When she choose to be appeasement, the c2 = 0 and she will pay R ∈ (0, V ) to player 1 for peace. If the player 2 fights the player 1, the Nature decides the winner with a Tullock contest (Tullock 1980): pi =

ci c1 + c2

(1)

Here pi represents the probabilities of player i. This is a probability of winning the game. The Eq. (1) implied the player is more likely to win when she or he pay more than the counterparty does in the game. The winner takes the property, the loser exits, and the game ends. 2.2

Solution

The payoffs of player 1 follows: ⎧ ⎨ p1 V − c1 , if 0, if π1 = ⎩ R, if c1 > 0

c1 > 0 c1 = 0 c2 = 0

(2)

In this case, the player 1 will attack under the condition of p1 Vt − c1t > 0. Thus, we have the Lemma 1. Lemma 1. The decision of intruder whether attack or not is based on difference between the payoff of attack and the opportunity cost. The payoffs of player 2 follows: ⎧ ⎨ αp2 V − c2 , if V − R, if π2 = ⎩ V,

c1 > 0 and c2 > 0 c1 > 0 and c2 = 0 if c1 = 0

(3)

In the period t, both player 1 and player 2 choose cit to maximum πit with the optimal level of c∗it , we have: c∗1 =

αV (1 + α)2

(4)

c∗2 =

α2 V (1 + α)2

(5)

20

Y. Lao and Z. Dong player 1 Silence

Attack player 2

(0, V )

Compromise

Def ends

(p1 V − c1 , αp2 V − c2 ) N otAccept (p1 V − c1 , αp2 V − c2 )

player 1 Accept (R, V − R)

Fig. 1. The gaming tree

Therefore, we simplify the model as a binary strategies for player i in the 0 period t, which is cit = ( ∗ ). The gaming tree is shown in the Fig. 1 below: cit According to the Fig. 1, firstly we consider the decision of player 1 when the player 2 compromise to pay R. Proposition 1. The player 1 will accept the R from the player 2 when the ratio R − to − V is higher than the minimum acceptable ratio. Proof of Proposition 1. In the stage 3, the condition of acceptance is p1 V − c1 < R

(6)

Substitute Eqs. (4) and (5) into expression (6), we have the minimum R the player 1 is willing to accept: R(V,α) =

1 V (1 + α)2

(7)

1 We define the ratio (1+α) 2 as the minimum acceptable ratio, that is influenced negatively by the endowment effect. It is very interesting of this result that the endowment effect of occupier has a negative impact on the decision of intruder. An occupier with high endowment effect wills to pay a higher price to maintain the property she owns (Morewedge and Giblin 2015). In a Tullock contest, that means the occupier has a higher probability to win a fight. The level of occupier’s willingness to defend the property is a factor of intruder’s decision on the minimum acceptable ratio.

Proposition 2. The player 2 with strong endowment effect will fight the player 1.

Appeasement or Radicalism

21

Proof of Proposition 2. In the stage 2, the condition of pay-for-peace is αp2 V − c2 < V − R

(8)

Substitute Eqs. (4) and (5) into expression (8), we have the maximum R the player 2 is willing to pay: R(V,α) = 2

1 + 2α + α2 − α3 V (1 + α)2

(9)

3

−α The ratio 1+2α+α is the maximum-acceptable ratio, which depends on (1+α)2 the endowment effect, for the occupier. After calculation, we have the threshold value of α is 2.151 . The player 2 with endowment effect less than 2.15 is defined as weak endowment occupier, while who with endowment effect higher than 2.15 is defined as strong endowment occupier. When the endowment effect is higher than 2.15, the condition of payfor-peace is unsatisfied. The strong endowment occupier will pay nothing and fight.

Proposition 3. Silence is the strictly dominated strategy in the stage 1. Proof of Proposition 3. According to the Eq. (2), we have the payoff 0 when the player 1 chooses silence. Substitute Eqs. (4) and (5) into expression (2), we have α the payoff (1+α) 2 V when the player 1 chooses to attack. Due to α ≥ 0, the payoff of attack is always higher than the payoff of silence. In addition, the payoff of accepting R is higher than the payoff of silence (0) as well. Therefore, Silence is the strictly dominated strategy in the stage 1.

3

Further Discussion

So far this paper has focused on the extension of this model. The following section will discuss the model in multiple periods. We try to find out the Nash Equilibrium in different situations. We assume that the player 1 will restart the stage 1 of this game when he chose to accept the pay-for-peace in the last period. In this period, the rest value of irreplaceable property equals the original value deducting sum of payments in prior periods. For the sake of simplicity, we assume the player 2 pays a constant R in each period. Thus, we modify the payoffs of player 1 and player 2 below: ⎧ ⎨ p1t Vt − c1t , if c1t > 0 0, if c1t = 0 (10) π1t = ⎩ R, if c1t > 0 c2t = 0

1

There are three roots of 1 + 2α + α2 − α3 = 0. Because α ≥ 1, the only one root is 2.1479.

22

Y. Lao and Z. Dong

π2t

⎧ ⎨ αp2t Vt − c2t , if Vt − R, if = ⎩ Vt ,

c1t > 0 and c2t > 0 c1t > 0 and c2t = 0 if c1t = 0

(11)

The dynamic value model is: Vt = V1 − (t − 1)R

(12)

Similar to the Sect. 2, we have the optimal level of c∗1t and c∗2t : c∗1t =

αVt α[V1 − (t − 1)R] = 2 (1 + α) (1 + α)2

(13)

c∗2 =

α 2 Vt α2 [V1 − (t − 1)R] = 2 (1 + α) (1 + α)2

(14)

Proposition 4. The minimum acceptable R of player 1 decreases, period by period. Proof of Proposition 4. Similar to the proof of Proposition 1, the condition of acceptance in the stage 3 of period t is p1t Vt − c1t < R

(15)

Substitute Eqs. (12), (13) and (14) into expression (15), we have the minimum R the player 1 is willing to accept: 1 V (16) + 2α + t In other words, t has a negative effect on the minimum acceptable R. A possible explanation for this might be that the rest of property’s value decreases by period. To avoid conflict, the player 1 has to decreases the minimum acceptable R. R(V,α,t) =

α2

Proposition 5. The maximum acceptable R of player 2 decreases, period by period. Proof of Proposition 5. Similar to the proof of Proposition 2, the condition of pay-for-peace is αp2t Vt − c2t < Vt − R

(17)

Substitute Eqs. (12), (13) and (14) into expression (17), we have the maximum R the player 2 is willing to accept: R(V,α,t) =

1 + 2α + α2 − α3 V (1 − t)α3 + tα2 + 2tα + t

(18)

Appeasement or Radicalism

To derive the maximum acceptable ratio a =

1+2α+α2 −α3 (1−t)α3 +tα2 +2tα+t ,

23

we have:

∂a (1 + 2α + α2 − α3 )2 =− = Amin?

Original training set

Yes

Return list of archetype sets

Fig. 1. Scheme of GH-ARCH for archetype extraction. Amin is a user-defined parameter.

is a data driven (self-organization) and incremental approach in the sense that both the number of neurons and their position in the feature space are automatically estimated from data. The technique described in this section directly derives from the Growing Hierarchical EXIN (GH-EXIN) algorithm [5], a neural network for hierarchical clustering. In both cases, neurons can be seen as representative prototypes of clusters as they are placed in such a way to provide the best topological representation of the data distribution. Therefore, once the positions of such prototypes is estimated from data, they can be interpreted as an archetype set of representative virtual samples. Figure 1 visually describes how GH-ARCH is used for archetype discovery. The major difference between the two algorithms consists in the final goal of the algorithm and on the indices to be minimized: while GH-EXIN is a network which focuses on finding biclusters and minimize a biclustering quantization index, GH-ARCH attempts to minimize the heterogeneity and maximize the purity of the clusters, in order to group points which are both close to each other and belonging to the same class. The way in which data is divided at deeper and deeper levels, on the other hand, follows the same algorithm and it is explained in the following. For each father neuron, a neural network is trained on its corresponding Voronoi set (set of data represented by the father neuron). The children nodes are the neurons of the associated neural network, and determine a subdivision of the father Voronoi set. For each leaf, the procedure is repeated. The initial structure of the neural network is a seed, i.e. a pair of neurons, which are linked by an edge, whose age is set to zero. Multiple node creation and pruning determines the correct number of neurons of each network. For each epoch (presentation in a random way of the whole training set to the network) the basic iteration starts at the

48

P. Barbiero et al.

presentation of a new data point, say xi . All neurons are ranked according to the Euclidean distances between xi and their weights. The neuron with the shortest distance is the winner w1 . If its distance is larger than the scalar threshold of the neuron (novelty test), a new neuron is created with weight vector given by xi . The initial weight vectors are heuristically defined as the average feature values of the points in the Voronoi set, and the neural thresholds are given by the mean distance among the same points. Otherwise, there is a weight adaptation and the creation of an edge The weight computation (training) is based on the Soft Competitive Learning (SCL) [6] paradigm, which requires a winner-takesmost strategy: at each iteration, both the winner and its neighbors change their weights but in different ways: w1 and its direct topological neighbors are moved towards xi by fractions α1 and αn (learning rates), respectively, of the vector connecting the weight vectors to the datum. This law requires the determination of a topology (neighbors) which is achieved by the Competitive Hebbian Learning (CHL) rule [6], used for creating the neuron connections: each time a neuron wins, an edge is created, linking it to the second nearest neuron, if the link does not exist yet. If there was an edge, its age is set to zero and the same age procedure as in [7] is used as follows. The age of all other links emanating from the winner is incremented by one; during this process if a link age is greater than the agemax scalar parameter, it is eliminated (pruning). The thresholds of the winner and second winner are recomputed as the distance to their farthest neighbor. At the end of each epoch, if a neuron remains unconnected (no neighbors), it is pruned, but the associated data points are analyzed by a new ranking of all the neurons of the network (i.e. also the neurons of the neural networks of the other leaves of the hierarchical tree). If it is outside the threshold of the new winner, it is labeled as an outlier and pruned. If, instead, it is inside, it is assigned to the winner Voronoi set. Each leaf neural network is controlled by the purity, calculated as (1) P = max {ci } i∈C

where ci is the number of elements belonging the class i within the Voronoi set of the leaf and C is the number of classes; and by the heterogeneity, calculated as the sum of the Euclidean distances between the neuron (wγ ) and the N data composing its Voronoi set (xi ): H=

N

||wγ − xi ||2

(2)

i=1

In particular, the training epochs are stopped when the estimated value of these parameters falls below a percentage of the value for the father leaf. This technique creates a vertical growth of the tree. The horizontal growth is generated by the neurons of each network. However, a simultaneous vertical and horizontal growth is possible. At the end of a training, the graphs created by the neuron edges are checked. If connected subgraphs are detected, each sub-graph is considered as a father, by estimating the centroid of the cluster (vertical growth) and the associated neurons as the corresponding sons (horizontal growth). This

Generating Neural Archetypes

49

last step ends the neural clustering on the Voronoi set of one leaf to be expanded. The decision on whether to expand a leaf is again based on the purity and the heterogeneity of that leaf. In case either the purity of the leaf is lower than Pmin or the heterogeneity is higher than Hmax (user-dependent parameter), the node is labelled as parent node and a further neural clustering is run on its Voronoi set. After repeating this procedure for all the leaves of a single layer, recalling Fig. 1, a given classifier is trained on the weight vectors of the leaves found by GH-ARCH so far. In case the accuracy obtained on a test set is higher than Amin , the current list of archetypes is returned. Nonetheless, the algorithm stops also in case there are no leaves to be expanded: this may occur when the current purity and heterogeneity of all leaves are already high. Lastly, if points grouped by a leaf do not belong to the same class, the label of the archetype of that leaf is assigned by means of a majority voting procedure.

3

Experimental Results

Both the experiments presented in this section can be reproduced using our code freely available on Bitbucket1 . All the used classifiers are implemented in the scikit-learn [8] Python module and use default parameters. For the sake of reproducibility, a random seed is set for all the algorithms exploiting pseudorandom elements. 3.1

Understanding Archetypes

In order to understand at a glance the importance of archetypes, GH-ARCH is first applied to a synthetic data set called Blobs. It is composed of three isotropic 2-dimensional gaussian distributions, each one representing a different class. In Fig. 2, four archetype sets are shown at different resolution levels, corresponding to the 2nd and the 3rd layer of GH-ARCH. These virtual sets of samples are used to train RandomForest [9] and Ridge [10] classifiers in place of the whole training set. Observe how even using 3–4 samples the accuracy of predictions is comparable with the one obtained exploiting the whole training set. Besides, archetypes in deeper layers of GH-ARCH represent the data set distribution in more detail. Finally, notice how archetypes seem to have a regularization effect on tree-based classifiers like RandomForest, as the corresponding decision boundaries are smoother than the ones obtained with the whole training set. 3.2

Making Sense of Archetypes and Decisions in Economics

In order to show how to exploit GH-ARCH in a real setting, the proposed approach is applied to the Car data set [11], containing the information of 406 cars produced in USA or in Europe/Japan. Cars are characterized considering both architectural features (e.g. number of cylinders and weight) and 1

https://bitbucket.org/neurocoreml/archetypical-neural-coresets/.

50

P. Barbiero et al.

Fig. 2. GH-ARCH on the Blobs dataset using RandomForest (left) and Ridge (right) classifiers. Archetypes in the first and the second row corresponds to the ones of the 2nd and the 3rd hierarchical level of GH-ARCH, respectively. The last row show the decision boundaries obtained using all the training samples to fit model parameters. Table 1. Cars data set. Training set size, classification accuracy on an unseen test set and running time (in seconds) for different classifiers exploiting both GH-ARCH and state-of-the-art algorithms for core set discovery. Algorithm

All samples

RandomForest

Bagging

LogisticRegression

Ridge

Size Accuracy Avg time

Size Accuracy Avg time

Size Accuracy Avg time

Size Accuracy Avg time

270

0.8824

270

0.9338

270

0.8971

270

0.8750

GH-ARCH (2L)

4

0.7868

0.06

4

0.7721

0.07

4

0.8235

0.03

4

0.8162

0.03

GH-ARCH (5L)

76

0.8971

0.20

76

0.8456

0.17

66

0.8309

0.15

76

0.8382

0.14

GIGA

27

0.6985

0.15

27

0.7426

0.15

27

0.6838

0.15

27

0.6250

0.15

FW

35

0.7206

0.63

35

0.7279

0.63

35

0.6912

0.63

35

0.6618

0.63

MP

26

0.6250

0.57

26

0.6250

0.57

26

0.6324

0.57

26

0.6324

0.57

FS

17

0.6250

0.62

17

0.6250

0.62

17

0.6250

0.62

17

0.6250

0.62

OP

6

0.6324

0.05

6

0.6176

0.05

6

0.6544

0.05

6

0.6471

0.05

LAR

8

0.6250

0.01

8

0.6250

0.01

8

0.6912

0.01

8

0.6324

0.01

the production year. Four ML classifiers are trained on GH-ARCH archetypes to predict the region of origin (USA vs not-USA): Bagging [12], RandomForest [9], Ridge [10], and LogisticRegression [13]. The results obtained exploiting

Generating Neural Archetypes

51

Table 2. Archetype set of extracted in the 2nd layer of GH-ARCH for RandomForest.

Fig. 3. Boxplots displaying the mean and the standard deviation of the training distribution for each feature. On top of boxplots, swarmplots show the archetype set extracted in the 2nd layer of GH-ARCH for RandomForest.

Features mpg Cylinders Displacement Horsepower Weight Acceleration Model year Target

A1 26.4 1.4 130.7 86.9 2438.5 16.0 75.4 EU/JP

A2 19.7 3.0 236.2 115.9 3314.9 15.1 75.1 USA

A3 14.3 3.9 364.5 170.0 4144.8 11.7 72.1 USA

A4 26.4 1.4 130.7 86.9 2438.5 16.0 75.3 EU/JP

GH-ARCH are then compared against the 6 coreset discovery algorithms GIGA [14], FW [15], MP [16], OMP [16], LAR [17,18], and FSW [19]. The comparison is performed on three metrics: i. coreset size (lower is better); ii. classification accuracy on the test set (higher is better); iii. running time of the algorithm (lower is better). Table 1 summarizes the results obtained for each ML classifier. With regard to the accuracy on an unseen test set, classifiers trained using archetypes extracted in the 5th layer of GH-ARCH are comparable with the ones trained using the whole training set. In order to show how archetypes can be useful in interpreting model decisions, we manually analyzed the archetypes extracted in the 2nd hierarchical layer of GH-ARCH. Figure 3 and Table 2 show in two different ways (graphical and tabular) the archetype set found. Observe how the American archetypes A2 and A3 are very different from the European/Japanese. More in detail, EU/JP vehicles seem more ecological, as they have a better miles per gallon (mpg) ratio. American cars, instead, appear to be more powerful as they have higher values for number of cylinders, engine displacements (in cubic centimeters), horsepower, and weight (in lbs.). Summarizing, the use of archetypes allows ML agents, such RandomForest, to make fast and accurate predictions and allows human experts to make sense of such decisions by analyzing few important samples.

4

Conclusions

Coreset discovery is a research line of utmost practical importance, and several techniques are available to find the most informative data points in a given training set. Limiting the search to existing points, however, might impair the final objective, that is, finding a set of points able to summarize the information contained in the original dataset. In this work, hierarchical clustering, based on a novel neural network architecture (GH-EXIN), is used to find meaningful

52

P. Barbiero et al.

archetype sets, virtual but representative data points. Results on a real economic dataset shows how archetypes may be useful in explaining decisions taken by machine learning classifiers.

References 1. Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017) 2. Mitchell Waldrop, M.: News feature: what are the limits of deep learning? Proc. Natl. Acad. Sci. 116(4), 1074–1077 (2019) 3. Bachem, O., Lucic, M., Krause, A.: Practical coreset constructions for machine learning. arXiv preprint arXiv:1703.06476 (2017) 4. Ciravegna, G., Barbiero, P., Cirrincione, G., Squillero, G., Tonda, A.: Discovering hierarchical neural archetype sets. In: The International Joint Conference on Neural Networks (IJCNN), July 2019 5. Cirrincione, G., Ciravegna, G., Barbiero, P., Randazzo, V., Pasero, E.: The GHEXIN neural network for hierarchical clustering. Neural Networks (2019, under peer review) 6. Haykin, S.S.: Neural Networks and Learning Machines, vol. 3. Pearson, Upper Saddle River (2009) 7. Cirrincione, G., Randazzo, V., Pasero, E.: The growing curvilinear component analysis (GCCA) neural network. Neural Netw. 103, 108–117 (2018) 8. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011) 9. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 10. Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195–198 (1943) 11. Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explorations 15(2), 49–60 (2013) 12. Breiman, L.: Pasting small votes for classification in large databases and on-line. Mach. Learn. 36(1–2), 85–103 (1999) 13. Cox, D.R.: The regression analysis of binary sequences. J. Roy. Stat. Soc.: Ser. B (Methodol.) 20, 215–242 (1958) 14. Campbell, T., Broderick, T.: Bayesian coreset construction via greedy iterative geodesic ascent. In: International Conference on Machine Learning (ICML) (2018) 15. Clarkson, K.L.: Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. ACM Trans. Algorithms 6, 63 (2010) 16. Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of 27th Asilomar Conference on Signals, Systems and Computers, pp. 40–44 (1993) 17. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–451 (2004) 18. Boutsidis, C., Drineas, P., Magdon-Ismail, M.: Near-optimal coresets for leastsquares regression. Technical report (2013) 19. Efroymson, M.A.: Multiple regression analysis. Math. Methods Digital Comput. (1960)

Strength Investing - A Computable Methodology for Capturing Strong Trends in Price-Time Space in Stocks, Futures, and Forex Heping Pan1,2,3(&) 1

Chinese Cooperative Innovation Center for Intelligent Finance, Institute for Advanced Study Chengdu University, 2025 Chenglu Road, Longquanyi District, Chengdu 201016, China [email protected] 2 Intelligent Finance Research Center, Chongqing Institute of Finance, Chongqing Technology and Business University (parallel first), 19 Xuefu Avenue, Nan’an District, Chongqing 400067, China 3 Swingtum Prediction, Delacombe, Australia

Abstract. This paper proposes Strength Investing (SI) as a new computable methodology for investing and trading in global macro financial markets. SI is actually the foremost mainstream investing methodology in developed economies and in China too. Insisting in that practice is the main criterion for testing truth of financial theory, SI goes beyond modern portfolio theory emphasizing diversification, but concentrates on a minimal diversification of capital on strong market forces. SI takes over value investing in that it offers a complete top-down procedure for optimal selection of markets (assets) with relative strength, with market timing included. It overcomes the weakness of trend following by dynamically shifting capital to strong markets with stronger trends, so frequent stop-loss triggering can be largely avoided. SI mainly aims at international stock index markets and most liquid stocks, while its principles and techniques can be adapted to commodity futures and foreign exchanges (forex). Keywords: Strength Investing Value investing Trend following Comparative relative strength Portfolio theory Trading strategies

1 Strength Investing versus Value Investing and Trend Following Financial investment in the most liquid global macro financial markets – stocks, commodity futures and foreign exchanges (forex) in developed and leading developing economies - has always been the focus of attention for both the financial academics and the Wall Street professionals. In the last 6 or more decades at least since 1950 s, (Supported by the National Social Science Foundation of China with Grant 17BGL231). © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 53–60, 2020. https://doi.org/10.1007/978-3-030-38227-8_7

54

H. Pan

numerous methodologies or schools of thought for financial investment have emerged, been developed and practiced. However, apart from the high-frequency trading in some of the most successful cutting-edge quantitative funds, three general methodologies of financial investment and trading have stood the test of the time and history: (1) Modern Portfolio Theory originally made computable by financial academics, (2) Value Investing including the growth investing mainly for stocks practiced by stocks-centric investors, (3) Trend following mainly for commodity futures practiced by commodity trading advisors (CTA’s). Modern Portfolio Theory originates in western developed economies, typically the USA and the UK ones that have not been interrupted by the world wars throughout the last two centuries. The theory of financial investment with financial markets as the main objective of study has developed quantitative financial theories and models, of which the three most important fundamental theoretical innovations are the portfolio theory, the equilibrium pricing theory (the Capital Asset Pricing Model and the Arbitrage Pricing Model) and the option pricing models. Modern financial theory and its practices worldwide, both in developed and developing market economies, have all the way faced two difficult problems: the first is the problem of implementation of the theories as pointed out by [1], and the other is how the theories should keep pace with the times as considered by [2]. It is fair to say that modern financial theory aims more at the pricing of capital assets, stocks mainly, and derivative products, options and futures. The theory is established on the efficient market theory (EMT), assuming the financial prices move in random walk and modeled as geometric Brownian motion (GBM) and its variations such as Levy process with jump-diffusion. The theory has not provided a complete computable methodology for financial investment apart from the passive index tracking and market-neutral hedging using derivatives. The modern portfolio theory started by Harry Markowitz provides an investment diversification model while the asset selection and market timing were not considered. Although later on the CAPM has been extended to multi-factor models, the selection of factors and factor combination involve explosion of complexity and the EMT can no longer apply. The quantitative portfolio theories have always faced two critical problems: the first is the dynamic selection of assets or markets (each traded asset is itself a trading market); the second is the timing of portfolio adjustment (portfolio rebalancing is a part of it). Value Investing is an investment paradigm that preaches buying securities that appear underpriced by fundamental analysis for company valuation. Its origin goes back to the investment philosophy first taught by Graham [3] who identified value investment opportunities including stock in public companies trading at discounts to book value or tangible book value, those with high dividend yields, and those having low price-to-earning multiples, or low price-to-book ratios. The emergence of value investing as the most respected investment methodology is largely attributed to highprofile proponents and successful practitioners of value investing, represented by Warren Buffett. His methodology is interpreted as arguing that the essence of value investing is buying stocks at less than their intrinsic value. The discount of the market price to the intrinsic value is what Graham called the “margin of safety”. For the last

Strength Investing - A Computable Methodology for Capturing Strong Trends

55

28 years, under the influence of Charlie Munger, Buffett expanded the value investing concept with a focus on finding an outstanding company at a sensible price rather than generic companies at a bargain price. Value stocks do not always beat growth stocks, as demonstrated in the late 1990s. Growth investing is a style of investment strategy focused on capital appreciation. Growth investors invest in companies that exhibit signs of above-average growth, even if the stock price appears expensive in terms of metrics such as price-to-earnings or price-to-book ratios. However, there is no theoretical difference between the concepts of value and growth, as growth is always a component in the calculation of value. It is generally recognized that value investing including growth involves two difficulties: the one is the lack of top-down market timing from the market through sectors to stocks, the other is the lack of a standard valuation method generally applicable to different countries with more or less different financial economic regimes and evolutionary paths. Trend Following is a general type of trading strategies that trade in the direction of the current prevalent trend, i.e., go long when the price trend goes up, or go short when the price trend goes down, expecting the price movement to continue. Trend following is used by commodity trading advisors (CTA’s) as the predominant strategy of technical traders. Trend following strategies typically limit possible loss by using stop-loss orders. The underlying tenet of trend followers is that they do not aim to forecast or predict specific price levels, they simply jump on the trend and ride it until their stoploss orders are triggered. Due to different time frames and signals employed to identify trends, trend followers on the same market as a group are not necessarily correlated to one another. Trend following is known being vulnerable to price whipsaw that could trigger stop-loss orders, resulting in capital loss in terms of severe drawdown of the trading equity. The difficulty of setting stop-loss orders together with take-profit targets in multiple time frames renders trend following uneasy to use even as if it looks simple. Strength Investing: Considering the shortcomings of portfolio theory, value investing and trend following, we propose a new computable methodology for financial investment, called Strength Investing (SI). As a general methodology, SI focuses attention on leading market forces, concentrates investment on those markets or assets which show stronger trends than the majority of the other markets or assets in a given asset universe. The tenet here is to concentrate with minimal diversification on strong trends of stronger trending markets, balancing concentrated investment styles of market masters and diversified portfolio management. The source of the strength may come from fundamental value or growth, or technical price behavior, or catalyst events. To portfolio theory, SI provides a force and engine to drive the intelligent portfolio management process. This driving engine actually has to start with asset selection. Actually, asset selection is never a static task, but it is always entangled with investment methodology and market timing. To value investing, SI provides a top-down dynamic selection of markets, sectors and stocks, with market timing included. To trend following, SI dynamically switches to markets or assets showing stronger trends, so stop-loss orders may not necessarily or less frequently be triggered. The rest of the paper is organized as follows: Sect. 2 establishes the plausibility of strength investing, clarifying the source of profitability of this methodology; Sect. 3 provides a framework of strength investing in stock markets, which is the main part of

56

H. Pan

this paper; Sect. 4 describes the tactical design considerations of trend trading strategies, including passive trend following and active trend trading.

2 Plausibility of Strength Investing - Source of Profitability Strength investing as a general methodology is based on 4 assumptions (A): A1: There is a complete universe of investable assets (or markets). A2: The totality of attention of the market participants in this asset universe is always preserved. This assumption is referred to as attention preservation. A3: At any time there is always at least one asset in bull or bear market. A4: Technically it is always possible to shift investment into the leading bull or bear markets without waiting for stop-loss to be trigged from other invested assets (or markets). In reality, we can only accept A1 to be relatively true, because there is no absolute complete universe of assets where the attention of all the market participants is bounded, so investors only consider to put their capitals within this universe. Examples of such universe are many. Chinse A-shares is typical as Chinese stock investors only consider the universe of A-shares without distraction to commodity futures or forex markets. On the other hands, futures or forex traders also form their own universes of trading. Based on the four assumptions, the methodology of strength investing proceeds in a dynamical mechanism consisting of the following steps (see Fig. 1 for stocks): Step 1: Step 2:

Step 3:

Step 4: Step 5:

Step 6:

Define a computable metric for the strength of trend for an asset (or market). Take a top-down procedure, starting from the highest level of the asset universe, calculate the trend strength of each candidate sector of assets, rankorder the candidates according to the trend strength metrics, and select a minority of the candidates with the strongest trends. This procedure is called strength selection on a single level. Walk through all the intermediate levels running the single-level strength selection for each of the selected sector or asset on its previous higher level. This top-down strength selection is run until the investable or tradable assets are reached on the final level. Construct a portfolio of selected assets and implement this asset portfolio on the trading market. If this new asset portfolio is no different from the previously implemented portfolio, do nothing; otherwise, implement this new portfolio by adjusting the existing portfolio. Go back to Step 2 by monitoring the comparative relative strengths of the candidate sectors or assets

Note that the adjustment of the asset portfolio depends on what portfolio theory and techniques are used. Two practical ways are the mean-variance portfolio model of Markowitz and the predefined weighting for selected assets rank-ordered according to their trend strength. The simplest way is to treat each of the selected assets equally, so the new portfolio assigns an equal weighting to each selected asset, while those assets

Strength Investing - A Computable Methodology for Capturing Strong Trends

57

included in the previous portfolio but not selected in the current portfolio will be closed out from the market. The concept of trend and that of strength are logically inherently related. First thing first, where a trend emerges in a given market (asset) is a matter of phase transition from a trivial random walk to a prevalent bias of market opinions as described by Soros [4]. Take one step further, the strength of a trending market exhibits in its price trend stronger than other markets in the given market universe. To differentiate from the relative strength index used in technical analysis, we would use the comparative relative strength of a market or asset in comparison with other markets or assets to refer to the strength of the market or asset. The source of the strength of a given market or asset may attribute to fundamental value or growth appreciation, technical behavioral dynamics, some strategic action or game play, or market-calendar events or unexpected surprising events. The foremost advantage of strength investing in contrast to value investing and growth investing is that it is not bound by requiring the strength in value or in growth. In fact, one is often not able to timely find out the reason when a trend breaks out. In academic literature, momentum trading and factor investing are the two closely related concepts that have been incorporated into modern portfolio theory. Momentum as a concept in finance [5, 6] refers to the empirically observed tendency for rising asset prices to rise further, and falling prices to keep falling, so actually it means the sustainability of the trend. The strength in our terminology includes momentum of price trend, but it requires the price trend to be well supported by a resonance of fundamental value and growth, together with the price momentum. In a stock market, if a stock exhibits price momentum and it is also a constituent in a strong industrial sector of a strong economy, then it can be said there is a strength in this stock with better reliability and sustainability. The existence of momentum is considered a market anomaly, which finance theory struggles to explain [7].

3 Strength Investing in Stock Markets A computable methodology of strength investing in a national stock market can be formulated in a dynamical mechanism as follows: Step 1:

Step 2:

Step 3: Step 4:

Stock classification or clustering with one, two or three approaches: (1) conceptual sectors such as blue chips, growth, small cap stocks, etc; (2) standard industrial classification such as mining, telecom, real estate, etc; (3) stock clustering through statistical or data mining tools, typically using price correlation or cointegration. Tracing leaders (or laggers) in the conceptual sectors, industrial sectors or/and stock clusters respectively in terms of price strength of each sector relative to a common reference time which should be a market bottom. Note in principle there can be 1–3 levels of sector classification or clustering. Selecting leaders (or laggers) among the stocks of selected sectors or clusters, still in terms of price strength relative to a common reference time. Grouping the selected stocks with price strength in a multi-asset portfolio.

58

H. Pan

Step 5: Step 6:

Constructing a multi-strategy portfolio for each selected stock, so in total, a multi-asset multi-strategy portfolio is constructed. Along with the market moving forward, the comparative strength of each sector or cluster and each stock will change, and even the clustering may generate differently new clusters. Therefore, there is a need to reiterate the whole process by going back to Step 1, Step 2, or Step 3

Strength Investing System

This strength investing system is applicable mainly to bull market periods, so bear markets should be avoided.

Market Situation Assessment

Tracing Strong Conceptual Sectors

Tracing Strong Industrial Sectors

Tracing Strong Stock Clusters

Note: when tracing strong industrial sectors, there can be up to 3 levels of classification of industries according to S& P

Selecting Strong Stocks

Selecting Strong Stocks

Selecting Strong Stocks

Strength of a stock or a sector can be measured by comparative relative strength of the price supported by value and growth appreciation, liquidity and stock concentration, etc.

Multi-Stock Portfolio

Stock1’s Strategy

Stocki’s Strategy

Stockn’s Strategy

Portfolio

Portfolio

Portfolio

Multi-Stock MultiStrategy Portfolio

According to the Intelligent Portfolio Theory, capital allocation on each asset (stock) can be re-diversified onto a portfolio of trading strategies [2].

Finally an intelligent portfolio is constructed, which consists of an asset portfolio of which capital allocation on each asset is implemented as a strategy portfolio.

Fig. 1. A strength investing framework for stocks

Strength Investing - A Computable Methodology for Capturing Strong Trends

59

This procedure is hierarchical in nature, so the stocks with strength, i.e. showing stronger trends, are selected with higher quality and reliability of trend because the multi-level strengths from sectors or clusters to individual stocks confirm the trend of the selected stocks with larger background market forces which may come from value appreciation, growth prospect, strategic intent of strategic investors, or herding. The principle here is to stay always with the stronger market forces, so big mistakes or gross errors can be avoided. This mechanism of hierarchical selection of strong stocks and constructing an intelligent portfolio is illustrated in Fig. 1. [2] provides more details on the intelligent portfolio theory supporting strength investing. To find the possible sources of the strength in the stock market, factor investing is a computable approach that involves targeting quantifiable fundamental, technical and strategic characteristics from the firm level through sectors to the market and economy level that attribute to or can explain differences in stock returns. An investing strategy of this type tends to tilting equity portfolios towards or away from specific factors in an attempt to generate long-term investment returns in excess of benchmarks. The theory of factor investing traces back to the Capital Asset Pricing Model which is a one-factor model, and the Arbitrage Pricing Theory which provides a general form for multi-factor models. Fama and French [8] developed a three-factor model, from where a plethora of multi-factor models have come out subsequently. A systematic study by Frazzini et al. [9] used a six-factor model to explain the long-term performance of Warren Buffett, which is decomposed into its components due to leverage, stocks in publicly traded equity, and wholly-owned companies. The six factors include the monthly returns of the standard size, value, and momentum factors, the Betting-Against-Beta factor, and the Quality-Minus-Junk factor.

4 Tactical Strategies of Trend Trading On the strategic level, strength investing overcomes the shortcomings of value investing and trend following with two mechanisms: (1) it shifts investment dynamically to markets (or assets) with stronger trend so the stop loss orders on the existing positions may be discarded before being triggered, and so the loss due to stop-loss triggered is often avoided; (2) it always concentrates capital on the market leaders with a small portfolio of minimal diversification, so superior return of investment can be mechanically realized. On the tactical level, a number of trend trading strategies are still needed, each operating independently on a given market (asset). Strength investing technically takes the form of stronger trend following. Apart from strength comparison and selection, it still trades the selected strong markets with trend trading strategies, except the stoploss, though used as a risk control measure, will be much less frequently triggered, because the capital may be shifted to stronger trending assets before strop-loss being triggered. Note that trend following has been the main strategy of trend trading. The first thing first in trend following is the choice of the time frame. The totality of all sensible time frames defines the scale space of price-time as described by [10]. Trends in different time frames may be different or contradicting. The two most popular technical analytics in determining the trend are moving average (MA) and price channel.

60

H. Pan

Two basic price channels are Bollinger band defined by MA and standard deviation and Donchian channel defined by the price highs and lows of the most recent periods. A trend following strategy is designed around the start, continuation, acceleration, slow down or reversal of an emerging or existing trend, trying to capture or follow the trend with entry, position management and exit. We have identified four major types of trend following strategies: (1) Rule-based passive trend following such as the Turtle traders, (2) Model-based passive trend following such as GBM [11], (3) Predictionbased active trend following using AI [12] or using econophysics such as log-periodic power laws (LPPL) [13], (4) Event-based active trend following with market calendar events or unexpected surprises. The later two types belong to the category of expectation-based active strategies of trend following, which are more or less active, while the former two types belong to the category of passive trend following. Antonacci [14] provides an empirical approach for dual momentum trading with both cross-asset relative momentum and single-asset absolute momentum, which is close to our theoretical methodology of strength investing. We have also observed the applicability of strength investing in commodity futures and forex markets.

References 1. Elton, E.J., Gruber, M.J., Brown, S.J., Goetzmann, W.N.: Modern Portfolio Theory and Investment Analysis, 9th edn. Wiley, Hoboken (2014) 2. Pan, H.: Intelligent portfolio theory – a new paradigm for invest-in-trading. In: Proceedings of 2019 IEEE International Conference on Cyber-Physical Systems (ICPS 2019) Special Session on Financial Big Data for Intelligent Systems, 6–9 May 2019, Taipei (2019) 3. Graham, B., Dodd, D.: Security Analysis: Principles and Technique. McGraw-Hill, New York (1934) 4. Soros, G.: The Alchemy of Finance. Simon & Shuster, New York (1987) 5. Jegadeesh, N., Titman, S.: Returns to buying winners and selling losers: implications for stock market efficiency. J. Finance 48(48), 65–91 (1993) 6. Low, R.K.Y., Tan, E.: The role of analysts’ forecasts in the momentum effect. Int. Rev. Financ. Anal. 48, 67–84 (2016) 7. Crombez, J.: Momentum, rational agents and efficient markets. J. Psychol. Financ. Mark. 2 (2), 190–200 (2001) 8. Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33, 3–56 (1993) 9. Frazzini, A., Kabiller, D., Pedersen, L.H.: Buffett’s Alpha. NBER Working Paper 19681 (2013). http://www.nber.org/papers/w19681 10. Pan, H.: Yin-Yang volatility in scale space of price-time – a core structure of financial market risk. China Finance Rev. Int. 2(4), 377–405 (2012) 11. Dai, M., Zhang, Q., Zhu, Q.J.: Optimal trend following trading rules (2011). http://ssrn.com/ abstract=1762118 12. Pan, H., Zhang, C.: FEPA - EPA: an adaptive integrated prediction model of financial time series. Chin. Manag. Sci. 26(6), 26–38 (2018) 13. Sornette, D.: Why Stock Markets Crash – Critical Events in Complex Financial Systems. Princeton University Press, Princeton (2003) 14. Antonacci, G.: Dual Momentum Investing: An Innovative Approach for Higher Returns with Lower Risk. McGraw-Hill Education, New York (2014)

A Deep Learning Framework for Stock Prediction Using LSTM Yaohu Lin1, Shancun Liu1,2, Haijun Yang1,3(&), and Harris Wu4 1

School of Economics and Management, Beihang University, Beijing, China {linyaohu,Liushancun,navy}@buaa.edu.cn 2 Key Laboratory of Complex System Analysis, Management and Decision, Beihang University, Ministry of Education, Beijing, China 3 Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, China 4 Strome College of Business, Old Dominion University, Norfolk, VA, USA [email protected]

Abstract. In order to test the predictive power of the deep learning model, several machine learning methods were introduced for comparison. Empirical case results for the period of 2000 to 2017 show the forecasting power of deep learning technology. With a series of linear regression indicator measurement, we find LSTM networks outperform traditional machine learning methods, i.e., Linear Regression, Auto ARIMA, KNN. Keywords: Prediction

Deep learning Stock LSTM Machine learning

1 Introduction Stock market prediction is notoriously difficult due to the high degree of noise and volatile features [1, 2]. Technical analysis is one of the most common analysis methods to predict the financial market. Candlestick charting is distinct from other technical indicators by simultaneously utilizing open-high-low-close prices which can reflect not only the changing balance between supply and demand [3], but also the sentiment of the investors in the market [4]. In recent years, machine learning models, such as Artificial Neural Networks, combing with technical analysis have been widely used to predict financial time series [5–10]. Generally speaking, there are three main deep learning approaches widely used, including convolutional neural networks [11], deep belief networks [12] and stacked autoencoders [13]. LeCun proposed that deep learning neural networks dramatically improved performance in speech recognition, visual object recognition, object detection and many other domains [14]. Cavalcante declaim that importing the deep learning technology into the complexity of financial market for prediction is regarded as one of the most charming topics [15]. For example, Ding predicts the short-term and longterm influences of events on stock price movements using neural tensor network and deep convolutional neural networks [16]. Dixon introduced deep neural networks into the future market to predict the price movements [17]. Sirignano take the deep neural © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 61–69, 2020. https://doi.org/10.1007/978-3-030-38227-8_8

62

Y. Lin et al.

networks into dealing limit order book to predict the stocks [18]. Bao predicts the financial time series using stacked autoencoders and long-short term memory [19]. Cao combine EMD and long-short memory to improve the accuracy of stock prediction [20]. However, this field still remains relatively unexplored. The remainder of this paper is organized as follows. Section 2 outlines the design of this paper. Section 3 presents the empirical results. Section 4 concludes the paper.

2 Methodology Our methodology consists several steps. First, we split our raw data into training and test sets. The training set is composed of sub-training set and validation set. Second, we discuss the feature and target. Third, we introduce LSTM of deep learning. Finally, we present several traditional machine-learning benchmark models. 2.1

Training and Test Sets

We design our training period and test period according to Krauss [21]. Therefore, our study consists of a training set including about 3000 daily data (approximately thirteen year) and a test set including about 1200 daily data (approximately five year). Training and test schedule are dealing with sliding window. 2.2

Features and Target

Features of Open, High, Low, Close, Volume, Turnover, Sigma, Beta, Relative return, PE and PB were used in this work. Let Os ¼ Ost ; t 2 T denotes the open price of stock s at time t, which T is the trading date collection, H s ¼ Hts ; t 2 T denotes the highest price of stock s at time t, Ls ¼ Lst ; t 2 T denotes the lowest price of stock s at time t, Cs ¼ Cts ; t 2 T denotes the close price of stock s at time t, V s ¼ Vts ; t 2 T denotes the volume of stocks at time t, Tus ¼ Tust ; t 2 T denotes the turnover of stock s at time t, TRs ¼ TRst ; t 2 T denotes the turnover rate of stock s at time t, r denotes system risk, b denotes risk, RR represents the rate of return relative to the overall market, PE denotes the dynamic price of earning ration of stock s at time t and PB denotes the dynamic price to book ration of stock s at time t. 2.3

LSTM Networks

Long-short term memory is one of the recurrent neural network (RNNs) architecture [22, 23]. RNN is a type of deep neural network architecture [11]. The RNN models are hard to learn long-term dependencies because of its inherent problems, i.e., vanish and exploding gradients [24, 25]. Hochreiter proposed an effective solution by using memory cells [26]. The memory cell consists of three components, including input gate, output gate and forget gate. The gates control the interactions between neighboring memory cells and the memory cell itself. The input gate controls the input state while the output gate controls the output state which is the

A Deep Learning Framework for Stock Prediction Using LSTM

63

input of other memory cell. The forget gate can choose to remember or forget its previous state. Figure 1 shows the memory cell in LSTM networks.

Fig. 1. Memory cell in LSTM

For the training of the LSTM network, we apply methods via keras. In the empirical stage, we set the hidden neurons to 60 and dropout to 0.1. Mean squared error is chosen to measure the loss and adam is selected as our optimizer function. Hence, the input layer contains 12 features and 60 timesteps. The LSTM main processing layers both contains 64 hidden neurons and a dropout value of 0.2. This configuration yields 19,456 parameters for the first layer, 33,024 parameters for the second layer and 65 parameters for the output layer. 2.4

Benchmark Models

2.4.1 Linear Regression Linear regression is the most basic machine learning algorithm. Cakra use Linear Regression model to predict Indonesian stock market [27]. The linear regression model returns an equation which determines the relationship between the independent variables and the dependent variable. 2.4.2 KNN K nearest neighbors (KNN) is another machine learning algorithm, which finds the similarity between new data points and old data points based on the independent variables. Chun-Xiao shows that the structure of the financial KNN network is related to the properties of the correlation matrix [28]. 2.4.3 ARIMA Autoregressive Integrated Moving Average model (ARIMA) is a very popular statistical method for time series forecasting [29]. ARIMA models take the past values into account to predict the future values. 2.5

Data

For the empirical application, we obtain stock daily data for CCER which is a local data provider of China. To eliminate survivor bias, we randomly choose one of the main board stocks in Shanghai Stock Exchange, one of the middle and small capital stocks in

64

Y. Lin et al.

Shenzhen Stock Exchange and one of the volatile stocks in Growth Enterprises Market. We download the specific stock data from Jan 4 2000 to Dec 31 2017. In the case study, stock code of 000001 with a total of 4,184 valid daily data was generated after data cleaning in Shenzhen Stock Exchange. Stock code of 600000 with a total of 4,228 valid daily data was generated after getting rid of not trading days, corresponding a 50,736 features set is formed. Due to first trading year in Growth Enterprises Market is 2009, we then get a total of 1,886 valid daily data from stock code of 300001. Table 1 displays the summary of case stocks. Table 1. Summary of case stocks Stocks 000001 600000 300001

No. 4,184 4,228 1,886

Features 50,208 50,736 22,632

3 Empirical Results 3.1

Main Results

In this study, we use the daily data of the stocks of the China Stock Market from the 18year period of 2000 to 2017. We used the first 3000 daily data to train the model and set the remain data as test set in the Shanghai and Shenzhen Stock Exchange, meanwhile used the first 1440 daily data as training data and set the remain data as test set in Growth Enterprises Market.

60 50 40 30 20 10 Date 20000822 20010411 20011126 20020722 20030314 20031103 20040618 20050127 20050915 20060526 20070202 20071029 20080611 20090121 20090915 20100506 20110304 20111025 20120625 20130204 20130924 20140512 20141224 20150805 20160321 20161102 20170615

0

Close

LSTM

LR

ARIMA

Fig. 2. Prediction result of 000001

KNN

A Deep Learning Framework for Stock Prediction Using LSTM

65

Figure 2 shows the prediction results of stock 000001. The training period of this stock is from Jan 4, 2000 to Jan 31, 2013 which containing the bull and bear market stage of China. The test period begins Feb 1, 2013 when this stock has been going up for a while. The prediction model of ARIMA overestimate the future trend of this stock and has the worst performance. The prediction model of KNN overestimate at the beginning and quickly adjust to be generally consistent with the actual trend. The Linear Regression model overestimated the future trend and it is not sensitive to the changes. The prediction model of Long-short term memory gives the best performance. LSTM is basically consistent with the actual trend of the stock which illustrate the prediction performance in time series forecasting of deep learning technology. Table 2. Prediction performance of 000001 LSTM LR ARIMA KNN MSE 0.56 81.68 355.01 32.20 RMSE 0.75 9.03 18.84 5.67 MAE 0.59 8.53 17.60 4.25 Directional accuracy 58.1% 49.6% 51.6% 50.2%

Prediction performance results is shown in Table 2. From the table we can see that LSTM performs best in Mean Squared Error (MSE), Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). In terms of the direction prediction of the rise and fall, LSTM is also higher than other prediction methods, with a correct rate of 58.1%.

Date 20000831 20010509 20011227 20020902 20030521 20040106 20040902 20050429 20051223 20061027 20070629 20080221 20081014 20090609 20100125 20101013 20110603 20120130 20120914 20130520 20140109 20140904 20150505 20160112 20160929 20170526

80 60 40 20 0 -20 -40

Close

LSTM

LR

ARIMA

KNN

Fig. 3. Prediction result of 600000

The prediction results of stock 600000 is shown as Fig. 3. The training period of this stock is from Jan 4, 2000 to Oct 12, 2012 which containing the bull and bear market stage of China. The test period begins Oct 15, 2012 while this stock has been

66

Y. Lin et al.

going down for a long time. It is the obvious bear market state at this time. According to the worst situation of this stock for a long time, the prediction model of ARIMA underestimate the future trend of this stock and predicts it will still fall. The prediction model of KNN overestimate all the time which expects the stock will rebound since it has fallen for a long time. Linear regression model overestimated the future trend and it is not sensitive to the changes. The prediction of Long-short term memory is basically consistent with the actual trend of the stock and gives the best performance. Table 3. Prediction performance of 600000 LSTM LR ARIMA KNN MSE 0.63 24.22 506.71 440.71 RMSE 0.79 4.92 22.51 20.99 MAE 0.72 3.89 19.53 17.78 Directional accuracy 59.6% 46.5% 57.6% 54.4%

Table 3 shows the prediction performance of stock 600000. We can see that LSTM performs best in MSE, RMSE and MAE. In terms of the direction prediction of the rise and fall is 59.6% of LSTM which is higher than other prediction methods.

Date 20100210 20100604 20100920 20110112 20110509 20110822 20111212 20120405 20120720 20121107 20130311 20130704 20131024 20140217 20140604 20141008 20150122 20150817 20151207 20160331 20160719 20161108 20170227 20170615 20170927

80 60 40 20 0

Close

LSTM

LR

ARIMA

KNN

Fig. 4. Prediction result of 300001

Figure 4 shows the prediction case of the Growth Enterprises Market. The training period of this stock is from Oct 10, 2009 to Feb 24, 2016 which containing the bull and bear market stage of China. The test period begins Feb 25, 2016 while this stock has been experienced skyrocketing and plunge. According to the worst situation of this stock, the prediction model of ARIMA overestimate the future trend of this stock and predicts it will rise in the future. The prediction model of KNN overestimate all the time which expects the stock will appear skyrocketing as before. since it has fallen for a long time. Linear regression model underestimated the future trend. The prediction of Long-short term memory also gives the best performance.

A Deep Learning Framework for Stock Prediction Using LSTM

67

Table 4. Prediction performance of 300001 LSTM LR ARIMA KNN MSE 0.22 15.82 157.33 93.54 RMSE 0.47 3.98 12.54 9.67 MAE 0.38 3.49 10.52 6.97 Directional accuracy 63.4% 53.3% 45.1% 48.1%

Prediction performance results is shown in Table 4. From the table we can see that LSTM performs best in MSE, RMSE and MAE. In terms of the direction of the rise and fall, LSTM is also higher than other prediction methods, with a correct rate of 63.4%.

4 Conclusion In this paper, we propose an empirical result of LSTM in stock market prediction in China. Eighteen years data from 2000 to 2017 suggest that the LSTM gives the best prediction performance. This study makes contributions in three aspects. First, we introduce the risk factors into the prediction situation. Second, the representative stocks are carefully selected which represent different markets. The empirical study was taken to valid the prediction power of LSTM. Finally, several traditional machine learning benchmark models are carried out. LSTM networks outperform traditional machine learning methods, i.e., Linear Regression, Auto ARIMA, KNN. The KNN model can easily magnify the change, resulting overestimated the future trend of the stocks. The Linear Regression model is not sensitive to the complex change of the stock market. The ARIMA model can be easily affected by recent times and amplify predictions. A series of indicators were introduced to measure the performance. The LSTM gives the best performance in our case study which provides the support of time series forecasting of deep learning technology. Acknowledgments. This research was partially supported by National Natural Science Foundation of China (Grant No. 71771006 and 71771008) and the Fundamental Research Funds for the Central Universities.

References 1. Fama, E.F.: Efficient capital markets: a review of theory and empirical work. J. Finance 25(2), 383–417 (1970) 2. Wang, B., Huang, H., Wang, X.: A novel text mining approach to financial time series forecasting. Neurocomputing 83(6), 136–145 (2012) 3. Caginalp, G., Laurent, H.: The predictive power of price patterns. Appl. Math. Finance 5(3–4), 181–205 (1998) 4. Marshall, B.R., Young, M.R., Rose, L.C.: Candlestick technical trading strategies: can they create value for investors? J. Bank. Finance 30(8), 2303–2323 (2006)

68

Y. Lin et al.

5. Burr, T.: Pattern recognition and machine learning. J. Am. Stat. Assoc. 103(482), 886–887 (2008) 6. Das, S.P., Padhy, S.: Support vector machines for prediction of futures prices in indian stock market. Int. J. Comput. Appl. 41(3), 22–26 (2013) 7. Lu, C.J., Lee, T.S., Chiu, C.C.: Financial time series forecasting using independent component analysis and support vector regression. Decis. Support Syst. 47(2), 115–125 (2009) 8. Refenes, A.N., Zapranis, A., Francis, G.: Stock performance modeling using neural networks: a comparative study with regression models. Neural Netw. 7(2), 375–388 (1994) 9. Guo, Z., Wang, H., Liu, Q., Yang, J.: A feature fusion based forecasting model for financial time series. PLoS One 9(6), 1–13 (2014) 10. Zhu, C., Yin, J., Li, Q.: A stock decision support system based on DBNs. J. Comput. Inf. Syst. 10(2), 883–893 (2014) 11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2012) 12. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) 13. Bengio, Y., Lamblin, P., Dan, P., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural. Inf. Process. Syst. 19, 153–160 (2007) 14. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015) 15. Cavalcante, R.C., Brasileiro, R.C., Souza, V.L.F., Nobrega, J.P., Oliveira, A.L.I.: Computational intelligence and financial markets: a survey and future directions. Expert Syst. Appl. 55, 194–211 (2016) 16. Ding, X., Zhang, Y., Liu, T., Duan, J.: Deep learning for event-driven stock prediction. In: International Conference on Artificial Intelligence, pp. 2327–2333 (2015) 17. Dixon, M.F., Klabjan, D., Bang, J.: Implementing deep neural networks for financial market prediction on the Intel Xeon Phi, vol. 101, no. 8, pp. 1–6. Social Science Electronic Publishing (2015) 18. Sirignano, J.: Deep Learning for Limit Order Books. Social Science Electronic Publishing, Rochester (2016) 19. Bao, W., Yue, J., Rao, Y.: A deep learning framework for financial time series using stacked autoencoders and long-short term memory. Plos One 12(7), e0180944 (2017) 20. Cao, J., Li, Z., Li, J.: Financial time series forecasting model based on CEEMDAN and LSTM. Phys. A 519, 127–139 (2019) 21. Krauss, C., Xuan, A.D., Huck, N.: Deep neural networks, gradient-boosted trees, random forests: statistical arbitrage on the S&P 500. Eur. J. Oper. Res. 259(2), 689–702 (2016) 22. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013) 23. Olah, C.: Understanding LSTM networks. http://colah.github.io/posts/2015-08-Underst anding-LSTMs/. Accessed 31 Mar 2019 24. Palangi, H., Ward, R., Deng, L.: Distributed compressive sensing: a deep learning approach. IEEE Trans. Signal Process. 64(17), 4504–4518 (2016) 25. Sak, H., Senior, A., Beaufays, F.: Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. Comput. Sci., 338–342 (2014) 26. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

A Deep Learning Framework for Stock Prediction Using LSTM

69

27. Cakra, Y.E., Trisedya, B.D.: Stock price prediction using linear regression based on sentiment analysis. In: International Conference on Advanced Computer Science & Information Systems, pp. 147–153 (2015) 28. Nie, C.-X., Song, F.-T.: Analyzing the stock market based on the structure of KNN network. Chaos, Solitons Fractals 113, 148–159 (2018) 29. Pai, P.F., Lin, C.S.: A hybrid ARIMA and support vector machines model in stock price forecasting. Omega 33(6), 497–505 (2005)

Testing Fiscal Solvency in Macroeconomics Paolo Canofari1(&)

and Alessandro Piergallini2

1

Luiss University, Rome, Italy [email protected] 2 University of Tor Vergata, Rome, Italy [email protected]

Abstract. This paper presents a literature review regarding the assessment of government solvency and fiscal policy. While indicators for fiscal sustainability are forward looking and based on expected future fiscal policies, tests are backward looking and based on information about the past values of fiscal variables. In this paper, we describe the main tests for fiscal solvency. Keywords: Fiscal sustainability solvency Structural change

Fiscal indicators Tests of government

1 Introduction Assessing the sustainability of budgetary policies is a controversial issue in macroeconomics. Fiscal sustainability can be analyzed using two methodologies. The former is based on indicators, following Miller (1983), Buiter (1985, 1987), Blanchard (1990), and Buiter et al. (1993). The other possibility provided by the literature is represented by tests for fiscal solvency, along the lines suggested by Hamilton and Flavin (1986), Trehan and Walsh (1988, 1991), Bohn (1998), Balassone and Franco (2000) and D’Erasmo. In the literature, indicators are mainly based on current information and published forecasts, or model-based projections of future fiscal policies (Canofari et al. 2018; Polito and Wickens 2011, 2012). Tests for government solvency involve unit root tests (Hamilton and Flavin 1986; Wilcox 1989), cointegration analysis of fiscal statistics (Trehan and Walsh 1988, 1991; Daniel and Shiamptanis 2013), or modelbased sustainability approaches (Bohn 1998, 2008; Mendoza and Ostry 2008; Ghosh et al. 2013; Piergallini and Postigliola 2013; Mauro et al. 2015; D’Erasmo et al. 2016; Leeper and Li 2017). This paper is organized as follows. Section 2 describes the most commonly used tests to assess fiscal sustainability. Section 3 concludes.

2 Testing for Fiscal Solvency Tests for solvency based on the present value budget constraint have originally been proposed by Hamilton and Flavin (1986). The authors emphasize similarity with the model of self-fulfilling hyperinflations first proposed by Flood and Garber (1980). It is © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 70–76, 2020. https://doi.org/10.1007/978-3-030-38227-8_9

Testing Fiscal Solvency in Macroeconomics

71

possible to test whether creditors perceive the government as solvent. This can be done testing the hypothesis that agents evaluate the interest bearing public debt dt as tending to: X1

dt ¼

i¼1

ð1 þ r gÞi E fbt þ i =Xt g;

ð1aÞ

where r is the real rate of return on public debt, g is the rate of growth of real output, dt is the interest bearing public debt at the end of period t, bt is the primary surplus inclusive of seigniorage revenues and Xt denotes the information set available at time t. Both bt and dt are ratios to GDP. The assumption of a constant growth-adjusted real rate of interest is primarily made for expositional simplicity. It is consistent with the standard empirical literature and does not affect the essence of the present analysis. Equation (2) implies that agents expect the present value of public debt-GDP ratio to approach zero as N tends to infinity, that is: lim ð1 þ r gÞT E fdt þ T =Xt g ¼ 0:

T!1

ð1bÞ

Equation (1b) is formally equivalent to the condition for ruling out bubbles in Flood and Garber’s model, and can be tested along the same lines. In particular, a direct test for debt sustainability which does not require a-priori assumptions on interest rates could usefully be derived employing the methodology proposed by West (1987) for testing the existence of speculative bubbles. Consider the following autoregressive representation of the surplus: bt þ 1 ¼ a0 þ a1 bt þ a2 bt1 þ . . . þ aq btq þ 1 þ vt þ 1 ;

ð2Þ

where a’s are parameters and vt þ 1 is a white noise process.1 Substituting the forecasts based on Eq. (2) in Eq. (1a) and applying the formula derived by Hansen and Sargent (1981) to solve (1a) for the past and present values of the surplus, one obtains:2 dt þ 1 ¼ d0 þ d1 bt þ 1 þ d2 bt þ . . . þ dq btq þ Zt þ 1 ; Zt þ 1 ¼

ð3Þ

X1

i f e ; t þ 1 þ i i¼1

where d’s are parameters resulting from highly non-linear combinations of the a’s and the discount factor f, and Ut is the subset of Xt including only present and lagged values of the surplus. 3 An unrestricted version of the model (2)–(3) can be estimated

1

2

3

Eq. (6) should be specified either in levels or in first differences, according to what is necessary to achieve stationarity. When the surplus equation is specified in differences, it is also convenient to specify the debt equation in the same way. The exact form of the non-linear constraints can be found in West (1987).

72

P. Canofari and A. Piergallini

using OLS to yield, under the null hypothesis of sustainability, consistent and efficient estimates of the parameters in the form dt ¼ f ðdt þ 1 þ bt þ 1 Þ;

ð4Þ

where it is also possible to obtain an instrumental variables estimate of f. This estimate is consistent under both the null hypothesis of sustainability and the alternative, and can be combined with the OLS estimates of the parameters in (2)–(3) to perform a Wald test of the non-linear restriction implied by (1a–1b). Tests for fiscal sustainability based on (1a–1b) have also been derived analyzing the cointegration properties of fiscal variables. The seminal work by Trehan and Walsh (1988) demonstrates that if revenues, spending and debt have unit roots, a sufficient condition for sustainability is the stationarity of with-interest deficit. Equivalently, it is sufficient that the primary surplus and public debt are cointegrated with a cointegrating vector ½1; ðr gÞ. Trehan and Walsh (1991) generalize the cointegration analysis of fiscal data incorporating the possibility of a non-stationary with-interest deficit. Specifically, the authors show that the intertemporal budget constraint holds if a quasi-difference of debt dt k dt1 is stationary when k 2 ½0; 1 þ r gÞ

ð5Þ

and primary surplus and debt are cointegrated. As pointed out by Bohn (2008), Trehan and Walsh (1991)’s conditions imply that a strictly positive relationship between surplus and debt is sufficient for sustainability. Cointegration between surplus and debt implies that bt a dt1 ¼ ut ;

ð6Þ

with a 6¼ 0, is stationary. Substituting (6) into the flow budget constraint yields: dt ¼ k dt1 þ ut ;

ð7Þ

where k ¼ 1 þ r g a. Hence, Trehan and Walsh’s condition (5) requires that a [ 0. This ensures sustainability because it guarantees that the growth rate of debt is strictly lower than the (growth-adjusted) real rate of interest (McCallum 1984). As shown by Bohn (1995, 1998, 2008), government solvency can also be tested using a model-based approach (see also Mauro et al. 2015, and D’Erasmo et al. 2016). Assuming private agents’ transversality condition: lim E Wt;T dt þ N =Xt ¼ 0;

T!1

ð8Þ

where Wt;T is the pricing kernel for contingency claims on period t + T. Combining the Euler equations, characterizing the agents’ intertemporal optimality conditions on the consumption-saving decision, with the government’s flow budget constraint (expressed

Testing Fiscal Solvency in Macroeconomics

73

with variable interest and growth rates) and applying the transversality condition yield the following intertemporal budget constraint (Bohn 1995): dt ¼

X1

E W b =X ; t;T t þ T t T¼0

ð9Þ

where dt ¼ ðrt gt Þ dt1 denotes debt at the beginning of period t. Bohn (1998) demonstrates that sustainability can be tested by estimating the following class of policy rules: bt ¼ b dt þ lt ;

ð10Þ

where b [ 0 and lt is a bounded set of other determinants of the primary surplus to GDP ratio.4 A positive reaction coefficient b, in fact, implies: E Wt;T dt þ T =Xt ð1 bÞT dtþ T ! 0:

ð11Þ

Note that if public debt and primary surplus ratios to GDP both have unit roots and is stationary, ensures cointegration between the two fiscal variables, thus satisfying the sustainability conditions obtained by Trehan and Walsh (1991). As a result, Trehan and Walsh’s cointegration condition linking debt to primary surpluses implies an errorcorrection mechanism that can be interpreted as a fiscal reaction function (Bohn 2007). In synthesis, according to Bohn (1998, 2008), sustainability surely holds if in the estimated fiscal policy rule one can detect a positive feedback reaction of the primary surplus to the debt-GDP ratio. A policy response to debt compatible with fiscal solvency can be possibly nonlinear, due for example to stabilization postponement related to political-economy reasons (Alesina and Drazen 1991; Bertola and Drazen 1993), as shown by a voluminous empirical literature (Sarno 2001; Chortareas et al. 2008; Cipollini et al. 2009; Arghyrou and Fan 2013; Ghosh et al. 2013; Legrenzi and Milas 2013; Piergallini and Postigliola 2013). According to Ghosh et al. (2013), in particular, Bohn’s requirement identifies a “weak” sustainability criterion, because it does not rule out the possibility of an ever-increasing debt-to-GDP ratio. A “strict” sustainability criterion prescribes that the responsiveness of the primary surplus should be greater than the interest rategrowth rate differential. Alternative approaches to Bohn (1998, 2008)’s criterion are based upon the existence of a “fiscal limit”, which recognizes that there is an upper bound to the primary surplus that can be raised because of distortionary taxes (Daniel and Shiamptanis 2013) and the role of monetary policy within general-equilibrium forward-looking relationships (Leeper and Li 2017).

4

The main advantages of this sustainability test are that it does not require any assumptions on interest rates and it maintains its validity under conditions of uncertainty.

74

P. Canofari and A. Piergallini

3 Conclusions The evolution of public debt in OECD countries requires some form of monitoring. The literature show how indicators are forward looking, in the sense that they are based on published forecasts, thereby reacting to a set of current and expected future conditions in fiscal policy. Tests, on the other hand, are backward looking, in the sense that they are based on a sample of past data. In the present paper, we have described tests of government solvency in the decision-making process about the (un)sustainability of fiscal policy.

References Alesina, A., Drazen, A.: Why are stabilizations delayed? Am. Econ. Rev. 81, 1170–1188 (1991) Andrews, D.W.K.: Tests for parameter instability and structural change with unknown change point. Econometrica 61, 821–856 (1993) Arghyrou, M.G., Fan, J.: UK fiscal policy sustainability, 1955–2006. Manchester Sch. 81, 961– 991 (2013) Balassone, F., Franco, D.: Assessing fiscal sustainability: a review of methods with a view to EMU. In: Banca d’Italia, Fiscal Sustainability Conference, Rome, pp. 21–60 (2000) Baldacci, E., Petrova, I., Belhocine, N., Dobrescu, G., Mazraani, S.: Assessing fiscal stress. IMF Working Paper N. 11/100 (2011) Bertola, G., Drazen, A.: Trigger points and budget cuts: explaining the effects of fiscal austerity. Am. Econ. Rev. 83, 11–26 (1993) Blanchard, O.J.: Suggestions for a new set of fiscal indicators. OECD Economics Department Working Paper N. 79 (1990) Bohn, H.: The sustainability of budget deficits in a stochastic economy. J. Money Credit Bank. 27, 257–271 (1995) Bohn, H.: The behavior of U.S. public debt and deficits. Q. J. Econ. 113, 949–963 (1998) Bohn, H.: Are stationarity and cointegration restrictions really necessary for the intertemporal budget constraint? J. Monet. Econ. 54, 1837–1847 (2007) Bohn, H.: The sustainability of fiscal policy in the United States. In: Neck, R., Sturm, J. (eds.) Sustainability of Public Debt. MIT Press, Cambridge (2008) Buiter, W.H.: A guide to public sector debt and deficits. Econ. Policy 1, 14–79 (1985) Buiter, W.H.: The current global economic situation, outlook and policy options, with special emphasis on fiscal policy issues, CEPR Discussion Paper N. 210 (1987) Buiter, W.H., Corsetti, G., Rubini, N.: Excessive deficits: sense and nonsense in the treaty of maastricht. Econ. Policy 8, 57–100 (1993) Canofari, P., Piergallini, A., Piersanti, G.: The fallacy of fiscal discipline. Macroecon. Dyn. (2018, forthcoming) Chalk, N., Hemming, R.: Assessing fiscal sustainability in theory and practice. IMF Working Paper N. 00/81 (2000) Chortareas, G., Kapetanios, G., Uctum, M.: Nonlinear alternatives to unit root tests and public finances sustainability: some evidence from Latin American and Caribbean countries. Oxford Bull. Econ. Stat. 70, 645–663 (2008) Chouraqui, J.-C., Hagemann, R.P., Sartor, N.: Indicators of fiscal policy: a reassessment. OECD Economics Department Working Paper N. 78 (1990)

Testing Fiscal Solvency in Macroeconomics

75

Cipollini, A., Fattouh, B., Mouratidis, K.: Fiscal readjustments in the United States: a nonlinear time-series analysis. Econ. Inq. 47, 34–54 (2009) Congressional Budget Office: Menu of Social Security Options. CBO, Washington (2005) Croce, M.E., Juan-Ramon, M.V.H.: Assessing fiscal sustainability: a cross country comparison. IMF Working Paper N. 03/145 (2003) D’Erasmo, P., Mendoza, E.G., Zhang, J.: What is a sustainable public debt? In: Taylor, J.B., Uhlig, H. (eds.) Handbook of Macroeconomics 2. Elsevier Press, Amsterdam (2016) Daniel, B.C., Shiamptanis, C.: Pushing the limit? Fiscal policy in the European Monetary Union. J. Econ. Dyn. Control 37, 2307–2321 (2013) European Commission: The Long-term Sustainability of Public Finances in the European Union, European Economy, N. 4 (2006) European Commission: 2009 Ageing Report: Economic and Budgetary Projections for the EU27 Member States (2008–2060)—Statistical Annex, European Economy N. 2 (2009) European Commission: Specifications on the Implementation of the Stability and Growth Pact and Guidelines on the Format and Content of Stability and Convergence Programmes (2016). http://ec.europa.eu/economy_finance/economic_governance/sgp/pdf/coc/code_of_conduct_ en.pdf Flood, R.P., Garber, P.M.: Market fundamentals versus price level bubbles: the first tests. J. Polit. Econ. 88, 745–770 (1980) Giammarioli, N., Nickel, C., Rother, P., Vidal, J.-P.: Assessing fiscal soundness: theory and pratice. ECB Occasional Paper Series N. 56 (2007) Ghosh, A., Kim, J., Mendoza, E., Ostry, J., Qureshi, M.: Fiscal fatigue, fiscal space, and debt sustainability in advanced economies. Econ. J. 123, F4–F30 (2013) Gramlich, E.M.: Fiscal indicators. OECD Economics Department Working Paper N. 80 (1990) Hamilton, J.D., Flavin, M.A.: On the limitations of government borrowing: a framework for empirical testing. Am. Econ. Rev. 76, 808–819 (1986) Hansen, B.E.: Approximate asymptotic p values for structural-change tests. J. Bus. Econ. Stat. 15, 60–67 (1997) Hansen, L.P., Sargent, T.J.: Formulating and estimating dynamic linear rational expectations models. In: Lucas R.E., Sargent, T.J. (eds.) Rational Expectations and Econometric Practice. University of Minneapolis Press, Minneapolis (1981) Horne, J.: Indicators of fiscal sustainability. IMF Working Paper N. 91/5 (1991) International Monetary Fund: The State of Public Finances Cross-Country Fiscal Monitor. IMF Staff Position Note N. 09/25 (2009a) International Monetary Fund: World Economic Outlook October 2009, Sustaining the Recovery. IMF, Washington (2009b) Larch, M., Nogueira Martins, J.: Fiscal Indicators. European Economy – Economic Papers N. 297 (2007) Leeper, E.M., Li, B.: Surplus-debt regressions. Econ. Lett. 151, 10–15 (2017) Legrenzi, G., Milas, C.: Modelling the fiscal reaction functions of the GIPS based on statevarying thresholds. Econ. Lett. 121, 384–389 (2013) Mauro, P., Romeu, R., Binder, A., Zaman, A.: A modern history of fiscal prudence and profligacy. J. Monet. Econ. 76, 55–70 (2015) McCallum, B.: Are bond-financed deficits inflationary? A ricardian analysis. J. Polit. Econ. 92, 125–135 (1984) Mendoza, E.G., Ostry, J.D.: International evidence on fiscal solvency: is fiscal policy responsible. J. Monet. Econ. 55, 1081–1093 (2008) Miller, M.: Inflation adjusting the public sector financial deficit. In: Kay, J. (ed.) The 1982 Budget. Basil Black-Well, London (1983)

76

P. Canofari and A. Piergallini

Polito, V., Wickens, M.: Assessing the fiscal stance in the European Union and the United States, 1970–2011. Econ. Policy 26, 599–647 (2011) Polito, V., Wickens, M.: A model-based indicator of the fiscal stance. Eur. Econ. Rev. 56, 526– 551 (2012) Piergallini, A., Postigliola, M.: Non-linear budgetary policies: evidence from 150 years of Italian public finance. Econ. Lett. 121, 495–498 (2013) Sargan, D.: Wages and prices in the United Kingdom: a study in econometric methodology. In: Hart, P.E., Mills, G., Whitaker, J.K. (eds.) Econometric Analysis of National Economic Planning. Butterworth, London (1964) Sarno, L.: The behavior of US public debt: a nonlinear perspective. Econ. Lett. 74, 119–125 (2001) Trehan, B., Walsh, C.: Common trends, the government budget constraint, and revenue smoothing. J. Econ. Dyn. Control 12, 425–444 (1988) Trehan, B., Walsh, C.: Testing intertemporal budget constraints: theory and applications to U.S. federal budget and current account deficits. J. Money Credit Bank. 23, 210–223 (1991) West, K.D.: A specification test for speculative bubbles. Q. J. Econ. 102, 553–580 (1987) Wilcox, D.W.: The sustainability of government deficits: implications of the present-value borrowing constraint. J. Money Credit Bank. 21, 291–306 (1989)

CEO Gender and Financial Performance: Empirical Evidence of Women on Board from EMEA Banks Stefania Fensore(&) DSFPEQ, University “G. d’Annunzio” of Chieti-Pescara, Viale Pindaro 42, 65127 Pescara, Italy [email protected]

Abstract. We investigate the relation between the CEO gender and the performance of banks operating in the EMEA countries. There is a long debate about the risk adversity of the females sitting on the board of directors and the consequences this behavior leads to in terms of financial performance. We estimate a dynamic model from a panel of 101 EMEA banks over the period 2000–2016 by using the generalized method of moments. Our findings of a positive relation between a female CEO and the bank performance are supported by well known statistical tests. Keywords: CEO gender Dynamic models performance GMM estimator

EMEA banks Financial

1 Introduction The study of the link between the composition of the board of directors and the bank or, more generally, firm outcomes has become more and more attractive for many authors in the last decade. The economic consequences of a growing number of female directors seem to be not univocally understood, although this is a crucial topic which the world legislations are very interested in. The policy implications mainly concern the promotion of guidelines aimed to regulate the composition of the board and the female quotas. Government research offices have performed no adequate studies on gender inequality and its impact on microeconomic and financial performance. Therefore, increased availability of information and results of gender-related studies would play an important role in influencing decision makers, such as banks’ or firms’ managers and policymakers. The existing literature pays more attention to the composition of the board of directors, also known as board diversity. [7] argued that, despite the extensive research conducted in the last decades in the area of gender and leadership, challenges in the empowering of women to more leader or senior positions are still present and need to be further analyzed. The aim of this work is to contribute to the debate by examining the dependence of the bank risk on the Chief Executive Officer (CEO) gender. In general, board characteristics are not exogenous random variables, therefore classical statistical methods, such as OLS or GLS, are not suitable to study the relation between the variables at hand. © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 77–84, 2020. https://doi.org/10.1007/978-3-030-38227-8_10

78

S. Fensore

We analyze a sample of 101 EMEA (Europe, Middle East, and Africa) banks from 28 countries over the period 2000–2016 by using the generalized method of moments, whose estimator contains individual effects, lagged dependent variables and no strictly exogenous variables. We perform serial correlation tests, a test for the restrictions on the coefficients and a specification test for the validation of the instruments. The paper is organized as follows. In Sect. 2 we briefly summarize the literature about the board diversity. In Sect. 3 we discuss the problem of endogeneity affecting these data and recall some basic facts about the GMM estimator. The empirical analysis is shown in Sect. 4, and in Sect. 5 we end up with some considerations.

2 Literature Review About Board Diversity Although the literature on board diversity has become substantial in the last years, it is quite controversial about the amount of focus on financial and non-financial firms. Some authors argue that little has been studied about the financial sector (see [3] and [11]), while many others state the opposite (see, for instance, [14]). As for the non-financial sector, most of the studies highlight a positive influence of board diversity on financial performance, mostly in the US (see, for example, [9] and [13]), while, on the other side, [1] finds an overall negative relationship between these variables. Concerning the link between gender diversity and bank risk [2] and [10] state that female directors are not risk-averse compared with their male counterparts. Specifically, [2] argues that female and male directors have substantial different risk attitude, due to the general more benevolent female behavior and the belief that women may not perform well in competitive environment. According to [12] the proportion of female directors in boardrooms is higher for lower-risk banks, and female directors denote a high growth orientation. Moreover, [11] find evidence that (i) the degree of risk-taking for female directors may vary according to their non-executive or executive roles, (ii) female and male executive directors may have the same risk-taking behavior, (iii) there is a positive and significant relationship between the proportion of female directors and financial performance. The above discussion shows that there is no real agreement on the nexus between board diversity and bank risk or performance. Our goal is to contribute to this debate providing an empirical analysis involving 28 EMEA countries aimed to explore the relationship between the bank financial performance and the CEO gender.

3 Methodology 3.1

Endogeneity

The main issue that concerns the relationship between CEO gender and bank risk or financial performance is the endogeneity. This is a typical concern affecting the board characteristics studies. Specifically, there are two sources of endogeneity: the unobserved heterogeneity and the reverse causality, also known as simultaneity.

CEO Gender and Financial Performance: Empirical Evidence of Women

79

Usually the first kind of endogeneity, which is due, for example, to omitted variables, is controlled by using a fixed effect model. However this model does not represent a good tool to face the simultaneity, which indeed requires a dynamic model, such as the generalized method of moments. 3.2

The GMM model

The generalized method of moments (GMM) is a linear dynamic model generally used for analyzing panel data. In time series regression models, if the covariates are not exogenous, which is the case of most economic phenomena, least squares-based inference methods (fixed effects or random effects estimators) produce biased and inconsistent parameter estimates, therefore some adjustments are required. The GMM estimator allows for endogenous covariates, and it shows optimal asymptotic properties for a finite number of time periods, T, and a large cross-sectional dimension, N (see [4, 5], and [8]). In a GMM model, for the ith individual observed at time t, the relation between the dependent variable yit and a set of covariates xit can be modeled by the following dynamic autoregressive specification: yit ¼ a yi;t1 + b0 xit þ vi þ eit ;

i ¼ 1; . . .; n; t ¼ 1; . . .; T;

ð1Þ

where a and the vector b refer to the parameters that need to be estimated, mi stands for the individual-specific effect, i.e. the set of unobserved variables which may be correlated with the covariates (unobserved time-invariant heterogeneity), and eit is the i.i.d. random error term having zero mean and variance r2e. Also, mi and eit are assumed to be independent for each i over all t, and the eit are serially independent. The fundamental condition of model (1) is a strict exogeneity of some of the explanatory variables, or the availability of strictly exogenous instrumental variables, conditional on the unobservable individual effects. As the consequence of no serial correlation in the error term, the model can be first differenced to get rid of the individual effect: Dyit ¼ a Dyi;t1 þ b0 Dxit þ Deit : Now, because of the correlation between Dyi,t−1 and Deit, yi,t−2 can be used as a further valid instrument. In general, the number of instruments in the GMM estimator grows with t. Therefore, past and present values of the dependent, exogenous and other non-exogenous variables can be employed as instruments once the permanent effects have been differenced out. However, lagged levels of the regressors could reveal to be poor instruments, with the consequence that the variance of the coefficients increases and the estimates can be biased, especially in relatively small samples. In this case, to increase the efficiency, one should use the system GMM (see [5] and [8]) that combines in a system the equation in first-differences with the same equation expressed in levels.

80

S. Fensore

4 Empirical models The sample collected for the empirical analysis is an unbalanced panel of 101 EMEA banks over the period 2000–2016. By unbalanced panel data we refer to a sample where consecutive observations on the individuals are available, but the number of time periods available may vary from unit to unit. For each individual, we have observed different kinds of variables: (i) governance variables (e.g. Board gender diversity, Independent board members, Board size, Board tenure, CEO/chair duality); (ii) financial variables (e.g. ROA, Common equity or Tier 1 on risk weighted assets, Price volatility, NPL ratio); (iii) country variables (e.g. GDP). A brief description of the variables involved in the empirical analysis is reported in Table 1. Descriptive statistics are shown in Table 2 and the correlation matrix is depicted in Fig. 1. Table 1. Description of the variables included in the empirical analysis. Variable BSIZE

Description Number of board directors

Variable INDD

CEODUA

Dummy variable: 1 if the CEO and the Chair of the board are the same person; 0 otherwise Common Equity on Risk Weighted Assets Average number of years each member has been on the board Percentage of female directors sitting on the board of directors Dummy variable: 1 if the CEO of the bank is a female; 0 otherwise

NPLR

CERWA EXPB FB FCEO

FCEOCH

FMANAGE FNE ln(GDP)

Dummy variable: 1 if either the CEO and the Chair of the board are female; 0 otherwise Percentage of female directors sitting on the management board Percentage of non-executive female board members Natural Logarithm of Gross Domestic Product

PV

Description Percentage of board independent directors NPL Ratio (Percentage of non-performing loans over total loans) Price volatility

QUOH

Hard female quotas

QUOR

Female quotas

ln(TA)

Natural Logarithm of Total Assets (proxy of the bank size) Total Loans on Total Assets

TLTA

T1RWA T1TA ROA

Tier 1 on Risk Weighted Assets Tier 1 on Total assets Return (Net Income) on Total Assets

CEO Gender and Financial Performance: Empirical Evidence of Women Table 2. Descriptive statistics. Variable

Mean

Median SD

BSIZE 13.52 13.00 CEODUA 0.6340 1.0000 CERWA 0.1443 0.1335 EXPB 8.0530 7.6350 FB 0.1530 0.1430 FCEO 0.0225 0.0000 FCEOCH 0.0324 0.0000 FMANAGE 0.3600 0.3600 FNE 0.1230 0.1110 ln(GDP) 10.571 10.699 INDD 0.5430 0.6250 NPLR 0.0269 0.0121 PV 0.2419 0.2299 QUOH 0.0451 0.0000 QUOR 0.3689 0.0000 ln(TA) 16.080 15.710 TLTA 0.6400 0.6787 T1RWA 0.1305 0.1220 T1TA 0.0843 0.0844 ROA 0.0125 0.0117

Min

4.6147 4.000 0.4820 0.0000 0.0618 0.0089 3.8687 0.2500 0.1185 0.0000 0.1482 0.0000 0.1771 0.0000 0.1249 0.0210 0.1098 0.0000 0.4741 7.7220 0.3072 0.0000 0.0475 0.0000 0.0903 0.0000 0.2076 0.0000 0.4826 0.0000 2.1135 11.530 0.1738 0.0000 0.0524 0.0014 0.0406 0.0100 0.0179 −0.2608

Max

N

44.00 1.0000 0.4200 19.170 0.6000 1.0000 1.0000 0.6380 0.6250 11.770 1.0000 0.6407 0.7050 1.0000 1.0000 21.890 1.0426 0.4500 0.4700 0.2772

1306 1310 3175 1026 1305 4811 4811 480 1299 4801 1261 3656 3615 4811 4811 4107 3768 3128 3689 3600

Fig. 1. Correlation matrix. Bold figures denote significance at the 5% level or below.

81

82

S. Fensore

To carry out the empirical analysis we used the system GMM estimator described in Sect. 3.2. After selecting five variables measuring the bank risk and financial outcomes as dependent ones, for each of them, we have specified the following models: 0

PVit ¼ a PVi;t1 þ b xit þ vi þ eit 0

CERWAit ¼ a CERWAi;t1 þ b xit þ vi þ eit 0

T1RWAit ¼ a T1RWAi;t1 þ b xit þ vi þ eit 0

T1TAit ¼ a T1TAi;t1 þ b xit þ vi þ eit 0

NPLRit ¼ a NPLRi;t1 þ b xit þ vi þ eit

ð2Þ ð3Þ ð4Þ ð5Þ ð6Þ

where mi and eit respectively denote the unobserved heterogeneity and the random error term described in the previous section, while xit refers to the governance, country, and control variables included in the models, as specified in Table 3. The subscript t −1 denotes the dependence of each response variable on past values. To control for the simultaneity we have used as instruments lagged variables, such as the percentage of female directors sitting on the board of directors or on the management board, the percentage of independent directors, the board size, etc. We have also assessed the validity of models (2), (3), (4), (5) and (6) with some statistical tests. The Sargan specification test is used to check the validity of the instruments. The model is correctly specified if the null hypothesis of exogenous instruments is not rejected. Due to the lagged dependent term, in the Arellano Bond test AR(1) the first order serial correlation is expected, therefore the null hypothesis of no serial correlation in the residuals should be rejected. The test for AR (2) in first differences detects the autocorrelation in levels. In this case the null hypothesis of absence of the second-order serial correlation in disturbances should not be rejected. Finally the Wald test for the robustness of the estimators of the coefficients covariance matrix is used. The results of the system GMM estimator are shown in Table 3. There is evidence of a positive, significant relationship between the proportion of female directors and the three considered measures of financial performance and solidity (CERWA, T1RWA, T1TA). Also, the variable controlling for the gender of both the CEO and the chair has a positive effect on the common equity on risk weighted assets, which is the most important indicator to assess the company’s equity solidity. As for the risk measures we find that the proportion of female directors has a negative impact on the non performing loans ratio, and the variable controlling for the gender of both the CEO and the chair has a negative effect on the price volatility. Therefore our results mainly document that banks with a female CEO or chair and, in general, banks with more women sitting on the board of directors appear to perform better than banks with fewer female board members.

CEO Gender and Financial Performance: Empirical Evidence of Women

83

Table 3. System GMM regressions of risk and performance measures (PV, CERWA, T1RWA, T1TA, NPLR) on proportion of women and other control variables, along with robust standard errors. Statistical significance at 10%, 5% and 1% are respectively denoted by *, ** and ***. PV lag(PV, 1)

CERWA

T1RWA

T1TA

lag(CERWA, 1)

0.424*** (0.108)

lag(T1RWA, 1)

0.693*** (0.093)

lag(T1TA, 1)

0.540*** (0.126)

lag(NPLR, 1)

0.838*** (0.039) −0.012 (0.009)

CEODUA

−0.0007* (0.0003)

EXPB FB

0.298* (0.132)

0.117*** (0.026)

0.023* (0.011)

−0.021* (0.009)

−0.558* (0.243)

FB2 FCEOCH

−0.0163** (0.006) 0.023* (0.011)

FNE

0.021 (0.034)

BSIZE

−0.007*** (0.002)

−0.049 (0.045)

−0.003 (0.003) −0.031* (0.014)

log(TA)

−0.0007 (0.018)

−0.057*** (0.013) −0.005** (0.002)

ROA

−0.598 (0.307)

1.221** (0.380)

log(GDP)

−0.130** (0.046)

QUOH

0.032** (0.011)

−0.004*** (0.001)

0.464** (0.168)

0.804*** (0.153)

0.011*** (0.003)

0.009*** (0.002)

−0.016*** (0.003) −0.007* (0.003)

QUOR INDD

NPLR

0.296*** (0.090)

0.094** (0.032)

−0.063* (0.026)

−0.590** (0.209)

−0.008* (0.003) 0.025* (0.012)

NPLR

0.090*** (0.026)

TLTA

0.033*** (0.009)

Sargan test: p-value

0.1273

0.1498

0.2954

0.1079

0.0777

Autocorrelation test (1):

0.0418

0.0369

0.0029

0.0016

0.0176

Autocorrelation test (2):

0.1333

0.6917

0.6314

0.3867

0.7599

Wald test: p-value

0.0000

0.0000

0.0000

0.0000

0.0000

5 Conclusions In this work we investigate how the bank CEO gender affects the financial performance and solidity of the company. Our findings are the result of an empirical analysis carried out by means of a panel dynamic model that let us able to take into account of the endogeneity affecting the involved variables. As many authors argue, investigating the risk attitude of the board members and banks’ outcomes through various corporate governance mechanisms (unitary governance and dual board mechanism) could reveal crucial. As highlighted by [10], the proportion of female directors may vary between the board of directors in the unitary or dual governance mechanism. Also, taking into account of the gender leadership approaches of transformational (TFL) and transactional (TSL) styles of decisionmaking could be important. [6] claimed that TFL is the most effective style because it leads to better qualitative results whereas TSL is more focused on performance. Therefore possible extensions of this study could use data not pooled from these points of view.

84

S. Fensore

References 1. Adams, R., Ferreira, D.: Women in the boardroom and their impact on governance and performance. J. Financ. Econ. 94, 291–309 (2009) 2. Adams, R.B., Funk, P.: Beyond the glass ceiling: does gender matter? Manag. Sci. 58(2), 219–235 (2012) 3. Adams, R.B., Mehran, H.: Bank board structure and performance. Evidence for large holding companies. J. Financ. Intermediation 21, 243–267 (2012) 4. Arellano, M., Bond, S.: Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Rev. Econ. Stud. 58(2), 277–297 (1991) 5. Arellano, M., Bover, O.: Another look at the instrumental variable estimation of errorcomponents models. J. Econ. 68, 29–51 (1995) 6. Bass, B.M., Avolio, B.J.: Developing transformational leadership: 1992 and beyond. J. Eur. Ind. Train. 14(5), 21–27 (1990) 7. Broadbridge, A., Simpson, R.: 25 years on: Reflecting on the past and looking on the future in gender and management research. Br. J. Manag. 22(3), 470–483 (2011) 8. Blundell, R., Bond, S.: Initial conditions and moment restrictions in dynamic panel data models. J. Econ. 87, 115–143 (1998) 9. Erhardt, N.L., Werbel, J.D., Shrader, C.B.: Board of director diversity and firm financial performance. Corp. Gov.: Int. Rev. 11(2), 102–111 (2003) 10. Farag, H., Mallin, C.: The influence of CEO demographic characteristics on corporate risktaking: evidence from Chinese IPOs. Eur. J. Finance (2016). https://doi.org/10.1080/ 1351847X.2016.1151454 11. Farag, H., Mallin, C.: Board diversity and financial fragility: evidence from European banks. Int. Rev. Financ. Anal. 49, 98–112 (2017) 12. Mateos de Cabo, R.M., Gimeno, R., Nieto, M.J.: Gender diversity on European Banks’ board of directors. J. Bus. Ethics 109, 145–162 (2012) 13. Miller, T., Triana, M.: Demographic diversity in the boardroom: mediators of the board diversity-firm performance relationship. J. Manag. Stud 46, 755–786 (2009) 14. Sila, V., Gonzalez, A., Hagendorff, J.: Women on board: does boardroom gender diversity affect firm risk? J. Corp. Finance 36, 26–53 (2016)

The Minimum Heterogeneous Agent Configuration to Realize the Future Price Time Series Similar to Any Given Spot Price Time Series in the AI Market Experiment Yuji Aruka1(B) , Yoshihiro Nakajima2 , and Naoki Mori3 1

Chuo University, Hachioji, Tokyo 192-0393, Japan [email protected] 2 Osaka City University, Sumiyoshi, Osaka, Japan 3 Osaka Prefecture University, Sakai, Osaka, Japan http://yuji-aruka.jp

Abstract. We employ an AI market simulator called the U-Mart system to elucidate the following propositions: (1) As the bodies of each strategy are simultaneously increased, the possibility to match current orders to settle them may be much bigger; (2) The discovery of the minimum agent configuration brings a scale free property to the NakajimaMori agent configuration of the so-called traditional technical analytical agents to realize the future price time series similar to any given spot price time series in the AI market experiment. In this sense, we will be able to employ this special configuration as a gravitational mediator to identify what kind of agent configuration dominates a current future price formation; (3) The earning property is also scale free from the absolute number of agents. The Nakajima-Mori special agent configuration usually provides us with an almost fixed mixture also of the earning structure, irrespective of the absolute number of participating agents. Keywords: Interactive CA · Turing’s rule selection · Minimum agent configuration · U-Mart system · Acceleration experiment · Self-correlation to lag · Moving Average Strategy · Earning distribution

1

AI Market Behaviors Hinted by FICA’s Behaviors

In the AI market experiment used by U-Mart system, Aruka et al. [1] has demonstrated that there can identify market performances with redundancies and a deeper logic of complexity motivated by Turing’s rule selection. In Cook [2], the cellular automaton of Rule 110 is well known as a complete Turing machine in the sense that any calculation or computer program can be simulated using this automaton. In this context, we also regard that the market system can generate any price time series. On the other hand, in the c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 85–92, 2020. https://doi.org/10.1007/978-3-030-38227-8_11

86

Y. Aruka et al.

lineage of Wolfram [5]: A new kind of science, we are interested in the properties of Class 1–4 on interactive cellular automaton (ICA): Cellular automata (CA) can be classified according to the complexity and information produced by the behavior of the CA pattern: Class 1: Fixed; all cells converge to a constant black or white set; Class 2: Periodic; repeats the same pattern, like a loop; Class 3: Chaotic; pseudo-random; Class 4: Complex local structures; exhibits behaviors of both class 2 and class 3; likely to support universal computation [3]. In particular, the property of Class 4 may be described as follows: “Nearly all initial patterns evolve into structures that interact in complex and interesting ways, with the formation of local structures that are able to survive for long periods of time.” ([4], 13). The rule 110 proves itself to reproduce such structures. Much more interestingly, Wolfram’s research group has also discovered these from the behaviors of Fully Random Iterated Cellular Automata (FRICA) with multiple rules.1 Thus, we can then detect some critical conditions that may make a certain local structure collapse by changing a selection of the rules. This new observation may bring us a new perspective, at least, a useful hint, on how to measure the market performances. Because we can regard a market performance as one similar to the property of Class 4. Here we replace the rule with the strategy. In the field of market transaction, usually, participating agents will behave by following their own strategy to either ask or bid. Initially given the participating set of agents to implement their own strategy, then, the market system will run under any allocation of strategy configuration that is always exposed to an internally or externally changing environment. It is noted that there is not always secured the equality of ask and bid any time but also some unsettled balance incurred even in the realization of a contract. There may often occur equilibria with redundancies or excesses which are to be carried over next date/session. Settlements at the exchange do not necessarily mean clearances on orders. So there are usually something like redundancies in the market. Hence, we can say that the market field is always affected by randomness and redundancies. It has been already long verified by the U-Mart system, our unique market simulator, that a specifically selected strategy configuration cited in Aruka et al. [1] could always internally generate any price time series that coincides with a given spot price time series, whatever its pattern is extrapolated. This special configuration may be regarded as a core structure locally generated during the market transaction procedure, on the one hand. On the other hand, we can measure any divergence from the special reference. Referring to the idea of Class 4, during fully random iterated transactions held at the exchange, we will be able to grope for identifying market performances with redundancies and a deeper logic of complexity. The U-Mart system is an artificial intelligent futures transaction system of long-run lifetime initiated by Japanese computer scientists since 1998 (See Kita 1

FRICA is employed to access how damaging the inclusion of a given rule is to the universal behavior of rule 110, for instance.

Heterogeneous Agent Strategy

87

et al. [6]. The development of the U-Mart system was mainly engineer-driven2 , the source code of the project is open for the public, and is now internationally recognized as a good platform for AI markets. This system is compatible with both types of batch and continuous double auction. It is interesting to know in the case of Sake Brewing that there are two ways of brewing: batch and continuous polymerization. The quality and taste of Sake become different when a different brewing process is used. It is also found in the market auction that there are two ways of matching: batch and continuous double auction. The different auction methods apparently bring different results. In our experiment, we will apply the batch auction. Moreover, either human agent or algorithm agent can join in the U-Mart system. The two eminent properties were equipped with the U-Mart system at the beginning. One is the participation system of hybrid agents. After the U-Mart system was released, the reality rather was closer to the U-Mart. The other is the implementation of the acceleration experiment tool3 which was indicative of the dominance fo the HFT. It is specially noted that our system has been actually the same as “Equity Index Futures” at Osaka Exchange, a branch of JPX.

2

Our New Remarkable Results

Aruka et al. [1] showed a special set to realize the future price time series similar to any given spot price time series in the AI market experiment. Firstly, it is stated that this special property does hold, as far as the relative weights are approximately fixed either after reducing or increasing the absolute number of strategies. 2.1

Future Price Formations by the Default Agent Configuration

In order to compare the special strategy set with the default one, we firstly summarize the behaviors in the default environment. In the future market field of the U-Mart system, agents, either way whether algorithmic or human, are directed to have their orders by referencing the spot price time series externally given. The U-Mart system provides us with the default configuration of agent strategies of the form of Table 1. But any agent strategy configuration will not necessarily generate the future time series continuously. As shown in Fig. 1, the future pricing are interrupted almost everywhere, if a spot time series of oscillating type is given. Keeping the similar weights constant, however, it will be verified that the more the number of agents/strategies, the bigger matching ratio among them will be induced. This may be easily verified by U-Mart simulation. Proposition 1 (The number increased effect). As the bodies of each strategy are simultaneously increased, the possibility to match current orders to settle them may be much bigger. 2 3

See the web site of the U-Mart Organizationhttp://www.u-mart.org/html/ index.html. The rule in our acceleration experiment is the batch auction.

88

Y. Aruka et al. Table 1. The default composition

Agent strategy

No. agents

TrendStragey

2

AntiTrendStragey

2

Random

3

SRandom

3

SRsiStrategy

2

RsiStrategy

2

MovingAverageStrategy

2

Fig. 1. A future price does not follow the given spot time series continuously, i.e., a future price is incessantly not formed almost everywhere

SMovingAverageStrategy 2 SFSpreadStrategy

2

DayTradeStrategy

2

MarketRandomStrategy

2

2.2

Future Price Formations in the New Special Environment

On the other hand, we can compare a special heterogeneous agent configuration to realize the future price time series similar to any given spot price time series. The latter can, with high precision, realize any future time series corresponding to any given spot time series of either the ascending (upward), descending (downward), oscillating (triangular) or reversal (inverting) type. We reproduce Figs. 2 and 3, which correspond to the cases of the oscillating and reverse spot time series respectively. Here it is noted that both correlation coefficients are 0.960526 and 0.974454. The parameter estimations carried above the captions of the both figures are fitted by the generalized linear model. Table 2. The Nakajima-Mori agent composition of any number any scaled number and the mini number Agent strategy

Original Any scaled Min bodies

RandomStrategy

8.1

4

1

AntiTrendStrategy

7.2

3

1

DayTradeStrategy

8.2

4

1

MovingAverageStrategy

26.4

12

3

RsiStrategy

9.3

4

1

TrendStrategy

13.2

6

2

SRandomStrategy

50.8

23

6

SFSpreadStrategy

45

20

5

SMovingAverageStrategy 32.8

15

4

SRsiStrategy

10

3

22.6

Heterogeneous Agent Strategy

Fig. 2. A future price evolution in the minimum agent composition the case of the oscillating spot time series; correlation coefficient = 0.960526

89

Fig. 3. A future price evolution in the minimum agent composition in the case of the reverse spot time series; correlation coefficient = 0.974454

According to Proposition 1 in the default agent environment, the agent number increase is indispensable to raise up the matching ratio among agent trades. In the vicinity that the spot and the future prices are closely formed, on the contrary, the precision of matching may be irrelevant to the agent number. This will be a good reason why the Nakajima-Mori special configuration could be the reference configuration to identify any volatility based on any other configuration of strategies. Next, we must focus on how this special agent composition is playing a role as a gravitational mediator to realize almost the same spot time series by inducing not all but many agents to be matched mutually. As shown in Table 2, there is certainly found an approximate proportionality among the columns vectors whose elements are not precisely proportional. Even in this approximate proportionality, interestingly, we are guaranteed to observe the equality between the spot and futures price with high reliance. The minimum number of an agent strategy is generally set q = 1. We may construct the minimum set by taking the minimal number m(=8.1) among the elements of the left edge column and divides it by the largest integer among m∗ . In general, the remainder r of the minimum number must be taken within a certain moderate error by choosing q. Thus it holds m = qm∗ + r for |r| < q. This will apply for each element i. This rough transformation will a minimum configuration placed in the right edge that (1, 1, 1, 3, 1, 2, 6, 5, 4, 3). This configuration provides a least one to fulfill our special property. Definition 1. The minimum configuration is given by applying a rough transformation mi = qi m∗ + ri to the given sets to fulfill the Nakajima-Mori special configuration. Proposition 2 (The scale free property). The discovery of the minimum agent configuration brings a scale free property to the Nakajima-Mori agent configuration to realize the future price time series similar to any given spot price time series. In this sense, we will be able to employ this special configuration as a gravitational mediator to identify what kind of agent configuration dominate a current future price formation.

90

3

Y. Aruka et al.

The Importance of Interacting Price Formation Among the Strategies

There are many scholars who are often enthusiastic exclusively to analyze any self-correlation of a time series. However, such a one-sided challenge may be useless, unless there are few strong self-correlation on the time series. We need to examine the self-correlation of the spot and future time series employed in our simulation. The market prices generated at the stock exchange are not simply a product of purely self-referencing by each agent. In the U-Mart system, the future market is internally formed by a number of algorithmic agents by reference of the spot price time series extrapolated outside the future market. Without overestimating the self-referencing factor, it is rather natural to believe that there are currently working a continuous interaction and coordination among heterogeneous agents (strategies) to generate various volatilities in the price time series. 3.1

Self-correlation and the Partial Correlation to Lag 10 with 95% White-Noise Confidence Bands

In fact, there are not necessarily obtained many useful result by prolonging the delay of time, because many actual time series may not always be discriminated from any white-noise process. This does not mean that the self correlated series does not exist. But it rather means that there are not always big long lag effects. In our AI market experiment, in fact, our price time series, whether actually collected or intelligently generated, usually retain merely a self-correlation at lag 1 and/or lag 2. The partial correlation to lag 10 with 95% white-noise confidence bands will be shown in Figs. 4 and 5. These figures then conclude that the effect of either a spot or a future price change is mainly positive from the last price of lag 1. The simulation result supports that indicates the spot data adopted in the U-Mart system is mainly subject to the process of positive correlation with lag 1 only. Discussing more in details, as Fig. 5 shows, there is found negatively correlated in the future price series part of lag 2 in this case of simulation, although the correlation of lag 2 usually tends to be fitted under the white noise range. Given a discernible negative correlation from lag 2, the positive correlation from lag 2 will be lessened, although the absolute size of the effect from lag 1 is greater than them of lag 2. It is noted that the spot time series is purely extrapolated from the data externally in the U-Mart system. This data cannot be affected by a sequential interactive force coming from the matching process of transactions. Thus we can limit our analysis of the interactive effects due of the matching process among different strategies only in the future market field. We may interpret the self-correlations of the two contradictory signs between the item lag 1 and the item lag 2 as the result of the so-called counter reactions against the actions due to the last positively correlated actions due to lag 1. So we shall analyze the self-referencing factor in view of some agent behaviors.

Heterogeneous Agent Strategy

Fig. 4. The partial correlation to lag 10 with 95% white-noise confidence bands in the spot time series

3.2

91

Fig. 5. The partial correlation to lag 10 with 95% white-noise confidence bands in the future time series

Analyzing the Self-referencing Factor in View of Some Agent Behaviors

The average of the previous prices is affected largely by the last price. In fact, we have several agent strategies that exclusively refer to the previous memories of the prices. Such a representative agent may be Moving Average Strategy (MAV). Thus, as long as MAV behaves in the range that spot is almost equal to future, MAV may contribute to keep the equality by cancelling out sell and buy. In this sense, MAV will behave a loss-cut agent to reinforce the equality of the spot and the future. In the vicinity where the spot and future prices are almost equal, the earning of MAV will, on the average, be compressed to be almost zero. Near here we have little advantage between MAV and SMAV. SMAV is a strategy to be decided based on the spot price, while MAV is a strategy to be decided based on the future price. Some of other agent strategies other than MVA may also contribute to the future price formation as a loss-cut agent. In the following, we examine the earning distribution pattern of the special agent strategies. Characteristics of the Earning Distribution. We finally characterize the earning profit distribution in the environment that randomness and redundancies are imbedded. Remarkably, through many simulations applied to all 4 patterns of spot price time series, we obtained a basic property commonly from our simulation results. Our finding may provide us with the reason why we compare this result to Class 4 of FRICA. The graphs are arranged to be almost zero profits in the mid part of the horizontal axis. We immediately discern almost Moving Average Strategy, whether SMVA or MVA, located around the zero level of profit, everywhere of Figs. 6, 7. More interestingly, it may be also found that the same kind of strategy may be symmetrically divided both into the positive and negative part of profit. This phenomenon may be interpreted as a collective loss-cut behavior, if such an expression is allowed.

92

Y. Aruka et al.

Fig. 6. The earning profit distribution in the environment of the oscillating spot price given the minimum special configuration

Fig. 7. The earning profit distribution in the environment of the reverse spot price given the minimum special configuration

Proposition 3 (The earning property). The earning property is also scale free from the absolute number of agents. The Nakajima-Mori special agent configuration usually provides us with an almost fixed mixture also of the earning structure, irrespective of the absolute number of participating agents.

References 1. Aruka, Y., Nakajima, Y., Mori, N.: An examination of market mechanism with redundancies motivated by Turing’s rule selection. Evol. Inst. Econ. Rev. (2018). https://doi.org/10.1007/s40844-018-0115-8 2. Cook, M.: Universality in elementary cellular automata. Complex Syst. 15, 1–40 (2004) 3. Carvalho, D.S.: Classifying the complexity and information of cellular automata (2011). http://demonstrations.wolfram.com/ClassifyingTheComplexityAndInform ationOfCellularAutomata/ 4. Ilachinski, A.: Cellular Automata: A Discrete Universe. World Scientific (2001). ISBN 9789812381835 5. Wolfram, S.: A New Kind of Science. Wolfram Media, Champaign (2002) 6. Kita, H., Taniguchi, T., Nakajima, Y.: Realistic Simulation of Financial Markets Analyzing Market Behaviors by the Third Mode of Science. Evolutionary Economics and Social Complexity Science, vol. 4. Springer, Tokyo (2016)

A Lab Experiment Using a Natural Language Interface to Extract Information from Data: The NLIDB Game Raffaele Dell’Aversana(&) and Edgardo Bucciarelli(&) University of Chieti-Pescara, Viale Pindaro, n. 42, 65127 Pescara, Italy {raffaele.dellaversana,edgardo.bucciarelli}@unich.it

Abstract. This paper makes a case for the challenge of an inductive approach to research in economics and management science focused on the use of a natural language interface for action-based applications tailored to businessspecific functions. Natural language is a highly dynamical and dialectical process drawing on human cognition and, reflexively, on economic behaviour. The use of natural language is ubiquitous to human interaction and, among others, permeates every facet of companies’ decision-making. Therefore, we take up this challenge by designing and conducting a lab experiment – conceived and named by us as NLIDB game – based on an inductive method using a novel natural language user interface to database (NLIDB) query application system. This interface has been designed and developed by us in order both (i) to enable managers or practitioners to make complex queries as well as ease their decision-making process in certain business areas, and thus (ii) to be used by experimental economists exploring the role of managers and business professionals. The long-term goal is to look for patterns in the experimental data, working to develop a possible research hypothesis that might explain them. Our preliminary findings suggest that experimental subjects are able to use this novel interface more effectively with respect to the more commons graphical interfaces company-wide. Most importantly, subjects make use of cognitive heuristics during the treatments, achieving pragmatic and satisficing rather than theoretically oriented optimal solutions, especially with incomplete or imperfect information or limited computation capabilities. Furthermore, the implementation of our NLIDB roughly translates into savings of transaction costs, because managers can make queries without recurring to technical support, thus reducing both the time needed to have effective results from business decisions and operating practices, and the costs associated with each outcome. JEL codes: C91

D23 L21

Keywords: Human problem solving Heuristics Laboratory experimentation Business exploratory research Natural language interface game Cognition

© Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 93–108, 2020. https://doi.org/10.1007/978-3-030-38227-8_12

94

R. Dell’Aversana and E. Bucciarelli

1 Introduction One of the success factors of contemporary world companies is the ability to extract information from data, particularly when it comes to huge amounts of data, and access it in a timely and accurate manner, so that it can be used by managers or practitioners in decision-making and, thus, in operational processes and practices, and ultimately in order to achieve pragmatic and satisficing rather than theoretically oriented optimal solutions. By and large, data are collected and stored in data warehouses for a range of different purposes, usually based on relational databases, so as to facilitate retrieval and to support decision-making through appropriate business tools directly related to human problem solving tasks. Thereby, managers or practitioners who intend to exploit that data employ technical personnel (i.e., technicians) capable of performing, among other activities, highly permeating data queries aimed at meeting specific business needs. It is worth mentioning a general point here: the vast majority of managers express their information needs – that is, information requests – in natural language on which the technicians build more or less complex data queries typically using SQL (i.e., structured query language) or OLAP tools (i.e., on-line analytical processing) and, after that, they prepare reports for managers thereupon. However, two major problems emerge from this operational process: (i) the correspondence between an information request and the related outcome is not always accurate; (ii) being able to build queries through SQL or OLAP tools properly is not effortless and can lead to local and systemic errors. With regard to (i), furthermore, a relationship of trust between the managers and technicians is needed, while with regard to (ii) technicians’ ability to correctly understand and implement managers’ (complex) information requests becomes a critical imperative. It would, therefore, be desirable for managers to be able to query data on their own, that is, without having to learn computer programming and formal languages as well as without having to learn to use sophisticated tools, such as OLAP-type query building interfaces. These formal languages, in fact, are somewhat awkward for domain specialists to understand and may cause a cognitive distance to application domains that are not inherent in natural language. A possible way to bridge the gap between natural and formal languages is to use controlled natural languages (CNLs) which mediate between the first two languages. Indeed, CNLs are designed as subsets of natural languages whose grammar and vocabulary are narrowly-defined with the aim to reduce both ambiguity and complexity of common, full natural languages (for an overview see [1–4]). In this paper, we present an experimental economics study based on the use of a CNL – embedded in a computerised decision support system or CNL-based editing interface – designed for representation of specific business knowledge and cognitive skills. More specifically, we design and implement a CNL to represent the automatic semantic analysis and the generation of queries starting from requests formulated in natural language straightforwardly by experimental subjects in the shoes of business managers dealing with the interplay of IT and operational decision-making (without any specialised technical support in the same query production). The layout of the paper is as follows. In Sect. 2, we present some foundational issues concerning research on human problem solving conducted by means of natural language. Section 3, presents relevant work in designing and programming to

A Lab Experiment Using a Natural Language Interface

95

develop a natural language interface to database (NLIDB) query applications, preparatory to subsequent experimental economics activities. In Sect. 4, the design of a lab experiment to explore management decisions in the course of business specific functions is provided. Section 5 presents both the preliminary results of the experiment conducted and discussion based on the analysis of the experimental data. In Sect. 6, finally, we draw interim conclusions and outline how future research will draw on further experiments in line with the inductive approach pursued.

2 Some Foundational Issues The body of knowledge relating to natural language is intertwined especially with cognitive and computer science [25, 26] such as linguistics, cognitive psychology, and artificial intelligence, as well as, in turn, is essential to areas such as economics and management studies both in terms of theoretical results and practical applications. Along these lines, as does much of the literature on information processing language, human problem solving represents a major research advance. Human problem solving is a complex yet cognitive process [5]. As a cognitive process, among others, it is considered with emphasis on situations involving human-machine interaction: “Whereby language is in the context of interpersonal symbol processing. […] To relate language to an IPS [Ed., information processing system] then it is assumed that processing of symbol structures are of the same form with the strings of natural language and is essential to human problem solving. And internal symbols structures that represent problems and information about problems are synonymous with linguist’s deep structure.” Newell & Simon, [5] pp. 65–66. Not only that, but according to [6] the pragmatic view of the relationship between language and cognition allows language to affect the cognitive process, playing a significant role in the cognitive process themselves. Nevertheless, as most conscious-level reasoning carried out by humans, human problem solving is attributable to the use of natural language. To this effect, the logical form of sentences in natural language is represented by expressing knowledge, experience, and beliefs in a general problem space such as business or scientific applications, where series of these sentences express pieces of reasoning. Basically, in human problem solving the sentences expressing accessible-and-relevant knowledge, experience, and beliefs are bonded to series of these same sentences, developing arguments and counterarguments to arrive at reasoned conclusions and, therefore, to a possible problem solution. Involving knowledge from several sources, the problem solving process may involve several substeps – or roles (rational and nonrational) – depending on the nature of the specific problem space and its degree of definition (well- or ill-defined). As it permeates most of the everyday business activities, successful natural language is ubiquitous for human interaction. However, in many problem- and solution-spaces where human reasoning is applied as in many business domains, it is possible to ascertain that for the problem under consideration only assumptions or incomplete information are actually available. In this regard, as displayed by a computer system designed as automated support for human analysis of

96

R. Dell’Aversana and E. Bucciarelli

specific functions, the regular use of natural language may be very useful in exploiting databases and information gathering, as well as in contributing effectively to decisionmaking and problem solving. More concretely, starting with Yngve’s, Bobrow’s, and Winograd’s pioneering research developed at MIT’s laboratories [7, 8, 14], designing natural language models and interfaces allows the emergence of natural correspondences with human linguistic foundational aspects and practice, and inferential limitation in various fields such as business environments [see also 9]. Here it is important to point out that we are aiming for computerised support of business-specific functions, and not computerised decision-making alone. According to [5, 10, 11], we maintain that standard formal theories of computer problem solving do not consider environmental and human aspects. Therefore, scientists’ agenda should, in a fashion, not only take into account issues that are directly relevant to computer problem solving but, mostly, the wider process of human problem solving using the computer and cognitive sciences. This shifts the perspective from computers to humans, that is, from research on abstract concerns – such as proof and reducibility – to research on more concrete situated activities reflecting the dynamic nature of reality and human cognitive processes. The main research question, then, is to explore real options for a general computer end-user to solve a problem (e.g., a business-specific problem) using computerised decision aid supporting better decision-making or, in other words, what support can computers provide to humans in problem solving? How can computers provide support to unstructured – or ill-defined – decisions in human problem-solving? As scholars concerned about human problem solving, why should this question matter? This work advances an answer focused on controlled natural language and, more specifically, on a natural language interface to database (NLIDB) query application system used to study certain business decisions and operating practices (for more details on NLIDBs, see [12]). For this purpose, we combine experimental-andcomputational economics and, in so doing, we design controlled, incentive compatible tasks that provide an opportunity both to discuss support for human problem solving within a business environment and to evaluate the impact of human decision-making therein. Accordingly, inductive experimentation through the use of NLIDB helps us to gain experimental evidence on viewing both a natural language interface as a human problem solving support system and natural language as composed of basic cognitive processes. By designing and running a lab experiment, this work intends to contribute to improve models of natural language for business applications and, as a consequence, to develop an instrument for appraising the quality of human decision-making and the underlying microeconomic processes that might account for it. Anyway, the ability to reflect on the natural language put forward during the experimental treatments is essential for the construction of arguments and successful problem solving. This relies on natural language, where successful natural language is based on human cognitive (if not even metacognitive) abilities and capabilities of reflection, by pointing to the problem solving technique of systematical state space exploration as the essential basis of human problem solving [13].

A Lab Experiment Using a Natural Language Interface

97

3 Designing a Natural Language User Interface As mentioned in Sect. 1, querying a data warehouse is a critical task in many research fields such as experimental-and-computational economics, especially for those scholars working on applied microeconomics, software engineering, and business applications. However, querying a data warehouse cannot rely on predefined queries autonomously but instead requires writing more elaborate queries, whereas there is a growing need for a manageable, end-user-friendly querying interfaces. Capable of being used as highlevel interface languages to various kinds of knowledge systems, controlled natural languages have gradually emerged. In this Section, we discuss the design and implementation of one of these end-user interfaces based on a controlled natural language which is an engineered as well as simple subset of the English language – with a reduced ambiguity and complexity of full natural English language – used in business environments for specific business functions. A controlled natural language interface represents an effective way to translate end-users’ requests into SQL/temporal – a query language for temporal databases – and is generally aimed at a specific purpose, bound to a specific context [15]. In this work, given the increasing importance of experimental methods in microeconomics, the context is concerned by business-specific functions, among which there are sales and marketing with their customer data platforms (CDPs). The latter are software platforms that collect big data on a company’s customers from multiple sources combining them to provide, for each customer, a unified view of the actions taken and to be taken, creating rich and useful customer profiles for analysis and use in other systems (e.g., to implement business processes aimed at increasing sales as well as strengthening and consolidating the company positioning in the key markets). In this context, managers’ requests for information typically concern the identification of customer segments, with specific characteristics, considered interesting for business processes aimed at increasing sales. Suppose, for the sake of argument, that one interacts with a straightforward database scheme, with only two tables: one for contacts, called contacts, and one for orders, called orders. There exists a relationship between contacts and orders of 1::n type, using the customer_id key (see the schema in Fig. 1).

Fig. 1. Prototype database schema.

Targeting at the customer level, let us now turn to the case of a manager who might be interested in finding good customers according to some criteria (i.e., contacts in Fig. 1) involving retail or wholesale marketing of products or services in order not only to increase sales but to earn and retain customers’ loyalty by recognising their needs and rewarding them accordingly. Rather than trying to sell everything to everybody all

98

R. Dell’Aversana and E. Bucciarelli

the time, managers can zero in on the best customers or prospects (i.e., the most profitable or strategically aligned with company objectives) or, in other words, managers can implement campaign management to drive sales with their targeted affinity groups. Actually, it is supposed that the most promising customers are those who placed at least six orders dated within the past three months, except those who placed at least two orders within the past two weeks, with the rationale that it is more appropriate to implement campaign management towards good customers that are not currently buying. Therefore, the manager generally writes her information request to technicians that translate it into a database query. The request is quite simple indeed, for instance: “find contacts placing at least two orders within the past three months, except those placing at least one order in the past week.”. Such a simple request translates into a quite complex SQL query: the manager needs to find all customers (by customer_id) having a certain property excluding customers having other properties. Here follows one way of writing this query in SQL related to the above schema:

While there are different ways to write the above database query, it might not be easy to understand for non-technicians, including a myriad of managers. Moreover, one can imagine that with more complex requests this query might become much complex to write (and to maintain in case of request change). The main difficulty lies in the fact that most common interfaces used to extract information from data represent information extraction process through a process-based abstraction, where one has to figure the steps and implement the filters. In Fig. 2 there is a simple process-based graph implementing the same kind of SQL query above. Consider that every task needs to be edited by a technician that needs to configure it and to fill the technical details necessary to implement it:

Fig. 2. Algorithmic process representing managers’ decision-making: query of the data source.

A Lab Experiment Using a Natural Language Interface

99

Natural language interfaces, instead, allow being as close as possible to managers’ natural language, by completely avoiding both the technical translation (with the related costs and waiting times) and possible errors in writing the database query. The language we develop in this work has a very simple syntax, which allows us to express the same information request bound by the strict predicate grammar (i.e., CNL).

contacts having at least 6 orders within last 3 months, except contacts having at least 2 orders within last 2 weeks

Fig. 3. NLIDB-style queries.

As can be noticed, the manager’s information request coincides with the implementation. The sentence in Fig. 3 can be interpreted cognitively by the machine thanks to our software implementation and, during typing, an advanced editor also provides suggestions on how to continue the sentence itself (e.g., see Fig. 4).

Fig. 4. Editing with suggestions.

The editor also shows any errors (the end-user may have misspelled some words or may have typed a sentence that cannot be adequately interpreted). The grammar used allows one to recognise a large part of requests and combinations of them. The request is subjected to lexical and syntactical analysis to produce an abstract syntax tree which, together with the semantics associated with the constructs, is used by the software to produce a correct query on the database model chosen for the experiment (Fig. 5).

Manager request, using NLIDB

Lexical analysis

Syntax analysis

Absract syntax tree

Semantic analysis

SQL generation

Fig. 5. From the manager’s information request using NLIDBs to SQL generation.

Here follows an extract of the language grammar written in BNF syntax; the following extract is capable of parsing queries like the one in Figs. 3 and 4:

100

R. Dell’Aversana and E. Bucciarelli

The CNL is built using Xtext, a state-of-the-art development facility for programming languages and domain-specific languages. The SQL generator produces automatically queries like the one mentioned above starting from the natural language query input and based on a database schema as in Fig. 1. The language structure allows concatenating multiple requests, separating them with a comma. Requests can be affirmative or start with the except keyword. All affirmative requests are evaluated through a logic AND (i.e., must be all true), and all except requests are evaluated as a subtraction from the affirmative one. For instance, consider the following request:

Semantically, this request means that the contacts must have six orders in three months and at least one order in the last week (both conditions must be true) but must not have two orders in the last 2 weeks. This semantics – and its reference frame – are used to design and conduct the proposed economics experiment in the next section.

4 Designing a Lab Experiment: The NLIDB Game Recently, there seems to be growing interest among economists on issues related to language (for a survey, see [16, 17]). Speaking of which, first Rubinstein [18, 19] and then Blume [20] demonstrate theoretically how optimisation principles might affect the structure of natural languages. From the experimental perspective, Weber & Camerer [22] focus on how organisations elaborate private codes based on natural languages, while Selten & Warglien [23] investigate the costs and benefits of linguistic communication in shaping the emergence of a straightforward language with which to address issues stemming from a coordination task. Other experimental economists classify natural language messages [24] or find that large efficiency losses in communication emerge when subjects differ in their language competence while the differences become private information [21]. To our knowledge, this work is the first to show the application of the experimental economics methodology to study natural language in order to extract information from data in the business domain. In this regard, we characterise our research approach according to the inductive logic, where by induction it is basically meant drawing general conclusions from individual, exploratory observations,

A Lab Experiment Using a Natural Language Interface

101

after finding robust empirical regularities in different regions of the world over time. In this paper, particularly, we focus on that area of experimental economics that overlaps with management and organisational sciences, and deals with cognitive aspects of management decision-making. Therefore, an experimental between-subjects (or between-groups) study design was conducted at the University of Chieti-Pescara to ascertain the effectiveness of the laboratory in developing managers’ ability to use and benefit from a natural language interface as part of specific business functions. This experiment has been named by us as NLIDB game since its purpose is similar to that of a game consisting of correctly answering questions that relate to typing queries in natural language in order to extract business information from data. Guidance is given through a series of questions that prompt the experimental subject to reflect on human problem solving by means of the opportunity to interact autonomously with the interface developed for this purpose. We randomly recruited a total of 108 subjects, 54 for each treatment, both from the general student population at the University of ChietiPescara and from practitioners working in various industries in Abruzzo with a variety of professional backgrounds and with different professional experience. Subjects were seated at visually separated computer terminal desks where they performed the experimental task independently. Before starting, subjects were informed with the experimental rules –subjects held the role of managers dealing with the extraction of information from data – and were provided with a transcript of instructions for their treatment. We used a training phase to ensure that everyone understood the experimental task. Each subject was told she would earn €5,00 for participating in the experiment (fixed stake). First of all, each subject was presented both the database schema in Fig. 1 and some examples of natural language queries (viz., problem space). Afterwards, the objective of the experiment (viz., solution space) was illustrated, consisting of being able to type queries based on ten randomly questions. The questions were asked as in the following examples: • Find all customers who never placed orders during the last six months. – Expected result based on CNL: ‘contacts having 0 orders within last 6 months’. • Let us define promising customers those who placed a few orders recently (of which at least one last week) and even more considering a wider time range. Type a query to find them. – Expected result based on CNL: ‘contacts having at least 1 order within last week, contacts having at least 3 orders within last month’. In the first example above, it is possible to notice a prescriptive task: the experimental subject’s cognitive effort is how to type the information request in her natural language so that the NLIDB accepts it. In the second example, the subject is given a free choice of time ranges and quantities, leaving her to decide the values to be entered independently. Subjects knew that the questions would be randomly proposed and they earned an additional €1,00 for each correct query. The experiment took about half an hour for each treatment, and median earnings were €8,00 (in addition to the fixed stake). In more detail, each experimental subject receives an explanation of the data structure (as in Fig. 1) and the experimental task required, that is, to query the data using natural language in order to identify sets of information about customers that meet certain requirements. The subject is then shown three examples with related

102

R. Dell’Aversana and E. Bucciarelli

solutions, which serve as training. In the aftermath of the training, the subject performs the experimental task in one of two treatments, each consisting of 10 different questions. After reading each question, the subject is required to enter her answer in the box provided (see Fig. 7). In the answer box, the software checks that the syntax is correct, therefore the subject can proceed only after entering a syntactically correct answer. The semantic verification of the answer is performed by the experimenters, who can assess whether the query entered is relevant to the experimental question or not. Treatment 1 is of a prescriptive nature: each question has a compelling character, explicitly indicating the objective to be achieved; the experimental subject is required only to formulate her query according to the natural language expected. Each question has an increasing level of difficulty, from simple questions – whose answer involves entering a single time period – to more complex questions for which more time periods and more excepts have to be entered in order to formulate the correct answer. As for treatment 2, it can be argued that it proposes the experimental subject questions of increasing difficulty but leaves more freedom to her, who can independently encompass both the time period and figures (see Fig. 6 for an overview of sample questions). Experimental treatment no. 1 (undergraduates) Find all contacts who placed one or more orders in the last week. Find all contacts having six orders both in the last three months and last month. Find all contacts having at least one hundred orders during the last two years, at least fifty orders last year, except those having more than twenty orders during the last six months.

Experimental treatment no. 2 (managers) Find all contacts who placed a few orders recently. Find all contacts having the same order quantity in two different periods (one more recent, one less recent). Find all contacts having at least one hundred orders during last two years, at least fifty orders during last year, except those having more than twenty orders during last six months.

Fig. 6. Overview of sample questions in each of the two treatments.

Fig. 7. NLIDB game: the experimental interface.

A Lab Experiment Using a Natural Language Interface

103

Figure 7 shows the complete interface of the NLIDB game, where there is a menu with various choices on the left side: • A sub-menu called ‘Introduction’ where subjects can access (both before and during the execution of their experimental task) to a description of the purpose of the experiment, to the instructions, and to some sample questions already resolved in order to familiarise with the task to be performed. • A ‘play’ button that allows subjects to start the game (or return to the game started after viewing the instructions again). • An ‘exit’ button that allows subjects to quit the game. The question is shown in the center of the interface in addition to a box where subjects can type their answers. A red ‘x’ is placed to the left of the typing line: it reveals the presence of any syntactic errors. At the bottom, a ‘next question’ button allows subjects to move on to the next question when they consider completed their answer.

5 Results and Discussion In this section, we present the results and discussion based on the analysis of the experimental data. We administered the experimental tasks to 108 human subjects across two different groups randomly assigned to treatments so that the two treatment groups contained, respectively: (i) 54 undergraduates enrolled in a degree course in economics and management (of which 34 females and 20 males) with an age range of 19–23 years; (ii) while the second group included 54 business and account managers holding a university degree (of which 28 females and 26 males) with an age range of 36–42 years. As indicated above, questions administered to experimental subjects mimic the same questions that business managers ask in order to request information, data analysis, and making decisions in this regard. For example, identifying those customers who have reduced purchase volumes to encourage them to make new purchases through targeted actions. As already mentioned, normally this type of analysis is not carried out by managers who merely ask questions – similar to those proposed to our experimental subjects – to technicians who then have to translate them into queries on databases and reports. Therefore, it was considered appropriate to adopt a different approach which, through a controlled natural language, allows nontechnicians to formulate the information request obtaining the desired output directly. This approach has at least four advantages: (i) the reduction of waiting times from the manager’s information request to the related response, as the request is formulated directly by the manager herself without the need for technicians; (ii) the opportunity for the manager to expand her information requests in terms of quantities and parameters and, then, to study the data autonomously and more completely; (iii) the reduction of possible – and inevitable – errors both in the information request and in the translation of the request and related query generation (as in the SQL example shown above); (iv) the implementation of a natural language interface roughly translates into savings of transaction costs, because managers can make queries without recurring to

104

R. Dell’Aversana and E. Bucciarelli

technicians, thus reducing both the time needed to have effective results from business decisions and operating practices, and the costs associated with each outcome.

Fig. 8. Effective human-machine interaction: the effectiveness of the NLIDB.

Figure 8 shows the average trend of some performance measures that ensure the effectiveness of the interface developed. In fact, many human factors such as usability, users’ cognitive characteristics, and domain knowledge, contribute as much to the effectiveness of the NLIDB as sophisticated algorithms do. In both treatments, as can be seen from the figure above, experimental subjects have adapted very quickly to the interface, finding it easy to understand. Cognitive fatigue increased at the beginning and, then, decreased and stabilised after the subjects have become accustomed to using the interface. Similarly, the average response time was initially variable and then became stable over time.

Treatment 1: % of correct answers

Treatment 2: % of correct answers

100%

100%

80%

80%

60%

60%

40%

40%

20%

20%

0%

0% 1

2

3

4

5

6

7

8

9 10

1 2 3 4 5 6 7 8 9 10

Fig. 9. The sequence of percentages of correct answers for the two treatment groups.

Regarding the correctness of the answers provided by the experimental subjects, very similar results were observed for the two treatments (see Fig. 9). Subjects were reasonably accurate in formulating their queries at first; then everyone experienced some difficulty to questions no. 4 and no. 6 in both treatments. Subjects took the most

A Lab Experiment Using a Natural Language Interface

105

time to answer these two questions, providing the largest number of incorrect answers. This is also confirmed by the percentage of instructions viewed per question (see Fig. 10). In general, we can argue that faced with the difficulties of performing their task, subjects gradually improved their skills and were then more accurate in answering the remaining questions. As mentioned in Sect. 4, ultimately, this first experiment revealed a basic accuracy by the subjects in formulating semantically-correct requests, although they did not have immediate checks regarding semantic correctness of their queries. This confirms the validity of the proposed game and makes us assume that, with a short training period, managers can become precise enough to formulate correct queries independently, merely using an interface similar to ours. Furthermore, according to the number of correct answers given during the two treatments, we have classified subjects as FAIR (i.e., just sufficient, less than 70% of correct answers), HIGH (between 70% and 90% of correct answers), and BEST (at least 90% of correct answers). More than half of experimental subjects falls into the ‘BEST’ category with 9 or 10 correct answers, while the remaining part (except for three subjects in both treatments) achieves a number of correct answers between 7 and 8 (see Table 1).

Table 1. The number of subjects classified as FAIR, HIGH, and BEST over the two treatments (based on the percentage of correct answers). FAIR ( 70) for example, in the Piemonte, Veneto and Emilia Romagna regions. These regions tend to develop a number of technologies that can only be replicated in a small number of other Italian regions. Knowledge production is moderately high (60 < KCI < 80) in Lazio and Emilia-Romagna regions. It is also clear from the table, that the leading regions in terms of complex knowledge production are dispersed across Italy.

140

D. Cialfi and E. Colantonio Table 2. The Knowledge Complexity Index (years: 2004–2012).

Regions

Piemonte Valle d’Aosta Liguria Lombardia Trentino Veneto Friuli VG Emila R Toscana Umbria Marche Lazio Abruzzo Molise Campania Puglia Basilicata Calabria Sicilia Sardegna

2004 High-tech patents 100.00 100.00

ICT patents 0.00 99.67

Biotechnology patents 100.00 0.00

2012 High-tech patents 100.00 100.00

ICT patents 0.00 0.00

Biotechnology patents 100.00 65.14

100.00 100.00 0.00 100.00 99.99 100.00 100.00 100.00 0.00 66.66 100.00 100.00 100.00 0.00 0.00 0.00 100.00 100.00

99.67 80.18 100.00 100.00 0.00 0.00 100.00 0.00 100.00 0.00 100.00 79.13 79.13 100.00 0.00 0.00 0.00 0.00

0.00 0.00 100.00 0.00 100.00 75.34 0.00 100.00 100.00 100.00 0.00 0.00 0.00 56.82 0.00 0.00 71.74 100.00

100.00 100.00 100.00 100.00 0.00 99.99 81.10 0.00 0.00 100.00 100.00 100.00 100.00 0.00 100.00 100.00 0.00 0.00

0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 75.83 75.83 0.00 100.00 100.00 0.00 0.00

65.14 39.07 100.00 88.49 0.00 100.00 0.00 0.00 0.00 100.00 75.83 0.00 0.00 0.00 0.00 0.00 0.00 0.00

4 Conclusions Knowledge is an increasingly critical dimension of competitive advantage. The present work shows how the knowledge cores of Italian regions might be differentiated using patent data and measure of technological relatedness between patents in different classes. In this paper, we have presented an empirical application of the method of reflection to Italian patent data, describing the geographic patterns and shifts in knowledge complexity that originated in the Italian regions from 2004 to 2012. What results from our analysis is the existence of geographic variations in knowledge complexity, in terms of which only a few Italian regions, especially the Northern ones, are producing the most complex new technologies. As a consequence, this suggests that knowledge is not spatially closely-related. This is because it tends to be produced in a few places, which suggests that it is difficult to move. How Italian regions could transform their knowledge essences into one more greater is a fundamental question, especially if we consider the future implementation of the smart specialization strategy that has been introduced in Italy, the Regional Innovation System.

Mapping the Geography of Italian Knowledge

141

References 1. Asheim, B., Gertler, M.: The geography of innovation: regional innovation systems. In: Faberberg, D., Mowery, D.C., Nelson, R. (eds.) The Oxford Handbook of Innovation, pp. 291–317. Oxford University Press, Oxford (2005) 2. Balland, P.A., Rigby, D.L.: The geography of complex knowledge. Econ. Geogr. 93(1), 1– 23 (2017) 3. Balland, P.A., Boschma, R., Crespo, J., Rigby, D.L.: Smart specialization policy in the EU: relatedness, knowledge complexity and regional diversification. Pap. Evol. Econ. Geogr. 17, 1–32 (2017) 4. Bult, K.: R & D, patenting and innovative activities: a statistical exploration. Res. Policy 11, 33–51 (1982) 5. Dicken, P.: Global Shift: Mapping the Changing Contours of the World Economy. Sage, Noubury Park (2007) 6. Grabher, G.: The weakness of strong ties: the lock-in of regional development in the Ruhr area. In: Grabher, G. (ed.) The Embedded Firms: on the Socio-Economies of Industrial Networks, pp. 255–277. Routledge, London (1993) 7. Hidalgo, C., Haussman, R.: The building blocks of economic complexity. In: Proceedings of the National Academy of Science, vol. 106, pp. 10576–10575 (2009) 8. Jaffe, A.B.: Technological opportunity and spillovers of R & D: evidence from firm’s patents, profits and market value. Working Paper 185, National Bureau of Economic Research, Cambridge, MA (1986) 9. Kim, S., Anand, J.: Knowledge complexity and the performance of inter-unit knowledge replication structures. Strat. Manag. J. 39(7), 1959–1989 (2018) 10. Kirman, A.: Ants and non-optimal self-organization: lessons from macroeconomics. Macroecon. Dyn. 20(2), 601–621 (2016) 11. Mazzucato, M.: The Value of Everything. Making and Talking in the Global Economy. Allen Lane, an Imprint of Penguin Books - Random House, UK (2018) 12. Neffke, F.: Productive places: the influence of technological change and relatedness on agglomeration externalities. Ph.D. dissertation, Utrecht University (2009) 13. Rigby, D.: Technological relatedness and knowledge space: entry and exit of US cities from patent classes. Reg. Stud. 49, 1922–1937 (2015) 14. Saxenian, A.: Regional Advantage: Culture and Competition in Silicon Valley and Route, vol. 128. Harvard University Press, Cambridge (1994) 15. Simon, H.A.: Administrative Behavior: A Study of Decision-Making Processes in Administrative Organizations, 4th edn. Free Press, New York (1945/1957) 16. Simon, H.A.: A behavioral model of rational choice. Q. J. Econ. 69(1), 99–118 (1955) 17. Simon, H.: The architecture of complexity. Proc. Am. Philos. Soc. 106, 467–482 (1962) 18. Simon, H.: Theories of bounded rationality. Decis. Organ. 1(1), 161–176 (1972) 19. Teece, D., Rumelt, R., Dosi, G., Winter, S.: Understanding corporate coherence: theory and evidence. J. Econ. Behav. Organ. 23, 1–30 (1994)

Hot Trading and Price Stability Under Media Supervision in the Chinese Stock Market Hung-Wen Lin1, Jing-Bo Huang2, Kun-Ben Lin3, and Shu-Heng Chen4(&) 1

Nanfang College of Sun Yat-sen University, Guangzhou, China 2 Sun Yat-sen University, Guangzhou, China 3 Macau University of Science and Technology, Macau, China 4 National Chengchi University, Taipei, Taiwan [email protected]

Abstract. The Chinese stock market is one of the markets where price reversal easily takes place in the world. Its opening has been accelerating. However, this market has plenty of shortcomings, some of which involve China’s special corporate culture. Artificial manipulations in financial reports are prevalent, which gives rise to uncertainties of investments. We propose turnover, financial transparency, and media coverage in the discussion of Chinese momentum. In so doing, we investigate the reasons for price reversal. We find that portfolio with high turnover usually has high financial transparency but it easily encounters price reversal. By contrast, the portfolio with high media coverage is free of price reversal even if these stocks have relatively low transparency and turnover. Keywords: Turnover

Transparency Media Momentum

1 Introduction To study media coverage, turnover, transparency and momentum in Chinese stock market, we engender a series of two-way sorted conditional momentum portfolios by dependent classifications (e.g., Kim and Suh 2018; Chen et al. 2014).1 Additionally, we extend the formation and holding period to dissect whether our results are sensitive to time length. In so doing, the investigations on media coverage, turnover, transparency and momentum are more comprehensive.2 Filtering mechanism exists in media in China. Controlled media will convey incomplete information and harm investments. Every coin has two sides. The filtering mechanism may make media news reliable to certain extent, as some rumors impeding market operations will not be diffused widely. 1

2

In details, we first classify all stocks in the market by certain factors into several groups. In each group, we select winner and loser stocks by stock returns. If stock prices behave stably and exhibit very low volatilities, it is possible for prices to maintain past trends and thus momentum occurs. By contrast, when stock prices vary frequently and have evident volatilities, price reversals are likely to take shape and contrarian appears.

© Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 142–153, 2020. https://doi.org/10.1007/978-3-030-38227-8_17

Hot Trading and Price Stability Under Media Supervision

143

There are a number of retail investors in Chinese stock market and they master deficient investment knowledge and skills. Their investment decisions always depend on what they believe rather than professional and rational principles. Especially, the retail investors are lack of knowledge, so they possibly trust media news and share their information with each other. In the literature, media coverage can be regarded as a proxy for information dissemination and has impacts on stock prices and returns (Peress 2014). It is obvious that high media coverage stocks grab more investor attention and retail investors are netbuyers of these stocks (Barber and Odean 2008). Thus, due to this kind of trading inclination, more coverage stocks will have higher returns. In the Chinese stock market, the increases in media coverage bring positive price pressure and the stocks with more media coverage also have higher returns than those with less media coverage (e.g., Huang 2018). Turnover generally represents the liquidity of stocks (e.g., Bathia and Bredin 2018; Hauser and Kedar-Levy 2018). In details, liquidity actually reflects how easy and fast the stocks transferring among different investors. To sum up, turnover indicates trading frequency of stocks. As the largest emerging capital market in the world, the Chinese stock market usually has high turnover (Dey 2005). Stock prices possibly have high volatilities under high turnover condition. When it comes to the relationship between turnover and momentum, Lee and Swaminathan (2000) and Avramov et al. (2016) have caught out some analyses. Concretely, Lee and Swaminathan (2000) discovered that turnover is negatively correlated with future stock returns. Avramov et al. (2016) perceived that momentum profits are larger in liquid states. In terms of liquidity and momentum, turnover will be an important research variable in this paper. Jin and Myers (2006) have provided evidence that transparency promotes the stock price to incorporate firm-specific information and can predict the stock price crash risk. Yuan et al. (2016) found the similar results with Jin and Myers (2006). Kim et al. (2014) also point out that transparency significantly explains stock price informativeness. These findings also suggest that transparency stands for stock price informativeness. Besides, transparency actually relates to stock price volatility due to predictability on crash risk. Based on the above relationships with stock prices, so we investigate transparency in this paper. The remainder of this paper is organized as follows. Section 2 describes empirical design and data. Empirical outcomes are reported in Sect. 3. We check the robustness in Sect. 4 and conclude this paper in Sect. 5.

2 Empirical Design and Data In this section, we introduce our turnover calculation, composite transparency, momentum calculations, and media coverage model and data description. The composite transparency consists of earnings aggressiveness, earnings smoothing and loss avoidance. Momentum and media calculations follow Jegadeesh and Titman (1993) and Hillert et al. (2014) respectively.

144

2.1

H.-W. Lin et al.

Turnover Calculation

The turnover rate of each stock is defined by ratio of the number of traded tradable shares to the number of shares outstanding. From its computation, it actually indicates that the degree of trading frequency of stocks. TRit ¼

#ðTTSit Þ #ðSOit Þ

ð1Þ

where TTSit is the traded tradable shares of stock i in period t, SOit is the shares outstanding, and # is the number of data. 2.2

Composite Transparency

Following Bhattacharya et al. (2003), we compute earnings aggressiveness (Dechow et al. 1995), earnings smoothing (McInnis 2010) and loss avoidance (Ball and Shivakumar 2005) to construct the composite transparency (Trans). Our transparency index is a relative concept as it is based on decile rankings. Transit ¼

DecilesðEAit Þ þ DecilesðESit Þ þ DecilesðLAit Þ 3

ð2Þ

where EAit , ESit and LAit respectively denote earnings aggressiveness, earnings smoothing and loss avoidance. 2.3

Media Coverage

According to Hillert et al. (2014), the media coverage of every stocks is calculated as follows. We adopt some adjustments with respect to Chinese market for their model. lnð1 þ no:art:Þi ¼ a þ

4 X

bn Indepn;i þ ei;media

ð3Þ

n¼1

where lnð1 þ no:art:Þi is the natural log of the number of reported articles regarding P stock i, 4n¼1 bn Indepn;i represents the independent variables and their coefficients, including: lnðcap:Þi is the natural log of market capitalization, CSIi is a dummy taking value of 1 when a stock belongs to CSI300 index and 0 otherwise, SZSi is a dummy taking value of 1 when a stock is listed in Shenzhen stock exchange and 0 otherwise and lnð1 þ esti:Þi is the natural log of the number of earning estimates. The regression residual is the media coverage of each stock. 2.4

Momentum Calculations

The formation and holding periods are 6 months each (6 months are standard in the momentum literature). In each month t − 1, we sort the stocks into x groups using

Hot Trading and Price Stability Under Media Supervision

145

stock returns during formation period (past t − 7 to t − 2 months). The stock group with highest formation returns constitutes winner portfolio, and the group with lowest formation returns is loser portfolio. A long position and short position are respectively created for winner and loser portfolio in each month t − 1. During holding period (future t to t + 5 months), we maintain the positions t;t þ 5 engendered in month t − 1 and calculate the average returns of winner portfolio, RW;p t;t þ 5 and loser portfolio, RL;p . The rolling period is p ¼ 1; 2; ; M. Thus, the momentum PM t;t þ 5 þ5 1 profit is: MP ¼ M p¼1 RW;p Rt;t . L;p

2.5

Data Description

We collect the data of stock price, financial transparency and turnover from the CSMAR database. The media data is collected from China Infobank database. The seasonal data period ranges from 2005 to 2006. Table 1 reports the descriptive statistics of seasonal and total mean values. For seasonal mean values, based on rolling procedure, we rank stocks by certain factors and calculate the corresponding values in each season. After obtaining the time series from rolling procedure, we calculate the mean value of time series. Table 1. Descriptive statistics Panel A: 3-group classification by turnover HO MO LO T 5.556 5.407 5.355 C −0.017 −0.029 0.031 Panel B: 3-group classification by transparency HT MT LT O 146.488 142.443 139.342 C −0.034 0.001 −0.036 Panel C: 3-group classification by media coverage HC MC LC O 145.882 166.427 149.815 T 5.268 5.295 5.196 Panel D: Total mean O 54.906 T 5.479 C 0.004

Reported in Table 1 are the seasonal mean value of transparency, media coverage and turnover under 3-group classifications. H, M, and L stand for high, median and low level respectively. O, T, and C are for turnover, transparency and media coverage respectively. We also calculate the total mean value of the variable time series in this table. From Table 1, we can discover some findings. First, transparency and turnover are positively correlated, suggesting that high transparency stocks are always traded

146

H.-W. Lin et al.

frequently and vice versa. Second, median turnover stocks have the lowest media coverage. Third, median transparency stocks have the highest media coverage. Fourth, median coverage stocks have the highest turnover and transparency.

3 Empirical Outcomes In this section, we show the outcomes of two-way sorted conditional momentum portfolios3. We use one sided t-test to check the significance of momentum profit when we respectively consider transparency, media coverage and turnover (H0 : MP 0; H1 : MP\0). 3.1

The Two-Way Sorted Conditional Turnover-Momentum Portfolio

In 2-season period, we construct conditional momentum portfolios by turnover and stock returns. Some notations are used to represent the momentum portfolios. Port (O, R) suggests the conditional momentum portfolio created by turnover and stock returns. To produce Port (O, R) portfolio, we classify stocks into 3 groups, 4 groups and 5 groups. By this procedure, we can obtain more insights into turnover and momentum. The momentum profits and their t statistics are in Table 2. Table 2. Two-way sorted conditional turnover-momentum portfolios Port (O, R) portfolio Panel A: 3-group classifications HO MO LO MP −0.026 −0.014 −0.013 t-stat −2.593*** −1.431* −1.061 Panel C: 5-group classifications HO 4 3 2 MP −0.027 −0.021 −0.022 −0.012 t-stat −2.247** −2.092** −2.129** −1.118

Panel B: 4-group classifications MP t-stat

HO 3 2 LO −0.025 −0.014 −0.010 −0.013 −2.351*** −1.431* −1.009 −1.009

LO −0.011 −0.819

Reported in Table 2 are the momentum profits of two-way sorted conditional turnover-momentum portfolios. The formation and holding period are both 2 seasons. In formation period, we classify stocks first by turnover into several groups. Then, in each group, stocks are classified by stock returns to engender the momentum portfolios. We calculate the momentum profits in holding period. H, M, and L stand for high, median and low level respectively. O is for turnover. *, **, *** represent the significance at 10%, 5%, and 1% level.

3

In the literature, there are also other types of momentum portfolios. For example, interactive momentum portfolio classifies stocks by stock returns and certain factors independently. Break-down momentum portfolio first constructs winner and loser portfolio by stock returns. Within winner and loser portfolio, the stocks are then classified by certain factors. Limited to the length of paper, we only show the results of conditional momentum. (e.g., Lee and Swaminathan 2000; Grinblatt and Han 2005).

Hot Trading and Price Stability Under Media Supervision

147

As Table 2 shows, the market definitely has significantly negative momentum profits with high turnover, suggesting that high turnover elicits contrarian in Chinese stock market. For example, a negative momentum profit of -0.026 arises with high turnover under 3-group classifications (t-stat = −2.593). These high turnover stocks also have high transparency, but are not accompanied with high media coverage as Table 1 reports. They still cannot avoid price reversal. High turnover indicates that trading activities are frequent and the stocks keep in hot trading, while this situation will lead to high price volatility (Blau and Griffith 2016). Frequent trading events make stock prices continuously vary and thus high volatility will occur. Consequently, when high volatility takes shape, the stock prices are less stable and possibly exhibit price reversal (Kang et al. 2018) instead of inertia. Finally, price reversal and contrarian emerge. 3.2

The Two-Way Sorted Conditional Transparency-Momentum Portfolio

With 2-season formation and holding period, we classify stocks by transparency and stock returns to construct conditional momentum portfolios. In constructing two-way sorted conditional momentum portfolios, we classify stock into 3 groups, 4 groups and 5 groups to offer complete results. The momentum profits and their t statistics of Port (T, R) are reported in Table 3.

Table 3. Two-way sorted conditional transparency-momentum portfolios Port (T, R) portfolio Panel A: 3-group classifications Panel B: 4-group classifications HT MT LT HT 3 2 LT MP −0.018 0.000 −0.006 MP −0.018 0.000 −0.006 −0.006 t-stat −1.509* 0.014 −0.505 t-stat −1.509* 0.014 −0.505 −0.619 Panel C: 5-group classifications HT 4 3 2 LT MP −0.016 −0.011 0.009 −0.014 −0.004 t-stat −1.371* −1.008 0.767 −1.154 −0.337 Reported in Table 3 are the momentum profits of two-way sorted conditional transparency-momentum portfolios. The formation and holding period are both 2 seasons. In formation period, we classify stocks first by transparency into several groups. Then, in each group, stocks are classified by stock returns to engender the momentum portfolios. We calculate the momentum profits in holding period. H, M, and L stand for high, median and low level respectively. T is for transparency. *, **, *** represent the significance at 10%, 5%, and 1% level.

In Table 3, with high transparency, the market is certain to make significantly negative momentum profits. For instance, a significantly negative momentum profit

148

H.-W. Lin et al.

of −0.018 appears with high transparency under 3-group classifications (t-stat = −1.509). Although high transparency induces decent disclosure, some mismanagements in corporate operations also will be timely revealed in financial reports. Eventually, when stock prices incorporate corresponding information, price reversal and contrarian possibly arise. This finding also suggests that the information of high transparency stocks is more easy to be trusted by investors. Hence, they are more willing to take part in trading activities and we also show that these stocks have high turnover in Table 1. 3.3

The Two-Way Sorted Conditional Coverage-Momentum Portfolio

We analyze the conditional coverage-momentum portfolio in 2 seasons. Given in Table 4 are the momentum profits and their t statistics of Port (C, R) momentum portfolio. In production of momentum portfolio, the classifications are 3 groups, 4 groups and 5 groups.

Table 4. Two-way sorted conditional coverage-momentum portfolios Port (C, R) portfolio Panel A: 3-group classifications Panel B: 4-group classifications HC MC LC HC 3 2 LC MP −0.026 −0.022 −0.046 MP −0.006 −0.020 −0.024 −0.037 t-stat −1.074 −1.246 −2.536*** t-stat −0.288 −0.927 −1.283* −2.284** Panel C: 5-group classifications HC 4 3 2 LC MP −0.017 −0.014 −0.017 −0.010 −0.020 t-stat −1.064 −0.693 −0.992 −0.525 −1.484* Reported are in Table 4 the momentum profits of two-way sorted conditional coveragemomentum portfolios. The formation and holding period are both 2 seasons. In formation period, we classify stocks first by media coverage into several groups. Then, in each group, stocks are classified by stock returns to engender the momentum portfolios. We calculate the momentum profits in holding period. H, M, and L stand for high, median and low level respectively. C is for media coverage. *, **, *** represent the significance at 10%, 5%, and 1% level.

We can see from the table that Port (C, R) is sure to make significantly negative momentum profits with low media coverage. For instance, with 3-group classifications, the portfolio makes a momentum profit of -0.046 (t-stat = −2.536). Low media coverage makes the information deficient and less transmitted, whereas no-news circumstance will elicit price reversal (Tetlock 2010). Moreover, the investors know little about the listed corporations and are uncertain about the future. They are possible to encounter unexpected news and very likely to overreact to information (DeBondt and Thaler 1985). Hence, price reversal and contrarian emerge with low media coverage.

Hot Trading and Price Stability Under Media Supervision

149

However, with high media coverage, regardless of the number of group classifications, the stocks do not exhibit significant price reversal. For instance, the insignificant momentum profit is −0.017 (t-stat = −1.064) under 5-group classifications. Let us recall that high media coverage stocks do not have the most outstanding turnover and transparency in Table 1. More media coverage makes the listed corporations expose to the public and suppresses artificial manipulations. The information is also diffused widely. The investors are less possible to experience information beyond expectations. Thus, this circumstance restricts the activations of price reversal and contrarian. These findings imply that even without ideal transparency and turnover, high coverage stocks are still free of price reversal to certain extent.

4 Robustness Checks For the sake of investigating whether our results are sensitive to the length of time, we calculate momentum profits and related t statistics in extensive periods. The formation and holding period are extended from 2 seasons to 12 seasons. 4.1

Extensive Formation and Holding Periods for Port (O, R) Portfolio

We prolong the formation and holding period of Port (O, R) portfolio from 2 seasons to 12 seasons. As an example, we adopt the classifications of 3 groups and show the momentum profits with high and low turnover.

Fig. 1. Port (O, R) Portfolio in Extensive Periods with High Turnover. Plotted are momentum profits and t-statistics of Port (O, R) portfolio.

As Fig. 1 shows, all the momentum profits in extensive periods are significantly negative. In the short run, price reversal occurs. As time goes by, the negative momentum profits become more significant. For example, with 10-season formation and 8-season holding period, Port (O, R) portfolio has a negative momentum profit of −0.78 (t-stat = −7). In sum, high turnover portfolio certainly makes contrarian in China regardless of the length of period.

150

H.-W. Lin et al.

Fig. 2. Port (O, R) Portfolio in Extensive Periods with Low Turnover. Plotted are momentum profits and t-statistics of Port (O, R) portfolio.

In Fig. 2, the most of momentum profits are significantly negative. In the short run, t-statistics are close to 0, but it turns into larger in longer periods. That is, price reversal does not appear in the short run. The surface of t-statistics goes down as periods are lengthened, implying that information updates promote price correction and cause price reversal with low turnover as time elapses. 4.2

Extensive Formation and Holding Periods for Port (T, R) Portfolio

We lengthen the formation and holding period of Port (T, R) portfolio from 2 seasons to 12 seasons. Depicted Fig. 3 show the momentum profits with high and low transparency from 3-group classifications.

Fig. 3. Port (T, R) Portfolio in Extensive Periods with High Transparency. Plotted are momentum profits and t-statistics of Port (T, R) portfolio.

The most of momentum profits with high transparency are significantly negative. Even in the short run of 2 seasons, the momentum profit is significantly negative (t-stat = −1.509). As the periods are increasingly lengthened, the negative momentum profits are become more significant. Similarly, high transparency portfolio is also sure to make contrarian in China regardless of the length of period (Fig. 4).

Hot Trading and Price Stability Under Media Supervision

151

Fig. 4. Port (T, R) Portfolio in Extensive Periods with Low Transparency. Plotted are momentum profits and t-statistics of Port (T, R) portfolio.

With low transparency, though all the momentum profits are negative, they tend to be insignificant in the short run. The surface of t-statistics is very close to -1 in the short run, but it goes down in the longer periods, suggesting that price reversal and contrarian incline to appear as time goes by. When time goes by, some underlying problems in corporate operations of low transparency corporations will be perceived. Hence, it provokes price reversal and contrarian. 4.3

Extensive Formation and Holding Periods for Port (C, R) Portfolio

The extensive periods are also applied to Port (C, R) portfolio. We plot the extensive momentum profits and t-statistics in Figs. 5 and 6.

Fig. 5. Port (C, R) Portfolio in Extensive Periods with High Coverage. Plotted are momentum profits and t-statistics of Port (C, R) portfolio.

As the Fig. 5 depicts, although all the momentum profits are negative, many negative momentum profits are insignificant. When the formation and holding period are extended, the surface of t-statistics does not definitely go down. High media coverage enables information to disseminate widely, so there are less uncertainties of the future and the investors are less possible to confront unexpected news. Therefore, high media coverage will restrain price reversal and contrarian to a certain extent in China.

152

H.-W. Lin et al.

Fig. 6. Port (C, R) Portfolio in Extensive Periods with Low Coverage. Plotted are momentum profits and t-statistics of Port (C, R) portfolio.

The momentum profits are still all negative with low media coverage. In the short run of 2 seasons, the t-statistic is smaller than −2. Besides, the most of t-statistics in other periods are also below −2, suggesting that low coverage portfolio is sure to make price reversal in China.

5 Conclusions We use turnover as a proxy for the degree of trading frequency. In the Chinese stock market, we perceive that high trading frequency stocks always have higher transparency. However, without high media coverage, this kind of portfolio has higher risks to encounter price reversal. Meanwhile, we empirically found that high media coverage stocks tend not to exhibit significant price reversal. In other words, these stocks are less possible to present price risk under broad information diffusion with strict official supervisions though they are without high transparency and turnover. In particular, high transparency stocks are not definitely less risky, which only indicates that the Chinese retail investors are really willing to participate in related trading activities. However, this phenomenon also implies that the better financial reporting systems of some listed corporations still have buried defects in China. The Chinese capital market is dominated by retail investors. Due the lack of sufficient knowledge, retail investors do not refer to financial reports, but rely on media news to make decisions, suggesting financial reports may not play an extremely important role in China. Therefore, the Chinese stock market cannot be discussed by the ways for those in the US and Europe. We propose that future research can shed light on the effects of retail investors’ preferences on stock market.

References Avramov, D., Cheng, S., Hameed, A.: Time-varying liquidity and momentum profits. J. Financ. Quant. Anal. 51, 1897–1923 (2016) Ball, R., Shivakumar, L.: Earnings quality in UK private firms: comparative loss recognition timeliness. J. Account. Econ. 39, 83–128 (2005)

Hot Trading and Price Stability Under Media Supervision

153

Barber, B.M., Odean, T.: All that glitters: the effect of attention and news on the buying behavior of individual and institutional investors. Rev. Financ. Stud. 21, 785–818 (2008) Bathia, D., Bredin, D.: Investor sentiment: does it augment the performance of asset pricing models? Int. Rev. Financ. Anal. 59, 290–303 (2018) Bhattacharya, U., Daouk, H., Welker, M.: The world price of earnings opacity. Account. Rev. 78, 641–678 (2003) Blau, B.M., Griffith, T.G.: Price clustering and the stability of stock prices. J. Bus. Res. 69, 3933–3942 (2016) Chen, H.Y., Chen, S.S., Hsin, C.W., Lee, C.F.: Does revenue momentum drive or ride earnings or price momentum? J. Bank. Finance 38, 166–185 (2014) DeBondt, W., Thaler, R.: Does the stock market overreact? J. Finance 3, 793–805 (1985) Dechow, P.M., Sloan, R.G., Sweeney, A.P.: Detecting earnings management. Account. Rev. 70, 193–225 (1995) Dey, M.K.: Turnover and return in global stock markets. Emerg. Mark. Rev. 6, 45–67 (2005) Grinblatt, M., Han, B.: Prospect theory, mental accounting, and momentum. J. Financ. Econ. 78, 311–339 (2005) Hauser, S., Kedar-Levy, H.: Liquidity might come at cost: the role of heterogeneous preferences. J. Financ. Mark. 39, 1–23 (2018) Hillert, A., Jacobs, H., Müller, S.: Media makes momentum. Rev. Financ. Stud. 27, 3467–3501 (2014) Huang, T.L.: The puzzling media effect in the Chinese stock market. Pac.-Basin Financ. J. 49, 129–146 (2018) Jegadeesh, N., Titman, S.: Returns to buying winners and selling losers: implications for stock market efficiency. J. Finance 48, 65–91 (1993) Jin, L., Myers, S.C.: R2 around the world: new theory and new tests. J. Financ. Econ. 79, 257– 292 (2006) McInnis, J.: Earnings smoothness, average returns, and implied cost of equity capital. Account. Rev. 85, 315–341 (2010) Kang, M., Khaksari, S., Nam, K.: Corporate investment, short-term return reversal, and stock liquidity. J. Financ. Mark. 39, 68–83 (2018) Kim, B., Suh, S.: Sentiment-based momentum strategy. Int. Rev. Financ. Anal. 58, 52–68 (2018) Kim, J.B., Zhang, H., Li, L., Tian, G.: Press freedom, externally-generated transparency, and stock price informativeness: international evidence. J. Bank. Finance 46, 299–310 (2014) Lee, C.M.C., Swaminathan, B.: Price momentum and trading volume. J. Finance 5, 2017–2069 (2000) Peress, J.: The media and the diffusion of information in financial markets: evidence from newspaper strikes. J. Finance 5, 2007–2043 (2014) Tetlock, P.C.: Does public financial news resolve asymmetric information? Rev. Financ. Stud. 23, 3520–3557 (2010) Yuan, R., Sun, J., Cao, F.: Directors’ and officers’ liability insurance and stock price crash risk. J. Corp. Finance 37, 173–192 (2016)

Does Cultural Distance Affect International Trade in the Eurozone? Donatella Furia(&), Iacopo Odoardi, and Davide Ronsisvalle University of Chieti–Pescara, Pescara, Italy {donatella.furia,iacopo.odoardi, davide.ronsisvalle}@unich.it

Abstract. International trade is one of the main economic forces on which to base the post-crisis economic recovery. The motivations that induce companies towards specific foreign markets and governments to propose commercial agreements are known. However, another underestimated factor acts to profoundly influence import and export decisions, i.e. cultural affinity. We calculate the cultural distance between the Euro countries, which are homogeneous in terms of development paths, trade agreements and even a single currency, in addition to spatial proximity. Our analysis calculates the cultural distance and the exchange relations of each country with all the other members of the Euro area. Our results allow us to observe how the cultural proximity acts on international trade, with particular reference to the post-crisis period. Our analysis suggest the presence of two homogeneous groups of countries, West and East Europe. The lower cultural distance within the two (relatively) homogeneous groups less hinders economic agents of each single country in the decisions to enter the other’s market. Keywords: Cultural distance

International trade Eurozone

1 Introduction The European Union (EU) bases its existence on the EEC (European Economic Community), established in 1958 between six countries that had the objective of intensifying economic relations to foster common growth. In addition to the objectives of international cooperation, the founder countries had common objectives in the fields of social and labor policies, but also in favoring the movement of workers and goods and thus boosting international trade [2]. International trade is in fact one of the most recognized strengths for all western economies [12], and customs barriers, material and immaterial obstacles, can weaken this important resource also for those countries that are particularly prone to export [24]. A further step toward the formation of a cohesive federation of countries is observable in the construction of the so-called Eurozone in 1999, born as a result of the desire to adopt a common currency and common monetary policies. The reduction of constraints on trade and the adoption of a common currency has been important steps in fostering trade between member states [5]. These countries had a common history, similar cultural roots and, in many cases, geographical proximity. These aspects have certainly fostered common growth, and for the more virtuous © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 154–161, 2020. https://doi.org/10.1007/978-3-030-38227-8_18

Does Cultural Distance Affect International Trade in the Eurozone?

155

countries, international trade has been one of the economic aspects on which to focus on encouraging recovery after the 2007–2008 crisis [23]. In this article, we wonder if the cultural distance between countries can play a role in favoring trade within the Euro area. Cultural proximity can affect the decisions and opportunities of economic agents, favoring relations in general, and trade, with those countries that are more similar. In the first step of analysis, we calculate a cultural distance index among Euro countries. The group of 19 countries has been chosen for specific characteristics: common currency, same commercial agreements, geographical proximity. These characteristics make the population of countries more homogeneous. The second step of the analysis consists in the determination, for each country, of the effect of cultural distance in the international trade in respect to all other countries of the Euro group. The period of analysis is 2009–2014, with the aim of considering the effects of the recent international economic crisis. The objective of the analysis is therefore twofold. First, to study if the long-run cultural similarities between countries are still valid. Second, to observe if this distance plays an effective role in favoring trade, and therefore deserves special attention from European and national policy makers. We expect that the socioeconomic effects of globalization have contributed to the cultural similarity between countries with a similar level of income and well-being. Furthermore, we expect the countries of central continental Europe to be closer to each other culturally, a further aspect which strengthens their economies through close trade interconnections.

2 Cultural Distance The distance between business partners has long been a focus in the international trade literature. Over the years, however, scholars have gone from examining the effects of physical distance (geographic positioning) between countries towards a sort of “cultural distance”. In 1956, Beckerman [4] introduced the idea of how the so-called cultural distance acts on international trade, but only in the Uppsala model [19], an empirical research has verified how the variable of cultural distance acts on foreign trade of the different countries. To have a definition of cultural distance, we can consider that of Mueller: “The concept of cultural distance has its origins in early international trade theory as an explanation for why trade tended to be concentrated in foreign markets most similar to domestic markets” ([6] p. 109). This definition sought to find the motivation for why international trade could tend to focus on some markets that were more similar to that of the exporting country. The definition of Inglehart and Baker ([17] p. 21) on how the culture of one country is different from another is: “Despite the cultural change going in a common direction, countries have a unique historical past that continues to shape their national cultures”. For the Authors, although there are continuous changes in the cultures that lead to standardization, each country has its own history that influences national culture. We consider the definition of Geert Hofstede to list the variables that must be analyzed to evaluate cultural distance: “Distance between cultures, referred to as cultural distance (CD), is defined as the

156

D. Furia et al.

degree to which the shared norms and values in one society differ from those of another society” ([14] p. 24). In summary, according to Hofstede, two countries are culturally distant when the main social and cultural norms and values in the countries are very different. Several methods have been used to calculate the cultural distance between countries [13, 16]. While Hofstede tries to give information on the dimensions that must be used to calculate values and variations of a country’s culture, Inglehart presents a dynamic theory of how cultural change in a country acts on its trade at the intentional level. In this work, we use the Inglehart model of 2004, with which we calculate the cultural distance between the European countries. 2.1

Factors Influencing Trade Between Countries: A Brief Summary

International trade regards the exchange of goods, services and capital between people and companies in different countries. Usually, factors of a political nature, economic proximity and commercial agreements are considered influent on import and export decisions. In particular, exchange rates, level of competitiveness, growing globalization effects, tariffs, trade barriers, transportation costs, languages, culture, and various trade agreements affect companies and their decision to target certain foreign markets. The 2013 WTO [26] report provides a list of the main factors that influence the international trade of a country and suggests the most important factors: demographic changes, investments, technologies, the presence of raw materials, transport costs, political institutions and cultural institutions [26]. According to Krugman and Obstfeld [21], the variables that are considered to explain international trade are: market size, geographical distance, cultural affinities, geographic location, business model and the presence of borders. Regarding cultural affinity, the central theme of this work, there are several studies that affirm that it has a positive impact on bilateral trade between countries [9–11], and even that the lack of cultural affinity between two countries can completely inhibit trade between them [8].

3 Methods: Cultural Distance Index and Impact on International Trade Several models exist to estimate the cultural distance between two countries [13, 16]. By using World Value Survey data (WVS) we measure the cultural distance between countries in the period 2009–2014. To calculate the value of the cultural distance, we consider the factorial analysis of Inglehart et al. [16], in which the respondents are classified with respect to two different dimensions of culture: traditional authorities vs. rational authority of the new century (TSR) and survival values vs. self-expression (SSE). Following Inglehart et al. [16], the formula for calculating the cultural distance between the countries of the Euro area is: CDij ¼ ðTSRj TRSj Þ2 þ ðSSEj SSEi Þ2 :

ð1Þ

Does Cultural Distance Affect International Trade in the Eurozone?

157

The second objective of this work is to calculate the influence of distance in (1) on international trade for each country to each partner market. According to Anderson [3] and Jeroen [18], we consider the effect of cultural distance on trade: vi;j;t ¼ Yi;ta Yj;tb Dci;t CDhi;j;t

ð2Þ

In Eq. (2), vi;j;t indicates the volume of exports from country i to country j at time t, Yai;t and Ybj;t are income Y of country i at time t and income Y of country j at time t, Dci;t is the geographical distance between country i and j, and CDhi;j;t is the cultural distance between country i and j. Equations (1) and (2) are used to estimate the values in Tables 1 and 2.

4 Data The model of Inglehart et al. [16] and data from World Value Survey (WVS) are used to estimate cultural distance, as done by Tadesse and White [25]. Therefore, we consider the value of TSR, which indicates the relationship between traditional authorities vs. rational authority of the new century, thus it shows if a country is more bound to traditions or is open to social changes. In this sense, the first type includes the subjects linked to religious values, to the family (as a traditional institution), and to community values, while those who are open to novelties tend to give greater importance to individual achievements and economic self-realization. The SSE value indicates survival values vs. auto expression, i.e. the differences between societies that emphasize hard work and self-denial and those that place more emphasis on quality of life, intended as women’s emancipation, equal opportunities for different ethnic groups and sexual freedom. To estimate the effect of cultural distance on international trade, we use the model of Anderson [3] also used by Jeroen [18]. In particular, in Eq. (2), vi;j;t indicates the specific export flows. The right side of (2) is a Cobb-Douglas function of income Y of country i at time t, of income Y of country j at time t, of geographical distance D between i and j and their cultural distance. a and b give a weight to the income of each country dependent on exports [3] c and h have negative signs as trade flows decrease when countries are more geographically and culturally distant [3].

5 Results 5.1

Cultural Distance Among European Countries

Our estimates of cultural distance between Euro countries in the period 2009–2014 are shown in Table 1.

158

D. Furia et al. Table 1. Cultural distance among the countries of the Eurozone (2009–2014). AT

B

CY

EE

FI

FR

D

EL

IE

IT

LV

LT

LU

MT

NL

P

SK

SL

ES

AT B

From 0.01 to 0.50 From 0.51 to 1.00 From 1.01 to 1.50 From 1.51 to 2.00 From 2.00 to 3.00

CY EE FI FR D EL IE IT LV LT LU MT NL P SK SL ES

Source: Authors’ elaborations on World Value Survey data.

From Table 1, we note that two distinct groups coexist in the Euro area. On the one hand, we find the Eastern European group with values close to each other and distant from the other countries, and a second group formed by Central Europe quite homogeneous (also characterized by lower average values). Countries that form the second group – Central Europe – are approximately the EEC founding countries, historically more similar to each other and probably more influenced by the socioeconomic globalization that tends to harmonize cultural differences. In contrast to all other countries, we find Cyprus and Malta that can be considered a sort of outlier cases. 5.2

The Effect of International Trade

Table 2 show the results of the gravity model of Anderson [3], in which we consider the cultural distance, the international trade between countries (exports and imports), the GDP of the two countries and the geographical distance.

Does Cultural Distance Affect International Trade in the Eurozone?

159

Table 2. Influence of cultural distance on international trade (2009–2014). AT B

CY

EE FI

FR D EL IE IT LV

LT

LU

MT

NL P SK SL ES

AT B CY EE FI FR D EL IE

From- 0.01 to -1.00 From -1.01 to -2.00 From -2.01 to -3.00 From -3.01 to -4.00 From+0.00 to+1.00

IT LV LT LU MT NL P SK SL ES

Source: Authors’ elaborations on World Value Survey data and Eurostat.

A smaller cultural distance corresponds to a smaller incidence of this variable on international trade. The second analysis confirms a distinction between Eastern Europe and Central Europe. In the latter countries, which are historically more similar, there is less influence of cultural distance on trade and this consequently leads to greater exchanges between them, as well as for the group of Eastern European countries. Obviously also the geographical distance plays an important role in this sense. As expected, the relationship between the two extreme cases is anomalous and is the only one to show a positive sign (0.98).

6 Conclusions Our article can contribute in the shortcoming in the literature on decision-making processes referred to international trade as, for example, explained by Crouch [7] on the demand in international tourism. As the model of Akerlof [1], in which social distance estimations are used to observe the influence on social decisions, also in the field of international trade the decisions of economic agents can be influenced by the information available [15, 20]. Our results suggest that cultural distance restrains international trade even in a relatively homogenous area, and with similar trade agreements, as

160

D. Furia et al.

the Eurozone, for which our analysis shows an average coefficient of cultural diversity on trade of −1.7063. We note that the countries of the original EU nucleus are less affected (in trade relations between them) by cultural distance, probably because they are more similar historically and structurally to each other (first analysis). Our article, in line with that of Lien and Lo [22], emphasizes that the trust acquired by cultural familiarity is an important factor for decision in the field of trade, as the literature also suggests happening on foreign direct investment [16]. In addition, we observe that the possibilities and opportunities for trade between central Europe and the East area, already more “culturally distant” as demonstrated in Sect. 5.1, would be more affected by an increasing of “cultural distancing”. However, this consideration deserves further details, in the light of the effects of homogenization due to social and economic globalization. The East and West groups would benefit from a greater similarity in the immaterial aspects that regulate numerous socioeconomic aspects characterizing behaviors and decisions of businesses and consumers. We must consider that the aspects useful for defining the same concept of “cultural distance” derive both from historical and traditional aspects (mainly the social capital) and from specific investments resulting from long-term policy interventions (e.g., investments in human capital).

References 1. Akerlof, G.A.: Social distance and social decisions. Econometrica 65(5), 1005–1027 (1997) 2. Albareda, L., Lozano, J., Ysa, T.: Public policies on corporate social responsibility: the role of governments in Europe. J. Bus. Ethics 74(4), 391–407 (2007) 3. Anderson, J.E.: The gravity model. Annu. Rev. Econ. 3(1), 133–160 (2011) 4. Beckerman, W.: Distance and the pattern of intra-European trade. Rev. Econ. Stat. 38, 31–40 (1956) 5. Berger, H., Nitsch, V.: Zooming out: the trade effect of the euro in historical perspective. CESifo Working Paper, No. 1435. CESifo GmbH (2005) 6. Caillat, Z., Mueller, B.: The influence of culture on American and British advertising: an exploratory comparison of beer advertising. J. Advert. Res. 36(3), 79–88 (1996) 7. Crouch, G.I.: The study of international tourism demand: a survey of practice. J. Travel Res. 32(4), 41–55 (1994) 8. Deardorff, A.V.: Local comparative advantage: trade costs and the pattern of trade. Research Seminar in International Economics Discussion Paper No. 500, University of Michigan, (2004) 9. Disdier, A.C., Tai, S.H., Fontagné, L., Mayer, T.: Bilateral trade of cultural goods. Rev. World Econ. 145, 575–595 (2010) 10. Dreher, A., Gaston, N., Martens, P.: Measuring Globalisation—Gauging Its Consequences. Springer, New York (2008) 11. Felbermayr, G., Toubal, F.: Cultural proximity and trade. Eur. Econ. Rev. 54, 279–293 (2010) 12. Gilpin, R., Gilpin, J.M.: Global Political Economy: Understanding the International Economic Order. Princeton University Press, Princeton (2001) 13. Hofstede, G.: Culture and organizations. Int. Stud. Manag. Organ. 10(4), 15–41 (1980) 14. Hofstede, G.: Culture’s Consequences: Comparing Values, Behaviours, Institutions and Organizations across Nations. Sage Publications, Thousand Oaks (2001)

Does Cultural Distance Affect International Trade in the Eurozone?

161

15. Hsieh, C.-T., Lai, F., Shi, W.: Information orientation and its impacts on information asymmetry and e-business adoption: evidence from China’s international trading industry. Ind. Manag. Data Syst. 106(6), 825–840 (2006) 16. Inglehart, R., Basanez, M., Diez-Medrano, J., Halman, L., Luijkx, R.: Human Beliefs and Values: A Cross-Cultural Sourcebook Based on the 1999–2002 Values Surveys. Siglo Veintiuno Editores, Mexico City (2004) 17. Inglehart, R., Baker, W.E.: Modernization, cultural change, and the persistence of traditional values. Am. Sociol. Rev. 65(1), 19–51 (2000) 18. Jeroen, J.: Cultural distance and trade within Europe. Economics (2016). http://hdl.handle. net/2105/34267 19. Johanson, J., Vahlne, J.E.: The internationalization process of the firm: a model of knowledge development and increasing foreign market commitments. J. Int. Bus. Stud. 8, 23–32 (1977) 20. Kandogan, Y.: Economic development, cultural differences and FDI. Appl. Econ. 48(17), 1545–1559 (2016) 21. Krugman, P., Obstfeld, M.: International Economics: Theory and Policy, 8th edn. Pearson/Addison-Wesley, Boston (2009) 22. Lien, D., Lo, M.: Economic impacts of cultural institutes. Q. Rev. Econ. Finance 64(C), 12– 21 (2017) 23. Nanto, D.K.: The Global Financial Crisis: Foreign and Trade Policy Effects. Congressional Research Service, Report for Congress (2009) 24. Savolainen, R.: Approaches to socio-cultural barriers to information seeking. Libr. Inf. Sci. Res. 38(1), 52–59 (2016) 25. Tadesse, B., White, R.: Does cultural distance hinder trade in goods? A comparative study of nine OECD member nations. Open Econ. Rev. 21(2), 237–261 (2010) 26. WTO: Fundamental economic factors affecting international trade. In: World Trade Report 2013, Factors shaping the future of world trade, pp. 112–219 (2013)

Making Sense of Economics Datasets with Evolutionary Coresets Pietro Barbiero1(B)

and Alberto Tonda2

1

2

Politecnico di Torino, Torino, Italy [email protected] Université Paris-Saclay, INRA, UMR 782 GMPA, 78850 Thiverval-Grignon, France [email protected]

Abstract. Machine learning agents learn to take decisions extracting information from training data. When similar inferences can be obtained using a small subset of the same training set of samples, the subset is called coreset. Coresets discovery is an active line of research as it may be used to reduce the training speed as well as to allow human experts to gain a better understanding of both the phenomenon and the decisions, by reducing the number of samples to be examined. For classification problems, the state-of-the-art in coreset discovery is EvoCore, a multiobjective evolutionary algorithm. In this work EvoCore is exploited both on synthetic and on real data sets, showing how coresets may be useful in explaining decisions taken by machine learning classifiers. Keywords: Classification · Coreset discovery · EvoCore Evolutionary algorithms · Explain AI · Machine learning Multi-objective

1

· ·

Introduction

Machine learning (ML) algorithms have recently emerged as an extremely effective set of technologies in addressing real world problems, both on structured and unstructured data [1]. Such progress may be explained as a combination of several factors, including an increasing availability of data and the diffusion of high-performance computing platforms. The main advantage of such models consists in their capacity, as they are composed of thousands or even millions of parameters. Such peculiarity makes it possible for ML models to fit almost any kind of data distributions [2], thus providing effective solutions to problems that could not have been tackled before. Addressing complex problems requires both a detailed and possibly large set of data and a model with sufficient capacity. However, even if accurate, ML decisions are often completely incomprehensible even for human experts, as they involve a combination of large sets of (i) variables, (ii) samples, and (iii) model parameters. As the amount of available data grows, it takes more time for algorithms to be trained, and it becomes harder for domain experts to make sense of the data itself. One possible solution to such issues is c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 162–170, 2020. https://doi.org/10.1007/978-3-030-38227-8_19

Making Sense of Economics

163

coreset discovery, extracting a subset of the original samples that can approximate the original data distribution. Termed coreset [3], such set of samples can be used both to speed up training, and to help human experts making sense of large amounts of data. With regards to coreset discovery for classification, the state-of-the-art is represented by EvoCore [4,5], an evolutionary approach to classification tasks, exploiting a state-of-the-art multi-objective evolutionary algorithm, NSGA-II [6]. EvoCore finds the best trade-offs between amount of samples in the coreset (to be minimized) and classifier error (to be minimized), for a specific classification algorithm. The resulting Pareto front includes different coresets, each one representing an optimal compromise between the two objectives. A human expert would then be able to not only select the coreset more suited for his needs, but also obtain extra information on the ML algorithm’s behavior, by observing its degradation in performance as the number of coreset points in Pareto-optimal candidate solutions decreases. Alternatively, a candidate coreset on the Pareto front can be automatically selected depending on its performance with respect to an unseen validation set. In this work, EvoCore is employed (i) on a toy problem, to provide the reader with an intuitive assessment of its capabilities, and (ii) on a dataset related to credit risk. A short analysis of the samples found in the coresets for the credit risk dataset shows how such points can be extremely informative for a human expert, probably representing different typologies of ideal/not ideal customers.

2 2.1

Background Machine Learning and Classification

ML algorithms are able to improve their performance on a given task over time through experience [7]. Such techniques automatically create models that, once trained on user-provided (training) data, can then provide predictions on unseen (test) data. In essence, ML consists in framing a learning task as an optimization task and finding a near-optimal solution for the optimization problem, exploiting the training data. Popular ML algorithms range from decision trees [8], to logistic regression [9], to artificial neural networks [10]. Classification, a classic ML task, consists in associating a single instance of measurements of several features, called sample, to one (or more) pre-defined classes, representing different groups. ML algorithms can position hyperplanes (often called decision boundaries) in the feature space, and later use them to decide the group a given sample belongs to. The placement of decision boundaries is set to maximize technique-specific metrics, whose value depend on the efficacy of the boundary with respect the (labeled) training data. Decision boundaries inside a classifier can be represented explicitly, for example as a linear or non-linear combination of the features, or implicitly, for example as the outcome of a group of decision trees or other weak classifiers.

164

2.2

P. Barbiero and A. Tonda

Coreset Discovery

In computational geometry, coresets are defined as a small set of points that approximates the shape of a larger point set. The concept of coreset in ML is extended to intend a subset of the (training) input samples, such that a good approximation to the original input can be obtained by solving the optimization problem directly on the coreset, rather than on the whole original set of input samples [3]. Finding coresets for ML problems is an active line of research, with applications ranging from speeding up training of algorithms on large datasets [11] to gaining a better understanding of the algorithm’s behavior. Unsurprisingly, a considerable number of approaches to coreset discovery can be found in the specialized literature. Often these algorithms start from the assumption that the single best coreset for a given dataset will be independent from the ML pipeline used, but this premise might not always be correct. Moreover, the problem of finding the coreset, given a specific dataset and an application, can be naturally expressed as multi-objective: on the one hand, the user wishes to identify a set of core samples as small as possible; but on the other hand, the performance of the algorithm trained on the coreset should not differ from its starting performance, when trained on the original dataset. For this reason, multi-objective optimization algorithms could be well-suited to this task. 2.3

Multi-objective Evolutionary Algorithms

Optimization problems with contrasting objectives have no single optimal solution. Each candidate represents a different compromise between the multiple conflicting aims. Yet, it is still possible to search for optimal trade-offs, for which an objective cannot be improved without degrading the others. The set of such optimal compromises is called Pareto front. Multi-objective evolutionary algorithms (MOEA) currently represent the state of the art for problems with contradictory objectives, and are able to obtain good approximations of the true Pareto front in a reasonable amount of time. One of the most effective MOEAs is the Non-Sorting Genetic Algorithm II (NSGA-II) [6].

3

EvoCore

Starting from the intuition that coreset discovery can be framed as a multiobjective problem, and that the results could be dependant on the target ML algorithm, a novel evolutionary approach to coreset discovery for classification has been recently proposed in [4,5]. Given a training set Tr and a ML classifier, a candidate solution in the framework represents a coreset, a subset of the original training set. Candidate solutions are internally represented as bit strings, of length equal to the size of the training set, where a 1 in position i means that the corresponding sample si is retained in the coreset, while a 0 means that the sample is not considered. The classifier is then trained on the candidate coreset, and then an evaluation is performed on two conflicting objectives: number of

Making Sense of Economics

165

samples in the coreset (to be minimized), and resulting error of the classifier on the original training set (to be minimized). NSGA-II is then set to optimize the coreset, finding a suitable Pareto front consisting of the best compromises with respect to the two objectives. In case the user wishes to obtain a single solution, the original training set Tr can be split into a training set to be used internally Tr’ and a validation set V. At the end of the evolutionary optimization, each candidate coreset on the Pareto front is evaluated on the validation set V (unseen by the evolutionary procedure), to find the compromise with the best generality.

4

Experimental Results and Discussion

In order to prove its efficacy, EvoCore is employed to extract coresets on two benchmark datasets: i. Moons, a synthetic data set composed of two interleaving distributions having an half circle shape (2 classes, 400 samples, 2 features); ii. Credit, a dataset evaluating credit risk (2 classes, 1000 samples, 20 features). In both cases EvoCore is used to find the coreset for four ML algorithms, representative of both classifiers with explicit hyperplanes (Ridge [12], SVC Support Vector Machines [13]) and ensemble, tree-based classifiers (Bagging [14], RandomForest [15]). All classifiers are implemented in the scikit-learn [16] Python module and use default parameters. For the sake of comparison, it is important that the classifiers will follow the same training steps, although under different conditions. Therefore, a fixed seed has been used for all algorithms that exploit pseudo-random elements in their training process. All the necessary code for the experiments has been implemented in Python, relying upon the inspyred module [17]. The code is freely available in a BitBucket public repository1 . NSGA-II uses default parameters of the inspyred module, with the exception of μ = 200, λ = 400, stop condition 200 generations, and evolutionary operators bit-flip (probability pbf = 0.5) and 1-point crossover (probability pc = 0.5). Parameter values have been defined after a set of preliminary runs. For each case study, samples are randomly split between the original training set Tr (66%) and test set (33%). Features of the datasets are normalized using a column statistical scaling (zscore) learned on the training set Tr, then eventually applied to the test set. The results obtained by EvoCore are then compared against well-known coreset discovery algorithms GIGA [18], FW [19], MP [20], OMP [20], LAR [21,22], and FS [23]. The comparison is performed on three metrics: i. core set size (lower is better); ii. classification accuracy on the test set (higher is better); iii. running time of the algorithm (lower is better). Results of the comparison are presented in Tables 1 and 22 . Text in bold highlights the highest accuracy for each classifier on the test set.

1

2

Evolutionary Discovery of Coresets, https://bitbucket.org/evomlteam/evolutionarycore-sets/src/master/. The accuracy of EvoCore is related to the coreset along the Pareto front having the highest accuracy on an unseen validation set.

166

4.1

P. Barbiero and A. Tonda

Meta-analysis

With regards to test accuracy, EvoCore usually bests the other techniques, and sometimes is able to outperform the performance obtained by training the same classifier with all the whole original training set. This means that the decision boundaries generated using the evolved core set may generalize even better than those generated using the whole training set. Figure 1 shows the decision boundaries obtained using the whole training set (left column) or the best candidate core set (right column) to train classifiers, on the Moons data set. This result suggest that the performance of ML classifiers would not be a function of the size of the training set (as Big Data and Deep Learning often claim) but a function of the mutual position of the training samples in the feature space. More generally, for artificial intelligence agents it may be more important to have samples with the right mutual position rather than have a huge data set. Table 1. Moons data set. Training set size, classification accuracy on an unseen test set and running time (in seconds) for different classifiers exploiting both EvoCore and state-of-the-art algorithms for core set discovery. Algorithm RandomForest

Bagging

SVC

Ridge

Size Accuracy Avg time Size Accuracy Avg time Size Accuracy Avg time Size Accuracy Avg time All samples 266 0.9328 EvoCore

10 0.9403

– 440.2 s

2661 0.9254 30 0.9478

– 415.9 s

266 0.9179 24 0.9403

–

266 0.8134

–

126.6 s

2 0.8209

139.8 s

GIGA

2 0.4254

0.01 s

2 0.2463

0.01 s

2 0.4701

0.01 s

2 0.4701

0.01 s

FW

6 0.6493

3.6 s

6 0.6493

3.6 s

6 0.5299

3.6 s

6 0.6866

3.6 s

MP

3 0.5149

4.6 s

3 0.5821

4.6 s

3 0.5896

4.6 s

3 0.6642

4.6 s

FS

2 0.5149

4.3 s

2 0.2313

4.3 s

2 0.6119

4.3 s

2 0.6119

4.3 s

OP

2 0.5149

0.01 s

2 0.2463

0.01 s

2 0.6493

0.01 s

2 0.6493

0.01 s

LAR

3 0.5149

24.2 s

3 0.2388

24.2 s

3 0.5224

24.2 s

3 0.5896

24.2 s

Table 2. Credit data set. Training set size, classification accuracy on an unseen test set and running time (in seconds) for different classifiers exploiting both EvoCore and state-of-the-art algorithms for core set discovery. Algorithm RandomForest

Bagging

SVC

Ridge

Size Accuracy Avg time Size Accuracy Avg time Size Accuracy Avg time Size Accuracy Avg time All samples 666 0.7275

666 0.7066

666 0.7635

666 0.7695

EvoCore

223 0.7395

735

241 0.7006

538

GIGA

137 0.7305

0.11

137 0.7066

0.11

137 0.7096

0.11

137 0.7156

0.11

FW

537 0.6886

0.87

537 0.6856

0.87

537 0.7635

0.87

537 0.7635

0.87

MP

528 0.6856

1.99

528 0.7036

1.99

528 0.7515

1.99

528 0.7605

1.99

FS

74 0.7006

1.90

74 0.7186

1.90

74 0.7305

1.90

74 0.7455

1.90

OP

20 0.7066

0.09

20 0.7036

0.09

20 0.6617

0.09

20 0.6587

0.09

LAR

21 0.6707

0.68

21 0.6886

0.68

21 0.6407

0.68

21 0.6677

0.68

94 0.7335

173

11 0.7485

161

Making Sense of Economics

167

Fig. 1. Decision boundaries on the Moons data set using all the samples in the training set (Left) and only the core set (Right) for training the classifier. Train samples are represented by squares, test samples by crosses, core samples by black diamonds and test errors by ‘x’-shapes.

Figure 2 reports a meta-analysis of all the Pareto-optimal candidate core sets found by EvoCore, divided by dataset, considering all classifiers. A few samples clearly appear very often among all candidate core sets, while others almost never do, but overall there is a considerable number of samples that are included with low but non-negligible frequency, indicating that different classifiers indeed exploit core sets of different shape.

Fig. 2. Frequency of appearance of samples in the Pareto front solutions of all the classifiers.

4.2

Making Sense of Coresets and Decisions in Economics

Focusing on the Credit dataset, a concise analysis of the samples most frequently included in the coresets found by the proposed approach is presented in the following. Notice from Fig. 2 that there are four training points that appear in each core set of the Pareto fronts for all the classifiers. Table 3 lists the features of the most frequent core samples. Interestingly, such samples represent four different customer profiles. C1 is a 58-years-old married female. She is asking the bank for relatively small credit for a new radio/tv. Currently, she has 4 existing credits at this bank, but her economical status seems stable (long-term employment, house and real estate of ownership), despite her low disposable income (less than 1,000 DMs). C2 is an aged single male asking a credit for a

168

P. Barbiero and A. Tonda

new car. He is skilled and well paid (∼2,000 DMs). He has recently changed his job, but he lives in its own house and has a life insurance. Overall, he looks like a responsible customer. C3 is a young and skilled man. He has rented the house where he lives (with his wife, probably) for the last three years. Despite his young age, he owns a real estate, he already paid off previous credits with the bank and his disposable income is very high (more than 15,000 DMs). He is currently asking a remarkable credit of more than 3,000 DMs, but he is very rich and the bank will reserve him a kid-glove treatment. C4 is 32 years old and wants to buy a new radio/tv. He has had the same employment for the past four years and he owns the house where he lives. However, his saving account is nearly empty and he must provide maintenance for two dependants. Despite their differences, the low-income aged woman (C1), the middle-class aged man (C2), and the wealthy young man (C3) turn out to be good customers for the bank. They probably represent three “ideal” profiles of good customers, as their characteristics suggest economic stability. On the contrary, the middle-class young (and single) man with two dependants to support and low liquidity seems a representative sample of a risky customer. Table 3. Comparison of the fundamental samples found in the credit risk dataset.

5

Features

C1

C2

C3

C4

Checking account (DMs)

0) = 0 ∂θ ∂θ satisfies: (i) ∂θ ∂δ < 0 for δ, θ ∈ (0, ∞), ∂δ = 0 for δ → ∞, and ∂δ → ∞ for δ → 0; a[ρ + (1 + ψ)¯ g] (ii) θ → θmin ≡ L − (1 + ψ)¯g > 0 for δ → ∞; and (iii) θ → ∞ for δ → 0+ . These properties imply the following proposition:

224

H. C. Lin

Proposition 1 (Iso-innovation mix (δ, θ)). Using the proposed innovationbacked securities to reward innovation in a decentralized economy, there exist infinitely many combinations of payout term δ and payout ratio θ that are substitutable to attain the same steady-state equilibrium innovation rate, g¯ > 0. From this proposition, to target a specific innovation rate, government has a high degree of freedom in choosing the shape of our proposed amortizing securities. However, we should notice that product innovation proceeds with three distinct externalities in the current context. These externalities refer to the favorable diffusion of knowledge for future innovation, the favorable expansion of variety for consumers, and the adverse hazard of creative destruction against existing products, respectively. Therefore, though the decentralized equilibrium innovation rate does not associate with monopolistic distortions, it is still suboptimal. Thus, we need to inform the government about how to enable a decentralized economy to achieve the first-best allocation by selecting the optimal shape of our proposed securities. 2.2

The Socially Optimal Innovation Rate

To obtain the socially optimal steady state, we assume a social-planning economy whose social planner is to maximize the current-value Hamiltonian, 1−α log n(t) + log[L − (1 + ψ)g(t)] + μ(t)[n(t)g(t)] (10) max H ≡ α g(t) s.t. :

lim e−ρt μ(t)n(t) = 0

t→∞

(11)

where g(t) is a control variable, n(t) is a state variable, μ(t) is the costate variable measuring the shadow value of a new variety under the social-planning regime. The social planner is to find the socially-optimal allocation of labor to innovate designs for new products so as to maximize economic well-being on behalf of households. Using the first-order conditions of the social planning problem subject to the transversality condition (11), we can obtain the socially optimal innovation rate g¯SP by equating the marginal social value of a new variety to the associated marginal social cost as follows: 1 1 1−α SP SP SP ¯ V ≡n ¯ (t)¯ μ (t) ≡ (12) = (1 + ψ)a ρ α L − (1 + ψ)a¯ g SP α is the normalized marginal social value of a new variety where V¯ SP ≡ ρ1 1 − α

and (1+ψ)a L − (1 +1ψ)a¯gSP is the normalized marginal social cost. The centralized economy also sees ongoing falls in an innovation’s marginal social value and marginal social cost. Therefore, as in the decentralized economy, a new variety’s shadow value μ ¯SP (t) is scaled by the mass of available varieties n ¯ SP (t), so is its marginal cost. Solving (12), we can obtain the socially optimal innovation rate, α L SP − g¯ = ρ (13) (1 + ψ)a 1−α

Amortizing Securities as a Pareto-Efficient Reward Mechanism

225

The Pareto-optimal innovation rate results from the first-best allocation of resources (labor), subject to the model parameters (L, ψ, a, α, ρ). As implied by Eq. (13), a larger the labor force (i.e. larger L) or a higher research productivity (i.e. smaller a) can sustain a larger Pareto-optimal innovation rate, reflecting the model’s scale-effect feature. However, the Pareto-optimal innovation rate becomes smaller if there is a larger hazard of creative destruction (i.e. larger ψ), or if there is a larger degree of product similarity (i.e. larger α), or if households have a stronger degree of time preference (i.e. larger ρ). All these relationships make logical sense from the social perspective. 2.3

The Socially Optimal Shape of Amortizing Securities

By forcing the decentralized equilibrium innovation rate g¯ to match the sociallyoptimal level g¯SP , we can compute any of the infinitely many combinations of a typical amortizing security’s payout ratio and term based on (9). Using a benchmark parameter set (ρ = 0.07, α = 0.8, L = 1, a = 1.5 and ψ = 1), we compute the optimal shape of innovation-backed securities, as shown in Fig. 1, where the middle locus corresponds to the benchmark coefficient of creative destruction (ψ = 1.0) and two other scenarios for robustness checks on this coefficient. Some interesting observations are in order:

Fig. 1. The socially optimal loci of payout term δ and payout ratio θ

First, when product innovation comes with a more significant externality of creative destruction, government should set a smaller payout ratio for any predetermined payout term in order to achieve social optimality. Second, the payout ratio and payout term are quite substitutable in maintaining social optimality when payout terms are shorter, but not so when payout terms are 30 years or more.

226

3

H. C. Lin

Concluding Remarks

This paper proposes a novel public reward system and its advantage is threefold. First, it can ensure perfectly competitive diffusion of innovative products while maintaining a pro-innovation mechanism for sustainable macroeconomic growth. Second, unlike what [4] proposes, the prize for innovation is an innovation-backed security rather than a lump-sum prize, thereby precluding the need to incur any up-front cost to taxpayers as soon as a successful innovation arrives. Third, since payouts are distributed based on a product’s market performance, the risk of miscalculating the value of a new innovation as a lump-sum prize can be eliminated. For purposes of exposition, the proposed reward system uses two key assumptions. First, it assumes no time lag between the arrival of an innovation and the formation of a perfectly competitive market. If this assumption is violated in practice, the system must enforce compulsory marginal-cost pricing. Second, it assumes one-to-one mapping between innovation and product. But in practice it is common that a product may result from ideas from multiple innovators. If it is the case, government can split an amortizing security for the innovative product into shares and reward each of these innovators with a distinct number of shares, depending on their ex ante negotiated contract.

References 1. Hopenhayn, H., Llobet, G., Mitchell, M.: Rewarding sequential innovators: prizes, patents, and buyouts. J. Polit. Econ. 114, 1041–1068 (2006) 2. Jones, C.I., Williams, J.C.: Too much of a good thing? The economics of investment in R&D. J. Econ. Growth 5, 65–85 (2000) 3. Judd, K.L.: On the performance of patents. Econometrica 53, 567–585 (1985) 4. Wright, B.D.: The economics of invention incentives: patents, prizes, and research contracts. Am. Econ. Rev. 73, 691–707 (1983)

A Micro-level Analysis of Regional Economic Activity Through a PCA Approach Giulia Caruso(B) , Tonio Di Battista, and Stefano Antonio Gattone University G. d’Annunzio, Chieti-Pescara, 66100 Chieti, Italy [email protected] Abstract. This paper addresses regional studies at a micro-local level from both the perspective of institutional governance and economic analysis, by examining a case study in Central Italy. Our analysis concludes that the past policy approach in the area under investigation offers a potential pathway for academics to work with policy-makers in moving towards the realization of local growth policies. Such kind of analysis has a great potential for local institutions as well as a deeper understanding at a local scale can enhance a more effective decision-making process also at a global scale. Keywords: Regional economics Local decision-making

1

· Principal Component Analysis ·

Introduction

This work is focused on regional economics considered as the study of economic issues that have, traditionally, a dominant local dimension. In particular, the paper provides general knowledge regarding the economic behavior of businesses, consumers, and government, as well as how these behaviors, in turn, affect local growth dynamics and policy. After World War II, the economic activities of the Province of Pescara were concentrated in the upper valley of the river Pescara, forming an industrial area. Households started to move from the countryside to this industrialized area, which gradually emerged as a growing regional economy and community, especially due to its strategic position both regionally and nationally. Afterward, the economic activities - but not those of an industrial nature - shifted from the hinterland towards the coast, giving rise to the development of the city of Pescara [7,8]. This paper aims to provide a key for understanding the underlying economic dynamics behind the development of the Province of Pescara through an analysis of Censuses data conducted by the Italian National Institute of Statistics (ISTAT) from 1951 to 2011. In the remainder of this paper, we first provide an overview on the economy of the Province. Afterward, we implement an over-time analysis on economic macro-categories and then a PCA of the provincial economy, followed by our conclusion and goals for future research. c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 227–234, 2020. https://doi.org/10.1007/978-3-030-38227-8_26

228

2

G. Caruso et al.

The Economy of the Province: An Overview

In 1927 Pescara became the capital of the Province and its surrounding area became part of the new territorial entity. The industrial settlements of the upper valley, related to the sectors of energy, extraction and construction, were responsible for a first, significant development of the young province of Pescara, both from an economic and a demographic point of view [8–10]. This area gradually switched from family-run activities, especially related to agriculture and sheepfarming, to others aimed at supporting the manufacturing industry and the sale of agricultural products on a regional and national scale. Furthermore, in the first half of the twentieth century, the development of these municipalities was enhanced by the presence of the chemical industry and of that related to building materials, which contributed to boosting the employment of the entire area. The upper valley of the river Pescara gradually became a pole of attraction for all the surrounding area, due to the railway and to its convergence with the state roads, main routes to the cities of Chieti, Pescara, Sulmona, Ancona, and Rome. Most households gradually left the countryside to gather in and around this area, giving an impulse to the industry of constructions and to substantial demographic growth. Despite an early exodus to the United States of America, during the 1950s the migration balance was positive in the upper valley of Pescara. For what concerns the city of Pescara, it switched from businesses dependent upon fisheries into industrial and commercial activities, to gradually lose industrial character. In the late 1920s the transport sector was strengthened, in order to tackle the expanding economic development of the city; examples include the transport companies, both urban and regional, the electric railway connecting Pescara to the rich agricultural municipalities of the Tavo valley, and finally the airport of Pescara, which progressively increased the passenger traffic. The situation further improved during the 1930s, after that Pescara became the capital of the Province; the harbor and the railroad were strengthened to tackle the rising traffic increases, becoming important poles of urban expansion. At the end of the 1930s, Pescara was the municipality with the highest number of industrial plants within the urban area in the whole region, becoming not only the administrative and infrastructural epicenter of the local economy, but also a strong pole of attraction for regional and extra-regional entrepreneurs. Essentially, the overall picture of Pescara was positive, considering the poor socio-economic development at that time. Within a few decades, Pescara became the most important area of the Province and of the Region, attracting entrepreneurs and companies from outside Abruzzo.

3

An over-Time Analysis of Economic Macro-categories

In order to better understand the main features of economic growth of the Province of Pescara over the last sixty years, we analyzed the data of productive activities collected through ISTAT Censuses, and carried out in the following

A Micro-level Analysis of Regional Economic Activity Through a PCA

229

years: 1951, 1961, 1971, 1981, 1991, 1996, 2001 and 2011. In particular, these activities have been grouped into the following 8 economic macro-sectors: – – – – – – – –

trade; services; credit and insurance; construction industry; extractive industry; manufacturing industry; energy industry (electricity, gas and water supply); transports and communication.

For each macro-category and for each municipality in the Province, the number of local units (activities)1 per 1000 inhabitants has been considered. Our analysis aims, on the one hand, to identify the economic characterization of Pescara in comparison to the rest of the Province; on the other, to evaluate its dynamic material progress over-time, allowing to understand its local economic growth and the motivations at the basis of the radical metamorphosis which we are witnessing today. To this aim, we implemented an over-time analysis for each macro-category, on the basis of the residents in each of the following area: – the city of Pescara; – the whole Province; – the whole Province, except the city of Pescara. The Province of Pescara was very lively in various economic sectors. With regards to trade (Fig. 1) and services (Fig. 2), Pescara performs better, showing, however, a growing trend which is substantially in line with that of the rest of the province, as shown more in detail below.

Fig. 1. Trade. 1

Fig. 2. Services.

“Local unit” shall mean a physical place in which an economic unit (a private enterprise, a public or a non-profit institution) performs one or more activities. The local unit corresponds to an economic unit or a part thereof situated in a geographically identified place. At or from this place an economic activity is carried out for which, save for certain exceptions, one or more persons work for the same company [6].

230

G. Caruso et al.

With regards to credit and insurance (Fig. 3) the provincial municipalities remain below the average of Pescara, however maintaining a slightly increasing trend. The construction industry (Fig. 4) has a greater weight in the provincial municipalities, also showing a steady growth over-time, whereas Pescara registers a weaker increasing trend. The extractive industry (Fig. 5) is very relevant in the provincial municipalities, although with a negative trend, whereas its impact is rather limited in Pescara.

Fig. 3. Credit and insurance.

Fig. 4. Construction industry installations of plants.

and

With regards to the manufacturing industry (Fig. 6), instead, the decreasing trend referred to provincial municipalities is substantially in line with that of Pescara. For what concerns the energy industry (Fig. 7), the provincial trend is much higher than that of the city of Pescara, at least until the 1990s. The upper valley of Pescara benefited, indeed, from the presence of power plants in the towns of Bussi and Piano d’Orta. Furthermore, the offices of the Italian Public Power Corporation (ENEL) were located in Torre de’ Passeri until their closure, due to the law on the free energy market, which caused the proliferation of small and big-sized private enterprises.

Fig. 5. Extractive industry.

Fig. 6. Manufacturing industry.

The sector of transports and communication (Fig. 8) shows an opposite trend in the city of Pescara, compared to the rest of the Province. This latter registers

A Micro-level Analysis of Regional Economic Activity Through a PCA

231

a growing trend from the post-war years until the 1960s, then a decrease until the 1990s, followed by a new pick-up.

Fig. 7. Energy industry.

4

Fig. 8. Transports and communication.

A Principal Component Analysis of the Provincial Economy

In order to further explore the underlying dynamics of the economic activity in the Province of Pescara, we implemented a Principal Component Analysis on the above mentioned macro-sectors. It has been carried out on the basis of the ISTAT Censuses conducted in following years: 1951, 1961, 1971, 1981, 1991, 1996, 2001 and 2011. However, only most representative graphs are showed in this paper (Fig. 9), precisely those referred to 1951, 1971, 1991 and 2011, well reflecting the economic evolution of the entire Province. The analysis has been carried out on the eight macro-sectors mentioned above; we identified a substantially reduced number of them, which succeed in interpreting the economic performance of the entire Province. Consequently, by examining the 1951 Census data, the two principal factors concerning the main economic activities in the Province were, on the one hand, both sectors of trade and of credit and that of insurance; on the other, both the extractive and the energy industry. Concerning the axis related to trade, the town of Torre de’ Passeri was the municipality with the highest prevalence of commercial activities, compared to the rest of the province. It is positioned, indeed, in the positive quadrant, with the greater distance from the origin. On the other axis, Turrivalignani, San Valentino, Bolognano and Serramonacesca come out to be the towns with the highest characterization of both the extractive and the energy industry, whereas the city of Pescara was weak on both fronts. In 1961, the leading sectors were, on one axis, services and trade, whereas on the other one the energy industry and the transport sector. For what concerns the first one, the town of Torre de’ Passeri was the leading one, whereas Bolognano was the town to emerge on the second axis.

232

G. Caruso et al.

Fig. 9. Characterization of municipalities in terms of economic activities for each Census date.

In 1971, the propulsive sectors were services and energy. With regards to the first one, Torre de’ Passeri and Salle were the most active towns; whereas, for what concerns the energy industry, Bolognano was the most active municipality. In 1981, the driving sectors were services and trade on one hand, whereas, on the other, both the energy and the extractive industry. With regards to the first axis, the town of Torre de’ Passeri stood out once again, followed by other municipalities of the upper valley of Pescara, as Salle, Cappelle sul Tavo and Catignano. With regards to the second axis, Torre de’ Passeri confirms its supremacy, followed by Caramanico and Catignano. In 1991 Torre de’ Passeri and Pescara compete for the pole position in both sectors of services and that of credit and insurance. With regards to the extractive and to the energy industry, instead, the most active towns were Bolognano, Abbateggio e Scafa. In 1996, the driving sectors were trade and services on one hand and the manufacturing industry on the other one. With regards to the first one, the capital of the Province finally assumes the dominant role, followed by the town of Torre de’ Passeri, Citt´ a Sant’Angelo and Montesilvano. With regards to the second axis, instead, Salle, Cappelle and Elice were the most active towns. In 2001, the most important sectors were trade and services on one hand, and on the other the manufacturing industry. With regards to the first one, Pescara maintains its predominant role, followed by Caramanico, Citt´ a Sant’Angelo and Montesilvano. For what concerns the second axis, Salle and Elice were the driving municipalities.

A Micro-level Analysis of Regional Economic Activity Through a PCA

233

In 2011, finally Pescara obtained the primacy in both sectors of trade and services, followed by Cittá Sant’Angelo and Montesilvano, whereas for what concerns the manufacturing industry, the most active municipality was Carpineto.

5

Conclusions and Future Research

From the analysis above, it appears that the upper valley of Pescara played a pivotal role in the economic growth of the entire Province in the last century. It represented, indeed, a pole of attraction for neighboring areas, due both to its strategic geographic position and to the presence of power plants and important industrial settlements. This area was particularly active until the mid 1970s, when the political and administrative shortsightedness of local governments caused a slow but inexorable decline. The construction of the Highway A24 Rome-Pescara was seen as a danger of a further reduction in the already limited upper valley territory, underestimating the future potential of modernization and economic development which it could have generated. In the General Regulatory Plan, indeed, the areas around the highway exit were assigned to an urban residential destination, the hilly ones to agriculture and the commercial sites resulted those more difficult to reach by heavy vehicles. Consequently, the commercial historical activities started suffering and the capability to create employment for the inhabitants of the valley considerably decreased. The upper valley of Pescara underwent a radical transformation also from a demographic point of view: since the population began to find employment in other municipalities, with a gradual exodus towards Pescara, Sulmona and Chieti, houses prices lowered in the valley, becoming particularly attractive for foreign migrants. Since the 1990s, until up to the most recent Census, the beating heart of the economy completely shifted in favor of the provincial capital Pescara and of the municipalities on the Adriatic Coast, such as Citt´ a Sant’Angelo and Montesilvano, which experienced an increase in the services sector, mainly due to the tourism development. With regards to our future research, we would like to implement a further in-depth analysis on this issue through a cluster analysis, in the wake of our previous works [1–5].

References 1. Caruso, G., Gattone, S.A.: Waste management analysis in developing countries through unsupervised classification of mixed data. Soc. Sci. (2019) 2. Caruso, G., Gattone, S.A., Balzanella, A., Di Battista, T.: Cluster analysis: an ˇ application to a real mixed-type data set. In: Flaut, C., Hoˇskov´ a-Mayerov´ a, S., Flaut, D. (eds.), Models and Theories in Social Systems. Studies in Systems, Decision and Control, vol. 179, pp. 525–533. Springer, Heidelberg (2019) 3. Caruso, G., Gattone, S.A., Fortuna, F., Di Battista, T.: Cluster analysis as a decision-making tool: a methodological review. In: Bucciarelli, E., Chen, S., Corchado, J.M. (eds.), Decision Economics: In the Tradition of Herbert A. Simon’s Heritage. Advances in Intelligent Systems and Computing, vol. 618, pp. 48–55. Springer, Heidelberg (2018)

234

G. Caruso et al.

4. Di Battista, T., De Sanctis, A., Fortuna, F.: Clustering functional data on convex function spaces. In: Studies in Theoretical and Applied Statistics, Selected Papers of the Statistical Societies, pp. 105–114 (2016) 5. Di Battista, T., Fortuna, F.: Clustering dichotomously scored items through functional data analysis. Electron. J. Appl. Stat. Anal. 9(2), 433–450 (2016) 6. Istat. https://www.istat.it/it/metodi-e-strumenti/glossario 7. Montanari, A., Staniscia, B.: An overview of territorial resources and their users in the Rome and Chieti-Pescara Areas. In: Khan, A., Le Xuan, Q., Corijn, E., Canters, F. (eds.), Sustainability in the Coastal Urban Environment: Thematic Profiles of Resources and their Users, pp. 35–60. Casa Editrice Universit´ a La Sapienza (2013) 8. Montanari, A., Staniscia, B.: Chieti-Pescara metropolitan area: international migrations, residential choices and economic deconcentration. Migracijske i etniˇcke teme 22(1–2), 137–161 (2006). https://hrcak.srce.hr/5047. Accessed 9. Montanari, A., Staniscia, B., Di Zio, S.: The Italian way to deconcentration. Rome: the appeal of the historic centre. Chieti-Pescara: the strength of the periphery. In: ´ Razin, E., Dijst, M., VAZquez, C. (eds.), Employment Deconcentration in European Metropolitan Areas: Market Forces versus Planning Regulations, pp. 145–178. Springer, Dordrecht (2007) 10. OECD: Linking renewable energy to rural development, OECD Green Growth Studies. OECD Publishing (2012). https://doi.org/10.1787/9789264180444-en

Can a Free Market Be Complete? Sami Al-Suwailem1(B) and Francisco A. Doria2 1 Islamic Research and Training Institute, 8111 King Khaled St., Jeddah 22332-2444, Saudi Arabia [email protected] 2 Advanced Studies Research Group, PEP/COPPE, Federal University at Rio de Janeiro, Rio de Janeiro 21941, Brazil [email protected]

Abstract. We adopt an algorithmic approach in characterizing complete markets and complete contracts. We define a free market to be a market where private ownership is positively valued by market members. In an economy with zero transaction costs, if the market is algorithmically complete, ownership will be worthless. Accordingly, the market can be either free or complete, but not both. For the same reason, the market can be either efficient or complete, but not both.

Keywords: Complete contracts Turing Machines

1

· Finite Automata · Ownership ·

Introduction

In his survey of incomplete contracts, Nobel laureate Jean Tirole [8] remarks: Incomplete contracting arguably underlies some of the most important questions in economics and some other social sciences ... For all its importance, there is unfortunately no clear definition of “incomplete contracting” in the literature. While one recognizes one when one sees it, incomplete contracts are not members of a well-circumscribed family. Moore [5], moreover, notes: “We’ve had 25 years to come up with a watertight theory of contractual incompleteness and we haven’t succeeded yet.” We argue here that it is possible, in principle, to provide a precise definition of complete contracts, or at least of a large class of contracts. This definition fits nicely with the major results obtained in the literature, and points to a promising research program. This short paper is organized as follows: Sect. 2 defines complete contracts based on principles of logic and computability theory. Section 3 shows why contracts cannot be algorithmically complete. Section 4 shows that, if contracts were in fact complete, then (1) economic agents will be naively rational, and (2) the market will be provably inefficient. Section 5 shows that a complete market cannot be a free market. Section 6 concludes the paper. c Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 235–241, 2020. https://doi.org/10.1007/978-3-030-38227-8_27

236

2

S. Al-Suwailem and F. A. Doria

Contracts as Algorithms

A contract, in essence, is a set of “If–then” statements. If we start from that perspective, we may view contracts as algorithms or computer programs. This characterization is not only practically useful, it also conceptually illustrates, as we shall see, what does it mean for a contract to be complete. To be more precise, we define a contract as a computer program satisfying certain properties. Specifically, a contract is a Turing Machine (TM); in order to encompass all possible contract tasks, we can use a TM that is capable of implementing any computable task. That is, any task that can be specified in well–defined procedural steps can be encoded as a Turing Machine.1 All computer programs can be implemented as Turing machines. Every Turing machine is coded by a G¨ odel number, that is, Turing machine odel number m. Mm = {m}, where we represent Turing machine Mm with G¨ Given an Universal Turing Machine U(m) parametrized by G¨ odel number m, then we have that Mm (x) = U(m, x), that is to say, machine Mm is emulated by U(m). So: Definition 21. A contract is a family of particular instances of an universal Turing machine U(m), for some m. We can add, provisionally: Definition 22. A complete contract is a total Mm .2 Otherwise it is an incomplete contract. One problem remains: Can we tell upfront if a given TM is total or not? The Halting Problem It turns out, however, that, in general, we cannot tell whether a given TM is total or not. This follows from the result established by Alan Turing, known as the Halting Problem. In brief, given a Turing Machine, is it possible to tell in advance if the program will terminate (or halt) upon executing some computation? The answer is: In general, this is not possible. That is, there is no computer program that can tell us whether an arbitrary program will halt upon execution or will get locked into an infinite loop. This result applies to any non–trivial property of the code, as established by another well–known result, Rice’s theorem. In other words, we cannot tell in advance all the interesting properties of the output of the code when executed. Accordingly, a contract cannot specify in advance all the possible consequences of its execution. There will be many instances whereby some important aspects of the outcome of the contract are not foreseeable at the time of contracting. 1 2

This is in accordance with the Church–Turing thesis that states that any intuitively computable task can be performed by a suitable TM. A Turing Machine is total if it executes the code and halts over a “halting state,” i.e. it does not come to abrupt halt nor goes on an infinite computation [7].

Can a Free Market Be Complete?

237

One way to show why this is impossible is to prove that this leads to a contradiction. Here is a simple argument based on the well–known Russell’s paradox. Russell’s paradox arises when we try to classify all sets into two groups: sets that include themselves, and sets that don’t. A short tale conceived by Bertrand Russell himself, leads to a paradox: in a small town, every man either shaves himself or is shaved by the local barber. Now the barber himself: to which set does he belong? If he shaves himself, then he is shaved by the local barber. Or if he is shaved by the local barber, then he shaves himself. Now, suppose we are able to create these two sets: – A = {x : x ∈ x}, the set of all sets that are members of themselves, and – B = {x : ¬(x ∈ x)}, the set of all sets that are not members of themselves. The sets A and B must be mutually exclusive by construction, so that A ∈ B and B ∈ A. Take B. Do we have B ∈ B? If so, ¬(B ∈ B). . . However, if ¬(B ∈ B), then B ∈ B! Another argument: Suppose we are able to classify all sets into either A or B based on a function m as follows: – A = {x : m(x) = 1} – B = {x : m(x) = 0} Since the set B is not supposed to belong to A, then B should not be a member of itself. But this means that B must include itself because it satisfies its own definition. This means that the characteristic function for B, m(B) = 1. This implies that B ∈ A. But this contradicts the assumption that A and B are mutually exclusive. It follows that there is no systematic procedure to classify sets into classes A and B. Thus, the function m cannot exist. Following the same reasoning, we can see why it is impossible algorithmically to identify halting programs. Suppose we can. Thus we have: – A = {x : h(x) = 1}, the set of all programs that halt upon execution, and – B = {x : h(x) = 0}, the set of all programs that do not. The function h(x) is a program that is able to identify which programs will halt and which won’t. But notice that the sets A and B are programs themselves. Moreover, program B must halt, or else we will not be able to distinguish the two sets from each other. This implies that h(B) = 1 and thus B ∈ A. But this is a contradiction as the two sets are supposed to be mutually exclusive. It follows that the program h(x) cannot exist. The same argument applies to any non-trivial property of Turing Machines.

3

The Question of Market Completeness

Are markets complete? If by completeness we mean that any object can be traded, then it is obvious that the market cannot be complete in this sense (we cannot trade love, for

238

S. Al-Suwailem and F. A. Doria

example). But it will be too restrictive to define market completeness based on such a criterion. Instead, we consider a less restrictive one: A market is complete if it is capable of identifying which objects can be traded in the market and which cannot. Suppose that we are able to systematically identify all contractable objects. By being contractable, an object becomes priceable and therefore tradable in the market. So suppose that we have: – A = {x : k(x) = 1}, the set of all contractable objects, and – B = {x : k(x) = 0}, the set of all non-contractable ones. The function k(x), if exists, decides which objects are contractable and thus tradable in the market, and which are not. A market cannot be complete if it is not possible to decide such an important question. The function k, however, must itself be contractable, or else it will not be possible to write complete contracts. That is, a complete contract must specify, for example, that “the counterparty must deliver the contractable object x1 such that k(x1 ) = 1. Delivery of a non–contractable object is not acceptable.” This requires the set B to be contractable, which implies that B ∈ A—a contradiction. Accordingly, the function k cannot exist. Thus, there is no systematic way to identify which objects are contractable and which are not. Contracts thus cannot be complete since it is impossible to systematically identify the subject matter of the contract. It follows that markets cannot be complete. Do Generalizations Help? Does incompleteness result from a weakness of our theories’ framework? If we add new axioms, will that result in making complete those contracts? For the sake of our argument, suppose that we extend (in some adequate way, by adding new axioms, or a richer language) our background theory, which uses the first order classical predicate calculus. We also require that the set of theorems in our extended theory be a recursively enumerable set—that it is the output of some algorithm.3 Define a contract in the extended theory to be a first–order predicate P (x). Then: Proposition 31. Given the conditions above and if P (x) is non–trivial, then there is a x0 so that one neither proves in the extended theory, P (x0 ) nor ¬P (x0 ). That is, P (x) seen as a contract is incomplete. (P (x) is nontrivial if neither ∀ x P (x) nor ∀ x ¬ P (x) are proved in the extended theory.) This is a generalized Rice’s theorem.

3

We of course require that our formal background, represented by our theory, include enough arithmetic. Axioms for + and × plus the so–called trichotomy axiom suffices.

Can a Free Market Be Complete?

4

239

A Complete World

We now reverse the question and ask, suppose that we are able to write algorithmically complete contracts. How would these contracts look like? An algorithmically complete contract is a contract the output of which is completely decidable. That is, we are able to identify all non-trivial properties of the outcome of the contract. This can be possible if the language of the contract was less powerful than a Turing Machine. Finite Automata (FA) are computing structures that are unable to encode and execute all computable tasks. Some tasks that can be organized mechanically in a step-by-step procedure, and thus computable by a TM, will be beyond the scope of FAs. The upshot for this limited computing ability is completeness: Almost every question about (deterministic) FA is decidable [1,6]. Hence, if contracts were in the form of Finite Automata, they will be complete. But in this case the resulting market, although complete, will be provably inefficient. That is, there will be many computable opportunities that such a market systematically fails to exploit. An economy of FAs will be inefficient in the sense that there will be many Pareto-improving arrangements for trade and production that the FAs will repeatedly fail to perform. Turing Machines, on the other hand, will be able, according to Church-Turing thesis, to perform all computable tasks. Thus, a complete market in this sense does not correspond to the efficient competitive market familiar in economic textbooks (Table 1). There is another hindrance if we wish to restrict our contracts to the FA kind: for there is no general algorithm that selects, from an arbitrary set of computer programs, those which are FAs and those which are not! So, we can exhibit explicit sets of FAs, but we cannot (in the general situation) select, from an arbitrary set of computer programs, those which are FAs. Table 1. Turing Machines vs. Finite Automata Turing Machines Finite Automata Computable tasks

All*

Decidable questions Some * According to Church-Turing thesis.

Some All

Probably the most important aspect of a competitive market is that, well, it is competitive, i.e. it is a decentralized system. Agents make their own decisions and follow their preferred choices without centralized control. If the market consisted of Finite Automata, where all questions about choice are decidable, then there is no value in having a decentralized system. In fact, for a system of FAs, a centralized order is likely to be more efficient than a decentralized one. There is no edge for a free market. This, it turns out, has a major consequence on one the most important market institutions: private ownership.

240

5

S. Al-Suwailem and F. A. Doria

To Own or Not to Own ...

What is ownership? According to Nobel laureate Oliver Hart, ownership is the set of residual rights. Specifically, “ownership of an asset goes together with the possession of residual rights of control over the asset; the owner has the right to use the asset in any way not inconsistent with a prior contract, custom, or any law” [3]. What does “residual” mean? It means those rights that are not specified in the contract. According to Hart, ownership is valuable because it fills in the gap in incomplete contracts. If the contract is complete, ownership will be worthless. In a complete world, residual rights of control are irrelevant since all decisions are specifiable in the contract [4]. A complete market is a market where all contracts are complete. In such a system, ownership has no value or role to play. Private ownership will be indistinguishable from state ownership. If we think that private ownership is an indispensable property of a free market, then a complete market cannot be a free market, since in a complete world, ownership is dispensable. In fact, in a complete world, ownership must be dispensable, because it is a constraint on the system. If the market is complete, such a constraint should result in an inefficient or suboptimal performance. In contrast, such an institution results in efficient allocation for incomplete markets (Table 2). Table 2. Ownership in complete & incomplete markets Complete market Incomplete market With ownership

Inefficient

Without ownership Optimal

Optimal Inefficient

An incomplete market is also inefficient, but not in a systematic manner. That is, there will be opportunities to achieve Pareto–superior allocations, but these allocations will not be generally decidable within the system. However, with proper institutions like private ownership, such inefficiencies will be substantially reduced. It is common to view the incomplete world we live in as an approximation of the complete world idealized in textbook economics. But this “approximation argument” cannot hold if we take the problem of ownership into account. A real– world market might approximate a complete one in the sense of moving towards a world with zero–transaction costs. But the same cannot be said of ownership: We are not moving towards a world of valueless ownership. If any thing, it is the opposite: We are moving towards a world of full private ownership. Since ownership is irrelevant in a complete market, then real world markets cannot be an approximation to such a hypothetical world.4 4

For similar theorems see [2].

Can a Free Market Be Complete?

6

241

Conclusion

We propose to define complete contracts in accordance with the algorithmic decidability of their outcomes. A complete market, in turn, is a market in which all aspects of all contracts are decidable. Complete contracts correspond to Finite Automata, which provably fail to capture all computable opportunities for efficient allocation of resources. Moreover, with complete contracts, private ownership is irrelevant; in fact, it will be an inefficient constraint. If private ownership and decentralized decision–making are indispensable in a free market, then a free market cannot be complete.

References 1. Bernhardt, C.: Turing’s Vision: The Birth of Computer Science. MIT Press, Cambridge (2016) 2. Chaitin, G., da Costa, N., Doria, F.A.: G¨ odel’s Way. CRC Press, Boca Raton (2012) 3. Hart, O.: An economist’s perspective on the theory of the firm. Columbia Law Rev. 89, 1757–74 (1989) 4. Hart, O.: Incomplete contracts and control, prize lecture. www.nobelprize.org (2016) 5. Moore, J.: Introductory remarks on Grossman and Hart (1986). In: Aghion, P., Dewatripont, M., Legros, P., Zingales, L. (eds.) The Impact of Incomplete Contracts on Economics. Oxford University Press, Oxford, England (2016) 6. Reiter, E., Johnson, C.M.: Limits to Computation: An Introduction to the Undecidable and the Intractable. CRC Press, Boca Raton (2013) 7. Singh, A.: Elements of Computation Theory. Springer, Heidelberg (2009) 8. Tirole, J.: Incomplete contracts: where do we stand? Econometrica 67, 741–781 (1999)

The Self-Organizing Map: An Methodological Note Daniela Cialfi(&) Department of Philosophical, Pedagogical and Economic-Quantitative Sciences, University G. d’Annunzio of Chieti-Pescara, Viale Pindaro 42, 65127 Pescara, Italy [email protected]

Abstract. Nowadays, the Self-Organizing Map (SOM) is configured as one of the most widely used algorithms for visualising and exploring high-dimensional data: this visual formation is derived from the learning map process in which neighbouring neurons, on the input lattice surface, are connected to their respective neighbouring neurons on the output lattice surface. The present methodological note will introduce the Self-Organizing Map (SOM) algorithm, discussing, in the first part, its background, properties, applications and extensions and, in the second part, its evolution: the formation of new types of topographic maps, used for categorical data, such as time series and the tree structured data. In particular, this new type of map could be useful for further micro data analysis applications, such as document analysis or web navigation analysis, going beyond the limitation of the kernel-based topographic maps or creating new type of kernels, detailed in the Support Vector Machine literature. Keywords: Self-Organizing Map Topographic map

Methodological note Neural network

1 Introduction and Fundamental Background For years, Artificial Neural Networks (ANNs) have been used to model information processing systems. Nowadays, the Kohonen Self-Organizing Map (SOM) is configured as one of the most widely used algorithms for visualising and exploring highdimensional data because it is a pattern recognition process, in which inter- and intrapattern relationships between stimuli and responses are learnt without external influence1.

1

In the neural network approach, there are two types of architecture: feed-forward, in which neurons are concatenated, and the recurrent architecture, in which neurons receive feedback from themselves and others (e.g. Multi-Layer Perceptron and the Hopfield network). Subsequently, there are two principal types of learning paradigms: supervised and unsupervised learning. The first is based on input-output (input-target) relationships, while the second lies in self-organization and interrelationships/associations processes: there is no direct adviser to determine how many output errors have originated, but they have been evaluated as ‘positive’ or ‘negative’ in terms of proximity or distance from the goal.

© Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 242–249, 2020. https://doi.org/10.1007/978-3-030-38227-8_28

The Self-Organizing Map: An Methodological Note

243

The first mathematical form of the self-organizing map was developed by Von der Malsburg and Willshaw [13] based on the projection of two-dimensional presynaptic sheets to two-dimensional post-synaptic sheets, like in Fig. 1 below.

Fig. 1. Von der Malsburg’s self-organizing map model

Notably, this architecture supposed the existence of two sets of neurons, ordered in two-dimensional layers or a lattice2: using short-range connections, the neighbour cells become mutually reinforced. Mathematically, the post-synaptic activities are expressed by (1) X X X #yi ðtÞ þ cyi ðtÞ ¼ wij ðtÞxi ðtÞ þ eik yk ðtÞ bik yk0 ðtÞ #t 0 j k

ð1Þ

k

where c is the membrane constant, wij ðtÞ is the synaptic power between i and j, xi ðtÞ is the presynaptic cells’ state and ekj , bkj are the short and long range inhibition constants respectively. This visual formation is derived from the learning map process in which neighbouring neurons on the input lattice surface are connected to their respective neighbouring neurons on the output lattice surface. The present methodological note is structured as follows. Section 2 presents the basic version of the SOM algorithm, describing the two stages of which it is composed: the competitive and the cooperative and the subsequent topographic ordering properties. Then the applications and extensions of the SOM algorithm will be discussed. In Sect. 3 important future evolutions of these types of topographic maps will be presented such as document and web navigation analyses. There will also be a brief conclusion of this methodological note.

2

The lattice is an undirected graph in which every non-border vertex has the same fixed number of incident edges. Its common representation is an array with a rectangular/simplex morphology.

244

D. Cialfi

2 The Self-Organizing Algorithm The Kohonen Self-Organizing Map (SOM) [4], the theme of the present methodological note, is mathematical abstraction of the forementioned self-organizing learning principles and incorporates Hebbian learning and lateral interconnection rules3. Its aim is to transform an incoming signal pattern of arbitrary dimension into a one or twodimensional discrete map and to perform this transformation adaptively in a topologically ordered fashion. In this framework, neurons become selectively ‘tuned’ to various input patterns (stimuli) or classes of input patterns during the course competitive learning. As a result, the location of the neurons become ordered and an input ‘feature’ is created on the lattice. As most self-organization processes do, the Kohonen SelfOrganizing Map (SOM) presents both competitive and correlative learning stages: when stimuli are present, neurons enter in competitions among themselves in order to posses these stimuli. As a result, the winning neurons increase their weight with these stimuli4. Following the Von der Malsburg and Willshaw approach, in the Kohonen approach, the postsynaptic activities are similar to Eq. (1) and a non linear function is applied to each postsynaptic activity to prevent non-negative solutions: yj ðt þ 1Þ ¼ u½wTj xðtÞ þ

X

hij yi ðtÞ

ð2Þ

i

where hij is similar to ekj and bkj . A standard structure is shown in Fig. 2 below.

Fig. 2. Kohonen’s self-organizing map exemple.

As it is possible to see from the previous figure, the formed bubble is centred on a postsynaptic cell which matches with the input, the first term of Eq. (2). 3

These rules represent the most common rules for unsupervised or self-organizing learning algorithms. Mathematically, the Hebbian learning rule could be written as: #wij ðtÞ ¼ axi ðtÞyi ðtÞ #t

where a is the learning rate (0\a\1), x and y are the input and output of the neural system. In particular, to prevent the weight increase or decrease it is necessary to insert a ‘forgetting term’ as it occurs in the SOM process. 4 In this methodological note, this is considered the minimum Euclidean distance.

The Self-Organizing Map: An Methodological Note

245

More profoundly, in the competitive stage, for, each input v 2 V, the neuron with the smallest Euclidean distance, called the winner, is selected according the Eq. (3) below: i ¼ arg mini kwi vk

ð3Þ

By the minimum Euclidean distance rule, a Voronoi tessellation of the input space is obtained (the grey shaded area in the Fig. 3).

Fig. 3. Definition of quantization region in the Self-Organizing Map, Van Hulle [12].

As is possible to see from the previous figure, each neuron has a corresponding a region in the input space in which the boundaries are perpendicular bisector planes of lines of joining pairs of weight vectors (the grey shaded area). In the Cooperative stage, the formation of topographically - ordered maps, where the neuron weights are not modified independently of each other, is crucial. During the learning process, not only is the weight–vector of the winning neuron updated but also those of the lattice neighbours. This is achieved with the neighborhood function, centred on the winning neuron and decreased with the lattice distance of the winning neuron. The weight is updated in an incremental way, following the Eq. (4):

Dwi ¼ gKði; i ; dK ðrÞÞðv wi Þ 8i 2 A

ð4Þ

where K is the neighborhood function. An example of the effect of the neighborhood function could be seen in Fig. 4 below.

Fig. 4. The effect of the neighborhood function in the SOM algorithm, Van Hulle [12].

246

D. Cialfi

The neighborhood function behaviour is called the two-phased convergence process, and it is configured as important property of the algorithm. Mathematically, this property could be described under the following terms: • The formation of the topology-prospect mapping: the Peano curve, which is an infinitely and recursively convoluted fractal curve; • The convergence of the weights (energy function minimization): this function aims to perform a gradient descent converging to a local minimum. But the SOM approach can only perform gradient descent as long as the neighborhood function is eliminated. Self-Organizing Algorithm

Repeat 1. At each time the winner,

, present an input

v(t ) = arg min

x(t ) − wk (t )

, and select

(5)

k∈Ω

2. Update the weights of the winner and its neighbors,

Δwk (t ) = α (t )η (v, k , t )[ x(t ) − wν (t )] (6) Until the map converges

2.1

Applications and Extensions

From application point of view, the geographical map, generated by the SOM algorithm, is easily understandable. For that reason, the SOM algorithm has thousands of applications in a wide range of areas; from the automatic speech recognition to cloud classification and micro-array data analysis. From the way the neuron is weighted different training set visualizations are derived: • Vector quantization; • Regression; • Clustering. Vector Quantization: The training set is modelled in the following way: the average discrepancy between data points and the neuron weights is the minimum. In fact, there exists a connection between the batch version of the unsupervised competitive learning (UCL) and the SOM topology approach, and between the generalised Lloyd vector version (see [6]).

The Self-Organizing Map: An Methodological Note

247

Regression: In this case, the training set is interpreted as a non-parametric regression because no prior knowledge is assumed on the function’s nature. Under this point, this type of process could be considered as the first statistical application of the SOM algorithm (see [7]) because the resulted topology map tried to capture the principal dimension of the input space5. Clustering 6: The clustering SOM approach is widely used. Previously, clusters and their boundaries were defined by the user but now, in order to visualise them in a direct way, an additional technique is necessary: the computation of the mean of the Euclidean distance between the neuron’s weight vector and the weight vectors of its nearest neighbors in the lattice. When the maximum and minimum distances are found for all neurons and after scaling the distance between 0 and 1, the lattice becomes a grayscale image, called U-Matrix (see [11]). In particular, this tool was suggested to impress the distance information on the map in order to visualise data structures and distributions. An example of this type of application is the WEBSOM (see Kaski [3]) and, for high-dimensional data visualisation, the Emergent Self-Organizing Map (ESOM). In the first type of application, the SOM is used for the document mapping, where each document is represented by a vector of keyword occurrence: similar documents will be linked together in the same cluster. In the second type, on the other hand, the emergence is considered as the “[..] ability of a system to produce a phenomenon on a new, higher level. In order to achieve emergence, the existence and cooperation of a large number of elementary processes is necessary. […]” (Van Hulle [13], pp. 14–15). An important difference between this approach and the traditional SOM is the number of neurons used: the Emergent SOM uses a very large number of neurons, at least a few thousand. Following the extension point of view, many adapted versions have emerged over the years: for some, the motivation was the improvement of the original algorithm, for others, the development of new topographic map formation performance. But the fundamental reason for all previous new SOM approaches is the need to develop a new learning role that computes gradient descent on an energy function. For that reason, we have the maximum information process, the growing topographic map algorithms and the kernel-based topographic map. In the maximum information process, instead of applying the minimisation criteria, Linsker [5] proposed the maximum information process principle in which the processing stage has the attribute that the output signals will discriminate, in an optimal way, any possible input signals sets. Based on this principle, he proposed a learning rule for topographic map formation7. In the growing map case, we have, in contrast to the SOM, the existence of a growing map, which has a dynamically-defined topology, with the aim to try to capture, as well as possible, the distribution of input structure, an attempt to overcome the topology mismatches present in the original SOM approach. More specifically, this approach tries to optimise the 5

6

7

Under visualisation point of view, the lattice coordinate system could be considered to be a global coordinate system of different types of data. Clustering is defined as the partitions of the dataset into subsets of ‘similar’ data without using prior knowledge about the subsets. He maximised the average mutual information between the output and the input signal component.

248

D. Cialfi

neurons’ usage (the so called dead units.). In order to do this, it is necessary for the lattice geometry to change: a lattice is gradually generated by, as it is possible to see from Fig. 5 below, by successive neurons insertions and connections between them.

Fig. 5. Neural Gas algorithm, combined with competitive Hebbian learning, applied to a data manifold consisting of a right parallelepiped, a rectangle and a circle connecting a line.

An intermediate approach is the Recurrent Topographic Maps, because there was the growing need to model data sources that present temporal characteristics such as a correlation structure, which is impossible to capture with the original SOM algorithm (e.g. Temporal Kohonen Map (see [1]) and the Merge SOM (see [10]). In the kernelbased approach, instead, the neurons posses a overlapping activation regions, usually in its kernel function form because of the density estimation properties of the topographic maps: in this way it is possible to combine the unique visualization properties with an implemented clusters modelling data.

3 Conclusions and Future SOM Development As we have seen in the previous section, one of the future developments could be going beyond the limitations of the kernel-based topographic maps because neuron input should be a vector, as it is in the extension used for the categorical data8, or introduce a new type of kernel in the Support Vector Machine literature, as Shaw [8] and Jin [2] suggest into the biochemical field. In the present methodological note, the Self-Organizing Map (SOM) algorithm was introduced and its properties, applications and some types of extension and future development were discussed.

8

It is necessary to stress out that the SOM algorithm is just extended in terms of strings and trees (e.g. Steil [9]).

The Self-Organizing Map: An Methodological Note

249

References 1. Chappell, G., Taylor, J.: The temporal Kohonen map. Neural Netw. 6, 441–445 (1993) 2. Jin, B., Zhang, Y.-Q., Wang, B.: Evolutionary granular kernel trees and applications in drug activity comparisons. In: Proceedings of the 2005 IEEE Symposium on Computational Intelligence (2005) 3. Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM - self-organizing maps of document collections. Neurocomputing 21, 101–117 (1998) 4. Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1984) 5. Linsker, R.: Self-organization in a perceptual network. Computer 21, 105–117 (1988) 6. Luttrell, S.P.: Derivation of a class of training algorithms. IEEE Trans. Neural Netw. 1, 229– 232 (1990) 7. Mulier, F., Cherkassy, V.: Self-organization as an interactive kernel smoothing process. Neural Comput. 7, 1165–1177 (1995) 8. Shaw, J., Cristianini, N.: Kernel Methods in Computational Biology. MIT Press, Cambridge (2004) 9. Steil, J.J., Sperduti, A.: Indices to evaluate self-organizing maps for structures. In: WSOM07, Bielefeld, Germany (2007) 10. Strickert, M., Hammer, B.: Merge SOM for temporal data. Neurocomputing 64, 39–72 (2005) 11. Ultsch, A., Siemon, H.P.: Kohonen’s self-organizing feature maps for explanatory data analysis. In: Proceedings International Neural Networks, pp. 305–308. Kluwer Academic Press, Paris (1990) 12. Van Hulle, M.M.: Faithful Representation and Topographic Maps: From Distortion to Information-Based Self-organization. Wiley, New York (2000) 13. Von der Malsburg, C., Willshaw, D.J.: Self-organizing of orientation sensitive cells in the striate cortex. Kybernetik 4, 85–100 (1973)

The Study of Co-movement Mechanisms Between the Mainland Chinese and Hong Kong Capital Markets Xianda Shang1, Yi Zhou2, and Jianwu Lin2(&) 2

1 HSBC Business School, Peking University, Shenzhen 518000, China Graduate School at Shenzhen, Tsinghua University, Shenzhen 518000, China [email protected]

Abstract. China is gradually opening its secondary markets through many channels. Among all the channels, one of the most important channels is through Hong Kong. QFII, QDII, RQFII, RQDII, the dual listing of A-H stocks, the Shanghai-Hong Kong Stock Connect and the Shenzhen-Hong Kong Stock Connect are all boosting the co-movement between Mainland and Hong Kong capital markets. We find that economic dependency, arbitrage strategy and information flow construct the transmission mechanism of the co-movement between Mainland China and Hong Kong capital markets and make the pricing of financial instruments in these markets much more efficient. While the comovement effect is strengthened by increasing capital investment and commodity imports and exports, some fluctuations are caused by policy shocks and the life cycle of arbitrage. Keywords: Co-movement Macroeconomic factors Microeconomic factors Life cycle of arbitrage strategy Information flow

1 Introduction Ever since economic globalization and trade liberalization swept across the globe in the late 1980s, economic and financial integration has increased a lot as many countries gradually opened up their economies and eased their financial restrictions. Such international capital began to move around the world at an increasing speed in pursuit of profit. On the one hand, economic globalization and international trading contributes to the integration and development of the global market. On the other hand, it also causes capital markets to be more vulnerable to financial shocks, and co-movement and risk-spreading become more prevalent. In particular, the number of studies on stock market linkage has surged due to the recurring global financial crises since the 1980s. Ever since then, China has been gradually opening up its stock markets through many channels. Among all the channels, one of the most important channels is through Hong Kong. QFII, QDII, RQFII, RQDII, the dual listing of A-H share, the ShanghaiHong Kong Stock Connect and the Shenzhen-Hong Kong Stock Connect are all boosting the co-movement between Mainland and Hong Kong capital markets. Comovement is defined as the same direction of changes for major stock indices in each market. In the case of the China and Hong Kong markets, the co-movement is © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 250–257, 2020. https://doi.org/10.1007/978-3-030-38227-8_29

The Study of Co-movement Mechanisms

251

calculated as the correlation between the Shanghai Stock Exchange Composite Index and the Hang Seng Index. Both markets picked up in 2005, reached their peaks simultaneously in 2008, and together crashed later in 2009. They then both rose again to a record high in 2015 before diving again. Ever since then, global financial market integration has been an important topic in finance for China and the world, and the degree to which financial markets are interconnected has an important bearing on information shock transmission and financial stability. Although international capital markets are perceived to have become more highly integrated, Bakeart and Harvey [1] find time varying integration for several countries. Some countries appear to integrate more with the world while others do not exhibit such tendencies. However, market co-movement may change depending on different market conditions. Beine et al. [2] measured stock market coexceedances and showed that macroeconomic variables asymmetrically impact stock market comovement across the return distribution. Financial liberalization significantly increases left tail co-movement, whereas trade integration significantly increases comovement across all quantiles. Nguyen et al. [3] employed Chi-plots, Kendall (K)-plots and three different copula functions to empirically examine the tail dependence between the US stock market and stock markets in Vietnam and China, and found no change in the dependence structure, but there exists stronger left tail dependence between the US and Vietnamese stock markets. As for the Chinese market, it is more independent of the US stock market which suggests that US investors have significant potential to invest in the Chinese market due to risk diversification after the 2008 global financial crisis. Hearn and Man [4] examined the degree of price integration between the aggregate equity market indices of Hong Kong, as well as the Chinese Shanghai and Shenzhen A and B share markets. They found that, in the long term, Chinese markets are more affected by domestic shocks than by external effects. Many studies show that several macroeconomic factors affect market co-movement. Johnson and Soenen [5] found that the equity markets of Australia, China, Hong Kong, Malaysia, New Zealand, and Singapore are highly integrated with the stock market of Japan. They found that a higher import share, as well as a greater differential in inflation rates, real interest rates, and gross domestic product growth rates, has negative effects on stock market co-movement between country pairs. Conversely, a higher export share from Asian economies to Japan and greater foreign direct investment from Japan in other Asian economies contribute to greater co-movement. Lin and Chen [6] employed the multinomial logit model to determine the economic factors that affect the co-movement relationships in the stock markets of Taiwan and four major trading partners (Mainland China, United States, Japan and Hong Kong), using daily data covering the period from 1994 to 2004. Their empirical results indicate that the volatility of stock market returns and the rate of change in the exchange rate are both important factors that affect co-movement. Additionally, interest rate differentials play an increasingly important role in the periods after the 2008 global financial crisis. Chen [7] found that the degree of a market’s co-movement with international stock markets is closely related with that of its country’s integration into the global economy. Besides macroeconomic factors, many microeconomic market factors also impact comovement of markets, especially between the Mainland China and Hong Kong capital markets. The profitability of hedge fund arbitrage strategies may reflect the level of the co-

252

X. Shang et al.

movement. The segmented nature of the Chinese stock market has attracted much attention from researchers. Many hedge funds leverage this feature to profit through arbitrage strategies and further impact the co-movement between these two markets. Another important micro market factor which impacts market co-movement is the information flow between the Chinese A-share market and H-share market. In 2004, Chen et al. [8] studied whether informed traders choose to trade in the stock market or the options market due to information flow. In 2008, Kwon and Yang [9] applied the entropy transfer model to the stock market, in which the information flow between the composite stock index and individual stocks replaces entropy transfer. In 2012, Kwon and Oh [10] applied this model to the global market: they conclude that the difference between the information flow from the stock index to individual stocks and that from individual stocks to the stock index is greater in mature markets than in emerging markets. In 2017, Huang et al. [11] studied how the information flow among the five Chinese markets – the Shanghai A-share market, Shanghai B-share market, Shenzhen A-share market, Shenzhen B-share market, and Hong Kong market - impacted each other from 1995 to 2014. In sum, recent studies have overlooked the factors influencing stock market comovement at both the macroeconomic and microeconomic levels simultaneously. Our empirical analysis complements these studies, as we simultaneously investigate stock market co-movement at the macroeconomic level as well as the microeconomic level. Furthermore, we investigate the differences between the co-movement of the Chinese market and the G20s markets and the co-movement of the Chinese market and Hong Kong market, providing insights into further understanding the roles played by microeconomic determinants at times of stable economic environments. This rest of this paper is organized in the following ways. Section 2 defines the comovement measurement used in this paper. Section 3 discusses the relationship between the co-movement and macroeconomic factors. Section 4 proposes two micro market factors and uses them to explain the residuals left from the main linear macroeconomic factor regression model. Section 5 concludes this research and proposes our policy recommendation.

2 Problem Statement We first examine to what extent the Hong Kong’s and G20 countries’ stock markets are integrated with the China stock markets. We choose these countries because they are the major economies of the world and also the major origins and destinations for trade activities and capital flows. South Africa, Malaysia and Saudi Arabia are dropped due to lack of data. All the data are obtained from WIND. We use the Shanghai Composite Index as a proxy for the mainland stock market and the Hang Seng Index for the Hong Kong stock market because these were two of the few major indices in existence throughout the data period in observation. The relatively large size of the Chinese economy and the relatively large demand for commodities from developing countries suggests that an exogenous shock to the Chinese stock market may negatively affect these developing countries’ stock markets if China’s economy and stock market is not performing well. Similarly, the fact that China has a larger trade surplus with developed countries may imply a highly integrated trade relationship between China and these

The Study of Co-movement Mechanisms

253

trading partners. Similarly, a stock market crash in those countries with a large trade deficit with China may adversely impact the Chinese stock markets as well. In this paper, we first regress about the co-movement between the G20 and China stock market mainly to identify major macroeconomic factors. Due to the interconnection between the Chinese and Hong Kong markets, we believe that there are also other microeconomic factors that help to better explain the co-movement effect between these two markets. Therefore, these macroeconomic factors determined from G20 countries’ panel data are in turn used as control variable when we regress the comovement effect between the Chinese and Hong Kong markets to find out the microeconomic factors, which is the focus and key point of our research. In particular, we are interested in the co-movement between China and Hong Kong’s equity markets, which is defined as the correlation between the weekly closing stock index values for the Shanghai Composite Index (SHC) and the Hang Seng Index (HSI).

3 Macroeconomic Factors Previous studies have examined the roles of macroeconomic factors that may impact the level of economic integration, such as interest rate differentials, exchange rate expectations, GDP, imports and exports, FDI and ODI. Most literature agrees that trade volume, capital flow, and GDP are the most significant variables. Therefore, we choose to include in our model macroeconomic factors such as Gross Domestic Product (GDP), commodity exports (Exports), commodity imports (Imports), Foreign Direct Investment (FDI) and Oversea Direct Investment (ODI). The exchange rate is not included because for the time being China is mostly pegged to the US dollar while Hong Kong maintains a “Linked Exchange Rate System” which is in fact a proxy for the US dollar. Also, the exchange rate is under the tight control of the Central Bank of China. The interest rate differential is not used because China imposes a strict capital control policy that severely limits arbitrage of the interest rate differential. We collect major stock markets’ index values for Hong Kong and eighteen economies in the G20 from 1995/3/31 to 2017/12/31 on a weekly frequency. To analyze international trade and financial linkage, we construct our four factor panel data regression model as follows: Yti ¼ a1 Mti þ a2 Xti þ a3 Fti þ a4 Oit þ eti c

ð1Þ

Yti Correlation between China’s SHC and country i’s stock market index; Mti Imports of country i from China, as a percentage of i’s GDP; Xti Exports of country i to China, as a percentage of i’s GDP; Fti FDI of country i from China, as a percentage of i’s GDP; Oit ODI of country i to China, as a percentage of i’s GDP; eti c Residual; Our linear regression model does not include a constant term because all the x variables are already normalized to a standard distribution. They are treated this way

254

X. Shang et al.

because we want to focus on the effect of a percentage change of an independent variable on the correlation between two countries’ stock markets. We first examine to what degree equity markets in the G20s are integrated with China’s equity market. We next examine the extent to which macroeconomic variables that are usually associated with economic integration explain the changes in the degree of stock market integration. By using weekly returns from 1995 to 2017, Table 1 shows that export and FDI seem to be major economic factors that contribute to the co-movement phenomenon. Since the degree of economic integration varies over time for a given pair of countries, we expect the extent of equity market integration to vary systematically. As Table 1 shows, we find that increased exports from China and greater FDI into China contribute to greater co-movement. By using this same model for the data of Hong Kong’ markets, these effects are significant and evident in the results shown in Table 1. Table 1. Regression results for the China market Factor Coefficient Imports 0.000151 Exports 0.197225*** FDI 0.067576*** ODI 0.013816 ***Significant at the 1% level, **Significant at the 5% level, *Significant at the 10% level.

However, the coefficients for the China and Hong Kong model register a different sign for the effect from the previous econometrically significant macro factors— Exports and FDI (Table 2). Here is the question: in addition to the macroeconomic determinants, what else drives the co-movement between the two markets. Since a large amount of the residuals is still not explained by the macroeconomic factors, we then look beyond long-term economic impact factors and investigate the microeconomic features that contribute to this effect. Through empirical examination, we find that the life cycle of arbitrage strategy and information flows that serve as proxies for sentiment indicators between the Chinese and Hong Kong markets, help to explain the discrepancy between the Hong Kong market and the rest of the world. Table 2. Regression results for the Hong Kong market Factor Coefficient Import −0.05731 Export 0.55474*** FDI −0.28967 ODI 0.240094 ***Significant at the 1% level, **Significant at the 5% level, *Significant at the 10% level.

The Study of Co-movement Mechanisms

255

4 Microeconomic Factors The results show that the signs for Imports and FDI change, which indicates that there might be an omitted-variable-bias problem and that we need to look beyond macroeconomic factors to explain the discrepancy between Hong Kong and the rest of the world. Thus, we introduce two microeconomic factors: life cycle of arbitrage and information flow to account for the changes at frequent intervals. To better approximate an A-H-share-market arbitrageur’s actual performance, we use three classical quantitative arbitrage strategies and the A-share and H-share trading data to simulate the arbitrage performance. For the stocks co-listed in the A and H markets, we choose the stocks that have: the greatest price spread (Strategy 1), the largest rolling standard deviation between the two prices from the two markets (Strategy 2), and the largest residual after regression on the two prices from two markets. Subsequently, we calculate the daily and weekly return rate of the three strategies. Combining the outcomes of the three strategies, we can approximate the weekly return of the A-H arbitraging activities, which we name the A-H stock arbitrage factor. To approximate the sentiment indicators that quantify the stock market sentiment of the A-share market and the H-share market, we combine the five sentiment factors and give to them an equal weight: the turnover ratio, ETF redemption strength, the future index premium, component stocks up percentage, and the market’s relative strength index. Then, we use Liang’s method [12] to calculate the causality relationship between the two markets’: the Information Flow. Fh!a ¼

Caa Cah Ch;da Cah2 Ca;da 2 C C C2 Caa hh aa ah

ð2Þ

Even more so than the correlation coefficient and the Granger causality test, the information flow can tell us about the causality relationship including the causal direction and the extent, which allows us to get the two-way information flow factors. To avoid multicollinearity, we collect the residuals of Hong Kong’s model to fit into the additional data (Table 3). et HK ¼ b1 Arbt þ b2 F h2at þ b3 F a2ht et

Arbt A-shares and H-shares arbitrage weekly return; F h2at The H-share market’s information flow to the A-share market; F a2ht The A-share market’s information flow to the H-share market;

ð3Þ

256

X. Shang et al. Table 3. . Factor Coefficient Arb −0.09693** F_h2a 0.137752** F_a2 h 0.207606*** ***Significant at the 1% level, **Significant at the 5% level, *Significant at the 10% level.

From the regression results, we can see that the arbitrage factor has a negative relationship with co-movement. It makes sense that low co-movement provides more arbitrage opportunities. Two information flow factors have a positive relationship with co-movement, which shows that the two-way information flows both provide a positive contribution to the co-movement. All three of these factors provide additional support for the co-movement in addition to the low-frequency macroeconomic factors.

5 Conclusion – Economic Meaning and Policy Recommendations In our research, we collect data on macroeconomic and microeconomic factors from Ashare and H-share markets from the past 15 years. Through computation of correlations between Mainland China and Hong Kong, we find out that, in addition to the macroeconomic factors that account for the long-term economic integration between the two markets, there is strong evidence of microeconomic determinants that help to explain this co-movement. This is because there are cross-listed companies on both the A-share market and the H-share market. Secondly, information flow is relatively easy to pass back and forth across the border since interconnection and mutual communication between the two adjacent regions is closely tied economically, culturally and politically. During stable macroeconomic environments, it is the microeconomic factors that mostly affect the financial integration and interconnection between the two markets. Moreover, institutional changes that accompany integration with the global market can lead to positive spill-over into the real economy by improving corporate governance and reducing the cost of capital, which can in turn spur investment and economic growth. Since the early 1990s, the Chinese authorities have implemented various reforms in their stock markets in order to achieve the benefits of integration. The existence of dual listing of stocks, namely the once-restricted A-shares (shares became available to foreign investors in 2002) and unrestricted B-shares (shares available to foreign investors throughout the period under study), will facilitate the evaluation of the effects of capital controls, providing important policy implications for other emerging markets pondering over the financial opening-up of their own economies. As a point of advice for financial regulators, it is suggested that the authorities should step up efforts for financial reforms in line with economic reform and trade liberalization in order to tap into the benefits of trade globalization, industrial specialization and financial integration.

The Study of Co-movement Mechanisms

257

References 1. Bekaert, G., Harvey, C.R.: Time-varying world market integration. J. Financ. 50(2), 403– 444 (1995) 2. Beine, M., Cosma, A., Vermeulen, R.: The dark side of global integration: increasing tail dependence. J. Banking Financ. 34(1), 184–192 (2010) 3. Nguyen, C., Bhatti, M., Henry, D.: Are vietnam and chinese stock markets out of the us contagion effect in extreme events? Phys. A Stat. Mech. Appl. 480, 10–21 (2017) 4. Hearn, B., Man, S.: An examination of price integration between stock market and international crude oil indices: evidence from China. Appl. Econ. Lett. 18(16), 1595–1602 (2011) 5. Johnson, R., Soenen, L.: Asian economic integration and stock market comovement. J. Financ. Res. 25(1), 141–157 (2002) 6. Lin, C., Cheng, W.: Economic determinants of comovement across international stock markets: the example of Taiwan and its key trading partners. Appl. Econ. 40(9), 1187–1205 (2008) 7. Chen, P.: Understanding international stock market comovements: a comparison of developed and emerging markets. Int. Rev. Econ. Financ. 56, 451–464 (2018) 8. Chen, C., Lung, P., Tay, N.: Information flow between the stock and option markets: where do informed traders trade? Rev. Financ. Econ. 14(1), 1–23 (2005) 9. Kwon, O., Yang, J.: Information flow between composite stock index and individual stocks. Phys. A: Stat. Mech. Appl. 387(12), 2851–2856 (2008) 10. Kwon, O., Oh, G.: Asymmetric information flow between market index and individual stocks in several stock markets. Europhys. Lett. 97(2), 28007 (2012) 11. Huang, W., Lai, P., Bessler, D.: On the changing structure among Chinese equity markets: Hong Kong, Shanghai, and Shenzhen. Eur. J. Oper. Res. 264(3), 1020–1032 (2018) 12. Liang, X.: Unraveling the cause-effect relation between time series. Phys. Rev. E 90(5–1), 052150 (2014)

In Favor of Multicultural Integration: An Experimental Design to Assess Young Migrants’ Literacy Rate Cristina Pagani1(&) and Chiara Paolini2

2

1 I.T.St. Aterno-Manthonè, Pescara, Italy [email protected] Universiteit Utrecht, Utrecht, The Netherlands [email protected]

Abstract. According to the UN Refugee Agency, several migratory flows are involving 65.6 million people worldwide, among them refugees, those seeking asylum, those who migrate for economic reasons, and stateless people [1]. In this regard, knowledge is the key to integration in contemporary society: in fact, learning the language of the country of migration (i.e., destination country) is crucial for beginning integration and taking part in interactive activities, such as educational training opportunities. Nevertheless, learning the language of the destination country is also mandatory in order to obtain the residence permit (i.e. visa) or to become a naturalized citizen. A practical application related to the above addresses the opportunity to assess young migrants’ L2 learning competence, focusing on the need to carry out a diversified and reliable achievement test, a placement test, and a proficiency test. The profiles of L2 young learners outline the priority of testing instruments adapted to specifically measure the language proficiency of the treatment and control groups, due to a variety of native languages, levels of school degrees, and type of migration. The aim of this paper is to present an experimental design concerning how to assess young migrants’ literacy rate and, therefore, how to formulate an initial laboratory diagnosis of their starting language competencies, as well as to monitor their progress. More specifically, we propose three tests – achievement test, placement test, proficiency test BICS-CALP – sharing the bioprofile which is both a background analysis tool as well as an assessment tool for the linguistic and cultural skills of the L2 young migrants. The experimental design proposes the integration of the performance-based assessment tests that, using also bioprofile, lead to a gradual construction of the learners’ competence profile. L2 is a key element to be adequately integrated into receiving societies. For that reason, early laboratory diagnosis avoids newly arrived migrants leaving school early, as well as individual skills diagnosis which allows for the enhancement of an intercultural linguistic integration process. Keywords: Design research Experimental treatment L2 research Migrants Intercultural competences Human development

© Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 258–267, 2020. https://doi.org/10.1007/978-3-030-38227-8_30

In Favor of Multicultural Integration

259

1 Introduction The characteristics of language diffusion among foreigners represent a phenomenon that over time has been analyzed from multiple perspectives with respect to different models and survey tools such as sociolinguistics, neurolinguistics, glottodidactics, language acquisition. The idea that the Italian language moves within a linguistic space not limited to the territory of a State, assuming a global dimension over time [2], concerns the tools with which it is implemented for Italian as a second language. In order to investigate the role of the literacy skills assessment of second language learners for research purposes, we need more reliable and more suitable measures. A practical application related to the above addresses the opportunity to assess young migrants’ L2 learning competence. The overarching aim of this paper is to propose an experimental design to assess young migrants’ literacy rate. The main goal is to highlight the role played by linguistic competence tests, namely, those tools that aim to verify, measure and, evaluate young migrants’ linguistic and communicative L2 competences.

2 Context Analysis According to the UN Refugee Agency, several migratory flows involve 65.6 million people worldwide, among whom refugees as well as asylum seekers, those who migrate for economic reasons, and stateless people [1]. The world’s diasporas of the last thirty years are creating such diversified migratory flows that the profiles of L2 migrants, as well as the alphabetic skills, are characterized by an extreme heterogeneity linked to cultures and languages of origin, degrees of education, biographical paths, types of immigration, and gender differences [3]. The prompt individual L2 skills diagnosis allows to foster an intercultural linguistic integration process which enhances an active approach to global citizenship. Forced displacement, cultural dislocation, limited education in the home/migration country, family background, all these factors have a negative impact on their L2 Italian learning process [4]. Nevertheless, the increasing number of low-literate young migrants who have recently arrived in Italy requires a rethinking of L2 Italian traditional teaching methods, approaches and assessment strategies. 2.1

Foreign Students in the Italian Educational System

The multiculturalism of the Italian educational system throws light on that the educational system is now facing the challenge of the inclusion of numerous immigrant children from diverse ethnicities, with educational disadvantages and many language problems. According to [5], the analysis conducted indicates a transition from a situation of “normal diversity” to a “different kind of normality” in multicultural schools that can give a great opportunity to build an intercultural school based on dialogue and common growth. The data confirms a steady and significant increase in the enrollment of foreign students in Italian schools: they have quadrupled their presence between the school-years 2003/04 and 2015/16. We have gone from 196414 foreign students in

260

C. Pagani and C. Paolini

school-year 2003/04 (2.2% of the total population) to 826091 in school-year 2016/17 (9.4% of the total). In addition, children born from migrant parents and without Italian citizenship now represent the majority of foreign students in Italian schools.

3 Two Research Perspectives The study reflects two specific perspectives which support the need to carry out a diversified and reliable achievement test, placement test, and proficiency test. They could represent the common basis to decrease performance gaps and to increase linguistic integration. The first perspective takes into account that foreign students enrolled in Italian schools are about 826091 in the 2016/2017 school year: they increased by 11000 units compared to 2015/2016 school year (+1.38%). Over the past five years the number of foreign students has increased by 59000 from 756000 to 815000. Foreign students with disabilities amounted to 26626 in the school-year 2013/14. The second perspective focuses on the weaknesses which could prevent the linguistic and educational integration of children and adolescents from migrant backgrounds. The 40% of foreign 14-year-olds experiences an educational delay and this usually happens because of the misplacement in lower grades compared to age. (from 43.4% to 57% in 2016/2017). European Early Leaving from Education and Training data show that pupils with non-Italian citizenship are the ones with the highest risk of abandoning school, with 32.8% compared to a national average of 13.8%. In 2016, young people with non-Italian citizenship represented 16.8% of the total NEET (Not in Employment, Education and Training) population in Italy, with greater effect compared to Italians of the same age.

4 The Importance and Necessity of Language Assessment in Multicultural Educational Settings In the field of education, “some form of assessment is inevitable; it is inherent in the teaching – learning process” [6]. Evaluation is a fundamental trait of teaching, as are all educational activities. However, teachers are constantly doubtful about the assessment, especially when it involves foreign students. They complain about the lack of wellgrounded tests able to describe entry levels and record progress in L2 skills of both low and high literate learners. In essence, we believe that a solid assessment should be composed of at least two assessment tests, including a need analysis tool like the one we propose here, the Bioprofile. On the one hand, a single test, although built by numerous activities, does not allow for the construction of a sufficiently detailed linguistic profile. On the other hand, linguistic needs and basic personal features (i.e. age, work) are usually not adequately taken into account within existing testing system. Moreover, class teachers and assistant teachers tend not to compare and discuss their experiences and results in using different kinds of evaluation criteria. In doing so, they create a communication gap between them, and prevent the construction of a shared platform for the exchange

In Favor of Multicultural Integration

261

of good practice and mutual learning in the assessment of specific target groups. Finally, the analysis of young migrants’ scholastic pathways highlights a high rate of school dropouts and some other serious issues, mainly caused by the absence of an accurate form of entry assessment. For these reasons, our proposal is based on a performance-based assessment model: this approach seeks to measure pupil learning on the basis of how well the learner can perform a practical real-life task, i.e. the ability to write a composition or carry out small conversation. The performance-based assessment focuses on problem-solving, decision-making, and analyzing and interpreting information. It is a student-centered approach that moves away from traditional methods. In the following section, we introduce the experimental design, followed by the description of the four treatment conditions.

5 Experimental Design Motivated by these considerations and the context analysis data, we propose an experimental design with an inductive approach to assess migrants’ literacy rate composed of four treatment conditions, namely a background analysis tool and three assessment tests. The inductive approach is motivated by the necessity of searching for patterns from observations and, in doing so, developing accurate explanations for those regularities through a series of hypotheses [7]. Our proposal considers the integration of performance-based assessment tests that lead to a gradual construction of a learners’ competence profile. In this sense, the first treatment condition, namely bioprofile, acts as an icebreaker before continuing with the other treatments which represent the very assessment procedure. Moreover, it seeks to find shared linking elements which connect the language training of experimental subjects (i.e. migrant students) with the L2 language skills assessment in order to promote a high-equity system and to meet lifelong learning goals. 5.1

Experimental Task

The target population of our study consists of young, foreign students from six different migration areas: Romania, Albania, Morocco, Senegal, Nigeria, and China. The treatment group is composed of 350 foreign students enrolled in 20 secondary schools in the Abruzzo region, between the ages of 13–16, while the control group is composed of 50 native Italian students of the same age and the same school profile. The experimental design involves the experimental subjects in lab-based activities; except for the first treatment condition, which consist of a one-to-one dialog, and the last part of the last treatment condition, which consists of an oral production task, subjects carry out the task individually using Macintosh computers. Our research uses a within-subjects design, that is all experimental subjects are exposed to every treatment condition in a fixed order. Each subject is involved in four treatment conditions:

262

C. Pagani and C. Paolini

1. The drafting of the bioprofile represents the first treatment condition in which the experimental subject is involved. It consists of a background analysis tool which includes an initial interview to gain an accurate picture of the experimental subject’s cultural and linguistic background. 2. The second treatment condition, that is the achievement test, measures the competence in the area of writing and reading with seven growing difficult tasks for each of the areas. The reading tasks are: the identification of single letters, the reading of plan disyllable words, the reading of plan trisyllable words, the reading of disyllable words with consecutive vowels and consonants, the reading of words with three of four syllables of average complexity, the reading word which are difficult to spell and the reading of sentences (fluency). The written tasks are: the copying of words, the copying of address, the writing of plain disyllable words, the writing of plain trisyllable words, the writing of disyllable words with consecutive vowels and consonants, the writing of four syllable words and the writing of sentences which are not difficult to spell. 3. The third treatment condition, that is the placement test, consists of three tasks for each level, with a growing difficulty connected to levels CEFR A1, A2, A2+, B1, B1+ [8]. The level reached by the student is recorded on the basis of the number of tasks achieved and completed. The tasks are: reading-comprehension, written comprehension, written production. 4. The fourth and last treatment condition, namely proficiency test BICS – CALP, is different from the others because it takes into consideration also oral production in Italian. It consists of: oral and listening comprehension, written and reading comprehension, oral production and written production. These tasks are different for their growing difficulty on the basis of levels CEFR A2, A2+, B1, B1+, B2, B2+. Each test has its own rubric, that is a tool which provides the scoring criteria, a rating scale and descriptors for assessing experimental subjects’ L2 level. Rubrics are compiled in order to clarify expectations from subjects, to provide them formative feedback, and/or to plan detailed courses and programs. The items from different scales were mixed up as much as possible to create a sense of variety and to prevent respondents from simply repeating previous answers. The characteristics of the items: short and simple, using simple and natural language, avoiding ambiguous or loaded words and negative constructions and double-barreled questions. This on-going research is designed to: diagnose both low-literate learners and highliterate groups, develop explicit statements of what experimental subjects should learn (student learning outcomes), assess how well experimental subjects performed each outcome, use results to improve both evaluation and teaching strategies, map the improvement in L2 acquisition with solid and reliable data, and finally identify assessment standards of treatment groups. 5.2

Bioprofile

Bioprofile (see Fig. 1) is both a background analysis tool as well as an assessment tool for the linguistic and cultural background of L2 learners, and it consists of a one-to-one dialog [9]. The term “bioprofile” was coined on purpose by the authors, indicating the

In Favor of Multicultural Integration

263

processes related to establish individual cultural identities of newly arrived immigrants (NAI). The principal advantage of bioprofile is its role as an icebreaker in order to elicit authentic communication or fresh talk. Moreover, it could reveal the necessity to adopt compensatory and additional measures both in teaching and evaluation. The questions explore individual cultural identity of NAI. We use the word “culture” broadly to refer to all the ways the individual understands his or her identity and experience it in terms of groups or communities, including national or geographic origin, ethnic community, religion, and language. Bioprofile rubric has to be filled in as an informal debriefing: it requires an adequate amount of time, so it is not possible to involve large numbers of students at a time. The rubric is designed to assess skills and dispositions involved in lifelong learning, which are curiosity, transfer, independence, initiative, and reflection (Table 1).

Table 1. An extract from the first part of the 1st treatment condition (readapted from the original). Macro thematic area Presentation

Social communication skills A0/A1 What's your name? Where are you from? How old are you? Where do you live? Are you married? How long have you been in Italy? Have you been to other countries before coming to Italy? What is your phone number? A2/A2+ Introduce yourself.

Family A0/A1 Are you married? Do you have children? Do you have brothers and/or sisters? With whom do you live here in Italy? Where does your family live? A2/A2+ Describe your family.

5.3

Achievement Test

The achievement test [10] (see Fig. 1) allows for the detection of basic alphabetical skills, and it refers to Pre Alpha A1, Alpha A1 and Pre A1 levels as given by [11]. The recipients are learners with no literacy in L1 or L2, those who are weakly educated in L1 and those who are weakly literate in logographic writing systems or alphabetic systems other than the Latin system. This test is useful for NAI: it allows to distinguish immediately low-literate learners (0–8 years of schooling in the home country) from a high-literate groups (9–18 years). The test is built using the descriptors of the most significant pre-alphabetic and basic literacy skills. For this reason, we followed the criteria of validity and reliability proposed in [11] and [12]. It is composed of three parts: global reading, analytical-synthetic reading and writing.

264

C. Pagani and C. Paolini

Both global and analytical-synthetic reading items are elaborated using the highest available words in school and daily life context. In evaluating writing skills, the test reveals the ability to copy one’s own data or to write short and simple words of personal interest up to short routine sentences in the lower levels. At the highest level the test requires to compose a very short text inherent to personal linguistic domain. The primary goal of this test is not to complete all activities, which is unthinkable for those who are not able to speak Italian yet, but to allow the experimental subject to understand and implement the learning strategies they acquired in previous schooling. The levels of reading and writing competencies are evaluated separately using score criteria: in fact, the learner does not reach both competences at the time, the ability to read usually precedes that of writing. A differentiated score was assigned for each item, in relation to the three levels: precisely, 0.5 for the Pre-alfa A1, 1 for the Alfa A1, 2 for the Pre-A1. When it is possible, the items for the discrimination of the three levels could be merged in order to make the test shorter and facilitate its use; the level to be assessed is chosen based on the information collected in the bioprofile, which precedes the testing phases. The time required depends on the level and individual cultural background of the candidates. It usually takes about 30 mins for the Pre-Alpha A1 level, 20 mins for both the Alfa A1 and the Pre A1.

Reading - 1 PASSIVE IDENTIFICATION OF SINGLE LETTERS. The teacher explains the task to the experimental subjects: The teacher pronounces the phoneme, and the subjects should identify it. Scores: 0.25 for each letter.

A

E

F

D

Z

P

R

I

V

N

T

U

M

S

O

V

Fig. 1. Written and oral comprehension from the 2nd treatment condition (readapted from the original).

5.4

Placement Test

The placement test [13] (see Table 2) assesses subjects who are literate in logographic writing systems or alphabetic systems different from the Latin system. As the previous test, it is divided into three parts: global reading, analytical-synthetic reading, and writing. The placement test allows to establish the state of interlanguage [14] of each experimental subject in order to build appropriate custom pathways, and to monitor his/her progress in learning L2. The rating scale, rubric scores, and descriptors provide

In Favor of Multicultural Integration

265

four fundamental aspects: the linguistic skills and the knowledge levels, the interlanguage stages, the most recurrent errors, and the pre-knowledge to be enhanced. The first aspect, as well as the interlanguage level, are functional to clarify which is the most suitable L2 learning-path to undertake. The third and fourth aspects are functional to the error analysis which detects both the L2 system that the learner is building, and interlanguage. It is a transitory and autonomous system, in which the learner is operating linguistically, does not comply either with the rules of the L2 system, nor with the L1 language framework. Furthermore, the analytical comparison of experimental subjects’ results allows us to organize quickly and effectively the learning level groups. Especially for NAI pupils, a well-timed diagnostic test is useful in order to choose the most suitable classroom for their abilities and their knowledge, as well as the educational personalized pathways, which lead teaching actions and their consequent evaluation. The abilities shown in the assessment scales are independent of general educational levels and the age of the experimental subject. The placement test assessment directs subjects to the development of their own learning. The assessment scales, aligned to CEFR standards, are: A1 - Breakthrough; A2 - Waystage; B1 - Threshold; B2 - Vantage. Each level is marked using two assessment scales: [8] and [12]. A level is considered to be achieved when all the abilities foreseen by this level have been achieved. Placement test takes 50 mins.

Table 2. An extract from the 3rd treatment condition (readapted from the original).

Written Production 2 You have two tickets for a play and you want to ask Alex to go together. Send an sms to Alex. Write when, where and at what time you will meet.

Score:___/8

5.5

Proficiency Test BICS - CALP

Proficiency L2 BICS - CALP [16] (see Fig. 2) test evaluates both Basic Interpersonal Communication Skills (BICS) and Cognitive Academic Language Proficiency (CALP) [16] and it aims at assessing both social and academic language skills. This means that evaluation tasks addressed to cognitive skills, academic content, and critical language awareness. Listening, speaking, reading, and writing about curricular contents are also

266

C. Pagani and C. Paolini

included in the tasks. Tasks are developed with explicit reference to CEFR A2, B1, B2. The Waystage Level (A2) assesses all areas of language abilities such as reading, writing, listening and speaking. In this regard, the student should be able to produce different pieces of writing, understand the meaning of a range of spoken material, practice in skimming and scanning texts. The speaking section (2 examiners and 1 candidate) offers the opportunity to demonstrate fluency and accuracy in L2. The Threshold (B1) and the Vantage (B2) levels include a fifth dedicated task called “the use of Italian”. The experimental subject has the chance to prove his/her abilities in understanding realistic usages of language and knowledge of the language system by completing a number of tasks. All the areas, based on realistic tasks and situations, define the candidate’s overall communicative language ability at these levels. BICS and CALP competencies are increasingly necessary as knowledge has been weakened and outdated in the information society.

Oral Production – B1 level One of your best friends, who still lives in your country of origin, asks you some information about Italy. Look at the map and talk about Italy.

-

The shape, regions and climate; Capital city: Rome; Language: Italian; Predominant religion: Catholicism; Currency: Euro; Food, most famous attractions, Italian habits; What do you like the most of Italy? Scores___/9

Fig. 2. An extract from the fourth treatment condition (readapted from the original).

6 Conclusion and Future Research The high number of foreign students who migrate or are already enrolled in Italian schools highlights the urgency of carrying out differentiated and reliable language assessments. The experimental design presented in this paper has been developed with the aim of providing a clear, individual entry evaluation of the linguistic and cultural background of NAI. In doing so, it allows teachers to give young migrant students the key competencies necessary to be adequately integrated into the receiving society. However, we need a clear understanding of how the assessment works in more realistic settings in order to calculate with regards to student’s needs. In this sense, the process of test validation is on-going in several Italian secondary schools which serve as

In Favor of Multicultural Integration

267

treatment groups, involving a collection of evidences supporting the use of tests from various perspectives. Future investigations are necessary to validate the kind of conclusions that can be drawn from this study.

References 1. UN High Commissioner for Refugees (UNHCR): Global Trends: Forced Displacement in 2016, https://www.unhcr.org/globaltrends2016/. Accessed 04 Apr 2019 2. Vedovelli, V.: Guida all’italiano per stranieri. Dal Quadro Comune Europeo per le lingue alla Sfida salutare. Carocci, Roma (2010) 3. Cohen, R.: Global Diasporas. An Introduction, 2nd edn. Routledge, New York (2008) 4. European Commission/EACEA/Eurydice: Key Data on Teaching Languages at School in Europe – 2017 Edition. Eurydice Report. Publications Office of the European Union, Luxembourg (2017) 5. Ministry of Education, University and Research (MIUR): Focus: gli alunni con cittadinanza non italiana. A.S. 2016/2017. https://www.miur.gov.it/documents/20182/0/FOCUS+16-17_ Studenti+non+italiani/be4e2dc4-d81d-4621-9e5a-848f1f8609b3?version=1.0. Accessed 04 Apr 2019 6. Hopkins, K., Stanley, J., Hopkins, B.R.: Educational and Psychological Measurement and Evaluation. Prentice Hall, Upper Saddle River (1990) 7. Bernard, H.R.: Research Methods in Anthropology, 5th edn. AltaMira Press, Lanham (2011) 8. Council of Europe: Common European Framework of Reference for Languages (CEFR). https://www.coe.int/en/web/common-european-framework-reference-languages. Accessed 04 Apr 2019 9. Custodio, B.: How to Design and Implement a Newcomer Program. Pearson Education, London (2010) 10. Kaulfers, W.V.: Wartime development in modern-language achievement testing. Mod. Lang. J. 28(2), 136–150 (1994) 11. Borri, A., Minuz, F., Rocca, L., Sola, C.: Italiano L2 in contesti migratori. Sillabo e descrittori dall’alfabetizzazione all’A1. Loescher Editore, Torino (2014) 12. Spinelli, B., Parizzi, F.: Profilo della lingua italiana. Livelli di riferimento del QCER A1, A2, B1, B2 con CD-ROM. La Nuova Italia, Firenze (2010) 13. Brown, H.D.: Principles of Language Learning and Teaching. Prentice Hall, Englewood Cliffs (1994) 14. Selinker, L.: Interlanguage. Prod. Inf. Int. Rev. Appl. Linguist. Lang. Teach. 10, 209–241 (1972) 15. Carroll, J.B.: Fundamental considerations in testing for English language proficiency of foreign students. In: Testing the English proficiency of foreign students (eds.) Center for Applied Linguistics, Washington DC, pp. 30–40 (1961) 16. Cummins, J.: Bilingual education. In: Bourne, J., Reid, E. (eds.) World Yearbook of Education: Language Education. Kogan, London (2003)

Coping with Long-Term Performance Industrial Paving: A Finite Element Model to Support Decision-Making Structural Design Oliviero Camilli(&) Business Network International, Area Europe, Charlotte, NC, USA [email protected]

Abstract. Industrial paving in reinforced concrete is the basis of many production and logistic processes in every field of economic activity, especially as industrial research adapted to the needs of the economy. Indeed, the subject of industrial paving design has pushed many organizations around the world to review their design methods in order to move towards empirical analysis and modelling. More specifically, designing an industrial paving is a complex decision problem that aims to support the management evidence-informed decision-making process and evaluation in both the private and public sectors. There is a growing interest in evidence-informed decision-making within many applied sciences as well as within management science. The purpose of this work is to propose a finite element model for industrial paving – referred to as Dual Thickness Finite Element Model (DTFEM) – that allows to integrate the phenomenon of curling in the structural design of the continuous paving made of reinforced concrete. The results obtained are consistent with the phenomena observed on the structures in operation with dynamic and static loads. Furthermore, applying this model in the design practice of industrial concrete paving implies considerable advantages especially from the point of view of reducing repair and maintenance costs. Keywords: Industrial applications Curling Computational engineering Stage analysis Satisfying Human problem solving

1 Introduction Industrial paving in reinforced concrete is widely used in many industrial and commercial contexts both logistic and productive. Paving performance is of fundamental importance to achieve the correct performance of overlying activities; as a result, an inappropriate design choice might compromise the entire production or logistic process with enormous economic consequences. Among the phenomena that can compromise the functionality of industrial pavement, curling macroscopically evidence the effects of the variation in the characteristics of concrete over time. It is known that curling depends on (i) the temperature gradient through the slab [2, 19], (ii) the built-in temperature gradient [5, 19], (iii) the moisture gradient through the slab [7, 9], (iv) the differential irreversible shrinkage [7, 10 17, and 18], and (v) creep [1, 13, and 15]. In this work, we consider the effects of shrinkage and creep, also taking into account the © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 268–278, 2020. https://doi.org/10.1007/978-3-030-38227-8_31

Coping with Long-Term Performance Industrial Paving

269

variation in concrete strength during maturation. In regard to the concrete maturation process, the widespread application of finite element design in the study of industrial paving is lacking. In part, this lack of application is because the effects of maturation, creep, and shrinkage are not taken into account in the finite element models that are widely used in structural design. For quite important structures, the effects of maturation process are considered (not related to structural design), for which the appropriate measures that should be adopted are chosen (e.g., concrete mix-design, curing, and joints layout). The choices that must be made at the design stage are dictated caseby-case based on the needs of the logistic or production process which will take place on the paving and, therefore, by the required performance level [4]. In general, however, decisions must always be made regarding the following: • mechanical performance (i.e., structural design), • geometric performances (e.g., planarity, deformability), and • economic performance (e.g., construction costs, durability, and maintenance). The above points are interdependent, so that each decision cannot be considered right in itself but must be implemented with all others decisions until the final result has been achieved, as given by the combination of decisions made in so far as they are satisfying solutions [16].

2 Purpose of the Work The purpose of this work is to propose a finite element model referred to as the Dual Thickness Finite Element Model (hereafter, DTFEM) that allows for the integration of ageing effects (such as pavement curling) at the same time as the structural design of the industrial paving, as well as by taking into account maturation, creep, and shrinkage of the concrete. This finite element model is designed for future use in design practice, so that the model can be implemented using commercial software, which can perform automatic post-processing of the results in accordance with industry standards and, hence, can immediately provide all information on solicitations, displacements, and quantities of armour. For this reason, consolidated knowledge of the maturation, creep, and shrinkage [6] has been implemented in the DTFEM. In Sect. 3, we describe the basic logical scheme of the DTFEM and its construction. In Sect. 4 we show the results of DTFEM processing using MIDAS/Gen software [3]. In Sect. 5 we give a quick review of future goals for our research.

3 Finite Element Model Description 3.1

Basic Logical Scheme of DTFEM

In industrial paving, the extensions in horizontal dimensions prevail over vertical dimension (i.e., thickness). For this reason, plate theory is utilized in structural problem solutions, and the numerical calculation is usually processed by Discrete KirchhoffMindlin Quadrilateral (hereafter, DKMQ) elements [8]. Many commercial software

270

O. Camilli

packages execute this model with good reliability, processing the results of the numerical calculation by automatically calculating the required armour amount. Nevertheless, if the curling is to be considered, it is essential to evaluate the differential irreversible shrinkage in the paving thickness. However, modelling with threedimensional finite elements, which are indicated for this purpose, is not suitable for practical structural design purposes. Then, let us consider an industrial paving having thickness S,P as if it were composed by n layers of finite thickness S1, S2 … Sn (resulting n obviously L¼1 SL ¼ S) each connected to adjacent layers and having its own homogeneous rheological characteristics. A rigorous approach to this scheme involves the use of multilayer plate theories [20, 21]. In this case, however, we are not looking for mathematical optimization but rather for a finite element model that is both representative and practical to use for industrial paving design. Therefore, we decide to analyse each layer following widely used plate theory, taking into account the adjacent layers using appropriate boundary conditions. In this way, while we detail the paving thickness field, for the related numerical calculation we can use common twodimensional finite elements. We refer to this logical scheme as the multi-thickness finite element model (hereafter, MTFEM), which represents a novel approach to industrial paving design [14] (Fig. 1).

Fig. 1. Basic logical scheme for MTFEM

3.2

DTFEM Construction

In this section, we build an MTFEM considering the breakdown of industrial paving into just two layers, referring to this model as DTFEM. The upper layer, which we call the surface layer (SL) with an SSL thickness equal to the depth of the contraction joints, represents the paving portion that most closely reflects the effects of exposure to environmental agents. The lower layer, which we call the ground layer (GL), with a thickness SGL = (S − SSI), represents the paving portion in which the environmental factors have a marginal influence on the concrete maturation process (Fig. 2). Each Layer can be represented using DKMQ elements. The DKMQ elements of each Layer are connected to each other, node by node. In this DTFEM, we use rigid body connection by means of the rigid link function [11]. According to [11], the rigid link function constrains geometric, relative movements of a structure. Geometric constraints

Coping with Long-Term Performance Industrial Paving

271

of relative movements are established at a particular node to which one or more nodal degrees of freedom (d.o.f.) are subordinated. The particular reference node is called a Master Node, and the subordinated nodes are called Slave Nodes. Rigid Body Connection constrains the relative movements of the master node and slave nodes as if they are interconnected by a three dimensional rigid body (Fig. 3).

Fig. 2. Logical scheme for DTFEM

In this case, relative nodal displacements are kept constant, and the geometric relationships for the displacements are expressed by the following equations: UXs ¼ UXm þ RYm DZRZm DY UYs ¼ UYm þ RZm DXRXm DZ UZs ¼ UZm þ RXm DYRYm DX RXs ¼ RXm RYs ¼ RYm RZs ¼ RZm where, DX ¼ Xm Xs ; DY ¼ Ym Ys ; DZ ¼ Zm Zs The subscripts, m and s, represent a master node and slave nodes respectively. UX, UY, and UZ are displacements in the global coordinate system (GCS) X, Y, and Z directions respectively, and RX, RY and RZ are rotations about the GCS X, Y, and Z-axes respectively. Xm, Ym, and Zm represent the coordinates of the master node, and Xs, Ys, and Zs represent the coordinates of a slave node. Below, we describe how boundary conditions are assigned. Boundary conditions are represented with nodal springs and are of the linear type in the x and y direction and compression-only type in the z direction. The modulus of the sub-grade reaction is specified in each direction. The soil property is then applied to the effective areas of individual nodes to produce the nodal spring stiffness as a boundary condition [11]. Below, we describe how material properties are assigned. We define two distinct materials. For each material, we assign a certain behaviour concerning creep,

272

O. Camilli

shrinkage, and maturation. The same material is assigned to each DKMQ element of the same Layer. Note that the DTFEM does not consider the distribution of shrinkage within the thickness of each Layer. The variation in shrinkage along the overall thickness of the paving is the consequence of different shrinkage values that develop in each layer, based on the assigned material and surrounding conditions. In the DTFEM, the shrinkage function along the overall thickness of the paving (unknown a priori) interpolates the shrinkage values for SL and GL.

Fig. 3. DTFEM

4 DTFEM Processing The DTFEM is processed using MIDAS/Gen (Build 6.20.2012). Two cases are examined: • Case 1: single paving field of dimensions 4.00 m 4.00 m. DKMQ elements are 25 cm 25 cm, • Case 2: paving dimensions of 20.02 m 20.02 m composed in turn by fields of dimensions 4.00 m 4.00 m interspersed with contraction joints of 0.5 cm width. DKMQ elements are 25 cm 25 cm (25 cm 0.5 cm and 0.5 cm 0.5 cm for joints, modelled on GL) In Sects. 4.1 and 4.2 we show the results for case 1 and case 2, respectively. What we present below is common to both cases. The following characteristics are assumed: • SSL = 3 cm; • SGL = 7 cm;

Coping with Long-Term Performance Industrial Paving

273

• load: self weight only; • strength, creep, and shrinkage considered according to the functions shown below (Fig. 4).

Surface Layer

Ground Layer

Fig. 4. Detailed prospect and axonometric views of the DTFEM in MIDAS/Gen.

We show significant settings for the DTFEM process below (Tables 1, 2, 3, 4 and 5). Table 1. Load combination No 1 2 3 4

CBC name Type SHR + CR Add SHR + CR + pp Add SHR Add CR Add

Active Strength/stress Strength/stress Strength/stress Strength/stress

Load combination Shrinkage and creep Shrinkage, creep, and self weight Shrinkage only Creep only

Table 2. Material ID Material name Type Standard DB Use mass density 1 C25/30 surface. Concrete EN(RC) C25/30 X 2 C25/30 ground. Concrete EN(RC) C25/30 X Table 3. Material ID Elasticity (kgf/cm2) 1 3.1072e+005 2 3.1072e+005

Poisson Thermal (1/[C]) 0.2 1.0000e−005 0.2 1.0000e−005

Density (kgf/cm3) 2.4000e−003 2.4000e−003

Mass density (kgf/cm3/g) 2.4473e−006 2.4473e−006

Table 4. Thikness ID Type In = out Thick-In (cm) Offset 1 Value Yes 7.0000 No 2 Value Yes 3.0000 No

Material type Isotropic Isotropic

274

O. Camilli Table 5. Boundary conditions Type Kx (kgf/cm3) Ky (kgf/cm3) Kz (kgf/cm3) Linear 1 1 0.01 Compression-only – – 7

To model the layers, the DKMQ elements in the standard MIDAS/Gen library are used [11, 12]. For both layers, the same law of maturation is considered, although for SL, we must also consider the effects of creep and shrinkage, which are otherwise neglected for GL. Structural calculation takes place according to construction stage analysis mode [11], in which the temporal steps are defined so that the structural analysis can be carried out according to the characteristics assumed by the material at that time. For material properties, we use the following parameters [6]: • • • • • •

Characteristic compressive cylinder strength at 28 days: fck = 250 kgf/cm2; Mean compressive cylinder strength at 28 days: fcm = 330 kgf/cm2; Relative humidity of ambient environment: RH = 40%; Notational size: h = 6 cm; Type of cement: class R; and Age of the concrete at the beginning of shrinkage: 0.2 day. The creep, shrinkage and strength functions [6] are shown below (Fig. 5).

Fig. 5. Strength function, creep coefficient function, and shrinkage function.

We compose the construction stage as follows: • • • • 4.1

Duration: 1000 days (step 1: 10 days; step 2: 100 days); Element activation at age 0: SL group, GL group; Boundary activation (deformed): Substratum boundary conditions, rigid link; Load activation at first day: Creep, shrinkage, and self-weight. Case 1: Results

We show the displacement contour in the z direction, where curling is emerging (Fig. 6).

Coping with Long-Term Performance Industrial Paving

275

[cm]

Fig. 6. Case 1. CBC: SHR + CR + pp.

4.2

Case 2: Results

We show the DTFEM details and displacement contour in the z direction, where curling is emerging (Figs. 7, 8, 9 and 10).

Fig. 7. DTFEM for Case 2. See the detail corresponding to the contraction joints (width 0.5 cm)

central area

[cm]

external edge

Fig. 8. Case 2. CBC: SHR + CR + pp.

276

O. Camilli [cm]

Fig. 9. Case 2. External edge of the paving. CBC: SHR + CR + pp. [cm]

Fig. 10. Case 2. Central area of the paving. CBC: SHR + CR + pp.

4.3

Summary

DTFEM does not directly implement curling but can incorporate functions that describe concrete strength, creep, and shrinkage. The results obtained from numerical processing show that curling emerges in the DTFM (consistent with phenomena observed on the structures in operation with dynamic and static loads).

5 Conclusions and Future Research In this paper, a finite element model for industrial paving is proposed. This model allows the phenomenon of curling to be integrated into the structural design. The results obtained from numerical processing show that curling emerges in the DTFEM. The model offers the possibility to make decisions both on strictly structural aspects, as well as the concrete mix-design, curing stage, and layout of the joints. The latter are neglected in widely used finite element structural models. The simplicity of the model and the parsimonious computational burden allow implementation in calculation procedures carried out with commercial software, providing data that can be immediately translated into design specifications. Therefore, the DTFEM allows sig-nificant benefits related to the entire industrial paving life cycle (i.e., construction, maintenance, and disposal). Future research will include a comparison of the experimental results and

Coping with Long-Term Performance Industrial Paving

277

DTFEM refinement (as for improving predictability) to obtain a useful tool for industrial paving design. Our future research also concerns benefits in terms of the environmental impact related to paving durability and maintenance. Acknowledgments. We would like to thank Angelo Antonacci, Edgardo Bucciarelli, Andrea Oliva, and Giovanni Plizzari for their invaluable contributions to earlier versions of this paper.

References 1. Altoubat, S.A., Lange, D.A.: Creep, shrinkage, and cracking of restrained concrete at early age. ACI Mater. J. 98, 323–331 (2001) 2. Armaghani, J.M., Larsen, T.J., Smith, L.L.: Temperature response of concrete pavements. J. Transp. Res. Board 1121, 23–33 (1987) 3. CSPFea. http://www.cspfea.net/. Accessed 30 Mar 2019 4. Italian Research Council (CNR): Technical document no. 211/2014. https://www.cnr.it/it/ node/2631. Accessed 29 Mar 2019 5. Eisenmann, J., Leykauf, G.: Effect of paving temperatures on pavement performance. In: 2nd International Workshop on the Design and Evaluation of Concrete Pavements, Siguenza (ES), pp. 419–431 (1990) 6. European Union Per Regulation: Eurocode 2 - EN 1992-1-1: Design of concrete structures (2004). https://www.phd.eng.br/wp-content/uploads/2015/12/en.1992.1.1.2004.pdf. Accessed 05 Apr 2019 7. Janssen, D.J.: Moisture in Portland cement concrete. J. Transp. Res. Board 1121, 40–44 (1987) 8. Katili, I.: New discrete Kirchhoff-Mindlin element based on Mindlin-Reissner plate theory and assumed shear strain fields. Part II: an extended DKQ element for thick-plate bending analysis. Int. J. Numer. Methods Eng. 36, 1859–1883 (1993) 9. Lim, S., Jeong, J.-H., Zollinger, D.G.: Moisture profiles and shrinkage in early-age concrete pavements. Int. J. Pavement Eng. 10, 29–38 (2009) 10. Mather, B.: Reports of the Committee on durability of concrete—physical aspects—drying shrinkage. Highw. Res. News 3, 26–29 (1963). 11, 34–38 (1964) 11. MIDAS Analysis guide. http://en.midasuser.com/training/technical_read.asp?idx=673&pg= 1&so=&sort=&bid=12&nCat=267&nCat2=General&bType=&totCount=37. Accessed 30 Mar 2019 12. MIDAS Plate element. http://en.midasuser.com/training/technical_read.asp?idx=667&pg= 1&so=&sort=&bid=12&nCat=272&nCat2=Element&bType=&totCount=25. Accessed 30 Mar 2019 13. Neville, A.M., Meyers, B.L.: Creep of concrete: influencing factors and prediction. In: Proceedings, Symposium on Creep of Concrete, pp. 1–33. American Concrete Institute, Detroit (1964) 14. Newell, A., Simon, H.A.: Human Problem Solving. Prentice-Hall Inc., Englewood Cliffs (1972) 15. Rao, C., Barenberg, E.J., Snyder, M.B., Schmidt, S.: Effects of temperature and moisture on the response of jointed concrete pavements. In: Proceedings, 7th International Conference on Concrete Pavements, Orlando, FL (2001) 16. Simon, H.A.: Rational choice and the structure of the environment. Psychol. Rev. 63(2), 129–138 (1956)

278

O. Camilli

17. Suprenant, B.A.: Why slabs curl—part I. Concr. Int. 3, 56–61 (2002) 18. Suprenant, B.A.: Why slabs curl—part II. Concr. Int. 4, 59–64 (2002) 19. Yu, H.T., Khazanovich, L., Darter, M.I., Ardani, A.: Analysis of concrete pavement responses to temperature and wheel loads measured from instrumented slabs. J. Transp. Res. Board 1639, 94–101 (1998) 20. Ugrumov, S.V.: Generalized theory of multilayer plates. Int. J. Solid Struct. 39, 819–839 (2002) 21. Wenbin, Y.: Mathematical construction of a Reissner-Mindlin plate theory for composite laminates. Int. J. Solid Struct. 42, 6680–6699 (2005)

Does Trust Create Trust? A Further Experimental Evidence from an Extra-Laboratory Investment Game Edgardo Bucciarelli(&)

and Assia Liberatore

University of Chieti-Pescara, Viale Pindaro, n. 42, 65127 Pescara, Italy {edgardo.bucciarelli,assia.liberatore}@unich.it

Abstract. This paper falls under the conceptual framework of social dilemmas in understanding and interpreting ethical choices commonly found in studies examining value and decision-making, such as in economics and finance. In the paper, special attention is paid to the Investment Game which is a stylised social dilemma game according to which Player A (i.e., the lender or trustor) may take a costly action – and hence possibly construable as an investment – that may generate social returns, while Player B (i.e., the borrower or trustee) may decide how to distribute the proceeds between herself/himself and the Player A. Experimental works on Trust and Investment Games showed systematic and reliable patterns of inconsistency concerning the standard equilibrium prediction underlying the assumption of complete information on selfish preferences. Along these lines, we outline in broad terms what is intended as social dilemmas; then, we shift the psychological framework of social dilemmas to experimental economics and modern behavioural economics. Finally, we conduct an Investment Game in the context of an extra-lab experiment, calculating two indices, namely, the propensity-to-trust index and the reciprocity index, for twenty-nine pairs of subjects randomly assigned to one of the two groups of subjects concerned. In finding components of trust and reciprocation, we can confirm expectations that a cooperative approach would lead to trusting and trustworthy behaviour, whereas a competitive approach makes experimental subjects behave suspiciously and untrustworthy. This should come as no surprise since components of trust are economic ‘primitives’ and, thus, relevant to every economic transaction. Accordingly, we study the basic empirical structure of social relations and associations emerging from experimental data, which may be expressed in network form. By using the concept of networks in the analysis of social behaviour, social network analysis is performed both to better understand the interactions between the two types of players involved through their decisions, and to gauge levels of trust and reciprocity. Keywords: Social dilemmas Behavioural economics Extra-laboratory experimentation Propensity-to-trust index Reciprocity index Social network analysis JEL codes: C71

C99 D01 D91

© Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 279–291, 2020. https://doi.org/10.1007/978-3-030-38227-8_32

280

E. Bucciarelli and A. Liberatore

1 Introduction In recent decades, an empirically grounded rubric for interdisciplinary studies has implemented basic features of social dilemmas such as common dilemmas and team games [see 1, henceforth abbreviated as VJPV in the plural]. The current research on social dilemmas analysis still helps to shed light on the nature and the magnitude of various real-world problems, including commuting decisions [2, 3] and organisational citizenship behaviours [4, 5]. VJPV focus on psychological variables that impact cooperation in social dilemma situations such as trust, individual differences, decision framing, priming, and affect. Unlike VJPV, we attempt to consider these variables focused on economic issues. Therefore, starting from VJPV’s review, this paper provides a general discussion of social dilemmas in the first place. According to [6], furthermore, the discussion follows the interdisciplinary nature of modern behavioural economics that is part of a broader landscape concerning research in social and behavioural sciences, including cognitive and social psychology [7]. The first part of the discussion (see Sect. 2) contributes to reaffirm some economic underpinnings of social dilemmas. The second part (see Sect. 3) penetrates the experimental area of social dilemmas, reporting on the preliminary findings of an extra-lab study concerning an Investment Game. Sections 4 concludes the paper with a summary.

2 From Social Psychology to Economics Experimentation More than ten years ago, Ariely and Norton pointed out that “the lack of communication between psychology and economics is particularly unfortunate because the fields share interest in similar topics that are of clear importance to public policy and social welfare; at the same time, however, the gaps in approach are substantial and epistemological, so bridging them is not trivial.” Ariely and Norton [8] p. 338. Experimental economics and social psychology abstract phenomena from the real world differently, distilling them in the context of laboratory experiments and similar or related settings. Psychologists involve deception, which translates into cover stories to manipulate people’s goals in the sense that actual circumstances might alter real-world goals [37]. They ensure that people’s decision-making adheres to contextual factors, while experimental economists use stakes to motivate subjects by refusing deception. Nevertheless, the possible use of deception by psychologists and the use of incentives by economists frequently reveal methodological choices made to pursue the same goal. In the words of Tyler and Amodio, “from our perspective, it appears that economists are becoming social psychologists in their questions, but still need to consider how to develop the statistical and methodological tools needed to study such questions in a scientifically valid way.” Tyler and Amodio [9] p. 188. Focusing on social-psychology, additionally, we note that: “personality differences in social values, trust, consideration of future consequences, framing, priming, heuristics, and affect represent a long list of variables that are important to understanding the psychological processes that are activated in social dilemmas.” VJPV [1] p. 134. These authors overcome classical theoretical limits of psychology and, thus, compose an interdisciplinary conceptual framework. They stress standard and extended versions of interdependence theory [10,

Does Trust Create Trust?

281

11, and 12], the appropriateness [13], and evolutionary theorising, such as reciprocal altruism, indirect reciprocity, and costly signaling. Following VJPV’s conceptual framework, we consider the empirical research in social dilemmas as a whole to test economics hypothesis and explain human behaviour. However, we introduce a behavioural economics approach towards human decision-making that, differently from VJPV, is referred only to economic literature [14]. Indeed, contrary to the so-called homo oeconomicus’ view – and its reduction of being a representative agent along with ‘its’ microeconomic structure projected towards macroeconomics – of human motivation and decision-making, modern behavioural economics assumes that human decision processes are shaped by social forces and embedded in social environments. In short, “we are social animals with social preferences, such as those expressed in trust, altruism, reciprocity, and fairness, and we have a desire for self-consistency and a regard for social norms.” Samson [7] p. 1. An important distinction in social preferences experiments stems between public and private decisions. Social dilemmas are often considered part of the first group [among others, see 15] (i.e., public decisions) but, drawing on VJPV, we envisage considering them even as private decisions. In this vein, Beckenkamp uses a taxonomy organising experimental social dilemmas in two major groups based on what incentive structures are involved [16]. On the one hand, social dilemmas with fixed quotas are proposed, with quotas defined exogenously, i.e., regardless of the decision of the subject in question, including (i) N-person prisoners’ dilemma; (ii) public-goods games with a threshold, which are not social dilemmas in the strict sense, but a hybrid between a social dilemma and a coordination game; and (iii) prisoners’ dilemmas and public-goods games with quadratic payoff functions [17]. On the other, social dilemmas with proportional quotas are accordingly considered, where the share of the common payoff function is given by distributions depending on the relative amount of an experimental subject’s contribution. These are settings in which self-interest and group-interest lead to behave differently. Two-principal paradigms – the prisoner’s dilemma and social preferences for voluntary contributions to a public good – are usually used in the literature [18, 19]. The dominant multi-person social dilemma paradigm in experimental economics is the public goods game, according to which the conventional experiment conducted to study human decisionmaking in the presence of public goods is based on the voluntary contribution mechanism. Individuals choose between cooperation and defection: the decision is allor-nothing in the prisoner’s dilemma, while intermediate decisions are possible in voluntary contributions. It is worth noting that public decisions are the basis of social dilemmas experiments, which include experiments founded upon altruism [20] or reciprocity [21, 22, and 23]. Reciprocity experiments include voluntary contribution experiments, dictator, ultimatum, bargaining, and trust games [24]. The rationale of the gift-exchange wage-effort relationship can be expressed by supposing outcome-based social preferences [25, 26], intention-based reciprocal behaviours [21, 22], and a mixture of the two [23, 27]. Tan and Bolle [28] consider reciprocity as an effective practice and investigate the effect of altruism and fairness in a single stage dictator game. In experimental economics, furthermore, subjects’ trust represents a prerequisite in order to enable experimental control, provided that experimenters do not resort to any form of deception [29]. In this regard, Krawczyk [30] explores the effect of assurance of no deception. A tendency towards altruistic decisions is manifested both

282

E. Bucciarelli and A. Liberatore

in entrepreneurship and in social preferences experiments for different research objectives. The tendency of trust and fairness is also studied in consumer decisions. Trust, cost, and social influence can be used to predict consumers’ intention to adopt mcommerce as in Chong et al. [31] who run survey experiments in Malaysia and China to extend Wei et al.’s study [32] on consumers’ trust in performing mobile commercial transactions. The next Section focuses on an experiment to study trust and reciprocity in an investment setting.

3 Testing Norms of Trust and Cooperation The concept of trust is articulated on three basic dimensions: (i) interdependence between trustor’s and trustee’s behaviour; (ii) risk when a positive outcome for one person necessarily leads to a worse outcome for another person; and (iii) lack of coercion in trust decision [33, 34]. Trust has been traditionally linked to the so-called ‘betrayal aversion’: people assume higher risks when they are faced with a given probability of misfortune than the same probability of being cheated by another person [35]. In social-dilemma experiments, stated beliefs might well be used as a justification for selfish behaviour. As is common knowledge, standard economics considers deceitfulness as a natural by-product of self-interested agents, while in human relationships, deception is often considered a violation of trust. The prohibitionist case, indeed, distinguishes deception as a ‘public bad’ from truthfulness as a ‘public good’ [36]. Not surprisingly, economics experiments preserve reputation in the academic community by supporting the question of trust. Since it was designed [38], the trust game – and the abducted investment game – has been conducted countless times. This contribution to the cause of trust research is a further confirmation, which serves the experimental purpose, especially in the case of economics. 3.1

Experimental Procedures

We randomly recruited 58 undergraduates from economics courses at the University of Chieti-Pescara, completely unaware of the type of experimental activity that would have involved them. The experiment was conducted on Wednesday 11th May 2016 in rooms no. 33 and no. 35 of the University as mentioned above – and was video recorded for later analysis – as part of a research project (DSFPEQ 2016–2018), whose principal investigator was Edgardo Bucciarelli. In order to share the subject pool fairly, experimental subjects were gathered in front of the two rooms, where they each chose one number from an urn of ninety as in bingo. Based on the number extracted, subjects were randomly assigned to room A (no. 35) or room B (no. 33). From now on, for simplicity, we indicate as subjects A (i.e., lenders or trustors) those players who were randomly assigned to room A, while as subjects B (i.e., borrowers or trustees) those players who were randomly assigned to room B. The same instructions were read – and delivered individually to the subjects – in both rooms by two moderators coordinated by the authors of this paper, one for each room. The instructions were read aloud to ensure that the description of the game was common information. The instructions and the experimental task were very similar to those used in [39]. The individual identity of

Does Trust Create Trust?

283

the ‘pair member’ was not revealed, conducting the experiment in double-blind mode. In a nutshell, subjects A were endowed with real money that can be earned (each subject A received a monetary endowment of €10.00) and had to decide how much of that money to send/lend to subjects B. To maintain discretion and confidentiality of subjects’ personal data, we used envelopes and alphanumeric codes at all times. Whatever money each subject A sent/lent to each subject B, it was tripled in value by the experimenters, so that each subject B received the amount sent by her/him pair member multiplied by three. Then, subjects B decided how much of that increased amount to return to their pair members, after which the game ended. In this sense, the investment game – as well as the trust game – is a vivid example of a social dilemma, that is, a state of affairs in which each subject’s autonomous maximisation of self-utility leads to an inefficient outcome. 3.2

Preliminary Results

We collected data from 29 pairs of observations corresponding to 29 pairs of experimental subjects, each pair consisting of a subject A and a subject B. Among the experimental data, it is worth pointing out that the earnings of subjects B show the minimum and maximum values from €0.00 to €20.00; instead, the earnings of subjects A fall between €4.00 and €26.00. Most importantly, a significant, positive correlation between the variables ‘decision of subjects A’ and the ‘earning of subjects A’ exists (see Table 1). Basically, this means that the more subjects A trust in subjects B, sending/lending them a relatively high monetary amount, the more subjects A will derive a high profit for themselves. Instead, there is no correlation between the variables ‘decision of subjects B’ and the ‘earning of subjects B’, namely, there is indifference between the two variables. Furthermore, there is a positive correlation between the ‘decision of subjects A’ and the ‘earning of subjects B’. In other words, this implies that the earning of subjects B depended on how much subjects A trusted subjects B and, at the same time, on how much subjects B were ‘grateful’ to their pair members. Finally, the variable ‘decision of subjects B’ mirrors the previous case, being positively correlated to the ‘earning of subjects A’. Table 2 shows the calculations of the propensity-to-trust index (PTI) or index of trust (i.e., the ratio between the amount of money sent/lent and the endowment received by subject A) and the reciprocity index (RI) or index of gratitude-or-indignation (i.e., the ratio between the amount of money returned and the endowment received by subject B). Table 2, afterwards, shows that no experimental subject behaves according to rational choice theory, which provides that subjects A should not send/lent any monetary amount to their pair members B. Most of 29 subjects A (11 subjects out of 29) trusted enough and bestowed about half endowment; others trusted little, or very little, while some trusted very much (2 subjects A out of 29 sent/lent their entire endowments). Analysing overall data (see also Fig. 1), we find that 17 subjects A reported a high propensity-to-trust index (>0.3), while the other 12 subjects A showed a low propensity-to-trust index on average ( 0.3). It is worth mentioning that while all subjects A received the same initial monetary endowment assigned by the experimenters, subjects B, conversely, received an endowment that depended exclusively on the decisions of subjects A: this determines a different interpretation of the behaviour of subjects A with respect to subjects

284

E. Bucciarelli and A. Liberatore

B. Regarding the 29 pairs of subjects A and B, one particular case is discussed. Only one subject A (ID = 2) trusts almost for nothing, registering the lowest propensity-totrust index value, sending the smallest monetary amount – equal to €1.00 – to the corresponding subject B. The subject B in question, however, decided to return the entire amount received – equal to €3.00, as a result of the multiplication applied by the experimenters. This particular interaction, in which the lowest index of trust is combined with the highest value of the reciprocity index, can be translated into a feeling of ‘indignation’ felt by that subject B, who would have expected to receive – by all odds – a more substantial monetary amount from subject A. Generally, the distribution of the reciprocity index focuses on medium-low values (15 subjects out of 29, see Fig. 2). Considering the higher values, instead, we note that six subjects B returned half, or slightly more, than what they received from the corresponding subjects A: among these six subjects B, we find two subjects who received the entire endowments of their pair members A and it is possible to notice a very positive response by both of them. Essentially, these results reinforce the idea that trust-based relationships tend to bring convenience to both subjects involved. Table 1. Correlations between experimental subjects’ decisions and earnings Earning of subjects A Decision of subjects A 0.444** Decision of subjects B 0.921** (*, **, *** refer to a significance level at 10%,

Earning of subjects B 0.605** −0.006** 5%, 1%).

Subjects A: the Propensity-to-Trust Index (PTI) or index of trust

1 0.9 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

1

2

3

4

5

6

7

8

Fig. 1. Distribution of the Propensity-to-Trust Index (values by number of subjects A).

Does Trust Create Trust?

285

Subjects B: the Reciprocity Index (RI) or index of graƟtude-or-indignaƟon

1 0,83 - 0,93 0,67 - 0,72 0,5 - 0,56 0,4 - 0,44 0,33 0,25 - 0,28 0,17 0,05 - 0,11 0

1

2

3

4

5

6

7

Fig. 2. Distribution of the Reciprocity Index (values and value ranges by number of subjects B).

Table 2. Propensity-to-Trust Index (PTI) or index of trust; Reciprocity Index (RI). ID Endowment of Decision of A (€) A (€) 1 10 2 2 10 1 3 10 3 4 10 5 5 10 4 6 10 2 7 10 4 8 10 4 9 10 4 10 10 9 11 10 2 12 10 4 13 10 5 14 10 3 15 10 3 16 10 10 17 10 10 18 10 2

PTI PTI (%) 0.2 20 0.1 10 0.3 30 0.5 50 0.4 40 0.2 20 0.4 40 0.4 40 0.4 40 0.9 90 0.2 20 0.4 40 0.5 50 0.3 30 0.3 30 1.0 100 1.0 100 0.2 20

Endowment of Decision of B (€) B (€) 6 3 3 3 9 4 15 5 12 2 6 2 12 2 12 2 12 3 27 25 6 1 12 3 15 8 9 1 9 4 30 15 30 15 6 4

RI

RI (%)

0.50 50.00 1.00 100.00 0.44 44.44 0.33 33.33 0.17 16.67 0.33 33.33 0.17 16.67 0.17 16.67 0.25 25.00 0.93 92.59 0.17 16.67 0.25 25.00 0.53 53.33 0.11 11.11 0.44 44.44 0.50 50.00 0.50 50.00 0.67 66.67 (continued)

286

E. Bucciarelli and A. Liberatore Table 2. (continued)

ID Endowment of Decision of A (€) A (€) 19 10 2 20 10 2 21 10 7 22 10 4 23 10 5 24 10 3 25 10 3 26 10 5 27 10 4 28 10 6 29 10 6

PTI PTI (%) 0.2 20 0.2 20 0.7 70 0.4 40 0.5 50 0.3 30 0.3 30 0.5 50 0.4 40 0.6 60 0.6 60

Endowment of Decision of B (€) B (€) 6 2 6 5 21 1 12 6 15 5 9 1 9 5 15 6 12 2 18 13 18 5

RI 0.33 0.83 0.05 0.50 0.33 0.11 0.56 0.40 0.17 0.72 0.28

RI (%) 33.33 83.33 4.76 50.00 33.33 11.11 55.56 40.00 16.67 72.22 27.78

In the final analysis, through data from investment game experiments – even with all the caveats that come with such data [40] – we can explore whether some specific types of networks have developed. To some extent, indeed, the data collected in this type of experiment can be useful for designing and constructing social network representations. Specifically, the sociograms below are built and intended to provide the reader with insight into the preferences and choices that emerged within and between each group of subjects. Figures 3 and 4 are complementary and contain the decisions made by experimental subjects A and B characterised by a reference threshold of €3.00 (set on the basis of PTI), while Fig. 5 shows the earnings obtained by the experimental subjects considering a reference threshold of €10.00. Each green node indicates a pair of subjects, whereas each blue node stands for the specific objective of the analysis (i.e., decisions or earnings), and each arrow indicates a connection relation regarding the objective of the network analysis. In each sociogram, two smaller subsets (i.e., cliques) of a larger network emerge highlighting mutually reciprocated relationships, while a subset of isolates does not participate in the dynamics of the network and, therefore, it is not considered a clique in the strict sense. The objective of this analysis is to extract unambiguous network trends from the data on node types, which determine whether trust creates trust between the subjects involved in this study. The analysis of the social network confirms that subjects A tend to trust, even if they are willing to risk about one third of their initial endowment, thus obtaining a low monetary value response from subjects B. On the other hand, when subjects A completely trust their pair members, sending/lending their entire initial endowment (€10.00), subjects B respond positively, and both subjects earn the same high amount (€15.00), as for pairs 16 and 17. The cases relating to pairs 2 and 21 are noteworthy. In pair 2, subject A sends/lends €1.00, while her/his pair member returns €3.00 (i.e., a case of indignation). In pair 21, subject A sends/lends €7.00, while her/his pair member returns €1.00 (i.e., the only case in which trust does not create trust) (Table 3).

Does Trust Create Trust?

287

Fig. 3. Decisions made in the two groups of subjects with a decision threshold €3.00. Both subjects of pairs 1, 2, 6, 11, 14, 19, and 24 remain within the threshold; all subjects belonging to isolates exceed the threshold; in all other pairs, at least one subject exceeds the threshold.

Fig. 4. Decisions made in the two groups of subjects with a decision threshold > €3.00. Both subjects of pairs 4, 10, 13, 16, 17, 22, 23, 26, 28, and 29 remain within the threshold; all subjects belonging to isolates are below the threshold; in all other pairs, at least one subject is below the threshold.

288

E. Bucciarelli and A. Liberatore

Fig. 5. Earnings obtained by the experimental subjects with an earning threshold €10.00. Both subjects of pairs 4, 5, 6, 7, 8, 9, 11, 12, 14, 19, 23, 24, and 27 are within the threshold; subjects of pairs 16 and 17 earn > €10.00; in all other pairs, at least one subject is within the threshold. Table 3. Descriptive statistics: endowments, decisions, and earnings of subjects A and B (€). N Min Max Endowment of A 29 10 10 Decision of A 29 1 10 Endowment of B 29 3 30 Decision of B 29 1 25 Earning of B 29 0 20 Earning of A 29 4 26

Mean 10 4.276 12.828 5.276 7.241 11

SD 0 2.3436 7.0309 5.3778 4.7106 3.9097

4 Conclusions In the context of social dilemmas, the Investment Game is the most significant experiment of choice to measure components of trust in economic decisions. Despite initial criticism about the measurement of trust, trustworthiness, and reciprocity, and although being replicated many times since its introduction – both in the form of oneshot and repeated treatments – the increasing importance of experimental methods in economics urges us to continue to stay focused on verifying its boundary conditions and the generalisability of its results. For all intents and purposes, the ability to trust in other individuals and to reciprocate this trust with trustworthy behaviour is rooted in everyday aspects of social and economic life. In this regard, the experimental evidence described in this paper does not support the theoretical prediction of the rational choice theory. The above results, although preliminary, definitely deviate from the Nash

Does Trust Create Trust?

289

equilibrium outcome that perfectly rational and selfish players would have reached. Actually, human behaviour is not always – or maybe never – aiming at rational maximising and self-interested calculation, indeed our results support studies that maintain the reciprocity hypothesis. As noted by some scholars, high levels of trust (thus, low ‘risk’ levels) tend to establish fairly among friends [e.g., 41] and people who are socially integrated [42]. Besides, long social relationships have the power to strengthen cooperation between individuals, who positively experience trust over a prolonged period of time characterised by repeated interactions. In line with other trust and investment games conducted under controlled conditions, however, our experiment is still incomplete in terms of external validity [43]. Therefore, it would be appropriate to extend the administration of the experimental task both to other undergraduates, even belonging to different schooling and educational path so as to study possible differences, and to subjects different from students, as for example a company’s workers or other types of subjects and organisations of geographical-and-cultural origin. Finally, by studying the empirical structure of social relations and associations emerging from experimental data, we expressed these relationships in the form of a network. We entrust this further experimental evidence to future generations of scholars in order to help understand human decisions and build more reliable hypotheses and models, especially in economics.

References 1. Van Lange, P.A.M., Joireman, J., Parks, C.D., Van Dijk, E.: The psychology of social dilemmas: a review. Organ. Behav. Hum. Dec. Process. 120, 125–141 (2013) 2. Joireman, J., Van Lange, P.A.M., Van Vugt, M.: Who cares about the environmental impact of cars? Those with an eye toward the future. Environ. Behav. 36, 187–206 (2004) 3. Van Vugt, M., Meertens, R.M., Van Lange, P.A.M.: Car versus public transportation? The role of social value orientations in a real-life social dilemma. J. Appl. Soc. Psychol. 25, 258– 278 (1995) 4. Joireman, J., Kamdar, D., Daniels, D., Duell, B.: Good citizens to the end? It depends: empathy and concern with future consequences moderate the impact of a short-term time horizon on organizational citizenship behaviors. J. Appl. Psychol. 91, 1307–1320 (2006) 5. Balliet, D., Ferris, D.L.: Ostracism and prosocial behavior: a social dilemma perspective. Organ. Behav. Hum. Dec. Process. 120, 298–308 (2013) 6. Camerer, C., Loewenstein, G., Prelec, D.: Neuroeconomics: how neuroscience can inform economics. J. Econ. Lit. 43, 9–64 (2005) 7. Samson, A.: Introduction to behavioral economics. In: Samson, A. (ed.) The Behavioral Economics Guide, 1st edn. (2014). http://www.behavioraleconomics.com 8. Ariely, D., Norton, M.I.: Psychology and experimental economics: a gap in abstraction. Curr. Dir. Psychol. Sci. 16, 336–339 (2007) 9. Tyler, T., Amodio, D.: Psychology and economics: areas of convergence and difference. In: Frechette, G.R., Schotter, A. (eds.) Handbook of Experimental Economic Methodology, pp. 181–195. Oxford University Press, New York (2015) 10. Kelley, H.H., Thibaut, J.W.: Interpersonal Relations: A Theory of Interdependence. Wiley, New York (1978) 11. Kelley, H.H., Holmes, J.W., Kerr, N.L., Reis, H.T., Rusbult, C.E., Van Lange, P.A.M.: An Atlas of Interpersonal Situations. Cambridge University Press, Cambridge (2003)

290

E. Bucciarelli and A. Liberatore

12. Van Lange, P.A.M., Rusbult, C.E.: Interdependence theory. In: Van Lange, P.A.M., Kruglanski, A.W., Higgins, E.T. (eds.) Handbook of Theories of Social Psychology, pp. 251–272. Sage, New Delhi (2012) 13. Weber, J.M., Kopelman, S., Messick, D.M.: A conceptual review of decision making in social dilemmas: applying a logic of appropriateness. Pers. Soc. Psychol. Rev. 8, 281–307 (2004) 14. Eaton, B.C.: The elementary economics of social dilemmas. Can. J. Econ. 37, 805–829 (2004) 15. Linde, J., Sonnemans, J.: Decisions under risk in a social and individual context: the limits of social preferences? J. Behav. Exp. Econ. 56, 62–71 (2015) 16. Beckenkamp, M.: A game-theoretic taxonomy of social dilemmas. Cent. Eur. J. Oper. Res. 14, 337–353 (2006) 17. Apesteguia, J., Maier-Rigaud, F.: The role of rivalry. Public goods versus common-pool resources. J Conflict Resolut. 50, 646–663 (2006) 18. Gibson, K., Bottom, W., Murnighan, K.: Once bitten: defection and reconciliation in a cooperative enterprice. Bus. Ethics Q. 9, 69–85 (1999) 19. Grund, C., Harbring, C., Thommes, K.: Public good provision in blended groups of partners and strangers. Econ. Lett. 134, 41–44 (2015) 20. Andreoni, J., Miller, J.: Giving according to GARP: an experimental test of the consistency of preferences for altruism. Econometrica 70, 737–753 (2002) 21. Rabin, M.: Incorporating fairness into game-theory and economics. Am. Econ. Rev. 83, 1281–1302 (1993) 22. Dufwenberg, M., Kirchsteiger, G.: A theory of sequential reciprocity. Game Econ. Behav. 47, 268–298 (2004) 23. Falk, A., Fischbacher, U.: A theory of reciprocity. Game Econ. Behav. 54, 293–315 (2006) 24. Hoffman, E., McCabe, K., Smith, V.: Behavioral foundation of reciprocity: experimental economics and evolutionary psychology. Econ. Inq. 36, 335–352 (1998) 25. Fehr, E., Schmidt, K.: A theory of fairness, competition, and cooperation. Q. J. Econ. 114, 817–868 (1999) 26. Bolton, G.E., Ockenfels, A.: A theory of equity, reciprocity, and competition. Am. Econ. Rev. 90, 166–193 (2000) 27. Levine, D.: Modeling altruism and spitefulness in experiments. Rev. Econ. Dynam. 1, 593– 622 (1998) 28. Tan, J., Bolle, F.: On the relative strengths of altruism and fairness. Theor. Decis. 60, 35–67 (2006) 29. Wilson, R.K., Isaac, R.M.: Political economy and experiments. The Political Economist XIV (1), 1–6 (2007) 30. Krawczyk, M.: Trust me, I am an economist. A note on suspiciousness in laboratory experiments. J. Behav. Exp. Econ. 55, 103–107 (2015) 31. Chong, A., Chan, F.T.S., Ooi, K.B.: Predicting consumer decisions to adopt mobile commerce: cross country empirical examination between China and Malaysia. Decis. Support Syst. 53, 34–43 (2012) 32. Wei, T.T., Marthandan, G., Chong, A., Ooi, K.B., Arumugam, S.: What drives Malaysian mcommerce adoption? An empirical analysis. Ind. Manag. Data Syst. 109, 370–388 (2009) 33. Righetti, F., Finkenauer, C.: If you are able to control yourself, I will trust you: the role of perceived self-control in interpersonal trust. J. Pers. Soc. Psychol. 100, 874–886 (2011) 34. Balliet, D., Van Lange, P.A.M.: Trust, conflict, and cooperation: a meta-analysis. Psychol. Bull. 139, 1090–1112 (2013)

Does Trust Create Trust?

291

35. Bohnet, I., Greig, F., Herrmann, B., Zeckhauser, R.: Betrayal aversion. Evidence from Brazil, China, Oman, Switzerland, Turkey, and the United States. Am. Econ. Rev. 98(1), 294–310 (2008) 36. Bonetti, S.: Experimental economics and deception. J. Econ. Psychol. 19, 377–395 (1998) 37. McDaniel, T., Starmer, C.: Experimental economics and deception: a comment. J. Econ. Psychol. 19, 403–409 (1998) 38. Camerer, C., Weigelt, K.: Experimental tests of a sequential equilibrium reputation model. Econometrica 56, 1–36 (1988) 39. Berg, J., Dickhaut, J., McKabe, K.: Trust, reciprocity and social history. Game Econ. Behav. 10, 122–142 (1995) 40. Chen, S.-H., Chie, B.-T., Zhang, T.: Network-based trust games: an agent-based model. JASSS 18(3), 5 (2015) 41. Leider, S., Möbius, M., Rosenblat, T., Do, Q.-A.: Directed altruism and enforced reciprocity in social networks. Q. J. Econ. 124, 1815–1851 (2009) 42. Brañas-Garza, P., García-Muñoz, T., Neuman, S.: The big carrot: high-stakes incentives revisited. J. Behav. Decis. Mak. 23, 288–313 (2010) 43. Loewenstein, G.: Experimental economics from the vantage point of behavioral economics. Econ. J. 109, 25–34 (1999)

Evolution of Business Collaboration Networks: An Exploratory Study Based on Multiple Factor Analysis Pedro Duarte1 and Pedro Campos1,2(&) 1 Faculty of Economics, University of Porto, R. Dr. Roberto Frias, 4200-464 Porto, Portugal [email protected] 2 Laboratory of Artificial Intelligence and Decision Support, LIAAD INESC TEC, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal

Abstract. Literature on analysis of inter-organizational networks mentions the benefits that collaboration networks can provide to firms, in terms of managerial decision-making, although rarely analysed in terms of their overall performance. This paper aims to identify the existence of common factors of evolutionary patterns in the networks that determine its performance and evolution through a Multiple Factor Analysis (MFA). Subsequently, a hierarchical clustering procedure was performed on the factors that determine these networks, trying to find similarities in the evolutionary behavior. Data were collected on twelve real collaboration networks, characterized by four variables: Operational Result, Stock of Knowledge, Operational Costs and Technological Distance. The hierarchical clustering allowed the identification and distinction of the networks with the worst and best performances, as well as the variables that characterize them, allowing to recognize poorly defined strategies in the constitution of some networks. Keywords: Inter-organizational networks Analysis Hierarchical clustering

Evolution Multiple Factor

1 Introduction Firms in corporate or inter-organizational networks aim at interacting with the objective of exchange or share of resources to achieve mutual benefits. Managerial decisionmaking processes involving cooperation between different types of organizations with the aim of improving innovation capacity has been unanimously regarded as important [1]. It has been demonstrated that a collaborative business strategy is superior to strategies of traditional economies of scale in obtaining competitive advantages and general efficiency in the implementation of management processes. The need for effective intercultural communication is also seen as a crucial factor in the design of innovative business practices [2]. Inter-organizational networks are sets of companies that interact with each other through inter-organizational relationships [3]. The literature on organizational decision processes has addressed the impact of collaborative networks and concluded that these © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 292–299, 2020. https://doi.org/10.1007/978-3-030-38227-8_33

Evolution of Business Collaboration Networks: An Exploratory Study

293

can improve organizational performance [4–6]. Some authors claim that the success or failure of firms is dependent on their direct or indirect interactions with other entities [7, 8]. Collaborative strategies encourage companies specialization in more critical activities where they can be more proficient, leaving other activities to be carried out by other members of the network. Collaboration can be understood as a complementary action among the components of a network, namely in terms of efforts for technological innovation [9]. In this work we study the evolutionary behaviour of collaboration networks, in order to identify the existence of common factors of evolutionary patterns in networks that determine their performance, through a Multiple Factorial Analysis. Subsequently, a hierarchical clustering procedure is performed on the factors that determine these networks, in order to find similarities/differences in this evolutionary behavior. The results obtained from the MFA revealed three factors that explain most of the data variation. The hierarchical clustering method allowed the identification and distinction of the networks with the worst and best performances, as well as the variables that characterize them, allowing to recognize poorly defined strategies in the constitution of some networks. This research is innovative because in the literature on evolutionary analysis of enterprise networks, networks as organizational structures, are poorly analysed for their overall performance [9, 10]. The paper is structured as follows: Sect. 2 contains the literature review concerning Inter-Organizational Networks and Collaboration. In Sect. 3, we report the Methods and Data; in Section we 4 provide the Results and finally, Conclusions are presented in Sect. 5.

2 Inter-organizational Networks and Collaboration Inter-organizational networks are groups of companies that interact with each other through inter-organizational relationships [3, 11]. It may be the result of strategic alliances, industrial districts, consortia, joint ventures, social networks, among other forms of relationship more or less formal. In other words, this concept consists of a strategic decision, adopted by two or more independent organizations, aiming at exchanging or sharing resources to achieve mutual benefits, constituting a system of relations based on the division of labour within the network [12]. Inter-organizational interaction can enhance skills, reduce resource constraints, promote combinations of knowledge and creativity, and promote the exploration of new business units. These benefits, in turn, lead to economic growth and increased competitiveness [13, 14]. With market globalization, [7] assume that companies that work together, and that collaborate with each other, are easier to adapt their products, services and operational processes to the demands of the markets. As a result, the need to connect more quickly and more effectively with others through collaborative networks, to expand internal capabilities and problem solving, is crucial, particularly in SMEs [15]. Collaboration environments promote the transfer and exchange of knowledge and resources, which can provide a competitive advantage to firms [16]. However, different size companies may benefit more from collaboration than others. Technological

294

P. Duarte and P. Campos

capacity, financial autonomy, and small business human capital constraints are detrimental to innovation processes, since the implementation of innovation projects is complex and costly. Therefore, collaborative networks can reduce these limitations through resource sharing and knowledge transfer. Small firms can benefit from their presence in these networks to leverage their business capabilities, innovation processes, environmental understanding and market reactivity [17, 18]. Network cooperation/collaboration strategies covers situations where several companies agree to form multiple partnerships to achieve joint objectives [19]. One of the advantages of collaboration is the fact that companies obtain access to partners of its partners [20]. The authors see strategic alliances as the main type of cooperative strategy and define these alliances as such, as they combine some of their resources and capabilities to create a competitive advantage.

3 Methods and Data 3.1

Multiple Factor Analysis (MFA)

Traditionally, multivariate analysis examines data obtained by measuring more than two variables over a set of objects or individuals, represented in a two-way structure, called a matrix or data frame. The Multiple Factor Analysis (MFA) was proposed by [21], with the objective of ascertaining the existence of an eventual structure common to several data matrices. It is used the simultaneous treatment of a succession of tables with the same individuals/objects characterized by equal or different sets of variables, quantitative or qualitative, associated with a third dimension, (e.g. Time). According to [22], the MFA is an extension of Principal Component Analysis (PCA), adapted to handle multiple data tables (for different time instances, for example), that measure sets of variables collected for the same observations, or alternatively (in dual-MFA) multiple tables of data where the same variables are measured in different sets of observations. PCA is usually applied in order to reduce a large space of variables to a smaller space. Although this is not the case here, the motivation for MFA is that it helps revealing the internal structure of the data in an evolutionary perspective. After performing the Multiple Factor Analysis, a network clustering procedure was applied, more specifically, a hierarchical clustering of the factors that determine these networks, resulting from the MFA, in order to find similarities in the evolutionary behaviour. 3.2

Networks and Data Preparation

Information for this study was collected from the Amadeus database [28], on twelve real collaboration networks1 also known as strategic alliances or partnerships (industrial,

1

The collaboration networks identified for this paper are the following: Continental Lemmerz (Portugal); Comportest; Nokia Siemens Networks U.S; Valindo - Têxteis, S.A; Renault-Nissan; Renault-Nissan-Daimler; Sony Mobile Communications Ab; Trützschler-Marzoli; Spairliners Gmbh; Lufthansa Bombardier Aviation Services Gmbh; N3 Engine Overhaul Services Gmbh & Co. Kg; Vanpro - Assentos, Lda; for more details concerning the data and the variables involved, see [28].

Evolution of Business Collaboration Networks: An Exploratory Study

295

aeronautical and service sectors), with a set of four attributes: Operating Result, Stock of Knowledge, Operational Costs and Technological Distance. We used these four attributes to characterize the evolution of each network in a time span of different time periods. Collaboration networks have been identified based on a minimum of 7 periods (years) of collaboration within the same inter-organizational structure. As mentioned previously, each network is described by four variables: Operational Result (V1), Stock of Knowledge (V2), Operational Costs Indicator (V3) and Technological Distance (V4). For this type of analysis, a data table D was constructed for each time period, where n observations in each period t, are described by a group of variables, V = {V1, …, V4}, as represented in Fig. 1. We analyse the intra-structure, to identify the main components for the analysis as a hole, and the inter-structure, that allows for the identification of the evolution of the factors.

Fig. 1. Representation of the several data tables as the input for the MFA

4 Results 4.1

Intra-structure

Seven different periods (T1, …, T7) were characterized by four variables, in a total of 7 4 = 28 variables that were created. Twelve observations were collected, each corresponding to a collaborative network. After performing the main stage of the MFA, it is thus possible to obtain the consensus space, represented by the global eigenvalues, as well as the percentages of the variance or inertia explained and accumulated (Table 1). Table 1. Representation of Consensus Space (intra-structure - first four components) PCA 1 2 3 4 … 11

Eigenvalue 6,731 3,106 1,595 1,254 … 0,000

Explained inertia Cumulated inertia 48,953 48,953 22,591 71,544 11,603 83,147 9,119 92,266 … … 0,001 100,000

296

P. Duarte and P. Campos

It is possible to conclude that the first three factors are those that have the greatest importance, explaining about 83.15% of the total variance of the model, and these will be the main factors to be taken into account in this study. In MFA, the first three dimensions and associated variables explain most of the inertia, serving as a criterion for discrimination of the groups within the clusters. Of these factors, the factor or axis 1 (F1) is the one with the highest preponderance in the analysis explaining about 49% of the variance, while the other factors (F2 and F3) explain about 22.6% and 11.6% of the variance, respectively. It can be argued that Factor 1 can adopt a description that involves the financial performance of the network and all its accumulated investment in goods and human resources, but specifically a positive evolution of them over time. It can be denominated this Factor 1 of “Investment and Operational Performance of the Network”. Factor 2, in turn, is characterized by negative correlations of the set of Technological Distance variables. These negative correlations indicate that the Factor has an opposite movement to the evolutionary movement of this set of variables. Finally, Factor 3 is mainly described by the variable ICop (Operational Costs Indicator), being the correlation with this variable negative, that is, Factor 3 and the Operational Costs Indicator “move” in opposite directions. 4.2

Inter-structure

The quality of the simultaneous representation of the clouds is evaluated by the ratio of inter-structure inertia to total inertia. The higher this quotient for each axis/factor of the intra-structure, the higher the quality of the cloud representation. Table 2. Quotient (Inertia Inter-structure/Total Inertia) Factor 1 Factor 2 Factor 3 0,978 0,815 0,519

Table 2 shows the quotient (inter-structure inertia)/(total inertia). With respect to this table, it can be concluded that Factor 1, when presenting a value of 0.978, indicates, on the one hand, a high degree of similarity between the different periods studied and, on the other hand, confirms the common character of this factor. The remaining two factors do not have a character as common as Factor 1, although Factor 2 conveys a great degree of similarity between the periods. 4.3

Clustering

After clustering the networks, it is possible to identify what the variables that contribute most to each class. These variables are determined taking in consideration the differences between the mean of each class and the overall mean are statistically significant, according to a specific statistical test [21]. For the list of variables corresponding to the partition of three classes, it is possible to verify that in class 1, consisting of nine networks, the variables that characterize this set of observations are mainly the sets of Technological Distance variables, Operating Result, and Stock of Knowledge. They are

Evolution of Business Collaboration Networks: An Exploratory Study

297

associated mainly with negative mean values of Network Operating Result, and not only average values of Stock of Knowledge considerably below the general average of the observations (except for the first three periods) as well as for the average Technological Distance of the class. It can be argued that this class is represented by the worst performing networks, with low levels of investment in goods and human resources, and the technological difference between the companies in these networks is almost non-existent. Class 2 is assigned to one network and the most relevant variable is Operational Costs. In Class 3, consisting of only 2 networks, it is associated not only with average values of Stock of Knowledge higher than the general average, but also with higher average values of Network Operating Result, also above average. One can argue the possibility of this cluster being made up of networks of high financial performance that can then be reflected in the amount invested in goods and human resources. According to the individual analysis of Class 3, it can be seen that this is composed of two networks composed of large companies in the automotive sector, which present not only average levels of Operating Results considerably above the general average of the sample, as well as average values of Stock of Knowledge. The high Operating Results can arise as a positive consequence of the investment by these companies, in line with the study carried out by [27] in which there is an increase in investment in technology to combat competition pressures, environmental legislation and consumer requirements.

5 Conclusions and Final Considerations The study of business networks is a complex theme. Part of this complexity is related to the lack of consensus on the definition of this multifaceted phenomenon. Its definition therefore lies in a continuous and multidisciplinary debate, still evolving. By identifying the factors underlying the evolutive inter-organizational networks and grouping them according to their characteristics, it was be possible to identify imperfections in the performance of companies and, from there, to improve, as well as to identify the indicators that best reflect the performance of a company/network of companies, and try to see if it will be possible to get out of this situation and boost its performance. In addition, there are some limitations to the work done. The study of the performance and evolutionary behaviour of business collaboration networks can provide high quality information about them if considered a relatively well-formed sample. This turned out to be one of the biggest and main difficulties throughout this study. The lack of existing information on business collaboration networks made it difficult to compile a considerable sample, not only by collecting the necessary information for the variables (financial information) and information on the establishment of networks, as well as the combination of these. For future research, it would be interesting to study collaborative networks with a larger sample, either in terms of the number of networks (observations) or a longer time frame. In addition, it would be recommended to carry out an analysis of business networks, assigning focus to a sector of specific activity, being able to use not only the Multiple Factor Analysis method, but also other data analysis techniques, such as regression trees panel data.

298

P. Duarte and P. Campos

References 1. Bianchi, C., Gras, N., Sutz, J.: Cooperation understood as knowledge exchanges driven by people: consequences for the design and analysis of innovation surveys. University Research Council Universidad de la República. Uruguay (2009) 2. Milovanovic, S.: Balancing differences and similarities within the global economy: towards a collaborative business strategy. Proc. Econ. Financ. 23, 185–190 (2015) 3. Eiriz, V.: Dinâmica de relacionamento entre redes interorganizacionais. Inovação Organizacional, Nº 2, pp. 121–153 (2004) 4. Combs, J.G., Ketchen, D.J.J.: Explaining interfirm cooperation and performance: toward a reconciliation of predictions from the resource-based view and organizational economics. Strateg. Manag. J. 20(9), 867–888 (1999) 5. Sarkar, M.B., Echambadi, R., Harrison Jeffrey, S.: Alliance entrepreneurship and firm market performance. Strategic Manag. J. 22(6), 701–715 (2001) 6. Zaheer, A., Geoffrey, G.B.: Benefiting from network position: firm capabilities, structural holes, and performance. Strategic Manag. J. 26(9), 809–825 (2005) 7. Wilkinson, I., Young, L.: On cooperating firms, relations and networks. J. Bus. Res. 55, 123–132 (2002) 8. Håkansson, H., Waluszewski, A.: Path dependence: restricting or facilitating development? J. Bus. Res. 55, 561–570 (2002) 9. Campos, P.: Organizational Survival and the Emergence of Collaboration Networks: a Multi-Agent Approach. Tese de Doutoramento, Faculdade de Economia da Universidade do Porto, Porto (2007) 10. Wang, C., Rodan, S., Fruin, M., Xu, X.: Knowledge networks, collaboration networks, and exploratory innovation. Acad. Manag. J. 57(2), 484–514 (2014) 11. Popp, J., Milward, H., MacKean, G., Casebeer, A., Lindstrom, R.: IBM Center for the Business of Governenment, Inter-organizational Networks, A Review of Literature to Inform Practice, Collaborating Across Boundaries Series (2014) 12. Johanson, J., Mattsson, L.G.: Interorganizational relations in industrial systems: a network approach compared with transaction-costs approach. Int. Stud. Manag. Organ. XVII(1), 34– 48 (1987) 13. Daugherty, P.J., Richey, R.G., Roath, A.S., Min, S., Chen, H.: Is collaboration paying off for firms? Bus. Horiz. 49(1), 61–67 (2006) 14. Hewitt-Dundas, N.: Resource and capability constraints to innovation in small and large plants. Small Bus. Econ. 26, 257–277 (2006) 15. Michaelides, R., Morton, S.C., Michaelides, Z., Lyons, A.C., Liu, W.: Collaboration networks and collaboration tools: a match for SMEs? Int. J. Prod. Res. 51(7), 2034–2048 (2013) 16. Mowery, D.C., Oxley, J.E., Silverman, B.S.: Strategic alliances and interfirm knowledge transfer. Strategic Manag. J. 17, 77–91 (1996) 17. Cohen, W., Levinthal, D.: Absorptive capacity: a new perspective on learning and innovation. Adm. Sci. Q. 35, 128–152 (1990) 18. Lane, P., Salk, J.E., Lyles, M.A.: Absorptive capacity, learning and performance in international joint ventures. Strategic Manag. J. 22, 1139–1161 (2001) 19. Hitt, M., Ireland, R.R., Hoskisson, R.E.: Strategic Management: Competitiveness and Globalization (Concepts and Cases). Thomson South-Western (2005) 20. Cline, R.: Partnering for strategic alliances. Lodging Hospitality 57(9), 42 (2001) 21. Escoufier, B., Pagès, J.: Mise en oeuvre de l’AFM pour les tableaux numériques, qualitatifs ou mixtes, 429 p. Publication interne de l’IRISA (1985)

Evolution of Business Collaboration Networks: An Exploratory Study

299

22. Abdi, H., Williams, L., Valentin, D.: Multiple factor analysis: principal component analysis for multitable and multiblock data sets. Comput. Stat. 5(2), 149–179 (2013) 23. Jaffe, A.B.: Real effects of academic research. Am. Econ. Rev. 79(5), 957–970 (1989) 24. Jaffe, A.B., Trajtenberg, M., Henderson, R.: Geographic localization of knowledge spillovers as evidenced by patent citations. Q. J. Econ. 108(3), 577–598 (1993) 25. Audretsch, D.B., Feldman, M.P.: R&D spillovers and the geography of innovation and production. Am. Econ. Rev. 86(3), 630–640 (1996) 26. Duarte, P.: Evolução de Redes de Colaboração Empresariais: um estudo segundo Análise Fatorial Múltipla. Faculdade de Economia da Universidade do Porto, Tese de Mestrado (2016) 27. MacNeill, S., Chanaron, J.-J.: Trends and drivers of change in the European automotive industry: (I) mapping the current situation. Int. J. Autom. Technol. Manag. 5(1), 83–106 (2005) 28. van Dijk, B.: AMADEUS, a database of comparable financial information for public and private companies across Europe (2017)

Ethics and Decisions in Distributed Technologies: A Problem of Trust and Governance Advocating Substantive Democracy Antonio Carnevale(&) and Carmela Occhipinti CyberEthics Lab, Corso Cesare Battisiti, n. 69, 80024 Cardito, NA, Italy [email protected]

Abstract. The distributed architecture of applications such as blockchains, wireless sensor networks, multi-agent platforms, and the Internet of Things charges technological development to face, by default, with two aspects of responsibility that were usually accorded to human beings: the “decision” (Who is enabled to make decisions in a decentralized system? What about the mechanism for deciding? Authorized by whom? With what kind of consensus?) and “ethics” (To which principles must respond the decision-making mechanism? And, if decisions are distributed, what is the role of ethics? To guide or to laissez-faire? Permissive or restrictive? For everyone or only for those who are authorized?). Responding to these epochal questions can lead to rethinking the distributed technologies in view of a game-changing transformation of models of trust-in-governance. From a model signed by centralization of trust, we have come to progressively more decentralized forms until the contemporary “distributed trust”. This paper constitutes an endeavor to introduce and address the main philosophical foundations of this historical passage. Keywords: Ethics Philosophy of technology Blockchain Governance Trust Decision-making Relation of economics to social values

1 Ethics and Decision in Distributed Technology 1.1

Philosophy as Pilot of Technology? the Moralizing of Things and the Informatization of Morality

The title may seem a provocation: How can technology be guided by philosophy, being they are two forms of rationality so distant? Philosophy means meditation, contemplation; technology instead is pragmatic, speed, anticipation, and consumption. How can philosophy guide the technological development? The answer starts with observing technological development. For centuries the technique has been understood as the human instrument par excellence to forge nature and replicate its strength and capacity for regeneration. Think to the myth of Prometheus and fire. Today, technologies appear to shape a societal power far beyond the nature and instrumentality [1]. The so-called “4th © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 300–307, 2020. https://doi.org/10.1007/978-3-030-38227-8_34

Ethics and Decisions in Distributed Technologies

301

Revolution” [2] has transformed ICTs in environmental forces that create and transform our realities. Data technologies are changing the links between sociality and privacy: in the past they were contrasting concepts and one represented the groundless intrusion in the other’s domain [3], today they are slowly adjoining and aligned [4], so much that someone hypothesizes the birth of a “group privacy” [5]. In the past, many philosophers and thinkers have been concerned for many decades to understand the nature of this power and stable norms and methods to regulate it: John Dewey, Martin Heidegger, Herbert Marcuse, Günther Anders, just to name a few. It can be assumed that the main anxiety that worries those thinkers is that, if one day technology will hold a teleological power – the capacity to establish its own ends for itself – this would not only be a danger for limiting the human responsibility, but it would undermine the entire foundations of human rationality. The inquiry with these accounts about the dangerousness of technology lies in a split vision of being human and technology struggling over the world. Humans and technologies are figured as superpowers that play the hegemony over a world made up of dumb and unarmed things. The novelty is that the technological development pushes the world toward a disruptive scenario where things can communicate. The number of smart objects is keeping growing. Different studies forewarn that by the 2020th around 34 billion smart devices will be connected to the Internet (24 billion IoT objects and 10 billion more traditional devices such as smartphones, tablets, smartwatch, etc.) [6]. A world in which even things contribute, with their M2M communications, to shift meanings and make decisions, it is a world no longer human-centered in a philosophical sense. This implies a set of imperative queries: What is the moral consistence of a M2M communication? When is it really true? In more exhaustive words: To which logical and ethical criteria this communication must respond? Do human being have already embedded by history or experience the conditions of thinkability of these criteria or, otherwise, do we have to learn novel way of thinking the truth? It therefore seems evident to envisage that digital technologies and ICTs will place philosophy once again at the center of knowledge. This is the background thesis of paper, a legacy of philosophy with a supplementary role. No longer investigating separately the construction of human subjectivity and the being of the world, rather quite the opposite, intersectively: the construction of the world though the being of agents operating such a construction. There are trends observable in society that seem to authorize this thesis: • The moralizing of things. Objects are increasingly designed for having already incorporated value-laden settings [7, 8]. • The informatization of reality. ICT have populated not only the devices of technology but also the language and its “metaphors we live by” [9], that is the key vehicle to construct the meaning of the world and collect a shared social imaginary. Our experiences are formed in a reality that is conditioned by computer technology in a consistent manner [10]. This substrate filters through the cognitive and emotional processes also in the ways in which we elaborate visions of the world, especially in all those types of technologies that are based on the relationship in order to be performative (machine learning, AI, blockchain, IoT, total no-gate solution).

302

1.2

A. Carnevale and C. Occhipinti

Distributed Technologies: Strengths and Weaknesses of an Announced Technological Revolution

According to many experts, the blockchain and more generally distribute technologies [DTs] represent the most suitable candidate technologies capable of maintaining an immutable log of transactions happening in a distributed network, building a truly decentralized, trustless and secure environment for the IoT [11], representing the infrastructure of reference for the operation of this “spiderweb of intelligent objects” [12]. The most known DTs are represented by the blockchain, namely, a digital ledger that allows for verification without having to be dependent on third-parties. When the blockchain first term appeared, in an article of 1991, it referred an abstract description of “a cryptographically secured chain of blocks” [12]. However, Nakamoto, an anonymous person (or group of persons) is universally recognized being father of the blockchain formally theorizing the blockchain technology [13] and implementing it (in 2008 and 2009, respectively) as a core component of the cryptocurrency Bitcoin. From this starting point, the theoretically-disruptive impact of DTs has been addressed as response that offers benefits in several other domains of application. But, along with the benefits, the worries have also grown. In the Table 1 we resume the most evident potentiality as well as weaknesses of DTs. Table 1. Most evidence-based strengths and weaknesses of DTs. Strengths • Validating identity management, without the use of an independent third party mainly in education and training fields [14, 15] • Storing trade-able information records and transactions to foster smarter business supply chain [16] • E-governance tool for creating decentralized platforms for storing, sharing and verifying qualified public services [17] • Enabling green energy technology for lowcarbon transition and sustainability [18]

1.3

Weaknesses • Global undefined personal data protection due to the “eternal” feature of DTs; privacy issue [19] • Far slow process to certificate that all nodes in the network come to an agreement that the transaction is valid; double-spend issue [20] • Disproportion between the growth of the distribution network and the quality and the amount of the nodes that constitute it; mining issues [20] • Expensive energy consumption technology [21] • Futuristic and still intangible technology [21]

“Distributed”, the Quality Beyond the Border of Technology: The Case of PERSONA Project

Despite the many qualities and faults, DTs cannot be totally considered as hoaxes or, opposedly, as revolutionary not only because they are too new technologies to be assessed. Rather, the crux of the matter concerns the ontological transformation – if we may say so – DTs concurs to promote of the “border of technology”. Thinking of technology as a rationality that designs smart and intelligent “agent” (robots,

Ethics and Decisions in Distributed Technologies

303

autonomous vehicles, artificial intelligences) means failing to see an important facet. So that these devices can work best, technology must be rethink beyond the border that previously separated the agents from their environment. This distinction has been reconceptualized and we have tom assume it in form of a flow that relates agent and environment within enabling architectures. Against this backdrop, “distributed technology” is no longer a defined type of technology – such as blockchain or distributed ledger technology – rather it becomes an interpretative paradigm of technological development. “Distributed” is the environmental characteristic that will characterize future integration between moral agents, technological architectures, and security systems to overcome the current practices of “border”. From this point of view, an interesting case study is represented by the EU project PERSONA of which the authors of this paper are partners [22]. The project does not deal with DTs commonly understood. Its main aim is to design and establish effective methods in order to carry out an impact assessment of no-gate border-controlling technology. Nevertheless, the distributed nature of the no-gate technologies, theoretically, favors the integration and facilitation both of crossing the border (i.e., no-gate solution), and controlling people (i.e., intensive use of technology to handle personal data and risk of data manipulation). Precisely this new possibility of combination makes important the role of decision-making and ethics in guide distributed technology. 1.4

The Importance of Ethics and Decision-Making to Guide the Environmental Architecture of Technology

For many years we have become accustomed with physically technological artefacts that constituted a “presence” in the world. They needed to be crossed, to be installed, and “power supplied” by switching or physical operations. In this sense, DTs aims to be no-gate, flowing, environmentally installed, and monitored by other technology and as much as possible designed for self-supply. Philosophically speaking, it seems a move from presence to absence. Against this backdrop, such a configuration needs also to be accompanied in its progressive development according to two guidelines: • Decision-making. The rules of the decision represent the main technical and regulatory aspect for the good functioning of the system and its elements. Therefore, both decision-making rules relating to the experience of the agents (transparency, security, privacy, immutability) and decision-making rules relating to the maintenance of the distributed nature of the environment (adaptability, resilience, level of consensus, trust) are important. Agents well educated to the distribution but living in an environment without rules and not very protected, end up succumbing to external attacks. On the contrary, a too top-down and vertically managed environment that protects itself from everyone, including its elements, loses the essential quality of distribution in the long run. • Ethics. For guaranteeing the accurate balance between the instances of agents and those of environment, the decision-making process needs to be guided by rules that respond to ethical principles. Ethics gives the rules the value-laden contents that allow rules not only to be observed as “commands”, but to be chosen as an intrinsic

304

A. Carnevale and C. Occhipinti

and rational aspect of a more universal vision [23] or, in the case of an artificial intelligence, as an agent performing different levels of abstraction [2].

2 Distributed Technologies, Trust, and Governance of Democracy 2.1

Trust, a Complex Human Feeling

It is commonly believed that trust has to do with having moral certainty and security, but it also includes the idea of risk [24]. Trust therefore moves between two opposing poles, on the one hand security, on the other vulnerability. It plays an important role both at interpersonal and impersonal level. But as the classical theorists of sociological thought – among others Georg Simmel and Émile Durkheim – trust is a pre-contractual element of social life, that is, with those basic solidarity and implicit cooperative agreement (both moral and cognitive) that allow to ‘hold together’ the society. In impersonal systems such as economics or law, trust develops a general social bound that leads to the internalization of common values, that is an active adherence to the normative order [25], or to a blind mechanism of adhesion to society released from the motivational structures of the actors [24]. Finally, at the political level, trust is the moral basis of the consensus that allow the legitimacy of the institutions [26]. 2.2

Are We Experiencing a Real “Crisis of Trust”?

Today, many facts of the reality speak about a “crisis of trust”. Rachel Botsman in a recent book [27], analyzing the 2017 report of Barometer of trust, underlines how the trust regarding the intermediate bodies of society is the lowest of recent decades. The most interesting aspect is that the people interviewed said they had lost confidence not only in governments and private companies – a consideration perfectly explainable in the view of the ongoing economic and political crisis – but also for media and NGOs, thus those agencies of society that should act as a critical marker of democratic values such truth, cooperation, solidarity, etc. By the 82% of the contexts surveyed, media are considered part of the elite. This misrepresentation has caused an implosion of trust in participatory processes and generated a parallel tendency to prefer self-referential truths and rely on own peers. It seems that people try to confirm the beliefs already they have, often turning to people they already know. Today trust and influence are directed more towards “people” – family members, friends, classmates, colleagues – than towards hierarchical elites, experts and authorities. However, social behaviors that seem to disconfirm this negative picture are observable in society. An interesting study is the Science Barometer, a representative survey of German citizens on science and research: The 2018 report shows that, despite the mentioned hostility towards elites, the public trust in science and research remains stable. But above all, the phenomenon of greater empirical relevance is the fortune of global companies that operate in the online marketplace. In a society apparently closes to the hierarchical elites, where individuals’ preferences count more than abstract

Ethics and Decisions in Distributed Technologies

305

institutions, it happens inexplicably that millions of people share billions of sensible personal data on remote and de-materialized platforms, sharing imagines, memories, emotions with virtual (often unknown) contacts on social forums. Is there an effective crisis of trust? 2.3

The Crisis as Change of Trust-in-Governance Model

It is evident that trust, even today, remain the same complex feeling between risk and inclination, egoism and reliance. The novelty is rather represented by the fact that we are faced with a profound transformation of the models that link trust and governance. From an era of centralization of trust, we have come to progressively more decentralized forms until today where an era of distributed trust. The main advantage of the distributed trust model is that it brings the community of experts closer to the community of stakeholders, the flaw is that it remains a model of engineering computing [28]. 2.4

Democracy for Trusting Distributed Technology

Nevertheless, these models outlook how trust has changed, but they do not explain why trust changes in this determined way, nor do they offer counteractive tools to intervene in case the human development should not be directed as desiderata. Although these models fit into an appreciable and shareable interpretive line – i.e. the environmental character of DTs favors transparent and secure processes of decision making and theoretical respect of fundamental ethical principles – however, they alone are not enough to guarantee democracy. Extra distributed decisions and deeper ethical load are not a sign of better democracy. From the point of view of distribution, as economist Amartya Sen has argued, a more transparent and equal distribution has little meaning if we do not understand the weight that the distributed good has in the capacity of individuals to realize their life plans [29]. On the other hand, if we mean ethics as respect for fundamental human values, it is not enough to make those values a real practice. There is no proportional automatism between more distributed technology and better ethical quality of life. Such automatism is but a possible version of how the two can be governed together. Therefore, the definition of decisions and ethical values are necessary aspects so that trust can be a constituent of DTs architectures, but they are not sufficient conditions. Sufficiency is reached when decisions and ethical values are “problematized” rather than “defined”. Problematic in the sense of being subjected to the democratic scrutiny of the public discussion. Problematization is the better distributed form of governance of democracy. It does not mean inventing difficulties where there are not, but rather enhancing democracy as an epistemic [30] and communicative [31] space to offer and to take reasons. Democracy is a problematization of facts in view of better collectively participated solutions. In so doing, DTs will help people to regain the trust in institutions, that is providing political decentralizations that give citizens or their elected representatives more power in public decision-making [32].

306

A. Carnevale and C. Occhipinti

3 Conclusions No-gate border-controlling technology, IoT, blockchain, multi-agent platform, and any other DTs can increase the perception of reality, but this does not in itself imply an increase in democratic quality of the empirical experience. The positive philosophical aspects of the DTs – emphasizing decision-making and ethical principles at the center of the policy agenda – are just the initial step (and not the solution) of the slow integration of the digital revolution with the human development and the environmental eco-sustainability. The return of philosophy as insightful disciplines in constructing the world does not simplify things, nor complicates them, simply questions them in the positive sense, that is, makes them more democratic. In this sense, DTs and democracy will converge if the former will provide the latter an informational and socialized space to trust the democratization of the decision-making mechanism. We need more distributed technological media to enhance the dialectical game of democracy: giving and requesting public reasons to discuss socially-relevant facts, to involve citizens in public consultations, to deliberate solutions that solve the actual problems but, at the same time, create responsibility and accountability for the future sustainability. Acknowledgements. This work is supported by the project PERSONA funded by the European Union’s Horizon 2020 programme EU.3.7.6. under grant agreement No. 787123.

References 1. Kirckpatrick, G.: Technology and Social Power. Palgrave Macmillan, Hants (2008) 2. Floridi, L.: The 4th Revolution: How the Infosphere is Reshaping Human Reality. Oxford University Press, Oxford (2014) 3. Warren, S.D., Brandeis, L.D.: The right to privacy. Harvard Law Rev. 4(5), 193–220 (1890) 4. Garcia-Rivadulla, S.: Personalization vs. privacy: an inevitable trade-off? IFLA J. 42(3), 227–38 (2016) 5. Taylor, L., Floridi, L., van der Sloot, B. (eds.): Group Privacy. Springer, Berlin (2017) 6. Number of IoT devices. https://iot-analytics.com/state-of-the-iot-update-q1-q2-2018-numb er-of-iot-devices-now-7b/. Accessed 06 Apr 2019 7. Verbeek, P.P.: Moralizing Technology: Understanding and Designing the Morality of Things. Chicago University Press, Chicago (2011) 8. Brey, P.: The technological construction of social power. Soc. Epistemology 22(1), 71–95 (2007) 9. Lakoff, G., Johnson, M.: Metaphors We Live By. University of Chicago Press, Chicago (2003) 10. Halpin, H., Monnin, A. (eds.): Philosophical Engineering: Toward a Philosophy of the Web. Wiley-Blackwell, Oxford (2014) 11. Rauchs, M., Glidden, A., Gordon, B., Pieters, G., Recanatini, M., Rostand, F., Vagneur, K., Zhang, B.: Distributed Ledger Technology Systems: A Conceptual Framework. Cambridge Centre for Alternative Finance, University of Cambridge, Cambridge, UK (2018). https:// papers.ssrn.com/sol3/papers.cfm?abstract_id=3230013. Accessed 06 Apr 2019/04/06 12. Haberand, W.S.: How to time-stamp a digital document. J. Cryptol. 3, 99–111 (1991) 13. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. http://bitcoin.org/bitcoin.pdf. Accessed 06 Apr 2019

Ethics and Decisions in Distributed Technologies

307

14. Pilkington, M.: Blockchain technology: principles and applications. In: Olleros, X.F., Zhegu, M. (eds.) Research Handbook on Digital Transformations. Edward Elgar, UK (2015) 15. Grech, A., Camilleri, A.F.: Blockchain in Education. In: dos Santos, A.I. (ed.). http:// publications.jrc.ec.europa.eu/repository/bitstream/JRC108255/jrc108255_blockchain_in_ education%281%29.pdf. Accessed 06 Apr 2019 16. Steiner, J.: Blockchain Can Bring Transparency to Supply Chains. The Business of Fashion (2015). http://www.businessoffashion.com/articles/opinion/op-ed-blockchain-can-bringtrans parency-to-supply-chains. Accessed 06 Apr 2019 17. Kokkinakos, P., Koussouris, S., Panopoulos, D., Askounis, D., Ramfos, A., Georgousopoulos, C., Wittern, E.: Citizens collaboration and co-creation in public service delivery: the COCKPIT project. Int. J. Electron. Govern. Res. 8(3), 33–62 (2012) 18. Merlinda, A., Robu, V., Flynn, D., Abram, S., Geach, D., Jenkins, D., McCallum, P., Peacock, A.: Blockchain technology in the energy sector: a systematic review of challenges and opportunities. Renew. Sustain. Energy Rev. 100, 143–174 (2019) 19. Finck, M.: Blockchains and Data Protection in the European Union. SSRN Scholarly Paper. Social Science Research Network, Rochester (2017) 20. UK GOS: Distributed Ledger Technology: beyond block chain. A report by the UK Government Chief Scientific Adviser (2016) 21. Böhme, R., Christin, N., Edelman, B., Moore, T.: Bitcoin: economics, technology, and governance. J. Econ. Perspect. 29(2), 213–238 (2015) 22. http://persona-project.eecs.qmul.ac.uk/. Accessed 06 Apr 2019 23. Anscombe, G.E.M.: Modern moral philosophy. Philosophy 33, 1–19 (1958) 24. Pettit, P.: The cunning of trust. Philos. Public Aff. 24, 202–225 (1995) 25. Parsons, T.: Politics and Social Structure. Free Press, New York (1969) 26. Walker, M.U.: Moral Repair: Reconstructing Moral Relations After Wrongdoing. Cambridge University Press, Cambridge (2006) 27. Botsman, R.: Who Can You Trust? How Technology Brought Us Together and Why It Might Drive Us Apart. PublicAffairs, New York (2017) 28. Abdui-Rahman, A., Hailes, S.: A distributed trust model. In: New Security Paradigms Workshop Langdale, Cumbria UK (1997) 29. Sen, A.: Equality of What? In: McMurrin, S. (ed.) Tanner Lectures on Human Values, vol. 1, pp. 353–369. Cambridge University Press, Cambridge (1980) 30. Cohen, J.: An epistemic conception of democracy. Ethics 97, 26–38 (1986) 31. Habermas, J.: Between Facts and Norms: Contributions to a Discourse Theory of Law and Democracy. MIT Press, Cambridge (1996) 32. European Blockchain Observatory and Forum, a European Commission initiative to accelerate blockchain innovation and the development. https://www.eublockchainforum.eu/. Accessed 06 Apr 2019

The Ascending Staircase of the Metaphor: From a Rhetorical Device to a Method for Revealing Cognitive Processes Cristina Pagani1(&) and Chiara Paolini2

2

1 I.T.St. Aterno – Manthonè, Pescara, Italy [email protected] Universiteit Utrecht, Utrecht, The Netherlands [email protected]

Abstract. This paper is concerned with cognitive research advances in a comparative study of the metaphor within educational contexts. By focusing on the role of metaphorical processes in reasoning, we investigate the metaphor over a period of time from 334 B.C. to 2014: specifically, from Aristotle’s seminal rhetorical theory – tradition acknowledges him as the founding father of the metaphor as a research method and as a scientific tool – up to Lakoff and Johnson’s [1] and Gola’s [2] arguments. The pivotal role of metaphor in the evolution of linguistics and neuroscience is represented through three diagrams in a Cartesian reference system, highlighting its ascending staircase paradigm: the undisputed star of many essays and theories, as stated by Eco and Paci [3], either despised or cherished as it happens with any star. The abscissas (i.e., x-axis) and the ordinates (i.e., y-axis) draw the biography of metaphor: since birth, Aristotle describes its embellishment qualities in the linguistic labor limae, but it is even more exalted as a sign of ingeniousness which develops different research perspectives. This paper aims to clarify the development path of the metaphor: until the seventeenth century, after losing its cognitive quality detected in paternal writings, it was diminished as a similitudo brevior, a “will-o’-the-wisp” or sentenced to a sort of “linguistic deceit”. Furthermore, the paper aims to share a theoretical and methodological approach which releases the metaphor from the rhetorical cage where it has been enveloped by some ancient and modern authors of the rhetorical tradition. Indeed, we embrace the idea that metaphor is not merely a part of language, but reflects a primordial part of people’s knowledge and cognition. In so doing, we show who and how has outlined that the pervasiveness of metaphors cannot be overlooked in human understanding and life, although, among the mysteries of human cognition, metaphor remains one of the most baffling. Keywords: Educational research Rhetorical theory Cognitive science Polygonal chain

Framing strategies

1 Introduction The present paper describes the life path of metaphor, better known as the queen of tropes: starting from a philological and linguistic point of view, through a rigorous exploration of theories of cognitive value and cognitive functions, it aims to address © Springer Nature Switzerland AG 2020 E. Bucciarelli et al. (Eds.): DECON 2019, AISC 1009, pp. 308–315, 2020. https://doi.org/10.1007/978-3-030-38227-8_35

The Ascending Staircase of the Metaphor: From a Rhetorical Device

309

one of the most recurring questions in the XX century: is metaphor a real cognitive structure, or is it just a simple linguistic element? The main goal of this study is to contribute to the resolution of the ancient enigma that wrongly considers metaphor a linguistic deceit, in order to demonstrate its cognitive role in language, as already depicted by Aristotle in his seminal works, the Poetics and the Rhetoric, who defines metaphor as the act of “giving the thing a name that belongs to something else” [4] p. 25.

2 The Ascending Staircase Paradigm of the Metaphor In this section, we provide a general background for reconstructing the arduous life path of metaphor from a philological and linguistic perspective. We represent its evolution as an ascending staircase paradigm: all the studies dedicated to the queen of tropes are represented within an ideal system of Cartesian axes which describes the crucial milestones of the evolving path of the metaphor. The one-to-one relationship between Aristotelian tradition and cognitive linguistics’ main concepts acts as a link between then and now. Aristotle is the starting point of our ascending staircase paradigm, as he is unanimously acknowledged as the father of metaphor. In his two seminal works dated from the 4th century B.C., that is the Poetics and the Rhetorics, he acknowledges the cognitive role of metaphor as a meaning transfer. Petrilli and Welby [5] emphasize the Aristotelic link between knowledge and metaphor describing it as a useful device for acquisition and the increase of individual knowledge, a systematic procedure that expands the field of speakable language through the use of the linguistic tools available. The first major divergence occurred with the publication of Rhetorica ad Herennium attributed to Cornificius in late 80s B.C., followed by the Orator by Cicero in 46 B.C. Cicero works the real devaluation of the Aristolelic lecture: in fact, he argues that metaphor is not a transfer, but a meaning substitution. In that sense, in early 90s A.D., Quintilian’s definition of metaphor [6] as a shorter form of simile (metaphora brevior est similitudo) is rooted by Cicero’s mindset, and all the following rhetorical tradition is influenced by him. The Middle Ages represents the golden age of metaphor seen as a mere discourse embellishment: Mark Johnson observes how this trope, “treated traditionally under rhetoric, becomes a stylistic device divorced from serious philosophical arguments” [7] p. 9. The process is favored by the fact that in the middle ages rhetoric starts to detach itself from philosophy, specializing in matters of style and artistic creation. It is interesting to observe how Cicero’s mindset on metaphor was pervasive, even throughout the Renaissance until the early Modern age. As a matter of fact, both English philosophers Thomas Hobbes and John Locke argue for the condemnation of the queen of tropes, denying “figurative language a positive role in truth-oriented communication or thought” Mussolf [8] p. 96. The Neapolitan philosopher and politician Giambattista Vico, in the heat of the age of Enlightenment, marked a turning point in the life path of metaphor: in Scienza Nuova (1752), he defines it as “the most luminous and therefore the most necessary and frequent [trope]” [9] p. 132. According to Vico, this trope is the one that acts first in

310

C. Pagani and C. Paolini

human communication. Humans have an extraordinary and natural talent in creating metaphors in order to talk about things they couldn’t explain. In the XX century, metaphor became one of the trending topics in linguistic research. As demonstrated by Ivor Armstrong Richards, “thinking is radically metaphoric” [10] p. 49, because “we cannot get through three sentences of ordinary fluid discourse without [metaphor] […] we do not eliminate it or prevent it without great difficulty” [11] p. 92. In 1980, American linguist George Lakoff and philosopher Mark Johnson made the breakthrough publishing the seminal essay “Metaphors we live by”, where for the first time, metaphor is conceived within a theoretical framework. This crucial work introduces the conceptual metaphor theory, in which metaphor is not anymore seen as part of language, but as a constituent of mind/concept level where it works a semantic mapping onto different meaning domains [1]. In the last decade, Italian philosopher Elisabetta Gola draws attention to the role played by metaphor in natural language processing, describing the queen of tropes as a “particular cognitive operation” [12] p. 21.

3 A Cognitive Approach to Metaphor The research tool that we adopt in this paper is not new, but rather it is the very tool inherent to the cognitive approach to metaphor. The cognitive view behind this approach is conducive both to human language conceptualization and neuroscience [13]. Indeed, reconsidering the two-dimensional graphical analytical scope of a Cartesian coordinate system, we represent the arduous path life of metaphor using twodimensional spaces. The evolution of metaphor is represented through three diagrams which highlight the ascending staircase paradigm of the star of the tropes. For their construction, we refer to [14]. 3.1

Metaphor: A Growing Polygonal Chain

In Fig. 1, the abscissa (i.e., x-axis) and the ordinate (i.e., y-axis) shows the biography of metaphor: since its birth, Aristotle describes its embellishment qualities in the linguistic labor limae following the authors who exalted metaphor as a sign of ingeniousness, able to open other worlds. The y-axis represents the authors who detected metaphor’s cognitive aspects in their studies. The x-axis shows chronologically the years of the author’s publications. The points of the blue polygonal chain are obtained uniting the y-axis authors coordinates with the x-axis corresponding publications. A blue polygonal path of the first graph highlights the metaphor biographical stages: its origin is fixed in Aristotle, and it is represented by the point 0. The growing blue polygonal chain shows how the authors’ frequency between 334 B.C. and 1750, nearly two thousand years, is lower than the frequency from 1750 to 2014. Overall, for the last three hundred years the frequency increased intensively from Aristotle’s times to the present, the queen of tropes has revealed its true identity and cognitive skills, developing along an intensive and difficult road: it has gone through metaphorical linguistic studies, essays on rhetoric, semiotic treatises, publications of cognitive sciences that have finally “crowned” it as a real cognitive event and an inevitable process of human thought.

The Ascending Staircase of the Metaphor: From a Rhetorical Device

311

Fig. 1. The growing polygonal chain is composed of the authors from 334 B.C. to 2014 who acknowledge metaphor’s cognitive aspect in their studies.

3.2

Metaphor: Ornatus Element vs. Cognitive Element

We provide a second Cartesian graph (see Fig. 2) which highlights the steady fluctuation of the ascending staircase paradigm of the metaphor. The y-axis is labeled with the authors’ names in chronological order: Aristotle is point 0. They are divided between who lived from 334 B.C to 1 A.D. and from A.D. 90 to A.D. 2014. The x-axis is labeled with the period of time, or in particular the dates when the authors published their works about the queen of the trope. The graph shows the piecewise linear curve characterized by two colors: in the top quadrant the blue piecewise linear curve connects all the most relevant authors who emphasize that the metaphor is a powerful device to increase our knowledge. The above mentioned authors recognize its cognitive quality detected in Aristotle’s writings. This positive view about metaphor has been interrupted by classic rhetoric coded metaphor, classifying it as a language embellishment. In fact, in the right bottom quadrant, the red piecewise linear curve connects the authors who diminish the queen of tropes as ornatus orationis (language ornamentations), among exornationes (rhetoric figure). The red and blue piecewise line clearly represent the metaphor’s tortuous life, travelling over the centuries by the queen of tropes; the undisputed star of many essays and theories, sometimes denigrated, sometimes loved, as it happens with the stars.

312

C. Pagani and C. Paolini

Fig. 2. The piecewise linear curve represents the metaphor tortuous life, travelling over the centuries: the blue polygonal chain connects the most relevant authors who consider the metaphor as a cognitive powerful device; the red polygonal path connects the authors who diminish the metaphor as ornatus orationis (language ornamentations).

3.3

Metaphor: Terms and Definitions

The third Cartesian graph (see Fig. 3) shows that the history and epistemology of the metaphor is marked by a continuous tension between two opposite stances: on the one hand, a decorative view conceives of metaphor as a mere element of ornatus, while on the other hand, a cognitive perspective considers the metaphor as a primary function in thought and knowledge [15].

The Ascending Staircase of the Metaphor: From a Rhetorical Device

313

Fig. 3. The two linear curves indicate two opposite stances: the definitions of the red polygonal chain indicate metaphor as a mere element of ornatus; the appellatives of the blue polygonal path name the metaphor as a primary function in thought and knowledge.

In the Cartesian plan the abscissa (y-axis) collects most of the authors who have investigated the use of metaphor and on the ordinate (x-axis) their major studies, dated back as from 334 B.C. to 2014, are pointed out. In the bottom right quadrant, the red polygonal path is labeled with the heterogeneous appellatives that diminish and declass the queen of the tropes: it is named as translatio, similitudo brevior, discourse embellishment. This ornamental view, initiated by Latin rhetoric and widely accepted for centuries, deprives metaphor of its cognitive quality detected by Aristotle’s writings. Medieval thought and theology, overall the Renaissance, generally consider metaphor as nothing more as an exclusive privilege of the very learned or an optional embellishment. The terms ignes fatui, perfect cheats and deviant phenomenon describe how metaphor is stigmatized by some scholars of the seventeenth century, who consider deviations from literal language as misleading, but also as deliberately deceitful.

314

C. Pagani and C. Paolini

Unfortunately, when considering the red polygonal path appellatives, we find no trace of a cognitive function of metaphor in Aristotle’s sense. Through the appellative labeled on the blue polygonal chain, the cognitive value assigned by Aristotle becomes clearer. Indeed, the terms as clothe words themselves with concepts or tools to understand the human encyclopedia highlighting the power of connecting language, thought and reality. In the past few decades, metaphor has been exalted and praised as a sign of both talent and ingeniousness to open other worlds, until when, in 2014, it was crowned as an inevitable process of human thought and reasoning.

4 Metaphor as a Cognitive Device in Educational Activity The core tenet of this paper can be found in Richard’s claim: “Metaphor is the omnipresent principle of language” [11] p. 92. His theory constitutes the theoretical and practical framework for the implementation of a teaching/learning pathway called “Teaching and learning metaphors 2.0”, in which the three diagrams are the reference learning object. The wide potentiality of metaphor for application in didactics is highlighted by the diagrams: the first and the second diagram throw light on ancient writers and contemporary theories which ascribe a cognitive power of metaphor. They are sharply increasing, pursuing new cognitive and neuroscientific research. In the third diagram the definitions conferred by the scholars to metaphor point out the value of metaphor as cognitive and knowledge device, besides its function of embellishment. The graphs become useful didactic tools in order to raise metaphor as a powerful cognitive device, and in doing so overcome the traditional didactics which anachronistically consider the queen of tropes as a linguistic anomaly or an ornamental trait of discourse [16]. The aim of “Teaching and learning metaphors 2.0” is to characterize the complex interaction between metaphor and deductive reasoning in the development of cognitive processes. The educational approach at the basis of our student-centered didactic pathway considers metaphor as a key cognitive device: in this sense, it focuses attention on the linguistic and reasoning details that “allow people to understand each other and negotiating meanings in everyday contexts” Ervas et al. [17] p. 649.

5 Conclusion and Future Research The systematic devaluation of metaphor as a mere ornatus is highly opposed to the acknowledgment of metaphor as an actual cognitive event and an inescapable neuroscientific process. When metaphors are not understood as easily as literal expressions, a possible contributing factor to this effect is suggested by research on cognitive science. In this regard, we use a Cartesian reference system in order to show the tension between these two opposite stances in metaphorical studies as it introduced the term conceptual metaphor which assumes that metaphor is not a mere linguistic phenomenon, but rather, it is capable of shaping human thought depending on inherent bodies of mutual experience. In so doing, we bring into light the cognitive emergence of metaphor through a philological and philosophical analysis of its history and

The Ascending Staircase of the Metaphor: From a Rhetorical Device

315

epistemology. In this paper, we provide some aspects of the theory and application of cognitive linguistics and, more particularly, we investigate cognitive processes in educational contexts and to implement a teaching/learning pathway called “Teaching and learning metaphors 2.0”, by which metaphor is seen as a key cognitive device in order to let students be more aware of their learning process. At the same time, metaphor is “the only skill that cannot be acquired from others […] it is a sign of a gifted mind” Aristotle [4] pp. 58–61. We consider this to be a promising starting point for developing our educational model and for future research in this field. Among the latter, we are designing an experimental study to investigate Gibbs’s [18] findings.

References 1. Lakoff, G., Johnson, M.: Metaphor We Live. University of Chicago Press, Chicago/London (1980) 2. Gola, E.: Metaphor and reasoning: Aristotle’s view revisited. In Ervas F., Sangoi M. (eds.): Isonomia – Epistemologica (5), 25–38 (2014) 3. Eco, U., Paci, C.: The scandal of metaphor: metaphorology and semiotics. Poetics Today 4 (2), 217–257 (1983) 4. Bywater, I. (trans./ed.): Aristotle, On the Art of Poetry. CreateSpace Independent Publishing Platform (2014) 5. Petrilli, S., Welby, V.: Signifying and Understanding: Reading the Works of Victoria Welby and the Signific Movement. De Gruyter Mouton, Berlin (2009) 6. Quintiliano: Institutio Oratoria. Utet, Torino (1968) 7. Johnson, M.: Philosophical Perspectives on Metaphor. University of Minnesota Press, Minneapolis (1981) 8. Musolff, A.: Ignes fatui or apt similitudes? The apparent denunciation of metaphor by Thomas Hobbes. Hobbes Stud. 18(1), 96–112 (2005) 9. Vico, G., Bergin, T.G., Fisch, M.H.: The New Science of Giambattista Vico. Cornell University Press, Ithaca (1948) 10. Richards, I.A.: Interpretation in Teaching. Routledge & Kegan Paul, London (1938) 11. Richards, I.A.: The Philosophy of Rhetoric. Oxford University Press, New York (1936) 12. Gola, E.: Metafora e mente meccanica. Creatività linguistica e processi cognitivi. Cuec Editrice, Cagliari (2005) 13. Indurkhya, B.: Metaphor and Cognition - An Interactionist Approach. Kluwer Academic Publishers, Dordrecht (1992) 14. Cheng, P.C.-H., Cupit, J., Shadbolt, N.: Supporting diagrammatic knowledge acquisition: an ontological analysis of Cartesian graphs. Int. J. Hum.-Comput. Stud. 54, 457–494 (2001) 15. Dirven, R., Frank, R., Pütz, M.: Cognitive Models in Language and Thought - Ideology, Metaphors and Meanings. De Gruyter, Berlin (2003) 16. Roth, W.-M.: Reading graphs: contributions to an integrative concept of literacy. J. Curriculum Stud. 34(1), 1–24 (2002) 17. Ervas, F., Gola, E., Rossi, M.G.: Metaphors and Emotions as Framing Strategies in Argumentation. EAPCogSci (2015) 18. Gibbs, R. (ed.): The Cambridge Handbook of Metaphor and Thought. Cambridge University Press, Cambridge (2008)