Business Modeling and Software Design: 7th International Symposium, BMSD 2017, Barcelona, Spain, July 3–5, 2017, Revised Selected Papers (Lecture Notes in Business Information Processing, 309) 9783319784274, 9783319784281, 3319784277

This book contains revised and extended versions of selected papers from the 7th International Symposium on Business Mod

148 25 14MB

English Pages 216 [215]

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Business Modeling and Software Design: 7th International Symposium, BMSD 2017, Barcelona, Spain, July 3–5, 2017, Revised Selected Papers (Lecture Notes in Business Information Processing, 309)
 9783319784274, 9783319784281, 3319784277

Table of contents :
Preface
Organization
Contents
A Visionary Way to Novel Process Optimizations
1 Introduction
2 Underlying Concepts
2.1 Process Domain
2.2 Knowledge Representation
2.3 Artificial Neuronal Networks
3 Objectives of an ANN Process Domain
4 Design of an ANN Process Domain
4.1 Neuronal Process Modeling
4.2 Neuronal Process Simulation
4.3 Neuronal Process Optimization
5 Demonstration of an ANN Process Domain
5.1 Theoretical Example Models
5.2 Practical Example Models
5.3 Practical Example Simulation
5.4 Practical Example Optimization
6 Evaluation
7 Conclusion
References
Microflows: Leveraging Process Mining and an Automated Constraint Recommender for Microflow Modeling
Abstract
1 Introduction
2 Related Work
3 Solution Approach
3.1 Microflow Principles
3.2 Microflow Lifecycle
4 Realization
4.1 BPMN to Microflow Specification Transformation
4.2 Microflow Constraint Mining
4.3 Recommender Service
4.4 Microflow Modeler and the Recommender Service
4.5 Microflow Error Recovery
5 Evaluation
5.1 BPMN Transformation
5.2 Microflow Constraint Mining
5.3 Microflow Modeler and Recommender Service Case Study
5.3.1 Recommender Service Training Set
5.3.2 Recommender Service Usage
5.3.3 Recommender Service Technical Evaluation
5.4 Microflow Error Recovery
6 Conclusion
Acknowledgments
References
IT Systems in Business: Model or Reality?
Abstract
1 Introduction
2 Early Years
2.1 Business Value and Data Quality
3 Covering Processes
3.1 Business Viewpoints
4 Years of Renewal
4.1 Business Models
5 Heterogeneity
5.1 Process Logic and Real Business
6 Types, Meaning and Use of Models
6.1 Implicit Modelling
6.2 Modelling for Software Design
6.3 High Level Business Modelling
6.4 Modelling Process Logic
6.5 Reference Models
6.6 OED and Stachowiak on Models
7 Conclusion: IT Systems as Models
References
Combining Business Process Variability and Software Variability Using Traceable Links
1 Introduction
2 Related Work
3 Background
3.1 Software Product Line Engineering
3.2 Business Process Modeling
3.3 Managing Variability in Business Process Modeling
4 Combined Variability Modeling
5 Industrial Case Study
5.1 Exemplary Sample Process
5.2 Software Product Line Engineering
5.3 Traceability
6 Evaluation
7 Conclusion
References
Enforcing Context-Awareness and Privacy-by-Design in the Specification of Information Systems
Abstract
1 Introduction
2 Basic Concepts
3 Problem Conceptualization
4 Background and Related Work
4.1 Context-Awareness
4.2 Privacy
5 SDBC
5.1 Justification
5.2 Relevance to Design Science
5.3 Outline
6 Weaving in Context-Awareness and Privacy
7 Illustrative Example
7.1 Case Briefing
7.2 Modeling the Border Security System
8 Conclusions
References
Towards an Integrated Architecture Model of Smart Manufacturing Enterprises
Abstract
1 Introduction
2 Methodology
3 Analysis
3.1 Excluding Non-architectural Concepts from ISA-95
3.2 Mapping ISA-95 to ArchiMate 3.0
3.3 Classifying Deficiencies in ArchiMate 3.0
3.4 Addressing the Deficiencies Found
4 Validation
4.1 Modelling
4.2 Impact Analysis
4.3 Performance Analysis
4.4 Effectiveness of Patterns
5 Related Work
6 Future Work
7 Conclusions
References
A Model Driven Systems Development Approach for NOMIS – From Human Observable Actions to Code
Abstract
1 Introduction
2 The NOMIS Modelling Approach
2.1 NOMIS Philosophical Foundations
2.2 NOMIS Vision
2.3 NOMIS Representation
3 Model Driven Systems Engineering
3.1 Modelling Languages
3.2 Model Transformations
4 A MDSD Approach for NOMIS
4.1 NOMIS Domain Specific Language
4.2 Transformation 1: Deriving System Services
4.3 Transformation 2: Deriving the Interface System
4.4 Transformation 3: Persisting Business Data
4.5 Transformation 4: Controlling System Services with State Machines
4.6 Transformation 5: Creating a Normbase
4.7 NOMIS Information Systems
4.8 Summary of the NOMIS MDSD Approach
5 Related Work
5.1 Diplans in TOA
5.2 Ontology Charts in OS
5.3 Aspect Models in EO
6 Conclusions and Future Work
References
Value Switch for a Digital World: The BPM-D® Application
Abstract
1 Improving Business Process Management
2 BPM for Strategy Execution and Digitalization
2.1 Discipline of Strategy Execution
2.2 Value-Switch for Digitalization
2.3 The Process of Process Management
3 Objectives of the Digitalization of the Process of Process Management
3.1 Focus on What Matters Most
3.2 Don’t Re-invent the Wheel - Integrate
3.3 Make Process Management Fun
4 Approach of the Digitalization of the Process of Process Management
4.1 Design of the BPM-D Application
4.2 Implementation of the BPM-D Application
5 Experiences with the First Pilot
5.1 Pilot Client Overview
5.2 Leveraging the BPM-D Application
5.3 Learnings and Further Development of the BPM-D Application
6 First Impact is Visible
References
A Systematic Review of Analytical Management Techniques Applied to Competition Analysis Modeling Towards a Framework for Integrating them with BPM
Abstract
1 Introduction
2 An Overview of Suitable Analytical Management Techniques for Competition Analysis that Could Be Integrated in the BPM Methodology
2.1 A Markov Chain Business Competition Modelling Analysis
2.2 The Game Theoretic Modelling Analysis Applied to Business Competition Analysis
2.3 The Cognitive Maps Approach in Modelling Analysis
3 Discussion - Conclusions
References
A Privacy Risk Assessment Model for Open Data
Abstract
1 Introduction
2 Previous Work
3 Privacy Threats and Opening Data
3.1 Disclosure of Real Identities
3.2 Personal Information Discovery Through Linking Data
3.3 Personal Information Discovery Through Data Mining
3.4 Data Utilization Versus Privacy
4 Proposed Privacy Risks Assessment Model
4.1 Open Data Attributes
4.2 Decision Engine
4.3 Privacy Risk Indicator (PRI)
4.4 Privacy Risk Mitigation Measures (PRMM)
5 PRI Model Implementation
5.1 Functional Components
5.2 Technical Implementation
6 Illustration Scenarios
6.1 Scenario S1: Open Crime Data Usage and Provisioning
6.2 Scenario S2: Open Social Data Provisioning
6.3 Scenario S3: Use of Restricted Archaeology Data
6.4 Scenario S4: Use of Physically Restricted Statistics Data
6.5 Scenario S5: Use of Physically Restricted Agency Data
7 Conclusions
References
Author Index

Citation preview

LNBIP 309

Boris Shishkov (Ed.)

Business Modeling and Software Design 7th International Symposium, BMSD 2017 Barcelona, Spain, July 3–5, 2017 Revised Selected Papers

123

Lecture Notes in Business Information Processing Series Editors Wil M. P. van der Aalst RWTH Aachen University, Aachen, Germany John Mylopoulos University of Trento, Trento, Italy Michael Rosemann Queensland University of Technology, Brisbane, QLD, Australia Michael J. Shaw University of Illinois, Urbana-Champaign, IL, USA Clemens Szyperski Microsoft Research, Redmond, WA, USA

309

More information about this series at http://www.springer.com/series/7911

Boris Shishkov (Ed.)

Business Modeling and Software Design 7th International Symposium, BMSD 2017 Barcelona, Spain, July 3–5, 2017 Revised Selected Papers

123

Editor Boris Shishkov Institute of Mathematics and Informatics (IMI), Bulgarian Academy of Sciences Sofia Bulgaria and Interdisciplinary Institute for Collaboration and Research on Enterprise Systems and Technology (IICREST), Bulgarian Academy of Sciences Sofia Bulgaria

ISSN 1865-1348 ISSN 1865-1356 (electronic) Lecture Notes in Business Information Processing ISBN 978-3-319-78427-4 ISBN 978-3-319-78428-1 (eBook) https://doi.org/10.1007/978-3-319-78428-1 Library of Congress Control Number: 2018937386 © Springer International Publishing AG, part of Springer Nature 2018 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Printed on acid-free paper This Springer imprint is published by the registered company Springer International Publishing AG part of Springer Nature The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

How did enterprises look 40 years ago today? What were the business process automation limitations without computers? How much harder was it to exchange information, not counting on global telecommunications and digital multimedia? Was it possible to externalize business processes without Web services and cloud infrastructures? Was it possible to develop truly adaptable information systems, not counting on sensor technology? Answering these questions would bring us to a conclusion that enterprises have been shifting to experience a growing dependency on IT over the past several decades. For this reason, it is not surprising that software engineering is becoming increasingly relevant with regard to enterprise developments. Hence, even though enterprise engineering (EE) and software engineering (SE) have developed separately as disciplines, it is currently important to bring together enterprise modeling and software specification; this would allow enterprises to adequately utilize current technology. Applying a modeling approach for closing that gap would assume establishing a common enterprise–software conceptual foundation; abstract models can essentially capture entities, processes, and regulations, no matter if this concerns software or an enterprise. Such a foundation would be useful in achieving enterprise– software alignment and traceability. Nevertheless, challenges would arise, related to the numerous enterprise–software-modeling perspectives: (a) In addressing an (enterprise) system, one would need to be able to model structure, dynamics, data and still keep all as a coherent whole. (b) In considering enterprise information systems, one would face the EE vs. SE viewpoints, needing nevertheless to keep the software under development consistent with its surrounding enterprise environment. (c) In modeling an enterprise/software system, one may take a black-box (functional) or a white-box (operational) perspective but it is necessary to keep the white-box models consistent with regard to the corresponding black-box models. (d) In considering context-awareness with regard to an enterprise information system, one should decide whether the goal is to optimize internal processes or to maximize the user-perceived effectiveness; this relates to the challenge of harmonizing the perspective of the (software) system and the perspective of the user. (e) In specifying software, one may need to weave in public values (such as privacy, transparency, accountability, etc.) that are essentially nonfunctional and thus need to be operationalized, in order to be actually reflected in the system’s functionality. Thus, we have many modeling viewpoints but we need overall consistency in order to be able to effectively bring together enterprise modeling (that is mainly rooted in social theories) and software specification (that is mainly rooted in computing paradigms), as discussed during the panel of the seventh edition of BMSD – the International Symposium on Business Modeling and Software Design. Referring to the LNBIP 275 preface: bringing together all those enterprise engineers and software engineers who are inspired to search for solutions on further bridging business modeling and software design is of key importance to the BMSD community.

VI

Preface

BMSD (http://www.is-bmsd.org) is an annual international symposium that brings together researchers and practitioners who are considering these challenges. Since 2011, we have enjoyed seven successful BMSD editions. The first BMSD edition (2011) took place in Sofia, Bulgaria, and the theme of BMSD 2011 was: “Business Models and Advanced Software Systems.” The second BMSD edition (2012) took place in Geneva, Switzerland, and the theme was: “From Business Modeling to Service-Oriented Solutions.” The third BMSD edition (2013) took place in Noordwijkerhout, The Netherlands, under the theme: “Enterprise Engineering and Software Generation.” The fourth BMSD edition (2014) took place in Luxembourg, Grand Duchy of Luxembourg, and the theme was: “Generic Business Modeling Patterns and Software Re-Use.” The fifth BMSD edition (2015) took place in Milan, Italy, with the theme: “Towards Adaptable Information Systems.” The sixth BMSD edition (2016) took place in Rhodes, Greece, under the theme: “Integrating Data Analytics in Enterprise Modeling and Software Development.” The seventh BMSD edition (2017) took place in Barcelona, Spain, and the theme of BMSD 2017 was: “Modeling Viewpoints and Overall Consistency.” In 2018, BMSD is going to Vienna, Austria, with the theme: “Enterprise Engineering and Software Engineering – Processes and Systems for the Future.” We are proud to have attracted distinguished guests as keynote lecturers, who are renowned experts in their fields: Norbert Gronau, University of Potsdam, Germany (2017), Oscar Pastor, Polytechnic University of Valencia, Spain (2017), Alexander Verbraeck, Delft University of Technology, The Netherlands (2017), Paris Avgeriou, University of Groningen, The Netherlands (2016), Jan Juerjens, University of Koblenz-Landau, Germany (2016), Mathias Kirchmer, BPM-D, USA (2016), Marijn Janssen, Delft University of Technology, The Netherlands (2015), Barbara Pernici, Politecnico di Milano, Italy (2015), Henderik Proper, Public Research Centre Henri Tudor, Luxembourg (2014), Roel Wieringa, University of Twente, The Netherlands (2014), Kecheng Liu, University of Reading, UK (2013), Marco Aiello, University of Groningen, The Netherlands (2013), Leszek Maciaszek, Wroclaw University of Economics, Poland (2013), Jan L.G. Dietz, Delft University of Technology, The Netherlands (2012), Ivan Ivanov, SUNY Empire State College, USA (2012), Dimitri Konstantas, University of Geneva, Switzerland (2012), Marten van Sinderen, University of Twente, The Netherlands (2012), Mehmet Aksit, University of Twente, The Netherlands (2011), Dimitar Christozov, American University in Bulgaria – Blagoevgrad, Bulgaria (2011), Bart Nieuwenhuis, University of Twente, The Netherlands (2011), and Hermann Maurer, Graz University of Technology, Austria (2011). The BMSD 2018 keynote lectures will be delivered by Jan Mendling, WU Vienna, Austria, and Roy Oberhauser, Aalen University, Germany. The Barcelona edition of BMSD demonstrated for a seventh consecutive year a high quality of papers and presentations as well as a stimulating discussion environment. In 2017, the scientific areas of interest to the symposium were: (a) Business Processes and Enterprise Engineering; (b) Business Models and Requirements; (c) Business Models and Services; (d) Business Models and Software; (e) Information Systems Architectures and Paradigms; and (f) Data Aspects in Business Modeling and Software Development.

Preface

VII

BMSD 2017 received 57 submissions from which 28 papers were selected for publication in the symposium proceedings. Of these papers, 15 were selected for a 30-minute oral presentation (full papers), leading to a full-paper acceptance ratio of 26% (compared with 29% in 2016) – an indication of our intention to preserve a high-quality forum for the next editions of the symposium. The BMSD 2017 keynote lecturers and authors were from: Algeria, Austria, Bulgaria, Colombia, Greece, Germany, Japan, Kazakhstan, Luxembourg, Morocco, The Netherlands, Poland, Portugal, Spain, Sweden, Taiwan, Tunisia, Turkey, the UK, and USA (listed alphabetically) – a total of 20 countries (vs. 16 in 2016, 21 in 2015, 21 in 2014, 14 in 2013, 11 in 2012, and 10 in 2011) to justify a strong international presence. Four countries have been represented at all seven BMSD editions to date: Bulgaria, Germany, The Netherlands, and the UK, indicating a strong European influence. The high quality of the BMSD 2017 program was enhanced by three keynote lectures delivered by outstanding guests (see above), which were greatly appreciated by the participants, helping them get deeper insight particularly in process optimization, conceptual (goal) modeling, and inter-enterprise collaborations. Further, Oscar’s participation (together with other professors) in the BMSD 2017 panel was of additional value. BMSD 2017 was organized and sponsored by the Interdisciplinary Institute for Collaboration and Research on Enterprise Systems and Technology (IICREST), being technically co-sponsored by BPM-D. Cooperating organizations were Aristotle University of Thessaloniki (AUTH), Delft University of Technology (TU Delft), the UTwente Center for Telematics and Information Technology (CTIT), the Institute of Mathematics and Informatics (IMI) of the Bulgarian Academy of Sciences, the Dutch Research School for Information and Knowledge Systems (SIKS), and AMAKOTA Ltd. This book contains revised and extended versions of ten BMSD 2017 papers (selected out of the full papers), addressing a large number of BMSD-relevant research topics: Grum and Gronau capture complex relations within business processes, inspired by neural network techniques and the current data processing capabilities, and coming through process modeling and process simulation, for achieving business process optimizations in the end. Acknowledging the complexity of most current business processes, leading to difficulties in modeling all possible process variations and inspired by the microservices software architectural style (assuming fine-grained services accessible via lightweight protocols), Oberhauser and Stigler propose an approach for modeling agile business processes, which features Microflows – lightweight workflow planning and enactment of microservices. Inspired by a 35-year experience in the food processing industry, Suurmond analyzes in his paper the changing role of business process models and their relevance to supportive IT, no matter if an IT system is supposed to only support the business processes or to execute them. Aiming to enforce a link between the variability of business processes and the variability of their corresponding product platforms, Sinnhofer et al. propose a combined variability modeling in which the requirements for the organization as well as for the development of the product platform are identified together in an integrated fashion. Inspired by the challenge of bringing together enterprise modeling and software specification, Shishkov and Janssen present a design approach that allows for weaving context-awareness and privacy-by-design into

VIII

Preface

the specification of information systems. Acknowledging the challenge of building information systems that are connected to the shop floor and aligned with the business needs of smart manufacturers, Franck et al. consider in their paper an enterprisearchitecture-modeling approach concerning smart manufacturing companies and analyze ArchiMate 3.0 in terms of its coverage of the manufacturing domain. Furthering his work on human-centered information systems modeling, Cordeiro presents a related model-driven system development approach. Inspired by previous studies on business process management, Kirchmer et al. propose an approach for the digitalization of process management. Taking a business process modeling perspective, Karras and Papademetriou present an overview of some important analytical management computational techniques. Finally, Ali-Eldin et al., touching upon the data aspects in business modeling and software development, propose a privacy risk assessment model for open data structures. We hope that you will find the current LNBIP volume interesting. We believe that the ten selected papers will be a helpful reference with regard to the aforementioned topics. February 2018

Boris Shishkov

Organization

Chair Boris Shishkov

Bulgarian Academy of Sciences / IICREST, Bulgaria

Program Committee Hamideh Afsarmanesh Marco Aiello Mehmet Aksit Paulo Anita Paris Avgeriou Dimitar Birov Frances Brazier Ruth Breu Barrett Bryant Cinzia Cappiello Kuo-Ming Chao Samuel Chong Dimitar Christozov Jose Cordeiro Claudio Di Ciccio Jan L. G. Dietz Teduh Dirgahayu John Edwards Hans-Georg Fill Chiara Francalanci J. Paul Gibson Rafael Gonzalez Norbert Gronau Clever Ricardo Guareis de Farias Jens Gulden Ilian Ilkov Ivan Ivanov Marijn Janssen Gabriel Juhas Dmitry Kan Stefan Koch Michal Krcal Kecheng Liu

University of Amsterdam, The Netherlands University of Groningen, The Netherlands University of Twente, The Netherlands Delft University of Technology, The Netherlands University of Groningen, The Netherlands Sofia University St. Kliment Ohridski, Bulgaria Delft University of Technology, The Netherlands University of Innsbruck, Austria University of North Texas, USA Politecnico di Milano, Italy Coventry University, UK Capgemini, UK American University in Bulgaria – Blagoevgrad, Bulgaria Polytechnic Institute of Setubal, Portugal WU Vienna, Austria Delft University of Technology, The Netherlands Universitas Islam Indonesia, Indonesia Aston University, UK University of Vienna, Austria/University of Bamberg, Germany Politecnico di Milano, Italy T&MSP – Telecom & Management SudParis, France Javeriana University, Colombia University of Potsdam, Germany University of Sao Paulo, Brazil University of Duisburg-Essen, Germany IBM, The Netherlands SUNY Empire State College, USA Delft University of Technology, The Netherlands Slovak University of Technology, Slovak Republic AlphaSense Inc., Finland Johannes Kepler University Linz, Austria Masaryk University, Czech Republic University of Reading, UK

X

Organization

Leszek Maciaszek Jelena Marincic Hermann Maurer Heinrich Mayr Nikolay Mehandjiev Jan Mendling Michele Missikoff Dimitris Mitrakos Ricardo Neisse Bart Nieuwenhuis Olga Ormandjieva Mike Papazoglou Marcin Paprzycki Oscar Pastor Prantosh K. Paul Barbara Pernici Doncho Petkov Gregor Polancic Henderik Proper Ricardo Queiros Jolita Ralyte Stefanie Rinderle-Ma Werner Retschitzegger Wenge Rong Ella Roubtsova Irina Rychkova Shazia Sadiq Andreas Sinnhofer Valery Sokolov Richard Starmans Hans-Peter Steinbacher Coen Suurmond Bedir Tekinerdogan Ramayah Thurasamy Roumiana Tsankova Marten van Sinderen Alexander Verbraeck Barbara Weber Roel Wieringa Dietmar Winkler Shin-Jer Yang Benjamin Yen Fani Zlatarova

Macquarie University, Australia/University of Economics, Poland ASML, The Netherlands Graz University of Technology, Austria Alpen Adria University Klagenfurt, Austria University of Manchester, UK WU Vienna, Austria Institute for Systems Analysis and Computer Science, Italy Aristotle University of Thessaloniki, Greece European Commission Joint Research Center, Italy University of Twente, The Netherlands Concordia University, Canada Tilburg University, The Netherlands Polish Academy of Sciences, Poland Universidad Politecnica de Valencia, Spain Raiganj University, India Politecnico di Milano, Italy Eastern Connecticut State University, USA University of Maribor, Slovenia Luxembourg Institute of Science and Technology, Grand Duchy of Luxembourg Polytechnic of Porto, Portugal University of Geneva, Switzerland University of Vienna, Austria Johannes Kepler University Linz, Austria Beihang University, China Open University, The Netherlands University of Paris 1 Pantheon Sorbonne, France University of Queensland, Australia Graz University of Technology, Austria Yaroslavl State University, Russia Utrecht University, The Netherlands FH Kufstein Tirol University of Applied Sciences, Austria RBK Group, The Netherlands Wageningen University, The Netherlands Universiti Sains Malaysia, Malaysia Technical University – Sofia, Bulgaria University of Twente, The Netherlands Delft University of Technology, The Netherlands Technical University of Denmark, Denmark University of Twente, The Netherlands Vienna University of Technology, Austria Soochow University, Taiwan University of Hong Kong, SAR China Elizabethtown College, USA

Organization

Invited Speakers Norbert Gronau Oscar Pastor Alexander Verbraeck

University of Potsdam, Germany Polytechnic University of Valencia, Spain Delft University of Technology, The Netherlands

XI

Contents

A Visionary Way to Novel Process Optimizations: The Marriage of the Process Domain and Deep Neuronal Networks . . . . . . . . . . . . . . . . . Marcus Grum and Norbert Gronau

1

Microflows: Leveraging Process Mining and an Automated Constraint Recommender for Microflow Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . Roy Oberhauser and Sebastian Stigler

25

IT Systems in Business: Model or Reality? . . . . . . . . . . . . . . . . . . . . . . . . Coen Suurmond Combining Business Process Variability and Software Variability Using Traceable Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andreas Daniel Sinnhofer, Peter Pühringer, Felix Jonathan Oppermann, Klaus Potzmader, Clemens Orthacker, Christian Steger, and Christian Kreiner Enforcing Context-Awareness and Privacy-by-Design in the Specification of Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Boris Shishkov and Marijn Janssen Towards an Integrated Architecture Model of Smart Manufacturing Enterprises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thijs Franck, Maria-Eugenia Iacob, Marten van Sinderen, and Andreas Wombacher A Model Driven Systems Development Approach for NOMIS – From Human Observable Actions to Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . José Cordeiro Value Switch for a Digital World: The BPM-D® Application. . . . . . . . . . . . Mathias Kirchmer, Peter Franz, and Rakesh Gusain A Systematic Review of Analytical Management Techniques Applied to Competition Analysis Modeling Towards a Framework for Integrating them with BPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimitrios A. Karras and Rallis C. Papademetriou

49

67

87

112

134 148

166

A Privacy Risk Assessment Model for Open Data. . . . . . . . . . . . . . . . . . . . Amr Ali-Eldin, Anneke Zuiderwijk, and Marijn Janssen

186

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

203

A Visionary Way to Novel Process Optimizations The Marriage of the Process Domain and Deep Neuronal Networks Marcus Grum(B)

and Norbert Gronau

University of Potsdam, 14482 Potsdam, Germany [email protected]

Abstract. Modern process optimization approaches do build on various qualitative and quantitative tools, but are mainly limited to simple relations in different process perspectives like cost, time or stock. In this paper, a new approach is presented which focuses on techniques of the area of Artificial Intelligence to capture complex relations within processes. Hence, a fundamental value increase is intended to be gained. Existing modeling techniques and languages serve as basic concepts and try to realize the junction of apparently contradictory approaches. This paper therefore draws a vision of promising future process optimization techniques and presents an innovative contribution. Keywords: Process modeling · Artificial Intelligence Machine learning · Deep neuronal networks Knowledge Modeling Description Language · KMDL Process simulation · Simulation process building · Process optimization

1

Introduction

A great potential of Artificial Neural Networks (ANN) is well known since nearly four decades. In general, those techniques copy the capabilities and working behavior of the brain in simulating a network of simple nerve cells. Early ANN architectures go back to the 1940s, numerous improvements can be found in late 1980–2000 [34]. Because of their ability to learn non-linear relations, to generalize correctly and to built biologically motivated efficiently working structures, ANN have been applied successfully in various contexts such as music composition, banking issues, medicine, etc. Even simple processes have been modeled on behalf of ANN [4]. Nowadays, in times of big data, enormous amounts of data are available and the computing power has increased immensely and with this, the possibility to create bigger and more complex networks. Although, the collection of processing data has become easy, a neuronal modeling and decoding of complex processes has not been realized. c Springer International Publishing AG, part of Springer Nature 2018  B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 1–24, 2018. https://doi.org/10.1007/978-3-319-78428-1_1

2

M. Grum and N. Gronau

Hence, the following research will focus on deep learning with ANN with the intention to answer the following research question: “How can the capability to create efficiently working structures of ANN be used for process optimizations?” This paper intends not to draw an all-embracing description of concrete, technical realizations of those novel process optimization techniques. It intends to set a first step to realize the conjunction of the process modeling, simulation and optimization domain on the one hand and the ANN domain on the other hand. Hence, sub research questions are: 1. 2. 3. 4.

“How “How “How “How

can can can can

a process modeling language be transported to a neuronal level?” neuronal processes be modeled?” neuronal models be used in process simulations?” neuronal networks be used in order to optimize processes?”

Fig. 1. The ANN process domain as intersection of process and ANN domain.

As Fig. 1 visualizes, the ANN Process Domain is build on the following definitions: A Neuronal Process Modeling (NPM) is referred to as the modeling of processes on a neuronal level with a common process modeling language, the reinterpretation of the common process modeling based on that understanding as well as their difference quantity. The Neuronal Process Simulation (NPS) is referred to as the process simulation of common process models considering ANN as knowledge model of process participants (persons and machines), the simulation of common process models reinterpreted as deep neuronal network, the simulation of neuronal processes reinterpreted as organizational processes and their difference quantity. The Neuronal Process Optimization (NPO) is referred to as common process optimization techniques that are realized on a neuronal level (e.g. double-loop learning on a neuronal level), process optimizations that can be realized because of the learning capabilities of ANN in the domain of common process models as well as their difference quantity. The research approach is intended to be design-oriented as Peffers proposes [25,26], such that the paper is structured as follows: A Sect. 2 presents underlying

A Visionary Way to Novel Process Optimizations

3

concepts, the Sect. 3 derives objectives for a NPM, NPS and NPO. The Sect. 4 provides corresponding designs, followed by their demonstration and evaluation. A Sect. 7 concludes the paper.

2

Underlying Concepts

Starting with the selection of a modeling approach and the question, how processes can be simulated and optimized in the Subsect. 2.1, the Subsect. 2.2 refers to underlying knowledge generation concepts. A further section introduces ANN. 2.1

Process Domain

Following the fundamental procedure model for simulation studies of Gronau [9], a model creation is realized after the modeling purpose has been defined, analyzed and corresponding data has been collected. Hence, the following starts with modeling issues. Afterwards, as the model is valid, simulation studies are realized and simulation results collected, analyzed and interpreted. As changes or optimizations are required, adjustments are defined and simulations tested until a sufficient solution has been identified. Best solution options will be implemented, of course. The following characterizes required fields of the process domain following this procedural logic. Process Modeling. Starting from a basic definition of models, which refer to simpler mappings of relevant attributes of the real world with the intention to reduce the complexity of the real world with respect to modeling objectives, process models can be understood as a homomorphous, time-based mapping of a real-world system focusing a sequence-based, plausible visualization [9]. According to Krallmann et al. [21], a system to be modeled consists of an amount of system elements, that are connected with an amount of system relations. As it is limited by a system border, the system environment and the system are connected with an interface to exchange system input and system output. For the modeling of process systems, several process modeling languages can be used. Considering organizational, behavior-oriented, informational and knowledge-oriented perspectives, Sultanow et al. identify the Knowledge Modeling Description Language (KMDL) to be superior in comparison to twelve common modeling approaches [37]. Because of the analogy with a human brain as knowledge processing unit, especially a knowledge process modeling is focused. Here, Remus gives an overview of existing modeling methods and a comparison of their ability to represent knowledge [28, p. 216f]. ARIS, EULE2, FORWISS, INCOME, PROMOTE and WORKWARE are only some representatives. Again, the KMDL can be identified to be superior because of its ability to overcome lacks in visualizations and analyses through the combination of several views such as the process view, activity view and communication view [14]. This language has been developed over more than ten years. Having collected experiences in numerous projects of various application areas such as software engineering,

4

M. Grum and N. Gronau

product development, quality assurance, investment and sales, the evolution of the KMDL can be found in [12]. The current version refers to KMDL version 2.2, but the development of version 3.0 is in progress [10]. In addition to the modeling language, the KMDL reaches a fully developed research method [11]. With its strengths in visualization and the focus of knowledge generation, the KMDL seems attractive for a transfer to the neuronal level. To the best of our knowledge, such a transfer has not been realized yet in any other process modeling language. With its intention to focus on the generation of knowledge following Nonaka and Takeuchi and the intention to transfer the learning potential of ANN, the KMDL enables the modeling of tacit knowledge bases and single or numerous knowledge transfers beside common processing issues. Hence, the KMDL is selected as modeling language for the demonstration in Sect. 5. The current paper builds on the wide spread KMDL version 2.2 [14]. Process Simulation. Once, a valid process model has been created, a dynamic sequence and variations of this process can be simulated. Aiming to gain insights within a closed simulation system, the intention is to transfer insights to reality. For this, the following pre-conditions have to be fulfilled: process models have to provide completeness. This includes the registration of input data such as time, costs, participants, etc. Further, process models have to provide interpretability of decisions. Here, values of variables, state change conditions and transfer probabilities are included. Further, meta information have to be considered, as for example the number of process realizations within a simulation. Beneath further objectives, the following simulation scenarios can be evaluated quickly and at low costs: current sequences of operations, plans and process alternatives. Those evaluations can be realized before expensive adjustments within current process models (so called as-is models) are implemented in the real world [9]. Process Optimization. As processes are adjusted with the intention to optimize them in regard to a certain objective, one speaks from process optimization. All activities and decisions that lead to a desired optimization of business processes, are designated as business process optimization [9]. The success of an optimization is measured by key performance indicators, such as production time, failure and success rates, produced components, etc. There can be found two basic approaches for business process optimizations that are reflected in various methods and variations: (1) An approach called Continuous Improvement Process and (2) an approach called Business Process Reengineering. (1) Originally inspired by a Japanese living and working philosophy called Kaizen, the management concept realizing a never ending improvement of processes and products in small steps is referred to as Continuous Improvement Process, (CIP). Following Imai, key principles are a feedback mentality and processes are reflected continuously. The everlasting search for efficiency demands

A Visionary Way to Novel Process Optimizations

5

for the identification and improvement of suboptimal processes, such that waste is reduced and eliminated. Further, the emphasis lies on continual steps rather than giant changes, which is connected to the key principle of evolution [18]. Corresponding management concepts follow cyclic procedures and can be found in numerous variations: Shewhart Cycle [36], Deming Wheel [5], a second Shewhart Cycle [6], PDSA Cycle [7], PDCA Cycle [23] and a second PDCA Cycle [19]. In general, those concepts contain planning activities (plan). Afterwards, the process is implemented and carried out (do). A feedback is collected and compared to a planned output (check ). Then changes are implemented constantly or revised, before this cyclic procedure is started again. The feedback collection can gather process data either coming from a process realization of the real world, or from a simulated processes. Since improvement ideas are generated during the process realization and a focus lays on single processes, improvements are carried out bottom-up. (2) The concept of Business Process Reengineering, (BPR), refers to the fundamental over-thinking of as-is processes [15]. This is mostly connected with far reaching changes up to a completely new design of processes and the organization itself. Process improvements are designed as if the organization was built anew and current knowledge and state-of-the-art techniques are considered additionally. Here, improvements are carried out top-down and optimization results provide the following characteristics: Decisions are decentralized, process step sequences are reorganized and different process variations can be considered. Further, the localization of working content is organized meaningfully, the need for control and required coordination efforts are reduced and centralized contact points (e.g. for customer requests) are established. In conclusion, within the field of process modeling, the modeling language KMDL is suitable for the use as NPM. Since the KMDL does not provide simulation structures, yet, ANN simulation capabilities can be used for an enhancement of the KMDL and enable NPS. Further, learning capabilities of ANN shall be used for the creation as NPM purposes and optimization as NPO purposes. Here, a CIP is attractive for first steps and a BPR can realize further potentials. 2.2

Knowledge Representation

Nonaka and Takeuchi distinguish between explicit knowledge and tacit knowledge [24]. While the first can be verbalized and externalized easily, the second is hard to detect. The following four knowledge conversion types can be distinguished: – An internalization is the process of integration of explicit knowledge in tacit knowledge. Here, experiences and aptitudes are integrated in existing mental models. – A socialization is the process of experience exchange. Here, new tacit knowledge such as common mental models or technical ability are created. – An externalization is the process of articulation of tacit knowledge in explicit concepts. Here, metaphors, analogies or models can serve to verbalize tacit knowledge.

6

M. Grum and N. Gronau

– A combination is the process of connection of available explicit knowledge, such that a new explicit knowledge is created. Here, a reorganization, reconfiguration or restructuring can result in new explicit knowledge. With the intention to focus on the potentials of human brains and its generation of knowledge, the knowledge generation concepts of Nonaka and Takeuchi seem attractive for modeling on a neuronal level. Since the KMDL is the only modeling language, which builds on this concept, the KMDL is selected for demonstration purposes. 2.3

Artificial Neuronal Networks

Originally, neural networks were designed as mathematical models to copy the functionality of biological brains. First researches were done by Rosenblatt [32], Rumelhart et al. [33] and McCulloch and Pitts [22]. As the brain connects several nerve cells, so called neurons, by synapses, those mathematical networks are composed of several nodes, which are related by weighted connections. As the real brain sends electrical activity typically as a series of sharp spikes, the mathematical activation of a node represents the average firing rate of these spikes. As human brains show very complex structures and are confronted with different types of learning tasks (unsupervised, supervised and reinforcement learning), various kinds of networking structures have established, which all have advantages for a certain learning task. There are for example Perceptrons [31], Hopfield Nets [17], Multilayer Perceptrons [1,33,38], Radial Basis Function Networks [2] and Kohonen maps [20]. Networks containing cyclic connections are called feedbackward or recurrent networks. The following focuses on Multilayer Perceptrons and recurrent networks being confronted with supervised learning tasks. Here, input and output values are given and learning can be carried out in minimizing a differentiable error function by adjusting the ANN’s weighted connections. For this, numerous gradient descent methods can be used, such as backpropagation [1,27], PROP [29], quickprop [8], conjugate gradients [16,35], L-BFGS [3], RTRL [30] and BPTT [39]. As the weight adjustment can be interpreted as a small step in a direction of optimization, the fix step size can be varied to reduce great errors quickly. The learning rate decay can be used to reduce small errors efficiently and a momentum can be introduced to avoid local optima. In this stepwise optimization, analogies to continuous process optimizations can be found (see Sect. 4.3). Since neuronal networks model human brains and the knowledge of a certain learning task, the following refers to neuronal networks as neuronal knowledge models. Those represent a current state of knowledge, the capability to generate new knowledge through their activation and interaction and the possibility to transfer knowledge or further process relevant objects within process simulations.

A Visionary Way to Novel Process Optimizations

3

7

Objectives of an ANN Process Domain

As one assumes to have a given process model and one aims to consider a neuronal network as a process participant’s knowledge model within the simulation of that process model, the following objectives have to be considered coming from a modeling side: 1. Neuronal knowledge models have to be integrated within existing process models. 2. The same neuronal knowledge models have to be able to be integrated several times within a process model. 3. Neuronal knowledge models have to be integrated within process simulations. 4. Modeled environmental factors (material such as non-material objects) have to be integrated with considered knowledge models. 5. Outcomes (materialized such as non-materialized) of considered knowledge models have to be considered within the process model. Further, objectives have to be considered coming from a neuronal techniques side: 6. Neuronal tasks have to be considered while neurons follow biological models. This includes both the neuron’s everyday business and learning processes. 7. Parallel neuronal task realizations have to be considered within neuronal networks. 8. Time-dependent neuronal behaviors have to be considered within neuronal networks. 9. Sequential neuronal task realization have to be considered within neuronal networks. 10. Different levels of neuronal task abstractions have to be considered in the neuronal process modeling and simulation. 11. Sensory information and knowledge flows have to be considered within the modeled neuronal network. 12. Actuator information and knowledge have to be considered as outcomes of neuronal networks. Each identified objective of those domains is relevant for the transfer of a process modeling language and serves as input for the following sections.

4

Design of an ANN Process Domain

The visionary way to a novel process optimization is drawn with help of the following subsections. First, a design for a neuronal process modeling is given. Then, a neuronal process simulation design follows. Finally, the neuronal process optimization is designed. All designs refer to the neuronal process circles of Fig. 2. Explanations can be found in corresponding subsections.

8

M. Grum and N. Gronau

Act

M

Do Neuronal Activation

Check

Neuronal Error Propagation

(a) Neuronal Process Modeling Circle

Plan

Plan

Plan Neuronal Weight Adjustment w

As-Is Process Input To-Be Process Output

As-Is Simulation Scenario To-Be Simulation Scenario

As-Is Process Input As-Is Process Output

Simulation Scenario Adjustment s

Act

S

Do Neuronal Model Use

Neuronal Weight Adjustment w*

Check

Neuronal Behavior and Expectation Comparison

(b) Neuronal Process Simulation Circle

Act

O

Do Neuronal Activation

Check

Neuronal Error Propagation

(c) Neuronal Process Optimization Circle

Fig. 2. Neuronal process circles.

4.1

Neuronal Process Modeling

The following gives definitions of the concept of neuronal modeling. For this, basic definitions are given firstly, definitions based on them are given afterwards. Neuronal knowledge objects are defined to be neuronal patterns, that evolve as current over a certain period of time that causes a specific behavior of consecutive neurons. Those patterns can reach from single time steps to long periods of time. Neuronal information objects are defined to be neuronal currents, that serve as interface from and to the environment such as incoming sensory information and outgoing actuator information. Here, stored information is included as well. Considering those objects, a neuronal conversion is defined to be the transfer of neuronal input objects to neuronal output objects. In accordance to Nonaka and Takeuchi [24], the following neuronal conversion types can be distinguished: – A neuronal internalization is the process of integration of explicit knowledge (neuronal information objects) in tacit knowledge. Here, experiences and aptitudes are integrated in existing mental models. – A neuronal socialization is the process of experience exchange between neurons within a closed ANN. Here, new tacit knowledge such as common mental models or technical abilities are created. – A neuronal externalization is the process of articulation of tacit knowledge (neuronal knowledge objects) in explicit concepts (neuronal information objects). Here, patterns can serve to verbalize tacit knowledge. – A neuronal combination is the process of connection of available explicit knowledge (neuronal information objects), such that a new explicit knowledge is created. Here, a reorganization, reconfiguration or restructuring can result in new explicit knowledge. Neuronal input objects are defined to be sensory information objects and knowledge objects. Neuronal output objects are defined to be actuator information objects and knowledge objects. An atomic neuronal conversion is defined to be a neuronal conversion considering only one input object and only one output object. Complex neuronal conversion are defined to be neuronal conversions considering at least three neuronal objects of one neuron. Pure complex neuronal

A Visionary Way to Novel Process Optimizations

9

conversions consider only one neuronal conversion type, while impure complex neuronal conversion consider several neuronal conversion types such that one is not able to distinguish them. Abstract neuronal conversion are defined to be neuronal conversions considering neuronal objects of more than one transferring neuron such that one is not able to identify atomic knowledge flows of participating neurons. In conclusion, those definitions are the basis for the transfer of process modeling languages to the neuronal level. The logic behind this, which refers to the creation of a practicable, neuronal models, is inspired by standard learning procedures (such as [1,27] describe). The neuronal process modeling circle of Fig. 2(a) intends to visualize this. First, training and test data have to be prepared on base of as-is process input and as-is process output data (plan). Then, current ANN are activated by available data (do). During the check, an activation result is compared to as-is process output data, such that neuronal errors can be generated. Weight adjustments are carried out during the act phase. Since this process is repeated until a stopping criteria is reached, a cyclic proceeding is established and results in neuronal process models. Since those can be used within neuronal process simulations, this is the base for NPS and NPO. With this design, an artifact for the first two sub research questions was presented. 4.2

Neuronal Process Simulation

low

Communication View Process Views

Complex, neuronal Views

Granularity

Activity Views Abstract, neuronal Views

Atomar, neuronal Views Cortex with externalized information objects

high

KMDL Extension

Neuronal KMDL

KMDL V2.2

The following gives definitions of the concept of neuronal process simulation. For this, basic definitions are given and a simulation framework is drawn. For the simulation, common views of the KMDL are brought in a strict hierarchy considering a 1-m-relationship from lower granularity views to higher granularity views. This is visualized in Fig. 3 by a neuronal process pyramid.

Fig. 3. Neuronal process pyramid.

Here, one can see on the top, that available views of the KMDL (version 2.2) are extended with further neuronal views, as they have been defined in Sect. 4.1.

10

M. Grum and N. Gronau

Since all are supplemented with a neuronal simulation logic, they make the Neuronal KMDL. The previously mentioned, strict 1-m-hierarchy refers to process views at the very top and knowledge intensive process steps being concretized by a hierarchy of activity views. Although in this figure only one hierarchy is visualized, each process step and corresponding activities are broken down to atomar, neuronal views at the very bottom via various abstract and complex neuronal views. Each represents a collection of connected neurons and are referred to neuronal subnets from here on. A further definition can be found in the realization of a single, discrete simulation step. This considers the time-dependent activation of all participating neuronal subnets. Some are activated by former subnets, some by an input of the simulation environment and some by a certain initial input (e.g. for tests by the simulation instance). The realization of a simulation sequence follows the underlying process model of lower granularity levels. Hence, some subnets are not activated at all while some are activated simultaneously and some only under conditions. As some subnets are activated repetitively during the simulation sequence, their current knowledge state can be reused during later activations. The production of realistic and plausible simulation results requires a successful training for participating subnets showing sufficient approximation results and generalization characteristics each. Systematic simulation runs can then be controlled by a simulation control framework of a higher order. Here, various simulation scenarios can be realized and compared easily. The simulation realization is visualized with the neuronal process simulation circle of Fig. 2(b). It includes the following phases: First, simulation scenarios have to be prepared in order to reflect as-is processes and to-be processes (plan). Then, current neuronal models are used on base of corresponding data (do). During the check, a neuronal behavior is compared to expected results, such that insights can be generated. If more simulations are required, different of optimized neuronal process models need to be used in NPS or scenarios of the next simulation need to be adjusted, then scenario changes are carried out during the act phase. Since this process can be repeated until a stopping criteria is reached, a cyclic proceeding is established and results in neuronal process simulations. In conclusion, those definitions are the basis for knowledge transfers and show knowledge generation and forgetting processes during a neuronal process simulation, which can be compared to a company’s intentional behavior. Further, this is the foundation for neuronal process optimizations. With this design, an artifact for the third sub research questions was presented. 4.3

Neuronal Process Optimization

Focusing on CPI corresponding to Kaizen, the optimization of organizational processes and learning processes with ANN have the following in common. Both require a set of input factors. The ANN is activated with a selected set of parameters, which mostly is a codification of real world meanings on base

A Visionary Way to Novel Process Optimizations

11

of simulated currents. Organizational processes are fed with input parameters, which are required during the realization of that process. Further, both produce a set of output factors. ANN built on input activations, which are transferred and manipulated in various ways, such that a codification of real world meaning can be generated. Organizational processes combine, manipulate and transform a given input, such that an output is produced. Following the idea of CPI, both kinds of outputs can serve as environmental feedback and indicate a fit in planned and achieved performances. Measured by key performance indicators, the performance of a process is improved in small steps. This reflects the CPI idea of evolution as follows. Organizational processes are improved by a change of any process parameters: A process can be realized on behalf of better production components, a change in process order, a better qualification of process participants, etc. This all results in a better process performance. Following Plaut et al. [27] and Bishop [1], the process of learning with neuronal networks is realized through the adjustment of the network’s weights w(n) in dependence of a prediction error E, as Eq. 1 intends to clarify.  w(n) = m  w(n − 1) − α

∂E , ∂w(n)

where 0 ≤ m ≤ 1.

(1)

Here, α is standing for the learning rate, m is standing for the momentum and n is standing for the current training interval. As the performance of the current neuronal model is improved, the error E is reduced stepwise. Here, the following two types of interpretations can be drawn and be connected to CPI-related issues of the efficiency: (1) Interpretations of organizational processes as ANN. (2) Interpretations of biological processes under economic constraints. (1) As the improvement of organizational processes is interpreted as a kind of human version of gradient descent method, the desired performance of organizational processes can be interpreted as discrepancy or error E, and process changes can follow biological plausible techniques. Hence, an ANN training procedure can be used either to establish required models for neuronal process simulations, improve as-is-process models during the neuronal process optimization or establish new process models in the sense of BPR during the neuronal process optimization. During those optimizations, the following error environments can follow this interpretation plausibly. Figure 4 shows three commonly known error plateaus, which either can stand for the prediction error reduction of ANN during the training process or for discrepancy reduction of a desired performance of organizational processes during the improvement. The current optimization level is visualized by a black ball rolling metaphorically to the global optima. The optimal direction of that moment is indicated by an arrow. As the learning rate α is adjusted intelligently, numerous cost intensive training runs can be avoided. Figure 4(a) shows that the increase of a small step size

12

M. Grum and N. Gronau E

E

E

w (a) Increase Step Size

w

w (b) Avoid Traps

(c) Fast Convergence

Fig. 4. Error environment examples showing heterogeneous characteristics.

can speed up the search for the global optima, which can be found in the very right of the diagram. The reduction of continuous process improvement runs can be achieved with help of the expertize and experience of process experts, who consider better changes and a greater number of changes in one run. As the momentum m is increased intelligently, an oscillation between local error plateaus can be avoided because of the consideration of recent weight adjustments. Figure 4(b) shows that the use of a momentum can help to avoid traps during the search for the global optima because several local optima can be overcome because of this moment of inertia. Those can be avoided through the use of standards in the context of continuous process improvement runs. These can help to disregard current trends, irregularities and invalid runs reasonably. As the learning rate and momentum are adjusted intelligently during the learning process, an effective optimization can be carried out. Figure 4(c) shows that a great step size is reasonable at the beginning but has to be decreased at the end, such that a global optima can be identified and a great step size does not oscillate around the global optima. The efficient reduction of continuous process improvement runs can be realized with help of patterns. Here, best practices, routines and proceeding models can help to identify best process options quickly and implement attractive behaviors of changes reasonably. (2) As the improvement of ANN processes is interpreted as a kind of economic version of organizational mechanisms, the human brain is restricted by the same economic constraints like time, quality and costs. The desired performance of organizational processes can be injected as E in backpropagation approaches and process changes can follow industry specific techniques. If organizational processes are optimized in regard to time, either the throughput time is reduced (e.g. in production processes) or the number of processes is increased, that are realized in a given period of time (e.g. a production date). Analogically, the human brain tries to map a time-based behavior as it is relevant for a corresponding task, e.g. fishing with a spear. Here, a complex sequence of actuator realizations within specified time frames is essential. Optimizing organizational processes in regard to quality refers to the improve of quality factors that are connected to the process outcomes. As example, one can find a better production surface measured by the rate of broken surfaces per month. Analogically, the human brain tries to shrink prediction errors, such that

A Visionary Way to Novel Process Optimizations

13

the rate of correct predictions within the corresponding task can be increased and qualitative better results can be realized. As organizational processes are optimized in regard to cost, the cost-intensive use of resources during the process realization is reduced. This can be connected to the use of less and cheaper materials if available, the reduction of required space if possible, or the realization of further tasks in parallel, etc. Since the learning process of the human brain is limited in space (by the size of the human’s head), the positioning of neurons representing a certain task and the creation of further connections also is relevant. Additionally, learning processes and the working of the human brain is cost-intensive. Here, proteins and transmitter are required, whose availability is limited, too. Further, they can not be substituted since cheaper materials are not available at all and a cost-efficient working is essential. In conclusion, the approximation of tasks with neuronal networks can try to realize a trade-off in a maximal number of learned tasks, their approximation accuracy in time and cost-based constraints. As a meaning of each element of the neuronal network can be mapped to an interpretation in the real world, changes in the neuronal network during the learning process can be interpreted directly within the corresponding context of their process representation. Being inspired by these analogies, various tools of both sides, the process domain and ANN domain show promising application possibilities within the neuronal process optimization. The neuronal process optimization circle of Fig. 2(c) intends to underline this. The preparation of input data coming from as-is processes and to-be process output data coming from to-be simulations is realized during the plan phase. Then, current neuronal models are activated by available data of selected scenarios (do). During the check, an activation result is compared to-be process output data, such that neuronal errors can be generated. Weight adjustments are carried out during the act phase. Since this process is repeated until a stopping criteria is reached, a cyclic proceeding is established and results in optimized neuronal process models. With this design, an artifact for the fourth sub research questions was presented.

5

Demonstration of an ANN Process Domain

The following subsections show the realization of the neuronal process modeling on behalf of the KMDL. For this, theoretic examples and corresponding process models are given, that visualize basic definitions. Then, practical examples follow. Five examples for neuronal process simulations shall visualize the interpretation of process models as deep neuronal networks and clarify the interplay of simulation output and expectable results. Further, two examples demonstrate the realization of neuronal process optimization.

14

5.1

M. Grum and N. Gronau

Theoretical Example Models

The definitions from Sect. 4 are visualized in the following three theoretical examples: Firstly, atomic knowledge conversions on a neuronal level can be found in Fig. 5.

Fig. 5. Atomic neuronal conversions. (Color figure online)

In this figure, one can see a neuronal socialization in the top left, a neuronal externalization in the top right, a neuronal combination in the bottom right and a neuronal internalization at the bottom left. All of them were visualized in the activity view of the KMDL. The entity of persons as process participants (yellow) was mapped to neurons who interact on a neuronal level. In consequence, the entity of tacit knowledge objects (purple) are connected to neurons. The entity of the conversion (green) was mapped to the activity of a neuron that generates new knowledge based on the transfer of its input objects. The environment as well as interaction possibilities with the environment are modeled with the entity of a database (white rectangle). Further, neuronal information objects are stored within a database. In consequence, the shape of information objects (red) are connected to those databases. Secondly, complex neuronal conversions are visualized in Fig. 6.

Fig. 6. Complex neuronal conversions (pure). (Color figure online)

A Visionary Way to Novel Process Optimizations

15

Again, in this figure, one can see a neuronal socialization in the top left, a neuronal externalization in the top right, a neuronal combination in the bottom right and a neuronal internalization at the bottom left. All of them were visualized in the activity view of the KMDL. Following the KMDL, conversions of the activity view can be repeated without control flow. Hence, each neuron can develop several neuronal knowledge objects or neuronal information objects over time. Hence, modeled neuronal objects do represent the identified current knowledge of a certain neuron. Therefore, a strict sequence modeling can be realized with help of the listener concept or the process view. Thirdly, an abstract neuronal conversion can be found in Fig. 7.

Fig. 7. Abstract neuronal conversion. (Color figure online)

In this figure, one can see several impure complex conversions simultaneously, for which reason the visualized arrows are black, as the KMDL asks for. Since more than one neuron (B1 and B2) is considered on that process model, an abstract level of neuronal conversions has been visualized. With those examples, the first sub research question was answered. 5.2

Practical Example Models

Using basic definitions of a neuronal process modeling, their transfer to practical examples coming from the industry is intended. The following gives four practical examples. All of them serve as a fruitful domain to visualize neuronal modelings, simulations and optimizations. Their modeling has been carried out on base of the neuronal modeling circle of Fig. 2(a). Required views look similar to examples of Sect. 5.1, but the labels show concrete meanings. A first example focuses on the organization of goods depots. Those can follow various strategies. For example fix places can hold reservations for certain goods. Alternatively, goods can get an arbitrary place, which considers current free spaces. Here, the human brain can serve as biological inspiration for strategies to store memories and can optimize the depot organization of goods. A second example focuses on production processes. Here, goods are not needed constantly. Meanwhile, they can be stored in goods depots and storage areas.

16

M. Grum and N. Gronau

Once, they are needed, they can be brought to the corresponding process step with help of transportation elements [13]. As they are not required, a transportation element pauses and buffers currently not needed goods. Alternatively, materials can be considered as just-in-time inventory, such that they do not have to be stored in expensive goods depots. Here, the velocity of transportation elements is adjusted in dependence to the production order. Analogies can be found in the human brain. As the storage of goods, the storage of memories can be organized or vice versa. A short-term-memory (current currencies) deals with neuronal knowledge objects similarly to just-in-time inventory. Here, neuronal knowledge objects are used at consecutive neurons as they are needed. Buffered goods are stored within long-term-memories similar to goods depots. Here, currencies are unlocked as they are needed within the current process. A third example focuses on specializations of production machines. As production processes can be considered as a single process network, machines are part of them. Since machines can show high specializations, the organization of production processes can be inspired by the organization of the human brain. Here, certain areas are responsible for a certain task and show high specializations as well. For example the auditory cortex deals mainly with acoustic information, the visual cortex mainly with optical information, etc. A fourth example focuses on outsourcing of tasks: Often, an efficient task realization does not contain the realization of all process steps within the own company. As parts can be outsourced to external parties, analogies can be found in the human brain as well. Here, speed relevant actions can be initiated by reflexes. This is efficient since the realization of a full cognitive task processing would be to slow. As an example, one can imagine the start of a sprinkler system. In case of a fire, it was not effective to create action alternatives, evaluate and select best options but start fighting a fire immediately like a reflex. With those examples, the second sub research question was answered. 5.3

Practical Example Simulation

On base of the neuronal process simulation circle of Fig. 2(b), the following example focuses on verifying of the spiral of knowledge of Nonaka and Takeuchi. Their model refers to the broadening of an epistemological knowledge base through the repetitive use of conversions between ontological entities [24]. Firstly, the simulation scenario is prepared during the plan phase. For this, neuronal process models have been created, as they can be found in Fig. 8. Within a triangle representing the system border of an example company, one can find a three-level hierarchy of divisions and corresponding processes by the positioning of several neuronal process pyramids, as they were presented in Fig. 3. Within the example, a first level represents operational divisions, such as sales, production, marketing, etc. A second level stands for processes of mid level managers, being responsible for operational divisions. The top management can be found on the third floor. Within this system border, the simulation scenario realizes the knowledge transfer of an innovation idea that is generated

A Visionary Way to Novel Process Optimizations

17

Company

5

2

3

4

Legend: Knowledge Spiral Hierarchial Border System Border Neuronal Conversions: 1. Inner-person specific 2. Person specific 3. Team specific 4. Company-wide 5. Corss-company-wide

1

Fig. 8. Neuronal process simulation for spiral of knowledge proof.

by a person during the production of goods following the company’s innovation process. Hence, corresponding simulation parameters are prepared. Secondly, the simulation is carried out during the do phase. Here, five forms of knowledge transfers can be detected. Those were visualized with help of spirals in Fig. 8. Inner-person specific neuronal conversions can be found on base of the first ontological entity: persons. As a production is realized by the manual work of a single person, an idea of a production process change is generated. Here, a simulation can carry out neuronal conversions of this individual via available views of the corresponding neuronal process pyramid, such that the generation of that idea becomes visible. Person specific neuronal conversions can be found on base of the first ontological entity as well. Here, the production process innovator follows the underlaying process model and presents the intended process change to a colleague. Their conversation and corresponding conversions can be simulated on base of their individual, inner-person specific process pyramid. Those two are intersecting at the process level. With this, an interaction can be simulated with help of two person specific neuronal networks. The interaction itself can be controlled by an interaction network, characterizing that process step. Further forms of transfers refer to team specific neuronal conversions. Collaborating in teams, the innovator could convince further divisions of his idea in group discussions. Since each participant provides its inner-person specific neuronal pyramid, a group-wise conversion builds on a collective knowledge base as well. This team knowledge is only accessible in this specific cultural circle. As several teams and divisions are interacting, company-wide neuronal conversions can be found at the fourth ontological entity, which refers to the entire company. Following the underlaying process model, a suggestion is prepared for the corresponding mid level manager. This person weights suggestions and prepares contracts for the top level manager. Here, their interaction and the production of corresponding knowledge and information objects can be simulated on base of neuronal process simulations.

18

M. Grum and N. Gronau

Further forms of transfers can be identified at the ontological entity of companies as well, which are called cross-company-wide neuronal conversions from here on. As interactions go beyond the company-wide system border, e.g. because of the integration of external consultants in contractual negotiations with the top management and mid level managers concerning cross-company-wide corporations, or open innovation projects, a neuronal simulation can carry out knowledge transfers beyond that border and identify outgoing and incoming objects. With this, an optimal trade-off of chances and risks can be identified. Thirdly, simulation results are compared with the expected behavior during the check phase. Through the simultaneous activation of several neuronal subnets, that are connected by networks representing underlaying current process models, various processes and their interactions can be simulated via deep neuronal network techniques. Those can show surprising side effects, such that a plausibility check is essential. In this example, this refers to the question if the initial innovation idea can be transfered through the repetitive use of conversions between selected ontological entities. When the epistemological knowledge base is broadened through this, the spiral of knowledge can be proven with neuronal process simulations. Fourthly, if simulation results are not clear and plausible, simulation parameters can be changed during the act phase, such that the knowledge spiral can be proved or disproved in next simulation runs. This example shows that the organizational working can be considered as deep neuronal network. Their results of neuronal process simulations serve as reference scenario and build the foundation for neuronal process optimizations. With this example, the third sub research question was answered. 5.4

Practical Example Optimization

The following shows an optimization example, which is limited by given place constraints. This underlines the relevance of a physically interpretable neuron positioning, which is best situated within the real world space. Here, Augmented Reality visualizations become attractive because of their intersection of the real and augmented world. Further, without the possibility to situate models within the real world, 3D models can give a spacial impression of those examples as well. So, imagine to have a shopping mall with various floors. Those are connected with moving stairs. Here, relevant real world objects can serve for neuronal process optimizations. Hereunder, one can find neurons representing shops. Those are placed inside of the building modeling their real physical position. Further, one can find neurons for moving stairs, which can be found in the building center. Hence, neuronal subnets can stand for available routing points. The movement of customers can be represented by currencies routed from neuron to neuron. Since the shopping mall is limited by walls, the physical size of single shops within the building and the number of routing neurons is strictly limited. A visualization of the building, the positioning and connecting of routing neurons can be found in Fig. 9. Here, three sub-figures visualize an as-is neuronal

A Visionary Way to Novel Process Optimizations

19

process model (NPM) (a) and two neuronal process optimization (NPO) results (b) and (c). Those have been realized following the neuronal process optimization circle of Fig. 2(c) with different optimization objectives. Neuronal input and output objects have not been visualized in Fig. 9 at all because of a better clarity. Since an optimization shall focus on three types of customers, different colors have been used for each type.

Fig. 9. Neuronal process optimization in shopping mall. (Color figure online)

In Fig. 9(a), one can see an as-is model, which was created with help of realworld process data and an ANN training process. Here, one can see a positioning of specific shops as it was intended by the shopping mall organizers. The simulation performance of this organization serves as reference point for a neuronal process optimization. Here, the position of real world shops within the shopping mall is questioned. Further, the adjustment of the shop sizes is addressed. In Fig. 9(b), one can see the result of a first NPO focusing on the maximization of the shopping mall profit. During the plan phase, the training data was manipulated in regard to the desired profit increase. Hence, the corresponding data output represents the intended behavior of to-be processes. Since current neuronal process models did not reflect the desired performance, which became clear because of its activation during the do phase, discrepancies were detected during the check phase. Then, process optimizations have been carried out during the act phase through the adjustment of the network’s weights. Those lead to a clearance in the network’s connections. Since not all connections were required, a learning and adaption process lead to a deletion of unattractive connections. The different size of connections of the result indicate that some shops are visited more frequently than others and lead to more profit. In general, a cluster for each customer type can be identified. Customer type 1 can mainly be found on the left of the building. Shops on the third floor mainly attract customer type 2 and floor 1 and 2 of the right are attractive for the third customer type. Shop sizes of the real world can be adjusted correspondingly. In Fig. 9(c), one can see a second result of a NPO focusing on the attraction of new customers. Here, the training data was manipulated in regard to the desired customer increase during the plan phase. Further training runs taking that data into account lead to a different clearance in the network’s connections (phases do, check, act). The positioning of the most frequently visited shops

20

M. Grum and N. Gronau

(showing the largest connections) can be found on the ground floor and their real world shop size can be optimized. Hence, the shopping mall satisfies most of the customers quickly and attracts new customers efficiently. Further shops that are not attracting a large number of customers have been spread around the entire building, such that bargains can be provoked. The example describes a neuronal process optimization on base of biologically inspired learning procedures of the human brain. In regard to a specified process improvement dimension, an as-is process was optimized in regard to two directions. A change within a deep neuronal network leads to direct interpretable changes in the real world. With those examples, the fourth sub research question was answered.

6

Evaluation

Faced with the demonstration artifacts of the previous section, objectives of Sect. 3 have been considered as follows. Objective 1 can be fulfilled by modeling neuronal knowledge models within the activity view characterizing a certain person. Here, a decomposition, such as the neuronal process pyramid it provides, rises the process model granularity of the selected activity and connects all neuronal process models with common process models. Since the common activity view characterizes a corresponding process task of the process view, neuronal knowledge models are integrated within existing process models. Since a neuronal network characterizes entities of persons, a trained neuronal network can be reused in any activity (objective 2). As neuronal knowledge models can be activated and can evolve over time, they can be integrated within discrete process simulations easily (objective 3). From a common activity view modeled environmental factors (material such as non-material objects) serve as interface for the activity view on a neuronal level. Hence, objective 4 and objective 5 are considered as well. Further, objectives have been considered coming from a neuronal technique side as follows: As learning with neuronal networks is not affected by the here presented concepts, neuronal tasks can follow the neuron’s biological models (objective 6) in both, neuronal process simulations as Sect. 5.3 shows in an simulation example and Sect. 5.4 in two optimization examples. A parallel neuronal task realization within neuronal networks has been considered (objective 7) as can be seen in Fig. 6 (neuronal socialization and neuronal externalization) and Fig. 7 (abstract neuronal conversion). Here, at least two neurons realize a parallel task processing. Objective 8 can be met as soon as recurrent connections are considered within the neuronal process models. Then, time-dependent neuronal behaviors are considered within neuronal networks. A sequential neuronal task realization within neuronal networks can be considered within the neuronal process modeling (objective 9), as presented activity views are characterizing corresponding tasks of the process view. Since logical control-flow operators can be used here, a sequential neuronal task processing can be modeled easily. This can be seen in the examples of Sects. 5.3 and 5.4. Further, a time-dependent

A Visionary Way to Novel Process Optimizations

21

behavior of a network modeled within the activity view can result in a sequential task processing. Objective 10 has been met as can be seen in Fig. 7. Here, the task “Neuronal Perception of Neuron Group B” models the activity of “Neuron B1” and “Neuron B2” on an abstract level. Further, knowledge objects, information objects, neurons and databases can be grouped and visualized on an abstract level. Sensory information and knowledge flows can be considered within the modeled neuronal network as can be seen in Figs. 5 and 6. In both figures, possible sensory information flows can be seen at the bottom (neuronal internalization and neuronal combination). Possible knowledge flows can be seen in both figures in the top (neuronal socialization and neuronal externalization). Objective 12 can be met as follows: Actuator information and knowledge have been considered as outcomes of neuronal networks. This can be seen in Figs. 5 and 6. In both figures, possible actuator information flows can be seen on the right (neuronal externalization and neuronal combination). Possible knowledge flows can be seen in both figures on the left (neuronal socialization and neuronal internalization). Considering the here presented evaluation of given objectives, it becomes clear that an idea for every objective has been identified. This supports the functioning of the transfer of the KMDL to a neuronal level, such that a neuronal process modeling, a neuronal process simulation and a neuronal process optimization can be built on that base.

7

Conclusion

In this paper, a visionary way to novel process optimization techniques has been drawn and the base has been realized on behalf of the KMDL. Main contributions and scientific novelties are the following: Definitions of a neuronal process modeling, neuronal process simulation and a neuronal process optimization have been created. Objectives of transferring a common process modeling language and a so called ANN process domain have been identified. Further, definitions for those concepts have been created and a modeling language has been transferred to the neuronal world. This includes the reinterpretation of existing shapes of the KMDL. On that base, theoretical examples have been visualized on behalf of the KMDL. Further, analogies for the use of the here presented concepts in the industry context have been identified. With this, the drawn transfer has been applied and proven. Hence, the first sub research question was answered. The second research question was answered with the design of the neuronal process modeling circle. Its application was demonstrated in the second example, such that industry analogies could have been identified. Further, it was required for the creation of neuronal models, which were used in NPS in the third example. Lastly, it was used for the model creation of the as-is process in the fourth example. Further, the neuronal process simulation circle and the neuronal process optimization circle were designed and analogies of neuronal learning procedures and process optimization procedures were drawn. Each application was demonstrated

22

M. Grum and N. Gronau

in the third and forth example, such that the third and fourth sub research question were answered. Hence, the following potentials are suitable for next steps: The function concretion of previously presented concepts will be realized. Then, those will be realized as quantitative neuronal process modelings, simulations and optimizations. Further, the comparison of the here presented concepts with traditional results was attractive as well. The application of the here presented concepts are assumed to cause a radical value increase. As simple and complex relations in different process perspectives like cost, time or stock can be considered, the prediction quality of process simulations can strongly improve going beyond the prediction quality of simple regression models or humans. Further, common optimization potentials can be estimated efficiently. Additionally, new optimization approaches and optimization potentials can be identified.

References 1. Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press Inc., Oxford (1995). ISBN 0198538642 2. Broomhead, D., Lowe, D.: Multivariate functional interpolation and adaptive networks. Complex Syst. 2, 321–355 (1988) 3. Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.Y.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(6), 1190–1208 (1995) 4. Chambers, M., Mount-Campbell, C.A.: Process optimization via neural network metamodeling. Int. J. Prod. Econ. 79, 93–100 (2000) 5. Deming, W.: Elementary Principles of the Statistical Control of Quality. JUSE, Tokyo (1950) 6. Deming, W.: Out of the Crisis. MIT Press, Cambridge (1986) 7. Deming, W.: The New Economics. MIT Press, Cambridge (1993) 8. Fahlman, S.: Faster learning variations on back-propagation: an empirical study. In: Touretszky, D., Hinton, G., Sejnowski, T. (eds.) Proceedings of the 1988 Connectionist Models Summer School, pp. 38–51. Morgan Kaufmann, San Mateo (1989) 9. Gronau, N.: Gesch¨ aftsprozessmanagement in Wirtschaft und Verwaltung., 2. Gito (2017) 10. Gronau, N., Thiem, C., Ullrich, A., Vladova, G., Weber, E.: Ein Vorschlag zur Modellierung von Wissen in wissensintensiven Gesch¨ aftsprozessen. Technical report, University of Potsdam, Department of Business Informatics, esp. Processes and Systems (2016) 11. Gronau, N.: Process Oriented Management of Knowledge: Methods and Tools for the Employment of Knowledge as a Competitive Factor in Organizations (Wissen prozessorientiert managen: Methode und Werkzeuge f¨ ur die Nutzung des Wettbewerbsfaktors Wissen in Unternehmen). Oldenbourg Verlag M¨ unchen (2009) 12. Gronau, N.: Modeling and Analyzing Knowledge Intensive Business Processes with KMDL - Comprehensive Insights into Theory and Practice. GITO mbH Verlag, Berlin (2012) 13. Gronau, N., Grum, M., Bender, B.: Determining the optimal level of autonomy in cyber-physical production systems. In: Proceedings of the 14th International Conference on Industrial Informatics (INDIN) (2016)

A Visionary Way to Novel Process Optimizations

23

14. Gronau, N., Maasdorp, C.: Modeling of Organizational Knowledge and Information: Analyzing Knowledge-Intensive Business Processes with KMDL. GITO mbH Verlag, Berlin (2016) 15. Hammer, M., Champy, J.: Reengineering the Corporation: A Manifesto for Business Revolution. Harper Business, New York (1993) 16. Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952) 17. Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. PNAS 79(8), 2554–2558 (1982) 18. Imai, M.: Kaizen: The Key to Japan’s Competitive Success. McGraw-Hill/Irwin, New York City/Huntersville (1986) 19. Ishikawa, K.: What is Total Quality Control? The Japanese Way. Prentice-Hall Inc., Upper Saddle River (1985) 20. Kohonen, T.: Self-Organization and Associative Memory, 3rd edn. Springer, New York (1989). https://doi.org/10.1007/978-3-642-88163-3. ISBN 0-387-51387-6 21. Krallmann, H., Frank, H., Gronau, N.: Systemanalyse im Unternehmen. Oldenbourg Wissenschaftsverlag (2001) 22. McCulloch, W.S., Pitts, W.: A Logical Calculus of the Ideas Immanent in Nervous Activity, pp. 15–27. MIT Press, Cambridge (1988). ISBN 0-262-01097-6 23. Moen, R., Norman, C.: Evolution of the PDCA. Google Scholar (2006) 24. Nonaka, I., Takeuchi, H.: The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation. Oxford University Press, Oxford (1995) 25. Peffers, K., Tuunanen, T., Gengler, C.E., Rossi, M., Hui, W., Virtanen, V., Bragge, J.: The design science research process: a model for producing and presenting information systems research. In: 1st International Conference on Design Science in Information Systems and Technology (DESRIST), vol. 24, no. 3, pp. 83–106 (2006) 26. Peffers, K., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research methodology for information systems research. Manag. Inf. Syst. 24(3), 45–78 (2007) 27. Plaut, D.C., Nowlan, S.J., Hinton, G.E.: Experiments on learning backpropagation. Technical report CMU-CS-86-126, Carnegie-Mellon University, Pittsburgh, PA (1986) 28. Remus, U.: Process-oriented knowledge management. Design and modelling. Ph.D. thesis, University of Regensburg (2002) 29. Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of the IEEE International Conference on Neural Networks, San Francisco, pp. 586–591 (1993) 30. Robinson, A.J., Fallside, F.: The utility driven dynamic error propagation network. Technical report CUED/F-INFENG/TR.1, Cambridge University Engineering Department (1987) 31. Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958) 32. Rosenblatt, F.: Principles of Neurodynamics. Spartan, New York (1963) 33. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning Internal Representations by Error Propagation, pp. 318–362. MIT Press, Cambridge (1986). ISBN 0-26268053-X 34. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

24

M. Grum and N. Gronau

35. Shewchuk, J.R.: An introduction to the conjugate gradient method without the agonizing pain. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA (1994) 36. Shewhart, W.: Statistical Method from the Viewpoint of Quality Control. Dover Publications Inc., New York (1939). Edited by W. Edwards Deming 37. Sultanow, E., Zhou, X., Gronau, N., Cox, S.: Modeling of processes, systems and knowledge: a multi-dimensional comparison of 13 chosen methods. Int. Rev. Comput. Softw. (IRECOS) 6, 3309–3319 (2012) 38. Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1, 339–356 (1988) 39. Williams, R.J., Zipser, D.: Gradient-based learning algorithms for recurrent networks and their computational complexity. In: Chauvin, Y., Rumelhart, D.E. (eds.) Back-propagation: Theory, Architectures and Applications, pp. 433–486. Lawrence Erlbaum Publishers, Hillsdale (1995)

Microflows: Leveraging Process Mining and an Automated Constraint Recommender for Microflow Modeling Roy Oberhauser(&)

and Sebastian Stigler

Computer Science Department, Aalen University, Aalen, Germany {roy.oberhauser,sebastian.stigler}@hs-aalen.de Abstract. Businesses and software development processes alike are being challenged by the digital transformation and agility trend. Business processes are increasingly being automated yet are also expected to be agile. Current business process modeling is typically labor-intensive and results in rigid process models. For larger processes it becomes arduous to consider all possible process variations and enactment circumstances. Contemporaneously, in software development microservices have become a popular software architectural style for partitioning business logic into fine-grained services accessible via lightweight protocols which can be rapidly and individually developed by small teams and flexibly (re) deployed. This results in an increasing number of available services and a much more dynamic IT service landscape. Thus, a more dynamic form of modeling, integration, and orchestration of these microservices with business processes is needed. This paper describes agile business process modeling with Microflows, an automatic lightweight declarative approach for the workflow-centric orchestration of semantically-annotated microservices using agent-based clients, graph-based methods, and the lightweight semantic vocabularies JSON-LD and Hydra. A graphical modeling tool supports Microflow modeling and provides dynamic constraint and microservice recommendations via a recommender service using machine learning of domain-categorized Microflows. To be able to utilize existing process model knowledge, a case study shows how Microflow constraints can be automatically extracted from existing Business Process Modeling Notation (BPMN) process files and transformed into flexible Microflow constraints, which can then be used to train the recommendation service. Further, it describes process mining of Microflow execution logs to automatically extract BPMN models and automated recovery for errors occurring during enactment. Keywords: Business process modeling  Workflow management systems Microservices  Service orchestration  Agent systems  Semantic technology Declarative programming  Recommenders  Recommendation engines Business process mining  Business Process Modeling Notation

1 Introduction Congruent with and related to the digital transformation sweeping across businesses and industry, there is a growing emphasis on business agility and process automation. One key automation area are business processes (BP) or workflows, evidenced by $2.7 © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 25–48, 2018. https://doi.org/10.1007/978-3-319-78428-1_2

26

R. Oberhauser and S. Stigler

billion in spending on Business Process Management Systems (BPMS) (Gartner 2015). The automation of a BP according to a set of procedural rules is known as a workflow (WF) (WfMC 1999). A workflow management system (WfMS), defines, creates, and manages the execution of workflows (WfMC 1999). However, with regard to agility, these workflows are often rigid, and while adaptive WfMS can handle certain adaptations, they usually involve manually intervention to determine the appropriate adaptation. Business Process Model and Notation (BPMN) (OMG 2011) supports Business process modeling (BPM) with a common notation standard. However, past BP models have hitherto not been accessible or leveraged in an automated way to support BPM. To support digital automation, WFs often utilize software services. One trend supporting agility in the development and deployment of software is the now popular application of the microservice architecture style (Fowler and Lewis 2014). It provides an agile and loosely-coupled partitioning of business capabilities into fine-grained services individually evolvable, deployable, and accessible with lightweight mechanisms. However, as the dynamicity of the service world increases, the need for a more automated and dynamic approach to service orchestration becomes evident. As the IT landscape becomes more complex and agile, it is evident that manual modeling will be handled by more automation. Approaches have included service orchestration, where a single executable process uses a flow description (such as WS-BPEL) to coordinate service interaction orchestrated from a single endpoint. In contrast, service choreography involves a decentralized collaborative interaction of services (Bouguettaya et al. 2014), while service composition involves the static or dynamic aggregation and binding of services into some abstract composite process. While automated dynamic workflow planning could potentially remove the manual overhead involved in workflow modeling, a fully automated semantic integration process remains challenging, with one study indicating that it is achieved by only 11% of Semantic Web applications (Heitmann et al. 2012). Thus, in our view constraint-based declarative approaches toward process modeling provide maximum flexibility when searching the solution space for an optimal process model solution in an automated fashion. Thus, rather than pursue the heavyweight service-oriented architecture (SOA) and semantic web, we chose a pragmatic lightweight bottom-up approach. Analogous to the microservices principles, we use the term Microflow to mean lightweight workflow planning and enactment of microservices, i.e. a lightweight service orchestration of microservices. In our prior work, we described our declarative approach called Microflows for automatically planning and enacting lightweight dynamic workflows of semantically annotated microservices (Oberhauser 2017) using cognitive agents and investigated its resource usage and viability (Oberhauser 2016). In (Oberhauser and Stigler 2017), we extended our work to transform existing BPMN models to Microflow constraints, as well as enabling bi-directional support for graphical modeling in BPMN tools via automated constraint extraction and BPMN generation from the execution log of a Microflow. Furthermore, automated error handing and on-the-fly replanning capabilities were extended to address the dynamic microservice landscape.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

27

This paper contributes a graphical Microflow Modeler with drag-and-drop support that automatically generates the equivalent textual JSON Microflow constraints, provides a catalog of microservices currently available for a Microflow, and utilizes a Recommendation Service that suggests microservices or constraints based on prior Microflows. This can make Microflow modelers aware of constraints or microservices that occur frequently in Microflows within a certain domain. Note that the Microflow approach is not intended to address all facets of BPMS support, but focuses on a narrow area towards addressing the automatic orchestration of dynamic workflows given a multitude of microservices, and does so by using a pragmatic lightweight approach rather than a theoretical treatise. This paper is organized as follows: the next section discusses related work. Section 3 presents the solution approach, while Sect. 4 describes its realization. The solution is evaluated in Sect. 5, which is followed by a conclusion.

2 Related Work In IBM business process manager terminology microflow is used to mean a transient non-interruptible BPEL (Web Services Business Process Execution Language) process (IBM 2015), whereas in our terminology a microflow is independent of any BPMS, choreography, or orchestration language. As to the combination of BPM with microservices, while Alpers et al. (2015) mention BPM with microservices, their focus is on collaborative BPM tool support services, presenting an architecture that groups them according to editor, management, analysis functionality, and presentation. Singer (2016) proposes a compiler-based actor-centric approach to directly compile Subject-oriented Business Process Management (S-BPM) models into a set of executable processes called microservices that coordinate work through the exchange of messages. In contrast, we assume our microservices preexist. With regard to orchestration of microservices, related work includes Rajasekar et al. (2012), who describe the integrated Rule Oriented Data System (iRODS) for large-scale data management, which uses a distributed event-condition-action rule engine to orchestrate micro-services into conditional chain-oriented workflows, maintaining transactional properties through recovery micro-services. Alpers et al. (2015) describe a microservice architecture for BPM tools, highlighting a Petri Net editor to support humans with BPM. Sheng et al. (2014) surveys research prototypes and standards in the area of web service composition. Although the web service composition using the workflow technique (Rao and Su 2004) can be viewed as similar, our approach does not explicitly create an abstract composite service; rather, it can be viewed as automated dynamic web service orchestration using the workflow technique. Declarative approaches for process modeling include DECLARE (Pesic et al. 2007). A DECLARE model is mapped onto a set of LTL formulas that are used to automatically generate automata that support enactment. Adaptations with verification during enactment are supported, typically via GUI interaction with a human, whereby the changed model is reinitiated and its entire history replayed. As to inputs, DECLARE facilitates the definition of different constraint languages such as ConDec and DecSerFlow.

28

R. Oberhauser and S. Stigler

For combining multi-agent systems (MAS) and microservices, Florio (2015) proposes a MAS for decentralized self-adaptation of autonomous distributed components (Docker-based microservices) to address scalability, fault tolerance, and resource consumption. These agents known as selfLets mediate service decisions using partial knowledge and exchanging messages. Toffetti et al. (2015) provide a position paper focusing on microservice monitoring and proposing an architecture for scalable and resilient self-management of microservices by integrating management functions into the microservices, wherein service orchestration is cited to be an abstraction of deployment automation (Karagiannis et al. 2014), microservice composition or orchestration are not addressed. Related standards include OWL-S (Semantic Markup for Web Services), an ontology of services for automatic web service discovery, invocation, and composition (Martin et al. 2004). Combining semantic technology with microservices, (Anderson et al. 2015) present an OWL-centric framework to create context-aware applications, integrating microservices to aggregate and process context information. For a more lightweight semantic description of microservices, JSON-LD (Lanthaler and Gütl 2012) and Hydra (Lanthaler 2013; Lanthaler and Gütl 2013) provide a lightweight vocabulary for hypermedia-driven Web APIs and enable the creation of generic API clients. Kluza et al. (2013) provide a survey of recommendation techniques for BPM and discuss machine learning (ML) approaches but do not address the use of neural networks and state that that feature extraction from BPMN diagrams is still an unsolved task. According to their classification we provide subject-based, position-based, and structural recommendations. As to BP recommenders, Chan et al. (2011) use a process fragment model and composition context graph with context matching to determine the requested service’s behavior and implicitly infer the service’s functionality to recommend a related or alternative service. They do not show an actual prototype user interface or a case study of it in use. Barba et al. (2013) propose a constraint-based BP recommendation system focused on planning and scheduling BP activities for performance goal optimization. Schobel and Reichert (2017) utilize machine learning to analyze historic BP performance, determine diverging processes, and recommend changes. Bobek et al. (2013) base recommendations on a Bayesian Network model that is built manually based on a configurable process. Our approach is to extract constraints and then use supervised learning via pre-classification by domain to automatically train an artificial neural multi-layer network to make recommendations. In general, in contrast to the above work, our contribution specifically focuses on microservices with an automatic lightweight declarative approach for the workflowcentric orchestration of microservices using agent-based clients, graph-based methods, and lightweight semantic vocabularies like JSON-LD and Hydra. The extraction of goals and constraints from existing BPM in conjunction with modelling recommendations is supported, and error handling permits dynamic recovery and replanning.

3 Solution Approach Referencing the solution architecture of Fig. 1, the principles and process constituting the solution approach are elucidated below and are based on Oberhauser (2016, 2017). One primary difference of our solution approach compared to typical BPM is the

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

29

reliance on goal- and constraint-based agents using automated planners to navigate semantically-described microservices, thus the workflow is dynamically constructed, reducing the overall labor involved in manual modeling of rigid workflows that cannot automatically adapt to changes in the microservice landscape, analogous to the benefits of declarative over imperative programming.

Fig. 1. Solution concept.

3.1

Microflow Principles

The solution approach consists of the following principles: Microservice Semantic Self-description Principle: Microservices provide sufficient semantic metadata to support autonomous client invocation, such that the client state at the point of invocation contains the semantic inputs required for the microservice invocation. Our realization uses JSON-LD/Hydra. Client Agent Principle: For the client agent of Fig. 1, intelligent agents exhibit reactivity, proactiveness, and social ability, managing a model of their environment and can plan their actions and undertake goal-oriented behavior (Wooldridge 2009). Nominal WfMS are typically passive, executing a workflow according to a manually determined plan (workflow schema). Because of the expected scale in the number of possible microservices, the required goal-oriented choices in workflow modeling and planning, and the autonomous goal-directed action required during enactment, agent technology seems appropriate. Specifically, we chose Belief-Desire-Intention (BDI) agents (Bratman et al. 1988) for the client realization, providing belief (knowledge), desire via goals, and intention utilizing generated plans that are the workflow. Graph of Microservices Principle: Microservices are mapped to nodes in a graph and can be stored in a graph database (see Fig. 1). Nodes in the graph are used to represent any workflow activity, such as a microservice. Nodes are annotated with properties. Directed edges depict the directed connections (flows) between activities annotated via properties. To reduce redundant resource usage via multiple database instances, the graph database could be shared by the clients as an additional microservice.

30

R. Oberhauser and S. Stigler

Microflow as Graph Path Principle: Adirected graph of nodes corresponds to a workflow, a sequence of operations on those microservices, and is determined by an algorithm applied to the graph, such as shortest path. The enactment of the workflow involves the invocation of microservices, with inputs and outputs retained in the client and corresponding to the client state. Declarative Principle: Any workflow requirement specifications take the form of declarative goal and constraint modelling statements, such as the starting microservice type, end microservice type, and constraints such as sequencing or branch logic constraints. As shown under Models in Fig. 1, these specifications may be (automatically) extracted from an existing BPM should one exist, or (partially) discovered via process execution log mining. Microservice Discovery Service Principle (Optional): We assume a microservice landscape to be much more dynamic with microservices coming and going in contrast to more heavyweight services. A microservice registry and discovery service (a type of Meta Service in Fig. 1) can be utilized to manage this and could be deployed in various ways, including centralized, distributed, client-embedded, with voluntary microservicetriggered registration or multicast-triggered mechanisms. For security purposes, there may be a desire to avoid discovery (of undocumented microservices) and thus maintain a whitelist. Clients thus may or may not have a priori knowledge of a particular microservice. Abstract Microservices Principle (Optional): Microservices with similar functionality (search, hotel booking, flight booking, etc.) can be grouped behind an abstract microservice (a type of Meta Service in Fig. 1). This simplifies constraints, allowing them to be based on a group rather than having to be individually based. It also provides an optional level of hierarchy to allow concrete microservices to only provide a client with a link to the logical next abstract microservice(s) without having to know the actual concrete ones, since the actual concrete microservice followers can be numerous and rapidly change, while determining exactly which ones are appropriate can perhaps best be decided by the client in conjunction with the abstract microservice. Path Weighting Principle (Optional): any follower of a service, be it abstract or concrete, can be weighted with a potentially dynamic cost that helps in quantifying and comparing one path with another in the form of relative cost. This also permits the navigation from one to another to be dynamically adjusted should that path incur issues such as frequent errors or slow responses. The planning agent can determine a minimal cost path. Path Constraint Logic Principle (Optional): If the path weighting is insufficient and more complex logic is desired for assessing branching or error conditions, these can be provided in the form of constraints referencing scripts that contain the logic needed to determine the branch choice. Note that the Data Repository and Graph Database could readily be shared as a common service, and need not be confined to the Client.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

3.2

31

Microflow Lifecycle

The Microflow lifecycle involves five stages as shown in Fig. 2. In the Microflow Modeling stage, Microflow specifications (goal and constraints), retained as JSON files in a repository (Fig. 1), are modeled: (1) graphically (in our Microflow Modeler and Constraint Recommender), (2) textually, (3) extracted and transformed via tools from existing BPMN process models, or 4) via process mining of execution logs (e.g., Microflow logs) and transformed to BPMN and then Microflows.

Fig. 2. Microflow lifecycle.

The Microservice Discovery stage involves utilizing a microservice discovery service to build a graph of nodes containing the properties of the microservices and links (followers) to other microservices, analogous to mapping the landscape. In the Microflow Planning stage, an agent takes the goal and other constraints and creates a plan known as a Microflow, finding an appropriate start and end node and using an algorithm such as shortest path to determine a directed path. In our opinion, a completely dynamic enactment without any planning (no schema) could readily lead to dead-end or circular paths, causing a waste of unnecessary invocations that do not lead to the desired goal and can potentially not be undone. This is analogous to following hyperlinks without a plan, which does not lead to the goal and require backtracking. Alternatively, replanning after each microservice invocation involves planning resource overhead (CPU, memory, network), and since this is unlikely to dynamically change within the enactment lifecycle, we chose a pragmatic and lightweight approach from a resource utilization perspective: plan once and then enact until an exception occurs, at which point a necessary replanning is triggered. Further advantages of our approach, in contrast to a thoroughly adhoc approach, is that the client is assured that there is at least one path to the goal before starting, and validation of various structural, semantic, and syntactic aspects can be readily performed. In the Microflow Enactment stage, the Microflow is executed by invoking each microservice in the order of the plan, typically sequentially but it could involve parallel invocations. A replanning of the remaining Microflow can be performed if an exception occurs or if notified by the discovery service of changes to the set of microservices. A client should retain the Microflow model (plan) and be able to utilize the service interfaces and thus have sufficient semantic knowledge for enactment. The Microflow Analysis stage involves the monitoring, analysis, and mining of execution logs in order to improve future planning. This could be local, in a trusted environment, or this could be distributed. Thus, if invocation of a microservice has often resulted in exceptions, future planning for this client or other clients could avoid this troublesome microservice. Furthermore, the actual latency incurred for usage of a microservice could be tracked and shared between agents and taken into account as a type of cost in the graph algorithm.

32

R. Oberhauser and S. Stigler

4 Realization As various details of our Microflow realization and lifecycle were previously detailed in (Oberhauser 2016, 2017), a short summary is provided and the rest of this section details the newer extensions. Implementations of microservices are assumed to be REST compliant using JSON-LD and Hydra descriptions. For our prototype testing, REST (REpresentational State Transfer) and HATEOAS support (Fielding 2000) were integrated with Spring-boot-starter-web v. 1.2.4, which includes Spring boot 1.2.4, Spring-core and Spring-web v. 4.1.6, Embedded Tomcat v. 8.0.23; Hydra-spring v. 0.2.0-beta3; and Spring-hateoas v. 0.16 are integrated. For JSON (de)serialization Gson v. 2.6.1 is used. Unirest v. 1.3.0 is used to send HTTP requests. As a REST-based discovery service, Netflix’s open source Eureka v. 1.1.147 is used. The Microflow clients uses the BDI agent framework Jadex v. 3.0-SNAPSHOT (Pokahr et al. 2005). Jadex’s BDI nomenclature consists of Goals (Desires), Plans (Intentions), and Beliefs. Beliefs can be represented by attributes like lists and maps. Three agents were created: the DataAgent is responsible for providing for and maintaining data repository, the PlanningAgent generates a path through the graph as a Microflow, while the ExecutionAgent communicates directly with microservices to invoke them according to the Microflow. Neo4j and Neo4j-Server v. 2.3.2 is used as a client Data Repository. Microflow specifications (goals and constraints) are referred to as PathParameters and consist of the startServiceType, endServiceType, and constraint tuples. Each constraint tuple consists of the target of the constraint (the service type affected), the constraint, and a constraint type (required, beforeNode, afterNode). For instance, target = “Book Hotel”, constraint = “Search Hotel”, and constraint type = “afterNode” would be read as: “BookHotel” is after node “Search Hotel”, implying the Microflow sequencing must ensure that “Search Hotel” precedes “Book Hotel” (but does not require that it must be directly before it). During Microflow Planning, constraint tuples are analyzed, whereby any AfterNode is converted to a BeforeNode by swapping target and constraint, RequiredNode constraints are also converted to BeforeNode constraints, and redundant constraints are removed and the constraints are then ordered. 4.1

BPMN to Microflow Specification Transformation

To leverage existing process model knowledge, a BPMN-to-Microflow (B2M) transformation tool is implemented in Java and parses BPMN 2.0 files, automatically extracting the start and end node (goal) and any constraints, generating a Microflow JSON specification file (left corner of Fig. 1) for the Microflow repository. The java libraries camunda-bpmn-model and camunda-xml-model version 7.6.0 are utilized for parsing. It includes support for the following BPMN elements: activities, events, gateways, and connections. Currently unsupported in the implementation for automated extraction are swimlanes, artifacts, and event subprocesses (throwing, catching, and interrupting events). Some of these can be manually modelled in the Microflows using scripting.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

4.2

33

Microflow Constraint Mining

A MicroflowLog-BPMN mining tool is implemented in Java that automatically parses a Microflow execution log file (taking the Log in Fig. 1 as input) and generates a BPMN 2.0 file, which could in turn be automatically converted to a Microflow specification file, for instance if constraint extraction is desired. Since it generates a direct sequence of the actual path taken, it results in a simple sequence of tasks. However, this can be helpful in providing a graphical depiction for human analysis and comparison, determining issues, debugging constraints, and as a reference or starting point for models having greater complexity. 4.3

Recommender Service

A Microflow modeler is confronted with many options and decisions, the number of possible Microservices to consider can be quite large, and there are various constraints that may or may not be appropriate, yet if certain essential constraints or Microservices are missing the Microflow may be problematic. To assist the modeler and raise awareness of possibly relevant constraints as well as microservices, a Microflow Recommender Service (shown in Fig. 1) utilizing machine learning (ML) was realized. It creates an artificial neural multi-layer network using DeepLearning4J (DL4J) 0.7.2 and provides a global recommendation of microservice types frequently used together with an input type, as well as frequent before and after constraints for some selected microservice. Three layers were used: Layer 0 takes the different constraints as input and has 200 fixed outputs, Layer 1 has both 200 inputs and 200 outputs, and Layer 2 takes 200 inputs and produces a probability distribution of available constraints as output equivalent to the number of inputs. Other relevant parameters are: the layer size was set to 200, total number of training epochs was set to 100, 8 constraints are batched at once for training, the stochastic gradient descent optimization, learning rate of 0.1 as to what degree correct suggestions should improve certainty, rms decay of 0.95 as to what degree unused suggestions lose relevance, truncated backpropagation, with further parameters left to their default. Supervised learning using classification (pre-classification by domain) is used to train the network from a repository of Microflow specifications, one network per domain. All Microflows constraints within a domain are aggregated, and during training the ordering of available constraint inputs is randomized. To provide a recommendation, a single time step is run to feed the current constraints through the network, and then the network output is converted into a JSON constraint recommendation. The REST interface for the service (DELETE omitted) consists of: GET /domains – returns list of available domains and POST and DELETE to add and remove specific domains respectively. POST /recommendation/{domain} – based on an input of a Microflow as JSON and given a selected node, returns a recommendation consisting of a list of global microservice types, a list of predecessor constraints, and a list of follower constraints. POST /train/{domain} – trains system with an additional list of constraints. GET /train/{domain} – returns the list of constraints used to train the system.

34

4.4

R. Oberhauser and S. Stigler

Microflow Modeler and the Recommender Service

The Microflow Modeler Tool provides a graphical and text-based editor to assist with the Microflow modeling process (see Fig. 3). It is implemented using NodeRED, a web-based tool for flow-based programming. Nodes or building blocks can connected to each other to direct the information flow or execute functionality. A Microflow constraint is in essence a description of a relation between microservices and can be depicted via a connection. Figure 3c shows the Microflow modelling workspace, where nodes representing Microservices can be inserted and moved and relations drawn to indicate before and after constraints. Figure 3e shows the generated textual equivalent of the graphical model on the workspace, and can be directly edited with changes dynamically reflected in the graphical model (round-trip). Figure 3b shows Microservices, a catalog of the available microservices, any of which can be pulled into the workspace via drag-and-drop. Figure 3d shows a drop-down menu to select the modeling domain (e.g., Travel, Business, Health). This controls the domain (category) of the Microflow repository used for constraining recommendations and categorizing this Microflow. Nearby is the Deploy button, which saves the Microservice specification to the repository and permits execution. That includes recommendations based on mining historical Microflows and, utilizing machine learning, recommends microservices and constraints included in past workflows (BPMN or Microflows). Any change to the constraints causes an update request for recommendations. Figure 3a shows the recommendation area, which includes a Global Recommendation for a suggested microservice that was most frequently included in this domain or based on the currently selected microservice type. Figure 3f starts a Recommender Service training session with the given Microflow. The background processing performed in the Microflow Modeler is shown in Fig. 4.

Fig. 3. Microflow Modeler Tool.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

35

Fig. 4. Microflow Modeler background processing.

4.5

Microflow Error Recovery

To support enactment error recovery, the Microflow client now supports data versioning of its state, integrating the javersion data versioning toolkit v. 0.14. The algorithm is shown in Fig. 5 and referred to by line. At each abstract node, the current client state (JSON data outputs from microservices) is committed (Line 11). If the execution of a microservice is not successful, the transition is penalized by adding to its cost so that any replanning does not necessarily continue to include a microservice with constant issues (Line 22); the node index is set to the last node where a commit was performed (Line 24) (ultimately the start node if none) and its state at that node restored (analogous to a rollback); and a replanning is initiated (Line 25) from that node. Thus, Microflow clients support an automated recovery and replanning mechanism. This is in contrast to standard BPMS whereby an unhandled exception typically results in the process terminating. In contrast to basic HATEOAS client implementations, the client state can be rolled back to the last known good service and a replanning enables the client to seek an alternative to reach its goal. This error recovery technique can be used to support the Microflow equivalent of BPMN subprocess transactions.

36

R. Oberhauser and S. Stigler

Fig. 5. Microflow execution algorithm.

5 Evaluation Case studies is used to evaluate the solution, first considering the extraction of constraints from BPMN models, the mining of BPMN models from a Microflow execution log, usage of the Microflow Modeler and then error recovery. 5.1

BPMN Transformation

As an illustrative example, we created our own travel booking process shown in Fig. 6, whereby both a hotel and flight should be found, and then a booking (reservation) of each is performed, and then payment is collected. Virtual microservices are used during enactment that differentiate themselves semantically but provide no real invocation functionality. The BPMN model of Fig. 6 generated an XML file using Camunda Modeler consisting of 209 lines and 11372 characters. In contrast, the Microflow constraint JSON file generated from this model by our BPMN-Microflow transformation tool contains 14 lines and 460 characters (Fig. 9a). To determine to what extent the spectrum of BPMN 2.0 is supported and if any issues are a result of the approach or limitations of the implementation, the BPMN files

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

37

Fig. 6. Our travel booking example as BPMN.

Fig. 7. Collapsed SubProcess BPMN model from OMG (2010).

Fig. 8. Expanded SubProcess BPMN model from OMG (2010).

from OMG BPMN Examples (OMG 2010) were tested. Both the collapsed SubProcess (Fig. 7) as well as the Expanded SubProcess (Fig. 8) BPMN models consist of 222 lines and 13996 characters of BPMN XML and were automatically transformed to constraint files of 19 lines and 622 characters in Microflow JSON as shown in Fig. 9b. Both BPMN files contain the subprocess information which is hidden in the graphical representation in Fig. 8. Assessing the subset of BPMN transformations of the OMG BPMN examples that were unsuccessful, which included portions of Incident Management, Nobel Prize Process, Procurement Process with Error Handling, Travel Booking, Pizza Order Process, Hardware Retailer, Email Voting, we identified the following issues: • Multiple start events: this implies multiple processes are enacted concurrently, resulting in issues with planning and merging state and potential race conditions. These issues, however, are due to limitations with our prototype implementation, not of the approach. Future work will consider concurrent enactment and synchronization. • Multiple end or terminate events: in this case, the planner cannot identify the goal node for the Microflow. One current implementation workaround is to create an abstract final node or a final common end node, which can be inserted into our internal graph with the appropriate additional relations.

38

R. Oberhauser and S. Stigler

• Missing start and end events: these are optional in BPMN and result in no clear start and end goal for the planner. One workaround for our implementation is to assume these are implied based on activities having no predecessor or no successor. • Event subprocess: the prototype does not automatically map exception areas, yet it would be feasible by adding a constraint to each contained node with a conditional before whereby a new path is then dynamically replanned from this relation on error. • Swim lanes: currently only isolated swim lanes are supported, but future work will consider a mapping to abstract nodes and possible communication and synchronization support. • Artifacts: our implementation cannot map BPMN inputs since in these models they lack sufficient semantic detail. One workaround would be to provide a manually created map of BPMN types to JSON-LD types.

Fig. 9. Extracted constraints from the BPMN of (a) travel booking and (b) SubProcess examples.

5.2

Microflow Constraint Mining

From a Microflow execution log (Fig. 10a) that injected an automated error recovery condition for the Travel example of Figs. 6 and 9a, our MicroflowLog-BPMN mining tool extracted a BPMN file (Fig. 10c shows an excerpt from its BPMN XML file and Fig. 10d its graphical equivalent). As explained in Sect. 4.2, this can assist human analysis or serve as a starting point for further modeling. To demonstrate the feasibility of a full cycle (roundtrip) back to a Microflow specification from an execution log, this BPMN was transformed to Microflow constraints shown in Fig. 10b. These constraints could, for example, then be reduced by a human to only those truly required and adjusted for requisite sequencing in order to optimize the dynamic planning capability.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

39

Fig. 10. Travel booking example (a) microflow process log file output with recovery elements highlighted in bold; (b) extracted Microflow constraints; (c) extracted BPMN XML; and (d) BPMN graphical equivalent.

5.3

Microflow Modeler and Recommender Service Case Study

A case study is used to demonstrate Microflow Modeler and Constraint Recommender capabilities. 5.3.1 Recommender Service Training Set In searching for available BPMN diagrams for testing we faced various difficulties. The OMG (2010) BPMN examples lack significant variants for a domain. Few companies are willing to share their internal processes for various reasons and potential

40

R. Oberhauser and S. Stigler

Fig. 11. Sample of BPMN travel domain training workflows variants.

Fig. 12. Sample of BPMN health domain training workflows variants.

risks, as these are often seen as a competitive advantage and significant investments in business process modeling were made. Ones we did publically available find were difficult to categorize (service names vary tremendously between organizations and no obvious semantic equivalence was available based on the service name itself) and they also lacked sufficient variation within a domain. Specifically, we were looking for reoccurring services use in various workflows. We thus chose to develop a synthetic Microflow repository dataset to ensure that certain reoccurring sequences that map to constraints would occur and to use semantically equivalent service names within three different domains. These are fairly basic Microflows, and consisted of ten travel Microflows, ten health Microflows, and eight business Microflows, samples of which are shown in Figs. 11, 12, and 13 respectively. This is not ideal for training ML, and one would wish there were large accessible repositories with clarity on the semantic mapping between services. Nevertheless, they provide an initial starting point for the evaluation. For instance, in the travel domain the recurring pattern is Preferences before other services, and payment after any other services. Likewise, Search occurs before Book. In the health domain the pattern is that Patient Information comes first, and Treatment comes only after checking various other information. Thus, domain-specific knowledge in the form of constraints are now made readily available for ML training without an expert being available by automatically extracting constraints from available process knowledge held within the process models.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

41

Fig. 13. Sample of BPMN business domain training workflows variants.

5.3.2 Recommender Service Usage In the following case study, the Recommender Service is used in conjunction the Microflow Modeler after having been trained with the Microflow specifications mentioned in Sect. 5.3.1. Figure 14 shows the initial state with the domain Travel selected. A global recommendation to include the Book Flight microservice is made on the basis of the frequency of occurrence of this microservice in the models in this domain. In Fig. 15a, a Preferences microservices was added and is currently selected (outlined in red). Recommendations for its followers ‘After Selection’ are ‘Search Hotel’ and ‘Search Flight’ with a global recommendation to include ‘Book Hotel’ somewhere in the Microflow. In Fig. 15b, ‘Search Hotel’ was dragged and dropped behind Preferences and connected as its follower. No microservice is selected so the Before and After are irrelevant. Book Hotel is still recommended, but ‘Search Flight’ was added after ‘Search Hotel’ in Fig. 15c. In Fig. 16, the Microflow has been further modeled to include Payment, which is currently selected. No ‘After’ microservice is recommended (because no microservices in the training came after this), for Before recommendations ‘Cancel Hotel’ and ‘Change Hotel Booking’ are shown, and also ‘Book Flight’ since it could come directly before Payment. ‘Change Flight’ is now a global recommendation.

Fig. 14. Microflow Modeler initial state for the travel domain.

42

R. Oberhauser and S. Stigler

Fig. 15. Microservices with constraints added to a Microflow in the travel domain. (Color figure online)

Fig. 16. Microflow Modeler showing a completed Microflow in the travel domain.

Figure 17 shows that the recommendations differ when the selected domain is health care, for the selected node ‘Check Insurance’ a Before suggestion is ‘Patient Information’ and After suggestions are Payment and Symptoms, whereby Treatment is given as a global recommendation to be included somewhere. Figure 18 shows a business domain example where ‘Confirm Vacation’ is selected, with ‘Accept Vacation Confirmation’ suggested under After and this happens to also be the top global recommendation.

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

43

Fig. 17. Sample of BPMN business domain training workflows variants. (Color figure online)

Fig. 18. Sample of BPMN business domain training workflows variants. (Color figure online)

5.3.3 Recommender Service Technical Evaluation Performance measurements were made to assess the practicality of usage on typical PCs. For the measurements, the hardware configuration consisted of an i5-4460@ 3.20 GHz and 8 GB of RAM. The software configuration was Win10 Pro, Java 1.8.0_144, NodeRed based on 0.17.5, and DeepLearning4J 0.7.2. Training the system took 22.4 s on average (over five invocations). The initial get recommendation request involves initialization and took 221 ms total. Thereafter, any recommendation requests took 6.9 ms on average (over five invocations).

44

R. Oberhauser and S. Stigler

Based on these results, the system seems fluid and viable for dynamic modeling and use of recommendations. While the training time has a noticeable delay (which is also why we have an explicit button in red), this is likely not to occur as frequently in usage and could be done without hampering modeling work in the background or at night, or the microservice could be placed on a high-performance server. No explicit tuning was performed. 5.4

Microflow Error Recovery

To demonstrate the automated error recovery capability, the Flight Booking service was modified to return a HTTP 500 status code and a Recovery for Flight Booking microservice (which could for example attempt to restart the failing service) was added as a microservice with a path cost higher than that of the normal Flight Booking just to demonstrate the ability for replanning to adjust and take a different path after receiving an error. It does not imply that recovery microservices are needed. Figure 19 includes a recovery microservice (green). In the execution log file of Fig. 10a, after receiving an error the execution returns to Abstract Booking Service. The client state (shown in Fig. 20) is restored to that which it was at the last commit, leaving ItemList, Hotel, and Flight (Lines 5–10) and discarding LodgingReservation and FlightReservation (Lines 1–4). The relation between Abstract Booking and Flight Booking is penalized, resulting in a replanning from Abstract Booking that now includes Recovery for Flight Booking since it is the path with the least cost. This is seen in Fig. 10a with the difference in the planning sequence from [CAN_CALL,9] to [CAN_CALL,12] –> (10) –> [CAN_CALL,13].

Fig. 19. Travel Booking example as Neo4J graph (error recovery shown in green). (Color figure online)

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

45

Fig. 20. Output of client state in JSON.

6 Conclusion In this paper, we described business process mining for constraints, Microflow modeling tool support, constraint recommendations based on machine learning, automatic lightweight declarative workflow-based orchestration of semantically-annotated microservices using agent-based clients, graph-based methods, and lightweight semantic vocabularies. The solution principles of the Microflow approach and its lifecycle were elucidated and details on its realization. The evaluation showed that Microflow constraints can be automatically extracted from existing BPMN files, that Microflow execution log file process mining can be used to extract BPMN models, that these can be used to train a constraint recommender service using machine learning, and that certain types of client error recovery can be automated with client state rollback, path cost penalization, and dynamic replanning during enactment. The Microflow constraint specification files were found to be much smaller than the equivalent BPMN files. With the Microflow approach, only the essential rigidity is specified via constraints, permitting a greater degree of agility in the business process models since the remaining unspecified areas of the workflow are automatically determined and planned (and thus remain dynamically adaptable). This significantly reduces business process modeling labor and permits a higher degree of reuse in a dynamic microservice world, reducing the total cost of ownership. Since the workflow (or plan) is not completely adhoc and dynamic, validation and verification checks can be performed before execution begins, and one is assured that the workflow is executable as planned. However, enhanced support for verification and validation of the correctness of the Microflow is still required for users to entrust the automatic planning. By integrating Microflow constraint recommendations in the Microflow Modeler, modeling mistakes due to lack of awareness or forgetfulness can be reduced as the constraint complexity increases.

46

R. Oberhauser and S. Stigler

Future work includes expanded support for BPMN 2.0 elements in our implementation, integrating advanced verification and validation techniques, integrating semantic support in the discovery service, supporting compensation and long-running processes, enhancing the declarative and semantic support and capabilities, tuning the recommender service, and empirical and industrial usage studies. Acknowledgments. The authors thank Florian Sorg and Tobias Maas for their assistance with the design, implementation, evaluation, and diagrams.

References Alpers, S., Becker, C., Oberweis, A., Schuster, T.: Microservice based tool support for business process modelling. In: IEEE 19th International Enterprise Distributed Object Computing Workshop (EDOCW), pp. 71–78. IEEE (2015) Anderson, C., Suarez, I., Xu, Y., David, K.: An ontology-based reasoning framework for context-aware applications. In: Christiansen, H., Stojanovic, I., Papadopoulos, G.A. (eds.) CONTEXT 2015. LNCS (LNAI), vol. 9405, pp. 471–476. Springer, Cham (2015). https://doi. org/10.1007/978-3-319-25591-0_34 Barba, I., Weber, B., Del Valle, C., Jiménez-Ramírez, A.: User recommendations for the optimized execution of business processes. Data Knowl. Eng. 86, 61–84 (2013) Bobek, S., Baran, M., Kluza, K., Nalepa, G.J.: Application of Bayesian networks to recommendations in business process modeling. In: Proceedings of the Workshop AI Meets Business Processes 2013, pp. 41–50. CEUR-WS.org (2013) Bouguettaya, A., Sheng, Q., Daniel, F.: Web Services Foundations. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-7518-7 Bratman, M.E., Israel, D.J., Pollack, M.E.: Plans and resource-bounded practical reasoning. Comput. Intell. 4(3), 349–355 (1988) Chan, N.N., Gaaloul, W., Tata, S.: Context-based service recommendation for assisting business process design. In: Huemer, C., Setzer, T. (eds.) EC-Web 2011. LNBIP, vol. 85, pp. 39–51. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23014-1_4 Fielding, R.T.: Architectural styles and the design of network-based software architectures. Doctoral dissertation, University of California, Irvine (2000) Florio, L.: Decentralized self-adaptation in large-scale distributed systems. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, pp. 1022–1025. ACM (2015) Fowler, M., Lewis, J.: Microservices a definition of this new architectural term (2014). http:// martinfowler.com/articles/microservices.htm. Accessed 31 Jan 2018 Gartner: Gartner Says Spending on Business Process Management Suites to Reach $2.7 Billion in 2015 as Organizations Digitalize Processes (2015, press release). https://www.gartner.com/ newsroom/id/3064717. Accessed 31 Jan 2018 Heitmann, B., Cyganiak, R., Hayes, C., Decker, S.: An empirically grounded conceptual architecture for applications on the web of data. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(1), 51–60 (2012) IBM: IBM Business Process Manager V8.5.6 documentation (2015). http://www.ibm.com/ support/knowledgecenter/SSFPJS_8.5.6/com.ibm.wbpm.wid.bpel.doc/topics/cprocess_transa ction_micro.html. Accessed 31 Jan 2018 Karagiannis, G., et al.: Mobile cloud networking: virtualisation of cellular networks. In: 21st International Conference on Telecommunications (ICT), pp. 410–415. IEEE (2014)

Microflows: Leveraging Process Mining and an Automated Constraint Recommender

47

Kluza, K., Baran, M., Bobek, S., Nalepa, G.J.: Overview of recommendation techniques in business process modeling. In: Proceedings of 9th Workshop on Knowledge Engineering and Software Engineering (KESE9), pp. 46–57. CEUR-WS.org (2013) Lanthaler, M.: Creating 3rd generation web APIs with hydra. In: Proceedings of the 22nd International Conference on World Wide Web (WWW 2013 Companion), pp. 35–38. ACM, New York (2013). https://doi.org/10.1145/2487788.2487799 Lanthaler, M., Gütl, C.: On using JSON-LD to create evolvable RESTful services. In: Proceedings of the Third International Workshop on RESTful Design, pp. 25–32. ACM (2012) Lanthaler, M., Gütl, C.: Hydra: a vocabulary for hypermedia-driven web APIs. In: Proceedings of the 6th Workshop on Linked Data on the Web (LDOW 2013) at the 22nd International World Wide Web Conference (WWW 2013), vol. 996. CEUR-WS.org (2013) Martin, D., et al.: OWL-S: semantic markup for web services. W3C member submission, W3C (2004) OMG: BPMN 2.0 by Example Version 1.0. OMG (2010) OMG: Business Process Model and Notation (BPMN) Version 2.0. OMG (2011) Oberhauser, R.: Microflows: lightweight automated planning and enactment of workflows comprising semantically-annotated microservices. In: Proceedings of the Sixth International Symposium on Business Modeling and Software Design (BMSD 2016), pp. 134–143. SCITEPRESS (2016) Oberhauser, R.: Microflows: automated planning and enactment of dynamic workflows comprising semantically-annotated microservices. In: Shishkov, B. (ed.) BMSD 2016. LNBIP, vol. 275, pp. 183–199. Springer, Cham (2017). https://doi.org/10.1007/978-3-31957222-2_9 Oberhauser, R., Stigler, S.: Microflows: enabling agile business process modeling to orchestrate semantically-annotated microservices. In: Proceedings of the Seventh International Symposium on Business Modeling and Software Design (BMSD 2017), pp. 19–28. SCITEPRESS (2017) Pesic, M., Schonenberg, H., van der Aalst, W.M.: DECLARE: full support for loosely-structured processes. In: 11th IEEE International Enterprise Distributed Object Computing Conference (EDOC 2007), p. 287. IEEE (2007) Pokahr, A., Braubach, L., Lamersdorf, W.: Jadex: a BDI reasoning engine. In: Bordini, R.H., Dastani, M., Dix, J., El Fallah Seghrouchni, A. (eds.) Multi-agent Programming, pp. 149–174. Springer, Boston (2005). https://doi.org/10.1007/0-387-26350-0_6 Rajasekar, A., Wan, M., Moore, R., Schroeder, W.: Micro-Services: a service-oriented paradigm for data intensive distributed computing. In: Challenges and Solutions for Large-Scale Information Management, pp. 74–93. IGI Global (2012) Rao, J., Su, X.: A survey of automated web service composition methods. In: Cardoso, J., Sheth, A. (eds.) SWSWPC 2004. LNCS, vol. 3387, pp. 43–54. Springer, Heidelberg (2005). https:// doi.org/10.1007/978-3-540-30581-1_5 Schobel, J., Reichert, M.: A predictive approach enabling process execution recommendations. In: Grambow, G., Oberhauser, R., Reichert, M. (eds.) Advances in Intelligent Process-Aware Information Systems. ISRL, vol. 123, pp. 155–170. Springer, Cham (2017). https://doi.org/ 10.1007/978-3-319-52181-7_6 Sheng, Q.Z., et al.: Web services composition: a decade’s overview. Inf. Sci. 280, 218–238 (2014) Singer, R.: Agent-based business process modeling and execution: steps towards a compilervirtual machine architecture. In: Proceedings of the 8th International Conference on Subject-Oriented Business Process Management. ACM (2016)

48

R. Oberhauser and S. Stigler

Toffetti, G., Brunner, S., Blöchlinger, M., Dudouet, F., Edmonds, A.: An architecture for self-managing microservices. In: Proceedings of the 1st International Workshop on Automated Incident Management in Cloud, pp. 19–24. ACM (2015) WfMC: Workflow Management Coalition Terminology & Glossary, WFMC-TC-1011, Issue 3.0. Workflow Management Coalition (1999) Wooldridge, M.: An Introduction to Multiagent Systems. Wiley, Hoboken (2009)

IT Systems in Business: Model or Reality? Coen Suurmond(&) RBK Group, Keulenstraat 18, 7418 ET Deventer, Netherlands [email protected]

Abstract. This paper reflects on the developments over the past decades in business modelling and software design in the context of a specific niche market. In the first part, the paper describes the changing role of models and modelling in customer projects and in the development of standard software solutions through the past 35 years. In the second part, the paper analyses different types of models and the relation between the model as an artefact and the modelled reality. The paper concludes with a short discussion of the IT system itself as a necessarily reduced and incomplete model of business processes. Keywords: Business modelling

 Software design  Business processes

1 Introduction A persistent theme in our company for nearly four decades of building systems for the food processing industry has been an orientation on business value, on data quality and on the fit of our solutions to the practical circumstances. This orientation did not originate from marketing considerations, but from our company’s background in both designing production facilities and building control and registration systems. The founder of the company, Hans Kortenbach, had a strong drive to combine his technical acumen and business knowledge to design innovative solutions for production processes, and our IT solutions had to make true on his promises to the customer and his intentions at design time. In other words: the solutions just had to work under practical (and often tough) circumstances. During this time IT technology changed a lot, but understanding of information and its role in business processes changed much less. The rapid developments both in hardware and software development tools were accompanied by a more and more reductive approach to information and information systems. My thinking moved in the opposite direction: information system development should start from real world business processes and its information needs, should accommodate heterogeneity of information carriers, and should be able to deal with irregularities and possibly inconsistencies. Above all, development of information systems should take care of all kinds of information involved in business processes, and should use IT systems only where those systems have added value. Regarding the role of modelling in the development of information systems over time, things changed on more than one dimension. Obviously, the role of explicit modelling as a basis for both communication and development became increasingly © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 49–66, 2018. https://doi.org/10.1007/978-3-319-78428-1_3

50

C. Suurmond

important over the decades. A second dimension was the shift from software modelling via information modelling to process modelling. A third dimension was a broadening of the information concept from just the information in the IT systems to all kinds of information used in business processes. The first half of the paper describes the evolution in developing and using models from a historic perspective, the second half delves deeper into different kinds of models and their meaning in the development of a business information system.

2 Early Years In 1978 I built my first commercial-use software for a tulip grower/trader. This was an invoicing program which used manually entered order header data, customer number and order line data (item number – quantity – unit price) to produce a paper invoice. Master data for the customers and items were retrieved from a mini-cassette tape. This invoicing program saved administrative work and reduced the risk of errors. The paper invoice was processed further in the traditional workflow in the paper-based accounting and records keeping. Invoices were not stored in the system: the two Philips P300 Office Computers used here had a working memory of respectively 2 kB and 6 kB and they had only the mini-cassette drive for external memory. Disk storage was not an affordable option for this kind of computer. This was the automation of one part of a process, one step on from the ‘smart’ electronic typewriters; it was not an information system. Within RBK we built dedicated stand-alone systems from the early eighties, but by then external ‘mass’ storage was available on floppy disks (360 kB storage capacity). Hard disks were available but still very pricey (and they had a capacity of 10 or 20 MB). Filleting systems were the most important software product: registration of the input and output of individual filleters in a production line, with a weekly payment based on the volume produced and the efficiency achieved. Later this was extended with quality registration (e.g. fish bones found), giving a penalty in pay when the number of quality errors exceeded a norm. These programmes operated in a fixed weekly cycle which finished on Friday afternoon with the payment calculation for each filleter. A connection to the further financial processes was realised by generating the weekly results with such a lay-out that the data could be checked quickly and easily and typed over quickly and reliably by accounting. In some individual cases an intermediate file was created that could be read in by accounting. After all procedures had been carried out at the end of the week, the floppy disks were erased and the system was prepared for the upcoming week. We did not keep history inside the system. These were our first information systems, although rather limited. The system would provide the users with information about yields and productivity both on screen and on paper. From the early eighties we built control systems for cooling/freezing processes in production lines and for cold storage warehousing. For the former the most important goals were to minimise the product quality loss and to optimise the energy efficiency of the cooling/freezing process. For example, in belt freezers for fish the product is frozen to a product temperature of −18° C and each individual product is encapsulated in a thin layer of ice. Improving the control accuracy results in significant gains; an improvement

IT Systems in Business: Model or Reality?

51

by just 0.5° can generate an additional annual revenue of EUR 50 000. The same principle applies to other cooling and freezing processes in production lines. Optimal product quality and reduced weight loss are the key parameters. A similar approach to the cooling of cold stores resulted in power consumption savings of up to 30%. All these systems were based on fundamental and innovative thinking by Hans Kortenbach, the founder of our company. Taking into consideration (1) the physical temperature processes and their effects on the products, (2) the characteristics and possibilities of industrial refrigeration systems (both equipment and control), (3) issues of product value and business value in relation to the markets at that time, and (4) the interdependence between the three aspects, he designed the installations, and we (the software developers), built the control systems. 2.1

Business Value and Data Quality

For us as software developers, building and implementing our software solutions was rather straightforward at the time. The relation between the business and the software was not problematic: either our software “created” reality, as in the technical control systems for freezing and chilling products, or our software was “just” representing the quantity, quality and yields of one production line. Such was the naiveté of our world as programmers. From a broader point of view, however, the combination of the design of the physical system by Hans Kortenbach and our control systems represented a kind of business process reengineering without thinking in those terms. There was no business modelling, only technical designs that reflected new ways of thinking about business and that took all aspects of the business into account. A fundamental issue in this early phase was the emphasis on data quality. We were imparted with the importance of getting reliable data into the system, as a prerequisite for the control quality and the quality of calculations and decisions based on information from our systems. In particular, we learned the hard way how much effort goes into creating reliable registration systems for the shop floor. In years since, we often have been astonished by the neglect of this subject and by the sloppiness in thinking about data quality.

3 Covering Processes Starting in 1989 we developed new software for slaughterhouses and for control of industrial refrigeration systems. All shop floor production processes in slaughterhouses were covered by our systems, as well as the commercial processes of the invoicing of livestock and for the sales processes. The invoicing system for livestock was very specific and very advanced, reflecting the competitive market in the Netherlands at that time. The commercial systems were AS/400 based, the shop floor systems based on PC’s in a Novell network, and the industrial control systems were based on the combination of industrial control hardware with a stand-alone PC for the user interface and data management. Exchange of information between systems was standard. We were specialising in a very specific niche market where we were acquainted with all key processes. A few years after the start of this development, products for meat

52

C. Suurmond

processors and for producers of pre-packaged meat for retailers were added to our software solutions. Short lead times, perishable products, variability of qualities and quantities formed the main characteristics of our markets. In subsequent decades, quality control requirements generated by food safety concerns and market requirements by the big retailers could be added to the list of key requirements. Technically our shop floor systems were event-driven, (semi-)real time with a response time of less than 0.5 s, and provided with multi-tasking mechanisms within the application. Events were either generated by peripheral equipment (weighing systems, hardware contacts) or generated via the keyboard (input from users). We developed our software in Borland Pascal in a MS-DOS environment with a text-based user interfaces. All data was stored in binary files. Each registration would open/ modify/close the relevant files, the risk of losing operational data was very low. In later years, our programming concepts prohibited an easy way to move to the Windows/SQL platform. We needed to have access to the physical world in our systems, and we needed our guaranteed response times for our real time tasks. The Windows environment would shield the hardware from our software. Apart from that issue we did not trust the response times of the databases in those years. Concepts for individually identifiable crates and containers were developed in that time, partially as an instrument to facilitate registration processes and production management, and partially as a method for tracking and tracing. Each container would have a fixed identification number, and the tare of the container is kept in the master data, which improved the collection of weight data. We could capture a lot of information connected to the physical unit of handling, and the scan of a barcode is a reliable and fast way of registration. The origins of this concept dated back to 1988, and it paid off very well. We did our first experiments for identification with RF-tags in 1987, but this concept has one very important drawback: it provides information just for systems, not for people. And on the shop floor visual information is important. Information flows for the shop floor must be designed taking all kinds of information for the shop floor into account, and not only deal with information in computer systems. 3.1

Business Viewpoints

The development of our systems was based on very close cooperation with our users. The customer would express a problem or wish, we would look into it and together we would discuss and try solutions. Sometimes this resulted in a prolonged iterative process, finding out what was really the case and finding out about side effects, adapting the problem, and so on. In a few cases we would get the feeling that creating a satisfactory solution was a kind of a random walk through possible problems and possible solutions. Fortunately, more often than not it was a more linear exploration through different viewpoints and different contexts. As a specialist and supplier of standardised software in this specific market we would try to satisfy two objectives: (1) the separation and expression of the general problem abstracted from the concrete questions and the specific circumstances of the individual customer, and (2) the identification of the proper interests of the different stakeholders. The term ‘proper’ is important here: a lot of people will express many interests (and sometimes provide detailed directions for solutions), but only interests related to the constructive role of

IT Systems in Business: Model or Reality?

53

the person in the business process are to be considered proper interests. In a nutshell, the steps in finding and verifying solutions were: Identification of the business processes, identification of the roles of people and systems in the respective processes, identification of the specific questions, generalisation of the questions, finding solutions to the generalised questions, making the solutions available to the specific context, and finding out if every proper information need is met. Explicit modelling of business was not an issue at the time, but for myself thinking about businesses and organisations was very much an issue in these years. During my education and in the first 10 years of my work I was strongly oriented on theories of the organisation as a means to analyse and understand the functioning of an organisation. Simon with his model of bounded rationality [1], Galbraith with the concept of slack in processes [2], and Mintzberg with his Structured Organisation [3] were important sources. In studies of organisations the difference between the formal organisation and the informal organisation was of course a major theme. To me this was an easy and a-theoretical escape explanation. Organisational theory would provide explanations about and the rationale behind the formal structures of the organisations; any deviation could be explained away by referring to the irrationality of human behaviour and to the informal social structures in an organisation. This was an unsatisfying state of affairs for me. A project for a producer of pre-packaged meat products changed my thinking about business and organisation. The company was led by a dominant owner/director and had a very flat organisation. In the management positions would be either trusted old hands with a fair amount of leeway to make decisions, or employees who had a more or less token position without discretionary powers. Old hands would be overruled occasionally, the others frequently. Operational decisions regarding production and distribution were more based on experience and organisational patterns than on organisational roles. Operationally, this was a sound company with a good reputation and with good financial results. It was a well running and responsive organisation that allowed the director/owner to realise his commercial vision. The breakthrough was partly triggered by a question a colleague asked me during the project: why do you think that this company gives us all this money? What do you think that the company wants to achieve with our systems? I then started to think in a different way about how the functioning of an enterprise should be understood. Not the organisation of an enterprise should be the starting point, but its markets and products. The characteristics of the markets and products determine the behaviour and the business processes of an enterprise and the (formal) organisation is a means to stabilise the business processes. I started to think and analyse from the opposite direction, outside-in instead of inside-out. And, as a consequence, I started to look for the foundations for information systems in the business processes, instead of in the organisational structures.

4 Years of Renewal The period started around 2001 with the conversion of our software environment from MS-DOS to Windows, which brought changes in programming language, data management, and user interface. It also meant the replacement or abolishment of most of

54

C. Suurmond

our software patterns, established and fine-tuned over a decade. These patterns might be either ‘just habits’ or skilfully engineered fundamental solutions for basic problems. Patterns for dealing with the multi-tasking and real-time aspects of our systems belonged to the latter category. We had to think of new ways for coordination between our systems and the physical world, and the coordination between processes in our software. Apart from dealing with the more technical software issues, we used this transition to rethink our fundamental concepts for representing business processes in our software. On the shop floor, we used two basic concepts: the production order and the individual container. Stock management was problematic in our software, a problem which we could neglect for a long time because (1) in production of fresh food, stocks are a minor issue in the business processes and (2) we had made some nice and creative work-arounds for representing fresh stock in production orders or in containers. We also wanted to solve two conceptual problems in representing the physical flows in our new software. One conceptual problem is specific to our kind of industry, the other is generic. The specific problem is exemplified best by the curing process. The curing of products, whereby the products are biochemically changing over time, can take a few hours (tumbling), a few days (brining) or up to a few weeks (dry sausages). In the processing of herring, for example, the product is successively graded, filleted, cured, frozen, packed, and stored. In curing the herring is put in a sour bath for two or three days, while stirred every 12 h. The curing process has characteristics of both a production order (semi-finished products are transformed into other semi-finished products) and stocks (products in a storage area for several days). This leads us to the generic problem of a real time representation of a production flow as a concatenation of stocks and production orders: the products are ‘lost’ between the input and output on a production order. Due to our background and driven by our motivation to represent an uninterrupted flow of goods in our systems, we decided to replace our basic concept of production order by the basic concepts of stock, lot and location. Locations are either “storage” or “process”. An input on a production order is represented as a stock movement from storage stock to process stock, and an output as a stock movement from process stock to storage stock. At the end of the production process, the resulting stock balance on the process stock represents the loss of materials in the production process. These few very simple concepts allowed us to represent any flow of goods, and gives us a lot of freedom to model the flow of goods in a concrete project. 4.1

Business Models

In this time we would start projects by a descriptive and informal model of the business processes at the customer, supported by a few generic business models for typical process patterns. It was more or less a model based approach along the lines of the concept of Weber of the ideal type. Ideal in his concept does not denote how the world should be, it does not mean perfection. “Ideal” in ideal type is a construct of the mind, it is logically coherent idea (model) about some part of reality [4]. An ideal type, therefore, can be a useful instrument to look at specific business processes to compare the ideal type with the actual processes. Differences between them should be analysed

IT Systems in Business: Model or Reality?

55

to find the causes or reasons. Sometimes they are caused by unchangeable circumstances, sometimes they are there for a good reason, and sometimes they represent patterns evolved over time, either better left in place or detrimental to the process and to be erased. But the first step always is to try to find the possible rationale behind the specific practices. My way of thinking about firms changed further in these years. Not the organisation, but the markets and products would now be my starting point in the analysis of a firm. An understanding of the markets and the products of the firm provides both background and norms for the analysis, understanding and evaluation of its business processes. The formal organisation was increasingly side-lined as a peripheral phenomenon. This approach was supported by the study of works about the theory of the firm by Coase [5], Kay [6] and De Geus [7], and about knowledge in organisations by Weick [8], Patriotta [9], and Boisot [10]. The study Thought and Choice in Chess by De Groot [11] about human problem solving and especially about the role of the perceptual processes of the expert was important for the importance of intangible patterns in business processes. Another line of study was the theoretical semiotic analysis of signs, sign systems and interpretation processes (in theory), together with the practical analysis of how individuals work with information in business processes and how emerged patterns give stability in working practices. How do individuals deal with regularities (day-to-day patterns) and with irregularities (both recurring and truly incidental incidents)? A lot of relevant information in business processes is either background routine or background knowledge, both for regular situations and for unforeseen situations. Increasingly I became aware of the limitations and drawbacks of rational-mechanistic approaches of business processes and information systems. Of course, rational models are necessary as a means for understanding and communicating. Models are important for analysis and can be useful instruments for change. Software systems incorporate models of reality. The danger lies in the inversion of the relation between model and reality. At the start of the project the model is a representation of reality, and at the end reality is considered to be an implementation of the model. Misfits between model and reality are at the end of the day regarded as problems of reality to be corrected, the model is rational and “true”. Incidentally, this kind of problem is of course very old. Many discussions between accounting departments (‘bean counters’) and operational departments can be traced back to this type of argument.

5 Heterogeneity In what can be considered as the fourth phase of our information systems we moved into heterogeneous system landscapes (to borrow a term from German). A first example is the replacement of our systems for slaughtering and grading processes. Our first system in this field dated back as far as 1987, and from that starting point it grew out gradually to octopus-like structures. Two decades of meeting a variety of information demands in one monolithic system will result in a lot of add-ons. The old system was (and still is) very stable and dependable, but increasingly difficult to adapt and

56

C. Suurmond

maintain. A further major drawback was the dependence on the one programmer who had originally developed the system and adapted it since, and who was the only one with the knowledge and experience to support the system. The objectives for our new system were: (1) replacing the old monolithic and entangled system serving heterogeneous needs (physical input/outputs, real-time aspects, user interface, data management, decision rules) by a heterogeneous landscape with dedicated, single-function subsystems; (2) independence of support by individual persons with special knowledge; (3) a clear overarching model, understandable to business people without technical knowledge; and (4) full specification of all information flows, their effects in related processes and their origins in the production lines. The latter objective is not realistic in general, but in this case attainable because the business domain is highly specific. Further, it is important because the use of terms in this domain can be highly confusing and coloured by local habits. This resulted in a model with four different subsystems. The first subsystem handles all physical aspects and tracks the movements of all individual pieces of meat in the conveyors, the second subsystem handles the user interfaces (touch screens) in the production lines, the third subsystem is responsible for data management and decouples the real time world from database actions (making response times independent of possible lateness of database transactions), and the fourth subsystem connects the other three and handles all business rules. The very knowledgeable employee who had developed the system from its origins some decades ago to the existing system would be the developer of the first subsystem with the real time and physical issues. He could share his knowledge in this field with his technically oriented colleagues. Informational aspects would be handled by other people, and this was made possible by the full specification of all information flows. In another interesting recent project we developed a control system for individualised deboning processes. Each individual piece of meat produced on the new production lines would be traceable to the original animal. Depending on the demand of finished products and depending on the individual characteristics of the raw material the system will decide which finished products are to be produced out of which raw material and the system will show individualised instructions to the people in the production line. At the start of this project the customer knew exactly what he wanted to achieve (full traceability to improve his market position and improvement of yields to earn his investments back) and had general ideas about how to achieve it. The translation of the general ideas into working solutions was up to the main contractors for all physical equipment and the IT system (us). In this kind of project the customer is catapulted from a situation under direct control of foremen, with a lot of flexibility, a lot of room for making (and correcting) errors, and a lot of buffers and internal transport into a world that is computer controlled, very straightforward and rigid, and with a very efficient throughput. Preparing the customer for the change is a big challenge for several reasons. The customer is used to make snap decisions on the work floor, in the new situation this is not possible any more. Once the quality grade is assigned to a piece of meat (before processing), all decisions are computer-controlled. The only human decision during the processing and packaging of the meat is to reject meat and take it out of the line, which should rarely happen.

IT Systems in Business: Model or Reality?

57

To design the system and to prepare its configuration asks for a lot of information from the customer about his current processes, which comes mostly in a highly unstructured way with a lot of exceptions, a lot of imprecise terms, and a lot of qualifiers as “normally”, “basically”, “mostly” or “at least, that should be the case”. Finding understandable and verifiable models in such a project asks for both creativity and background knowledge. The latter is not only important for asking the right questions and understanding the answers, but also for listening to what the customer is not saying. 5.1

Process Logic and Real Business

In the projects mentioned above the modelling phase was not only based on direct observation of visible processes or on interviewing the customer to inform the analyst about his processes. Rather, it was based on a two-step analysis where in the first step a generic model of the underlying and invariant processes was obtained by logical analysis and background knowledge, and in the second step the actual customer’s business processes were modelled. The invariances of the model in the first step are partially determined by the characteristics of products, partially by the characteristics of market conventions in dealing with customers, and partially by social and legal norms belonging to that kind of markets and products. The first kind of invariance is more stable over time than the second and third. Markets, norms and regulations will change over time, but at any given time any company that serves a certain market (in a certain country) must obey the rules and conventions of that market. The generic model is most stable in its ontology and static structures. In a market-oriented company, demand expectation will always be captured and be translated into quantities to be produced, and via production planning be translated into demand of raw materials and resources. Also belonging to the first business model, but possibly less stable over time, are the dynamic aspects of the model. The market for pre-packaged fresh food has typical lead times and a typical planning/production cycle that can be observed by every company in that market. Lead times may gradually shift a bit over time due to market expectations and due to new packaging methods that prolong shelf-time, but the general dynamics of the planning/production cycle will be unaffected and stable.

6 Types, Meaning and Use of Models The sections above described in broad outline the developments through the more than 30 years of doing IT projects in the food processing industry. In the following sections the various modelling concepts that were developed and used in the projects will be discussed. Different types of modelling will be discussed individually, followed by a classification of models according to the three main meanings of ‘model’ as given by the Oxford dictionary: for representation, for design, for imitation. Also in this section is a short discussion of the idea and the use of reference models. The last section will discuss how the different models are meant to be used in our software development and in our software design.

58

6.1

C. Suurmond

Implicit Modelling

In our projects, explicit modelling came only in the later years with the growing complexity of our systems and with the embedding of our systems in heterogeneous system landscapes. Implicit modelling we did all the time, as any IT system is implicitly a model of its domain. Our software development environment was Turbo Pascal, where the programming code was organised in units. Each unit is meant to contain a logically and functionally coherent set of functions and data. Each unit has an interface part with data declarations and function calls that are visible for the software outside the unit, and an implementation part that contains the private parts of the units and is hidden for the software outside the unit. The organisation of the software in units implies modelling of the software. Well-chosen names for data and functions would self-document the software, making it readable and understandable for other programmers. At least, that is what I and my colleagues liked to believe in those days. We did not differentiate explicitly between the functional model, the information model, and the model of the outside world. However, when we were increasingly confronted with different business viewpoints on the same physical flow of goods, we were also increasingly engaged in finding adequate and neutral representations of the data in our software, allowing for multiple interpretations. Interpretation of the data according to the different viewpoints should be on the functional layer or presentation layer of the software. In other words: modelling the outside world took precedence. We tried to model the peculiarities and irregularities of the outside world in our software and to avoid the reduction of the data to ‘logical’ and ‘rational’ structures that allows for neat software structures but does not represent the reality of the shop floor. Implicit modelling and using the model for imitation is a practice that governed our software development in the old days and still (partially) governs our implementations in modern day. For new projects existing software would be copied and adapted. Take an existing and operational system as starting point, and copy it to the new project. Adapt the system where required, and implement the system. Marketers may call this approach “best practices”, but without any explication or argumentation what “best” means for either the customer or the IT supplier this is an empty formula (my suspicion is that this approach often is a best practice for IT suppliers, not for the customer). Whatever one may think of “best practices”, the copy and adapt approach is a pretty safe and reliable approach on a project by project basis, but it will lead to a gradually increase in complexity that is difficult to cope with. It is hard to keep track of the small adaptations, especially because they are often based on ad hoc criteria, and for finding solutions to specific problems. 6.2

Modelling for Software Design

In redesigning our software with the change from the DOS to the Windows environment we wanted to radically reconstruct our software structures, taking into account what we learned over the years about the normal varieties and irregularities of the shop floor. At the same time, we wanted to restrict the core structures of our software to a bare minimum of fundamental concepts. In order to achieve both aims, we explicitly detached the meaning of the core concepts from their use/interpretation in customer

IT Systems in Business: Model or Reality?

59

projects. We also extended the meaning of structurally fundamental concepts such as “stock” and “lot” well beyond the common sense meanings in order to create a simple, consistent and complete concept for the representation of the flow of goods through production and distribution (key element for both traceability and production control). Doing this, we designed a base information model to be realised in the core software and to be applied and interpreted in customer projects. The same approach to design modelling was applied later on in the two other examples for software design mentioned in the sections above. Both in the project for redesigning our software for slaughtering lines and in the project for the innovative deboning processes our aim was to construct a stable model that would (1) represent the essential and invariant structures of the business processes involved, (2) provide an appropriate architecture for the development of the software, and (3) be a useful instrument for understanding and communication with both software developers, consultants and practitioners. In this approach, the model itself is a designed artefact, representing an abstract reconstruction of the outside world and to be realised by the software to be developed (the software being another artefact). The last two examples of explicit modelling for software design also exemplify how in such modelling conventional terms and concepts are used in a non-conventional way. In the model the terms have a formal meaning, which allows for logical checks on completeness and consistency. At the same time, because of the commonness of the terms, outsiders will read common sense meanings with vague boundaries into these terms. To avoid the risk of misunderstanding and unjustified interpretation of the modelling terms, it can be useful to invent new terms for the modelling terms. The new terms should at the same time appeal to the conventional meaning for the sake of recognition, and be sufficiently “strange” in order to be aware of the specificity of the term in use. An example from each of the three discussed software design models: (1) for the core model of our software we use “base-lot” for the modelling concept ‘lot’, (2) in the design model for the slaughtering line we use “token” for the identification by the system of the individual carcass, and (3) we use “anatomical item” for the individual parts of carcasses in the deboning system, and we use “sub-individual” for any individually identified major part of a carcass. These model-specific terms avoid the use of the conventional terms lot (“anything that belongs together according to some criterion”), carcass-ID (at least five different valid carcass-ID’s are commonly used, none of them sufficiently consistent and reliable for our software purposes), or item (item coding is in any meat processing company unreliable, mostly based on simple principles that are corrupted in practical application). Normally, software design models are not directly used in customer projects. However, in case of projects where processes are directly controlled by the IT systems and have a strong ‘mechanical’ and repetitive character, the software design models can be used directly in the customer project. Examples are the projects in the slaughtering line and in deboning. In these cases the representation of the process onto the software structures is immediate (therefore recognisable by the customer), and the software is directly controlling both the physical process and the information exchange with users in the production line (therefore the customer must know about the software structures).

60

6.3

C. Suurmond

High Level Business Modelling

Independent of this modelling work for software redesign, and starting somewhat earlier, we created a representational model for the business processes of meat processing/packaging/pricing for retail chains. The intention of this process model was to find common ground between us and our customers and to create standardised solutions for this kind of business. Because of the clear distinguishing characteristics of this market with very high service levels, daily fresh production, and high sensitivity of demand to discounts and weather, we could model the processes in a simple and straightforward model of about 15 base processes, and define the information flows between the processes in a very compact list that could be printed on a single A4 sheet. This specialised model defined for example the finished-product-identification as a combination of item-id plus consumer price on the label plus a ‘discount-id’ (the latter identifying discount markings on the package such as a discount sticker). This standardised composite identification was subsequently used in demand planning, production planning, production line control, ordering and warehousing. The software model had basically a one-to-one relation to the information model and the process model, as exemplified by the finished-product-identification. Later we developed this model further into a general business process model for food processing. This new model was intended as an aid to discuss heterogeneous system landscapes with our customers and fellow IT suppliers. The customer wants to know how his processes are covered by the IT systems, and in a heterogeneous system landscape he especially wants to be sure that there are no gaps, no overlaps, and especially no misunderstandings and fights about interfaces between the systems. IT suppliers are inclined to use common terms in a rather specific way when communicating about their software. For the customer this is confusing or unclear, especially when various IT suppliers are using the same words but with different meanings. Having a model representing the relevant business processes and their relations has proven to be helpful. In a customer project the high level business model is also used nowadays as a background to create a descriptive representation of the company at hand (the description being itself a model). It helps in the communication with the customer to use such a general and encompassing scheme to pinpoint where processes are located in the customer’s organisation. The model also helps to check for completeness and to demarcate the project scope by specifying which processes are outside the project. 6.4

Modelling Process Logic

Modelling for software design and modelling for understanding of the business are on the two far sides of the modelling steps for designing a business information system. The former way of modelling is situated primarily in the system world, is highly formalised and is capable of being tested for completeness and consistency. Meaning in the model is rigid. Although in de the model terms may be used that are known in everyday speech, in the context of the software model these terms lack the breadth and dynamics of meaning of normal natural language. Business modelling, on the other hand, is situated in the social world, is formulated in natural language and lacks formal

IT Systems in Business: Model or Reality?

61

rigidity. Our way for bridging this gap is the modelling of the process logic of the business processes. The process logic is meant to model the invariant structures underlying the business processes of different companies in the same markets, and it should therefore be understandable for people from the business. At the same time the process logic must be formulated in precise language, and be capable of checks for completeness and consistency. Invariant dynamic elements must also be modelled in the process logic, for example by state transition diagrams or interaction diagrams. Let us take the issue of quarantine as a simple example. It is a concept of the social world, and quarantine can be defined by two basic pragmatic rules: (1) when (a part of) a product lot is in quarantine, this means its use in normal processes is not allowed; and (2) quarantined product can be either (a) unconditionally released for general use, (b) conditionally released for specific use, or (c) removed and destructed. The above holds true for any instantiation of quarantine. The terms in these two rules are “quarantine”, “product lot”, “normal use”, ‘conditionally released’, “specific use”, and “destruction”. All terms are perfectly understandable by business people, and all terms except “quarantine” are liable to highly flexible interpretation in business practice. The two basic rules can be implemented in the physical world by creating a dedicated quarantine storage room with a lock on the door, by a designated quarantine area in a general storage room, or by labelling the (part of) the product lot with big red labels with appropriate texts on the label. In combination with organisational procedures all variants can do the job, and a combination of safety and practical considerations will determine how it will be done in practice. The second rule can again be implemented in the physical world by moving the (part of) the product lot from one dedicated or designated area to another. In a computer system the information carried by the terms “quarantine”, “normal process” and so on can be conveyed by the description of storage locations or processes, or by an attribute on the product lot. In the first case the rules cannot be checked by the system, in the second case the system can check them. In the first case interpretation and judgement is a matter of human interpretation (checked by organisational procedures for accountability), in the second case the system can prohibit attempted registrations that contradict the rules. Different companies can make different choices here, but all choices will have to implement the two basic rules in its business processes. The two basic rules represent the process logic, the actual business processes can be mapped onto the process logic, and analysis of the way the processes are mapped onto the rules are an essential part of the preparation phase for the design and implementation of a new information system. Company-specific dynamic aspects in the business processes (sequences, timing, coordination, iterations) are not represented in the process logic model, and neither are the “centres of experience” and the “centres of conflict solving”. However, these not-represented idiosyncratic elements are vital for understanding a company and its processes. They determine the way an individual company is executing its normal processes and how the company is dealing with irregularities and conflicts of norms. Some will be absorbed by operational decisions in the processes. Others will be solved with decisions by people who have authority because of their organisational position, their competence, or both. Exchange and interpretation of relevant information in these kinds of irregular processes are often dependent on tacit knowledge and hard to

62

C. Suurmond

identify. At the same time, to some extent the competitive power of a company will be dependent on its ability to deal with non-standard situations, and this ability must not be hampered or eroded by the implementation of new information systems focusing solely on the structured flows of information. 6.5

Reference Models

Especially in German literature quite a lot is written about the development and use of reference models for business information systems. Nonnenmacher defined in 1994 the concept of a reference model as related to the concepts of recommendation, ground, and relationship [12]. According to this author the relationship of a reference model to an individual model must be that the reference model provides the ground for the individual model. What holds true for the reference model must also hold true for the individual model (Nonnenmacher: “generally valid statements for individual models should be derivable from the reference model”, my translation). The individual model would be a specific interpretation of the reference model. Later on, Becker et al. defined reference models as information models that are created in one context and used in another [13]. Their reference model is considered as being a normative model that prescribes how things should be done. The latter definition seems to presuppose a lateral relationship between reference model and individual model (created in one context and used in another), where Nonnenmacher clearly declared this relationship as hierarchical. In 2007 Fettke and Loos edited a reference book with a collection of articles about reference models. In their introductory chapter the editors write that the term “reference model” is popular in both academia and practice, but that the term is used to designate different objects. The editors then distinguish three different features often attributed to a reference model: (1) best practices, (2) universal applicability, and (3) reusability. According to Fettke and Loos, these features have “an appealing intuitive meaning”, but a satisfactory definition of a reference model was lacking at the time of writing [14]. The general idea is that a reference model is used as a platform to generate an individual model. vom Brocke [15] and Kirchmer [17] discuss how reference models can be instruments in customer projects to develop customer models in less time and with a higher quality. To elaborate this point Vom Brocke categorises reference models as designed to be used by analogy, by specialisation, by aggregation, by instantiation, or by configuration. In applying a reference model in a customer project, a reference model designed to be used by analogy has the lowest costs for development and brings the lowest savings in the customer project; configurable reference models have the highest development costs but bring the highest savings. It seems to me that in this approach the term “standard solution” would be more applicable. But, then again, perhaps the commercial appeal of “standard solution” is much less than “reference model” or “best practices”. Quite another meaning of reference model is found in for example the OSI reference (published in 1984), which specifies a framework for human communication about computer communication [18]. It defines the basic elements of the framework as well as the relations between the elements. This reference model differs in two important aspects from the reference models as discussed above: the OSI reference model is formulated in precise and formalised language, and the model is used as a

IT Systems in Business: Model or Reality?

63

reference (not a model that can be copied and adapted, or configured). Other examples of such reference models are the Reference Model for Open Distributed Processing [19] or the ISA 95 model for Manufacturing Operation Management [20]. These kinds of models precisely define frameworks for computer communication in the respective domains, as well as for human communication in developing the applications. They are to be used as a background for developing solutions, and their meaning conforms to the definition of Nonnenmacher discussed above. The high level business models and the process logic models could be regarded as reference models in the latter meaning. They are meant to provide a background for communication in their respective domains. They are not meant to be blueprints to be applied directly (in the sense of configuration or adaptation) in customer projects, and neither do they reflect ‘best practices’. The models are created by logical analysis of both the business world of markets and products and the process world, and not by empirical generalisation of successful practices. 6.6

OED and Stachowiak on Models

The OED gives the following major semantic division for the meanings of the noun “model”: (1) a representation of structure, (2) a type of design, and (3) an object of imitation [21]. For convenience, I will label these meanings as representational models, design models, and imitation models. In the first meaning the world is a given and the model a representation of this given world, in the second meaning the artefact does not yet exist and the model represents something in the world that will be. In the third meaning the model specifies something in the world that will be replicated. In the first two meanings the model is a representation of aspects of the world, in the third meaning the model coincides with something in the world. These three main groups of meaning correspond with three different uses of modelling in the development of an information system. In analysis we use our high level business models as instruments to understand the outside world and to communicate our understanding. In software design we create design models as essential constructs of an artefact-to-be. Process logic models are to be considered as primarily representational models, as they are meant to represent the underlying structure of the business world “out there”. Reference models such as the OSI reference model are design models. Reference models as discussed in the context of business information systems are primarily imitation models, because they lack rigid foundation and seem to be meant to “apply and adapt” or “apply and configure”. In his General Theory of Models Stachowiak defines three essential aspects of models: Representation (“Abbildung”), Reduction (“Verkürzung”) and Pragmatics [22]. According to this approach, a model represents a choice of elements of something in the world for a certain purpose in a certain context. For our discussion it is important to distinguish between two different kinds of worlds: the systems world and the business world. The system world is ultimately represented in a rigid formal sign system (the computer with its formal rules for manipulating structured data); the business world is ultimately represented in a social sign system with conventional meaning and interpretation in a social context. Another useful distinction is in the degree of formality in modelling. In Table 1 the criteria are applied to different kinds of models:

64

C. Suurmond Table 1. Types of models and their characteristics

Implicit models

Software models

High Level Business models (generic)

Descriptive Business models (for an individual company)

Process logic models

Reference models

Are used for imitation, are related to both the business world and the system world, have no clear and explicit criteria for what to include, are not explicitly represented Are used for design, are related to the systems world, are highly formalised, represent explicitly the formal structures of the software-to-be Are representative, are related to the business world, represent generic business process structures, are explicitly represented in informal but generic business language Are representative, are related to the business world, represent specific business process structures in the individual company, are explicitly represented in informal and company-specific business language Are representative, are primarily related to the business world and secondarily to the systems world, represent generic structure of individual business processes in semi-formal language (with precise definitions) Are meant to be copied and adapted (“best practices”, “reusability”), are related to the business world and to the system world, are represented in informal language and schemata

7 Conclusion: IT Systems as Models Any IT system used in a company for either supporting or executing its business processes embodies ipso facto a model of the business. Where an IT system is executing a business process autonomously and without human intervention, the IT model and business process reality coincide. (That is: when we consider only the process in isolation, the input or output of the isolated process might well be manipulated by other systems or by human intervention!). Where an IT system is providing information for other process actors (either humans or automated systems) for their actions and decisions, the IT model does not coincide with business process reality. Following the General Model Theory of Stachowiak, the IT model can be viewed as a reduced representation of business process “somethings”. In developing the IT system, the business world is reduced to the IT model. In doing that, the IT system must be regarded as part of an encompassing information system that structures and defines all relevant information flows, both inside and outside the IT system. In developing the information system, the process logic model must provide a stable, valid and invariant structure for the IT system, thereby providing a first ground for the applicable reductions. The analysis of the differences between the process logic and the actual processes, the subsequent analysis of the causes and reasons for the specific ways a company is executing its business processes, and finding out about the impalpability of certain business processes and of blurred boundaries between certain

IT Systems in Business: Model or Reality?

65

processes should provide a framework for deciding which information can be processed effectively and efficiently by an IT system, and which information should be made available by other means (background knowledge, informal notes, human consultation, markings on products). The beginning and ending of processes in IT systems should be clearly and precisely defined. A nice example of the troubled relation between structured and formalised IT systems on the one hand and impalpable processes with informal information exchanges on the other hand can be found in Kirchmer [17]. He describes the problem of business processes executed by knowledge workers, such as can be found in product development for example. The problem for Kirchmer is that the processes involved have a high degree of unpredictability, and are difficult to model in an IT system. I consider this an example of a general line of thinking in the IT world, where the IT system is not viewed as a model of the business world, but as itself being the business world (or as somehow being obliged to represent the business world without gaps). However, If you would isolate these difficult processes as black boxes, and if you would clearly define the interactions between the black box and the normal operational business processes (as any company would do, to avoid the risk of ad hoc disturbances of its normal processes), the ‘difficult’ processes can both be isolated from and embedded in the normal operational processes. In line with what was discussed above, inside the black box processes can be organised by a combination of structured meetings and ad hoc consultations between the people involved. Kirchmer writes about his personal experiences in this field and mentions organisational mechanisms like continuous discourse, updating knowledge maps, bringing ‘unrelated’ people into contact, personal leadership instead of rigid rules. From my experience on the shop floor I do not consider the process described by Kirchmer as an exception or as an isolated and rare phenomenon. Any IT system will process information according to a reductive model of the business with fixed and deterministic rules. However, business processes and their outcomes almost never are completely predictable. Each and every person who is not acting fully mechanical is partly a knowledge worker whose job it is to interpret the environmental information and to adapt to circumstances. The most important difference between much conventional IT thinking and the approach described in the sections above is that I view IT systems as instruments incorporating reduced models of the business, and not as encompassing systems that more or less coincide with the business processes. The role of the modelling for software design, the modelling of the process logic, and the high level business modelling is to find the right place for the powerful but reductive IT systems.

References 1. 2. 3. 4.

Simon, H.A.: Administrative Behavior. The Free Press, New York (1976) Galbraith, J.R.: Designing Complex Organisations. Addison Wesley, Boston (1973) Mintzberg, H.: The Structuring of Organizations. Prentice Hall, Englewood Cliffs (1979) Weber, M.: Economy and Society. University of California Press, Berkely (1968)

66

C. Suurmond

5. Coase, R.H.: The nature of the firm. In: Williamson, O.E., Winter, S.G. (eds.) The Nature of the Firm: Origins, Evolution, and Development, pp 18–33. Oxford University Press, Oxford (1993) 6. Kay, J.: Foundations of Corporate Success. Oxford University Press, Oxford (1998) 7. De Geus, A.: The Living Company. Nicholas Brealey Publishing, London (1997) 8. Weick, K.E., Sutcliffe, K.M.: Managing the Unexpected. Jossey-Bass, San Fransisco (2001) 9. Patriotta, G.: Organizational Knowledge in the Making. Oxford University Press, Oxford (2003) 10. Boisot, M.H.: Knowledge Assets. Oxford University Press, Oxford (1998) 11. De Groot, A.D.: Thought and Choice in Chess. Mouton Publishers, The Hague (1978) 12. Nonnenmacher, M.G.: Informationsmodellierung unter Nutzung von Referenzmodellen. Peter Lang, Frankfurt am Main (1994) 13. Becker, J., Niehaves, B., Knackstedt, R.: Bezugsrahmen zur epistemologischen Positionierung der Referenzmodellierung. In: Becker, J., Delfmann, P. (eds.) Referenzmodellierung Grundlagen. Techniken und domänenbezogene Anwendung. Physica Verlag, Heidelberg (2004) 14. Fettke, P., Loos, P.: Perspectives on reference modeling. In: Fettke, P., Loos, P. (eds.) Reference Modeling for Business Systems Analysis. Idea Group, London (2007) 15. Vom Brocke, J.: Design principles for reference modelling: reusing information models by means of aggregation, specialisation, instantiation and analogy. In: Halpin, T., Krogstie, J., Proper, E. (eds.) Innovations in Information Systems Modeling. IGI Global, London (2009) 16. Boisot, M.H.: Knowledge Assets. Oxford University Press, Oxford (1998) 17. Kirchmer, M.: High Performance through Business Process Management, 3rd edn. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51259-4 18. Open Systems Interconnection - Basic Reference Model. ISO/IEC 7498-1. ISO/IEC, Genève (1994) 19. RM ODP, Information Technology – Open Distributed Processing – Reference Model: Overview. ISO/TEC 10746-1. ISO/TEC, Genève (1998) 20. ANSI/ISA-95.00.01-2010 (IEC62264-1 Mod) Enterprise-Control System Integration Part 1: Models and Terminology. ISA, Durham (2010) 21. OUP: The Oxford English Dictionary. Oxford University Press, Oxford (1989) 22. Stachiowak, H.: Allgemeine Modelltheorie. Springer, Wien (1973)

Combining Business Process Variability and Software Variability Using Traceable Links Andreas Daniel Sinnhofer1(B) , Peter P¨ uhringer3 , Felix Jonathan Oppermann2 , 2 Klaus Potzmader , Clemens Orthacker2 , Christian Steger1 , and Christian Kreiner1 1

Institute of Technical Informatics, Graz University of Technology, Graz, Austria [email protected], {steger,christian.kreiner}@tugraz.at 2 NXP Semiconductors, Gratkorn, Austria {felix.oppermann,klaus.potzmader,clemens.orthacker}@nxp.com 3 Gratkorn, Austria [email protected]

Abstract. Nowadays, domains like Cyber-Physical Systems (CPS) and Internet of Things (IoT) are highly affected by short product cycles and high pricing pressure. Business Process oriented organizations are known to perform better in such flexible environments. However, especially industries which are focused on delivering low cost systems are facing big challenges if the according business processes are not aligned with the capabilities of the product. Consequently, development effort is spent for features which are never addressed by any business goal. With this work, we propose to use a combined variability management in order to create an integrated view on the product variability from an organizational point as well as from a technical view. Using this approach helps in order to identify business drivers as well as to establish a mature product line development. Keywords: Software product lines · Feature oriented modelling Business processes · Tool configuration · Variability modeling

1

Introduction

We are living in an ever changing and interconnected world. The dawn of the Internet of Things (IoT) further increases the trend for organizations to delivery feature rich systems in high quantities and at low costs. Further, the demand of the market requests an increasing number of new features with every product release, leading to very tight development cycles. Due to the pricing pressure and the shortened development time, methods have to be investigated which allows modular and highly configurable systems such that the products can be adapted and easily extended to the current requirements of the market. Business Process (BP) oriented organizations are known to perform better regarding highly flexible demands of the market and fast production cycles c Springer International Publishing AG, part of Springer Nature 2018  B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 67–86, 2018. https://doi.org/10.1007/978-3-319-78428-1_4

68

A. D. Sinnhofer et al.

[1–4]. These goals are achieved through the introduction of a management process in which business processes are modeled, analyzed and optimized in iterative improvement processes. During recent years, business process management is further coupled with a workflow management in order to monitor the correct execution of the process and to integrate responsibilities to the process models. Additionally, context aware business process modeling techniques were introduced in order to cope with fast changing requirements [5]. Such context aware systems rely on gaining flexibility by analyzing the context states of the environment and by mapping the according processes to their related software systems. One of the problems of this approach is, that such software systems are often developed independently from each other, although they share a similar software architecture. Software Product Lines (SPL) have proven to be essential for the development of flexible product architectures which can be adapted to the current requirements [6]. Thus, Software Product Line Engineering promises to create diverse, high-quality software products of a product family in short time and at low price. This is achieved through the use of a common architecture and reusable product features. The most critical phase during the design and the implementation of a product line is the identification of the variable parts and the common parts of the product family [6]. Consequently, a lot of effort is invested to identify the domain requirements of the final product portfolio. Equally important is the selection of the according features during the application engineering: It has to be guaranteed, that the customer requirements are fully met; further, all unnecessary features need to be excluded in order to ensure low productions costs of the final product. Since the identification of the domain requirements is usually carried out from developers, an integrated view on the organizational goals is often missing or incomplete. This means that a stable feature architecture is only achieved after a few iterations. Consequently, the efficiency of the product line is reduced since additional effort needs to be invested in order to create a product line adhering to the current requirements. This work focuses on the development of a framework which aims to enforce a link between the variability of the business processes and the variability of the product platform. As such, we propose a combined variability modeling in which the requirements for the organization as well as for the development of the product platform are identified together in an integrated fashion. Consequently, developers and business process experts are getting insight into the different domain, leading to a more mature development process. After identifying the requirements, order processes are designed which reflect the possible product configurations that can be ordered by a customer. These variable order processes are further used to automatically trigger the product customization process in order to reduce the production costs of the final product. This work is based on our previous works in which we already defined systems for modeling variability of business process models (see [7]) as well as a framework for generating software configurations based on order processes [8–10]. This work is structure in the following way: We present related work in Sect. 2. Section 3 summarizes basic concepts about business process modeling as

Combined Variability Modeling

69

well as software product line engineering. Section 4 summarizes our approach to link variable order process models to variable software architectures in an automatic way. In Sect. 5 we describe how we applied the introduced concepts in an industrial use case and present a simplified example for illustration purposes. Further, we will give an insight into some implementation details. Since the identification of business drivers is essential for an organization to survive in a competitive market, we show in Sect. 6 how we were able to identify improvement opportunities by analyzing the results of our framework. And finally, we conclude this work in Sect. 7.

2

Related Work

Traditionally, business process modeling languages do not explicitely support the representation of families of process variants [11]. As a consequence, a lot of work can be found which tries to extend traditional process modeling languages with notations to build adaptable process models. As such, adaptable process models can be customized according to domain requirements by adding or removing fragments to the model and by explicitly transforming this model to dedicated process variants which can be executed in the field. This promises to increase the flexibility of business process oriented organizations with respect to highly flexible requirements of the market. Having such a variability modeling for business process models builds the foundation of this work. Thus, related work which is utilizing similar modeling concepts are presented in the following: Derguech [12] presents a framework for the systematic reuse of process models. In contrast to this work, it captures the variability of the process model at the business goal level and describes how to integrate new goals/sub-goals into the existing data structure. The variability of the process is not addressed in his work. Gimenes et al. [13] presents a feature based approach to support e-contract negotiation based on web-services (WS). A meta-model for WS-contract representation is given and a way is shown how to integrate the variability of these contracts into the business processes to enable process automation. It does not address the variability of the process itself but enables the ability to reuse business processes for different e-contract negotiations. While our used framework to model process variability reduces the overall process complexity by splitting up the process into layers with increasing details, the PROVOP project [14–16] focuses on the concept, that variants are derived from a basic process definition through well-defined change operations (ranging from the deletion, addition, moving of model elements or the adaptation of an element attribute). In fact, the basic process expresses all possible variants at once, leading to a big process model. Their approach could be beneficial considering that cross functional requirements can be located in a single process description, but having one huge process is also contra productive (e.g. the exchange of parts of the process is difficult). The work of Gottschalk et al. [17] presents an approach for the automated configuration of workflow models within a workflow modelling language.

70

A. D. Sinnhofer et al.

The term workflow model is used for the specification of a business process which enables the execution in an enterprise and workflow management system. The approach focuses on the activation or deactivation of actions and thus is comparable to the PROVOP project for the workflow model domain. La Rosa et al. [18] extends the configurable process modelling notation developed from [17] with notions of roles and objects providing a way to address not only the variability of the control-flow of a workflow model but also of the related resources and responsibilities. The Common Variability Language (CVL [19]) is a language for specifying and resolving variability independent from the domain of the application. It facilitates the specification and resolution of variability over any instance of any language defined using a MOF-based meta-model. A CVL based variability modeling and a BPM model with an appropriate model transformation could lead to similar results as presented in this paper. The work of Zhao and Zou [20] shows a framework for the generation of software modules based on business processes. They use clustering algorithms to analyze dependencies among data and tasks, captured in business processes. Further, they group the strongly dependent tasks and data into a software component.

3

Background

This Section summarizes the basic concepts of Business Process Modeling and Software Product Line Engineering which are applied in this work. Further, our previous publications – which are forming the foundation of this work – are briefly summarized. 3.1

Software Product Line Engineering

Software Product Line Engineering (SPLE) applies the concept of product lines to software products. As a consequence, SPLE promises to create diverse, highquality software products of a product family in short time and at low costs [6]. Instead of writing software for every individual system, a Software Product Line (SPL) is used to automatically generate software products by combining the required domain artifacts. The principal concept can be split into two main phases: the Domain Engineering and the Application Engineering [6,21]. During the Domain Engineering, the domain artifacts, the variabilities and the commonalities of the according domain are identified and implemented. The modeling of the domain is usually carried out using a Feature-Oriented Design Modelling (FODA) [22] to explicitly state all dependencies. Domain artifacts are reusable development artifacts like the software architecture, or software components including their corresponding unit-tests [6]. The domain engineering is split into five sub-processes: 1. Product Management: Is the first process and deals with the economic aspects of the product line and defines the product roadmap.

Combined Variability Modeling

71

2. Requirements Engineering: During this process, reusable and model-based requirements are defined. Hence, the requirements are not specified for a single particular application, but contains the common and variable requirements for all products of the SPL. 3. Domain Design: During this processes, the variable software architecture is designed and the variability model that was created in the previous process is refined. Key outcome is the so called architectural design which defines common ways to deal with variability in the Application Design and Application Realization of the Application Engineering phase. 4. Domain Realization and Domain Testing: Are closely tied together and deal with the implementation of the reusable software components and their according component tests. In the Application Engineering phase, the final products are created by combining the domain artifacts which were implemented in the previous phase. Similar to the Domain Engineering, the Application Engineering is split into the four phases Requirements Engineering, Application Design, Application Realization and Testing. In contrast to the domain engineering, the application engineering mainly focuses on reusing domain artifacts. Based on the current requirements of the product, specific domain artifacts are chosen and assembled to the final product. Ideally, the application engineering makes use of software generators to automatically derive product variants without the need of implementing any new logic. This enables a rapid creation of high-quality products within a defined product family. The amount of reused domain artifacts greatly depends on the application requirements. Hence, a major concern of the application engineering is the detection of deltas between the application requirements and the available capabilities of the SPL. 3.2

Business Process Modeling

Business Processes (BP) are a specific sequence of activities or (sub-) processes which are executed in a dedicated sequence to produce output with value to the customer [2,9]. In this work, we use the concept defined by [24] to model BPs: BPs are modeled in different layers, where the top level (macroscopic level) is a highly abstract description of the overall process and the lower-levels (microscopic level) are more detailed descriptions of the sub-processes. A reasonable level of detail is reached, if the process description on the lowest levels can be used as work-instructions for the responsible employees. This leads to the fact that the higher levels of the process description are usually independent of the production facility and the supply chains; while the lower levels are highly dependent on the production facility and its capabilities. As a consequence, the macroscopic level is more stable with respect to changes and can be reused in different contexts and production environments. The microscopic levels need to be updated in order to reuse them in different contexts. An exemplary process showing a bank account creation request is illustrated in Fig. 1. As shown in the Figure, the top level is a highly abstract description of the request process,

72

A. D. Sinnhofer et al.

while each sub-process is further described in the lower levels. For example, the Credit Assessment process can even be further refined with an additional process description such that it can be directly executed by the responsible assessor. Request rejected

New Account Request

Create Account

Send Card

Send PIN

Create User Account

Credit Assessment

Fig. 1. Exemplary layered business process showing a simplified bank account creation request [23]

Variability of such process structures can be modeled through a variable process structure (i.e. by adding/removing activities in a process) or by replacing process refinements with different sub-processes. In general, three main types of business processes can be distinguished (c.f. [25]): – Primary Processes: Each of the process activities adds a specific amount of value to the value chain. Consequently, such processes are also often referred to as Core Processes since they cover the essential value creation of a company, that is, the production of goods and services for which customers pay [26]. – Support Processes: Are processes which are designed to support the Primary Processes like managing resources or infrastructure. Such processes do not directly add value to the customer but are essential to ensure the proper execution of the Primary Processes. Examples are human resources, procurement, but also research and development processes [26]. – Management Processes: Are designed to monitor and schedule business activities like the execution of Primary Processes or Support Processes. While Management Processes do not directly add value to the customer, they are designed to increase the efficiency of the business activities. Domain specific modeling languages are usually used to model all the activities, resources and responsibilities within a Business Process. In the scope of this work, the Business Process Model and Notation (BPMN, [27]) is used to model processes, but the general concept of this work is not limited to this notation. The key concepts which are used in this work, are summarized below [9,26,27]:

Combined Variability Modeling

73

Events: Occurs during the execution of a process and may affect the flow of the process. Events are happening atomically, which means that they do not have a duration. For example, the start or the completion of an Activity are typical events that occur in every process. According to the BPMN specification [27], events are used only for those types, which affect the sequence or timing of activities of a process. Activities: An Activity is a specific amount of work that has to be performed by the company – or another organization – during the execution of a process. Two different types of activities can be distinguished: An activity that is rather simple and can be seen as one single unit of work is called an atomic activity (also referred to as task) and non-atomic activities (e.g. sub-processes). For example, if checking received goods from a supplier only covers the activity of checking the amount of received goods, it can be modeled as a single task. If on the other hand, the process of checking received goods contains a sequence of activities like checking that the goods are within a defined specification/are in a usable state, it is called a non-atomic activity or sub-process. Gateways: Are used to control how the process flows through different sequences of activities. Each gateway can have multiple input and/or output paths. One example is a decision, where out of many possibilities, only one path is selected based on specific decision(s). The decision can be coupled to conditions or events which are triggered during the execution of the process. For example, if the inspection of the received goods was negative, the responsible worker may decide to return the goods to the supplier. Data: Data objects represents the information flow through the process. Two types of Data objects can be distinguished: Input Data that is required to start a specific activity and Output Data which is produced after the completion of an Activity. Pool and Lane: Are used to model responsibilities for specific activities in a process. Responsibilities can be usually assigned to an organization, to specific roles or even dedicated employees. 3.3

Managing Variability in Business Process Modeling

It is common practice for organizations to maintain multiple variants of business processes which are based on a common template [11]. This leads to the situation that similar process variants are created through a copy and clone strategy. As a consequence, maintaining these process variants is a time consuming tasks since every single process variant has to be manually updated by the according process designer. Besides the additional maintenance effort, using copy and clone strategies also have a negative influence on the process documentation. To solve these issues, we proposed a Software Product Line approach for the derivation of process variants from business process models (see [7–9]). The concept can be split into four different phases:

74

A. D. Sinnhofer et al.

Process modeling: During the process modeling, process designers are responsible to design process templates. The process templates are designed using the BPMN notation and additional artifacts are integrated like documentation templates, responsible roles, resource allocations, etc. The process templates are designed in an appropriate BPM Tool to fully support the process designers during the design process. The process of designing the process templates and the process of creating the according domain model goes hand in hand in an iterative manner to ensure that the created templates can be reused in many different contexts. Domain modeling: During this process, the created templates are imported into a Software Product Line tool and translated into a so called feature model (see Sect. 3.1). During the creation of the feature model, it has to be decided which parts of the process are designed to be variable and which parts are static. For illustration purposes, the following example is given: A company creates car parts for two major car manufacturers. While the overall process for creating the car parts is identical for both customers, different production planning strategies are used to optimize the material usage (e.g. stock size, etc.). E.g., for customer X, the company employs event-driven Kanban, and for customer Y, the company uses Kanban with a quantity signal. Thus, the domain engineer chooses the production planning strategy to be variable by defining a Variation Point (VP). He deposits the different Kanban implementations as possible variants for the VP so that two processes can be generated based on the same model. The definition of variable parts and static parts happens in close cooperation with the according process designers and may even lead to a re-iteration of the first phase if some process templates need to be adapted. Not every combination of variants may create meaningful process variants. As a result, a comprehensive list of restrictions and rules has to be designed as well to guarantee that only valid and meaningful process variants can be created by the product line. The list of rules and restrictions has to be defined flexible as well, since not every restriction may be identified in advance when the initial process model is created. Consequently, re-iterations of the restriction model are common after collecting evaluation data from the execution of the processes. Feature selection: Based on the current requirement of the organization, process variants are created using the created feature model. This is done be selecting the required features from the model and by translating this feature selection to a valid business process structure. To ensure an automatic transformation, generators have to be developed which are able to translate between the business process model and the feature model. The defined rules and restrictions are enforced during this process to guide the domain expert in selecting a meaningful set of options. To continue the example from above, two process variants may be created for the two customers. The only difference between the processes is the production planning strategy.

Combined Variability Modeling

75

Maintenance and Evolution: One of the most important phases is the maintenance and evolution phase: To be highly flexible and adaptive to the current requirements of the market, processes and their according models have to be continuously improved and adapted. As such, the derived processes are monitored by production experts during the time in use and evaluated against the requirements. Based on the collected data, the experts can either improve the feature selection of the used process (i.e. iteration back to phase 3), or issue a process improvement process (i.e. iteration back to step 1). During a process improvement process, process templates are updated or created from the process designers and integrated into the existing feature model. For example, during the production for customer X, it was observed that event-driven Kanban was too slow to react to the customer needs. As a consequence, the production experts changed the production planning strategy to quantity-based Kanban to tackle these problems. Another possibility could be the observation that quantity-based Kanban was too general, since only one bin was recognized as being valuable for the production process. Consequently, a new process is designed and integrated into the existing feature model. Through the capabilities of the Software Product Line tool, it is possible to automatically propagate the changes of the process templates to every instance. As a consequence, no time consuming and error prone manual maintenance process is necessary to adapt all the existing process variants. Since it may happen that some of the process variants shall not be updated in case of changes, version control can be used to explicitly state which version of a template shall be used. Our today’s business environment is focused on creating sustainable value by increasing the revenue of business drivers. The identification of such business drivers, or the identification of the drivers which are able to destroy value is an essential step for an organization [28]. Otherwise, staying competitive or even survive on a flexible market is not possible. The combination of business variability and software variability is a promising way to improve the identification of such drivers: Linking domain artifacts and their according implementation/maintenance efforts to the respective business processes can reveal improvement opportunities. Further, having a combined view of the requirements helps to increase the overall efficiency of the product line.

4

Combined Variability Modeling

The goal of this work is an automatic generation of software products based on the product order. In order to achieve this goal, an integrated view is necessary in which the variability of the software product is reflected in a variable order process. Since only a few configuration options are usually exposed to the end-customer, also all internal processes need to be covered in the variability management process. The overall concept of the resulting process is highlighted in Fig. 2. As illustrated, the Process Variability Framework – which was described in Sect. 3.2 – is used as a foundation to model variable order process models. Based on this order process models, order entry forms are automatically generated which need to be filled by internal or external customers. The provided

76

A. D. Sinnhofer et al. Process Variability Framework

Process Model

Feature Model

Generators

Feature SelecƟon

Feature TransformaƟon

Maintenance Evolve

Domain Experts generated Order Entry (e.g. Web-Interface)

Customer

configures

Process Variant

configures

Order Data

Product

verify

influences

Internal Customer

Domain Experts

Fig. 2. Overview of the concept for combining order process variability and software variability to automatically derive software products [10].

data is used as input for a product line which maps the provided order data to the customization options of the software product. Due to the feature mapping, the final product can be automatically derived without any manual step beside verification steps, which may be required by certification requirements. In order to achieve a binding between the order process models and the generated order entry forms, the following type model was introduced [9]: – None: No special data needs to be submitted. Thus, a process node marked with none will not appear as a setting in the order entry form. – Inputs: Is the abstract concept for different input types which are described below. Each Input is mapped to a specific input type, defining the format of the input. For example, input data could be delivered in form of a file, or configuration settings could be delivered in form of strings or integer values. Depending on the applied domain, also non functional properties may need to be modeled in the Input type. For example, if a security critical product is developed, a customer may be asked to provide a cryptographic key which is used to authenticate the customer to the device. Besides providing this key, also some kind of specification is required in which format this data was provided (e.g. pgp encrypted, etc.). – Customer Input: Specific data that has to be added from an external customer. A process activity marked with this type will generate an entry in the order entry form of a specific type. For example, drop down list will appear if a customer can select between different options. – Internal Input: Specific data needs to be added from an internal stakeholder. A process activity marked with this type will not generate an entry in the external customer order interface, but will create a separate order entry for the according internal stakeholder. – Input Group: A set of inputs which are logically linked together. As a consequence, all of these inputs will be highlighted as a group in the generated order entry and all of them are required for a single customization feature of the final software product.

Combined Variability Modeling

77

The type information has to be added to the process feature model of the SPLE tool. To support the domain experts in creating the according mappings, the following rules are automatically applied by the SPLE tool based on the BPMN types [9]: – Activities: Non-atomic activities are used to group specific sets of input parameter to a single feature. For example, a process designed for customizing an application may require several input parameter (like user name, password, license files, etc.). As a consequence, non-atomic activities will appear as an Input Group for all inputs defined by the according sub-process(es). Any atomic activity will be automatically tagged as input type “None”. The input type “None” is also automatically applied if a non-atomic activity does not contain any data. Consequently, “empty” non-atomic activities will not appear in the generated order entry form. – Gateways: Are used to define the structure of the generated form. For example, for a decision node, a drop down selection will appear such that the customer can choose between different customization paths. For decisions it is further enforced that the customer can only select and submit the data for one single path. – Data: Data to be provided by any entity involved in the process(es). With respect to our case study, “String” turned out to be a meaningful default value. – Pool and Lanes: Are used to define the source of the input data. For example, a data node which is part of a company internal lane will automatically be tagged as an “Internal Input”, while Data in an external lane will be marked as “External Input”. “Internal Input” should be used as a default value to circumvent accidental exposure of internal configuration settings to the end-customer if Pool and Lanes are not used. All default mapping rules can be manually overwritten by the Domain Expert during the creation of the process model. Changes to the process model (e.g. adding/removing/changing activities) are traced via unique identifiers and illustrated as a diff-model such that changes can be reviewed by the Domain Experts. After the order process model was successfully tagged, the according order entry forms can be generated. With respect to this work, we have chosen web-based forms since they are commonly used in practice. As illustrated in Fig. 2, the provided customer data is used to create the feature selection of the final product. A manual verification step is advisable in order to ensure that no mistake was made during the development of the translation logic. Additionally, for certification purposes it may also need to be proven that a verification was done to ensure that no customer related data is confused with other products. In order to automatically select the features, the grouping information of the order entry is used to select the required features. After the selection was approved by the Domain Experts, it is automatically processed by the product specific code generators of the product line which utilizes the provided order data to actually generate the according product.

78

A. D. Sinnhofer et al.

The result of this process strongly depends on the use case: It could be a binary file that is loaded to the Integrated Circuits during the production, a configuration script which is executed on the final product, or any other approach. We will discuss a script based approach in Sect. 5 in more detail. Especially for new types of products it is very likely that new knowledge is gained on how to increase the efficiency of the whole process(es). Only if changes to the generated order entry forms are necessary, a maintenance process for the product customization system is required. The maintenance costs for the product line can be kept as low as possible, since the code generators and model transformation logic only has to be updated once, but can be reused for the whole product line.

5

Industrial Case Study

For illustration purposes, we will discuss our industrial case study in more detail, showing how process models are translated and the final product is automatically derived. The implemented business processes of our industrial partner are controlled by an SAP infrastructure and are designed with the BPM-Tool Aeneis. Further, we are using the SPLE tool pure::variants to manage to variability of the business processes as well as the variability of the final product configurations. A more detailed description of the developed tool plugins can be found in our previous publications (see [7,8]). 5.1

Exemplary Sample Process

We will consider the following – simplified – example: A company is developing small embedded systems which are used as sensing devices for the Internet of Things (IoT). The devices are sold to distributors (refered to as customer in the following) in high quantities which means that it is economically infeasible to configure each device manually. Furthermore, establishing customization toolchains for dedicated versions of the product may solve the issue of manually configuring products, but leads to high maintenance costs with respect to the involved tools. Thus, a flexible system has to be implemented that ensures that variable parts of the process are reflected by variable parts of the software architecture of the customization toolchain. The device is offered in three different variants with the following features (based on [9]): Version 1: Senses the data in a given time interval and sends the recorded signal to a customer operated web-server which is used for post-processing the data. In the first version, the communication channel between the web-server and the device is unprotected. During the order process, the customer is responsible for providing the connection string of the web-server to the company. Version 2: Additionally to the basic features of the first version, this version allows encryption of the communication channel between the server and the node using symmetric encryption algorithms. As such, it should be circumvented that

Combined Variability Modeling

79

third parties are able to read the data in plain. For simplicity of this example, it can be assumed that the encryption key is provided by the customer in plain and is used in every product. This means that the customer has to specify only a single value which is then loaded to all products during the production process. Version 3: Additionally to the basic features of the second version, this version allows customer applications to be run on the system. This requires that the customer submits a binary file which is loaded to the device during the production. Version 1

Version 2 IP Address

IP Address

Configure Web Server

Configure Web Server

Key

Configure Encryption Key

Version 3 IP Address

Configure Web Server

EncrypƟon Key

Configure Encryption Key

Binary

Load Customer Applications

Fig. 3. Exemplary order processes for the three different versions of the IoT device, based on [9].

Traditionally, this would result in three different order processes which are formed via a copy and clone strategy (see Fig. 3): The order process of the first version is copied and extended for the second version, while the third version is an extended copy of the second version. This means that changes to the basic version would result in the maintenance of two other processes as well. Using our developed framework leads to the situation that all three process variants are derived from one common process model. As such, the same result is achieved like using a manual preparation, but the maintenance costs can be reduced essentially since all variants are automatically updated. For illustration purposes, we assume that the final product can execute basic scripts in order to load data, or install applications. Which means that the result of the Software Product Line is a configuration script which basically loads/installs the required data to the devices.

80

5.2

A. D. Sinnhofer et al.

Software Product Line Engineering

As described in Sect. 3.1, the main goal of the Domain Engineering phase includes the identification and implementation of reusable Domain Artifacts. As such, we apply the transformation rules defined in Sect. 4 to identify the required artifacts. The default order entry (i.e. automatically generted by the process variability framework without applying any manual actions) is generated with the following settings: A string input box for defining the IP address of the web-server and a decision whether to send the data encrypted or not. In case of encryption data, a string input box can be used to specify the plain encryption key; if set, an option appears that allows a customer to load custom applications also as a string input box. The only problem of the default order entry process is the type of the Input ‘Binary’. Thus, the Domain Expert adopts the Process Model by adding the hint that a binary file input should be used. Consequently, a file upload button appears in the order entry with which binary files can be added. Additionally, the Domain Expert could refine the types of the other inputs such that runtime checks can be performed when the customer fills out the order form. For example, rules can be applied so that the IP Address has to be valid, or that the key has to a correct length and format (e.g. Base64 encoded). These additional hints are directly mapped to the processes using the feature model of the SPLE tool: Each node in the BPM tool has a unique identifier which is used for the mapping. Thus, reusing parts of processes in other processes will automatically propagate this additional information. The generated order entry is illustrated in Fig. 4. After pressing the submit button on the order entry form, the provided data is converted into an XML file and zipped together with all the provided files to a zip archive. The XML file is necessary to ensure that the product configuration product line is able to automatically interpret the given zip file. Further, having an XML file has also the positive side effect that it is human readable which is essential for manual verification steps. Additional data can be included into the archive like identifiers and time-stamps to have a traceable link from placing an order to the actual manufacturing of the product. Based on the identified structure of the order entry, the features of the Feature Model can automatically be derived. The resulting Feature Model is depicted in Fig. 5. For illustration purposes, a “requires” relationship between the web-server configuration and the data protection configuration is not highlighted since the ‘Configure Web Server’ feature is a mandatory feature. Consequently, the configuration is always part of the final product and does not need to be explicitly modeled. As highlighted in the Figure, the ‘Configure Web Server’ is modeled as a mandatory feature, while the ‘Configure Encryption Key’ and ‘Configure Customer Application’ Feature is modeled as optional. Furthermore, the ‘Configure Customer Application’ requires that the ‘Configure Data Protection’ Feature is selected. The ‘IP Address’, ‘Encryption Key’ and ‘Executable’ nodes are used to symbolize the configuration dependency of the Feature which they are attached to. These configuration dependencies, however, are not real Features in the classical sense as defined by Kang et al. [22].

Combined Variability Modeling

81

Fig. 4. Exemplary order entry form that is automatically generated from the exemplary process model [10].

Order Process

Configure Web Server

Configure Encryption Key

Load Customer Application

IP Address

Encryption Key

Binary

Mandatory

Optional

Requires

Fig. 5. Exemplary feature model for the customization of the IoT device in three different flavors [10].

Finally, the three identified main features have to be implemented in form of domain artifacts. For example: Scripts can be developed which are used during the Application Engineering. Principally, this would result in the desired output since the scripts for whole product configurations could be generated automatically. The problem with this approach is, that every individual feature would result in a dedicated script designed for a specific hardware. Thus, a lot of different scripts would need to be developed although they share a lot of commonalities. As a consequence, a flexible system is required which can be configured for the current requirements. Additionally, it has to be runtime-configurable to circumvent the implementation and deployment of new software releases, which

82

A. D. Sinnhofer et al. Function Description FD

Function Description Language (XSD) Language Primitives

Operations

Script

PT PT PT

References

Library

ct io n

so

f

generated

Ca ll

sF un

Abstract Class Hierarchy

Sub

FDL Interpreter

uses

Implementation

Submission Perso. Script

Fig. 6. Framework for a flexible, runtime-configurable script generation system [10]

may demand great effort. To overcome these issues, the framework illustrated in Fig. 6 was developed. Basically, it consists of three main components: The first component is the so called Function Description Language (FDL) which is a domain-specific language designed for the creation of configuration scripts. The second component is the FDL Interpreter, which interprets the given submission files (Sub) and executes the Functions of a given Function Description (FD) to generate the product configuration script. The third component is a library consisting of small reusable Product Template (PT) scripts which are designed to work with a wide variety of different platforms. Consequently, the maintenance costs for the script library can be kept low since the scripts should not contain any platform specific functionality. In the context of our industrial research, we used XML as a basis for the domain specific language since it is easy to read by humans and easy to process for computer systems. 5.3

Traceability

Traceability is an important topic for an organization which is supposed to delivery high quality products in an automated fashion. Through the use of our framework, a traceable link is established between the Order Process Model and the Feature Model of the Product Line. This means that every feature of the product can be linked to the according implementation artifacts of the product line. Through the use of requirements engineering tools, it is also possible to further create a link to the according test cases which may be a requirement by specific certifications (e.g. functional safety). Having a traceable link between the order process and the according implementation artifacts also help in order to find the business drivers, or equally important, the drivers which destroy value. This can be achieved by monitoring the development/maintenance costs of a product feature and by analyzing the product orders and the according revenue.

Combined Variability Modeling

6

83

Evaluation

First results were already compiled in our previous works [8,9] in which we compared the development efforts using “traditional software development” techniques and compared it with the overhead of the developed framework. We use the term “traditional software development” techniques for a software development with ad-hoc (or near to ad-hoc) software architecture which means that multiple different systems are designed almost independent, but make use of copy and adapt strategies. Consequently, the maintenance efforts for such systems are rather high. We investigated, that the economical break-even point of the developed framework is at around 3 to 4 systems. Further, the robustness of the customization process was increased since automatic methods were used for the feature selection and thus, configuration errors could be reduced significantly. Through the use of automatic methods, it was also possible to generate log-files for certification purposes which are used to verify that the provided customer data was loaded and not confused or manipulated. In this work, we will investigate other aspects of the developed framework, namely the identification of business drivers: We analyzed the development efforts of individual product features and contrasted them with the revenue that was earned by selling the according product configuration. The development efforts were extracted from the time-recordings of the responsible workers and should give a reasonable estimate about the real development efforts. In total, 106 different product orders were analyzed which provided 30 individual configuration settings. The results are illustrated in Fig. 7. The first 10 product configuration options are internal system specific configuration settings which are mandatory for every product. A business decision was taken to reduce the costs for the base system to a minimum level to ensure a low cost base product. As a consequence, the revenue earned by the basic product configuration is rather low compared to other customization options. An interesting finding was that a lot of development effort was invested in complex features which were never used for any customer or are only rarely used. Due to this finding, it was decided that some of the features will be removed from the product in a future release, in order to reduce the overall costs. As illustrated in the Figure, feature 12 required a lot of development effort, but is also frequently ordered by customers. Consequently, an improvement process was triggered in order to reduce the costs of this feature. After a first improvement process, current estimates showed that the overall revenue could be increased by a few percents after addressing the bad performance of feature 12, 24 and 25. The increase in the revenue is highlighted in right part of Fig. 7. The analysis of the business drivers was integrated into a regular performance analysis in order to continuously work on process improvements to maximize the revenue. Additionally, the combination of variability modeling for business process models an the according software product lines can help to increase the overall awareness of the involved teams: For example non-functional aspects which needs to be addressed in several stages of a product development get more attention and automatically links the involved developers/experts. For instance, security related topics are not only covered as a side activity but are already

A. D. Sinnhofer et al.

Number of Products

100

Number of orders Quantitative costs

80 60 40

Quantitative Revenue After first improvement process Quantitative Revenue

84

20

5

10 15 20 25 Customization Options

30

5

10 15 20 25 Customization Options

30

Fig. 7. Analysis of the development efforts and revenue to identify business drivers. The revenue and the costs are illustrated in a quantitative manner [10].

addressed during the early stages: On a business process level (e.g. how customers are able to securely share confidential data) and on the implementation level (e.g. how the data is securely processed/handled). After having discussed the positive aspects of the developed framework, we also want to address some limitations of the current implementation: While we were able to fully generate the required customization scripts for simple product configurations, we were able to only partially generate the scripts for complex product configurations due to the high number of inter-feature constraints of the product features. This is not a technical problem of the approach, but having a complete coverage of all inter-feature constraints is a time-consuming and iterative process. Further, modeling all the constraints in advance is usually not possible for complex systems. As a result, we decided to model only basic constraints in advance and to update the constraint model with every product order. Based on this semi-automatic generation, we managed to reduce the time to release a complex product by 50%.

7

Conclusion

We are living in an ever-changing and interconnected world. Especially in the domain of IoT and CPS, short production cycles and highly flexible requirements of the market need to be addressed in order for a company to survive on the market. We proposed a way to use software product line engineering techniques for the modeling of business process models as well as to combine the variability models with a product line for product customization. Using this approach establishes an integrated view of the product variability from a business perspective as well as from a technical perspective, which can be used to raise the efficiency of the overall business. As a result, the development costs and the required time to react to changes of the market can be reduced significantly. Moreover, using the proposed techniques supports Domain Experts to identify business drivers and thus, raise the overall efficiency of the organization.

Combined Variability Modeling

85

In the current state, the presented framework is focused on covering the variability of order processes for similar type of products. Consequently, future work will address the extension of the developed framework to other processes. Further, we are currently investigating methods on how to bind non-functional requirements like security requirements to the variability models in order to enforce specific properties throughout the whole process in an automatic and systematic way. Acknowledgements. The project is funded by the Austrian Research Promotion Agency (FFG). We want to gratefully thank pure::systems for their support and especially Danilo Beuche.

References 1. McCormack, K.P., Johnson, W.C.: Business Process Orientation: Gaining the E-Business Competitive Advantage. Saint Lucie Press, Boca Raton (2000) 2. Hammer, M., Champy, J.: Reengineering the Corporation - A Manifesto for Business Revolution. Harper Business, New York (1993) 3. Valena, G., Alves, C., Alves, V., Niu, N.: A systematic mapping study on business process variability. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 5(1), 1–21 (2013) 4. Willaert, P., Van den Bergh, J., Willems, J., Deschoolmeester, D.: The processoriented organisation: a holistic view developing a framework for business process orientation maturity. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM 2007. LNCS, vol. 4714, pp. 1–15. Springer, Heidelberg (2007). https://doi.org/10.1007/ 978-3-540-75183-0 1 5. Saidani, O., Nurcan, S.: Towards context aware business process modelling. In: 8th Workshop on Business Process Modeling, Development, and Support (BPMDS 2007), vol. 7, p. 1. CAiSE (2007) 6. Pohl, K., B¨ ockle, G., van der Linden, F.J.: Software Product Line Engineering: Foundations Principles and Techniques. Springer, Heidelberg (2005). https://doi. org/10.1007/3-540-28901-1 7. Sinnhofer, A.D., P¨ uhringer, P., Kreiner, C.: varBPM - a product line for creating business process model variants. In: Proceedings of the Fifth International Symposium on Business Modeling and Software Design, BMSD, vol. 1, pp. 184–191 (2015) 8. Sinnhofer, A.D., P¨ uhringer, P., Potzmader, K., Orthacker, C., Steger, C., Kreiner, C.: A framework for process driven software configuration. In: Proceedings of the Sixth International Symposium on Business Modeling and Software Design, BMSD, vol. 1, pp. 196–203 (2016) 9. Sinnhofer, A.D., P¨ uhringer, P., Potzmader, K., Orthacker, C., Steger, C., Kreiner, C.: Software configuration based on order processes. In: Shishkov, B. (ed.) BMSD 2016. LNBIP, vol. 275, pp. 200–220. Springer, Cham (2017). https://doi.org/10. 1007/978-3-319-57222-2 10 10. Sinnhofer, A.D., H¨ oller, A., P¨ uhringer, P., Potzmader, K., Orthacker, C., Steger, C., Kreiner, C.: Combined variability management of business processes and software architectures. In: Proceedings of the Seventh International Symposium on Business Modeling and Software Design, BMSD, vol. 1, pp. 36–45. INSTICC, SciTePress (2017)

86

A. D. Sinnhofer et al.

11. Rosa, M.L., Van Der Aalst, W.M.P., Dumas, M., Milani, F.P.: Business process variability modeling: a survey. ACM Comput. Surv. 50(1), 2:1–2:45 (2017) 12. Derguech, W.: Towards a framework for business process models reuse. In: The CAiSE Doctoral Consortium (2010) 13. Gimenes, I., Fantinato, M., Toledo, M.: A product line for business process management. In: International Software Product Line Conference, pp. 265–274 (2008) 14. Hallerbach, A., Bauer, T., Reichert, M.: Guaranteeing soundness of configurable process variants in provop. In: 2009 IEEE Conference on Commerce and Enterprise Computing, CEC 2009, pp. 98–105. IEEE (2009) 15. Hallerbach, A., Bauer, T., Reichert, M.: Issues in modeling process variants with provop. In: Ardagna, D., Mecella, M., Yang, J. (eds.) BPM 2008. LNBIP, vol. 17, pp. 56–67. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3642-00328-8 6 16. Reichert, M., Hallerbach, A., Bauer, T.: Lifecycle management of business process variants. In: vom Brocke, J., Rosemann, M. (eds.) Handbook on Business Process Management 1. IHIS, pp. 251–278. Springer, Heidelberg (2015). https://doi.org/ 10.1007/978-3-642-45100-3 11 17. Gottschalk, F., Van Der Aalst, W.M.P., Jansen-Vullers, M.H., La Rosa, M.: Configurable workflow models. Int. J. Coop. Inf. Syst. 17, 177–221 (2008) 18. La Rosa, M., Dumas, M., ter Hofstede, A.H.M., Mendling, J., Gottschalk, F.: Beyond control-flow: extending business process configuration to roles and objects. In: Li, Q., Spaccapietra, S., Yu, E., Oliv´e, A. (eds.) ER 2008. LNCS, vol. 5231, pp. 199– 215. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87877-3 16 19. Haugen, O., Wasowski, A., Czarnecki, K.: CVL: common variability language. In: Proceedings of the 17th International Software Product Line Conference, SPLC 2013 (2013) 20. Zhao, X., Zou, Y.: A business process-driven approach for generating software modules. Softw. Pract. Exp. 41(10), 1049–1071 (2011) 21. Weiss, D.M., Lai, C.T.R.: Software Product-Line Engineering: A Family-Based Software Development Process. Addison-Wesley Longman Publishing Co. Inc., Boston (1999) 22. Kang, K.C., Cohen, S.G., Hess, J.A., Novak, W.E., Peterson, A.S.: Feature-oriented domain analysis (FODA) feasibility study. Technical report. Carnegie-Mellon University Software Engineering Institute (1990) 23. Sinnhofer, A.D.: Advances of the pre-personalization process for secure embedded systems. Ph.D. thesis, Graz University of Technology - Institute of Technical Informatics (2017) ¨ 24. Osterle, H.: Business Engineering Prozeß- und Systementwicklung. Springer, Heidelberg (1995). https://doi.org/10.1007/978-3-662-06188-6 25. Association of Business Process Management Professionals: Guide to the BusiR ness Process Management Common Body of Knowledge: ABPMP BPM CBOK. Association of Business Process Management Professionals (2009) 26. Dumas, M., La Rosa, M., Mendling, J., Reijers, H.A.: Fundamentals of Business Process Management. Springer, Heidelberg (2013). https://doi.org/10.1007/9783-642-33143-5 27. Object Management Group: Business process model and notation (BPMN). Version 2.0, pp. 1–538 (2011). http://www.omg.org/spec/BPMN/2.0/ 28. Strnadl, C.F.: Aligning business and IT: the process-driven architecture model. Inf. Syst. Manag. 23(4), 67–77 (2006)

Enforcing Context-Awareness and Privacy-by-Design in the Specification of Information Systems Boris Shishkov1,3(&) and Marijn Janssen2 1 Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Sofia, Bulgaria 2 Faculty of Technology, Policy, and Management, Delft University of Technology, Delft, The Netherlands [email protected] 3 Institute IICREST, Sofia, Bulgaria [email protected]

Abstract. Networked physical devices, vehicles, home appliances, and other items embedded with electronics, software, sensors, actuators, and connectivity, allow for run-time acquisition of user data. This in turn can enable information systems which capture the “current” user state and act accordingly. The use of this data would result in context-aware applications that get fueled by user data (and environmental data) to adapt their behavior. Yet the use of data is often restricted by privacy regulations and norms; for example, the location of a person cannot be shared without given consent. In this paper we propose a design approach that allows for weaving context-awareness and privacy-bydesign into the specification of information systems. This is to be done since the very early stages of the software development, while the enterprise needs are captured (and understood) and the software features are specified on that basis. In addition to taking into account context-awareness and privacy-sensitivity these two aspects will be balanced, especially if they are conflicting. The presented approach extends the “Software Derived from Business Components” (SDBC) approach. We partially demonstrate our proposed way of modeling, by means of a case example featuring land border security. Our proposed way of modeling would allow developers to smoothly reflect context and privacy features in the application design, supported by methodological guidelines that span over the enterprise modeling and software specification. Those features are captured as technology-independent societal demands and are in the end reflected in technology-specific (software) solutions. Traceability between the two is possible as well as re-use of modeling constructs. Keywords: Enterprise modeling Context-awareness  Privacy

 Software specification

1 Introduction We observe increasing public demands for resilient Enterprises Information Systems (EIS) that are not only effective and efficient but also compliant with societal (legislation-related) demands, in the fields of privacy, security, transparency, and so on © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 87–111, 2018. https://doi.org/10.1007/978-3-319-78428-1_5

88

B. Shishkov and M. Janssen

[28]. EIS in turn often count on run-time environmental information, such that they are capable of adapting their behavior accordingly. Helpful in this respect are the latest IoT (Internet-of-Things) developments: networked physical devices, vehicles, home appliances, and other items embedded with electronics, software, sensors, actuators, and connectivity, allow for run-time acquisition of user data [15]. Hence, empowered by sensor technology and connectivity, EIS can “know” what is going on around and use this information for optimizing their internal processes, for maximizing the user-perceived effectiveness, and for analyzing relevant data. For this, EIS need support from context-aware applications (they are EIS-internal), for the sake of delivering to the end user services that are adequate with regard to his or her situation at the moment [30]; context-aware applications get fueled by user data from emails, chat messages, sensors, and so on. Nevertheless, those innovative features lead to an increased complexity with regard to the underlying software; this in turn often assumes new risks [15], including risks that concern privacy [28]: for example, the European data protection act requires that the location of a person cannot be shared without given consent. Hence, more advanced modeling methods and techniques may be necessary, especially in the area of EIS, such that: (i) enterprise needs are aligned with software specifications; (ii) context-awareness is achieved but also balanced with privacy (it is to be noted that because of the limited scope of this paper, we only focus on context-awareness as an optimization strategy and we only focus on privacy as a relevant societal demand). Thus, we have opted for an explicit consideration of context-awareness [30] and privacy [14]. We propose a design approach for weaving context-awareness and privacy-bydesign into the specification of information systems. This is to be done since the very early stages of the software development, while the enterprise needs are captured (and understood) and the software features are specified on that basis. In addition to considering context-awareness and privacy-sensitivity these two aspects will be balanced, especially if they are conflicting. In order to avoid starting from scratch, we have looked for a relevant existing software specification approach to extend further for the sake of accommodating context-awareness and privacy-by-design. Nevertheless, we could only consider an approach that effectively brings together enterprise modeling and software specification because context issues and privacy issues are to be captured from the enterprise environment but need to be reflected in software solutions. Actually, enterprise engineering alone is insufficiently capable of grasping the technical complexity of an EIS (and its reach outside through software services [31]), while a purely software engineering perspective would assume only superficial enterprise-specific domain knowledge [27]. We need a common modeling ground for this, allowing us to properly align enterprise modeling and software specification. Such a common ground can be co-created by enterprise engineers and software engineers, featuring: (a) technologyindependent enterprise models rooted in social theories; (b) technology-specific software models rooted in computing paradigms [26]. Further, we would only consider an approach that is consistent with the Model-Driven Architecture – MDA [22] that can be considered as a de facto standard. Finally, we consider important that the approach of

Enforcing Context-Awareness and Privacy-by-Design

89

choice has sound underlying theories. In this regard, we have opted for considering the SDBC approach [27] reflected in previous work (“SDBC” stands for: “Software Derived from Business Components”), noting that: (i) SDBC effectively brings together (in a component-based way); (ii) SDBC is consistent with MDA; (iii) SDBC considers several underlying social theories, such as Enterprise Ontology [7] and Organizational Semiotics [21] as well as computing paradigms, such as Service-Oriented Computing [32]; also SDBC brings this all together through its modeling guidelines and notations, such that adequate modeling generations and transformations are possible. This means that taking as input unstructured business information, we should be able to usefully apply a modeling and design process, such that we come through enterprise models and reach as far as the specification and implementation of software. Since we consider SDBC as an approach with such capabilities, we adopt SDBC in the current research. Further, staying consistent with MDA (see above), we assume a development process starting with computation-independent modeling and ending up with code generation. Nevertheless, for the sake of brevity, we are limiting our focus to the CIM generation (Computation-Independent Models (CIM) point to the highest level of abstraction in MDA), noting that: • SDBC is capable of adequately reflecting a CIM input into lower-level software specifications; • It is at this highest level of abstraction where context-awareness and privacy are to be weaved in, bringing together both an enterprise perspective and a software perspective. That is how an SDBC-rooted enterprise-modeling-driven software specification is improved, by weaving in context-awareness and privacy enforcement. We partially demonstrate our proposed way of modeling, by means of a case example featuring land border security. Our proposed way of modeling would allow developers to reflect context and privacy features in the application design, supported by methodological guidelines that span over the enterprise modeling and software specification. Those features are captured as technology-independent societal demands and are in the end reflected in technology-specific (software) solutions. Traceability between the two is possible as well as re-use of modeling constructs. The remaining of the current paper is organized as follows: In Sect. 2, we present several basic concepts. In Sect. 3, we provide the problem conceptualization. Section 4 is featuring background information on context-awareness and privacy, considering also related work respectively. In Sect. 5 we briefly outline the SDBC approach (also justifying its choice) and in Sect. 6 we present a proposal on how to weave in contextawareness and privacy in the software specification. In Sect. 7, we present a motivating application scenario in the public security domain, based on which we partially demonstrate our proposed way of modeling. Finally, in Sect. 8, we present the conclusions.

90

B. Shishkov and M. Janssen

2 Basic Concepts In order to effectively address the enterprise-software alignment and consider on top of that context-awareness and privacy, we need a common conceptual background. Hence, we present several relevant basic concepts in the current section, noting that there are numerous concepts and modeling constructs underlying SDBC. For the sake of brevity however, we will only address some of them in the current section, especially those ones that are considered relevant to the challenge of weaving contextawareness and privacy-enforcement in land-border-security-related software specifications. For more related information on SDBC, interested readers are referred to [26]. Taking this into account, we firstly present the system definition inspired by Bunge [3] and having fundamental importance in the SDBC modeling: Definition 1. Let T be a nonempty set. Then the ordered triple r = is system over T if and only if C (standing for Composition) and E (standing for Environment) are mutually disjoint subsets of T (i.e. C \ E = ∅), and S (standing for Structure) is a nonempty set of active relations on the union of C and E. The system is conceptual if T is a set of conceptual items, and concrete (or material) if T  H is a set of concrete entities, i.e. things. Inspired by the system definition, we focus particularly on enterprise systems since a (border-security) software system would inevitably operate in an enterprise surrounding (comprising (organizational) entities, business processes, regulations, and so on) and we consider an enterprise system as being composed of human entities collaborating among each other through actions, driven by the goal of delivering products/services to entities belonging to the environment of the system. As for an EIS, it is also composed of human entities (they are often backed by ICT (Information and Communication Technology) applications as well as by technical and technological facilities) but the EIS goal is to support informationally a corresponding enterprise system. This is functionally reflected in the collection, storage, processing, and exchange (or distribution) of data among users within or between enterprises, or among people within wider society [26]. Further, it is important to present the SDBC units of modeling and in this regard, it is to be noted that essentially, SDBC is focusing on the ENTITIES to be considered and their INTER-RELATIONS. It is desired to be able to model entities and relations abstractly (no matter if enterprise entities or software entities are concerned), and also to be able to specialize such models accordingly, in an enterprise direction or in a software direction. For this: • We consider actors (combination of the actor-role and the entity fulfilling the role) since often one entity can fulfil many roles and one role can be fulfilled by many entities [26]; • We consider a generic interaction pattern (featuring the transaction concept – see Definition 2) that is claimed to be helpful in modeling any real-life interaction in an enterprise/software context:

Enforcing Context-Awareness and Privacy-by-Design

91

Definition 2. A transaction is a finite sequence of coordination acts between two actors, concerning the same production fact. The actor who starts the transaction is called the initiator. The general objective of the initiator of a transaction is to have something done by the other actor, who therefore is called the executor [7]. Hence, enterprise modeling and software specification are both being approached by those two essential concepts: ACTOR and TRANSACTION. Thence, a business process is viewed as a structure of (connected) transactions that are executed in order to fulfil a starting transaction and a business component is viewed as an enterprise sub-system that comprises exactly one business process. Further, a complete (by this we mean elaborated in terms of structure, dynamics, and data) model of a business component is called a business coMponent. The identification of business coMponents (featured in terms of actors and transactions) is hence considered an essential enterprise modeling task within SDBC. Further elaboration of other relevant concepts will be presented in Sect. 5, when introducing the SDBC approach.

3 Problem Conceptualization As suggested by the Introduction, the problem we are facing in the current paper concerns modeling situations in which context-awareness and/or privacy need to be adequately reflected in the (software) system functionalities. This points to two problem “components” (context-awareness and privacy), as visualized in Fig. 1.

Fig. 1. Problem conceptualization

If we have a system that is delivering services to a user belonging to the system environment, then we may have the following three situations:

92

B. Shishkov and M. Janssen

(i) Situation “service0”, as labelled in the figure: this is the typical service provisioning, when the (software) system is delivering a service to the user, regardless of context and privacy demands. An example of this is a ticket machine service – regardless of the user situation and of any specific privacy demands, the ticket machine is issuing tickets in the same way to any user in any situation. (ii) Situation “service1” is when the service is delivered in “versions” in the sense that depending on the situation of the user, a corresponding service version is instantiated; the situation of the user is captured through sensors (see the black disk with “s” in the figure), as displayed in the figure. An example of this is an intelligent music playing service that may be adjusted not to play while the user is sleeping, to play tender music during morning hours, to play rhythmic music while the user is driving, and so on, assuming the possibility for capturing the user situation either through sensing (sensing that the user is in the car, for instance) and/or through a timer, or in another way. (iii) Situation “service2” is featuring a privacy-driven adaptation of the service delivery, assuming that the system is receiving privacy demands from Society (see the black disks with “p” in the figure) that are to be “translated” into functional solutions. For example, security monitoring may need to be updated, driven by public demands, such that any captured visual information is to be destroyed if after some period of time, no incident has occurred. Hence, if we assume that (i) is covered by current software specification approaches, such as SDBC, we consider challenging achieving (ii) and/or (iii), and if both are to be achieved – to resolve possible tensions. Our proposed way of tackling this will be explained in the following sections.

4 Background and Related Work As mentioned in the Introduction, in this section we address context-awareness and privacy, by providing brief introductions and considering related work. 4.1

Context-Awareness

The advances in wireless telecommunications and sensor technology, in combination with the capabilities of smart devices, have empowered IT systems to “know” what is going on with the end user while (s)he is utilizing corresponding services – this represents a user perspective in service delivery. Hence, the service delivered to the user is to be adapted to the situation of the user. For example, a person wearing a body-area network [1] through which body vital signs are captured, may appear to be at “normal state” and then, for example, vital signs are captured and recorded as archival information, or the person may appear to be in an “emergency state” and then help would need to be urgently arranged. Thus, one kind of service would be needed at normal state and another kind of service would be needed at emergency state. For this reason, the system should be able to: (i) identify the situation of the user; (ii) deliver a service to the user, which is suited for the particular situation. This is illustrated in Fig. 2.

Enforcing Context-Awareness and Privacy-by-Design

93

contextaware system

Fig. 2. A schematic representation of a context-aware system

As it is seen from the figure a service is delivered to the user and the user is considered within his or her context, such that the service is adapted on the basis of the context state (or situation) the user finds himself/herself in. That state is to be somehow sensed and often technical devices, such as sensors, are used for this purpose. Context-aware systems actually deliver services to the user by means of ICT applications (“applications”, for short). Hence, unlike “traditional” applications assuming that users would have common requirements independent of their context, context-aware applications are capable of adapting their behavior to the situation of the user. This is especially relevant to services delivered via mobile devices. Such applications are, to a greater or lesser extent, aware of the user context situation (for example: user is at home, user is traveling) and provide the desirable services corresponding to the situation at hand. This quality points also to another related characteristic, namely that context-aware applications must be able to capture or be informed about information on the context of users, preferably without effort and conscious acts from the user part. Developing context-aware applications is hence not a trivial task and as above suggested, the following related challenges have been identified: (i) Properly deciding what physical context to sense and what high-level context information to pass to an application, and also bridging the gap between raw context data and high-level context information; (ii) Deciding which potential end-user context situations to consider and which ones to ignore; (iii) Modeling context-aware application behavior including switching between alternative behaviors [30]. The basic assumption underlying the development of context-aware applications is that user needs are not static, however partially dependent on the particular situation the user finds himself/herself in, as already mentioned. For example, depending on his/her current location, time, activity, social environment, environmental properties, or physiological properties, the user may have different interests, preferences, or needs with respect to the services that can be provided by applications. Context-aware applications are thus primarily motivated by their potential to increase user-perceived effectiveness, i.e. to provide services that better suit the needs of the user, by taking account of the user situation. We refer to the collection of parameters that determine the situation of a user, and which are relevant for the application in pursue of user-perceived effectiveness, as user context, or context for short, in accordance to definitions found in literature [6].

94

B. Shishkov and M. Janssen

As above-mentioned, context-awareness also implies that information on the user context must be captured, and preferably so without conscious or active involvement of the user. Although in principle the user could also provide context information by directly interacting with the application, one can assume that in practice this would be too cumbersome if not impossible; it would require deep expertise to know the relevant context parameters and how those are correctly defined, and furthermore be very time consuming and error-prone to provide the parameter specifications as manual input [30]. In studying RELATED WORK, we have considered context-aware application practices. Due to the complexity and importance of handling context-awareness, many studies have tried to investigate different ways of developing context-aware applications. Many context modeling techniques have been created to enumerate and represent context information [37]. Methodologies for architectural design were proposed by researchers, such as: Context Toolkit – it aggregates context information [6], Context Modeling Language [11] & Model Driven Development (MDD), and UML- based approaches [2, 35] which mainly describe the key steps and activities for modeling context-aware applications; next to that: Contextual Elements Modeling and Management through Incremental Knowledge Acquisition (CEManTIKA) support the development of context-aware applications. Further, Vom Brocke et al. have proposed a framework which consists of 4-dimensional factors to be considered in the design of context-aware applications, including (1) application goals, (2) characteristics of the process, (3) internal organizational specifications where context-aware applications are implemented, (4) the broader or external environment in which context-aware applications are built [38]; those factors can be used as guidelines when designing a context-aware application. In general, many current research projects are focusing on the development of context-aware applications, touching upon concepts, networking aspects, middleware aspects, user-interface-related concerns, services, and so on. Still, even such a wide consideration of context-aware applications has not yet inspired (in our view) a widely accepted agreement on the development of such applications. Hence, it is still a question how to weave context-awareness in the specification of software, and the current paper offers some contribution in this direction. 4.2

Privacy

As mentioned already, with regard to the (software) system-to-be, we are not only aiming at context-awareness but we are also willing to weave in values, such as privacy and transparency. In this paper, we are focusing on privacy not only because it is one of the key values (e.g. [14]) but also because it is highly relevant with regard to the land-border security application domain addressed in the paper. Hence, in the remaining of the current sub-section, we will firstly discuss privacy in general (still assuming a border security focus) and then we will focus on privacy enforcement practices (related work) that are to be taken into account with regard to our case-driven modeling approach. Although the boundaries and specific contents of privacy vary significantly in different countries, the main definition of information privacy includes the right to be left alone and control of information about ourselves [24]. Data

Enforcing Context-Awareness and Privacy-by-Design

95

can have various needs of privacy, whereas some information should always be opened to create transparency, other information should not be shared without proper authorization. Although there is much information claimed to be privacy-sensitive, we consider the following information concerning border control as privacy sensitive, referring to the Pearson’s privacy information classification [24]: • Personally identifiable information: information that can be used to identify an individual • Data from records: name, date of birth, bio-metrics, address, social security number, and so on; • Surveillance data: images, video, voice, and so on; • Secondary data: bank account number, credit card number, phone number, social media network ID, and so on; • Demographical information: sex, age group, race, health status, religion, education, and so on; • Usage data • Networking-related data: mobile phone history data, Internet access point data, computer log files, and so on; • Recorded online activities: messenger records, contribution to social websites, and so on; • Travel data: ticketing/boarding pass data, reservations, cancellations, and so on; • Unique device identities: any information that might be uniquely traceable to a device, e.g. IP address, device fabric number, Radio Frequency Identity (RFID) tags, and so on. In studying RELATED WORK, we acknowledge that privacy enforcement is often difficult. ICT enables the creation of systems that ensure the privacy of data, which is called privacy-by-design [14]. Privacy-by-design has received attention within organizations as a way to always ensure that privacy is protected. Privacy-by-design suggests integrating privacy requirements into the design specifications of systems, business practices, and physical infrastructures. In the ideal situation, data is collected in such a way that privacy cannot be violated. This requires that both governance aspects (such as data updating processes and procedures, access rights, decisionmaking responsibilities, and so on) and technical aspects (such as encryption, access control, anonymization, and so on) are covered. Since privacy enforcement solutions differ in different contexts, some general principles to guide the privacy-by-design are to be (adapted and) used. For instance, the principles stated in Article 5 of the EU General Data Protection Regulation, need to be carefully considered, including: lawfulness, fairness and transparency, purpose limitation, data minimization, accuracy, storage limitation, integrity and confidentiality, and accountability. However, some principles would often be in conflict with the characteristics of implemented border control information systems - for instance, the continuous collection of surveillance image data is against the principle of purpose limitation. Therefore, technical solutions should be a trade-off between privacy and (border-control-related) benefits [18].

96

B. Shishkov and M. Janssen

Technical solutions regarding privacy enforcement would in general refer to PET – Privacy-Enforcement Technologies. Those technologies assume secure communication and data storage by encryption, access control and auditing, anonymization of on-line activity, detection of privacy violators, and so on [25, 40]. Since PET can only partially address privacy-related problems, they need to be combined with information governance features in order to create comprehensive privacy-enforcement mechanisms. Besides PETs, PITs (Privacy-Invasive Technologies) and privacy threats are also frequently examined in various domains [4, 13, 17, 34, 39]. Nevertheless, there is still limited insight in how enterprises can reduce privacy violation risks for open data in particular, and there is no uniform approach for privacy protection [16].

5 SDBC In considering the SDBC approach in the current section, we will firstly provide justification with regard to our choice to base our modeling on that approach, secondly, we will discuss the Design Science relevance of SDBC, and finally, we will briefly outline the approach. 5.1

Justification

As studied by Shishkov [26], there are many other approaches/modeling languages, some widespread and widely used. What justifies our considering particularly SDBC is the following: • SDBC is neither addressing only enterprise modeling nor is it addressing only software specification; instead, the approach brings both together which is important if one needs to reflect sophisticated (legislative) requirements in complex software architectures. • SDBC is not only limited to general guidelines and related modeling notations but it is also a method in the sense that different modeling activities are carried out in a specific order – this is to ensure that the software system being modeled is well-aligned with the business needs. • SDBC is empowering re-usability and traceability which are considered essential with regard to software development in general. • SDBC is aligned with the UML notations representing a de facto standard notation for specifying software [36] and is consistent with MDA. • In previous work, SDBC has been considered particularly in the border security application domain [29]. For this reason, we have opted for adopting SDBC in the current work and in the following sub-section we will also justify the Design Science relevance of the approach.

Enforcing Context-Awareness and Privacy-by-Design

5.2

97

Relevance to Design Science

In Design Science research, the information systems research framework proposed by Hevner et al. [12] has been widely accepted and applied in many IT artefact designs [23]. According to that framework, researchers develop an IT artefact, by considering the business needs and limits within an appropriate environment which consists of involved people, organizations and available technologies [12]. In such a design process, for the sake of supporting the design, researchers use: (i) existing knowledge bases (that include theories, models, and methods) as knowledge foundations and (ii) data analysis, measures, and validation criteria as methodologies. After having been developed, the IT artefact is to be evaluated and justified via analysis, case studies, experiments and/or simulation. New developed artefacts can also contribute to the knowledge base(s) accumulation. Hence, referring to Design Science, we acknowledge that SDBC is essentially oriented towards a goal-driven modeling that relates to corresponding user needs and the modeling itself is justified by the capabilities (and limitations) of the corresponding entities contributing to the service deliveries. For this reason, we consider SDBC as relevant in general with regard to Design Science. Nevertheless, SDBC lacks powerful goal generation mechanisms and for that it needs support from other tools – for example, tools related to Artificial Intelligence [27]; anyway, this is left beyond the scope of the current paper. 5.3

Outline

SDBC is a software specification approach (consistent with MDA) that covers the early phases of the software development life cycle and is particularly focused on the derivation of software specification models on the basis of corresponding (re-usable) enterprise models. SDBC is based on three key ideas: (i) The software system under development is considered in its enterprise context, which not only means that the software specification models are to stem from corresponding enterprise models but means also that a deep understanding is needed on real-life (enterprise-level) processes, corresponding roles, behavior patterns, and so on. (ii) By bringing together two disciplines, namely enterprise engineering and software engineering, SDBC pushes for applying social theories in addressing enterprise-engineering-related tasks and for applying computing paradigms in addressing software-engineering-related tasks, and also for integrating the two, by means of sound methodological guidelines. (iii) Acknowledging the essential value of re-use in current software development, SDBC pushes for the identification of re-usable (generic) enterprise engineering building blocks whose models could be reflected accordingly in corresponding software specification models. We refer to [26] for information on SDBC and we are reflecting the SDBC outline in Fig. 3. As the figure suggests, there are two SDBC modeling milestones, namely enterprise modeling (first milestone) and software specification (second milestone). The first milestone has as input a case briefing (the initial (textual) information based on which the software development is to start) and the so called domain-imposed requirements (those are the domain regulations to which the software system-to-be should conform).

98

B. Shishkov and M. Janssen

Based on such an input, an analysis should follow, aiming at structuring the information, identifying missing information, and so on. This is to be followed by the identification (supported by corresponding social theories) of enterprise modeling entities and their inter-relations. Then, the causalities concerning those inter-relations need to be modeled, such that we know what is required in order for something else to happen [32]. On that basis, the dynamics (the entities’ behavior) is to be considered, featured by transactions (see Definition 2). This all leads to the creation of enterprise models that are elaborated in terms of composition, structure, and dynamics (all this pointing also to corresponding data aspects) – they could either feed further software specifications and/or be “stored” for further use by enterprise engineers. Such enterprise models could possibly be reflected in corresponding business coMponents (see Sect. 2). Next to that, re-visiting such models could possibly inspire enterprise re-design activities, as shown in Fig. 3.

Fig. 3. SDBC - outline (Source: [28], p. 48)

Furthermore, the second milestone uses as input the enterprise model (see above) and the so called user-defined requirements (those requirements reflect the demands of the (future) users of the software system-to-be towards its functioning). That input feeds the derivation of a use case model featuring the software system-to-be. Such a software specification starting point is not only consistent with the Rational Unified Process - RUP [19] and the Unified Modeling Language – UML [36] but is also considered to be broadly accepted beyond RUP-UML [5, 8, 26]. The use cases are then elaborated, inspired by studies of Cockburn [5] and Shishkov [27], such that software behavior models and classification can be derived accordingly. The output is a software specification model adequately elaborated in terms of statics and dynamics. Applying de-composition, such a model can be reflected in

Enforcing Context-Awareness and Privacy-by-Design

99

corresponding software components, as shown in the figure. Such an output could inspire software engineers to propose in a future moment software re-designs, possibly addressing new requirements. Further, in bringing together the first milestone of SDBC and the second one, we need to be aware of possible granularity mismatches. The enterprise modeling is featuring business processes and corresponding business coMponents but this is not necessarily the level of granularity concerning the software components of the system-to-be. With this in mind, an ICT application is considered as matching the granularity level of a business component – an ICT application is an implemented software product realizing a particular functionality for the benefit of entities that are part of the composition of an enterprise system and/or a (corresponding) EIS. Thus the label software specification model, as presented in Fig. 3, corresponds to a particular ICT application being specified. Software components in turn are viewed as implemented pieces of software, which represent parts of an ICT application, and which collaborate among each other driven by the goal of realizing the functionality of the application (functionally, a software component is a part of an ICT application, which is self-contained, customizable, and composable, possessing a clearly defined function and interfaces to the other parts of the application, and which can also be deployed independently). Hence, a software coMponent is a conceptual specification model of a software component. Said otherwise, the second SDBC milestone is about the identification of software coMponents and corresponding software components. In this paper, we will only address the business coMponent identification and its reflection in a use case model featuring the specification of the ICT application-to-be, weaving in context-awareness and privacy-enforcement accordingly – this will be considered in the following section.

6 Weaving in Context-Awareness and Privacy Considering the problem conceptualization (see Fig. 1), we are deriving three key demands with regard to the desired weaving of context-awareness and privacy: • We need to be able to capture user context to be used by the (software) system for its adapting services delivered to the user; • We need to be able to capture public-values-related demands (such as privacy) and “translate” them into functional (software) solutions; • If fulfilling both privacy demands and context awareness would assume tensions (because of conflicting requirements), we need to be able to resolve those tensions in a socially-responsible way. All those demands originate from the (enterprise) environment of the (software) system-to-be but require technology-specific (software) solutions. For this reason, we could neither position those demands as relevant to the enterprise modeling SDBC milestone nor can we position them as relevant to the software specification SDBC milestone; thus, they have to be positioned in between, as suggested by Fig. 4.

100

B. Shishkov and M. Janssen

Fig. 4. Extending SDBC – weaving in context-awareness and privacy

Hence, as shown on the figure, the difference with the common SDBC modeling (with no specific needs for weaving context-awareness and privacy in the design) would be that the OUTPUT of Milestone 1 (Enterprise Modeling) is not the INPUT for Milestone 2 (Software Specification). Instead, the Milestone 1 output would have to undergo some transformations to become a Milestone 2 input: this is presented in the right part of Fig. 4, using the notations of the UML Activity Diagram [36] – the “start” point relates to the Milestone 1 output while the “end” point relates to the Milestone 2 input. What is going on between those two points and how is it justified? • Two processes flow in parallel, one related to the desired context-awareness enforcement (left part of the Activity Diagram) and the other one – to the desired privacy enforcement (right part of the Activity Diagram). • Those two parallel processes hold an a-priori equal importance but in the end, they reach a synchronization bar where BALANCING NORMS are to be implemented, as a final parameterization of the Milestone 2 input. For example, in the case of a security camera video-recording (the security system is assumed to be adapting to the monitoring circumstances as well as to be privacy-sensitive and not distribute facial information) if there is indication for real-time criminal activities, then according to a balancing norm, context-awareness should prevail over privacy and no privacy sensitivity would be observed towards those persons being monitored. • As for the context-awareness process, it follows from the context-aware service delivery features, as introduced and discussed in the Introduction and in Subsect. 4.1, also assuming that for each user state type that is of high occurrence probability, there is a corresponding system behavior type that is prepared at design time. Then the first thing to be done is to capture the user context and if there is a system behavior type (prepared at design time) that corresponds to the user state

Enforcing Context-Awareness and Privacy-by-Design

101

type (to which the captured context is pointing), then that behavior type is instantiated accordingly. Otherwise, “auto-pilot” behavior would have to be triggered that is guiding the system based on rules that are applied at run time. • As for the privacy process, it follows from the privacy-related features as introduced and discussed in the Introduction and in Subsect. 4.2, also assuming that for each situation type of high occurrence probability there is a corresponding system behavior type specified at design time, and there are corresponding privacy-related “instructions”. Then it is only necessary to position the “current” system behavior with regard to a corresponding behavior type (for example: “camera surveillance while Police are chasing a criminal”, “camera surveillance while a person is walking in a public area”, and so on), such that it is known how the behavior instance would need to be refined. Those methodological guidelines have been considered firstly by Shishkov et al. [28] and also in the current paper, mainly inspired by the relevance of weaving contextawareness and privacy-by-design in the (SDBC-driven) specification of software, considering dedicated studies addressing context-awareness and privacy. Still, our guidelines are not yet exhaustive and need further elaboration, especially as it concerns some modeling transformations (as it will be seen in the following section). Nevertheless, the partial validation (realized in terms of a case example) clearly demonstrates the adequacy of the proposed guidelines and their relevance. This will inspire further related research activities that would not only address the model transformations with regard to the context-privacy perspective considered in the current paper but would also broaden the public demands perspective, reaching as far as Value-Sensitive Design [9]. The above-mentioned illustrative case example will be considered in the following section.

7 Illustrative Example As mentioned in the Introduction, we partially validate our proposed way of modeling (that touches upon the weaving of context-awareness and privacy) by means of an illustrative example, featuring land border security. In this section, we will firstly present the case briefing and then we will proceed with the security system modeling. 7.1

Case Briefing

Border control is one of Europe’s biggest recent challenges, in the light of severe sea border problems in Greece and Italy in 2015–2017 [10] and land border problems in Bulgaria and Croatia, for example. This leads not only to deadly incidents for numerous migrants who undertake illegal sea/land border crossings in severe (weather) conditions but also to allowing terrorists (mixed with regular migrants) land on Europe’s territory. According to many reports of the European Union - EU (www.europa.eu), this uncontrolled migration to Europe is causing societal tensions and is stimulating extreme

102

B. Shishkov and M. Janssen

political views. Further, even though illegal migration to Europe is mainly fueled by smuggling channels, it is partially ‘facilitated’ by technical/organizational weaknesses at the EU external borders. In this paper, we abstract from the former and focus on the latter. Such a focus has been justified by numerous EU efforts, aiming at improving security at the external borders of the European Union – for example: new border facilities are constructed along those borders, Police officers from some Western EU countries are sent to the South-Eastern EU borders to physically help, new organizational approaches and technical solutions are developed, and so on, as according to the European Union; all those efforts are directed towards stopping the illegal migration to the European Union and it is widely agreed that any migrant should legally approach an EU border point where (s)he would be treated according to the laws and values of the EU. In that sense, we take an application scenario which concerns the EU land border control (our focus is particularly on the external EU borders) and this is about monitoring and reaction to violations. Fulfilling this assumes human actions because security-related decisions are always human-centric [20]. Still, in what they are doing, border police officers receive useful technical support, assuming various channels: infrared images, visible images, proximity sensors, and so on, followed by some kind of intelligent data fusion algorithms [29]. We acknowledge this “duality” – human entities vs. technical entities and acknowledge as well the need to orchestrate this “whole” in a sound way, allowing for objectivity and capability with regard to any situation that is possible to occur. Hence, we are approaching typical situations in this regard, and also the corresponding desirable reactions to those situations. Hence context-awareness is relevant with respect to land border security. Further, realizing that, the above-mentioned technology requires, among other things, IT-based services to recognize people (i.e. biometrics), we acknowledge the need for a special treatment of those issues as far as privacy is concerned because it is justified to distribute personal details of a terrorist but it is not justified to distribute personal details of anybody. We thus identify and approach some privacy-sensitive situation types accordingly. In realizing all this, we take as an example the situation at the Bulgarian-Turkish land border [29]; nevertheless, we abstract from many location-specific details in order to reach findings that are generic and widely applicable. Monitoring the land border is a continuous process where: (i) There is a (wired) border fence that is supposed to obstacle illegal migrants to get in; still, such a facility can be overcome by using a ladder or by just destroying the wire. (ii) There are border police officers who are patrolling (possibly using vehicles); still, no matter how many border police officers are deployed in the border area, it would be physically impossible to guarantee police presence at any time anywhere along the border, over hundreds of kilometers. (iii) There are sensors and other (smart) devices, as mentioned above; they are realizing surveillance; we assume the possibility that a device would perform local processing plus artificial reasoning; based on this, it may generate contentful messages to be transmitted to corresponding human agents.

Enforcing Context-Awareness and Privacy-by-Design

103

Taking the above into account, we argue that there are two main situation types at any point along the border, namely: (a) Normal Situation (NS); (b) Alarm Situation. We realize that both context-awareness and privacy enforcement are “under control” with regard to (a) because: • Within NS, all is just progressing according to pre-defined rules – hence, there is no need to adapt the system behavior to surrounding context; • Following pre-defined rules would also assume adequate treatment of privacysensitive data (for example: the border police officers are also monitored but it is not allowed to distribute their facial information). What is more interesting thus is what is done in the case of (b) where context-awareness and related privacy enforcement are crucial. Approaching (b) and taking into account the case information, we define in turn three situation types concerning migrants possibly attempting to illegally cross the land border outside an official border crossing point: 1. Human-Triggered Alarm Situation (HTAS): when a border police officer faces an attempt of one or more persons to illegally cross the border. Then the officer can do ONE of three things, namely: 1.1. Try to physically stop the persons from crossing, following the corresponding EU regulations; 1.2. Connect to colleagues and ask help; 1.3. Activate particular devices for taking pictures and video of the violators. It is important to note that in this situation, the person in charge has full decision-making capacity. 2. Device-Triggered Alarm Situation (DTAS): when a device is “alarmed” by anything and there is no border police officer on the spot. Then, there are two possibilities: 2.1. The detecting device is “passive” in a sense that the (video) information it is transmitting, is received in run-time and straightforwardly “used” by a distant officer who intervenes generating a decision and corresponding actions; 2.2. The detecting device is “active” in a sense that based on information coming from at least several sensing units, the information gets filtered and automated reasoning is performed, based on which a “hypothesis” on what is happening is generated by the device and sent to corresponding human agents. 3. Outage Situation (OS): when any unexpected (power, performance, or other) outage occurs, not necessarily assuming illegal border crossing at the same time. This calls for urgent system recovery both in human and technical respect. 7.2

Modeling the Border Security System

A logical starting point in our case-driven modeling is the “translation” of the case briefing into better structured information that would be featuring the original business reality and corresponding domain-imposed requirements. As it is well-known, this often assumes (partial) enterprise re-design (re-engineering) that is needed for the sake of making the considered enterprise system adequately supportable by ICT applications [7]. Because of the limited scope of this paper, we are not going in detail on how we analyze the case briefing and how we conduct such a partial enterprise re-design. Moreover, this is not directly related to the main challenges addressed in the current paper, namely: the enterprise-IT alignment, with incorporation of context-awareness

104

B. Shishkov and M. Janssen

and privacy-by-design. Hence, we move directly to the textual reflection of the case briefing, holding in itself re-design-driven and requirements-driven updates:

Hence, this refined case briefing appropriately reflecting the business needs, is our starting point. SDBC has particular strengths on further structuring such information: actor-roles are methodologically identified as well as corresponding transactions, and so on. For the sake of brevity, we do not go in further detail here; still, for more information on those issues, interested readers are referred to [26]. The entities (featuring actor-roles) are: • S (Sensor); S is capturing the occurring situations (situation instances), for example: “all looks normal during night time”, “two persons are hanging over the border fence”, “one person is running next to the patrolling vehicle”, and so on, to give just several examples; in this, S is supported by sensing devices, sensor networks, cameras, data fusion engines, and so on. • PE (Pattern Engine); PE is linked to two pattern banks, namely: ‘sp’ and ‘pp’ – they hold the subclass specifications (‘sp’ featuring situations and ‘pp’ featuring privacy-driven restrictions). Hence, PE is capable of providing such information as reference. • MM (Match-Maker); MM is matching an instance to a subclass, for example: matching a situation instance captured by S to a subclass from Bank ‘sp’. • TE* (Task Engine); TE is generating a desired system behavior description (a task), by instantiating accordingly a behavior subclass (the bank that holds the subclass specifications featuring behaviors is ‘bp’) corresponding to a respective situation subclass. * For the sake of enforcing privacy, it is necessary to match each prescribed desired system behavior to corresponding privacy-driven restrictions stored in Bank ‘pp’; Thus, MM should do a match, based on a prescribed behavior instance (delivered by TE) and privacy patterns (delivered by PE). • PrE (Privacy Engine); PrE delivers a refined behavior recommendation accordingly. • C (Customer); C is hence fulfilled by the corresponding border police officer(s) and/or other team member(s) using such a task specification (as RECOMMENDATION) in order to establish their actions accordingly.

Enforcing Context-Awareness and Privacy-by-Design

105

Thus, next to identifying entities (featuring actor-roles [7, 26]), we are to also identify corresponding transactions (see Definition 2): this we present as the Border Security Business Entity Model, expressed using notations inspired by DEMO [7] – see Fig. 5:

pp

sp

PE bp

TE

t3

t4

t5

S

t7

MM

t6

t2

C

t1

PrE

Fig. 5. Business entity model for the border security case

On the figure, the identified entities are presented in named boxes, while the small grey boxes, one at the end of each connection indicate the executor entity [26]. The connections indicate the need for interactions between entities in order to achieve the business objective of recommendation generation – in our case, those interactions reflect transactions. Hence, with each connection, we associate a single transaction (t): C- PrE (t1), PrE-MM (t2), and so on. As for the delimitation, C is positioned in the environment of the recommendation generation system, and PrE, MM, TE, PE, and S together form the system, where we have included as well the three data banks mentioned above, namely: ‘bp’, ‘pp’, and ‘sp’. Further, we have to make explicit the the causal relationships among the transactions, and given the business entity model, we establish that in order for PrE to deliver a refined task specification as a recommendation to C, it needs input from MM that in turn needs input from TE and PE. Further, in order for TE to deliver a desired system behavior description, it needs input from MM that in turn needs input from S and PE. Those causal relationships are presented in Fig. 6, using the notations of UML Activity Diagram [36]. As can be seen from the figure: (a) capturing a situation instance and considering corresponding situation patterns (viewed as subclasses) go in parallel firstly; (b) secondly goes a match between the two that establishes the relevant subclass (featuring situations) corresponding to a respective behavior pattern; (c) the behavior specification and consideration of relevant privacy-driven restrictions go in parallel thirdly; (d) fourthly goes a match between the two, that establishes the relevant privacy-driven restrictions with regard to the considered behavior; (e) finally, the refined behavior specification is delivered to C in the form of recommendation. Hence, context-awareness and privacy are incorporated through corresponding modeling “building blocks” featuring transactions 6 + 7 and 3 + 4, respectively, as suggested by Fig. 6. Further, with regard to the SDBC modeling process, we have

106

B. Shishkov and M. Janssen

t7

sp

contextawareness

t4

pp

privacyenforcement

t6

t5

t3

t2 t1

Fig. 6. Modeling the causal relationships among transactions

identified the entity model and the causality relations. What goes next are transactions (see Fig. 3) and with regard to this, we use the SDBC interpretation of the transaction concept – see Fig. 7. SDBC interprets the transaction concept as centered around a particular production fact (see Definition 2). The reason is that the actual output of any enterprise system represents a set of production facts related to each other. They actually bring about the useful value of the business operations to the outside world and the issues connected with their creation are to be properly modeled in terms of structure, dynamics, and data. However, considering also the corresponding communicative aspects is important. Although they are indirectly related to the production facts, they are to be positioned around them. SDBC realizes this through its interpretation of the transaction concept. As it is seen from Fig. 7, the transaction concept has been adopted, with a particular stress on the transaction’s output – the production fact. The order phase (left side of the figure) is looked upon as input for the production act, while the result phase (right side of the figure) is considered to be the production act’s output. The dashed line P-fact

r(I)

d(E)

p(E)

s(E)

input

Yes compromise found?

P-act

output

Legend r: request I: Initiator p: promise E: executor s: state a: accept

a(I)

d(I)

Yes compromise found?

cancel

Fig. 7. The SDBC interpretation of the transaction concept (Source: [27], p. 70)

Enforcing Context-Awareness and Privacy-by-Design

107

shows that a transaction could be successful (which means that a production fact has been successfully created) only if the initiator (the one who is initiating the transaction) has accepted the production act of the other party (called executor). As for the (coordination) communicative act types, grasped by an SDBC transaction, they are also depicted in the figure. The initiator expresses a request attitude towards a proposition (any transaction should concern a proposition – for example, a shoe to be repaired by a particular date and at a particular price, and so on). Such a request might trigger either promise or decline - the executor might either promise to produce the requested product (or service) or express a decline attitude towards the proposition. This expressed decline attitude actually triggers a discussion (negotiation), for example: “I cannot repair the shoe today, is tomorrow fine?… and so on”. The discussion might lead to a compromise (this means that the executor is going to express a promise attitude towards an updated version of the proposition) or might lead to the transaction’s cancellation (this means that no production fact will be created). If the executor has expressed a promise attitude regarding a proposition, then (s)he must bring about the realization of the production act. Then the result phase follows, which starts with a statement expression by the executor about the requested proposition that in his/her opinion has been successfully realized. The initiator could either accept this (expressing an accept attitude) or reject it (expressing a decline attitude). Expressing a decline attitude leads to a discussion which might lead to a compromise (this means that finally the initiator is going to express an accept towards the realized production act, resulting from negotiations that have taken place and compromise reached) or might lead to the transaction’s cancellation (this means that no production fact will be created). Once the realized production act is accepted the corresponding production fact is considered to have appeared in the (business) reality. Hence, one could “zoom in” with regard to any of the transactions depicted in Fig. 6 and elaborate each transaction, using the transaction pattern presented in Fig. 7. This actually means modeling transactions at two different abstraction levels. At the highest abstraction level, the transaction is represented as a single action which models the production fact that is enabled. At a lower abstraction level, the transaction’s communicative aspects are modeled conforming to the transaction pattern. The transaction’s request (r), promise (p), state (s), accept (a), decline, and the production act are modeled as separate actions. This is illustrated in Fig. 8 (abstracting from declines and cancellations), featuring only part of the model depicted in Fig. 6, namely, focusing only on transactions 5, 6, and 7: …

t5

r5

p5

t6

r6

p6

s6

a6

t7

r7

p7

s7

a7

s5

a5



Fig. 8. Detailed behavior aspect model featuring transactions

108

B. Shishkov and M. Janssen apply search

generate behavior specification

generate privacy restrictions

deliver situation patterns





perform match-making

Customer

capture situation

deliver recommendation

check data accuracy

Fig. 9. Partial use case model for the border security case

As it is seen from the figure, in order for t5 to be realized, both the realization of t6 and the realization of t7 are to be fulfilled. Hence, upon requesting t5 and before the promise, it is necessary that t6 and t7 are initiated. If realized successfully, both transactions’ output is necessary for the delivery of the production act of t5 (the production acts are depicted as black boxes in the figure). That is how transactions are elaborated. In summary, such an enterprise modeling featuring entities (and data aspects) and corresponding causal relationships as well as the behavior elaboration of respective transactions, represents an adequate basis for specifying software on top of it. We now move to the specification of software: the derivation of use cases is the first challenge – see Fig. 3. For detailed information concerning the derivation of use cases from transactions, interested readers are referred to [26] – for the sake of brevity, we go directly to a partial use case model, derived on the basis of the 7 transactions (see Figs. 5 and 6). The model is depicted in Fig. 9. As it is seen from the figure, all use cases, except for the ones backgrounded in black and grey, correspond to respective transactions: the SYSTEM’s DELIVERY OF RECOMMENDATION (assuming behavior refinement) to CUSTOMER includes MATCHING between: (i) BEHAVIOR SPECIFICATION and (ii) PRIVACY RESTRICTIONS. In turn, (i) includes MATCHING between (iii) CAPTURED SITUATION and (iv) A SITUATION PATTERN (this matching allowing to identify the right behavior pattern to consider). Those are the so called essential use cases – the ones straightforwardly reflecting transactions [26, 33]. Those use cases usefully drive the alignment between enterprise modeling and software specification, guaranteeing that the software system-to-be is stemming from corresponding enterprise models. Nevertheless, next to the essential use cases, we have also: (a) informational use cases, reflecting informational issues (not essential); (b) use cases reflecting user-defined requirements with regard to the software system-to-be [26]. An example for (a) is the use case APPLY SEARCH - delivering situation patterns and generating privacy restrictions are essential business tasks

Enforcing Context-Awareness and Privacy-by-Design

109

requiring in turn informational activity, namely: searching through the corresponding data banks. An example for (b) is the use case CHECK DATA ACCURACY - it may be required by the user that upon match-making, the accuracy of corresponding data is checked. Those two use cases are only to illustrate (a) and (b). Because of the limited scope of this paper, we have only considered a partial use case model, aiming at being explicit on the enterprise-software alignment that in turn builds upon the weaving of context-awareness and privacy at the enterprise modeling level. For this reason, we are not going to address in the current paper the elaboration of use cases as well as the further software specification reflected in behavior + states modeling and classification. Interested readers are referred to [27] where this is considered and justified by means of a case study. Still, we will consider (in future research) those issues with regard to the land border case example.

8 Conclusions The contribution of the current paper concerns a proposed design approach that allows for smoothly reflecting context and privacy features (and tackling possible tensions among the two) in the application specification, supported by methodological guidelines that span over the enterprise modeling and software specification, fueled by the SDBC approach. We have partially demonstrated our way of modeling by means of a case example featuring the domain of land border security. Hence, we have not only contributed to the enterprise-software alignment research (addressing also the challenge of weaving context-awareness and privacy-by-design in the software specification) but we have also delivered a useful domain-specific study featuring an application domain where context-awareness and privacy have essential impact. As future research, we plan to consider a large-scale border security case study assuming software development activities as well as the consideration of other public values as well (next to privacy), such as transparency and accountability.

References 1. AWARENESS. Freeband AWARENESS Project (2008). http://www.freeband.nl 2. Ayed, D., Delanote, D., Berbers, Y.: MDD approach for the development of context-aware applications. In: Kokinov, B., Richardson, D.C., Roth-Berghofer, T.R., Vieu, L. (eds.) CONTEXT 2007. LNCS (LNAI), vol. 4635, pp. 15–28. Springer, Heidelberg (2007). https:// doi.org/10.1007/978-3-540-74255-5_2 3. Bunge, M.A.: Treatise on Basic Philosophy. A World of Systems, vol. 4. D. Reidel Publishing Company, Dordrecht (1979) 4. Burghardt, T., Buchmann, E., Böhm, K.: Why do privacy-enhancement mechanisms fail, after all? A survey of both, the user and the provider perspective. In: Workshop W2Trust, in Conjunction with IFIPTM (2008) 5. Cockburn, A.: Writing Effective Use Cases. Addison-Wesley, Boston (2000) 6. Dey, A.K.: Understanding and using context. Pers. Ubiquit. Comput. 5(1), 4–7 (2001) 7. Dietz, J.L.G.: Enterprise Ontology, Theory and Methodology, 1st edn. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-33149-2

110

B. Shishkov and M. Janssen

8. Dietz, J.L.G.: Generic recurrent patterns in business processes. In: van der Aalst, W.M.P., Weske, M. (eds.) BPM 2003. LNCS, vol. 2678, pp. 200–215. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-44895-0_14 9. Friedman, B., Hendry, D., Borning, A.: A survey of value sensitive design methods. Int. J. Found. Trends. Hum. Comput. Interact. 11, 63–125 (2017) 10. FRONTEX: The website on the European Agency, FRONTEX (2018). http://frontex.europa.eu 11. Henricksen, K., Indulska, J.: Developing context-aware pervasive computing applications: models and approach. Perv. Mob. Comput. 2, 37–64 (2006) 12. Hevner, A.R., March, S.T., Park, J., Ram, S.: Design science in information systems research. MIS Q. 28(1), 75–105 (2004) 13. Huberman, B.A., Franklin, M., Hogg, T.: Enhancing privacy and trust in electronic communities. In: 1st International ACM Conference on Electronic Commerce, EC 1999. ACM (1999) 14. Hustinx, P.: Privacy by design: delivering the promises. Identity Inf. Soc. 3(2), 253–255 (2010) 15. IoTDI 2nd International Conference on Internet-of-Things Design and Implementation. ACM/IEEE (2017) 16. Janssen, M., Van den Hoven, J.: Big and open linked data (BOLD) in government: a challenge to transparency and privacy? Gov. Inf. Q. 32(4), 363–368 (2015) 17. Johnston, A., Wilson, S.: Privacy compliance risks for Facebook. IEEE Technol. Soc. Mag. 31(2), 59–64 (2012) 18. Könings, B., Schaub, F., Weber, M.: Privacy and trust in ambient intelligent environments. In: Ultes, S., Nothdurft, F., Heinroth, T., Minker, W. (eds.) Next Generation Intelligent Environments, pp. 133–164. Springer, Cham (2016). https://doi.org/10.1007/978-3-31923452-6_4 19. Kruchten, P.: The Rational Unified Process, An Introduction. Addison-Wesley, Boston (2003) 20. LBS. LandBorderSurveillance, the EBF, LandBorderSurveillance Project (2012). http://ec. europa.eu 21. Liu, K.: Semiotics in Information Systems Engineering. Cambridge University Press, Cambridge (2000) 22. MDA. The OMG Model Driven Architecture (2018). http://www.omg.org/mda 23. Offermann, P., Blom, S., Schönherr, M., Bub, U.: Artifact types in information systems design science – a literature review. In: Winter, R., Zhao, J.L., Aier, S. (eds.) Global Perspectives on Design Science Research. DESRIST 2010. LNCS, vol. 6105, pp. 77–92. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13335-0_6 24. Pearson, S.: Taking account of privacy when designing cloud computing services. In: International Workshop on Software Engineering Challenges of Cloud Computing, ICSE 2009 (2009) 25. Seničar, V., Jerman-Blažič, B., Klobučar, T.: Privacy-enhancing technologies approaches and development. Comput. Stand. Interfaces 25(2), 147–158 (2003) 26. Shishkov, B.: Enterprise Information Systems, A Modeling Approach, 1st edn. IICREST, Sofia (2017) 27. Shishkov, B.: Software specification based on re-usable business components (Ph.D thesis), 1st edition, TU Delft. Delft (2005) 28. Shishkov, B., Janssen, M., Yin, Y.: Towards context-aware and privacy-sensitive systems. In: 7th International Symposium on Business Modeling and Software Design, BMSD 2017. SCITEPRESS (2017)

Enforcing Context-Awareness and Privacy-by-Design

111

29. Shishkov, B., Mitrakos, D.: Towards context-aware border security control. In: 6th International Symposium on Business Modeling and Software Design, BMSD 2016. SCITEPRESS (2016) 30. Shishkov, B., van Sinderen, M.: From user context states to context-aware applications. In: Filipe, J., Cordeiro, J., Cardoso, J. (eds.) ICEIS 2007. LNBIP, vol. 12, pp. 225–239. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88710-2_18 31. Shishkov, B., Van Sinderen, M.J., Tekinderdogan, B.: Model-driven specification of software services. In: IEEE International Conference on e-Business Engineering, ICEBE 2007. IEEE (2007) 32. Shishkov, B., Van Sinderen, M.J., Quartel, D.: SOA-driven business-software alignment. In: IEEE International Conference on e-Business Engineering, ICEBE 2006. IEEE (2006) 33. Shishkov, B., Dietz, J.L.G.: Deriving use cases from business processes, the advantages of DEMO. In: 5th International Conference on Enterprise Information Systems, ICEIS 2003. SCITEPRESS (2003) 34. Seigneur, J.-M., Jensen, C.D.: Trading privacy for trust. In: Jensen, C., Poslad, S., Dimitrakos, T. (eds.) iTrust 2004. LNCS, vol. 2995, pp. 93–107. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24747-0_8 35. Simons, C., Wirtz, G.: Modeling context in mobile distributed systems with the UML. Vis. Lang. Comput. 18(4), 420–439 (2007) 36. UML. The Unified Modeling Language (2017). http://www.uml.org 37. Vieira, V., Tedesco, P., Salgado, A.C.: Designing context-sensitive systems: an integrated approach. Expert Syst. Appl. 38(2), 1119–1138 (2011) 38. Vom Brocke, J., Zelt, S., Schmiedel, T.: On the role of context in business process management. Inf. Manag. 36(3), 486–495 (2016) 39. Weber, R.H.: The digital future - a challenge for privacy? Comput. Law Secur. Rev. 31(2), 234–242 (2015) 40. Zhu, N., Zhang, M., Feng, D., He, J.: Access control for privacy protection for dynamic and correlated databases. In: International IEEE SmartCity Conference, SmartCity 2015. IEEE (2015)

Towards an Integrated Architecture Model of Smart Manufacturing Enterprises Thijs Franck1, Maria-Eugenia Iacob2(&), Marten van Sinderen2, and Andreas Wombacher1 1

Aurelius Enterprise, Amsterdam, Netherlands {thijs.franck,andreas.wombacher} @aureliusenterprise.com 2 University of Twente, Enschede, Netherlands {m.e.iacob,m.j.vansinderen}@utwente.nl

Abstract. With the introduction of smart manufacturing, the scope of IT expands towards physical processes on the shop floor. Enterprise architects, software engineers and process engineers will have to work closely together to build the information systems that are connected to the shop floor and aligned with the business needs of smart manufacturers. However, it is unclear whether they have the means to do so. This research aims to provide enterprise architecture modelling support for smart manufacturers by investigating ArchiMate 3.0’s fitness for this purpose. ArchiMate 3.0 meta-model is compared to the ISA-95 standard for enterprise systems and control systems integration. Modelling patterns are introduced, along with some new modelling concepts, to compensate for deficiencies found. The patterns proposed are validated as part of a case study. Keywords: ArchiMate 3.0  Enterprise architecture Smart manufacturing  Industry 4.0

 ISA-95

1 Introduction Manufacturing companies worldwide are facing the need to improve productivity and quality, as well as implement new products, while shortening innovation cycles. To this end, the manufacturing industry is currently in the process of adopting the new Smart Manufacturing paradigm, also known as the Industry 4.0 paradigm. Smart Manufacturing promises smart machine line operations, high-fidelity models of production processes and improved decision-making support [3]. For the benefits of Smart Manufacturing to materialize, manufacturers will need some way to maintain alignment between their business needs and the information systems that permeate increasingly through all levels of their operations [4, 13]. Maintaining alignment between a company’s strategy (the business domain) and its supporting IT is one of the main benefits of enterprise architecture (EA) [1]. The management of processes at the shop floor and the systems used to operate the industrial control devices have traditionally fallen under the Operations Technology (OT) domain of process engineers. As OT increasingly starts to overlap with IT, it makes sense to consider the physical domain from an IT perspective. As a result, the dichotomy between IT and OT fades, in favour of a single EA for the manufacturing domain. © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 112–133, 2018. https://doi.org/10.1007/978-3-319-78428-1_6

Towards an Integrated Architecture Model

113

To make this integration between business, IT and OT successful, enterprise architects and process engineers must have a shared modelling language that can express all concepts required for modelling the EA of the manufacturing domain. One of the major requirements introduced by Smart Manufacturing is the modelling of cyber-physical systems (ISCPS). CPS is a type of information system that integrates computational and physical processes and allows these processes to interact [9]. For example, an oven may report real time its temperature curve. If this curve is sub-optimal, the oven wastes energy. Such an insight could be used as input for operational excellence programs, or preventive maintenance. The modelling of such systems will involve not just viewpoints and concepts related to applications and IT infrastructure, but also to the physical environment (i.e., conditions on the shop floor) [10]. For this research, we adopt the international open standard ArchiMate as our EA modelling language of choice. The most recently published version of the standard, ArchiMate 3.0 [11], already includes several concepts for modelling the physical environment of enterprises. Being a new release, however, it has not been seriously validated or applied in the manufacturing domain. To ensure that ArchiMate enables the modelling of a smart manufacturer’s EA, the standard needs to be validated for that particular purpose. We adopt a process framework and a common object model published as part of the standards suite ANSI/ISA-95 [6, 7] (alternatively, ISO/IEC 62264), or ISA-95 for short, to represent the manufacturing domain. The ISA-95 common object model [7] describe entities at the shop floor level, where IT and OT interact, whereas the ISA-95 process framework describes exactly this interaction. Conversely, while ISA-95 describes the physical domain, it does not describe the business or IT domains very well, nor was it intended to model EAs in the first place. Thus, to be capable of modelling the EA of a smart manufacturer, ArchiMate 3.0’s meta-model needs to be able to express all architectural concepts from ISA-95. To that end, this paper tries to answer the following questions: RQ1. To what extent can ArchiMate 3.0 express the EA of any smart manufacturer per ISA-95? RQ2. If ArchiMate 3.0 cannot fully express the EA of any smart manufacturer per ISA-95, what changes to the meta-model of ArchiMate 3.0 are necessary to make this possible? Thus, the contribution of this research concerns an analysis of whether the meta-model of ArchiMate 3.0 is expressive enough to model an EA in the manufacturing domain. Secondly, we propose a set of modelling patterns describing how ISA-95 concepts can be expressed in ArchiMate. These patterns can be simple direct mappings, or may involve a grouping of Archimate concepts. Finally, to enhance ArchiMate’s expressiveness and enable the modelling of certain smart manufacturing concepts some change suggestions are made. The remainder of the paper is organised as follows. In Sect. 2 we explain the methodology we followed to define a mapping from ISA-95 to ArchiMate, and to analyse the expressiveness of ArchiMate. Section 3 describes the results of the analysis,

114

T. Franck et al.

and contains the main contribution of the paper. Section 4 gives an account of how we validated our findings. We conclude the paper with a discussion of the related work in Sect. 5 and with conclusions and some pointers to future work in Sect. 6.

2 Methodology To define a mapping from ISA-95 to ArchiMate and answer research questions, we followed a four-step approach. Firstly, we derived a subset of architectural concepts from the concepts defined by ISA-95. ISA-95 was written with IT/OT integration in mind. To apply its concepts to architecture modelling, an assessment is necessary to find out which concepts qualify as architectural. For this assessment, the same criteria that were used to define the current set of concepts in ArchiMate are applied to each concept in ISA-95. These criteria are explained in Sect. 3.1. Secondly, we make a comprehensive mapping of the architectural ISA-95 concepts onto ArchiMate 3.0. Criteria used for the mapping are the similarity of concept definitions, as well as similarity of direct relations to other concepts (depth = 1). Thirdly, the ArchiMate’s expressiveness concerning the smart manufacturing domain is investigated by identifying semantic deficiencies in terms of the types defined by Wand and Weber [14] (see Sect. 3.3). We assume that the ISA-95 common object model is a complete representation of entities at the shop floor level. Given our goal of representing this same domain in ArchiMate 3.0, the ISA-95 common object model should fully map onto ArchiMate 3.0. Whether ISA-95 can fully express ArchiMate is not of interest. Thus, we only consider deficiencies of type construct overload, where several ISA-95 constructs map to one ArchiMate construct, and type construct deficit, where an ISA-95 construct does not map to any ArchiMate construct. The deficiencies identified are subsequently analysed and, if necessary, addressed. In the case of construct overload, an assessment is made concerning critical expressiveness loss as result of the higher abstraction level. In the case of construct deficit, it must be determined whether the intended meaning of the ISA-95 concept can be expressed using a combination, or ‘pattern’, of constructs currently present in ArchiMate 3.0’s meta-model. If the current meta-model is found insufficiently expressive, we suggest a pattern that includes new constructs (i.e., new relations or concepts). Finally, the identified patterns are validated as part of a case study at SteelCorp. The validation aims to prove the usefulness of the patterns in modelling the EA of a manufacturer, as well to demonstrate the usefulness of such a model through two common manufacturing use cases: an impact of change analysis and an operational excellence analysis.

3 Analysis The results of several parts of the analysis have been summarized in a spreadsheet (from here on referred to as ‘the spreadsheet’) which is made available online via http:// bit.ly/2amGJqi.

Towards an Integrated Architecture Model

3.1

115

Excluding Non-architectural Concepts from ISA-95

To determine the architectural concepts in the ISA-95 common object model, it is necessary to perform a ‘normalization’ of the ISA-95 concepts to a level of abstraction that coincides with that of ArchiMate concepts. The criteria for normalization are the same as those originally used to determine the ArchiMate concepts. ArchiMate uses for this a layered structure [8]. Starting at the lowest specialization level, concepts are defined in a highly abstract manner as simply entities and relationships between them. At the next level, concepts are specialized as either passive structure concepts, behaviour concepts or active structure concepts, corresponding to the basic structure of the ArchiMate language (dynamic system level). Concepts are then further specialized as EA concepts, which are the ones used to build architecture models. ArchiMate defines implementations of concepts in architecture models as its lowest level of abstraction. At each specialization step, the utility of the specialization must be argued based on the modelling goals that the modeller has in mind. Following this structure, any ISA-95 concept that is architectural will need a specialization relation to one of the concept types at ArchiMate’s dynamic system level. The concepts at the dynamic system level are listed in Table 1. Table 1. Dynamic system level concept types [8]. Concept type Active structure concept Behaviour concept

Description An entity that is capable of performing behaviour A unit of activity performed by one or more active structure elements Passive structure concept An object on which behaviour is performed

By eliminating all ISA-95 concepts that do not have a specialization relationship to one of these concepts, we end up with a normalized set of architectural concepts. The normalization analysis reveals that 66% of ISA-95 concepts are architectural. The remaining 33% are non-architectural. For example, ‘person’ qualifies as architectural concept since a person can perform behaviour. Properties describing that person are non-architectural concepts. To review specifically which concepts classify as architectural, please refer to the spreadsheet. 3.2

Mapping ISA-95 to ArchiMate 3.0

To define a mapping from ISA-95 concepts to ArchiMate we follow a two-step approach: Firstly, for each architectural ISA-95 concept, a comparison is made between its definition and the definition of every ArchiMate concept. Secondly, if there is a fit with one or more definitions, a further comparison is made. In this comparison, each direct relation (depth = 1) of the ISA-95 concept is compared to each of the concepts directly surrounding the ArchiMate concept. This includes both the definition of the surrounding object and the definition of the connecting relationship. If these relations are also in alignment, an ISA-95 concept maps to ArchiMate.

116

T. Franck et al.

Of the architectural concepts in ISA-95, 12% of ISA-95 concepts map to ArchiMate completely. 75% does have a fit based on definition, but has one or more relations that cannot be mapped. Finally, 13% has no matching definition to start. For a reference on which specific ISA-95 concept maps to which specific ArchiMate 3.0 concept, please refer to the spreadsheet. N-to-M mappings. In some cases, it turns out that that several concepts from ISA-95 map to several other concepts from ArchiMate. These mappings are ambiguous, causing uncertainty with regards to which concept to use. According to the mapping, several concepts would be correct. These n-to-m mappings need to be addressed before moving forward. Particularly, this concerns the following two mapping scenarios. Process Segment Process Segment; Process Segment Dependency; Operations Segment; Operations Segment Dependency Map to Business Process; Business Function; Business Interaction; Business Event There appears to be an n-to-m mapping in this scenario. However, strictly comparing the definitions of the ISA-95 concepts, as well as the relations they share to surrounding concepts (depth = 1), the ISA-95 concepts turn out to be synonymous. This resolves the n-to-m mapping to concept redundancy, which will be addressed in Sect. 3.3. This case shall be further referred to as Process Segment. Equipment Equipment Class; Equipment Map to Business Role; Location; Equipment; Facility In this second scenario, Equipment and Equipment Class are not synonymous per the ISA-95 meta-model. However, given that ArchiMate does not distinguish between classes and instances, Equipment Class and Equipment can safely be abstracted to mean the same thing. This, again, resolves the n-to-m mapping to concept redundancy, which will be further discussed in Sect. 3.3. This case shall be further referred to as Equipment. 3.3

Classifying Deficiencies in ArchiMate 3.0

Based on the previously established mapping of ISA-95 onto ArchiMate, several deficiencies in ArchiMate 3.0 can be identified. Classifying each deficiency will help find a suitable solution at a later stage. Four types of deficiency exist [14]. Table 2 describes each type. We assume that the ISA-95 common object model is a complete representation of the entities on the shop floor. Thus, if ArchiMate is capable of modelling the EA of a smart manufacturer, its meta-model should be capable of expressing ISA-95. Based on this analysis, several cases of construct overload, as well as construct deficit, are uncovered. The following sections discuss the occurrences of each type.

Towards an Integrated Architecture Model

117

Table 2. Types of deficiencies [14]. Type Construct overload Construct redundancy Construct excess Construct deficit

Description Several ontological constructs map to one grammatical construct Several grammatical constructs map to one ontological construct A grammatical construct might not map to any ontological construct An ontological construct might not map to any grammatical construct

Cases of construct overload. Construct overload (i.e., more ISA-95 concepts map onto one ArchiMate 3.0 concept) occurs in the case of the following ArchiMate concepts: Business Object is used to represent information objects that are used on the shop floor and may serve as a placeholder for more complex entities like a schedule or a bill of materials. Specifically, Table 3 describes the objects that map to Business Object. Where a business object is used, the model will depend on relationships to other entities to provide the expressiveness needed to model the meaning that the user intends. If this level of expressiveness cannot be achieved, this causes a construct deficit. Personnel Class and Equipment map to Business Role. This happens specifically in the case where Equipment refers to an automated production unit. This abstraction loses the direct distinction between a manual and an automated role. However, depending on whether a given role depends on an actor or not, this distinction can still be derived. Material Class, Material Definition, Material Lot and Material Sublot map to Material in ArchiMate. Because of this, the distinction between a class of material and a specific type of material used as part of a process is lost. Furthermore, the difference between a class of material and an identifiable (group of) its instances is also lost. Table 3. Construct overload to business object. Qualification test specification Equipment capability test specification Physical asset capability test specification Material test specification Material assembly Material definition assembly Material class assembly Personnel segment specification Equipment segment specification Material segment specification Material segment specification assembly Physical asset specification

Operations material bill Personnel specification Equipment specification Physical asset specification Material specification Material specification assembly Operations schedule Segment requirement Personnel requirement Equipment requirement Physical asset requirement Material requirement

118

T. Franck et al.

Cases of construct deficit. Several deficits have been identified as part of the mapping analysis. When a deficit occurs, the ISA-95 concept cannot be expressed in ArchiMate. Each deficit is explained in the paragraphs below. Various concepts in ISA-95 are related to a test specification that is used to test certain properties of said concepts. A Test Specification maps to a Business Object. The ArchiMate meta-model only allows for an association relationship between Active Structure concepts and a Business Object. The dependency in ISA-95 is, however, stronger (). An assembly is a collection or set of related elements. In ISA-95, they are represented as classes related to aggregation relationships between elements. In ArchiMate, every element can also have an aggregation relation with an element of the same type. There is, however, no class that represents information about this relation. A process segment (maps to business process) in ISA-95 is a collection of several concepts, including specific parameters that do not fall into the category of personnel, equipment, physical asset or material. The ‘other’ parameters are known as process segment parameters. ArchiMate allows only well-defined concepts to be related to a business process. While an ISA-95 Material can be directly mapped to an ArchiMate material, a problem occurs when attempting to map a Material Lot. A requirement for a Material Lot is that it should be possible to determine its current state based on the lot ID. This requires traceability to an information object, i.e., a Business Object. While it is possible to relate a Material Lot to a Business Object through an association, the relationship between a physical and an information object is deemed more meaningful. The operations definition describes the relation between a production, maintenance, inventory or quality operation, the way in which it is implemented and the resources that are needed. A framework for these kinds of manufacturing operations is defined by the first part of ISA-95 [6]. ArchiMate only loosely defines business processes, independent of their context. ISA-95 defines a schedule concept. It is implemented as a set of operations requests, which directly relate to an operations definition. There is no similar concept in ArchiMate. ISA-95 makes a distinction between the definition of a process, the planned process and the actual process. Once executed, Operations Responses are returned for every Operations Request (which make up the schedule). In ArchiMate, an Operations Response can be represented as either a Business Object or Data Object, depending on whether this information is collected digitally or not. The actual production information is, however, much too volatile to model as part of the architecture. 3.4

Addressing the Deficiencies Found

Now that several deficiencies have been identified, solutions can be defined that allow ArchiMate to express all the architectural concepts in ISA-95, thus making the language suited to model the shop floor and, by extension, the EA of a smart manufacturer. The solutions to the deficiencies identified will be discussed below as modelling patterns. A pattern is a set of constructs from ArchiMate that expresses a certain aspect

Towards an Integrated Architecture Model

119

of ISA-95. Preferably, only existing constructs will be included in these patterns. If a new construct must be introduced, it will conform to the requirements for constructs in ArchiMate [11]. The following paragraphs discuss the solutions per deficiency. Test Specifications. Various concepts in ISA-95 are related to a test specification that is used to test certain properties of said concepts. Often, these concepts are mapped to active structure concepts in ArchiMate. For example, a Person (maps to Actor) relates to a Qualification Test Specification (maps to Business Object). A Business Object is, however, a passive structure concept. The ArchiMate meta-model only allows for an association relationship between Active Structure concepts and a Business Object. As discussed in Sect. 3.3, we must rely on the context of the ArchiMate model to define the meaning of a Business Object. For a Test Specification, which has a very specific purpose in ISA-95, we deem an association relationship insufficient, since this association without context can be interpreted in many ways. A stronger relation [2] between an Active Structure concept and a Business Object can only be established via a Behavior concept, specifically the assigned to relations (for Active Structure concepts) and accesses relations (for Business Objects) to Business Service, Business Event and Business Process. Since relations from the physical layer are only allowed to Business Process, Business Function and Business Interaction (and not Business Service or Business Event), this leaves Business Process as the only option. Given this limitation, we define the Test Specification Pattern as shown in Fig. 1.

Fig. 1. Test specification pattern for ArchiMate 3.0.

Assemblies. An assembly is a collection or set of related elements. In ISA-95, they are represented as classes related to aggregation relationships between elements. In ArchiMate, every element can also have an aggregation relation with an element of the same type. There is no class that represents information about this relation. For example, to express the size of an assembly in ArchiMate, it would be necessary to model an entity for each element in the collection. This makes sense in a scenario where each instance of a class is meaningfully different. For example, since every person has different qualifications, it is meaningful to model people separately as part of a team. However, in the case where the elements of a collection are not meaningfully different, e.g. a set of materials used for the production of a batch (bill of materials), it makes more sense to model each material as a class rather than as separate instances. When modelling only a class, however, the quantity of the material used for the production of e.g. a batch is still meaningful information. Both alternatives below present a solution that makes use of a parameter to a relationship to express meaning. Such a pattern can also be used to express the Operations Material Bill Item concept per ISA-95.

120

T. Franck et al.

Alternative 1. To model such information relevant to an assembly, parameters for the relation between the class (e.g. a material) and the assembly (e.g., a bill of materials) is proposed. While ISA-95 defines assemblies broadly, in the specification they only occur in relation to materials. A placeholder mapping for assembly would be a business object. Currently, there exists an indirect relation between Business Object and Material through Business Process. The information relevant to an assembly could be attached to the relation between Material and Business Process as a (set of) parameter(s). Alternative 2. This implementation eliminates the need for a separate Business Object by modelling the bill of materials implicitly through the set of relations between said Business Process and the Materials used. Figure 2 illustrates the proposed pattern.

Fig. 2. Implicit bill of materials pattern for ArchiMate.

However, the solution presented in alternative 1 does not allow for a bill of materials to be modelled explicitly. A bill of materials is quite common in manufacturing, so the capability to include this concept explicitly may be desirable. To do so, a direct relation between Business Object and Material is necessary. An association relationship is currently available between Material and Business Object. However, as explained in Sect. 3.3, we feel that the use of an association relationship in this case is not sufficiently expressive. Instead, an aggregation relationship is proposed. An aggregation relationship indicates that a concept (the bill of materials) groups a number of other concepts (materials). While Materials are meaningful independent of one another, the bill of materials groups them for the purposes of use in a production process. The proposed parameters would be attached to this relationship. This solution is, however, not perfect either. The relation between Business Object and Material makes the relation between Business Process and Material redundant, since the Bill of Materials will always be related to a production process (Business Process). Figure 3 shows a pattern for the modelling of an explicit bill of materials. There are two major differences between this pattern and the pattern for an implicit bill of materials. Firstly, this pattern includes a Business Object that denotes the bill of materials. This Business Object is related to the Business Process via an accesses relation. This relation currently exists in ArchiMate. The bill of materials lists one or

Towards an Integrated Architecture Model

121

more Materials via an aggregation relation. This aggregation relation is newly introduced for this purpose. Secondly, the information describing the assembly is related to the aggregation relation between Material and Business Object, as denoted by the dotted line.

Fig. 3. Explicit bill of materials pattern for ArchiMate.

Process Segment Parameter. A process segment (maps to business process) in ISA-95 is a collection of several concepts, including specific parameters that do not fall into the category of personnel, equipment, physical asset or material. The ‘other’ parameters are known as process segment parameters. For a production process, an example might include the known lead time of a process step (e.g. the steel coil needs to be in the oven for 10 min). For a quality process, a parameter might be the size of the sample to be pulled (e.g. 1 coil). ArchiMate allows only well-defined concepts to be related to a business process. The only concepts that fit with the description of Process Segment Parameter (i.e. related to business process and not a person, equipment or material) are Business Service and Business Event (behavior), or Business Object (passive structure). A timer like in the oven example would typically be modeled as an event, but a parameter like sample size cannot be expressed formally in ArchiMate. If needed, such information can be included as part of the sub-process name (e.g. take a quality sample, size 1). Modelling this information as such works as a way to capture it informally, e.g. for presentation purposes. However, for analysis purposes, a more formal approach will be required, since information stored in a concept name cannot be queried easily. The proposed solution is to introduce parameters related to a business process. This is similar to the solution introduced to model assemblies, with the difference being that the parameters are related to a concept rather than a relation. Examples of parameters are average duration, sample size or temperature. This parameter pattern can also be used to model other manufacturing object parameters, per the ISA-95 object properties. The parameter pattern for concepts is illustrated in Fig. 4.

122

T. Franck et al.

Fig. 4. Process parameter pattern for ArchiMate 3.0.

Material Lot. While an ISA-95 Material can be directly mapped to an ArchiMate Material, a problem occurs when attempting to map a Material Lot. The current state of a Material Lot should be accessible via its ID. This requires traceability to an information object, i.e. a Business Object. It is possible to associate a Material with a Business Process and a Business Object with a Business Process. It is even possible to draw an association between Material and Business Object. In the case of a Material Lot, however, the relationship between Physical Object and Information Object is more meaningful than an association. The relationship should describe how the informational object reflects the state of the physical object it represents. To add this expressiveness, a realization relationship is proposed. A realization relationship links a logical entity with a more concrete entity that realizes it. Thus, a realization relation could describe how a physical object is represented by an information object. Furthermore, a Data Object may realize a Business Object. This Data Object can, by means of an indirect relation, be considered as the digital representation of said Material stored in some information system. This extrapolation would not be valid if a weaker relation should be used between the physical object and the Business Object. Finally, by linking the data model of said Data Object to the architecture, it becomes possible to perform analyses of a material’s production lifecycle. The same logic also applies to other physical elements. For example, a piece of equipment may be used as part of a business process, causing it to change state (e.g. from ‘idle’ to ‘in use’). Per ISA-95, entities associated with processes include materials, as well as physical assets, equipment and people. Because of this relation in ISA-95, the same realization relation that is proposed for ArchiMate between Material and Business Object is also proposed as a relation between Business Object and Business Role, Business Actor, Equipment and Facility (see Table 4). Table 4. Proposed relations. ArchiMateConcept Material Business role Business actor Equipment Facility

ISA-95 concept Material lot Personnel class Person Equipment class Physical asset

Relation Realizes Realizes Realizes Realizes Realizes

Concept Business Business Business Business Business

object object object object object

Towards an Integrated Architecture Model

123

Finally, the Business Process concept is included to show that the newly added realization relation is only intended for those concepts that have an accesses or assigned to relation with Business Process. Figure 5 illustrates the proposed extension. The newly added realization relation is marked with a red circle. For the sake of legibility, the ‘Physical Elements’ block serves as a placeholder for the ArchiMate concepts listed in Table 4. The figure also shows how an indirect realization relation between Data Object and a Physical element can be derived using the realization relation between Data Object and Business Object.

Fig. 5. Informational representation of a material. (Color figure online)

Operations Definition. The operations definition describes the relation between a production, maintenance, inventory or quality operation, the way in which it is implemented and the resources that are needed to carry out the process. A framework for these kinds of manufacturing operations is defined by the first part of ISA-95. ArchiMate only loosely defines business processes, independent of their context. However, the ISA-95 process framework [6] can be implemented in ArchiMate. It can then provide structure through composition relations from framework processes to processes that are company-specific. Figure 6 shows a pattern for how to apply the ISA-95 process framework to company-specific business processes. Such processes are modeled as sub-processes (hence the composition relation) of processes from the ISA-95 process framework. Since both ISA-95 processes and their sub-processes have flow relations between them, sub-processes cannot compose more than one ISA-95 process. If a process in a currently existing model cannot fulfil this requirement (e.g. Batch Annealing in Fig. 6), that process needs to be decomposed such that each sub-process only has one relation to an ISA-95 process. ISA-95 also explicitly defines a Bill of Materials in relation to an Operations Definition. A business object best fits the definition, but a business object cannot have a relation to a material (except through a business process). ArchiMate does implicitly define a bill of materials through the accesses relation between business process and material. The pattern introduced for Assemblies solves this issue.

124

T. Franck et al.

Fig. 6. Operations definition pattern (incl. example).

Operations Schedule. ISA-95 defines a schedule concept. It is implemented as a set of operations requests, which directly relate to an operations definition. There is no particular ordering (time sequence) to the set. There is no similar concept in ArchiMate. While the schedule itself could be modeled as a business object, another issue arises with regards to the relation between a business object and a business process. A business process is typically modeled as a class in ArchiMate, while the schedule must relate to instances to be meaningful. It would either be necessary to model each instance of the process separately, or to model no relations to business processes at all, effectively making the schedule a placeholder object. The first is preferable from an analysis standpoint, while the second is preferable from a complexity standpoint. A compromise between these two options is to, rather than model each instance as part of the architecture, include a reference to the data model used to store each instance (Fig. 7). This data model can then serve as the basis for a query. The way in which this query is structured shall depend on the viewpoint for which the information is required. For example, a query based on product ID may reveal which execution path was followed for the production of that unit.

Fig. 7. Operations schedule pattern.

Operations Performance. ISA-95 makes a distinction between the definition of a process (operations definition), the planned process (operations schedule) and the actual process (operations performance). Once executed, Operations Responses are returned for every Operations Request (which make up the schedule). An Operations

Towards an Integrated Architecture Model

125

Response is made up of ‘actuals’, which represent the people, equipment, materials and physical assets that were utilized. In ArchiMate, an Operations Responses can be represented as either a Business Object or Data Object, depending on whether this information is collected digitally or not. The actual production information, such as the actual execution of the process, any errors that may have occurred, is however much too volatile to model as part of the architecture. Instead, it is recommended to relate an Operations Response object to a specification of the data model, describing how the data can be obtained externally (e.g. an E/R-diagram). Based on this specification and the relation to a data object accessed by some application, it will be possible to generate a query for analysis purposes. The proposed pattern is shown in Fig. 8.

Fig. 8. Performance actual pattern.

4 Validation A case study has been done to validate whether ArchiMate (plus ISA-95 based modelling patterns) effectively introduces EA modelling capability to manufacturers. The case concerned a large steel manufacturer (named SteelCorp for the sake of anonymity) that intends to make a change in one of its production processes. We show how the models created do not only help manufacturers in their transformation efforts, as well as how the models created can be used as a basis for further analysis applicable to a smart manufacturing landscape. The process modelled as part of this case study concerns a batch annealing process for steel coils at one of SteelCorps factories. During batch annealing, a group of three coils is placed into an oven. Heat is applied over a period of time to change certain qualities of the steel. SteelCorp is looking to optimize this process and to harmonize its surrounding application landscape. Proposed optimizations include the integration of information used in several activities preparatory to production into the production planning system (PPS). The PPS is used to manage the utilization of the ovens. Secondly, to increase oven utilization, SteelCorp plans to generate optimized batches of coils from the PPS, rather than having employees combine each batch manually. Thirdly, to minimize waiting times once a batch leaves the oven, SteelCorp wants to know how long it takes for a coil to cool down in inventory. For this reason, they will install thermometers that monitor each coil periodically. Finally, actual oven temperature curves will be recorded and stored in the data warehouse with the intent of using this data to optimize energy efficiency.

126

4.1

T. Franck et al.

Modelling

Creating a model of this process involved establishing the batch annealing process formally as part of the business domain, as well as modelling its relation to the physical, digital and IT infrastructure domains. Notable physical objects include the steel coils and the ovens. Information is associated with these objects at several stages in the process and this information moves through several systems throughout the production lifecycle. An as-is process model was made. This model could successfully be used to demonstrate the challenges SteelCorp was facing and to motivate the proposed changes. The as-is model included first and foremost a process model describing each step in the batch annealing process. We modelled this primarily using business processes. Further, we modelled materials, equipment and facilities and linked them directly to the business processes. This helps to show the flow of goods and the use of physical assets across the shop floor, as well as how the process depends on these assets. We think this is a very common manufacturing use-case. Additionally, we modelled the dataflow that occurs in the process. This helped us show the dependencies between the process and the applications running in the plant. This model allowed us to show how the data generated from the physical assets was being handled, and how it was being used in the process. Next, a to-be model including the same elements as the as-is model was created to show how the proposed changes would contribute to the goals set forth by SteelCorp. The holistic overview provided by the model, including the process, application landscape, data flow and physical perspective, was of particular interest to them as their analyses so far had been mostly domain-specific. 4.2

Impact Analysis

We performed an impact analysis of the proposed changes. An impact analysis shows how the changing dependencies between processes, applications, data and physical assets affect the flow of the process, data and physical goods across the shop floor. We find it important to note that, in the process of adding processes, applications and physical assets to our model, a lot of (implicit) dependencies are created that, while adding a lot of useful details, could at times make the model hard to use and maintain. 4.3

Performance Analysis

We used our architecture model as the basis for some analyses using shop floor data to predict future performance of the process. Since we believe this is a use-case that is specifically applicable to an industry 4.0 scenario, we will describe it in more detail. To make the best use of our architecture model in this case, we needed to add the structure of our data objects to the model. Since ArchiMate lends itself poorly to this (and is not intended for this use-case in the first place), we chose to model our data structures in UML instead and link them to our ArchiMate model.

Towards an Integrated Architecture Model

127

As a metric for the performance of the batch annealing process, we chose operational excellence. An operational excellence score is based on three factors: quality, availability and utilization. Each of these three factors is scored based on their conformance to some norm. For example, the goal may be to achieve 1 defect in 100.000.000 products, which would correspond to a 10 (on a scale of 1–10). To determine the overall operational excellence score, the average of the three scores is taken. We illustrate this use case with another example from the batch annealing process. Currently, when steel coils leave the oven, they are placed into inventory to cool down. Operators assume this process takes around a day for every coil. Thus, every coil stays in inventory for a day. This day cooling down time will serve as the norm for the availability score of this process step. However, SteelCorp expects that the cooling down time of coils is actually variable, depending on the type of steel used, the thickness of the coil and the temperature at which the oven was running. Thus, by determining the cooling down time of a certain type of coil, it may be possible to improve operational excellence by reducing the time a coil spends in inventory. By moving a coil out more quickly, that coil can be delivered to the customer sooner, improving availability. Furthermore, inventory space is freed up for the next coil, improving utilization. To determine the actual cooling down time of coils, the company wishes to change the current process in several ways. Firstly, an operator will be assigned to manually measure the temperatures of every coil as part of a pilot phase. During this phase, temperature data will be collected to establish a baseline cooling down time for a certain type of coil. At a later stage, this measurement process will be automated, allowing operators to determine real time when which each individual coil is cooled down. The factor that determines whether a coil can be moved out of inventory is its current temperature. In the current process, there is no way to tell the temperature of a coil. In the to-be process, however, data will be collected which can be used to tell whether a coil is sufficiently cooled down. The simplified to-be architecture is shown in Fig. 9.

Fig. 9. Operational excellence example architecture.

The details of the to-be process and the new data services are shown in Fig. 10. The elements highlighted in read will be impacted by the change. There are a couple of major changes made to the application landscape. Most notably, the Excel spreadsheet

128

T. Franck et al.

is replaced entirely by the PPS. The PPS now stores all production planning information. The data warehouse is now also directly linked to the production process, since oven temperature data is now stored for temperature curve analysis. An undetermined database solution will be used to store cooling down temperature data.

Fig. 10. Impact analysis on the to-be architecture.

Prominent entities in the cooling down process include the coil and a business object describing the temperature of the coil. The coil temperature is realized by the steel coil, as well as a data object that stores the information. The structure of the temperature data, while currently not yet determined, might be as shown in Fig. 11.

Fig. 11. Operational excellence example data structure.

Using the temperature data, a cooling down curve can be derived (see Fig. 12). This example uses mocked up historical data to illustrate how this might work. The example assumes a sample of the current temperature will be recorded every hour. Temperature may also be monitored in real-time for more accuracy. The above temperature curve shows how a coil cools down over a period of time. The coil enters the warehouse at 12:35:27. At that time, the coil temperature is 1400 °C. Over time, the coil cools down at a linear rate of 55° per hour. The target temperature of 180 °C is reached at approximately 10:45:00. This means the coil has had a cooling down period of around 22 h and 10 min. This is one hour and fifty minutes less than the normal duration of 24 h. Using the actual cooling down times of coils, certain patterns may be revealed regarding the cooling down time of a type of coil. This way, rather than reactively moving a coil out of the warehouse as soon as it is cool enough, planning can take into account the expected cooling down time of a coil.

Towards an Integrated Architecture Model

129

Fig. 12. Example cooling down curve of a steel coil.

Score calculation. Taking the above temperature curve as the expected temperature curve for this type of coil indicates that the current process is inefficient in terms of both availability and utilization by 7.6%. Assuming every other aspect of the process to be perfect, the operational excellence score calculation for the as-is situation is as shown in Table 5. Table 5. Operational excellence score (no data use). Factor Quality Availability Utilization Operational excellence score

Score 10 9.24 9.24 8.53

In the to-be situation, availability and utilization will be improved. The magnitude of improvement will depend on how the data is used. If the temperature data is used reactively, notifying operators that a coil is ready based on the hourly measurement, the actual gains will be somewhat variable based on what time a coil has entered the warehouse. At most, the time wasted will be an hour. Based on the most pessimistic scenario, this means an improvement to availability and utilization of 3.4%, leading to the scores shown in Table 6. Table 6. Operational excellence score (reactive data use). Factor Quality Availability Utilization Operational excellence score

Score 10 9.58 9.58 9.18

130

T. Franck et al.

Finally, if the temperature data is used to its fullest, by using the data to compute average cooling down times based on a coil’s cooling down temperature curve, the amount of time wasted can be minimized, leading to a perfect operational excellence score (of course, assuming every other task is performed fully efficiently as well), as shown in Table 7. Table 7. Operational excellence score (temperature curve use). Factor Quality Availability Utilization Operational excellence score

4.4

Score 10 10 10 10

Effectiveness of Patterns

The proposed modelling patterns proved useful in several instances. Patterns based on existing ArchiMate concepts were enough to model most of the case. However, some aspects of the case could not be modelled and required the use of modelling patterns that make use of new elements. For example, the current utilization of an oven (and the discrepancy between the perceived utilization of an oven and its actual utilization) could not be modelled. This required a realization relation to a business object per the pattern introduced in Sect. 3.4 (Table 4). Another example of this is the temperature data related to a steel coil that is monitored at regular intervals during the process.

5 Related Work Urbaczewski and Mrdalj [12] reviewed the EA frameworks available at that time. They identified DoDAF as the only framework that allows for the modelling of physical elements. In another literature review Hermann et al. [5] identify CPS as major principle behind smart manufacturing/industry 4.0. Furthermore, Sacala and Moisescu [10] argue that modelling a CPS as part of an overall enterprise systems landscape requires a physical entity, an association with a business entity and an application with interfaces to both the business and the physical entity. Finally, The Open Group released in 2016 ArchiMate 3.0 [11], which introduced (among other things) several modelling concepts to describe physical elements and how these elements relate to applications and business entities. The current research draws upon all the above and relates ArchiMate to ISA-95, by exploring its current modelling capabilities for smart manufacturing.

6 Future Work Since publishing the first version of this paper, we have had several more opportunities to further experience apply architecture models in the smart manufacturing/industry 4.0 domain. We have observed that, while many manufacturers use models to design their

Towards an Integrated Architecture Model

131

products and formalize their processes, most are lacking a holistic architectural overview of their plants. SteelCorp remains quite representative of a typical manufacturing environment in this sense. At the same time, there is a great need for insights derived from data gathered by the manufacturers. Our current and future work remains focussed on using architecture models based on languages like ArchiMate and UML to provide not just the IT department, but also the other stakeholders in manufacturing with the insights they require to make the industry 4.0 transition. In this light, we have observed several areas with potential for additional contributions. Firstly, the models that manufacturers are currently using are often not based on ArchiMate. In fact, they are often not based on any formal modelling language. In practice, this means that much manual labor is required to interpret existing documentation and transform it into a structured language like ArchiMate. This creates a high barrier to entry for manufacturers. Further, since existing documentation will often be composed independent and regardless of other, similar documents, it can be difficult to integrate each of these sources into a coherent model. One problem is that authors will often use a level of abstraction in their descriptions that fits the use-case that they were addressing with their document. However, when attempting to create a cohesive models, this source material will often either leave unfilled gaps, or contain too many details as compared to the rest of the model. Furthermore, the use of terms and definitions is often inconsistent. To eliminate such inconsistencies, we need a way to discern different levels of abstraction in a model. When a model is on the same abstraction level, it may be possible to do pattern matching in the model and thereby eliminate many of the inconsistencies we are currently experiencing. Finally, we continue to look into using the model to define interesting perspectives on the data that is gathered from the shop floor. Based on the dependencies in an architecture model, it may for example be possible to derive dimensions that are of interest to a particular stakeholder working on a certain part of the production process.

7 Conclusions With the introduction of smart manufacturing (or Industry 4.0), IT and operations technology increasingly intertwine. For large manufacturers, this means increasing digitization of the shop floor and, consequently, the need to control the information flowing from the physical domain and to manage changes from a multidisciplinary (IT and OT) perspective. This is where EA helps, but existing EA frameworks and languages were not designed with this type of requirements in mind. This research provides EA modelling support for smart manufacturing companies. Based on the ISA-95 standard for the integration of enterprise systems and control systems in the manufacturing industry [6, 7], this research has presented an analysis of ArchiMate 3.0 [11] in terms of its coverage of the manufacturing domain. The results of the analysis lead to the following answers to the research questions formulated in the introduction:

132

T. Franck et al.

RQ1: Since ISA-95 was written on a different abstraction level than ArchiMate, not all of its concepts may be of architectural nature. To determine which concepts are architectural, the ISA-95 concepts were normalized using the criteria used to determine which concepts should be part of the ArchiMate language [8]. The normalization revealed that only 66% of ISA-95 concepts qualify as such. Given the set of architectural concepts identified, a mapping was made of each architectural ISA-95 concept to ArchiMate 3.0. To be able to express the EA of any smart manufacturer, ArchiMate should be able to express each architectural ISA-95 concept. The mapping analysis revealed that, while 12% of concepts can be mapped one-to-one, construct overload or deficit [14] occurs in 75% of cases. Solving these issues requires the use of modelling patterns based on either indirect relations or on new constructs. RQ2: When a concept from the manufacturing domain cannot be mapped to ArchiMate, this will invariably cause issues when attempting to model the architecture of a manufacturing enterprise. Thus, this second question asks for a solution to the mapping difficulties uncovered as part of the mapping analysis. For each identified issue, a pattern has been proposed that resolves the problem by using some combination of ArchiMate concepts to express the intended meaning of the ISA-95 concept, and/or by introducing some new constructs if ArchiMate’s meta-model does not have sufficient expressive power. The following concepts are introduced: 1. Concept Parameter and Relationship Parameter. These concepts describe information about a concept (e.g. a steel coil) or relation (e.g., an item on a bill of materials) respectively. 2. An aggregation relation between Material and Business Object is proposed to enable the modelling of an explicit bill of materials. 3. A realization relation between Business Object and Business Actor, Business Role, Material, Equipment and Facility will allow for both the current physical and informational state of a physical object to be modeled. The proposed modelling patterns enhance ArchiMate 3.0’s coverage of ISA-95 architectural concepts from 12% to 92%, and were validated as part of a case study. They proved useful in modelling part of the production process at a steel manufacturer. The models could also effectively be used to perform two common analysis use-cases: impact analysis and operational excellence analysis. Note that the proposed modelling patterns are applicable to ArchiMate only. Furthermore, the patterns should be further validated by testing them in more cases, also covering discrete and continuous processes, since SteelCorp is a batch process.

References 1. Boucharas, V., van Steenbergen, M., Jansen, S., Brinkkemper, S.: The contribution of enterprise architecture to the achievement of organizational goals: a review of the evidence. In: Proper, E., Lankhorst, M.M., Schönherr, M., Barjis, J., Overbeek, S. (eds.) TEAR 2010. LNBIP, vol. 70, pp. 1–15. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-64216819-2_1

Towards an Integrated Architecture Model

133

2. van Buuren, R., Jonkers, H., Iacob, M.-E., Strating, P.: Composition of relations in enterprise architecture models. In: Ehrig, H., Engels, G., Parisi-Presicce, F., Rozenberg, G. (eds.) ICGT 2004. LNCS, vol. 3256, pp. 39–53. Springer, Heidelberg (2004). https://doi.org/10.1007/ 978-3-540-30203-2_5 3. Davis, J., Edgar, T., Graybill, R., Korambath, P., Schott, B., Swink, D., Wang, J., Wetzel, J.: Smart manufacturing. Ann. Rev. Chem. and Biomol. Eng. 6, 141–160 (2015) 4. Henderson, J.C., Venkatraman, H.: Strategic alignment: Leveraging information technology for transforming organizations. IBM Syst. J. 31(1), 472–484 (1993) 5. Hermann, M., Pentek, T., Otto, B.: Design principles for Industrie 4.0 scenarios: a literature review. Technische Univ. Dortmund, No. 01, pp. 4–16 (2015) 6. International Society of Automation: Enterprise-Control System Integration. ANSI/ISA standard 95.00.01-2010 (2010a) 7. International Society of Automation: Enterprise-Control System Integration. ANSI/ISA standard 95.00.02-2010 (2010b) 8. Lankhorst, M.M., Proper, H.A., Jonkers, H.: The anatomy of the archimate language. Int. J. Inf. Syst. Model. Des. Arch. 1(1), 1–32 (2010) 9. Lee, E.A.: Cyber physical systems: design challenges. In: Proceedings of the 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC), pp. 363–369 (2008) 10. Sacala, I.S., Moisescu, M.A.: The development of enterprise systems based on cyberphysical systems principles. Rom. Stat. Rev. 4, 29–39 (2014) 11. The Open Group: ArchiMate 3.0 Specification (2016) 12. Urbaczewski, L., Mrdalj, S.: A comparison of enterprise architecture frameworks. Issues Inf. Syst. 7(2), 18–23 (2006) 13. Wagner, H.-T., Weitzel, T.: Operational IT business alignment as the missing link from IT strategy to firm success. In: 12th Americas Conference on Information Systems (AMCIS 2006), AISeL, pp. 570–578 (2006) 14. Wand, Y., Weber, R.: Research commentary: information systems and conceptual modelling - a research agenda. Inf. Syst. Res. 13(4), 363–376 (2002)

A Model Driven Systems Development Approach for NOMIS – From Human Observable Actions to Code José Cordeiro(&) Setúbal School of Technology, Polytechnic Institute of Setúbal, Campus Do IPS, Setúbal, Portugal [email protected]

Abstract. NOMIS is a human centred information systems modelling approach based on human observable actions. It models a business domain using a number of views relating human actions and interactions, context for actions and information. These models are represented by a set of tables and diagrams using the NOMIS graphical notation and are formalized with a metamodel. NOMIS metamodel and graphical notation are a first step for automating the implementation of computer applications. In this paper, we propose to develop NOMIS applications using a Model Driven System Development approach. This suggested approach will define formally NOMIS models and notation and, using model transformations, will derive a code structure to be used by a computerized information system. Additionally, other components of a specific application middleware will be created, including a database schema for business data. It is expected that this approach will be flexible enough to cope with frequent requirement changes. Keywords: Information systems  Information systems modelling Human-centred information systems  NOMIS  NOMIS Vision NOMIS models  NOMIS modelling notation  NOMIS metamodel Model Driven System Engineering  Metamodelling  Model transformations Domain specific languages

1 Introduction NOMIS [1] is a human centred information systems modelling approach based on human observable actions that aims to provide the necessary objectivity and precision for information systems design and development. To achieve these goals, NOMIS relies on a new philosophical stance: Human Relativism [2]. According to Human Relativism, reality is subjective, dependent on the observer, but there is an observable part that can be seen as objective. Therefore, the observability concept, used in NOMIS, is a starting point to model precisely an information system. NOMIS models a business domain using different views relating human actions and interactions, context for actions and information applying some ideas and perspectives from the Theory of Organized Activity [3], Enterprise Ontology [4] and, Organisational Semiotics [5]. An additional view – the information view – is added in © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 134–147, 2018. https://doi.org/10.1007/978-3-319-78428-1_7

A Model Driven Systems Development Approach for NOMIS

135

NOMIS to model separately information and derived data requirements. For representing these Views, NOMIS proposes a set of diagrams and a graphical notation. Despite NOMIS modelling formalisms, there is no formal use of NOMIS models permitting to go from model to code when developing an information system. For example, in [6] an e-learning prototype is modelled and developed according to NOMIS, however, a model based strategy is used instead of a model driven one. Therefore, in this paper, we propose the development of NOMIS applications using a Model Driven System Development (MDSD) approach. We suggest creating a domain specific language to represent NOMIS models based on the NOMIS elements metamodel, for the abstract syntax and, NOMIS notation for the concrete syntax. We also suggest to use different transformations producing: (1) a code structure to be used by a computerized information system, (2) a schema for a database supporting business data and, (3) a rule based system to store business rules. This paper is organized as follows: Sect. 2 gives a brief overview of NOMIS. Section 3 presents a brief overview of MDSD, Sect. 4 introduces and describes the detail of a NOMIS MDSD based solution, Sect. 5 refers related work and, Sect. 6 concludes and points some future research directions.

2 The NOMIS Modelling Approach 2.1

NOMIS Philosophical Foundations

NOMIS – NOrmative Modelling of Information Systems is a human centred modelling approach to information systems development [1]. By recognizing a difficulty to define precisely Information Systems requirements, NOMIS proposes a solution that is based on the observability concept: “what we observe is more consensual, precise and, therefore more appropriate to be used by scientific methods” [2]. In NOMIS, an Information System is a human activity (social) system which may or may not involve the use of computer systems. In these systems what is observable are: “human physical actions” and “physical things” they involve and, what is not observable are: (1) human mental actions such as decisions, intentions, judgements, goals, etc. that are not externalized and, (2) conceptual or informational features of physical things such as a category of things, a name, a price, a qualitative aspect of a specific good, etc. Observability is a key concept in Human Relativism [2], the philosophical stance in which NOMIS is based. With a focus on using observable elements, NOMIS models information systems by means of human observable actions including both material and language actions. Close connected to human actions is information. In fact, information is used as an input, an auxiliary element, an output, even as a target element of human actions. As human actions, also information has observable and not observable parts: data is an observable material support for information and information itself is considered, in NOMIS, immaterial, as “a meaning extracted by humans from data”. NOMIS understands information as a result of an interpretation process extracted after perceiving the observed reality. Information is only available from data after being interpreted by a human. Therefore, there is no information without a human interpreter.

136

J. Cordeiro Language Action Perspective

Activity Theory

Enterprise Ontology

Theory of Organized Activity  Physical View

 Interaction View

 State View

Semiotics Organisational Semiotics

 Information View

Norms and Information Fields

Fig. 1. NOMIS Vision – its views and foundational theories [1]

2.2

NOMIS Vision

Based on human observable actions, NOMIS proposes a vision of an information system composed by four views – Interaction View, Physical View, State View and Information View – addressing, respectively, human interactions, action processes, context for actions, and information. NOMIS views form a coherent and consistent vision of an information system from a human observable action perspective. Considering the unpredictable nature of human actions, NOMIS adds to its vision Norms as human behaviour regulators in order to achieve some predictability. Norms addresses and regulates groups of human actions. In this case, behavioural norms are use. Behavioural norms are represented analytically in a semi-formal way as defined in Organisational Semiotics [7]: IF condition THEN agent ADOPTS attitude TOWARD something Where agent is always a human actor and attitude is related to human behaviour. Expected (human) behaviour is derived from systems of norms or information fields as they are defined in Organisational Semiotics [7]. Within an information field people tend to behave in a certain, expected and controlled way. Information Fields and Norms are a glue connecting human actions and information. NOMIS Vision including its four views, the theories behind each view and their broader scope theories, plus norms and information fields is shown in Fig. 1. 2.3

NOMIS Representation

NOMIS uses tables and diagrams to represent NOMIS Views according to NOMIS Vision. A complete list of diagrams and tables used in NOMIS is presented and briefly

A Model Driven Systems Development Approach for NOMIS

137

described in Table 1. The elements represented in NOMIS diagrams correspond to key concepts in NOMIS, namely: • • • • • •

Human Actions Actors – human performers Bodies – physical things Information Items – without physical support Language Actions (or Coordination-acts) Environmental States

NOMIS elements and their relationships are formalized in a metamodel (shown in Fig. 2). In this metamodel, besides the key elements described before, we find some simple composite elements such as Activity, CompositeInformationItem, CompositeBody, and CompositeActor to group similar elements, and a complex composite element, the EnvironmentalState grouping different key elements. There are also some specializations of elements, such as Interactions and LanguageActions, for actions, Context, for bodies and BodyStates for States (See [1] for a complete description). To represent NOMIS elements a modelling notation is also provided [6].

Table 1. NOMIS modelling artefacts Diagram HID Human Interaction Diagram

Content

Used in…

Observations

Actors and their interactions

Interaction view

ASD

Action Sequence Diagram

Sequences of actions

BSD

Body State Diagram Existential Dependencies Diagram

The different states of a body Environmental States and their existential dependencies

Interaction view and physical view State view

A kind of construction model as used in Enterprise Ontology [4] A kind of UML activity diagram

ESD

Environmental State Diagram

Details each environmental state showing its elements

State view

AVD

Action View Diagram

Show all the elements related to a single action

Physical view

EDD

State view

A kind of UML state diagram A kind of Ontology Charts as in Organisational Semiotics [5] A kind of an Ontology Chart as used in Organisational Semiotics Shows information items, bodies, and actors related to an action (continued)

138

J. Cordeiro

Table 1. (continued) Content

Used in…

Observations

Sequences of actions related to body state changes

Physical view

Information Connection Diagram

Information items related to bodies and actors

Information view

Human Action Table

Content Human actions, actors, bodies and information items

Used in… Interaction view and physical view State view

A kind of Diplan as used in the theory of organized activity [3] Shows information interpreters and information supporting bodies Observations Collects human actions, actors, bodies and information items Collects dependencies for human actions

Diagram ABD Action Body Diagram

ICD

Table HAT

HADT

IIT

Human Actions Dependency Table Information Items Table

Human actions, their dependencies on bodies, information and context Information items

Information view

Collects details of information items

Fig. 2. NOMIS elements metamodel [1]

3 Model Driven Systems Engineering Model Driven System Engineering (MDSE) is a software engineering approach that aims at derive software systems from models. Its core concepts are models and transformations. For specifying these models MSDE uses modelling languages. In a

A Model Driven Systems Development Approach for NOMIS

139

higher abstraction level, metamodels are used to model models themselves and, meta-metamodels to model metamodels. Usually meta-metamodels are the highest abstraction level. Metamodels are useful, specially to define new modelling languages. In this case, also modelling languages are specified by a model (its metamodel). The other core concept – transformations – is used to transform a model into another model according to a set of transformation rules. Ultimately, a transformation can be used to transform a model into code. Although there are other applications, code generation is perhaps the most important application of MDSE. MDSE has a broader scope than Model Driven System Development (MDSD), which considers only the development process that uses models as its primary artefact. A good introduction to MDSE can be found in [8]. 3.1

Modelling Languages

As stated before, modelling languages are used to specify models, and metamodels can be used to specify modelling languages. Modelling languages can be classified as general-purpose modelling languages or domain-specific modelling languages. A well-known example of a general-purpose modelling language is UML that can be applied in different domains. A domain-specific modelling language, on the other hand, is used to model a particular domain. Examples are HTML markup for creating Web Pages or SQL for database access. These are text based modelling languages. A modelling language is defined by (1) an abstract syntax that describes the language structure, its primitives and the way they are combined, (2) a concrete syntax describing its visual representation and appearance and, (3) its semantics, the meaning of its elements. 3.2

Model Transformations

Model transformations is the engine of MDSD. It can be used to transform models into code, to refine or refactor models, to translate models, etc. Usually model transformation can be applied between two graphical models, known as Model-to-Model transformations (M2M) or between a graphical model and a text model, these are Model-to-Text transformations (M2T). M2T are usually used to produce source code from graphical models.

4 A MDSD Approach for NOMIS NOMIS allow us to model a business domain and give us guidelines for a computerized support system implementation. Effectively, in [6] an eLearning system is modelled using NOMIS models and implemented according to NOMIS vision. Also, a system infrastructure for NOMIS applications was proposed in this work. However, in that eLearning system there is no clear connection between NOMIS models and system implementation. In fact, models were used just for reference and implementation validation. Therefore, in this section we propose a Model Driven System Development approach to derive part of system implementation from NOMIS models.

140

4.1

J. Cordeiro

NOMIS Domain Specific Language

As mentioned before, with NOMIS a business domain is modelled using a set of tables and diagrams showing the key elements of NOMIS together with their relationships. What is shown in NOMIS diagrams is a representation of its Vision and is the way NOMIS sees the Information System reality. This reality representation is the NOMIS language, which is formalized by a metamodel (see Fig. 2) and a notation [6]. These formal elements are also the required elements for creating a Domain Specific Language (DSL). A DSL requires, for its construction, an abstract syntax and a concrete syntax. The abstract syntax describes the language structure and the combination of its primitives, independently of any particular representation, while the concrete syntax describes a specific representation covering its visual appearance [8]. Therefore, it should be natural to use the NOMIS metamodel for this new DSL abstract syntax, and adapt NOMIS notation for its concrete syntax. Besides creating a new DSL it is also necessary to provide the necessary transformations for deriving part of the system implementation. Having a DSL for NOMIS together with the necessary model transformations is the basis of a MDSD approach for NOMIS. 4.2

Transformation 1: Deriving System Services

The central element of NOMIS is the human observable action. In a computerized information system these actions correspond to functionalities to be provided by the system to a user. For example, if a user wants to “store a document” the system will provide this functionality using, usually, a user interface element. The system acts as a tool enabling and supporting certain human actions. In general, NOMIS human actions can be implemented as services by a system using a kind of a Service Oriented Architecture (SOA). This can be done by a model transformation where actions in models are transformed to service software interfaces to be used in the implementation of corresponding functionalities. “To store a document” is to be transformed to a software function as part of a software interface. The proper interface to deploy this function or service depends on the context. In NOMIS, contexts are usually related to states. So, it will be a research problem to find the appropriate grouping of functions according to NOMIS states so they do not become a rigid solution when changes are needed. Interactions between two actors are also seen in NOMIS, as a specific type of human actions, involving a request action and a delivery action. In this case, they can also be modeled as system services: a request service from a first actor is delivered as a receive service by a second actor. 4.3

Transformation 2: Deriving the Interface System

Human actors do not interact directly with a computer system. This is done through a peripheral device, such as a mouse, a keyboard, a computer monitor or other type of input/output device. These human interactions can be modelled and implemented as part of a system specific component – the Interface System – that is separated from the remaining system. This component would access the system services in response for

A Model Driven Systems Development Approach for NOMIS

141

any required system functionality. For example, the user action “store a document” can be triggered by pressing an interface button in a monitor screen or by a gesture using an appropriate haptic device. The interface system component will be responsible to call the appropriate system service in response to this action. This solution permits to design and program the interface separately from the remaining system. Designers can design interfaces using different screen widgets or design any other kind of interaction elements using different physical means and programmers create the required connections to system services in the interface system component.

Fig. 3. The interface system component

Some human actions may not need to have corresponding system services for it. For example, in an application forum component, a user usually wants to check received messages. A system service could retrieve these messages but the interface system component could be responsible for implementing user requested actions such as sorting these messages, freeing the main system from this interface task. Separation of interface system from system services provides flexibility in using or adapting different designs and interfaces with the same services, may be in line with user preferences. A diagram of the interface system component is shown in Fig. 3. In this diagram user or human actions received and treated within the Interface System Component, are forwarded as system services to NOMIS Middleware, or, on the other hand, treated in a different way by a separated technical application middleware as technical actions. Besides a separation between interface functionalities and system services, an interface system can also make use of activity elements from NOMIS models. An activity in NOMIS corresponds to a composite of human actions. So, some activities can be mapped/transformed to an application window, or a browser page which will have all the necessary components to trigger the human actions present in that activity. 4.4

Transformation 3: Persisting Business Data

Information is a central element of NOMIS that is represented by an information item element in NOMIS models. Each information item is associated with a business domain

142

J. Cordeiro

concept and is detailed within an Information Item Table. In [6], content of information items is stored as records in relational database tables. This is part of a NOMIS Middleware responsible for storing all business related information. In that work is proposed a flexible database schema allowing business terms and concepts to evolve and change without being attached to a specific structure. Application data, on the other hand, is kept separated from business data outside NOMIS Middleware. Another model transformation is be applied to information items and used to persist NOMIS application business domain data. This transformation will generate or reuse the provided database structure. When modelling a business domain using NOMIS, information items are not the only elements that is necessary to persist. Effectively, information about activities, actions, states and bodies also needs to be stored. In this case, it is again necessary to provide the corresponding transformations. A research problem, in this case, is how to track and store changes as well, as they may have impact in the business. 4.5

Transformation 4: Controlling System Services with State Machines

States and Environmental States in NOMIS are used as context or condition for actions. A simple state can be applied to a single body (a physical element in NOMIS), or to a human actor. In the case of a human actor, a state is understood as a role. Different states, or roles, give access to different types of actions. For example, a book in a library can be in a state “for lend” or in a state “not for lend”. Lending is only available for books in a “for lend” state. “Lending a book” is also only available for library members – a human actor in the “library member” role. Environmental States (ES) are a more complex form of states that is modelled as a composite of NOMIS elements, namely: actions, bodies, actors and information items where actors and bodies may exhibit a particular state. Environmental States are utmost important elements in NOMIS, they constitute the information system anchors giving the desirable system stability. Using the same example, in a library information system, a “membership” ES is a required condition to lend a book (see [9]). This ES is composed by an actor, in a state of “library member”, plus his/her membership data, a “paid membership fee” condition and the book. This ES, with all its elements, is required, for a “lend a book” action in this library information system. Within an Information System, Environmental States depend on information about its elements. This information should be stored or obtained by the information system. For our MDSD solution, NOMIS ES and states need to be translated to system features using model transformations. This can be done by relating those states to business information stored in the NOMIS middleware. A kind of state machine controlling system services availability can be derived from ES where information, stored or obtained, will act as a required input to each state. 4.6

Transformation 5: Creating a Normbase

In NOMIS, action sequences are regulated by (Behavioural) Norms. Norms are composed by a condition (an Environmental State), an agent (a human actor), and an attitude towards something (a human action). Sequences of actions can be defined

A Model Driven Systems Development Approach for NOMIS

143

Fig. 4. NOMIS application system

using norms. Contrary to business rules, norms are not mandatory as they depend on human behaviour. Instead, they act as expected behaviour, and a human performer may decide to not follow a norm. So, sequences of actions, defined by norms, cannot be directly translated to business processes as they should not be hard wired in the system. Anyway, it will be possible to derive an implementation of a sequence of actions, regulated by norms, as a sequence of system services or a service orchestration as long as this sequence is allowed to be broken by a human performer. A possible solution, mentioned in [5] is a rule based system to store norms, similar to a Normbase [10]. Norms are not shown directly in NOMIS Models, as an alternative they can be attached as notes to some NOMIS elements. Norms are written in text and therefore, require a M2T transformation to derive a correspondent rule to be stored in the Normbase. 4.7

NOMIS Information Systems

Following our proposed approach, resulting information systems will have a clear separation between the technical and the business domain. A diagram of this system is shown in Fig. 4. In this figure, available functionalities, triggered by human actions, are exposed by the application middleware that includes the Interface System Component. This application middleware does not deal with any business directly related information, this is the task of NOMIS Middleware, accessible through system services. In this system, application data is kept separated from business data. The connection between both domains is derived from NOMIS models through the definition of (human action) services, the schema of a business information database (the middleware database) and a special database for NOMIS norms (the normbase). 4.8

Summary of the NOMIS MDSD Approach

NOMIS models provide a comprehensive, coherent and consistent view of an information system from a business domain perspective. By using an MDSD approach it will be possible to derive business software elements to be used by a technical system.

144

J. Cordeiro Table 2. Summary of the MDSD approach for NOMIS

Transformations

NOMIS

Metamodel Notation Human observable actions Activities and actions Information Items States and Environmental States Norms

MDSD Approach

DSL abstract syntax DSL concrete syntax System services Interface System Database Schema State Machine Normbase

Implementing a MDSD approach for NOMIS will start by creating a new Domain Specific Language (DSL) for expressing NOMIS Models. This DSL abstract syntax will use a formalization of the NOMIS Metamodel. On the other hand, the DSL concrete syntax will be based on NOMIS notation. Next, model transformations will be applied to create part of a computerized information system. In Table 2 there is a summary of the different transformations that will be created for this approach. By using this approach, part of the system, related to the technical domain, still needs to be modelled and developed separately but the boundary of the technical domain and the business domain will be clear.

5 Related Work NOMIS was not formalized before as proposed in this work. Hence, there is no known previous work. However, some elements and diagrams in NOMIS have similar concepts in the underlying theories in which NOMIS is based. Effectively, except for the Information View, NOMIS Views are inspired by the theories of Enterprise Ontology (EO) [4], Organisational Semiotics (OS) [5] and, the Theory of Organized Activity (TOA) [3]. Some diagrams in NOMIS are adaptations, improvements or extensions of the diagrams used in these theories. Therefore, related work can be found in written research on these theories, this will be exposed in the next sections. 5.1

Diplans in TOA

The Theory of Organized Activity (TOA) that is related to the Physical View in NOMIS, uses a diagrammatic language – Diplans [11] – to show human actions, bodies, states and their relationships. This is a language similar to Petri Nets but applied to a business environment. In NOMIS is possible to have a similar representation with ABD diagrams. In [12] there is a proposal to formalize Diplans with UML profiles. Unfortunately, UML was found not suitable for this task due to some extension issues such as the existence of UML metaclasses with underlying features that did not match proposed extension classes, also UML has limited relationship types or limited UML element combinations defined in its metamodel. These difficulties lead us to find an

A Model Driven Systems Development Approach for NOMIS

145

alternative, and this is a new of a DSL for expressing Diplans. To the best of author’s knowledge there is no other related research work in Diplans. 5.2

Ontology Charts in OS

Organizational Semiotics (OS) is behind NOMIS State View, where some diagrams inspired by OS Ontology Charts are used to show states and their existential dependencies. This is the case of EDD and ESD diagrams used in the State View. Also, in [12], Ontology Charts are modelled with UML profiles. As with Diplans, similar adaptation problems were found with Ontology Charts. [13] proposes some heuristic rules for class diagram derivation from OCs. This work gives just some hints on how to obtain (and translate) the OC elements into UML elements. Used UML elements are limited to classes and associations, compositions and generalizations relationships between them. [14] proposes the generation of a prototype system from Ontology Charts. The solution uses a database structure to store information from the elements in the OC. From NOMIS point of view, this solution is not consistent with the proposed theoretical framework. [15], suggested later the generation of UML 2 use cases from Ontology Charts. They map agents to actors and communication acts to use cases. This transformation is not suitable as well for NOMIS as it does not cover the required detailing. [16] made an extensive review on OS literature from 2011 till 2015 covering conferences, journals, and book chapters with 91 publications found. We could not find any related research in those publications. 5.3

Aspect Models in EO

Enterprise Ontology (EO) uses aspect models to model a business system. These aspects models uses a set of diagrams, textual rules and tables for modelling purposes. From EO diagrams, NOMIS only has equivalents for Actor Transaction Diagrams (ATD) and Process Structure Diagrams (PSD) with, respectively, HID and ASD diagrams. They are used in the Interaction View. In [17], there is also a proposition for a UML profile for ATD, PSD and Actor Bank Diagrams. Again, there were issues in extending UML with specific profiles for these diagrams that excludes UML profiles as a suitable solution. EO is the most studied and researched approach from the foundational theories of NOMIS. Therefore, we just mention some relevant research related to this work that can inspire a MDSD approach. First, [18] suggests the transformation of EO metamodels to a XML schema. This could be an interesting transformation to be used in a MDSD approach for NOMIS. [19] provides a complete MDSD based approach for EO. It uses a SOA architecture with a process engine to execute EO models. The overall view is provided. [20] describes a EO processor that fully automates EO development. Also uses a MDSD approach.

146

J. Cordeiro

6 Conclusions and Future Work This paper proposes and presents a MDSD approach for NOMIS computerized information system implementations. The proposed solution advocates the creation of a new DSL for representing NOMIS models and establishes the guidelines for the necessary model transformations to be developed. As a result of these transformations part of the implementation code together with a persistency system for business information and business norms will be created. Using this approach, the technical and the business part will be modelled separately, but the connection points between these parts will be established and derived from NOMIS models. As future work, it is our intention to create a DSL for this proposal using the Eclipse Modelling Framework [21] and the Ecore Metamodel. This DSL will include the creation of a concrete syntax using the Graphical Modelling Framework, model validation and model persistence. A prototype of a simple application will be used to validate the approach.

References 1. Cordeiro, J.: Normative approach to information systems modelling. Ph.D. thesis. The University of Reading, UK (2011) 2. Cordeiro, J., Filipe, J., Liu, K.: Towards a human oriented approach to information systems development. In: Proceedings of the 3rd International Workshop on Enterprise Systems and Technology, Sofia, Bulgaria (2009) 3. Holt, A.: Organized Activity and Its Support by Computer. Kluwer Academic Publishers, Dordrecht (1997) 4. Dietz, J.: Enterprise Ontology, Theory and Methodology. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-33149-2 5. Liu, K.: Semiotics in Information Systems Engineering. Cambridge University Press, Cambridge (2000) 6. Cordeiro, J.: Applying NOMIS - modelling information systems using a human centred approach. In: Shishkov, B. (ed.) BMSD 2016. LNBIP, vol. 275, pp. 27–45. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57222-2_2 7. Stamper, R.: Signs, norms, and information systems. In: Holmqvist, B., et al. (eds.) Signs of Work. Walter de Gruyter, Berlin (1996) 8. Brambilla, B., Cabot, J., Wimmer, M.: Model-Driven Software Engineering in Practice, 2nd edn. Morgan & Claypool, San Rafael (2017). ISBN 978-1627057080 9. Cordeiro, J.A.M.: A new way of modelling information systems and business processes - the NOMIS approach. In: Shishkov, B. (ed.) Business Modeling and Software Design, pp. 102– 118. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-20052-1_6. ISBN 978-3-319-20051-4 10. Stamper, R., Liu, K., Klarenberg, P., Van Slooten, F., Ades, Y., Van Slooten, C.: From database to normbase. Int. J. Inf. Manag. 11, 67–84 (1991) 11. Holt, A.: Diplans: a new language for the study and implementation of coordination. ACM Trans. Inf. Syst. (TOIS) 6(2), 109–125 (1988) 12. Cordeiro, J., Liu, K.: UML 2 profiles for ontology charts and diplans - issues on meta-modelling. In: Proceedings of the 2nd International Workshop on Enterprise Modelling and Information Systems Architectures, St. Goar, Germany (2007)

A Model Driven Systems Development Approach for NOMIS

147

13. Bonacin, R., Baranauskas, M., Liu, K.: From ontology charts to class diagrams - semantic analysis aiding systems design. In: Proceedings of the 6th International Conference on Enterprise Information Systems, Porto, Portugal. vol. 1, pp. 389–395 (2004) 14. Tsaramirsis, G., Yamin, M.: Generation of UML2 use cases from MEASUR’s ontology charts: a MDA approach. In: Lano, K., Zandu, R., Maroukian, K. (eds.) Model-Driven Business Process Engineering, pp. 67–76. Bentham Science Publishers Ltd. Shariqah, United Arab Emirates (2014). ISBN: 978-1-60805-893-8 15. Tsaramirsis, G., Poernomo, I.: Prototype generation from ontology charts. In: Fifth International Conference on Information Technology, pp. 1177–1178. New Generations, Las Vegas (2008) 16. de Souza Santos, M.C., da Silva Magalhães Bertãozini, B., Neris, V.: Studies in organisational semiotics: a systematic literature review. In: Baranauskas, M.C.C., Liu, K., Sun, L., Neris, V., Bonacin, R., Nakata, K. (eds.) ICISO 2016. IAICT, vol. 477, pp. 13–24. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42102-5_2 17. Cordeiro, J., Liu, K.: A UML profile for enterprise ontology. In: Proceedings of the 2nd International Workshop on Enterprise Systems and Technology, Enschede, The Netherlands (2008) 18. Wang, Y., Albani, A., Barjis, J.: Transformation of DEMO metamodel into XML schema. In: Albani, A., Dietz, J.L.G., Verelst, J. (eds.) EEWC 2011. LNBIP, vol. 79, pp. 46–60. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21058-7_4 19. den Haan, J.: An enterprise ontology based approach to model-driven engineering, Master’s thesis, Delft University of Technology (2009) 20. van Kervel, S., Dietz, J., Hintzen, J., van Meeuwen, T., Zijlstra, B.: Enterprise ontology driven software engineering. In: Hammoudi, S., van Sinderen, M., Cordeiro, J. (eds.), ICSOFT pp. 205-210. SciTePress (2012). ISBN: 978-989-8565-19-8 21. Steineberg, D., Budinsky, F., Patermostro, M., Merks, E. (eds.): Eclipse Modeling Framework, 2nd edn. Addison-Wesley Professional, Boston (2008). ISBN 978-0321331885

Value Switch for a Digital World: The BPM-D® Application Mathias Kirchmer1,2(&)

, Peter Franz1, and Rakesh Gusain1

1

2

BPM-D, West Chester, USA {Mathias.Kirchmer,Peter.Franz, Rakesh.Gusain}@bpm-d.com University of Pennsylvania, Philadelphia, USA

Abstract. Business Process Management (BPM) has become a management discipline that translates strategy into people and technology based execution – fast and at minimal risk. It helps organizations to realize the full potential of their digitalization enterprise transformation initiatives. BPM is implemented through the “process of process management” (PoPM). To assure continuous improvements of the PoPM an appropriate digitalization approach for the PoPM itself is essential. However, little work has been done in this field and companies are failing to recognise the importance of an integrated digitalization of the PoPM. This paper presents a successful approach for the digitalization of the PoPM to enable a powerful BPM-Discipline. It includes experiences from a first pilot implementation of the developed prototype, the BPM-D Application. Keywords: BPM  Business process management  Digitalization Process of process management  Execution  Strategy  Process design Strategy execution  Business processes  Digital technology  BPM-discipline Implementation  Enterprise architecture  Process modelling Value-driven BPM

1 Improving Business Process Management Business Process Management (BPM) is increasingly seen as a management discipline that has significant impact on an organization (von Rosing et al. 2015). It provides significant value by transforming strategy into people and technology based execution – at pace with certainty (Franz and Kirchmer 2012). BPM plays a key role in realizing the full potential of digitalization initiatives (Kirchmer and Franz 2016) (Kirchmer 2017) and enterprise transformations. The discipline of process management enables ongoing strategy execution and digitalization in our volatile business environment. The BPM-Discipline is also implemented through a process of its own, the process of process management (PoPM). The increasing importance of the BPM-Discipline for the success of an organization requires an appropriate performance improvement of the PoPM. The first progress in this area has been made through the identification and design of the PoPM (Kirchmer 2015). In order to achieve the next performance level, we apply digitalization systematically to the PoPM itself – leveraging the relevant process management approaches, methods and tools. © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 148–165, 2018. https://doi.org/10.1007/978-3-319-78428-1_8

Value Switch for a Digital World: The BPM-D® Application

149

2 BPM for Strategy Execution and Digitalization In a recent research study, The Gartner Group explained that only 13% of organizations reach their yearly strategic goals (Cantara 2015). This situation can get even worse with more and more businesses starting their digitalization journey and thus increasing the requirement and pace of change. According to the same study only 1% of businesses have their processes sufficiently under control to realize the full potential of digitalization. So, the gap between expectations from digitalization and reality grows even more. This is where the BPM-Discipline helps. It closes the gap between expectation and reality. 2.1

Discipline of Strategy Execution

BPM has become the management discipline that enables an effective strategy execution across the organization (Swenson and von Rosing 2015). It operationalizes strategy so that it can be executed through the appropriate combination of people and technology, fast and at minimal risk (Franz and Kirchmer 2012). This is visualized in the BPM-D® Framework shown in Fig. 1. This patent-pending framework summarizes key aspects of a comprehensive definition of BPM and operationalizes them by an appropriate management of the process lifecycle from design, implementation, execution to control of the process.

Fig. 1. BPM-D® value framework

It is possible to leverage the BPM-Discipline for enterprise-wide strategy execution mainly because of the transparency it creates as well as its organization-wide customer and outcome-oriented approach. The discipline of BPM enables cross-departmental initiatives to achieve values like quality and efficiency, agility and compliance, integration into enterprise networks and internal alignment as well as innovation and conservation of existing practices (Kirchmer 2015). The typical values that the discipline of BPM delivers is shown in the BPM-D Value-Framework in Fig. 2.

150

M. Kirchmer et al.

Fig. 2. The BPM-D value-framework

The transparency also enables the alignment between business and information technology which becomes more and more important in a digital world. The listed values or a sub-set of them are systematically combined through the BPM-Discipline to make strategy happen. The supporting methods and tools enable an efficient and effective approach to strategy execution. 2.2

Value-Switch for Digitalization

A rapidly increasing number of organizations are making digitalization a part of their strategy. Digitalization is defined as the integration of physical products, people and processes through the internet of things (IoT) and related information technology (IT) (McDonald 2012; Scheer 2015). This definition is visualized in Fig. 3.

Fig. 3. Definition of digitalization

Businesses normally have a solid management discipline around products they produce or buy, e.g. as equipment. Examples are product or asset management disciplines. They normally also have a good discipline around their people and their information technology. However, in many cases the discipline around their business processes is missing (Cantara 2015). The BPM-Discipline closes this gap. It uses the opportunities of digitalization to create new or improved business processes which realize the strategy of the organization.

Value Switch for a Digital World: The BPM-D® Application

151

BPM provides the answers to the main issues businesses struggle with in their digitalization initiatives. Figure 4 shows key challenges organizations encounter – all of them addressed through the BPM-Discipline.

Fig. 4. Key challenges of digitalization initiatives

2.3

The Process of Process Management

The BPM-Discipline is implemented, just as any other management discipline: through the appropriate business processes. We refer to those processes realizing the BPM-Discipline as the “process of process management” (PoPM) (Franz and Kirchmer 2012). This PoPM consists of project-related sub-processes, focused on improving the organization and realizing the targeted value, and asset-related processes, enabling efficient and effective improvements. In both groups we can distinguish planning and realization related sub-processes. A definition of the PoPM is described in the BPM-D Process Framework, represented in Fig. 5 (Kirchmer 2015; Kirchmer 2017).

Fig. 5. The BPM-D process framework

To support the implementation and continuous improvement of this PoPM we have described this business process from all relevant views (Scheer 1998): organization, functions, data, deliverables and control view (Kirchmer 2015). In over 50 business transformation and improvement initiatives we have proven that this PoPM definition

152

M. Kirchmer et al.

delivers significant value - adjusted to and applied in the specific business context (Kirchmer 2016). It is sufficiently complete and consistent. The PoPM helps to focus on what really matters, improves or transforms processes in the specific context of an organization and sustains those improvements. The high importance of the BPM-Discipline for strategy execution and digitalization requires and justifies and even more accelerated improvement of the PoPM and its application to specific organizations. This can be achieved by digitalizing the PoPM itself. This is illustrated in Fig. 6. Especially the “focus” and “sustain” effects of the PoPM are often underestimated and underdeveloped in traditional companies so that the BPM-Discipline helps here to move existing practices to the next level of performance. It becomes the key means that help the “Chief Process Officer” (Kirchmer and Franz 2014a) guides his journey of ongoing strategy execution and digitalization.

Fig. 6. Digitalization of the process of process management

3 Objectives of the Digitalization of the Process of Process Management There are a large number of digital tools supporting the PoPM, such as process modelling and repository tools, process automation and workflow engines, robotic process automation, block chain, or process analytics and mining tools (Scheer 2017). Most of them target the execution or design of processes or some other small components of the PoPM. Those digital enablers are often not or only loosely integrated. In order to get the best possible results, the digitalization of the PoPM needs to be more comprehensive. We have identified three core objectives: • Focus on what matters most • Don’t re-invent the wheel - Integrate • Make process management fun. These objectives are realized based on the BPM-D Process Framework as an example. They can be applied the same way to other PoPM reference models and frameworks.

Value Switch for a Digital World: The BPM-D® Application

3.1

153

Focus on What Matters Most

An analysis of the different sub-processes of the BPM-D Process Framework based on over 200 process initiatives has shown that there are nine areas which are currently not well covered through digital tools. The operationalization of a company strategy through an appropriate process strategy is one important area that is not well supported. An organization only competes with about 15–20% of its processes (Franz and Kirchmer 2012). All the others are commodity processes that do not really impact the competitive positioning if they are performed at least at an industry average level. It is key for an organization to know its high impact processes, align the process management capabilities with those and define a BPM agenda or roadmap consistent with these findings (Kirchmer and Franz 2014b). The systematic support of this development of a value-driven process strategy is crucial for a successful BPM-Discipline and must be adjusted with every major change of strategy or market. We have not identified any existing focused digital tools supporting this part of the PoPM, hence this should be part of a new more holistic digitalization approach. While the management of improvement projects is normally well captured through project management systems, the value-realization after the project and the related process and data governance are not sufficiently covered. This is another area where an enhancement of digital support can lead to significant improvements of the PoPM. In practice, the whole “people dimension” of process management is also not given adequate digital support in many BPM approaches. In most process transformation and improvement approaches the challenges are less on the technology side, but rather on the people side (Spanyi 2003). Since only some processes can be fully automated, people and their skills are often the bottleneck. While there is good progress made with digitally enabled change management and education approaches (Scheer 2017) (Ewenstein et al. 2015), such as the use of eLearning or various communication tools, the active management of process communities and their integration with change management is still not sufficiently covered. Hence, this is another area for an improved digitalization of the PoPM. Figure 7 shows all the focus areas for an advanced digitalization of the PoPM.

Fig. 7. Functional focus of PoPM digitalization

154

M. Kirchmer et al.

In addition, there is a lack of integration between the different existing digital tools. Hence, a next generation digitalization of the PoPM needs to address this and deliver the right degree of integration enabling best performance of the overall PoPM. 3.2

Don’t Re-invent the Wheel - Integrate

This clearly defined functional focus of the PoPM digitalization initiative also prepares for the second objective. Existing digital process management tools and applications need to be re-used and integrated in the new digital BPM environment. This saves time and cost which is key in our fast-changing business environment. In addition, it makes the adoption easier for organizations who often have already made significant investments into existing process management tools. An important aspect is to re-use data available in other applications. The knowledge about processes stored in a process repository as part of an enterprise architecture application, for example, is excellent master-data for other digital tools. This data can be used to identify high impact processes, support the value-realization of a process improvement or guide the management of process communities. The new PoPM digitalization needs to be complementary to existing tools and provide an integration environment to optimize the overall support of the PoPM – as efficiently as possible. 3.3

Make Process Management Fun

The acceptance of a PoPM with a significant higher degree of digitalization is again dependent on the people who have to use it. To motivate them and make the PoPM part of a positive process-oriented culture it is important that the new digital components are fun for the users to deal with. This requires a simple and nice to work with user interface. It needs to make people feel familiar and comfortable with it by copying behaviors from existing widely used applications. On the other hand, it must also bring innovations to the table that make it interesting to migrate, for example from the use of a spreadsheet, to the new PoPM application. The integration of gamification, self-learning and data analytics components is another way to get people excited and make dealing with the new set of tools fun while improving PoPM performance. This is especially important when it comes to community management and application functionality that is used on a daily basis. To make the use of the tool fun, its administration has to be efficient. Hence, a cloud-based approach is required. The cloud has become a main driver of digitalization (Abolhassan 2016). The PoPM digital initiative is not an exception to this.

4 Approach of the Digitalization of the Process of Process Management In line with these objectives, work has progressed on the design and implementation of an integrated BPM-D Application that aims to properly support and digitalize the Process of Process Management (PoPM). The approach, initial implementation and early pilots demonstrate considerable progress regarding the defined objectives.

Value Switch for a Digital World: The BPM-D® Application

4.1

155

Design of the BPM-D Application

To effectively digitalize the PoPM and achieve the defined objectives appropriate application software must be developed. We call it the BPM-D Application. In order to meet the objectives, the following functional requirements have been identified, using a design thinking approach (Nixon 2013): 1. Digitally manage Strategy execution: • Centrally document business strategy and the process impacted by it • Translate strategy into executable value driven work packages using required process and BPM capabilities • Define, manage, track and improve maturity level of BPM capability in an organization • Track and continuously manage business process impact of projects • Define, Track and manage role based controls, metrics and measurable outcomes of past project activities. 2. Apply Analytics to a process and its execution • Analyze maturity of PoPM and operational processes, visualize results in dashboards • View, analyze, and manage selected process knowledge • Leverage process knowledge to support various use case scenarios, for example the enforcement of process standards. 3. Enable Gamification based collaboration of the BPM community • Setup and manage the required process and data governance • Enable and encourage collaboration across the BPM Community • Support focused training of BPM communit. 4. Integration with existing technologies • Track and manage a portfolio of business process-related technologies • Identify integration-scenarios and solutions. 4.2

Implementation of the BPM-D Application

The BPM-D Application is an intuitive tool that is being developed in an agile approach to meet these requirements (Sims and Johnson 2014). It is a web-based platform delivering the defined objectives. Hence, it becomes an enabler of ongoing strategy execution and digitalization for the next generation enterprise. The BPM-D Application provides the functionality in the key priority areas identified above in Fig. 7 and then integrates where appropriate with a series of other tools that currently digitalize other functional areas of the PoPM. The integration of the application to other modules is enabled by the prevalence of XML as a standard for data communication. Hence, the BPM-D Application supports from the first prototype on focused integration with existing tool, enhancing the value that those tools deliver and avoiding re-inventing existing digital solutions.

156

M. Kirchmer et al.

The application consists of a set of modules as shown in the Fig. 8. In a possible following commercialization phase of this prototype those modules could be licensed separately.

Fig. 8. BPM-D application modules

The key develop tenets of the BPM-D Application are: • • • • •

Cloud based for easy access Mobility enabled (access through mobile phones, tablets) Intuitive user interface Open Source Architecture to facilitate ongoing development and improvement Service Based Architecture enabling integration and layered modular architecture that supports plug and play approaches for agile implementation.

The Application modules are based on the BPM-D framework which segments into six main sections; • • • • • •

BPM-D BPM-D BPM-D BPM-D BPM-D BPM-D

Strategy Assets Project Execution People Enablement Management Technology Enablement.

The overall architecture of the BPM-D application is shown in Fig. 9. This is a high-level view, stressing the importance of the integration into an existing PoPM-related software environment. The basis of the BPM-D Application is the effective management of process knowledge. The definition of business processes in the form of process models are typically well supported through modelling and repository tools. The BPM-D Application focuses on contextual and management information about the processes as shown in the Process Master module in Fig. 10. This has been developed using the comprehensive BPM-D Data Framework (Kirchmer 2015), describing the data view of the PoPM.

Value Switch for a Digital World: The BPM-D® Application

157

Fig. 9. BPM-D application architecture

Fig. 10. BPM-D process master module

Initial reviews of the Application with seasoned stakeholders identified the need for viewing much of the data in two views: a more engineering oriented view as per Fig. 10 to be used for maintenance of the information in the most effective manner. A second more executive oriented view is required to communicate the information in a “Value Chain” format as shown in Fig. 11.

Fig. 11. BPM-D process master module (value chain view)

On basis of master data, the BPM-D Application systematically fills the PoPM gaps identified earlier. The starting point is the connection of business strategy to the process hierarchy using the Value-Driver Tree and Process Impact Assessment (Kirchmer 2015).

158

M. Kirchmer et al.

The easy to use value-tree creation page is shown in Fig. 12. This creates the fundamental link between the organization’s business strategy and a series of quantifiable value drivers as a basis for establishing the process agenda (Franz and Kirchmer 2012).

Fig. 12. BPM-D application value tree creation

These value drivers are then connected to the processes through the Process Impact Assessment. This intuitive interface helps to gather information relevant for process impact and maturity very collaboratively and then identifies the high impact and low maturity processes. Based on the ever-changing strategy, these priorities will also change. The BPM-D Application offers the process professional the ability to react to these changes in an agile manner, being well informed about possible impacts of this strategy change. The tabular engineering view of the Process Impact Assessment is shown in Fig. 13.

Fig. 13. BPM-D application process impact assessment (table view)

For the high impact processes, the application then enables the collation of current process capability through surveys to identify the current and targeted level of maturity for each process. This is then available for review and amendment by process owners using interactive sliders as shown in Fig. 14.

Value Switch for a Digital World: The BPM-D® Application

159

Fig. 14. BPM-D application process maturity assessment

For interaction with business stakeholders, this process impact and maturity information is then summarized onto the process value chain (Fig. 11) identifying the high impact processes (color of the process) and the maturity of each process as indicated by the green bar at the base of each process shown in Fig. 15. It is these high impact, low maturity processes that require the priority intervention in the context of the chosen strategy (Kirchmer and Franz 2014a and b).

Fig. 15. BPM-D application process impact and maturity assessment (value chain view) (Color figure online)

Another key component of the BPM-D Application is the Process Governance module. Identifying process performance gaps is only useful if it is clear who has responsibility and accountability for taking any process improvement action. Process governance is multi-dimensional as it needs to reflect three key organizational realities: • Functional responsibility: Which processes can I touch? • Organizational responsibility: What actions am I entitled to execute? • Process management responsibility: What BPM role do I play? These realities need to be applied to all modules of the BPM-D Application to enable an effective support of the PoPM. This is a pre-condition for a holistic integrated digitalization approach. Translating the identified process performance gaps (high impact, low maturity processes) into improvement actions is achieved through the definition of work packages in the Process Agenda module. Here the responsible process owner can

160

M. Kirchmer et al.

review the work packages that are already in progress and check how well they address the performance gaps. As shown in Fig. 16, a graphical interface assists in identifying how many current work packages are in progress in support of each process. It is clearly shown where there are misalignments in the focus of interventions. Where there are a number of work packages in progress impacting lower priority processes, these can be assessed and possibly stopped. High impact low maturity processes with no active work packages identify the need for initiating new action and where there are a number of overlapping work packages, these can be assessed for consolidation opportunities.

Fig. 16. BPM-D application work package analysis.

Again, this can be overlaid as additional detail on the more intuitive value-chain view as shown in Fig. 17. This is then a very powerful tool for communicating priorities with management showing the clear line of sight to the business strategy. The application caters for a drill down into lower levels of detail in those areas of focus.

Fig. 17. BPM-D application work package impact assessment (value chain view)

Once the portfolio of current and potential work packages is identified and documented, these are then analyzed in terms of effort and value (as measured against the strategy) on displayed on a prioritization matrix shown in Fig. 18. Here simple drag-and-drop functionality enables the allocation of work packages into phases.

Value Switch for a Digital World: The BPM-D® Application

161

Fig. 18. BPM-D application work package portfolio analysis

In early discussion with a number of organizations, that are evaluating the use of the BPM-D Application, they consistently mention that this approach has numerous benefits in better focusing and aligning the portfolio of improvement initiatives in an organization. It also provides the capability to much better identify and manage the value realization of initiatives. Each work package is assessed in terms of its impact on delivering process improvements. Then through the process impact assessment KPIs can be identified. The impact that work packages therefore have on the KPIs can be quantified into a much more representative business case. This also provides the basis for an effective value realization approach after improvement projects have concluded. This support of the value realization has also already implemented for selected value realization scenarios, such as the management of process standardization to meet compliance requirements. This functionality will be discussed in the context of a pilot client scenario.

5 Experiences with the First Pilot The Beta version of the BPM-D Application is being developed with pilot customers and some features are already live and used by the same clients. 5.1

Pilot Client Overview

One of the early adopters of the BPM-D Application is a large shipping company headquartered in Europe with offices globally. They manage over 100 vessels and live under a very robust regulatory and control environment. Their finance organization is structured in a hybrid way with a combination of a corporate oversight; there are individuals in each business unit to support their management and a centralized global business services team executing many of the transactional and reporting tasks. Alignment of processes and the necessary controls across these finance entities is important to ensure that actions are not overlooked and that there is the proper segregation of responsibilities. These controls were managed in a very manual way and were thus not as robust as required. Their business processes were mapped in a diagramming tool which was little more than a pictorial representation of the workflow. The controls were highlighted on the workflow and they used a combination of a worksheet and email to manage the compliance and audit of these controls.

162

M. Kirchmer et al.

Changes to the processes and the controls were also difficult to implement as they were kept on a local server and not integrated. The controls team attempted to keep these up to date and then needed to distribute changes through email notifications. The organization required a much more integrated and accessible solution to achieve the controls objectives effectively. 5.2

Leveraging the BPM-D Application

The organization therefore embarked on a program of implementing a cloud-based full functional process modelling and repository tool. All of their financial processes were duly converted into this tool and verified through a collaborative on-line process. It proved to be a great opportunity for them to bring their process models up to date and to ensure that all of the globally-dispersed finance team had access to the same process information. This only provided half the solution and they recognized the need for controls and compliance management that was more tightly coupled with these processes. An “enlightened” process professional helped them recognize that this was the first step towards more effectively digitalizing their process of process management, with controls simply being one of the many management requirements. They therefore agreed to be one of the pilot adopters of the BPM-D Application. The financial process hierarchy was loaded into the application with integrated references back into the process repository to the detailed process information. The process models were developed in BPMN 2.0 (Fisher 2012) notation and included references to the required controls. This is shown in Fig. 19.

Fig. 19. Process model in BPMN 2.0 with controls marked in red (Color figure online)

These controls are accessed through the BPM-D Application and managed against the control objectives hierarchy. All of the context information related to the controls and their management responsibility was then managed using the BPM-D Application governance module. The controls could thus be seamlessly managed by the controls administrator as shown in Fig. 20. The control related information is then instantly available through the cloud-based environment to the finance users globally.

Value Switch for a Digital World: The BPM-D® Application

163

Fig. 20. BPM-D application process controls management in the process governance module

Finance users then are assigned control related tasks that need to be performed periodically. These tasks are simply added into the BPM-D Application task management module alongside all other process management tasks. The application filters the tasks based on their governance profile and then makes it easy for them to display their tasks and capture their actions against these tasks. This is shown in Fig. 21.

Fig. 21. BPM-D application task management

A central control manager then manages their area of responsibility and checks on the progress of the periodic tasks. The BPM-D application has a graphical representation of the controls status and the ability to easily identify and act on delayed or outstanding actions, as shown in Fig. 22.

Fig. 22. BPM-D application task status monitoring

164

M. Kirchmer et al.

While the examples shown here for this pilot project are specific to controls management, the BPM-D application simply recognizes control compliance to standards as one of the numerous process management tasks and these same modules cater for the effective process management activity across the organization for a range of other PoPM usage scenarios. 5.3

Learnings and Further Development of the BPM-D Application

The integrated and intuitive nature of the BPM-D Application proved to be very popular with the pilot organization’s finance users. The compliance activities now require less time to execute and are thus more diligently performed. The fact that the user community works on-line ensures that they are executing the latest version of the controls and there is an excitement to apply the same approach in other parts of the group. A very exciting by-product of the implementation was that the related process models now more accurately reflect the business operations and there is an incentive to ensure that they are properly understood and kept current. The ownership for these models has moved from being with one lonely process owner to being much more effectively managed in a collaborative way by the broader stakeholder community. This has made the finance team much more aware of the benefits of value-driven process management. They are looking to extend their capability and simultaneously extending their adoption of the BPM-D Application functionality. In the next steps of the agile development of the BPM-D Application the value realization in different business scenarios will be further developed, including the definition of projects based on the value-packages, their execution -managed through external project management systems- and control of the realization of the defined benefits after the project conclusion. With that the entire way from strategy until execution will be covered. These developments will be combined with the launch of the implementation of the integrated support of people change management and process-oriented community management. In that way more and more of the discovered PoPM gaps will be closed while already creating benefits through existing BPM-D Application components.

6 First Impact is Visible The first step of the digitalization of the PoPM has proven the initial hypothesis that this will significantly increase the performance of the process management discipline. The continued development of the BPM-D Application will lead to a more efficient and far more effective approach to establishing a value-driven BPM-Discipline in an organization. The permanent change of our business environment also impacts the PoPM. Hence, this process also changes continuously and with it the requirements for the BPM-D Application. Therefore, an agile ongoing development approach is required.

Value Switch for a Digital World: The BPM-D® Application

165

Ongoing research about the change of the PoPM in our digital world needs to deliver the requirements for this ongoing development. This makes the BPM-Discipline the execution engine for strategy execution and business digitalization, delivering fast results at minimal risk.

References Abolhassan, F. (ed.): The Drivers of Digital Transformation – Why There is No Way Around the Cloud. Springer, New York (2016). https://doi.org/10.1007/978-3-319-31824-0 Cantara, M.: Start up your business process competency center. In: Documentation of The Gartner Business Process Management Summit, National Harbor (2015) Ewenstein, B., Smith, W., Sologar, A.: Changing Change Management. In: McKinsey Digital (2015) Fisher, L. (ed.): BPMN 2.0 Handbook – Methods, Concepts, Case Studies and Standards in Business Process Modelling Notation (BPMN). Future Strategies, Lighthouse Point (2012) Franz, P., Kirchmer, M.: Value-Driven Business Process Management: The Value-Switch for Lasting Competitive Advantage, 1st edn. McGraw-Hill, New York (2012) Kirchmer, M.: High Performance through Business Process Management –Strategy Execution in a Digital World, 3rd edn. Springer, New York (2017). https://doi.org/10.1007/978-3-31951259-4 Kirchmer, M.: Strategy execution in a consumer goods company: achieving immediate benefits while building lasting process capabilities for the digital world. In: Proceedings of BPM 2016, Rio de Janeiro, 18–22 Sept 2016 Kirchmer, M., Franz, P., Lotterer, A., Antonucci, Y., Laengle, S.: The value-switch for digitalisation initiatives: business process management. BPM-D Whitepaper, Philadelphia, London (2016) Kirchmer, M.: The process of process management – mastering the new normal in a digital world. In: BMSD Proceedings, Milano, July 2015 Kirchmer, M., Franz, P.: Chief process officer – the value scout. BPM-D Whitepaper, Philadelphia, London (2014a) Kirchmer, M., Franz, P.: Targeting value in a digital world. BPM-D Whitepaper, Philadelphia, London (2014b) McDonald, M.P.: Digital strategy does not equal IT strategy. Harvard Bus. Rev. (2012) Nixon, N.: Viewing ascension health from a design thinking perspective. J. Organ. Des. (2013) Scheer, A.W.: Performancesteigerung durch Automatisierung von Geschaeftsprozessen. Whitepaper, AWS-Institute fuer Digitale Produkte und Prozesse gGmbH, Saarbruecken, Germany (2017) Scheer, A.W.: Industry 4.0: from vision to implementation. Whitepaper Number 5, AugustWilhelm Scheer Institute for Digital Products and Processes, Scheer GMBH, Saarbruecken, Germany (2015) Scheer, A.W.: ARIS – Business Process Frameworks, 2nd edn. Springer, Berlin (1998). https:// doi.org/10.1007/978-3-642-58529-6 Sims, C., Johnson, H.L.: Scrum: A Breathtakingly Brief and Agile Introduction (2014) Spanyi, A.: Business Process Management is a Team Sport – Play it to Win!. Anclote Press, Tampa (2003) Swenson, K.D., von Rosing, M.: What is business process management. In: von Rosing, M., Scheer, A.-W., von Scheel, H. (eds.) The complete Business Process Handbook – Body of Knowledge from Process Modeling to BPM, Amsterdam, Boston, vol. 1, p. 79–88 (2015) von Rosing, M., Scheer, A.W., von Scheel, H. (e.d.): The complete business process handbook– Body of Knowledge from Process Modeling to BPM, vol. 1, Amsterdam, Boston (2015)

A Systematic Review of Analytical Management Techniques Applied to Competition Analysis Modeling Towards a Framework for Integrating them with BPM Dimitrios A. Karras1,2(&) and Rallis C. Papademetriou1,2 1

2

Automation Department, Sterea Hellas Institute of Technology, 34400 Psachna, Evoia, Greece [email protected] Faculty of Technology, School of Engineering, University of Portsmouth, Anglesea Building, Anglesea Road, Portsmouth PO1 3DJ, UK

Abstract. The understanding of Business Process modelling is an essential approach for an Organization or Enterprise to achieve set objectives and improve its operations. Recent development has shown the importance of representing processes to carry out continuous improvement. One important aspect of enterprise modelling is actually its involvement in competition. The modelling and simulation of Business Processes has been able to show Business Analysts, and Managers where bottleneck exists in the system, how to optimize the Business Process to reduce cost of running the Organization, and the required resources needed for an Organization. Although large scale organizations have already been involved in such BPM applications, on the other hand, Small Medium Enterprises (SME) have not drawn much attention with this respect. It seems that SME need more practical tools for modelling and analysis with minimum expenses if possible. One approach to make BPM more applicable to SME but, also, to larger scale organizations would be to properly integrate it with analytical management computational techniques, including the game-theoretic analysis, the Markov-chain modelling and the Cognitive Maps methodology. In BPM research the Petri Nets methodology has already been involved in theory, applications and BPM Software tools. However, this is not the case in the previously mentioned as well as to other analytical management techniques. It is, therefore, important in BPM research to take into account such techniques but focusing on specific modelling requirements. One such requirement is the modelling of market share competition. This paper presents an overview of some important analytical management computational techniques, as the above, that could be integrated in the BPM framework, based on the market share competition analysis paradigm. It provides an overview along with examples of market share competition analysis of the applicability of such methods in the BPM field. The major goal of this systematic overview is to propose steps for the integration of such analytical techniques in the BPM framework so that they could be widely applied. Keywords: Business Process Modelling  Competition analysis of markets Modelling requirements  Analytical management techniques Game-theory modelling  Markov-chain modelling  Cognitive maps modelling © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 166–185, 2018. https://doi.org/10.1007/978-3-319-78428-1_9

A Systematic Review of Analytical Management Techniques

167

1 Introduction Small and medium-sized enterprises (SMEs) account for more than 90% of the world’s enterprises and 50–60% of employment. Their contribution to national and regional economic development and gross domestic product growth is well-recognized (Morsing and Perrini 2009). In fact, SMEs are often characterized as fostering enhanced local productive capacities; innovation and entrepreneurship; and increased foreign direct investment in both developed and developing countries (Raynard and Forstater 2002). Hence, while SMEs account for more than 60% of employment in developing countries, and although they are sometimes portrayed as key vehicles in the struggle against poverty (Luetkenhorst 2004), there is still a critical lack of knowledge about the extent to which these firms may contribute to the achievement of broader objectives of sustainable and equitable development (Fox 2005; Jeppesen et al. 2012). In order to understand the possibility of such a contribution it is important to investigate how SMEs are involving analytical management techniques to better explore their possibilities and systematically optimize their performance in a complex financial world and global market. The focus and interest on complex data management, including big data analytics, has been increased over the recent years in the world of SME firms. Several research reports attempt, through questionnaires, to understand the use of analytical management and planning tools and techniques in SMEs operating in different countries. As a result of these studies, the most common used tools and techniques are strategic planning, human resources analysis, total quality management, customer relationship management, outsourcing, financial analysis for firm owners, vision/mission, PEST, financial analysis for competitors, benchmarking, STEP analysis, Porter’s 5 forces analysis and analysis of critical success factors. According to Gunn and Williams (2007), the results of their research in the UK, SWOT analysis is the most widely applied strategic tool by all organizations surveyed. Benchmarking was ranked second in terms of its usage by all but manufacturing organizations. However, it is important to perform a meta-analysis research on all these and most recent reports on the use of management tools and techniques in SMEs in order to clearly answer, in detailed tables, in what extend each technique is involved by SMEs depending on its sector of economy, on its country/continent as well as on other crucial meta-analysis factors. Moreover, it is frequently noticed that the value of just data has significantly reduced in recent past. There are 2 main factors and open issues to consider: (a) There is an overdose of data and it’s really hard for a resource strapped SME to be able to digest it. (b) There is an overdose of technology solutions and again it’s really hard for SME’s to understand this landscape and pick the right solution. Actionable Insights from data is what everyone, including SMEs, want, something with which, on a daily basis, they can uncover new opportunities to grow their business within a complex world, understanding completely their true performance.

168

D. A. Karras and R. C. Papademetriou

The above two questions have not been answered so far by the research reports for SMEs. These questions are, also, highly correlated to the issue of “on what extend the different analytical management tools are really used by SMEs in the optimization of their performance”. In order, however, for an SME or a larger scale organization to apply such analytical techniques and for the research community to answer the above questions, modelling of the business processes (BPM) involved is absolutely necessary in order to establish a common language, a well-defined framework for the application of analytical management techniques. Therefore, more critical than the meta-analysis previously discussed on the use of data by SMEs and other larger scale organizations, is to review, discuss and provide a framework for the proper integration of BPM methodologies and analytical management techniques worthwhile to be utilized in SMEs and beyond. The major goal of the paper is, therefore, to discuss suitable analytical management techniques that could be integrated in the BPM framework, and through examples to discuss the feasibility of establishing a well-defined framework for the application of these techniques to enterprises. It is an extension of the paper originally presented in (Karras and Papademetriou 2017). With this respect we herein discuss and give examples of game theoretic analysis, probabilistic/stochastic methodology, Markov-chain analysis as well as Cognitive maps methodology in business modelling and analysis towards discussing the feasibility of a well-defined framework for the application of these techniques to SME and larger scale enterprises through the BPM approach.

2 An Overview of Suitable Analytical Management Techniques for Competition Analysis that Could Be Integrated in the BPM Methodology Most attempts to describe and classify business models in the academic and practice literatures have been taxonomic, that is, developed by abstracting from observations typically of a single industry. With only a few exceptions, these attempts rarely deal fully and properly with all its dimensions of customers, internal organization and monetization; see, for instance, Rappa (2004) and Wirtz et al. (2010). So far, the literature lacks clear typological classifications that are robust to changing context and time (Hempel 1965). A typology has been proposed that considers four elements Baden-Fuller et al. (2010, 2013): Identifying the customers (the number of separate customer groups); customer engagement (or the customer proposition); monetization; and value chain and linkages (governance typically concerning the firm internally). In order to define a framework for the application of analytical management techniques through BPM methodology such a typology of business processes models is important in order to establish the ontologies, the conceptual links as well as the application paradigms. The herein systematic review attempts to describe the aforementioned techniques within this context.

A Systematic Review of Analytical Management Techniques

2.1

169

A Markov Chain Business Competition Modelling Analysis

Many real-world systems, including enterprises functionality and operations, contain uncertainty and evolve over time. Stochastic processes (and Markov chains) are probability models for such systems. A discrete-time stochastic process is a sequence of random variables X0, X1, X2, … typically denoted by {Xn}. The state space of a stochastic process is the set of all values that the Xn’s can take. (we will be concerned with stochastic processes with a finite # of states). Time: n = 0, 1, 2, … State: v-dimensional vector, s = (s1, s2, …, sv) In general, there are m states, s1, s2, …, sm or s0, s1, …, sm−1. Also, Xn takes one of m values, so Xn $ s. A stochastic process {Xn} is called a Markov chain if PrfXn þ 1 ¼ jjX0 ¼ k0 ; . . .; Xn1 ¼ kn1 ; Xn ¼ ig ¼ PrfXn þ 1 ¼ jjXn ¼ ig transition probabilities for every i, j, k0, …, kn−1 and for every n. Discrete time means n 2 N = {0, 1, 2, …}. The future behavior of the system depends only on the current state i and not on any of the previous states. PrfXn þ 1 ¼ jjXn ¼ ig ¼ PrfX1 ¼ jjX0 ¼ ig for all n (They don’t change over time). Normally, stationary Markov chains are considered. The one-step transition matrix for a Markov chain with states S = {0, 1, 2} is 2

p00 P = 4 p10 p20

p01 p11 p21

3 p02 p12 5 p22

where pij = Pr{X1 = j|X0 = i}. P If the state space S = {0, 1, …, m − 1} then we have j pij ¼ 1 8 i and pij  0 8 i; j. Potential Studies of Business Modelling based on Markov Chain methodology – Predict market shares at specific future points in time when different business strategies are applied. – Assess rates of change in market shares over time. – Predict market share equilibriums. – Evaluate the process for introducing new products. In short business competition analysis for market sharing prediction could be benefited from Markov Chain approach. A relevant example for the application of Markov Chain modeling in the field of SMEs or larger scale organizations, regarding the number of products and thus, the relevant market share switching from enterprise to enterprise, is as follows (adapted from https://www.analyticsvidhya.com/blog/2014/07/markov-chain-simplified/):

170

D. A. Karras and R. C. Papademetriou

Let’s analyze the competition of three brands, producing the same product (devices, e.g. smartphones), examining the number of these products (e.g. the smartphones) switching from brand i in week, for instance, 17 to brand j in week 18. Smartphone producer brand (i) 1 2 3 Total

(j) 1 90 5 30 125

Total 2

3

7 205 18 230

3 40 102 145

100 250 150 500

This is called the contingency table of the Markov Chain and is used to construct the transition probabilities. Calculation of the Empirical Transition Probabilities for the smartphone Brand Switching example, P = (pij)I,j=1,2,3. Smartphone producer brand (i) 1 2 3

(j) 1 90/100 = 0.9 5/250 = 0.02 30/150 = 0.2

2 7/100 = 0.07 205/250 = 0.82 18/150 = 0.12

3 3/100 = 0.03 40/250 = 0.16 102/150 = 0.68

State variable, Xn = brand device purchased in week n. {Xn} represents a discrete state and discrete time stochastic process, where S = {1, 2, 3} and N = {0, 1, 2, …}. If {Xn} has Markovian property and P is stationary, then a Markov chain should be a reasonable representation of aggregate consumer brand switching behavior. We approximate qi(0) by dividing total customers using brand i in week 18 by total sample size of 500, which is let’s say the total market at that week: qð0Þ ¼ ðqi ð0Þ; for i ¼ 1; 2; 3Þ ¼ ð125=500; 230=500; 145=500Þ ¼ ð0:25; 0:46; 0:29Þ To predict market shares for, say, week 20 (that is, 2 weeks into the future), we simply apply equation with n = 2: qð2Þ ¼ qð0ÞPð2Þ 2

0:90 0:07

6 q(2) ¼ (0.25,0.46,0.29)4 0:02 0:82 0:20 0:12

0:03

32

7 0:16 5 0:68

¼ ð0:327; 0:406; 0:267Þ ¼ expected market share from brands 1; 2; 3 at week 20:

A Systematic Review of Analytical Management Techniques

171

Property 1: Let fXn : n ¼ 0; 1; . . .g be a Markov chain with state space S and state-transition matrix P. Then for i and j 2 S, and n = 1, 2, … PrfXn ¼ jjX0 ¼ ig ¼ pij where the right-hand side represents the ijth element of the matrix P(n). Property 2: Let p = (p1, p2, …, pm) is the m-dimensional row vector of steady-state (unconditional) probabilities for the state space S = {1, …, m}. To find steady-state probabilities, solve linear system: p ¼ pP; Sj¼1;m pj ¼ 1; pj  0; j ¼ 1; . . .; m 2 3 0:90 0:07 0:03 6 7 ðp1 ; p2 ; p3 Þ ¼ ðp1 ; p2 ; p3 Þ4 0:02 0:82 0:16 5 0:20 0:12 0:68 where, p1 þ p2 þ p2 ¼ 1; p1  0; p2  0; p3  0 p1 ¼ 0:90p1 þ 0:02p2 þ 0:20p3 p2 ¼ 0:07p1 þ 0:82p2 þ 0:12p3 p3 ¼ 0:03p1 þ 0:16p2 þ 0:68p3 p1 þ p2 þ p3 ¼ 1 p1  0; p2  0; p3 0 With simple substitutions we could get: p1 = 0.474, p2 = 0.321, p3 = 0.205, which are the steady state calculations. If we recall that q1(0) = 0.25, q2(0) = 0.46, q3(0) = 0.29, we could understand the differences between steady state calculations, that is the market share equilibrium in the competition between the different brands, and the market share prediction, when competition has not been stabilized. 1. Steady-state predictions of competition analysis between different brands are never achieved in actuality due to a combination of (i) errors in estimating P (ii) changes in P over time (iii) changes in the nature of dependence relationships among the states. 2. Nevertheless, the use of steady-state values is an important diagnostic tool for the decision maker. 3. Steady-state probabilities might not exist unless the Markov chain is ergodic. 2.2

The Game Theoretic Modelling Analysis Applied to Business Competition Analysis

Every game has players (usually two), strategies (usually two, but sometimes more) and payoffs (the payoffs to each player are defined for each possible pair of strategies in a two-person game). There are also rules for each game which will define how much information each player knows about the strategy adopted by the other player, when

172

D. A. Karras and R. C. Papademetriou

this information is known, whether only pure strategies or mixed strategies may be adopted, etc. Game theory is used to help us think about the strategic interaction between firms in an imperfectly competitive industry. It is particularly helpful for looking at pricing, advertising and investment strategies, and for looking at the decision to enter an industry (and the strategies that can be adopted to deter a firm from entering an industry – entry deterrence) as well as to formulate the outcomes of different strategies of specific business processes. There is a lot of terminology to when someone is first introduced to game theory. For instance, games can be co-operative or nonco-operative. A co-operative game is one in which the players can form lasting agreements on how to behave. We focus our attention, however, on nonco-operative games in which such binding agreements are not possible, and players are always tempted to cheat on any temporary agreement if they can gain an advantage by cheating. Such games are well suited in the case for modelling different strategies for specific business processes. Games can be “pure strategy” games or they can allow for “mixed” strategies. Most of the time we will discuss only pure strategy games (for example: if a firm has two strategies for a business process, which are to charge $50 and to charge $100, then a pure strategy game allows for only these two possibilities). However, we could consider some examples of mixed strategies (for example: if the firm has the two pricing strategies described above, it would also have the option of charging $50 thirty percent of the time and charging $100 seventy percent of the time – i.e., a probabilistic move). Games can be single-period games or many-period games (many-period games are also called repeated-play games or multi-period games). A single-period game will only be played once and no one thinks about the future possible replaying of the game in making their decisions about the best strategy. However, many of life’s strategic decisions (for business firms as well as individuals) require us to think about the payoffs that will occur if a game is played over and over and over again. Results in a one-period game can be overturned once you take repeated effects into account. Games can be described as simultaneous games or sequential games. In a simultaneous game, the two players know what their possible strategies are, they know the identity of the other player, they know what the payoffs are for both players from any combination of strategies, but each player does not know what move the other player has decided to make. In other words, each player knows the incentives, but not the actual strategy adopted. On the other hand, in a sequential game, one player moves first and the other player moves second. The second player to move already knows what strategy the other player has adopted when the second player is making his/her decision. What constitutes a dominant strategy? A dominant strategy is one that gives you the best result, no matter what the other person chooses to do. For example, consider the following game, adapted from the above brand market share example (note: in all the games herein discussed the payoff of the business process for the first brand Brand#1 following this process will always be listed first):

A Systematic Review of Analytical Management Techniques

173

For Brand#1, Y is a dominant strategy, because Brand#1 always ends up Business with a higher payoff for the enterprise by process B choosing this business process. For (7, 6) Business Brand#2 there is no dominant strategy, process because Brand#2 does better by choosY ing A if #1 chooses Z, but Brand#2 does better by choosing B if #1 chooses Y. A Nash equilibrium occurs when Business (6, 1) process Z neither party has any incentive to change his or her strategy, given the strategy adopted by the other party. Clearly, the existence of a dominant strategy will result in a Nash equilibrium: in the game above, the enterprise following process 1 always chooses strategy Y; while the enterprise following process 2 then, chooses B; Y, B is a Nash equilibrium. However, games without any dominant strategies also often have Nash equilibria. A game may have no Nash equilibrium, a single Nash equilibrium, or multiple Nash equilibria. In order for such a methodology to be applied it is important to completely define business processes, payoffs (which, in the competition analysis are the increments/ decrements of relevant market shares) and of course the players, which, in our competition analysis case, they are simply the different brands. In our case the players are different competitive processes within an enterprise, but they could be within two different firms too. Regarding the payoffs could be even the number of customers attracted by the different strategies. Therefore, the applicability of this analytical management technique should be discussed within BPM framework in order to be established for wide use within SME or larger enterprises. Brand#1

2.3

Brand#2 Business process A (10, 5) (market share increment of brand #1, market share increment of brand #2) (4, 3)

The Cognitive Maps Approach in Modelling Analysis

Cognitive maps (Axelrod 1976; Eden 1992) are a collection of nodes linked by some arcs or edges. The nodes represent concepts or variables relevant to a given domain. The causal links between these concepts are represented by the edges. The edges are directed to show the direction of influence. Apart from the direction, the other attribute of an edge is its sign, which can be positive (a promoting effect) or negative (an inhibitory effect). Cognitive maps can be pictured as a form of signed directed graph. Figure 1 shows a cognitive map used to represent a scenario involving let’s say competition analysis of 7 brands of a specific product P (a device, let’say a smartphone), BP1-BP7, following a specific brand switching analysis as the one of Sect. 2.1. The construction of a cognitive map requires the involvement of a knowledge engineer and one or more experts in a given problem domain. Methods for constructing a cognitive map for a relatively recent real-world application are discussed in (Tsadiras 2003; Jetter Antonie and Kok 2014).

174

D. A. Karras and R. C. Papademetriou

BP1 BP2

+

BP3

BP5

-

BP4

BP6

-

+

+

BP7

Fig. 1. Cognitive map concerning causal relations in a generalization of the Brand Switching example of Sect. 2.1.

• Basic structure of FCM – Each node in Ci in FCM represents a concept. – Each arc (Ci, Cj) is directed as well as weighted, and represents causal link between concepts, showing how concept Ci causes concept Cj. These structures are shown in the following example of a business causal model.

A Systematic Review of Analytical Management Techniques

175

The adjacency matrix related to this causal business model is W=

C1 C2 C3 …

C1 = Customer satisfaction 0 VL 0 .

C2 = Product defects 0 0 H .

8 < wij [ 0; W ¼ wij ¼ 0; : wij \0;

C3 = Sales Volumes VH 0 0 .

… . . . .

expresses positive causility expresses no causility expresses no causility 

fsign ðxÞ ¼

1; x [ 0; 0; x  0:

8 < 1; x [ 0; ftri ðxÞ ¼ 0; x ¼ 0; : 1 x\0: f ðxÞ ¼ tanhðxÞ or f ðxÞ ¼

e2x  1 : e2x þ 1

New state of the map Fig. 2. The evolution of the cognitive map

Figure 2 shows the evolution of the cognitive map basic equation with the following transfer functions FCM Inference Algorithm Step 1: Definition of the initial vector A that corresponds to the elements-concepts identified by experts’ suggestions and available knowledge.

176

Step Step Step Step

D. A. Karras and R. C. Papademetriou

2: 3: 4: 5:

Multiply the initial vector A with the matrix W defined by experts. The resultant vector A at time step k is updated using function threshold ‘f’. This new vector is considered as an initial vector in the next iteration. Steps 2–4 are repeated until epsilon (where epsilon is a residual, describing the minimum error difference among the subsequent concepts).

The main objective of building a cognitive map around a problem is to be able to predict the outcome by letting the relevant issues interact with one another. These predictions can be used for finding out whether a decision made by someone is consistent with the whole collection of stated causal assertions. Such use of a cognitive map is based on the assumption that, a person whose belief system is accurately represented in a cognitive map, can be expected to make predictions, decisions and explanations that correspond to those generated from the cognitive map. This leads to the significant question: Is it possible to measure a person’s beliefs accurately enough to build such a cognitive map? The answer, according to Axelrod and his co-researchers, is a positive one. Formal methods for analysing cognitive maps have been proposed and different methods for deriving cognitive maps have been tried in (Axelrod 1976). In a cognitive map, the effect of a node A on another node B, linked directly or indirectly to it, is given by the number of negative edges forming the path between the two nodes. The effect is positive if the path has an even number of negative edges, and negative otherwise. It is possible for more than one such paths to exist. If the effects from these paths is a mix of positive and negative influences, the map is said to have an imbalance and the net effect of node A on node B is indeterminate. This calls for the assignment of some sort of weight to each inter-node causal link, and a framework for evaluating combined effects using these numerically weight-ed edges. Fuzzy cognitive maps (FCM) (Caudill 1990; Brubaker 1996a, b) were proposed as an extension of cognitive maps to provide such a framework. Fuzzy Cognitive Maps The term fuzzy cognitive map (FCM) was coined in (Kosko 1986) to describe a cognitive map model with two significant characteristics: (1) Causal relationships between nodes are fuzzified. Instead of only using signs to indicate positive or negative causality, a number is associated with the relationship to express the degree of relationship between two concepts. (2) The system is dynamic involving feedback, where the effect of change in a concept node affects other nodes, which in turn can affect the node initiating the change. The presence of feedback adds a temporal aspect to the operation of the FCM. The FCM structure can be viewed as a recurrent artificial neural network, where concepts are represented by neurons and causal relationships by weighted links or edges connecting the neurons. By using Kosko’s conventions, the interconnection strength between two nodes Ci and Cj is eij, with eij, taking on any value in the range −1 to 1. Values –1 and 1 represent, respectively, full negative and full positive causality, zero denotes no causal

A Systematic Review of Analytical Management Techniques

177

effects and all other values correspond to different fuzzy levels of causal effects. In general, an FCM is described by a connection matrix E whose elements are the connection strengths (or weights) eij. The element in the ith row and jth column of matrix E represents the connection strength of the link directed out of node Ci and into Cj. If the value of this link takes on discrete values in the set {−1, 0, 1}, it is called a simple FCM. The concept values of nodes C1, C2, …, Cn (where n is the number of concepts in the problem domain) together represent the state vector C. An FCM state vector at any point in time gives a snapshot of events (concepts) in the scenario being modelled. In the example FCM shown in Fig. 2, node C2 relates to the 2nd component of the state vector and the state [0 1 0 0 0 0 0] indicates the event “migration into city” has happened. To let the system evolve, the state vector C is passed repeatedly through the FCM connection matrix E. This involves multiplying C by E, and then transforming the result as follows: C ð k þ 1Þ ¼ T ½ C ð k Þ  E  where C(k) is the state vector of concepts at some discrete time k, T is the thresholding or nonlinear transformation function, and E is the FCM connection matrix. With a thresholding transformation function, the FCM reaches either one of two states after a number of passes. It settles down to a fixed pattern of node values - the so-called hidden pattern or fixed-point attractor. Alternatively, it keeps cycling between a number of fixed states - known as the limit cycle. With a continuous transformation function, a third possibility known as the chaotic attractor (Elert 1999) exists, when instead of stabilising, the FCM continues to produce different state vector values for each cycle. Extensions of FCMs A number of researchers have developed extended versions of the FCM model described above Tsadiras (2003), Jetter Antonie and Kok (2014) describe the extended FCM, in which concepts are augmented with memory capabilities and decay mechanisms. The new activation level of a node depends not only on the sum of the weighted influences of other nodes but also on the current activation of the node itself. A decay factor in the interval [0, 1] causes a fraction of the current activation to be subtracted from itself at each time step. Park (1995) introduces the FTCM (Fuzzy Time Cognitive Map), which allows a time delay before a node xi has an effect on node xj connected to it through a causal link. The time lags can be expressed in fuzzy relative terms such as “immediate”, “normal” and “long” by a domain expert. These terms can be assigned numerical values such as 1, 2, 3. If the time lag on a causal link eij is m (1  m) delay units, then m − 1 dummy nodes are introduced between node i and node j. Decision makers often find it difficult to cope with significant real-world systems (Fig. 3). These systems are usually characterised by a number of concepts or facts interrelated in complex ways. They are often dynamic i.e., they evolve through a series of interactions among related concepts. Feedback plays a prominent role among them by propagating causal influences in complicated pathways. Formulating a quantitative

178

D. A. Karras and R. C. Papademetriou

F-BP1

+0.1

F-BP2

+0.6 F-BP3

+0.7

+0.9 +0.9 F-BP4

F-BP5

-0.3 F-BP6

-0.9 -0.9

+0.8 +0.9

F-BP7

Fig. 3. Fuzzified version of the cognitive map shown in Fig. 1.

mathematical model for such a system may be difficult or impossible due to lack of numerical data, its unstructured nature, and dependence on imprecise verbal expressions. FCMs provide a formal tool for representing and analysing such systems with the goal of aiding decision making. Given an FCM’s edge matrix and an input stimulus in the form of a state vector, each of the three possible outcomes mentioned above can provide an answer to a causal “what if” question. The inference mechanism of FCMs works as follows. The node activation values representing different concepts in a problem domain are set based on the current state. The FCM nodes are then allowed to interact (implemented through the repeated matrix multiplication mentioned above). This interaction continues until: (1) The FCM stabilises to a fixed state (the fixed-point attractor), in which some of the concepts are ‘on’ and others are not. (2) A limit cycle is reached. (3) The FCM moves into a chaotic attractor state instead of stabilising as in (1) and (2) above. The usefulness of the three different types of outcomes depends on the user’s objectives. A fixed-point attractor can provide straightforward answers to causal “what if” questions. The equilibrium state can be used to predict the future state of the system being modelled by the FCM for a particular initial state. As an example based on Fig. 2, the state vector [0 1 0 0 0 0 0], provided as a stimulus to the FCM, may cause it to equilibrate to the fixed-point attractor at [0 0 0 1 0 0 0]. Such an equilibrium state would indicate that an increase in “migration into city” eventually leads to the increase of “garbage per area”. A limit cycle provides the user with a deterministic behaviour of the real-life situation being modelled. It allows the prediction of a cycle of events that the system

A Systematic Review of Analytical Management Techniques

179

will find itself in, given an initial state and a causal link (edge) matrix. For FCMs with continuous transformation function and concept values, a resulting chaotic attractor can assist in simulation by feeding the simulation environment with endless sets of events so that a realistic effect can be obtained. Development of FCMs for decision modelling FCMs can be based on textual descriptions given by an expert on a problem scenario or on interviews with the expert. The steps followed are: Step 1: Identification of key concepts/issues/factors influencing the problem. Step 2: Identification of causal relationships among these concepts/issues/factors Experts give qualitative estimates of the strengths associated with edges linking nodes. These estimates are translated into numeric values in the range –1 to 1. For example, if an increase in the value of concept A causes concept B to increase significantly (a strong positive influence), a value of 0.8 may be associated with the causal link leading from A to B. Experts themselves may be asked to assign these numerical values. The outcome of this exercise is a diagrammatic representation of the FCM, which is converted into the corresponding edge matrix. Learning in FCMs FCM learning involves updating the strengths of causal links. Combining multiple FCMs is the simplest form of learning. An alternative learning strategy is to improve the FCM by fine-tuning its initial causal link or edge strengths through training similar to that in artificial neural networks. Both these approaches are outlined below. Multiple FCMs constructed by different experts can be combined to form a new FCM. FCM combination can provide the following advantages: 1. It allows the expansion of an FCM by incorporating new knowledge embodied in other FCMs. 2. It facilitates the construction of a relatively bias-free FCM by merging different FCMs representing belief systems of a number of experts in the same problem domain. The procedures for combining FCM are outlined in (Kosko 1988). Generally, combination of FCMs involves summing the matrices that represent the different FCMs. The matrices are augmented to ensure conformity in addition. Each FCM drawn by different experts may be assigned a credibility weight. The combined FCM is given by E¼

k¼N X

Wk Ek

k¼1

Where E is the edge matrix of the new combined FCM, Ek is the edge matrix of FCM k, Wk is the credibility weight assigned to FCM k, and N is the number of FCMs to be combined. Siegel and Taber (1987) outlines procedures for credibility weights assignment in FCMs.

180

D. A. Karras and R. C. Papademetriou

McNeill and Thro (1994) discuss the training of FCMs for prediction. A list of state vectors is supplied as historical data. An initial FCM is constructed with arbitrary weight values. It is then trained to make predictions of future average value in a stock market using historical stock data. The FCM runs through the historical data set one state at a time. For each input state, the ‘error’ is determined by comparing the FCM’s output with the expected output provided in the historical data. Weights are adjusted when error is identified. The data set is cycled until the error has been reduced sufficiently for no more changes in weights to occur. If a correlated change between two concepts is observed, then a causal relation between the two is likely and the strength of this relationship should depend on the rate of the correlated change. This proposition forms the basis of the Differential Hebbian Learning (DHL). Kosko (1992) discusses the use of DHL as a form of unsupervised learning for FCMs. DHL can simplify the construction of FCMs by allowing the expert to enter approximate values (or even just the signs) for causal link strengths. DHL can then be used to encode some training data to improve the FCM’s representation of the problem domain and consequently its performance. Business Models as Cognitive Maps Drawing on the insights of the cognitive mapping approach in strategic management, we argue that the causal structures embedded in business models can be usefully conceptualized and represented as cognitive maps (Furnari 2015). From this perspective, a business model’s cognitive map is a graphical representation of an entrepreneur or top manager’s beliefs about the causal relationships inherent in that business model (Furnari 2015). By emphasizing the causal nature of business models, this definition is consistent with previous studies viewing business models as sets of choices and the consequences of those choices (e.g. Casadesus-Masanell and Ricart 2010), and with studies that explicitly highlight the importance of cause-effect relationships in business models’ cognitive representations (e.g. Baden-Fuller and Haefliger 2013; Baden-Fuller and Mangematin 2013). Business models’ cognitive maps can be derived from the texts that entrepreneurs and top managers use in designing their business models, or to pitch their projects to various audiences (including investors, customers, policy makers); or they can be derived from primary interviews with entrepreneurs and top managers (Furnari 2015). Thus, the content of a business model’s cognitive map can be idiosyncratic, depending on the particular individual’s cognitive schemas and on the language they use. The raw concepts that entrepreneurs and top managers use in their causal statements identify the elements of a business model’s cognitive map that are induced empirically (Furnari 2015). At the same time, such maps may include elements deduced theoretically from extant theories about business models - i.e. the conceptual categories developed in such theories (such as “value proposition”, “monetization mechanisms”) - that can be useful to classify the raw concepts used by entrepreneurs and top managers, providing a basis for comparing different individuals’ cognitive maps Thus, business models’ cognitive maps include both inductive and deductive elements, as do other types of cognitive maps (e.g. Axelrod 1976; Bryson et al. 2004). For the sake of illustrating examples of business models’ cognitive maps, we focus particularly on the business model representation developed by Baden-Fuller and Mangematin (2013; Furnari 2015). Among the several business model representations

A Systematic Review of Analytical Management Techniques

181

suggested in the literature, we adopt this typological representation because it strikes a balance between parsimony and generality, thus meeting the criteria typically recommended for solid theory-based typologies (e.g. Doty and Glick 1994; Delbridge and Fiss 2013). Specifically, this typology includes the essential building blocks of the business model as covered by other business model representations, thus having a general scope in terms of content. At the same time, it uses a more parsimonious set of categories than other business model representations in covering this general scope. For this reason, in the cognitive maps’ illustrations provided below, we used the four constructs characterizing this business model representation (“customer identification”, “customer engagement (or value proposition)”, “value chain” and “monetization”) as organizing categories (Furnari 2015). Although we use this specific business model representation here for illustrating business models’ cognitive maps, the cognitive mapping approach developed in this paper can be used, more generally, with any other business model representation, depending on the analyst’s preferences and research objectives (Furnari 2015) (Fig. 4).

Fig. 4. A model for integrating FCM or Cognitive Maps approach in BPM.

3 Discussion - Conclusions In this study we have attempted to present and analyse some important analytical management techniques that might be of value in Business Process Modelling. We have argued through examples relevant to Business Competition analysis of different brands sharing the same market, that each such technique could be involved in business process analysis via specific formalisms and that in order for these techniques to be widely utilized by enterprises a common well defined framework should be established based on BPMN. BPMN could provide the representation schemes that should be integrated in the associated formalisms, as shown in the next diagram.

182

D. A. Karras and R. C. Papademetriou

The Basic elements of BPMN (including BPMN 2.0)—A Proposal towards a framework for integrating analytical management techniques.

Flow objects BPMN 2.0

An enriched Activity in BPMN 2.0 is a possible conveyor for integrating analytical management techniques in Business Process Modelling - in combination with GATEWAYS for decisions based on the results of such enriched activities. To the above discussed end, our presentation is a first step. Each analytical management technique herein presented should be analysed in depth in order to be integrated with BPM methodology in a common useful and well organized application framework, that in the sequel could be employed in real world scenarios, managing even big data of the associated enterprises.

References Ashcroft, M.: Bayesian networks in business analytics. In: Ganzha, M., Maciaszek, L., Paprzycki, M. (eds.) Proceedings of the Federated Conference on Computer Science and Information Systems FedCSIS 2012, Polskie Towarzystwo Informatyczne, pp. 955–961. IEEE Computer Society Press, Warsaw (2012) Anderson, R.D., Mackoy, R.D., Thompson, V.B., Harrell, G.: A Bayesian network estimation of the service-profit chain for transport service satisfaction. Decis. Sci. 35(4), 665–689 (2004) Axelrod, R.M. (ed.): Structure of Decision: The Cognitive Maps of Political Elites. Princeton University Press, Princeton (1976) Baden-Fuller, C., Morgan, M.: Business models as models. Long Range Plan. 43(2–3), 156–171 (2010) Baden-Fuller, C., Haefliger, S.: Business models and technological innovation. Long Range Plan. 46(6), 419–426 (2013) Baden-Fuller, C., Mangematin, V.: Business models: a challenging agenda. Strateg. Organ. 11 (4), 418–427 (2013) Barr, P.S., Stimpert, J.L., Huff, A.S.: Cognitive change, strategic action, and organizational renewal. Strateg. Manag. J. 13(S1), 15–36 (1992) Bhargava, H.K., Herrick, C., Sridhar, S.: Beyond spreadsheets: tools for building decision support systems. IEEE Expert 32(3), 31–39 (1999) Brubaker, D.: Fuzzy Cognitive Maps (1996a). http://www.ednmag.com. EDN Accessed 11 Apr 1996 Brubaker, D.: More on Fuzzy Cognitive Maps (1996b). http://www.ednmag.com. EDN Accessed 25 Apr 1996

A Systematic Review of Analytical Management Techniques

183

Bryson, J.M., Ackermann, F., Eden, C., Finn, C.B.: Visible Thinking: Unlocking Causal Mapping for Practical Business Results. Wiley, Hoboken (2004) Calori, R., Johnson, G., Sarnin, P.: CEOs’ cognitive maps and the scope of the organization. Strateg. Manag. J. 15, 437–457 (1994) Carley, K., Palmquist, M.: Extracting, representing, and analyzing mental models. Soc. Forces. 70(3), 601–636 (1992) Casadesus-Masanell, R., Ricart, J.E.: From strategy to business models and onto tactics. Long Range Plan. 43(2–3), 195–215 (2010) Caudill, M.: Using neural nets: fuzzy cognitive maps. AI Expert 5, 49–53 (1990) Chakraborty, S., et al.: A Bayesian network-based customer satisfaction model: a tool for management decisions in railway transport. Decis. Anal. 3, 4–5 (2016). https://doi.org/10. 1186/s40165-016-0021-2 Chesbrough, H., Rosenbloom, R.S.: The role of the business model in capturing value from innovation: evidence from Xerox Corporation‘s technology spin-off companies. Ind. Corp. Change 11(3), 529–555 (2002) Chong, A., Khan, S., Gedeon, T.: Differential Hebbian learning in fuzzy cognitive maps: a methodological view in the decision support perspective. In: Proceedings of the Third Australia-Japan Joint Workshop on Intelligent and Evolutionary Systems, 23–25 November 1999, Canberra (1999a, to be published) Chong, A.: Development of a fuzzy cognitive map based decision support systems generator. Honours dissertation, School of Information Technology, Murdoch University (1999b) Clarke, I., Mackaness, W.: Management ‘intuition’: an interpretative account of structure and content of decision schemas using cognitive maps. J. Manag. Stud. 38(2), 147–172 (2001) Daly, R., Shen, Q., Aitken, S.: Learning Bayesian networks: approaches and issues. Knowl. Eng. Rev. 26(02), 99–157 (2011) Davis, G.F., Marquis, C.: Prospects for organization theory in the early twenty-first century: institutional fields and mechanisms. Organ. Sci. 16(4), 332–343 (2005) Delbridge, R., Fiss, P.C.: Editors’ comments: styles of theorizing and the social organization of knowledge. Acad. Manag. Rev. 38(3), 325–331 (2013) Doganova, L., Eyquem-Renault, M.: What do business models do? Innovation devices in technology entrepreneurship. Res. Policy 38(10), 1559–1570 (2009) Doty, D.H., Glick, W.H.: Typologies as a unique form of theory building: toward improved understanding and modeling. Acad. Manag. Rev. 19(2), 230–251 (1994) Doz, Y.L., Kosonen, M.: Embedding strategic agility: a leadership agenda for accelerating business model renewal. Long Range Plan. 43(2–3), 370–382 (2010) Eden, C., Ackermann, F., Cropper, S.: The analysis of cause maps. J. Manag. Stud. 29(3), 309– 324 (1992) Eisenmann, T., Parker, G., Van Alstyne, M.W.: Strategies for two-sided markets. Harv. Bus. Rev. 84(10), 92 (2006) Elert, G.: The Chaos Hypertexbook (1999). http://hypertextbook.com/chaos/about.shtml Elster, J.: A plea for mechanisms. In: Hedström, P., Swedberg, R. (eds.) Social Mechanisms: An Analytical Approach to Social Theory. Cambridge University Press, Cambridge (1998) Fiol, C.M., Huff, A.S.: Maps for managers: where are we? Where do we go from here? J. Manag. Stud. 29(3), 267–285 (1992) Fox, T.: Small and Medium-Sized Enterprises (SMEs) and Corporate Social Responsibility – A Discussion Paper. International Institute for Environment and Development, London (2005) Furnari, S.: A cognitive mapping approach to business models: representing causal structures and mechanisms. In: Baden-Fuller, C., Mangematin, V. (eds.) Business Models and Modelling. Advances in Strategic Management, chap. 8, vol. 33. Emerald Press (2015)

184

D. A. Karras and R. C. Papademetriou

Gavetti, G.M., Henderson, R., Giorgi, S.: Kodak and the Digital Revolution (A). Harvard Business School Case #705-448 (2005) Goldman, I.: Poor decisions that humbled the Kodak giant’. Financial Times, 13 April 2012 Grandori, A., Furnari, S.: A chemistry of organization: combinatory analysis and design. Organ. Stud. 29(2), 315–341 (2008) Gunn, R., Williams, W.: Strategic tools: an empirical investigation into strategy in practice in the UK. Strateg. Change 16, 201–216 (2007). www.interscience.wiley.com, https://doi.org/10. 1002/jsc.799 Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995) Hempel, C.G.: Fundamentals of taxonomy, and typological methods in the natural and social sciences. In: Aspects of Scientific Explanation, chap. 6, pp. 137–72. Macmillan, New York (1965) Hodgkinson, G.P., Bown, N.J., Maule, A.J., Glaister, K.W., Pearman, A.D.: Breaking the frame: an analysis of strategic cognition and decision making under uncertainty. Strateg. Manag. J. 20, 977–985 (1999) Hodgkinson, G.P., Healey, M.P.: Cognition in organizations. Ann. Rev. Psychol. 59, 387–417 (2008) Hodgkinson, G.P., Maule, A.J., Bown, N.J.: Causal cognitive mapping in the organizational strategy field: a comparison of alternative elicitation procedures. Organ. Res. Methods 7(1), 3–26 (2004) Huff, A.S.: Mapping Strategic Thought. Wiley, Chichester (1990) Huff, A.S., Narapareddy, V., Fletcher, K.E.: Coding the causal association of concepts. In: Mapping Strategic Thought. Wiley, New York (1990) Jenkins, M., Johnson, G.: Entrepreneurial intentions and outcomes: a comparative causal mapping study. J. Manag. Stud. 34, 895–920 (1997) Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, 2nd edn. Springer, Berlin (2007). https://doi.org/10.1007/978-0-387-68282-2 Jeppesen, S., Kothuis, B., Tran, A.N.: Corporate Social Responsibility and Competitiveness for SMEs in Developing Countries: South Africa and Vietnam, Focales 16, Agence Francaise de Development, Paris (2012) Jetter Antonie, J., Kok, K.: Fuzzy cognitive maps for futures studies—a methodological assessment of concepts and methods. Futures J. 61, 45–57 (2014). https://doi.org/10.1016/j. futures.2014.05.002 Johnson, S., Fielding, F., Hamilton, G., Mengersen, K.: An integrated Bayesian network approach to Lyngbya majuscula bloom initiation. Mar. Environ. Res. 69(1), 27–37 (2010) Kaplan, S., Sawhney, M.: B2B E-Commerce hubs: towards a taxonomy of business models. Harv. Bus. Rev. 79(3), 97–100 (2000) Kardaras, D., Karakostas, B.: The use of fuzzy cognitive maps to simulate information systems strategic planning process. Inf. Softw. Technol. 41(4), 197–210 (1999) Karras, D.A., Papademetriou, R.C.: A systematic review of analytical management techniques in business process modelling for smes beyond what-if-analysis and towards a framework for integrating them with BPM. In: Proceedings of the 7th International Symposium on Business Modeling and Software Design, BMSD 2017 (2017) Klang, D., Wallnöfer, M., Hacklin, F.: The business model paradox: a systematic review and exploration of antecedents. Int. J. Manag. Rev. 16, 454–478 (2014) Kosko, B.: Fuzzy cognitive maps. Int. J. Man-Mach. Stud. 24, 65–75 (1986) Kosko, B.: Neural Networks and Fuzzy Systems. Prentice Hall, Englewood Cliffs (1992) Lee, K.C., Kim, H.S.: A fuzzy cognitive map-based bi-directional inference mechanism: an application to stock investment analysis. Intell. Syst. Account. Fin. Manag. 6, 41–57 (1997)

A Systematic Review of Analytical Management Techniques

185

Lee, K.C., Lee, W.J., Kwon, O.B., Han, J.H., Yu, P.I.: Strategic planning simulation based on fuzzy cognitive map knowledge and differential game. Simulation 71, 316–327 (1998) Laukkanen, M.: Comparative cause mapping of organizational cognitions. Organ. Sci. 5(3), 322– 343 (1994) Luetkenhorst, W.: Corporate social responsibility and the development agenda. Intereconomics 39(3), 157–166 (2004) McNeill, M.F., Thro, E.: Fuzzy Logic: A Practical Approach. AP Professional, Boston (1994) Morsing, M., Perrini, F.: CSR in SMEs: do SMEs matter for the CSR agenda? Bus. Ethics: Eur. Rev. 18(1), 1–6 (2009) Nadkarni, S., Shenoy, P.P.: A Bayesian network approach to making inferences in causal maps. Eur. J. Oper. Res. 128(3), 479–498 (2001) Nadkarni, S., Shenoy, P.P.: A causal mapping approach to constructing Bayesian networks. Decis. Support Syst. 38(2), 259–281 (2004) Park, K.S.: Fuzzy cognitive maps considering time relationships. Int. J. Hum.-Comput. Stud. 42, 157–167 (1995) Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann Publishers Inc., San Francisco (1988) Power, D.J.: What is DSS? DS*. On-Line Exec. J. Data-Intensive Decis. Support 1(3) (1997). http://dss.cba.uni.edu/papers/whatisadss Rappa, M.A.: The utility business model and the future of computing services. IBM Syst. J. 43 (1), 32–42 (2004) Raynard, P., Forstater, M.: Corporate social responsibility: implications for small and medium enterprises in developing countries. UNIDO, Geneva (2002). www.unido.org/fileadmin/ import/29959_CSR.pdf. Accessed 27 Sept 2016 Render, B., Stair, M.S.: Quantitative Analysis for Management. Allyn & Bacon, Boston (1988) Salini, S., Kenett, R.S.: Bayesian networks of customer satisfaction survey data. J. Appl. Stat. 36 (11), 1177–1189 (2009) Taber, R.: Knowledge processing with fuzzy cognitive maps. Expert Syst. Appl. 2(1991), 83–87 (1991) Tsadiras, A.K.: Using fuzzy cognitive maps for e-commerce strategic planning. In: Proceedings of the 9th Panhellenic Conference on Informatics (EPY 2003) (2003) Turban, E.: Decision Support and Expert Systems Management Support Systems, 3rd edn. Macmillan, New York (1993) VidHya Analytics: Introduction to Markov chain: simplified! (2017). https://www.analytics vidhya.com/blog/2014/07/markov-chain-simplified/. Accessed 1 2017 Wirtz, B.W., Schilke, O., Ullrich, S.: Strategic development of business models: implications of the web 2.0 for creating value on the internet. Long Range Plan. 43(2–3), 272–290 (2010)

A Privacy Risk Assessment Model for Open Data Amr Ali-Eldin1,2(&), Anneke Zuiderwijk3, and Marijn Janssen3 1

2

Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands [email protected] Computer and Control Systems Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt 3 Faculty of Technology, Policy and Management, Delft University of Technology, Delft, The Netherlands

Abstract. While the sharing of information has turned into a typical practice for governments and organizations, numerous datasets are as yet not openly published since they may violate users’ privacy. The hazard on data protection infringement is a factor that regularly hinders the distribution of information and results in a push back from governments and organizations. Moreover, even published information, which may appear safe, can disregard client security because of the uncovering of users’ personalities. This paper proposes a privacy risk assessment model for open data structures to break down and diminish the dangers related with the opening of data. The key components are privacy attributes of open data reflecting privacy risks versus benefits exchanges-off related with the utilization situations of the information to be open. Further, these attributes are assessed using a decision engine into a privacy risk indicator value and a privacy risk mitigation measure. Privacy risk indicator expresses the anticipated estimation of data protection dangers related with opening such information and privacy risk mitigation measure expresses the estimations that should be connected on the information to evade the expected security risks. The model is exemplified through five genuine scenarios concerning open datasets. Keywords: Open data  Privacy risks Personally identifiable information (PII)

 Data mining  Scoring systems

1 Introduction Governments and openly subsidized research associations are urged to unveil their information and to make this information available without limitations and for free [1]. Opening public and private information is a mind boggling movement that may bring about advantages yet may likewise experience risks [2]. An essential risk that may hinder the production of the information is that associations may abuse the privacy of citizens when opening data about them [3]. In addition, when opening data, associations lose control on who is utilizing this information and for what reason. When information are distributed, there is no power over who will download, utilize and adjust the information. © Springer International Publishing AG, part of Springer Nature 2018 B. Shishkov (Ed.): BMSD 2017, LNBIP 309, pp. 186–201, 2018. https://doi.org/10.1007/978-3-319-78428-1_10

A Privacy Risk Assessment Model for Open Data

187

To maintain a strategic distance from data protection infringement, information distributers and owners can remove delicate data from datasets, in any case, this makes datasets less helpful. Furthermore, even distributed information, which may appear security agreeable, can disregard user privacy because of leakage of genuine user personalities when different datasets and different assets are connected to each other [4]. The likelihood of mining the information subsequently to get important conclusions can prompt leakage of private information or users’ real identities. In spite of the fact that organisations remove personally identifiable information (PII) from the dataset before distributing the information, a few investigations exhibit that anonymized information can be de-anonymized and thus real identities can be uncovered [4]. Different existing investigations have pointed at the dangers and difficulties of privacy infringement for distributing and utilizing open data [3–6]. A few investigations have distinguished privacy risks or approaches for organisations in gathering and preparing information [7, 8], some have given choice help to opening information as a rule [2], and some have concentrated on discharging data and information on the individual level [9]. All things considered, there is as yet constrained knowledge in how associations can lessen privacy infringement dangers for open data specifically, and there is no uniform approach for privacy assurance [5]. From existing studies, it has not turned out to be clear which open data frameworks can be utilized to lessen the hazard on open data privacy infringement. An open data design is required that helps settling on choices on opening data and that gives understanding in whether the information may abuse users privacy. The goal of this paper is to propose a model to analyse privacy infringement risks of publishing open data. To do as such, a new arrangement of what are called open data attributes is proposed. Open data attributes reflect privacy risks versus benefits exchanges off related with the normal utilize situations of the information to be open. Further, these attributes are assessed utilizing a decision engine into a privacy risk indicator (PRI) and a privacy risk mitigation measure (PRMM). Specifically this can decide if to open data or keep it closed. This paper is organized as follows: Sect. 2 discusses related work while Sect. 3 presents privacy violation risks associated with open data. Section 4 introduces the proposed model. The model helps identifying the risks and highlights possible alternatives to reduce these risks. Section 5 highlights how the proposed model can be implemented in reality. Section 6 exemplifies the model by providing some scenarios and preliminary results. Section 7 discusses the key findings and concludes the paper.

2 Previous Work Open bodies are viewed as the greatest makers of data in the general public in what is known as open data. Open data may extend from information on acquirement openings, climate, movement, traveller, energy utilization, crime statistics, to information about arrangements and organizations [1, 2]. Information can be arranged into various levels of secrecy, including exceedingly private, classified, confined and open. We consider open data that has no connection with information about citizens as outside the extent of this work.

188

A. Ali-Eldin et al.

Anonymized information about citizens can be shared to comprehend societal issues, for example, crimes or diseases. A case of subject information is the sharing of patient information to start joint effort among healthcare providers which is relied upon to be gainful to the patient and scientists. The profoundly expected advantages behind this information sharing is the enhanced comprehension of particular illness and subsequently considering better medications. It can likewise help professionals to become plainly more productive. For instance, a general specialist can rapidly analyse and recommend drug. However, this sharing of patients’ data ought to be achieved by information security approaches and privacy controls. An assortment of Data Protection Directives has been made and executed. In light of the Data Protection Directive [2], a thorough change of data protection rules in the European Union was proposed [3]. Additionally, the Organization for Economic Co-operation and Development (OECD) has created Privacy Principles [4], including standards, for example, “There should be limits to the collection of personal data” and “Personal data should not be disclosed, made available or otherwise used for purposes other than those specified in accordance with Paragraph 9 except (a) with the consent of the data subject; or (b) by the authority of law.” In addition, ISO/IEC 29100 standard has defined 11 principles for privacy [5]. It turns into a pattern these days that organizations put a greater amount of their consideration on the privacy issue, since information is assumed to be a central asset of any business. The Data Protection Directives are frequently characterized on high level of abstraction, and give restricted rules to making an interpretation of the directives to practice. In spite of the created Data Protection Directives and other information assurance arrangements, associations are still subject to privacy infringement when distributing open data. In the accompanying sections, we expound on the principle risks of privacy infringement related with open data. There has been expanding enthusiasm for outlining privacy assurances into advancements from the beginning known as privacy by design (PbD). PbD is a proactive way to deal with privacy assurance that considers privacy ramifications of new advances amid the plan organize, as opposed to as a bit of hindsight [6]. Privacy awareness is increasingly being raised. A lot of privacy related investigations are really being directed, however they either concentrate on legal aspects like [7], or on conducting formal Privacy Impact Assessment (PIA) as [8]. Most work on privacy impact assessment plan to lead reviews or surveys that evaluate organizations methods for managing individual information as indicated by regulatory frameworks and moral or ethical esteems into what is known as PIA. According to [9], a PIA is a procedure which should start at the most punctual conceivable stages, when there are still chances to impact the result of a project. It is a procedure that should proceed until and even after the undertaking has been sent. A PIA has frequently been depicted as an early cautioning framework as it gives an approach to identify potential privacy issues [9]. The General Data Protection Regulation (GDPR) (Regulation (EU) 2016/679), embraced on 27 April 2016 as a replacement of Directive 95/46/EC) [2], becomes enforceable from 25 May 2018 and does not require national governments to pass any empowering enactment [10]. The directive points fundamentally to give control back to subjects and EU occupants over their own information and to streamline the administrative condition for global business by bringing together the control inside the EU.

A Privacy Risk Assessment Model for Open Data

189

GDPR enforces organizations which deal with personal information of EU citizens to include privacy protection activities into the development lifecycle of software and business processes. With regards to open data, such frameworks like to assess privacy risks cannot be utilized since the information to be distributed will contain no identifying data as a requirement by the law. Having said that, ordinary methods for assessing privacy risks cannot be applied and new ways are required that exceed the advantages of sharing the information contrasted with expected privacy risks of the leakage of personally identifiable information.

3 Privacy Threats and Opening Data In this section, we elaborate on privacy threats associated with making data openly accessible. 3.1

Disclosure of Real Identities

Privacy can be characterized as a need to oversee data and associated interactions [11]. It is clear that privacy risks are caused mainly by the risks related with anonymizing the information and making it open for re-utilize. Privacy legislation and data protection policies oblige organisations and governments not to distribute private data. In this specific situation, associations are requested to dismiss any distinguishing data from the information before making it accessible on the web. In any case, a few researches in anonymization methods demonstrate that anonymized information can be deanonymized and consequently identifying information can be disavowed. For instance, Narayanan and Shmatikov [12] demonstrated that adversary with very little information about a user, could recognize his or her record in the Netflix openly distributed datasets of 500,000 anonymized endorsers. Likewise, expelling genuine names, birth dates and other sensitive data from datasets may not generally have the coveted impact. For example, the police may distribute open information about autos and bike robberies after removing genuine names of individuals included. In spite of the fact that excluding these names may sound attractive for privacy and security insurance, more research is expected to examine whether user identities are safeguarded and whether this is a robust approach. 3.2

Personal Information Discovery Through Linking Data

The blend of factors from different datasets could bring about the distinguishing proof of people and uncover personalities [13]. Information characteristics, alluded to with the term ‘semi-identifier’, can be connected to outer information assets and consequently can prompt the arrival of concealed personalities [14]. Cases of semi identifiers are a man’s age, sex and address. An attacker may recognize an individual John from a dataset. By linking this dataset to another on the sexual orientation, origin and the city where John lives, John’s records might be distinguished in the open dataset. Hence, these information types are

190

A. Ali-Eldin et al.

critical to be covered up too. Furthermore, delicate information, for example, illness ought to be excluded from the datasets. In any case, information suppliers frequently cannot anticipate ahead of time which blend of factors will prompt privacy infringement [13], and along these lines this expectation is a perplexing action. Some domains like healthcare providers, utilize a linkage unit to interface diverse datasets together with a linkage id instead of personally identifiable information (PII) so as to lessen privacy risks. More often, the linkage unit depends on a blend of obscured PII data (for example first letters of names + birth date) [15]. However in the same way an attacker can mimic a linkage unit and retrieves actual records of users. 3.3

Personal Information Discovery Through Data Mining

Open information makes information accessible online for specialists and organizations. Organizations utilize data mining procedures to retrieve significant information from these datasets which help them in their organizations. While doing as such, they can damage user privacy since mining the information can derive private data. In order to help overcome these issues, privacy preserving data mining techniques should be used to reduce privacy risks [14]. 3.4

Data Utilization Versus Privacy

Once a dataset has been exchanged from the information owner to the information distributer, the information proprietor is never again responsible for his or her information. Information control has exchanged to the information distributer who is legally responsible for the assurance of individuals’ privacy. Before distributing the information on the web, the information distributer anonymizes the information and evacuates any delicate information that makes it conceivable to distinguish people. A large portion of the circumstances, the information distributer does not know who will get the information and for what reason he or she will get to the information. Further, the information distributer does not really realize what mining systems will be utilized by the information beneficiary and how much sensitive data can be found from the anonymized information. On the off chance that the information distributer expels all identifying data, modifies related semi-identifiers, and evacuates delicate information, the distributed information can lose its esteem. Subsequently, there ought to be a harmony between what can be distributed, with the goal for clients to have the capacity to infer valuable data, and in the meantime guaranteeing privacy insurance. Complete privacy assurance may bring about no utilization of the information by any means, and consequently the distributed information can happen to have no esteem.

4 Proposed Privacy Risks Assessment Model Uncertainty related with the exposure of information makes it hard to come up with a decent way to ensure user privacy. Whenever distributed, obscure outsider associations and different clients can access sensitive data. Sharing data under uncertainty conditions while having the capacity to ensure privacy protection is one of the difficulties in these situations [16]. Since evaluating privacy and security risks is basic for ventures [17], we

A Privacy Risk Assessment Model for Open Data

191

Privacy Risk Indicator (PRI)

Low

Level 4: Discard Publishing

Level 3: Exclude Delicate Items

Level 2: Change Semi-identifiers

Privacy Risk Mitigation Measures (PRMM)

Level1: Abandon Identities

Physical Presence

High

Trust Level

Type of User

Medium-high

Threats of Attacks

Low-medium

Criticality

Decision Engine

Open Data Attributes

Need for Openness

Restrictions of Use

expect the same is needed for open data environments for the sake of protection of user data. The key components of the proposed model are presented as follows (see Fig. 1):

Fig. 1. Proposed privacy risk assessment model

4.1

Open Data Attributes

Privacy attributes represent factors influencing publishing of the data openly which is inspired by the work of [18] who defined a number of factors influencing users willingness to share their private information and [19] who proposed a decision engine to evaluate privacy risks associated with sharing information on social networks. In this context, five attributes are distinguished as follows: – Criticality level: this factor expresses the significance of the information, practically equivalent to the significance of the advantage of information distributing to the group. Criticality level can be measured by running a privacy impact assessment. – Openness: alluding to the requirement for distributing the information straightforwardly. This factor expresses the advantage of data publishing from public and business perspectives. In the event that the information criticality level is high and the need for openness is high, at that point an exchange-off exists and the requirement for transparency can exceed the high criticality level or the other way around. – Risk of Attacks: this alludes to what degree is the normal digital security danger alarm. In the event that the threat of an attack is set to high, at that point this can

192

A. Ali-Eldin et al.

have effect on the idea of information being distributed and made accessible to others. This is similar to the threat level of an attack which takes four levels in the Netherlands; minimal meaning attacks are unlikely, limited meaning an attack is less likely, substantial meaning an attack is most likely and critical meaning very strong indication an attack will happen [20]. Here, we use similar notation which reflects the status of security threats at the body publishing information, its neighbourhood or country. – Trust: this corresponds to how the data publisher/owner is evaluated by others with respect to his or her dependability. Disrepute of the data publisher impacts the nature of the information and the way protection is managed. – Restrictions of Use: restrictions of use represents access privileges allowed on the data. We distinguish four types to describe this restriction: • None. This means no restriction is applied. • Role of user. This means a restriction is applied on basis of the user role. • Purpose of use. This means different types of restriction may apply depending on the purpose data is needed for. • Physical Presence. This means that data access depends on the physical location where it is accessed from. 4.2

Decision Engine

The decision engine is in charge of settling on apparent privacy risks and suggests an appropriate privacy risk mitigation measure. This is done in light of a scoring value and a decision algorithm having scores of open data attributes as input. Practices are indicated by topic specialists and by investigation of related work. For effortlessness, a scoring mechanism is utilized where attributes are given scores on a scale from 1 to 5 as indicated by their risk to privacy. Each attribute is valuated with a score s such that 0 \ s  1. These scores are created based on assumptions on privacy risks associated with each attribute value. Each attribute category Ai has a weight ð0 \ wi  1Þ associated with it such that when aggregating all scores they get weighted as follows: PRI ¼

n 1 X  wi  Maxðsi Þ; n i¼1

PRI  1

ð1Þ

Max(Si) means that if more than one score is possible within one attribute category because of the existence of more than one attribute value like for example two types of use, then the maximum score is selected to reflect the one with the highest risk. The upside of utilizing weights is to present some adaptability with the end goal that the influence of each characteristic class can get refreshed after some time as indicated by lessons gained from assembled information and already discovered privacy threats. 4.3

Privacy Risk Indicator (PRI)

The PRI represents the predicted value of privacy risks associated with opening such data. PRI can have four values; low, low-medium, medium-high and high. A high PRI

A Privacy Risk Assessment Model for Open Data

193

means the threat to privacy violation is expected to be high. PRI is determined by the decision engine based on the scoring matrix and the rules associated with the decision engine. 4.4

Privacy Risk Mitigation Measures (PRMM)

Based on the decision engine, privacy risk indicator score is predicted together with a privacy risk mitigation measure. For example, what should be done if there is a risk that the identity of an owner of a stolen bike can be tracked down if we publish stolen bike records online? The following measures are used in our framework: – Level 1: Abandon identifiers. This is the slightest measure that should be taken by a data publisher when the risk indicator shows low risk. By doing that, they adhere to the European directives and the law. The utilization of database anonymization software is compulsory with a specific end goal to anonymize the data [21–23]. – Level 2: Change Semi-identifiers. Modifying semi identifiers’ information can help diminish identity capturing. Semi identifiers are information constructs which if connected with other datasets can uncover user identities. Illustrations are age, sex and postal district [16, 24]. Researchers around there created mechanisms that can identify and find semi-identifiers [25–27]. To meet PRMM level 2, PRMM level 1 activities must be completed as well. – Level 3: Exclude Delicate Data. For a few cases, there are data items, for example, restorative infections which are considered delicate and should be safeguarded when publishing the data if the privacy risk is high. The sort of information that is viewed as delicate or sensitive shifts from dataset to another which makes it complex to securely recognize and evacuate. Some commercial tools exist that could be utilized. Examples of such tools are Nessus [28]. To meet PRMM level 3, PRMM levels 1&2 activities must be completed as well. – Level 4: Discard Publication. In case that the threat is high, it is advised not to publish the data at all, and therefore the recommended measure would be to reject publication.

5 PRI Model Implementation In this section, we elaborate on the implementation features of the PRI model. Functional requirements needed to implement the proposed model in an open data platform are introduced. 5.1

Functional Components

It is obvious that for such a platform privacy represents a critical requirement which must be met. The proposed functionalities are inspired by the proposed ones in [29]. These functionalities are (see Fig. 2):

194

A. Ali-Eldin et al.

Fig. 2. PRI model functional components

– Search Datasets: Via this functionality, users can search for available open data by specifying specific keywords. – Publish Datasets: data can be published to be used by other researchers or practitioners as well. Before publishing the data, the dataset will be reviewed by the management. A decision need to be take on the privacy measurements by the privacy model first to guarantee users’ privacy without impacting the quality of the data. The privacy model is where the proposed privacy assessment model will be implemented. – Discuss an Evaluate: Through this functionality researchers can discuss together certain data elements with other experts who have had similar experiences. Further this component implements datamining tools to evaluate data properties. – Access Management and Privacy Model: Access control is needed in order to restrict access to this platform for authorized persons only. These include: registered professionals, technicians and decision makers etc. Further, sensitive data and privacy identifying information are removed before publishing. – Evaluation Engine: This is responsible for providing an evaluation for the privacy mitigation decision using the PRI model. – Request for Download of Dataset: Via this functionality, professionals will be able to download the data they want after taking the necessary approvals.

A Privacy Risk Assessment Model for Open Data

195

– User Roles: Three types of users are distinguished; management, authorized users and operators. The management is responsible for granting access to users. Further, they approve datasets being uploaded by the operators to be published on the platform.

5.2

Technical Implementation

Using a common platform for sharing data could only exist due to the recent developments in services technology and computer networks [29]. Such specialized arrangements have been there since over two decades with the development of web technology innovation by Microsoft [30]. The early usage of this approach relied upon the high adaptability offered by the World Wide Web. Later, and with the improvement of web 2.0 technology, ventures put resources into enormous IT change ventures towards achieving business and public strategic goals [31]. This prompt presentation of web 2.0 innovation assets examples are Representational State Transfer (REST) protocol and RESTFUL web services [32]. In this work, RESTFUL web services are prescribed on the grounds that REST gives preferable execution over SOAP web services. In addition, its usage is simpler than that of SOAP [32].

6 Illustration Scenarios In this section, we describe five scenarios to illustrate the proposed model. These scenarios are based on the authors’ experience with recent projects in open government. In each case we evaluate PRI and PRMM. Table 1 demonstrates the scoring approach in light of the authors’ experiences. Table 2 demonstrates how PRI is mapped to a privacy risks mitigation measure level (PRMM). The situations include distinctive sorts of actors who conduct different activities. A portion of the actors upload datasets, others utilize them or both transfer and utilize them. The kind of information provided fluctuates between the scenarios, since a portion of the opened data is updated regularly, while others are static with or without refreshes. The criticality of the data ranges from low to high, and the information utilize is confined in different ways. The utilization of some datasets is not limited, while for different datasets the limitation relies upon the reason for utilize, and the type of user. The level of trust in information quality is diverse for each of the scenarios, extending from exceptionally constrained issues (low) to no issues (high). 6.1

Scenario S1: Open Crime Data Usage and Provisioning

A resident of a European city needs to know what number of violations happen in her neighbourhood contrasted with different neighbourhoods in the city. She scans different open information frameworks for the information that she is searching for. When she discovers ongoing open crime information, she downloads and examines them. As per the permit, the information can be utilized as a part of different structures, both non-economically and monetarily. Data perceptions help the citizen to understand the

196

A. Ali-Eldin et al. Table 1. Open data attributes matrix Attribute Type of user

Attribute value Score (s) Government 0.2 Researcher 0.4 Citizen 0.6 Student 0.8 Company 1.0 Purpose of use Information 0.2 Research 0.4 Commercial 0.6 Sharing 0.8 Unknown 1.0 Type of data Static 0.33 Updated 0.67 Real-time 1.0 Data criticality Low 0.25 Low-medium 0.50 Medium-high 0.75 High 1.00 Restrictions of use None 0.25 Type of user/purpose of use 0.50 Restricted by country 0.75 Restricted by network 1.00 Need for openness Low 0.33 Medium 0.67 High 1.00 Trust Low 0.25 Low-medium 0.50 Medium-high 0.75 High 1.00

Table 2. Mapping PRI to PRMM PRI Low Low-medium Medium-high High

Score 0.00–0.25 0.25–0.50 0.50–0.75 0.75–1.00

PRMM Level 1: Level 2: Level 3: Level 4:

Abandon identities Remove Semi-identifiers Exclude Delicate data Discard publishing

information. In any case, she has just constrained data about the nature of the dataset and about the supplier of the data, which diminishes her trust in the information. The open data framework that the subject uses does not just permit governmental associations to open datasets, yet offers this capacity to any client of the infra-structure.

A Privacy Risk Assessment Model for Open Data

197

This subject likewise needs to share a few information herself. She has gathered observation of robbery in the shop that she possesses, and submits these information on the web as open data. This implies the citizen both downloads and transfers open data. An overview of open data characteristics for the scenarios is given in Table 3. Utilizing the proposed model, From Table 1, PRI can be calculated using Eq. (1): PRI ¼ 0:61: From Table 2, PRI can be seen to be medium-high meaning a relatively high privacy risk with associated PRMM set at level 3: remove delicate data. The data publisher should filter the published data from identifying information, semi-identifiers and sensitive data to avoid this expected relatively high privacy risk. Table 3. Scenarios overview Attributes S1 Type of user Citizen

S2 Governmental archivist Use and upload Upload open open data about social data neighbourhood

S3 Student

Purpose of use

Use open Use open data for study data for research

Type of data Real-time

Static

Data criticality Restrictions of use

Low

Low

None

None

Openness Trust level

High High

High Low-medium

6.2

S4 Researcher

S5 Civil servant

Use data provided by own organization Static Static, Real-time updated and static, frequently updated frequently Low-medium Medium-high High Purpose of use, type of user Medium Medium-high

Physical presence, type of user Low Low

Physical presence, type of user Low Low

Scenario S2: Open Social Data Provisioning

An archivist working for an administrative organization keeps up the open data framework of this office. Datasets cannot be transferred by anybody yet just by a representative of the administrative association. The analyst has the undertaking to make different social datasets that are discovered fitting for production by the office representatives accessible to general society. The analyst transfers static datasets that are non-sensitive, with the goal that the risks on security ruptures is limited. The datasets can be reused by anybody; there are no limitations in regards to the kind of user or the reason for utilize. Since the datasets are given online much metadata, including information about the nature of the dataset, this lessens the trust issues that clients may have needed to utilize the dataset. Utilizing the proposed model, a review of this scenario open data qualities is appeared in Table 3. Like scenario S1, PRI = 0.39 with Low-medium privacy risk. PRMM is set at level 2: expel semi-identifiers. This infers expelling identifying information too.

198

6.3

A. Ali-Eldin et al.

Scenario S3: Use of Restricted Archaeology Data

An understudy directs an examination in the region of archaeology studies. To acquire access to the information, the understudy needs to present a demand at the association that claims the information. In his demand, the understudy needs to give data about himself, his examination and about the reason for which he needs to utilize the administrative organization with data, the legislative organization can choose to give more delicate information than the information that they offer with open access. More delicate data can be unveiled to this single client, under the condition that he will not give the data to others. Since the client can directly contact the information supplier, trust issues are much lower than they might be for other (open) datasets. Utilizing the proposed model, an outline of this situation is given in Table 3. Similarly, PRI ¼ 0:54; PRMM is at level 3: remove delicate data. Special contractual agreement can be put in place with this particular student before delicate data can be shared with him otherwise this data has to be unveiled. 6.4

Scenario S4: Use of Physically Restricted Statistics Data

A scientist might want to utilize some statistics that is given by a governmental measurements association. The measurements office has been opening information for a long time and has a good fame around there, since it offers great information. The researcher in this manner puts stock in the information of the measurements office and trusts that he can reuse these information for his own particular research. While the investigator can get to different open datasets on the web, some datasets are given in a more confined frame. To get to the more delicate datasets, the specialist needs to physically go to the measurements office. The measurement office does not open these delicate information, since this may prompt privacy breaches. The scientist can investigate the information at the area of the measurements office, yet it is not permitted to take any information alongside him and to distribute these information as open information. Since the specialist physically needs to move to the measurements office, the workplace can acquire knowledge in the reasons for which the analyst needs to utilize the information, and in light of this reason, they support or object the utilization of their information. Utilizing the proposed model, an illustration of this case is given in Table 3. Similarly, PRI ¼ 0:51, PRMM is at level 3: remove delicate data. This means before sharing this data with the researcher, all sensitive data has to be removed together with identifying information and semi-identifiers. 6.5

Scenario S5: Use of Physically Restricted Agency Data

A government worker may be engaged with opening datasets, as well as reuse datasets that are given by her own association. The organization’s information must be gotten to inside by its workers who are available at the office, and is in this manner confined by type of user and by physical boundaries. The datasets are both run-time and static, yet they are refreshed much of the time. The office’s information are exceptionally sensitive; since they have not been anonymized and delicate data has not been removed. The information cannot be utilized by anybody and are not open. Trust of the information

A Privacy Risk Assessment Model for Open Data

199

client is high, since the client knows about the setting in which the information have been made and approaches partners who can answer inquiries concerning the information if vital. Utilizing the proposed model, an outline of this case is given in Table 3. Similarly, PRI ¼ 0:68, PRMM is at level 3: remove delicate data. In the past cases, risk of attack is thought to be low in this way it was excluded in the evaluations. From the above, we see that for the diverse situations of the same dataset, we could have distinctive privacy risks and in this manner need to consider applying measures for mitigation of these risks (see Table 4). The use of the proposed model has given knowledge into this relationship between the datasets and the situations in view of privacy risks scores related with these cases. This knowledge will help in applying the appropriate privacy risk mitigation measure (PRMM) before distributing the information straightforwardly.

Table 4. Overview of scenarios evaluation Scenario S1 S2 S3 S4 S5

PRI Medium-high Low-medium Medium-high Medium-high Medium-high

PRMM Level 3: Level 2: Level 3: Level 3: Level 3:

exclude delicate data expel semi-identifiers exclude delicate data exclude delicate data exclude delicate data

7 Conclusions The opening and sharing of information is regularly hindered by security and privacy observations. Most work on privacy assesses privacy breaches in view of evaluation of organizations’ methods and taken procedures for managing individual information and their development in doing as such according to benchmarks and normal practices. These systems cannot be effectively utilized in open data platforms in light of the fact that the information does not contain personally identifiable data (PII) as a matter of course if distributed out in the open. In any case, in this paper, we demonstrated that PII can in any case be uncovered even when evacuated through various ways. We additionally contended for the need of assessing the diverse scenarios related with the use of the dataset before a decision to be made on whether to open the data.

References 1. Janssen, M., van den Hoven, J.: Big and Open Linked Data (BOLD) in government: a challenge to transparency and privacy? Gov. Inf. Q. 32(4), 363–368 (2015) 2. European Parliament and the Council of the European Union: Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data (1995)

200

A. Ali-Eldin et al.

3. European_Commission: Communication from the commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions. Towards better access to scientific information: Boosting the benefits of public investments in research (2012). Accessed 6 Oct 2013 4. OECD: OECD recommendation of the council for enhanced access and more effective use of on Public Sector Information (2008). http://www.oecd.org/dataoecd/41/52/44384673.pdf. Accessed 8 Nov 2011 5. ISO/IEC-29100: INTERNATIONAL STANDARD ISO/IEC Information technology Security techniques - Privacy framework (2011) 6. Kroener, I., Wright, D.: A strategy for operationalizing privacy by design. Inf. Soc. 30(5), 355–365 (2014) 7. ISACA AICPA/CICA: Privacy Maturity Model (2011) 8. Revoredo, M., et al.: A privacy maturity model for cloud storage services. In: Proceedings of the 7th International Conference on Cloud Computing (2014) 9. Wright, D.: The state of the art in privacy impact assessment. Comput. Law Secur. Rev. 28 (1), 54–61 (2012) 10. Blackmer, W.S.: GDPR: getting ready for the new EU general data protection regulation. In: Information Law Group (2016) 11. James, T.L., Warkentin, M., Collignon, S.E.: A dual privacy decision model for online social networks. Inf. Manag. 52, 893–908 (2015) 12. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 111–125 (2008) 13. Zuiderwijk, A., Janssen, M.: Towards decision support for disclosing data: closed or open data? Inf. Polit. 20(2), 103–117 (2015) 14. Xu, L., et al.: Information security in big data: privacy and data mining. IEEE Access 2, 1149–1176 (2014) 15. Randall, S.M., et al.: Privacy-preserving record linkage on large real world datasets. J. Biomed. Inform. 50, 205–212 (2014) 16. Eldin, A., Wagenaar, R.: Towards autonomous user privacy control. Int. J. Inf. Sec. Priv. 1 (4), 24–46 (2007) 17. Jones, J.A.: An Introduction to Factor Analysis of Information Risk (Fair) (2005). http:// www.fairinstitute.org/. Accessed 13 Dec 2016 18. Ali-Eldin, A., Wagenaar, R.: A fuzzy logic based approach to support users self control of their private contextual data retrieval, In: European Conference on Information Systems (ECIS). Association for Information Systems (AISeL), Turku (2004) 19. Ali-Eldin, A., van den Berg, J., Ali, H.: A risk evaluation approach for authorization decisions in social pervasive applications. Computer and Electrical Engineering 55, 59–72 (2016) 20. Government_of_the_Netherlands: Risk of an attack (threat level). https://www.government. nl/topics/counterterrorism-and-national-security/risk-of-an-attack-threat-level. Accessed 28 Jan 2018 21. Anonymizer. http://www.eyedea.cz/image-data-anonymization/. Accesed 1 Mar 2017 22. ARX: Data Anonymization Tool. http://arx.deidentifier.org/. Accessed 1 Mar 2017 23. Camouflage’s-CX-Mask: https://datamasking.com/products/static-masking/. Accessed 1 Mar 2017 24. Fung, B.C., et al.: Privacy preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4) (2010) 25. Shi, P., Xiong, L., Fung, B.: Anonymizing data with quasi-sensitive attribute value. In: Proceedings of the 19th ACM International Conference (2010)

A Privacy Risk Assessment Model for Open Data

201

26. Motwani, R., Xu, Y.: Efficient algorithms for masking and finding quasi-identifiers (PDF). In: Proceedings of the Conference on Very Large Data Bases (VLDB) (2007) 27. Shadish, W.R., Cook, T.D., Campbell, D.T.: Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton-Mifflin, Boston (2002) 28. Nessus: Nessus Vulnerability Scanner. https://www.tenable.com/products/nessusvulnerability-scanner. Accessed 1 Mar 2017 29. Ali-Eldin, A.M.T., Hafez, E.A.: Towards a universal architecture for disease data models sharing and evaluation. In: 2017 International Symposium on Networks, Computers and Communications (ISNCC) (2017) 30. Josuttis, N.M.: SOA in Practice: The Art of Distributed System Design. O’Reilly, Sebastopol (2007) 31. Ali-Eldin, A.M.T.: Towards a shared public electronic services framework. Int. J. Comput. Appl. 93(14), 48–52 (2014) 32. Abeysinghe, S.: Restful PHP Web Services. PACKT Publishing, Birmingham (2008)

Author Index

Ali-Eldin, Amr

Oberhauser, Roy 25 Oppermann, Felix Jonathan 67 Orthacker, Clemens 67

186

Cordeiro, José 134

Papademetriou, Rallis C. 166 Potzmader, Klaus 67 Pühringer, Peter 67

Franck, Thijs 112 Franz, Peter 148 Gronau, Norbert 1 Grum, Marcus 1 Gusain, Rakesh 148 Iacob, Maria-Eugenia Janssen, Marijn

112

87, 186

Karras, Dimitrios A. 166 Kirchmer, Mathias 148 Kreiner, Christian 67

Shishkov, Boris 87 Sinnhofer, Andreas Daniel Steger, Christian 67 Stigler, Sebastian 25 Suurmond, Coen 49 van Sinderen, Marten

112

Wombacher, Andreas

112

Zuiderwijk, Anneke 186

67