Transactions on Engineering Technologies: International MultiConference of Engineers and Computer Scientists 2018 [1st ed. 2020] 978-981-32-9807-1, 978-981-32-9808-8

This book contains revised and extended research articles written by prominent researchers, selected from presentations

646 33 34MB

English Pages IX, 364 [371] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Transactions on Engineering Technologies: International MultiConference of Engineers and Computer Scientists 2018 [1st ed. 2020]
 978-981-32-9807-1, 978-981-32-9808-8

Table of contents :
Front Matter ....Pages i-ix
An Ensemble Kalman Filtering Approach for Discrete-Time Inverse Optimal Control Problems (Andrea Arnold, Hien Tran)....Pages 1-12
Deep Learning Based Structural Health Monitoring Framework with Electromechanical Impedance Method (Alex W. H. Choy, Daniel P. K. Lun)....Pages 13-24
Flexible Control Schemes for Grid-Tied Inverters Under Unbalanced Grid Voltage Conditions (Chao-Tsung Ma)....Pages 25-38
Multiphase DC-DC Boost Converter: Introduction to Controller Design (Vaishali Chapparya, Prakash Dwivedi, Sourav Bose)....Pages 39-53
Evaluation of the Visibility of Color Representation for Cell-Based Evacuation Guidance Simulation (Toshihiro Naka, Tomoko Izumi, Takayoshi Kitamura, Yoshio Nakatani)....Pages 54-68
Identifying Prophetic Bloggers Based on Prediction Ability of Buzzwords and Categories (Jianwei Zhang, Yoichi Inagaki, Reyn Nakamoto, Shinsuke Nakajima)....Pages 69-81
Query Generation for Web Search Based on Spatio-Temporal Features of TV Program (Honoka Kakimoto, Yuanyuan Wang, Yukiko Kawai, Kazutoshi Sumiya)....Pages 82-92
End to End Internet Traffic Measurement Model Based on Compressive Sampling (Indrarini Dyah Irawati, Andriyan Bayu Suksmono, Ian Joseph Matheus Edward)....Pages 93-104
Approach to the Segmentation of Buttons from an Elevator Inside Door Image (Yung-Sheng Chen, Yu-Ching Hsu)....Pages 105-118
A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System (Thi Thi Zin, Pyke Tin, Ikuo Kobayashi, Hiromitsu Hama)....Pages 119-129
Pre- and Post-survey of the Achievement Result of Novice Programming Learners - On the Basis of the Scores of Puzzle-Like Programming Game and Exams After Learning the Basic of Programming - (Tomoya Iwamoto, Shimpei Matsumoto, Shuichi Yamagishi, Tomoko Kashima)....Pages 130-142
A Proposal for an Impatience of Scoring Method for a Text-Based Smartphone Emergency Report (Yudai Higuchi, Takayoshi Kitamura, Tomoko Izumi, Yoshio Nakatani)....Pages 143-152
Gazing Point Comparison Between Expert and Beginner DJs for Acquisition of Basic Skills (Kazuhiro Minami, Takayoshi Kitamura, Tomoko Izumi, Yoshio Nakatani)....Pages 153-164
A Comparative Analysis of Image Segmentation Methods with Multivarious Background and Intensity ( Erwin, Saparudin, Diah Purnamasari, Adam Nevriyanto, Muhammad Naufal Rachmatullah)....Pages 165-181
Communication Interruption Between a Game Tree and Its Leaves (Toshio Suzuki)....Pages 182-193
Intermittent Snapshot Method for Data Synchronization to Cloud Storage (Yuichi Yagawa, Mitsuo Hayasaka, Nobuhiro Maki, Shin Tezuka, Tomohiro Murata)....Pages 194-206
Extraction and Graph Structuring of Variants By Detecting Common Parts of Frequent Clinical Pathways (Muneo Kushima, Yuichi Honda, Hieu Hanh Le, Tomoyoshi Yamazaki, Kenji Araki, Haruo Yokota)....Pages 207-218
Important Index of Words for Dynamic Abstracts Based on Surveying Reading Behavior (Haruna Mori, Ryosuke Yamanishi, Yoko Nishihara)....Pages 219-232
New Chebyshev Operational Matrix for Solving Caputo Fractional Static Beam Problems (Thanon Korkiatsakul, Sanoe Koonprasert, Khomsan Neamprem)....Pages 233-246
Stability Analysis and Solutions of a Fractional-Order Model for the Glucose-Insulin Homeostasis in Rats (Natchapon Lekdee, Sekson Sirisubtawee, Sanoe Koonprasert)....Pages 247-261
On Asymptotic Stability Analysis and Solutions of Fractional-Order Bloch Equations (Sekson Sirisubtawee)....Pages 262-275
A Hybrid Delphi Multi-criteria Sorting Approach for Polypharmacy Evaluations (Anissa Frini, Caroline Sirois, Marie-Laure Laroche)....Pages 276-290
Risk Averse Scheduling for a Single Operating Room with Uncertain Durations (Mari Ito, Fumiya Kobayashi, Ryuta Takashima)....Pages 291-306
A Proposal of Diseases Words Classifying Method for Medical Hospitality (Hiroki Kozu, Yukio Maruyama, Tusyoshi Yuyama, Tomoya Hasegawa)....Pages 307-317
CFD Modelling of Rotating Annular Flow Using Wall y+ (Andrew A. Davidson, Salim M. Salim)....Pages 318-330
Risk Analysis of Danger Sources and Humanitarian Aid Supply Chains Due to Emergencies (Lorenzo Damiani, Roberto Revetria)....Pages 331-337
An Optimization Model of Sugarcane Harvesting with Fixed and Variable Costs Approximated by Fourier and Cubic Functions (Wisanlaya Pornprakun, Surattana Sungnul, Chanakarn Kiataramkul, Elvin J. Moore)....Pages 338-353
The Relationships Among Attitude, Perceived Learning and Perceived Engagement of the Use of Tablet PCS for Students’ Learning in Hong Kong Higher Education (Hon Keung Yau, Yuk Fung Leung)....Pages 354-361
Back Matter ....Pages 363-364

Citation preview

Sio-Iong Ao · Haeng Kon Kim · Oscar Castillo · Alan Hoi-shou Chan · Hideki Katagiri   Editors

Transactions on Engineering Technologies International MultiConference of Engineers and Computer Scientists 2018

Transactions on Engineering Technologies

Sio-Iong Ao Haeng Kon Kim Oscar Castillo Alan Hoi-shou Chan Hideki Katagiri •







Editors

Transactions on Engineering Technologies International MultiConference of Engineers and Computer Scientists 2018

123

Editors Sio-Iong Ao International Association of Engineers Hong Kong, Hong Kong Oscar Castillo Graduate Division Tijuana Institute of Technology Tijuana, Mexico Hideki Katagiri Department of Industrial Engineering and Management Kanagawa University Yokohama, Japan

Haeng Kon Kim Department of Computer and Communication Dae Gu Catholic University Daegu, Korea (Republic of) Alan Hoi-shou Chan Department of SEEM City University of Hong Kong Kowloon, Hong Kong

ISBN 978-981-32-9807-1 ISBN 978-981-32-9808-8 https://doi.org/10.1007/978-981-32-9808-8

(eBook)

© Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

A large international conference on Advances in Engineering Technologies and Physical Science was held in Hong Kong, March 14–16, 2018, under the International MultiConference of Engineers and Computer Scientists 2018 (IMECS 2018). The IMECS 2018 is organized by the International Association of Engineers (IAENG). IAENG is a non-profit international association for the engineers and the computer scientists, which was founded originally in 1968 and has been undergoing rapid expansions in recent few years. The IMECS conference serves as a good platform for the engineering community to meet with each other and to exchange ideas. The conference has also struck a balance between theoretical and application development. The conference committees have been formed with over three hundred committee members who are mainly research center heads, faculty deans, department heads, professors, and research scientists from over 30 countries with the full committee list available at our conference Web site (http://www.iaeng.org/ IMECS2018/committee.html). The conference is truly an international meeting with a high level of participation from many countries. The response that we have received for the conference is excellent. There have been more than 600 manuscript submissions for the IMECS 2018. All submitted papers have gone through the peer review, and the overall acceptance rate is 50.16%. This volume contains 28 revised and extended research articles written by prominent researchers participating in the conference. Topics covered include electrical engineering, communications systems, engineering mathematics, and industrial applications. The book offers the state of the art of tremendous advances in engineering technologies and physical science and applications, and also serves as an excellent reference work for researchers and graduate students working with/on engineering technologies and physical science and applications. Sio-Iong Ao Haeng Kon Kim Oscar Castillo Alan Hoi-shou Chan Hideki Katagiri v

Contents

An Ensemble Kalman Filtering Approach for Discrete-Time Inverse Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrea Arnold and Hien Tran

1

Deep Learning Based Structural Health Monitoring Framework with Electromechanical Impedance Method . . . . . . . . . . . . . . . . . . . . . . Alex W. H. Choy and Daniel P. K. Lun

13

Flexible Control Schemes for Grid-Tied Inverters Under Unbalanced Grid Voltage Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . Chao-Tsung Ma

25

Multiphase DC-DC Boost Converter: Introduction to Controller Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vaishali Chapparya, Prakash Dwivedi, and Sourav Bose

39

Evaluation of the Visibility of Color Representation for Cell-Based Evacuation Guidance Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toshihiro Naka, Tomoko Izumi, Takayoshi Kitamura, and Yoshio Nakatani Identifying Prophetic Bloggers Based on Prediction Ability of Buzzwords and Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianwei Zhang, Yoichi Inagaki, Reyn Nakamoto, and Shinsuke Nakajima Query Generation for Web Search Based on Spatio-Temporal Features of TV Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Honoka Kakimoto, Yuanyuan Wang, Yukiko Kawai, and Kazutoshi Sumiya End to End Internet Traffic Measurement Model Based on Compressive Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indrarini Dyah Irawati, Andriyan Bayu Suksmono, and Ian Joseph Matheus Edward

54

69

82

93

vii

viii

Contents

Approach to the Segmentation of Buttons from an Elevator Inside Door Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Yung-Sheng Chen and Yu-Ching Hsu A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Thi Thi Zin, Pyke Tin, Ikuo Kobayashi, and Hiromitsu Hama Pre- and Post-survey of the Achievement Result of Novice Programming Learners - On the Basis of the Scores of Puzzle-Like Programming Game and Exams After Learning the Basic of Programming - . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Tomoya Iwamoto, Shimpei Matsumoto, Shuichi Yamagishi, and Tomoko Kashima A Proposal for an Impatience of Scoring Method for a Text-Based Smartphone Emergency Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Yudai Higuchi, Takayoshi Kitamura, Tomoko Izumi, and Yoshio Nakatani Gazing Point Comparison Between Expert and Beginner DJs for Acquisition of Basic Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Kazuhiro Minami, Takayoshi Kitamura, Tomoko Izumi, and Yoshio Nakatani A Comparative Analysis of Image Segmentation Methods with Multivarious Background and Intensity . . . . . . . . . . . . . . . . . . . . . 165 Erwin, Saparudin, Diah Purnamasari, Adam Nevriyanto, and Muhammad Naufal Rachmatullah Communication Interruption Between a Game Tree and Its Leaves . . . 182 Toshio Suzuki Intermittent Snapshot Method for Data Synchronization to Cloud Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Yuichi Yagawa, Mitsuo Hayasaka, Nobuhiro Maki, Shin Tezuka, and Tomohiro Murata Extraction and Graph Structuring of Variants By Detecting Common Parts of Frequent Clinical Pathways . . . . . . . . . . . . . . . . . . . . 207 Muneo Kushima, Yuichi Honda, Hieu Hanh Le, Tomoyoshi Yamazaki, Kenji Araki, and Haruo Yokota Important Index of Words for Dynamic Abstracts Based on Surveying Reading Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Haruna Mori, Ryosuke Yamanishi, and Yoko Nishihara New Chebyshev Operational Matrix for Solving Caputo Fractional Static Beam Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Thanon Korkiatsakul, Sanoe Koonprasert, and Khomsan Neamprem

Contents

ix

Stability Analysis and Solutions of a Fractional-Order Model for the Glucose-Insulin Homeostasis in Rats . . . . . . . . . . . . . . . . . . . . . 247 Natchapon Lekdee, Sekson Sirisubtawee, and Sanoe Koonprasert On Asymptotic Stability Analysis and Solutions of Fractional-Order Bloch Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Sekson Sirisubtawee A Hybrid Delphi Multi-criteria Sorting Approach for Polypharmacy Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 Anissa Frini, Caroline Sirois, and Marie-Laure Laroche Risk Averse Scheduling for a Single Operating Room with Uncertain Durations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Mari Ito, Fumiya Kobayashi, and Ryuta Takashima A Proposal of Diseases Words Classifying Method for Medical Hospitality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Hiroki Kozu, Yukio Maruyama, Tusyoshi Yuyama, and Tomoya Hasegawa CFD Modelling of Rotating Annular Flow Using Wall y+ . . . . . . . . . . . 318 Andrew A. Davidson and Salim M. Salim Risk Analysis of Danger Sources and Humanitarian Aid Supply Chains Due to Emergencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 Lorenzo Damiani and Roberto Revetria An Optimization Model of Sugarcane Harvesting with Fixed and Variable Costs Approximated by Fourier and Cubic Functions . . . 338 Wisanlaya Pornprakun, Surattana Sungnul, Chanakarn Kiataramkul, and Elvin J. Moore The Relationships Among Attitude, Perceived Learning and Perceived Engagement of the Use of Tablet PCS for Students’ Learning in Hong Kong Higher Education . . . . . . . . . . . . 354 Hon Keung Yau and Yuk Fung Leung Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

An Ensemble Kalman Filtering Approach for Discrete-Time Inverse Optimal Control Problems Andrea Arnold1 and Hien Tran2(B) 1

Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, MA 01609, USA [email protected] 2 Department of Mathematics, North Carolina State University, Raleigh, NC 27695, USA [email protected]

Abstract. Solving the inverse optimal control problem for discrete-time nonlinear systems requires the construction of a stabilizing feedback control law based on a control Lyapunov function (CLF). However, there are few systematic approaches available for defining appropriate CLFs. We propose a method that utilizes nonlinear Bayesian filtering to parameterize a quadratic CLF. In particular, we use the ensemble Kalman filter (EnKF) to estimate parameters used in defining the CLF within the control loop of the inverse optimal control problem formulation. Using the EnKF in this setting provides a natural link between uncertainty quantification and optimal design and control, as well as a novel and intuitive way to find the one control out of an ensemble that stabilizes the system the fastest. Results of the EnKF CLF procedure are demonstrated on both a linear and nonlinear test problem. Keywords: Bayesian inference · Control Lyapunov function · Ensemble Kalman filter · Inverse optimal control · Nonlinear filtering · Stabilizing feedback control

1 Introduction The aim of optimal control [7, 21] is to determine a control law for a given system that minimizes a cost functional relating the state and control variables. For a linear dynamical system with the associated cost functional that is quadratic in the state and control, the optimal control is a linear state feedback law where the control gain is obtained by solving a differential/algebraic Riccati equation [2]. The widespread applicability of this linear-quadratic regulator (LQR) problem is a consequence of the successful development of robust and efficient algorithms for solving the Riccati equation. However, for a nonlinear dynamical system, the optimal state feedback control law is described in terms of the solution to the Hamilton-Jacobi-Bellman (HJB) equation, which is very difficult to solve analytically for general nonlinear systems [14, 20]. This has led to many computational methods proposed in the literature to obtain an approximate solution to c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 1–12, 2020. https://doi.org/10.1007/978-981-32-9808-8_1

2

A. Arnold and H. Tran

the HJB equation as well as to obtain a suboptimal feedback control for general nonlinear dynamical systems (see, e.g., [6] and the references therein). An alternate approach is to find a stabilizing feedback control law first, then establish that it optimizes a specified cost functional – this is known as the inverse optimal control problem. Solving the inverse optimal control problem for discrete-time nonlinear systems requires the construction of a stabilizing feedback control law based on a control Lyapunov function (CLF). However, there are few systematic approaches available for defining appropriate CLFs. Available methods parameterize quadratic CLFs using a recursive speed-gradient algorithm [17], particle swarm optimization [19] or, more recently, the extended Kalman filter [1]. This work develops a novel approach employing nonlinear Bayesian filtering methodology to parameterize a quadratic CLF. In particular, we use the ensemble Kalman filter (EnKF) to estimate the parameters used in defining the CLF within the control loop of the inverse optimal control problem. Using the EnKF in this setting provides a natural link between uncertainty quantification and optimal design and control, as well as an intuitive way to find the one control out of an ensemble that drives the system to zero the fastest. For preliminary results of this work, we refer the interested reader to [5]. In the Bayesian framework, unknown parameters are modeled as random variables with probability density functions representing distributions of possible values. The EnKF is a nonlinear Bayesian filter which uses ensemble statistics in combination with the classical Kalman filter equations for state and parameter estimation [3, 8, 10]. The EnKF has been employed in many settings, including weather prediction [12, 15] and mathematical biology [3, 4]. To the authors’ knowledge, this is the first proposed use of the EnKF in inverse optimal control problems. The novelty of using the EnKF in this setting allows us to generate an ensemble of control laws, from which we can then select the control law that drives the system to zero the fastest. While the nonlinear problem has no guarantee of a unique control, we use the control ensemble to find the best solution starting from a prior distribution of possible controls. The paper is organized as follows. We review the main ideas behind optimal control and inverse optimal control in Sect. 2 and nonlinear Bayesian filtering and the EnKF in Sect. 3. In Sect. 4, we describe the application of the EnKF to parametrizing the CLF for inverse optimal control problem. The results in Sect. 5 demonstrate the effectiveness of the EnKF CLF procedure on both a linear and nonlinear test problem. We conclude in Sect. 6.

2 Optimal and Inverse Optimal Control In this section we describe the optimal control problem and inverse optimal control problem for discrete-time nonlinear systems, using similar notation to that in [1, 18]. For details on feedback control methodology for nonlinear dynamic systems, see, e.g., [6]. Consider the discrete-time affine nonlinear system xk+1 = f (xk ) + g(xk )uk ,

x0 = x(0),

(1)

where xk ∈ Rn is the state of the system at time k, uk ∈ Rm is the control input at time k, and f : Rn → Rn and g : Rn → Rn×m are smooth mappings with f (0) = 0 and g(xk ) = 0

An Ensemble Kalman Filtering Approach

3

for all xk = 0. The nonlinear optimal control problem is to determine a control law uk that minimizes the associated cost functional  ∞  (2) V (xk ) = ∑ L(xn ) + uT n Eun , n=k

where V : Rn → R+ has V (0) = 0, L : Rn → R+ is positive semidefinite, and E is a real, symmetric positive definite m × m weighting matrix. The boundary condition V (0) = 0 is necessary so that V (xk ) can be used as a CLF. The cost functional (2) can be rewritten as (3) V (xk ) = L(xk ) + uT k Euk +V (xk+1 ). For an infinite horizon control problem, the time-invariant function V ∗ (xk ) satisfies the discrete-time Bellman equation   ∗ V ∗ (xk ) = min L(xk ) + uT Eu +V (x ) . (4) k k+1 k uk

Taking the gradient of (4) with respect to uk yields the optimal control

∂ V ∗ (xk+1 ) 1 u∗k = − E−1 gT (xk ) 2 ∂ xk+1

(5)

which, when substituted into (3), yields the discrete-time Hamilton-Jacobi-Bellman (HJB) equation V ∗ (xk ) = L(xk ) +V ∗ (xk+1 ) +

∂ V (xk+1 ) 1 ∂ V ∗ T (xk+1 ) g(xk )E−1 gT (xk ) . 4 ∂ xk+1 ∂ xk+1

(6)

Since solving the discrete-time HJB Eq. (6) is very difficult for general nonlinear systems, an alternative approach is to consider the inverse optimal control problem. In inverse optimal control, the first step is to construct a stabilizing feedback control law, then to establish that the control law optimizes a given cost functional. By definition, the control law ∂ V ∗ (xk+1 ) 1 (7) u∗k = − E−1 gT (xk ) 2 ∂ xk+1 is inverse optimal if it satisfies the following two criteria [18]: 1. It achieves (global) exponential stability of the equilibrium point xk = 0 for the system (1). 2. It minimizes the defined cost functional (2), for which L(xk ) = −V with V := V (xk+1 ) −V (xk ) + u∗k T Eu∗k ≤ 0,

(8)

where V (xk ) is positive definite. The inverse optimal control is, therefore, characterized by function V (xk ). A control law satisfying the above definition can be defined using a quadratic control Lyapunov function (CLF) of the form 1 (9) V (xk ) = xkT Pxk , 2

4

A. Arnold and H. Tran

where the matrix P ∈ Rn×n is symmetric positive definite (i.e., P = PT > 0). Once an appropriate CLF (9) has been selected, the state feedback control law (7) becomes u∗k = −

−1 1 1 E + gT (xk )Pg(xk ) gT (xk )P f (xk ). 2 2

(10)

It is noted that since E and gT (xk )Pg(xk ) are symmetric positive definite matrices, the inverse matrix in the optimal feedback control law (10) is guaranteed to exist. Therefore, the problem at hand is to select an appropriate matrix P to achieve stability and minimize a meaningful cost function. As noted in the introduction, currently proposed methods to estimate the entries of the matrix P in the CLF (9) include a recursive speed-gradient algorithm [17], particle swarm optimization [18], and, more recently, use of the extended Kalman filter [1]. In this work, we propose use of nonlinear Bayesian filtering techniques, in particular the EnKF, to estimate the entries of the matrix P from a distribution of possible values, which allows us to find the best control out of an ensemble.

3 Nonlinear Bayesian Filtering and the EnKF We approach the solution to the inverse optimal control problem from the Bayesian statistical framework, using nonlinear Bayesian filtering methodology to parameterize the quadratic CLF. In the Bayesian framework, the quantities of interest (such as the system states or parameters) are treated as random variables with probability distributions, and their joint posterior density is assembled using Bayes’ theorem. In particular, if x denotes the states of a system and y some partial, noisy system observations, then Bayes’ theorem gives π (x | y) ∝ π (y | x)π (x), (11) where the likelihood function π (y | x) indicates how likely it is that the data y are observed if the state values were known and the prior distribution π (x) encodes any known information on the states before taking the data into account. Bayesian filtering methods rely on the use of discrete-time stochastic equations describing the model states and observations to sequentially update the joint posterior density. Assuming a time discretization tk , k = 0, 1, . . . , T , with the observations yk occurring possibly in a subset of the discrete time instances (where yk = 0/ if there is no observation at tk ), we can write an evolution-observation model for the stochastic state estimation problem using discrete-time Markov models. The state evolution Equation Xk+1 = F(Xk ) +Vk+1 ,

Vk+1 ∼ N (0, Qk+1 ),

(12)

where F is a known propagation model and Vk+1 is an innovation process, computes the forward time propagation of the state variables, while the observation equation Yk+1 = G(Xk+1 ) +Wk+1 ,

Wk+1 ∼ N (0, Rk+1 ),

(13)

where G is a known operator and Wk+1 is the observation noise, predicts the observation at time tk+1 based on the current state (and parameter) values.

An Ensemble Kalman Filtering Approach

5

  Letting Dk = y1 , y2 , . . . , yk denote the set of observations up to time tk , the stochastic evolution-observation model allows us to sequentially update the posterior distribution π (xk | Dk ) using a two-step, predictor-corrector-type scheme:

π (xk | Dk ) → π (xk+1 | Dk ) → π (xk+1 | Dk+1 )

(14)

The first step (the prediction step) employs the state evolution Eq. (12) to predict the values of the states at time tk+1 without knowledge of the data. The second step (the analysis step or observation update) then uses the observation Eq. (13) to correct the prediction by taking into account the data at time tk+1 . If there is no data observed at tk+1 , then Dk+1 = Dk and the prediction density π (xk+1 | Dk ) is equivalent to the / this updating posterior π (xk+1 | Dk+1 ). Starting with a prior density π (x0 | D0 ), D0 = 0, scheme is repeated until the final posterior density is obtained when k = T . 3.1 Ensemble Kalman Filter As noted in the introduction, the ensemble Kalman (EnKF) filter is a nonlinear Bayesian filter that uses ensemble statistics in combination with the classical Kalman filter equations to accommodate nonlinear models [8, 10]. While there are versions of the EnKF that perform joint state and parameter estimation [3, 11], for our purposes we need only consider the standard EnKF for state estimation, which will be adapted in the following section for the inverse optimal control problem. To avoid confusion with the states of the control system (1), here we denote the states in the filter as ak , k = 0, . . . , T , as opposed to the typical xk notation. The EnKF algorithm for state estimation is outlined as follows. Assume the current density π (ak | Dk ) at time tk is represented in terms of a discrete ensemble of size N, denoted as  Sk|k = (a1k|k ), (a2k|k ), . . . , (aNk|k ) . (15) In the prediction step, the states at time tk+1 are predicted using the state evolution Eq. (12) to form a state prediction ensemble, given by j j j ak+1|k = F(ak|k ) + vk+1 ,

j = 1, . . . , N,

(16)

j where vk+1 ∼ N (0, Qk+1 ) represents error in the model prediction. Ensemble statistics yield the prediction ensemble mean

ak+1|k =

1 N

N

j ∑ ak+1|k

(17)

j=1

and prediction ensemble covariance matrix Γk+1|k =

1 N −1

N

j j − ak+1|k )(ak+1|k − ak+1|k )T . ∑ (ak+1|k

j=1

(18)

6

A. Arnold and H. Tran

When an observation yk+1 arrives, an artificial observation ensemble is generated around the true observation, such that j j = yk+1 + wk+1 , yk+1

j = 1, . . . , N,

(19)

j where wk+1 ∼ N (0, Rk+1 ) represents the observation error. The observation ensemble is compared to the observation model predictions j j = G(ak+1|k ), y k+1

j = 1, . . . , N,

(20)

which are computed using the observation function G as defined in (13). The posterior ensemble at time tk+1 is then computed by j j j j (21) = ak+1|k + Kk+1 yk+1 − y k+1 ak+1|k+1 for each j = 1, . . . , N, where the Kalman gain Kk+1 is defined as −1 y y y

. Kk+1 = Σa

k+1 Σk+1 + Rk+1

(22)

y In the Kalman gain formula (22), Σa

k+1 denotes the cross covariance of the state pre-

j j

y

and observation predictions y k+1 , Σyk+1 the forecast error covariance of dictions ak+1|k the observation prediction ensemble, and Rk+1 the observation noise covariance. This formulation of the Kalman gain straightforwardly allows for nonlinear observations, as opposed to the more familiar formula for linear observation models [16]. Use of the artificial observation ensemble (19) ensures that the resulting posterior ensemble in (21) does not have too low a variance [8]. The posterior means and covariances for the states are then computed using posterior ensemble statistics, and the process repeats.

4 EnKF CLF Procedure for Inverse Optimal Control To apply the EnKF to the inverse optimal control problem, we treat the entries of the symmetric positive definite P defining the quadratic CLF in (9) as the states of the filter and apply the following updating procedure to find the control that drives the system to zero the fastest. At time k, assume a discrete ensemble of P matrices ⎡ j ⎤ j P1,1 · · · P1,n ⎢ . . j . ⎥ n×n ⎥ =⎢ (23) Pk|k ⎣ .. . . .. ⎦ ∈ R , j = 1, . . . , N, j j P1,n · · · Pn,n j is symmetric positive definite. Using symmetry to our advanwhere each matrix Pk|k tage, we need only update the upper triangular entries of the P matrices, which we place into the vectors ⎡ j ⎤ P1,1 ⎢ . ⎥ ⎢ .. ⎥ ⎢ ⎥ n(n + 1) ⎢ j ⎥ j . (24) = ⎢P1,n pk|k ⎥ ∈ Rn , n = ⎢ ⎥ 2 ⎢ .. ⎥ ⎣ . ⎦ j Pn,n

An Ensemble Kalman Filtering Approach

7

As in the prediction step of the filter, we generate a prediction ensemble j j j = pk|k + vk+1 , pk+1|k

j vk+1 ∼ N (0, Q),

(25)

where here the propagation function in Eq. (16) is multiplication by the n × n idenj tity matrix and the covariance of the innovation term vk+1 is some constant matrix Q. Prediction ensemble statistics can be computed as in (17)–(18), however they are not needed for the remaining computations. j }Nj=1 back into matrices Reformulating the prediction ensemble vectors {pk+1|k j }Nj=1 , we can compute the corresponding predicted controls, states, and root {Pk+1|k mean square error (RMSE) values for each ensemble member using the following formulas. The predicted controls are given by −1 1 1 j j j j j j j = − E + gT (xk|k )Pk+1|k g(xk|k ) gT (xk|k )Pk+1|k f (xk|k ) (26) uk+1|k 2 2 as in (10) for each j = 1, . . . , N, which are then used to generate the state prediction ensemble j j j j = f (xk|k ) + g(xk|k )uk+1|k , j = 1, . . . , N, (27) xk+1|k

as in the nonlinear system (1). For the analysis step of the filter, we interpret as “observations” the RMSE values of the states as we drive them to zero. Since the aim is to find a control that drives the RMSE to zero, we treat RMSE = 0 as the true “observation” and generate an observation ensemble using the prescribed observation noise covariance matrix R as follows: j j = wk+1 , RMSEobs

j wk+1 ∼ N (0, R).

(28)

We then compare the “observed” RMSEs to the RMSEs of the predicted states, given by  j j j (xk+1|k )21 + (xk+1|k )22 + · · · + (xk+1|k )2n j (29) RMSEk+1|k = n and compute the posterior ensemble as in (21) using the formula j j j j pk+1|k+1 = pk+1|k + Kk+1 (RMSEobs − RMSEk+1|k ),

(30)

y where the Kalman gain is defined as in (22) with Σa

k+1 denoting the cross covariance

j j

y

and RMSE predictions RMSEk+1|k , Σyk+1 the forecast error of the predictions pk+1|k covariance of the RMSE prediction ensemble, and R the observation noise covariance. Posterior control law, state, and RMSE ensembles can be computed after reformulating j }Nj=1 into their corresponding matrices the posterior ensemble of entry vectors {pk+1|k j {Pk+1|k }Nj=1 , and ensemble statistics can be computed. This process is repeated for each successive time step until an appropriate control is found, based on some prescribed stopping criterion. In particular, if we want to find the control that drives the system to zero the fastest, we can stop when the minimum RMSE of all ensemble members is less than some prescribed tolerance. We refer to this process as the EnKF CLF procedure, and the resulting EnKF CLF corresponds to the control law with minimum RMSE that stops the algorithm, i.e. drives the system to zero the fastest.

8

A. Arnold and H. Tran

5 Results We illustrate the effectiveness of the proposed EnKF CLF method on two numerical examples, one involving a linear system and the other a nonlinear system. 5.1

Numerical Example: Linear System

For our first numerical example, we consider the discrete-time linear system     0.9974 0.0539 0.0013 xk + u xk+1 = −0.1078 1.1591 0.0539 k

(31)

with initial point x0 = [2, 1] ∈ R2 . The goal is to minimize the performance measure J=

 1 N−1  0.25(x1 )2k + 0.05(x2 )2k + 0.05u2k ∑ 2 k=0

as described in [13]. We set up the EnKF CLF estimator by letting E = 0.05,

Q = q0 I2 ,

R = r0 ,

(32)

where q0 = 1 × 10−4 and r0 = 1 × 10−3 . We generate uniform prior of size N = 1, 000 ensemble members on the upper-triangular entries of the P matrix with minimum value 0.05 and maximum value 0.2. We set the stopping criterion such that the filter stops when min(RMSE) 0.5. The input, output design requirements and calculated inductor and capacitor values to achieve the desired performance are given in Table 2. Substituting these values in Eq. 26, the transfer function of IBC is obtained as given in Eq. 27. 82.5(1 + s/17.85 ∗ 106 )(1 − s/1238.15) vˆo (s) = G(s) = ˆ (1 + s/2377.45)(1 + s/2624.49) d(s)

(27)

It has been found that IBC posses a non minimal phase (NMP) behaviour, thus, making the design of a controller for stabilization of IBC a difficult task. Table 2. Design requirement of interleaved boost converter Parameters description

Notations Experimental value Units

Source voltage

Vs

30

Output voltage

Vo

50

V

Source current

Is

1.6667

A

Current through inductor L1 il1

0.833

A

Current through inductor L2 il2

0.833

A

V

Output current

Io

1

A

Duty ratio

D/d

0.4



Switching frequency

fs

50

KHz

Inductor L1 ripple current

Il1

0.0083

A

Inductor L2 ripple current

Il2

0.0083

A

Voltage ripple

Vo

0.5

V

Inductor

L1

28.91

mH

Inductor

L2

28.91

mH

Capacitor

C

4

µF

Load resistance

R

50

Ω

Multiphase DC-DC Boost Converter: Introduction to Controller Design

5

45

Controller Design

Various controllers are reported in literature to improve the performance of IBC such as PI controller [24], Sic MOSFET and Fuzzy Logic Controller [25], PSOBased Controller [26], Robust Model Predictive Control [27], Type III Controller [28]. The Type-III controller is a lead compensator which is equivalent to PD controller. This type of controller is sensitive to noise due to its derivative action. Further, tuning of parameters is also required in Type-III controller. 2DOF digital controller mentioned in [29] involves microcontroller programming. Microcontrollers once programmed cannot be reprogrammed. Therefore Loop shaping, a graphical technique is suggested for designing the controller for NMP system. Loop Shaping has been proved as one of the most influential technique in the progress of different multivariable control design technique methodologies [30– 33]. The main aim of the technique is to draw the loop transfer function L(s) graphically to accomplish the robust performance criterion ||W1 S|+|W2 T ||∞ < 1 and then via relationship C = L/P controller transfer function C can be realized [33]. In case of NMP, designer should keep in mind that the new gain crossover frequency should lie before the right half zero frequency to take of the internal stability [23,33]. In this study, author used the same concept for designing a controller for the plant transfer function G(s) which is given in Eq. 27. This system has a right side zero at ω = 1238 rad/s. The plant becomes closed loop unstable with unity feedback. Therefore a controller is required such that this zero should not be cancelled by the controller transfer function to take care of the internal stability. The right side zero also limits the achievable bandwidth, therefore the crossover region of compensated IBC should be before ω = 1238 rad/s [33].

Fig. 3. Bode plot of G(s)

The Bode plot of plant G(s) is drawn in Fig. 3. The gain crossover frequency is 400000 rad/s. As mentioned above, crossover frequency should be less than ω = 1238 rad/s, therefore in order to get gain crossover frequency as desired, author has reshaped the magnitude plot to get the desired magnitude plot of L(s). To fulfill this design requirement a line of −20 dB/dec is plotted for five

46

V. Chapparya et al.

decades as shown in Fig. 4. The new magnitude plot of L(s) which comprises plant and controller is obtained by drawing a parallel line to above mentioned −20 dB/dec line. This slope is drawn from 1 rad/s to 105 rad/s and after that it follows the same slope of G(s) as shown in Fig. 4. The transfer function of L(s) is computed from its magnitude plot and given in Eq. 28. The controller transfer function K(s) is computed as L(s)/G(s) and given in Eq. 29. L(s) =

206.2(1 + s/1.78 ∗ 107 )(1 − s/1.2 ∗ 103 ) s(1 + s/1.2 ∗ 103 )

(28)

2.5(1 + s/2.3 ∗ 103 )(1 + s/2.6 ∗ 103 ) s(1 + s/1.2 ∗ 103 )

(29)

K(s) =

6

Stability Analysis

This section deals with the stability assessment of IBC. The assessment has been done in frequency and time domain using bode plots and step response. The effect of gain variation on the system also has been studied.

Fig. 4. Bode magnitude plot curve with loop shaping

The uncompensated open loop IBC has two poles and two zeros, with one of its zero in the right side of s plane. In Fig. 3, G(s) denotes the unstable Bode response with GM = −38.4 dB and PM = −87.8◦ . The Bode response of compensated IBC L(s) is shown in Fig. 5 which is stable with GM = 15.8 dB and PM = 71.6◦ at K = 2.5. The step response for the G(s) and L(s) are given in Figs. 6 and 7. Figure 6 shows the unstable behaviour. It is observed from Fig. 7 that the system shows the stable response with settling time 0.0126 s. To further improve the transient parameters controller gain (Kc ) can be varied. The respective Bode plot and step response of variation in Kc are shown in Figs. 5 and 7. The values of time response and frequency response for different values of Kc are tabulated in Tables 3 and 4.

Multiphase DC-DC Boost Converter: Introduction to Controller Design

Fig. 5. Effect of variation of Kc on frequency response

Fig. 6. Step response of G(s)

Fig. 7. Effect of variation of Kc on time response Table 3. Time response characteristics Compensator gain Peak overshoot (Mp ) Settling time (ts ) Rise time (tr ) K=1

0

0.0424

0.0230

K = 2.5

0

0.0126

0.0067

K=5

11.87

0.0089

0.0023

47

48

V. Chapparya et al. Table 4. Frequency response characteristics Compensator gain Gain margin (dB) Phase margin (deg) K=1

23.6

82.4

K = 2.5

15.8

71.6

K=5

9.58

53.3

It is to be noted from Fig. 5 that at K = 1 the corresponding GM and PM are 23.6 dB and 82.4◦ which depicts that the system is stable at K = 1 also. When gain is increased to value K = 5 the GM and PM reduced to the values 9.58 dB and 53.3◦ respectively. This is to be noted from Fig. 5 that increment in the value of gain of the compensator is leading the system towards unstability. The pragmatic relation P M/100 = ζ establishes relation between the time and frequency response. If PM ≥ 100◦ then ζ ≥ 1 which indicates that the system will be over-damped and sluggish. The same can be verified from the Figs. 5 and 7.

7

Simulation Results

This section deals with experimental validation of the control strategy for the IBC. Proposed loop shaping technique has been used to control output voltage during variable input voltage. Typhoon-HIL-402 has been used to implement the control technique. The experimental setup of Typhoon HIL-402 is shown in Fig. 12, which contains HIL-402, storage oscilloscope, power circuit model on Typhoon HIL software. Fig. 8 shows the variation of input voltage to the IBC. In open loop, output voltage varies in proportionate to the input voltage for a particular duty ratio, which is shown in Fig. 9. 42 40 38

Voltage (V)

36 34 32 30 28 26 24 22 20

0

1

2

3

4

5

Time (sec)

6

7

8

9

10

Fig. 8. Open loop input voltage

Close loop controller need to be design to maintain constant output voltage during fluctuating input voltage. A close loop controller has been designed using loop shaping technique to regulate output voltage. Fig. 10 is showing close loop

Multiphase DC-DC Boost Converter: Introduction to Controller Design

49

70

60

Voltage (V)

50

40

30

20

10

0

0

1

2

3

4

5

Time (sec)

6

7

8

9

10

Fig. 9. Open loop output voltage

output voltage characteristic which is almost constant. Duty ratio of the switches of the IBC are varied constantly to obtained desired output voltage. Duty ratio of the switches are controlled using PWM technique. Modulating signal input to the PWM chip of the present case has been shown in Fig. 11.

Fig. 10. Closed loop output voltage

Figure 13 is showing the emulation results of the IBC in open-loop. Channel 1 is showing the input voltage while Channel 2 is showing the output voltage. It is observed that output voltage is changing in proportionate to variable input voltage as duty ratio of the switches are fixed in open-loop. The desired output 50 V is achievable only when input voltage is 30V. In the present study, IBC consist of two inductor at the input side. Inductor current characteristics of both the inductor (Channel 1 is showing il1 while Channel 2 is showing il2 .) has been shown in Fig. 14. Figure 15 is showing the emulation results of the IBC in close-loop. Channel 1 is showing the input voltage while Channel 2 is showing the output voltage. It is observed that output voltage is always constant irrespective of the variable input voltage. Duty ratio of the switches of the IBC are continuously varying using loop shaping controller to maintained desired constant output voltage which is 50 V. Fig. 16 is showing the inductor currents for closed loop, Channel 1 is showing average current for inductor L1 while Channel 2 is showing average current for inductor L2.

50

V. Chapparya et al.

Fig. 11. Modulating signal

Fig. 12. Hardware in-the-loop experimental setup

Fig. 13. HIL result of input and output voltage for open loop

Fig. 14. HIL result of inductor currents for open loop

Multiphase DC-DC Boost Converter: Introduction to Controller Design

51

Fig. 15. HIL result of input and output voltage for closed loop

Fig. 16. HIL result of inductor currents for closed loop

References 1. Denicia, E.P., Luqueno, F.F., Ayala, D.V., Zetina, M., Nikhar, A.R., Lopez, L.A.M.: Renewable energy sources for electricity generation in Mexico: a review. In: Renewable and Sustainable Energy Reviews, pp. 597–613 (2017) 2. Krishnan, M.S., Ramkumar, M.S., Sownthara, M.: Power management of hybrid renewable energy system by frequency deviation control. Int. J. Innov. Res. Sci., Eng. Technol. 3(3), 763–769 (2014) 3. Deshmukh, M.K., Deshmukh, S.S.: Modeling of hybrid renewable energy systems. Renew. Sustain. Energy Rev. 12(1), 235–249 (2008) 4. Singh, R.S.S., Abbod, M., Balachandran, W.: Low voltage hybrid renewable energy system management for energy storages charging- discharging. IEEE (2016) 5. Kwon, J.M., Kwon, B.H.: High step-up active-clamp converter with input-current doubler and output-voltage doubler for fuel cell power systems. IEEE Trans. Power Electron. 24(1), 108–115 (2009) 6. Zhu, L.: A novel soft-commutating isolated boost full-bridge ZVS-PWM DC-DC converter for bidirectional high power applications. IEEE Trans. Power Electron. 21(2), 422–429 (2006) 7. Hwu, K.I., Yau, Y.T.: An interleaved AC-DC converter based on current tracking. IEEE Trans. Ind. Electron. 56(5), 1456–1463 (2009) 8. Balogh, L., Redl, R.: Power-factor correction with interleaved boost converters in continuous-inductor-current mode. In: Proceedings Eighth Annual Applied Power Electronics Conference and Exposition, APEC 1993, pp. 168–174 (1993)

52

V. Chapparya et al.

9. Swamy, H.M.M., Guruswamy, K.P., Singh, S.P.: Design, modeling and analysis of two level interleaved boost converter. In: 2013 International Conference on Machine Intelligence and Research Advancement (2013) 10. Newton, A., Green, T.C., Andrew, D.: AC/DC power factor correction using interleaving boost and Cuk converters. In: IEEE Power Electronics and Variable Speed, Conference Publication, No. 475, pp. 293–298 (2000) 11. Miwa, B.A., Otten, D.M., Schlecht, M.F.: High efficiency power factor correction using interleaving techniques. In: IEEE Applied Power Electronics Conference and Exposition, pp. 557–568 (1992) 12. Apte, S.M., Somalwar, R., Nikhar, A.R.: Review of various control techniques for DC-DC interleaved boost converters. In: 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication, pp. 432– 437 (2016) 13. Revathi, B.S., Prabhakar, M.: Non isolated high gain DC-DC converter topologies for pv applications, a comprehensive review. Renew. Sustain. Energy Rev. 66, 920–933 (2016) 14. Kabalo, M., Blunier, B., Bouquain, D., Miraoui, A.: State-of-the art of DC-DC converters for fuel cell vehicles. In: 201 IEEE Vehicle Power and Propulsion Conference (VPPC), pp. 1–6. IEEE (2010) 15. Crews, R., Nielson, K.: Interleaving is good for boost converters too. Power Electron. Technol. 34(5), 24–29 (2008) 16. Khadmuna, W., Subsinghaa, W.: High voltage gain interleaved DC boost converter application for photovoltaic generation system. Energy Procedia 34, 390–398 (2013) 17. Khoucha, F., Benrabah, A., Herizi, O., Kheloui, A., Benbouzid, M.: An improved MPPT interleaved boost converter for solar electric vehicle application. In: IEEE POWERENG Conefernce, pp. 1076–1081 (2013) 18. Seyezhai, R., Mathur, B.L.: Design and implementation of interleaved boost converter for fuel cell systems: application to the interleaved boost and modular multilevel converters. Int. J. Hydrog. Energy 37(43), 3897–3903 (2012) 19. Jang, Y., Jovanovic, M.M.: Interleaved boost converter with intrinsic voltagedoubler characteristic for universal-line PFC front end. IEEE Trans. Power Electron. 22(4), 1394–1401 (2007) 20. Liu, C., Johnson, A., Lai, J.-S.: A novel three-phase high-power soft-switched DC/DC converter for low-voltage fuel cell applications. IEEE Trans. Ind. Appl. 41(6), 1691–1697 (2005) 21. Krein, P.T., Bentsman, J., Bass, R.M., Lesieutre, B.L.: On the use of averaging for the analysis of power electronic systems. IEEE Trans. Power Electron. 5, 182–190 (1990) 22. Sun, J., Mitchell, D.M., Greuel, M.F., Krein, P.T., Bass, R.M.: Averaged modeling of pwm converters operating in discontinuous conduction mode. IEEE Trans. Power Electron. 16, 482–492 (2001) 23. Chapparya, V., Murali Krishna, G., Dwivedi, P., Bose, S.: Loop shaping controller design for constant output interleaved boost converter using real-time hardware in-the-loop (HIL). In: Proceedings of The International MultiConference of Engineers and Computer Scientists 2018. Lecture Notes in Engineering and Computer Science, Hong Kong, 14–16 March 2018, pp. 659–664 (2018) 24. Cisneros, R., Pirro, M., Bergna, G., Ortega, R., Ippoliti, G., Molinas, M.: Global tracking passivity-based PI control of bilinear systems: application to the interleaved boost and modular multilevel converters. Control Eng. Pract. 43, 109–119 (2015)

Multiphase DC-DC Boost Converter: Introduction to Controller Design

53

25. Karthika, P., Basha, A.M., Ayyapan, P.: PV based speed control of DC motor using interleaved boost converter with SiC MOSFET and fuzzy logic controller. In: International Conference on Communication and Signal Processing, pp. 1826– 1830 (2016) 26. Banerjee, S., Ghosh, A., Rana, N.: An improved interleaved boost converter with PSO-based optimal type-III controller. IEEE J. Emerg. Sel. Top. Power Electron. 5, 323–337 (2017) 27. Sartipizadeh, H., Harirchi, F.: Robust model predictive control of DC-DC floating interleaved boost converter under uncertainty, pp. 320–327 (2017) 28. Banerjee, S., Ghosh, A., Rana, N.: Design and fabrication of closed loop two-phase interleaved boost converter with type-III controller, pp. 3331–3336 (2016) 29. Adachi, Y., Mochizuki, Y., Higuchi, K.: Approximate 2DOF digital controller for interleaved PFC boost converter. In: Journal Article on Lecture Notes in Electrical Engineering. LNEE, vol. 282, pp. 135–144 (2014) 30. Doyle, J.C., Stein, G.: Multivariable feedback design: concepts for a classical/modern synthesis. IEEE Trans. Autom. Control. 26, 4–16 (1981) 31. Agamennoni, O., Figueroa, J.L., Desages, A.C., Palazoglu, A., Romaonoli, J.A.: A loop-shaping technique for feedback control design. Comput. Chem. Eng. 20, 27–37 (1996) 32. Rahim, A.H.M.A., Kandlawala, M.F.: Robust STATCOM voltage controller design using loop-shaping technique. Electr. Power Syst. Res. 68(2004), 61–74 (2004) 33. Skogestad, S., Postlethwaite, I.: Multivariable Feedback Control Analysis and Design (2001)

Evaluation of the Visibility of Color Representation for Cell-Based Evacuation Guidance Simulation Toshihiro Naka1 , Tomoko Izumi2(B) , Takayoshi Kitamura3 , and Yoshio Nakatani3 1

Graduate School of Information Science and Engineering, Ritsumeikan University, Shiga, Japan [email protected] 2 Faculty of Information Science and Technology, Osaka Institute of Technology, Osaka, Japan [email protected] 3 College of Information Science and Engineering, Ritsumeikan University, Shiga, Japan [email protected], [email protected]

Abstract. This study provides an overview of the development of a computer simulator that quantitatively evaluates the effectiveness of evacuation guidance methods for tourists. The majority of previous studies have focused intensively on the movements of targeted residents in disaster sites, with only a very limited investigation being conducted regarding visitors such as tourists and business people. This study focuses on a simulation support system for disaster places of refuge in large tourist areas. In the simulation for supporting an evacuation guidance targeting tourists, simply simulating a large number of movements of tourists is difficult because their actions are very varied and the area is wide. Therefore, we propose a simulator based on cellular automaton in which colors are assigned to the cells to present the simulation results in a visually comprehensive way. Moreover, in this simulation, we conducted an evaluation regarding the assignment and optimal number of colors. In this study, we also present our interviews with experts about our system. Keywords: Cell-based simulation · Color representation prevention · Interface · Simulation of evacuees · Tourists

1

· Disaster

Introduction

Terrible large-scale earthquakes occur now and then worldwide, and more largescale earthquakes are expected to occur in the very near future. In areas where earthquakes occur rather frequently, people constantly pay much attention to news related to these earthquakes, and we witness the efforts of various local governments in their planning for natural disasters. Some of these areas, such c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 54–68, 2020. https://doi.org/10.1007/978-981-32-9808-8_5

Evaluation of the Visibility of Color Representation

55

as Japan, are popular destinations for tourists [4]. However, disaster measures targeting tourists are rarely implemented. These tourists may not have sufficient knowledge for evacuation, such as of the geographical information in the country in which they visit, and the availability of evacuation facilities. Consequently, it is possible that chaos will occur during a disaster. Therefore, designing all necessary guidelines for tourist refuges and evaluating in advance the evacuation guidance methods for them are important. In Kyoto City, one of Japan’s famous tourist spots worldwide, the evacuation guidance based on a gradual evacuation method is established to avoid confusion from mixing up between tourists and local residents at a station [3]. However, it is difficult to predict the type of situation that will occur if a large number of tourists in any of the tourist spots scattered in a wide area follow the gradual evacuation guidance during a disaster. Therefore, we focused on tourists as the target evacuees and proposed a system to support the development of evacuation guidance methods by simulating evacuation behaviors in a wide area [2]. Nevertheless, there is a problem in this system: The behavior can be simulated only in limited paths. Hence, we proposed a support system for developing the tourism evacuation guidance method via a simulator by using a cellular automata method [8]. In this study, we divided the target areas into several square cells and expressed each cell with appropriate colors to show the evacuation conditions therein. In this sense, displaying the simulation results scattered over a wide area is essential for a better understanding for a reviewer. In this study, we focused on the evaluation for color representation in a cell-based evacuation guidance simulation. Specifically, we considered the colors assigned to the cells and the number of colors in a simulation for a representation of the evacuation behavior of tourists. This study shows the evaluation results regarding the visibility of the simulation and presents our the interviews with experts regarding our system.

2 2.1

Related Studies Behavior Models for Evacuation

Many studies on the evacuation behavior model are available [12]. Lee [5] reports that evacuation behaviors can be divided into two groups; namely, walking crowd, and following a model. Walking speed of the walking crowd group is determined by the crowd density. When the crowd density is 1.5 people/m2 or more, overtaking within the crowd is impossible. Walking speed, in this case, is determined by collective conditions. Meanwhile, the walking speed of the group following a model is determined by the relative relationship with the predecessor. People in this group walk after the preceding evacuees. Morimoto et al. [6] proposed to introduce a psychology of pedestrians to change the exit by considering the congestion that occurs in building evacuations. Although only the behavior model in a building is regarded in this study, Osaragi et al. considered the evacuees not only in buildings but also outside, such

56

T. Naka et al.

as pedestrians and those riding trains and automobiles [11]. Muraki et al. considered the geographic locality and proposed the behavior models for a multi-agent simulation [7]. Although many types of studies were conducted on the modeling of evacuation behaviors for disaster, research related to a support system that considers tourists as the target and considers guidance methods has hardly been conducted to date. 2.2

Evacuation Behavior Models for Tourists

Generally, tourists only stay for a short period of time in a visited city, their behaviors during disasters are often considered to be different from those of local residents. Emori et al. indicated from the reports of the local government that general behavioral characteristics of tourists during disasters in Kyoto City are as follows [1]: • Tourists have no idea of the geographical structure of an area; hence, mastering the evacuation location, direction, time required, etc., is very challenging for them; • They tend to gather in the vicinity of railway stations, among others, to escape from the afflicted area or to gather information; • They have insufficient knowledge regarding the potential hazards of a place; • The number of visitors is considerably massive, and they are mixed with local residents. Thus, there will be difficulties when the two groups have to move simultaneously. In addition, Nishino et al. [9] in their study conducted a survey with regard to disaster awareness over the evacuation assumption from an earthquake with fires on tourists who were sightseeing in the Kiyomizu area of Kyoto City. The results of their study revealed that evacuation behavior of tourists can be classified into four large categories; namely, intention-oriented, exploratory, directional, and other behaviors. The intention-oriented types have clear, specific destinations, such as stations and sightseeing spots on their minds, and they head there with clarity. The exploratory types prefer to explore for a safe place without having a clear destination. The directional types aim for a safe direction but without a clear destination. Moreover, in the investigation of Nishio et al., the following facts were obtained: • Evacuation behavior after an earthquake is strongly influenced by the history of dwelling of the tourists. • Tourists proceed to sightseeing-related facilities, such as transportation and travel spots, for the evacuation. • In the event of fire, many people do not or cannot follow the evacuation guidance. Hence, the evacuation guidance of tourism operators becomes less likely to function well, unlike that in a situation after an earthquake without fires.

Evaluation of the Visibility of Color Representation

57

In this study, we constructed a simulation system based on cell-automata to evaluate the visibility of colors representing the evacuee situations. In our simulation, we followed the characteristics of tourist evacuation behavior that previously were mentioned and they were fitted into the tourist model. 2.3

Simulator for Developing the Evacuation Guidance

Many existing studies that involve evacuation behavior simulation were conducted, but several problems existed, such as usage examination conducted by local governments. Another problem encountered in the study is that the area to be simulated is occasionally limited only either within a building or a very small one. Given that a large amount of calculation is required for a wider range of simulations, a high-performance computer is required to avoid a very timeconsuming calculation and such a computer to be useful during its repeated utilization in the examination field. In addition, because general users not assumed to use the computer, setting parameters related to evacuation behaviors is difficult for amateur users. With regard to this problem, Emori et al. [2] in their study proposed a support system to review the evacuation guidance method for tourists who can simulate gradual evacuation guidance, with Kyoto City as the target area. In this system, designating on an electronic map the number of tourists in each point, the relative evacuation routes, and evacuation destination are required. Then, the evacuation status can be displayed on the map at various time intervals. Moreover, the degree of congestion in each point is visually displayed by drawing a circle similar in size to the degree of congestion. Thus, congestion caused by the specified guidance method is presented to the examiner. However, there is a problem in the simulator proposed by Emori et al. Although evacuation of visitors on each road is visualized, only the situation on limited roads, being from the initial point of entry by the reviewer of the evacuees to the evacuee’s evacuation centers or stations is shown. Hence, the calculation was not performed in areas that were not covered. The examiner cannot determine the evacuation and congestion situation other than the designated escape route. In this research, we propose a method to divide the target area by cells and perform a simulation to address this problem and to express evacuee situations in each cell with appropriate colors sufficient in showing the congestion situation.

58

T. Naka et al. Table 1. Colors used in the first experiments

3

Color name

RGB

Red Yellow Green Light blue Blue Purple Pink Brown White Black

R:255 G:0 B:0 R:255 G:255 B:0 R:0 G:128 B:0 R:0 G:255 B:255 R:0 G:0 B:255 R:128 G:0 B:128 R:255 G:20 B:147 R:165 G:42 B:42 R:255 G:255 B:255 R:0 G:0 B:0

Displayed color

Evaluation of Colors Assigned to Each Cell

In this research, we are planning to present to investigators the evacuation status via simulation by expressing the state of each cell in color. Specifically, the map shown on a display (see Fig. 1) is divided into small squares, called cells. The color of each cell shows the attributes of the cell (e.g., shelters, damaged areas) and the evacuee movements. Therefore, color choice is assumed to have a large influence on the recognition of the evacuation situation. The color should be easy for the reviewer to recognize. Thus, we performed experiments based on impressions received when various colors were drawn in cells on the electronic map. In this evaluation, we selected 10 colors to be assigned to the cells and examined the visible ones. Table 1 lists the names and RGB values of the colors used in this evaluation.

Fig. 1. Example of the initial map where several cells are colored.

Evaluation of the Visibility of Color Representation

59

Table 2. Attributes of the first experiment collaborators Attributes Number of collaborators 10 Sex

Male: 7; female: 2

Age

21–25

Table 3. Experimental results for color evaluation Collaborators Ranks answered by the collaborators ID Sex Age Red Yellow Green Light blue Blue Purple Pink Brown White Black A

M

21

2

3

B

M

23

1

3

C

M

23

1

D

F

23

1

3

E

M

24

3

2

F

F

23

2

G

M

24

2

1

H

M

24

1

3

I

F

22

3

J

F

24

3

1 2 3

2 1 3

1 3 2

2 2

2

1 1

We use here an 11.6 in. display with a resolution of 1366 × 768, on which the squared electric map is presented. The side length of the map is 1275 px. This map is of Kyoto City, Japan, and was obtained from Google Maps by Google. It is divided into 2,500 cells, i.e., 50 cells horizontally and 50 cells vertically. The settings of the simulator in this evaluation are as follows. Initially, five cells for each color in Table 1 were randomly selected without any overlap. That is, 50 cells among the 2,500 cells are painted with colors. Finally, 10 cells have been randomly selected as shelters. Note that no colored cell is assigned as a shelter. Figure 1 shows an example of the initial maps. The colored cells that use one of the colors listed in Table 1 indicate that evacuees are present in the areas. The evacuees move at a uniform speed from the initial position to the nearest shelters via the shortest path. In each step, the evacuees on cell s select one of the neighboring cell s , which is the nearest one to the target shelter. If cell s is colored by color x at the t time step, then the neighboring cell s that was selected based on the above manner is colored with x at the t + 1 time step. At the t + 1 time step, cell s is not assigned with any color if no evacuee selects s as the subsequent cell. When the evacuees reach to the shelter cell, they no longer move. At this time, the shelter cell is colored with color x, which is assigned to the evacuees arriving at the cell. Table 2 lists the details of the 10 experimental collaborators in this evaluation. The age of the collaborators is from 21 to 25, with seven males and two females. We requested the collaborators to sit in front of the display and to watch the

60

T. Naka et al.

output of the simulation on the display for 1 min. Thereafter, they were required to rank of the visibility into the top three colors. The evaluation results are summarized in Table 3. In the table, the values for each collaborator are his/her answered ranks. The results show that the order in the simulation of the most visible colors is black, red, blue, and pink. Note that no other colors are selected by the collaborators. We considered that the results are caused by the color of the electric map drawn in the background. Given that the roads on the map are in gray or white, black which is the deepest color is visible for the collaborators. The reason why red and blue were selected is that they are two of the three primary colors and are not used on the map. Although yellow is one of the three primary colors, it was not selected. The reason is that a shade of yellow is used on the map to indicate some of the roads or special areas.

4

Evaluation Based on the Number of Colors

In this section, we consider the evaluation of the number of colors used in the evacuation simulation for the tourists. First, we explain the details of the simulations, and then present the evaluation results and interviews related to our simulation with the experts. 4.1

Simulation of the Evacuation Behaviors of Tourists

As mentioned in Sect. 2.3, many problems exists in using an evacuation simulator in local governments. Given that this study focuses on a simulator for local governments or for general users to examine the effectiveness of evacuation guidelines, the simulator used in our evaluation should satisfy several conditions in addressing these problems. In this section, we present the outline of the simulation in our evaluation. 4.1.1 Cellular Automata Simulation To simulate the evacuation situation not only for a specific target road but also for a wide target area, simplifying the calculation is necessary. In the research of Emori et al. [2] in their study demonstrated that all agents on all designated roads were simulated. However, applying this method to all roads was difficult owing to problems of computational complexity. Therefore, a cellular automaton method was used for the calculation in this study. In this way, the space represented by a two-dimensional plane is partitioned with lattices (cells), and states of the partitioned area are allocated to the cells. The number of states on each cell is definite. The subsequent state of the cell is determined by the state transition rule defined in advance from the current state of the cell Si,j and the current states of eight adjacent cells Si−1,j−1 , . . . , Si+1,j+1 (Fig. 2). In existing evacuation behavior simulation based on the cellular automata method, agents are placed in the cell and each agent decides which cell and state

Evaluation of the Visibility of Color Representation

61

Fig. 2. States of the current and neighboring cells of the cellular automaton

to shift to next from the staying cell and neighboring cell states [10]. However, when the agents are generated in this manner for each evacuee, the calculation load on the wide area simulation is extremely high. Therefore, instead of treating evacuees in the cell as individual agents, only the number of evacuees for each attribute in the cell is regarded as data. 4.1.2 Tourist Modeling In this section, we will explain the factors that alter the behavior of tourists. To define the behavior model, we set specific attributes of cells, given that the attribution of a cell has been shown to be affected by tourism facilities, such as transportation and sightseeing spots. Hence, we considered the following attributes for cells in this simulation: • • • • • •

Sightseeing spots Shelters Open spaces (e.g., parks) Accommodation Transportation facilities (e.g., stations) Fireplace

Regarding the fireplace, we considered this site as the attribute of a cell because evacuation behavior changes when fire occurs [9]. First, guides were introduced who could influence the evacuation behavior of tourists. The examiner considered the type of evacuation routes and considered the manner of executing an efficient and safe guidance. In this case, the examiner must consider the method for arranging a limited number of guides. An arrangement function of a guide is also proposed in the captioned research of Emori et al. [2]. Therefore, in our system, which is based also on the cell automata, the guide was placed in the cell designated by the examiner. The guide did not move to the neighboring cells and stayed in his/her specified cell. Second, we considered attributes of tourist evacuees. A rough progress on the behavior of evacuees is shown in Fig. 3. In this study, we considered evacuees who follow induction and those who do not follow guidance, which was also observed

62

T. Naka et al.

in the study of Emori et al. [2]. The behavior model of evacuees who follow guidance, called obedient, is as follows: They move according to the evacuation routes (in this evaluation, it was the shortest path to the nearest shelter) entered by the examiner. However, a certain percentage of obedient evacuees in each cell did not follow the induction. This percentage becomes higher if sightseeing spots, open areas, accommodation, transport facilities, and fire placement sites are present in the resident and neighboring cells. When the evacuees arrive at the cell of the specified evacuation center, the movement ends. Third, we introduced the behavior models of the evacuees who did not follow guidelines. We classified them into three types; namely, intention-oriented, explorative, and directional, as indicated in the research of Nishino et al. [9]. The classification follows the ratio indicated in an existing study [9]. The intentionoriented types moved toward the nearest sightseeing spot, shelter, open space, or transportation facility. They initially continued to move to their destination, and them moved to the neighboring cells by following the shortest path to the destination. The explorative types randomly selected the neighboring cells as their subsequent cell. However, if sightseeing spots, shelters, open spaces or transportation facilities were present in the neighboring cells, they them moved there and stayed. The directional types initially continued to move in a direction randomly decided. However, if sightseeing spots, shelters, open spaces, or transportation facilities are present in the neighboring cells, then they move there and stay. In all cases including the obedient type, if the destination cell is a place where fire occurred, then these type of evacuees moved to one of the cells adjacent to the destination cell. Moreover, the evacuees who do not follow the guidance reached to the cell where a guide was assigned, some of them, in a certain percentage, follow the guidance, i.e., obedient type. Thus, the evacuees selected a neighboring cell following the shortest path to the nearest shelter. 4.1.3 Input/Output Interface In this section, we describe the input/output interface to the examiner. A prototype system was developed by using Artisoc 4.0 by Structural Planning Laboratory [13]. The examiner can input the following details; • • • • •

Evacuation starting points Number of people at the evacuation starting points Evacuation routes to shelters Cells where the guides are assigned Cells with fire occurrence

The setting of each point can be set by selecting the corresponding cell. The number of people at the evacuation starting point is entered in a specified field. For evacuation routes, cells corresponding to the evacuation routes were consecutively selected.

Evaluation of the Visibility of Color Representation

63

Fig. 3. Flow of the behavior of evacuees.

Fig. 4. Buttons in the prototype simulator.

While executing the simulation, the examiner can adjust the execution speed of the simulation. The following buttons are set on the screen of the system (Fig. 4): • • • • •

“Execute” button: Start the simulation “Step execution” button: Execute the simulation in one step “Pause” button: Pause the simulation “Stop” button: Terminate the simulation “Execute wait” button: Adjust the simulation execution speed of the analog scale

The output of the evacuee status of each attribute included in each cell on the electronic map is based on color. However, if the state of each cell is an output in color for the output screen targeting a wide area, the colors aids in rendering the recognition of the evacuation easy. Hence, color choice and number of color outputs from the simulator significantly influenced our results. 4.2

Evaluation Results

As explained in the previous section, drawing multiple colors is necessary to represent the evacuee behaviors because they have several attributes (i.e., type).

64

T. Naka et al. Table 4. Colors assigned to each cell. Attribute

Color name

RGB

Evacuee Evacuee Evacuee Park Guide Fire Destination

Red Yellow Blue Lime Purple Gray Black

R:255 G:0 B:0 R:255 G:255 B:0 R:0 G:0 B:255 R:0 G:255 B:0 R:128 G:0 B:128 R:123 G:123 B:123 R:0 G:0 B:0

Displayed color

In this experiment, we examine how many colors, which represent the evacuees, are readily and visually recognized on the simulation screen. Specifically, three colors (i.e., red, yellow, and blue) were selected to express the evacuee states, and four colors were set to represent attributes of cells as special facilities. 4.2.1 Colors Assigned to Each Attribute Table 4 lists the names of the color used in this experiment, RGB values, and actual display of the colors. The table presents seven colors with five attributes, including evacuee, park, guide, fire, and destination. Herein, the required (e.g., evacuation shelter) and the non-required destinations were set in one cell. The required one is Kitano Tenmangu (i.e., in black), whereas the non-required one is Umekoji Park (i.e., in green). Some of the cells were randomly selected and initially assigned with gray. The cells with assigned guides were inputted by the examiners (i.e., in purple). Initially, the evacuees were assigned to a cell input by the examiner. Note that the initial type of evacuee is neither obedient nor non-obedient. The evacuees of the initial type moved to the neighboring cell randomly selected. These evacuees only selected their type when they met a guide. After selecting their type, they followed the behaviors according to the definition described in the previous section. That is, the obedient evacuees moved to the neighboring cells following the shortest path to Kitano Tenmangu. Meanwhile, the destination of non-obedient evacuees was Umekoji Park or Kitano Tenmangu. If they visited cells where a guide was assigned, they changed their type, either obedient or non-obedient, with a certain probability, which was designated by the examiner. We performed three experiments within this experiment by changing the number of colors that represents the evacuees. If we used only one color, all of the evacuees were colored with red. If we used two colors, all of the evacuees were initially colored with red. In this case, when they met a guide and changed their type, the obedient evacuees were colored with blue, whereas the non-obedient evacuees were colored with red. If we used three colors, the evacuees of initial type are assigned with red, the obedient ones with blue, and the non-obedient ones with yellow.

Evaluation of the Visibility of Color Representation

65

Table 5. Attributes of the second experiment collaborators Attributes Number of collaborators 9 Sex

Male: 7; female: 2

Age

20–54

Job experience (year)

1–10

Herein, we drew the colors on a square electronic map with vertical and horizontal sizes of 3000 px on 49 in. display with a resolution of 3840 × 2160. One of the screen examples is shown in Fig. 5. The target range of the battery map was divided into 900 (30 vertical and 30 horizontal) cells. 4.2.2 Experimental Collaborators and Procedure In December 2017, to inspect the effectiveness of the system, we showed the usage example of this system to nine experts in the age range of 20–54, with seven males and two females, from the Disaster and Crisis Management Office of Kyoto. Table 5 presents the details of the experimental collaborators. The collaborators were provided with the evaluation settings and applied them to our system (Table 6). The collaborators also set four cells as guides (in purple), located in Kawaramachi Dori. The experts set the obedient rate of the evacuees. The rates were 91% and 69% based on the research in the study of Nishino et al. [9]. In addition, the evacuee had thee types of colors. In other words, we performed six experiments, that is, two types of obedient rates and three types of colors. We asked the experts to sit in front of the display and watch the simulation for 10 min. Thereafter, we produced an evaluation questionnaire.

Fig. 5. Example of the screen in the evaluation experiment Table 6. Evacuation setting The evacuation guidance guides the evacuees from Gion-shijo to Kitano Tenmangu. Moreover, the non-obedient evacuees continued to Umekoji park. The guides are located in Kawaramachi Dori. The two types of obedient rates are 91% and 69%

66

T. Naka et al.

4.2.3 Experimental Results and Discussion We presented the system screen to the experts. The results were evaluated on the basis of the following questions: • Can you confirm the evacuation change when you change the arrangements of guides? • Can you confirm the evacuation change when you change the rate of obedience? • Is it easy to determine the evacuation visually when the evacuee color is only one? • Is it easy to determine the evacuation visually when the evacuee colors are two? • Is it easy to determine the evacuation visually when the evacuee colors are three? • How many evacuation colors are the best for you? Table 7 presents the results. The evaluation represents 1 for Excellent, 2 for Good, 3 for Average, 4 for Below average, and 5 for Poor. We obtained good evaluation results regarding the examination of the evacuation of the question “Can you confirm the evacuation change?”. In addition, poor evaluation results were obtained in the case of using only one color. The evaluators reported that it was difficult for evaluators to use one color to determine the motion of each cell visually. However, we were able to obtain good evaluation results regarding two and three colors of the evacuee, and the best number of colors was three. Table 7. Questionnaire results Collaborators ID Sex Age

Answers Experience Arrangements Guidance One Two Three Best of guides rate color colors colors number of colors

A M

35–39

3

1

1

4

3

1

Three

B M

20–24

1

1

1

5

3

1

Three

C F

25–29

3

2

2

5

1

4

Two

D M

50–54

6

1

1

3

3

1

Three

E M

45–49

1

1

1

5

3

1

Three

F M

35–39 10

1

4

5

1

1

Three

G M

45–49

2

1

1

5

2

1

Three

H M

35–39

1

2

2

3

3

2

Three

I

20–24

1

4

4

5

3

2

Three

F

Evaluation of the Visibility of Color Representation

67

Furthermore, in this evaluation, we found out that some improvement in this system was needed by experts. They required the following functions: 1. Setting for the place, seasons and time zone in which many tourists come. 2. Setting changes in the condition during a disaster (morning, daytime, evening, and nighttime). 3. A situation setting when the evacuees can be guides. 4. An evaluation and visualization setting when each guide works together. 5. Clarifying the difference in representation between the actual guide and direction board.

5

Conclusion

In this research, we investigated a support system used when an evaluator discusses an evacuation method. We suggest that color representation should be considered in the case of cellular automaton and simulator suggestion. Particularly, the examination for the recognition of color representation was performed by the expert member. Regarding the colors, the collaborators decided black, blue and red were visible for the simulation. Regarding the number of colors representing the attributes of the evacuees, the evaluation was expressed in one, two, and three colors when the state of the evacuees moved on the screen. Consequently, representation of the three colors was best to visually recognize the others. Our future work is to meet the requirement specifications pointed out by the experts. If we can meet these requirement specifications, this system then becomes increasingly effective. In addition, we can contribute to more specific evacuation plans through using other simulators. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number JP16K21484.

References 1. Emori, N., Izumi, T., Nakatani, Y.: Support system for developing evacuation guidance for tourists: visualization of the number of evacuees in a space. In: Proceedings of The International MultiConference of Engineers and Computer Scientists 2015, pp. 534–539. Lecture Notes in Engineering and Computer Science (2015) 2. Emori, N., Izumi, T., Nakatani, Y.: A support system for developing tourist evacuation guidance. In: Transactions on Engineering Technologies (International MultiConference of Engineers and Computer Scientists 2015), pp. 15–28 (2016) 3. Kyoto City Fire Department: A guideline for support people who are unable to return home after disasters (2014). DIALOG. http://www.city.kyoto.lg.jp/shobo/ page/0000162218.html. Accessed 15 Jan 2018 4. Kyoto City Industrial Tourism Department: General survey of tourism in Kyoto (2017). DIALOG. http://www.city.kyoto.lg.jp/sankan/page/0000222031. html. Accessed 15 Jan 2018

68

T. Naka et al.

5. Lee, J.: Development of support model for evacuation advisory for large crowd and motion of pedestrian for leading crowd. Dissertation thesis, Graduate School of System and Information Engineering, University of Tsukuba (1992) 6. Morimoto, Y., Kurita, O., Tanaka, K.: An evaluation model for evaluation behavior in a congested buildings considering the pedestrians choice behavior of exit. J. City Inst. Jpn. 50(3), 636–643 (2015) 7. Muraki, Y., Kanoh, H.: Multiagent model for wide-area disaster-evacuation simulations with local factors considered. Trans. Jpn. Soc. Artif. Intell. AI 22, 416–424 (2007) 8. Naka, T., Kitamura, T., Izumi, T., Nakatani, Y.: A study of color representation for interface of cell-based evacuation guidance simulation for tourists. In: Proceedings of The International MultiConference of Engineers and Computer Scientists 2018, pp. 416–421. Lecture Notes in Engineering and Computer Science (2018) 9. Nishino, T., Ohashi, K., Hokugo, A.: Tourists behavioral tendency expected in post-earthquake fire evacuation. J. Arch. Inst. Jpn. 81(719), 1–8 (2016) 10. Ohi, F., Onogi, M.: A simulation of evacuation dynamics of pedestrians in case of emergency by a 2-dimensional cellular automaton method. J. Trans. Oper. Res. Soc. Jpn. 51, 94–111 (2008) 11. Osaragi, T., Morisawa, T.: Simulation model of evacuation behavior in large-scale earthquake considering various states and attributes of people in large city. J. Arch. Plan. (Trans. AIJ) 76(660), 389–396 (2011) 12. Schreckenberg, M., Sharma, S.D. (eds.): Pedestrian and Evacuation Dynamics, vol. 1. Springer, Boston (2002) 13. Yamagata, S.: Introduction to Multi-agent Simulation by Artisoc. Hayama Book Studio, Tokyo (2007)

Identifying Prophetic Bloggers Based on Prediction Ability of Buzzwords and Categories Jianwei Zhang1(B) , Yoichi Inagaki2 , Reyn Nakamoto2 , and Shinsuke Nakajima3 1

Iwate University, 4-3-5 Ueda, Morioka, Iwate 020-8551, Japan [email protected] 2 Kizasi Company, Inc., 20-14 Hakozaki-Cho, Nihonbashi, Chuo-ku, Tokyo 103-0015, Japan {inagaki,reyn}@kizasi.jp 3 Kyoto Sangyo University, Motoyama, Kamigamo, Kita-ku, Kyoto 603-8555, Japan [email protected]

Abstract. Finding important users from social media is a challenging and significant task. In this paper, we focus on the users in the blogosphere and propose an approach to identify prophetic bloggers by estimating bloggers’ prediction ability on buzzwords and categories. We conduct a time-series analysis on large-scale blog data, which includes categorizing a blogger into knowledgeable categories, identifying past buzzwords, analyzing a buzzword’s peak time content and growth period, and estimating a blogger’s prediction ability on a buzzword and on a category. Bloggers’ prediction ability on a buzzword is evaluated considering three factors: post earliness, content similarity and entry frequency. Bloggers’ prediction ability on a category is evaluated considering the buzzword coverage in that category. For calculating bloggers’ prediction ability on a category, we propose multiple formulas and compare the accuracy through experiments. Experimental results show that the proposed approach can find prophetic bloggers on real-world blog data. Keywords: Buzzword detection · Expert finding · Prediction ability Prophetic blogger · Social media · Time-series analysis

1

·

Introduction

Finding important users from social media is a challenging and significant task. Past research has two main directions: finding knowledgeable users by measuring expertise levels or finding influential users by estimating influence degrees. The former is usually based on textual content analysis while the latter also utilizes link structure in social networks. We focus on the users in the blogosphere and consider the users’ prediction ability, which has not been investigated in previous works. c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 69–81, 2020. https://doi.org/10.1007/978-981-32-9808-8_6

70

J. Zhang et al.

The blogosphere is a platform for bloggers to issue posts, share ideas and exchange opinions. The data in the blogosphere is dynamic reflecting information change over time. Potential knowledgeable bloggers with prior awareness of future popular trends may exist in the blogosphere. Identifying these bloggers can bring great values. For example, analysis on their blog entries may help find future trends or communication with them may even help foresee things that will become popular. We propose an approach to identify important bloggers based on their prediction ability on buzzwords and categories. Buzzwords are the terms or phrases describing topics or events that have become well-known to general population. We call the bloggers who are knowledgeable and have high prediction ability “prophetic bloggers”. For identifying prophetic bloggers, we conduct a timeseries analysis on real-world blog data consisting of 150 million entries from 11 million bloggers. Bloggers’ prediction ability on a buzzword is evaluated considering three factors: post earliness, content similarity and entry frequency. The general idea is that (a) The earlier a blogger posted blog entries containing a buzzword, the better prediction ability on the buzzword he may have. (b) The more similar the contents of their past entries to the peak time content of a buzzword at its popularity peak, the more accurate their prediction ability on the buzzword. (c) The larger the quantity of early and similar blog entries containing the buzzword are, the better prophetic blogger he may be. Our previous work [1] did not fully discuss bloggers’ prediction ability on a category. In this paper, prediction ability on a category is evaluated making use of prediction scores on buzzwords in that category and considering buzzword coverage. The general idea is that the more buzzwords relative to a category he can well predict, the better prophetic blogger on the category he may be. Note that this paper is the extended and revised version of the paper accepted in IMECS 2018 [2]. Our contributions are summarized as follows: • We introduce a method for categorizing a blogger into his appropriate potential communities called knowledgeable categories (Sect. 2). • We develop a method for automatically identifying past buzzwords from historical blog data based on their persistence (Sect. 3). • We analyze a buzzword’s properties by identifying its peak time content and calculating its growth period (Sect. 4). • We integrate the necessary factors for evaluating a blogger’s prediction ability on a buzzword (Sect. 5). • We propose multiple formulas for estimating a blogger’s prediction ability on a category (Sect. 6).

2

Categorizing a Blogger into Knowledgeable Categories

We extract potential communities of bloggers called knowledgeable categories (kc), and automatically categorize bloggers into their appropriate kcs. A potential community in our research is a group of bloggers who are knowledgeable in

Identifying Prophetic Bloggers

71

a kc. For example, the “politics” community is the group of bloggers who are knowledgeable in the “politics” category. Potential communities of bloggers are objectively identified by analyzing bloggers’ entries that they posted. Even if one does not declare his interest in a category explicitly, if he has posted many blog entries related to the category, our method can categorize him into the appropriate kcs automatically. 2.1

Extracting Knowledgeable Categories and Constructing Co-occurrence Dictionaries

Each kc is represented by a keyword that is often mentioned in the blogosphere. This keyword becomes the name of the kc. They are extracted by performing a regular Web search with the search keywords such as “expert in *” and “fan of *”. We manually remove inappropriate ones and categorize the keywords into 122 categories, ending up with a list of 122 kc names (e.g., “politics”, “economy”, “IT ”). For each kc, a co-occurrence dictionary is automatically constructed. For each keyword representing the kc, we extract the top n words that have the highest co-occurrence degrees from all blog entries. Specifically, n is 400 in our current implementation. The co-occurrence words and their co-occurrence degrees are stored in each co-occurrence dictionary for each kc. 2.2

Calculating a Blogger’s Knowledge Score

A blogger’s knowledge score for a kc is calculated by analyzing how often as well as how in-depth he has posted blog entries related to the kc. If a blogger has an extensive use of co-occurrence words of a kc, a high score is attached to him. We first calculate Relevancekc (ei )–the relevance score of a blog entry ei for a kc–as follows: n  Relevancekc (ei ) = αj · βj · γj (1) j=1

where n is the number of the co-occurrence words (n = 400), αj = (n − j + 1)/n is the weight of the jth co-occurrence word that decreases as j increases, βj is the co-occurrence degree of the jth co-occurrence word, and γj is a binary value that indicates whether the entry ei contains the jth co-occurrence word or not. We next calculate Knowledgekc (blg)–the knowledge score of a blogger blg for a kc–as follows: m

Knowledgekc (blg) =

l log(m)  · · Relevancekc (ei ) n m i=1

(2)

where ei is an entry that blogger blg posted, m is the number of entries that blg posted during the analysis period, n is the number of the co-occurrence words and l is the number of the co-occurrence words that occurred in all entries posted by blg. l/n indicates the coverage ratio of the co-occurrence words that

72

J. Zhang et al.

blg has used. log(m)/m reduces the effect when a blogger frequently posts a large amount of entries, but most of them are the entries unrelated to the kc. A blogger is categorized into a kc if his knowledge score is larger than a given threshold. Moreover, a blogger may be categorized into two or more kcs and thus may have two or more knowledge scores for different categories. For example, if a blogger belongs to both “politics” and “economy”, he has a knowledge score representing his expertise degree in “politics” and another one representing his expertise degree in “economy”. Through the above process, we have a list of knowledgeable bloggers for each kc.

3

Identifying Past Buzzwords

Before evaluating a blogger’s buzzword prediction ability, buzzwords need to be first detected. We identify past buzzwords by analyzing real-world blog data. 3.1

Determining Buzzword Candidates

We start with the top-ranked keywords in the daily topic ranking list provided by kizasi Company. These are the keywords that have the highest ratios of the number of bloggers who mentioned them in the past two days to the number of bloggers who mentioned them in the past two years. We take the top-k (k = 100) keywords from each day and then exclude repeated words and periodical words. The remaining keywords become buzzword candidates.

Fig. 1. Buzzword candidates’ persistence

Identifying Prophetic Bloggers

73

In our approach, we evaluate a blogger’s prediction ability for a kc based on their prediction scores on the buzzwords that belong to the kc. In order to associate buzzword candidates (bwc) with kcs, we calculate the similarity between a bwc and each kc. A bwc is associated with a kc if they share many co-occurrence words. For example, bwc “Abenomics”1 and kc “politics” have many common co-occurrence words such as “Abe”, “premier” and “party”, and thus, “Abenomics” can be categorized into “politics”. Each bwc is categorized into the top-k (k = 5) kcs with the highest similarities. Consequently, given a kc, the set of similar bwcs can also be identified. This categorization result is used for the subsequent process in Sects. 3.2 and 6. 3.2

Determining Buzzwords Based on Their Persistence

Among buzzword candidates, there are also some burst words that disappear immediately after the peak. This kind of word is not a buzzword since it is forgotten by the public soon after the peak. We extract influential words as buzzwords from buzzword candidates based on their persistence (Fig. 1). A buzzword candidate’s persistence is evaluated by counting the total number of blog entries containing it during a specified duration period Td (e.g., six months) after the peak. If the number of entries containing a buzzword candidate during Td is small, it is of low persistence. In contrast, if a buzzword candidate has a large number of entries containing it during Td , it has high persistence. From each kc, we select the top-k (k = 10) buzzword candidates with the highest persistence as the buzzwords representing each kc.

Fig. 2. Determining peak time content words

4

Analyzing Past Buzzwords’ Properties

We identify the peak time content of a buzzword represented by a set of its peak time content words and determine each buzzword’s growth period by analyzing the content similarity between the content at each period (e.g., at intervals of one week) before the peak and the peak time content. 1

Abenomics refers to the economic policies advocated by Shinzo Abe, the Prime Minister of Japan.

74

4.1

J. Zhang et al.

Extracting Peak Time Content Words

The co-occurrence words of the buzzword around the peak time are the candidates of its peak time content words. However, not all of them are appropriate as the peak time content words. Figure 2 shows the idea of extracting the peak time content words. We select the co-occurrence words whose time-series variation is the most similar to the buzzword’s for representing its peak time content. In Fig. 2, co-occurrence word xx is more appropriate as the peak time content word than aa, since xx has much more similar variation curve with buzzword bw. 4.2

Calculating Growth Period Based on Content Similarity

A buzzword’s growth period dates from its peak back to the time point when the contents of blog entries start to be similar to the peak time content. For example, if buzzword “iP hone 6” starts to be mentioned in unspecific entries

Fig. 3. Determining a growth period

Identifying Prophetic Bloggers

75

such as “I really want to buy an iP hone 6.”, the growth period has not begun. Since it only contains ordinary words, this period is inappropriate for analyzing bloggers’ prediction ability on the buzzword. If some blog entries such as “iP hone 6 may adopt new chip and larger display.” begin to appear and some content words from the popularity peak such as “chip” and “display” are mentioned, the growth period may begin. Figure 3 shows the idea of identifying the growth period. For determining the starting point of the growth period, we calculate the content similarity between each period before the peak (at intervals of one week) and the peak time. Specifically, for each period ti before the peak we extract the set of co-occurrence words (COWti ) from the blog entries containing the buzzword posted during each ti and calculate its similarity with the set of peak time content words (CT Wpeak ) as follows: Similarity(ti , peak) =

|COWti ∩ CT Wpeak | min(|COWti |, |CT Wpeak |)

(3)

Then, we calculate the average of Similarity(ti , peak) before the peak and specify the starting point of the growth period by using the following criterion. After accumulating the differentials between the average Similarity(ti , peak) and each interval’s Similarity(ti , peak), the time point when the cumulative sum has the largest value is specified as the starting point of the growth period. As shown in Fig. 3, there are cases where the similarity curve slightly surpasses (t1 ) and subsequently falls below the average (t2 ). If we were to use the simple intersection of the similarity curve and the average line, the starting point would be set too early (t1 or t2 ). Instead, we adopt the accumulative sum of the differentials between the average and each interval, and thus, avoid this problem. The starting point is the time when the accumulative sum becomes the highest (t3 ). Note that different buzzwords have different growth periods and a growth period of a buzzword is analyzed on the entries posted by all bloggers, independent of any individual blogger.

5

Calculating a Blogger’s Prediction Score on a Buzzword

A blogger’s prediction score on a buzzword is calculated based on post earliness, content similarity and the quantity of his blog entries containing the buzzword during its growth period. We assign a score of post earliness to each entry containing the buzzword posted during its growth period. All entries containing the buzzword during its growth period are sorted according to their post dates. An entry posted at the starting point of the growth period should receive the highest earliness score and an entry posted at the end of the growth period (i.e., the popularity peak of the

76

J. Zhang et al.

buzzword) should receive the lowest earliness score. Thus, we devise the formula for post earliness of entry ei for buzzword bw as follows: Earlinessbw (ei ) = −log

order(ei ) |ETg |

(4)

where ETg is the set of all entries containing buzzword bw during the growth period Tg and order(ei ) is the appearance order of entry ei in the set. For example, if there are 100 entries containing a buzzword during its growth period, the earliness scores are 2, 1.698, 1.522, ..., 0.008, 0.004, 0, respectively. If a blogger posted many blog entries containing a buzzword similar to its peak time content at the early stage of its growth period, he can be regarded as a good predictor on this buzzword. Thus, we devise the formula for the prediction score of blogger blg for buzzword bw as follows: P dtbw (blg) =

m 

Earlinessbw (ei ) · Similarity(ei , ptc)

(5)

i=1

where ei is one of m entries containing buzzword bw that blogger blg posted during its growth period, Earlinessbw (ei ) is ei ’s earliness score and Similarity(ei , ptc) is its content similarity to the peak time content ptc of buzzword bw. The content similarity between entry ei and the peak time content ptc is calculated as follows: Similarity(ei , ptc) =

|D(ei ) ∩ CT Wpeak | min(|D(ei )|, |CT Wpeak |)

(6)

where D(ei ) is the set of words appearing in ei and CT Wpeak is the set of peak time content words of the buzzword.

Fig. 4. Evaluating a blogger’s prediction ability on a category

Identifying Prophetic Bloggers

6

77

Calculating a Blogger’s Prediction Score on a Category

A blogger’s prediction ability on a category is evaluated considering his prediction scores on the buzzwords that belong to that category. The knowledgeable bloggers with high prediction ability on a category are identified as prophetic bloggers. As prophetic blogger candidates for a category, we first select the top-k (k = 300) knowledgeable bloggers with the highest knowledge scores in this category calculated in Sect. 2. Then, we find the buzzwords in this category shown in Sect. 3. Each knowledgeable blogger’s prediction score on each buzzword can be calculated by the method described in Sect. 5. Using the prediction scores on buzzwords, we propose five methods for estimating a blogger’s prediction ability on a category. The first method counts the numbers of buzzwords that a blogger can predict. Concretely, for each buzzword we can prepare a top-k (k = 5) blogger list in which the bloggers have the highest prediction scores on it. We regard the bloggers who appear in multiple top blogger lists as prophetic bloggers on that category. In Fig. 4, blg2 is the best prophetic blogger in that category since he has successfully predicted three buzzwords in that category. blg6 , blg7 and blg8 are the next best prophetic bloggers since they predicted the next highest number of buzzwords after blg2 . By this method blg6 , blg7 and blg8 have the same rankings since the numbers of buzzwords that they can predict are identical. In order to meticulously distinguish bloggers’ prediction ability on categories, we further propose four calculation formulas. Formula 7 sums up a blogger’s prediction scores (P dtbw (blg)) on all buzzwords (bw) that belong to a category (m is the number of blog entries (C). Formula 8 introduces a factor log(m+1) m+1 containing buzzword bw), which intends to reduce the effect that a blogger posts a large number of entries only related to a specific buzzword. Formula 9 considers the buzzword coverage by introducing l/n where n is the number of buzzwords in a category and l is the number of buzzwords that the blogger can predict. Formula 10 integrates all the factors of the above formulas.  P dtbw (blg) (7) P dtC (blg) = bw∈C

 log(m + 1) P dtbw (blg) m+1 bw∈C l  P dtbw (blg) P dtC (blg) = n

P dtC (blg) =

(8) (9)

bw∈C

P dtC (blg) =

l  log(m + 1) P dtbw (blg) n m+1

(10)

bw∈C

7

Experimental Evaluation

In the experiment, we select three categories: M ovie, T V program and Smartphone. For each category, ten buzzwords are manually listed up. Based

78

J. Zhang et al.

on the method described in Sect. 4.2, we calculate the growth period for each buzzword. The growth period of different buzzwords are different, varying from about six months to more than one year. For each of the three categories, the top 300 bloggers with the highest knowledge scores are first extracted. For each of the ten buzzwords in each category, the prediction scores of the 300 bloggers on the buzzword are calculated using the method described in Sect. 5. The ranking list of the top five bloggers with the highest prediction scores on each buzzword is generated. We investigate whether there exist bloggers who appear in more than two buzzwords’ top blogger lists for each category. We find that the proposed approach detects eight, seven and six bloggers who appear in more than two top blogger lists for the three categories respectively. We ask two evaluators to browse the entries posted by these bloggers and judge whether they are prophetic bloggers. The judgment criterion is whether the bloggers has posted some entries which contain buzzwords’ peak time content words before the peak. The bloggers who are regarded as prophetic bloggers by both evaluators are used as the true prophetic bloggers for the evaluation of identification accuracy. We compare the accuracy of top-k bloggers ranked by the proposed approach with two baseline methods, one of which is based on the numbers of entries containing any of the ten buzzwords in each category and the other of which is based on bloggers’ knowledge scores. Table 1 shows the accuracies of the baseline methods. Since the two methods based on entry numbers and knowledge scores do not take temporal features and prediction abilities into account, the accuracies for identifying prophetic bloggers averaged over the three categories are low (25.6% and 15.1%). From Table 2, we can observe that our proposed five methods achieve the average accuracies from 42.9% to 52.6%, outperforming the two baseline methods. Among the proposed five methods, the proposal using Formula 7 that sums up the prediction scores on all buzzwords in a category does not work better than the other proposals. The proposal using Formula 8 that reduces the effect of the numbers of blog entries performs better than the other proposals for one category T V program. The proposal using Formula 9 that considers the buzzword coverage performs better than Formula 7 and Formula 8. The highest accuracies are observed for the basic proposal considering the numbers of buzzwords that a blogger can predict and the proposal using Formula 10 that considers both the effect of numbers of blog entries and the buzzword coverage. Although the basic proposal and the proposal using Formula 10 provide the same accuracies in this experiment, the proposal using Formula 10 can distinguish bloggers more meticulously than the basic proposal. The basic proposal only considers the numbers of top blogger lists a blogger appears in and the numbers are usually small integers which may give many bloggers the same rankings. However, the proposal using Formula 10 calculates bloggers’ prediction scores on a category in real numbers, which can better rank bloggers by avoiding many same rankings.

Identifying Prophetic Bloggers

79

Table 1. Accuracy of baseline methods Category

# of bloggers # of entries Knowledge score

Movie 8 TV program 7 Smartphone 6

12.5% (1/8) 0% (0/8) 14.3% (1/7) 28.6% (2/7) 50.0% (3/6) 16.7% (1/6)

AVG

25.6%

7

15.1%

Table 2. Accuracy of proposed methods Category

# of bloggers

Proposal Basic

Proposal Formula 7

Proposal Formula 8

Movie

8

62.5% (5/8) 50.0% (4/8) 37.5% (3/8)

Proposal Formula 9

Proposal Formula 10

62.5% (5/8) 62.5% (5/8)

28.6% (2/7) 42.9% (3/7) 14.3% (1/7)

TV program 7

28.6% (2/7)

Smartphone

6

66.7% (4/6) 50.0% (3/6) 50.0% (3/6)

66.7% (4/6) 66.7% (4/6)

AVG

7

52.6%

47.8%

8

42.9%

43.5%

28.6% (2/7) 52.6%

Related Work

Identification of important users has been widely studied. [3] provided a survey on expert finding within an organization. [4] addressed the problem of expertise retrieval in a bibliographic network. There is also research aimed at finding important users from social media. We classify them into two types: one that extracts knowledgeable users [5,6] and the other that identifies influential users [7–9]. Different from the previous works which focus on the expertise degree and influence degree of users, we attempt to find important users by analyzing users’ buzzword prediction ability. Topic or event detection [10–13] is closely related to our work. These works motivate us to analyze the lifespan of buzzwords: the starting point of buzzwords, the peak of buzzwords, and the duration period after its peak. Another related line of research is popularity prediction. Future popularity is predicted for different types of data such as events [14], videos [15–17], news [18], search [19–21], tweets [22,23], and unrestricted use generated contents [24]. Although future popularity has been noticed in these researches, it is not used for finding important users. We link buzzword popularity analysis results to finding prophetic bloggers.

9

Conclusions

In this paper, we proposed the approach to find prophetic bloggers. We focused on temporal and content features of blog data and analyzed bloggers’ prediction ability on buzzwords and categories. Bloggers were evaluated on how early, how related, how often and how in-depth they posted blog entries containing the

80

J. Zhang et al.

buzzwords in a category. Multiple formulas for estimating bloggers’ prediction ability were compared. The experimental results showed our approach could extract prophetic bloggers. In the future, we will try to develop the methods for identifying future buzzwords from the blog entries posted by prophetic bloggers and implement a practical system that can extract future buzzwords. Acknowledgements. This work was partially supported by JSPS KAKENHI Grant Number #26330351.

References 1. Zhang, J., Tomonaga, S., Nakajima, S., Inagaki, Y., Nakamoto, R.: Prophetic blogger identification based on buzzword prediction ability. IJWIS 12(3), 267–291 (2016) 2. Zhang, J., Inagaki, Y., Nakamoto, R., Nakajima, S.: Estimating Bloggers’ prediction ability on buzzwords and categories. In: Proceedings of The International MultiConference of Engineers and Computer Scientists 2018, Hong Kong, 14–16 March 2018. Lecture Notes in Engineering and Computer Science, pp. 393–398 (2018) 3. Balog, K., Fang, Y., Rijke, M., Serdyukov, P., Si, L.: Expertise retrieval. Found. Trends Inf. Retr. 6(2–3), 127–256 (2012) 4. Hashemi, S.H., Neshati, M., Beigy, H.: Expertise retrieval in bibliographic network: a topic dominance learning approach. In: CIKM, pp. 1117–1126 (2013) 5. Bozzon, A., Brambilla, M., Ceri, S., Silvestri, M., Vesci, G.: Choosing the right crowd: expert finding in social networks. In: EDBT, pp. 637–648 (2013) 6. Guy, I., Avraham, U., Carmel, D., Ur, S., Jacovi, M., Ronen, I.: Mining expertise and interests from social media. In: WWW, pp. 515–526 (2013) 7. Bakshy, E., Hofman, J.M., Mason, W.A., Watts, D.J.: Everyone’s an influencer: quantifying influence on Twitter. In: WSDM, pp. 65–74 (2011) 8. Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on Twitter. In: WWW, pp. 705–714 (2011) 9. Singer, Y.: How to win friends and influence people, truthfully: influence maximization mechanisms for social networks. In: WSDM, pp. 733–742 (2012) 10. Asur, S., Huberman, B.A., Szabo, G., Wang, C.: Trends in Social Media: Persistence and Decay. In: ICWSM 2011 (2011) 11. Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on Twitter. In: ICWSM 2011 (2011) 12. Yin, H., Cui, B., Lu, H., Huang, Y., Yao, J.: A unified model for stable and temporal topic detection from social media data. In: ICDE, pp. 661–672 (2013) 13. Spina, D., Gonzalo, J., Amigo, E.: Learning similarity functions for topic detection in online reputation monitoring. In: SIGIR, pp. 527–536 (2014) 14. Zhang, X., Chen, X., Chen, Y., Wang, S., Li, Z., Xia, J.: Event detection and popularity prediction in microblogging. Neurocomputing 149, 1469–1480 (2015) 15. Figueiredo, F., Benevenuto, F., Almeida, J.M.: The tube over time: characterizing popularity growth of Youtube videos. In: WSDM, pp. 745–754 (2011) 16. Pinto, H., Almeida, J.M., Goncalves, M.A.: Using early view patterns to predict the popularity of Youtube videos. In: WSDM, pp. 365–374 (2013)

Identifying Prophetic Bloggers

81

17. Li, H., Ma, X., Wang, F., Liu, J., Xu, K.: On popularity prediction of videos shared in online social networks. In: CIKM, pp. 169–178 (2013) 18. Bandari, R., Asur, S., Huberman, B.A.: The pulse of news in social media: forecasting popularity. In: ICWSM 2012 (2012) 19. Kairam, S.R., Morris, M.R., Teevan, J., Liebling, D.J., Dumais, S.T.: Towards supporting search over trending events with social media. In: ICWSM 2013 (2013) 20. Golbandi, N., Katzir, L., Koren, Y., Lempel, R.: Expediting search trend detection via prediction of query counts. In: WSDM, pp. 295–304 (2013) 21. Radinsky, K., Svore, K.M., Dumais, S.T., Shokouhi, M., Teevan, J., Bocharov, A., Horvitz, E.: Behavioral dynamics on the web: learning, modeling, and prediction. ACM Trans. Inf. Syst. 31(3), 16 (2013) 22. Hong, L., Dan, O., Davison, B.D.: Predicting popular messages in Twitter. In: WWW (Companion Volume) 2011, pp. 57–58 (2011) 23. Bian, J., Yang, Y., Chua, T.: Predicting trending messages and diffusion participants in microblogging network. In: SIGIR, pp. 537–546 (2014) 24. Ahmed, M., Spagna, S., Huici, F., Niccolini, S.: A peek into the future: predicting the evolution of popularity in user generated content. In: WSDM, pp. 607–616 (2013)

Query Generation for Web Search Based on Spatio-Temporal Features of TV Program Honoka Kakimoto1(B) , Yuanyuan Wang2(B) , Yukiko Kawai3(B) , and Kazutoshi Sumiya1(B) 1

Kwansei Gakuin University, Sanda-shi, Hyogo, Japan {dmi91695,sumiya}@kwansei.ac.jp 2 Yamaguchi University, Ube-shi, Yamaguchi, Japan [email protected] 3 Kyoto Sangyo University, Kita-ku, Kyoto, Japan [email protected]

Abstract. Recently, people have begun searching for relevant information of each scene of TV program videos with other devices such as smartphones and tablets. While they view TV programs, users’ interests change by each scene of the video. When they try to get information related to the content of the scenes, users have to input appropriate query keywords for a Web search. However, it takes users time and effort to find their requested information. Although some data-casting services suggest related information to TV programs, the related information does not synchronize enough with each scene of the videos. To solve this problem, our system proposes a novel query keyword extraction method for Web searches, based on spatio-temporal features of videos using location names in the video caption data. We first extract all location names from the closed caption, and classify them into two types: main location name and sub-location names based on the occurrence frequency and average of the interval. Next, it determines subtopics from all nouns in TV program based on the same way as the classification of location names to generate web search queries. Therefore, suitable web pages for each scene can be found based on the generated query keywords through our system. Keywords: Closed caption · Geographical metadata · Geographical relationships · Recommender system · Spatio-temporal feature · Topic extraction

1

Introduction

Recently, people have begun searching for relevant information for each scene on TV programs with other devices such as smartphones and tablets. While view TV programs, users’ interests change with each scene of the video. c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 82–92, 2020. https://doi.org/10.1007/978-981-32-9808-8_7

Query Generation For Web Search Based on Spatio-Temporal Features

83

When users want to get information related to the contents of the scenes, they have to input appropriate query keywords a for Web search. However, it takes users time and effort to find their requested web pages until they achieve the relevant information. Moreover, there are certain users, including children and elderly people, who are not able to input appropriate query keywords. Further, it is difficult to search various types of information through the Web at once. Some datacasting services such as NHK Hybridcast [7] and other viewer participating program services recommend related information to TV programs on the interface. However, the recommended information does not synchronize with each scene of the videos. Therefore, it is necessary to recommend information related to each scene and users’ concerns. TV programs are often associated with closed captions, and many researchers proposed systems utilizing topics in closed caption data of videos. In this work, we develop an automatic location-based recommendation system using the concept of the automatic location-based image viewing system synchronized with video clips [13] by query keyword generation based on spatio-temporal features of TV program related to tourism and travel.

2 2.1

System Overview and Related Work System Overview

Figure 1 shows the system flow of our proposed method for generating web search queries based on location names in closed caption data. First, the system extracts location names in closed captions from an MPEG file of a TV program based on the method described in the related work section. This method classifies location names and deletes unnecessary words as outliers and sets the maximum area that the system recommends based on the location names in the closed caption data. Second, the system selects the main keywords and sub-keywords from the TV program in two analyses, based on the time length. It then creates a web search query by combining the main keywords and sub keywords. Third, the system searches appropriate web pages for the scenes with the search query. Finally, it recommends several web pages on the user interface, that contain detailed and related information based on each scene. Web pages are recommended as three tabs in the user interface, namely: go, eat, and buy. An example of system flow is described in Fig. 2. A TV program related to the tourist spot along Hokuriku Shinkansen Line on Fig. 3 [10] has the following flow: (1) (2) (3) (4) (5) (6) (7)

Extract location name: Niigata, Kanazawa, etc. Max range: Hokuriku area. Select “Hokuriku” as the main location name. Select “Hot Spring”, “Skiing,” etc. as subtopics. Create search query: (Skiing or Hiking) and Hokuriku. Search web pages with the queries. Recommend web pages about skiing ground or hot spring in the Hokuriku area on the “go” of the user interface.

84

H. Kakimoto et al.

Fig. 1. System flow

2.2

Related Work

Nishizawa et al. [13] extracted the semantic structure of location names in closed caption data by utilizing Wikipedia categories, and detected relevant topics of location names within the semantic structure. In our study, we extract the semantic structure of location names in closed caption data based on the Nishizawa team’s method. In addition, the team proposed a location-based image viewing system synchronized with video clips. In their study, the system recommends the images and map information related to the scenes based on location names in the closed caption data of a travel video clip. To recommend more suitable information for users’ interests in each scene, we recommend contents from the web pages of e-commerce, travel, and restaurant search sites. Wang et al. [14] proposed an automatic video reinforcing system based on a popularity rating of scenes and the level of detail controlling scenes based on closed caption data. They proposed a novel automatic video reinforcing system with a media synchronization mechanism and a video reconstruction mechanism based on closed caption data. Their proposed system recommends certain web content such as YouTube video clips and images related to the scene in a video

Query Generation For Web Search Based on Spatio-Temporal Features

85

Fig. 2. Example of system flow

clip. To recommend suitable information for a travel TV program, our proposed system recommends web pages related to tourism and local specialties. Son et al. [4] proposed a system that segments broadcasting content into semantic units, called scenes, based on its multiple characteristics. In their work, they analyzed scenes and generated keywords, topics, and stories. We focus on keywords and topics of TV programs because we analyze TV program related to tourism, Which do not have stories. Ma et al. [9] proposed a system in which web pages related to the TV-program content are retrieved automatically in real time. Herein, we propose an automatic recommendation system with information related to a TV program based on keywords concerning tourism within the program. Broder [1] classified web search queries as three types: informational queries for acquiring some information from a web page, navigational queries for reaching a particular site, and transactional queries for performing some web-mediated activity. To generate web search queries and recommend information for viewers of a TV program, our system mainly generates informational queries. Our preliminary experiment in Sect. 4 shows that users tend to make informational queries while watching a TV program related to tourism. Okamoto et al. [3] proposed a system to bookmark scenes of a TV program and recommend web pages that describe topics in the scenes. Their proposed system generates a web search query with the term “Where is” followed by the location name. Although it is useful for users who want detailed information regarding the location, web pages with various topics will be better to recommend web pages for users who do not have a purpose to watch a TV program. In our system, web pages with detailed information and relevant information are recommended for these users. In the system of Ercolessi et al. [8], scenes of TV drama are analyzed based on automatic speaker diarization and automatic speech recognition. TV programs are often broadcast with closed captions, and the spelling of text data is

86

H. Kakimoto et al.

more accurate than automatic speech recognition because closed caption data is written by a human. Therefore, our proposed method utilizes closed caption data to generate web search queries. Yamada et al. [6] analyzes typical scenes of a TV program based on the features of the closed caption data. When a TV program is related to travel and tourism, typical scenes appear in the part of the introduction describing a location. It is possible to extract location names and calculate the frequency of them from the introduction. However, the introduction does not contain the temporal features of the TV program. Our proposed method calculates the importance of location names based on the analysis of whole TV program.

3

Query Keyword Extraction Based on Spatio-Temporal Features

3.1

Extraction and Classification of Location Names

Our proposed system extracts location names and other keywords from all nouns in the closed captions from an MPEG file of TV program based on the method of related work [13]. Then, it classifies location names into two factors: geographical distance between locations and the semantic distance between location names. First, it deletes the location names that extremely deviate from a group of other location names in a closed caption. Subsequently, we set a maximum range of recommendation with the rest of the names. Second, to recommend relevant web pages effectively, the system sets a maximum range of the recommendation of web pages. We measure the semantic distances from location to location by using the semantic structure based on the Wikipedia category structure for the creation of web a search query, as described in detail in Sect. 3.3.

Fig. 3. Location names and keywords related to tourism in closed caption data

Query Generation For Web Search Based on Spatio-Temporal Features

3.2

87

Determination of Main Location Name, Sub-location Names, and Subtopics

With the keywords in closed captions, this system selects main the location name, sub-location names and subtopics in the TV program to generate the web search query. Figure 3 shows the occurrences of location names and keywords related to tourism extracted from the closed caption data of a 20-min TV program [10]. Here, closed captions of TV commercials are not extracted. The first row of Fig. 3 shows the time sequence. The black lines show the keywords that appear periodically in the TV program, and the gray lines show other keywords. The keywords that appear periodically in closed captions are determined to be main location names and sub-location names, and other keywords are determined as subtopics based on the length of time. In this study, we rank the location names in the overall analysis. The main objective of the overall analysis is to define the location name that appears most frequently in the closed captions as the main location name based on frequency and average rate of the interval. Furthermore, it is potentially the word that presents the main theme or the atmosphere of the TV program. Based on this analysis, it selects one location name for recommendation. In addition, other location names in high ranks are defined as sub-location names. The main location name is defined based on tf value by the following equation. Here, l is a location name, p is a TV  program. nl,p is the number of location name l appears in the TV program p. s∈p ns,p is a total value of the number of all location names appear in the TV program p. F is a set of tf values of all location names. The location name in the highest tf value is defined as the main location name. nl,p (1) tfl,p =  s∈p ns,p Fp = {tf1,p , tf2,p , tf3,p , · · · tfs,p }

(2)

M ain location name = max(Fp )

(3)

When tf values of multiple location names are the same, a location name that has the highest average of the interval is defined as the main location name by the following Eq. 4. L is the number of cuts that contain location name l, and I is the number of interval cuts of location name l. Similarly, the sub-location names are defined based on the ranking of location names. Average of intervall,p =

Cp − Ll,p Il,p

(4)

Subtopics are defined based on idf value by the following equation. Here, t is a topic, p is a TV program, and Cp is the number of all cuts of the TV program p. One cut has 60 s. dft,p is the number of cuts contain topic t. D is a set of idf values of each topic. Topics in the highest idf value are defined as subtopics. idft,p = log

Cp dft,p

(5)

88

H. Kakimoto et al.

Dp = {idf1,p , idf2,p , idf3,p , · · · idfm,p }

(6)

Subtopic = max(Dp )

(7)

We weight the proper nouns, location names, landmarks, and food items based on the results of the preliminary experiment: Sect. 4. Therefore, our proposed system increases the priority of the main location name to enhance the accuracy of the web search. It means that the semantic interpretation of the main location name changes based on each scene of the video. Thus, the system creates various queries for a web search. In addition, it deals with keywords that appear for a short time as the locale analysis. To generate the web search queries to recommend detail information, it combines the overlapping topics. Similar to the overall analysis, main topics are also prioritized in the local analysis. 3.3

Generation of Web Search Query

To search web pages related to the scenes, this system creates a query in two ways based on the location names and other keywords that are classified, namely, AND search and AND-OR search. First, it generates an AND search query with the main location name and some subtopics based on locale analysis to search and recommend detail information. M ain location name ∧ Subtopic

(8)

(M ain location name ∧ Sub location name) ∧ Subtopic

(9)

Next, it generates an AND-OR search query with the main location name and sub-location names and some subtopics based on the overall analysis. Here, sublocation names are in parallel relationships based on the Wikipedia category structure. (M ain location name ∨ Sub location name) ∧ Subtopic

(10)

Finally, the system searches web pages related to each scene of a TV program from travel, e-commerce, and restaurant search sites based on the search queries. Then, the results of the search are recommended. 3.4

Recommendation of Web Pages

Web pages searched with a search query are recommended on three tabs of the user interface as shown in Fig. 4. At least three web pages are recommended on each tab. Tab go displays the web pages related to tourism and events. Tab buy and eat display the web pages that are searched from e-commerce and restaurant search sites. Users can change the tabs based on their interests. To enhance operability, this work assumes the use of the system on a tablet. Users tap on a thumbnail image and browse the web pages on a browsing window while watching the TV program on a video player window. An example of user operation are described as Fig. 5.

Query Generation For Web Search Based on Spatio-Temporal Features

89

Fig. 4. User interface

Fig. 5. Example of operation

4

Preliminary Experiment

4.1

Questionnaire Survey

We conducted a questionnaire survey with several short video clips of a TV program related to tourism [5,11,12], and the group of all nouns in the closed caption data. The purpose of the survey is to analyze the trend of combinations of the keywords that viewers use while watching TV programs. Each video is divided based on scene changes. To analyze the trend of search queries for information relevant to the scenes, viewers add at least one keyword to make a search query. After respondents watch several videos, they answer the following questions: Q1: Please circle all keywords you are interested in. Q2: Please make some web search keywords using the keywords that you circled in Q1 and at least 1 keyword that you want to add.

90

H. Kakimoto et al.

Fig. 6. Result of Q1

Fig. 7. Result of Q2

4.2

Result and Discussion

The result of Q1 in Fig. 6 shows that 40% of all circled nouns were location names, landmarks, and food items. In addition, more than a quarter of them proper nouns. It shows that viewers are likely interested in these types of nouns. Therefore, we weight the proper nouns, location names, landmarks, and food items in the determination of location names and subtopics in Sect. 3.2. In addition, to analyze the effects of advance knowledge, we asked respondents whether they knew the locations and landmarks in the scenes in advance or not. The result shows that viewers who had prior knowledge tended to search for relevant information rather than detailed information, and the respondents who did not have prior knowledge tended to search detailed information. To recommend suitable web pages based on the viewer, we reflect the results in the recommendation method. The result of Q2 in Fig. 7 shows over 50% of generated web search queries are for detailed search, and it shows viewers’ explicit needs. In addition, almost all user queries can be classified as informational queries, and only 17% of web search keywords for relevant search contain location names and landmarks. It shows the implicit needs of respondents, and suggests relevant information searched with location names and landmarks that does not appear in the closed caption is effective. This is because the aforementioned manner of making web search queries is not representative of viewers thought processes. Therefore, we consider recommending relevant information with location names and landmarks that

Query Generation For Web Search Based on Spatio-Temporal Features

91

does not appear in closed captions can expand viewers’ interests in the scenes. However, there is a possibility that a user may not conduct a relevant search with location names that do not appear in a program because the user is not very interested in them.

5

Conclusion

In this paper, we proposed a query keyword extraction method for Web search based on location names in the closed caption data of videos. First, our proposed system extracts subtopics related to tourism and the location names, and classified them into two types: main location names and sub-location names based on the appearance frequency and average of the interval. Next, it generates a web search query by combining location names and subtopics, and the suitable web pages for scenes were found based on the generated web search query. Thus, it recommends detail information and related information through the user interface of our proposed system. As future work, we are planning to evaluate our system with a questionnaire survey. For this, we plan to make several demonstration videos with TV programs about travel and tourism. In addition, we plan to construct a dictionary to extract appropriate keywords to TV program related to travel and tourism. Another future direction is to recommend information with various types of media such as SNS, videos, reviews, etc. to expand the users’ interests. Acknowledgements. This work was supported in part by JSPS KAKENHI Grant Number 16H01722 from the Ministry of Education, Culture, Sports, Science, and Technology of Japan.

References 1. Broder, A.: A taxonomy of web search. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Montr´eal, Canada, 22–27 April 2006, pp. 711–720. https://doi.org/10.1145/1124772.1124878 2. Kakimoto, H., Hayashi, T., Wang, Y., Kawai, Y., Sumiya, K.: Query keyword extraction from video caption data based on spatio-temporal features. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International MultiConference of Engineers and Computer Scientists, Hong Kong, 14–16 March 2018, pp. 405–408 (2018) 3. Yamada, I., Nakada, Y., Matsui, A., Matsumoto, T., Miura, K., Sumiyoshi, H., Shibata, M., Yagi, N.: Scene detection using a large number of text features. Proc. ITE Trans. Media Technol. Appl. 1(2), 157–166 (2013). https://doi.org/10.3169/ mta.1.157 4. Son, J.-W., Park, W., Lee, S.-Y., Kim, J., Kim, S.-J.: Smart media generation system for broadcasting contents. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 August 2017, pp. 1297–1300 (2017)

92

H. Kakimoto et al.

5. NHK Educational TV Osaka: Ability of the Ancestor, Wisdom Spring, The Third generation Iemitsu Tokugawa, How He Inherited the Composition Well, September 2017 6. Okamoto, M., Kikuchi, M., Yamasaki, T.: One-button search extracts wider interests: an empirical study with video bookmarking search. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), pp. 779–780. ACM, New York (2008). https:// doi.org/10.1145/1390334.1390500 7. NHK Hybridcast, in NHK Hybridcast, NHK Online. http://www.nhk.or.jp/ hybridcast/online/. Accessed 7 Sept 2017 8. Ercolessi, P., Bredin, H., S´enac, C.: StoViz: story visualization of TV series. In: Proceedings of the 20th ACM International Conference on Multimedia (MM 2012), pp. 1329–1330. ACM, New York. https://doi.org/10.1145/2393347.2396468 9. Ma, Q., Tanaka, K.: WebTelop: dynamic TV content augmentation by using web pages. In: Proceedings of IEEE International Conference on Multimedia and Expo, Baltimore, USA, 6–9 July 2003, vol. 2, pp. 173–176 (2003) 10. TV TOKYO: Let’s go somewhere. It is not only Kanazawa! The best Hokuriku Shinkansen Line stations you should get off, April 2015 11. Yomiuri Telecasting Corporation: Ten, Ichiban! The Charming Otona Trip! Sneak into the Popular Tour in Kobe, September 2017 12. Yomiuri Telecasting Corporation: Ten, Travel Concierge! Selected Gourmets of Gion, Kyoto, September 2017 13. Wang, Y., Nishizawa, M., Kawai, Y., Sumiya, K.: Location-based image viewing system synchronized with video clips. In: Proceedings of the 13th International Conference on Location Based Services, Vienna, Austria, 14–16 November 2016, pp. 233–238 (2016) 14. Wang, Y., Kawai, Y., Sumiya, K., Ishikawa, Y.: An automatic video reinforcing system based on popularity rating of scenes and level of detail controlling. In: Proceedings of the 2015 IEEE International Symposium on Multimedia, Miami, USA, 14–16 December 2015, pp. 529–534 (2015)

End to End Internet Traffic Measurement Model Based on Compressive Sampling Indrarini Dyah Irawati1,2(&), Andriyan Bayu Suksmono2, and Ian Joseph Matheus Edward2 1

Telkom Applied Science School, Telkom University, Bandung, Indonesia [email protected] 2 School of Electrical and Informatics, Institut Teknologi Bandung, Bandung, Indonesia {suksmono,ian}@stei.itb.ac.id

Abstract. This paper proposes a modeling technique in the measurement of end-to-end internet traffic based on Compressive Sampling (CS). Because of the increase in network capacity, direct measurement in the network becomes ineffective. Actually, we only need a small number of direct measurement to get traffic information from certain node, then we can estimate entire traffic on the network with the data using CS methods. This paper explains in detail the models and processes which are involved in the modelling, both mathematically and by simulations. We use the results of our previous research to support this study, where Singular Value Decomposition (SVD) is used to sparsify the traffic signal. The Gaussian distribution employed as a measurement matrix. Finally, we reconstruct the entire traffic flows on the network using i1 algorithm. Simulation results show that the sparse vector model can be used to reconstruct endto-end internet traffic more accurately than the one by the sparse matrix model. Keywords: Compressive sampling  End-to-end traffic  Internet traffic  Sparse matrix model  Sparse vector model  Traffic measurement  Traffic reconstruction

1 Introduction Today, the growth of the Internet network increases network traffic that causes problems in network management, such as when one doing network tomography. Network tomography is network measurement using information from end-to-end nodes that collected by a router on the network for monitoring. The information that are usually measured, among others are: delay, packet lost, and traffic. Traffic represents the amount of data distributed in the network [1]. The traffic measurements on each node are normally carried out periodically at intervals of 5–15 min. On large-scale internet networks such as the Wide Area Network (WAN) topology, measurements at each node become ineffective. It due to the increase in resource requirements, such as processor, memory, hardisk, and can cause additional delay and congestion. This is the main reason why we approximate internet traffic on network rather than measure it on site.

© Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 93–104, 2020. https://doi.org/10.1007/978-981-32-9808-8_8

94

I. D. Irawati et al.

Traffic estimation techniques have been developed for a long time. In [2], a traffic estimation scheme is proposed using a Poisson distribution. The Poisson random distribution is independent so it cannot always represent the actual traffic between sources to destinations on the network. In fact, every source and destination might have a correlation. In 2005, M. Roughan proposed a simple gravity model to estimate traffic based on Newton’s law of gravity [3]. This model assumes that all of the traffic flows across the entire network. This model overcomes the problems in the previous research, which there are several correlations between sources to destinations when the nodes share information. However, this model has a weakness because it supposes that the traffic flowing between two nodes is comparable to the incoming and outgoing traffic on these nodes. This assumption is not in agreement with real conditions on the network, where the possibility of missing packets along the network and internet routing are asymmetrical. Gravity models are then further developed into generalized gravity and tomo-gravity [4]. In 2005, Lakhina proposed a traffic matrix estimation method using Principle Component Analysis (PCA) to overcome the problem of large enough traffic matrix dimensions. PCA only works with spatial structures and does not consider temporal correlations and is not adaptive to changes in the network [1, 5]. In 2006, Donoho [6], Candes [7], and Baraniuk [8] introduced Compression Sampling (CS) as an acquisition and reconstruction technique provided that the sampled signal was sparse. Several studies on CS for estimating traffic on the Internet network have been developed, among others are presented in [9–15]. In 2012, Nie et al. [9] proposed the Flow Sensing Reconstruction (FSR) algorithm to reconstruct end-to-end network traffic using CS. This experiment uses a random walk method to create a measurement matrix that meets the requirements of Restricted Isometric Property (RIP). The results showed that the FSR had a smaller relative error value than the SRSVD reconstruction algorithm and tomogravity both spatially and temporally. Roughan et al. [10] proposed a spatial-temporal model of the traffic matrix by investigating the nature of traffic matrix using a low-rank matrix approach. This research leads to the use of K-Nearest Neigbors (KNN) method for local interpolation on incomplete rows and the proposed Sparsity Regularized Matrix Factorization (SRMF) technique. KNN is suitable for estimating traffic matrices if the missing value is small, while SRMF is more suitable for large missing values. The combination of SRMF and KNN algorithms is capable to reconstruct the traffic matrix with an error of less than 22% in a random missing scenario of 90%. In [11], Huibin et al. considered a reconstruction algorithm, namely Self-Similarity and Temporal Compressive Sensing (SSTCS). This method can reconstruct the traffic matrix with errors of less than 32% in the missing value of 98% for the scenario missing lines and missing columns randomly. We have examined the use of CS in internet data traffic applications. A series of studies have obtained results, including comparisons between CS and non CS methods, ie: interpolation for estimating internet traffic [12]. In [13], we explored CS for reconstructing missing traffic, detecting link sensitivity and time sensitivity [13]. We also compared various sparsity techniques that are suitable for internet traffic data [14]. In [15], we combined interpolation method and CS to fix missing traffic in order to increase accuracy. For reducing processing time, we have studied the use of serial CS and parallel CS techniques [16]. In this paper, we study the model to estimate end-to- end traffic on internet network using CS. By using fewer information, we can approximate the entire

End to End Internet Traffic Measurement Model Based on Compressive Sampling

95

of network traffic. CS has three requirements, ie: the traffic signal must be sparse, the measurement matrix meets Restricted Isometric Property (RIP), and reconstruction algorithm for recovering sparse signal. Based on the learning outcomes in [14], Singular Value Decomposition (SVD) is the most appropriate sparse technique for internet traffic. We use measurement matrix that are generated from binomial distribution. In this experiment, Orthogonal Matching Pursuit (OMP) is used as a reconstruction algorithm. To gain further understanding of the CS modeling scheme for estimating end-to-end internet traffic, this paper explaines the processes step by step and describes simulation results using traffic information taken from Abilene network [17].

2 Problem Statement 2.1

Internet Traffic Matrix (TM)

Internet traffic is the amount of information that flow between nodes on the network. The traffic runs out from source nodei to destination nodej that is defined as wij . If a network consisting of n nodes, there are n  n traffic information, which is named as traffic matrix, W ¼ wij ; i ¼ 1; 2; . . .; n and j ¼ 1; 2; . . .; n: TM can be expressed as a three-dimensional matrix when traffic is measured over time T, X 2 Rn  Rn  RT . We can simplify equation by changing snapshot TM to form column vector, xN , N ¼ n  n. However the three-dimensional matrix is manipulated into a twodimensional matrix, X 2 RN  RT , where the columns represent at different times, while rows represent the connection between nodes. 2.2

Compressive Sampling (CS)

Compressive sensing theorem is a new paradigm by utilizing signal sparsity in the transformation area so that it can be sampled below the Nyquist rate [6–8]. The main idea in CS is the number of information signals shows some structure or redundancy so that it can be used for the acquisition and reconstruction of signals simultaneously. There are three conditions that allow a signal to be sampled and reconstructed perfectly using the CS technique. Data structure and redundancy are often identical to sparsity. The first condition is that the signal to be sampled is sparse, the second condition is sampling done with a measurement matrix that meets RIP requirements. And the third condition, the compression results can be returned back to the sparse signal using certain reconstruction algorithms. The signal is called sparse if the signal contains many zero elements and a few nonzero elements. A non-sparse signals can be converted in to a sparse signal by transformation. For example, traffic signals that normally are not sparse is sparse on SVD transformation [14]. The projection or sensing matrix is also called the measurement matrix that has a function to reduce the number of samples from the sparse signal. If the sparse signal, s, consists of n elements, then to reduce the number of samples into a compressed signal, y, consisting of m elements with m  n. In the linear equation, the relationship between the compressed signal y, the sparse signal s, and the measurement matrix A is expressed in Eq. (1).

96

I. D. Irawati et al.

y¼As

ð1Þ

where y expresses a compressed signal of sized m  1, A defines a measurement matrix sized m  n, and s denotes a sparse signal sized n  1. In the reference [18], it is explained that A must have RIP properties, as expressed in the following equation: ð1  ds Þ  ksk2  kA  sk2  ð1 þ ds Þ  ksk2 ;

ð2Þ

where 0\ds  1. In previous studies, the elements of the measurement matrix were random values which are obtained from Gaussian, Bernoulli, Binary, and Uniform distribution [6, 19]. On the other researches, Roughan et al. and Huibin et al. used a routing matrix as a measurement matrix [10, 11]. Reconstruction process aims to restore sparse signal s, if we known compressed signal, y, and measurement traffic A. Because the size A is m  n with m  n, then by using Eq. (1), we obtain under-determined linear equation. This results in the problems of many solutions. In [20], Tropp provides a solution with i1 minimization, which is then better known as Basis Pursuit (BP). Mathematically BP solutions are stated as follows: bs ¼ arg minksk1 ;subject to A  s ¼ y

ð3Þ

One method of solving Eq. (3) is by convex optimization. The solution with convex optimization is done by describing Eq. (3) into linear programming then applying the Interior Point Method (IPM) to produce the optimum solution. Candes et al. provided i1 - magic program as the solution [21].

3 Model and Algorithms In this experiment, we used the internet traffic data from Abilene backbone network to analyze the performance of our model. 12 routers and 54 links compose the network. Traffic measurements are carried out for one day every 5 min. Hence, TM sized is 144  288. To prove that CS is able to predict all traffic values on the network, the information on rows in X that represents traffic between links are deleted and symbolized by Xc . Traffic information on the rows is replaced with a zero value and assumed that the cut length 0\c\N. Finally X c is the information available to estimate using CS so that all network traffic values are obtained. Consider a sparse matrix Xc , which can be illustrated by SVD. The equation can represented as follows: Xc ¼ USV T

ð4Þ

SVD divide Xc into three matrices, U 2 RNN and V 2 RTT are orthogonal matrices containing the singular vectors, while S 2 RNT is diagonal matrix containing the singular values. The elements of S is a set of diagonal values which are arranged as

End to End Internet Traffic Measurement Model Based on Compressive Sampling

97

r1  r2      rk [ 0, where k  min ðN; T Þ is a rank of the matrix. The most of elements S are zero, it is called as sparse matrix. To make the matrix more sparse, we can determine the value of rank-r that meets r  k  min ðN; T Þ. The rank reduction produces a matrix with a smaller dimension, i.e. Sr 2 Rrr ; U r 2 RNr ðN  r Þ; and V Tr 2 RrT . Sparse Sr meets the requirements for CS. The elements of measurement matrix A is obtained from a Gaussian distribution. The matrix A 2 Rmr consists of m rows as the number of measurement and r columns as the number of rank matrix Sr . The A is a set of vector columns A ¼ fa1 ;    ; ar g. 3.1

Sparse Matrix Representation

CS resolves problems with incomplete measurements as follows [12–16]: Y ¼AS

ð5Þ

where Y is compressed matrix sized m  r, A is a measurement matrix sized m  r, and S is sparse matrix sized m  r. The following are compression and reconstruction algorithm steps for sparse matrix representation.

Compression Algorithm Steps for 2 dimension INPUT : Rank An An

traffic data matrix orthonormal measurement matrix

(Matrix)

with

OUTPUT : Compression Ratio CR An m compressed matrix

PROCEDURE : 1. 2. 3. 4. 5. 6. 7.

Generate traffic data with loss node in obtaining with size ( ). by SVD yielding and Decompose Reduce with rank obtaining , with . For reconstruction need, also crop and into and . and . Generate and make it orthonormal with size Only if and fulfils RIP requirement, do next step, otherwise do previous step. obtaining with size r. Do CS acquisition on : Calculate Compression Ratio (CR) between Y and X.

98

3.2

I. D. Irawati et al.

Sparse Vector Representation

  In this study, we represent the diagonal matrix Sr ¼ sij ; with i ¼ j ¼ 1;    ; r to be the column vector sr1 ¼ ½si1 ; si2 ;    ; sir : CS process runs according to Eq. (1). The detail of our proposed compression and reconstruction algorithms for sparse vector representation are shown as follows:

Compression Algorithm Steps for 1 dimension INPUT : Rank An An

traffic data matrix orthonormal measurement matrix

(Vector)

with

OUTPUT : Compression Ratio CR compressed matrix An

PROCEDURE : 1. 2. 3. 4. 5. 6. 7. 8.

Generate traffic data with loss node in obtaining with size ( ). by SVD yielding and Decompose Reduce with rank obtaining , with . For and . reconstruction need, also crop and into with size (r2 x 1) Modify into 2 2 Generate and make it orthonormal with size and . fulfils RIP requirement, do next step, otherwise do previous Only if and step. : obtaining with size 1. Do CS acquisition on Calculate Compression Ratio (CR) between and .

End to End Internet Traffic Measurement Model Based on Compressive Sampling

99

4 Simulation Results and Analysis In this section, we evaluate the performance of sparse matrix representation and sparse vector representation model. We compare the reconstruction results with the actual traffic. We choose traffic on the 15th node and the 81st node for testing with parameters r ¼ 88 and r ¼ 88. Figure 1 illustrates the reconstruction traffic on the 15th node without removal information. In Fig. 2, we cut 50% traffic information on nodes, so that only 50% of the prior information to get approximation. Simulation results show that both methods can estimate the dynamic traffic changes along time. However, the estimation results of the sparse vector representation method are closer to actual traffic than the sparse matrix representation method. This proves that our method is more accurate to approximate end-to-end internet traffic with limited information. In Fig. 3, the X-axis illustrates the number of measurementm, while the right Yaxis illustrates compression ratio (CR) and left Y-axis illustrates the NMSE. The green line indicates the minimum CR, which is 1. The red line illustrates NMSE and CR for vector input, while the blue line illustrates NMSE and CR for matrix input. It can be concluded that the greater the number of measurements, CR and NMSE are getting lower. Moreover, at the same measurement value, the vector representation method is superior to the matrix representation method in both CR and NMSE parameters. For this case, we can use the number of measurement between 10 and 80 because the NMSE is still below the NMSE reference (1). Figure 4 shows the effect of the amount of information on the missing node to NMSE with the X-axis indicating the number of information missing node and the Yaxis illustrating the NMSE. The parameters used are rank-r ¼ 64 and m ¼ 0:5r:

100

I. D. Irawati et al.

Fig. 1. Reconstruction traffic on 15th node

Fig. 2. Reconstruction traffic on 81st node

End to End Internet Traffic Measurement Model Based on Compressive Sampling

101

Fig. 3. Trade-off between compression ratio and NMSE at different number of measurement

Fig. 4. The influence of the number of lost information nodes to NMSE

102

I. D. Irawati et al.

The more nodes that lose information, the higher the NMSE value. Both methods can estimate end-to-end internet traffic with a minimum prior information of 15%. Vector representation method has a smaller NMSE value compared to the matrix representation method. Figure 5 shows the effect of measurement number to processing time. The X-axis indicates the measurement number while the Y-axis illustrates the time processing. Simulation results show that the more number of measurements, the longer the processing time. Sparse vector representation method has a faster processing time than sparse matrix representation method.

Fig. 5. The influence of the number of measurement to time processing

5 Conclusion and Future Work In this paper, we have show that sparse vector representation method is a novel study that can estimate end-to-end internet traffic based on compressive sampling. This method is more accurate compared to sparse matrix representation and it also takes less time. The experiment results show that the proposed method is reliable for fixing endto-end internet traffic. In the future, we will investigate the amount of computation and compare it with other methods of reconstruction. Acknowledgment. This work was supported by Telkom Foundation, Indonesia Ministry of Higher Education, and LPPM ITB. We also thank to our colleagues from Telkom University, Radio Telecommunication and Microwave Laboratory (ITB) who always helped expertise and insights for assisting this research.

End to End Internet Traffic Measurement Model Based on Compressive Sampling

103

References 1. Soule, A., Lakhina, A., Taft, N., Papagiannaki, K., Salamatian, K., Nucci, A., Crovella, M., Diot, C.: Traffic matrices: balancing measurements, inference and modeling. In: Proceedings of ACM SIGMETRICS International Conference on Measurement 2005, Banff, Canada, 6– 10 June 2005, pp. 362–373 (2005) 2. Vardi, Y.: Network tomography: estimating source destination traffic intensities from link data. J. Am. Stat. Assoc. 91(433), 365–377 (1996) 3. Roughan, M.: First order characterization of internet traffic matrices. Invited paper at the 55th Session of the International Statistics Institute, Sydney University of Adelaide, Sydney, Australia, 5–12 April 2005 (2005) 4. Zhang, Y., Roughan, M., Duffield, N., Greenberg, A.: Fast accurate computation of largescale IP traffic matrices from link loads. ACM SIGMETRICS Perform. Eval. Rev. 31, 206– 217 (2003) 5. Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C., Kolaczyk, E., Taft, N.: Structural analysis of network traffic flows. In: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems 2004, New York, USA, 10–14 June 2004, pp. 61–72 (2004) 6. Donoho, D.L.: Compressed sensing. J. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006) 7. Candes, E.J., Walkin, M.B.: An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008) 8. Baraniuk, R.: Compressed sensing, lecture note. IEEE Signal Process. Mag. 24(4), 118–121 (2007) 9. Nie, L., Jiang, D., Guo, L.: A compressive sensing-based reconstruction approach to end-toend network traffic. In: Proceedings of IEEE International Conference on Wireless Communications, Networking and Mobile Computing 2012, Shanghai, China, 21–23 September 2012, pp. 1–4 (2012) 10. Roughan, M., Zhang, Y., Willinger, W., Qiu, L.: Spatio-temporal compressive sensing and internet traffic matrices (extended version). J. IEEE/ACM Trans. Netw. 20(3), 662–676 (2012) 11. Zhou, H., Zhang, D., Xie, K., Wang, X.: Data reconstruction in internet traffic matrix. IEEE J. Mag. China Commun. 11(7), 1–12 (2012) 12. Irawati, I.D., Suksmono, A.B., Edward, I.J.M.: Low-rank internet traffic matrix estimation based on compressive sampling. Adv. Sci. Lett. 23(5), 3934–3938 (2017) 13. Irawati, I.D., Suksmono, A.B., Edward, I.J.M.: Missing internet traffic reconstruction using compressive sampling. Int. J. Commun. Netw. Inf. Secur. (IJCNIS) 9(1), 57–66 (2017) 14. Irawati, I.D., Edward, I.J.M., Suksmono, A.B.: Low-rank representation for internet traffic reconstruction using compressive sampling. J. Telecommun. Electron. Comput. Eng. 10(4), 147–152 (2018) 15. Irawati, I.D., Suksmono, A.B., Edward, I.J.M.: Local interpolated compressive sampling for internet traffic reconstruction. In: Proceedings of IEEE Conference-Regional Conference on Computer and Information Engineering (RCCIE), Vietnam, 29 November–1 December 2017, pp. 93–98 (2017) 16. Irawati, I.D., Suksmono, A.B., Edward, I.J.M.: Comparing serial and parallel compressive sensing for internet traffic matrix. In: Proceedings of The International MultiConference of Engineers and Computer Scientists, Hong Kong, 14–16 March 2018, pp. 387–392 (2018) 17. Network, The Abilene Research. http://abilene.internet2.edu/. Accessed 21 Oct 2015 18. Candes, E.J.: The restricted isometry property and its implications for compressed sensing. J. Compte Rendus de l’Academie des Sci. 1(9–10), 589–592 (2008)

104

I. D. Irawati et al.

19. Candes, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. J. Commun. Pure Appl. Math. 59(8), 1207–1223 (2006) 20. Tropp, J.A.: Just relax: convex programming methods for identifying sparse signals in noise. IEEE Trans. Inf. Theory 52(3), 1030–1051 (2006) 21. Candes, E.J., Romberg, J.: l1-magic: Recovery of Sparse Signals via Convex Programming (2005). http://www.acm.caltech.edu/l1magic/downloads/l1magic.pdf. 4,14., 2005. Accessed 13 Oct 2015

Approach to the Segmentation of Buttons from an Elevator Inside Door Image Yung-Sheng Chen(B) and Yu-Ching Hsu Department of Electrical Engineering, Yuan Ze University, Taoyuan, Taiwan R.O.C. [email protected], [email protected]

Abstract. For a robot, even though its navigation in indoor environments has been studied well, its movement between different floors is still a challenging topic since the robot should possess the ability to recognize and control the buttons on the elevator control panel for taking the elevator. In this paper, an automatic approach for segmenting buttons from an elevator inside door (EID) image is presented before pursuing the goal of recognition and control. Due to the vary styles of buttons being used onto elevator control panels and the reflection phenomenon existing in an EID image, it is not easy to deal with the button segmentation. To overcome this problem, based on the edge information of the EID image, a kernel function called projection-and-checking (PAC) and some refining processes are developed for the button segmentation, which will be useful for the robot vision application. Our experiments confirm the feasibility of the proposed approach. Keywords: Computer vision · Contour · Edge · Elevator inside door (EID) · Image segmentation · Projection and checking (PAC)

1

Introduction

Computer vision plays an important role for a robotic system, in particular considering the interaction between a robot and the environment [1]. Except for the study on the interaction between robot and real world, it is also very interesting for exploring that between robot and digital world, such as the CUBot developed by Chen and Lin [2]. The CUBot possesses the abilities of perceiving the information on the computer screen [3] and manipulating the computer mouse to operate the computer like a human being. The CUBot is very suitable for investigating the eye-hand coordination mechanism. In addition, the recognition of keyboard was also studied for its further integration into the CUBot in the near future [4]. Along such a research direction, another interesting topic on the button recognition of the elevator door control is opened here and presented in this study. Due to the prompt progress of robot technology, the various robots are coming into our daily life. Consider a scenario that a mobile robot wants to get an elevator as we usually do in our daily life, one of most important key functions is that the buttons of elevator should be perceived and then operated c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 105–118, 2020. https://doi.org/10.1007/978-981-32-9808-8_9

106

Y.-S. Chen and Y.-C. Hsu

by the robot such as the work performed by Klingbeil et al. [5]. The elevator inside door (EID) image containing buttons is therefore studied. Segmentation of buttons from the EID image is a necessary preprocessing for a human-like robot to operate the elevator and thus is the goal of this study to propose an automatic method for dealing with this topic. The partial work of this study has been presented in the IMECS 2018 [6].

Fig. 1. Elevator inside door (EID) image with (a) original (color), (b) gray, and (c) edge domain.

Given an EID-image as shown in Fig. 1(a), some properties can be observed as follows. The color of the door panel and that of buttons are very similar and not easily distinguished. In addition, there is a serious reflection over the EIDimage. Therefore, the traditional method, e.g., SIFT [7], will not be helpful for such a situation of unobvious features. Note here that the SIFT technology has been successfully applied for the keyboard recognition [4], where the keyboard image possesses obvious features on each key. Except for the SIFT consideration, it is also possible that by means of a thinning method [8] to extract the stroke information of characters inside button and then recognize the button. However, before performing such a scheme a good thresholding preprocessing should be available but it is not easy due to the mentioned issues of reflection phenomenon and similar color. Fortunately, the edge information appearing in the EID image provide us a cue for designing a feasible approach of locating the buttons, which will be presented in this paper. The proposed approach can be simply depicted in Fig. 2, which serves as a guild line for our method presentation.

Approach to the Segmentation of Buttons

107

Fig. 2. Flowchart of the proposed approach.

2 2.1

Segmentation of Button Columns Observation of an EID Image in Edge Domain

As mentioned previously, there exist the inevitable problems of reflection phenomenon and similar color in the considered EID image. By observing the original (color) EID image given in Fig. 1(a), the reflection phenomenon is very serious and the metal color makes the button and its symbol difficultly differentiated. Fortunately, the edge information on the EID image provides us a cue for designing our approach. Therefore, before presenting our approach the original EID image is first converted into a gray-scale one as shown in Fig. 1(b) and then further using the Canny edge detection method [9] to extract the wanted edge information as shown in Fig. 1(c), which is named as E-image for the convenience of presenting our method. Even some unwanted edge information appear in the E-image, the appearance of edge information for each button region can be used for the button segmentation. The more the collected edge information, the higher the possibility of being a button region. Let e(x, y) be a bi-level pixel belonging to the E-image with the size of W ×H pixels, x and y be the coordinate along the horizontal and vertical direction, respectively. The origin (0, 0) is located at the top-left corner of the E-image. We then have  1, edge pixel (called 1-pixel), e(x, y) = (1) 0, otherwise (called 0-pixel). In order to reduce the noisy pixels in the edge image and make the appearance information more enough, a morphological operation of three times erosions and dilations with 3 × 3 structure element is applied for the E-image as reshown in Fig. 3(a). Along the current illustration, the obtained morphology result is named as M-image as shown in Fig. 3(b), in which the pixel is denoted by m(x, y). Additionally, the morphological process strengthen the connection between the edge information as well. Hence, it can be easily observed that the lines close to the wanting area are more highlighted than those in the original E-image.

108

Y.-S. Chen and Y.-C. Hsu

Fig. 3. By applying morphological process onto (a) E-image, we can obtain (b) Mimage, which not only strengthens the connection between edge information but also highlights the wanted buttons.

2.2

Projection and Checking Process

The adopted idea of constructing our algorithms is to project the accumulation of 1-pixel in the M-image onto horizontal and vertical axis (or x-axis and y-axis respectively) iteratively followed with some checking process to identify a useful range (Rx = [x1 : x2 ], or Ry = [y1 : y2 ]), which can be used to form a region-ofinterest (ROI) image cropped from the M-image and denoted as ROI(Rx , Ry ) = ROI([x1 : x2 ], [y1 : y2 ]). For example, in this way, the original M-image can be represented as ROI([0 : W − 1], [0 : H − 1]). Thus the kernel function in this study is named as projection-and-checking (PAC for short) and may be formulated by the following expression.

Approach to the Segmentation of Buttons

109

The PAC function can be further denoted as P ACx (including Px as illustrated in Fig. 4 and Cx ) and P ACy (including Py and Cy ) depending on the process performed along x-axis or y-axis, respectively. The EID-image is usually composed of several buttons arranged by multiple columns as the two columns shown in Fig. 1. Therefore, the region of button column should be located first before segmenting the buttons. In this step for our PAC processing, the input ROI-image (ROIM ) is the original M-image as given in Fig. 3(b). Since button columns are considered here, only the process P ACx will be performed in this stage. By performing Px , the projection plot can be obtained as depicted in Fig. 4(a). However, because the information in Fig. 4(a) is too redundant for deciding the main columns, we apply a median filter with the sliding window of size 7 × 7 to reduce the computing complexity without lowering accuracy of localization as the Fig. 4(b) shows. It is obvious that the distinction among the main columns is very easy from the human visual perception. However, it is hard for a computer to make a right decision under the same situation. Thus, the significance of automatically deciding the threshold becomes more apparent. The following is our checking process Cx being composed of thresholding and range detection. Thresholding: In order to remove the 1-pixel noise and preserve the main button columns information, a threshold T H should be determined first and derived as follows. Let p¯x be the mean value of px (i), i = 0, ..., W − 1, and the searching scheme be started at the value of y = p¯x downward in vertical direction. By incrementing one of y (say y  ) for each scan, the amount of 1-pixel (say  )−A(p¯x ) . A(y  )) is accumulated and used to compute the difference Δw = A(yA( p¯x ) The searching process will stop if Δw > 0.2. The threshold T H can thus be determined by the following mean expression.

110

Y.-S. Chen and Y.-C. Hsu

Fig. 4. The height of the black graph represents the accumulation amount of the 1pixel in the M-image with vertical projection along x-axis (Px ). (b) is the median result by applying the median filter onto Px in (a).

TH =

1 (¯ px + Δw) , where Δw > 0.2, 4

and then the thresholding process can be expressed by  0, if Px (i) < T H, Px (i) = Px (i), otherwise.

(2)

(3)

Based on this formulation, the obtained T H may be depicted in Fig. 5(a) and the thresholded result is shown in Fig. 5(b). Range Detection: By observing the plot in Fig. 5(b), we can find some, say Nx , ranges with “black” zones, where their widths are denoted as Wx (i), i = ¯x be the mean of the Wx (i), i = 1, . . . , Nx . That range having its 1, . . . , Nx . Let w width greater than w ¯x /2 will be remained and used to locate the corresponding button column; otherwise are removed. In the current illustration, two wanted range Rx (1) and Rx (2) are found as depicted in Fig. 6(a), where the number Nx is reduced to 2. Based on these two ranges, the two button columns can be cropped from the M-image as shown in Fig. 6(b) and (c), respectively.

Approach to the Segmentation of Buttons

111

Fig. 5. Illustrations of (a) the determined T H in red, p¯x in blue, Δw in green, and (b) the thresholded result showing the candidates of main columns.

1 Fig. 6. (a) Two detected ranges. (b) and (c) show the cropped button columns (ROIBC 2 and ROIBC ) from the M-image.

112

3 3.1

Y.-S. Chen and Y.-C. Hsu

Segmentation of Buttons Preliminary Segmentation

After locating the main button columns (refer to Fig. 2), we can obtain the 1 2 and ROIBC for the current illustration as corresponding ROI-image, say ROIBC shown in Fig. 6(b) and (c). It can be found that all buttons are confined in these ROIs. Therefore, the same procedure described in Sect. 2 can be applied onto these ROIs one by one for locating the wanted buttons. The projection direction is changed from vertical into horizontal one. That is, the P ACy including Py and i , ∀i. By performing the P ACy , the Cy functions will be performed on the ROIBC corresponding projection plots can be obtained as depicted in Fig. 7(a) and (b). 1 , we obtain Ny1 = 6 wanted ranges, i.e., Ry1 (j), j = 1, . . . , Ny1 . For For ROIBC 2 ROIBC , we have Ny2 = 5 wanted ranges, i.e., Ry2 (j), j = 1, . . . , Ny2 . Thus all the Nx i k , k = 1, . . . , i=1 Ny , can be generally located with the following buttons, ROIB loops. This refers to the block of preliminary button segmentation in the flowchart as indicated in Fig. 2.

1 2 Fig. 7. Projection results of performing Py for the (a) ROIBC and (b) ROIBC , and (c) the corresponding zones boxed with red color in the M-image.

Approach to the Segmentation of Buttons

113

k The 11 results of the preliminary button segmentation, i.e., ROIB ,k = 1, . . . , 11, for Fig. 7(c) can be listed respectively by button ‘5’, ‘4’, ‘3’, ‘2’, ‘1’, ‘B1’, ‘10’, ‘9’, ‘8’, ‘7’ and ‘6’.

3.2

Refining Process with Frame Included

Since the x coordinates of the preliminary located buttons are based on the main button columns as given in Fig. 7(a) and (b), they may possibly not fit the wanted x coordinates. Such a phenomenon may also appear in the case of y coordinates. Therefore, a refining process should be further applied to adjust both x and y coordinates for each preliminary button. By means of the PAC process presented in Sect. 2.2, let a preliminary ROI be an input, then the output will be the refined ROI, where T H = p¯x /2 is used in this step. For example, consider the three 6 7 8 (button ‘10’), ROIB (button ‘9’) and ROIB preliminary buton ROIs, ROIB (button ‘8’) depicted in Fig. 7(c), after performing the refining process, their y projections and x projections can be obtained as shown in Fig. 8(a)–(c) and (d)–(f), respectively. Note here that the red portions are removed and thus the button ROI is modified in this refining process. Figure 9(b) shows the final refined results for all button ROIs. This refers to the block of refining by PAC in the flowchart as indicated in Fig. 2. 3.3

Refining Process Based on Contour Analysis

In order to facilitate the future robot application, the character inside the button frame should be focused. In our early study, such a button image is similar to a Chinese seal image [10], where a contour analysis was developed for performing the seal image registration. Therefore in this study, the contour information of a button image boxed in Fig. 9(b) is further used for analysis so that the button frame can be removed and the character part is remained for the future recognition. Following the current illustrations on button ‘10’, ‘9’, ‘8’ indicated in Fig. 9(b), their contours (plotted with blue color) are displayed in Fig. 10.

114

Y.-S. Chen and Y.-C. Hsu

6 Fig. 8. (a) and (d) are the projection results of ROIB (button ‘10’). (b) and (e) are 7 (button ‘9’). (c) and (f) are the projection results of the projection results of ROIB 8 (button ‘8’). Note here that the red portions are removed and thus the button’s ROIB ROI is modified in this refining process.

Fig. 9. (a) shows the result after the refining process with y projections (in vertical). (b) shows the result after the refining process with x projections (in horizontal).

Let Ck be the k-th contour and cck be the center of Ck . In addition, let the outer frame of the button ROI be regarded as the largest contour C0 , where w and h represent its width and height, respectively. Thus a distance dk between cck and C0 may be defined as

Approach to the Segmentation of Buttons

dk = min DE (cck , p), ∀p∈C0

115

(4)

where DE represents the Euclidean distance between two points. By observing the distribution of all contours, the contours belonging to the characters are usually located near to the center of the button ROI. In this study, a threshold is experimentally defined as T Hc = min( w4 , h4 ). If a contour Ck satisfies with dk > T Hc , then it belongs to the part of character and will be remained. Such a refining process by contour analysis may be detailed as the following procedure.

After performing the above procedure, Fig. 10 shows the related results, where the green color represents the contour center; the yellow color represents that the point ∈ C0 distance to the corresponding contour center having the nearest distance; and the white color represents the final contour belonging to

6 7 Fig. 10. (a)–(c) show the contour analysis results for ROIB (button ‘10’), ROIB (but8 ton ‘9’) and ROIB (button ‘8’), respectively. The green points represent the centers of the selected contours (illustrated in white color), and their paired nearest points ∈ C0 are highlighted in yellow color.

116

Y.-S. Chen and Y.-C. Hsu

the character inside the button ROI. The whole segmentation result of locating characters inside button for the current illustration is finally obtained as shown in Fig. 11(a).

Fig. 11. Final segmentation results of locating characters inside button for (a) the current illustration, and (b)-(d) the other demonstrated examples.

Approach to the Segmentation of Buttons

4

117

Result and Discussion

Consider an EID-image, except for the segmentation result of locating characters inside button shown in Fig. 11(a), the other examples shown in Fig. 11(b)–(d) also confirm the feasibility of the proposed approach. However, there still exist some limitations for our current approach. Firstly, the designation of buttons in an EID-image should satisfy the right columns or rows and not be skewed too much as the demonstrated examples display. Secondly, the EID-image should not involve the apparent textures around the buttons, since these texture information will yield unwanted edges and thus affect the effect of applying our PAC algorithm. Figure 12 demonstrates such a failure example. In addition, the unwanted edge information will also influence the accuracy of locating the characters inside button. As a result, under the current constraints even the proposed method could segment the button columns, locate the buttons as well as the characters inside button, it still has a room for improvement in the near future.

Fig. 12. Button segmentation will fail if there exist apparent textures around the buttons in an EID-image.

5

Conclusion

Based on the scheme of projection and checking, an automatic approach for the button segmentation from an EID image has been presented. In addition, to facilitate the further character recognition, the range of located button has also been refined to that of characters inside button by means of the contour analysis and its distance computation. Even the experiments have demonstrated the feasibility of the proposed approach, there are some efforts that can be further improved. For example, more types of buttons arranged in an EID-image should be investigated for improving the current method more robust to various cases. The recognition mechanism or machine learning may be embedded into the system for accurately identifying the character of a button and thus make the robot vision to be more practicable. It is expected that along this research a home-care mobile robot can take the elevator from one floor to the other for good service in the near future.

118

Y.-S. Chen and Y.-C. Hsu

References 1. Chen, S., Li, Y., Kwok, N.: Active vision in robotic systems: a survey of recent developments. Int. J. Robot. Res. 30, 1343–1377 (2011) 2. Chen, Y.-S., Lin, K.-L.: CUBot: computer vision on the eye-hand coordination with a computer-using robot and its implementation. Int. J. Pattern Recognit. Artif. Intell. 32, 1855005 (2018) 3. Chen, Y.-S., Lin, K.-L.: Screen image segmentation and correction for a computer display. Int. J. Pattern Recognit Artif Intell. 28, 1454002 (2014) 4. Chao, M.-T. and Chen, Y.-S.: Keyboard recognition from scale-invariant feature transform. In: Proceedings of IEEE International Conference on Consumer Electronics-Taiwan, pp. 207–208, Taipei, Taiwan (2017) 5. Klingbeil, E., Carpenter, B., Russakovsky, O., Ng, A.Y.: Autonomous operation of novel elevators for robot navigation. In: Proceedings of 2010 IEEE International Conference on Robotics and Automation, pp. 751–758, Anchorage, AK, USA (2010) 6. Hsu, Y.-C., Chen, Y.-S.: Button segmentation from an elevator inside door image. In: Proceedings of The International MultiConference of Engineers and Computer Scientists, Lecture Notes in Engineering and Computer Science, Hong Kong, pp. 343–347 (2018) 7. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1150–1157, Kerkyra, Greece, Greece (1999) 8. Chen, Y.-S., Chao, M.-T.: Pattern reconstructability in fully parallel thinning. J. Imaging 3, 29 (2017) 9. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679–698 (1986) 10. Chen, Y.-S.: Registration of seal images using contour analysis. In: Proceedings of 13th Scandinavian Conference on Image Analysis, LNCS, vol. 2749, pp. 255–261, G¨ oteborg, Sweden (2003)

A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System Thi Thi Zin1(&), Pyke Tin2, Ikuo Kobayashi3, and Hiromitsu Hama4 1

2

Faculty of Engineering, University of Miyazaki, 1-1 Gakuen Kibanadai-Nishi, Miyazaki, Japan [email protected] Center for International Relations, University of Miyazaki, Miyazaki, Japan [email protected] 3 Field Science Center, Faculty of Agriculture, University of Miyazaki, Miyazaki, Japan [email protected] 4 Osaka City University, Osaka, Japan [email protected]

Abstract. In the era of fourth industrial revolution or Industry 4.0 together with the second information and communication technology era, the way we human beings live, act and do are always challenging by the waves of new technologies such as Internet of Things, Artificial Intelligence, Cloud Computing so on and so forth. These new technologies also drive life science, academic, business and production industries for better sides. Among them, one of challenging and innovative technology in life science industry is precision dairy farming which has been pushed into frontline topic among the academia and industry farm managers. Generally speaking, the precision dairy farming can be defined as the use of technologies advanced or simple to analyze the physical and mental behaviors of an individual cows for specifying health and profitability indicators so that overall management and farm performance are to be improved. In other words, the precision dairy farming will focus on welfare and health care for making the farm returns optimal through the use of technologies. Here, technologies may range from daily milk recording to automatic body conditions scoring for individual cows and an accurate prediction of calving time and reporting unusual occurrences during the delivery times. In order to realize the objectives of precision dairy farming as a fundamental step, a hybrid visual stochastic approach which is a fusion of image technology and statistical method to dairy cow monitoring system is introduced in this paper. The proposed system will investigate four key areas for the precision dairy farming namely: cow identification, body condition scoring, detection of estrus behavior, prediction of time for the occurrence calving event. In doing so, the combination of image technology, statistical methods and stochastic models will be utilized. Specifically, image processing methods will be performed to detect cow activities such as standing, lying and walking in association with time, space and frequencies. Then collected data are to be transformed into a stochastic model of special type of Markov Chain for decision making process. Then some experimental results will be shown by using both image and statistical data collected from the real life environments and some available datasets. © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 119–129, 2020. https://doi.org/10.1007/978-981-32-9808-8_10

120

T. T. Zin et al. Keywords: Body condition scoring  Calving time prediction  Cow monitoring system  Estrus detection  Industry 4.0  Precision dairy farming Visual and analytical approach



1 Introduction The era of fourth generation of industrial revolution is beginning and the second generation of information and communication revolution coming in. Advances in technologies in these eras are fascinating the way we value, face and challenge for better and smoother life in each and every aspect of living world. This trend not only affects to human beings but also animals especially dairy farming industries where human labors are in shortage while farm sizes are increasing so that more technology assistants are needed. In such a situation, farm management experts are talking about precision dairy farming which is a use of technologies ICT in particular for dairy farms [1–3]. Since the use of video surveillance cameras have been effective and popular to monitor human actions and interactions for detecting suspicious people and suspicious things or objects in public areas, it is worthwhile to look into the studies of dairy farming for better operation and profitable management. To have a successful dairy farming, welfare and health care of an individual cow is to be monitored and analyzed. Like human beings, dairy cows also make actions and interactions such as standing, lying, ruminations, eating drinking and mounting. Through these actions and interactions, the behaviors of individual cows whether they are in heat with or without mounting or whether they are at the stage of calving and how their body condition score patterns look like. Actually, the cow body condition score is a key indicator for the whole dairy farm to analyze individual cow health, milk production, time for insemination, and predicting calving time and so on. Even though there have been some studies on dairy cow monitoring system it is still in need to be improved for practical purposes [4–8]. Moreover, from the profitable dairy farming aspect, in order to optimize lactation of individual cows, the dry-off periods need to be minimized which require to accurately identify when the milk cow is in heat [9]. The costs of poor performance in the aspect of the farming operation are high because of later calving, lost milk production and fewer artificially bred replacement heifers. Recent reports have estimated the annual costs to the industry at around $65 million for missed heats alone with additional costs incurred as a result of inseminating cows when they are not in estrus. In addition, the quantity and quality of labor required for successful heat detection is an important factor to productivity gains, especially on larger farms. Such cattle monitoring system could support cattle reproduction by improving estrus detection. This will lead to a successful in breeding as well as an increase in milk production. Therefore, in this paper vided-based monitoring system for cow behavior analysis is proposed by using a Stochastic Model of Special Type. In order to realize the proposed system, the rest of the paper is organized as follows. In Sect. 2, some related works are described followed by the overview of the system architecture is described in Sect. 3. Then some illustrative simulation results are presented in Sect. 4. Finally, we conclude the paper in Sect. 5.

A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System

121

2 Related Works Recognizing the individual cow behaviors as key indicators for moving forwards to precision dairy farming, a tremendous amount of researches have been done by various researchers on the topics of cow monitoring systems. [10, 11]. However, there are still far from what we expect for precision dairy farming. For example, there remains some open and challenging problems in the area of reproduction such as optimal timing to make an artificial insemination in order to meet a year cycle of calving. Along with problems equally important research is an automatic detection of a cow in heat not only by mounting behavior but also by other activities of standing, lying and ruminations etc. Generally, the farmers are likely to reduce the duration of estrus since the milk production is increased near the time of estrus causing to the shorter time duration for detecting estrus visually. In addition, the percentage of the number of cows expressing standing or mounting heat become smaller and silent ovulations are difficult to detect and expression of estrous behavior due to confinement are also reduced leading to difficult situations for detecting estrus behavior accurately. Thus an efficient way of detecting estrus is an essential to narrow down the time interval between calving time and next to first artificial insemination (AI) time. On the other hand, it will also increase the average interval between AI services thereby limiting the rate at which cows become pregnant. As a consequence, the problems of estrus detection and the impact of AI service are needed to be focused. These problems indicate that there should be a visual intelligent monitoring system that incorporate activity monitoring as a means to associate increased physical activity with estrous behavior in cattle. The monitoring system is not only for estrus detection, but also for various functions of precision dairy farming such as cow identification systems, cow body condition scoring and monitoring of calving processes to avoid the loss of calves to death or disease. Also monitoring the feeding pattern for each individual cow so that their energy needs to be fulfilled to ensure maximum milk yields and healthy conditions. Moreover, the collected data can be analyzed for detecting unusual behavior or some unhealthy forms of patterns such as lameness, calving difficulties, abnormal cow body condition scores [11, 12].

3 Architecture of Proposed Dairy Cow Monitoring System The architecture of proposed dairy cow monitoring system consists of three units namely: Unit 1: Visual Data Collection and Preprocessing Unit, Unit 2: Image Processing and Stochastic Modeling Unit, and Unit 3: Decision Making and Output Producing Unit. The block diagram of the proposed dairy cow monitoring system is described in Fig. 1. Again the Unit 1 is composed of three sub-units for monitoring cow identification and body condition scoring, estrus detection and calving process. The camera setting for body condition scoring is taken place at milk rotary parlor where that for estrus detection will be in house or field and delivery barn for calving process. In all cases, background modeling and noise removal processing are to be performed in the preprocessing step. Then the Unit 2 will activate on the output results of Untit1 to extract geometric shape features and motion features consisting activities of standing,

122

T. T. Zin et al.

lying, lying bouts, rumination, mounting and gazing. For body condition scoring some key landmarks or anatomical points at pin bone, hook, thigh, and tail head areas are extracted and their geometrical shape features such as angles and areas are computed. Then these features will be modelled as a Markov Chain by deriving transition probabilities from the Euclidean distances among the extracted features. Alternatively, we employ Gamma based Markov Chain model to investigate the cow behaviors. Then, the model parameter estimation procedure module systematically learns the set of parameters involved in the behavior model (e.g. linear regression on feature points) making use of the labeled dataset. An illustration of Gamma probability density function is shown in Fig. 2.

Cow Identification Estrus Detection Calving Process

Unit 1: Visual Data Collection and Preprocessing Unit Unit 2: Image Processing and Stochastic Modeling Unit

Geometric Shape Features and Motion Features Extraction

Unit 3: Decision Making and Output Producing Unit

Cow Behaviors Investigation Using Gamma Based Markov Chain Model

f(x)

Fig. 1. Architecture of monitoring system

x μ = E ( X ) = 0, 6667 σ = SD( X ) = 0.3333 σ 2 = Var ( X ) = 0.1111

Fig. 2. A sample of gamma probability density function graph

Once we obtained the transition matrix of Markov Chain, Markov Classification process is performed so that we can learn which feature is contained in which class of body condition scores. By using the learned model, the body condition score of new testing feature can be estimated. So much for the body condition scoring system. Let us consider the estrus detection process by analyzing actions and interactions of a cow. Let N be the number of activities such as standing, lying, lying bouts, rumination, bite or without bite, and so

A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System

123

on. Each activity is divided into three levels, high, middle and low depending on the frequencies of activity. We can define a probability matrix C ¼ ½cij  where cij represents the probability that ith activity is at jth level for i ¼ 1; 2; . . .; N and j ¼ 1; 2; 3: Define two square matrices A and B such that A ¼ CCT and B ¼ CT C where CT is the transpose matrix of C. By computing An and Bn we obtain n-step transition probabilities for each activity and each level. The value of n at which the difference between low and high level probabilities is maximum will be determined as the time at which the cow is in heat. The last part of monitoring system is to monitor cow calving process for predicting delivery time as accurate as it can and to observe whether there may occur unusual event such as difficulties in calving process. For this purpose, the proposed monitoring system will calculate the behaviors of expected cows during 14 days prior to expected due date for calving. During these days, the system will collect the standing, lying time and lying bouts and the number transitions in every hour. This process can be modelled as a two state Markov Chain with standing state, S(t) and lying state L(t) for t ¼ 1; 2; . . .. By observing the collected sequence of events, we can find the cooccurrence matrix. Let dij ¼ fnjðSðtÞ ¼ i; LðtÞ ¼ jÞ for i; j ¼ 1; 2g Then the transition matrix P ¼ ½Pij  where Pij ¼ dij

, 2 X

dij

j¼1

Since the transition matrix P is regular, Pn becomes in equilibrium state for large n. This gives the estimated time at which the calving event occurs.

4 Some Simulated Experimental Results In the precision dairy farming, monitoring cow behaviors plays an important role for establishing many different types of prediction curves such as lactation curve pattern, optimal time of artificial insemination and estrus behavior patterns and body condition scores changing patterns and calving behaviors. These curves provide key information needed for dairy farmers. For example, lactation curve tells us about milk yield patterns, when will be the optimal time for a cow showing the high estrus behavior pattern at which the artificial insemination should be made is very important for a dairy farm too. In similar fashion, the changes of body condition scores and calving behavior are key information for establishing a modern precision dairy farms.

124

T. T. Zin et al.

Before we perform the real experimental works, it is worthwhile to do some simulation works so that we can have some ideal about the real nature of experiments and what we should expect. This will lead to reduce the cost and can lead to an effective real life experiments and can avoid unnecessary situations and unexpected outcomes. In such cases our illustrative simulation models would be beneficial and could be used dairy management tools. In order to do so, we shall use three special types of Markov Models for simulation experimental results. A. Some Simulation Results for Cow Body Condition Scores (BCS) Dairy cow body condition scoring is a key and important in nearly every aspect of modern precision dairy farms for management of welfare and healthcare, milk yields and split meals (feeding system), aware heat and reproduction peak, calving in smooth and calf saving in proof so that all in all profitability in use. However, the majority of dairy farmers are not doing the body condition scoring on regular basis due to lack of automation so that it causes time-consuming and very subjective. Apart from simulation works we have done some real life environment experiments for body condition scoring systems [13–16]. There are two types of scales for body condition scoring: 1 to 5 point scales with increment 0.25 and 1–9 point system with increment 0.5. In both scaling systems, 1 represents thin cows and 4.5(8) above represent fat cows. According to this scale system, the possible number of scores can vary 1 to 17 different scores. In this simulation, we assume the pattern of BCS curves follow a generalized gamma probability distribution described in Eq. (1). Gðt; a; bÞ ¼ b

ta1 t e sa

ð1Þ

In this model, Gðt; a; bÞ represents the weekly BCS on week t and parameter a is an average weekly score for particular lactation, a stands for increment and b stands for decrement rate. These parameters are estimated by using generated gamma based random numbers. The parameters, a, b are obtained from random samples of a uniform distribution in the interval [0-1]. The simulation results are shown in Figs. 3 and 4. It can be seen that the body scoring drops below 4 but it increases until dry periods. Again due to activation the BCS score a little drops down and up again until next calving. But we must be careful if the BCS increases over 4.5 in such case some health condition of cow should be checked. B. Visual Estrus Detection for Dairy Cows In the precision dairy farming, an accurate and early detection of dairy cow in heat is one the most desirable concept to be realized. In order to achieve an efficient reproductive performance of a modern dairy farm, a timely and accurate detection of estrus is very important one. There has been a tremendous amount of research works on this topic however a more accurate and practical method is still required. Detecting estrus or heat in cows is to observe the changes in behavior occurred due to coming into or being in standing heat. Although standing in heat through the mounting by another cow is clearest and obvious sign, in these days, the farm experts observed that a cow can be at the state of heat without sign of mounting or to be mounted. This observation is very beneficial to estimate the other behaviors to be looked at for estrus detection. For

A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System

125

4

Body Condition Scores

3.5 3 2.5 2 1.5 1 0.5 0 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18

Time (t)

Fig. 3. Conceptual graph for body condition scores

Gamma Probability Distribution

Gamma Distribution Chart 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00

0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 Time (t)

Fig. 4. Graph of body condition scores with variable parameter

example, a cow in heat may more active than the normal situation. Typically, the standing heat lasts for about 12–18 h, but some cows may stand as short as four hours or as long as 24 h. The term “estrous cycle” refers to the whole sequence of hormonal and reproductive changes that take place from one heat period to the next. The length of the estrous cycle averages 21 days, but may vary among individuals, with (17–24) day cycle lengths being common. For the purpose of illustration, we shall do a simulation model for estrus detection by using cow activities of lying, standing, rumination and chewing. For each activity, we shall assume high level, medium level and low level states. Specifically, we consider 8 activities during an estrus cycle. They are: A1: Rumination and chewing with occasional bite while the cow is lying (RBL); A2: Rumination and chewing without bite while the cow is lying (RNBL); A3: Bite and Rumination Chewing at standing (RBS) A4: No Bite and Rumination Chewing at standing (RNBS) A5: Bite and Rumination Chewing while walking (RBW) A6: No Bite and Rumination Chewing while walking (RNBW)

126

A 7: A 8:

T. T. Zin et al.

Single Interactions (SI) Multiple Interactions (MI)

In order to create a synthetic dataset, we collect Gamma probability based random number generation by varying shape and scale parameters. We then have an 8  3 matrix say C ¼ cij where cij = a random number generated from the gamma probability density function. From this matrix C, we deduced two matrices of 8  8 and 3  3 to form Markov transition matrices such as, A = CCT (i.e. 8  8) and B = CTC (i.e. 3  3) after normalization. In usual ways we calculate the stationary distributions of the Markov Chains. As an example, we have the following two stationary distributions p(A) and p(B) of A and B, respectively for shape parameter 1 and scale parameter 1. p(A) = [0.2135 0.1851 0.1381 0.1381 0.1033 0.0829 0.0722 0.0669] p(B) = [0.5158 0.3034 0.1808]. By varying the parameters of shape and scales we have the following graphs for three levels of high, medium and low respectively. From both Figs. 5 and 6 show that the difference between the high and low activities is big at the time of day 1 and day 20 at which the decision can be made the cow is in heat and it is the optimal time for insemination. So that we can continue the process when the next heat may arise. In our precious research we have done the estrus by observing mounting actions [17–19]. ANALYSIS BY CHANGING SHAPE PARAMETER A1

A2

A3

A4

A5

A6

A7

A8

3

4

5

6

0.35

Activities

0.3 0.25 0.2 0.15 0.1 0.05 0 1

2

Time (t)

Fig. 5. Activity graph for 8 actions

7

8

A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System

127

ANALYSIS BY CHANGING SHAPE PARAMETER High Level

Medium Level

Low Level

0.8 0.7

Activities

0.6 0.5 0.4 0.3 0.2 0.1 0 1

2

3

4

5

6

7

8

Time (t)

Fig. 6. Activity graph for 3 levels

C. Simulation for Cow Calving Time Prediction Here we shall briefly outline a simulation model for cow calving time prediction. In order to do so, we shall observe the behavior patterns of an expected cow 14 days before expected due date. For simulation purpose we assume that the counting system performs the number of standing state and the number of lying state, the number of transitions from standing to lying and vice-versa are also counted. By using the sequence of states we establish an occurrence matrix among the states. Then a Markov Chain transitions matrix is deduced. From this matrix we can calculate the n-step probability of each state. The value of n at which the probability of transitions maximum is to be decided as the predicted time for calving [20, 21]. This will be done in future.

5 Conclusion and Future Work In this paper we have proposed a framework for dairy cow monitoring system as a foundation of future precision dairy farming. We recognized that this framework is only at its infant stage. Much research works on this topic are needed to be done for practical purposes to be in line with the fourth generation industrial revolution or Industry 4.0. We have also pointed out some key research areas such as: (i) individual cow identification system, (ii) automatic estrus detection monitoring, (iii) image technology based cow body condition scoring analyzer, and stochastic model for cow calving time prediction. Despite widespread availability of advanced technologies in engineering and other industrial fields, adoption of these technologies in the dairy industry has been relatively slow so far. To be specific, the majority of information management systems available and utilized by many dairy farms are still needed to be improved. Especially, further investigation of various analysis such estrus or heat detection to find an optimal time at which an artificial insemination could be done. Moreover, how the body condition scores pattern is changed with respect time, body weight and milk

128

T. T. Zin et al.

yields and so on. This proposed framework is only its infancy state, more works to be done theoretically as well as experimentally. These are challenging and promising research problems in dairy science and modern precision dairy farming. Acknowledgment. This work was supported in part by SCOPE: Strategic Information and Communications R&D Promotion Program (Grant No. 172310006) and JSPS KAKENHI Grant Number 17K08066.

References 1. Clercq, M.D., Vats, A., Biel, A.: Agriculture 4.0: The Future of Farming Technology (2018) 2. Bewley, J.M.: Precision Dairy Farming: Advanced Analysis Solutions for Future Profitability (2010). https://www.researchgate.net/publication/267711814 3. Schwab, K.: The fourth industrial revolution; World Economic Forum, 91–93 route de la Capite CH-1223, Cologny/Geneva: Switzerland (2016). www.weforum.org 4. Bruijnis, M.R.N., Beerda, B., Hogeveen, H., Stassen, E.N.: Assessing the welfare impact of foot disorders in dairy cattle by a modeling approach. Animal 6(6), 962–970 (2012) 5. Smith, K., Martinez, A., Craddolph, R., Erickson, H., Andresen, D., Warren, S.: An integrated cattle health monitoring system. In: 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS06), pp. 4659–4662 (2006) 6. Rushen, J., Passillé, A.M.De, Von Keyserlingk, M.A.G., Weary, D.M.: The Welfare of Cattle, vol. 5, Springer Science & Business Media, Berlin (2007) 7. Jónsson, R., Blanke, M., Poulsen, N.K., Caponetti, F., Højsgaard, S.: Oestrus detection in dairy cows from activity and lying data using on-line individual models. Comput. Electron. Agric. 76(1), 6–15 (2011) 8. Roelofs, J.B., Graat, E.A.M., Mullaart, E., Soede, N.M., Voskamp-Harkema, W., Kemp, B.: Effects of insemination- ovulation interval on fertilization rates and embryo characteristics in dairy cattle. Theriogenology 66(9), 2173–2181 (2006) 9. Lopez, H., Satter, L.D., Wiltbank, M.C.: Relationship between level of milk production and estrous behavior of lactating dairy cows. Anim. Reprod. Sci. 81(3), 209–223 (2004) 10. Brizuela, C.: Use of a monitor to predict dairy cow health and estrus. Harper Adams University College Report. 052, 1–2 (2010) 11. Kokin, E., Praks, J., Veermäe, I., Poikalainen, V., Vallas, M.: IceTag3D™ accelerometric device in cattle lameness detection. Agron. Res. 12(1), 223–230 (2014) 12. Pearson, R.E., Fulton, L.A., Thompson, P.D., Smith, J.W.: Three times a day milking during the first half of lactation1. J. Dairy Sci. 62(12), 1941–1950 (1979) 13. Imamura, S., Zin, T.T., Kobayashi, I., Horii, Y.: Automatic evaluation of cow’s bodycondition-score using 3D camera. In: Proceeding of IEEE 6th Global Conference on Consumer Electronics (GCCE 2017), Nagoya, Japan, pp. 104–105, October 2017. (Student Paper Award: First Prize) 14. Lynn, N.C., Kyu, Z.M., Zin, T.T., Kobayashi, I.: Estimating body condition score of cows from images with the newly developed approach. In: Proceeding of 2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Kanazawa, Japan, June 2017 15. Lynn, N.C., Zin, T.T., Kobayashi, I.: Automatic assessing body condition score from digital images by active shape model and multiple regression technique. In: Proceeding of The 2017 International Conference on Artificial Life and Robotics (ICAROB 2017), pp. 311–314, January 2017

A Hybrid Visual Stochastic Approach to Dairy Cow Monitoring System

129

16. Zin, T.T., Tin, P., Hama, H., Kobayashi, I., Horii, Y.: An automatic estimation of dairy cow body condition score using analytic geometric image features. In: Proceeding of IEEE 7th Global Conference on Consumer Electronics (GCCE 2018), Nara, Japan, pp. 740–741, October 2018 17. Zin, T.T., Kai, H., Sumi, K., Kobayashi, I., Hama, H.: Estrus detection for dairy cow using a laser range sensor. In: Proceedings of 2016 Third International Conference on Computing Measurement Control and Sensor Network (CMCSN), pp. 162–165, May 2016 18. Hirata, T., Zin, T.T., Kobayashi, I., Hama, H.: A study on estrus detection of cattle combining video image and sensor information. In: Zin, T.T., Lin, J.C.-W. (eds.) ICBDL 2018. AISC, vol. 744, pp. 267–273. Springer, Singapore (2019). https://doi.org/10.1007/ 978-981-13-0869-7_30 19. Mizobuchi, T., Zin, T.T., Kobayashi, I., Hama, H.: A study on detection and tracking of estrous behaviors for cattle using laser range sensor and video camera. In: Proceeding of IEEE 7th Global Conference on Consumer Electronics (GCCE 2018), Nara, Japan, pp. 742– 743, October 2018 20. Zin, T.T., Tin, P., Hama, H., Kobayashi, I.: Markov chain techniques for cow behavior analysis in video-based monitoring system. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International Multi-Conference of Engineers and Computer Scientists 2018, 14–16 March, 2018, Hong Kong, pp. 339–342 21. Sumi, K., Zin, T.T., Kobayashi, I., Horii, Y.: A study on cow monitoring system for calving process. In: Proceeding of IEEE 6th Global Conference on Consumer Electronics (GCCE 2017), Nagoya, Japan, pp. 379–380, October 2017

Pre- and Post-survey of the Achievement Result of Novice Programming Learners - On the Basis of the Scores of Puzzle-Like Programming Game and Exams After Learning the Basic of Programming Tomoya Iwamoto1 , Shimpei Matsumoto1(B) , Shuichi Yamagishi1 , and Tomoko Kashima2 1

Faculty of Applied Information Science, Hiroshima Institute of Technology, 2-1-1 Miyake, Saeki-ku, Hiroshima 731-5193, Japan {md18002,s.matsumoto.gk,s.yamagishi.if}@cc.it-hiroshima.ac.jp 2 Faculty of Engineering, Kindai University, 1 Takaya Umenobe, Higashi-Hiroshima, Hiroshima 739-2116, Japan [email protected]

Abstract. Some researches on programming education have reported that the aptitude for programming, which determines the achievement results after its learning, is strongly influenced by the learner’s previous ability. In this paper, we analyze the relationship between pre- and post-state of programming learning. Concretely, in order to estimate the pre-state of programming learning, we focus on puzzle-like programming game and discuss the learner’s ability based on its scores. The new contribution of this paper is to clarify whether the learner’s prestate can be observed in the programming game. This paper takes up a puzzle-like programming game “Algologic” aiming to experience the concept of algorithmic thinking for inexperienced programming learners. This is a simple puzzle game aiming at solving a given task by automatically controlling a robot. This game player designs an autonomous robot by selecting some of the instruction blocks, arranging the blocks in an appropriate order, and giving them to the robot while considering the concept of algorithm. Before students learn the basic of programming, we conducted a test to determine whether algorithms each student gave to the robot were correct or not by utilizing Algologic. Likewise, after students have learned the basic of programming, we conducted a comprehension test to clarify the reachability. This paper reports the investigation result of the relationship between the comprehension of Algologic and the achievement result after learning programming. Analysis results revealed that the results of Algologic test and the achievement results after learning programming were significantly in a positive relationship. Keywords: Algologic · Education · Game · Learning analysis · Novice learners · Pre- and post- survey · Programming

c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 130–142, 2020. https://doi.org/10.1007/978-981-32-9808-8_11

Pre- and Post-survey of the Achievement Result of Novice Programming Learners

131

1 Introduction Programming has been considered an essential skill for software development. In fact, most of higher education faculties associated with computer science, mainly colleges and universities, have been devoting huge effort to the programming education. These faculty staffs and researchers have examined various attempts to improve the learning effect of programming, and we can see the reports of these efforts in academic journals. In addition, the Japanese government is now considering to make the programming education to the part of the curriculum of compulsory education. These would be the firm evidence that the importance of programming education is widely-recognized by society. However, usual programming instructional methods cannot afford to neglect a critical issue: bimodality depending on the learner’s comprehension level. The bimodality consists of two group; the one is a group of learners who are readily acceptable to learn, and the other is a group of learners who reject any concept. The problem of this bimodality is mentioned as the critical problem in programming education by some researchers, and some previous researches suggested possible causes of this problem [1–5]. However, it has been occurring repetitively even though the programming education has promoted and actively created various learning materials. In order to relieve the bimodality, getting learners interested is considered to as one of the most important things in the introductory education. In fact, introduction education for programming beginners aims three primary goals: conveying the concept of programming, conveying the fun of programming, and giving learners experience with important things for learning programming. As an approach to realize these goals, puzzle-like programming games or visual programming languages have been actively adopted in programming introduction education. The difference between programming game and visual programming languages is that the goal is given in advance or not. In general, programming games provide a definite goal and constrain the means of problem-solving. On the other hand, visual programming languages only provide a means to make programming easier, and its learners themselves can set goals freely. Programming games are easy to use for a lesson because its goal is clear, so they have often been used to convey the essence of programming in a short time [6, 7]. The authors have also been promoting the use of a programming game in introduction education. The programming game the authors have been introducing is “Algologic”, one of the famous Japanese programming games developed by the Japan Electronics and Information Technology Industries Association. Algologic is a puzzle game to experience the thinking of algorithm for inexperienced programmers, and the main target of this game is junior high school and high school students who are interested in computer science and want to learn its details. The authors have assumed that getting used to thinking algorithm has a good influence on programming learning, so have utilized Algologic at the early stages of a programming introduction class at university. Also, to check the progress of the exercise, the authors have carried out an Algologic featuring test to grasp the learner’s comprehension degree and overall tendency in the programming lecture right after the exercise with Algologic in this class. There are many practical examples using programming games for programming education [6, 7], but there are few efforts to gather data on learning activities when playing a programming game and to utilize these data for the analysis of programming learners.

132

T. Iwamoto et al.

Investigating what kind of learning activities were performed in programming game at the early stage of the learning and analyzing the relationship between the score of programming game with the achievement result after learning the basics of programming are considered to be valuable. Such research will provide useful data for enhancing programming education such as the design of instructional method and lecture planning, estimation of difficulty in an algorithm, and extraction of learners who may have difficulty keeping up with the lecture. This aim of this paper is to investigate the relationship between the achievement degrees of Algologic and an actual programming lecture. As mentioned above, the authors made online-based quizzes featuring the rules of Algologic, gave it to students before touching programming, and collected learning data that are useful to estimate the pre-state before learning programming. After students learned the basic concept of programming, the authors performed an achievement test to clarify the post-state of each student. Based on the scores of pre- and post- tests, this paper conducted a preand post-survey on each student’s programming level. Specifically, the authors examined each student’s score of the pre-test according to the score of the post-test. The analysis result in this paper showed a positive relationship between the score of pre-test and the post-test. This finding suggests that students who make programming learning difficult may not have enough computational thinking skill [8–10] or not have enough motivation to design a procedure of problem-solving [11] at the time at the point before learning programming.

2 Related Works In recent years, expectations and requests for programming education have increased more than ever. Along with the growth of interest in programming, various programming games have been developed and used in the lessons at colleges or private schools. RoboCode is famous for a long time, and Minecraft and Elevator Saga are drawing much attention recently1 . For the same reason, development and research of visual programming languages have been also advanced [12–16]. Squeak [17] and Scratch [18–23] are representatives of visual programming languages. Since visual programming languages can avoid grammatical errors, they make programming learners concentrate on building and comprehending algorithms. Therefore, it would be an easy-to-use programming language for beginners. In particular, visual programming languages are very effective for programming learning of nonEnglish speaking children. There are some efforts to train the programming skill and also the ability to build algorithms for mainly college students; the representative examples are BlockEditor [24] and oPEN2 [25]. These are using a visual programming library named OpenBlocks [26] and realize a smooth transition from the conventional block type language to the text type language.

1 2

http://www.businessinsider.com/15-free-games-that-will-help-you-learn-how-to-code2017-4/. https://github.com/xDNCL/oPEN.

Pre- and Post-survey of the Achievement Result of Novice Programming Learners

133

Whether programming games are effective to train programming skills or not is unclear at this time. Hour of Code3 rejects that many of the programming games are not developed for the purpose of enhancing programming skills, creativity, problemsolving skill. In other words, general programming games aim to introduce the players that learning computer science is close, useful, simple and exciting. Several examples of using programming games for programming education have been done [6, 7, 27], but it seems to be practiced under the same policy as Hour of Code.

3 Skill Check Test Using Programming Game We can find many studies on programming education showing the effectiveness of introducing a programming game on the basis of the learners’ opinions/comments collected by a questionnaire. However, there have been few studies on analyzing the comprehension level on programming games themselves, and of course, we have hardly found studies on discussing the condition of achievement on programming with the score of a programming game. Therefore, the authors focus on clarifying the relationship between the score of programming game and the achievement degree of programming for college students. 3.1 Programming Game “Algologic” Algologic is a puzzle-like game, whose player aims to find a proper procedure of processes which can bring a robot to the goal automatically by using predetermined instruction blocks. In Algologic, there are two kinds of goals: the one asks a player to collect all flags set in the stage as shown in Fig. 1, and the other asks a player to control the

Fig. 1. A question asking a player to collect all flags set in the stage 3

https://hourofcode.com/us.

134

T. Iwamoto et al.

Fig. 2. A question asking a player to control the robot along with the indicated course

robot along with the indicated course as shown in Fig. 2. For each stage, the player picks up some instruction blocks given in advance, arranges these in the appropriate order, gives these to the robot, and automatically controls the robot. The player can clear each stage if the robot indirectly satisfies its requirement. The following items give some explanations for Algologic. • The player put an instruction block onto the bottom of the start block by dragging and dropping. Algologic provides instructions: front, back, left, right, rotation, repeat, and branch (only Algologic 2) as the instruction block. The player can give the amount of movement and the number of repetitions from 1 to X to the instruction block by changing the number indicated on the block with tap operation. • When the player clicks the start button after the arrangement of the instruction blocks, the robot moves in accordance with the given order of the instruction blocks. If the start button is clicked while the robot is moving, the robot pauses.

Fig. 3. An example of a question in the comprehension check test

Pre- and Post-survey of the Achievement Result of Novice Programming Learners

135

• When the player satisfies the requirement of each stage by the smallest instruction block, Algologic tells the player that his/her algorithm is the best solution. The player can solve each stage without an optimal solution, but cannot receive the decoration showing that he/she designed the best solution.

3.2 Instruction with Algologic In the authors’ college, 1st-grade students are learning the essence of programming without coding by using Algologic at the beginning of a basic programming class. Students have the exercises of Algologic in two 90 min lectures. The authors carefully checked whether each student understood the rule of Algologic properly, and individually instructed the rules over and over again if his/her understanding is inadequate. After confirming that all students understood the rules enough, the authors conducted tests to confirm their comprehension about Algoligic. This test includes some alternative quizzes featuring the rules of Algologic, and here we call this as Algologic test. Algologic test does not require students to design and construct an algorithm from scratch, but ask to answer by “yes” or “not” whether a given algorithm for each stage is effective or not. An example of a question is shown in Fig. 3. Specifically, using Moodle’s Quiz module, each student sent his/her answer with a two-alternative form to each question: “you can clear the given stage if the robot can take all the flags / if the robot can trace the indicated course by controlling the robot as instructed. Answer whether the given algorithm on the right side is appropriate to solve the problem or not. It is not necessary that the algorithm does not have to be optimal”. This test includes 10 quizzes, and the time limit is 10 min. To obtain accurate data as much as possible, the test did not allow talking to each other and using personal belongings.

4 Analysis Results The analysis target of this paper was 1st-grade college students majoring informatics with little programming experience. The authors collected learning log data of basic

Fig. 4. Histograms of Algologic test

136

T. Iwamoto et al. Table 1. The average GPA 1st term 2nd term 1st year

2.12

2.35

2nd year 2.03

2.22

3rd year 2.24

2.29

programming classes over third years, analyzed these data: n = 124 in the 1st year’s classes, n = 109 in the 2nd year, and n = 111 in the 3rd year. These data excluded second and higher grades students and also students who declined to attend from the middle of this class. 1st-grade students took the Algologic test at the beginning of this class held in the 1st term which is used to grasp the pre-state of students. In their 2nd term, there was an advanced programming class, and this paper used the score of this class’s examination result to grasp their post-state. These two lectures targetted the C programming language. For the analysis of the relationship between pre- and poststates of students, this paper analyzed only students who took both 1st and 2nd term classes. Therefore, the analysis targets were n = 113 for the 1st year’s class and n = 99 for the 2nd year, and n = 103 for the 3rd year. Evaluation criteria of all years are considered equivalent because the same lecturer was responsible for all year’s classes, and also the content of the lesson and also the difficulty level of the Algologic test was comparable. There are five grade levels from 0 to 5, and this paper adopted the average of the grade level as GPA excluding abandoned students who declined to attend from the middle of this class. The average GPA(s) of 1st and 2nd terms in three years are as shown in Table 1. As we can see some difference between 3 years from Table 1, there was no significant difference between the averages of GPA by Welch’s t-test. Therefore, the programming ability of each group can be regarded as almost the same degree. Figure 4 shows histograms of the score of Algologic tests. The vertical axis is the relative frequency, and the horizontal axis is the score of the test where its perfect score is 10 points. There was no significant difference between the average scores where the 1st year was 7.38, the 2nd year was 7.31, and the 3rd year was 6.88. As shown in Fig. 4, all histograms of three years were similar trends. Therefore, based on the values of the average GPA mentioned above, three year’s variations of academic level are considered to be almost the same. As an important point here, these histograms unveiled that about 30% of students could not obtain 7 points or over. This test did not require designing and building algorithms, but instead only predicting the behavior as a result of processing by interpreting the predetermined instructions in order. Designing and constructing an algorithm is an essential task of programming before coding source code. Therefore, the result of Fig. 4 suggests that about 30% of students did not have enough knowledge necessary for progressing intrinsic learning of programming. Figure 5 shows a box plot which denotes Q1/4 as lower quartile, Q2/4 as median, Q3/4 as upper quartile and Q1/4 − 1.5IQR or Q3/4 + 1.5IQR as outlier where IOR = Q3/4 − Q1/4 . In Fig. 5, the game represents Algologic test and 1st - 3rd exam. represents the scores of three periodic tests conducted after Algologic test, respectively.

Pre- and Post-survey of the Achievement Result of Novice Programming Learners

137

Table 2. Correlation coefficient between Algologic test and programming exams Algologic 1st exam. 2nd exam. 3rd exam. GPA 0.12

0.27

0.21

0.29

Fig. 5. Box plots of the score of 3 years

1st exam. and the 2nd exam. were conducted in the 1st semester, and only 3rd exam. was conducted in the 2nd semester. These values were normalized by 10 levels. From Fig. 5, the distribution tended to be lower as the later tests in all years. Since the later exams contained advanced techniques and were more difficult, these trends would be reasonable. There were no significant differences in the distribution of the scores for all years, so we can assume that the characteristics of students of all years, for example, the level of academic ability were the same. Here analyzed the relationship between the achievement result after learning programming and Algologic test. Concretely, we examined the scores of each exam according to the levels of Algologic test by dividing all students into two groups; the one represented as “upper group” consists of students whose score of Algologic test is 8 or more, and the other represented as “lower group” consists of students whose score of Algologic test is 7 or less. The scores of all tests were normalized in 10 grades. Figures 6, 7, 8 shows the analysis results where the vertical axis is the score, the horizontal axis shows exams, and each exam has two groups. As shown in Figs. 6, 7, 8, similar trends were shown in all years, and the averages of all lower groups were significantly lower than that of the upper groups, where the differences were statistically varified by Welch’s t-test. Regarding the fact that the performance of the students in the lower group was remaining low, there are several possibilities as a reason for these trends. The first possibility is that some kind of sense is indispensable to acquire the knowledge of programming [2–4], and a learner without the sense could not obtain the knowledge. Next possibility is that a learner stumbled at the early stage of the programming class could not learn the concept of programming smoothly. As this big cause, the instructional method might not be suitable for them. Or perhaps they were not interested in learning itself. As shown in Table 2, the influence of basic academic ability would

138

T. Iwamoto et al.

Fig. 6. Relationship between the scores of programming exam and the score of Algologic test (1st year)

Fig. 7. Relationship between the scores of programming exam and the score of Algologic test (2nd year)

be low because the correlation between GPA and each test is low. In any case, students who lack the ability required by Algologic test did not obtain good performance at the full lesson of programming. The results in this paper suggest that the ability to properly grasp an algorithm/a procedure of instructions and to imagine an output for a given input would be more important for learning programming than memorizing programming language specification and coding a source code. Of course, since the analysis results in here were only from a global perspective, we cannot ignore exceptions. Concretely, there were some students whose score of Algologic test was good but the score of programming was not good and vice versa. Therefore, the authors have to perform a further detailed analysis, for example, classification of students under various conditions, or student’s pattern analysis based on the kind of questions such as simple branch, simple loop, multiple control structure, and nested structure.

Pre- and Post-survey of the Achievement Result of Novice Programming Learners

139

Fig. 8. Relationship between the scores of programming exam and the score of Algologic test (3rd year)

5 Discussion Firstly the analysis result in this paper showed that about 30% of students did not have enough knowledge necessary for progressing intrinsic learning of programming, and this trend was the same every year. Some researches have focused on the failure rates in introductory programming courses, and conducted a survey on a world scale [29, 30]. Their surveys revealed there were no substantial differences in the failure rates based on the programming language, grade level, country, and class size. The first experimental results, where a certain percentage of students may not understand the basic concepts of programming every year, supported the findings of their research and can be said to be reasonable trends. In addition, the scores of programming exam for students whose score of Algologic test is 7 or less tended to be lower than those who were not. Since Algologic test evaluates the ability to properly grasp an algorithm/a procedure of instructions and to imagine an output for a given input, the analysis result suggests that the skills required by Algologic test are essential before obtaining knowledge to design an algorithm/a data structure. As various problems experienced by novice programming learners were identified by Robins et al. [28], it can be said that this paper was able to clarify the first barrier to novices. From the analysis result showing students with low Algologic test scores had low performance in programming exams, as a possibility, we suspected that they might not have been interested in learning itself. Luxton-Reilly said that unrealistic expectation makes the programming course difficult [31]. However, the exercise of Algologic, which is just a game before making such expectations, already showed a difference in the achievement. Therefore, we can’t ignore the possibility that the learning of programming strongly depends on interest. As Kinnunen reports, the lack of motivation might be a major factor behind our experimental results [11]. As programming has been regarded as a particularly important subject in the field of specialization of higher education institutions such as universities, it has long been known that students who can never understand programming always exist in every class. Such a problem is said to be bimodality. There are positive opinions [32] and negative opinions [33] about the bimodality. Sagisaka and Watanabe classified learners more

140

T. Iwamoto et al.

strictly and revealed the characteristics of learners who make programming difficult [34]. Specifically, they confirmed that most of the students belonging to the lowest group of programming comprehension could not understand the most basic terms and grammar of programming. In addition, they revealed that most learners in the group one level upper from the bottom cannot completely understand the conditional branch, loop, functions, and also cannot have the technique of tracing the source code sufficiently. The existence of students who are not good at programming will be an especially serious problem when the programming is integrated into the curriculum of compulsory education. Therefore, investigating when and why beginning programming learner gives up would be important, and besides, programming education needs to establish an appropriate new teaching method for them. Considering these reports, it is suggested that the programming learners in the lowest group do not have even a minimum knowledge to properly read simple programs than writing a general program. The results of this paper would contribute to clarifying the characteristics and the origins of learners in the lowest group.

6 Conclusion This paper investigated the relationship between the achievement degrees of Algologic and an actual programming lecture and reported the analysis results. The analysis result in this paper showed that the pre-state of learning programming, the score of Algologic test and the post-state, the achievement result after learning the basics of programming were a positive relationship. The analysis result suggests that the skills required by Algologic test are essential before obtaining knowledge to design an algorithm/a data structure. In addition, the analysis result in this paper showed that that about 30% of students did not have enough knowledge necessary for progressing intrinsic learning of programming, and this trend was the same every year. Acknowledgements. This work was partly supported by Japan Society for the Promotion of Science, KAKENHI Grant-in-Aid for Scientific Research(C), No.16K01147 and No.17K01164.

References 1. Matsumoto, S., Yamagishi, S., Kashima, T.: Relationship analysis between puzzle-like programming game and achievement result after learning the basic of programming. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International MultiConference of Engineers and Computer Scientists 2018, 14–16 March, 2018, Hong Kong, pp. 168–171 (2018) 2. Bornat, R., Dehnadi, S.: Mental models, consistency and programming aptitude. In: Conferences in Research and Practice in Information Technology Series, 78, S. pp. 53–61 (2008) 3. Dehnadi, S., Bornat, R.: The camel has two humps. Internal report, School of Computing, Middlesex University, UK (2006) 4. Bornat, R.: Camels and humps: a retraction (2014). http://www.eis.mdx.ac.uk/staffpages/r bornat/papers/camel hump retraction.pdf. Acceessed on Nov 2017 5. Lahtinen, E., Ala-Mutka, K., Jarvinen, H.: A study of the difficulties of novice programmers. ACM Sigcse Bull. 37(3), 14–18 (2005)

Pre- and Post-survey of the Achievement Result of Novice Programming Learners

141

6. Gomes, A., Correia, F., Abreu, P.: Types of assessing student-programming knowledge. In: Frontiers in Education Conference (FIE), 2016 IEEE, pp. 1–8 (2016) 7. Gomes, A., Santos, A., Paris, C., Martins, N.: Gamification-based E-Learning strategies for computer programming education. Playing with Programming: A Serious Game to Start Programming, pp. 261–277. IGI Global, Pennsylvania (2017) 8. Wing, J.: Computational thinking. Commun. ACM 49(3), 33–35 (2006) 9. Brennan, K., Resnick, M.: New frameworks for studying and assessing the development of computational thinking, In AERA 2012 (2012) 10. Google, Google’s Exploring Computational Thinking. http://www.google.com/edu/ computational-thinking/. Accessed on 16 Nov 2017 11. Kinnunen, P., Malmi, L.: Why students drop out CS1 course? In: Proceedings of the Second International Workshop on Computing Education Research, pp. 97–108 (2006) 12. Cooper, S., Dann, W., Pausch, R.: Teaching objects-first in introductory computer science. In: Proceedings of the 34th SIGCSE Technical Symposium on Computer Science Education, pp. 191–195 (2003) 13. Cheung, J.C., Ngai, G., Chan, S.C., Lau, W.W.: Filling the gap in programming instruction: a text-enhanced graphical programming environment for junior high students. In: Proceedings of the 40th ACM Technical Symposium on Computer Science Education (2009) 14. Pasternak, E.: Visual Programming Pedagogies and Integrating Current Visual Programming Language Features, Master’s Thesis. Carnegie Mellon University, Robotics Institute Master’s Degree (2009) 15. Warth, A., Yamamiya, T., Ohshima, Y., Scott, W.: Toward a more scalable end-user scripting language. In: Proceedings of 2nd International Conference on Creating Connecting and Collaborating through Computing 2008, pp. 172–178 (2008) 16. Google Inc., Blockly: a visual programming editor. https://developers.google.com/blockly/. Accessed on 16 Nov 2017 17. Ingalls, D., Kaehler, T., Maloney, J., Wallace, S., Kay, A.: Back to the future: the story of squeak, a practical smalltalk writtern in itself. Proc. ACM OOPSLA 1997, 318 (1997) 18. Fal, M., Cagiltay, N.: How scratch programming may enrich engineering education. In: Proceedings of 2nd International Engineering Education Conference, pp. 107–113 (2012) 19. Maloney, J., Burd, L., Kafai, Y., Rusk, N., Silverman, B., Resnick, M.: Scratch: a sneak preview. In: Proceedings of 2nd International Conference on Creating Connecting and Collaborating Through Computing, pp. 104–109 (2004) 20. Harvey, B., Monig, J.: Bringing no ceiling to scratch: can one language serve kids and computer scientists? In: Constructionism 2010 (2010) 21. Lewis, C.: How programming environment shapes perception, learning and goals: logo vs. scratch. In: Proceedings of the 41st ACM Technical Symposium on Computer Science Education, pp. 346–350 (2010) 22. Ozoran, D., Cagiltay, N., Topalli, D.: Using scratch in introduction to programming course for engineering students. In: Proceedings of 2nd International Engineering Education Conference, pp. 125–132 (2012) 23. Scratch Team Lifelong Kindergarten Group MIT Media Lab, Scratch imagine.program.share- https://scratch.mit.edu. Accessed on 16 Nov 2017 24. Matsuzawa, Y., Tanaka, Y., Sakai, S.: Measuring an Impact of Block-Based Language in Introductory Programming. In: Brinda, T., Mavengere, N., Haukijarvi, I., Lewin, C., Passey, D. (eds) Stakeholders and Information Technology in Education, SaITE 2016, IFIP Advances in Information and Communication Technology, vol. 493, Springer (2016) 25. Nishida, T., Harada, A., Yoshida, T., Nakamura, R., et al.: PEN: a Programming Environment for Novices - Overview and Practical Lessons -, EdMedia: World Conference on Educational Media and Technology, pp. 4755–4760 (2008)

142

T. Iwamoto et al.

26. Roque, R.V.: OpenBlocks An Extendable Framework for Graphical Block Programming Systems. Electrical Engineering and Computer Sciences - Master’s degree (2007) 27. Saga, T.: Learning programming with Algologic and programin. J. Wakkanai Hokuseigakuen College 12, 99–111 (2012). In Japanese 28. Robins, A., Rountree, J., Rountree, N.: Learning and teaching programming: a review and discussion. Comput. Sci. Educ. 13, 137–172 (2003) 29. Bennedsen, J., Caspersen, M.: Failure rates in introductory programming. ACM SIGCSE Bull. 39(2), 32–36 (2007) 30. Watson, C., Li, F.: Failure rates in introductory programming revisited. In: Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education, pp. 39–44 (2014) 31. Luxton-Reilly, A.: Learning to program is easy. Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, pp. 284–289 (2016) 32. Hook, L., Eckerdal, A.: On the bimodality in an introductory programming course: an analysis of student performance factors. In: Proceedings of 2015 International Conference on Learning and Teaching in Computing and Engineering, pp. 79–86 (2015) 33. Basnet, R., Payne, L., Doleck, T., Lemay, D., Bazelais, P.: Exploring bimodality in introductory computer science performance distributions. Eurasia J. Math. Sci. Technol. Educ. 14(10), em1591 (2018). https://doi.org/10.29333/ejmste/93190 34. Sagisaka, T., Watanabe, S.: Development and evaluation of a web-based diagnostic system for beginners programming course. J. Jpn Soc. Inf. Syst. Educ. 27(1), 29–38 (2010). In Japanese

A Proposal for an Impatience of Scoring Method for a Text-Based Smartphone Emergency Report Yudai Higuchi1 , Takayoshi Kitamura1(B) , Tomoko Izumi2 , and Yoshio Nakatani1 1

2

Graduate School of Information Science and Engineering, Ritsumeikan University, Shiga, Japan [email protected], [email protected], [email protected] Faculty of Information Science and Technology, Osaka Institute of Technology, Osaka, Japan [email protected]

Abstract. Japan is prone to earthquakes. However, other types of disasters such as typhoons, floods, volcanoes, and landslides disasters cause significant damage in affected areas. During events of severe large disaster, the fire departments of affected areas are expected to provide assistance. At the time of a disaster, a huge number of reports are sent to fire departments via telephone, which saturates that the telephone lines. Reporting by text through social networking services (SNSs) such as Twitter and LINE have been considered as a potential solution for this problem. However, fire brigades who respond to the call are able to judge the urgency of the situation from the tone of the person reporting it, which cannot be accomplished from a text report. Therefore, we propose a method for judging the impatience of a caller from their behaviour while generating a text message using a smartphone. Keywords: Communication · Disaster · Emergency call · Human behavior · Panic · Triage

1 Introduction According to a report published in 2015 by the Central Disaster Prevention Council of the Cabinet of Japan, there is a high probability for the occurrence of a Nankai megathrust earthquake and an earthquake that directly hit the Tokyo area. It has also been emphasized that this earthquake would cause enormous damage to a vast area from the Kanto region (eastern area) to Kyushu region (western area)[13]. A recent example of such a disaster is the Great East Japan Earthquake that occurred on March 11, 2011. Despite the physical damage to the public communication network equipment due to the earthquake and tsunami, the enormous damage that occurred in a wide area caused a surge in the number of emergency calls. It was reported that the fire department could not deal with these situations [1]. In recent years, social networking services (SNSs) have been utilized during such large-scale disasters. SNSs are services c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 143–152, 2020. https://doi.org/10.1007/978-981-32-9808-8_12

144

Y. Higuchi et al.

used to build social connections and communities on the Internet. LINE [11] and Twitter [14] example of SNSs widely used in Japan. Internet connections are said to be relatively resilient during disasters [1], and even in the Great East Japan Earthquake, many emergency requests were submitted using SNSs, and their usefulness has been emphasized in previously published research [1]. However, voiceless emergency reports using SNSs cannot convey users’ emotions such as tension and impatience. In interviews with the fire department that were conducted by the authors of this study, it was determined that the fire department sometimes judges the urgency of the situation from the tone of voice of the reporting person [7]. Therefore, in addition to the information conveyed by text, a system that can extract the reporter’s tension level from both the text and the situation at the time of input is required. In this study, we considered a system for judging the degree of impatience of a caller from the caller’s behavior and the message creation process in SNS. Furthermore, we examined whether or not this information is useful for estimating the impatience of the reporting person, by considering the shaking of their hands and body, the types of typing mistakes and their occurrence, soliloquy, and typing speed.

2 Related Studies 2.1

Method to Estimate Emotions from Input Text

Matsubayashi et al. [9] proposed a method to classify emotions in sentences posted on SNSs by using reference tweets posted on Twitter. The recorded emotions were divided into five categories: delight, anger, sorrow, pleasure, and apathy. To estimate these emotions from sentences posted on Twitter by using an emotional expression dictionary, the authors created a set of training data by assigning these five emotions to tweet labels. This was possible by generating feature vectors using Word2Vec [5] and classifying random forest suitable for large amounts of data. Although there exists a method to estimate the emotion of the author from the contents of the text, only the text information is analysed, but this method do not consider whether or not the person intentionally input a segment of text. 2.2

Research on Acquiring Information from Input Text

Much effort has been devoted to extracting information about the editing process from sentences such as those found in e-mails. Kadono et al. [10] focused on the behavior of the sender in the writing process, by extracting information that seems to be equivalent to nonverbal cues in various situations in which the sentences were created, as well as by inferring the type of situation. The extracted information that was examined in this work is listed below. • Total message creation time: The time from opening the message creation window to pressing the message send button. • Total number of characters: Number of characters included in the sent message

A Proposal for an Impatience of Scoring Method for a Text-Based Emergency Report

145

• Total number of keystrokes: Number of keystrokes during message creation. • Total number of keystrokes of the deletion key: Total number of keystrokes of the delete key and backspace key during message creation. Based on this information, the authors obtained the message creation time, typing speed, and correction rate, and noted that these could help infer the psychological state and situation of the sender. Although research on the extraction of information from the editing process of an e-mail at the time of text entry exists, no research has been done on the deduction of the situation of an emergency reporter. 2.3 Method to Estimate Emotions from Motions Tamura et al. [8] proposed a system that distinguishes four emotions (neutral, sadness, joy, and anger) from walking by using a single acceleration sensor and biological motion data (movement of a joint in the body). In the evaluation, participants wore a single acceleration sensor that recorded time and the three-dimensional force exerted when performing walking motions while exercising neutral, sad, joyful, or angry emotions. To differentiate between these four emotions, the authors used feature quantities such as spatiality and dynamics data. 2.4 Measurement of Time Anxiety Seiwa et al. [6] used the following sentences for a measure of time anxiety. • • • • • • • • • •

It will be confusing if things do not proceed as they always do I feel very impatient if my job does not go well When tackling something, there is not enough time to panic I am worried that I will be in an unpredictable situation I get confused when my work is interrupted I feel upset if there is a sudden schedule change I cannot handle other things if I do not finish the thing that I am currently doing I cannot get to work unless I schedule it I am a person who is being blown away by time than others When unexpected things happen, I do not know what to do

The authors of the study attempted to quantitatively measure anxiety and fear about time, as detailed in [6].

3 Experiments 3.1 Outline In this study propose a method to estimate the degree of impatience from behaviors such as the shaking of hands and body at the time of inputting text to an emergency report using, mistakes, number of corrections, soliloquy, and typing speed. For this reason, we

146

Y. Higuchi et al.

conducted experiments using smartphones and determined what kind of information obtained from smartphones can be effectively used to estimate the impatience of the reporter. We assumed a scenario in which the uses a smartphone text-based reporting system in an emergency situation. For this purpose, an application created using Swift 3.0 [4] was installed on an Apple iPhone 6 [3]. To display emergency notifications, texts and moving images were projected on the screen of the smartphone. 3.2

Measured Data

This research uses only data from an iPhone 6, and any information obtained by other devices is not considered. Table 1 shows the obtained types of data and their descriptions. Table 1. Measured data Data

Description

- Shakes of the user

Acceleration information acquired from the acceleration sensor -Total message creation time Time from when the message creation window is opened until the send button is pressed - Total number of characters Total number of readable characters contained in the message sent -Total number of keystrokes of Total number of keystrokes of the delete key or the backspace key during message creation - Number of wrong characters Number of unreadable characters in the message sent Percentage indicating how much of the specified message - Achievement rate content can be entered The content of the vocalizations such as a “wow” or “oh” - User’s own soliloquy that the user murmured during message creation, their volume, and the number of utterances

In this research, the measured data were used to calculate the character input speed, the wrong character transmission rate, the correction rate, the soliloquy rate, and the achievement rate, as expressed by (1) to (4). These values were calculated to eliminate variations that occur due to the number of created sentences. For example, when preparing a report message that conveys a lot of content, the total creation time, the total number of characters, the number of keystrokes of the deletion key and the number of soliloquies will naturally increase. It was not possible to determine which information is useful for deducing the degree of impatience by merely comparing these values. For this research, we used the character input speed, the wrong-character transmission rate, the correction rate, the soliloquy utterance rate and the achievement rate. Character Input Speed =

Total Number o f Characters Total Message Creation Time

(1)

A Proposal for an Impatience of Scoring Method for a Text-Based Emergency Report

W rong Character Transmission Rate = Correction Rate =

Number o f W rong Characters Total Number o f Characters

147

(2)

Number o f Keystrokes o f Deletion W rong Key Total Number o f Characters

(3)

Soliloquy Talk Time Total Message Creation Time

(4)

Soliloquy Rate =

Furthermore, by using the difference in the degree of impatience of the reporter between the normal and the emergency process, we examine which type of editing process was useful for judging their degree of impatience. In a normal process the participant inputs letters as usual, and in an emergency process, they input characters in an impatient state. We obtain useful editing process information, by analyzing the difference in the character input speed, the wrong character transmission rate, the correction rate, the achievement rate, and the degree of impatience reported by the participant after inputting text in a normal and in an emergency process. To record data about the text, we used a screen capture software manufactured by Apowersoft Ltd [2]. To record the content and rate of occurrence of soliloquy, we used the camera C930e manufactured by Logicool [12]. In the preliminary study, acceleration information that represented the shakes of the user during the input of the text, as well as the soliloquy rate could not be confirmed for each user, so it was not considered in this experiment. 3.3 Outline of Experiment 1 Figures 1 and 2 show examples of screens of the smartphone system used in Experiment 1. The input dialog on the screen is general smartphone text dialog and it is not dependent on the interface of the SNS system. The test users were 15 university students majoring in informatics (13 males, 2 females). They were assigned the task describing the contents of sentences shown on a projector in the text input area of the screen. The users were first asked to input the text without a time limit, and then they performed the same task with a time limit set based on the completion time for the first task, but longer than two hours. Also, to catch the users’ attention, the screen background shifted from Figs. 1 to 2 when the time limit approached. (i.e., yellow outer frame region). The users were asked to estimate their impatience as a percentage after each task. 3.4 Results of Experiment 1 The average difference between the degree of impatience obtained from a normal and emergency process was 35.67%. The average character input speed was 0.03 [characters/second]. The average wrong character transmission rate was −0.001 [number of wrong characters/total number of characters]. The average correction rate was 0.01[total number of keystrokes of the deletion key/total number of characters], and the average achievement rate was 97.03%. In addition, a multiple regression analysis was performed using the difference in the degree of impatience as an explanatory variable for

148

Y. Higuchi et al.

Fig. 1. Screenshot of the system used in Experiment 1 (Neutral status)

Fig. 2. Screenshot of the system used in Experiment 1 (Emergency status)

the variables; character input speed, wrong character transmission rate, and correction rate difference. As a result, as shown in Fig. 3, we found that there exists a relationship between the difference in the degree of impatience between the two processes both and the character input speed and the correction rate of the message. From these results, we can conclude that if the degree of impatience increase, the correction rate and the character input speed will decrease.

A Proposal for an Impatience of Scoring Method for a Text-Based Emergency Report

149

Fig. 3. Result of a Multiple regression analysis for Experiment 1

3.5 Outline of Experiment 2 Figures 4 and 5 show examples of screens of the smartphone system used in Experiment 2. Compared to the system that was usedin Experiment 1, this system is more similar to the textual input screen of a generic smartphone. In Experiment 2, in order to create an experience more akin to that in which emergency reports are made, the participants freely described the contents of the report while watching a video within a specified time limit. The contents of the video reproduced a situation in which a person became trapped by furniture and could not move. To increase the users’ impatience, an alert was displayed twice to indicate that the remaining battery level is decreasing during the experiment. Furthermore, in order to restrict the degree of freedom of the text dialog contents, an interactive format in which an artificial agent interprets the situation was adopted. The questions posed by the artificial agent were as follows.

Fig. 4. Screenshot of the system used in Experiment 2 (reporting)

150

Y. Higuchi et al.

Fig. 5. Screenshot of the system used in Experiment 2 (battery alert)

• • • • • • •

Is it a fire or an emergency? What’s wrong? How many people need help? Please tell me more about the symptoms What are the ages of those who want help? Where are you? On which floor are you?

Following a normal task definition, each message entered by the user did not have a time limit. As in Experiment 1, the users were asked to estimate their impatience after each task and assigned a percentage based on their answers. The participants were 20 university students majoring in informatics (17 males and 3 females). 3.6

Results of Experiment 2

The average difference of the impatience reported in a normal and an emergency process was 26.78%. The average character input speed was −0.33 [characters/second]. The average wrong character transmission rate was 0.040 [number of wrong characters/total number of characters]. The average correction rate was 0.15[total number of keystrokes of deletion key/total number of characters]. Figure 6 shows the difference between the users’ impatience in a normal and an emergency process. We performed a multiple regression analysis using the difference in the degree of impatience the dependent variable, and considering the character input speed, the wrong character transmission rate, and the correction rate, as shown in Fig. 7. We found that there was correlation between the difference in the degree of impatience in the processes and both the character input speed and the wrong character transmission rate. From these results, we concluded that if the degree of impatience rises, the wrong character transmission rate and the character input speed will decrease. We presume

A Proposal for an Impatience of Scoring Method for a Text-Based Emergency Report

151

Fig. 6. Chart showing the level of impatience in a normal and an emergency process

Fig. 7. Result of multiple regression analysis in Experiment 2

that the reason why we have found no correction with the correction rate is that the user is less likely to input the backspace key in the specified sentence when using flexible typing.

4 Conclusion In this paper, we have described experiments conducted to obtain useful data for the evaluation of the degree of impatience, in order to create a method to judge this value from a text typing process. By conducting two experiments, we have found that there exists a measurable difference in the degree of impatience between emergency and normal situations, and that there exists a relationship between this value and both the character input speed and the wrong character transmission rate. In future work, we plan to increase the number of test participants to improve our analysis. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number JP16K21484.

152

Y. Higuchi et al.

References 1. A study meeting on the possibility of emergency calling by social networking service at the time of large-scale disaster: the report of a study meeting on the possibility of emergency calling by social networking service at the time of large-scale disaster. Available via DIALOG. http://www.fdma.go.jp/neuter/topics/houdou/h25/2503/250327 1houdou/02 houkokusho.pdf. Accessed on 4 Oct 2018 2. Apowersoft Ltd.: iPad/iPhone recorder. Available via DIALOG. https://www.apple.com/jp/ iphone/. Accessed on 4 Oct 2018 3. Apple Inc.: iPhone. Available via DIALOG. https://www.apple.com/jp/iphone/. Accessed on 4 Oct 2018 4. Apple Inc.: Swift. Available via DIALOG. https://www.apple.com/jp/swift/. Accessed on 4 Oct 2018 5. Google Inc.: Word2Vec. Available via DIALOG. https://code.google.com/archive/p/ word2vec/. Accessed 4 Oct 2018 6. Seiwa, H., Uchida, N.: Measurement of anxiety about time. Memoirs of the Faculty of Integrated Arts and Sciences III, Studies in Information and Behavior Sciences (1992). https:// doi.org/10.15027/30262 7. Higuchi, Y., Kitamura, T., Izumi, T., Nakatani, Y.: A proposal of a scoring method of impatience with a text-based emergency call to fire department. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International Multi Conference of Engineers and Computer Scientists 2018, 14–16 March, 2018, Hong Kong, pp. 202–206 (2018) 8. Tamura, H., Maeda, T., Ishii, M., Tanno, K., Tang, Z.: A study of the emotion discrimination of the walking style using biological motion. In: FSS (2010). https://doi.org/10.14864/fss. 26.0.127.0 9. Matsubayashi, K., et al.: An emotion estimation method from Twitter tweets and the application. 78th National Conv. IPSJ 50(1), 254–267 (2016) 10. Kadono, K., Nishimoto, K.: A mail system that conveys information on editing process of a mail as implied messages. Trans. IPSJ 50(1), 254–267 (2009) 11. LINE Corporation: LINE. Available via DIALOG. https://line.me/ja/. Accessed on 4 Oct 2018 12. Logicool Co Ltd.: C930E. Available via DIALOG. https://www.logicool.co.jp/ja-jp/product/ c930e-webcam. Accessed on 4 Oct 2018 13. The Central Disaster Prevention Council of Japan: Measures for Nankai megathrust earthquake. Available via DIALOG. http://www.bousai.go.jp/jishin/nankai/taisaku wg/pdf/ 20130528 honbun.pdf. Accessed on 4 Oct 2018 14. Twitter, Inc.: Twitter. Available via DIALOG. https://twitter.com/?lang=ja. Accessed on 4 Oct 2018

Gazing Point Comparison Between Expert and Beginner DJs for Acquisition of Basic Skills Kazuhiro Minami1(&), Takayoshi Kitamura2(&), Tomoko Izumi3(&), and Yoshio Nakatani2(&) 1 Graduate School of Information Science and Technology, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu-shi, Shiga 525-8577, Japan [email protected] 2 College of Information Science and Technology, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu-shi, Shiga 525-8577, Japan [email protected], [email protected] 3 Information Science and Technology, Osaka Institute of Technology, 1-79-1 Kitayama, Hirakata, Osaka 573-0196, Japan [email protected]

Abstract. In recent years, the number of people interested in club music has increased, and the demand for equipment and schools for new disc jockeys (DJs) has increased. However, there are only few skilled instructors in the small Japanese DJ industry, to help beginners improve their skills. In this research, we propose that beginners can effectively improve their skills if they can understand the intentions and actions of experts. We created a system that records an observer’s viewpoint in a situation similar to an actual DJ performance using a head mounted display with a high sense of immersion. We found differences between the gazing points of experts and beginners. Keywords: Eye tracking  Gazing point Self-study  Virtual reality

 Learning support  Music

1 Introduction In the fields of sports and music, beginners commonly imitate and repeat the behavior of experts in order to acquire some skills. However, in some fields that are not well known in Japan, the opportunities to develop teaching methods are also low because the population of stakeholders is small. This applies to acquiring skills on DJ playing. DJs have the skills to select music from sound sources (i.e., CDs or records) and perform smooth and exhilarating transitions between different tracks. Club music gained popularity in recent years in Japan. In addition, the demand for equipment and DJ schools has increased as well. DJs have become familiar to the taste of people in Japan in the past few years. However, many beginners have to learn basic skills on their own because DJ schools are located only in urban areas. In the field of DJs, there are some distinct terms and concepts required. Understanding “DJing” through self-study is a difficult task for beginners, therefore, beginners often encounter a harsh environment © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 153–164, 2020. https://doi.org/10.1007/978-981-32-9808-8_13

154

K. Minami et al.

when trying to improve their skills. Beginners who obtain equipment and begin DJing as a hobby often quit soon after. It is therefore difficult for companies to expand the market and sell DJ equipment in Japan. We propose a system in which beginners are able to proceed learning effectively if they can understand intentions and actions of experts as well as their gazing points. To verify this, we recorded multiple subjects’ gaze points and compared them to the gazing points of experts.

2 Related Research It is often said that efficiency is poor when expert guidance lacks the verbal skills to teach beginners. Hence, many learning support methods specialize in specific instruments. Research is being carried out, in which gaze-measuring instruments are used to support the reading of musical scores in order to improve the guidance efficiency for piano performances [1]. For drum performances, there is a research study in which an animation of the performer’s viewpoint is created using augmented reality. In addition, the performer’s gaze information by changing the colors of the instruments playing in order to draw attention to important objects [2]. However, these systems aim to teach beginners a basic way to play. In DJ mixing, however, there are no routines, and the sensibility of the individual is regarded as an important asset in discovering the original way to play, for instance by showcasing the DJ’s various experiences of by using senses of vision and touch. In addition to the DJ trend in Japan, there has been research on DJ skill acquisition assistance using Information and Communication Technology. Ishisaki et al. [3] have proposed an automated DJ mixing system which requires expertise and a considerable amount of equipment. They proposed the system that allows inexperienced users to enjoy mixing songs that the users provided themselves. This system is designed for users who have never performed DJ mixing. In addition, there are other research studies that target people who already have experience as DJs. Tomibayashi et al. [4] have developed a wearable DJ system that uses sensors. Normally, DJs cannot move from the DJ booth where the equipment is installed, and their performance movement is thus limited. The wearable system enables DJs to perform easily even while they are away from the DJ booth through wearable sensors and gesture recognition technology that assigns arbitrary functions to the DJ equipment. The work described above has often been researched. However, with the exception of our research, there have been no studies that focus on assisting beginner DJs, who have little experience [5]. In our research, we compared the gazing points of experts and beginners DJs and proposed a system that reproduces a DJ’s point of view in virtual reality.

Gazing Point Comparison Between Expert and Beginner DJs

155

3 System Overview Over multiple repetitions, we recorded the gazing point information of a subject and created a measurement system using a head mounted display (HMD) with a strong sense of immersion. According to Sakata [6], obtaining gaze measurement during exercise is difficult because the DJs are unable to fix their eyes on a specific point, given their limited range of movement. In many cases, DJs often use their entire bodies during their performances, which poses a problem for gaze measurement. We constructed an environment similar to an actual DJ performance in virtual reality (VR). The measurement of a user’s gaze can be accomplished without any specialized eye tracking equipment. As shown in Fig. 1, there are two main videos recorded: the video taken directly from the expert DJ’s point of view and the operation of the software used by the DJ. By placing these videos one on top of the other and displaying them on the screen of an HMD), we created a situation similar to a DJ’s actual point of view in VR. The position of the mouse cursor (center of the field of view), which is linked to the movement of the HMD, is measured. The upper left corner of the system screen (x, y) is set to (0, 0) by default. The measured coordinates and the playback time of the video are recorded in the database every second. In addition, the movement of the user’s natural line of sight is measured and the mouse cursor which may hinder visibility, is not displayed. A total of 13 areas are set between the two videos in the system and the color of the area where the mouse cursor is positioned changes to white in order to show the user’s current viewing position. As shown in Fig. 2, areas are named depending their roles, and their names are recorded in the database every second along with the coordinates and the reproduction time (Mixer 2ch). For example, in Fig. 1, the area “g” is the current viewing position.

Fig. 1. Screen displayed on HMD.

156

K. Minami et al.

Fig. 2. Name of each area.

4 Experiment Using the gaze point recording system proposed in the previous chapter, we conducted experiments to evaluate the differences between 5 beginner DJs and 5 expert DJs. We instructed the DJs to wear the HMD as shown in Fig. 3 and to use the system while standing up. We recorded the areas they watched closely and their timing, as well as the system screen. Mixing is an operation in which specific parts of a song are connected to parts of a different song. In our experiment an experienced DJ performed total of 3 mixing operations. We then analyzed the DJ’s gazing point tendency. In general, a DJ performs mixing operations repeatedly. In the most basic mixing operation (Fig. 4), the song to be played next is selected first, and the position at which the song starts is determined. Then, the DJ uses the headphones to control the pitch of both songs and to adjust the volume to shift to the next song while fading out the preceding song. In some cases, the DJ also uses headphones to jump to a specific point in a song. There are various methods for this type of mixing operation, and they differ from DJ to DJ, and also depend on the genre of music played. In order to make an accurate comparison in this experiment we adopted a basic mixing operation in which any skilled DJ would perform the extract same procedure.

Gazing Point Comparison Between Expert and Beginner DJs

157

Fig. 3. User with a gazing point analysis system.

Fig. 4. Basic DJ mixing.

5 Results The results of this experiment were compared and analyzed using two evaluation methods. (1) Analysis of gaze point • Extracting the areas that users watched (using n-gram analysis) • Obtaining the most viewed area • Recording the time at which each area was gazed

158

K. Minami et al.

(2) Comparison of gaze points in an actual performance • Obtaining the number of matches between the reference data and the user gazing point data • Verifying significant differences between expert DJs and beginner DJs (using Chi-square test) In our experiments, we took 3 basic mixing operations as a point of reference that any skilled DJ would perform in exactly the same way. This data was created in collaboration with several expert DJs. A. Gazing Point Extraction Using N-Gram Analysis We used a gazing point extraction method based on n-gram analysis of line of sight movement [7]. Originally, n-gram analysis is a method used for finding the frequency of the appearance of words in a character string (Fig. 5). We replaced each of the 13 area names in our system with a letter of the alphabet. In the database, the attention areas of the users were measured for every second, forming a string of characters. We defined a gazing point as an area that was watched for more than two seconds by the DJ (a substring of two or more of the same characters). Using n-gram analysis of the gazing points, it was possible to obtain their frequency. In our experiment, we used the top 2 gazing points with the highest frequency.

Fig. 5. Flow chart of gazing point extraction using n-gram analysis.

B. Analysis Results of Gazing Points N-gram analysis was used to find the areas receiving a high degree of attention for each mixing operation performed 3 mixing operations. From the analysis, we determined the differences between experts and beginners.

Gazing Point Comparison Between Expert and Beginner DJs

159

For the first mixing operation, an analysis was carried out in the two areas that received the highest DJ attention time. The results are shown in Table 1. In this mixing operation, it was necessary to match the tempo of the current song with the time and tempo of the next song. Therefore, experts E1, E2, E4, and E5 who were familiar with the procedure closely watched areas shows waveforms of 2 songs, when selecting songs and determining their starting positions. In addition to this, the area “f” (Fig. 6) was used for mixing two songs. By operating several knobs in this area, it was possible to change the song volume to high, medium, and low. Experts E2, E4, E5, and beginner B3 watched area “f” in scenes where the knobs were operated in the video. Expert E3 frequently watched outside the studied areas, while 3 beginners often switched their gaze between areas during the mixing operation. As a result, these beginners had no frequent gazing points. Table 1. Most frequent gazed areas in the first mixing operation.

Experts Areas

E1 e

E2 d,f

E3 None

E4 e,f

E5 e,f

Beginners Areas

B1 e

B2 None

B3 f

B4 None

B5 None

We then analyzed the area receiving the maximum attention in the second mixing operation. In the contrast to the first mixing operation, this operation was performed on a turntable. The role of each area also changed in the software. For this reason, the first and second time area “d” showed the waveform of the next song and attracted the users’ attention. Experts E1, E2, E4, and E5 gazed at this area before playing the next song in order to confirm the match in tempo between the two songs. Beginners B1 and B5 tended to shift their gaze between area “d” and other areas during this mixing operation. Also, in the first mixing operation, expert E3 frequently gazed outside the available areas, and beginners B2 and B4 also frequently shifted between areas that had multiple gazing points (Table 2).

Table 2. Most frequent gazed areas in the second mixing operation.

Experts Areas

E1 d,e

E2 d,e

E3 None

E4 d

E5 d,g

Beginners Areas

B1 d

B2 None

B3 f

B4 None

B5 d,e

The third mixing operation was similar to the first and second mixing operations, but because the speed differed between songs therefore an additional matching process

160

K. Minami et al.

was necessary. By operating the pitch control located at the top of each turntable the song speed could be changed. In the software, display area “b” corresponds to the pitch. Experts E1, E4, and E5, and beginners B1 and B5 gazed at area “b” where information such as the speed of the song is displayed. In addition, in this mixing operation, many DJs closely watched the area that displayed the next song to be played. Figure 6 shows the most gazed areas in the three mixing operations (Table 3).

Table 3. Most frequent gazed areas in the third mixing operation.

Experts Areas

E1 b,e

E2 d,e

E3 e

E4 b

E5 b,e

Beginners Areas

B1 b,e

B2 None

B3 None

B4 b,f

B5 d,e

Fig. 6. Areas receiving the most attention during three mixing operations.

C. Comparison of Gazing Points During an Actual Performance In addition to n-gram analysis, we compared the users gazing points at each step of the mixing operation with their gazing points while actually performing. We created a reference dataset of a basic mixing operation with 5 experts, and the areas to be watched are set for each basic mixing operation. In Fig. 7, we compared the gazing points of users at each step with the reference data. In addition, we examined whether there was a significant difference between experts and beginners using the chi-square test. Tables 4, 5, and 6 show the total number of user gazing points that both did and did not match the reference data.

Gazing Point Comparison Between Expert and Beginner DJs

161

Fig. 7. Comparing users’ gazing points with the reference data.

Our results show that when performing, experts have different gazing points than when using our system. However, the number of gazing points of beginners was not consistent with the number of gazing points in the reference data. Also, when comparing experts and beginners, it is conceivable that their experience has some influence on their gazing tendencies when using the system. A chi-square test was used to confirm whether there is a difference between experts and beginners. A 5% difference was found between the first and third mixing operations. Table 4. Results of chi-squared test of the first mixing operation. Number of matches Number not matched 5 Experts 95 95 5 Beginners 64 126 p value = 0.03431 (p < 0.05) Significant difference

Table 5. Results of chi-squared test of the second mixing operation. Number of matches Number not matched 5 Experts 103 92 5 Beginners 78 117 p value = 0.1684 (p > 0.05) No significant difference

Table 6. Results of chi-squared test of the third mixing operation. Number of matches Number not matched 5 Experts 133 92 5 Beginners 80 145 p value = 4.938e−05 (p < 0.05) Significant difference

162

K. Minami et al.

In the n-gram analysis of gazing points, experts watch the same areas at the same time as each other. In addition, many of the beginners often moved from area to area during the mixing operation, and their gazing points had to be recorded several times. Upon studying the gazing points in an actual performance, we found that experts also performed different movements when playing live than when using our system. However, in the first evaluation, there were areas often watched by experts at the same time during the mixing operation. Since they matched the reference data, we can conclude that the gazing points were similar to the points considered important in the mixing operation. In addition, the chi-square test conducted did not show a significant difference. We experienced a problem with one of the experts in the experiment. The expert gazed by moving his eyes only, therefore, gazing areas were often registered outside because user need to move their head with HMD in our system, and it was not possible to measure them adequately. We performed an additional test without the expert with the lowest performance in our experiments (Tables 7, 8 and 9) and the results improved significantly. Table 7. Results of chi-squared test of the first mixing operation without one expert. Number of matches Number not matched 4 Experts 88 64 5 Beginners 64 126 p value = 0.000489 (p < 0.05) Significant difference Table 8. Results of chi-squared test of the first mixing operation without one expert. Number of matches Number not matched 4 Experts 97 59 5 Beginners 78 117 p value = 0.001887 (p < 0.05) Significant difference Table 9. Results of chi-squared test of the third mixing operation without one expert. Number of matches Number not matched 4 Experts 114 66 5 Beginners 80 145 p value = 3.181e−06 (p < 0.05) Significant difference

D. Post Experiment Questionnaires Questionnaires were given to users after the experiment, asking about the following questions. (1) Gazing areas in which the users were particularly interested during the experiment. (Selection of formula, multiple answers possible)

Gazing Point Comparison Between Expert and Beginner DJs

163

(2) How the users learned to DJ. (3) What was the hardest thing when the users started DJ playing. We also asked the experts to describe the following topics: (4) The extent of their experience teaching beginners. (5) The points that they found difficult to teach. We compared the users’ answers to Question 1 (Fig. 8) with the areas that they closely watched based on the gazing point information recorded while using our system. Results showed that for the five experts, 16 out of 22 answers matched. In contrast, only 7 out of 20 answers matched in case of the 5 beginners. From this, we conclude that experts have intentional gazing points when using this system compared to beginners. Additionally, the most frequent answer to Question 1 was that waveform information of the song, which is consistent with the result of the first evaluation using n-gram analysis. Results 3 and 5 in Fig. 8 correspond to those parts.

Fig. 8. Answers to Question 1.

As the answer to Question 2, experts responded that they learned to DJ by watching other DJs play at events, through self-study, or by learning from experienced people. 2 beginners answered that they were learning on their own, and the other 3 beginners answered that they were learning from experienced people. When comparing these questions with the attributes of each user, many users have acquired experience by participating in club events, dance events, and so on, and they often saw experienced DJs and found themselves in environments that offered guidance from experts. Although some of the DJs had motivation to improve, two beginners did not have basic skills because they were not in an environment where they could easily receive some guidance. This shows the need that DJs have for support. In Question 3, 7 out of 10 users responded, “during the mixing operation,” and did not understand the bar and beat of the song. As a result, they found that they had difficulties with the timing necessary to synchronize the next song. In response to Question 5, most experts felt that they had difficulty in teaching beginners using musical sheets.

164

K. Minami et al.

The results of these questionnaires showed the importance of gazing at waveform information. We also demonstrated that beginners would be able to engage in efficient self-study if they were able to develop a sense of rhythm and the use of musical sheets.

6 Conclusion In this research, we compared the gazing points of DJs watching gazing points of other DJs during their performance, developed a gaze point measurement system using HMD with a strong sense of immersion, and conducted experiments on 10 subjects. We created a comparative analysis using two evaluation methods. As a result, we made several interesting discoveries worth examining in the development of future support methods. In the future, we will consider developing a support system aimed at teaching basic skill acquisition for beginners.

References 1. Toma, W., Nakahira, K.T.: Descriptive analysis of process for acquiring piano playing skill by means of eye-movement tracking while reading music scores. In: The 11th Forum on Information Technology (2012) 2. Hayakawa, K., Hasegawa, D., Sakuma, H.: Performer’s viewpoint on drum performance support effect of visualization of ar animation and gaze information. In The 76th Information Processing Society of Japan (2014) 3. Ishizaki, H., Hoashi, K., Yasuhiro, T.: Automatic DJ mixing system based on measurement function of user discomfort on music. Forum Inf. Technol. J. 52(2), 890–900 (2011) 4. Tomibayashi, Y., Takekawa, Y., Terakawa, T., Tsukamoto, M.: Development and actual use of a wearable DJ system using wearable sensors. In: IPSJ special Interest Group on Music And Computer (MUS), pp. 39–44 (2008) 5. Minami, K., Kitamura, T., Izumi, T., Nakatani, Y.: Gazing point analysis of experts and beginners DJ for acquiring basic skills. In: Proceedings of The International MultiConference of Engineers and Computer Scientists 2018, Hong Kong, 14–16 March 2018. Lecture Notes in Engineering and Computer Science, pp. 214-218 (2018) 6. Sakata, M.: One’s eyes are often more eloquent than one’s mouth: study cases on eye movement analysis. SCBDMMT Kobe Univ. 6(1), 103–116 (2006) 7. Takagi, H.: Recognizing user’s uncertainty on the basis of eye movement patterns: a step toward an effective task assistance system. Forum Inf. Technol. J. 41(5), 1317–1327 (2000)

A Comparative Analysis of Image Segmentation Methods with Multivarious Background and Intensity Erwin1, Saparudin2(&), Diah Purnamasari1, Adam Nevriyanto1, and Muhammad Naufal Rachmatullah2 1 Department of Computer Engineering, Universitas Sriwijaya, Indralaya, Indonesia [email protected], [email protected], [email protected] 2 Department of Informatics Engineering, Universitas Sriwijaya, Indralaya, Indonesia [email protected], [email protected]

Abstract. Segmentation is the process of separating the object into parts that generally separates the foreground and background. The techniques proposed segmentation using different approaches and based on the problems in the background pattern and intensity of the image. There are 10 (ten) techniques used are Adaptive Threshold, Region Growing, Watershed, YCbCr-HSV, K-Means, Fuzzy C-Means, Mean Shift, Grab Cut, Skin Color HSV, and Otsu Gaussian. Spoken image using American Sign Language fingerspelling of ASL University, fingerspelling primary image and the retinal image of STARE. ASL fingerspelling is fingerspelling which is standard in sign language so that the application of these ten segmentation techniques can help maximize the application of pattern recognition. While the retinal image is used to separate the blood vessels. Measuring the quality of segmentation using the Root Mean Square Error (RMSE) and Peak Signal to Noise Ratio (PSNR), the experimental results show that all tested techniques that produce an average PSNR above 40 dB, meaning segmentation techniques work well in both datasets. In ASL fingerspelling dataset, a technique that generates the highest PSNR Skin Color whereas techniques for segmentation of vessels on the Retinal dataset namely Adaptive thresholding technique. Keywords: Adaptive Threshold  Biometric image  Fuzzy C-Means  K-Means  Mean Shift  Otsu Gaussian  Region Growing  Segmentation  Skin color watershed  YCbCr-HSV

1 Introduction In the system of recognition and identification of digital image data acquisition are often susceptible to interference (noise), which in this case research is the brightness (brightness), the complexity of the background. In this research, the luminance noise is © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 165–181, 2020. https://doi.org/10.1007/978-981-32-9808-8_14

166

Erwin et al.

divided into two, which is too dark or too bright. Background complexity occurs because of differences in color, shape (pattern) that affect the introduction of foreground-background. So as a challenge to overcome this problem by distinguishing foreground and background so that objects can be recognized. Image segmentation is a process of dividing the image into regions or characteristics [1], Each pixel in the image are allocated to categories. Segmentation is said to be good if it meets one of the criteria, namely, pixels in the same category have a similar scale value and form an interconnected region or neighboring pixels that are in different categories have different values. According to [2–4], there are three categories in which Intensity-Based segmentation (thresholding), Region-Based (Region Growing, Split and Merge) and Other Methods (Texture, Edge, Motion or Color). A thresholding segmentation makes pixels grouped into groups divided by the threshold value. In the region-based segmentation algorithm iteratively operates by classifying pixels which neighbors and have similar values and separate groups of pixels whose value is not the same. Meanwhile, edgebased segmentation, edge filter applied to the image, pixel pixels classified as edge or non-edge hanging out, and pixels are not separated by the edges, is allocated in the same category.

2 Datasets Testing techniques using datasets fingerspelling of ASL American Sign Language University (ASLU), fingerspelling primary and STARE retinal image. We used the dataset for fingerspelling which is starting from 1 to 10 of American Sign Language and the retina is based on the type of disease. There are 8 intensity and pattern of a different background for ASL fingerspelling and 1 intensity and background patterns on the Retinal. The size of the image dataset has dimensions of 200  200 pixels.

3 Intensity Based: Adaptive Threshold The value of T (x, y) is the thresholding that complies:  hða; bÞ ¼

0; 1;

if cða; bÞ  T ða; bÞ otherwise

f(x, y) is the intensity of the input image as the pixel at (x, y) and. If the value of T (x, y) is given equally to all of the input images, then this process is called global thresholding, but when the value of T (x, y) is given differently depending on the value of the parameter statistical around (x, y), this process called Adaptive thresholding (local thresholding). f ðx; yÞ 2 ½0; 1: Adaptive thresholding algorithm is as follows:

A Comparative Analysis of Image Segmentation Methods

167

4 Based Region: Growing Region Region Growing is the method of approach to image segmentation by starting from a few pixels (seeds), which represent a different image area and grow to form a wider region in the image. To use this method required rules that explain the mechanism of growth and homogeneity rules each region having grown [5, 6]:

168

Erwin et al.

5 Watershed Watershed transformation is based on morphological transformation functions for image segmentation. Use of the watershed is recognized as a powerful method because it has the advantage of speed and simplicity [7]. This method is the basis of development of edge detection [8, 9]. Watershed Algorithm in [10] represents a watershed transformation is applied to image segmentation.

A Comparative Analysis of Image Segmentation Methods

169

6 Based Color: YCbCr-HSV YCbCr-HSV techniques in this study had an algorithm as follows:

7 Clustering Based: K-Means K-Means technique is that unsupervised clustering techniques with the aim of minimizing the number of squared distances between all the points and the center of the cluster. This procedure follows a simple and easy way to classify a given set of data through a number of previously defined clusters. K-Means algorithm is as follows:

170

Erwin et al.

8 Clustering Based: Fuzzy C-Means This is the development of low-level segmentation techniques where this approach uses the concept of region growing and pyramid data structure in the analysis of aerial imagery hierarchy. FCM algorithm approach has a higher level of an image obtained by averaging the four lower-level image and block the effects observed in the process of segmentation that this technique including good initial approach and are often developed in any research [11]. The fuzzy C-Means algorithm is as follows:

A Comparative Analysis of Image Segmentation Methods

171

9 Clustering Based: Mean Shift Mean Shift is one of the methods that categorized as an unsupervised clustering segmentation method. In addition, the termination process of segmentation is based on several strategies incorporation of the territory that is applied to the images that are being filtered, and the number of regions in the image is segmented primarily determined by the minimum number of pixels in a region, which is denoted as M (ie, the region containing less than M pixels will be eliminated and merged into neighboring regions).

172

Erwin et al.

10 Grab Cut Grab Cut works based on pixel color distribution (intensity) and so this technique has the ability to delete interior pixels that are not part of the object. The Grab Cut algorithm can be described as follows:

A Comparative Analysis of Image Segmentation Methods

173

11 Skin Color: HSV The HSV value which is set for the ASL fingerspelling and retinal datasets in this study are as follows: Lower RGB to Gray = R: 80 G: 255 B: 20 to H: 105° S: 92.2% V: 100% Upper RGB to Gray = R: 255 G: 255 B: 0 to H: 60° S: 92.2% V: 100%

12 Otsu Gaussian This method is using Gaussian blur to disguise noise in the image and then the Otsu method is implemented as a segmentation technique. At the kernel we set 3  3 to define standard Gaussian so that we got the same result [12].

13 Result and Analysis Testing methods using 10 images of American Sign Language University (ASLU) with one type of background and intensity, 71 images from the primary dataset with 7 kinds of background and different intensities. For the retinal, there are 84 diseased images from 6 types of blood vessel pattern and intensity within the same range. This is done with the aim to test the 10 methods of segmentation that is Adaptresh, Region Growing, Watershed, YCbCr-HSV, Fuzzy C-Means, K-Means, Mean Shift, Grab Cut, Skin Color, and Otsu Gaussian (Table 1). From the graph in Fig. 1, it can be hypothesized that the intensity can affect the results (image segmented). In testing dataset ASLU (American Sign Language University) and primary dataset (doc. Private), the intensity in the range of 57–86 grades is sufficient conditions dark image while the intensity in the range of 138–154 is quite a bright image with the condition. While the retinal image has an average range of 50–60 where the comparison of which 50 were quite dark and 60 are quite bright. In addition to the intensity of the condition of an effect, there are other factors that also affect, i.e. the pattern of the background. If it is quite complicated as the text or other images and if we just do the background-foreground separation (segmentation), the pattern of the background and the shadow belong to the foreground as a joined identified (Tables 2a, 2b, 3a and 3b).

174

Erwin et al.

Table 1. Illustration of ASL fingerspelling and retinal image with 8 background and different patterns

Type of Fingerspelling Disease Names Backgrounds ASL Background 1 Background Diabetic Retinopathy Background 2

Branch Retinal Vein Occlusion

Background 3

Choroidal neovascularization

Background 4

Coats

Background 5

Histoplasmosis

Background 6

Myelinated Nerve Fibers

Background 7

Background 8

Retinal STARE

A Comparative Analysis of Image Segmentation Methods

175

The Diseased Retina

200 150 100 50

ASL Fingerspelling

0 1

2

3

Dataset Retina STARE (Σ)

4

5

6

7

8

Dataset ASL Fingerspelling ASLU dan Primer (Σ)

Fig. 1. Average graph of intensity level of 8 background types ASL fingerspelling and 6 types of disease retinal patterns

Data representation from RMSE is a parameter to find out the difference between the original image and the segmented image, where the closer to 0, the bigger the image is identified (Figs. 2 and 3). In the background one with specification range of intensities of 75.2 and black background, the best technique is the technique of Mean Shift, background 2 with specification range of intensities of 154 and a plain white background, the best technique is a technique YCbCr. Mechanical Skin Color produces the best segmentation for background 3, 5 and 8. Background 3 with a yellow background and patterned specification text with a range of intensities of 138, background 5 with plain white specs that its intensity range at 149 and background 8 with a range of intensities of 146 and white background, slightly yellow. While on the retinal image, adaptive thresholding techniques produce well at recognizing the blood vessels by 53.19 dB PSNR.

176

Erwin et al. Table 2a. The result of ASL fingerspelling image using 5 methods in segmentation

Type of Adaptive Background Thresholding Background 1 Background 2 Background 3 Background 4 Background 5 Background 6 Background 7 Background 8

Region Growing

Water shed

YCbCrHSV

K-Means

A Comparative Analysis of Image Segmentation Methods

177

Table 2b. The result of ASL fingerspelling image using 5 methods in segmentation

Type of background Background 1 Background 2 Background 3 Background 4 Background 5 Background 6 Background 7 Background 8

Fuzzy C-Means

Mean Shift

Grab Cut

Skin Color

Otsu Gaussian

178

Erwin et al. Table 3a. The results of retinal image using 5 methods in segmentation

Type of Disease Backgroun d Diabetic Retinopath y Branch Retinal Vein Occlusion choroidal Neovas cularizatio n Coats

Histoplas Mosis

Myelinate d Nerve Fibers

Adaptive thresholdi ng

Region growing

Water shed

YCbCrHSV

K-Means

A Comparative Analysis of Image Segmentation Methods Table 3b. The results of retinal image using 5 methods in segmentation

Type of Disease Background Diabetic Retinopathy Branch Retinal Vein Occlusion choroidal Neovas cularization Coats

Histoplas Mosis Myelinated Nerve Fibers

Fuzzy C-Means

Mean Shift

Grab Cut

Skin Color

Otsu Gaussian

179

180

Erwin et al.

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Fingerspelling ASL

The Diseased Retina

Fingerspelling ASL Retina Berpenyakit Fig. 2. The comparison of RMSE average in ASL fingerspelling dataset ASLU and primary and retinal STARE

70 The Diseased Retina 60 50 40 30 20 10 0

Fingerspelling ASL

Fingerspelling ASL Retina Berpenyakit

Fig. 3. The comparison of PSNR average in ASL fingerspelling dataset ASLU and primary and retinal STARE

A Comparative Analysis of Image Segmentation Methods

181

14 Conclusion From the data of test results and data representation of the above, it can be concluded that the method of segmentation based on background with a background in monocolor intensity is quite light, a technique well segmented is Skin Color and K-Means successfully segmenting 4 types of background are: Background 1, 2, 4, 5 and 2, 3, 5, 8. for Gaussian Otsu techniques and YCbCr-HSV successfully segmenting the 3 types of background: Background 1, 2, 8 and 1, 2, 5. While the technique of Adaptive Thresholding produces the best segmentation in identifying blood vessels compared to other techniques.

References 1. Glasbey, C.A., Horgan, G.W.: Image Analysis for the Biological Science. Wiley, New York (1995) 2. Marques, O.: Practical image and Video Processing Using Matlab ®. Wiley, Hoboken (2011) 3. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Prentice-Hall, Inc., Upper Saddle River (2007) 4. Erwin, Saparudin, Nevriyanto, A., Purnamasari, D.: Performance analysis of comparison between region growing, adaptive threshold, and watershed methods for image segmentation. In: Proceedings of the International MultiConference of Engineers and Computer Scientists 2018, IMECS 2018, Hong Kong, 14–16 March 2018. Lecture Notes in Engineering and Computer Science, pp. 157–163 (2018) 5. Sambandam, R.K., Jayaraman, S.: Self-adaptive dragonfly based optimal thresholding for multilevel segmentation of digital images. J. King Saud Univ. Comput. Inf. Sci. 30(4), 449– 461 (2016) 6. Rouhi, R., Jafari, M., Kasaei, S., Keshavarzian, P.: Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Syst. Appl. 42(3), 990–1002 (2015) 7. Sinha, A.: A new approach of watershed algorithm using distance transform applied to image. Int. J. Innov. Res. Comput. Commun. Eng. 1(2), 185–189 (2013) 8. Beucher, S.: The watershed transformation applied to image segmentation. In: Scanning Microscopy International, pp. 1–26 (1991) 9. Tarabalka, Y., Chanussot, J., Benediktsson, J.A.: Segmentation and classification of hyperspectral images using watershed transformation. Pattern Recogn. 43(7), 2367–2379 (2010) 10. Saini, S., Arora, K.: A study analysis on the different image segmentation. Int. J. Inf. Comput. Technol. 4(14), 1445–1452 (2014) 11. Lim, Y.W., Lee, S.U.: On the color image segmentation algorithm based on the thresholding and the fuzzy c-means techniques. Pattern Recogn. 23(9), 935–952 (1990) 12. Otsu, N.: A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. SMC-9(1), 62–66 (1979)

Communication Interruption Between a Game Tree and Its Leaves Toshio Suzuki(B) Department of Mathematical Sciences, Tokyo Metropolitan University, Hachioji, Tokyo, Japan [email protected]

Abstract. We introduce a successor model of an AND-OR tree. Leaves are connected to internal nodes via communication channels that possibly have high probability of interruption. By depth-first communication we mean the following protocol: if a given algorithm probes a leaf then it continues to make queries to that leaf until return of an answer. For each such tree, we give a concrete example of interruption probability setting with the following property. For any independent and identical distribution on the truth assignments (probability is assumed to be neither 0 nor 1), any depth-first search algorithm that performs depth-first communication is not optimal. This result makes sharp contrast with the counterpart on the usual AND-OR tree (Tarsi) that optimal and depthfirst algorithm exists. Our concrete example is based on Riemann zeta function. We also present a generalized framework. Keywords: AND-OR tree · Depth-first algorithm · Directional algorithm · Optimal algorithm · Communication channel · Communication interruption · Riemann zeta function · Independent and identical distribution

1

Introduction

An AND-OR tree is a mini-max tree whose evaluation function is Booleanvalued, in other words, value is 1 (true) or 0 (false). The root is labeled by AND, child nodes of an AND-gate (an OR-gate, respectively) are labeled by OR (AND, respectively). Each leaf has Boolean value and their values are hidden. An algorithm probes leaves to find the Boolean value of the root, and during computation, the algorithm skips a leaf if unnecessary. Cost of computation is measured by the number of leaves probed during computation. Given a probabilistic distribution on the truth assignments to the leaves, cost means expected value of the above-mentioned cost. Computational complexity issues on AND-OR trees have been studied from the early stage of artificial intelligence ([2,4,5,8,14]). This work was supported by JSPS KAKENHI JP16K05255. c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 182–193, 2020. https://doi.org/10.1007/978-981-32-9808-8_15

Communication Interruption Between a Game Tree and Its Leaves

183

On the other hand, current systems of artificial intelligence are often consist of many devices that communicate each other. Interruption of communication is one of potential risks in such systems. We propose a successor model of an AND-OR tree in which each leaf is connected to an internal node via a communication channel. We are interested in the case where each channel has high probability of interruption. Figure 1 is an example of such a tree. Circles are internal nodes, squares are leaves, solid lines are usual wire, and broken lines are communication channels. In our mind, the main body of the tree is in our local computer, but leaves are on remote devices. The most simple type of probability distribution on an AND-OR tree is an independent and identical distribution (IID for short). More precisely, an IID is a distribution such that there is a fixed positive real number p ≤ 1 and each leaf independently has has value 0 with probability p.

Fig. 1. Broken lines are communication channels

A tree is balanced (in the sense of Tarsi [14]) if (1) any two internal nodes of the same depth (the distance from the root) have the same number of child nodes and (2) all the leaves have the same depth. An algorithm A is depth-first if for every internal node x, once A probes a leaf that is descendant of x then A does not probe leaves that are non-descendants of x until A finds value of x. A is directional if there is a fixed linear order of the leaves, and for any truth assignment to the leaves, the order of probe by A is consistent to the above-mentioned linear order [4]. The above (standard) definition of depth-first is not exact for our purpose. In our computation model, “algorithm A makes a query to leaf x” is merely a necessary condition for “A finds value of x”. Thus, we redefine the concept of depth-first, and in addition, introduce concept of depth-first communication.

184

T. Suzuki

Definition 1. Let A be an algorithm on a tree. 1. A performs depth-first search (or simply, A is depth-first) if for every internal node x, once A finds value of a leaf that is descendant of x then A does not make queries to leaves that are non-descendants of x until A finds value of x. 2. A (possibly does not perform depth-first search) performs depth-first communication if for each leaf, once A makes a query to a leaf, A consecutively makes queries to that leaf until return of an answer.  About an IID on an AND-OR tree, the following result of Tarsi is important and well-known. If 0 < p < 1, there is an optimal algorithm (its cost achieves the minimum among all algorithms) that is depth-first and directional [14]. Suppose that T is a balanced AND-OR tree, and that an attached distribution is an IID with 0 < p < 1 (p is probability of a leaf having value 0). For any leaf v and any positive integer k, at kth query to v, assume that probability of interruption depends only on k, not depending on v. Let f (k) be the probability. We give a particular example of a function f with the following property: Any depth-first algorithm that performs depth-first communication is not optimal (The main theorem). This result and the above result of Tarsi contrast sharply. Our f (x) is x2 /(x + 1)2 . In the proof, a key tool is Riemann zeta function. In Sect. 3, we observe cost of getting value of a leaf via consecutive access to it through a communication channel. In Sect. 4, we introduce our interruption probability setting on a tree of height 1. In Sect. 5, we give a framework that generalizes the example in Sect. 4. In Sect. 6, we investigate a tree of general height, and show our main result. The proof works under the generalized framework of Sect. 5. This paper is a revised and extended version of our conference paper [10] presented in International MultiConference of Engineers and Computer Scientists 2018. Section 5 is the newly added passage in this revised and extended version.

2

Preliminaries

  As usual, and denote sum and product, respectively. Throughout the paper, k−1 k−1 an expression of the form i=k [· · · ] denotes 0, and i=k [· · · ] denotes 1. We denote ∞ Riemann zeta function [1, Chapter 23] by ζ. Thus, for each s > 1, ζ(s) = n=1 n−s . In particular, ζ(2) = π 2 /6 = 1.6449 · · · and ζ(4) = π 4 /90 = 1.0823 · · · . For two events E1 and E2 , we denote conditional probability of E2 under E1 by prob[E2 |E1 ]. The paper given in [9] is a concise survey on complexity and equilibria of AND-OR trees, by which the reader can overview the previous research [3] and its subsequent developments [12,13] and [7]. For more recent works on this line, see the papers [6] and [11].

Communication Interruption Between a Game Tree and Its Leaves

3

185

Consecutive Queries to a Particular Leaf

In this section, we investigate a single leaf x0 with a communication channel (Fig. 2). A procedure P consecutively makes queries to x0 until return of an answer.

Fig. 2. A single leaf with a communication channel

For each positive integer n, we look at the following events En,0 and En,1 . En,0 : “For each j such that 1 ≤ j < n, jth query to x0 is interrupted (that is, P does not receive an answer)”. En,1 : “nth query to x0 is interrupted”. Lemma 1 (Cost of getting 1-bit information). Let k be a positive integer. Let αk be expected cost for P to get an answer, under the assumption that for all n, (1) holds. prob[En,1 |En,0 ] = [(n + k − 1)/(n + k)]2

(1)

Then, we have the following. αk = k (ζ(2) − 2

k−1 

j −2 )

(2)

j=1

Recall that by our convention in the notation section, we have Thus, in particular, the following holds.

−1

j=1

j −2 = 0.

α1 = ζ(2)

(3)

Proof. Let f (x) = [x/(x + 1)]2 . αk =

∞  j=1

prob[Ej,0 ∧ ¬Ej,1 ] × j =

∞  j=1

k+j−2 

[(

i=k

f (i))(1 − f (k + j − 1))j]

186

T. Suzuki

Here, we have the following. n  j=1

k+j−2 

[(

f (i))(1 − f (k + j − 1))j] =

n 

[j

k+j−2 

j=1

i=k

=

=

f (i) − j

k+j−1 

i=k

i=k

n k+j−2  

k+n−1 

j=1 n 

f (i) − n

i=k

f (i)]

f (i)

i=k

[k/(k + j − 1)]2 − n[k/(k + n)]2

j=1 n  = k 2 [ (k + j − 1)−2 − n/(k + n)2 ] j=1 k+n−1 

= k2 [

j −2 −

j=1

→ k 2 (ζ(2) −

k−1 

j −2 − n/(k + n)2 ]

j=1 k−1 

j −2 ) (n → ∞)

(4)

j=1



Hence, (2) holds.

4

Height 1 Binary Tree

In this section, we investigate a binary OR-tree of height 1 with a communication channel for each leaf (Fig. 3). We are going to compare depth-first communication and non-depth-first communication.

Fig. 3. Height 1, binary case

Example 1 (Cost of depth-first communication). Let p be a real number such that 0 < p < 1. Let dp be an IID such that probability of a leaf having value 0 is p. Assume that interruption probabilities are given by (1) with k = 1. prob[En,1 |En,0 ] = [n/(n + 1)]2

(5)

Communication Interruption Between a Game Tree and Its Leaves

187

In this model, we investigate the following algorithm Lω Rω : Make queries to x0 until x0 returns an answer. If the value is 0 then make queries to x1 until x1 returns an answer. Let cost(Lω Rω , dp ; [n/(n + 1)]2 ) denote the expected cost of Lω Rω under dp and the interruption probabilities (5). Then, by Lemma 1, we have the following. cost(Lω Rω , dp ; [n/(n + 1)]2 ) = α1 + pα1 = (1 + p)ζ(2)

(6) 

Let (LR)ω denotes the following algorithm. Repeat the following while none of x0 and x1 returns an answer: “Make a query to x0 , then make a query to x1 ”. If xi returns an answer, break the above loop. If xi is 0 then make queries to x1−i until x1−i returns an answer. Let p and dp be those in Example, and assume that interruption probabilities are given by (5). Let cost((LR)ω , dp ; [n/(n + 1)]2 ) denote the expected cost of Lω Rω under dp and the above interruption probabilities. We are to define sequence {βk }k=0,1,2,··· so that cost((LR)ω , dp ; [n/(n+ going ∞ 2 2 1)] ) = k=1 βk . In the following, f (x) denotes [x/(x + 1)] , and αk is that defined in Lemma 1. For each i ∈ {0, 1} and n ≥ 1, let En be the event “The first return from a leaf happens at nth access to the leaves”. Let βn = prob[En ] × (n + pα(n+1)/2 ) (if n is odd), and βn = prob[En ] × (n + pα(n+2)/2 ) (if n is even). Then the following holds. β1 = (1 − f (1))(1 + pα1 ),

(7)

β2 = f (1)(1 − f (1))(2 + pα2 ),

(8)

β2k+1 = (

k 

f (j)2 )(1 − f (k + 1))(2k + 1 + pαk+1 ),

(9)

j=1

β2k+2 = (

k 

f (j)2 )f (k + 1)(1 − f (k + 1))(2k + 2 + pαk+2 )

(10)

j=1

Lemma 2. cost((LR)ω , dp ; [n/(n + 1)]2 ) = 2ζ(2) + (1 − p)(ζ(4) − 3)

(11)

Proof. It is easy to verify the following. β2k+1 + β2k+2 2k + pαk+1 2(k + 1) + pαk+2 1 − + (k + 1)4 (k + 2)4 (k + 1)2 (k + 2)2 p(αk+2 − αk+1 ) + (12) (k + 1)2 (k + 2)2

= (k + 1)−4 +

188

T. Suzuki

Hence, we have the following. 2k+2  j=1

βj =

k+1 

j −4 + pα1 − (2k + 2 + pαk+2 )/(k + 2)4

j=1

+

k  j=0

k

 p(αj+2 − αj+1 ) 1 + 2 2 (j + 1) (j + 2) (j + 1)2 (j + 2)2 j=0 (13)

Here, the following holds. k+1 

j −4 → ζ(4) (k → ∞)

(14)

j=1

Throughout the rest of the proof, let σx denote following hold.

x

j=1

j −2 . By Lemma 1, the

pα1 = p ζ(2),

(15)

(2k + 2 + pαk+2 )/(k + 2)4 = O(k −3 ) + p(ζ(2) − σk+1 )/(k + 2)2 → 0 (k → ∞)

(16)

The third term of (13) is estimated as follows. k  j=0

k  1 2j + 1 2j + 3 2 = (− + + ) 2 2 (j + 1)2 (j + 2)2 (j + 1) (j + 2) (j + 2)2 j=0

= −3 + (2k + 3)/(k + 2)2 + 2σk+2 → −3 + 2ζ(2) (k → ∞)

(17)

Again, by Lemma 1, we get the following. (αj+2 − αj+1 )/(j + 1)2 (j + 2)2 = [((j + 1)−2 − (j + 2)−2 )ζ(2) − (j + 1)−4 ] − ((j + 1)−2 − (j + 2)−2 )σj (18) Here, the sum of [· · · ] has the following limit. k 

[((j + 1)−2 − (j + 2)−2 )ζ(2) − (j + 1)−4 ] = (1 − (k + 2)−2 )ζ(2) −

j=0

k+1 

j −4

j=1

→ ζ(2) − ζ(4) (k → ∞)

(19)

∞ We are going to show j=0 [((j + 1)−2 − (j + 2)−2 )σj ] = −3 + 2ζ(2). Let a be the left-hand side. Here, we have σ0 = 0, thus we may ignore the term for j = 0.

Communication Interruption Between a Game Tree and Its Leaves

a=

∞ 

−2

[(j + 1)

j=1

j 

k

−2

j+1  k=1

n→∞

=

− (j + 2)

k=1

= 1/4 − lim (n + 2)−2 ∞ 

−2

n+1 

k −2 +

∞ 

k

−2

]+

∞ 

189

(j + 1)−2 (j + 2)−2

j=1

(j + 1)−2 (j + 2)−2

j=1

k=1

(j + 1)−2 (j + 2)−2

j=0

= −3 + 2ζ(2)

[by (17)]

(20)

By (18), (19) and (20), we can evaluate the fourth term of (13). ∞  p(αj+2 − αj+1 ) = p(3 − ζ(2) − ζ(4)) (j + 1)2 (j + 2)2 j=0

(21)

By (13), (15), (16), (17) and (21), we find the cost. ω

2

cost((LR) , dp ; [n/(n + 1)] ) =

∞ 

βj = 2ζ(2) + (1 − p)(ζ(4) − 3)

(22)

j=1

In other words, (11) holds.  Corollary. Suppose that 0 < p < 1. Then, cost((LR)ω , dp ; [n/(n + 1)]2 ) is less than cost(Lω Rω , dp ; [n/(n + 1)]2 ). Proof. By Example and Lemma 2, we have the following. cost(Lω Rω , dp ; [n/(n + 1)]2 ) − cost((LR)ω , dp ; [n/(n + 1)]2 ) = (1 − p)(−ζ(2) − ζ(4) + 3) > 0

(23) 

5

Generalization

In Sects. 3 and 4, we have observed the particular interruption probability setting f (n) = [n/(n + 1)]2 . In this section, we investigate the situation where we substitute a real-valued function g for f . We put the following hypotheses on g. 1. The domain of g is the set of all positive integers. For each n it holds that 0 < g(n) < 1. ∞ 2. The infinite product n=1 g(n) converges to 0. 3. For each positive integer k, the following summation converges. αkg :=

∞  j=1

k+j−2 

[(

i=k

g(i))(1 − g(k + j − 1))j]

(24)

190

T. Suzuki

4. For each integer k ≥ 2, we have α1g < αkg . 5. For each positive integer k, and a real number p such that 0 ≤ p ≤ 1, we define the real number βkg as follows. β1g := (1 − g(1))(1 + pα1g ),

(25)

β2g := f (1)(1 − g(1))(2 + pα2g ),

g β2k+1 := (

k 

(26)

g g(j)2 )(1 − g(k + 1))(2k + 1 + pαk+1 ),

(27)

g g(j)2 )g(k + 1)(1 − g(k + 1))(2k + 2 + pαk+2 )

(28)

j=1 g β2k+2 := (

k 

j=1

Then for each p (0 ≤ p ≤ 1), the following summation converges. g β∞ :=

∞ 

g g (β2k+1 + β2k+2 )

(29)

k=0

Throughout the rest of this section, we assume the following two. • Function g satisfies all of the above-mentioned five hypotheses. • A tree is a binary OR-tree of height 1. g . Then cost(Lω Rω , dp ; g) = (1 + p)α1g , and cost((LR)ω , dp ; g) = β∞

Lemma 3. It holds that cost((LR)ω , dp ; g) ≤ cost(Lω Rω , dp ; g). The equality holds only in the case where p = 1. Proof. We consider the two costs as a function of p. Let cost((LR)ω , dp ; g) = a0 p + b0 . We have cost(Lω Rω , dp ; g) = α1g p + α1g . In the case where p = 1 the values of leaves are 0, thus an algorithm has to probe the both of two leaves to determine the value of the root. Hence, for any algorithm, the expected cost is 2α1g . This means the following. a0 × 1 + b0 = 2α1g = α1g × 1 + α1g

(30)

We are going to show that a0 > α1g . Recall that a0 is the coefficient of p in g . By hypothesis 4, the following holds. β∞ a0 > α1g

k ∞   [( g(j)2 )(1 + g(k + 1))(1 − g(k + 1))] k=0 j=1

= α1g lim

n→∞

k k+1 n    [ g(j)2 − g(j)2 ] k=0 j=1

= α1g lim [ n→∞

0 

j=1

g(j)2 −

j=1 n+1  j=1

g(j)2 ]

(31)

Communication Interruption Between a Game Tree and Its Leaves

191

0 By the convention of j=1 = 1 and the hypothesis 2, the last formula equals g − 0) = α1 . Therefore the two positive real numbers a0 and α1g satisfy that a0 > α1g . Hence by (30), a0 x + b0 ≤ α1g x + α1g for x in the closed interval [0, 1] and the equality holds exactly when x = 1. 

α1g (1

In the case where g(n) is the f (n) = [n/(n + 1)]2 , the above-mentioned five hypotheses surely hold. The hypothesis 4 is verified as follows. Given a positive integer n, real valued function f (x) = x2 /(x + n)2 has positive derivative for x > 0. Therefore for each integer k ≥ 2, it holds that 1/(1 + n)2 < k 2 /(k + n)2 . ∞ Thus it is not difficult to see the following. α1 = ζ(2) < n=0 k 2 /(k + n)2 = αk . In the same way, the five hypotheses hold in the case where g(n) = [n/(n + 1)]3 .

6

The Theorem

Now, we investigate a tree of arbitrary height. Let T be a balanced AND-OR tree or a balanced OR-AND tree of height h(≥ 1). Suppose that p is a real number such that 0 < p < 1 and let dp be an IID such that at each leaf, probability of having value 0 is p. Assume that g is a function satisfying the five hypotheses in Sect. 5, and the interruption probability setting is given by g(n). Given an algorithm A on T and a real number p, let cost(A, dp ; g) denote the expected cost of A under dp and the above interruption probabilities. Theorem. In addition to the above setting, suppose that A is a depth-first algorithm on T and A performs depth-first communication (see Definition in Introduction). Then A is not optimal. Proof. We investigate the case where nodes just above leaves are OR-gates. The other case (they are AND-gates) is treated in a similar way. The case where T is a binary OR-tree of height 1 is shown by Lemma 3. In the following, T is assumed to have more than 2 leaves. Since the distribution dp is an IID and 0 < p < 1, there exists an initial segment γ of a computation path of A on T , such that γ has positive probability and after γ, A performs in the same way as Lω Rω . More precisely, γ (its length is, say k ≥ 1) consists of ordered pairs x(i) , a(i) (i = 0, . . . , k − 1) of leaves x(i) and truth values a(i) ∈ {0, 1}, and in addition there are leaves x(k) and x(k+1) , and the following hold. 1. There exists an OR-gate (say, xu ) such that x(k) and x(k+1) are its child. 2. At the beginning of computation, A makes queries to x(0) until return of an answer. 3. For each i < k, if an answer of x(i) is a(i) then A makes queries to x(i+1) until return of an answer. 4. In the presence of dp and the interruption probabilities given by g, A performs the move γ (until getting the answer a(k−1) ) with positive probability.

192

T. Suzuki

5. If an answer of x(k−1) is a(k−1) then A makes queries to x(k) until return of an answer. 6. If an answer of x(k) is 0 then A makes queries to x(k+1) until return of an answer. Then A finds value of the root. 7. If an answer of x(k) is 1 then A finds value of the root. Figure 4 illustrates the OR-gate xu and its child leaves. For example, x(k) and x(k+1) are possibly xu,m−2 and xu,m−1 , respectively. These two leaves are the last two leaves with positive probability.

Fig. 4. General case

Now, let B be the following algorithm. B simulates A. However, if history γ (including the instance getting the answer a(k−1) ) happens then B performs as (x(k) x(k+1) )ω , in other words, B obeys the following instructions. Repeat the following while none of xk and xk+1 returns an answer: “Make a query to xk , then make a query to xk+1 ”. If xk+i (i ∈ {0, 1}) returns an answer, break the above loop, and then make queries to xk+1−i until xk+1−i returns an answer. By Lemma 3, B has lower cost than A. 

7

Summary and Future Directions

We investigated multi-branching balanced AND-OR trees with communication channels between leaves and the main body. For each such tree, we showed concrete example of interruption probability setting with the following property: For any independent and identical distribution on the truth values on the leaves (probability is assumed to be neither 0 nor 1), depth-first algorithms of depthfirst communication are not optimal. Our main tool is Riemann zeta function. The following are future directions. • Characterization of optimal algorithms in the presence of the above interruption probability setting. • Study on words of infinite length as algorithms of our model. For example, Lω Rω is an infinite sequence LL · · · RR · · · , and (LR)ω is (LR)(LR) · · · .

Communication Interruption Between a Game Tree and Its Leaves

193

• Application to computation model under emergency where batteries of devices have loss of power. • Application to neural networks. Acknowledgement. We are grateful to the anonymous referees of the previous version for helpful advices. We wish to thank the attendants of IMECS 2018 for valuable discussion.

References 1. Abramowitz, M., Stegun, I.A. (eds.): Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables, Tenth Printing. U.S. Government Printing Office, Washington, D.C. (1972) 2. Knuth, D.E., Moore, R.W.: An analysis of alpha-beta pruning. Artif. Intell. 6, 293–326 (1975) 3. Liu, C.G., Tanaka, K.: Eigen-distribution on random assignments for game trees. Inform. Process. Lett. 104, 73–77 (2007) 4. Pearl, J.: Asymptotic properties of minimax trees and game-searching procedures. Artif. Intell. 14, 113–138 (1980) 5. Pearl, J.: The solution for the branching factor of the alpha-beta pruning algorithm and its optimality. Commun. ACM 25, 559–564 (1982) 6. Peng, W., Peng, N.N., Ng, K.M., Tanaka, K., Yang, Y.: Optimal depth-first algorithms and equilibria of independent distributions on multi-branching trees. Inform. Process. Lett. 125, 41–45 (2017) 7. Peng, W., Okisaka, S., Li, W., Tanaka, K.: The uniqueness of eigen-distribution under non-directional algorithms. IAENG Int. J. Comput. Sci. 43, 318–325 (2016) 8. Saks, M., Wigderson, A.: Probabilistic Boolean decision trees and the complexity of evaluating game trees. In: Proceedings of 27th IEEE FOCS, pp. 29–38 (1986) 9. Suzuki, T.: Kazuyuki Tanaka’s work on AND-OR trees and subsequent developments. Ann. Jpn. Assoc. Philos. Sci. 25, 79–88 (2017) 10. Suzuki, T.: An AND-OR-tree connected to leaves via communication channels. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International MultiConference of Engineers and Computer Scientists, Hong Kong, 14–16 March 2018, pp. 185–189 (2018) 11. Suzuki, T.: Non-depth-first search against an independent distribution on a balanced AND-OR tree. Inform. Process. Lett. 139, 13–17 (2018) 12. Suzuki, T., Nakamura, R.: The eigen distribution of an AND-OR tree under directional algorithms. IAENG Int. J. Appl. Math. 42, 122–128 (2012) 13. Suzuki, T., Niida, Y.: Equilibrium points of an AND-OR tree: under constraints on probability. Ann. Pure Appl. Logic 166, 1150–1164 (2015) 14. Tarsi, M.: Optimal search on some game trees. J. ACM 30, 389–396 (1983)

Intermittent Snapshot Method for Data Synchronization to Cloud Storage Yuichi Yagawa1(&), Mitsuo Hayasaka1, Nobuhiro Maki1, Shin Tezuka1, and Tomohiro Murata2 1

2

Research & Development Group, Hitachi, Ltd., 1-280, Higashi-koigakubo, Kokubunji-shi, Tokyo 185-8601, Japan {yuichi.yagawa.bh,mitsuo.hayasaka.hu, nobuhiro.maki.vg,shin.tezuka.xs}@hitachi.com Graduate School of Information, Production and Systems, Waseda University, Kitakyusyu-shi, Fukuoka 808-0315, Japan [email protected]

Abstract. We developed a configuration to place a cache storage called Cloud on-Ramp (CoR) with small capacity between client terminals at remote offices and a cloud storage connected by a wide area network. A CoR is placed in each remote office, and applications access them via a local area network, guaranteeing access performance. A CoR synchronizes the data with the cloud storage by copying updated data from the client terminal at regular intervals. It is necessary that a window to carry out the bulk-copy function be shortened and repeated many times to keep the latest data in the cloud storage. A method of maintaining consistency of data at a CoR is necessary, even if the data are updated in the copy by the client terminal. Therefore, we propose the “intermittent snapshot method”, with which a snapshot is taken during bulk-copy execution and released as soon as the bulk copy is over. We evaluated the proposed method from the perspective of an implementation design. We formalized the method by using a stochastic Petri-net model and considered the proper size of the bulk-copy window that optimizes both the synchronization delay and application-access performance through simulation. Keywords: Cloud storage  Cache storage  Cloud on-Ramp (CoR) synchronization  Intermittent snapshot  Stochastic Petri-net

 Data

1 Introduction Cloud storage [1, 2] has begun to be used for collaboration between remote offices in a company. The configuration to place a cache storage with small capacity between a client terminal and cloud storage has been suggested because the problem of delay arises from the client terminal in a distributed environment connected to the cloud storage through a wide area network (WAN) [3–5]. A cache storage is placed in each remote office, and applications access them via a local area network (LAN), guaranteeing access performance. We call such a cache storage Cloud on-Ramp (CoR) [3–5]. A CoR synchronizes the data with the cloud storage by copying updated data from the client terminals to the cloud storage based on the write-back cache algorithm in a © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 194–206, 2020. https://doi.org/10.1007/978-981-32-9808-8_16

Intermittent Snapshot Method for Data Synchronization

195

lump at regular intervals. We call this a “bulk-copy” function. This function should not affect the access performance from the client terminals and their applications. Therefore, this function is executed when work in the remote office usually finishes, including nighttime. The demand of placing the latest data into the cloud storage has recently increased. For example, there is a need of sharing the data that have accumulated in many CoRs at remote offices and of referring them through the cloud storage immediately. It is also necessary to place as much of the latest data into the cloud storage as possible because the data need to be recovered from the cloud storage at the time of data access errors in a CoR. We call the difference in the synchronization time in the cloud storage determined from a CoR “synchronization delay”. The shorter the synchronization delay, the fresher the data users can access through the cloud storage. It is necessary that the distance to execute the bulk-copy function be shortened and repeated many times to shorten the synchronization delay. We call this distance the “bulk-copy window”. However, the time required to execute bulk copy is prolonged for the time it takes all copies of the data to update because a re-copy runs when a file is updated in the bulk-copy window [6]. A method of maintaining consistency of data is necessary even if the data are updated in the copy by the client terminal to shorten the bulk-copy window and repeated many times. Generally, a method of maintaining the consistency of data copying includes a snapshot method [7]. In a CoR, a snapshot of the update data is taken from the client terminal and copied to the cloud storage. In this study, we assumed office applications and data processing applications through the sharing of data among remote offices, but there are more read commands (reading of data from a CoR) than write commands (writing data to a CoR) in the input/output (IO) of those applications. Therefore, we decided to adopt the Copy On Write snapshot method (COW), with which the effect on read performance is zero [7]. However, there is problem with COW in which the effective performance of the write judgment from the applications degrades. Therefore, we propose the “intermittent snapshot method”, with which a snapshot is taken during the bulk-copy execution and released as soon as the bulk copy is over. The performance penalty in the write from the applications is reduced by limiting the time to take the snapshot. There is a problem in the implementation to appropriately determine the bulk-copy window. Users request to shorten the window and place the latest data into the cloud storage. However, the number of bulk-copy execution times increases; in other words, it becomes easy for the applications to incur a performance penalty due to the increased time of snapshots to be taken. There are also cases in which a bulk-copy window increases when the preceding bulk copy-function is not finished before the next one is started. Conventionally, this bulk-copy window was determined from experience. For this technical problem, we formalize our intermittent snapshot method in the stochastic Petri-net model and calculate a suitable bulk-copy window through simulation.

196

Y. Yagawa et al.

2 Bulk-Copy Function and Its Problem in CoR In this chapter, we give an overview of the bulk-copy function and its problem in a CoR. 2.1

System Architecture

The system architecture we assumed for this study is based on that of a CoR that we previously proposed [3–5]. As shown in Fig. 1, the system is composed of CoRs placed in remote offices, applications in client terminals accessing the CoRs, and a cloud storage connected from the CoRs. Remote office clients/applica ons

Remote office clients/applica ons

LAN

LAN

Update CoR

Update CoR

File system

File system

File

Stub

Copy

Refer

File

File

WAN

Namespace Tenant

File

Stub

Copy

Recall

File

File

Namespace Cloud storage

Tenant

Fig. 1. Assumed system architecture

End users operate the application programs in the client terminals, which generate file data. A write request to store the data in a CoR and a read request to obtain the data that have been stored away by the CoR are issued. The request is made in general filesharing protocols such as CIFS or NFS. In addition, the client terminals are connected to a CoR by high-speed LAN. A CoR is a cache storage that temporarily stores file data. The request received from the client terminals are handled, and data are stored or provided. Furthermore, the bulk of data is regularly copied to a cloud storage to protect the stored data. Regarding the file data on a CoR where bulk copying ended, the use situation is checked regularly.

Intermittent Snapshot Method for Data Synchronization

197

Because a CoR has small capacity, the data with a small access frequency leave only a reference to the replicated data written in the cloud storage, and the CoR deletes those data. When a read request for the data is received, the data are first recovered from the cloud storage to the CoR then returned to the client terminals. Please refer to a previous study [5] regarding this process. A cloud storage is an object storage that permanently stores file data. A replication request received from a CoR is handled, and data are stored. The HTTP/HTTPS suitable for communication in a wide area is used for the request. The data are reproduced at several inter-nodes constituting the cloud storage and are protected. In addition, the network to connect CoRs and the cloud storage is assumed to be a WAN, which is slower than a LAN, because such a network is used with a system that is applied to remote offices located over a wide geographic area. The bulk-copy function by a CoR is regularly executed based on the bulk-copy window. A system manager can set the bulk-copy window, which is determined depending on the demand from the end users. For example, the window is kept short if the end users want to keep the latest data shared in the cloud storage. 2.2

Bulk-Copy Function

The bulk-copy function is composed of the following three processes. In the pre-process, the data for the bulk-copy function are extracted. A CoR finds updated data and outputs a copy-entity list. In order to shorten this process, the CoR records the operation history of the client for a file and directory using the operating system function beforehand [8]. In the replication process, the CoR transfers the data of a file and the directory listed in the copy-entity list to a cloud storage. The cloud storage receives the data and stores it. In the post-process, the result of every bulk-copy function being executed repeatedly is recorded, for example, when a process is not finished in a bulk-copy window when the data for a large quantity of copies exist. In this case, the current iteration continues and the next iteration is skipped, but the results are recorded. It will be confirmed whether the manager can keep the bulk-copy window the end users require with reference to this record. It is necessary that the bulk-copy function be executed to maintain the data consistency of the IO requests from applications. Therefore, a snapshot method is used. The applications access a primary volume, and the bulk-copy function accesses a snapshot volume while maintaining data consistency. 2.3

Conventional Snapshot Methods and Their Challenges

There are generally two methods of taking a snapshot, i.e., redirection on write method (ROW) and COW [7]. The overview of the two methods and the challenges when applied to the bulk copy are explained below. 2.3.1 Redirection on Write Snapshot Method (ROW) ROW is a snapshot method using log-structured data management. Every chunk of new data is added to a physical volume as the log structure, and a logical address refers to

198

Y. Yagawa et al.

the address in the physical volume. The chunk is first written to add a postscript to the physical volume when an application writes in a file and directory (postscript data). The primary volume manages the physical address of each chunk logically in B+tree manner. The snapshot volume manages the snapshot equally as an aggregate of the physical address. The snapshot is converted into a data array that should be read with postscript data by converting the logic address of the chunk the application requires into a physical address. When ROW is applied to the bulk-copy function, there are problems in performance. In both primary and snapshot volumes, performance always degrades regarding the address translation of the chunk at the time of the read. Particularly, it is easy to follow that the data of one file are located at the far-off position on the physical disk, and the performance penalty at the time of the read is large. The write performance to the primary volume also degrades. Because address translation and chunk allocation must be assigned regardless of the write size, write performance always degrades. 2.3.2 Copy on Write Snapshot Method (COW) COW is implemented as a function of the logical volume manager (LVM) and takes a snapshot by using the snapshot volume and COW table for a reference prepared for separately from the primary volume. When a client overwrites with a file and directory, the LVM first detects a chunk of data to be overwritten and copies it in the snapshot volume. Furthermore, an address of the copy is recorded on the COW table. COW overwrites with new data in the overwrite chunk afterwards, and the COW table is stored in the fixed position of the snapshot volume. There are also problems in performance when COW is applied to the bulk-copy function. Write performance to the primary volume largely decreases at the time of taking a snapshot (snapshot period). This is because the number of reading and writing requests of the volume increases, compared with the normal IO. In addition to the write of the new chunk, the read of the chunk to be overwritten, write of the chunk to the snapshot volume, and write of the COW table occur. In addition, the read performance of the snapshot degrades. This is because the data in the snapshot are detected from the COW table and searched from the snapshot volume. On the other hand, the read performance of the primary volume by the application does not degrade. 2.4

Problems in Implementing Bulk-Copy Function

As explained above, ROW and COW cause problems in read/write performance. The frequency of the bulk-copy function increases and it is expected that the synchronization delay is shortened to reflect the latest data in the cloud storage. However, it becomes easy for an application to incur a penalty in read or write performance degradation due to implementing the conventional snapshot methods because the snapshot period increases. Also, a bulk-copy window increases when the processing of the bulk-copy function in front does not finish before the next one is started. In contrast, choosing the appropriate snapshot method in the assumed use environment and the appropriate copy-window size (time width) are systematic problems. However, the bulk-copy window size is conventionally determined from experience

Intermittent Snapshot Method for Data Synchronization

199

and an appropriate bulk-copy window size and performance limit of the bulk-copy function based on the size has not been studied.

3 Intermittent Snapshot Method We developed our intermittent snapshot method to limit the snapshot period. In other words, a snapshot starts to be taken just before the replication handling of net-net start, i.e., a file data transfer, and finishes just after the replication process ends. This makes it possible to mitigate the performance degradation in the remaining time of the bulkcopy window after the snapshot. This process is repeated at every bulk-copy function. General office applications and data-processing applications are assumed through the data sharing among remote offices, and COW is used to determine the effect on read performance, which is zero in consideration that there are more read requests than write requests in the IO of such applications. 3.1

Overview of Intermittent Snapshot Method

This method involves bulk-copy process, write process from applications with taking a snapshot, and normal write process without taking a snapshot. Though a bulk-copy function is executed repeatedly, these processes are executed alternately, as explained in the following paragraph. In addition, other processes including read from applications are not changed from conventional processes. In the bulk-copy process, the CoR repeatedly replicates collective file data in the copy-entity list to the cloud storage. These file data are received in the cloud storage and stored. At replication start, a snapshot starts to be taken first. Then, the write process from applications with taking a snapshot replaces the normal write process. When a write request occurs, the CoR begins to read the current data and writes in the current data at the snapshot volume and renews the COW table. The new data are then written in a primary volume. At this time, the penalty for one read and three writes occurs in comparison with a normal write. At replication end, the snapshot stops being taken. Then, the normal write process replaces the write process with taking the snapshot. When a write request occurs from the application, the CoR writes in the write request data at the primary volume. The impact on application-write performance can be reduced by limiting the time to take the snapshot. The method is particularly effective with applications mainly composed of reads because the effect on read performance is zero with COW. It is also possible to reduce the cost of the cache storage because only the domain of the snapshot for one generation is needed in the CoR, whereas the snapshot domain of many generations is always necessary in the ordinary network-attached storage due to data backup.

200

3.2

Y. Yagawa et al.

Design Criteria for Implementing Intermittent Snapshot Method

With the proposed method, the relationship between the size of the bulk-copy window and synchronization delay needs to be addressed. For example, each bulk-copy function is started every hour or 30 min when a bulk-copy window is set for one hour or 30 min. In addition, each snapshot starts to be taken in sync with the bulk-copy function and completed when all data replications in the copy-entity list are finished. This snapshot period is the same as the bulk-copy-execution time, which expresses the synchronization delay, and must be shorter than the bulk-copy window. Particularly, it is expected that the snapshot period is further shortened because the performance penalty of write requests from applications to a CoR occurs with COW. This depends on the number of data replications and WAN transfer bandwidth, and the number of transfers is based on the number of files updated by the applications and their data size during the bulk-copy window in front. The synchronization delay is expected to decrease because the latest data should be placed in the cloud storage. Therefore, it is necessary to shorten the bulk-copy window, but the number of starts of the bulk-copy function increases at the same time. As a result, it becomes easy for the applications to incur a performance penalty because the snapshot period increases. In addition, a bulk-copy window promised for the end users cannot be protected when the current bulk-copy function does not complete before the next one begins. In a design implementation, it becomes necessary to appropriately determine how much the bulk-copy window can be shortened regarding the statistical frequency of read and write requests under the assumed application environment. Specifically, under the assumed write performance and WAN-transfer bandwidth, the smallest bulk-copy window, which is 100% of snapshots completed, will be necessary.

4 Implementation Design For the design criteria explained in Sect. 3, we discuss a model of our intermittent snapshot method and a simulation experiment in this section. 4.1

Stochastic Petri-Net Model of Intermittent Snapshot Method

We first formalize our intermittent snapshot method using the stochastic Petri-net model to solve an implementation problem because the structure and behavior of a system can be expressed visually. Also, the stochastic Petri-net model can express random characteristics of write requests from applications and uncertainty of WAN data transmission. The stochastic Petri-net model consists of a main process and a timing control, as shown in Fig. 2. Each component is explained as follows.

Intermittent Snapshot Method for Data Synchronization

201

T0

T5 P5

Applica on/ Client

P0

T6

T1 P6

T12

P1 T2 CoR

P7

P2

T7

T3

X Y

P3

: Inhibitor arc : Permission arc : Immediate transi on : Timed transi on : Stochas c transi on

T4 Cloud Storage P4

Fig. 2. Stochastic Petri-net model of proposed intermittent snapshot method

4.1.1 Main Process In applications, stochastic transition T0 expresses write requests and issues a writerequest token based on probability distribution. It is assumed that the random nature of the write requests obeys an exponential distribution. Place P0 expresses divergence to T1 expressing the normal write processing without taking a snapshot and to T12 expressing the write processing with taking a snapshot. In a CoR, T1 and T12 are connected from P6 by a permission arc or inhibitor arc respectively, then either T1 or T12 will be chosen based on the condition of P6.

202

Y. Yagawa et al.

Transition T1 requires the processing time that is equal to normal write-request performance. Transition T12 requires the processing time based on the performance penalty of a COW snapshot. In addition, the write-request tokens from T1 or T12 are collected in P1. Place P2 expresses a wait buffer for a bulk-copy function from a CoR to a cloud storage, and T2 controls the flow of tokens from P1 to P2. Transition T2 is connected by an inhibitor arc from P7 that expresses a state of a snapshot under the bulk-copy execution, and this transition is inhibited during the bulk-copy execution time. When the bulk-copy execution is completed (i.e. all data replications complete), T2 fires, and the write-request tokens that have become the candidates of the next bulk copy will be collected in P2. In T3, the write-request tokens to the same files are adjusted. The mean probability of unique files updated among all write requests is denoted as X. Then, the number of unique files is multiplied by a data-set conversion ratio Y and converted into data-set tokens that will be replicated. Here, Y is calculated in Formula (1). The value necessary for calculating X and Y in simulation is determined from a real environment. In addition, T3 is connected in a permission arc by P7, so T3 fires during the bulk-copy execution and the data-set tokens for the bulk-copy execution will be collected in P3. Y ¼ average file size  data-set size

ð1Þ

In the cloud storage, stochastic transition T4 expresses WAN transfer delay, and its condition is the same as that in our previous study [5]. Place P4 is where all data-set tokens have been transferred in the cloud storage. 4.1.2 Timing Control Transitions T5–T7 and P5–P9 control each timing of the main process by delivering a token to each. Time transition T5 expresses a bulk-copy window and issues a token with a regular period of the window. Place P5 expresses waiting for completion of the bulk-copy function. If the previous bulk-copy function is not completed, a token remains in P5 and inhibits issuing by the token from T5. Transition T6 expresses a bulk-copy start when it fires. Place P6 expresses the state of normal write, permits T1 and inhibits T12 when there is a token, and inhibits T1 and permits T12 when there is no token. When there is a token in P5 and P6, T6 fires, and each token is removed from P5 and P6. Then, T12 fires, and taking a snapshot is started. (T1 is inhibited, and the normal write is terminated.) Transition T7 expresses the bulk-copy completion when it fires, and it is connected to P3 by an inhibitor arc. In other words, T7 does not fire unless all transfer tokens of P3 disappear. Place P7 expresses a state of the bulk-copy execution, and a token will wait until all transfer tokens of P3 disappear and T7 fires. Transition T2 is inhibited in the state with a token in P7, and write-request tokens are accumulated in P1. On the contrary, T3 is permitted to fire in the state with the token in P7. All tokens in P2 are transferred to P3 for replication. In the conventional methods using ROW or COW all the time, the stochastic Petrinet model has a configuration without T1 and the permission and inhibitor arcs from

Intermittent Snapshot Method for Data Synchronization

203

P6. The write-request tokens always go through T12 with a snapshot penalty. The other configuration of these conventional methods is the same as that in Fig. 2. 4.2

Simulation Experiments and Results

We developed a simulator based on the model in Fig. 2. With the simulator, the synchronization delay was measured while changing the bulk-copy window (time transition T5) to evaluate a suitable bulk-copy window size with the intermittent snapshot method. 4.2.1 Experimental Conditions The control settings of the experiment were as follows. We first measured the mean file size, mean file update rate, and mean write-request performance from the actual system in our workplace. The measurement period was from 8:00 to 20:00 on weekdays from June 22nd through September 5th, 2017. The mean file size was 1 MB, mean file update rate (X) was 33% (the rate except the write requests issued to the same file), and mean write-request performance (T0) was 6 operations per second (OPS), (the number of write requests issued for 381 OPS of all IO requests). We also assumed that the write requests (T0) occurred at random and in exponential distribution. Regarding the WAN-data-transmission delay, stochastic transition T4 set a parameter by Pareto distribution based on a previous study published in 2005 [9]. We assumed that the WAN-data-transmission delay had not changed since 2005, so we used the same average and dispersion as that study. However, the bandwidth of the WAN varies according to use environment. Therefore, we assumed that the data-set size in Formula (1) was 36 times larger than 9 KB, which was used in 2005, based on our environment. The Y was calculated as 3 tokens. Each delay time of a normal write (T1) and a write with a COW snapshot (T12) was set to 8 and 17 ms respectively, as determined in a previous study [7]. Note that we assumed the delay time under the condition of a 100% write-request ratio in the previous study [7]. Applications were assumed to run from 8:00 to 20:00, and the measurement time was set to 12 h. The bulk-copy window size was changed from 1 s to 15360 s. The number of simulation executions was 10, and we calculated the average of all results in the simulation executions. 4.2.2 Calculations in Simulation We investigated the relationship between the size of the bulk-copy window and synchronization delay. Specifically, we made sure of how much the bulk-copy window can be shortened under the condition in which the effectiveness for the write-request and 100% copy-success ratio to finish the bulk-copy execution in the copy window is maintained. The following formulas were calculated using the stochastic Petri-net model shown in Fig. 2.

204

Y. Yagawa et al.

Effective performance of application-write requests ð%Þ ¼ ðT1 number of fire times þ T12 number of fire timesÞ  T0 number of fire timesÞ  100

ð2Þ

All tokens fired by T0 should go through either T1 or T12 within the measurement time, and no tokens should remain in P0. Copy-success ratio ð%Þ ¼ ð1  Number of inhibited times in T5  assumed number of bulk-copy-execution timesÞ  100

ð3Þ

Transition T5 should fire regularly to meet the bulk-copy window. However, T5 cannot fire if there is a token left in P5. Synchronization delay ¼ RðT7 fire time  T5 fire timeÞ  T7 number of fire times

ð4Þ

The fire of T5 means the start of taking a snapshot as well as executing the bulkcopy function, and the fire of T7 means end of taking a snapshot. We assume that T5 and T6 fire at the same time if there is a token in P6 when the regular fire time is in time transition T5. 4.2.3 Experimental Results As a result, the effective performance of application-write requests was maintained at 100% even when the bulk-copy window was changed from 1 s to 15360 s. Also, the copy-success ratio was maintained as 100%, except the case in which the bulk-copy window was 1 s. The synchronization delay was measured, as shown in Fig. 3. The synchronization delay rapidly increased from 1920 s of the bulk-copy window, which indicates the limit of the bulk-copy window under given conditions. An appropriate bulk-copy window should be set below this limit. For example, if we choose 480 s as the bulk-copy window, the average synchronization delay becomes 45 s, which will meet end users’ request in our workplace. However, further study is required to define an optimal bulkcopy window.

Intermittent Snapshot Method for Data Synchronization

205

Fig. 3. Bulk-copy window and synchronization delay

5 Conclusion and Future Work We proposed the intermittent snapshot method for a CoR to regularly synchronize data with cloud storage. The method adopts a COW snapshot to maintain consistency of data at a CoR, even if the data are updated in the copy by the client terminals, but limits the snapshot period. The snapshot is taken only during the bulk-copy execution and released as soon as the execution is over. We also evaluated the proposed method from the perspective of an implementation design. It is necessary that the bulk-copy window is shortened and repeated many times to keep the latest data in cloud storage. We formalized the method by using the stochastic Petri-net model, and considered the proper size of bulk-copy window that optimizes both synchronization delay and application-access performance through simulation. We also found that there is a limit of the bulk-copy window under given conditions. However, we need to consider the following points for future work. (a) The overheads of both the pre-processing and post-processing of the bulk-copy function may impact the overall throughput of the method. We assume that there is a tradeoff between synchronization delay and overhead, and that an optimal bulk-copy window can be determined from this tradeoff. (b) The intermittent snapshot method should be compared with the conventional methods of ROW and COW that always takes snapshot. We assume that our method should perform better from the applications perspective. Acknowledgment. We thank Mr. Yoshiyuki Fujita and Mr. Tatsuya Matsumoto of Hitachi Ltd., IT Services Division for their support during the experiments.

206

Y. Yagawa et al.

References 1. Alam, M., Shakil, K.A.: Recent developments in cloud based systems: state of art. arXiv: 1501.01323 (2015) 2. Venkatesakumar, V., Yasotha, R., Subashini, A.: A brief survey on hybrid cloud storage and its applications. World Sci. News 46, 219–232 (2016) 3. Yagawa, Y., Sutoh, A., Matsuzawa, K., Fujita, Y., Matsuo, O., Murata, T.: A cloud storage cache system to improve data access performance through WAN. In: The 29th International Technical Conference on Circuits/Systems, Computers and Communications (2014) 4. Yagawa, Y., Sutoh, A., Malamura, E., Murata, T.: Modeling and performance evaluation of Cloud on-Ramp by utilizing a stochastic petri-net. In: Proceedings of the 5th IIAI International Congress on Advanced Applied Informatics, Kumamoto, pp. 995–1000, June 2016 5. Yagawa, Y., Sutoh, A., Malamura, E., Murata, T.: Implementation design and performance evaluation of partial recall method. IEEJ Trans. Electron. Inf. Syst. 137(10), 1414–1421 (2017) 6. Nemoto, J., Sutoh, A., Iwasaki, M.: File system backup to object storage for on-demand restore. In: Proceedings of the 5th IIAI International Congress on Advanced Applied Informatics, Kumamoto, Japan, pp. 946–952, June 2016 7. Xiao, W., Yang, Q., Ren, J., Xie, C., Li, H.: Design and analysis of block-level snapshots for data protection and recovery. IEEE Trans. Comput. 58, 1615–1625 (2009) 8. Takata, M., Sutoh, A.: Event-notification-based inactive file search for large-scale file systems. In: Proceedings of Asia-Pacific Magnetic Recording Conference, TA-3 (2012) 9. Kashima, T., Kato, S.X., Akiyama, T., Nozaki, K., Matsumoto, Y.M., Shimojo, S.: A method for the estimation of collective communication time using probability distribution of communication latency in grid environment. IPSJ J. Comput. Syst. (ACS) 46(SIG16) (ACS12), 43–55 (2005) 10. Yagawa, Y., Hayasaka, M., Maki, N., Tezuka, S., Murata, T.: Intermittent snapshot method for data synchronization to cloud storage. In: Proceedings of the International MultiConference of Engineers and Computer Scientists. Lecture Notes in Engineering and Computer Science, Hong Kong, 14–16 March 2018, pp. 237–242 (2018)

Extraction and Graph Structuring of Variants By Detecting Common Parts of Frequent Clinical Pathways Muneo Kushima1(&), Yuichi Honda2, Hieu Hanh Le2, Tomoyoshi Yamazaki1, Kenji Araki1, and Haruo Yokota2 1 Faculty of Medicine, University of Miyazaki Hospital, 5200, Kihara, Kiyotake-cho, Miyazaki-shi 889-1692, Japan {muneo_kushima,yama-cp,taichan}@med.miyazaki-u.ac.jp 2 Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Oookayama, Meguro-ku, Tokyo 152-8552, Japan {honda,hanhlh}@de.cs.titech.ac.jp, [email protected] http://mit.med.miyazaki-u.ac.jp/ http://www.de.cs.titech.ac.jp/

Abstract. In this research, common parts of frequent sequences were detected from medical instructions for catheter ablation described in the clinical pathways of University of Miyazaki Hospital, deformed patterns (variants) of medical instructions were detected and graphical representation was performed. As a result of experiments, it became possible to grasp the correspondence between typical clinical pathways in catheter ablation surgery. Keywords: Clinical pathways  Catheter ablation  Electronic medical records  Sequential pattern mining  T-PrefixSpan  Visualization

1 Introduction Medical workers including doctors, nurses, and technicians currently use clinical pathways. A clinical pathway is a guideline for a typical sequence of medical orders for a disease, which is traditionally generated by the medical workers themselves based on their medical experiences. Human verification and modification of clinical pathways are time-consuming for workers. It would be helpful if medical workers could verify the correctness of the existing clinical pathways or modify them by comparing them with the frequent sequential patterns of medical orders extracted from Electronic Medical Record (EMR) logs [1]. In our previous work, Le et al. [2] proposed to speed up the clinical pathway generation, and deployed an occurrence check that adds only closed sequential patterns to the results during mining while considering time intervals between events. Experiments on real data sets showed that the proposed system can be more than 13 times faster than our earlier method and can significantly improve the decision-making process for medical actions at large hospitals. © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 207–218, 2020. https://doi.org/10.1007/978-981-32-9808-8_17

208

M. Kushima et al.

Honda et al. [3] proposed a method for detecting variants in clinical pathways with treatment time information from EMR logs, and combine typical clinical pathways to find the differences between similar pathways TUR-Bt (transurethral bladder tumor resection) and ESD(endoscopic submucosal dissection). In this study, a common part detection is performed on the typical sequence obtained by T-PrefixSpan from the data of the medical instruction history, so that a variant which is a deformation pattern of the medical instruction was represented as a variant pattern explained. As a result of the experiment using the proposed method, it was possible to show that correspondence between typical clinical pathways in catheter ablation technique can be grasped.

2 Variant in Clinical Pathways Since A clinical pathway, also known as care pathway, integrated care pathway, critical pathway, or care map, is one of the main tools used to manage the quality in healthcare concerning the standardization of care processes. It has been shown that their implementation reduces the variability in clinical practice and improves outcomes. Clinical pathways aim to promote organized and efficient patient care based on evidence-based medicine, and aim to optimize outcomes in settings such as acute care and home care. A single clinical pathway may refer to multiple clinical guidelines on several topics in a well specified context. Variance is a process that is different from the process expected in the clinical pass ways and that has not been achieved (outcome) (achievement target). For example, when an examination that was not in the clinical path is added, or when the discharge date is extended due to the circumstances of the patient, it becomes a patient variance. By collecting and analyzing variance and improving the path, we can provide care that suits each patient, which leads to evaluation and improvement of medical care.

3 Catheter Ablation Catheter ablation is a procedure that uses energy to make small scars in your heart tissue to prevent abnormal electrical signals from moving through your heart. Radiofrequency (RF) ablation uses high-energy, locally delivered RF signals to make the scars. Cryoablation uses extremely cold temperatures to make the scars. Sometimes, laser light energy is used. Catheter ablation is used to treat certain types of arrhythmias, or irregular heartbeats, that cannot be controlled by medicine or if you have a high risk for ventricular fibrillation (v-fib), sudden cardiac arrest, or atrial fibrillation. Cardiologists, or doctors who specialize in the heart, will perform catheter ablation in a hospital.

Extraction and Graph Structuring of Variants By Detecting Common Parts

209

4 Method 4.1

Extracting Typical Clinical Pathways

First, we extract typical clinical pathways from actual EMRs by sequential pattern mining. As a side note, these clinical pathways do not have variants. In medical care, it is important to consider time intervals between medical treatments. In this study, we employ T-PrefixSpan [4], because this method considers time intervals between items. As in our previous study, we define medical treatments to have the four values Class, Description, Code, and Name. (1) Class denotes the classification of a medical treatment, (2) Description denotes its detailed diagnostic record, (3) Code is the medical code that represents the unique efficacy of the medicine considered, and (4) Name is the name of the medicine. For treatments without medicine, Code and Name are set to “null”, to represent a blank value. For example, assume a medical treatment designated as “injection” is described as an “intravenous injection” with a medical code “331” and the name “Lactec injection” appears in a medical log. The item is then represented in the form: (injection; intravenous injection; 331; Lactec injection). In this example, “Code 331” indicates “blood substitute.” In another example, when the medical treatment is the “nursing task” of “changing the sheets,” the item is represented in the form: (nursing task; changing the sheets; null; null). We specify that medical treatments have four values but we execute mining by focusing on Name, that is, we do not use efficacy of medicines (Code). Efficacy is used to detect variants. This is explained in the next section. The following are the important definitions introduced for T-PrefixSpan, which we employ in this work. Definition 1. T-item (i, t) Let I be a set of items and let t be the time when an item i occurred. We define a T-item (i, t) as a pair of i and t. Definition 2. T-sequence s and O-sequence Os T-sequence s is a sequence of T-items, which is denoted by s = . T-times that occur at the same time should be arranged in alphabetical order. Furthermore, let n be the length of T-sequence s and let an O-sequence of s be the sequence Os = . Definition 3. Time interval TIk Given a T-sequence s = , the time interval TIk is defined as follows: TIk  tk+1 – tk (k = 1, 2, …., n – 2, n – 1).

210

M. Kushima et al.

Definition 4. T-sequential database D and O-sequential database OD Given a set of T-sequences, the T-sequential database D is defined as follows, where the identifier sid of an element of D has a unique value for each sequence. D  {(sid, s)|sid, s 2 S}. Let an O-sequential database OD be a sequential database that consists of Osequences configured from all the T-sequences in D. Let Size(D) be Size(OD), which is the number of sequences in OD. Definition 5. T-frequent sequential pattern P Let MinSup (0 ≦ MinSup ≦ 1) be a minimum support and let D be a T-sequential database. Given P = (8j ij is an item, 8k Xk is a set of five values: (mink, modk, avek, medk, maxk)), and we can configure a sequence OP = . We define P as a T-frequent sequential pattern if OP is a frequent sequential pattern in an O-sequential database configured from D (i.e., Sup(P) = | {Seq | OP  Seq(sid, Seq)  OD, where sid is an identifier of Seq} | ≧ Size(OD)  MinSup). Let OP be the O-pattern of P. The set of five values is defined as follows: Given all the T-sequences with O-sequences containing OP in D, let S be one of them, where S = . By using j1, j2,…., jn–1, jn, which satisfies: (1) 1  j1 < j2 < … < jn–1 < jn ≦ (2) ik ¼ i0jk ; ik þ 1 ¼ i0jk þ 1 ;

m

and

We can configure sets of time intervals: SetTI1, SetTI2, …, SetTIn–1, where TIk ¼ tjk0 þ 1  t0jk . Moreover, in Xk= (mink, modk, avek, medk, maxk), we define the five values as follows. (1) (2) (3) (4) (5)

mink = min SetTIk modk = the most frequent value in SetTIk avek = the average of the values in SetTIk medk = the median of the values in SetTIk maxk = max SetTIk

Given a time interval Xj = (minj, modj, avej, medj, maxj)(1  j < n), if the equation minj = maxj holds, then the time interval between item ij and item ij + 1 is consistent; in particular, if the equation minj = maxj = 0 holds, then these two items occurred at the same time. Definition 6. T-closed frequent sequential pattern A Given a T-sequential database D, let R be a set of T-frequent sequential patterns extracted from D and let A be a T-frequent sequential pattern of R. A is a T-closed frequent sequential pattern if Z satisfying the following does not exist in R\ A. (1) If we let A′ and Z′ be O-patterns of A and Z, respectively, then A′  Z′. (2) Sup(A)  Sup(Z), where we define a support of the T-frequent sequential pattern as Sup(A)  |{s | s  S, (sid, S) 2 D, where sid is the identifier of S in D} |.

Extraction and Graph Structuring of Variants By Detecting Common Parts

211

(3) If we let A and Z be , and , respectively, then j1, j2, …, jn exists and satisfies z2 ; T20 ; . . .; zm1 ; Tm1 (1) 1  j1 < j2…. jn  m and (2) ak = bjk, ak+1 = bjk+1. Thus, for all Tk = (mink, modk, avek, medk, maxk) and Tjk0 ¼ min0jk ; modjk0 ; ave0jk ; medjk0 ; max0jk , equation (1) mink min0jk and (2) maxk  max0jk hold. T-PrefixSpan outputs the set of T-frequent sequential patterns P with T-sequential database D and minimum support MinSup as input. More detailed definitions are given in Uragaki [5]. 4.2

Grouping the Clinical Pathways

We explained how to extract typical pathways without variants from EMR logs. In this section, we explain how to group the clinical pathways. That is, to detect a more practical variant from a combination of similar pathways. For example, two medical treatments in the variant have the same Class, Description and Code but not the same Name. To achieve this, we group pathways. In each group, the number of medical treatments are all equal for each relative treatment day. The reference date of the relative treatment day is defined as the day on which the main medical treatment is performed. For example, the relative treatment day of the treatment done the day before the main medical treatment day is “–1” and one done on the day following the main medical treatment day is “1”. This date is determined from time intervals. The method in [5] defines time intervals that have five values, i.e., minimum, most frequent, average, intermediate, maximum, but we use only the most frequent value to calculate relative days. Because we think that typical sequential patterns are frequent patterns of EMR logs, relative days use the most frequent values. 4.3

Detecting Clinical Pathways Containing Variants

We have explained grouping the pathways. In this section, we explain how to detect clinical pathways containing variants while treating time information from typical pathways without variants for each group. First, we define the concepts required to introduce our method before we explain the algorithm. Definition 7. T-block B T-block B is the set of T-items with the same time of occurrence, which is denoted by B = {(i1, t1), (i2, t2), …, (in, tn)|t1 = t2 = …. = tn}. Furthermore, let n be the number of elements of T-block B, and let tB be the time at which these items occur, that is, t1 = t2 = … = tn = tB.

212

M. Kushima et al.

Definition 8. Variant pattern v Variant Pattern v is a sequence of T-blocks, which is denoted by V = . T-blocks that occur at the same time should be arranged in alphabetical order. Let n be the length of Variant Pattern V. When the number of elements of all the blocks is 1, the variant pattern is equal to the T-sequence. We developed a method for detecting the pathway with variants from the pattern without variants. The algorithm is described in Algorithm 1. A variant pattern is detected by finding the difference between the clinical pathways in each group.

Algorithm 1 Detecting clinical pathways containing variants Input: P’: the set of T-closed frequent sequential patterns Output:V : the set of variant patterns 1: v {s | s ∈ S} 2: for p ∈ P’ \ v do 3: k , j = 1 4: while k < length(v) and j < length(p) do 5: Bv = {Bi ∈ v} 6: Bs = {Bj ∈ p} 7: if tBv = = tBp then 8: for {i | i ∈ Bv } do 9: if iBp does not match any element of iBv then 10: Add (iBp , tBp ) to Bv 11: end if 12: end for 13: k = k + 1, j = j + 1 14: else if tBv > tBp then 15: Insert Bp just before Bv 16: k = k + 1, j = j + 1 17: else 18: k=k+1 19: end if 20: end while 21: while j < length(p) do 22: Add Bj , Bj+1,…., Blength(p) to the end of v 23: end while 24: end for v 25: V

4.4

Representing Variant Patterns

As a solution to the problem that cannot be expressed when the order cannot be uniquely determined, the method of representing the variants is defined by a nested structure with reference to the graphical notation [5].

Extraction and Graph Structuring of Variants By Detecting Common Parts

213

Definition 9. Nested branched sequence The items in the odd level of a list indicate a sequence pattern, while the items in the even level of the list a parallel pattern. [L1,1, L1,2, …, L1,n] is a sequence pattern, L1, i = [L2,i,1, L2,i,2, …, L2,i,m] is a parallel pattern, and L2,i,j = [L3,i,j,1, L3,i,j,2,…., L3,i,j,k] is a sequential pattern, and so on. For example, when five T-items(a, b, c, d, e, f, g) are written as v = [[a], [b], [c], [d, e], [f], [g]], the variant pattern is shown in Fig. 1.

Fig. 1. Example of variant representation.

4.5

Visualizing Variant Patterns

The problem with the representation in the previous section is that if the structure becomes complicated, it becomes impossible to understand the information intuitively, and the amount of information that can be read by each person is different. To provide the same information to anyone, visualization of data in an interactive graphical interface system is important. 4.6

Deriving the Set of Nodes and Edges from the Variant Pattern

For visualization, we must create sets of nodes and edges from the variant patterns. The algorithm is described in Algorithm 2.

Algorithm 2 Derivation of the set of nodes and edges Input: Variant pattern v’ Output: the set of nodes of v’(V ) and the set of edges of v (E) 1: k = 1 2: count = 1 3: for k < length(v) do 4: startBk = count 5: 6: 7: 8:

for {i | i ∈ Bk} do icount = i icount V for k

2 and startBk-1

(ij , icount) 9: E 10: end for 11: count = count + 1 12: end for 13: endBk = count - 1 14: k=k+1 15: end for

j

endBk-1 do

214

M. Kushima et al.

The set of nodes contains all the T-items that exist in the variant pattern, the set of edges contains all edges from all T-items belonging to Bk–1 to all T-items belonging to Bk. In the visualization, the graph is created by the set of nodes and edges.

5 Experiment 5.1

Experimental Data

We used target medical treatment data based on clinical pathways recorded from November 19, 1991, to October 4, 2015, in the EMRs at the Faculty of Medicine, University of Miyazaki Hospital. These medical data were acquired using an EMR system WATATUMI [6] employed by the Faculty of Medicine, University of Miyazaki Hospital. The total data size of the EMR system is 49 GB. For personal information protection, the data that we used did not include information that could identify a patient uniquely. When we extracted the medical treatment data, we used anonymous patient IDs, which were impossible to associate with real people. The data we extracted from the EMRs to support medical treatments at the Faculty of Medicine, University of Miyazaki Hospital were described previously in [7] and they can be accessed at the website of the University of Miyazaki and the Research Ethics Review Committee of Tokyo Institute of Technology. Our target data consisted of medical treatments based on one clinical pathway that was included in the EMRs: catheter ablation. We chose this one clinical pathway because catheter ablation has clinical pathways that are relatively fixed. In the experiment, we confirm that variant pattern v can be constructed for the one case catheter ablation. The number of sequences, average length of the sequences and maximum length of the sequences for the one data set catheter ablation is shown in Table 1. The reason for limiting the hospitalization period is to exclude exceptional pathways. For example, there is a pathway in which the treatment period should end in several days but the hospitalization period has exceeded one year. Table 1. Target dataset. Dataset Number of patients Min days of hospitalization period Max days of hospitalization period Average treatments per patient Maximum treatments per patient Minimum treatments per patient

Catheter ablation 21 7 7 70.10 121 38

Extraction and Graph Structuring of Variants By Detecting Common Parts

5.2

215

Results and Discussion

Confirming variant patterns: The numbers of typical clinical pathways by T-PrefixSpan are shown in Table 2. Table 2. Numbers of typical clinical pathways by T-PrefixSpan. Dataset Threshold Number of typical pathways Average treatments per pathway Maximum treatments per pathway Minimum treatments per pathway

Catheter ablation 0.4 12636 7.48 12 3

In the result of catheter ablation, there are not a few cases where a variant pattern is formed from many pathways. The reason is that many small differences of medical treatments between pathways were detected and there are few similar pathways because the clinical pathway of catheter ablation is fixed. A part of clinical pathways of catheter ablation at threshold 0.4 for T-PrefixSpan are shown in Figs. 2a, 2b, 2c and 2d. In Figs. 2a, 2b, 2c and 2d, variant patterns derived are shown. The original character outputs are not colored, but we give the same color to the medical treatments of the same T-block in Figs. 2a, 2b, 2c and 2d, for the readability. 5.3

Visualization

It is difficult for medical workers to grasp the variant patterns from the character strings. Therefore, we try to visualize the variant patterns using an interactive graphical output tool D3.js [8]. We use circles and arrows between them to make it easier to grasp the entire clinical flow, while making detailed information be given interactively by user clicks. Over the circle node, Type and Treatment day are always displayed because they are necessary information for grasping the entire pathway. We make the same color node have the same Type of medical treatment. When a circle node is clicked, detailed information, i.e., Explain, Code, and Name are displayed in the square node. This square node with the detailed information is a toggle between hide and show to focus on important portions in the pathway. In Fig. 2a, it is necessary to carefully consider the difference between the internal medicine and the Labona tablet and the intravenous injection, furthermore, the difference in the way medicines are given, the meanings of the ingredients and purpose. Similarly, in Figs. 2b, 2c and 2d, contemplation is required depending on the case of each visualization diagram. By contemplating the visualized output result diagram, it is possible to discover a combination of new medical instructions. It was easier to understand and handled than the conventional string output.

216

M. Kushima et al.

Fig. 2a. Visualization example.

Fig. 2b. Visualization example.

Fig. 2c. Visualization example.

Extraction and Graph Structuring of Variants By Detecting Common Parts

217

Fig. 2d. Visualization example.

As a result, it is possible to grasp the detailed information of each medical instruction and grasp the detailed difference between the medical instructions, so to grasp the entire flow by following nodes and arrows, supplemental information of medical instructions, it is possible to grasp the information of the medical instruction unit. Furthermore, there is a possibility that comparison of corresponding medical instructions is easy, and new medical knowledge can be obtained from differences in visualization diagrams.

6 Conclusion In this study, we described a method to represent a variant, which is a deformation pattern of medical instructions by catheter ablation, as a variant pattern. As a result of the experiment using the proposed method, it was possible to show that correspondence between typical clinical paths in catheter ablation surgery can be grasped. However, confirmation of the visualization effect of clinical path variants from a medical point of view is still to be done, and at the present time the visualization method is in the stage of functioning.

7 Future Work It is the present situation of our research results that we implemented and visualized the visualization method. As future development of this research it is a future task to discuss how to utilize the merit obtained by visualizing the variant and its usage. Furthermore, since catheter ablation is performed this time, it is necessary to conduct the same experiment for other clinical paths. Then, it is necessary to evaluate the proposed method by ascertaining to what extent medical output benefits including information on this clinical path variant are beneficial to medical staff. Acknowledgment. In this research, using electronic medical record data of University of Miyazaki Hospital for medical instruction support is described in University of Miyazaki Hospital’s HP [7].

218

M. Kushima et al.

This research has been approved by the Ethics Review Committee of University of Miyazaki Hospital and Tokyo Institute of Technology. We appreciate the cooperation of everyone involved.

References 1. Kushima, M., Honda, Y., Le, H.H., Yamazaki, T., Araki, K., Yokota, H.: Visualization and analysis of variants in catheter ablation’s clinical pathways from electronic medical record logs. In: Lecture Notes in Engineering and Computer Science: Proceedings of the International MultiConference of Engineers and Computer Scientists, 14–16 March 2018, Hong Kong, pp. 271–275 (2018) 2. Le, H.H., Edman, H., Honda, Y., Kushima, M., Yamazaki, T., Araki, K., Yokota, H.: Fast generation of clinical pathways including time intervals in sequential pattern mining on electronic medical record systems. In: The 2017 International Conference on Computational Science and Computational Intelligence, Proceeding of the fourth International Conference on Computational Science and Computational Intelligence (CSCI 2017), December 2017 3. Honda, Y., Kushima, M., Yamazaki, T., Araki, K., Yokota, H.: Detection and visualization of variants in typical medical treatment sequences. In: International Workshop on Data Management and Analytics for Medicine and Healthcare, Proceeding of the third International Workshop on Data Management and Analytics for Medicine and Healthcare (DMAH 2017), in Conjunction with the 43rd International Conference on Very Large Database (VLDB 2017), pp. 88–101, September 2017 4. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of 2001 International Conference on Data Engineering, pp. 215–224 (2001) 5. Achar, A., Laxman, S., Raajay, V., Sastry, P.S.: Discovering general partial orders from event streams. Technical report. arXiv:0902.1227v2 [cs.AI]. http://arxiv.org 6. Denshi Karte System WATATUMI (EMRs \WATATUMI”). http://www.corecreate.com/ 0201izanami.html 7. Miyazaki Daigaku Igaku Bu Fuzoku Byouin Iryo Jyoho Bu (Medical Informatics Division, Faculty of Medicine, University of Miyazaki Hospital). http://www.med.miyazaki-u.ac.jp/ home/jyoho/ 8. D3.js. https://d3js.org

Important Index of Words for Dynamic Abstracts Based on Surveying Reading Behavior Haruna Mori1 , Ryosuke Yamanishi2(B) , and Yoko Nishihara2 1

Graduate School of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan [email protected] 2 College of Information Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan [email protected], [email protected]

Abstract. Through the widely-spread of digital devices such as smartphone, the digital books have become more popular. This research investigated the abstract required before resuming the reading. Through the survey, it seemed that the words that have a climax just before the bookmark is important for the abstract. We propose an elemental method to generate dynamic abstracts for each reading progress based on the results of the survey. The proposed method focuses on the local variation of word importance, though some existing criterions for summarization focus on the overall word importance. We prepared four types of local variation and compared the effectiveness of those with each other. The experiment to detect words accepted to manually-generated dynamic abstracts was conducted with each types of the proposed method while the general word importance criterion (tf-idf ) is used as the comparative method. Through the discussions of the results, it was confirmed that some types of the proposed method were more effective to detect the words accepted to dynamic abstracts than the comparative method. Keywords: Dynamic abstract generation · E-books · Features of abstract · Index of word importance · Local variation · Natural language processing · Reading behaviors

1

Introduction

So many people enjoy reading novels on their own smart devices, i.e., e-books. We believe that the advantage of e-books should be provided by applying some information processing technologies, though current e-books are simply digitalized books of paper media. In non-series contents such as long story novels, it seems that the reading pace differs for each reader. So the different abstracts for each reader should be required. As one of the advantage of e-books, our research c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 219–232, 2020. https://doi.org/10.1007/978-981-32-9808-8_18

220

H. Mori et al.

is focusing on generation of dynamic abstract which should be read just before the resuming to read novels. The abstracts are constituted by extracting information related to specific words and summarizing them, so generating the abstract of novels can be regarded as a part of general Query-Focused Summarization (QFS). On the other hand, the abstracts are changeable depending on the reading progress, so generating the abstract of novels can be regarded as a type of Update Summarization (US) [9]. As the research concerning generation of the abstracts, Bamman et al. related the sentences in the abstract with the sentences in the novel [1]. What kind of information is required for the abstracts is not clear. We survey the features of manually-generated abstract focusing on positions of sentences selected for the dynamic abstract corresponding to reading progress. Also, we discuss the relationships between the abstracts and reading activity, such as the amount of text and time in each reading. Based on the survey, we propose an elemental method to generate dynamic abstracts for each reading progress. We prepared four types of local variation and compared the effectiveness of those with each other.

2

Survey of Reading Behavior

At first, we have to figure out what can be the dynamic abstract for smoothly resuming reading. The experiments to create manually-generated abstract are conducted. Based on the analysis of the features of manually-generated abstract, we discuss the relationships between the dynamic abstracts and reading behavior. 2.1

The Experiments to Create Manually-Generated Abstract

Novels which have over 20,000 characters were used in the experiment. It is considered that readers may interrupt their reading many times when they read a novel which has many words, and then it is expected that abstract should be Table 1. Metadata of novels used in the experiment and the number of participants in the experiment for each novel. Novel ID

Author

Title

The amount (in Japanese)

# of participants

Characters

Words

Sentences

Paragraphs

N1

Natsume, Soseki

Kokoro (the first part)

49,103

32,415

1,779

282

7

N2

Miyazawa, Kenji

Night on the Galactic Railroad (Shincho Bunko ver.)

38,212

22,963

1,100

158

8

N3

Akutagawa, Hell Screen Ryunosuke

26,103

18,313

480

98

7

Important Index of Words for Dynamic Abstracts

221

Fig. 1. The concept of how to create abstract in the experiment.

useful to remember the story of the novel. It seems that a novel with a lot of words has many characters, then organizing information with abstract should be more important. Text data of “Kokoro,” “Night on the Galactic Railroad (Shincho Bunko ver.),” and “Hell Screen” were used in the experiment to create abstract. These three novels have been publicly available on Aozora Bunko1 . Novel IDs and metadata are shown in Table 1; the data was calculated from each novel without metadata such as title, author, footnote, information of copy-text, and headline. In the case of ID N1 , the first part was extracted and used to let the number of characters be almost the same as other novels. These three novels were ranked the second, 11th, and 58th respectively in the access ranking of Aozora Bunko2 . Authors of each novel are top three persons of the total number of accesses to novels written by each author. Thus, these novels should be common and popular novels. The experiment to create abstract was conducted by 22 participants in their late teens to twenties who did not know the content of the assigned novel. The number of participants in the experiments for each novel is shown in Table 1. The participants were instructed not to obtain information that could become the spoiler such as an outline of the novel. The experiment was conducted by repeating the following three procedures. 1. Participants read the novel. 2. Participants answer the questions to verify whether participants understand the contents of the novel or not. 3. Participants create the abstract. The number of sentences selected for abstract was decided by 8.17647, that is the average number of sentences in the abstract of young-adult novels with different titles and author names. The following information was given to the participants. 1 2

http://www.aozora.gr.jp. The ranking of XHTML ver. in the whole year of 2016. There are all 13,969 novels, as of 2016.12.31.

222

H. Mori et al.

• Abstract means the information that is required by the reader when resuming the reading. • Abstract created by the participants should be read just before reading the text for this time. • The abstract was created by extracting sentences from the text between the beginning of the novel and the bookmark for last time. Figure 1 shows the image of the range where sentences for abstract are selected. When the participant read the novel until the position where he/she interrupted his/her reading that is the position of the bookmark at this time, the sentences for abstract should be selected from the beginning of the novel to the position of the bookmark at last time. There is no previous bookmark after the first reading, so no abstract is created only at this time. Here, we focus on also the section p as the target of the analysis. In that section, so many spoiler words should be contained. Several researches [2–4,8] have reported the effectiveness of the spoilers as the important information providing the pleasure of reading novels. All 22 participants showed a correct answer rate of 80% or more in the questions asking contents of each novel. According to this result, it was considered that all participants understood the contents of each novel; the abstracts created by all 22 participants were assumed as reasonable reference information. 2.2

Analytics of Manually-Generated Abstract

The relationships between the features of abstracts created by the participants and the reading behaviors of each participant will be analyzed and the discussions will be described below. The results were discussed from five kinds of viewpoint: the trends for positions of a bookmark, the positions from which sentences for abstract were extracted, the relationships between the reading interval and the abstract, the semantic information in the abstract, and the relationships between the appearance rate of each word and the abstract. 2.2.1 Positions of Bookmark Table 2 shows the positions that the participants interrupted the reading and insert the bookmark. Duplication was not allowed in Table 2. Where the bookmark was inserted is checked through in the order of boundary of sections, blank Table 2. Positions where the bookmark is inserted by the participants for each novel. Novel Boundary Blank Between ID of sections line paragraphs # of bookmark N1 N2 N3

20 14 13

0 1 0

0 5 1

Between sentences

Total number of bookmark

0 4 0

20 24 14

Important Index of Words for Dynamic Abstracts

223

Table 3. Positions from which sentences for abstract were extracted. Novel ID Positions from which sentences was selected Only from the part just From the beginning Exhaustively selected before the bookmark part of the novel and from all of the read the part just before the parts bookmark N1

S2 , S3 , S4 , S6 , S7

S1

S5

N2

S9 , S10 , S11 , S12 , S14 , S15

S13

S8

N3

S17 , S18 , S19 , S20

S16 , S21 , S22

Table 4. Selection probability of sentence just before the bookmark for the abstract for each participant. Novel ID pi = 0.00

0.00 < pi ≤ 0.50

0.50 < pi ≤ 1.00

N1

(S1 : 0.00), (S3 : 0.00), (S4 : 0.00), (S5 : 0.00), (S6 : 0.00)

(S7 : 0.50)

(S2 : 1.00)

N2

(S11 : 0.00), (S12 : 0.00), (S15 : 0.00)

(S10 : 0.50), (S13 : 0.33)

(S8 : 0.75), (S9 : 0.60), (S14 : 0.67)

N3

(S16 : 0.00), (S17 : 0.00), (S18 : 0.00), (S19 : 0.00), (S20 : 0.00)

(S21 : 0.50)

(S22 : 1.00)

lines, between paragraphs, and between sentences. When the corresponding position was confirmed, the checking was finished. More than 50% of the bookmarks was inserted at the positions where the boundary of sections in every three novels. It was considered that 83% of the bookmarks was inserted between paragraphs, because the boundary of sections and the blank line were included in between paragraphs. In generic texts or documents, a paragraph is a group of sentences concerning the same one topic. The point that the reader interrupts reading is the boundary of the story; a boundary of paragraphs is a boundary of the story. 2.2.2 Positions from Which Sentences for Abstract Were Extracted Regarding positions of sentences selected for the abstracts, there were three patterns of the trend: the sentences selected from only the part just before the bookmark, from the beginning part of the novel and the part just before the bookmark, and exhaustively from all of the read parts. The beginning part of the novel means the part that each participant read first. Table 3 shows these three patterns and trend of selecting sentences for an abstract by each participant. In Table 3, the participants are represented by using participant ID: Si . And, S1 to

224

H. Mori et al. Table 5. The results of the t-test. (Cq,q−1 ) # of data Average Standard deviation t-value N1 Cq,q−1 > 1 13 Cq,q−1 < 1 13

14.77 7.54

5.59 2.27

0.0015

N2 Cq,q−1 > 1 16 Cq,q−1 < 1 16

20.50 11.25

5.47 3.63

0.0000

N3 Cq,q−1 > 1 Cq,q−1 < 1

36.57 14.00

7.31 2.56

0.0001

7 7

S7 , S8 to S15 , and S16 to S22 are the participants of experiments with N1 , N2 , and N3 , respectively. All of the participants required the information included in the part just before the bookmark for the abstract. It is considered that the contents just before interrupting the reading should be required for the abstract. Though most participants considered it was enough if there is the information of the read part just before bookmark, there were some participants required information from the beginning of the novel or other read parts. From these results, it was suggested that it was necessary to prepare the abstract appropriate to each participant. Also, it is inferred that the influence of the differences of novels is smaller in the patterns of positions to select sentences for abstract. There is a possibility that the sentence just before the bookmark contains information strongly related to the contents of the story after the bookmark. It seemed that the sentence just before the bookmark was considered as important to remind contents just before interrupting reading. We investigated the probability pi that the sentence just before the bookmark was selected for the abstract by each participant Si . The Eq. (1) calculates pi ; pi =

Abstlast , N

(1)

where, Abstlast and N each shows the number of abstracts including sentence just before the bookmark and the number of creating the abstract by each participant, respectively. In Table 4, the results are shown as (Si : pi ): the participant ID Si and the value of pi . The majority of participants did not select the sentence just before the bookmark for the abstract; the sentence just before the bookmark should be not important for the abstract. However, there were some participants selecting the sentence just before the bookmark; two participants selected the sentence just before the bookmark every time. It was suggested that whether the sentence just before the bookmark is important or not depends on each participant.

Important Index of Words for Dynamic Abstracts

225

2.2.3

Relationships Between Appearance Rate of Each Word and Abstract Basically, it has been known that the more the word appears the more important the word should be. If the information just before the bookmark is important, the appearance of the words in the read part just before the bookmark might be relatively higher than that after the bookmark. On the other hand, if the information related to the contents in the next reading would be required for the abstract, the words which would be more appeared in the next reading part should be included in the abstract. As mentioned in Sect. 1, we have already suggested the method based on this idea [6]. To verify this idea, we analyzed the relationships between the appearance rate of each word and the abstract. The appearance rates of the word x selected by each participant for the abstract is calculated by the Eq. (2); f (xi ) , (xi ) = M i=0 f (xi )

(2)

where, word x is only a noun, verb, and unknown word, and the word whose total number of appearance in each novel is over five times or more. M shows the total number of words x. Also, the words that do not have a meaning by themselves such as suffixes were excluded. Let q be the current part that the reader starts to read, and then q − 1 means the part that the reader read last time. Each increasing rate of word x for part q and q − 1 is each calculated by the Eqs. (3) and (4). Rq (xi ) =

q (xi ) , q−1 (xi )

Rq−1 (xi ) =

q−1 (xi ) . q−2 (xi )

(3)

(4)

Then, the Eq. (5) calculates the contrast between Rq (xi ) and Rq−1 (xi ). Cq,q−1 =

Rq−1 (xi ) . Rq (xi )

(5)

We verified that the number of words which Cq,q−1 is higher and lower than 1.0 with t-test under the condition that is the one-sided test and significant difference of 5%: the significant t-value should be 0.0167 after Bonferroni correction. Table 5 shows the result. The t-values of each novel were N1 : 0.0015, N2 : 0.0000, and N3 : 0.0001. All of those values are smaller than the 5% significance level, moreover, N2 and N3 showed the 1% significance level. From this results, it was suggested that the abstract should include the words which appearance rate on q − 1 is higher than that on q − 2 and q. From these, it seemed that the words with a climax on the part just before the bookmark should be important for the abstract.

226

H. Mori et al.

Table 6. Concept of each index. We propose and compare four types of indices focusing on the variation of word frequency from the section p − 2 through p; the previous index focused on the variation of word frequency from the section p − 1 to p, though. Variation of word Variation of word frequency from p − 2 to frequency from p − 1 to p−1 p Proposed methods Index Index Index Index Former method

3

IN II IF ID

Increase Increase Increase Increase

Index N I –

– Increase Flat Decrease Increase

Proposed Methods

From the results described in Sect. 2.2.3, it is considered that focusing on the local variation of word frequency should be important for dynamic abstracts. Based on the above survey, we propose a numerical statistic to weight words based on the local variation of word frequency for word importance in the dynamic abstract. We define the section that is estimated sentences the reader would read this time based on the average amount of each reading progress as p, and then the section of previous reading is defined as section p − 1. 3.1

Motivations

In our previous study [6], we focused on the increase of word frequency from section p − 1 to p. Because, we considered that the information would become more important in section p should be provided to the reader as a dynamic abstract. Reflecting such information before resuming the reading, the reader should smoothly start to read the subsequent story. Based on this concept, we have proposed an index to weight words based on the variation during section p − 1 to p: index N I. Through the analysis of reading behavior and manually-generated dynamic abstract [7], we have found that the important information in section p − 1 also should be notable. We propose an index based on the variation of word frequency from section p − 2 to p − 1: index IN . That is to say, we believe that the word which frequency increases from section p − 2 to p − 1 should be the important information on section p − 1. Then, the variation of word frequency from section p − 1 to p, which is the point of the previous index, should be taken into the consideration. While focusing on the increase of word frequency from section p − 2 to p − 1, we prepared three patterns for variation of word frequency from section p − 1 to p: increase, stay-flat, and decrease. The index with each pattern can be each defined as index II, index IF and index ID, respectively. Table 6 shows the concepts of the indices.

Important Index of Words for Dynamic Abstracts

227

Fig. 2. The general idea of the method weighting words focusing on local variation of word frequency.

Maeda et al. stated that most all of spoiler words in the story is nouns and verbs [5]. The spoiler words can be assumed as the important words in the story. In a dynamic story, nouns and verbs are related to the important information in the story and should be the keywords to grasp the past story. This research focuses on nouns and study the effectiveness of the proposed method. 3.2

Definition

Figure 2 shows the general idea of the definition of the proposed method. Sections p − 2, p − 1 and p are extracted based on the position of bookmark. Let each frequency of word x on section s be tfs (x). The local variation of word frequency during two consecutive sections for word x is defined as IRss+1 (x), and can be calculated as follows; IRss+1 (x) =

tfs+1 (x) , tfs (x)

(6)

where, the options for s should be p − 2 and p − 1. The index to represent local increase, stay-flat and decrease of IR(x) are each represented as Incs+1 (x), F lats+1 (x) and Decs+1 (x), respectively. The indices s s s can be obtained as the following formulas; Incs+1 (x) = IRss+1 (x), s 1 , (x) = F lats+1 s |1 − IRss+1 (x)| 1 Decs+1 . (x) = s s+1 IRs (x)

(7) (8) (9)

The word importance for word x with each index can be each defined as WIN (x), WII (x), WIF (x), WID (x), WN I (x), respectively. As assigning each section to s, those can be calculated as follows;

228

H. Mori et al. Table 7. Metadata of novels used in the experiment.

Novel ID Author

Title

The amount (in Japanese) Characters Words Sentences Paragraphs

N4

Dazai, Osamu

No Longer Human (excerpted)

37,633

24,109

588

202

N5

Yumeno, Kyusaku

Dogra Magra (excerpted)

47,878

31,347

923

297

N6

Kobayashi, Takiji

The Crab Cannery Ship (excerpted)

39,373

27,045 1,421

299

N7

Tanizaki, Junichiro A Portrait of Shunkin

46,179

29,613

283

27

N8

Dickens, Charles

A Christmas Carol (excerpted)

38,163

25,344

992

127

N9

Mori, Ogai

Vita Sexualis (excerpted)

42,139

29,487 1,637

275

N10

Kafka, Franz

Die Verwandlung 56,069

33,845 1,205

91

WIN (x) = Incp−1 p−2 (x), p WII (x) = Incp−1 p−2 (x) × Incp−1 (x), p WIF (x) = Incp−1 p−2 (x) × F latp−1 (x), p WID (x) = Incp−1 p−2 (x) × Decp−1 (x), WN I (x) = Incpp−1 (x).

(10) (11) (12) (13) (14)

These indices can be used as the likelihood for each variation types of word frequency during p − 2 through p. The higher the value of the index is the more likely the frequency of the word variates with the corresponding pattern.

4

Evaluation

We conducted the experiment to evaluate the proposed method described in Sect. 3. With each proposes index, we focused on the difference of the values for the words accepted to manually-generated dynamic abstracts and the words rejected from the manually-generated dynamic abstracts. Comparing the difference, we would like to investigate which index shows great difference between the words used and not used in manually-generated dynamic abstracts. If we can see the great difference between the words used and not used in manuallygenerated dynamic abstracts, it can be suggested that the index can be effective to detect the words accepted to dynamic abstracts. We use tf-idf as the comparative method which is a general index weighting words based on the importance in the whole of the document. As same as the proposed indices, we also discuss the difference of tf-idf values between the words used and not used in manuallygenerated dynamic abstracts.

Important Index of Words for Dynamic Abstracts

4.1

229

Novels Used in the Experiment

We used 10 titles, which are all available on Aozora Bunko, as the target novels to generate dynamic abstracts: “No Longer Human,” “Dogra Magra,” “The Crab Cannery Ship,” “A Portrait of Shunkin,” “A Christmas Carol,” “Vita Sexualis,” “Die Verwandlung,” and the three titles shown in Table 1. Novel IDs and the metadata are shown in Tables 1 and 7; the numerical data for each novel was calculated without metadata such as title, author, footnote, information of copytext, and headline. In the case of ID N1 , N4 , N5 , N6 , N8 and N9 , texts were partially extracted from the beginning and used to let the number of characters be almost the same as other novels. These novels were respectively ranked higher than 100th place in the access ranking of Aozora Bunko3 ; these novels thus should be common and popular novels. 4.2

Manually-Generated Dynamic Abstracts

The abstracts generated in Sect. 2.1 and the abstracts newly generated were used as the set of correct dynamic abstracts. The new abstracts were manually generated in the same way as described in Sect. 2.1 by 26 participants in their late teens to twenties who did not know the content of the assigned novel. 25 participants showed a correct answer rate of 80% or more in questions asking contents of each novel. According to this result, it was considered that 25 participants understood the contents of each novel; one participant does not understand the contents, though. Therefore, only the abstracts generated by 25 participants whose correct answer rate is over 80% were assumed as the reasonable reference information; the abstracts were used as the set of correct manually-generated abstracts. From these procedures, 51 correct dynamic abstracts were obtained. 4.3

Results and Discussions

For each of the 51 manually-generated abstracts, the nouns existed in the abstract were extracted. Also, the same number of nouns were randomly extracted from the sentences that were not used in the abstracts; the number of nouns differed for each abstract. Accordingly, we obtained each 1,890 nouns existed and did not exist in the dynamic abstracts in total from 51 dynamic abstracts. Even though the noun itself is the same, the nouns were assumed as different words if the abstract where the word exist was different. We calculated the word importance for each of 3,780 nouns using each index.

3

The ranking of XHTML ver. in the whole year of 2016. There are all 13,969 novels, as of 2016.12.31.

230

H. Mori et al.

Table 8. The average rank of words accepted to and rejected from manually-generated dynamic abstracts. Index IN Index II

Index IF Index ID Index N I tf-idf

rankacpt

1697.00

1712.59

1773.33

1809.80

1792.97

1768.04

rankrej

2084.00

2068.41

2007.67

1971.20

1988.03

2012.96

−355.82 −234.34

−161.40

−195.05

−244.93

rankacpt − rankrej −387.00

The value of the word importance for each index differed for each index. Thus, we used the rank instead of the value itself for the discussion. The rank of the word that had the highest and lowest value was each set as 1 and 3,780, respectively. Let the words accepted to and rejected from the dynamic abstracts each be acpt and rej. The average ranks of acpt and rej were respectively defined as rankacpt and rankrej . The difference between rankacpt and rankrej was also calculated for the discussions: “rankacpt - rankrej .” Table 8 shows the results: rankacpt , rankrej and “rankacpt - rankrej ”. The lower value of “rankacpt - rankrej ” means that the word importance for the words accepted to dynamic abstracts was higher than the one for the words rejected from dynamic abstracts. That is, it is considered that the less “rankacpt - rankrej ” the more the index should be effective to detect the words accepted to dynamic abstracts. Both index IN and index N I focus on increase of word frequency but the focused periods differ. The index IN focuses on section p − 2 to p − 1 though the index N I focuses on p − 1 to p. The index IN showed higher rankacpt than index N I. And, the index IN showed approximate 1.98 × “rankacpt − rankrej ” for index N I. Therefore, the index IN weighted the words accepted to dynamic abstract more heavier than index N I; the index IN was suggested to be more effective criterion. Based on this result, it seemed that we should focus on increase of word frequency during not section p − 1 to p but p − 2 to p − 1. The indices II, IF and ID focus on the variation of word frequency from section p − 1 to p. In these indices, the index II showed the highest rankacpt and the lowest rankrej ; also, in all indices, the index II showed high rankacpt behind the index IN . The index II showed the lowest “rankacpt − rankrej ” in these three indices and it showed 1.52 times for the index IF . It was suggested that the word which frequency increase during section p−1 to p should be effective to detect the words accepted to dynamic abstracts; the increase of word frequency during section p−1 to p was the concept of the index N I, which is the previous method. The indices IN and IF each showed approximately 1.58 and 1.45 times “rankacpt −rankrej ” for tf-idf. It was suggested that the local variation of word frequency was more effective than overall word frequency in the generation of dynamic abstracts.

5

Conclusion

The purpose of this study is to generate the abstract depending on reading progress. The experience to create manually-generated abstracts was conducted

Important Index of Words for Dynamic Abstracts

231

and we analyzed the features of the abstract to investigate the relationship between reading behaviors and dynamic abstracts. Through the discussion, we have obtained the following findings; • The position where the bookmark is inserted is biased to a boundary of the story. • There are three patterns to create abstracts: – Sentences are extracted from only the part just before the bookmark. – Sentences are extracted from the beginning part of the novel and the part just before the bookmark. – Sentences are extracted exhaustively from all of the read parts. • Whether the sentence just before the bookmark is included in the abstract or not depends on the attribute of each reader. • The important information for the abstracts is changeable depending on each reading progress. • There is no relation between the reading intervals and which sentences are selected for the abstract. • The information of character name and place may work effectively to generate the abstracts. • It seems that the words that have a climax on the part just before the bookmark is important for the abstract. Based on this survey, we proposed novel word weighting methods focusing on the local variation of word frequency for generating dynamic abstracts. We focused on two periods surrounding the bookmark: p − 2, p − 1 and p, and the local variation of word frequency from sections p − 2 to p − 1 and p − 1 from p. The index IN and N I focused on the increase either of p − 2 to p − 1 and p − 1 from p. The indices II, IF and ID considered the combination of variation on p − 2 to p − 1 and p − 1 from p: increase, stay-flat and decrease. As the results, the index IN that focused on the increase of word frequency during p − 2 to p − 1 showed effective result to detect the words accepted to dynamic abstracts. That is to say, we should focused on the words which frequency increases from the two times before the bookmark to the last time. For dynamic abstracts, it was suggested that the reader should reflect what words are active in the last time before resuming the reading. Also, we believe that the words related to the future event which will be happened in the following story would be effective to smooth reading because the index II showed enough effective results too. Using the proposed methods, we will develop a system to generate dynamic abstract corresponding to the reading progress. Then we have to consider how to combine the proposed method that is for weighting words and method to evaluate sentences to be used for abstracts. In our future, we will compare the combination of the word weighting and sentence evaluation methods with each other. Acknowledgements. This work was supported in part by JSPS Grant-in-Aid for Young Scientists B #16K21482.

232

H. Mori et al.

References 1. Bamman, D., Smith, N.A.: New alignment methods for discriminative book summarization. CoRR abs/1305.1319 (2013). http://arxiv.org/abs/1305.1319. Accessed 2 June 2013. 20:48:21 +0200 2. Boyd-Graber, J., Glasgow, K., Zajac, J.S.: Spoiler alert: machine learning approaches to detect social media posts with revelatory information. In: Proceedings of the American Society for Information Science and Technology, vol. 50, no. 1, pp. 1–9 (2013) 3. Green, M.C., Brock, T.C., Kaufman, G.F.: Understanding media enjoyment: the role of transportation into narrative worlds. Commun. Theory 14(4), 311–327 (2004) 4. Guo, S., Ramakrishnan, N.: Finding the storyteller: automatic spoiler tagging using linguistic cues. In: Proceedings of the 23rd International Conference on Computational Linguistics, 23–27 August 2010, Beijing, pp. 412–420 (2010) 5. Maeda, K., Hijikata, Y., Nakamura, S.: A basic study on spoiler detection from review comments using story documents. In: IEEE/WIC/ACM International Conference on Web Intelligence (WI), 13–16 October 2016, Omaha, pp. 572–577 (2016) 6. Mori, H., Yamanishi, R., Nishihara, Y., Fukumoto, J.: The difference of word importance before and after bookmark for novel abstract in each reading progress. In: Proceedings of the 21th International Conference on Knowledge Based and Intelligent Information and Engineering Systems, 6–8 September 2017, Marseille, pp. 1246–1253 (2017) 7. Mori, H., Yamanishi, R., Nishihara, Y., Fukumoto, J.: Relationship between features of reading behaviors and dynamic abstract of novel. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International MultiConference of Engineers and Computer Scientists, 14–16 March 2018, Hong Kong, pp. 254–259 (2018) 8. Tsang, A.S.L., Yan, D.: Reducing the spoiler effect in experiential consumption. In: Advances in Consumer Research, vol. 36, pp. 708–709 (2009) 9. Wang, D., Li, T.: Document update summarization using incremental hierarchical clustering. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 26–30 October 2010, Toronto, pp. 279–288 (2010)

New Chebyshev Operational Matrix for Solving Caputo Fractional Static Beam Problems Thanon Korkiatsakul1 , Sanoe Koonprasert1,2(B) , and Khomsan Neamprem1,2 1

Department of Mathematics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand [email protected], {sanoe.k,khomsan.n}@sci.kmutnb.ac.th 2 Centre of Excellence in Mathematics, Bangkok 10400, Thailand

Abstract. The main objectives of this work are present a new Chebyshev operational matrix for Caputo fractional derivative to solve approximate analytical solutions of the nonlinear fourth order Caputo fractional integro-differential equations (0 < α ≤ 1) of the static beam problem. The analytical solutions of this problem can be written by a Chebyshev series that can compute the unknown coefficient of Chebyshev polynomials by using the new Chebyshev operational matrices with nonlinear algebraic system. With our results, the Chebyshev polynomials is a powerful method for solving a simply supported beam due to this method is good and simple. The validity and accuracy of our method have been shown through analytical results and absolute error. Some examples are given to show the simplicity and accuracy of this method. Keywords: Chebyshev operational matrix · Chebyshev polynomials Caputo fractional derivative · Static beam problem · Nonlinear algebraic equations · Fractional integro-differential equations

1

·

Introduction

Beam structures are one of the most used elements in structural engineering and it consists of a core that serves to support the vertical weight taken into the support base. The forces that act the beam can produce bending moment and shear forces along the beam which can cause strains, defections and internal stress. The problem of static beams can determine a horizontal structure which has a load point along the length of the beam, causing vertical shear forces. The beam is used for resistance against vertical shear strength and bending moment as shown in Fig. 1. Woinowsky-Krieger [1] determined deflection of an extensible beam where L is length of hinged ends, H is the tension at rest bending and defection of a beam of hinged end with length L, E is the Young elasticity modulus, ρ is density, c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 233–246, 2020. https://doi.org/10.1007/978-981-32-9808-8_19

234

T. Korkiatsakul et al.

Fig. 1. Bending of simple supported beam

I is cross-sectional moment of inertia and A is cross-sectional area. The static definition of a beam can be determined the partial differential equation   L  2  2  ∂u  E ∂ 2 u EI ∂ 4 u H   dx ∂ u = 0. + + − (1)  ∂x  ∂t2 ρA ∂x4 ρ 2ρL ∂x2 0

In 2016, Ren and Tian [2] applied the partial differential Eq. (1) by generalization of the stationary problem as 2   d4 u(x) d2 u(x) 2 d2 u(x) L du(x) − ε − dx = f (x), 0 < x < L, (2) dx4 dx2 L dx2 dx 0 d2 u(L) d2 u(0) = =0 dx2 dx2 and solving numerical solution by Bernoulli collocation method. The main objective of our work, we modify static beam problems (2) to the fourth order nonlinear fractional integro-differential equation in Caputo sense. After that we provide a new operational matrix method to solve the problem which depends on the first kind shifted Chebyshev polynomial. The fourth order Caputo fractional nonlinear integro-differential equation (0 < α ≤ 1) of a static beam is in the form:  2 2 L  (α) Da u(x) dx Da(2α) u(x) = f (x) (3) Da(4α) u(x) − εDa(2α) u(x) − L 0 u(0) = u(L) =

0 < x < L, with boundary conditions u(0) = u(L) = Da(2α) u(0) = Da(2α) u(L) = 0,

0 < x < L,

(4)

where u(x) is the static deflection of the beam at the point x and ε is a positive constant. The model bending equilibrium of a beam (length L) which is simply supported at x = 0 and attached to a fixed nonlinear at x = L. The given function, f (x), represents the external force. The mathematical model for the elastic

New Chebyshev Operational Matrix

beam with non-local terms

235

2

L  (α) (2α) Da u(x) dx Da u(x) can be found in [3,4]. 0

Due to the presence of the integral over [0, L], the problem is not modeled by a point-wise equation and therefore is a non-local problem. The results of Chebyshev method for solving the problem (3) already presented at the international multiconference of engineers and computer scientists 2018 [5].

2

Preliminaries

In this section, we suggest some definitions of the Caputo fractional derivative and properties of Chebyshev polynomials. 2.1

Definition of Caputo Fractional Derivative

Definition 1. The Caputo fractional derivative of u(x) ∈ AC n [a, b] is defined by [6] Daα u(x)

1 = Γ (n − α)



x

a

u(n) (τ ) dτ, (x − τ )α−n+1

where the order α ∈ R+ and n = α which is the smallest integer greater ∞ than or equal to α. The gamma function Γ (z) = 0 e−t tz−1 dt. Under natural condition on the function u(x), if α = 0, then Daα u(x) = u(x) and if α → n, n u(x) it is obvious that the Caputo fractional derivative is a then Daα u(x) = d dx n linear operator similar to integer order differential operators. Some properties of Caputo fractional derivatives are as follows [6]. Daα C = 0, and Daα xβ

2.2

=

where C is a constant

(5)

β < α, β ≥ α.

(6)

0, Γ (β+1) β−α , Γ (β+1−α) x

Properties of Chebyshev Polynomials

We summarize some elementary formulae for the manipulation of Chebyshev polynomials. Definition 2. The Chebyshev polynomials of the first kind Tn (s), n = 1, 2, . . ., N are orthogonal polynomials degree n in s defined on [−1, 1] by [7,8] Tn (s) = cos(nθ), where s = cos(θ), θ ∈ [0, π] and s ∈ [−1, 1].

236

T. Korkiatsakul et al.

For convenience, we first transform s ∈ [−1, 1] to the interval x ∈ [0, 1] by using the transformation x = 12 (s + 1) and obtain the first kind of shifted Chebyshev polynomials in the form Tn∗ (x) = Tn (2x − 1). The shifted polynomials can be generated by using the recurrence relation ∗ ∗ (x) − Tn−2 (x), Tn∗ (x) = 2(2x − 1)Tn−1

with T0∗ (x) = 1 and T1∗ (x) = 2x − 1 so the first kind of shifted Chebyshev polynomials defined by: TN∗ (x) = N

N

(−1)N −k

k=0

22k (N + k − 1)! k x , (2k)!(N − k)!

N = 2, 3, 4, . . .

The zeroes solution of TN∗ +1 (x) on [0, 1] is given by

 [2(N − n) + 1]π 1 1 xn = + cos , 2 2 2N + 2

(7)

(8)

where n = 0, 1, 2, . . . , N . 2.3

The First Kind Shifted Chebyshev Polynomial Expansion

From (7), we introduce the first kind of shifted Chebyshev vector as T∗ (x) = [T0∗ (x) T1∗ (x) T2∗ (x) . . . Tn∗ (x)] . T

(9)

¨ urk and G¨ In 2015, Ozt¨ ulsu [8] constructed the matrix that represent the first kind of shifted Chebyshev polynomial, T∗ (x), so Eq. (9) can be written in matrix form as: (10) T∗ (x) = D−1 Y(x), where ⎡

20 C0,0 ⎢ 2−2 C2,1 ⎢ D=⎢ .. ⎣ . 2−2N C2−2N +1 ,N

where Cn,r =

n! (n−r)!r!

0 2−1 C2,0 .. . 2−2N +1 C2−2N +1 ,N −1

⎤ ··· 0 ⎥ ··· 0 ⎥ ⎥, .. .. ⎦ . . −2N +1 ··· 2 C2−2N +1 ,0

(11)

and T  Y(x) = 1 x x2 . . . xn .

(12)

In general, any function u(x) can be written as a linear combination of the shifted Chebyshev polynomial as: u(x) =

∞ n=0

an Tn∗ (x).

(13)

New Chebyshev Operational Matrix

237

For solving by the shifted Chebyshev method, we first approximate a Chebyshev solution (uN (x)) of the problem as uN (x) =

N

an Tn∗ (x) = AT∗ (x)

n=0

= AD−1 Y(x),

(14)

where A = [a0 a1 ... aN ] is an unknown vector, the square matrix D in (11) and the column vector Y(x) in (12). 2.4

Chebyshev Operational Matrix for Caputo Fractional Derivatives

In this section, we want to construct the first kind of shifted Chebyshev operational matrix that represents the Caputo fractional derivatives. In 2018, Baseri, Abbasbandy and Babolian [9] already proved that the Caputo fractional derivatives of the first kind of shifted Chebyshev polynomials uN (x), (14) can be written in terms of the power series as the theorem. Theorem 1. Let the approximated uN (x) of the first kind of shifted Chebyshev polynomials as in (14) and also suppose α > 0. Then the Caputo fractional derivative with order α of the first kind of shifted Chebyshev polynomials is given by [9] Dα (uN (x)) =

N

n

(α)

an wn,k xk−α ,

(15)

n=α k=α

where (α)

wn,k = (−1)n−k

22k n(n + k − 1)!Γ (k + 1) . (2k)!(n − k)!Γ (k + 1 − α)

(16)

Next we develop a new Chebyshev operational matrix for Caputo fractional derivatives, Dα (uN (x)) as     Dα uN (x) = Dα AD−1 Y(x) = AD−1 Dα Y(x). From (6), we show that the Caputo fractional derivative of the vector, Y(x), of order α is given by the matrix form Dα Y(x) = Bα (x)Y(x), where

⎡ 0 ⎢0 ⎢ ⎢ −α ⎢0 Bα (x) = x ⎢ ⎢ .. ⎣. 0

0 Γ (2) Γ (2−α)

0 .. . 0

(17)

0 0

··· ···

0 0

Γ (3) Γ (3−α)

··· .. . ···

0 .. .

.. . 0

Γ (N +1) Γ (N +1−α)

⎤ ⎥ ⎥ ⎥ ⎥, ⎥ ⎥ ⎦

(18)

238

T. Korkiatsakul et al.

so the Caputo fractional derivatives is   Dα uN (x) = AD−1 Bα (x)Y(x),

(19)

where the (N + 1) × (N + 1) operational matrix D−1 Bα (x) of Caputo fractional derivatives order α which depends on x variable. Moreover we also construct the j th α Caputo fractional derivative matrix with order j th α is given by   D(jα) uN (x) = AD−1 Bjα (x)Y(x), (20) where the (N + 1) × (N + 1) operational matrix D−1 Bjα (x) of Caputo fractional derivatives order jα which depends on x variable and the matrix Bjα (x) defines by ⎡ ⎤ 0 0 0 ··· 0 ⎢0 0 0 · · · ⎥ 0 ⎢ ⎥ ⎢ ⎥ ··· 0 −jα ⎢0 0 0 ⎥, (21) Bjα (x) = x ⎢ . . . Γ (j) ⎥ . .. ⎢ .. .. .. ⎥ ⎣ ⎦ Γ (j−jα) Γ (N +1) 0 0 0 · · · Γ (N +1−jα) and j = 1, 2, 3, . . . , N .

3

Chebyshev Solutions of the Static Beam

Consider the nonlinear Caputo Fractional Integro-Differential Static Beam (3) D(4α) u(x) − εD(2α) u(x) −

2 L



L



2 D(α) u(x) dx D(2α) u(x) = f (x),

(22)

0

with the boundary conditions u(0) = u(L) = D(2α) u(0) = D(2α) u(L) = 0,

0 < x < L.

(23)

First, we assume that the approximate solution, uN (x), in (22) can be written as an expansion in Chebyshev polynomials of the first kind in the form: uN (x) =

N

an Tn∗ (x) = AD−1 Y(x),

(24)

n=0

where the coefficient A = [a0 a1 ... aN ] is an unknown vector. Now (20), the operational matrix D−1 Bjα (x) of Caputo fractional derivatives jα is   D(2α) uN (x) = AD−1 B2α (x)Y(x),   D(4α) uN (x) = AD−1 B4α (x)Y(x),

using order (25) (26)

New Chebyshev Operational Matrix

239

where the matrices D−1 , B2α and B4α are given in (11) and (21). Substituting (19), (24), (25) and (26) into (22), we obtain the following matrix equation AD−1 B4α (x)Y(x) − εAD−1 B2α (x)Y(x)  2 2 L AD−1 Bα (x)Y(x) dx AD−1 B2α (x)Y(x) = f (x). − L 0

(27)

The matrices of the boundary conditions in (23) are then AD−1 Y(0) = 0,

(28)

B2α (0)Y(0) = 0, AD−1 Y(L) = 0, AD−1 B2α (L)Y(L) = 0.

(29) (30) (31)

AD

−1

Next, we solve each coefficient of the row vector A = [a0 , a1 , . . . , aN ] of the solution uN (x). First, we select the real zeros of the shifted Chebyshev polynomial xn in (8), xn , n = 0, 1, 2, . . . , N into (27) and obtain (N + 1) system of algebraic equations. Pick x = x0 and substitute into (27) we have AD−1 B4α (x0 )Y(x0 ) − εAD−1 B2α (x0 )Y(x0 ) 

2 2 L AD−1 Bα (x0 )Y(x0 ) dxAD−1 B2α (x0 )Y(x0 ) = f (x0 ). − L 0

(32)

Similarly, we substitute x = x1 then AD−1 B4α (x1 )Y(x1 ) − εAD−1 B2α (x1 )Y(x1 ) 

2 2 L AD−1 Bα (x1 )Y(x1 ) dxAD−1 B2α (x1 )Y(x1 ) = f (x1 ), − L 0

(33)

x = x2 , the equation is AD−1 B4α (x2 )Y(x2 ) − εAD−1 B2α (x2 )Y(x2 ) 

2 2 L AD−1 Bα (x2 )Y(x2 ) dxAD−1 B2α (x2 )Y(x2 ) = f (x2 ), − L 0

(34)

.. . x = xN −1 , the equation is −1

AD −

2 L



−1

B4α (xN −1 )Y(xN −1 ) − εAD B2α (xN −1 )Y(xN −1 ) 2 L −1 −1 AD Bα (xN −1 )Y(xN −1 ) dxAD B2α (xN −1 )Y(xN −1 ) = f (xN −1 ),

0

(35)

x = xN , the last equation is AD−1 B4α (xN )Y(xN ) − εAD−1 B2α (xN )Y(xN ) 

2 2 L AD−1 Bα (xN )Y(xN ) dxAD−1 B2α (xN )Y(xN ) = f (xN ). − L 0

(36)

240

T. Korkiatsakul et al.

With combining Eqs. (32)–(36) to the system N +1 equations, the solution of system must satisfy the boundary conditions (28)–(31). So, we replace the Eqs. (32), (33) by the boundary conditions x = 0 in (28), (29), the Eqs. (35) and (36) are replaced by the conditions x = L in (30), (31). We applied Newton’s iterative method and Maple program to solve the system of (N + 1) nonlinear algebraic equations to obtain the coefficient unknown vector A. So the vector A which is coefficient of approximate solution uN (x) (24) can obtain an analytical solution of the static beam problem.

4

Error Bound of the Shifted Chebyshev Approximate Solution

In this section, we derive an error bound for the approximate solution u(x) in (22). Diethelm [10] has proved solutions of Caputo fractional differential equations and shown that very good results can be obtained by differentiation of the solution in the interval [0, L] [11]. N Lemma 1. If u(x) is the exact solution and uN (x) = n=0 an Tn∗ (x) is the best shifted Chebyshev approximate solution of (22), then the error bound is given by n2+3

h 2 R √ u(x) − u (x) ≤ , (n + 1)! 2n + 3 N

where R = maxx∈[xi ,xi+1 ] |u(n+1) (x)|,

x ∈ [xi , xi+1 ] ⊆ [0, 1],

(37)

h = xi+1 − xi .

Proof. Applying Taylor’s expansion, we set u1 (x) = u(xi ) −

dun (xi ) (x − xi )n du(xi ) du2 (xi ) (x − xi )2 (x − xi ) + + ... + . 2 dx dx 2! dxn n!

Then

dun+1 (xi ) (x − xi )n+1 , | dxn+1 (n + 1)! N where ξ ∈ [xi , xi+1 ]. Since uN (x) = n=0 an Tn∗ (x) is the best approximation of u(x), we have |u(x) − u1 (x)| ≤ |

u(x) − uN (x)2 ≤ u(x) − u1 (x)2  xi+1 = |u(τ ) − u1 (τ )|2 dτ xi xi+1

 ≤

xi xi+1

|u(n+1) (ξ)|2

 ≤

xi

R2

(τ − xi )2(n+1) dτ (n + 1)!2

(τ − xi )2(n+1) dτ (n + 1)!2

New Chebyshev Operational Matrix

241

where R = maxx∈[xi ,xi+1 ] |u(n+1) (x)|, (τ − xi )2n+3 xi+1 R2 |xi (n + 1)!2 2n + 3 h2n+3 R2 , where h = xi+1 − xi . ≤ (n + 1)!2 2n + 3 =

Taking the square root of both sides, we have 2n+3

u(x) − uN (x) ≤

h 2 R √ , (n + 1)! 2n + 3

which is the desired result for each sub interval [xi , xi+1 ], i = 1, 2, . . . , n. Then, 2n+3 2n+1 O(h 2 ) has a local error bound of the solution u(x), and O(h 2 ) has a global error bound on the interval [0, 1].

5

Numerical Results

In this section, we give some examples to present the applicability and preciseness of the proposed method. All numerical computations were operated using the Maple program. Example 1. Consider the fourth order Caputo fractional nonlinear integrodifferential equation of the static beam for order α = 1 2   d2 u(x) 1 du(x) d4 u(x) d2 u(x) − −2 dx = 2 sin(πx)π 4 + sin(πx)π 2 , dx4 dx2 dx2 dx 0 0 < x < 1, with the boundary conditions d2 u(0) d2 u(1) = = 0, 2 dx dx2 which has the exact solution u(x) = sin(πx). u(0) = u(1) =

By applying the Chebyshev method, we obtain the approximate solution from the shifted Chebyshev expansion with seven terms (N = 7): u(7) (x) =

7

an Tn∗ (x) = AD−1 Y(x),

n=0

where the coefficient matrix A = [a0 a1 ... a7 ]T  T 1 x x2 . . . x7 , and ⎤ ⎡ 1 0 0 ... 0 ⎢ −1 2 0 ... 0 ⎥ ⎥ ⎢ ⎢ 1 −8 8 . . . 0 ⎥ −1 D =⎢ ⎥. ⎢ .. .. .. .. ⎥ .. ⎣ . . . . ⎦ . −1 98 −1568 . . . 8192

and Y(x)

=

242

T. Korkiatsakul et al.

Using the real zeroes of shifted Chebyshev polynomial T7∗ (x) in (8), we obtain x0 = 0.0096, x4 = 0.5975,

x1 = 0.0843, x5 = 0.7778,

x2 = 0.2222, x6 = 0.9157,

x3 = 0.4025, x7 = 0.9904.

For each x, we calculate all matrices Y(x) in (12), Bα (x) in (18), B2α (x) and B4α (x) in (21) as follows:       0 02×2 01×6 04×4 01×4 0 = = Bα = 1×1 1×7 , B , B , 2α 4α 07×1 Bα 06×1 B2α 04×1 B4α 7×7 6×6 4×4 

Bα 7×7 B2α 6×6 B4α 4×4

1 = diag x  2 = diag 2 x  24 = diag 4 x

2 6 x x 6 x2 120 x4

 12 20 30 42 , x x x x  24 60 120 210 , x2 x2 x2 x2  360 840 . x4 x4

We then construct the system of 8 algebraic equations including all boundary conditions. After that, we use the Maple program to solve for the coefficient matrix A = [a0 a1 ... a7 ]T which gives ⎡

⎤ 4.7109 × 10−1 −12 ⎢ 5.4476 × 10 ⎥ ⎢ ⎥ ⎢ −4.9860 × 10−1 ⎥ ⎢ ⎥ ⎢ −6.0979 × 10−12 ⎥ ⎥ A=⎢ ⎢ 2.8110 × 10−2 ⎥ . ⎢ ⎥ −13 ⎢ 6.2246 × 10 ⎥ ⎢ ⎥ ⎣ −6.0570 × 10−4 ⎦ 2.7878 × 10−14

Fig. 2. Graphs of solution and absolute error of the static deflection of the beam for Example 1

New Chebyshev Operational Matrix

243

Therefore, the approximate shifted Chebyshev solution for N = 7 is given by u(N ) (x) = −1.893 × 10−11 + 3.133 x + 5.563 × 10−10 x2 + . . . + 2.284 × 10−10 x7 and the graph of the solution is as shown in Fig. 2. In numerical results, we also apply absolute errors, |uexact (x) − uN (x)| to show the accuracy of this method by comparing the approximate Chebyshev solution with the exact solution N = 14, 15, 16 in Table 1 and graphs of absolute errors in Fig. 2. Table 1. Comparing between exact solutions and approximate Chebyshev solutions & absolute error of Example 1 x

Exact App. Cheb. solutions Absolute error N = 14 N = 15 N = 16 N = 14 N = 15

0.1 0.3082 0.3090

0.3090

0.3090

0.0000 −10

N = 16

0.0000

2.0 × 10−10

0.0000

4.0 × 10−10

0.2 0.5865 0.5878

0.5878

0.5878

1.0 × 10

0.3 0.8076 0.8090

0.8090

0.8090

1.0 × 10−10 1.0 × 10−10 2.0 × 10−10

0.4 0.9495 0.9511

0.9511

0.9511

3.0 × 10−10 1.0 × 10−10 4.0 × 10−10

0.5 0.9984 0.9999

1.0000

0.9999

2.0 × 10−10 0.0000

0.6 0.9495 0.9511

0.9511

0.9511

5.0 × 10

0.0000 −9

0.7 0.8076 0.8090

0.8090

0.8090

1.0 × 10

0.8 0.5865 0.5878

0.5878

0.5878

1.1 × 10−9

0.9 0.3082 0.3090

0.3090

0.3090

2.5 × 10

−9

1.7 × 10−10 −10

5.0 × 10−10

−10

6.0 × 10−10

1.0 × 10

7.0 × 10−10 1.1 × 10−9 1.0 × 10−10 6.0 × 10−10

Considering an equation Lu = f , we can define an absolute residual error for the equation given by Er = |Lu − f |. In the following example, we apply an absolute residual error to show the performance of the shifted Chebyshev method. Example 2. The following nonlinear fourth order Caputo fractional integrodifferential static beam problem:  1 2 (4α) (2α) D(α) u(x) dx D(2α) u(x) = −x, u(x) − D u(x) − 2 0 < x < 1, D 0

and the boundary conditions u(0) = u(1) = D(2α) u(0) = D(2α) u(1) = 0. Applying Chebyshev method, we use the operational matrices of the first kind of shifted Chebyshev to obtain the approximate solutions for α = 1 and values N = 7, 10, 16 respectively as follows.

244

T. Korkiatsakul et al.

u(7) (x) = −3.21 × 10−13 − 0.02 x − 7.61 × 10−12 x2 + . . . − 0.19 × 10−3 x7 u(10) (x) = −7.59 × 10−13 − 0.02 x + 3.96 × 10−12 x2 + . . . − 1.23 × 10−7 x10 u(16) (x) = 1.09 × 10−12 − 0.02 x + 1.01 × 10−11 x2 + . . . − 1.83 × 10−8 x16 . We define an absolute residual error given by Er = |Lu − f |, 2

L where Lu = u(4α) − εu(2α) − L2 0 u(α) dxu(2α) and f (x) = −x. The results of the errors for different values of N = 7, 10, 16 are shown in Table 2. Table 2. Absolute residual errors (Er ) of Example 2 x

N =7

N = 10

N = 16

0.1 1.92 × 10−4 1.55 × 10−8

3.00 × 10−11

0.2 1.67 × 10−5 3.60 × 10−9

2.00 × 10−10

−5

−9

2.00 × 10−10

0.4 5.81 × 10−7 1.70 × 10−9

1.00 × 10−10

−5

2.00 × 10−10

0.3 1.92 × 10 0.5 1.37 × 10

2.50 × 10

−10

2.00 × 10

0.6 6.33 × 10−7 1.40 × 10−9

7.00 × 10−10

−5

−9

1.00 × 10−10

0.8 2.17 × 10−5 3.60 × 10−9

1.00 × 10−10

−4

2.00 × 10−10

0.7 2.29 × 10 0.9 2.72 × 10

2.28 × 10

−8

1.67 × 10

Graphs of the approximate Chebyshev solutions with N = 7 and different values of fractional order α = 0.6, 0.7, 0.8, 0.9, 1 for Caputo fractional derivative are shown in Fig. 3.

Fig. 3. Graphical solutions for deflection of the static beam for fractional order α = 0.6, 0.7, 0.8, 0.9, 1

New Chebyshev Operational Matrix

245

Example 3. Caputo non-local fractional integro-differential equation of the static beam on elastic bearings:  1 2 1 D(α) u(x) dx D(2α) u(x) = , 0 α, where g(t) is continuous in [0, ∞). The function is also said to be in the space Cαm if f (m) ∈ Cα , m ∈ N (for further details see [16]). Definition 1 [16]. The Riemann-Liouville fractional integral operator of order α > 0 of a function f ∈ Cα with a ≥ 0 is defined as  t 1 α J f (t) = (t − τ )α−1 f (τ )dτ, t > a, (3) RL a Γ (α) a where Γ (·) is the gamma function. Definition 2 [16]. For a positive real number α, the Caputo fractional derivative of order α with a ≥ 0 is defined in terms of the Riemann-Liouville fractional integral, i.e., C Daα f (t) =RL Jam−α f (m) (t), or it can be expressed as α C Da f (t)

1 = Γ (m − α)



t

a

f (m) (τ )

α−m+1 dτ,

(t − τ )

(4)

m where m − 1 < α < m, t ≥ a and f ∈ C−1 , m ∈ N.

An important property of the Riemann-Liouville fractional integral and the Caputo fractional derivative of the same order γ can be written as [16] α α RLJa C Da f (t)

= f (t) −

m−1 

f (k) (a)

k=0

(t − a)k , k!

(5)

where m − 1 < α < m, f ∈ Cαm for m ∈ N and α ≥ −1. The Laplace transforms of a fractional derivative in the Caputo sense and of some types of the Mittag-Leffler functions [16] are as follows. Lemma 1 [16]. The Laplace transform of the Caputo fractional derivative of order m − 1 < α < m is L {C Daα f (t)} = sα F (s) −

m−1  k=0

where F (s) = L {f (t)}.

sα−k−1 f (k) (0),

(6)

250

N. Lekdee et al.

Definition 3 [16]. Given α, β > 0, and z ∈ C. The one parameter Mittag-Leffler function Eα is defined as Eα (z) =

∞  j=0

zj , Γ (jα + 1)

(7)

and the Mittag-Leffler function with two parameters is defined as Eα,β (z) =

∞  j=0

zj . Γ (jα + β)

(8)

Lemma 2 [17]. The Laplace transforms for several Mittag-Leffler functions are given by sα−1 , sα + λ sα−β , L {tβ−1 Eα,β (−λtα )} = α s +λ L {Eα (−λtα )} =

(9) (10)

provided that s > |λ|1/α , where λ is a constant parameter. Definition 4. Consider the following high-dimension fractional-order system Dq x(t) = f (x(t)),

(11)

where x(t) = (x1 (t), x2 (t), . . . , xn (t))T , f (x(t)) = (f1 (x(t)), f2 (x(t)), . . . , fn (x(t)))T . The equilibrium solutions x∗ = (x∗1 , x∗2 , . . . , x∗n ) of system (11) are defined by f (x∗ ) = 0. Lemma 3 [18]. The equilibrium point (x∗ , y ∗ , z ∗ ) of the fractional differential system Dq x(t) = f1 (x, y, z), x(0) = x0 ,

Dq y(t) = f2 (x, y, z),

y(0) = y0 ,

Dq z(t) = f3 (x, y, z)

q ∈ (0, 1],

z(0) = z0

is locally asymptotically stable if all of the eigenvalues of the Jacobian matrix evaluated at the equilibrium point (x∗ , y ∗ , z ∗ ), which is expressed as ⎛ ∂f1 ∂f1 ∂f1 ⎞

∂x ∂y ∂z

⎜ ∂f2 ∂f2 ∂f2 ⎟

A = ⎝ ∂x ∂y ∂z ⎠

, (12)

∂f3 ∂f3 ∂f3

∂x

∂y

∂z

(x∗ ,y ∗ ,z ∗ )

satisfy the following condition |arg(eig(A))| >

qπ . 2

(13)

Stability Analysis and Solutions of a Fractional Model

3

251

Algorithms of the Used Methods

In this section, we will describe the algorithms of the Laplace Adomian Decomposition Method and the Adams-Bashforth-Moulton predictor-corrector scheme so that we can use them to solve the fractional-order system (1) equipped with (2) for analytical and numerical solutions, respectively. 3.1

The Laplace Adomian Decomposition Method

The Laplace Adomian decomposition method (LADM) [19,20] for solving FDEs or a system of FDEs can be described as follows. Consider the following fractional-order initial value problem: α C Da u(t)

+ R(u) + N (u) = g(t),

(14)

where m − 1 < α < m, m ∈ N and the solution u(t) satisfies some given initial conditions. In Eq. (14), C Daα denotes the Caputo fractional derivative of order α with respect to t, R and N are linear and nonlinear operators of u, respectively, and g is a source term. Taking the Laplace transform of both sides of Eq. (14) and then applying the formula (1) to the resulting equation, we obtain L {u(t)} =

m−1 1 1  α−k−1 (k) s u (0) + α L {g(t)} sα s k=0

1 1 − α L {R(u)} − α L {N (u)}. s s

(15)

In the LADM, we define the solution u(t) as an infinite series u(t) =

∞ 

ui (t),

(16)

i=0

and represent the nonlinear term N by an infinite series of Adomian polynomials N (u) =

∞ 

Ai ,

(17)

i=0

where the Ai polynomials can be determined by the following formula



 1 di

k Ai = N λ u , i ≥ 0. k

i! dλi k=0

(18)

λ=0

Substituting (16) and (17) into (15), we then have the following Adomian recursion scheme L {u0 } =

m−1 1  α−k−1 (k) 1 s u (0) + α L {g(t)}, sα s k=0

1 1 L {un+1 } = − α L {R(un (t))} − α L {An }, n ≥ 0. s s

(19)

252

N. Lekdee et al.

Taking the inverse Laplace transform to Eq. (19), we can compute the solution components un (n ≥ 0). Then the n-term approximation of the solution is ϕn (t) =

n−1 

ui (t),

(20)

i=0

which the limn→∞ of the approximation yields the exact solution of Eq. (14), i.e., u(t) = lim ϕn (t).

(21)

n→∞

Sometimes the exact solution u(t) in Eq. (21) may be formulated in a closed form. If the exact solution u(t) in Eq. (21) can be written as a power series in which an independent variable t is raised to fractional powers and the radius of convergence of the series is quite small, then the solution might not be valid for the entire domain of interest. Hence, a technique of analytical continuation to obtain a solution, which is valid on the domain of interest, is required. The Pad´e approximant method constructs a rational function in t as an approximation for a slowly converging or diverging power series in t. It is one of the well-known convergence acceleration techniques, which can be applied to an n-term polynomial approximation φn (t). We denote the [m/m] diagonal Pad´e approximant of φn (t) in t by [m/m] {φn (t)}, i.e., Pad´e[m/m] {φn (t)} = [m/m] {φn (t)}, where m = (n − 1)/2 if n = 3, 5, 7, . . ., and m = n/2 if n = 4, 6, 8, . . .. However, if each variable t in the n-term approximation φn (t) has a fractional power, then we must change such fractions to new integer powers using a transformation before applying the Pad´e approximants. The LADM accelerated by the Pad´e approximants is called the Laplace-Adomian-Pad´e method (LAPM). 3.2

Adams-Bashforth-Moulton Predictor-Corrector Scheme

Recently, the Adams-Bashforth-Moulton type predictor-corrector scheme or the PECE [12] method is extensively used to numerically solve FDEs. In this work, we will use this method to obtain approximate numerical solutions for system (1). The general idea and the associated formulas of this method are as follows. Consider the following fractional-order initial value problem (FIVP) α C Da u(t)

u

(k)

(0) =

= f (t, u(t)), (k) u0 , k

0 ≤ t ≤ T,

= 0, 1, . . . m − 1,

α ∈ (m − 1, m),

(22)

where f is a nonlinear function and m is a positive integer. The FIVP (22) can be transformed to the following Volterra integral equation u(t) =

m−1  k=0

(k) t

u0

k

k!

+

1 Γ (α)



t

(t − τ )α−1 f (τ, u(τ ))dτ.

(23)

0

To approximate the integral in (23), we discretize the entire time T as the uniform grid {tn = nh : n = 0, 1, . . . N } for some integer N and the step size

Stability Analysis and Solutions of a Fractional Model

253

h := T /N . Let uh (tn ) denote the approximation of u(tn ). Suppose that we have already calculated approximations uh (tj ), j = 1, 2, . . . , n, then the approximation uh (tn+1 ) of the FIVP (22) can be calculated using the PECE method as follows: uh (tn+1 ) =

m−1  tkn+1 k=0

+ where

aj,n+1

k!

(k)

u0 +

hα f (tn+1 , uP h (tn+1 )) Γ (α + 2)

n

 hα aj,n+1 f (tj , uh (tj )), Γ (α + 2) j=0

⎧ α+1 n − (n − α)(n + 1)α , if j = 0, ⎪ ⎪ ⎨ (n − j + 2)α+1 + (n − j)α+1 = −2(n − j + 1)α+1 , if 1 ≤ j ≤ n, ⎪ ⎪ ⎩ 1, if j = n + 1.

(24)

(25)

The predictor uP h (tn+1 ) in Eq. (24), i.e., an initial approximation, is given by uP h (tn+1 ) =

m−1  tkn+1 k=0

k!

(k)

u0 +

n

1  bj,n+1 f (tj , uh (tj )), Γ (α) j=0

(26)

where bj,n+1 =

4

hα ((n + 1 − j)α − (n − j)α ). α

(27)

Stability Analysis and Obtained Solutions

In this section, we firstly perform the study of the asymptotic stability of the fractional-order model for glucose-insulin homeostasis in healthy rats by calculating arguments of the eigenvalues of the Jacobian matrix. Secondly, we obtain the exact solutions of the model via using the Laplace transform. Finally, we confirm our analytical results by obtaining analytical and numerical solutions using the proposed methods. The equilibrium point of system (1) is obtained by solving the following algebraic system: C α a Dt i(t) C α a Dt g(t) C α a Dt d(t)

= c1 g(t) − cα 6 i(t) = 0, = −c4 (i(t) − c5 ) − c2 i(t) − c3 + c0 d(t) = 0, =

−cα 7 d(t)

= 0.

Then the equilibrium point of the above system is  c c − c cα  c c − c   4 5 3 4 5 3 ,0 , , 6 E∗ = c2 + c4 c1 c2 + c4 provided that c4 c5 − c3 > 0.

(28)

(29)

254

N. Lekdee et al.

Theorem 1. The equilibrium point E ∗ of the fractional-order system (1) defined in Eq. (29) is asymptotically stable. Proof. The Jacobian matrix of system (1) is as follows ⎡ ⎤ −cα c1 0 6 J = ⎣−(c4 + c2 ) 0 c0 ⎦ . 0 0 cα 7

(30)

The characteristic polynomial equation of the matrix J is 2 α (λ + cα 7 )(λ + c6 λ + c1 (c2 + c4 )) = 0.

(31)

It is not difficult to see that the eigenvalues of the matrix J are λ1 = −cα 7, λ2,3 =

−cα 6 ±



2 (cα 6 ) − 4c1 (c2 + c4 ) . 2

(32)

2 Separating the discriminant value (cα 6 ) − 4c1 (c2 + c4 ) into three cases, we can examine the asymptotic stability of the equilibrium point E ∗ as follows. 2 Case I: If (cα 6 ) − 4c1 (c2 + c4 ) = 0, then

λ1 = −cα 7 < 0,

λ2,3 = −cα 6 < 0.

Thus, for α ∈ (0, 1], we obtain |arg(λi )| = π > απ 2 , ∀i = 1, 2, 3. 2 Case II: If (cα 6 ) > 4c1 (c2 + c4 ), then we have  2 (cα −cα 6 ± 6 ) − 4c1 (c2 + c4 ) α λ1 = −c7 < 0, < 0. λ2,3 = 2 Thus, we obtain |arg(λi )| = π > απ 2 where i = 1, 2, 3, for α ∈ (0, 1]. 2 ) < 4c (c + c ), then the eigenvalues can be written as Case III: If (cα 1 2 4 6  2 |(cα −cα 6 ± 6 ) − 4c1 (c2 + c4 )| i α λ1 = −c7 < 0, . λ2,3 = 2 For α ∈ (0, 1], we thus obtain |arg(λ1 )| = π > απ 2 and λ2,3 are located on the left hand complex plane. So, |arg(λ2,3 )| > π2 ≥ απ 2 . From Case I, II, III, we obviously obtain that for α ∈ (0, 1], |arg(λi )| >

απ , 2

∀i = 1, 2, 3.

By Lemma 3, the equilibrium point E ∗ defined in Eq. (29) is therefore asymptotically stable for all α ∈ (0, 1]. 

Stability Analysis and Solutions of a Fractional Model

255

Next, we demonstrate formulas of solutions of the FIVP in Eqs. (1)–(2) using the proposed methods. More details of the derivation of the formulas can be found in [21]. Using the Laplace transform method and the formula (9) and (10), the exact solutions of the FIVP in Eqs. (1)–(2) can be written in terms of the Mittag-Leffler functions as follows (more details in [21]). φ1 φ2 α Eα,α+1 (β1 tα ) − Eα,α+1 (β2 tα ) − φ3 Eα,α+1 (−cα 7 t ), β1 − β2 β1 − β2 (33) ω1 ω2 0 α α α α g(t) = g + Eα,α+1 (β1 t ) − Eα,α+1 (β2 t ) − ω3 Eα,α+1 (−c7 t ), β1 − β2 β1 − β2 (34) i(t) = i0 +

α d(t) = d0 Eα (−cα 7 t ),

(35)

where c1 c0 d0 β1 c1 c0 d0 β2 , φ2 = β22 i0 + c1 (g 0 β2 + μ2 ) + , α β1 + c 7 β2 + c α 7     2 2 cα cα − 4μ1 c1 − 4μ1 c1 −cα −cα c1 c0 d0 cα 6 + 6 − 6 6 7 , β1 = , β2 = , φ3 = α α (β1 + c7 )(β2 + c7 ) 2 2

φ1 = β12 i0 + c1 (g 0 β1 + μ2 ) +

α 0 ω1 = g 0 β12 + cα 6 μ2 + β 1 μ2 + c 6 g +

0 (β1 + cα 6 )c0 β1 d − μ1 β1 i0 , α β1 + c 7

α 0 ω2 = g 0 β22 + cα 6 μ2 + β 2 μ2 + c 6 g +

0 (β2 + cα 6 )c0 β1 d − μ1 β2 i0 , α β2 + c 7

ω3 =

4.1

α α 0 (cα 6 − c7 )c0 c7 d , μ1 = c 2 + c 4 , α (β1 + c7 )(β2 + cα 7)

and

μ2 = c 4 c 5 − c 3 .

The Employed Methods: LAPM and PECE

The current section demonstrates the use of the Laplace-Adomian-Pad´e method (LAPM) and the predictor-corrector scheme to obtain an analytical and numerical solutions of the FIVP in Eqs. (1)–(2). We begin by using the Laplace Adomian Decomposition method (LADM) to obtain a series solution of the FIVP. We then show that the radius of convergence of the series is very small and that the series diverges over a large region of the domain of interest. Thus, the LAPM is required to obtain a solution which can be used over the whole domain by replacing the series solution by a Pad´e approximant, i.e., by a rational function. More derivations for the following results can be found in [21]. Using the following initial conditions and parameter values [1] i0 = 100, g 0 = 150, d0 = 50, c0 = 0.1, c1 = 0.7, c2 = 0.0003, c4 = 0.05, c6 = 0.25, c7 = 0.14, c5 = 250,

(36)

256

N. Lekdee et al.

and the symbolic algebra package MATHEMATICA, the LADM approximating solutions of the FIVP (1)–(2) for α = 1 are I20 (t) = 100 + 65t − 10.56t2 . . . + 3.549 × 10−28 t20 , G20 (t) = 150. + 6.97t − 1.672t2 . . . + 6.632 × 10−29 t20 ,

(37)

−32 19

D20 (t) = 50 − 7.5t + 0.5625t . . . − 9.119 × 10 2

t ,

and the corresponding solutions using the LAPM are 100 + 119297t . . . + 2.917 × 10−10 t10 , 1 + 1192.32t . . . + 4.487 × 10−13 t10 150 + 187164t . . . + 9.83 × 10−11 t10 Pad´e[10/10] {G20 (t)} = , 1 + 1247.71t . . . + 4.31 × 10−13 t10 50 + 18228.8t . . . − 3.97 × 10−14 t10 Pad´e[10/10] {D20 (t)} = . 1 + 364.725t . . . + 7.95 × 10−16 t10

Pad´e[10/10] {I20 (t)} =

(38)

The corresponding exact solutions in Eqs. (33)–(35) for α = 1, the approximate numerical solutions using the RK4 method, the LADM, and the LAPM are compared in Fig. 1. It is obvious from Fig. 1 that the solutions for i(t), g(t) obtained using the LADM are different from those obtained using the other methods when t is approximately close to t = 25 and that the LADM solution diverges for t > 25. This divergence is due to the fact that the infinite series solution for the LADM diverges for t > 25. Due to this divergence of the LADM compared with the LAPM, it is clear that the LAPM will be a better method than the LADM for other values of α. Next, we apply the PECE method, i.e., the Adams-Bashforth-Moulton predictor-corrector scheme, in Eqs. (23)–(27) [12] to the FIVP (1)–(2). The discretization of a time interval with points {tn } to obtain the numerical solution formulas in = i(tn ), gn = g(tn ), dn = d(tn ) of the problem can be found in [21]. The numerical simulations of the FIVP using the resulting discretized formulas will be obtained in the next section. 4.2

Simulation Results

Here we will show the simulation results of the FIVP (1)–(2) with α = 34 , 12 using the exact solution formulas in Eqs. (33), (34), and (35) and the function mlf for the Mittag Leffler functions, which is implemented by [22]. The approximating solutions of the problem generated by the PECE method, the LAPM, and the LADM are also computed. However, the absolute errors of the numerical results obtained by the LAPM and the PECE method when compared to those obtained using the exact solution formulas can be found in [21]. The following simulation results are for α = 34 , 12 for which more details of the computations can be found in [21].

Stability Analysis and Solutions of a Fractional Model 190

500 Exact Rk4 LADM LAPM

450 400

EXACT RK4 LADM LAPM

180

170

g(t)

350

i(t)

257

160

300 250

150

200 140

150 100

0

5

10

15

20

25

130

30

0

5

10

t

15

20

25

30

t 50 Exact RK4 LADM LAPM

45 40 35

d(t)

30 25 20 15 10 5 0

0

5

10

15

20

25

30

t

Fig. 1. Simulation comparisons of the solutions i(t), g(t), d(t) for the FIVP (1)–(2) with α = 1 using the exact solutions, the RK4 method, the LADM, and the LAPM. The simulation results i(t), g(t) obtained from the LADM are diverging for t ≥ 25.

For α = 34 : Applying the LADM to the problem using the recursion scheme, the 20-term approximations of the solutions are I20 (t) = 100 + 59.52t3/4 . . . + 1.216 × 10−18 t15 , G20 (t) = 150 + 7.584t3/4 . . . + 1.460 × 10−19 t15 ,

(39)

D20 (t) = 50 − 13.113t3/4 . . . − 5.328 × 10−22 t57/4 . Substituting with an appropriate variable for t in Eq. (39) and calculating the Pad´e[10/10] of the resulting solutions via the command ‘PadeApproximant’ in Mathematica and then changing the new variable back to the variable t, so the LAPM leads us to the following approximating solutions 1

Pad´e[10/10] I20 (t) =

5

100 − 2.54 × 10−10 t 4 . . . + 7.67 × 10−12 t 2 1

5

1 − 2.54 × 10−12 t 4 . . . + 4.08 × 10−14 t 2 1 5 150 + 1.61 × 10−8 t 4 . . . − 1.41 × 10−9 t 2 Pad´e[10/10] G20 (t) = 1 5 , 1 + 7.74 × 10−11 t 4 . . . − 8.91 × 10−12 t 2 1 5 50 + 2.7 × 10−11 t 4 . . . + 2.33 × 10−12 t 2 Pad´e[10/10] D20 (t) = 1 5 . 1 + 5.38 × 10−13 t 4 . . . + 1.45 × 10−13 t 2

, (40)

The simulation results of the problem for α = 34 using all of the methods, i.e., the exact formulas (35), (33), and (34), the PECE method with the step size h = 10−3 , the LAPM in Eq. (40), and the LADM in Eq. (39), are shown in Fig. 2. The numerical simulations for i(t), g(t) and d(t) obtained by the PECE method

258

N. Lekdee et al. 350

180 Exact PECE LAPM LADM

300

Exact PECE LAPM LADM

175

170

i(t)

g(t)

250 165

200 160 150

100

155

0

5

10

15

150

20

0

5

10

t

15

20

t 50 Exact PECE LAPM LADM

45 40

d(t)

35 30 25 20 15 10 5

0

5

10

15

20

t

Fig. 2. Simulation comparisons of the solutions i(t), g(t), d(t) for the FIVP (1)–(2) using the exact solutions, the PECE method, the LAPM, and the LADM for α = 34 .

and the LAPM are in very good agreement with the exact solutions, however, the solution curves i(t), g(t) constructed by the LADM are diverging when t ≈ 15. The absolute errors between numerical solutions, which are computed using the LAPM and the PECE method, and the exact solutions can be found from Table I in [21]. It can be numerically concluded from such a Table that the PECE method achieves a higher degree of accuracy than the LAPM when t is larger. For α = 12 : Applying the LADM to the problem using the recursion scheme, the 20-term approximations of the solutions are as follows I20 (t) = 100 + 47.115t1/2 . . . + 1.028 × 10−10 t10 , G20 (t) = 150 + 7.865t1/2 . . . + 9.059 × 10−12 t10 ,

(41)

D20 (t) = 50 − 21.851t1/2 . . . − 6.57 × 10−13 t19/2 . Replacing with an appropriate variable for t in Eq. (41) and calculating the Pad´e[10/10] of the resulting solutions via the command ‘PadeApproximant’ in Mathematica and then changing the new variable back to the variable t, so the LAPM gives us the following approximating rational solutions

Stability Analysis and Solutions of a Fractional Model

259

1

Pad´e[10/10] I20 (t) = Pad´e[10/10] G20 (t) = Pad´e[10/10] D20 (t) =

100 + 1.08 × 109 t 2 . . . + 25.106 × t5

, 1 1 + 1.08 × 107 t 2 . . . + 0.11t5 1 150 − 1.855 × 101 0t 2 . . . − 286.91 × t5 1

1 − 1.237 × 108 t 2 . . . − 1.40t5 1 50 − 1.80 × 109 t 2 . . . − 1.04 × 10−3 t5 1

1 + 3.583 × 107 t 2 . . . 0.285t5

,

.

(42)

The numerical results of the problem for α = 12 using all of the methods are illustrated in Fig. 3. The solution curves i(t), g(t) obtained using the LADM are diverging when t ≈ 10 and the numerical simulations obtained by the PECE method and the LAPM are in very good agreement with the exact solutions. The absolute errors between numerical solutions, which are computed using the LAPM and the PECE method, and the exact solutions can be found from Table II in [21]. It is not difficult to observe from such a Table that the PECE method provides a better accuracy than the LAPM when t is far away from the initial point. 180

350 Exact PECE LAPM LADM

300

Exact PECE LAPM LADM

175

170

i(t)

g(t)

250 165

200 160 150

100

155

0

5

10

150

15

0

5

t

10

15

t 50 45

Exact PECE LAPM LADM

40

D(t)

35 30 25 20 15

0

5

10

15

t

Fig. 3. Graphical comparisons of the solutions i(t), g(t), d(t) for the FIVP (1)–(2) using the exact solutions, the PECE method, the LAPM, and the LADM for α = 12 .

5

Conclusion

In this chapter, we have proved the asymptotic stability of the equilibrium point of the fractional-order model for glucose-insulin homeostasis in healthy rats. We

260

N. Lekdee et al.

have obtained the exact solutions of the fractional-order initial value problem (1)-(2) by using the Laplace transform. In addition, we have obtained approximating analytical solutions of the problem via the LADM and the LAPM and the numerical solutions of the problem simulated via the PECE method have been computed. The simulations of the solutions calculated by the above approaches have been compared for the integer order α = 1 and the fractional orders α = 34 , 12 . The comparisons have shown that the infinite series solutions generated by the LADM become divergent as the time increases, whereas the approximate solutions generated via the LAPM and the PECE are in very good agreement with the exact solutions of the problem. Acknowledgements. This research is supported by the Department of Mathematics, King Mongkut’s University of Technology North Bangkok, Thailand and the Centre of Excellence in Mathematics, the Commission on Higher Educaton, Thailand.

References 1. Lombarte, M., Lupo, M., Campetelli, G., Basualdo, M., Rigalli, A.: Mathematical model of glucose-insulin homeostasis in healthy rats. Math. Biosci. 245(2), 269–277 (1950) 2. Cho, Y., Kim, I., Sheen, D.: A fractional-order model for minmod millennium. Math. Biosci. 262(Supplement C), 36–45 (2015) 3. Asgari, M.: Numerical solution for solving a system of fractional integro-differential equations. IAENG Int. J. Appl. Math. 45(2), 85–91 (2015) 4. Zhang, S., Zhou, Y.Y.: Multiwave solutions for the Toda lattice equation by generalizing Exp-function method. IAENG Int. J. Appl. Math. 44(4), 177–182 (2014) 5. Song, H., Yi, M., Huang, J., Pan, Y.: Numerical solution of fractional partial differential equations by using legendre wavelets. Eng. Lett. 24(3), 358–364 (2016) 6. Gepreel, K.A., Nofal, T.A., Al-Sayali, N.S.: Exact solutions to the generalized hirota-satsuma kdv equations using the extended trial equation method. Eng. Lett. 24(3), 274–283 (2016) 7. Kataria, K., Vellaisamy, P.: Saigo space-time fractional poisson process via adomian decomposition method. Stat. Probab. Lett. 129, 69–80 (2017) 8. Haq, F., Shah, K., ur Rahman, G., Shahzad, M.: Numerical solution of fractional order smoking model via laplace adomian decomposition method. Alexandria Eng. J. 57(2), 1061–1069 (2017) 9. Sirisubtawee, S., Kaewta, S.: New modified adomian decomposition recursion schemes for solving certain types of nonlinear fractional two-point boundary value problems. Int. J. Math. Math. Sci. 2017, 1–20 (2017) 10. Liu, Q., Liu, J., Chen, Y.: Asymptotic limit cycle of fractional van der pol oscillator by homotopy analysis method and memory-free principle. Appl. Math. Model. 40(4), 3211–3220 (2016) 11. Freihat, A., Momani, S.: Application of multi-step generalized differential transform method for the solutions of the fractional-order chua’s system. Discrete Dyn. Nat. Soc. 2012, 1–15 (2012) 12. Diethelm, K., Ford, N.J., Freed, A.D.: A predictor-corrector approach for the numerical solution of fractional differential equations. Nonlinear Dyn. 29(1), 3– 22 (2002)

Stability Analysis and Solutions of a Fractional Model

261

13. Wang, H., Yang, D., Zhu, S.: A petrov-galerkin finite element method for variablecoefficient fractional diffusion equations. Comput. Methods Appl. Mech. Eng. 290, 45–56 (2015) 14. Lu, H., Bates, P.W., Chen, W., Zhang, M.: The spectral collocation method for efficiently solving PDEs with fractional laplacian. Adv. Comput. Math. 44(3), 861–878 (2017) 15. Khan, N.A., Ara, A., Khan, N.A.: Fractional-order riccati differential equation: analytical approximation and numerical results. Adv. Difference Equ. 2013(1), 185 (2013) 16. Podlubny, I.: Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of their Solution and Some of Their Applications, vol. 198. Academic press, Cambridge (1998) 17. Magin, R., Feng, X., Baleanu, D.: Solving the fractional order bloch equation. Concepts Magn. Reson. Part A 34(1), 16–23 (2009) 18. Kou, C.H., Yan, Y., Liu, J.: Stability analysis for fractional differential equations and their applications in the models of HIV-1 infection. Comput. Model. Eng. Sci. 39(3), 301 (2009) 19. Momani, S., Odibat, Z.: Numerical comparison of methods for solving linear differential equations of fractional order. Chaos, Solitons Fractals 31(5), 1248–1255 (2007) 20. Ahmed, H., Bahgat, M.S., Zaki, M.: Numerical approaches to system of fractional partial differential equations. J. Egypt. Math. Soc. 25(2), 141–150 (2017) 21. Lekdee, N., Sirisubtawee, S., Koonprasert, S.: Exact Solutions and Numerical Comparison of Methods for Solving Fractional Order Differential Systems. In: Lecture Notes in Engineering and Computer Science: Proceedings of The International MultiConference of Engineers and Computer Scientists, Hong Kong, pp. 459–466, 14–16 March 2018 22. Podlubny, I.: Calculates the mittag-leffler function with desire accuracy (2005)

On Asymptotic Stability Analysis and Solutions of Fractional-Order Bloch Equations Sekson Sirisubtawee(B) Department of Mathematics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand [email protected]

Abstract. The Bloch equations are a model for nuclear magnetic resonance (NMR), which is a physical phenomenon arising in engineering, medicine, and the physical sciences. The main contributions of this work are to obtain an asymptotic stability condition of Caputo fractional-order Bloch equations and to provide analytical solutions and numerical solutions of the mentioned fractional-order model. The asymptotic stability analysis is generalized for the incommensurate fractional-order Bloch equations. The standard two methods employed to solve the fractional-order Bloch equations are a revised variational iteration method and an Adams-Bashforth-Moulton type predictor-corrector scheme. The first method gives analytical solutions while the second scheme generates numerical solutions for the problem. Comparisons of the two types of obtained solutions are demonstrated via varying fractional orders of the model. Keywords: Asymptotic stability analysis · Adams-Bashforth-Moulton predictor-corrector scheme · Caputo fractional-order Bloch equations · Incommensurate fractional-order model · Revised variational iteration method · Absolute error

1 Introduction Recently, fractional-order differential equations (FDEs) have played an important role in applied mathematics and engineering. Especially, they have been used to model phenomena in many fields. FDEs are based on various definitions of fractional derivatives. They are generalizations of integer-order differential equations and they have been of interest to mathematicians, scientists and engineers for more than thirty years. Because the memory and hereditary properties of various physical processes can be explained using FDEs [1]. FDEs have been efficiently used to model many real phenomena in many research fields such as engineering [2], physics [3], applied mathematics [4], and disease models [5]. Most nonlinear FDEs do not generally have exact solutions, hence approximate analytic solutions and numerical solutions are usually required for S. Sirisubtawee—A Researcher in the Centre of Excellence in Mathematics, Bangkok, 10400, Thailand. c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 262–275, 2020. https://doi.org/10.1007/978-981-32-9808-8_21

On Asymptotic Stability Analysis and Solutions

263

such cases. Some techniques for obtaining approximate analytic solutions of FDEs include the homotopy analysis method (HAM) [8], the Laplace-Adomian decomposition methods (LADM) [6], the Laplace-variational iteration method (LVIM) [9], and the Duan-Rach modified decomposition method [7]. The LADM and HAM are based on the assumption that the analytical solutions of FDEs are in the form of infinite series, whereas the LVIM constructs a correction functional by a generalized Lagrange multiplier method. Numerical methods for obtaining approximate solutions of FDEs are based on discretization of the independent variable and include the Adams-BashforthMoulton type predictor-corrector or PECE (Predict, Evaluate, Correct and Evaluate) method [10], the finite element method [12], and the simulink model [11]. The objective of this work is to construct and compare analytical and numerical solutions of the following Caputo fractional-order Bloch system Mx (t) , T2 My (t) α , C Da My (t) = −ω0 Mx (t) − T2 M0 − Mz (t) α , C Da Mz (t) = T1

(1)

Mx (a) = Mx0 , My (a) = My0 , Mz (a) = Mz0 .

(2)

α C Da Mx (t) = ω0 My (t) −

with initial conditions

The fractional-order system (1) is developed from the classical system of first-order Bloch equations [13] with initial conditions given at t = a. Replacing the first-order time derivatives in the classical one with the Caputo fractional derivatives of order α ∈ (0, 1] denoted by C Dαa and maintaining a consistent set of units for both sides of each equation in the system via fractional time constants, we finally obtain the generalized system (1). The applications of system (1) in some specific fields can be found in [13– 15]. In Eq. (1), the states Mx (t), My (t), and Mz (t) represent the system magnetization in x, y and z components, respectively. The meanings of the parameters in the system are as follows: ω0 = 2π f0 is the resonant frequency, T1 is the spin-lattice relaxation time, T2 is the spin-spin relaxation time, and M0 is the equilibrium magnetization. In this work, we provide a stability condition for which an equilibrium point of the incommensurate system of (1) is asymptotically stable. Moreover, we obtain an approximate analytical solution of the initial value problem in Eqs. (1)–(2) using the revised variational iteration method (RVIM) [16] and a numerical solution using the AdamsBashforth-Moulton type predictor-corrector scheme (PECE) [17]. These two methods are currently regarded as the most reliable and efficient analytical and numerical methods for solving FDEs.

2 Relevant Definitions and Theorems In this section, we provide necessary definitions and important properties of fractionalorder operators including the Riemann-Liouville fractional integral and the Caputo fractional derivative. In addition, the vital asymptotic stability theorems for fractional-order systems are described.

264

S. Sirisubtawee

A function f (t) (t > 0) is said to be in the space Cα (α ∈ R) if it can be written as f (t) = t p g(t) for some p > α , where g(t) is continuous in [0, ∞). The function is also said to be in the space Cαm if f (m) ∈ Cα , m ∈ N (for more details see [1]). Definition 1 [1]. The Riemann-Liouville fractional integral operator of order α > 0 of a function f ∈ Cα with a ≥ 0 is defined as α RLJa f (t) =

1 Γ (α )

 t a

(t − τ )α −1 f (τ )d τ , t > a,

(3)

where Γ (·) is the gamma function. If α = 0, then RLJaα f (t) = f (t). Definition 2 [1]. Given a positive real number α , the Caputo fractional derivative of order α with a ≥ 0 is defined in terms of the Riemann-Liouville fractional integral, i.e., α m−α f (m) (t), where m − 1 < α < m, m ∈ N, or it can be expressed as C Da f (t) =RL Ja α C Da f (t) =

1 Γ (m − α )

 t a

f (m) (τ )

(t − τ )α −m+1

dτ ,

(4)

m . If α = m, then Dα f (t) = f (m) (t). where t ≥ a and f ∈ C−1 C a

Remark 1 [1]. For f (t) ∈ C1n , α , β ≥ 0, m−1 < α ≤ m, α + β ≤ n, where m, n ∈ N, a ≥ 0 and γ ≥ −1, we have the following important properties 1. RLJaα

β RLJa

f (t) = RLJaβ

2. RLJaα (t − a)γ =

α RLJa

f (t) = RLJaα +β f (t),

Γ (γ + 1) (t − a)γ +α , Γ (γ + α + 1)

3. RLJaα C Dαa f (t) = f (t) −

m−1



k=0

f (k) (a)

(t − a)k . k!

Theorem 1 [18–20]. The equilibrium point x∗ = [x1∗ , x2∗ ..., xn∗ ]T of the following commensurate fractional-order system of n differential equations α

C Da

x(t) = f(t, x(t)),

x(0) = x0 ,

(5)

T

where α = [α1 , α2 , ..., αn ] and α1 = α2 = ... = αn = α ∈ (0, 1] is locally asymptotically stable if all the eigenvalues of the Jacobian matrix evaluated at x∗ , i.e., J(x∗ ), satisfy the following condition απ |arg(eig(J(x∗ )))| > . (6) 2 Theorem 2 [21]. Consider the following autonomous system α

C Da

x(t) = Ax(t), x(0) = x0 ,

(7)

T

with α = [α1 , α2 , ..., αn ] and its n-dimensional representation: α1 C Da x1 (t) =a11 x1 (t) + a12 x2 (t) + . . . a1n xn (t), α2 C Da x2 (t) =a21 x1 (t) + a22 x2 (t) + . . . a2n xn (t),

... αn C Da xn (t) =an1 x1 (t) + an2 x2 (t) + . . . ann xn (t),

(8)

On Asymptotic Stability Analysis and Solutions

265

where all αi ’s are rational numbers between 0 and 2. Assume m to be the least common multiple of the denominators ui of αi ’s, where αi = vi /ui , vi , ui ∈ Z + for i = 1, 2, . . . , n and we set γ = 1/m. Define ⎛ mα ⎞ λ 1 − a11 −a22 . . . −a1n ⎜ −a21 λ mα2 − a22 . . . −a2n ⎟ ⎟ = 0. (9) det ⎜ ⎝ ⎠ ... m α n −an2 ... λ − ann −an1 The characteristic equation Eq. (9) can be transformed to an integer order polynomial equation if all αi ’s are rational numbers. Then the zero solution of system (8) is globally asymptotically stable if all roots λi ’s of the characteristic (polynomial) equation (9) satisfy π |arg(λi )| > γ f or all i. (10) 2

3 Descriptions of the Methods In this section, we give the description of the RVIM and the PECE methods that we will use them to solve the IVP (1)–(2). 3.1 Revised Variational Iteration Method In order to describe the RVIM for solving a system of FDEs, we must initially describe the general principle of the variational iteration method (VIM) [16, 22] for solving a fractional-order differential equation. Consider the following fractional-order differential equation: α C Da u(t) + N(u(t)) =

f (t), 0 < α ≤ 1,

(11)

where N is a nonlinear operator with respect to u(t) and f (t) is a source function. According to the VIM, the correction functional for Eq. (11) is constructed as follows: un+1 (t) = un (t) +RL Jaα [λ (C Dαa un (t) + N(un (t)) − f (t))]  t 1 = un (t) + (t − τ )α −1 λ (τ ) (C Dαa un (τ ) − N(un (τ )) − f (τ )) d τ , (12) Γ (α ) a where λ is the Lagrange multiplier, which can be optimally determined via variational theory [23]. Some approximations are required to identify the Lagrange multiplier. The correction functional equation (12) can be approximated by the following equation: un+1 (t) = un (t) +

 t a



λ (τ ) u n (τ ) + N(u˜n (τ )) − f (τ ) d τ .

(13)

Applying restricted variations u˜n to the term N(u), we can easily determine the multiplier λ . If we assume that the aforementioned functional is stationary, i.e., δ u˜n = 0, then we obtain  t

δ un+1 (t) = δ un (t) + δ λ (τ ) u n (τ ) − f (τ ) d τ . (14) a

266

S. Sirisubtawee

This yields the Lagrange multiplier λ = −1. Substituting λ = −1 into the functional equation (12), we obtain the following iteration formula: un+1 (t) = un (t) −RL Jaα (C Dαa un (t) + N(un (t)) − f (t)) .

(15)

We must choose the initial approximation u0 (t) to satisfy the initial conditions of the problem. Finally, we can approximate the solution u(t) = limn→∞ un (t) by the Nth term approximation uN (t). Next, we describe the RVIM for solving a system of fractional-order differential equations as follows. Consider the following incommensurate system of fractionalorder differential equations: αi C Da ui (t) + Ni (u1 (t), ..., um (t)) = f i (t), 0 < αi

≤ 1, i = 1, 2, ..., m,

(16)

where Ni are operators of u j (t), j = 1, 2, 3, ..., m and fi (t) are known functions. The correction functionals for this case are as follows:

ui,n+1 (t) = ui,n (t) −RL Jaαi C Dαa i ui,n (t) + Ni (u1,n (t), ..., um,n (t)) − fi (t) , i = 1, 2, ..., m.

(17)

In the same manner, the initial approximations ui,0 (t), i = 1, 2, ..., m can be independently selected as long as they satisfy the initial conditions of the system. The Nth order terms ui,N (t), i = 1, 2, ..., m can then be used to represent approximations of the solutions ui (t) = limn→∞ ui,n (t), i = 1, 2, ..., m of the system. The iterative scheme for the RVIM can be constructed by modifying the iteration formula (17) obtained by the standard VIM as above. The modification can be done by replacing u1,n (t), ..., ui−1,n (t) in the formula of ui,n+1 (t) with the updated values u1,n+1 (t), ..., ui−1,n+1 (t), respectively. In consequence, the recursive formula for the RVIM employed to solve system (16) can be expressed as ui,n+1 (t) = ui,n (t) −RL Jaαi C Dαa i ui,n (t) + Ni (u1,n+1 (t), ..., ui−1,n+1 (t), ui,n (t),

(18) ..., um,n (t)) − fi (t) , i = 1, 2, ..., m. The modified technique in the RVIM formula (18) can accelerate the convergence of iteratively approximate solutions comparing to the approximate solutions obtained using the standard VIM. Therefore, the corrected solution ui,n+1 (t) of the RVIM is more accurate than the solution ui,n+1 (t) from the standard VIM because updated values are employed for computing the RVIM solution. 3.2

Predictor-Corrector Scheme

Recently, the Adams-Bashforth-Moulton type predictor-corrector scheme or the PECE method [10] is widely used to solve FDEs. Hence, we will use this technique to numerically solve the IVP (1)–(2). The main ideas of the method are briefly given as follows. Consider the following fractional-order initial value problem α C D0 u(t) = (k)

f (t, u(t)),

u(k) (0) = u0 , k = 0, 1, ...m − 1,

0 ≤ t ≤ T,

α ∈ (m − 1, m),

(19)

On Asymptotic Stability Analysis and Solutions

267

where f is a nonlinear function and m is a positive integer. The IVP (19) can be converted to the following Volterra integral equation u(t) =

m−1



(k) t

u0

k=0

k

k!

+

1 Γ (α )

 t 0

(t − τ )α −1 f (τ , u(τ ))d τ .

(20)

In order to estimate the integral in (20), we uniformly discretize the whole time T by a uniform grid {tn = nh : n = 0, 1, ...N} for some integer N with the step size h := T /N. Let uh (tn ) denote the approximation to u(tn ). Suppose that we have already calculated approximations uh (t j ), j = 1, 2, ..., n, then the approximation uh (tn+1 ) of the IVP (19) can be computed using the PECE method as follows: uh (tn+1 ) = + where

m−1 t k n+1 (k) u0 + k=0 k! n α



h Γ (α + 2)

hα f (tn+1 , uPh (tn+1 )) Γ (α + 2)

∑ a j,n+1 f (t j , uh (t j )),

(21)

j=0

⎧ α +1 − (n − α )(n + 1)α , if j = 0, ⎪ ⎪n ⎨ (n − j + 2)α +1 + (n − j)α +1 a j,n+1 = if 1 ≤ j ≤ n, ⎪ −2(n − j + 1)α +1 , ⎪ ⎩ 1, if j = n + 1.

(22)

The initial approximation uPh (tn+1 ) in Eq. (21) is called a predictor and is given by uPh (tn+1 ) =

m−1 t k n+1 (k) u0 + k=0 k!



1 Γ (α )

n

∑ b j,n+1 f (t j , uh (t j )),

(23)

j=0

where b j,n+1 =

hα ((n + 1 − j)α − (n − j)α ). α

(24)

4 Main Results In this section, we firstly establish the condition for which an equilibrium of an incommensurate fractional-order Bloch equation is locally asymptotically stable. In particular, a condition for the asymptotic stability of the FIVP in Eqs. (1)–(2) is derived. Secondly, we exhibit the use of the RVIM and the PECE method as described above to solve the FIVP in Eqs. (1)–(2) for solutions using the starting point t = a = 0. Consider the following incommensurate fractional-order Bloch equation Mx (t) , T2 My (t) α2 , C D0 My (t) = −ω0 Mx (t) − T2 M0 − Mz (t) α3 , C D0 Mz (t) = T1

α1 C D0 Mx (t) = ω0 My (t) −

(25)

268

S. Sirisubtawee

where αi ∈ (0, 1] for i = 1, 2, 3 with the initial conditions Mx (0) = Mx0 , My (0) = My0 , Mz (0) = Mz0 .

(26)

The asymtotic stability of the system expressed in Eqs. (25) and (26) can be investigated according to Theorem 2 as follows. Theorem 3. The characteristic polynomial equation of Eq. (25) is defined as T1 T2 2 λ m(α1 +α2 +α3 ) + ω0 2 T1 T2 2 λ mα3 + T1 T2 λ m(α1 +α3 ) + T1 T2 λ m (α2 +α3 ) +T1 λ mα3 + T2 2 λ m(α1 +α2 ) + ω0 2 T2 2 + λ mα1 T2 + λ mα2 T2 + 1 = 0,

(27)

where m is the least common multiple of the denominators ui of αi ’s, where αi = vi /ui , vi , ui ∈ Z + for i = 1, 2, 3. The equilibrium point x∗ = [0, 0, M0 ]T of the system in Eqs. (25) and (26) is globally asymptotically stable if all roots λi ’s of Eq. (27) satisfy the following condition π |arg(λi )| > f or all i. (28) 2m Proof. The asymptotic stability condition is determined from the following characteristic polynomial equation   λ mα1 + 1  − ω 0 0   T  ω 2 λ mα2 + 1  0 (29)   = 0. 0 T2  1  m α 3  λ + T1 0 0 Equation (29) is equivalent to Eq. (27). According to Theorem 2 described above, the π asymptotic stability of the equilibrium x∗ can be obtained if the condition |arg(λi )| > 2m holds for every root λi of Eq. (27).  The following initial conditions and parameter values are employed in our simulations: Mx0 = 0, My0 = 100, Mz0 = 0,

ω0 = 60π rad/s, T1 = 1 s, T2 = 20 × 10−3 s, M0 = 100.

(30)

The following two examples demonstrate that the equilibrium point of the incommensurate and commensurate fractional-order Bloch equations is asymptotically stable via using Theorem 3. Example 1. Consider the incommensurate fractional-order Bloch equation (25) with α1 = 1, α2 = 0.9, α3 = 0.8. Using the parameter values shown in Eq. (30), we obtain m = 10 and the following characteristic polynomial equation

λ 27 + λ 19 + 50λ 18 + 50λ 17 + 50λ 10 + 50λ 9 + (3600π 2 + 2500)λ 8 + 3600π 2 + 2500 = 0. All roots λi of Eq. (31) for i = 1, 2, ..., 27, are shown in Table 1.

(31)

On Asymptotic Stability Analysis and Solutions

269

Table 1. All eigenvalues λi , i = 1, 2, ..., 27 of the incommensurate fractional-order Bloch equation (25) with α1 = 1, α2 = 0.9, α3 = 0.8. −1.7555

−0.4005 ± 1.7063i −0.3827 ± 0.9239i 0.9239 ± 0.3827i

1.7082 ± 0.3349i

−0.9239 ± 0.3827i −0.9655 ± 1.4361i 1.6381 ± 0.5541i

−1.3748 ± 1.0905i 1.1428 ± 1.3074i

0.7407 ± 1.5848i

0.1076 ± 1.7296i

1.5572 ± 0.7880i

0.3827 ± 0.9239i

After small algebraic computation for the arguments of the λi ’s, we obtain |arg(λi )| >

π , i = 1, 2, ..., 27. 20

(32)

By Theorem 2, the equilibrium point x∗ = [0, 0, 100]T of system (25) with α1 = 1, α2 =  0.9, α3 = 0.8 is asymptotically stable. Example 2. Consider the commensurate fractional-order Bloch equation in Eq. (25) with the same fractional orders α1 = α2 = α3 = 0.9. In the same manner as Example 1, the corresponding characteristic polynomial equation is expressed as

λ 27 + 101λ 18 + (3600π 2 + 2600)λ 9 + 3600π 2 + 2500 = 0.

(33)

All roots λi of Eq. (33) for i = 1, 2, ..., 27 are shown in Table 2. Table 2. All eigenvalues λi , i = 1, 2, ..., 27 of the commensurate fractional-order Bloch equation (25) with α1 = α2 = α3 = 0.9. −1.0000

−0.0518 ± 1.7959i −1.1940 ± 1.3424i 1.7596 ± 0.3628i

−1.5294 ± 0.9427i

0.9397 ± 0.3420i −0.5656 ± 1.7052i −0.1736 ± 0.9848i −1.7776 ± 0.2609i −0.7660 ± 0.6428i 1.1147 ± 1.4090i 0.5000 ± 0.8660i

0.6629 ± 1.6699i

1.5811 ± 0.8531i

After small algebraic computation for the arguments of the λi ’s, we obtain |arg(λi )| >

π , i = 1, 2, ..., 27. 20

(34)

By Theorem 2, the equilibrium point x∗ = [0, 0, 100]T of system (25) with α1 = α2 = α3 = 0.9 is asymptotically stable.  Next, we exhibit the use of the RVIM and the PECE method as described above to solve the FIVP in Eqs. (1)–(2) with the starting point t = a = 0. However, the exact solutions of the integer-order Bloch equation, i.e., Eq. (25) with α1 = α2 = α3 = 1 are given in [13] or expressed in Eq. (19) of [24] so that we can compare the results obtained using the two methods with them for the integer-order case.

270

4.1

S. Sirisubtawee

Applications of the RVIM and the PECE Method

The present section is devoted to the use of the RVIM and the PECE method to obtain analytical and numerical solution formulas of the IVP (1)–(2). Firstly, we apply the iteration formula (18) of the RVIM with λ (τ ) = −1 to the problem. Then we obtain the following iteration formulas:    Mxn (τ ) n+1 n α α n n Mx (t) = Mx (t) +RL Ja λ (τ ) Da Mx (τ ) −ω0 My (τ ) + , T2 C   M n (τ ) , (35) = Mxn (a) −RL Jaα −ω0 Myn (τ ) + x T2 

  Myn (τ )  α n n+1 λ (τ ) Da My (τ ) + ω0 Mx (τ ) + , T2 C   Myn (τ ) , = Myn (a) −RL Jaα ω0 Mxn+1 (τ ) + T2    M0 − Mzn (τ )  n+1 n α α n Mz (t) = Mz (t) +RL Ja λ (τ ) Da Mz (τ ) − , T1 C   n n α M0 − Mz (τ ) , n = 0, 1, 2, ..., = Mz (a) +RL Ja T1

Myn+1 (t) = Myn (t) +RL Jaα

(36)

(37)

in which the initial approximations are chosen to be Mx0 (t) = Mx0 , My0 (t) = My0 and Mz0 (t) = Mz0 . Secondly, we apply the PECE method in Eqs. (21)–(24) to the IVP (1)–(2) by discretizing the time interval with points {tn }. We then obtain the numerical schemes for approximations Mxh,n = Mxh (tn ), Myh,n = Myh (tn ), Mzh,n = Mzh (tn ) as follows:     n hα MxP,h,n+1 Mxh, j hα ω0 MyP,h,n+1 − a1, j,n+1 ω0 Myh, j − + , ∑ Γ (α + 2) T2 Γ (α + 2) j=0 T2     n MyP,h,n+1 Myh, j hα hα h, j − ω0 MxP,h,n+1 − Myh,n+1 = My0 + a ω M − + − , 0 x ∑ 2, j,n+1 Γ (α + 2) T2 Γ (α + 2) j=0 T2     n M0 − MzP,h,n+1 hα hα M0 − Mzh, j Mzh,n+1 = Mz0 + a3, j,n+1 + , ∑ Γ (α + 2) T1 Γ (α + 2) j=0 T1 Mxh,n+1 = Mx0 +

in which MxP,h,n+1 MyP,h,n+1 MzP,h,n+1

= Mx0 + = My0 + = Mz0 +

1 Γ (α ) 1 Γ (α ) 1 Γ (α )

n

∑ b1, j,n+1

j=0 n

∑ b2, j,n+1

j=0 n

∑ b3, j,n+1

j=0



ω0 Myh, j − 

Mxh, j T2



M0 − Mzh, j T1

,

(39) (40)



Myh, j T2 

−ω0 Mxh, j −

(38)

,

(41)

 ,

(42)

(43)

On Asymptotic Stability Analysis and Solutions

271

where, al, j,n+1 and bl, j,n+1 for l = 1, 2, 3 are defined in (22) and (24), respectively. We will use the discretized formulas (38)–(43) to obtain the numerical solutions for the IVP (1)–(2) in the following section. 4.2 Simulation Results In this section, the simulation results obtained using the formulas in Sect. 4.1 are graphically demonstrated. Some numerical results are not shown here, however, they can be found in [24]. Given the parameter values described in Eq. (30) and the starting point a = 0, we will exhibit the simulation results of the IVP (1)–(2) obtained using the formulas of the exact solution, the RVIM in Eqs. (35)–(37), and the PECE method in Eqs. (38)–(43) for α = 1, 0.9, 0.8. In particular, the absolute errors of the numerical results, generated by the RVIM and the PECE method compared to those obtained using the exact solution, are calculated for α = 1. Moreover, the absolute differences of the numerical results, calculated by the RVIM and the PECE method, are measured for α = 0.9, 0.8. The following results are for α = 1. Using the iteration formulas (35)–(37), we obtain the 80th terms of the approximations Mx80 (t), My80 (t) and Mz80 (t) as follows. Mx80 (t) =18849.6t − 942478t 2 − 8.80607 × 107 t 3 + · · · − 7.11155 × 1082 t 158 − 2.0116 × 1081 t 159 , My80 (t) =100 − 5000t − 1.65153 × 106 t 2 + · · · + 8.48415 × 1082 t 159 + 2.36986 × 1081 t 160 ,

(44)

Mz80 (t) =100t − 50t 2 + 16.6667t 3 + · · · + 1.1178 × 10−115 t 79 − 1.39724 × 10−117 t 80 .

The simulation results of the problem for α = 1 using all of the methods, i.e., the exact formulas, the RVIM in Eq. (44), and the PECE method in Eqs. (38)–(43) with the step size h = 10−4 are shown in Fig. 1. It can be easily observed that the numerical simulations obtained by the RVIM and the PECE method are in very good agreement with the exact solutions. The absolute errors between numerical solutions, which are calculated using the RVIM and the PECE method, and the exact solutions are shown in Table I and Table II in [24], respectively. The approximate solution component Mz (t), obtained by the two methods, is the most accurate when compared with its corresponding exact solution. When t is larger, the approximate solution components Mx (t), My (t), achieved by the RVIM, are significantly less accurate compared with their corresponding exact solutions. However, the fluctuation of the absolute errors in the PECE method is appreciably lower than that of the RVIM when t is increasing. From the simulation results of the RVIM and the PECE method compared with the exact solutions, the two methods are reliable and efficient tools for obtaining approximate solutions of the problem for α = 0.9, 0.8 as well. Next, we will simulate numerical results of the problem for α = 0.9, 0.8 as follows. Applying the RVIM to the problem via the iteration formulas (35)–(37), the 80th term approximations of the solutions for α = 0.9 are expressed as Mx80 (t) =19598.9t 0.9 − 1.12435 × 106 t 1.8 − 1.26686 × 108 t 2.7 + · · · − 1.81327 × 10117 t 142.2 − 9.3589 × 10115 t 143.1 , My80 (t) =100 − 5198.77t 0.9 − 1.97022 × 106 t 1.8 + · · · + 3.94721 × 10117 t 143.1 + 2.01309 × 10116 t 144 ,

(45)

272

S. Sirisubtawee 100

70

M of Exact

60

M x of RVIM

M y of RVIM

50

M x of PECE

M y of PECE

M of Exact

x

y

40

50

My

Mx

30 20 10 0

0 -10 -20 -30

-50 0

0.03

0.06

0.09

0.12

0.15

0

0.03

0.06

t

0.09

0.12

0.15

t 14 12 10 M y of Exact

Mz

8

M y of RVIM 6

M y of PECE

4 2 0 0

0.03

0.06

0.09

0.12

0.15

t

Fig. 1. Simulation comparisons of the solutions Mx (t), My (t), Mz (t) for the IVP (1)–(2) using the exact solutions formulas the RVIM, and the PECE method for α = 1.

Mz80 (t) =103.975t 0.9 − 59.6484t 1.8 + 23.9771t 2.7 + · · · + 7.67141 × 10−101t 71.1 − 1.63307 × 10−102t 72 . The obtained 80th term approximations using the RVIM for α = 0.8 are Mx80 (t) =20238.2t 0.8 − 1.3185 × 106t 1.6 − 1.77232 × 108t 2.4 + · · · − 8.01025 × 10150 t 126.4 − 7.46033 × 10149 t 127.2 , My80 (t) =100 − 5368.36t 0.8 − 2.31044 × 106t 1.6 + · · · + 3.14647 × 10151 t 127.2 150 128

+ 2.89747 × 10

t

(46)

,

Mz80 (t) =107.367t 0.8 − 69.9484t 1.6 + 33.5435t 2.4 + · · · + 2.19822 × 10−86t 63.2 −88 64 − 7.88103 × 10

t .

In a similar fashion, the numerical simulations results of the problem for α = 0.9, 0.8 utilizing the RVIM in Eqs. (45)–(46), and the PECE method in Eqs. (38)–(43) with the step size h = 10−4 are graphically portrayed in Figs. 2 and 3, respectively. It is not difficult to observe from such figures that the numerical results from the RVIM and the PECE method are still in good agreement for α = 0.9, 0.8. The absolute differences of the results obtained by the two methods are numerically shown in Table III and Table

On Asymptotic Stability Analysis and Solutions 100

70

M x of RVIM

60

M of PECE

50

M of PECE y

60

40

40

My

Mx

M y of RVIM

80

x

273

30

20

20 10

0

0

-20

-10

-40 0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0

0.01

0.02

0.03

0.04

t

0.05

0.06

0.07

0.08

t 10 9 8 7

M z of RVIM

Mz

6

M z of PECE

5 4 3 2 1 0 0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

t

Fig. 2. Simulation comparisons of the solutions Mx (t), My (t), Mz (t) for the IVP (1)–(2) using the RVIM, and the PECE method for α = 0.9. 60

100

M of RVIM

50

y

M y of PECE

80 60

My

40

Mx

M of RVIM

x

M x of PECE

30

40

20

20

10

0

0

-20 0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0

0.005

0.01

0.015

t

0.02

0.025

0.03

0.035

t

7 6 5

M of RVIM z

M of PECE

4

Mz

z

3 2 1 0 0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

t

Fig. 3. Simulation comparisons of the solutions Mx (t), My (t), Mz (t) for the IVP (1)–(2) using the RVIM, and the PECE method for α = 0.8.

274

S. Sirisubtawee

IV in [24] for α = 0.9 and α = 0.8, respectively. We can conclude from the last two tables that the two methods provide the numerical data which are quite close to each other for the specified values of t. Especially, the solution component Mz (t), obtained via the two methods, has the lowest discrepancy.

5 Concluding Remarks In this work, we have proved the asymptotic stability of the equilibrium point of the FIVP (1)–(2) for the NMR Bloch equations. Moreover, we have obtained approximate solutions including the approximate analytical solutions, which have been computed via the RVIM, and the approximate numerical solutions, which have been computed via the PECE method. The simulations of the solutions, calculated by the two methods, have been compared for integer order α = 1 and fractional orders α = 0.9, 0.8. The exact solutions of the problem for α = 1 have been used to measure the accuracy of the solutions obtained by the two approximate methods. The comparison of the simulation results for the two approximate methods with the exact solution for α = 1 show that the approximate methods provide quite accurate solutions. The comparisons of the approximate analytical and numerical results have shown that the approximate solutions are in good agreement for α = 0.9, 0.8 as well. Furthermore, the RVIM and PECE methods could be reliably and efficiently applied to solve other fractional-order differential equation systems arising in engineering and applied science problems. Acknowledgements. The authors would like to thank the editors and the anonymous referees for their valuable suggestions on the improvement of this work. The present work is extended and revised from the corresponding conference paper [24], which was financially supported by the Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok (contract no. 5942106).

References 1. Podlubny, I.: Fractional Differential Equations. Academic Press, New York (1989) 2. Obembe, A.D., Al-Yousef, H.Y., Hossain, M.E., Abu-Khamsin, S.A.: Fractional derivatives and their applications in reservoir engineering problems: a review. J. Petrol. Sci. Eng. 157, 312–327 (2017) 3. Guner, O., Bekir, A.: The Exp-function method for solving nonlinear space-time fractional differential equations in mathematical physics. J. Assoc. Arab Universities Basic Appl. Sci. 24, 277–282 (2017) 4. Sirisubtawee, S., Koonprasert, S., Khaopant, C., Porka, W.: Two reliable methods for solving the (3 + 1)-dimensional space-time fractional Jimbo-Miwa equation. Math. Probl. Eng. (2017) 5. Carvalho, A., Pinto, C.M.: A delay fractional order model for the co-infection of malaria and HIV/AIDS. Int. J. Dyn. Control 5(1), 168–186 (2017) 6. Dogan, N.: Numerical solution of chaotic Genesio system with multi-step Laplace Adomian decomposition method. Kuwait J. Sci. 40(1) (2013) 7. Sekson Sirisubtawee, S.K., Kaewta, S.: Duan-Rach modified decomposition method for solving some types of nonlinear fractional multi-point boundary value problems. J. Electr. Eng. 102(10), 2143–2176 (2017)

On Asymptotic Stability Analysis and Solutions

275

8. Ghazanfari, B., Veisi, F.: Homotopy analysis method for the fractional nonlinear equations. J. King Saud University Sci. 23(4), 389–393 (2011) 9. Martin, O.: A modified variational iteration method for the analysis of viscoelastic beams. Appl. Math. Model. 40(17), 7988–7995 (2016) 10. Diethelm, K., Ford, N.J., Freed, A.D.: A predictor-corrector approach for the numerical solution of fractional differential equations. Nonlinear Dyn. 29(1), 3–22 (2002) 11. Petr´asˇ, I.: Fractional-order feedback control of a DC motor. J. Electr. Eng. 60(3), 117–128 (2009) 12. Zhao, E., Chao, T., Wang, S., Yang, M.: Finite-time formation control for multiple flight vehicles with accurate linearization model. Aerosp. Sci. Technol. 71, 90–98 (2017) 13. Magin, R., Feng, X., Baleanu, D.: Solving the fractional order Bloch equation. Concepts Magn. Reson. Part A 34(1), 16–23 (2009) 14. Qin, Y., Lu, C., Li, L.: Multi-scale cyclone activity in the Changjiang River-Huaihe River valleys during spring and its relationship with rainfall anomalie. Adv. Atmos. Sci. 34(2), 246–257 (2017) 15. Baleanu, D., Magin, R.L., Bhalekar, S., Daftardar-Gejji, V.: Chaos in the fractional order nonlinear Bloch equation with delay. Commun. Nonlinear Sci. Numer. Simul. 25(1), 41–49 (2015) ¨ u, C., Jafari, H., Baleanu, D.: Revised variational iteration method for solving systems of 16. Unl¨ nonlinear fractional-order differential equations. Abstr. Appl. Anal. (2013) 17. Zayernouri, M., Matzavinos, A.: Fractional Adams-Bashforth/Moulton methods: an application to the fractional Keller-Segel chemotaxis system. J. Comput. Phys. 317, 1–14 (2016) 18. Matignon, D.: Stability properties for generalized fractional differential systems. In: Proceedings of Fractional Differential Systems: Models, Methods and Applications, vol. 5, pp. 145–158 (1998) 19. Tavazoei, M.S., Haeri, M.: A necessary condition for double scroll attractor existence in fractional-order systems. Phys. Lett. A 367, 102–113 (2007) 20. Tavazoei, M.S., Haeri, M.: Limitations off requency domain approximation for detecting chaos in fractional order systems. Nonlinear Anal. 69, 1299–1320 (2008) 21. Deng, W., Li, C., Lu, J.: Stability analysis of linear fractional differential system with multiple time delays. Nonlinear Dyn. 48, 409–416 (2007) 22. He, J.-H.: Approximate analytical solution for seepage flow with fractional derivatives in porous media. Comput. Methods Appl. Mech. Eng. 167(1–2), 57–68 (1998) 23. Inokuti, M., Sekine, H., Mura, T.: General use of the Lagrange multiplier in nonlinear mathematical physics. In: Variational Method in the Mechanics of Solids, vol. 33, no. 5, pp. 156–162 (1978) 24. Sirisubtawee, S.: Comparison of analytical and numerical solutions of fractional-order Bloch Equations using reliable methods. In: Lecture Notes in Engineering and Computer Science: Proceedings of the International Multi Conference of Engineers and Computer Scientists, Hong Kong, 14–16 March 2018, pp. 467–472 (2018)

A Hybrid Delphi Multi-criteria Sorting Approach for Polypharmacy Evaluations Anissa Frini1(&), Caroline Sirois2, and Marie-Laure Laroche3 1

3

University of Quebec at Rimouski, Lévis, Canada [email protected] 2 Faculty of Medicine, Laval University, Quebec, Canada [email protected] Faculty of Medicine, University of Limoges, Limoges, France [email protected]

Abstract. With the intensification of chronical disease within older people, concurrent use of different drugs (polypharmacy) is becoming increasingly frequent. However, there is no established manner to determine whether polypharmacy is appropriate or not. We propose an original method of classifying polypharmacy using a Delphi survey results and multi-criteria decision-aid methods. To do this, we provided clinicians with a list of drugs that could be potentially prescribed to the typical elderly person suffering from three diseases (diabetes, chronic obstructive pulmonary disease, and heart failure). Clinicians expressed their opinions on a 5-point Likert scale, allowing for hesitation between two or more answers. They evaluated risks, benefits, and impacts of each drug on the patient’s quality of life. We then aggregated these evaluations in order to obtain, for each drug, a multi-criteria evaluation vector representing the collective opinion of the clinicians consulted. Subsequently, ELECTRE TriC and ELECTRE Tri multi-criteria sorting methods were used to evaluate and assign the polypharmacy to one of the following three categories: appropriate, more or less appropriate, or not appropriate. The proposed approach is innovative and enables the integration of a variety of conflicting criteria in the evaluation of polypharmacy quality. It also allows clinicians to express their opinion, and their hesitation where relevant, linguistically. Keywords: Decision aid  Delphi  ELECTRE Tri methods  Polypharmacy  Quality evaluation

 Multi-criteria sorting

1 Introduction As the population ages and chronic diseases increase, a growing number of elderly people take a large quantity of drugs. However, the benefits of the concurrent use of drugs, named polypharmacy, are scarcely reported in the literature. On the contrary, polypharmacy has been associated with a number of adverse effects, such as increased risk of hospitalization, geriatric syndromes, and inappropriate prescribing. Sorting out appropriate and inappropriate polypharmacy is a complex decision problem in view of the multiple conflicting criteria that require simultaneous consideration (for example, benefits, risks, improvement of quality of life, age, interaction between diseases, individuals’ and professionals’ preferences, and the diversity of person-specific damages). © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 276–290, 2020. https://doi.org/10.1007/978-981-32-9808-8_22

A Hybrid Delphi Multi-criteria Sorting Approach

277

Polypharmacy remains a little known and complex phenomenon. No consensus has been reached as to how to measure it [1, 2]. The distinction between the notions of appropriate and inappropriate polypharmacy is also insufficiently clear [3]. In sum, there are no specific procedure for sorting out appropriate and inappropriate polypharmacy. It should also be noted that polypharmacy quality evaluation is subject to analysis of the interaction between its component drugs as well. This analysis is based on the guidelines and medical documents related to interactions. Literature review reveals that multi-criteria decision support is becoming increasingly popular in healthcare decision-making problems. Application contexts vary and cover a number of specialties. However in this range of applications, polypharmacy quality evaluation is an unexplored field of research, hence the originality and innovative nature of this paper. Given the conflicting decision criteria (benefits, risks and improvement of quality of life of polypharmacies), the heterogeneity of measuring scales, multiple and varied viewpoints of experts, and the problem of sorting into predefined categories (appropriate, more or less appropriate, and inappropriate), we believe it is relevant to analyze the extent to which multi-criteria classification methods can be used to evaluate polypharmacy quality. The general objective of this article is to propose a hybrid Delphi multi-criteria sorting method, taking into consideration hesitation in evaluations, in order to distinguish appropriate from inappropriate polypharmacy. Section 2 presents the literature review of multi-criteria classification methods. Section 3 puts forward a novel approach to polypharmacy evaluation and classification. Section 4 offers an illustration of the proposed approach.

2 Previous Work Multi-criteria classification methods provide an opportunity to deal with the issue of sorting alternatives into pre-defined and not pre-defined categories. In the early 1990s, multi-criteria sorting methods were designed either for classification into pre-defined categories (ordinal sorting) or into non pre-defined categories (nominal sorting). Later on, Perny [4] introduced the idea of filtering on the basis of comparing alternatives with respect to reference points in order to determine their category or class. Shortly afterwards, Henriet [5] distinguished multi-criteria assignment focused on building an assignment function and taking into consideration the decision-maker’s preferences. Existing multi-criteria sorting methods derive either from the single-criterion synthesis approach (UTADIS, M.H.DIS, etc.) or from the outranking approach (ELECTRE Tri, ELECTRE Tri-C, ELECTRE SORT, SMAA-Tri, PROAFTN, etc.). One of the popular single synthesizing criterion approaches is the UTADIS (UTilités Additives DIScriminantes) method developed by Jacquet-Lagrèze [6] and improved by Zopounidis and Doumpos [7]. UTADIS is an ordinal sorting method based on utility functions. It assigns an overall utility to each action and to profile limits, and classifies actions through comparing utilities with profile limits. Although this method has a solid theoretical foundation, it excludes incomparability, allows for compensation, and takes into account only criteria measured on cardinal scales. The M.H.DIS, another single synthesizing criterion approach, was designed for a multi-group context [8].

278

A. Frini et al.

Among outranking sorting methods, the ELECTRE Tri method [9, 10] has been widely used for ordinal sorting. With this method, reference actions are used in order to segment criteria space into categories. Each category has two limits (higher and lower) defined by two reference actions. In order to compare actions and profile limits, ELECTRE Tri builds an outranking approach using concordance, discordance, and veto indices. More recently, a number of variants of the ELECTRE Tri method were put forward, such as SMAA-TRI [11], ELECTRE TRI-C [12], and ELECTRE SORT [13]. Other sorting methods have been proposed as part of the outranking approach, as the PROMETHEE-based classification method [14, 15] and PROAFTN for nominal sorting [16–18], which are based on a fuzzy assignment procedure. Recent advances in MCDA sorting and classification include UTADISGMS [21], MAUT/MAVT [22], interactive sorting [23, 24] and supervised machine learning [25]. For a recent review of outranking MCDA sorting and clustering methods see [26–28].

3 Proposed Approach for Polypharmacy Evaluation The proposed approach for polypharmacy quality evaluation is structured in five stages (Fig. 1). Stage 1 consists in formulating the clinical case. Stage 2 focuses on collecting information regarding the drugs. A Delphi process enables experts to express their opinions regarding the benefits, risks, and improvement of the quality of life of each listed drug. Based on the data collected for each drug, the Delphi process stops while reaching a 70% agreement on each evaluation of the drug. Then, the obtained experts’ viewpoints are aggregated into a common position concerning each drug. Stage 3 relates to the study of drug interactions, in particular major ones, and to evaluating each polypharmacy by aggregating the evaluations of the individual drugs that compose it. Finally, the objective of stage 4 is to assign polypharmacy to one of the categories: appropriate, more or less appropriate, or inappropriate. 3.1

Stage 1: Formulating a Clinical Case

The clinical case is that of a man of 73 years old or more suffering from type 2 diabetes, heart failure, and a chronic obstructive pulmonary disease. First, this choice was motivated by the fact that chronic diseases are frequent in the elderly people. Second, a number of beneficial drugs used in the treatment of one of these diseases are harmful in the treatment of the other diseases. 3.2

Stage 2: Data Collection on Drugs

Delphi Study We conducted a Delphi study, which is a well-established technique designed to collect expert opinions with the purpose of finding consensus for defined clinical problems. A panel of international experts in pharmacy, pharmacology or geriatrics was invited to participate to the Delphi survey aiming at finding consensus within a pre-defined maximum of three rounds. We aimed to recruit 30 individuals. We constructed a list of 181 potential participants based on their expertise in clinical practice or in research for

A Hybrid Delphi Multi-criteria Sorting Approach

279

Stage 1: Describe a clinical case

Stage 2: Data collecƟon on drugs 2.1 E-Delphi 2.2 AggregaƟon of experts’ opinions for each drug

Stage 3: Polypharmacy evaluaƟon 3.1 Study of interacƟons between drugs

3.2 EvaluaƟon of the polypharmacy’s risk, benefit, and impact on quality of life RISK BENEFITS IMPACT ON QUALITY OF LIFE

Stage 4: Polypharmacy classificaƟon

Not appropriate

Appropriate

Appropriate

Fig. 1. Polypharmacy evaluation approach [29]

the treatment of the targeted chronic diseases or geriatrics in general. Experts were recruited across different countries to ensure international representation. On the 181 experts conducted, 37 agreed to participate and a panel of 16 experts completed three rounds of the Delphi. The three rounds of the Delphi survey were conducted from September 2017 to January 2018 using the LimeSurvey® web-based platform. The expert panelists were asked to rate the benefits, risks and positive impact on quality of life of different drugs

280

A. Frini et al.

that could be used by our clinical case (a 73 years old man with the three chronic conditions previously mentioned: type 2 diabetes, heart failure and chronic obstructive pulmonary disease). The participants rated each dimension of each drug (or, in some instances, the absence of the drug) using a 5-level Likert scale (very high to very low). They necessarily and implicitly take into account potential interactions between drugs, contraindications, and precautions depending on comorbidity and predispositions. Expert panelists could also comment each question of every round. Consensus was reached on 95% of items (166/174). Only two drug classes were associated with both the highest category of benefits and positive impact on quality of life, and the lowest risk category. Nine other drugs/classes of drugs were categorized within the highest benefits level. Fifteen drugs were included in the highest level of risk. The results of this Delphi survey suggest that drugs with evidence-based effectiveness and recommended in clinical guidelines for single conditions are generally considered positive for multimorbid older patients. Nevertheless, a non-negligible number of medications was considered negative or very negative by our panelists. Aggregation of Experts’ Opinions In the Delphi, the expert is given a five point Likert scale (l1 ¼ very low, l2 ¼ low, l3 ¼ neutral, l4 ¼ high, l5 ¼ very high). Let us have E ¼ ðE1 ; E2 ; . . .; En Þ the set of linguistic evaluations provided by the n consulted specialists for a given drug. The common view Ec is calculated as follows: Ec ¼ Averageðd ðE1 ; l5 Þ; d ðE2 ; l5 Þ; . . .; d ðEn ; l5 ÞÞ Where d is the measure of distance developed in [19] which indicates the distance between each pair of adjacent linguistic terms (Fig. 2). The distance between two nonadjacent evaluations is the sum of the distances of the shortest path between these two evaluations. These distances are shown in Fig. 2, where a and b are two constants associated respectively to imprecision. a makes it possible to penalize the distance (imprecision) as the degree of hesitation rises. For example, we add a for going from ½l4 ; l5  to ½l3 ; l5 : As for b, it penalizes the distance according to the remoteness of the linguistic variable with respect to level 1, which represents maximum precision. For example, we add 2b for going from ½l3 ; l5 to ½l2 ; l5  and we add 3b for going from ½l2 ; l5  to ½l1 ; l5  because we are even farther from l5 . a and b belong to the set Tg where g is the number of levels in the linguistic scale.   1 1 Tg ¼ ða; bÞ  0ja þ bðg  1Þ\ ð1Þ 2 g2 if g is odd (1). Once the linguistic evaluations have been aggregated, we obtain a multi-criteria evaluation vector for each drug, which provides the aggregate level of benefit, risk, and improvement of quality of life for that drug. Illustration: Let us consider four specialists providing the following evaluations in favor of a drug administered to a patient:E1 ¼ l4 , E2 ¼ l5 , E3 ¼ ½l4 ; l5 , E4 ¼ ½l3 ; l5 : We consider a ¼ b ¼ 0; 1. To illustrate the aggregation, the values a and b are chosen arbitrarily from the set of values that satisfy Eq. (1).

A Hybrid Delphi Multi-criteria Sorting Approach

281

Fig. 2. Polypharmacy evaluation approach [19]

d ðE1 ; l5 Þ ¼ d ðl4 ; l5 Þ ¼ 2; d ðE2 ; l5 Þ ¼ d ðl5 ; l5 Þ ¼ 0; d ðE3 ; l5 Þ ¼ d ð½l4 ; l5 ; l5 Þ ¼ 1 þ a ¼ 1; 1; d ðE4 ; l5 Þ ¼ d ð½l3 ; l5 ; l5 Þ ¼ ð1 þ a þ bÞ þ ð1 þ aÞ ¼ 2; 3: Ec ¼ Average½d ðE1 ; l5 Þ; d ðE2 ; l5 Þ; d ðE3 ; l5 Þ; d ðE4 ; l5 Þ ¼ 1; 35 That means that consensus is at a distance of 1.35 from the best “very high” score and it is therefore situated between “very high” and “high.” 3.3

Stage 3: Polypharmacy Evaluation

Study of Interactions Between Drugs Before delving into polypharmacy evaluation as such, we will analyze the interactions between the drugs that compose it. Information about medication-medication interactions was collected. The analysis was based on Microdemex® Solutions data. The evaluations were restricted to contraindicated interactions as defined by Micromedex, as they represent the most serious interactions that must be avoided because of their significant likelihood of important clinical consequences. Should there be a major interaction between two or more drugs that are part of the polypharmacy, the evaluation approach thereto stops at this point and the polypharmacy is henceforward systematically assigned to the category “not appropriate”. Evaluation of the Polypharmacy When the analysis does not reveal a major drug interaction, stage 3 consists of evaluating the polypharmacy based on the assessments of the various drugs according to the criteria (benefit, risk, and improvement of quality of life).   Let us consider A ¼ ðM1 ; M2 ; . . .; Mn Þ a polypharmacy where Mj ; j ¼ 1; . . .; n represents the list of drug components. B1 ; B2 ; B3 ; B4 ; B5 five reference profiles where B1 (and respectively B2 ; B3 ; B4 ; B5 Þ is a polypharmacy where the benefit of all the drugs that compose it is very low (respectively low, neutral, high, very high).

282

A. Frini et al.

2 6 6 B1 ¼ 6 4

VL VL ... ... VL

3

2

7 6 7 6 7; B2 ¼ 6 5 4

L L ... ... L

3

2

7 6 7 6 7; B3 ¼ 6 5 4

N N ... ... N

3

2

7 6 7 6 7; B4 ¼ 6 5 4

H H ... ... H

3

2

7 6 7 6 7; B5 ¼ 6 5 4

VH VH ... ... VH

3 7 7 7 5

As well, R1 ; R2 ; R3 ; R4 ; R5 are five reference profiles where R1 (and respectively R2 ; R3 ; R4 ; R5 Þ is a polypharmacy where the risk presented by all the drugs composing it is very high (respectively, high, neutral, low, very low). The risk criterion is one that we have to minimize. ðQL1 ; QL2 ; QL3 ; QL4 ; QL5 Þ are five reference profiles where QL1 (respectively QL2 ; QL3 ; QL4 ; QL5 Þ is a polypharmacy where the improvement of quality of life is very low for all drugs composing it (respectively low, neutral, high, very high). VL, L, N, H, and VH represent respectively the acronyms that will be used for very low, low, neutral, high, and very high. In the following, we will expose the polypharmacy evaluation process for the criterion “benefit” but the same method will be applied to the criteria “risk” and “improvement of quality of life”. Let us consider aj and bkj the benefits of drug j and the benefit of the central profile Bk on drug j. In order to evaluate the benefit of the polypharmacy A, we propose the use of the ELECTRE Tri-C assignment method [20]. This method will assign one of the five levels (very low, low, neutral, high or very high) to the benefit of the polypharmacy A. The first step is to calculate a concordance index cj ðA; Bk Þ which measures the extent to which drug Mj supports the assertion “polypharmacy A is at least as good as Bk with regards to the ‘benefit’ criterion”. Let us introduce Dj as follows:  Dj ¼

aj  bkj bkj  aj

if evaluations have to be maximized if evaluations have to be minimized

ð2Þ

Equation (3) expresses the concordance index that takes a value between 0 and 1 according to the position of Dj with regard to the indifference and preference thresholds. 8 < 1 0 cj ðA; Bk Þ ¼ : Dj þ p j pj qj

if if otherwise

Dj   qj Dj \  pj

ð3Þ

Where Dj is calculated according to Eq. (2), qj is the indifference threshold and pj is the preference threshold (pj  qj  0). The concordance indices for each drug are then aggregated taking into account the weight wj of the drugs. The weights are provided by clinicians depending on the importance of the drug for the patient.

A Hybrid Delphi Multi-criteria Sorting Approach

cðA; Bk Þ ¼

X3 j¼1

wj cj ðA; Bk Þ

283

ð4Þ

Then, we calculate the discordance index, which measures the degree to which drug j counters the fact that “polypharmacy A is at least as good as Bk with regards to the ‘benefit’ criterion”. dj ðA; Bk Þ ¼

8 < 0

Dj þ p j : pj vj

1

if if if

D j [  pj vj  Dj \  pj Dj \  vj

ð5Þ

where vj is the veto threshold and Dj given by Eq. (2). Then, the credibility index rðA; Bk Þ is computed to measure the extent to which “polypharmacy A is at least as good as Bk with regards to the ‘benefit’ criterion” (given by Eq. (7)). The credibility index shows whether the outranking hypothesis is plausible or not. The term “outranking” means that polypharmacy A is at least as good as Bk . Equation (6) defines the expression Tj ðA; Bk Þ which is used to calculate the credibility index (Eq. (7)). Tj ðA; Bk Þ ¼

 1d ðA;B Þ j

k

1cðA;Bk Þ

1

if dj ðA; Bk Þ [ cðA; Bk Þ otherwise

rðA; Bk Þ ¼ cðA; Bk Þ

Y5 j¼1

Tj ðA; Bk Þ

ð6Þ ð7Þ

Subsequently, we use the exploitation method of ELECTRE Tri-C while considering the majority threshold 0:5  k  1. Starting with k ¼ 5, stop at k as rða; rk Þ  k  0. If rðA; Bk Þ  k  rðA; Bk þ 1 Þ  k then assign the kème category to the polypharmacy evaluation according to the considered criterion (benefit, risk, or quality of life); otherwise assign the k + 1ème category. This process is repeated for each criterion (benefit, risk, or quality of life). Finally, we obtain an overall evaluation of benefit, risk, and improvement of quality of life for the polypharmacy according to the linguistic scale (Very low, Low, Neutral, High, Very high). 3.4

Stage 4: Polypharmacy Assignment

Once the polypharmacy is evaluated in stage 3 of the method, this stage consists of assigning polypharmacy to one of the following categories (inappropriate, more or less appropriate, or appropriate) using ELECTRE TRI. Reference profiles ðr0 ; r1 ; r2 ; r3 ) that will be used here with ELECTRE TRI defines the boundaries of each category as given below. For instance, the category ‘inappropriate’ is limited by the profiles r0 and r1 ; the category ‘more or less appropriate’ by r1 and r2 , and the category ‘appropriate’ by r2 and r3 .

284

A. Frini et al.

" r0 ¼

# " # " # " # VL L H VH VH ; r1 ¼ H ; r2 ¼ L ; r3 ¼ VL VL L H VH

ELECTRE Tri assigns polypharmacy to one of the three categories by comparing its benefits, risks, and improvement of quality of life with those of the reference profiles, based on concordance, discordance, and credibility of outranking indices presented in Eqs. (3) to (7). The credibility index rða; rk Þ (Eq. (7)) enables us to express the extent to which “Polypharmacy A outranks rk ” taking into account concordance and discordance indices. Once credibility indexes are computed, we will assign polypharmacy A to one of the categories either with pessimistic or optimistic assignment. Pessimistic Assignment: Comparing polypharmacy A to the reference profiles, starting with the profile of the highest category. Assigning polypharmacy A to the k + 1ème category, where rk is the first profile, as rða; rk Þ  k  0: Optimistic Assignment: Comparing polypharmacy A to the reference profiles, starting with the profile of the lowest category. Assigning polypharmacy A to the kème category where rk is the first profile, as rðrk ; aÞ  k  0. Cross-referencing optimistic and pessimistic assignments leads to a final classification.

4 Illustration We consider the clinical case described in Sect. 3. In order to illustrate the approach, we consider that the polypharmacy administered to the patient is composed of the following 10 classes of drugs: M1: Beta-Blockers, M2: Angiotensin Converting Enzyme Inhibitors, M3: Loop diuretics, M4: HMG-CoA reductase inhibitors, M5: Metformin, M6: Other sulfonylureas, M7: Antiplatelet drugs, M8: Long-acting anticholinergic agents, M9: Short-acting anticholinergic agents, M10: short-action betaagonists. To simplify the illustration, we consider five expert’s opinions on the evaluation of benefits, risks, and improvement of quality of life for each drug (Tables 1, 2 and 3). The aggregation process of the experts’ opinions presented in Sect. 3 is applied here for the three criteria, with parameters a ¼ 0:1 and b ¼ 0:1. Table 4 presents the aggregation results. For example, the aggregated benefit of angiotensin converting enzyme inhibitors is 0.22, which corresponds to the distance with respect to the level “very high,” showing that the aggregated evaluation is very close to the evaluation “very high.” This is confirmed by the fact that this class of drugs is indeed very beneficial for heart failure while not being contraindicated for diabetes and chronic obstructive pulmonary disease. The next step is the application of the multi-criteria ELECTRE Tri-C method for the polypharmacy’s evaluation in regards to benefits, risks, and improvement of quality of life. Concordance, discordance and credibility indices are computed using the formulas of Eqs. (2) to (7).

A Hybrid Delphi Multi-criteria Sorting Approach

285

Table 1. Experts’ opinions on the benefits of each drug No

Benefits Expert 1 M1 L4 M2 L5 M3 L4 M4 L3 M5 L4 M6 L2 M7 L4 M8 L3 M9 L2 M10 L4

Expert 2 Expert 3 L4 [L3,L4] L5 [L4,L5] L5 [L3,L4] L2 [L3,L4] L5 [L4,L5] L4 L3 L5 [L4,L5] L4 L4 L4 [L2,L4] L5 [L2,L4]

Expert 4 [L3,L5] L5 [L3,L4] [L4,L5] L5 L4 L5 [L4,L5] [L3,L4] [L4,L5]

Expert 5 [L1,L2] L5 [L3,L4] L2 [L4,L5] [L3,L4] [L4,L5] [L3,L5] [L3,L4] [L3,L4]

Table 2. Experts’ opinions on the risks of each drug No M1 M2 M3 M4 M5 M6 M7 M8 M9 M10

Risks Expert 1 [L3,L4] L4 L4 L4 L4 L4 L3 L4 L4 L3

Expert 2 Expert 3 L4 [L2,L3] L3 [L2,L3] L3 [L2,L3] L4 [L2,L4] L2 [L2,L4] L2 L4 L5 [L3,L4] L3 L2 L3 L2 L4 L2

Expert 4 L3 L1 L2 [L2,L4] [L1,L3] [L3,L4] [L3,L5] L3 [L1,L2] [L1,L2]

Expert 5 [L4,L5] [L3,L4] [L2,L3] L2 L2 L4 [L2,L4] [L2,L3] L2 [L2,L3]

Table 5 presents the credibility indices. These indices make it possible to draw a conclusion regarding the levels of benefits, risks, and improvement of quality of life of the polypharmacy. We apply a descending ELECTRE Tri-C assignment, using a majority threshold k ¼ 0; 7: According to the ascending assignment, the evaluations for the polypharmacy A will be the following: Benefit = High, Risk = Neutral, Improvement of quality of life = High. We chose 0.7 as majority threshold but the calculations can be done using higher values if we wish to consider a higher level of credibility. The last stage involved in this method consists in classifying the polypharmacy into one of the three categories, appropriate, more or less appropriate, or inappropriate, using the ELECTRE Tri method. Tables 6, 7 and 8 present respectively concordance, discordance, and credibility indices deduced from the comparison of polypharmacy A with the profile limits r0 to r3 as defined in Sect. 3.

286

A. Frini et al.

Table 3. Experts’ opinions on the improvement of quality of life induced by each drug No M1 M2 M3 M4 M5 M6 M7 M8 M9 M10

Improvement of quality of life Expert 1 Expert 2 Expert 3 Expert 4 L4 L5 [L3,L4] [L2,L4] L5 L2 [L3,L4] L5 L3 L5 [L3,L4] [L4,L5] L2 L2 [L2,L4] [L1,L2] L4 L5 [L3,L4] L4 L3 L3 L2 L3 L3 L3 [L3,L4] L3 L3 L5 L4 [L4,L5] L2 L4 L3 L4 L4 L5 [L2,L4] [L4,L5]

Expert 5 [L1,L2] [L3,L4] [L2,L3] L2 [L2,L4] [L2,L4] [L2,L3] [L4,L5] [L3,L4] L3

Table 4. Mean distances of aggregated opinions from L5 No M1 M2 M3 M4 M5 M6 M7 M8 M9 M10

Benefit 0.22 3.30 2.26 4.04 0.84 0.84 3.42 2.10 2.28 3.70

Risk 4.44 3.06 4.44 3.72 2.74 4.92 3.02 4.84 4.22 5.02

Quality of life 2.44 3.30 2.66 5.88 4.04 2.28 4.46 2.28 1.64 3.42

Table 5. Credibility indices Credibility indices rðA; B1 Þ Benefit 0.00 Risk 1.00 Quality of life 0.00

rðA; B2 Þ 0.00 1.00 0.00

rðA; B3 Þ 0.30 0.91 0.59

rðA; B4 Þ 0.83 0.17 1.00

rðA; B5 Þ 1.00 0.00 1.00

Then, we will apply the ELECTRE Tri optimistic and pessimistic assignment processes and cross-reference the results. Both optimistic and pessimistic classifications indicate that A outranks r1 (at least as good as r1 Þ: Then, the polypharmacy will be assigned to the second category, which is “more or less appropriate.”

A Hybrid Delphi Multi-criteria Sorting Approach

287

Table 6. Concordance indices C ðA; r0 Þ Benefit 1.00 Risk 1.00 Quality of life 1.00 Global 1.00

CðA; r1 Þ 1.00 1.00 1.00 1.00

CðA; r2 Þ 1.00 0.00 1.00 0.67

CðA; r3 Þ 0.00 0.00 0.00 0.00

Table 7. Discordance indices DðA; r0 Þ Benefit 0.00 Risk 0.00 Quality of life 0.00

DðA; r1 Þ 0.00 0.00 0.00

DðA; r2 Þ 0.00 0.00 0.00

DðA; r3 Þ 0.00 0.67 0.00

Table 8. Credibility indices rðA; r0 Þ rðA; r1 Þ rðA; r2 Þ rðA; r3 Þ 1.00 1.00 0.67 0.00 A S r1

Discussion This paper proposes a novel approach for evaluating polypharmacies and classifying them as appropriate, more or less appropriate or inappropriate. The proposed method is original and provides interesting results that will evaluate the quality of polypharmacies. In particular, it allows clinicians to express linguistically their opinions and their hesitation. The proposed method has a number of advantages. On the basis of an individual assessment of each therapy, it is possible to create different polypharmacies with a lower or greater number of drugs, and to evaluate them. The subsequent stages of our research will focus in particular on the optimal composition of a polypharmacy in order to obtain optimal health results. Also, the method could enable the integration of the patient’s vision in addition to the opinions of healthcare professionals, a stage that will be developed by our team in the future. The proposed approach presents some limitations. First, it assumes from the outset that the presence of a major interaction will immediately make a polypharmacy inappropriate. At the same time, minor interactions may render polypharmacies less pertinent. This aspect will be developed in future stages of this research. In addition, we believe that the results can vary depending on the clinical expertise of the healthcare professional. It will be important in future work to involve a variety of specialists (cardiologists, endocrinologists, pulmonologists, and so on) and generalists (general practitioners, geriatricians, and pharmacists, among others) in order to take a variety of opinions into account. Clinicians’ personal experiences are also highly likely to influence their opinions in regard to risks, benefits, and improvement of quality of

288

A. Frini et al.

life. While developing the current version of the method and in order to foster more impartiality in the process, we have added clinical information derived from clinical practice guidelines or randomized trials. Also, the fact that a 70% threshold of agreement is required for the Delphi enables us to ensure a certain degree of uniformity of results. On another side, the data obtained from the Delphi apply only for a specific clinical case as described in Sect. 3. The experts’ assessment of benefits, risks, and improvement of quality of life for each drug can vary depending on the treated population. For instance, data concerning a polypharmacy for elderly person may be different from data that concern a population of multi-morbid young adults. Consequently, with the data available now, the proposed approach cannot be used for other contexts than our clinical case. Should the data on the benefit, risk and quality of life of drugs be available for other clinical contexts, the proposed method could then be used.

5 Conclusion The possibility of sorting polypharmacy into the categories appropriate and inappropriate can be very useful in fostering the most beneficial combinations of drugs. The proposed approach is innovative and enables the integration of a variety of conflicting criteria in the evaluation of polypharmacy quality. It allows clinicians to express their opinion, and their hesitation where relevant, linguistically. In addition, it evaluates each polypharmacy taking into consideration the impact of the drugs that compose it in terms of risks, benefits, and improvement of quality of life. The proposed approach is based on the ELECTRE Tri and ELECTRE Tri-C methods and demonstrates their applicability in conjunction with linguistic variables. The evaluation of the quality of polypharmacies and their classification into a category as being appropriate, more or less appropriate, or inappropriate, is very useful for clinicians. It allows them to promote the use of more beneficial drug combinations. For example, algorithms can be developed and integrated into pharmaceutical management software in drug stores and hospitals in order to rapidly identify potentially problematic polypharmacies that need to be examined more closely by a pharmacist. Algorithms can also be included into smartphone applications so as to enable healthcare professionals to use them in the course of their clinical activities. Furthermore, because polypharmacy mainly affects the elderly, we based our example thereon. However, the principles developed can be extended to a number of other population groups and clinical situations. For example, the concomitant use of several drugs is frequent in the treatment of psychiatric conditions. The proposed method in this paper is general and can also be applied in these cases in order to distinguish appropriate and inappropriate polypharmacies. Acknowledgment. This work was supported by University of Quebec at Rimouski.

A Hybrid Delphi Multi-criteria Sorting Approach

289

References 1. Hovstadius, B., Petersson, G.: Factors leading to excessive polypharmacy. Clin. Geriatr. Med. 28, 159–172 (2012) 2. Sirois, C., Émond, V.: La polypharmacie: enjeux méthodologiques à considérer. J. Population Ther. Clin. Pharmacol. 22, 285–291 (2015) 3. Patterson, S.M., Hughes, C., Kerse, N., Cardwell, C.R., Bradley, M.C.: Interventions to improve the appropriate use of polypharmacy for older people. Cochrane Database Syst. Rev. 5, CD008165 (2012). https://doi.org/10.1002/14651858.cd008165.pub2 4. Perny, P.: Multicriteria filtering methods based on concordance and non-discordance principles. Ann. Oper. Res. 80, 137–165 (1998) 5. Henriet, L.: Systèmes d’évaluation et de classification multicritères pour l’aide à la décision, construction de modèles et procédures d’affectation. Université Paris-Dauphine, Thèse de doctorat (2000) 6. Jacquet-Lagrèze, E.: An application of the UTA discriminant model for the evaluation of R&D projects. In: Pardalos, P.M., Siskos, Y., Zopuunidis, C., (eds.) Advances in MultiCriteria Analysis, Kluwer Academic Publishers, Dordrecht, pp. 203–211 (1995) 7. Zopounidis, C., Doumpos, M.: Multicriteria classification and sorting methods: a literature review. Eur. J. Oper. Res. 138, 229–246 (2002) 8. Zopounidis, C., Doumpos, M.: Building additive utilities for multi-group hierarchical discrimination: the M.H.Dis method. Optim. Methods Softw. 14, 219–240 (2000) 9. Yu, W.: ELECTRE TRI: Aspects méthodologiques et guide d’utilisation. Document du LAMSADE, Université Paris-Dauphine, 74 (1992) 10. Roy, B., Bouyssou, D.: Aide multicritère à la décision, Economica, Paris (1993) 11. Tervonen, T., Figueira, J.R., Lahdelma, R., Dias, J.A., Salminen, P.: A stochastic method for robustness analysis in sorting problems. Eur. J. Oper. Res. 192, 236–242 (2009) 12. Almeida-Dias, J., Figueira, J.R., Roy, B.: Electre Tri-C: a multiple criteria sorting method based on characteristic reference actions. Eur. J. Oper. Res. 204, 565–580 (2010) 13. Ishizaka, A., Nemery, P.: Assigning machines to incomparable maintenance strategies with ELECTRE-SORT. Omega 47, 45–59 (2014) 14. Doumpos, M., Zopounidis, C.: A multicriteria classification approach based on pairwise comparisons. Eur. J. Oper. Res. 158, 378–389 (2004) 15. Hu, Y.-C., Chen, C.-J.: A PROMETHEE-based classification method using concordance and discordance relations and its application to bankruptcy prediction. Inf. Sci. 181, 4959–4968 (2011) 16. Belacel, N.: Multicriteria assignment method PROAFTN: methodology and medical application. Eur. J. Oper. Res. 125, 175–183 (2000) 17. Belacel, N., Boulassel, M.R.: Multicriteria fuzzy assignment method: a useful tool to assist medical diagnosis. Artif. Intell. Med. 21, 201–207 (2001) 18. Belacel, N., Raval, H.B., Punnen, A.P.: Learning multicriteria fuzzy classification method PROAFTN from data. Comput. Oper. Res. 34, 1885–1898 (2007) 19. Falcò, E., Garcìa-Lapresta, J.L., Rosellò, L.: Aggregating imprecise linguistic expressions. In: Human-Centric Decision-Making Models for Social Sciences, Studies in Computational Intelligence, vol. 502, pp. 97–113 (2013) 20. Dias, J., Figueira, J., Roy, B.: ELECTRE TRi-C: a multiple criteria sorting method based on central reference actions (2008). https://hal.archives-ouvertes.fr/hal-00281307v2 21. Greco, S., Kadzinski, M., Slowinski, R.: Selection of a representative value function in robust multiple criteria sorting. Comput. Oper. Res. 38(11), 1620–1637 (2011)

290

A. Frini et al.

22. Kadzinski, M., Ciomek, K., Slowinski, R.: Modeling assignment-based pairwise comparisons within integrated framework for value-driven multiple criteria sorting. Eur. J. Oper. Res. 241(3), 830–841 (2015) 23. Koksalan, M., Bilgin Ozpeynirci, S.: An interactive sorting method for additive utility functions. Comput. Oper. Res. 36(9), 2565–2572 (2009) 24. Zhang, J., Shi, Y., Zhang, P.: Several multi-criteria programming methods for classification. Comput. Oper. Res. 36(3), 823–836 (2009) 25. Benabbou, L.: Contributions à la classification supervisée multi-classes et multicritère en aide à la décision. Université Laval, Thèse de doctorat (2009) 26. Broekhuizen, H., Groothuis-Oudshoorn, C.G.M., van Til, J.A., Hummel, J.M., Ijzerman, M. J.: A review and classification of approaches for dealing with uncertainty in multi-criteria decision analysis for healthcare decisions. PharmacoEconomics 33(5), 445–455 (2015) 27. Figueira, J.R., Greco, S., Roy, B., Slowinski, R.: An overview of ELECTRE methods and their recent extensions. J. Multi Criteria Decis. Anal. 20(1–2), 61–85 (2013) 28. Zare, M., Pahl, C., Rahnama, H., Nilashi, M., Mardani, A., Ibrahim, O., Ahmadi, H.: Multicriteria decision making approach in E-learning: a systematic review and classification. Appl. Soft Comput. 45, 108–128 (2016) 29. Frini, A., Sirois C., Laroche, M-L.: A linguistic multi-criteria classification approach for the evaluation of polypharmacy quality. In: Proceedings of the International Multi-Conference of Engineers and Computer Scientists 2018, Lecture Notes in Engineering and Computer Science, Hong Kong, 14–16 March 2018, pp. 944–949 (2018)

Risk Averse Scheduling for a Single Operating Room with Uncertain Durations Mari Ito(B) , Fumiya Kobayashi, and Ryuta Takashima Department of Industrial Administration, Tokyo University of Science, 2641 Yamazaki, Noda-shi Chiba 278-8510, Japan {mariito,takashima}@rs.tus.ac.jp

Abstract. We introduce stochastic programming models for scheduling a single operating room using the variance, value-at-risk (VaR), and conditional value-at-risk (CVaR) as criterions. These criterions express the risk-averse attitude of the scheduler to the risk event that the expected end time of a surgery presumed by a surgeon can be considerably delayed. One of the important advantages of the CVaR is that the stochastic programming problem can be treated as a linear programming problem. The CVaR is thus more practical than the variance, or VaR, as a measure of risk for single operating room scheduling. This paper evaluates the effectiveness of the proposed model by numerical experiments. The results are useful for the management of schedules for a single operating room. We recommend that the scheduler considers both the expected total delay and CVaR when he/she scheduling a single operating room. To avoid the risk of delayed surgery, the scheduler orders surgeries from small to big variances of duration. Keywords: Operations research in health service · Operating room Surgery scheduling · Stochastic programming · Value-at-risk · Conditional value-at-risk

1

·

Introduction

The operating room management is essential for improving patient treatment quality and reducing hospital costs. Fujiwara [4] suggested that efficient operating room management is an emergent issue for improving medical services in Japanese hospitals, where many patients experience long wait times before undergoing operations because an insufficient number of surgeons, anesthesiologists, and operating rooms. Waiting list problems are also being experienced in Europe due to aging populations, making efficient medical services increasingly important [2]. Many hospitals remain at a low level of efficiency, yet there have been few analytical studies of efficient medical services. Surgeries are considered the most vital source of hospital revenue they have been estimated to generate approximately two-thirds of the total revenues [5]. Thus, operating room c Springer Nature Singapore Pte Ltd. 2020  S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 291–306, 2020. https://doi.org/10.1007/978-981-32-9808-8_23

292

M. Ito et al.

management and effectiveness have to be improved to make operating rooms a very crucial hospital resource. However, Macario et al. [9] argued that operating rooms account for 40 % of the general hospital costs, therefore making them the most significant single-cost source. In operating room management, the quality of operating room schedules is one of the most important factors. The improved operating room scheduling not only reduces patient waiting times but also reduces both the workload of surgeons and anesthesiologists and the required overtime. It also increases the effectiveness of operating room utilization, which may partially resolve the problem of operating room shortages. Moreover, the provided schedules are not always suitable for operating room management and are often not closely followed. One reason for this is that the planning durations for surgeries are sometimes underestimated. This delays surgeries and makes the operating room unavailable for subsequent scheduled surgeries. The operating room scheduling problem has been studied by numerous researchers. For instance, Lamiri et al. [8] formulated operating room scheduling to minimize overtime costs using a stochastic model, proposing a Monte Calro optimization method that consists of Monte Carlo simulation and mixed-integer programming. Denton et al. [3] proposed a two-stage stochastic programming model for operating room scheduling. There are also many case studies that consider hospital characteristics. Blake and Donald [1] solved the operating room scheduling problem at Mount Sinai Hospital by using an integer programming model. This approach greatly influenced the scheduling process at the hospital; it more fairly allocates durations of surgeries to surgeons, reduces operating room manager workloads, and avoids conflict between surgeons and operating room managers. Ito et al. [6] proposed an estimation method for calculating durations of surgeries using regression analysis and a mixed integer programming model for the operating room scheduling problem. Moreover, they developed an operating room scheduling system to use proposed method automatically and applied it to operating room scheduling at Aichi Medical University Hospital in Japan. It is important to prepare more detailed schedule of the operating room and to operate the operating room efficiently. In particular, stochastic approaches capture uncertainties that are frequently encountered. The stochastic programming problem for operating room scheduling is generally formulated to minimize the expected value of total delay of surgery from expected end time. However, this approach cannot adequately address the risk and challenges facing surgery with a very large delay from the expected end time. In other words, minimizing the expected value has not been able to express scheduler’s averse attitude from the risk that surgeries have a large delay from the expected end time desired by a surgeon. To date, there have been no examples of applying risk measures into the operating room scheduling problem. In this study, we incorporate the variance, VaR, and CVaR as the criterions into the operating room scheduling problem. Our purpose is to decide optimal sequence of surgeries in a single operating room considering the risk that surgeries have a large delay from the end expected time desired by a surgeon. We evaluate the effectiveness of our model by numerical experiments. As for the

Risk Averse Scheduling for a Single Operating Room

293

practical upshot of this study, our results are useful when managing a single operating room. From the viewpoint of CPU time, we recommend the use of the CVaR as a measure of risk for single operating room scheduling. We recommend that the scheduler considers both the expected total delay and CVaR when he/she scheduling a single operating room. To avoid the risk of delayed surgery, the scheduler orders surgeries from small to big variances of duration. The remainder of this paper is organized as follows. Section 2 explains charactaristic of the variance, VaR, and CVaR. Section 3 introduces the problem setting and models of a single operating room scheduling. In Sect. 4, we show the results obtained by numerical experiments. We conclude the paper in Sect. 5 with some comments about future areas for research.

2

Risk Measures

The variance is one of the well-utilized risk measure. The variance herein represents the degree of variation in loss, which is a random variable. The variance can be defined as follows.   2 η − Eη˜ [˜ (1) η ]) . Eη˜ (˜ Using the variance in the risk measure has some disadvantages. Firstly, the case where loss is lower than its expected value is also evaluated as risk. Secondly, the stochastic programming model incorporating the variance into mathematical programming model becomes nonlinear. The VaR is one of the other well-utilized risk measure. The VaR is defined by using α-quantile which is a value that divides the discrete probability distribution into a certain percentage α and 1 − α. This criterion can adequately address to the risk facing a very large loss. Thus, the cumulative distribution of random η ≤ z) with α ∈ (0, 1). z satisfies the condition variable η˜ assumes Fη˜(z) = P (˜ P (˜ η ≤ z) ≥ α and P (˜ η ≥ z) ≥ 1 − α is the α-quantile. The quantile function Fη˜−1 : (0, 1) → R can be defined as follows. Fη˜−1 (α) = minz {z|P (˜ η ≤ z) ≥ α}, = minz {z|Fη˜(z) ≥ α}.

(2) (3)

The VaR can be defined as follows. η ) = Fη˜−1 (α) = minz {z|Fη˜(z) ≥ α}. VaRα (˜

(4)

The stochastic programming model incorporating the VaR into mathematical programming model becomes nonlinear because there is no convexity. The CVaR is a coherent risk measure and is also considered as a measure to improve the disadvantages of the VaR. The CVaR is convenient to incorporate into mathematical programming problem because the coherent risk measure is convex risk measure. Fη˜(z) defines cumulative distribution function of random variable η˜ for loss. When cumulative distribution function Fη˜(z) is continuous function, we can show the following function.

294

M. Ito et al.

CVaRα (˜ η ) = Eη˜ [˜ η |˜ η ≥ VaRα (˜ η )] ,  ∞ 1 = zdFη˜(z), 1 − α VaRα (˜η)

(5) (6)

where α ∈ (0, 1). The CVaR gives information in the form of expected loss for the range exceeding the VaR. Figure 1 shows image of the VaR and CVaR. Specifically, the CVaR can be formulated as an optimum value for the following minimization problem. The optimal solution of the minimization problem is the VaR. η ) = minv v + CVaRα (˜

  1 Eη˜ (˜ η − v)+ , 1−α

(7)

When the CVaR is adopted as a risk measure, it can be formulated as follows using weighting factor β ∈ (0, 1).   ˜ + βCVaRα (V (x, ξ)) ˜ Minimizex (1 − β)Eξ˜ V (x, ξ) (8) subject to Ax = b,

(9)

x ≥ 0,

(10)

˜ is cost function with variable x of n1 dimension vector and random where V (x, ξ)   ˜ is expected value of cost. A is m1 × n1 matrix. b is m1 ˜ E ˜ V (x, ξ) variable ξ. ξ

Probability density function

dimension vector.

Loss

VaR CVaR

Fig. 1. Image showing the VaR and CVaR.

Suppose that the distribution of ξ˜ has a finite support Θ = {ξ 1 , ..., ξ K } and probability pk is given to ξ k (k is scenario) when the random variable follows

Risk Averse Scheduling for a Single Operating Room

295

the discrete probability distribution. In order to make expressions easier, we introduce a new variable u(ξ k ) > 0, k = 1, ..., K. This problem can be expressed as follows using the definition formulation (6) of the CVaR. Minimizex,y(ξ1 ),...,y(ξK ),v,u(ξ1 ),...,u(ξK )   K  T k k T k (1 − β) c x + p q(ξ ) y(ξ ) + β v + k=1

K 1  k k p u(ξ ) 1−α

(11)

k=1

subject to cT x + q(ξ k )T y(ξ k ) − v ≤ u(ξ k ), k = 1, ..., K,

(12)

u(ξ k ) ≥ 0, k = 1, ..., K,

(13)

Ax = b,

(14)

T (ξ k )x + W y(ξ k ) = h(ξ k ), k = 1, ..., K,

(15)

x ≥ 0,

(16)

y(ξ k ) ≥ 0, k = 1, ..., K,

(17)

where c is parameter of n1 dimension vector. q(ξ k ) is parameter of n2 dimension vector. y(ξ k ) is variable of n2 dimension vector. T (ξ k ) is m2 × m1 matrix. W is m2 × n2 matrix. The above results in a linear programming problem when the CVaR is incorporated as risk measure in two-stage programming problem under discrete probability distribution. Sarin et al. [10] introduced the use of the CVaR as a criterion for stochastic scheduling problems. They demonstrated its application for the single or parallel machine scheduling problem. They also exhibited the use and effectiveness of minimizing the CVaR in the context of the single or parallel machine scheduling problem.

3 3.1

Single Operating Room Scheduling Problem Setting

For most hospitals in Japan, departments have limited available time of the operating room by day of the week and time zone. The surgeon and the patient decide on the desired surgery date and starting time of the surgery within the available time of operating room. Surgeons propose durations of their own surgery and desired starting time to scheduler of operating room. The scheduler then creates

296

M. Ito et al.

the schedule by using proposed durations of surgery. This operating room schedules often are created by hand. In generally, scheduler assigns the surgeries of the same department consecutively when several departments use one operating room. The reason is that the surgeries of the same department is likely to use the same medical equipment. Additionally, time adjustment of the surgery of the same department is more easily than that of other departments. Figure 2 shows details of the stages of a single surgery. We define duration of a surgery as the length from time to prepare hospitals for the surgery to time to clean up the operating room (i.e., from time a to time f in Fig. 2). The patient receives several treatments during time b, c, d and e. Figure 3 shows stages in a single operating room. We define the delay of a surgery as the length from the expected end time desired by a surgeon to finishing time. In Fig. 3, surgery 1 has the same time as the expected end time and the finishing time determined as requested. Surgery 2 of expected end time and the finishing time determined as requested is different.

a

b

c

d Duration of surgery

e

f

a: Time to prepare hospitals for the surgery b: Time to prepare the surgery c: Time to administer the anesthesia d: Time to perform the surgery e: Time to awake from anesthesia f: Time to clean up the operating room Fig. 2. Stages in a single surgery.

Finishing time (Expected end time) Surgery 1

Expected end time Finishing time Delay Surgery 2

Fig. 3. Stages in a single operating room.

3.2

Model

We consider models of a single operating room scheduling problem that determines sequences for surgeries in a single operating room. It is assumed that the duration of the surgeries are the only random variables in this problem. Consider the following notation:

Risk Averse Scheduling for a Single Operating Room

297

Notation Index sets. J: S: D: Ed :

Set Set Set Set

of of of of

surgeries scenarios departments surgeries belonging to the same department d, d ∈ D

Parameters. wj : Weight of surgery j, (∀j ∈ J) pjs : Duration of surgery j under scenario s, (∀j ∈ J, ∀s ∈ S) j desired by a surgeon, which is defined as dj : Expected end time of surgery 

dj = bj + Es j∈J pjs where bj is defined as starting time of surgery j desired by a surgeon, (∀j ∈ J) πs : Probability of scenario s, (∀s ∈ S) α: Probability level, α ∈ (0, 1) β: Weighting factor, β ∈ (0, 1) Variables. cjs : Finishing time of surgery j under scenario s, (∀j ∈ J, ∀s ∈ S) tjs : Delay of surgery j from expected end time under scenario s, (∀j ∈ J, ∀s ∈ S) η: A threshold value (equal to the VaR when an optimal solution is obtained) μs : Amount of delay of surgery tjs exceeding the threshold value η under scenario s, (∀s ∈ S) zij : Surgery precedence binary variable, where, zij = 1 if surgery i is processed before surgery j, zij = 0 otherwise, (∀i, j ∈ J, i = j) The CVaR model is formulated as the following stochastic programming model [7]. Formulation of the CVaR.



1 wj tjs + β(η + 1−α πs μs ) (18) Minimize (1 − β)Es j∈J

s∈S

subject to η + μs ≥



wj tjs ,

∀s ∈ S,

(19)

j∈J



pis zij + pjs ≤ cjs ,

∀s ∈ S, j ∈ J,

(20)

j∈J\{j}

tjs + dj ≥ cjs , zij + zji = 1,

∀s ∈ S, j ∈ J,

(21)

∀i = j ∈ J,

(22)

298

M. Ito et al.

|



zij + zjk + zki ≤ 2,  zij − zi j | = 1,

j∈J

∀i = j = k ∈ J, ∀i = i ∈ Ed , ∀d ∈ D,

(23) (24)

j∈J

cjs ≥ 0,

∀s ∈ S, j ∈ J,

(25)

tjs ≥ 0,

∀s ∈ S, j ∈ J,

(26)

μs ≥ 0,

∀s ∈ S, j ∈ J,

(27)

∀i = j ∈ J.

(28)

zij ∈ {0, 1},

In the formulation above, the objective function (18) comprises of two terms, each having weights 1-β and β. The first term of the objective function minimizes the total weighted delay. The second term of the objective function minimizes the CVaR. For each scenario s, constraint (19) determines μs as the amount of total weighted delay that exceeds the threshold value of η (if at all). Constraint (20) bounds the surgery finishing times according to surgery sequencing relationships. Constraint (21) determines the delay of surgeries. Constraints (22) and (23) ensure feasibility of the surgery sequence by eliminating cyclic sequences. Constraint (24) allocates continually surgeries i and i to time because surgeries i and i are the same department. Constraints (25), (26), and (27) are non-negative constraints. Constraint (28) is a binary constraint. These scenarios are derived either from some discrete approximation of the underlying distributions of the problem parameters, or from some scenario generation procedure, with the probability value πs being associated with a scenario s, ∀s ∈ S. The VaR model is formulated as the following stochastic programming model. Notation Additional parameter. M : big number Additional variables. θs : auxiliary variable Formulation of the VaR.

⎡ ⎤  (1 − β)Es ⎣ wj tjs ⎦ + βη

Minimize

(29)

j∈J

subject to Equations (20), (21), (22), (23), (24), (25), (26), (28),  wj tjs − η ≤ M θs , ∀s ∈ S, j∈J



πs θs ≤ 1 − α,

(30) (31)

s∈S

θs ∈ {0, 1},

∀s ∈ S.

(32)

Risk Averse Scheduling for a Single Operating Room

299

In this formulation, objective function (29) comprises two terms, each having weights 1-β and β. The first term of the objective function minimizes the total weighted delay. The second term of the objective function minimizes the VaR. For each scenario s, constraint (30) determines η as the threshold value. Constraint (31) determines θs . Constraint (32) is a binary constraint. The variance model is formulated as the following stochastic programming model. Formulation of the variance.

Minimize





2 ⎡ ⎤

w t − E w t  j js s j js s∈S j∈J j∈J wj tjs ⎦ + β (1 − β)Es ⎣ S−1 j∈J

subject to

(33)

Equations (20), (21), (22), (23), (24), (25), (26), (28). In the formulation above, objective function (33) consists of two terms, each having weights 1-β and β. The first term of the objective function minimizes total weighted delay. The second term of the objective function minimizes the variance.

4 4.1

Numerical Experiments Data

In this section, we describe the numerical experiments to evaluate the performance of our proposed formulation. Generally, from 2 to 5 surgeries are scheduled at a single operating room [6]. As an example, we consider a five-surgery problem with 100 scenarios (i.e., J = 5, S = 100). The durations of surgeries are assumed to follow left-truncated log-normal distributions (truncated at zero to ensure nonnegativity). Scenario-wise values of pj were generated via Monte Carlo sampling. Figure 4 shows histograms of the scenario-wise values of the duration of surgery j (pj ). The surgery parameter values are summarized in Table 1. These parameters are generated based on the parameters of previous study [6]. wj is all 1. Surgeries 1 and 2 belong to the same department. Surgeries 3, 4 and 5 belong to the each different department (i.e., D = 4, E1 = 1, 2, E2 = 3, E3 = 4, E4 = 5). M is 10,000. We solved the problem of single operating room scheduling by minimizing the variance, VaR, and CVaR with α = 0.8.

300

M. Ito et al.

Probability

0.4 0.3 0.2 0.1 0 160

230

300

370

440

510

580

650

720

790

530

580

Probability

Duration of surgery 1

130

180

230

280

330

380

430

480

Duration of surgery 2

Probability

0.4 0.3 0.2 0.1 0 130

160

190

220

250

280

310

340

370

Duration of surgery 3

Probability

0.4 0.3 0.2 0.1 0 180

230

280

330

380

430

480

530

Duration of surgery 4 Probability

0.4 0.3 0.2 0.1 0 60

80

100 120 140 160 180 200 220 240

Duration of surgery 5 Fig. 4. Histograms of the scenario-wise values of the duration of surgery j (pj ).

Risk Averse Scheduling for a Single Operating Room

301

Table 1. Parameters of an example problem. Surgery j E[pj ] (min.)

1

2

334

3

215

207

4 309

5 107

V ar[pj ] (min.) 7046 4700 2586 4845 1850

4.2

bj (min.)

1

50

100

150

1

dj (min.)

335

265

307

459

108

Results

The computer used to generate the schedule was equipped with an Intel Core 2.30 GHz processor (i5-6200U) and 8 GB RAM. We solve the stochastic programming problem by using the IBM ILOG CPLEX 12.6.3 solver. In the generated test instances, there were 2,729 constraints and 1,130 variables when the model used the CVaR in the objective function. There were 2,225 constraints and 1,126 variables when the model used the VaR in the objective function. There were 2,125 constraints and 1,025 variables when the model used the variance in the objective function. We varied the weighting factor β from 0 to 1 to assess the effects of the weighting factor β on the optimal solution. Table 2 shows the results under different β when the model used the CVaR in the objective function. The case of β = 0 minimizes only the expected value of total weighted delay without considering risk. In contrast, the case of β = 1 minimizes risk alone. Minimizing the expected value (β = 0) resulted in the optimal sequence δ0 = 5, 3, 2, 1, 4. The expected total weighted delay becomes small in this case at 1811.19, but the value of the CVaR is high at 3,018. A large delay can occur in this situation. The criterion of minimizing the CVaR (β = 1) resulted in the optimal sequence δ1 = 5, 3, 4, 2, 1. The CVaR is small in this case, at 2,957, but the expected total weighted delay is high, at 2955.83. Table 3 shows the results under different β when the model uses the VaR in the objective function. Minimizing the expected value (β = 0) resulted in the optimal sequence δ0 = 5, 3, 2, 1, 4. The expected total weighted delay becomes small at 1811.19 but the value of VaR is high at 3,051. Large delay can also occur in this situation. The criterion of minimizing the VaR (β = 1) resulted in the optimal sequence δ1 = 5, 3, 4, 2, 1. The VaR is small at 3,010 but the expected total weighted delay is high, at 3000.5. Table 4 shows the results under different β when the model has the variance to objective function. Minimizing the expected value (β = 0) resulted in the optimal sequence δ0 = 5, 3, 2, 1, 4. The expected total weighted delay is small at 1811.19 but the variance is high, at 135904.98, meaning that large delay may occur. The criterion of minimizing the variance (β = 1) resulted in the optimal sequence δ1 = 5, 3, 4, 2, 1. The variance is small at 0 but the expected total weighted delay is high, at 3,137. Table 5 lists the CPU times that correspond to each value of β. Optimization in terms of the variance returned different CPU times depending on the value of β. The CVaR and VaR returned solutions with CPU time of

302

M. Ito et al. Table 2. Results under different β, CVaR. β

Optimal sequence δβ

Expectation value of total weighted delay (min.)

CVaR

0

5, 3, 2, 1, 4

1811.19

3018

0.1 5, 3, 2, 1, 4

1811.19

3018

0.2 5, 3, 2, 1, 4

1811.19

3018

0.3 5, 3, 2, 1, 4

1811.19

3018

0.4 5, 3, 4, 2, 1

1845.70

2957

0.5 5, 3, 4, 2, 1

1845.70

2957

0.6 5, 3, 4, 2, 1

1845.59

2957

0.7 5, 3, 4, 2, 1

1845.70

2957

0.8 5, 3, 4, 2, 1

1845.70

2957

0.9 5, 3, 4, 2, 1

1845.70

2957

1

2955.83

2957

5, 3, 4, 2, 1

Table 3. Results under different β, VaR. β

Optimal sequence δβ

Expectation value of total weighted delay (min.)

VaR

0

5, 3, 2, 1, 4

1811.19

3051

0.1 5, 3, 2, 1, 4

1850.46

3051

0.2 5, 3, 2, 1, 4

1850.46

3051

0.3 5, 3, 2, 1, 4

1850.46

3051

0.4 5, 3, 2, 1, 4

1850.46

3051

0.5 5, 3, 4, 2, 1

1884.97

3010

0.6 5, 3, 4, 2, 1

1884.97

3010

0.7 5, 3, 4, 2, 1

1884.97

3010

0.8 5, 3, 4, 2, 1

1884.97

3010

0.9 5, 3, 4, 2, 1

1884.97

3010

1

3000.5

3010

5, 3, 4, 2, 1

5 s regardless of β. However, the CPU time for the VaR changes greatly depending on the value of M , which is difficult to set. From the viewpoint of CPU time, we recommend the use of the CVaR. The cumulative distribution functions of total weighted delay under both sequences are plotted in Fig. 5. As a result, the cumulative distribution function of the β = 1 and β = 0.5 reach probability 1 at the delay value of 3000, while the β = 0 has a substantial associated probability of exceeding this value. This example illustrates the risk-averse nature of the CVaR and its effectiveness in reducing variability. Figure 6 shows the histograms of total weighted delay of

Risk Averse Scheduling for a Single Operating Room

303

Table 4. Results under different β, Variance. β

Optimal sequence δβ

Expectation value of total weighted delay (min.)

Variance

0

5, 3, 2, 1, 4

1811.19

135904.98

0.1 5, 3, 4, 2, 1

2696.45

920.93

0.2 5, 3, 4, 2, 1

2801.97

261.22

0.3 5, 3, 4, 2, 1

2843.15

132.25

0.4 5, 3, 4, 2, 1

2883.74

54.76

0.5 5, 3, 4, 2, 1

2908.49

24.01

0.6 5, 3, 4, 2, 1

2924.33

10.89

0.7 5, 3, 4, 2, 1

2936.21

4.41

0.8 5, 3, 4, 2, 1

2945.12

1.44

0.9 5, 3, 4, 2, 1

2952.05

0.25

1

3137.00

0

5, 3, 4, 2, 1

Table 5. CPU times corresponding to values of β. β

CPU time (s) CVaR VaR Variance

0

5

5

5

0.1 5

5

20

0.2 5

5

20

0.3 5

5

20

0.4 5

5

20

0.5 5

5

20

0.6 5

5

19

0.7 5

5

20

0.8 5

5

20

0.9 5

5

21

1

5

19

5

surgeries for β = 0, β = 0.5 and β = 1. The variance for β = 0 is approximately 134,545, whereas the variance for β = 0.5 is approximately 143,632 and the variance for β = 1 is approximately 135. The median of β = 0 is 1,823, whereas the median for β = 0.5 is 1,789 and β = 1 is 2,957. It is important to consider both expected value of total weighted delay and CVaR to create a single operating room schedule. In summary, the optimal sequence in all risk measures was δ0 = 5, 3, 2, 1, 4 if the weightage assigned to delay risk was small. The optimal sequence for all risk measures was δ1 = 5, 3, 4, 2, 1 when the risk of delay was featured prominently.

304

M. Ito et al.

Cumulative Probability

1.2 1 0.8 0.6 0.4

=0

0.2

=0.5

=1

0 1200

1400

1600

1800

2000

2200

2400

2600

2800

3000

Total weighted delay (min.)

Fig. 5. Cumulative distribution function of delay of surgeries for β = 0, β = 0.5, and β = 1.

Probability

0.3

0.2

=0 0.1

0 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200

Total weighted delay (min.)

Probability

0.3

0.2

=0.5 0.1

0 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200

Totalweighteddelay (min.) 1.2

Probability

1 0.8 0.6

=1

0.4 0.2 0 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200

Totalweighteddelay (min.)

Fig. 6. Histograms of total weighted delay of surgeries for β = 0, β = 0.5 and β = 1.

Risk Averse Scheduling for a Single Operating Room

305

When scheduling a single operating room, the surgery with the largest variance was scheduled last in the order. To minimize delay, we recommend that the schedule begins with surgeries that have small variance of duration. In case that the worst-case scenario obtains, we analyzed the effect on the end time of the last surgery. The worst-case scenario assumes that every surgery lasts for its longestpossible duration. These longest durations are 766 min for surgery 1,485 min for surgery 2,362 min for surgery 3,509 min for surgery 4, and 221 min for surgery 5. We arranged the optimal sequence of surgeries and checked the difference between the maximum and expected values (delay). Considering the delay risk can prevent timing deviations of about 270 min compared with the schedule that does not consider the delay risk. To create a single operating room schedule, we recommend that scheduler considers both the expected total delay and CVaR.

5

Conclusion

We developed stochastic programming models for a single operating room scheduling. We derived an optimal sequence for surgeries that minimizes the expected value of total weighted delay and the variance, VaR, and CVaR. We analyzed the effects of the weighting factor on the expected value of the total weighted delay and the variance, VaR, and CVaR. As a result, we found that the numerical experiments illustrate the risk-averse nature of the CVaR and its effectiveness in reducing variability. From the viewpoint of CPU time, we recommend that the CVaR is implemented as the risk measure when scheduling a single operating room. To avoid delay, we recommend that the scheduler begins schedules with surgeries that have the smallest variance of duration. When scheduling a single operating room, we recommend that the scheduler considers both the expected total delay and CVaR. Future work will expand this approach to models of multiple operating rooms scheduling. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number JP16H07226.

References 1. Blake, J.T., Donald, J.: Mount sinai hospital uses integer programming to allocate operating room time. Interfaces 32(2), 63–73 (2002) 2. Cardoen, B.E., Demeulemeester, B.J.: Operating room planning and scheduling: a literature review. Eur. J. Oper. Res. 201(3), 921–932 (2010) 3. Denton, B.J., Viapiano, A.V.: Optimization of surgery sequencing and scheduling decisions under uncertainty. Health Care Manage. Sci. 10(1), 13–24 (2007) 4. Fujiwara, Y.: Data analysis of health care service focusing on acute care medicine. Commun. Oper. Res. Soc. Jpn. 58(11), 651–656 (2013). (in Japanese) 5. Jackson, R.: The business of surgery. Health Manage. Technol. 23(7), 20–22 (2002) 6. Ito, M., Suzuki, A., Fujiwara, Y.: A prototype of operating rooms scheduling system: a case study in Aichi Medical University Hospital. Jpn. Ind. Manage. Assoc. 67(2E), 202–214 (2016)

306

M. Ito et al.

7. Ito, M., Kobayashi, F., Takashima, R.: Minimizing conditional-value-at-risk for a single operating room scheduling problems. In: Proceedings of the International MultiConference of Engineers and Computer Scientists 2018, 14–16 March, 2018, Hong Kong, pp. 968–973 (2018) 8. Lamiri, M., Xie, X., Dolgui, A., Grimaud, F.: A stochastic model for operating room planning with elective and emergency demand for surgery. Eur. J. Oper. Res. 185, 1026–1037 (2008) 9. Macario, A., Vitez, T.S., Dunn, B., McDonald, T.: Where are the costs in perioperative care?: analysis of hospital costs and charges for inpatient surgical care. Anesthesiology 83(6), 1138–1144 (1995) 10. Sarin, C.S., Sherali, D.H., Liao, L.: Minimizing conditional-value-at-risk for stochastic scheduling problems. J. Sched. 17(1), 5–15 (2014)

A Proposal of Diseases Words Classifying Method for Medical Hospitality Hiroki Kozu1(&), Yukio Maruyama2, Tusyoshi Yuyama3, and Tomoya Hasegawa3 1

Data Analysis Research Center, KAZU technica Co., Ltd., 1-9-18 Chuo, Chuo-ku, Sagamihara-shi, Kanagawa 252-0239, Japan [email protected] 2 Faculty of Advanced Engineering, Nippon Institute of Technology, 4-1, Gakuendai, Miyashiro-machi, Minami-Saitama-gun Saitama 345-8501, Japan [email protected] 3 Data Solution Divi., KAZU technica Co., Ltd., 1-9-18 Chuo, Chuo-ku, Sagamihara-shi, Kanagawa 252-0239, Japan {yuyama,hasegawa}@kazu-technica.co.jp

Abstract. In recent years, hospitality in health care has become a very important service for patients and their families who have some anxiety factors regarding diseases. However, it is difficult for medical staff to grasp some potential anxiety factors of patients and their families during medical examination and treatment. The purpose of this study is to propose methods of classifying potential anxiety factors for patients and families that are not understood by medical staff. In this paper, several methods of classifying potential anxiety factors for patients and their families are examined using written content by Japanese users with diseases on the “Yahoo! Answers” website. First, the words are extracted from the content regarding each disease by morpheme analysis. Next, the classification of the extracted words is obtained by (1) counting the occurrences of each word, (2) counting the appearance ratio of each word, (3) using correlation analysis, (4) counting the questions containing each word, and (5) counting the combinations of each word in the questions. Moreover, the obtained correlation between the word and each disease is judged according to the knowledge of the medical staff. As a result, the method of classifying the words based on the number of combinations of each word in the questions shows a tendency to agree with the knowledge of medical staff when compared to the words classified by other methods. Keywords: Medical hospitality  Potential anxiety factors  Knowledge-search service  Proposal methods  Diseases words classifying  Morpheme analysis

1 Introduction In recent years, the perspective of healthcare services is now changing from “curing” to “looking after” for development of quality of life (QOL) [1]. This concept of “looking after” means developing hospitality, such as a cooperative medical system within the © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 307–317, 2020. https://doi.org/10.1007/978-981-32-9808-8_24

308

H. Kozu et al.

medical team and a priority medical system for patients. Therefore, the hospitality in health care is very important service for patients and their families who have some anxiety factors with diseases. In previous studies, a questionnaire survey was conducted to determine some anxiety factors for patients and their families [2–4]. The role of nurses and clinical psychologists was reported to improve some anxiety factors in patients [5, 6]. In addition, the background factors affecting the QOL of patients are measured by a multiple-regression model [7]. It is possible to understand the superficial anxiety of patients and their families, but it is difficult to recognize potential anxiety. This study focuses on the potential anxiety factors related to diseases [8]. The purpose of this study is to propose the methods of classifying potential anxiety factors for patients and their families that are not recognized by medical staff. In this paper, several methods of classifying potential anxiety factors for patients and their families are examined using written content by Japanese users with diseases on the website “Yahoo! Answers”, which is the most famous knowledge-search service website in Japan. Moreover, the classified potential anxiety factors for patients and their families are evaluated by observations by medical staff.

2 Outline of Knowledge-Search Service 2.1

Overview of “Yahoo! Answers”

The knowledge-search service on websites such as “Yahoo! Answers” is a collection of many potential anxiety factors. This service is constructed by various questions for anxiety and the answers of many users. In “Yahoo! Answers”, 13,000,000 questions are accumulated and the number of questions is larger than that of other knowledgesearch services as of January 2015. 2.2

Data from “Yahoo! Answers”

In this study, written questions related to diseases in “Yahoo! Answers” are used. Answers corresponding to these questions are not used. The three diseases considered are high blood pressure, kidney disease, and diabetes mellitus. These questions are collected from the dataset of “Yahoo! Answers” and the collection period of the questions regarding each disease is as follows. • High blood pressure: August 19, 2016 to October 18, 2016 • Kidney disease: November 20, 2014 to November 19, 2017 • Diabetes mellitus: March 20, 2017 to November 19, 2017

3 Proposal Method of Classifying Words with Diseases In this section, the procedure of the proposed method is described. As shown in Fig. 1, this proposal method consists of four steps, as follows.

A Proposal of Diseases Words Classifying Method for Medical Hospitality

309

Fig. 1. Analytical procedure

3.1

Dataset of Disease (Step 1)

In Step 1, the questions for three kinds of diseases (high blood pressure, kidney disease, and diabetes mellitus) are collected as a dataset from “Yahoo! Answers”. 3.2

Morpheme Analysis and Extraction of Words (Step 2 and Step 3)

In this session, the procedure of Steps 2 and 3 is described. The words are extracted from the content regarding each disease by morpheme analysis. The result of extracting words, such as nouns, by morpheme analysis is shown in Table 1. As shown in Table 1, the words are extracted from the content of three kinds of major diseases by morpheme analysis. For high blood pressure, 4,530 words like “symptom”, “blood pressure”, and “hospital” are extracted from 1,000 questions. For kidney disease, 4,248 words including “disease”, “inspection”, and “hospital” are extracted from the content of 631 questions. For diabetes mellitus, 4,291 words, such as “meal”, “blood glucose”, and “inspection” are extracted from 1,000 questions. In this study, the words are extracted using KH Coder 2.00f (third-party software). Table 1. Result of morpheme analysis High blood pressure Symptom Blood pressure Hospital Inspection Diagnosis

Kidney disease Disease Inspection Hospital Kidney Insurance

Diabetes mellitus Meal Blood glucose Inspection Hospital Pregnancy

310

3.3

H. Kozu et al.

Methods of Classifying Potential Anxiety Factors (Step 4)

In Step 4, the classification of an extracted word is obtained from the number of occurrences of each word (Method 1), appearance ratio of each word (Method 2), correlation analysis (Method 3), number of questions containing each word (Method 4), and number of combinations of each word in the questions (Method 5). The procedure of Method 1 is shown in Fig. 2. As shown in Fig. 2, the number of occurrences of each word extracted by morpheme analysis is calculated. For example, when “blood pressure” is used 3 times, the number of occurrences of “blood pressure” is 3. In Method 2, the appearance ratio of each word extracted using morpheme analysis is defined in the following equation. ðAppearance ratioÞ ¼ ðNumber of occurrences of word in each questionÞ ðTotal number of words in each questionÞ

ð1Þ

In Method 3, the correlation coefficient for each word is calculated using the appearance ratio of Method 2. The procedure of Method 4 is shown in Fig. 3, in which the number of questions containing each word extracted by morpheme analysis is calculated. For example, when there are four questions that include “inspection”, the number of questions is 4. The procedure of Method 5 is shown in Fig. 4. As shown in Fig. 4, the number of combinations of words in the questions is calculated. For example, when “hospital” and “inspection” appear in both questions No. 3 and No. 4, the number of combinations is 2. Words High blood pressure Blood pressure High blood pressure Blood pressure Blood pressure High blood pressure Hospital Hospital

Number of occurrences of each word High blood bressure 3 Blood pressure 3 Hospital 2 Words

Fig. 2. Number of occurrences of each word

A Proposal of Diseases Words Classifying Method for Medical Hospitality Question No. 1 2 3 4 5 6

311

Contents High blood pressure , …inspection …consultation Hospital ...inspection...consultation… ...inspection …no abnormality… High blood pressure ...Diagnosis… Hospital ...instection … Degradation of diabetes...Diagnosis

Number of questions containing each word

Words Inspection High blood pressure Hospital

4 2 2

Fig. 3. Number of questions containing each word

Question High blood No. pressure ‫ٳ‬ 1

Blood pressure

Hospital

Inspection

‫ٳ‬

2

‫ٳ‬

‫ٳ‬

3

‫ٳ‬

‫ٳ‬

‫ٳ‬

4

‫ٳ‬

‫ٳ‬

‫ٳ‬

5

‫ٳ‬

High blood pressure High blood pressure Blood pressure

‫ٳ‬

Blood pressure

100 -

Hospital

-

-

Inspection

-

-

Hospital

Inspection

1

3

3

80

0

1

50

2

-

301

Fig. 4. Number of word pairings in questions

4 Evaluation of Proposed Method 4.1

Observation by Medical Staff

The result of selected words by knowledge of medical staff is shown in Table 2. As shown in Table 2, each word extracted using morpheme analysis is classified into 2 levels (strong relationship and weak relationship) according to the knowledge of the

312

H. Kozu et al.

medical staff. In the case of high blood pressure, the words “low”, “glycosuria”, “pregnancy”, and so on have a strong relationship with the disease, and “inspection”, “high”, “diagnosis”, and so on have a weak relationship with the disease. In the case of kidney disease, the words “protein”, “blood”, “glycosuria”, and so on have a strong relationship with the disease, and “chronic”, “meal”, “bladder”, and so on have a weak relationship with the disease. In the case of diabetes mellitus, the words “blood glucose”, “life”, “kidney”, and so on have a strong relationship with the disease, and “dialysis”, “family”, “stress”, and so on have a weak relationship with the disease. The words strongly related to diseases are defined as those that can provide imagined diseases and relevance to people other than medical staff. Words that are weakly related are defined as difficult for people other than medical staff to understand in relationship with the diseases. The medical staff who provided the opinions here are specialists in high blood pressure, kidney disease, and diabetes mellitus. Table 2. Classifying words according to knowledge of medical staff (excerpt) High blood pressure Strong Weak relationship relationship Low Inspection

Kidney disease Strong Weak relationship relationship Protein Chronic

Glycosuria Pregnancy Heart

High Diagnosis Hospitalization

Blood Glycosuria Dialysis

Hospital

Reason

Bleeding

4.2

Meal Bladder Blood vessel Water

Diabetes mellitus Strong Weak relationship relationship Blood Dialysis glucose Life Family Kidney Stress Insulin Rice Weight

Snack foods

Comparison Each Extracted Word with Observation of Medical Staff

The comparison results between words extracted in relation to each disease and knowledge of medical staff are shown in Tables 3, 4 and 5. For each disease, the classification words of the proposed method have been compared with the knowledge of medical staff. In any disease, the proposed method (see Method 4 below) shows a characteristic tendency compared with other proposed methods (see Method 1, Method 2, Method 3, and Method 5, below). As shown in Table 3, in the case of high blood pressure, the words “high” and “low” are included in the top rank. As shown in Table 4, for kidney disease, “kidney”, and “insufficiency” are not included in the top rank. As shown in Table 5, in the case of diabetes mellitus, the phrase “blood glucose” is not in the top rank, but “life” is. In the top 50 words classified by the proposed method, whether the content ratio of words is considered to have a strong or weak relationship according to medical staff is defined by Eqs. (2) and (3) below.

A Proposal of Diseases Words Classifying Method for Medical Hospitality

313

ðContentratio of strongly related wordsÞ ¼ ðNumber of strongly related words in the top 50 wordsÞ ðNumber of strongly related wordsÞ ðContentratio of weakly related wordsÞ ¼ ðNumber of weakly related words in the top 50 wordsÞ ðNumber of weakly related wordsÞ

ð2Þ

ð3Þ

The result of the comparison of each word extracted and knowledge of the medical staff is shown in Table 6. As shown in Table 6, in the case of high blood pressure, the found the highest content ratio of the strongly related words is found in Method 4 and Method 5, and the highest content ratio of the weakly related words is proposed to be found in Method 1, Method 2, and Method 4. In the case of kidney disease, the highest content ratio of the strongly related words is found in Method 1, Method 2, Method 3, and Method 5, and that of the weakly related is in Method 4. Finally, in the case of diabetes mellitus, it is found that the highest content ratio of the strongly related words is found in Method 3, Method 4, and Method 5, and that of the weakly related words is found in Method 1, Method 2, Method 3, and Method 4. Therefore, the proposed Method 4 is shown to be consistent with the knowledge of medical staff. Table 3. Comparison result between words extracted in relation to high blood pressure and knowledge of medical staff (excerpt) Method 1

Method 2

Method 3

Method 4

Method 5

High blood pressure

High blood pressure

High blood pressure

Hight

Blood pressure

Blood pressure

Blood pressure

Blood pressure

Low

High blood pressure

Hospital

Hospital

Hospital

Blood

Hospital

Inspection

Inspection

Request

Inspection

Request

Request

Request

Inspection

Confinement

Inspection

Symptom

Symptom

Diagnosis

Pregnancy

Diagnosis

Diagnosis

Diagnosis

Self

Work

Factor

Self

Self

Symptom

Mentalis

Glycosuria

Hospitalization

Hospitalization

Factor

Blood pressure

Self

Glycosuria

Glycosuria

Disease

Child

Question

: Strong relationship

: Weak relationship

314

H. Kozu et al.

Table 4. Comparison result between words extracted in relation to kidney disease and knowledge of medical staff (excerpt) Method 1

Method 2

Method 3

Method 4

Method 5

Patient

Patient

Patient

Custom

Request

Inspection

Inspection

Request

Life

Patient

Hospital

Hospital

Insufficiency

function

Insufficiency

Kidney

Kidney

Inspection

Lowering

Inspection

Insufficiency

Insufficiency

Hospital

Cancer

Hospital

Insurance

Insurance

Kidney

Life long

Kidney

Hospitalization

Hospitalization

Function

Medical

function

Life

Life

Dialysis

Insufficiency

Lowering

Meal

Meal

Answer

Chronic

Patient

Self

Self

Lowering

Albuminoid

Life

: Strong relationship

: Weak relationship

Table 5. Comparison result between words extracted in relation to diabetes mellitus and knowledge of medical staff (excerpt) Method 1

Method 2

Method 3

Method 4

Method 5

Glycosuria

Glycosuria

Glycosuria

Seeking diagnosis

Inspection

Blood glucose

Blood glucose

Request

Life

Glycosuria

Inspection

Inspection

Hospital

Kidney

Blood glucose

Hospital

Hospital

Blood glucose

Grandmother

Hospital

Request

Request

Inspection

Parents

Request

Self

Self

Meal

Work

Symptom

Meal

Meal

Self

Condition

Pregnancy

Family

Family

Family

Inter net

Calory

Relationship

Relationship

Medical doctor

Feeling

Talks

Stress

Stress

After meals

Prescription

Relationship

: Strong relationship

: Weak relationship

A Proposal of Diseases Words Classifying Method for Medical Hospitality

315

Table 6. Comparison of each word extracted and knowledge of medical staff Methods High blood pressure Strong Weak relationship relationship 1 0.636 0.609 2 0.636 0.609 3 0.636 0.478 4 0.682 0.609 5 0.682 0.522

4.3

Kidney disease Strong Weak relationship relationship 0.778 0.500 0.778 0.500 0.778 0.400 0.556 0.600 0.778 0.400

Diabetes mellitus Strong Weak relationship relationship 0.733 0.800 0.733 0.800 0.800 0.800 0.800 0.800 0.800 0.600

Evaluation of Extracted Words Using Cluster Analysis

This study focuses on the number of occurrences of each word (Method 1) and the number of questions including each word (Method 4) for extracting known words of medical staff. Those variables were evaluated using cluster analysis of each disease. The category of high blood pressure was classified into 5 clusters. Cluster 2 includes words related to the medical condition, such as “inspection” and “symptoms”. Cluster 5 also includes words related to family, such as “husband” and “grandmother”. Next, kidney disease was classified into 5 clusters. Cluster 2 includes words related to medical expenses, such as “insurance” and “medical”. Diabetes mellitus was classified into 4 clusters. Cluster 1 is includes words related to medical conditions, such as “blood glucose” and “hospital”. In addition, cluster 5 includes words related to family, such as “husband” and “grandmother”. The classification result of the words in the cluster analysis was compared with the words judged based on the knowledge of the medical staff in Session IV-a. For words classified by cluster analysis, the content ratio of words considered to have a strong or weak relationship by medical staff is defined by Eqs. (4) and (5) below. ðContentratio of strongly related words in each clusterÞ ¼ ðNumber of strongly related words in each clusterÞ ðNumber of strongly related wordsÞ

ð4Þ

ðContentratio of weakly related words in each clusterÞ ¼ ðNumber of weakly related words in each clusterÞ ðNumber of weakly related wordsÞ

ð5Þ

The comparison result of extracted words in each cluster and the knowledge of medical staff is shown in Table 7. In this study, clusters containing more than 10 words were evaluated. As shown in Table 7, in the case of high blood pressure, Cluster 4 contains many words that are judged to be strongly related to disease. Cluster 3 contains many words that are judged to be weakly related to disease. In the case of kidney disease, Cluster 5 contains many words that are judged to be strongly related to disease. Cluster 5 contains many words that are judged to be weakly related to disease.

316

H. Kozu et al.

In the case of diabetes mellitus, Cluster 2 contains many words that are judged to be strongly related to disease and Cluster 3 contains many words that are judged to be weakly related to disease. Table 7. Comparison of each word extracted and knowledge of medical staff ( to evaluation) High Blood Pressure

Kidney Disease

: Not subject

Diabetes Mellitus

Clusters

Strong relationship

Weak relationship

Strong relationship

Weak relationship

1

1.000

0.000

0.500

0.000

0.333

0.000

2

1.000

0.000

0.188

0.063

0.389

0.000

3

0.200

0.467

0.125

0.125

0.125

0.250

4

0.333

0.259

0.000

0.071

0.143

0.086

5

0.205

0.227

0.214

0.500

-

-

Strong relationship

Weak relationship

5 Conclusion The purpose of this research is to propose a classification method for potential anxiety factors of patients and their families, which medical staff do not recognize. In this paper, several classification methods of potential anxiety factors have been analyzed using the text content of Japanese users with diseases on the website, “Yahoo! Answers”. The classification of potential anxiety factors of patients and their families is judged based on the knowledge of the medical staff. In this study, the following two points were obtained from the examination results. 1. Method 4 tends to have more words matching with the knowledge of medical staff with either a strong or weak relationship. 2. The words classified by the proposed methods and the knowledge of medical staff tend to be different depending on the disease. “Strong relationship” is a term indicating the relevance of the word to diseases, even if it is not used by medical staff. “Weak relationship” indicates a word that is difficult to understand in relation to diseases for those outside of the medical field. In this research, words with a weak relationship are observed to be important for understanding potential anxiety factors. Therefore, Method 4 can be considered a good classification method. On the other hand, words such as “insurance” or “lifelong” in “kidney disease” are not included in the knowledge of medical staff, but are considered important to indirectly understand the potential anxiety factors of patients and their families. Therefore, it is possible to explain that the classified words contain important terms that medical workers do not regularly use. Furthermore, in the evaluation using cluster analysis, a tendency was shown in which the words judged as “related to disease” or “slightly related to disease” by medical staff were included in a specific cluster. In case of diabetes mellitus and high blood pressure, it was shown that the number of occurrences of each word and the number of questions containing each word are clusters at medium positions as a whole.

A Proposal of Diseases Words Classifying Method for Medical Hospitality

317

In case of kidney disease, it was shown that the number of occurrences of each word and number of questions containing each word are clusters at low positions. Future tasks are as follows. 1. Consider words that are not included in the knowledge of medical staff. 2. Classify potential anxiety factors using component analysis and factor analysis. 3. Consider the effects on the differences among diseases. Acknowledgment. We wish to thank Dr. Akiyama, T at KAZU technica Co., Ltd. for advice on data extraction and method of text mining. And we want to thank staff who provide medical knowledge.

References 1. Mono, T.: Medical Marketing (Japan edition). Japan hyoron-Sha, Tokyo (2011) 2. Tanaka, M., Ooi, K., Yanagisawa, K., Yoshioka, T., Gomi, T., Miyashita, T., Takamizawa, S.: Benefits of pharmacists’ assessment of depression and anxiety in outpatients undergoing chemotherapy. J. Pharm. Health Care Sci. 34(12), 1086–1090 (2008) 3. Takei, A., Itou, T., Kanou, T., Onozeki, J., Maeda, M., Tutumi, S., Asao, T., Kuwano, H., Kanda, K.: Analysis of anxiety in cancer patient received outpatient chemotherapy. Kitakanto Med. J. 55, 133–139 (2005) 4. Hisamatsu, M., Niwa, S.: Support factors of coping with anxiety in families of patients with terminal cancer. J. Jpn. Acad. Nurs. Sci. 31(1), 58–67 (2011) 5. Yasuda, N., Ito, H., Mori, H., Ushiro, M., Miura, M., Odaka, Y., Ishimaru, Y., Hashimoto, K., Miyazaki, M.: Nursing elderly hemodialysis patients following surgery under general anesthesia. J. Jpn. Soc. Dial. Ther. 24(8), 1167–1170 (1991) 6. Nagai, N., Nomura, T., Morimoto, T., Sasaki, Y.: A role of clinical psychologists in terminal care for cancer patients. Palliat. Care Res. 11(3), 534–537 (2016) 7. Sano, H., Asao, K., Matsushima, M., Agata, T., Kusaka, M., Sasaki, T., Tanishima, Y., Yamamoto, I., Shimizu, H., Tajima, N.: Measurement of quality of life in patient with type 2 diabetes (2): the influence of patients’ profiles and diabetic complications on quality of life. J. Japan Diab. Soc. 44(1), 57–62 (2001) 8. Kozu, H., Maruyama, Y., Yuyama, T., Hasegawa, T.: A proposal of method of classifying words with diseases for development of hospitality in health care. Lecture Notes in Engineering and Computer Science: Proceedings of The International Multi Conference of Engineers and Computer Scientists 2018, Hong Kong, 14–16 March 2018, pp. 860–864 (2018)

CFD Modelling of Rotating Annular Flow Using Wall y+ Andrew A. Davidson1,2(&) and Salim M. Salim2 1

2

SSE plc, Perth, UK [email protected] University of Dundee, Dundee DD1 4HN, UK [email protected]

Abstract. This project establishes a strategy of accurately modeling rotating annular flow of drilling fluid to improve the numerical predication of pressure loss in an annulus. Pressure loss is vital within several engineering applications from HVAC design to oil & gas drilling. By being able to accurately predict this through numerical methods it creates the potential for innovation and efficiency. The project will build on previous recommendation of wall y+ by Salim et al. [1] that looked at high Reynolds number turbulent flow for the predication of wall bounded flow. A strategy was established with the aid of the wall y+ value to investigate the most suitable turbulence model in ANSYS FLUENT to create a method that will reduce time and costs in the development of drilling tools. Out of 5 turbulence approaches, the k – x model was found to be the most accurate for a wall y+ of less than 5. The k – e model performed least well and its was observed that there was a direct link between the turbulent intensity found in the annulus and the performance of the turbulence model. The k – e was found to over predict the turbulent kinetic energy for the mesh set-up and thus contributed to inaccurate results regarding the pressure loss in the annulus. This project, therefore, suggests that a structured mesh with a y+ < 5 and the k – x turbulence model will provide sufficiently accurate data in the investigation of pressure loss in an annulus. This will provide benefit to the industry and to researchers who wish to model this flow situation where experimental data is not available. The strategy can be used by design engineers to create drilling tools and allow them to try more experimental designs, without the need to build expensive and time-consuming prototypes. It may also be used as an investigation tool for researchers wishing to gain a greater understanding of the complex fluid flow that occurs during rotating annular flow. Keywords: Annular flow  ANSYS  CFD  Drilling  Rotating flow  Wall y+

1 Introduction With the recent downturn in the oil & gas industry [2, 3], innovation and efficiency are needed more than ever to reduce costs. An area where time and money can be significantly saved is during the development of drilling tools; which are used to improve the overall drilling performance. While tools are intended to enhance drilling efficiency, the negative impact they have must also be known and investigated during the development stage of the tool. Maintaining downhole pressure to within the required © Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 318–330, 2020. https://doi.org/10.1007/978-981-32-9808-8_25

CFD Modelling of Rotating Annular Flow Using Wall y+

319

window is currently a major challenge for engineers, especially in horizontal and extended reach (ERD) wells [4]. It is, therefore, vital that the effect the drilling tool will have on the pressure loss within the annulus is known during the development stage of the product, so that its performance in actual drilling operations is better understood. This can be done through creating a prototype of the product and testing it through an experimental set-up, however, this can be costly and time-consuming. Computational Fluid Dynamics (CFD) is a method to overcome this problem by computationally modelling the product and its effect on the flow properties. By using the wall y+ as a tool for selecting a suitable mesh and turbulence model combination, the need for validation through data is removed. This allows many different flow situations to be modelled without the concern of getting suitable data to verify the results and in turn makes CFD an attractive option, not only for product design and development, but also for future research. One of the key issues within the industry today is the effective management of the Equivalent Circulating Density (ECD) of drilling fluid [4]. ECD is the density exerted by a circulating fluid against the formation and takes into account the pressure drop above the point being considered. It is vitial in avoiding kicks, especially in wells with a narrow window between fracture gradient and pore-pressure gradient [5]. Annular frictional pressure loss strongly affects ECD and hence if this can be controlled it will aid in the ECD management. Maintaining control of downhole pressure is also vital to the safety and successful drilling of a well, this is therefore a significant area within the oil & gas industry. Experimental studies have taken place to evaluate annular pressure loss and it has been found that the rotation of the drill pipe has a significant effect on the pressure loss [6, 7]. While experiments can effectively investigate the effects of rotation some may be time-consuming to create and costly to operate, depending on the flow scenario. CFD could allow for simulations to be carried out, reducing the time and cost, compared to those found when experiments are conducted. CFD also provides the added benefit of carrying out many different flow scenarios by simply changing the computational domain and set-up. One of the reasons CFD may not be implemented is due to the time taken to set-up a numerical study and the ability to validate it. The need to verify CFD simulations removes the appeal of this technology. Therefore, there is a need for a fast, reliable modelling strategy to allow efficient analysis of drilling tools and remove the need for validation. This strategy gives confidence in predicting accurate numerical results for this flow situation. Annular frictional pressure loss occurs due to the movement of the drilling fluid and cuttings through the annular space. This incorporated with the rotation of the drill pipe, which would be experienced during drilling operations, creates a complex flow situation due to the secondary flow and the formation of Taylor Vortices [8, 9]. This project, which will expand on the publication by Davidson and Salim [10], uses CFD to model a horizontal section of a well and investigate the effects drill pipe rotation has on a non-Newtonian power-law fluid. A strategy will then be suggested for selecting a mesh configuration and turbulence model when experimental data is not available with the aid of the wall y+. This will be achieved by replicating experimental data by McCann [11] in ANSYS FLUENT and investigating the following five turbulence models; k – x, k – e, k – e enhanced wall, Spalart-Allmaras and RSM. The most time-consuming section of a

320

A. A. Davidson and S. M. Salim

CFD study can be generating a suitable mesh that captures the correct resolution; especially due to the impact of walls in wall bounded flow [12]. The most conventional method, known as a grid independence test, is to run many simulations with different mesh sizes and configurations until the results match experimental data. This removes one of the main advantages of CFD compared to experiments by increasing the time taken to complete a numerical simulation. The y+ can be used as guidance for developing a reliable mesh and turbulence model strategy. By removing the need and time for validation and a grid independence test the main advantages of CFD are restored.

2 Previous Work CFD is used within many industries to make improvements or provide greater analysis that would be impossible to achieve by other means. The oil & gas industry is no different. CFD is widely used for several key areas such as flow assurance and the investigation of cuttings transport [13–15]. While CFD is unquestionably being applied for the development of drilling tools (see Centraflow for the development of the CEBond tool), this paper is focusing on creating a strategy that provides validation for flow scenarios where experimental data is difficult or impossible to obtain. This may occur when tools are being designed for a specific well where there is no previous data available or when a variation of a drilling fluid is being investigated. Salim and Cheah [1] investigated a strategy for dealing with 2-D wall bounded turbulent flows using the wall y+ as guidance for mesh configuration and the most suitable turbulence model. Walls have a substantial impact on turbulent flow and hence the mesh in this area must be refined sufficiently to obtain an acceptable solution. The quality of the mesh at a wall can be checked by the y+ value which is a dimensionless number to represent the distance from a wall to the centre of the nearest cell. The main applications for using the computed wall y+ as guidance in selecting the appropriate mesh density and corresponding turbulence model are situations where reliable experimental data are not available to validate CFD models. The investigation found that a wall y+ value in the range of 30–60 provided acceptable results for relatively high turbulent flows. They also suggest the mesh should not be within the buffer region as neither the near wall treatments nor wall function is able to solve it accurately and thus the overall solution is inaccurate. This paper shows the effectiveness of using the wall y+ as a tool to assist in selecting a suitable mesh and turbulence model combination. The paper highlighted the time taken to carry out a grid independence study and therefore the need for developing a meshing strategy to reduce this time. The results show that the best combination of mesh structure and turbulence model depend on the flow scenario and the property being investigated. This paper has been useful within several different industries, including wear on turbine components and heat transfer of refrigerant, and has been cited over 165 times for research and investigation studies. Ariff et al. [16] built on this work and extended the investigation to 3-D turbulent flows over a cube by using the y+ as guidance. The study was divided into two parts for low and high Reynolds numbers. For the low Reynolds number study, a y+ > 30 was unable to be investigated as it gave a poor mesh resolution. For this part of the study, the Spalart-Allmaras was found to be the most suitable for predicting the reattachment

CFD Modelling of Rotating Annular Flow Using Wall y+

321

when compared to the theoretical data. The second part of this study was for a higher Reynolds number of 40,000. The y+ value was approximately 33 and thus solving in the log-law region of the turbulent boundary layer. It is found that most of the RANS turbulence models provide results to an adequate accuracy with some performing better for certain situations such as separation and re attachment. The work on wall y+ from both these papers served as guidance for this project. This project will differ from the previous work by investigating the flow of a nonNewtonian fluid in a rotating annulus, using the wall y+ value to obtain accurate results by selecting the most suitable turbulence model and mesh set-up.

3 Methodology For this project, a 3D numerical study was carried out using ANSYS FLUENT v14.5 to replicate the experimental conditions by McCann [11]. McCann investigated the effect drill pipe rotation had on the pressure loss within the annular space. Five combinations of RANS turbulence models and near-wall treatments are tested to identify the most appropriate pairs. Table 1. Turbulence models investigated Turbulence model Near-wall treatment k–e Standard k–e Enhanced wall k–x Standard S-A Standard RSM Standard

The results are then compared to the experimental results using wall y+ values to aid in the selection of the most suitable model and mesh configuration. 3.1

Experimental Data

Fig. 1. Experiment schematic

322

A. A. Davidson and S. M. Salim

The CFD model is validated against the experiment conducted by McCann et al. The drilling fluid enters the annulus at the mud inlet, where the required flow rate is achieved using pumps. The flow rate was constant at 0.905 m.s−1. The annulus is made up of a 31.37 mm diameter stainless steel shaft to replicate the drill pipe. The wellbore is represented by an acrylic tube of 38.1 mm diameter. This set-up, Fig. 1, allows for the annulus to be concentric or fully eccentric while the motor can produce rotation speed of the drill pipe up to 900 rpm. The pressures are obtained by two pressure taps, 1.22 m apart, at each end of the annulus. A variety of fluids, pipe rotation speeds and annulus alignments were tested in this paper. The focus on this project will be on increasing pipe rotation for the concentric annulus on a non-Newtonian fluid as this relates to drilling conditions. Rotating annular flow is not only applicable to oil & gas drilling and could also be applied to turbines and components where fluid is used as a lubricant. The fluid is denoted as Fluid B in the paper and its properties are detailed in Table 2. Table 2. Fluid B properties Symbol q n K

3.2

Quantity Mud weight Power – Law index Flow – Consistency

Units 9.14 (ppg) 0.697 0.1398 (Ibf-secn.100 ft-2)

Computational Domain and Boundary Conditions

Figure 2 portrays the computational domain used for the CFD study. As the fluid flow in the annular space is being modelled this is the only region that needs to be created, the outer face of the cylinder represents the stationary wellbore while the inner face represents the rotating drill pipe. The computational domain replicates the experiment set-up. The length of the annulus is created to ensure fully developed flow. Following recommendations by Sorgun et al. [13] the length required for fully developed flow is found through Eq. (1).

Fig. 2. Computational domain

CFD Modelling of Rotating Annular Flow Using Wall y+

Le Turbulent ¼ 4:4ðD0  Di ÞRePL

323

ð1Þ

Where RePL is the Reynolds number for a Power – Law fluid which is found by Eq. (2) RePL ¼

ðDo  Di Þn V 2n q  n 8n1 3n4nþ 1 K

ð2Þ

The boundary conditions are set to match the experiment, with the flow rate converted to 0.9055 m.s−1. The boundary conditions are summarized below in Table 3. Table 3. Boundary conditions Named selection Inlet Outlet Wall – Drill pipe Wall – Wellbore

Boundary conditions Velocity inlet – 0.9055 m.s−1 Pressure outlet – 101325 Pa No slip stationary wall No slip rotating wall – 0–800 rpm

4 Mesh

Fig. 3. Mesh refinement at both wall of annulus

As was the case in the study by Ariff, the y+ value available for investigation was determined by the Reynolds number. The low Reynolds number found in this project restricted the uses of a y+ value to only less than 5. For this reason, the 5 turbulence models listed in Table 1 were all investigated to determine the most suitable for a mesh of this type. To create a mesh with a y+ < 5 it was refined at each of the two walls. A structured mesh was created with the use of the various tools in ANSYS fluent to give a greater refinement at the wall, the area where the fluid is strongly impacted by the boundary layer. Figure 3 displays the mesh cross section of the annulus.

324

A. A. Davidson and S. M. Salim

5 Results and Discussion The following section includes the results of the CFD simulations produced by each turbulence model as shown in Table 1 Sect. 3. They are k – x, k – e, k – e enhanced wall, Spalart-Allmaras and RSM. The accuracy of each model was analysed by comparing the results against the experimental data by McCann. The results are displayed for rotation speeds ranging from 0–800 rpm. Upon analysis of the results from each model, the most and least effective models were investigated further to determine the likely reasons they perform the way they do, after which a turbulence model and meshing strategy is proposed. 5.1

Wall y+ < 5

The generated mesh produced a small y+ value due to the low Reynolds number produced with the current boundary conditions. As can be seen from Fig. 4, the y+ value begins at around 1.2 at the inlet but decreases to less than 0.2 for the remainder of the pipe length. The structured mesh produces identical y+ value for the walls of the well and the drill pipe. A wall y+ value of 0.7 [20], such that they could be said to be reliable. However, Cronbach’s alpha is sensitive to the number of items in the scale, so the resulted values would be likely to be lower than the acceptable value (0.7) if the scale only contained a few number of items. Hence, Briggs and Cheek [3] proposed a more appropriate way which is to report the mean inter-item correlation for the items with an optimal range of 0.2 to 0.4. The Pearson product-moment correlation coefficient (r) would be used to determine whether there is a relationship between two factors [13]. The Cronbach’s Alpha for attitudes attributes is 0.799, which is higher than the acceptable value 0.7 [20], implying that this scale is reliable. Meanwhile, the Corrected Item-Total Correlation values of all items in this scale are higher than the acceptable value 0.3 [10], indicating that all the questions of attitudes attributes are relevant to the overall scale. The Cronbach’s Alpha for perceived learning attributes is 0.913, which is higher than the acceptable value 0.7 [20], indicating that this scale is reliable. Meanwhile, the Corrected Item-Total Correlation values of all items in this scale are higher than the acceptable value 0.3 [10], implying that all the questions of perceived learning attributes are relevant to the overall scale. The Cronbach’s Alpha for perceived engagement attributes is 0.841, which is higher than the acceptable value 0.7 [20], indicating that this scale is reliable. Meanwhile, the Corrected Item-Total Correlation values of all items in this scale are higher than the acceptable value 0.3 [10], implying that all the questions of perceived engagement attributes are relevant to the overall scale.

4 Results and Discussion In this research, questionnaires were collected from 187 respondents. Among all respondents, about 43.3% are male and 56.7% are female. The major age range of the participants is between 18 and 23, which accounts for about 88.2%, while there are 7.5% of participants aged from 24 to 30 and the other age ranges only accounts for about 4% in total. Besides, regarding their educational level, 49.2% of respondents are Year 3 students, 25.7% of them are Year 2, and 17.6% of them are postgraduate students. Both Year 1 and Year 4 students account for 3.2% and the others account for 1.1% of respondents. Most of the participants are engineering students, which account for 64.7%. There are

The Relationships Among Attitude, Perceived Learning and Perceived Engagement

359

4.3% of respondents majoring in business and 3.7% of them majoring in science. Participants majoring in social science and creative media account for 2.7% and 1.1% respectively. 23.5% of respondents are studying in other majors such as nursing. Moreover, the majority of respondents are full-time students, which account for 90.9%, while part-time and exchange students account for about 8% and 1.1% respectively. Most of the participants have GPA with the range of second honour in which 47.1% are 3.0–3.49 and 29.9% are 2.5–2.99. There are about 8.0% of respondents whose GPA is in the range of first honour and about 9.1% have GPA in the range of third honour. In addition, 65.2% of respondents have their own tablet PCs while 34.8% of them do not have. The Pearson Correlation Coefficient is +0.699 (p = 0.000 < 0.01) which implies a significant and positive correlation between attitudes and perceived learning. As a result, there is a high correlation between the two variables and their relationship is positive. Thus, H1 is supported. H1: Attitude towards the use of tablet PCs in learning is positively related to the perceived The Pearson Correlation Coefficient is +0.619 (p = 0.000 < 0.01) which indicates significant and positive correlations between attitudes and perceived engagement. Therefore, it shows a high correlation between attitudes and perceived engagement and the two variables are positively related. Hence, H2 is supported. H2: Attitude towards using tablet PCs in learning is positively associated with perceived engagement. The Pearson Correlation Coefficient is +0.711 (p = 0.000 < 0.01), indicating that the correlation between perceived learning and perceived engagement is statistically significant and positive. It shows a high correlation and a positive relationship between the two variables. Thus, H3 is proved. H3: Perceived learning is positively related to perceived engagement. The findings have revealed that students’ attitudes towards using tablet PCs for learning are positively associated with both perceived learning and perceived engagement. The findings are also demonstrated in other studies [5–7, 24]. Students who have more positive attitudes towards the use of tablet PCs tend to perceive more positively about their value beliefs of the effects of using tablet PCs for learning. Meanwhile, they are also more likely to engage more in the learning activities with the use of tablet PCs as they would like to adequately utilize the functionalities of tablet PC in order to enrich their learning experience [24]. On the other hand, a positive relationship between perceived learning and perceived engagement has also been found. Vekiri and Chronaki [26] suggested that students with more positive perceptions in learning with a technology are likely to have higher engagement in using the technology for learning. Therefore, students who have more positive perceived learning on the use of tablet PCs have higher perceived engagement.

360

H. K. Yau and Y. F. Leung

5 Conclusion This study has evaluated the role of tablet PC towards students’ learning in higher education by identifying the relationships among students’ attitude, perceived learning and perceived engagement towards using tablet PCs for learning. Questionnaire was designed and distributed to university students in order to collect relevant data from respondents. The collected data were analysed and statistical results were generated by Statistical Package for the Social Science (SPSS). The main limitation of this study is the uneven sample size of respondents. In the survey, the majority of respondents are engineering students and most of the respondents are studying bachelor degree. Since the learning behaviour and environment of students with different majors and different education levels can be vary a lot, the data findings in the research are limited to represent students in higher education. On the other hand, the majority of respondents are from the University the researcher were studying due to the convenience for distribution of questionnaires. Thus, the distribution of the sample size is not even enough and not representable to reflect the general opinions of students in Hong Kong higher education. Random sample should be used instead of a convenient sample in order to achieve more a representative study. This project mainly just focused on the student perspective regarding the relationships among their attitude, perceived learning and perceived engagement towards using tablet PCs for learning. Besides, future research can involve students in primary and secondary schools.

References 1. Anderson, J., Gerbing, D.: Structural modelling in practice: a review and recommended twosteps approach. Psychol. Bull. 10(3), 411–423 (1988) 2. Bansavich, J.C., Yoshioka, K.: The iPad: implications for higher education. In: 2011 EDUCASE Annual Conference, vol. 19, October 2011 3. Briggs, S.R., Cheek, J.M.: The role of factor analysis in the development and evaluation of personality scales. J. Pers. 54(1), 106–148 (1986) 4. Carter, C.P., Reschly, A.L., Lovelace, M.D., Appleton, J.J., Thompson, D.: Measuring student engagement among elementary students: pilot of the student engagement instrument–elementary version. Sch. Psychol. Q. 27(2), 61–73 (2012) 5. El-Gayar, O.F., Moran, M.: College students’ acceptance of tablet PCs: an application of the UTAUT model. Dakota State University, 820 (2006) 6. El-Gayar, O.F., Moran, M.: Examining students’ acceptance of tablet PC using TAM. Issues Inf. Syst. 8(1), 167–172 (2007) 7. Enriquez, A.G.: Enhancing student performance using tablet computers. Coll. Teach. 58(3), 77–84 (2010) 8. Farnsworth, B.J., Shaha, S.H., Bahr, D.L., Lewis, V.K., Benson, L.F.: Preparing tomorrow’s teachers to use technology: learning and attitudinal impacts on elementary students. J. Instr. Psychol. 29(3), 121–138 (2002) 9. Field, A.: Discovering Statistics Using SPSS. Sage Publications, London (2009) 10. Fornell, C., Larcker, D.: Structural equation models with unobservable variables and measurement error: algebra and statistics. J. Mark. Res. 15, 282–388 (1981)

The Relationships Among Attitude, Perceived Learning and Perceived Engagement

361

11. Fraenkel, J.R., Wallen, N.E.: How to Design and Evaluate Research in Education, 5th edn. McGraw-Hill Companies, New York (2003) 12. Fredericksen, E., Pickett, A., Shea, P., Pelz, W., Swan, K.: Student satisfaction and perceived learning with on-line courses: principles and examples from the SUNY learning network. J. Asynchronous Learn. Netw. 4(2), 7–41 (2000) 13. Ho, R.: Handbook of Univariate and Multivariate Data Analysis and Interpretation with SPSS. CRC Press, Boca Raton (2006) 14. Landis, R.N., Reschly, A.L.: Reexamining gifted underachievement and dropout through the lens of student engagement. J. Educ. Gift. 36(2), 220–249 (2013) 15. Liaw, S.S.: Investigating students’ perceived satisfaction, behavioral intention, and effectiveness of e-learning: a case study of the Blackboard system. Comput. Educ. 51(2), 864–873 (2008) 16. Lim, K.Y., Toto, R., Nguyen, H., Zappe, S.E., Litzinger, T., Wharton, M., Cimbala, J.: Impact of instructors use of the Tablet PC on student learning and classroom attendance (2008). http://search.asee.org/search/fetch;jsessionid=1ofr39vucpdwq?url=file%3A%2F% 2Flocalhost%2FE%3A%2Fsearch%2Fconference%2F17%2FAC%25202008Full616. pdf&index=conference_papers&space=129746797203605791716676178&type=application %2Fpdf&charset= 17. Mager, R.F.: Developing Attitude Toward Learning. ERIC (1968) 18. Muijis, D.: Doing Quantitative Research in Education with SPSS. SAGE Publications, Inc., Thousand Oaks (2004) 19. Lowe, M.: Beginning Research: A Guide for Foundation Degree Students. Routledge, London (2006) 20. Nunnally, J.C.: Psychometic Theory. McGraw-Hill, New York (1978) 21. O’Malley, C., Vavoula, G., Glew, J.P., Taylor, J., Sharples, M.: Guidelines for Learning/Teaching/Tutoring in a Mobile Environment (2005). http://www.mobilearn.org/ download/results/public_deliverables/MOBIlearn_D4.1_Final.pdf 22. Partin, M.L., Haney, J.J., Worch, E.A., Underwood, E.M., Nurnberger-Haag, J.A., Gerhardt, M.W., Brown, K.G.: Individual differences in self-efficacy development: the effects of goal orientation and affectivity. Learn. Individ. Differ. 16(1), 43–59 (2006) 23. Rossing, J.P., Miller, W.M., Cecil, A.K., Stamper, S.E.: iLearning: the future of higher education? Student perceptions on learning with mobile tablets. J. Scholarsh. Teach. Learn. 12(2), 1–26 (2012) 24. Tenhet Jr., T.O.: An Examination of the Relationship between Tablet Computing and Student Engagement, Self-Efficacy, and Student Attitude Toward Learning. California State University, Fresno (2013) 25. Toto, R., Wharton, M., Cimbala, J., Wise, J.: One Step Beyond: Lecturing With a Tablet PC (2006). http://search.asee.org/search/fetch?url=file%3A%2F%2Flocalhost%2FE%3A% 2Fsearch%2Fconference%2F12%2F2006Full1599.pdf&index=conference_papers&space= 129746797203605791716676178&type=application%2Fpdf&charset= 26. Vekiri, I., Chronaki, A.: Gender issues in technology use: perceived social support, computer self-efficacy and value beliefs, and computer use beyond school. Comput. Educ. 51(3), 1392–1404 (2008) 27. Yang, S.H.: Exploring college students’ attitudes and self-efficacy of mobile learning. Turk. Online J. Educ. Technol. TOJET 11(4), 148–154 (2012) 28. Yau, H.K., Leung, Y.F.: The relationship between self-efficacy and attitudes towards the use of technology in learning in Hong Kong Higher Education. In: Proceedings of the International MultiConference of Engineers and Computer Scientists 2018, IMECS 2018. Lecture Notes in Engineering and Computer Science, Hong Kong, 14–16 March, pp. 832– 834 (2018)

Author Index

A Araki, Kenji, 207 Arnold, Andrea, 1 B Bose, Sourav, 39 C Chapparya, Vaishali, 39 Chen, Yung-Sheng, 105 Choy, Alex W. H., 13 D Damiani, Lorenzo, 331 Davidson, Andrew A., 318 Dwivedi, Prakash, 39 E Edward, Ian Joseph Matheus, 93 Erwin, 165 F Frini, Anissa, 276 H Hama, Hiromitsu, 119 Hasegawa, Tomoya, 307 Hayasaka, Mitsuo, 194 Higuchi, Yudai, 143 Honda, Yuichi, 207 Hsu, Yu-Ching, 105 I Inagaki, Yoichi, 69 Irawati, Indrarini Dyah, 93

Ito, Mari, 291 Iwamoto, Tomoya, 130 Izumi, Tomoko, 54, 143, 153 K Kakimoto, Honoka, 82 Kashima, Tomoko, 130 Kawai, Yukiko, 82 Kiataramkul, Chanakarn, 338 Kitamura, Takayoshi, 54, 143, 153 Kobayashi, Fumiya, 291 Kobayashi, Ikuo, 119 Koonprasert, Sanoe, 233, 247 Korkiatsakul, Thanon, 233 Kozu, Hiroki, 307 Kushima, Muneo, 207 L Laroche, Marie-Laure, 276 Le, Hieu Hanh, 207 Lekdee, Natchapon, 247 Leung, Yuk Fung, 354 Lun, Daniel P. K., 13 M Ma, Chao-Tsung, 25 Maki, Nobuhiro, 194 Maruyama, Yukio, 307 Matsumoto, Shimpei, 130 Minami, Kazuhiro, 153 Moore, Elvin J., 338 Mori, Haruna, 219 Murata, Tomohiro, 194

© Springer Nature Singapore Pte Ltd. 2020 S.-I. Ao et al. (Eds.): IMECS 2018, Transactions on Engineering Technologies, pp. 363–364, 2020. https://doi.org/10.1007/978-981-32-9808-8

364

Author Index

N Naka, Toshihiro, 54 Nakajima, Shinsuke, 69 Nakamoto, Reyn, 69 Nakatani, Yoshio, 54, 143, 153 Neamprem, Khomsan, 233 Nevriyanto, Adam, 165 Nishihara, Yoko, 219

Sungnul, Surattana, 338 Suzuki, Toshio, 182

P Pornprakun, Wisanlaya, 338 Purnamasari, Diah, 165

W Wang, Yuanyuan, 82

R Rachmatullah, Muhammad Naufal, 165 Revetria, Roberto, 331 S Salim, Salim M., 318 Saparudin, 165 Sirisubtawee, Sekson, 247, 262 Sirois, Caroline, 276 Suksmono, Andriyan Bayu, 93 Sumiya, Kazutoshi, 82

T Takashima, Ryuta, 291 Tezuka, Shin, 194 Tin, Pyke, 119 Tran, Hien, 1

Y Yagawa, Yuichi, 194 Yamagishi, Shuichi, 130 Yamanishi, Ryosuke, 219 Yamazaki, Tomoyoshi, 207 Yau, Hon Keung, 354 Yokota, Haruo, 207 Yuyama, Tusyoshi, 307 Z Zhang, Jianwei, 69 Zin, Thi Thi, 119