Behaviour Monitoring and Interpretation - BMI : Smart Environments [1 ed.] 9781607504597, 9781607500483

Focuses on behaviour monitoring and interpretation with regard to two main areas of focus: investigation of motion patte

172 47 5MB

English Pages 368 Year 2009

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Behaviour Monitoring and Interpretation - BMI : Smart Environments [1 ed.]
 9781607504597, 9781607500483

Citation preview

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

BEHAVIOUR MONITORING AND INTERPRETATION – BMI

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Ambient Intelligence and Smart Environments The Ambient Intelligence and Smart Environments (AISE) book series presents the latest research results in the theory and practice, analysis and design, implementation, application and experience of Ambient Intelligence (AmI) and Smart Environments (SmE). Coordinating Series Editor: Juan Carlos Augusto Series Editors: Emile Aarts, Hamid Aghajan, Michael Berger, Vic Callaghan, Diane Cook, Sajal Das, Anind Dey, Sylvain Giroux, Pertti Huuskonen, Jadwiga Indulska, Achilles Kameas, Peter Mikulecký, Daniel Shapiro, Toshiyo Tamura, Michael Weber

Volume 3 Recently published in this series Vol. 2.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Vol. 1.

V. Callaghan et al. (Eds.), Intelligent Environments 2009 – Proceedings of the 5th International Conference on Intelligent Environments: Barcelona 2009 P. Mikulecký et al. (Eds.), Ambient Intelligence Perspectives – Selected Papers from the First International Ambient Intelligence Forum 2008

ISSN 1875-4163

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI Smart Environments

Edited by

Björn Gottfried Centre for Computing Technologies, Universität Bremen, Germany

and

Hamid Aghajan

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Department of Electrical Engineering, Stanford University, USA

Amsterdam • Berlin • Tokyo • Washington, DC

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

© 2009 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-60750-048-3 Library of Congress Control Number: 2009932690 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: [email protected]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Distributor in the UK and Ireland Gazelle Books Services Ltd. White Cross Mills Hightown Lancaster LA1 4XS United Kingdom fax: +44 1524 63232 e-mail: [email protected]

Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail: [email protected]

LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved.

v

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Preface The workshop on Behaviour Monitoring and Interpretation (BMI) was launched in 2007 and was co-located with the German conference on Artificial Intelligence. The first two editions of the workshop indicated a significant interest in research related to BMI and motivated the creation of the current volume with contributions by a number of leading researchers in this emerging field. Although the book covers a broad spectrum of topics concerned with behaviour monitoring and interpretation, there are essentially two prominent directions in focus here based on their particular interest in the ongoing research: the investigation of motion patterns and the area of Ambient Assisted Living. This volume aims to offer state-of-the-art contributions on these directions of research. The first chapter is an introduction to the area of BMI by Björn Gottfried and Hamid Aghajan; it explains what this field signifies and how it relates to other research areas. Then, in the first part of this volume, a number of chapters discuss recent developments in monitoring and representing behaviour, with a particular focus on movementbased behaviour. Alexandra Millonig, Norbert Brändle, Markus Ray, Dietmar Bauer, and Stefan Van Der Spek provide an overview of methods for monitoring and analyzing pedestrian motion behaviours. The subsequent chapter by Patrick Laube also considers movement behaviour; however, the focus is on which typical patterns can be distinguished that are not restricted to human beings and also involve patterns of groups of objects. Similarly, groups and their movement patterns are investigated by Zena Wood and Antony Galton who provide a classification scheme for collectives. Yohei Kurata and Max Egenhofer provide a qualitative spatial representation for relating directed line segments to their topological context; in this way they characterise movement patterns of individuals in relation to their topologically described context. Another qualitative representation is provided by Frank Dylla who considers ordinal relations; basic movement patterns are described between pairs of objects. The second part of the volume includes chapters that are more application driven. Tim Adlam, Bruce Carey-Smith, Nina Evans, Roger Orpwood, Jennifer Boger, and Alex Mihailidis present case studies about the monitoring and support of people with dementia in smart environments. Sylvain Giroux, Tatjana Leblanc, Abdenour Bouzouane, Bruno Bouchard, Hélène Pigot, and Jérémy Bauchet report on AI techniques applied in smart environments, in particular for providing inhabitants with cognitive impairment assistance in their everyday life. Joyca Lacroix, Yasmin Aghajan, and Aart Van Halteren discuss another approach that makes environments smarter: ambient assisted physical activity systems are presented that aid in increasing the engagement of seniors in physical activities. The third part presents a number of investigations which show how monitored behaviours can be interpreted in smart environments. Peter Kiefer, Klaus Stein, and Christoph Schlieder give a survey on knowledge-intensive methods for intention recognition; in particular, they look at how environments are spatially structured and take into account context-specific background knowledge. Albert Hein, Christoph Burghardt, Martin Giersich, and Thomas Kirste discuss an approach for the detection of high-level activities, in particular for interpreting team behaviours. Asier Aztiria,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

vi

Alberto Izaguirre, Rosa Basagoiti, and Juan Carlos Augusto present a model for how ambient intelligence systems can automatically discover patterns of user behaviour; they also discuss how the interaction of the users with the system can improve the performance of their system. The two final chapters are devoted to the infrastructure of smart environments. Matt Duckham and Rohan Bennett investigate decentralised spatiotemporal algorithms that optimise the support of spatially distributed systems of smart environments; for example, in order to monitor environmental changes even at large geographical scale. More related to middleware technologies is the contribution by Alvaro Marco, Roberto Casas, Gerald Bauer, Rubén Blasco, Ángel Asensio, Bruno Jean-Bart, and Miriam Ibanez; they present a framework for enabling the interoperability and handling the heterogeneity of components found in ambient assisted living systems. We hope you will find the presented material in this volume of interest to your research. A further motivation for introducing this book has been to encourage interdisciplinary interaction among researchers working on the various fields related to BMI. We hope the presented state-of-the-art in this volume will offer a glimpse of the potentials ahead.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

June 2009 Björn Gottfried and Hamid Aghajan

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

vii

Contents Preface Björn Gottfried and Hamid Aghajan

v

Behaviour Monitoring and Interpretation: Smart Environments Björn Gottfried and Hamid K. Aghajan

1

Monitoring and Analysing Movement Behaviours Pedestrian Behaviour Monitoring: Methods and Experiences Alexandra Millonig, Norbert Brändle, Markus Ray, Dietmar Bauer and Stefan van der Spek

11

Progress in Movement Pattern Analysis Patrick Laube

43

Representing and Reasoning About Movement Behaviours Interpretation of Behaviours from a Viewpoint of Topology Yohei Kurata and Max J. Egenhofer Qualitative Spatial Reasoning for Navigating Agents: Behavior Formalization with Qualitative Representations Frank Dylla

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Classifying Collective Motion Zena Wood and Antony Galton

75

98 129

Well-Being and Assisted Living Implementing Monitoring and Technological Interventions in Smart Homes for People with Dementia: Case Studies Tim Adlam, Bruce Carey-Smith, Nina Evans, Roger Orpwood, Jennifer Boger and Alex Mihailidis

159

The Praxis of Cognitive Assistance in Smart Homes Sylvain Giroux, Tatjana Leblanc, Abdenour Bouzouane, Bruno Bouchard, Hélène Pigot and Jérémy Bauchet

183

Avatar Communication for a Social In-Home Exercise System: A User Study Joyca Lacroix, Yasmin Aghajan and Aart van Halteren

212

Behaviour Interpretation in Smart Environments Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data in Ambient Assisted Living Peter Kiefer, Klaus Stein and Christoph Schlieder

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

235

viii

Model-Based Inference Techniques for Detecting High-Level Team Intentions Albert Hein, Christoph Burghardt, Martin Giersich and Thomas Kirste Learning About Preferences and Common Behaviours of the User in an Intelligent Environment Asier Aztiria, Alberto Izaguirre, Rosa Basagoiti and Juan Carlos Augusto

257

289

Infrastructures for Smart Environments 319

Common OSGi Interface for Ambient Assisted Living Scenarios Alvaro Marco, Roberto Casas, Gerald Bauer, Ruben Blasco, Angel Asensio, Bruno Jean-Bart and Miriam Ibañez

336

Author Index

359

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Ambient Spatial Intelligence Matt Duckham and Rohan Bennett

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-1

1

Behaviour Monitoring and Interpretation Smart Environments

a

Björn GOTTFRIED a,1 , Hamid K. AGHAJAN b Centre for Computing Technologies, University of Bremen, Germany b Department of Electrical Engineering, Stanford University, USA Abstract. This chapter provides an overview of the area of Behaviour Monitoring and Interpretation, BMI for short. It outlines this research direction and gives examples of current research. In a nutshell, BMI is about monitoring behaviour of humans or other entities and interpretation of the observed behaviour. Then, the chapter shortly discusses how BMI compares to related areas such as Ambient Intelligence and ubiquitous computing. Some details about future challenges are finally shown. Keywords. Behaviour Monitoring; Behaviour Interpretation; Smart Environments; Ambient Assisted Living; Pervasive Computing; Ambient Intelligence.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Introduction Monitoring what occurs in the environment, what people do and how they interact with their surroundings is of interest in several areas. Smart environments, the fields of healthcare and security, and more generally, Ambient Intelligence applications or mobile services are just a few examples. Independent of the specific application the purpose is always to learn about how humans behave. Substantial technological advances in recent years have essentially improved the monitoring of activities and resulted in maturing the field of behaviour monitoring. Additionally, the employment of techniques developed in Computer Science, and in particular in the field of Artificial Intelligence, have enabled novel ways of interpretation of activities. This is the reason for workshops like BMI, which is short for Behaviour Monitoring and Interpretation, to enter the field [9,10]. As a consequence, a more precise definition of BMI would be timely. The following introduction to the field of BMI outlines a research direction which is rather established from the point of view of behavioural science, while being somewhat newer regarding a subfield of geography that analyses spatiotemporal phenomena, and which has just evolved in the past two decades in the context of ubiquitous computing and smart environments. The latter direction is what distinguishes BMI from behavioural science, namely to employ new technologies for behaviour monitoring, analysis, and even behaviour interpretation. We will present the latest investigations contributing 1 Corresponding Author: Björn Gottfried, Centre for Computing Technologies, University of Bremen, 28359 Bremen, Germany; E-mail: [email protected]. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

2

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments

Abstraction level Semantics

Property of interest

BMI level

Mapping analysis results to s.th. meaningful

Some software for extracting specific properties

Interpretation

Analysis

Necessary & sufficient data

A specific representation for the acquired data

Representation

Observation

A sensory system for monitoring the behaviour

Measurement tools

Reality

Some actual behaviour

Object of observation

Figure 1. BMI layers.

to this field and will discuss specific directions the BMI community places particular emphasis on. We will conclude with an outlook on future research within the field of BMI.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Behaviour Monitoring and Interpretation Fig. 1 illustrates the different layers that may be involved in a BMI system. At the bottom layer are the behaviours one may want to interpret. For this purpose monitoring techniques based on different sensor technologies are employed. The second layer consists of such measurement modules. The third layer is about representation of data captured by the sensors, indicating which data pieces may be necessary and sufficient for the interpretation task at hand. The represented data is further processed at the analysis layer in order to detect how the data relates to background and context knowledge or in order to detect specific relations within the acquired data. The top layer eventually represents the semantics, and hence interpretation results, of the monitored behaviour. In the following each of these layers is discussed in turn and a few case studies representing current research directions involving the mentioned BMI layers are presented. These case studies indicate that not all layers are necessarily part of each research project or BMI system. Instead, emphasis is placed on different layers depending on whether the application aims to only observe and report on an event, or involves algorithms for interpreting the observations, or is expected to provide an appropriate representation enabling reasoning about the observed behaviour. In this sense, the layered design notion helps to envision how the different research directions within this field may work together. Furthermore, the combined design also involves how the layers connect to each other. Altogether, this framework enables easier comparison of approaches.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments Abstraction level

3

BMI level

Semantics

Swift, convenient, Riding, curving, Approach, go away Normal vs. abnormal Activities of daily living, Interpretation passionate shopper sauntering, standing from, go towards, stop behaviour patterns e.g. preparing meals (Millonig 2008) (Kiefer 2008) (Shi 2008) (Adlam 2009) (Giroux 2009, Adlam 2009)

Property of interest

Trajectory segmentation wrt. region, speed, curvature (Kiefer 2008)

Necessary & Ordinal sufficient data (Dylla 2007)

Topologic (Kurata 2007)

Observation

Localisation technologies (Millonig 2009)

Reality

Hand postures (Goshorn 2008)

Learning: mining for frequent sequences, (Aztiria 2009)

Finite state machines (Goshorn 2008)

Geosensor networks (Duckham 2009)

Pedestrians (Millonig 2008)

Clustering and speed histograms (Millonig 2008)

Grammars (Kiefer 2009)

Smart floors (Steinhage 2008)

Cyclists (Kiefer 200)

Learning (Hein 2008)

Probabilistic representation (Hein 2009)

Video Monitoring (Lacroix 2008)

Sailors (Dylla 2009)

Collectives (Wood 2009)

Movement patterns (Laube 2009)

Analysis

Lattice-Based, HMM, Bayesian Networks (Giroux 2009)

Representation

Services for heterogeneous monitoring technques (Lacroix 2008)

Measurement tools

Nursing home residents (Adlam 2009, Giroux 2009)

Object of observation

Figure 2. BMI layers and some examples different authors deal with.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1.1. Reality: Object of Interest The observation of behaviour and the attempt to make sense of these observations is what we are interested in. This concerns primarily the behaviour of people but might in principle also concern other entities such as animals or machines. Then, behaviour patterns might themselves be of interest in specific research fields or they are of interest in the context of specific applications, such as for smart environments. The range of possible behaviours is wide and includes examples like hand movements controlling a computer mouse, facial gestures and expressions, hand gestures and body movements, postures and poses, spatial movements and everything observable which concerns the behaviour of individuals or groups. Then, conclusions drawn from these behaviours include intentions, desires, goals and everything else that might be the purpose of the observation. Recent studies on behaviour have appeared in related workshops and other publications. In [28] sensor plates are used to detect the number of people in a place and where they move – either indoors or outdoors. Knowing where people stay and for how long enables applications in smart environments, including modeling normal and abnormal behaviour. In [26] shopping patterns of pedestrians are investigated and three types of shoppers are identified based on the observed movement behaviours. Behaviour of cyclists is considered in [16] to distinguish behaviours such as riding, curving, slow curving, sauntering, and standing. Concepts of human motion are investigated in [27], where directional movement behaviour such as go towards, approach, go away from, and stop are detected. A basis for research about group behaviour is provided in [33] who define a model of collective phenomena, enabling the classification of groups according to the flexibility of individual memberships, the variability of their locations, coherence, and the kinds of roles members might have in relation to their group. Also related to group behaviour are investigations in [15] which analyse the space use of students on a campus in order to find out where they are used to work as opposed to socialise. Soccer players and their behaviour are considered in several studies [31,11,23]. Daily life behaviour in smart environments is considered in [4,14,21] including applications in daily routine activities, medication, and common habits typical for an individual. Hand posture behaviour is analysed in [8], where hand poses are assembled to hand gestures which are then

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

4

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments

used as input commands to control electronic devices. Yet another set of behaviour has been investigated in [18] to track mice in a cage and look for where they prefer to stay and which paths they take in their cage labyrinth. 1.2. Observation: Measurement Tools The second layer is about the deployment of sensor technologies in order to sense behaviour patterns. A number of methods for behaviour monitoring can be found in [25]. The paper focuses on techniques to monitor motion behaviour and discusses the stateof-the-art including such technologies as GPS, GSM/UMTS, Bluetooth, WLAN, RFID, and Video. Vision as the mode of monitoring is investigated by [1,29]; video technology is also used for monitoring users during exercising [21]. Sometimes single measurement tools are not sufficient for the monitoring task at hand. Instead, a network of sensors may become necessary, in particular in a distributed geographic space [5]. Additionally, standards for interconnecting middleware in smart environments are important for connecting different sensing devices over networks [24]. Such middleware may also become necessary when the monitoring task cannot be accomplished by a single modality but with a heterogeneous set of sensors including infrastructure sensors in the environment and wearable sensors carried by users.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1.3. Necessary & Sufficient Data: Representation Instead of direct quantitative interpretation of the data, an alternative approach may consist of generating formal representations which are subsequently used for reasoning. That is, in the third layer a representation of the data in the suitable way for the application is created. For example, such a representation might be based on topological relations [20,19,12] or on ordinal relations [6,30]. While [20] relates the motion behaviour of single moving objects to their environment, [6] relates object pairs and describes their relative motion behaviour. Other techniques may be required when dealing with collections of moving objects [34]. The spaces in which objects are monitored can be modelled as well. Among others, [22] considers motion patterns in the Euclidean space, constrained spaces as opposed to unconstrained spaces, the space-time aquarium, heterogeneous field spaces, irregular tessellations, and networks. 1.4. Property of Interest: Analysis The fourth layer is about further data processing, e.g. by clustering quantitative data after having employed preprocessing operations such as outlier rejection or smoothing [25]. Techniques related to visual analytics are proposed in [2] to enable humans to more easily evaluate complex data sets. In [17] formal grammars are applied to mapping the observed behaviour to intentions, allowing to consider crossing dependencies of the different behaviours in a temporal sequence. In [7] AI techniques are applied to analysing activities of inhabitants in smart environments, while non-technical modes of communication are also taken into account in design.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments

Abstraction level Semantics

BMI level

Convenient vs. passionate shopper

Speed histogram

Property of interest

Coordinates of smoothed trajectory

Necessary & sufficient data

GPS

Observation

Actual walking behaviour

Reality

5

Interpretation

Analysis

Representation

Measurement tools

Object of observation

Figure 3. BMI layers filled with an example.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1.5. Semantics: Interpretation The top layer encompasses methods of data interpretation. Several methods of behaviour interpretation of moving objects based on formal grammars are investigated in [16]. In [14] learning techniques for high-level activities such as daily routines are studied. The behaviour clues are processed until being mapped to specific concepts at this top layer. Team behaviours are regarded in [13] by employing probabilistic reasoning. For example, pedestrian behaviour is mapped to the concepts of swift, convenient, or passionate shoppers [26] by means of clustering techniques. The behaviours of cyclists are mapped to concepts such as riding or searching [16] by using formal grammars. Movement behaviours of objects are categorised differently in [27] by describing them with the aid of a specific qualitative representation. 1.6. Summary BMI focuses on observing behaviours and analysing or interpreting them. Accordingly, methodologies investigated in the BMI field range from the monitoring of behaviours, via their analysis and representation, up to their interpretation. Instead of being a self contained area, BMI should be conceived of as a subfield of several different areas. Some investigations focus on the monitoring, others on the interpretation and yet others discuss the whole process. Fig. 3 summarises the layers BMI distinguishes by adapting the example from [26]. That is, walking behaviours are observed (bottom layer), for example by collecting positions outdoor with GPS (second layer). The coordinates of positions are taken and smoothed (third layer). They form trajectories single people have covered. Speed histograms can be computed since the GPS coordinates involve temporal information (forth layer). Eventually, clustering techniques identify typical behaviours that are annotated with semantic concepts characterising those behaviours (top layer).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

6

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments

2. Related Research In order to get a better idea of the aims and scope of the topic of BMI it helps to outline its relationships to related areas and their concerned topics. Two closely related areas are Ambient Intelligence (AmI) and Smart Environments (SmE). Given the multidisciplinary nature of these areas, they cover diverse topics such as multimodal sensing, knowledge representation and reasoning, application-oriented efforts in human-centred services, robotics, networking, HCI, mobile, collaborative, and ubiquitous computing. Overview articles for various key topics of these areas can be found in [3]. As these areas get broader and more complex it is important to identify subareas which are to be investigated more thoroughly. BMI can be conceived of as a subfield of AmI and SmE in that both fields require the monitoring and interpretation of behaviours. While AmI and SmE provide a more holistic view, BMI is a proper subconcept of AmI that covers the more specific problem of how to sense behaviours. The complementary view to BMI is the other subfield of AmI and SmE that is about the environment to automatically adapt to user’s needs or respond and react to the observed behaviours. Also, the field of ubiquitous computing [32], or pervasive computing, can be regarded as a direction that might be concerned with BMI, however only in settings based on providing services according to the consideration of behaviour.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3. Future Research Future developments in the field of BMI may focus on improving with respect to data quality, i.e. accuracy and precision, or regarding the manifold of behaviours that will be captured automatically. Furthermore, the employment of computational approaches to interpretation will significantly advance data analysis methods. That is, tools will provide more sophisticated help in analysing the observations, relating them to a broader context and supporting predictions of behaviours. Such results will enable environments to become smarter by providing support by means of automatic behaviour interpretation and even prediction. More specifically, the shown layered design helps to envision how the different research directions within the field of BMI may work together. It also provides a framework facilitating better comparison of approaches. Future research will have to investigate whether the individual layers themselves can provide general sub-frameworks to aid in standardising modules or in providing opportunities of interdisciplinary research. Finally, efforts are necessary to integrate BMI with related areas as those mentioned in Section 2. For this purpose, BMI needs to be identified as an individual subarea within those fields, sufficiently separated from other components of complex systems. This would help such systems to develop better control methods for their BMI modules. Eventually, as soon as the technology reaches maturity an investigation would be needed on whether a general way exists for plugging BMI components into specific applications, devices and smart environments. The ability to use BMI modules as building blocks for specific purposes and contexts would be the ultimate goal of this research direction.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments

7

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

References [1] H. Aghajan and C. Wu. From Distributed Vision Networks to Human Behavior Interpretation. In B. Gottfried, editor, 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 129–143. CEURS Proceedings, 2007. [2] N. Andrienko and G. Andrienko. Extracting patterns of individual movement behaviour from a massive collection of tracked positions. In B. Gottfried, editor, Proceedings of the 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 1–16. CEURS Proceedings, 2007. [3] J. C. Augusto and H. Aghajan. Editorial: Inaugural issue. Journal of Ambient Intelligence and Smart Environments, 1(1):1–4, 2009. [4] A. Aztiria, A. Izaguirre, R. Basagoiti, and J. C. Augusto. Learning of Users’ preferences and common behaviours in Intelligent Environments. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. [5] M. Duckham and R. Bennett. Ambient Spatial Intelligence. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. [6] F. Dylla. Qualitative Spatial Reasoning for Navigating Agents. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. [7] S. Giroux, T. Leblanc, A. Bouzouane, B. Bouchard, H. Pigot, and J. Bauchet. The Praxis of Cognitive Assistance in Smart Homes. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. [8] R. Goshorn, D. Goshorn, and M. Kölsch. The Enhancement of Low-Level Classifications for Ambient Assisted Living. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 87–101. CEURS Proceedings, 2008. [9] B. Gottfried, editor. Behaviour Monitoring and Interpretation, volume 296. CEURS Proceedings, 2007. [10] B. Gottfried and H. Aghajan, editors. Behaviour Monitoring and Interpretation, volume 396. CEURS Proceedings, 2008. [11] B. Gottfried and J. Witte. Representing spatial activities by spatially contextualised motion patterns. In G. Lakemeyer, E. Sklar, D. G. Sorrenti, and T. Takahashi, editors, RoboCup 2006, International Symposium. Bremen, Germany, June 19-20, volume 4434 of LNAI, pages 329–336. Springer, 2006. [12] P. Hallot and R. Billen. Spatio-temporal configurations of dynamics points in a 1D space. In B. Gottfried, editor, 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 77–90. CEURS Proceedings, 2007. [13] A. Hein, C. Burghardt, M. Giersich, and T. Kirste. Model-based Inference Techniques for detecting High-Level Team Intentions. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. [14] A. Hein and T. Kirste. Towards recognizing abstract activities: An unsupervised approach. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 102–114. CEURS Proceedings, 2008. [15] T. Heitor, A. Tomé, P. Dimas, and J. P. Silva. Measurability, Representation and Interpretation of Spatial Usage in Knowledge-Sharing Environments – A Descriptive Model Based on WiFi Technologies. In B. Gottfried, editor, 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 43–61. CEURS Proceedings, 2007. [16] P. Kiefer and K. Stein. A Framework for Mobile Intention Recognition in Spatially Structured Environments. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 28–41. CEURS Proceedings, 2008. [17] P. Kiefer, K. Stein, and C. Schlieder. Rule-based intention recognition from spatio-temporal motion track data in ambient assisted living. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. [18] M. Kritzler, L. Lewejohann, and A. Krüger. Analysing movement and behavioural patterns of laboratory mice in a semi natural environment based on data collected via rfid-technology. In B. Gottfried, editor, 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 17–28. CEURS Proceedings, 2007. [19] Y. Kurata and M. Egenhofer. The 9+ -Intersection for Topological Relations between a Directed Line Segment and a Region. In B. Gottfried, editor, Proceedings of the 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 62–76. CEURS Proceedings, 2007. [20] Y. Kurata and M. Egenhofer. Interpretation of Behaviours from a Viewpoint of Topology. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

8

[21]

[22] [23] [24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[32] [33]

[34]

B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation: Smart Environments

2009. J. Lacroix, Y. Aghajan, and A. Van Halteren. Avatar communication for home exercise: user preferences in a social setting. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. P. Laube. Progress in Movement Pattern Analysis. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. P. Laube, S. Imfeld, and R. Weibel. Discovering relative motion patterns in groups of moving point objects. International Journal of Geographical Information Science, 19(6):639–668, 2005. A. Marco, R. Casas, G. Bauer, L. Lain, M. Binas, and B. Jean-Bart. A Common OSGi Interface for Ambient Assisted Living Scenarios. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. A. Millonig, N. Brändle, M. Ray, D. Bauer, and S. Van Der Spek. Pedestrian Behaviour Monitoring: Methods and Experiences. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009. A. Millonig and G. Gartner. Shadowing – Tracking – Interviewing: How to Explore Human STBehaviour. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 42–56. CEURS, 2008. H. Shi and Y. Kurata. Modeling Ontological Concepts of Motions with Two Projection-Based Spatial Models. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 42–56. CEURS Proceedings, 2008. A. Steinhage and C. Lauterbach. Monitoring movement behaviour by means of a large area proximity sensor array in the floor. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 15–27. CEURS Proceedings, 2008. K. Terzic, L. Hotz, and B. Neumann. Division of Work During Behaviour Recognition – The SCENIC Approach. In B. Gottfried, editor, 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 144–159. CEURS Proceedings, 2007. N. Van De Weghe, P. Bogaert, A. G. Cohn, M. Delafontaine, L. De Temmerman, T. Neutens, P. De Maeyer, and F. Witlox. How to Handle Incomplete Knowledge Concerning Moving Objects. In B. Gottfried, editor, Proceedings of the 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 91–101. CEURS Proceedings, 2007. T. Wagner, T. Bogon, and C. Elfers. Incremental Generation of Abductive Explanations for Tactical Behavior. In B. Gottfried, editor, 1st Workshop on Behaviour Monitoring and Interpretation (BMI’07), volume 296, pages 117–128. CEURS Proceedings, 2007. M. Weiser. The Computer for the Twenty-First Century. Scientific American, 265:94–110, 1991. Z. Wood and A. Galton. Collectives and how they move: A tale of two classifications. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI’08), volume 396, pages 57–71. CEURS Proceedings, 2008. Z. Wood and A. Galton. Collectives and their Movement Patterns. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Smart Environments. IOS Press, 2009.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Monitoring and Analysing Movement Behaviours

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-11

11

Pedestrian Behaviour Monitoring: Methods and Experiences b ¨ Alexandra MILLONIG a,b,1 , Norbert BRANDLE , Markus RAY b , Dietmar BAUER b , and Stefan VAN DER SPEK c a Department of Geoinformation and Cartography, Vienna University of Technology, Austria b Dynamic Transportation Systems, arsenal research, Austria c Faculty of Architecture, Urbanism and Building Sciences, Delft University of Technology, The Netherlands

Abstract. The investigation of pedestrian spatio-temporal behaviour is of particular interest in many different research fields. Disciplines like travel behaviour research and tourism research, social sciences, artificial intelligence, geoinformation and many others have approached this subject from different perspectives. Depending on the particular research questions, various methods of data collection and analysis have been developed and applied in order to gain insight into specific aspects of human motion behaviour and the determinants influencing spatial activities. In this contribution, we provide a general overview about most commonly used methods for monitoring and analysing human spatio-temporal behaviour. After discussing frequently used empirical methods of data collection and emphasising related advantages and limitations, we present seven case studies concerning the collection and analysis of human motion behaviour following different purposes.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Keywords. Spatio-temporal behaviour, pedestrian monitoring, dataset generation, data analysis

1. Introduction Modern societies are characterised by a clear tendency towards individualism and independence. Strategies for promoting independence and quality of life for people of every age are especially important in the field of mobility. Mobility allows people to perform essential functions, including engaging in social and recreational activities when desired and reaching business and social services when needed. Especially individuals who are restricted in physical mobility due to physical-neuromuscular handicaps or handicaps caused by limited or missing sensory perception need special support. Pedestrians without physical constraints can also benefit from navigational and environmental information services when walking through unfamiliar environments. In this respect applied research has produced a number of emerging technologies and technological services such as navigation aids implemented on mobile devices that respect individual needs, in order 1 Corresponding Author: Alexandra Millonig, Department of Geoinformation and Cartography, Vienna University of Technology, Gusshausstr.20, 1040 Vienna, Austria; E-mail: [email protected].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

12

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

to support self-determined mobility for completing basic daily tasks without personal assistance. Advances in this field are strongly dependent on broad knowledge about people’s behaviour with regard to motion, decisive decision processes and related influencing factors. Efficient assistance and technological services can only be developed on the basis of comprehensive investigation of spatio-temporal behaviour and underlying determinants. Researchers of different disciplines, e.g. sociology, tourism and travel behaviour research, artificial intelligence, or ubiquitous geotechnology and geoinformation, are therefore applying various methods in order to examine, analyse and interpret pedestrian behaviour. The topics of this chapter include several key terms and expressions that are defined as follows: a trajectory or a track is the path a moving object follows through space; Position Determination Technologies (PDT) comprise technologies providing the location of an object or a person by using a wireless device; video-based data collection is defined as the process of capturing visual data with one or multiple cameras and analysing the captured video contents in an automatic manner with methods from the domain of computer vision; in the context of pedestrian monitoring the use of observations means watching and registering pedestrian spatial activities; and survey techniques are self-report instruments like questionnaires or interviews, where information concerning spatio-temporal behaviour is based on the participants’ self-assessments of habits or preferences. The chapter provides an overview about common methods for monitoring and analysing human spatio-temporal behaviour. It comprises two main sections. The first part focuses on dataset generation (Section 2): commonly used methods for data collection are presented and discussed with respect to specific strengths and limitations. The second part focuses on data analysis (Section 3): several case studies are presented, describing the methods which have been used for analysing specific datasets. The chapter concludes with a comparison of the presented empirical methods with respect to several crucial criteria (e.g. positioning accuracy, covered region, main cost factor), which provides a concise overview about the applicability of specific methods for different research foci on human spatio-temporal behaviour (Section 4).

2. Dataset Generation This section focuses on empirical methods of motion data collection. The first examples present common methods to track individuals by the use of mobile devices. It starts with satellite-based localisation (GPS) which is described in Section 2.1. Section 2.2 describes the potentials of positioning based on mobile phone cells. Data collection with Bluetooth is described in Section 2.3. The following section focuses on video-based data collection (Section 2.4). Observational research and survey techniques are described in Section 2.5 and 2.6, respectively. At the end of the first part of this chapter, a brief overview about additional data collection methods (laser scanning, sensor mats, RFID, WLAN) is provided in Section 2.7. 2.1. GPS Data Collection Global Navigation Satellite System (GNSS) is the general term for satellite based Position Determination Technologies (PDT). Controlled by the USA, in the mid 90s GPS

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

13

(Global Positioning System) became operative as the first worldwide available satellite based positioning technology. GPS is based on 24 non-geostationary satellites which are geographically distributed in such a way that at any spot on earth one can receive signals from at least four satellites. This technology is widely spread due to low prices of commercial GPS devices and free system usage [37]. GPS devices receive emitted satellite signals and calculate the current position measuring propagation time. GPS receivers can be used for real time positioning applications providing three dimensional locations together with accurate timing information. Other GNSS which are currently under development include GALILEO in Europe and GLONASS in Russia.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.1.1. Advantages and Limitations of GPS Tracking The accuracy of GPS can reach up to three meters, and positioning is feasible with high frequencies, e.g. every second. Standardised interfaces and communication protocols (serial communication interface and NMEA-0183 format) allow easy storing of positioning data. Powerful GPS chipsets are increasingly integrated in low-cost mobile phones. Accurate positioning requires unobstructed satellite signals. Consequently, GPS is not suited for indoor environments. Furthermore, in urban regions GPS performance vastly decreases due to shadowing and multipath effects. Depending on the GPS device the first position after starting the device is received after a few minutes. This so-called time-to-first-fix (TTFF) describes the maximum time which is required for determining the first position given by cold, warm and hot starts [37,54]. Assisted GPS (A-GPS) is often used to reduce TTFF by providing GPS chipsets pre-knowledge about the satellites’ current positions. This service is mainly available for mobile phones, where the relevant information is transmitted over the telecommunication network. Note that A-GPS does not improve position accuracy. Accuracy can be improved by Differential GPS (D-GPS), which considers additional information about orbit errors provided by terrestrial communication networks. The accuracy of GNSS can also be improved by increasing the number of satellites, as planned for GALILEO [26]. New generations of GPS devices use improved chipsets with highly sensitive reR and Sirfstar III . R These GPS devices benefit in position quality, ceivers such as MTK size, weight and power consumption.

2.1.2. Potential Fields of Application GPS enables collecting long-term geo-referenced trajectory data of individuals equipped with a GPS device. GPS tracking can replace the post-hoc travel diary [37] and supports travel choice behaviour and activity pattern research. For example, Janssens et al. [22] use GPS in combination with a handheld device for collecting information about travel choice behaviour in Flanders, Belgium. Bohte et al. [8] used GPS devices for tracking families in three Dutch cities for one week. De Bois [12] equipped 15 families in three neighbourhoods in Almere with GPS units in order to evaluate spatial usage (see also Section 3.2). Shoval [46,45] used GPS to track tourists. Hovgesen [21] and Nielsen [33] use GPS to track pedestrians in parks in Aalborg and in schools in Copenhagen, respectively. Millonig and Gartner [30] employ a combination of shadowing (see Section 2.5) and GPS for investigating pedestrian motion behaviour. Van der Spek [55] used GPS devices for tracking visitors of three historic city centres in the Spatial Metro project (see

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

14

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

Section 3.1). Apart from fields of urban design, spatial planning and human geography, the collected data is also useful for social sciences, simulation purposes and prediction models, such as Space Syntax [59]. 2.2. Cell-based Positioning Cell-based positioning is a PDT relying on mobile telecommunication technology (mainly ’Global System of Mobile Communication’ (GSM) and ’Universal Mobile Telecommunications System’ (UMTS)). Limited localisation capabilities of GPS (see Section 2.1.1) have led to the idea of using GSM/UMTS as an alternative positioning technology for Location Based Services (LBS). A comprehensive study in Italy and the USA showed that the assumption that phones would be connected to the closest antenna was true only in 57% of the experiments [53]. This makes localisation and tracking with mobile phones a non-trivial task. This section provides an overview of the network architecture (Section 2.2.1), describes passive and active data collection methods (Section 2.2.2), and discusses potential applications (Section 2.2.3).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2.1. Network Architecture and Location Techniques Figure 1 (a) shows a realistic view on the GSM/UMTS cell network structure, especially for urban areas. Cell coverage of base transceiver stations (BTS) is simplified using ellipse models and sharp bounds, and is classified into three different types: (1) cells covering a large area with low connection capacity to supply a basic level of service due to possible technical breaks, (2) strategically distributed cells to cover the whole area adjusted to the required capacity, and (3) additional cells to cover shadowed regions (e.g. caused by high-rise buildings). This results in a complex network structure where one place might be covered by multiple cells. Cell sizes also vary strongly depending on the environment. For densely populated regions, cell coverage decreases down to approx. 50m (high network complexity). For rural areas, cell size may increase up to 3-30km (low network complexity). This coherency between cell-sizes and population density mostly allows good approximations, but may differ in some cases. This raises the question of which cell will be selected for a place covered by more than one cell. In many cases the strongest cell in the vicinity is selected, which in general does not need to be the one associated with the closest cell center. Cell selection depends on many – partly still unknown – factors, including Signal to Noise Ratio (SNR), network load balance, mobile phones cell selection algorithm and previously selected cells. Figure 1 (b) shows the results of an investigation in the region of Vienna (Austria). Four separated places (’work’, ’home 1’, ’home 2’ and ’weekend’) with corresponding cell positions (Centre Of Cell-coverage) are shown. This result is based on a half year permanent observation of one volunteer and confirms that the probability that one place is composed by multiple cells is higher in urban areas compared to rural areas (for details see [38]). The presented results are based on Cell Of Origin (COO) positioning technique. All cells in the telecommunication network are identified by a unique ID (Location Area code and Cell Identification Number). Hence, geographic coordinates of the BTS or of the theoretical centres of cell coverage – depending on the network provider – are assigned to the unique cell-IDs. COO is the most commonly provided and used localisation technique, because it does not require additional costly network equipment.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

15

1 BTS

Basic coverage of an area

2 BTS

4 BT S 5 BT S

3 BTS

Strategical distribution in an area

Reduce shadowing effects

(a) 48.28 48.26

work (urban area)

48.24

latitude

48.22 48.2

home 1 (suburban area)

48.18 48.16

home 2 (suburban area)

48.14

weekend (rural area)

48.12

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

48.1 16.4

16.45

16.5

16.55 longitude

16.6

16.65

(b) Figure 1. (a) Cell-network view. (b) (Multiple) Cells dedicated to places in urban, suburban and rural areas of one person (see [38])

Alternative techniques like ’Enhanced Observed Time Difference (E-OTD)’, ’Time Of Arrival (TOA)’ and ’Time Advance (TA)’ are based on signal propagation delay measurements. This is challenging with respect to clock synchronisation. ’Angle Of Arrival (AOA)’ techniques use the knowledge about antenna orientation in the mobile phones’ vicinity to limit the possible positioning area. The major drawback of the alternatives to GSM/UMTS is the required costly additional provider network equipment, limiting their availability in practice. An overview about alternative GSM/UMTS localisation techniques can be found in [62].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

16

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

2.2.2. Data Collection Methods The mobile telecommunication device receives information about the selected cell from the network. The unique cell identification number and geographical position (if supported) can be retrieved from the device (client-based) or the cell-network (server-based). Client-based collection Client based data collection has to cope with the fact that cell information is mainly processed in the GSM/UMTS modem unit of a mobile device and often not directly accessible. Due to the growing LBS-market and mandatory emergency services E911 and E112, mobile phones and cell networks increasingly provide interfaces to position information. The Java 2 Mobile Edition Programming Interface JSR179, for example, provides an abstract interface to location information, independent of the underlying positioning technology. Hence, a location request using JSR179 automatically checks locally available positioning modules like GPS and GSM. The provided data – position and quality information – can be stored locally or transmitted with GPRS/UMTS.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Server-based collection Cell network providers do not store all position logs of mobile phones [10]. However, many activities in the cell network, such as opening a data connection, phoning someone or changing the location area, implicitly cause a network location update. Location updates are stored by the network providers. Hence, location information is available, but the level of completeness depends on the users’ activities. Mobile office systems like BlackBerry are permanently online, hence their location is updated at any cell change (due to uninterrupted handover processes). The market penetration of mobile online systems might increase in the future, and such location update log files could contain high quality position information. Privacy issues are important in this respect, and may restrict collection/access to such information. Alternatively, some network providers give access to a location interface. This interface allows performing an explicit location update and retrieving the position of a given mobile subscriber number, hence allowing people tracking. Due to privacy issues, the located person must agree to such a service and usually has the opportunity to switch off the service at any time. Each location query causes process time within the network system, limiting the number of requests and sample rate. In the course of the investigations described in [38], ten persons have been observed using a location interface for six month with a sample rate of five minutes. 2.2.3. Potential Field of Application Cell-based positioning primarily benefits from the indoor availability and the high market penetration of mobile phones. Taking into account the restricted position quality (at least in rural areas) and some network-based phenomena (e.g. multiple cells using COO), this technology provides a reliable framework for Location Based Services. 2.3. Data Collection with Bluetooth Bluetooth is a short distance radio communication technology, which is used for data exchange in e.g. digital cameras, printers, handhelds, laptops and mobile phones. BlueBehaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

17

tooth uses a broadcasted signal for device identifications which can be obtained by any Bluetooth device in the near environment (up to 100 meters). Each Bluetooth device carries a unique identifier number (MAC address). This identification process can be used for (1) passive tracking and (2) active tracking. In ’passive tracking’ a network of Bluetooth devices is distributed in a given environment. Each network device periodically scans for Bluetooth devices in the vicinity using the mentioned broadcast signal. Hence, other Bluetooth devices within the observation area can be traced from the network using timestamps, unique identification numbers and the related network nodes. As a precondition, passive tracking is only possible if Bluetooth visibility is activated on the tracking device. Passive tracking has been applied in city centres (e.g. Norwich), at campus areas (e.g. Koblenz Landau) or in shopping malls [15]. In ’active tracking’ the tracked device itself periodically scans for other Bluetooth devices in the vicinity. In this case ’unintelligent’ and low-cost Bluetooth devices (Bluetooth beacons) are distributed in a given environment. Hence, special software including pre-knowledge of the network infrastructure is required on the tracked device for tracing purposes. Here, the tracing information is stored locally on the device. An example of this type of research was carried out by Millonig and Gartner [30] in a shopping mall: programmed cell phones/handhelds with Bluetooth were distributed at the entrance collecting the ID’s of 50 Bluetooth beacons. In Aalborg zoo the same Bluetooth beacons were provided to children to determine their location in cases they got lost (see in [48]).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.3.1. Advantages and Limitations of Bluetooth Tracking Approaches Using Bluetooth for monitoring peoples’ behaviour has two major limitations. Firstly, the number of people with activated Bluetooth is rather limited (experiences indicate a percentage of less than 5%) and therefore the covered sample often doesn’t represent the target group. Secondly, the accuracy is depending on the distribution density of the beacons. There are two main advantages applying Bluetooth technologies for tracking studies. Bluetooth is one of the technologies allowing position determination in indoor environments. And concerning privacy issues, participants in Bluetooth tracking studies can fully control the possibility of being tracked. 2.3.2. Potential Fields of Application In general, Bluetooth tracking applications require a network composed of a number of Bluetooth hardware. Hence, hardware and installation costs vastly increase depending on the required tracking accuracy and the covered infrastructure. For this reason, this technology is mainly applied on small to medium scales (e.g. buildings or blocks of buildings). In many cases Bluetooth devices are just deployed at entrances of infrastructures to gather information about the number of people within an infrastructure (carrying visible Bluetooth devices). However, in principle Bluetooh can extend GPS abilities with respect to indoor tracking. Here, customised software with pre-knowledge of the observed environment is required. 2.4. Video-based Data Collection Research in the field of automatic video surveillance with computer vision has matured considerably over the past 10 years, with technical publications reporting significant Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

18

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

progress. At the same time, abilities of computer vision are often hyped and exaggerated by industry and media, benefits are glamorised and dangers dramatised in movies and politics [17]. Digital video footage is composed of temporal sequences of two-dimensional pixel arrays having a fixed number of rows and columns, called video frames. There are three key steps in automated analysis of digital video data.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Detection of interesting objects in individual video frames. The objective of this step is to identify pixel sets belonging to object classes like pedestrians and cars. Large variations in human pose and clothing, as well as varying backgrounds and environmental conditions (lighting conditions, shadows) make the problem of pedestrian detection particularly challenging from a computer vision perspective. Many interesting pedestrian classification approaches have been proposed in the literature; an overview is given in [32]. An overview about human pose estimation can be found in [36]. 2. Tracking of objects. The objective of this step is to associate detected objects between video frames (and between multiple cameras) in order to obtain trajectories. Difficulties in tracking can arise due to abrupt object motion, changing appearance patterns, occlusions (people-to-objects, people-to-people, people-toscene) and camera motion. The survey in [61] categorises and provides detailed description of tracking methods and examines their advantages and disadvantages. 3. Analysis of people tracks to recognise their behaviour. The objective of this step is to automatically extract semantics out of spatio-temporal trajectory data. Trajectory dynamics analysis provides a medium between tracking and highlevel analysis. Typical motion, for example, is repetitive, while interesting events rarely occur. This repetition enables event analysis in the context of learned motion. The survey in [31] provides an overview of activity analysis in surveillance video based on object tracking. Note that many trajectory-based analysis methods could also be applied to GPS or related data. 2.4.1. Advantages and Limitations of Video-based Data Collection Video-based data collection enables obtaining pedestrian trajectories with high spatial and temporal resolution both in outdoor and indoor environments covered by cameras. The biggest advantage of sensed visual data is that the raw data basically contains richer information than mere spatio-temporal data. Automatically extracting such information enables monitoring of the environment as well as everything within the scene, such as pedestrians, animals, vehicles. Such a high-level analysis remains, however, an extremely challenging problem. The most complex behaviours can only be understood in the correct context, and it is difficult to imagine general procedures capable of working over a wide range of scenarios [31]. The performance of automatic monitoring is heavily influenced by the imaging setup. For example, one way to avoid severe occlusions and the resultant problems with pedestrian detection and tracking is to capture the scene from a bird eye’s view. Most of today’s commercially available video-based pedestrian counting sensors rely on such configurations. A top view configuration, however, severely limits the area which can be captured by a camera. Wide angle lenses or fish-eye lenses can increase the covered area, but increasing viewing angles again introduces occlusions [29].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

19

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 2. Transformation between video frame and infrastructure ground plane

An important issue is whether cameras are calibrated, i.e. the transformation between coordinates in the three-dimensional world and the pixel coordinates of the video frames has been identified. Combining calibration information with a site map or site model (even a simple ground plane model) enables the system to use the absolute size and speed of the detected objects. Figure 2 illustrates the transformation of pixel coordinates of a video frame (upper right part) into two-dimensional metric coordinates of a ground plane (lower left part). While such a plane-to-plane calibration technique enables synthetic top views of an environment, one must keep in mind that the accuracy of world coordinates decreases with increasing distances of world points from the camera. This is illustrated in Figure 2 by the increasingly blurred regions towards the left hand side of the transformed image. Many built environments have an architecture which does not necessarily allow full video coverage. Too many cameras might be required to cover an entire building. Additionally, the ceilings are often low (for example in subway environments), which will lead to shallow viewing angles with the related severe mutual occlusions. A large number of cameras might demand an impractical hardware setup. Smart cameras combine video sensing, processing and communication in a single embedded device and can therefore avoid video transmission and hardware resources for video processing. An overview of distributed smart cameras can be found in [40]. The majority of the existing automatic video surveillance techniques can only claim robustness and reliability for limited scenarios – with limited sensor networks, limited video footage, limited fault tolerance and small variability of scenes [39]. Furthermore, it is widely agreed that the object-based trajectory approach only works up to a certain complexity and density of people. For crowded scenes with many individuals, the mutual occlusions become so severe that currently no tracking algorithms can handle them effectively, even with a multi camera approach. The sub-domain of crowd analysis deals with information extraction without relying on individual object tracking [63]. 2.4.2. Fields of Application Video-based pedestrian tracking currently works best for scenarios with isolated individuals and loose groups of people, see [43] for a real system and an overview of other systems. Good results can be expected in terms of tracking quality if a steep camera angle can be assured. Outdoor applications will clearly only work when the environmental conditions allow a good imaging quality – darkness or fog for example will likely not produce interpretable video contents.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

20

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

2.5. Observation-based Data Collection One of the most conventional ways to explore human motion behaviour is to observe human activities and the physical settings in which such activities take place. Observation techniques belong to the most fundamental methods in social and behavioural sciences and have a long tradition. They focus upon the investigation and interpretation of visible motion behaviour. Usually they are of an explorative or descriptive character and hence commonly used in research fields aiming at the generation of hypotheses rather than testing or confirming a hypothesis. First attempts to investigate human spatio-temporal behaviour therefore mainly employed observation methods (sometimes in combination with questionnaires or interviews, see Section 2.6) in order to describe existing phenomena and look for patterns and potential hypotheses. Observations of pedestrian motion behaviour, also known as behavioural mapping or “tracking”, have primarily been used for collecting data concerning the movements of visitors at museums and exhibitions [57,60,5]. Subsequent research has been focusing on several different fields of interest, e.g. routes and activities of pedestrians in urban environments [20] or tourism research [19,25]. 2.5.1. Types of Observations

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Observing human motion behaviour involves following the subject at a distance and recording the movements by drawing a line corresponding to the subject’s activities on a map of the investigation area. This can be done in a very simple way by just using a paper map and a pencil. Recently, also technology-enhanced approaches have been applied, e.g. by using a digital map (see Section 3.3). There are several types of observation techniques which can be applied: • Direct (Reactive) Observation: The researchers identify themselves as researchers and explain the purpose of their observations. • Unobtrusive Observation: The researchers do not identify themselves. Either they mix in with the subjects undetected, or they observe from a distance. • Participatory Observation: The researcher participates in what is being observed so as to get a finer appreciation of the phenomena. The decision of which type of observation technique to use is mainly determined by the purpose a study aims to follow. Each technique bears its specific advantages and drawbacks that have strong influence on the quality of collected data. For further information about observational research and related studies see [13,1]. 2.5.2. Advantages and Limitations in Observational Research Observations are frequently used in explorative and descriptive research, as they are usually flexible and do not necessarily need to be structured around a hypothesis. Another positive aspect is that observational research findings are considered to be strong in validity, because the observer is able to collect extensive information about a particular (though mainly just visible) behaviour. However, in terms of reliability and generalisability there are some negative effects. Replicating or generalising observed behaviour patterns may not be easily achieved, especially when behaviour patterns have been observed in the “natural” loci of activities, and not in laboratories under controlled con-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

21

ditions. Moreover, the sample of observed individuals may not be representative of the population, or the observed behaviours may not be representative of the individual. Unobtrusive observations are useful if a particular field of interest has not been studied in depth so far. Another advantage of observing individuals without their knowledge is the fact that so-called “observer-effects” can be avoided. It has been shown that people who know that they are participating in a study tend to adapt their behaviour – consciously or subconsciously – to what they expect to be socially desired behaviour [34]. Hence, unobtrusive observation techniques are the only way to gain insight into “natural” behaviour patterns. Major drawbacks in this respect, however, are ethical concerns, as people cannot freely choose to participate in such a study. In non-disguised or participatory studies with subjects who agree to take part, researchers have to be aware of the fact that observer-effects may occur, and especially in participatory studies researchers may lose their objectivity. 2.5.3. Fields of Application As stated above, observation techniques are mainly useful in explorative and descriptive research. They are particularly helpful for identifying behavioural patterns and determining basic research hypotheses. Though, comprehensive insight into human motion behaviour and underlying cause-and-effect relations cannot be achieved by solely applying observation techniques. Therefore, a combination of observations with other methods (e.g. interviews) is often beneficial. Section 3.3 gives an example of such a combination of several complementary methods for analysing pedestrian behaviour styles.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.6. Data Collection based on Survey Techniques Along with observation methods (see Section 2.5), interviews and questionnaires belong to the first methods of data collection that have been applied in human spatiotemporal behaviour research [20,25]. Regarding the exploration of influence factors determining human route decision processes and related preferences, survey studies still represent one of the most important data collection techniques in transportation studies. They are relatively cheap and allow the collection and analysis of data taken from comparatively large samples. Inquiries are commonly used to gather information concerning route choice preferences, individual habits, motives, and intentions [7]. However, as spatio-temporal behaviour is mainly based on subliminal decisions, responses may be incorrect and constructed ex post. Human behaviour is never fully determined by verbalised structures [34], and people tend to modify their behaviour when they know they are being watched: they portray their “ideal self” rather than their true self [14], and the accuracy of the results gathered from questionnaires may suffer. Consequently, studies relying solely on results based on questionnaire data will have to accept a certain degree of inaccuracy [20]. 2.6.1. Common Survey Techniques Questioning participants about their habits and preferences can be done in various ways. Questionnaires, interviews and trip diaries are among the most commonly used survey techniques in spatio-temporal behaviour research. In the following, a rough overview of these techniques and related advantages and limitations is provided.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

22

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

Questionnaires The use of questionnaires is very popular, as they can be easily distributed (among pedestrians, via mail or the web) and therefore allow the collection and analysis of data taken from comparatively large samples at low costs. Standardised questionnaires are selfreport instruments which provide a written text comprising the exact wording of the questions and listing the possible answers. This results in the positive effect that standardised questionnaires provide valid and reliable data which can then be analysed and generalised (depending on the chosen sample). However, standardisation also includes two disadvantages: firstly, questions that are unclear to the participant may be answered incorrectly, and secondly, as participants are forced to choose among sets of prefabricated answers, the provided options may be incomplete and certain explanations may be missing. Furthermore, the provided set of answers can influence the participants’ responses subconsciously. Interviews Personal interviews (either face-to-face or via telephone) can be structured in different ways. In a structured type of interview, only the questions are standardised, the answers are expressed freely. An non-structured interview technique is applied when neither questions nor answers are standardised and the interviewer just follows a predefined field manual. The analysis of data collected in personal interviews is more complex than data collected by standardised questionnaires, as answers have to be categorised. However, this technique is especially useful if a survey aims at including persons who belong to so-called “hard-to-reach” groups, such as individuals who either usually refuse to participate, do not understand all the questions, or cannot fit into categories of answers designed for the average citizen [11].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Trip Diaries Another frequently used method is the time-space budgets technique, including recall diaries and self-administered diaries [52,46]. Recall diaries and interviews are strongly dependant on the participant’s memory, which will result in a lesser degree of accuracy. Self-administered diaries are written in real-time and can therefore provide very detailed information. However, they demand considerable effort on part of the subjects; consequently, only few people are willing to participate in these kinds of studies, and significant variation in the quality of the information must be expected. 2.6.2. Fields of Application Due to the limitations mentioned above, exclusively employing survey techniques may not be sufficient for the detection of human spatio-temporal behaviour patterns. However, as questionnaires and interviews offer the only chance to reveal relevant decision processes, certain research question require the use of these methods for understanding the influence factors determining spatio-temporal behaviour. A combination of different data collection methods including survey techniques is often recommended. Further details concerning such techniques, sampling, and interview types (e.g. structured/semistructured/unstructured) can be found in [1,11,13].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

23

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

2.7. Other Methods In addition to the data collection methods described above, a number of alternative options for pedestrian tracking have been explored in the past. These measurement techniques are seen to be in their early stages of application to pedestrian tracking and have not yet matured. We will therefore provide only a brief overview hinting at their potential and their disadvantages. The discussion below focuses on two alternatives for pedestrian tracking mostly in indoor settings (i.e. laser range scanners and sensor mats), and two alternatives to localise pedestrians (i.e. RFID tags and WLAN). A laser range scanner measures the location of surrounding objects by inferring the distance of the closest objects in a high-resolvent angular grid from the travel time of the reflected laser beam. This results in a point cloud in the local coordinate system of the scanner. Using a number of laser range scanners simultaneously and converting the raw measurements into a joint coordinate system results in a combined point cloud. Laser range scanners are typically able to perform a 360 ◦ scan rapidly (e.g. with frequency of 10 Hertz). Figure 3(a) shows an experimental environment using laser range scanners, the PAMELA walking platform in London 2. In 2007 this platform was equipped with two laser range scanners located on two opposing corners of the platform approximately at hip height. Figure 3(b) shows a snapshot of one raw data set where the handrail is clearly visible as well as a number of pedestrians walking on the platform. The second 14 12 10 8 6 4 2 0 −2

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

−4 −5

(a)

0

5

10

15

20

25

(b)

Figure 3. (a) The platform on the PAMELA facility in London. (b) Snapshot of the raw data of one walking experiment on the platform (units: meters). Each dot represents one raw measured obstacle location.

step is to detect objects in the raw data. Subsequently, the objects can be tracked over time, for example with a Hidden Markov Model tracking, as described in [3]). Laser range scanners can provide reliable pedestrian trajectories with a precision of a few centimetres ([3] find a mean deviation from the true path of approx. 3 cm). However, they share the same limitations as video image based methods in terms of occlusion problems. Hence tracking of a large amount of persons requires a large amount of laser range scanners, which is costly. In order to reduce occlusion problems, laser range scanners have been mounted close to the floor for detecting legs. This incurs the problem of matching pedestrians and their corresponding pair of legs. Nevertheless, a number of publications list the successful application of the technology also in dense situations, see e.g. [64,44]. 2 We

thank Kei Kitazawa from University College London for making the datasets available to us.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

24

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

Sensor mats are a promising alternative to computer vision based methods and laser range scanners to avoid occlusion problems. The two main problems in this respect are the detection of footprints on the mat (which is related to the spatial resolution of the sensors in the mat) and the tracking (complicated by the fact that while stepping the footprint disappears). First results and some references in this respect can be found in [51, 50]. No large scale applications are known. Laser scanning and sensor mats do not require the tracked subject to carry measurement devices and deliver tracking information. If it is possible to equip subjects with devices or existing devices are used for localisation purposes additional possibilities to localise pedestrians are possible. Radio frequency identification (RFID) tags have been used for this purpose. RFID tags are popular in the area of logistics in order to identify objects. Two different approaches can be distinguished: Active tags require power supply increasing the costs significantly while passive tags obtain power supply from signals sent by the RFID reader posing strong restrictions on the capabilities of the tags and limiting the distance from which they can be read. On the other hand passive tags do not require special maintenance operations. The signal strengths of one active RFID tag received at a number of RFID readers (three are necessary) are compared to tabulated ’fingerprints’ in order to estimate the position of the tag (see e.g. [41]). To the best of our knowledge the achievable accuracy using such a system in real world applications is insufficiently known, whereas using very dense RFID-reader distributions an accuracy of a few metres is achievable [23]. For passive tags on the other hand the tag is registered if it is sufficiently close to the card reader (typically in the magnitude of a few centimetres). This allows to track pedestrians in the vicinity of the readers but not in between. Beside access controls commonly used at offices, museums and the like, where typically the timing is the main point of interest, also tracking information is already used in large scale real life applications (see e.g. [49] which logs the ski lifts used by a person over the course of a day including the timing and the elevation skied). Publications on RFID technology are legion. Recent introductory surveys (not limited to RFID) can be found in [28,16]. Finally an emerging technology which poses similar challenges as RFID and Bluetooth (see Section 2.3) is wireless local area network (WLAN). See e.g. [56] comparing RFID, Bluetooth and WLAN technology. The same principles are used for WLANlocalising: Either only the specific reader which is closest to the WLAN device records a position or fingerprints are compared to signal strength patterns in order to estimate the location of the WLAN device. This technology is promising since an ever increasing amount of mobile devices (such as computers and smart phones) are equipped with WLAN capabilities.

3. Data Analysis/Experiences This part introduces selected case studies in order to illustrate applicable analysis methods for specific datasets. Section 3.1 describes a project focussing on using GPS for mapping pedestrian activities in historic city centres. Section 3.2 presents the results of tracking families in a suburbian residential area. Section 3.3 describes a multi-method approach for categorising pedestrian spatio-temporal behaviour in shopping environments. Section 3.4 describes a method to extract habits from motion history based on GPS mea-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

25

surements. Section 3.5 presents an approach for identifying prominent places using cellbased positioning. Section 3.6 describes the development of a location aware pilot system for city tourists employing camera phones and GPS positioning. Section 3.7 presents a study using vision-based analysis finding stopping pedestrians in a railway station.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.1. Spatial Metro: GPS-Tracks of Pedestrians in Historic City Centres Urban development requires careful reconciliation of different demands, including protecting historical and cultural heritage, creating attractive retail areas and future retail strategies, and sustaining and improving the quality of life in the central area. The Spatial Metro project aimed at investigating public space and subsequently improving city centres for pedestrians in Norwich (UK), Rouen (France) and Koblenz (Germany), which are cities with approximately 100.000 inhabitants and a historical centre. The TU Delft has developed a method using GPS for monitoring pedestrian movement to measure the effects of city investments like city beautification, street furniture, lighting and information systems [55]. In each of the three above cities GPS tracking devices were distributed to volunteers at two parking facilities for one week. GPS positions have been stored at a frequency of five seconds from 10am to 6pm. After the observation phase, questionnaires were filled by the volunteers. Only trip-related and general non-sensitive demographic information have been asked. In total 1300 pedestrians have been tracked and interviewed. On average only 60% of the spatio-temporal files were valid due to issues with GPS performance in urban areas, GPS fix times, power consumption, and outliers. GPS performance decreases vastly in urban areas (see Section 2.1.1), and people tend to go into buildings where GPS does not deliver position information. High time-to-first-fix rates also result in localisation gaps [37]. The collected data has been preprocessed with respect to cleaning and validation. Subsequently, valid data has been visualised in a Geographic Information System (GIS) by (1) drawing collected tracks on different GIS layers (e.g. on map layer, see Figure 4 (a)) and (2) plotting a density analysis of questionnaire data. The first type of visualisation combines spatio-temporal data with e.g. aerial images, access routes and arrival points, commercial activities, points of interest and investments. This enables analysing spatial conditions in relation to actual behaviour of tracked samples. The second type

(a)

(b)

Figure 4. GPS tracking results (a) Norwich: pedestrians tracked from two access points for one week. (b) Koblenz: layering of commercial activities and trajectories (all track points at 5 seconds frequency).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

26

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

of visualisation delivers a set of specific spatial patterns based on the aspects of origin, familiarity, purpose and duration and age, gender and group type. Hence spatial patterns for specific groups of participants could be compared. Both ways of visualisation offer tremendous insight in pedestrian behaviour, leading to conclusions and opportunities for application in practice (see also Figures 4 (b) and 5).

Figure 5. GPS tracking results in Rouen: pedestrians tracked from Haute Vieille Tour (3D density analysis).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.2. Tracking Families in Almere Almere is a poly nuclear Newtown in the “Flevopolder” and is composed of many suburbs. The last decade, the TU Delft has developed and applied several methods to analyse the available network between the city and related suburbs. GPS based behaviour analysis offers new perspectives for contradicting or confirming the developed theory of city planning for that region. Within three suburbs families with one or two children (aged between 16 and 18) were approached by the municipality to participate in the investigation. Fifteen families agreed, leading to approx. 50 participants. Questionnaires were filled and instructions were given at the participants’ home. Each participant was provided one GPS tracking device and a battery charging unit for one week. The spatial-temporal data has been collected at an interval of 2 seconds. Several hundred trips have been collected [12]. After the observation phase the volunteers were interviewed again. The tracks have been preprocessed (e.g. outlier removal) and have been split into trips based on activities (e.g. stays or entering a building). Incomplete tracks due to low GPS performance in urban areas (see Sections 2.1.1 and 3.1) resulted in some incomplete tracks which have been removed. Subsequently, interrelations between trips and questionnaires have been sought for advanced behaviour analysis. A GIS tool was used to visualise these query based results (see Figure 6). Either individual or combined activities of volunteers have been analysed. The data provided information about network usage by different modes of transportation and the areas of activities.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

27

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

(a)

(b) Figure 6. GPS tracking results in Almere-Haven: mapping of activity patterns of 13 males (a) and 13 females (b) for one week resulting in a map based on actual use of the urban tissue.

3.3. Multi-method Approach to the Interpretation of Pedestrian Motion Behaviour In this subsection the currently ongoing project UCPNavi, aiming at the classification of pedestrian walking behaviour and related influence factors, is described. In this study we apply a multi-methods-approach comprising several complementary data collection techniques. For more details about the project, methodology and preliminary outcomes see [30]. 3.3.1. Methodology The selection of appropriate methods had to fulfil several conditions: Firstly, we aimed at collecting data of sufficient quality and accuracy in larger environments (indoor and

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

28

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

outdoor). Secondly, as it is assumed that people might change their behaviour if they know that they are being observed, an unobtrusive form of monitoring was to be included. Thirdly, visible behaviour patterns were to be combined with interview data in order to allow the identification of relevant underlying intentions, preferences, and lifestylerelated factors. The following three methods have been chosen in order to achieve an optimal combination of empirical data collection techniques in consideration of these preconditions:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Unobtrusive Observation (Shadowing): Observation of the “natural”, uninfluenced spatio-temporal behaviour of pedestrians; only visible behaviour, no insight to intentions and motives. 2. Non-disguised Observation (Tracking): Continuous observation over a long period in combination with standardised interviews (data from both the structural and the agent-centred perspective); observer effects possible. 3. Inquiry (Interviews): Motivations, self-assessments of individual motion patterns; responses can be incorrect and constructed ex post. The execution of unobtrusive and non-disguised forms of observation was not feasible in parallel. Hence, a two-step approach was designed, which also offers the chance of using preliminary results for ameliorating the methods and selected features of investigation in the second empirical phase. The unobtrusive observation method applied in the first phase of the study was used to collect anonymous data of people walking in public areas, who did not know that they were being observed. The process consisted of random selection of an unaccompanied walking person and following the individual as long as possible while mapping her path on a digital map. In total trajectories of 111 individuals with a balanced gender and age ratio have been collected (57 observations on a shopping street, 54 in a shopping mall). The collected datasets have been analysed according to the velocity computed between each marked point in the observed path, additionally locations and durations of stops within the trajectory have been detected. Subsequently, speed histograms of each trajectory have been compiled, showing the proportional amount of time an individual walked at a velocity within a specific speed interval. Figure 7 shows diagrams consisting of all histograms compiled from indoor and outdoor observations (speed intervals: 0.1m/s steps, 30 intervals). In order to compile initial classes of behaviour, the histograms of each investigation area have subsequently been classified using clustering algorithms (hierarchical clustering and k-means algorithm). For each class, characteristic attributes have been identified to describe preliminary types of spatio-temporal behaviour. 3.3.2. Preliminary Results The classification process resulted in three homogeneous clusters for the indoor datasets (containing 10, 14, and 30 individual observations per cluster). The outdoor dataset analyses produced eight clusters, with a vast majority (86%) of observations belonging to the first four classes. As an example the results of the indoor analysis are now explained in more detail. In total datasets of 54 observations have been classified. The three clusters of motion behaviour can be interpreted as “swift shoppers”, “convenient shoppers”, and “passionate shoppers” (see Figure 8). Four of the eight clusters resulting from the analysis of outdoor datasets solely comprise a number of one to three subjects. Among the other four clusters, all three indoor

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences Outdoor observations

Cases

Cases

Indoor observations

29

Speed Intervals

Speed Intervals t

Figure 7. Histograms of all indoor (left) and outdoor (right) observations. Rows present individual observations, columns present speed intervals of 0.1 m/s ranging from 0 to 3 m/s. Higher intensities represent higher histogram bin values.

clusters can be identified to a certain extend. Additionally, a cluster of specific behaviour patterns was identified: these “discerning shoppers” were mainly female, walked at comparatively high speed, stopped rather often but shortly, and showed – other than individuals belonging to the other clusters – a slight tendency towards specialised and exclusive shops. Cluster No. of subjects

Swift Shoppers

Convenient Shoppers

Passionate Shoppers

10

14

30

Gender: female male < 30 30 - 60 > 60 Average age Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Age:

Average speed

25-30

30-35

35-40

1.19 m/s

0.61 m/s

0.24 m/s 3.57 (13)

Av. no. of stops (max.)

0.3 (3)

1.36 (2)

Av. duration of stops

6.97 s

2.58 min

4.66 min

casual or convenient

casual or conservative

casual/trendy or elegant

food store

no main focus

fashion, specialities, drugstore, bookshop

Fashion style Visited shops/facilities

Figure 8. Behaviour clusters in the indoor environment.

3.3.3. Conclusion Preliminary results of the first empirical phase indicate that a number of homogenous behaviour patterns can be observed, especially in consistent context situations. Currently ongoing investigations using a non-disguised form of observation combined with detailed interviews include and test basic findings of the first analysis. The combination of several complementary empirical techniques is a very promising approach to gain

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

30

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

comprehensive insight to human spatio-temporal behaviour patterns, even though some limitations have to be accepted. 3.4. Extracting Habits from Motion History based on GPS-measurements This section describes the experiences from the analysis of a long term motion history measured using GPS technology. The emphasis here lies on presenting the main experiences and in particular the pitfalls in the analysis phase, full details can be found in [4]. The aim of the study was to investigate the possibilities to extract commuting habits from the motion history of one individual alone without any prior knowledge. To this end a commuter was equipped with a GPS-tracking device which was operated on the commute to and from work on a total of 27 days within the timespan of approximately seven weeks in 2006. GPS tracking was not always successful due to sensor failures of various causes, including the failure to turn on the tracking device (the quality of GPSsensors has increased in the meantime making other causes of sensor failure less of an issue). The GPS-tracker recorded a (location, timestamp) pair every 2 seconds providing a detailed motion history. On total 70000 (location, timestamp) pairs have been collected. The intention of the data analysis is to obtain knowledge on the commuting habits covering information on:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• Points of interest (POIs) somewhat sloppily defined as points which are visited frequently, where the commuter either pauses or changes mode of transport. • The routes most frequently used. • The mode choice on these routes. • Temporal information such as the probability of route and mode choice for any given day of the week, the distribution of the starting times of trips and the distribution of travel times. Corresponding to these goals the analysis can be partitioned into the a number of phases, which are detailed as follows: Data preprocessing aims at increasing data quality. This is done by the detection of outlying observations, signal losses, as well as the application of smoothing operations to reduce the noise level. While this operation is performed as a first step in the data analysis, data quality can be also increased at later stages: After mode detection has been performed the knowledge of the transportation mode can be used in order to apply more efficient outlier detection and noise reduction methods. To give just one example unreliable observations can be detected and omitted for routes walked by monitoring travel speed implied by the measurements. Inaccurate measurements tend to imply inprobably high speeds, a phenomenon frequently encountered in urban canyons where low signal quality leads to the measurements slowly drifting away from the true locations which is corrected rapidly if subsequently strong signals are received. For more details on GPS reliabilty in weak signal environments see [27] and [24]. The preprocessing results in a number of trajectories of uninterrupted measurements, that is each trajectory is a sequence of (location, timestamp) pairs such that the distance of two consecutive timestamps is less than a prespecified margin. This set of trajectories is subject to the stop detection according to the algorithm proposed in [18]. The thus found stops are combined with the places of signal losses (potentially the signal is lost due to entering a building) in order to search for POIs. This set of locations is clustered

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

31

and the cluster centres are taken to be the POIs. For this procedure the starting points of the trajectories are not included, since due to relatively long time-to-first-fix (TTFF) of the GPS-tracker used of up to five minutes the first observed location does not closely match the trip start location. This problem occurs, since the GPS-tracker is turned on at the trip starting time without delaying the trip until the location is found for the first time. For TTFF of less than ten seconds (which is achieved by state-of-the-art GPS trackers) this is less of an issue. With POIs being identified the starts of the trajectories are matched to the nearest POI and the trajectories are extrapolated backwards in time to start at a POI. This results in a set of trajectories, which all start and end at POIs. Next trajectories are dissected into shorter segments by breaking up trajectories passing the vicinity of POIs. This leads to a new set of trajectories starting and ending at POIs and not coming close to other POIs. This is necessary since some mode changes do not result in stops in one direction but do so in the opposite direction. One example in this respect is walking away from a subway station: while exiting one typically does not stop; whereas when entering the subway many times at least a short time of waiting is necessary. The output of this step is a collection of trajectories which connect the POIs and where ideally each trajectory only corresponds to one mode. The next step is the detection of the main routes used by the commuter. To this end the algorithm of [2] is used resulting in a number of main routes. The main function of this step is to eliminate infrequently used routes for which it is not possible to detect clear usage patterns due to too small sample sizes. Next for each trajectory indicators are calculated which are used in order to detect the transportation mode used for this trajectory. The indicators contain a number of factors mainly built using speed and speed variation. The classification uses a tree based classifier built based on manually labelled example trajectories. This simple approach works surprisingly well in this case. In a final step temporal information is collected separately for each distinct route/transportation mode configuration. The case study showed that it is possible using only the motion history to correctly identify the main routes used by the commuter. Furthermore mode detection was able to identify the transportation mode used with a high precision. The only errors made were misclassifications of walking and biking (due to high measurement noise and inappropriate outlier detection for the routes walked) and confusion of public transport and inner city car usage. In particular the second point might be due to a lack of a significant number of trajectories documenting inner city car usage. This is a topic for further research. Additionally the indications of the case study need to be validated using a larger set of users which is the topic of an ongoing research project. 3.5. Finding Prominent Places Most of today’s Location Based Services (LBS) provide information based solely on a user’s location, not taking into account context knowledge about the user’s current situation and needs. This often results in low-quality and inappropriate information to the user. Hence, in order to provide user-oriented services, an improvement of the responsequality of information requests is required. This section outlines a methodology for finding and classifying places where the user regularly stays in her life, in the following denoted as ’prominent places’. A detailed description of this work has been published in [38].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

32

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

3.5.1. Collecting cell-data In order to draw meaningful conclusions about the motion behaviour of individuals, a sufficiently large amount of localisation data is required. 250 000 cell-based position measurements from ten volunteers obtained during half a year of permanent observation using a constant sample rate of five minutes have been collected. On average, 25 000 positions have been obtained from each volunteer in cooperation with the biggest Austrian mobile phone provider. 3.5.2. Analysing cell-data The analysis is split into two steps. First the collected cell-data are taken to find places where the volunteers spend most of their time. The found places are subsequently automatically annotated with semantics by labelling them with e.g. ’home’ or ’work’ (see Figure 9).

finding potential cell candidates

grouping cells by linkages

definfing model sequence

grouping cells based on visiting frequencies

computing individual cell-network

comparing and classifying

computing place sequence

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 9. Workflow of analysing cell-data for (a) finding and (b) classifying prominent places.

a) Finding prominent places Prominent places are defined as places where the user spends most of her time. In general, such places will be mainly ’home’ and ’work’ locations. Hence, cells where one volunteer has been located more often than in others (using a constant sample rate) must correspond to her prominent places. Cell-candidates are therefore first identified by filtering out cells exceeding a high dwell time. In some cases there is a one-to-one relation between a cell candidate and a prominent place. However, it often happens that one prominent place is assigned to multiple cells (see Section 2.2 for details on Cell Of Origin data collection). We have therefore developed an approach of an individual cell-network graph. Nodes of the cell-network graph represent cells and links represent cell changes. Cell-candidates are grouped if the topological distance between them is lower or equal than a predefined number of links. Due to unknown network characteristics, it might happen that not all expected cellcandidates re-presenting one prominent place are linked and therefore correct grouping will fail. To overcome this case, a further approach is used to add missing cells to related prominent places by comparing time series of visiting frequencies.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

33

b) Classifying prominent places After grouping is finished we can compute a sequence of prominent places ordered by visits through a work day based on the visiting frequencies. At the same time we can manually define a daily routine for such work days (here [’home’ ’work’ ’spare time’ ’home’]). By comparing these two sequences we can label the computed prominent places for finally giving them semantics. 3.5.3. Experimental results

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The presented methodology has been applied to a 250 000 cell-based positioning data set collected by ten volunteers during a half year of permanent observation. Eleven of twelve home locations (92%) and nine of ten work locations (90%) have been found and correctly classified. (Two volunteers moved their home during observation phase.) Each volunteer has validated the result based on her provided cell-based positioning data with respect to the correctness of found and classified prominent places. All found prominent places are close to the real location. Geographical accuracy of the found places mainly depends on the cell-network distribution in the surrounding area and cannot be influenced by the method used. Hence, no quantitative validation about the localisation quality was performed. In Figure 10 is an example for visualising the results in Google Earth. This visualisation was used to validate the results together with the volunteers. The demonstrated grouping and classification results are promising and can be used as basis for improved LBS.

Figure 10. Example for visualising prominent places in GoogleEarth. Highlighted rectangles indicate the composition of cell-based positions for prominent places.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

34

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

3.6. Mobile City Explorer Mobile City Explorer (MCE) is a project implementing the concept of an innovative mobile guide for personalised city tours. The concept was derived from the user perspective, integrating features for guiding the tourist according to her preferences and actual behaviour, identifying objects using object recognition, and collecting pictures and route information in an automated travel diary. A location aware pilot system for city tourists employing camera phones and GPS positioning technology has been developed in 2006. The mobile travel assistant guides tourists through the city, suggests Points Of Interests (POI) matching the tourists’ interests and recommends appropriate routes. If the tourist deviates from the recommended route, the tour is recalculated according to her personal interests and time constraints. At the same time the personal multimedia travel diary collects pictures, videos, text comments and acoustic impressions captured by the user along the route. The overall concept of the MCE-project has been published in [58]. This section introduces the results of the GPS-based motion behaviour analysis component of the MCE [42]. 3.6.1. Motion behaviour analysis

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Here, the sequence of visited POIs constitutes the central source of information to learn interests of city tourists. Motion behaviour analysis intends to find out in real time the POIs a user is visiting. Here a visit is characterised by spending some time at one place. An online stop detection algorithm is used to find stays in the motion data of the tourist. Stays detected by this step are classified into ’short stays’ and ’long stays’ depending on the dwell time. Here, short stays close to a POI (at least two minutes) may indicate user interest and long stays (at least five minutes) definitely indicate user interest to that certain POI. The extracted information is taken for learning the users’ interests (based on POI pre-knowledge). Hence, automatic POI suggestions and on-trip adaptations based on pre-selected interests and individual tourist motion behaviour is provided by the system. 3.6.2. Experimental results The evaluation took place in Vienna with seven city tourists (unfamiliar to that city) exploring the first district using the MCE system. A total of 21846 GPS data points were collected. At the city tour start, the tourists were prompted to select initial interests for the user profile initialisation. Then interesting POIs were assembled to a city tour and the tour map was provided to the tourist. Each time the tourist came close to a suggested POI, a notification containing a short description and a link to detailed information was sent to the tourist. Each stop at a POI resulted in an adaption of the personal interest profile and subsequently led to a reassembled city tour. Figure 11 shows strong deviations between the suggested tour after visiting 6 POIs (a) and after visiting 12 POIs (b). Based on five initial selected interests (History and sub categories) 14 new interests were learned on tour and the new main interest was Art & Culture and not anymore History. Figure 12 shows the adapted tour maps (a) after visiting 6 POIs and (b) after visiting 12 POIs. In both maps the path and the suggested POI sequence is shown. The results of the Mobile City Explorer project are promising and some concepts might in future be used in tourist applications.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences User 153 Interest Profile [12] 0.16

0.14

0.14

0.12

0.12 0.1

0.08

0.08

0.06

0.06

0.04

0.04

0.02

0.02

0

0

Art &

Ar Art ists t&Cu & M lture cu Cu ltu us lt re ic ex ural F ans hib ac ilit Art ition mu ies &c s ult ure eum ev en Arc t hit Build s ec tur ings e Mid style s d Mo le ag Ne wC de rn es on str Build time uc ted ing t s Mo ype s nu me nt His His tor His s to ic e tor xh rical y His ibitio eve tor n m nts ica us l e His Monu um tor me ica l p nts ers o Re ns ligio S Sc C n ien cien ce hurch ce & ex es hib Tech it n Fa cto ion m ique rie s in useu m stit utio ns Ec on om y

0.1

(a)

Art

0.16

Ar Art ists t&Cu &c & M lture ult Cu us ure ltu ica ra ex hib l Fa ns cilit Art ition mu ies &c s ult ure eum ev en Arc t hit Build s ec tur ings es t yle Mid s dle Mo Ne wC de ages rn on str Build time uc in s g ted Mo type s nu me nt His H tor His s ic e istor ica tory xh l His ibitio eve tor n m nts ica us l e His Monu um tor me ica l p nts ers o Re ns lig S Sc Ch ion ien cien urc c ce he ex e&Te s hib ch it n Fa cto ion m ique rie s in useu m stit utio ns Ec on om y

User 153 Interest Profile [06]

35

(b)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 11. Example for learned user interests (a) after visting 6 POIs and (b) after visiting 12 POIs.

(a)

(b)

Figure 12. The city tour map (a) after visiting 6 POIs and (b) after visiting 12 POIs.

3.7. Video Analysis of a Train Station Hall This section describes some results of a case study for obtaining and analysing long-term video-based pedestrian track data within a large hall of an Austrian railway station[9]. One major objective was to identify places within the railway hall where pedestrians frequently stop or walk slowly. Identifying such stops can provide insights into possible congestion areas, waiting areas and areas showing an aggregation of stops. Furthermore, places where pedestrians stop potentially coincide with areas where they are reorientating in order to find specific targets. Automatic video-based analysis of an infrastructure can provide rich information about the spatio-temporal motion behaviour of pedestrians. Being aware of the limits of current vision-based people tracking systems (see Section 2.4), we wanted to get an idea of what information could be achieved with a set of temporarily digital surveillance cameras and a video-based tracking software.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

36

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

(a)

(b)

(c)

Figure 13. Column (a): Simulated camera views via CAD modelling; Column (b): Real camera views with intermediate tracking result. Row 2 shows a gray information booth in front of the information display with circular layout. This information booth was built days after CAD-modelling and is therefore not represented in column (a); Column (c): Spatial distribution of detected stops with R = 1 meters and T = 3 seconds; coordinates refer to the ground plane – the dashed polygon indicates the camera’s region of interest.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.7.1. Camera Positioning A strategic placement of cameras is essential to guarantee a good coverage of two levels of the hall with a limited number of cameras. The goal was to clearly determine the fields of view of the cameras in advance in order to select the necessary camera positions and lenses. This prevented ad-hoc and inadequate settings during the limited installation time window. The imaging system was not intended to be permanently installed. This transient nature of data acquisition prohibited arbitrary installation of power supply for sensor locations or repeater locations. The equipment for the video recording had to be placed in lockable rooms not accessible for the public. We have decided to simulate the field of view of each camera with 3D modelling in the spirit of [35]. A total of seven sensors has been defined, and two virtual camera views are provided in column (a) of Figure 13. 3.7.2. Recording and Tracking The infrastructure has been imaged during 13 days, and produced 15-minutes video samples at selected times of the day. Total recording time has been 100 hours per camera. A pedestrian tracking software described in [6] was applied to the digital video streams. The pedestrian tracks in the video frame have been transformed to world coordinates of the respective ground floor, as illustrated in Figure 2 of Section 2.4.1. The tracking algorithm delivers a sequence of timestamp/location states which have been permanently stored. A snapshot of the tracking results is shown in column (b) of Figure 13. Note that due to algorithmic reasons the tracking algorithm interprets any person standing still for more than 100 video frames (4-5 seconds) as part of the background and hence loses track.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

37

3.7.3. Preprocessing and Preliminary Analysis We have analysed 18 video samples of one day obtained from five cameras. This amounts to a total of 22.5 hours of video material. The output of the pedestrian tracking software is spatially and temporally fragmented due to the issues described above and in Section 2.4. In a first step, obvious outliers have been removed: trajectories with fewer than six states and all points whose world coordinates lie outside of the building due to tracking errors have been discarded. The latter phenomenon results mainly from tracking errors in the vicinity of the camera’s vanishing line. The remaining tracks have been smoothed in order to reduce the noise level.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.7.4. Results for Stopping Pedestrian Detection In the algorithm of [18], a trajectory is said to contain a stop at position (x, y) if the trajectory enters a circle of radius R meters centred in (x, y) and leaves this circle after spending more than T seconds in the circle. This does not define a unique stopping point – we have therefore averaged all positions within the circle in order to be unique. We have set R = 1 meter, and for the duration T we have alternatively used two and three seconds. Here two seconds implies that any person moving with a speed of less than a meter per second will be classified as ’stopping’. This speed corresponds to slow walking as well as stopping, allowing identifying places where people stop or are slowed down. Column (c) of Figure 13 illustrates the results of the algorithm applied to the track data obtained during one day, using R = 1 meter and T = 3 seconds. The plots have been produced using a standard kernel density estimator [47], with brighter areas corresponding to regions with frequent stops. The analysis is based on the output of the tracking software, hence tracking failures are not visible in the data. The plots should therefore always be interpreted with care. The results for the first camera (covering a part of the lower level of the hall) show a concentration of stopping places at the areas around the ticket machine, at the exit (where newspapers are sold), in the area in front of the escalators and in front of a shop entrance. The largest concentration occurs where the main pedestrian streams intersect. Some found stopping locations closely match the expectation reconfirming a priori knowledge. Other locations, however, were surprising. In particular, the identified area just outside the shop as a place where people frequently stopped came as a surprising result. It can be explained by the fact that shoppers buying provisions for the journey are often accompanied by fellow travellers waiting in front of the shop to look after the luggage. The results for the second camera (covering a part of the upper floor) show stopping areas at the information booth on the left part of the plots, underneath both sides of the information display and around the wagon information stand. Again these findings are in agreement with expectations. More results of stopping detection, also in terms of velocity calculations from the track data can be found in [9]. The findings of video-based analysis have been used for supporting the design of a guiding system.

4. Conclusion: Applicable Methods for Specific Research Foci The presented projects and case studies exemplify that pedestrian spatio-temporal behaviour is interesting for a variety of different research fields. The complexity of spatial

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

38

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

Table 1. Comparison of data collection methods for pedestrian monitoring. Covered region

Accuracy 1

Counts / tracks / determinants

Main cost factor

Comments

Accuracy depending on environment (performance decreases in urban areas), world wide availability, georeferenced data, GPS modules integrated in mobile devices, cheap high sensitive receivers available.

medium, large

outdoor

approx. 3 – 50m

tracking

technical equipment

Cell-based positioning passive

large

indoor & outdoor

approx. 100m – 3km

tracking

provider licences

Cell-based positioning active

large

indoor & outdoor

approx. 100m – 3km

tracking

technical equipment

Accuracy depending on provider network density, software on mobile client is required.

Bluetooth passive

small, medium

indoor & outdoor

approx. 5 – 10m

tracking

technical equipment

Infrastructure equipment is required, observable mode has to be activated on clients, central data collection.

Bluetooth active

small, medium

indoor & outdoor

approx. 5 – 10m

tracking

technical equipment

Infrastructure equipment and software on mobile client is required.

Video analysis

small, medium

indoor & outdoor

varying

tracking

technical equipment

Accuracy depending on frame size, camera perspective; calibration. outdoor performance strongly depending on environmental conditions.

Observations unobtrusive

medium, large

indoor & outdoor

varying

tracking

manpower

Information about “natural” behaviour, accuracy depending on conditions, technique and observer skills, potential ethical issues, laborious analysis.

Observations non-disguised

medium, large

indoor & outdoor

varying

tracking

manpower

Strong observer effects, accuracy depending on conditions and technique, laborious analysis.

Questionnaires

medium, large

indoor & outdoor

low

determinants

manpower

Large sample possible, strong observer effects, information limited due to predefined categories.

Interviews

medium, large

indoor & outdoor

low

determinants

manpower

Detailed information, strong observer effects, accuracy depending on participants’ memory and awareness of their own behaviour, laborious analysis.

Trip diaries

medium, large

indoor & outdoor

medium, low

determinants

manpower

Detailed information, real-time diaries labour-intensive for subjects, recall diaries dependant on subjects’ memory (less accuracy).

Laser-scans

small

indoor & outdoor

few centimetres

tracking

technical equipment

High accuracy, crowded scenes require many scanners due to occlusion problems (costly).

Sensor mats

small

indoor & outdoor

?

counting

technical equipment

Counting accuracy depending on conditions and technology, mostly limited to single lanes. Tracking applications are being investigated.

RFID passive

large

indoor & outdoor

few centimetres

tracking

RFID readers

Length of tracking gaps depending on costly RFID-reader distribution. Infrastructure equipment is required. Cheap RFID – tags.

RFID active

large

indoor & outdoor

approx. 10 – 100m

tracking

technical equipment

Accuracy depending on environment condition and positioning technology. Infrastructure equipment is required.

WLAN

large

indoor & outdoor

approx. 30 – 300m

tracking

technical equipment

Accuracy depending on environment condition and positioning technology. Infrastructure equipment is required.

GPS

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Indoor / outdoor

1 Experienced

Accuracy depending on provider network density, central data collection using network provider interfaces.

values.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

39

activities and the interdependencies between those activities and the surrounding physical settings raise numerous questions which can only be tackled by accurately investigating pedestrian motion behaviour and related influence factors. A great number of empirical methods and technologies have been developed for the purpose of collecting and analysing data in order to understand, describe, model and predict pedestrian spatiotemporal behaviour. In the field of ambient assisted living the sub-domains of people tracking and event detection are particularly important, because they provide motion trajectory data of people and semantic interpretation, respectively. The applicability of particular data collection methods strongly depends on the specific focus of the research questions. Therefore, suitable methods and techniques need to be carefully selected. Table 1 juxtaposes the empirical data collection methods described in this chapter in a qualitative manner according to the following criteria:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• Covered region: refers to the dimension of the investigation area: “small” (e.g. a room), “medium” (e.g. a building, a street), “large” (e.g. city, county). • Indoor / outdoor: indicates whether the method can be used in indoor or outdoor areas (or both). • Accuracy: refers to the positioning accuracy level a method can achieve under normal conditions. • Counts / tracks / determinants: refers to the kind of collected data. Methods which can be used for tracking are usually applicable for counting as well. “Determinants” is related to factors influencing spatial behaviour (e.g. individual preferences). • Main cost factor: most significant expense factor with growing sample size. In several cases the investigation of a topic by applying one particular empirical method may not be sufficient, as each one implies specific limitations. Hence, there have been several approaches combining two or more methods in order to overcome their drawbacks and maximise their benefits. Early examples can be found in [19] using video and behavioural mapping techniques for analysing city tourists’ behaviour, and in [25] combining unobtrusive tracking methods and inquiries to analyse urban tourism. More recently, localisation technologies have been applied in projects aiming at investigating mobility patterns, e.g. for the development of activity-based transportation models by collecting data with the help of GPS enhanced self-administered diaries recorded on PDAs [22]. With the rapidly growing technical advances in the field of localisation technologies, further developments of more accurate, more reliable and cheaper technologies can be expected. Still, especially regarding the analysis and interpretation of particular datasets, many problems remain unsolved and achieving profound and comprehensive knowledge about human spatio-temporal behaviour is still challenging.

References [1] [2] [3]

E.R. Babbie. The Practice of Social Research. Thomson/Wadsworth, 2006. D. Bauer, N. Br¨andle, and S. Seer. Finding Highly Frequented Paths in Video Sequences. In Proc. 18th International Conference on Pattern Recognition (ICPR2006), Hongkong, 2006. D. Bauer and K. Kitazawa. Using laser scanner data to calibrate certain aspects of microscopic pedestrian motion models. In Proceedings 4th Intl. Conf. on Pedestrian and Evacuation Dynamics (PED2008), Wuppertal, Germany., 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

40

[4]

[5] [6] [7] [8] [9]

[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[22] [23] [24] [25] [26]

[27] [28] [29]

[30]

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

D. Bauer, M. Ray, N. Br¨andle, and H. Schrom-Feiertag. On Extracting Commuter Information from GPS Motion Data. In Proceedings International Workshop on Computational Transportation Science (IWCTS08), 2008. R. Bechtel. Environmental Psychology, chapter Human movement in architecture, pages 642–645. New York: Holt, Rinehart & Winston, 1967. C. Beleznai, B. Fr¨uhst¨uck, and H. Bischof. Human Tracking by Mode Seeking. In Proc. 4th International Symposium on Image and Signal Processing an Analysis, 2005. S. Blivice. Pedestrian Route Choice: A Study of Walking to Work in Munich. PhD thesis, University of Michigan, 1974. W. Bohte, K. Maat, and W. Quak. Urbanism on Track, chapter A Method for Deriving Trip Destinations and Modes for GPS-based Travel Surveys, pages 129–145. IOS Press Amsterdam, 2008. N. Br¨andle, D. Bauer, and S. Seer. Track-based Finding of Stopping Pedestrians – A Practical Approach for Analyzing a Public Infrastructure. In Proceedings 9th IEEE International Conference on Intelligent Transportation Systems (ITSC2006), Toronto, Canada, 2006. N. Caceres, J.P. Wideberg, and F.G. Benitez. Review of traffic data estimations extracted from cellular networks. IET Intelligent Transport Systems, 2:179–192, 2008. P. Corbetta and B. Patrick. Social Research: Theory, Methods and Techniques. SAGE, 2003. P. de Bois and R. de Haan. GPS experiment in Almere. to be published 2009. N.K. Denzin and Y.S. Lincoln. The SAGE Handbook of Qualitative Research. SAGE, 2005. H. Esser. Befragtenverhalten als ”rationales Handeln” – Zur Erkl¨arung von Antwortverzerrungen in Interviews. Arbeitsbericht 85/01, ZUMA, 1985. U. Furbach, M. Marun, and K. Read. Street Level Desires, chapter Information systems for Spatial Metro, pages 74–79. Urbanism: Delft, 2008. B. Glover and H. Bhatt. RFID Essentials. O’Reilly, 2006. N. Haering, P.L. Venetianer, and A. Lipton. The evolution of video surveillance: an overview. Machine Vision and Applications, 19:279–290, 2008. R. Hariharan and K. Toyama. Project Lachesis: Parsing and Modeling Location Histories. In Proceedings of the Third International Conference on GIScience, Adelphi, MD, USA, 2004. R. Hartmann. Combining Field Methods in Tourism Research. Annals of Tourism Research, 15:88–105, 1988. M. Hill. Stalking the Urban Pedestrian: A Comparison of Questionnaire and Tracking Methodologies for Behavioral Mapping in Large-Scale Environments. Environment and Behavior, 16:539–550, 1984. H.H. Hovgesen, T.A.S. Nielsen, P. Bro, and N. Tradisauskas. Urbanism on Track, chapter Experiences from GPS tracking of visitors in Public Parks in Denmark based on GPS technologies, pages 65–77. IOS Press Amsterdam, 2008. D. Janssens, E. Hannes, and G. Wets. Urbanism on Track, chapter Tracking Down the Effects of Travel Demand Policies, pages 147–159. IOS Press Amsterdam, 2008. T. Kanda, M. Shiomi, L. Perrin, and H. Hagita H. Ishiguro. Analysis of People Trajectories with Ubiquitous Sensors in a Science Museum. In Proc. IEEE Int. Conf. on Robotics and Automation, 2007. M.D. Karunanayake, M.E. Cannon, and G. Lachapelle. Evaluation of Assisted GPS (AGPS) in Weak Signal Environments Using a Hardware Simulator. In ION GNSS, 2004. A. Keul and A. K¨uhberger. Tracking the Salzburg Tourist. Annals of Tourism Research, 24:1008–1024, 1997. F. Kleijer, D. Odijk, and E. Verbree. Location Based Services and TeleCartography II, chapter Prediction of GNSS Availability and Accuracy in Urban Environments Case Study Schiphol Airpor, pages 387– 406. Springer: Berlin, Heidelberg, 2009. H. Kuusniemi and G. Lachapelle. GNSS Signal Reliability Testing in Urban and Indoor Environments. In ION NTM, 2004. A. LaMarca and E. DeLara. Location systems: An introduction to the technology behind location awareness. Synthesis Lectures on Mobile and Pervasive Computing, 4:1–122, 2008. A.J. Lipton, J.I.W. Clark, B. Thompson, G. Myers, S.R. Titur, Z. Zhang, and P.L. Venetianer. The Intelligent Vision Sensor: Turning Video into Information. In Proceedings IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS2007), 2007. A. Millonig and G. Gartner. Shadowing - Tracking - Interviewing: How to Explore Human SpatioTemporal Behaviour Patterns. In Technical Report 48: Workshop on Behaviour Monitoring and Interpretation, pages 1–15, 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

[31] [32] [33] [34] [35] [36] [37] [38]

[39] [40] [41] [42] [43] [44]

[45] [46] [47]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[48] [49] [50]

[51]

[52] [53] [54] [55] [56]

[57]

41

B.T. Morris and M.M. Trivedi. A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance. IEEE Transactions on Circuits and Systems for Video Technology, 18(8):1114–1127, 2008. S. Munder and D.M. Gavrila. An Experimental Study on Pedestrian Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3):1–6, 2006. T.S. Nielsen and H.H. Hovgesen. GPS in pedestrian and spatial behaviour surveys. In Conference Proceedings 5th International Conference on Walking in the 21st Century: Cities for People, 2004. R.E. Nisbett and T.D. Wilson. Telling more than We can Know: Verbal Reports on Mental Processes. Psychological Review, 84:231–259, 1977. I. Pavlidis, V. Morellas, P. Tsiamyrtzis, and S. Harp. Urban Surveillance Systems: From the Laboratory to the Commercial World. Proceedings of the IEEE, 89(10):1478–1497, 2001. R. Poppe. Vision-based human motion analysis: an overview. Computer Vision and Image Understanding, 108:4–18, 2007. J. Raper, G. Gartner, H. Karimi, and C. Rizos. A critical evaluation of location based services and their potential. Journal of Location Based Services, 1:5–45, 2007. M. Ray and H. Schrom-Feiertag. Cell-based Finding and Classification of Prominent Places of Mobile Phone Users. In The 4th International Symposium on Location Based Services & TeleCartography, 2007. P. Remagnino, S.A. Velastin, G.L. Foresti, and M. Trivedi. Novel concepts and challenges for the next generation of video surveillance systems. Machine Vision and Applications, 18:135–137, 2007. B. Rinner and W. Wolf. An Introduction to Distributed Smart Cameras. Proceedings of the IEEE, 96(10):1565–1575, 2008. S. Schneegans, P. Vorst, and A. Zell. Using RFID Snapshots for Mobile Robot Self-Localization. In 3rd European Conference on Mobile Robots (ECMR 2007), 2007. H. Schrom-Feiertag and M. Ray. Learning Interests for a Personalised City Tour based on Motion Behaviour. In The 4th International Symposium on Location Based Services & TeleCartography, 2007. M. Shah, O. Javed, and K. Shafique. Automated Visual Surveillance in Realistic Scenarios. IEEE Multimedia, 14(1):30–39, 2007. X. Shao, H. Zhao, K. Nakamura, K. Katabira, R. Shibasaki, and Y. Nakagawa. Detection and tracking of multiple pedestrians by using laser range scanners. In IEEE International Conference on Intelligent Robots and Systems, 2007. N. Shoval. Tracking technologies and urban analysis. J.Cities, 25(1):21–28, 2008. N. Shoval and M. Isaacson. Tracking tourists in the digital age. Tracking tourists in the digital age, 34(1):141–159, 2007. B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London, 1986. D. Singel´ee and B. Preneel. Enabling Location Privacy in Wireless Personal Area Networks. Cosic internal report, Katholieke Universiteit Leuven, 2007. skiline marketing gmbh. Skiline. http://skiline.cc/. last accessed 2009/04/02. A. Steinhage and C. Lauterbach. Monitoring Movement Behaviour by means of a Large Area Proximity Sensor Array in the Floor. In B. Gottfried and H. Aghajan, editors, 2nd Workshop on Behaviour Monitoring and Interpretation (BMI08), pages 15–27. CEURS Proceedings, 2008. J. Suutala and J. Rning. Towards the Adaptive Identification of Walkers: Automated Feature Selection of Footsteps Using Distinction-Sensitive LVQ. In Int. Workshop on Processing Sensory Information for Proactive Systems (PSIPS 2004), 2004. P. Thornton, A. Williams, and W.G. Shaw. Revisiting Time-Space Diaries: An Exploratory Case Study of Tourist Behavior in Cornwall, England. Environment and Planning A, 29:18471867, 1997. E. Trevisani and A. Vitaletti. Cell-ID technique, limits and benefits: an experimental study. In Sixth IEEE Workshop on Mobile Computing Systems and Applicaitons, 2004. S.C. van der Spek. Urbanism on Track, chapter Tracking Technologies, pages 25–33. IOS Press Amsterdam, 2008. S.C. van der Spek. Urbanism on Track, chapter Spatial Metro: Tracking pedestrians in historic city centres, pages 79–101. IOS Press Amsterdam, 2008. P. Vorst, J. Sommer, C. Hoene, P. Schneider, C. Weiss, T. Schairer, W. Rosenstiel, A. Zell, and G. Carle. Indoor Positioning via Three Different RF Technologies. In 4th European Workshop on RFID Systems and Technologies, 2008. R.S. Weiss and S. Boutourline Jr. A summary of fairs, pavilions, exhibits, and their audiences. Technical

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

42

[58]

[59]

[60] [61] [62] [63]

report, Cambridge, Mass., 1962. S. Wiesenhofer, H. Feiertag, M. Ray, L. Paletta, P. Luley, A. Almer, M. Schardt, J. Ringert, and P. Beyer. Mobile City Explorer: An innovative GPS and Camera Phone Based Travel Assistant for City Tourists, pages 557–573. Lecture Notes in Geoinformation and Cartography. Springer Berlin Heidelberg, 2007. A. Willis, C. Gjersoe, N.and Havard, J. Kerridge, and R. Kukla. Human movement behaviour in urban spaces: implications for the design and modelling of effective pedestrian environments. Environment and Planning B: Planning and Design, 31:805–828, 2004. G. Winkel and R. Sasanoff. Environmental Psychology, chapter An approach to an objective analysis of behavior in architectural space, pages 619–631. New York: Holt, Rinehart & Winston, 1966. A. Yilmaz, O. Javed, and M. Shah. Object Tracking – A Survey. ACM Computing Surveys, 38(4):pp 1–45, 2006. V. Zeimpekis, G.M. Giaklis, and G. Lekakos. A Taxonomy of Indoor and Outdoor Positioning Techniques for Mobile Location Services. ACM, 2003. B. Zhan, D.N. Monekosso, P. Remagnino, S.A. Velastin, and L. Xu. Crowd analysis: a survey. Machine Vision and Applications, 19:345–357, 2008. H. Zhao and R. Shibasaki. A novel system for tracking pedestrians using multiple single-row laserrange scanners. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans, 35:283–291, 2005.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[64]

A. Millonig et al. / Pedestrian Behaviour Monitoring: Methods and Experiences

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-43

43

Progress in Movement Pattern Analysis Patrick LAUBE a a Geomatics Department, The University of Melbourne, 3010 Parkville VIC, Australia Abstract. This chapter reports on progress made in the development of techniques and tools to formalize, describe, detect and understand movement patterns in spatiotemporal data. The chapter is presented as a critical review of the relevant literature in the fields of geographical information science, data mining and knowledge discovery, and computational geometry. After discussing the nature of movement patterns, several conceptual space models accommodating movement are discussed as a preliminary for the formation of movement patterns. Then typical types of movement patterns and their signature applications are portrayed. The chapter concludes with a series of limitations to movement pattern analysis, an outlook, and a research agenda for future work. Keywords. Movement patterns, moving point objects, spatio-temporal data mining, trajectories.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Flock – a quintessential movement pattern Imagine a time-lapse movie of grazing sheep on a luscious meadow, shot in a wide angle perspective. Asked about what you see in the scene, you might very well answer with “flocking sheep”. Having done so, you would have abstracted the essence of a rather complex dynamic process and labeled it with a commonsense term. That is exactly what this chapter is about. Movement is inherently spatio-temporal and its monitoring inevitably produces massive volumes of low-level observational data. Any deeper understanding of movement processes, however, needs “high-level conceptual schemes through which we humans interpret, understand, and use that (raw movement) data” [38, p. 300]. Movement patterns provide such high-level schemes and allow human interpreters to monitor, analyze, and ultimately understand movement. Growing volumes of monitoring data pour out of our networked cyber-infrastructure at previously unseen temporal and spatial granularities. New techniques and tools are required to understand and exploit this new source of insight into the movement (behavior) of people, animals, or any sort of monitored moving object. In this context, the metaphor “flock” has inspired a whole new branch of spatio-temporal modeling and analysis in spatial computing and geographical information science (GIScience). The movement pattern flock illustrates how spatial scientists structure raw spatio-temporal data, aiming at condensing overwhelming amounts of raw spatio-temporal monitoring data into valuable process knowledge. The metaphor of a flock is the quintessential movement pattern, so this introductory section will discuss the concept of movement patterns through postulating five theses gained from an analysis of the movement pattern flock.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

44

P. Laube / Progress in Movement Pattern Analysis

s5

s7

s2 s1 tx+dx

s6

s3(x,y,t)

p

tx

s3 s8 s4

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. Eight moving sheep on a meadow. Sheep s1 to s4 flock from tx until tx+dx . Typically, moving entities are modeled as moving point objects and their location is fixed as tuples of (x, y, (z), t) (see s3 ). For data capture and implementation movement paths are typically discretized (see s1 ). Discretization furthermore allows defining the movement pattern flock as an instance of n = 3 consecutive time steps, for which there exists a disc of radius p containing at least k = 4 moving entities. The movement pattern flock in itself does not disclose whether individual sheep choose to stay together for protection (second order effect) or must congregate in order to cross a creek (first order effect).

1. Movement patterns focus on moving points. The majority of research into monitoring and analyzing movement behaviors focuses on moving point objects. Acknowledging that handling space-time is already difficult enough, moving entities are mostly treated as moving points even if they have a spatial extent (as for example moving cyclones). At any time, the location of a moving (point) object can then be fixed as a simple tuple (x, y, (z), t), with x, y, z referring to spatial coordinates and t referring to the corresponding time. Figure 1 illustrates eight roaming sheep, modeled as moving point objects s1 to s8 . 2. Movement patterns are discrete. Once reduced to a moving point object, the path of a moving object can easily be thought of as a continuous curve in space-time. But when it comes to implementation usually only discrete representations of this path are stored and manipulated in computers. “Abstract models are simple, but only discrete models can be implemented” [31, p. 281]. Movement paths, such as the ones depicted in Figure 1, therefore are typically discretized into trajectories consisting of a set of time-stamped fixes that are connected through straight segments. Note that localization techniques (such as GPS receivers) produce discrete fixes anyway and hence comply with a discrete data model in the first place. As can be seen from s1 the discrete trajectory model is only an approximation of the real path. Depending on the sampling rate, the discretization clearly deviates from the real path and can significantly underestimate the actual distance traveled.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

45

3. Movement patterns are intrinsically spatio-temporal. When thinking of sheep, the metaphor flock also suggests that movement patterns must extend in space and time. A flock is more than just a momentarily congregation of sheep, it clearly has the intuitive notion that the involved sheep must stick together for some time when constituting a flock. There are already many techniques for describing and analyzing static point patterns (outliers detection, density measures, clustering). Much less work has been done for point patterns that persist and or develop over time, such as movement patterns. Sheep s1 to s4 in Figure 1 not only aggregate at some time but move together over an extended period of time (bold sections of the paths from tx until tx+dx ).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4. Movement patterns are elusive. There is surprisingly little consensus about what movement patterns are, and how they should be defined. Do our sheep have to stay together for an extended period of time, or do they have to move together, so a resting group of sheep would not be a flock? And, what about the number of sheep in a flock? Has this number to be constant? What happens if some sheep leave the group to be replaced by others, is it then still the same flock? Whereas there exist neat definitions for most spatial patterns, there is hardly any definition of spatio-temporal movement patterns that two authors could agree on, at least in the details. Be it through different interests, diverting methodological backgrounds, or through incompatible data sets, almost every paper on movement pattern analysis defines its own unique patterns, and then gives methods to visualize, quantify or detect them. The same holds for the movement pattern flock. As Benkert et al. illustrate, flock has experienced some evolution in its definition [5], and other authors still prefer referring to herds in a similar context [49]. 5. Movement patterns are context dependent. Movement is typically embedded in some given context, and so are movement patterns. Whereas the sheep in Figure 1 seem to be free to move wherever they like on the meadow, it appears they prefer the bridge to get from the right side to the left side, perhaps changing from their morning site to the afternoon site. In this case their evident movement pattern would be due to the variability of the embedding environment. At the same time, if there were no bridge, the same flock could also have emerged since sheep prefer to stick together in order to be protected against predators. This example illustrates that movement patterns are context dependent. Knowing about that context is crucial as it may influence how patterns are modeled, formalized, and detected [39]. These five theses derived from the specific pattern flock may be oversimplified and may in some cases divert quite significantly from other, also perfectly legitimate notions of movement patterns. However, it is not the goal of this chapter to present an exhaustive inventory of the variability of movement pattern approaches. In contrast, the chapter aims to provide an introductory overview of a research field that is quickly gaining a lot of attention in various disciplines, including geographical information science and computational geometry, data mining, information visualization, and exploratory data analysis. This chapter introduces the concept of movement patterns and reflects lessons learned from initial research on defining and detecting such patterns. The chapter is presented as a critical literature review that revolves around an extended stock of research articles, visiting and revisiting many articles from different perspectives. The remainder of this chapter is structured as follows. Section 2 tackles the difficulties of getting a grip on the elusive nature of movement patterns and framing them

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

46

P. Laube / Progress in Movement Pattern Analysis

into formal definitions. Section 3 provides the necessary basis for movement patterns as it illustrates how movement itself can be modeled conceptually in a range of different spaces having quite different characteristics. The following Section 4 presents a set of typical movement patterns in their application contexts. Section 5 asks the question of what movement patterns could be used for, as there is a surprising variability in the motives and purposes of the application fields analyzing movement patterns. Section 6 lists a series of limitations to the analysis of movement patterns. Finally, Section 7 outlines possible routes for future research and Section 8 concludes the chapter with the formulation of a research agenda for movement pattern analysis.

2. What is this thing called “movement pattern”? The definition of movement patterns is a slippery issue and in most cases context and hence application dependent. Watanabe (1985) has a valid point when he argues that a pattern basically is “the opposite of chaos; it is an entity, vaguely defined, that could be given a name – i.e. a something” [89]. There is a huge variability of movement “somethings”, movement events, episodes and processes, that can be given the label of a movement pattern. Here are some examples:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• • • • • • • • • • • •

the “L” or “7” shaped movement pattern of the chess knight; the waggle dance of a honey bee; the daily commute of an office clerk between home and work; a racing pigeon following a prominent landscape element, like a river or a highway, for navigation purposes; the recurring path of hurricanes through the Caribbean sea; an near-miss of two airplanes; sheep flocking on a paddock; the seasonal migration of caribou; the symmetric V-shaped flight formation of a gaggle of geese (skein); a traffic jam; a marching band; a defending football team trapping a striker in off-side.

The list describes patterns referring to individual movement paths in their entirety (the commuter), parts of individual movement paths (waggle dance) and also coordination amongst multiples of moving objects (flocking sheep, skein). Some patterns capture singular events (near-miss), others refer to recurring processes (hurricane), some show even cyclical phenomena (commuter). Some patterns only make sense when movement is related to the underlying space, whereas others seem to be constituted independent of an embedding (geo)space. For example, the movement pattern of the navigating pigeon is linked to the river (just as the bridge in Figure 1). In a spatial statistics reading [64], such patterns are first order effects in that they emerge due to variability in the underlying properties of the local environment. By contrast, the structured movement of a marching band develops since every musician follows some rules of maintaining a given spatial relation with its colleagues. The pattern is thus a second order effect as it mirrors local interaction effects between moving objects.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

P. Laube / Progress in Movement Pattern Analysis

47

In some cases the number of involved objects is inherent in the definition of the movement pattern. For instance, a near-miss of airplanes will always involve two airplanes; a marching band has a fixed number of members; the defending row of a football team setting the off-side trap typically features four defenders. In other cases patterns are constituted explicitly without an indication of the number of involved objects. Examples for such open movement patterns are the migration of caribou or a traffic jam, where the power of the pattern lies in the abstraction that we do not need to specify the number of caribou or cars. In yet other cases it is not a priori clear if the pattern has a defined size. For instance, does a flock have a given number of members? Is it still a flock if some sheep wander away? Is it then still the same flock? Antony Galton calls this issue participation in the large-scale phenomenon (flock) by its constituent small-scale elements (sheep) and illustrates it with the example of a string quartet [38]. Whereas we would probably agree that a string quartet remains the same string quartet if one musician had to be replaced. But what if more and more members are replaced and the quartet ends up with none of its original members. Is it then still the same quartet1 ? With that much diversity, the question arises, what is it that unifies all the patterns in the above list? What constitutes a movement pattern? As a first and obvious characteristic, all moving entities in the above list move in one way or the other. Frank 2001 describes motion as one of two separate forms of change. “Change of the objects of interest and change in the position or geometric form of these. For the first we use the heading life of objects: objects may appear and disappear (for example, a forest or a residential zone), two objects may merge (for example, two parcels or two towns), and objects may split. For the second we use the heading motion: objects may move or may appear to move, with or without changing their form at the same time” [36, p. 22]. In this chapter I shall define movement as change of the spatial location of an object over time. Hence, this chapter focuses on (geo)spatial movement and does not discuss other forms of movement, such as for example the movement patterns of the human body in biomechanical applications (as for example in “walking”). Apart from describing movement phenomena, all above listed items, in one form or another, capture the essence of a specific observation of a movement process. Be it through the characteristic of a movement path, be it through relating movement to the embedding environment, or be it through some form of coordination amongst moving objects, all examples describe a phenomenon such that we could identify other cases of the same kind. It is here where an identifiable pattern emerges, when we are able to associate the present particular case to other cases of waggle dances, hurricane paths, flocks, off-side traps, recognizing their similarity [89]. The recognition of pattern unifying observations allows the building of a group of similar things and placing them in a “box” [89]. Movement patterns provide the essential high-level concepts we humans need to discuss, interpret, and ultimately understand movement data [38]. In a data mining and knowledge discovery context, Fayyad et al. define patterns as “an expression in some language describing a subset of the data or a model applicable to the subset. Hence, [...] extracting a pattern also designates fitting a model to data; finding structure from data; or, in general, making any high-level description of a set of data” [32]. Given the variability and the elusive nature of movement phenomena and the diversity of researchers interested in them, it is not surprising that so far most research 1 The

classic Greek Theseus paradox raises the same question whether an object that has all its components replaced remains the same object; in the Greek legend specifically a ship having all its planks replaced. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

48

P. Laube / Progress in Movement Pattern Analysis

fields foster their own definitions or movement patterns, typically adapted to their specific problems. However, this diversity makes it difficult to compare and thus evaluate the various techniques and procedures in movement pattern analysis. The community has acknowledged this problem and there is discussion on whether or not there is or can be a set of generic movement patterns. At a recent interdisciplinary workshop at Schloss Dagstuhl, Germany, there were intensive discussions on that question without coming to a conclusion yet [10]. On a very similar note, other authors work on the ongoing research strand of classifying movement patterns [25], and Section 4 will get back to this issue in more detail. In this chapter I will not engage in this difficult discussion but rather adhere to a rather open definition of movement patterns. Following the open pattern definition given in Fayyad et al. (1996), I use in this chapter the following definition: Definition 1. A movement pattern is any high-level description of the movement of an individual or a group of individuals. This description can but must not relate the movement to the underlying space. Before I can discuss signature movement patterns and their typical applications in sections 4 and 5, I will in the next section have to lay the basis accommodating patterns, – I will have to discuss conceptual space models underlying movement.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3. In what spaces can movement patterns be found and modelled? Modeling movement means modeling the moving entities, but equally modeling the space they move in. Being spatial in nature, both building blocks could exploit the full spectrum of spatial conceptual data models and data structures: that is use anything from entities (points, lines, and polygons) to fields (rasters). Squaring such variability results in a plethora of options for modeling movement. And indeed the modeling of movement has seen a wide variety of underlying data models, ranging from point entities moving in homogeneous 2D space in computational geometry [43] to lifelines in a 3D space-time aquarium in time geography [52], and extending from agents roaming through heterogeneous field spaces in behavioral ecology [17,61] to complex traffic models in network spaces [60] and using cellular automata [62]. As the underlying movement model has a profound influence on the nature of emerging patterns, and on the procedures suitable for their analysis, it is worthwhile to first discuss conceptual models of movement. Even though many moving entities in the physical world do have a spatial extent – think of a low pressure zone, a bush fire or a flock of sheep – the vast majority of methods shrink moving entities to points. I argue that there are two good reasons for this common generalization. First, moving points are in many circumstances perfectly sufficient to capture all properties of a moving object required for movement pattern analysis. Second, modeling moving linear or areal objects is rather cumbersome and people were reluctant to tackle it for good reason. Where, for instance, is the conceptual difference between a moving polygon and a static polygon shrinking on side and expanding on the other? Hence, this section briefly introduces the most common ways of modeling the spaces accommodating moving point objects, discusses the lessons learned from using such models for movement pattern analysis and gives respective signature examples. Figure 2 gives an overview of the discussed movement spaces and table 1 summarizes their advantages and disadvantages for embedding movement of point objects.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

49

P. Laube / Progress in Movement Pattern Analysis

(a)

(d)

(b)

(c)

(e)

(f)

Figure 2. Six basic space models accommodating point movement. (a) Euclidean homogeneous space, (b) constrained Euclidean space, (c) space-time aquarium, (d) heterogeneous field space, (e) irregular tessellation, (f) network space.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.1. Homogeneous Euclidean space The most simple space to accommodate moving point objects is a homogeneous twodimensional Euclidean space R2 . In such an Euclidean space, point objects can occupy any arbitrary location. At any given time t this location is typically indicated as a tuple (x, y, t) and referred to as fix. The term homogeneous refers to the assumption that the space R2 is featureless and hence neither obstructs nor influences where point objects are located or move to. Distances in such spaces, for instance distances between fixes of different individuals at a given time or fixes of the same individual at different times, are expressed as Euclidean distances. In combination with the trajectory model representing movement as a series of discrete linear steps connecting fixes, such Euclidean spaces allow the building of a basic and simple to implement movement model (see Figure 2a). Featureless Euclidean spaces are popular when inter-object proximity is important, as for instance in data mining for patterns of coordinated movement. For instance, the field of computational geometry has in the last decade seen a growing interest in movement patterns and algorithms for their efficient detection [43]. Lesson learned. Featureless two-dimensional Euclidean spaces are simple, a plausible proxy for many geo- and other spaces and straightforward to implement for analysis and simulation. Furthermore, movement data from most tracking technologies produce discrete trajectories and are thus a natural fit. This holds true for the established global po-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

50

P. Laube / Progress in Movement Pattern Analysis

sitioning system (GPS) as well as emergent wireless sensor technologies for indoor environments such as radio-frequency identification (RFID). However, hardly any moving object in the physical world moves in a featureless space, but in contrast in some embedded environment that is most likely to influence the observed movement. Although simple and straightforward, modeling moving entities in featureless spaces risks an emphasis on second order effects (inter-object relations) and potentially neglects important first order effects (environment-object relation). 3.2. Constrained Euclidean space

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Constrained spaces are a special case of homogeneous Euclidean spaces. They are often used in monitoring and modeling of movement in in-door settings. With advances in image analysis and RFID tagging, in-door monitoring of individuals and crowds rapidly gains importance in surveillance and security applications (public spaces such as airports, train stations, sports venues) [44] and in consumer behavior studies (shoppers in malls, grocery stores) [53]. In the same application fields, the simulation of movement processes may also be of interest [51]. Here, rather than finding specific movement patterns, the task is reproducing certain patterns in order to better understand a dynamic multi-object process. The underlying space model must constrain where the moving objects can go and where they can’t. Walls and obstacles constrain an otherwise homogeneous featureless space (see Figure 2b). What matters is the spatial layout of the constraints, – bottlenecks, corridors, dead ends. Proximity can be measured as beeline Euclidean distances as long as no barrier is in the way. Otherwise distance must be evaluated avoiding obstacles and around corners. Consequently, such constrained spaces are often generalized into network spaces (see section 3.6), representing rooms and corridors of a floor plan as nodes and edges of a topological network. Constrained spaces also gain importance in robotics applications, for example in in-door map discovery and industrial maintenance and repair [19]. Lesson learned. Similar to Euclidean featureless spaces, constrained spaces are simple to use. Such spaces allow a straightforward implementation of moving agents, moving through rooms and corridors and bouncing off obstacles. 3.3. The space-time cube The space-time cube is a three-dimensional space-time model featuring two spatial axes (x, y) and one perpendicular temporal axis t (Figure 2c). Whereas the vertical time axis could model any type of time (ordinal or continuous, linear or cyclic) [35], the most common choice is a continuous linear time (though discretized for implementation). Trajectories of moving point objects take the form of space-time paths, ascending with advancing time. Standstill produces vertical life lines, moving produces life lines deflecting from the vertical, with the azimuth reflecting movement direction and the inclination angle representing speed. The space-time cube, sometimes called space-time aquarium, is popular for visualization purposes and in time geography. Movement visualization in the space-time cube exploits the versatility of arbitrary perspectives opening up in three dimensions [52]. The concept of time geography dates back to the seminal work of Hägerstrand and the Lund

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

51

School [45], investigating temporal aspects of spatial human activities. Adopting certain constraints about an object’s movement the space-time aquarium provides a model of an object’s potential locations. When a life line shows where an object is in space-time, the space-time prism shows where the object could be. There is ample research on spacetime cubes in GIScience and database research, mainly concentrating on activity and accessibility analysis [59,33,31]. Lesson learned. Space-time cubes offer an intuitive conceptual approach for representing movement data for visual inspection. The space-time cube, the toolbox of exploratory data analysis, and the human observer’s ability to recognize complex patterns, are a powerful combination for the identification of salient movement patterns. On a more negative note, densely populated space-time cubes rapidly loose clearness as space-time paths increasingly occlude each other. Consequently, space-time cubes are best suited for analyzing the behavior of individuals or small groups. In short, the space-time cube is an elegant concept for few individual cases; combinations with space-time prisms are rather cumbersome to implement [33].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.4. Heterogeneous field space Whereas all previous movement models are based on a continuous underlying space, a second set of approaches model objects moving in discrete underlying spaces. The conceptual data model field and its corresponding data structure raster offer a straightforward discrete movement space (see Figure 2d). Discrete field spaces are appropriate when moving agents must be modeled that react on or interact with a heterogeneous environment. Moving agents may choose their next step depending on some given neighborhood function (for instance choosing the path of least resistance in a digital elevation model). Moving agents may also change their environment given some activity and then move according to the changed environment (for instance agents modeling roaming animals grazing spatially distributed food resources [4]). Consequently, such movement spaces are widespread in behavioral ecology, researching the ecological (and evolutionary) basis for animal behavior. Similarly cellular automata simulating pedestrian movement often use discrete field space [23,15]. Lesson learned. A field space offers simple and straightforward implementation for a heterogeneous movement space and allows the modeling of agent-environment feedback. For many movement processes, however, the discretization allowing only fixed step lengths given by the grid granularity and a given number of cardinal movement azimuths may blur fine-scale movement patterns. 3.5. Irregular tesselations Mobile phones are arguably the largest producer of individual movement data at this time, although for the time being access to such data is restricted for privacy reasons. Excluding legitimate privacy considerations for the moment (Section 6.5 will get back to this important issue), mobile phone data present yet another challenging form of movement data. Individual time-resolved movement paths can be reconstructed from the sequence of the towers routing the communication of individual mobile phone users [41]. The approximate reception areas of the towers partition the movement space into an

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

52

P. Laube / Progress in Movement Pattern Analysis

irregular tesselation, typically a Voronoi diagram. Movement is then modeled as a sequence of moves between cells (sometimes called “zones”), with objects hopping from cell to cell (see Figure 2e). Consequently, rather than providing precise location fixes for arbitrary times, such cell spaces create events such as “an object enters a cell at a given time”, “an object stays in a cell for a given time interval”, or “an object leaves a cell at a given time” [26]. Lesson learned. Offering access to the movement patterns of potentially millions of mobile phone users is the biggest asset of this otherwise limiting movement space. Such cell spaces are suited for coarse level comparison of trajectories, and consequently their aggregation and classification. Obviously, movement patterns relying on precise location information cannot be explored with this movement model. Furthermore, the discretization granularity varies in space, as cells of telecommunication networks typically vary in shape and size, ranging form very small in metropolitan areas to very large in rural areas. What can provide a fairly precise location estimation in a central business district, can become completely useless in the hinterland.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.6. Network space The majority of movement performed or planned by humans is in some form networkbound (see Figure 2f). Humans (and also some animals) rationalize movement by building infrastructure facilitating and channeling movement. Obvious examples are street networks, public transport networks, air traffic networks, and shipping and cargo networks. Other human movement spaces can easily be generalized into a network, as was mentioned earlier with building floor-plans. Network space models allow for a detailed modeling of the properties of edges and nodes [13]. Edges may be directed (one-way street), have maximum speeds (speed limits) or maximum capacities (cars per hour) or restrict movement modalities (highways, bicycle lanes, footpaths). Nodes may model stopping times (red lights) and transfer or layover properties (air traffic, cargo). All these network properties can furthermore be time dependent (peak hours). Naturally, network spaces are used for modeling, managing, and analyzing any sort of individual or public traffic [68]. The network model is also well suited for queries in moving object databases [13, 22]. Network spaces furthermore allow location extrapolation and location estimation, paramount to moving object data bases (MOD) that manage the positions of constantly moving objects in real time. So-called future queries refer to database states that are extrapolated from the last known update [80,84]. Lesson learned. Networks allow at the same time the generalization of movement spaces and the precise control of the movement they accommodate. For that reason, network spaces are popular when modeling and analyzing complex movement systems involving many interacting moving agents. However, network spaces are a possible source of bias for movement pattern analysis when handled carelessly; patterns assuming unconstrained movement should not be applied to objects moving on a network. For example, the predominant movement azimuths of a driver moving on a metropolitan grid network mirrors the network properties but may reveal little about the drivers intentions.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

53

Table 1. Summary comparisons of movement space models. space model

+ advantages

− disadvantages

homogeneous 2D

simple, straightforward for implementation, corresponds with tracking data format

danger of negligence of first order effects

constrained 2D

as above, suited for agent-based modeling

space-time cube (3D)

elegant and figurative, change of perspectives

regular raster

simple, straightforward for implementation, allows simple agentenvironment interaction

not suited for movement patterns at a granularity below raster granularity

irregular partition

allow coarse level comparison, aggregation and classification of movement paths

disparities in cell sizes and shapes limit comparability between areas of different granularities

network space

space generalization, precise modeling of spatially heterogeneous movement constraints

limits range of meaningful movement patterns

allows

limited to small sample sizes

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4. What types of movement patterns are there? The previous sections discussed the difficulty of defining movement patterns and portrayed a set of space models accommodating movement (patterns). This section presents a selection of movement patterns that shall offer an overview of the possibilities of movement pattern analysis. There is a growing interest in categorizing movement patterns in the GIScience community. These are important initiatives and this chapter will briefly report on progress in that area. However, this introductory chapter does not aim at categorizing movement patterns, but rather presents a number of archetypal patterns, their typical applications, and the lessons that can be learned for future work. Instead of structuring the presentation of movement patterns according to a (yet to be established) taxonomy, I functionally structure my presentation of movement patterns following the well-established functionalities of data mining. Even though data mining is not the only area using the notion of movement patterns (see section 5), it is a prominent one and the discipline has run a long-lasting scientific discourse on what kinds of patterns there could be, how they should be formalized, detected, and evaluated [47,8,46]. Given the various application fields, it is not surprising that the structure and the detailed terminology of the basic data mining tasks marginally varies between different sources. However, for the sake of simplicity, and because it offers a nice fit for movement patterns, I have chosen the structure presented in Han and Kamber [46] for my presentation of movement patterns: 4.1 Class/concept description, 4.2 Finding frequent patterns, rules, associations and correlations, 4.3 classification and prediction, and 4.4 clustering2 . Figure 3 illustrates a selection of signature movement patterns and Table 2 summarizes the findings of this section. But first, what has been achieved with the community’s important pursuit of a taxonomy of generic movement patterns? Considerable efforts organizing movement pat2 Han and Kamber [46] also list outlier analysis (objects that do not comply with the general behavior or model of the data) and evolution analysis (modeling objects whose behavior changes over time). Little research has been done with such patterns so that these two functionalities could be dropped with little loss.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

54

P. Laube / Progress in Movement Pattern Analysis

terns in a systematic way emerge the rapidly growing field of visual analytics (see 5.2). This latest form of combining human and computer capabilities for exploratory analysis of massive data repositories requires a profound systematic understanding of the types of patterns there may exist and how they can be best extracted through transformation, computation, and visualization techniques [2]. In their attempt to present a set of visual analytics tools suitable for large collections of movement data, the authors distinguish individual movement behavior (IMB, characterized by a trajectory, its shape, length, direction, speed, etc.), momentary collective behavior (MCB, movement characteristics of a set of objects at a given time), and dynamic collective behavior (DCB, movement characteristics of multiple objects over a time period). They then give similarity patterns for IMBs referring to similarity of descriptive measures, co-location, co-incidence, and synchronization. Examples for MCB patterns are constancy, change, fluctuation and trend. Finally, correlation (statistical correlation, co-occurrence of behaviors), influence (phenomena influencing each other), and structure (composition of complex patterns from simple ones) are listed for DCB patterns. Many of these terms can be found again in Dodge et al. aiming specifically at taxonomy of movement patterns [25]. Their taxonomy groups movement patterns into generic patterns and behavioral patterns. Generic patterns are described as lower-level building blocks, further subdivided into primitive patterns (e.g. co-location, concentration, incidents, sequence) and compound patterns (e.g. symmetry, convergence vs. divergence, trend). Behavioral patterns on the other hand, include the notion of a specific context and particular objects performing the patterns (e.g. flock, foraging, congestion). Both examples aiming at a categorization of movement patterns mirror a vivid and sound discussion in the community. However, there is still more work to be done towards a widely accepted taxonomy of generic movement patterns. The participatory process and the corresponding website initiated by [25] is an excellent platform for the interested reader to follow the progress of this community consensus initiative 3 .

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.1. Class/concept description Movement data can be associated with classes or concepts like any data. Designers of location based services may need to identify travel modalities as in car, train, tram, bicycle, or pedestrian. Anthropologists may want to describe sex classes male and female of tracked orangutan by classifying their movement trajectory characteristics. Behavioral scientists may want to identify prominent behaviors such as foraging or resting by discriminating their respective marks left in geographic space. The aim of class/concept description is the characterization or discrimination of classes or concepts by summarizing the data of the target classes under study. The classes are user-defined and their label is typically known in advance. Queries to moving object databases are used to collect the required data. The output patterns are presented in summary statistics, tables, charts or generalized relations. Figure 3c shows six space-time paths in a space-time cube. Let’s assume that it is furthermore known that these trajectories represent the morning commute of cyclists and train users. In order to describe the two classes one could now investigate summary statistics of speed, acceleration, or trajectory sinuosity, allowing an unambiguous description of the classes cyclists {c2 , c3 , c6 } and train users {c1 , c4 , c5 } respectively. 3 http://movementpatterns.pbwiki.com/

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

55

P. Laube / Progress in Movement Pattern Analysis

a1

a2 b1.1

a3

b1.2 db

c6 dt c1 c 4

a4 (a) d1 t1

d2 t2

t4

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

t1 t4 t6

t7

t2 t3

t5

t3 t5

t6

(b)

E t7

t8 t12 d3 (d)

H e1

c5

c3

(c)

B e3 A

c2

C

A

D e4

f2 f3

G

G F J I e2

f1

K

K

(e)

(f)

Figure 3. Examples of movement patterns. (a) Patterns in featureless Euclidean space, space-use pattern home range (a1 ) and arrangement patterns leadership (a2 ), flock (a3 ), single file (a4 ) (bottom); (b) descriptive patterns characterizing a trajectory, distance traveled dt vs. beeline distance db . Different parts of the trajectory show different sinuosity (b1.1 , b1.2 ); (c) trajectory similarity and trajectory clustering: origin-destination ( from H to K vs. from K to A) vs. travel modality models (cyclists show steady low speed, train users move fast between layovers). (d) correlation pattern: d1 and d2 show highly synchronous movement, d3 is an outlier; (e) sequence pattern: two trajectories both featuring a sequence I, F, G; (f) trajectory similarity and clustering in a network space; f1 and f2 are more similar than f1 and f3 , all three trajectories build an origin-destination cluster (from A to K).

Measures characterizing movement are paramount to class/concept description. Such measures typically describe the spatial and temporal properties of some sort of trajectory. Descriptive measures can either characterize entire trajectories (e.g. distance traveled), subsections of trajectories (e.g. average direction of an office clerk’s morning commute), or give momentary movement descriptions (e.g. current speed). Laube et al. gave four context operators for deriving trajectory descriptors at different granularities [54]. Similar to Tomlin’s map algebra, they proposed instantaneous (“local” in a spatial context), interval (“zonal”), episodal (“regional”), and total (“global”) context operators applicable to a continuous stream of movement descriptors along a trajectory. They illustrated their conceptual framework by applying it to movement properties such as speed, sinuosity, and movement azimuth. The key to many ecological questions lies in describing the movement patterns of the species under study. Developed in a pre-GPS era when coarse observation data and mark-recapture data were the norm, many established ecological movement descriptors are of a summary nature. For example, the mobility indices mean daily movement or

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

56

P. Laube / Progress in Movement Pattern Analysis

maximal distance describe the daily movement of an animal [6]. From data sets covering many individuals over many days, measures are derived that shall describe the characteristic movement pattern of a given species. Such measures include daily activity radius, minimal convex polygon (around all observations), and, of course, the popular home range [6]. A home range describes the space-use of an animal species, “that area traversed by the individual in its normal daily activities [...] [16]. Over the years more concise quantitative definitions of home range were put forward, but in their core they all aim at aggregating observation fixes into shapes mirroring an individual’s space-use pattern for some given interval (see a1 Figure 3a). Lessons learned. Summary parameters and statistics collapse the spatial and temporal detail of movement data into a single number. As neat as this can be with respect to data compression, the spatio-temporal footprint characterizing movement is completely lost. For the sake of argument, let’s assume a gazelle and a lion both had similar daily travel distances. This summary measure surely would be little suited to adequately classify the trajectories of a grazing animal and a hunter. Space-use patterns such as the convex polygon or more sophisticated (but less transparent) home range only collapse time and offer a spatial footprint pattern for class description. Such aggregation patterns offer a quick fix for coping with excessive movement data volumes. However, they promote a static world view and hence remain a blunt tool for analyzing dynamic movement. Furthermore, even though home range is an established concept for describing space use, there is little agreement on how to apply and parameterize the various home range methods [11].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.2. Finding frequent patterns, rules, associations, correlations Frequent patterns are itemsets, subsequences or substructures that occur frequently in data [46]. Whereas itemsets in a conventional transactional data set refer to items that frequently appear together (as in the classic market basket association rule “customers buying coffee frequently also buy milk”), in movement data “appearing together” will be modeled as spatio-temporal proximity. Frequent sequences and subsequences in movement can refer to locations or space partitions that are visited in some specific order or frequency. The high spatio-temporal granularity of movement data offers insights into trajectory substructures, such as specific forms or shapes (for example, unusually wiggly moves in an otherwise smooth path, sudden direction changes). Furthermore, correlation between pairs of moving objects can be investigated and larger groups can express frequent arrangement patterns (e.g. flocking). Examples of frequent patterns are illustrated in Figure 3a: leadership (a2 ), flocking (a3 ), and single file (a4 ). The trajectory in Figure 3b illustrates an association rule of the form “objects that wiggle for some time (b1.1 ), often turn left thereafter and move straight for some time (b1.2 )”. Figure 3d exemplifies how correlation between trajectories d1 and d2 can be computed by measuring the direction changes of concurrent moves (t1 to t2 , t2 to t3 , and so forth). Sequences of visited cells may also build patterns: it appears from Figure 3e that the sequence I, F, G appears often in the (admittedly small) data set of two trajectories. The last decade has seen a growing interest of computational geometry in developing algorithms for the efficient detection of movement patterns. Most approaches adhere to pattern matching, as first specific patterns are predefined, – modeled and formalized –

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

P. Laube / Progress in Movement Pattern Analysis

57

and then algorithms are developed for their efficient detection. Gudmundsson et al. [43, p.196] defined movement patterns as “large enough subgroups of moving point objects that exhibit similar movement in the sense of direction, heading for the same location, and/or proximity”. In Andrienko and Andrienko’s terminology [2], such patterns can also be referred to as arrangement patterns, as they typically capture geometric or topological relations amongst moving objects. Prominent examples of arrangement patterns are flock, leadership, or single file. A flock refers to a group of moving objects that move in spatial proximity for some time (a bit like a moving cluster, see a3 in Figure 3a). After defining flock as a geometric arrangement, Benkert et al. present a series of algorithms detecting flocks exploiting quadtrees as well as range and nearest neighbor queries [5]. This is a typical example for movement pattern analysis in computational geometry, as the data mining problem at hand suits a well-known data structure and thereon applied procedures. A flock could have a leader, defined as an object that spatially leads the way to a group of followers for some time [1]. Here, the spatial relation in front is modeled geometrically, before algorithms and data structures that identify leaders and followers are presented (see a2 in Figure 3a). A group of objects may furthermore build a single file pattern if they are following each other, one behind the other (see a4 in Figure 3a) [14]. This work exploits well-understood concepts from computational geometry for movement pattern mining, most prominently the Fréchet distance and the free space diagram (a geometric data structure for computing the Fréchet distance of polygonal curves). Whereas most arrangement patterns put a strong focus on the geometric properties of trajectories in featureless 2D space, other embedding spaces may reveal patterns in sequences of movement events. For example, Du Mouza and Rigauz [26] present a set of mobility patterns as sequences of events of point objects moving in a discrete space. Directly modeled from mobile phone users traveling in an antenna cell network, trajectories are perceived as strings of the labels of the visited cells and patterns are frequently found substrings (see Figure 3e). Such approaches open up links to pattern matching in strings and molecular genetics. Verhein and Chawla offer an example for spatio-temporal association rules, again emerging movement in an irregularly tessellated space [87]. The authors present SpatioTemporal Association Rules (STARs) that describe how objects move between regions over time. Their STAR miner searches for objects that “have visited region F for some time and later appear in region G” (see Figure 3e). With respect to regions, the frequency of inbound and outbound moves allow the discrimination of high traffic (frequent inbound and outbound moves) and stationary regions (frequent extended stays). When comparing the trajectories of synchronously moving objects, patterns of correlation can emerge. Such correlation patterns do not assess the similarity of the geometric shape of trajectories, but rather refer to synchronism of instantaneous movement properties. For example, two objects could move with the same or similar movement azimuth or with the same speed at a given time [56]. When extended over longer time intervals, such measures can be used to evaluate the correlation between trajectories or parts of trajectories [76]. For example, trajectories d1 and d2 in Figure 3d are positively correlated, whereas d3 breaks ranks. Lessons learned. Frequent arrangement patterns deserve credits for pioneering much of the current enthusiasm for movement pattern analysis because they are simple, illustrative, and can often be given a catchy label (e.g. flock). However, despite many con-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

58

P. Laube / Progress in Movement Pattern Analysis

ference talks and published papers on the topic, such patterns remain an elegant concept lacking convincing applications of social and economic relevance. I identify three major reasons for this shortcoming. First, there still is a shortage of large enough accessible movement data sets allowing a thorough benchmarking of often theoretically presented approaches. This limitation will fall in the near future and many approaches will have to pass the test. Second, movement pattern definitions and their respective detection algorithms tend to be ruled by the methodological toolbox of the researchers developing the tools rather than by the actual needs of the users applying the tools. Third, the evaluation of the significance (relevance, interestingness) of found patterns is difficult [57]. Furthermore, most work on frequent movement patterns still defines its own specific problems and its own patterns that come with them. Given the wide range of problems addressed and research and development fields involved, comparison of approaches remains (for the time being) difficult.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.3. Classification and prediction Conventional data mining classification and prediction aims to find a model or function that describes data classes or concepts [46]. Once this model is found, it can be used to predict the class of other objects whose label is unknown. The model is typically derived from a training set, that is objects whose class label is known. Adopted for movement pattern analysis, classification and prediction attempts to find models that allow reproducing and predicting trajectories, parts of trajectories or certain moves of humans, animals and other moving objects. Classification and prediction patterns can take the form of classification rules, decision trees, mathematical formulas or neural networks. An example for a classification rule could be: “if the speed is below 10km/h then the travel modality is walking”. Once this rule is derived from a set of known walker trajectories, other trajectories whose travel modality is unknown, can be classified accordingly. Let’s assume that the means of transport is known to be bicycle for trajectories c2 and c6 and train for c1 and c5 in Figure 3c. Hypothesizing a derived classification model basing on speed and layover times, the remaining trajectories could easily be labeled bicycle in the case of c3 and train for case c4 . Similarly, assuming trajectory b in Figure 3b tracks a building cleaner, then its sub-trajectories might be classified according to a imaginary sinuosity model as vacuuming (b1.1 ) and walking off to coffee break (b1.2 ). With GPS tracks of individual human beings becoming increasingly accessible, an obvious movement classification task is the identification of activity mirrored in trajectories or subsets of trajectories. Dykes and Mountain suggest the identification of episodes of spatio-temporal behavior [29]. Once identified, such episodes may correspond to distinct activities, which then in turn could express different information demands in an LBS context. Exploiting a combination of variables (speed, sinuosity, spatial range, freedom of movement), they conjecture activity classes on foot, low-speed motor, high-speed motor and flight. Similarly, Smyth [81] mines stored trajectories of motorists for repeated behavior in order to predict future mobile behavior. This work is motivated by the thought that it is possible “to aid travelers in their mobile activities by deriving new information from the traces of their past activities” (for example, a motorist repeatedly visiting the same address may be given simplified wayfinding instructions after a while) [81, p. 346]. In behavioral ecology there is an abundance of work aiming at fitting known movement models to observed trajectories. Examples range from random walk for copepods

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

59

[72] and caribou [7], and from Lévy walk patterns for spider monkeys [70] to hidden Markov models again for caribou [37]. Since movement is thought to be the primary mechanism coupling animals to their environment [7], the identification of a model best representing observed movement (patterns) is assumed to shed light on key biological processes such as migration or population spread. Finally, Wentz et al. provide an example for classification for prediction, filling gaps in fragmentarily recorded trajectories of primates. Derived characteristics from observed trajectory sections were thereafter used to create continuous tracks through a random walk model [90]. Lately, scientists even attempt to find similar models for human mobility in trajectories of anonymized mobile phone users [41]. Lessons learned. Classifying episodes of human activity from their spatio-temporal footprint is difficult. In many cases the geographic context is likely to reveal more about the actor’s behavior than the properties of the trajectory it carves. As an extreme example, a person standing still at a tram stop obviously waits for a tram – an activity difficult to infer from the trajectory properties alone. When classification is difficult, then prediction of human (and perhaps less so also animal) behavior is extremely difficult. However, since simulation of trajectories is a popular strategy overcoming the lack of suited movement data repositories, finding adequate models reproducing realistic human (and animal) trajectories remains a noble ambition.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.4. Clustering Clustering aims at grouping observations into sets of observations that are similar to each other (maximizing intraclass similarity) and relatively different from other sets of observations (maximizing interclass difference) [64,46]. In contrast to classification and prediction, labels are unknown in the first place and cannot be used for clustering. However, labels may be assigned to derived clusters later on. Clusters in movement analysis group similar moves, parts of trajectories or entire trajectories. Given that trajectories can significantly vary in their general appearance, their spatial and temporal extent and also their granularity, defining and evaluating trajectory similarity is not straightforward and is therefore especially addressed in this section. Figure 3c illustrates two distinct clusters of trajectories. The space-time paths could have been clustered according to different trajectory properties (“angular” vs. “smooth”) or referring to their origin-destination differences (gray cluster moves from region H to K, black cluster form K to A). The trajectories in Figure 3c could also be grouped into an origin-destination cluster, here moving from region A through G to K. Whereas the notion of trajectory similarity may be obvious for d1 and d2 in Figure 3d, a comparison of either two with d3 is less obvious: d3 is recorded at a later time, is much shorter in space and spans a shorter time. Trajectory clustering has been proposed for various reasons in several application fields and for movement modeled in many quite different ways. Developers of locationbased services (LBS), traffic, and mobility management applications are interested in clusters of similar trajectories of mobile phone users, drivers or commuters [21]. In an environmental health context, Sinha and Mark propose a lifeline distance (dissimilarity) to detect clusters of people with similar environmental exposure history – “providing a basis for revealing possible regions in space-time where environmental hazards might

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

60

P. Laube / Progress in Movement Pattern Analysis

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Table 2. Signature movement patterns summary. data mining task

signature patterns

lessons learned

description

patterns in summary movement descriptors/statistics, such as speed, sinuosity, daily displacement

Descriptive patterns reflect a static world view and tend to disregard the dynamic aspect of movement

frequent patterns

arrangement patterns (flock, leadership, single file); move sequences; STARs; trajectory correlation

Frequent patterns offer elegant concepts that have yet to deliver convincing applications and relevance measures

classification/prediction

travel modalities; episodes; activities; predictive models (e.g. random walk, Hidden Markov models)

classification/prediction of human (animal) activity is difficult without context knowledge

clustering

origin-destination clusters; clusters of geometrically similar trajectories

Trajectory clustering is first and foremost a powerful exploratory data analysis concept

have existed in the past” [79, p.117]. In marketing, shoppers’ paths through stores are clustered in order to identify distinctive types of grocery store travel [53]. Whereas similarity for conventional clustering uses Euclidean distance between observation points in attribute space, the computation of similarity between trajectories that extend in space and time is less straightforward. Trajectories can vary in spatial length and temporal duration; they can cover a wide range of granularities and have various sampling properties; and they may evolve simultaneously or with a time lag [88]. Consequently, there are many different routes to evaluate trajectory similarity for clustering. In a transportation context, trajectories of commuters are often clustered with respect to similar origins, similar destinations or similar origin-destination combinations [12,71]. Other approaches focus on geometric properties of trajectories. Some use summary statistics about path length, move length, move duration, speed, turning angle, turning rate, and net displacement [72]. Others focus on the coordinates of the fixes and compute distances between concurrent or otherwise corresponding observation fixes (first fix, last fix, fix at 0.1 length, 0.2 length, etc.) [21,79]. Still other similarity measures assess the similarity between models describing trajectories. First the geometry of trajectories is replaced with some models describing them. Second, the similarity of those placeholder models is evaluated [69,37]. Lessons learned. Trajectory clustering is primarily used as a tool for exploratory data analysis, where the actual clusters are less important than the process of finding them. A reason for this frequent deprecation of the actual patterns may lie in the problem of assessing pattern relevance that is notoriously associated with data mining. Again, any data set allows the building of arbitrary clusters – but do they really mean anything? Also, in many cases the analysis interest will not be the resulting clusters themselves, but the relationship of the clusters with the underlying geospace, for instance when the clustering exercise aims to identify a bottle-neck in a traffic network [71].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

61

5. Application fields Finding movement patterns in large data sets has been the structure of the previous section. However, such data mining is not the only purpose of movement patterns. This section puts the obvious data mining task in perspective with a set of alternative application fields of movement patterns. 5.1. Data mining: pattern detection and pattern recognition The sheer amount of movement data pouring out of mobile and location-aware systems makes it increasingly difficult to follow the complex dynamic processes these systems monitor. Therefore automated tools are needed that scan massive data volumes for patterns discovering high-level structure. Data mining provides such tools. In pattern detection it is typically not know in advance what patterns the data mining tools will come up with. Extracting the elusive patterns is an integral part of the data mining process. In pattern recognition, well defined ideas of the patterns to be found are given and the task is to find these patterns efficiently. Typical results are quantifiable classifications, frequent patterns, rules, or clusters. Mobility data mining is still a relatively young research field. The complex dynamics of objects in motion challenges conventional data mining techniques and requires the development of new concise and useful abstractions of large volumes of mobility data. Even though large sections of this chapter report on substantial progress in mobility data mining, the community still intensively discusses what patterns can be extracted from trajectories and best ways to do so [39].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.2. Exploratory data analysis, visualization, and visual analytics In exploratory data analysis (EDA) and visualization, movement patterns play a different role. Here, movement patterns are not necessarily the ultimate results, but rather stepping stones for supporting skilled analysts in their understanding of movement processes. EDA relies on the capability of humans to recognize complex visual patterns, as well as intuition and trained judgment [67]. Movement pattern analysis enables analysts to detect the expected, and discover the unexpected. Analysts may experiment with different flavors of patterns, systematically varying pattern parameters, shapes and granularities in order to understand the dynamic phenomenon under study. Such analytical reasoning is at the core of visual analytics, the science of analytical reasoning facilitated by interactive visual interfaces [83]. Visual analytics is different to EDA because the data to be analyzed consists of massive data streams of multiple data types and from multiple sources, with the data typically even being conflicting and incomplete. The focus of research in visual analytics lies in the development of tools that “prepare and visualize the data so that the human analyst can detect various types of patterns by looking at the visual displays” [2, p. 117]. Movement pattern analysis in this context enables the creation of hypotheses and scenarios. The patterns support the analysts in examining these hypotheses and scenarios in the light of the available evidence [83].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

62

P. Laube / Progress in Movement Pattern Analysis

5.3. Moving Object Databases Conventional database management systems (DBMS) are not a priori capable of handling moving objects, but have been adapted and extended to do so [91]. The last decade has seen ample research on moving object databases (MOD), especially for real-time applications such as systems for air traffic control, taxicab fleet management or emergency response. Recent years have seen an amplified interest in MOD due to further technological advancements and emerging new mobile electronic commerce applications (e.g., location-based services). Movement patterns are an important ingredient for MODs as they allow the formalization and (real-time) detection of interesting events and processes, such as a dangerous near-miss incident in an air traffic control. In this context, movement patterns are typically formalized as query statements, for example, in specifically developed derivatives of SQL allowing the inclusion of movement relations [82].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.4. Surveillance and motion imagery analysis Video feeds are a further source of massive volumes of data about the movement of people, vehicles, or animals. Applications include surveillance and security, sports performance analysis, traffic management, and animal behavior science. The first challenge facing movement analysis from imagery data is the identification of individual trajectories, which is difficult due to possible occlusion and trajectory intersection [66]. Their clustering is an obvious pattern of interest, once trajectories are (re-)constructed (“Find all objects in a video feed whose movement trajectories are similar”) [75]. Classification of identifiable activities is of interest in surveillance applications (“single out car theft patterns in a car park video feed”, [58]), and ontologies for video event representation have been developed [63]. Video feed analysis in biomechanics and medicine also searches for patterns of human motion (here referring to the movement of the human body in three-dimensions). Insights gained through such motion/movement pattern analysis aims to improve rehabilitation, diagnostics and sports [73]. However, such applications and respective literature reach beyond the focus of this chapter and are not considered any further. 5.5. Behavioral ecology In another strand of research, the movement patterns may be well-known and evident. What is not known is the processes, perhaps the behavioral rules of moving agents that produce the patterns. A flock of foraging animals sometimes suddenly stops foraging and individual relocate altogether to another spot. Whereas the relocation pattern may be known, the rules governing or triggering such group decision may be elusive [18]. Here, agent-based modeling may be used to “reproduce” a movement pattern, and by that shed light on behavioral rules controlling observed movement patterns. The classification and prediction patterns discussed in section 4.3 typically follow such reasoning [7,6,72].

6. Limitations to movement pattern analysis So far, the chapter presented movement patterns as an appropriate means to cope with ever increasing volumes of movement data. There remain limitations to use and appli-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

63

cation of movement patterns, albeit all the progress made in movement pattern analysis. This section gives a brief overview over six limitations. 6.1. Pattern definitions The definition of movement patterns remains difficult. The commonsense idea of many movement patterns masks the difficulty of how they should be translated into formalizable and detectable patterns. The definition of a pattern will often be more influenced by the tools at hand than by the actual needs of the users. A computational geometer will conceptualize a flock as a geometric structure, while the same flock is perceived by a database specialist as a query statement. For many dynamic processes involving human or animal decisions, mechanistic pattern definitions may fail to model complex behavior. Algorithms not “detecting the expected” are not necessarily faulty, but might just be based on inappropriate pattern definitions. Furthermore, developers of new methods and tools may lapse into the pitfall of “seductive sugarcoating”, labeling per se value-free movement patterns with a behavioral connotation. Such labeling (as in flock, trend-setter, or leadership) may lead to misinterpretation or over-interpretation of found patterns. The notion of flock can imply an intrinsic impetus of the agents involved to stick together that, in fact, may not be there; an individual appearing to be a trend-setter may just happen to move ahead of others in a dynamic process without actually anticipating or setting the trend at all.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.2. First order vs. second order effects For behavioral ecologists, it is indispensable that movement (behavior) is tightly knit to the environment embedding movement [7]. Whereas the spatial information community makes good progress in promoting metadata for spatial information, much less has been achieved for spatio-temporal movement data, with typical trajectory data sets hardly containing more than time-stamped point coordinates. Neglecting the embedding geography when investigating patterns bears the danger of excluding important environmental factors that influence the observed movement. In many cases the observed movement patterns indeed are second order effects solely emerging inter-object behavioral rules. In many other cases, however, first order effects due to the variability of the underlying environment are equally or even more important (see again the initial example with the flock on the bridge in Figure 1). By contrast, geography as the study of the human-environment relationship, has a fair point by generally refusing “geodeterministic” reasoning, i.e. the idea that people (and their actions) be determined by their environment. Irrespective of perceiving movement patterns as first or second order effects, it is fair to say that spatially explicit movement behavior of conscious agents is most likely a combination of internal and external stimuli – be it people, animals or computational agents. Excluding either impetus in movement pattern analysis risks “mismodeling” the phenomenon. 6.3. Pattern relevance It has been recognized in the literature that data mining can produce a large number of obvious or irrelevant patterns [77,78,65]. Objective interestingness measures have been proposed that depend solely on the structure of a pattern and the data it is detected in. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

64

P. Laube / Progress in Movement Pattern Analysis

Support and confidence are the most prominent such objective interestingness measures, reflecting the usefulness and the certainty of a discovered rule [46]. One could derive the movement rule “Objects that visited F , afterward visited G” from the trajectories moving in the cell space in Figure 3e. The support of this rule is the number of trajectories that follow the rule (2 out of 4, or 12 ). The confidence is the number of trajectories that follow the rule as a fraction of the trajectories which could have followed the rule, i.e. all trajectories visiting F in the first place (2 out of 3, that is 23 ). Verhein and Chawla go one step further with their spatial support measures for such cell sequence trajectories, adapting the support of a rule with respect to size of the visited cells [87]. Objective interestingness measures are useful in many contexts. However, it has been acknowledged that objective interestingness measures should be complemented with subjective measures, assessing the patterns’ relevance to a potential user [78]. Hence, patterns are subjectively interesting to a user if the patterns are unexpected and the user can act on them to his/her advantage. Laube and Purves, for example, investigate the notion of unexpectedness of movement patterns, comparing the emergence of some given patterns in real movement data and corresponding Monte-Carlo simulated movement data [57]. Be it objective or subjective relevance, the inclusion of some notion of relevance can only strengthen the impact of a pattern-based analytical approach for movement data.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.4. Scale and granularity effects Movement patterns can cover a wide range of spatial and temporal granularities, ranging from honeycombs for dancing bees (apidology, [74]) to continents for migrating caribou (behavioral ecology, [7,37,56]), and happen within the blink of an eye (eye tracking, [25]) just as well as last for a human’s lifetime (environmental health, [79]). It is generally agreed that movement pattern analysis is best performed at a granularity similar to the sampling rate [85,56]. However, in the light the latest tracking technologies producing trajectories in millimeter and millisecond granularities, aggregation and generalization of movement data becomes vital. Even though we might have GPS logs in milliseconds, one might be interested in commuting or migration patterns covering daily or even seasonal rhythms. Even though trajectories are in principle open to any approach developed for line generalization, the simplification of trajectories certainly is a sensitive issue as the characteristics of the captured movement should not be blurred (e.g. line simplification is little suited when sinuosity patterns are to be detected). Also, in many cases the movement patterns of individuals may be of a lesser interest than an aggregate movement picture of a whole population. Apart from trajectory clusters concepts such as flows or taxels (time voxels in the space-time cube) [33,34] offer alternatives for the aggregation of movement trajectories. Often it will not a priori be evident at which granularity a certain movement phenomena expresses its relevant patterns. Hence, shifting granularities may be vital for movement pattern analysis. For instance, moving back and forth among temporal granularities has been identified as a necessary knowledge discovery routine for spatiotemporal data mining. Kathleen Stewart Hornsby has introduced the notion of temporal zooming to do so [48]. A completely different, but rather elegant idea for tackling the scale problem is the detection of scale invariant patterns. Ramos-Fernández et al. [70] suggest that the foraging patterns of spider monkeys resemble scale invariant Lévy walk patterns known in physics.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

65

6.5. Data availability and privacy There remain limitations with respect to movement patterns availability, even though the cost for reliably tracking individuals is ever decreasing and more and more movement data becomes available. First, privacy concerns limit for good reasons the general availability of individual movement data emerging telecommunication applications [24,27,39]. Privacy protection with movement data is difficult, as precise location information is a quasi-identifier, allowing re-identification of previously anonymized data [9]. Revealing when an individual was where and for how long can reveal sensitive behaviors. Second, many available movement data sets do not represent the population of interest but rather only a subset of a population, sampled with more or less bias. For instance, GPS-logs of users of location-aware devices may only mirror subscribed users of a certain service [41], potentially being a special subset of people interested in technology. Any inference about general behavior patterns from such biased subsets is difficult. In behavioral ecology, attaching a GPS collar to an animal may significantly change its behavior and again inference about that specie’s general movement patterns should be exercised with care [86]. Finally, many of today’s movement pattern analysis techniques have been developed only for the small data sets available at the time. These approaches should be revisited and their performance reassessed in the light of growing data sets. Since real observation data is often lacking or just not suited for a research question, there is ample research relying on synthetic trajectories generated through some simulation process. Simulations are certainly handy as they allow a tight control of the moving agents involved, and the produced data can be tailored to the problem at hand. However, there is the danger of creating a simulation environment that is too simple, or in other ways misrepresents reality.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

7. The Road Ahead The availability of movement data will continue to increase with GPS navigation devices and GPS-enabled phones currently conquering the mass market. Movement data availability will further increase with emerging technologies for monitoring dynamic processes such as geosensor networks or radio frequency tracking technologies (RFID). RFID tags and related technologies may be the key to a better understanding of indoor movement, shedding light on movement spaces of high relevance in our everyday lives but largely unexplored so far due to tracking technologies constrained to the outdoor world. The MIT reality mining project pioneers this development [30]. Data from bluetooth-enabled phones was used to recognize social patterns of daily activity of a set of office workers, infer their relationships, identify significant social locations, and model organizational rhythm [30]. Geosensor networks – ad-hoc wireless networks of sensorenabled miniature computing platforms monitoring geospace [93] – may push movement pattern analysis beyond analysis of moving points as such systems get currently fit to monitor dynamic fields and detect their topological changes [28,92]. The spatial side to Web 2.0 offers further sources of novel movement data. Geotagged pictures reveal the photographer’s location; sequences of time-stamped and geotagged pictures allow the construction of individual trajectories. Whereas geotagging

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

66

P. Laube / Progress in Movement Pattern Analysis

so far remains largely a manual process, in the future geotags might be an ubiquitous byproduct of digital pictures and be widely available for movement pattern mining. Girardin et al. offer a first glimpse of this new fascinating source of movement data by reconstructing digital footprints of tourists on their way around Rome from Flickr imagery data [40]. Without doubt such user-generated content, so-called volunteered geographic information [42], provided by private citizens largely without formal mandate, will increasingly complement conventional sources of movement data. Movement pattern analysis will furthermore embrace the current trend of blurring the threshold between data capture and data processing in knowledge discovery [50]. Growing data volumes, limited communication bandwidth, and privacy concerns dictate that data collected at different nodes be analyzed in a decentralized fashion, without collecting everything to a central site [20]. Consequently, the discipline of Distributed Data Mining (DDM) has evolved to find patterns and rules from distributed and heterogeneous data using minimal communication [50]. With my work on the decentralized detection of movement patterns amongst the roaming nodes of a mobile geosensor network, I aimed with co-authors to introduce the concept of distributed data mining to the problem of movement pattern detection [55]. The chapter concludes with a vision of ambient spatial intelligence (AmSI), extending the concept of ambient intelligence (AmI) in ubiquitous computing [3]. Imagine a world of distributed and mobile spatial computing, where the days of spatial data processing in a monolithic GIS are numbered. Highly mobile human users will interact in dynamic networks and profit from ubiquitous access to AmSI. Furthermore, spatially distributed autonomous computing nodes will increasingly invade various technology fields of immense socio-economic significance, including environmental monitoring, health and medicine, and industrial repair and maintenance. AmSI envisions distributed and mobile spatial computing systems that allow highly dynamic human and non-human users to ubiquitously access spatial data and the required analysis capability. Some applications of movement patterns that have pervaded our natural and built environment, include: • distributed and mobile spatial computing systems that allow highly dynamic human and non-human users to ubiquitously access spatial data and the required analysis capability; • vehicle board computers collaborating in an ad-hoc and distributed way and issuing traffic jam warnings to their drivers before the congestion actually builds up; • fire fighters that are automatically alerted when one team member falls behind in hostile environments with poor visibility; or • smart farming applications where flocks of animal stock are automatically stimulated to relocate such that some aimed for grazing or manuring pattern can be achieved. The interested reader is referred to the respective chapter in this volume where the idea of AmSI is elaborated upon in more detail.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

67

8. Conclusions Recent years have seen a rapid increase of interest in movement pattern analysis in the fields of GIScience, data mining and knowledge discovery, computational geometry, exploratory data analysis, and visual analytics. Given the wide range of involved disciplines and applications and the relative youth of the research field, progress in movement pattern analysis still is a matter of individual contributions and the development of an underlying theory remains fragmentary. Furthermore, many methods are developed for, and tested on particular or even simulated data. Movement pattern analysis has yet to demonstrate convincing commercial and social benefits [10]. Finally, a concluding research agenda lists several key research and development frontiers. • Develop a general theory of movement patterns. This includes further efforts in defining generic movement patterns as well as taxonomic efforts. • Intensify the metadata discussion for movement data (underlying movement space models, geographic context) • Identify benchmark data sets for movement pattern analysis. The benchmark data sets should be annotated and the benchmark tasks clearly delineated. • Close the gap between research developing movement analysis techniques and the users of such methods. • Complement the notion of crisp patterns (pattern vs. no pattern) with fuzzy notions of movement patterns or probabilistic approaches (found pattern with probability p). • Develop movement pattern analysis approaches that respect and protect the privacy of the users of mobile ICT applications.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Acknowledgements Large parts of section 6 base a paper entitled “Pitfalls in the analysis of moving object data” that Patrick gave with Ross Purves, The University of Zürich, at the Seminar 08451, “Representation, Analysis and Visualization of Moving Objects”, held from 2 to 7 November 2008 held at Schloss Dagstuhl, Leibniz-Zentrum für Informatik, Germany [10]. Patrick would furthermore like to acknowledge the organizers and the attendees of the Seminar, as the vivid discussion at beautiful Schloss Dagstuhl helped to shape the research agenda in section 8. Patrick’s work on DeSC is a collaboration with Matt Duckham, the University of Melbourne, and was partly funded by the Australian Research Council (ARC), Discovery grant DP0662906 and the ARC Research Network on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP). Finally, Patrick wishes to thank Matt Duckham and an anonymous reviewer for their invaluable comments.

References [1] M. Andersson, J. Gudmundsson, P. Laube, and T. Wolle. Reporting leadership patterns among trajectories. In 22th Annual ACM Symposium on Applied Computing, pages 3 – 7, Seoul, Korea, 2007.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

68

P. Laube / Progress in Movement Pattern Analysis

[2] N. V. Andrienko and G. L. Andrienko. Designing visual analytics methods for massive collections of movement data. Cartographica, 42(2):117–138, 2007. [3] J. C. Augusto and D. Shapiro, editors. Advances in Ambient Intelligence, volume 164 of Frontiers in Artificial Intelligence and Applications. IOS Press, 2007. [4] J. A. Beecham and K. D. Farnsworth. Animal foraging from an individual perspective: an object oriented model. Ecological Modelling, 113(1–3):141–156, 1998. [5] M. Benkert, J. Gudmundsson, F. Hübner, and T. Wolle. Reporting flock patterns. Computational Geometry, 41(3):111–125, 2008. [6] U. Berger, G. Wagner, and W. F. Wolff. Virtual biologists observe virtual grasshoppers: an assessment of different mobility parameters for the analysis of movement patterns. Ecological Modelling, 115(12):119–127, 1999. [7] C. M. Bergman, J. A. Schaefer, and S. N. Luttich. Caribou movement as a correlated random walk. Oecologia, 123(3):364–374, 2000. [8] M. J. A. Berry and G. Linoff. Mastering Data Mining: The Art and Science of Customer Relationship Management. John Wiley and Sons, Inc., New York, 2000. [9] C. Bettini, X. Wang, and S. Jajodia. Protecting privacy against location-based personal identification. In W. Jonker and M. Petkovic, editors, Secure Data Management, volume 3674 of Lecture Notes in Computer Science, pages 185–199. Springer, Heidelberg, 2005. [10] W. Bitterlich, J.-R. Sack, M. Sester, and R. Weibel. 08451 summary report – representation, analysis and visualization of moving objects. In W. Bitterlich, J.-R. Sack, M. Sester, and R. Weibel, editors, Representation, Analysis and Visualization of Moving Objects, Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2009. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany. [11] L. Borger, N. Franconi, G. De Michele, A. Gantz, F. Meschi, A. Manica, S. Lovari, and T. Coulson. Effects of sampling regime on the mean and variance of home range size estimates. Journal of Animal Ecology, 75(6):1393–1405, 2006. [12] M. Bottai, N. Salvati, and N. Orsini. Multilevel models for analyzing people’s daily movement behavior. Journal of Geographical Systems, 8(1):97–108, 2006. [13] T. Brinkhoff. A framework for generating network-based moving objects. GeoInformatica, 6(2):153– 180, 2002. [14] K. Buchin, M. Buchin, and J. Gudmundsson. Detecting single file movement. In Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Irvine, California, 2008. ACM. [15] C. Burstedde, K. Klauck, A. Schadschneider, and J. Zittartz. Simulation of pedestrian dynamics using a two-dimensional cellular automaton. Physica A, 295(3-4):507–525, 2001. [16] W. H. Burt. Territoriality and home range concepts as applied to mammals. Journal of Mammalogy, 24:346–352, 1943. [17] J. Carter and J. T. Finn. MOAB: a spatially explicit, individual-based expert system for creating animal foraging models. Ecological Modelling, 199(1):29–41, 1999. [18] L. Conradt and T. J. Roper. Group decision-making in animals. Nature, 421(6919):155–158, 2003. [19] N. Correll and A. Martinoli. Collective inspection of regular structures using a swarm of miniature robots. In Jr. Ang, H. Marcelo, and O. Khatib, editors, Experimental Robotics IX, The 9th International Symposium on Experimental Robotics (ISER), Singapore, June 18-21, volume 21 of Springer Tracts in Advanced Robotics, pages 375–385. Springer, Heidelberg, 2006. [20] S. Datta, K. Bhaduri, C. Giannella, H. Kargupta, and R. Wolff. Distributed data mining in peer-to-peer networks. IEEE Internet Computing, 10(4):18–26, 2006. [21] M. D’Auria, M. Nanni, and D. Pedreschi. Time-focused density-based clustering of trajectories of moving objects. In Workshop on Mining Spatio-temporal Data (MSTD-2005), Porto, 2005. [22] V. T. de Almeida and R. H. Güting. Indexing the trajectories of moving objects in networks. GeoInformatica, 9(1):33–60, 2005. [23] J. Dijkstra, A. J. Jessurun, and H.J.P.Timmermans. A multi-agent cellular automata system for visualising simulated pedestrian activity. In S. Bandini and T. Worsch, editors, Theoretical and Practical Issues on Cellular Automata, Proceedings of the Fourth International Conference on Cellular Automata for Research and Industry, pages 29–36. Springer, Heidelberg, 2000. [24] J. E. Dobson and P. F. Fisher. Geoslavery. IEEE Technology and Society Magazine, 22(1):47–52, 2003. [25] S. Dodge, R. Weibel, and A.-K. Lautenschütz. Towards a taxonomy of movement patterns. Inf Visualization, 7(3-4):240–252, 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

P. Laube / Progress in Movement Pattern Analysis

69

[26] C. Du Mouza and P. Rigaux. Mobility patterns. Geoinformatica, 9(4):297–319, 2005. [27] M. Duckham and L. Kulik. Simulation of obfuscation and negotiation for location privacy. In Spatial Information Theory (COSIT 2005), volume 3693 of Lecture Notes in Computer Science, pages 31–48. Springer, Heidelberg, 2005. [28] M. Duckham, S. Nittel, and M. Worboys. Monitoring dynamic spatial fields using responsive geosensor networks. In C. Shahabi and O. Boucelma, editors, ACM GIS, pages 51–60. ACM Press, 2005. [29] J. A. Dykes and D. M. Mountain. Seeking structure in records of spatio-temporal behaviour: visualization issues, efforts and application. Computational Statistics and Data Analysis, 43(4):581–603, 2003. [30] N. Eagle and A. Pentland. Reality mining: sensing complex social systems. Personal and Ubiquitous Computing, 10(4):255–268, 2006. [31] M. Erwig and R. H. Güting. Spatio-temporal data types: An approach to modeling and querying moving objects in databases. GeoInformatica, 3(3):269–296, 1999. [32] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge discovery in databases. AI Magazine, 17(3):37–54, 1996. [33] P. Forer. Geometric approaches to the nexus of time, space, and microprocess: Implementing a practical model for mundane socio-spatial systems. In M. J. Egenhofer and R. G. Colledge, editors, Spatial and Temporal Reasoning in Geographic Information Systems, pages 171–190. Oxford University Press, Oxford, UK, 1998. [34] P. Forer, H. F. Chen, and J. F. Zhao. Building, unpacking and visualising human flows with gis. In Proceedings of the GIS research UK 12th Annual Conference, pages 334–336, Norwich, UK, 2004. University of East Anglia. [35] A. U. Frank. Different types of times in gis. In M. J. Egenhofer and R. G. Colledge, editors, Spatial and Temporal Reasoning in Geographic Information Systems, pages 40–62. Oxford University Press, Oxford, UK, 1998. [36] A. U. Frank. Socio-economic units: Their life and motion. In A. U. Frank, J. Raper, and J. P. Cheylan, editors, Life and motion of socio-economic units, volume 8 of GISDATA, pages 21–34. Taylor & Francis, London, UK, 2001. [37] A. Franke, T. Caelli, and R. J. Hudson. Analysis of movements and behavior of caribou (rangifer tarandus) using hidden markov models. Ecological Modelling, 173(4):259–270, 2004. [38] A. Galton. Dynamic collectives and their collective dynamics. In A.G. Cohn and D. M. Mark, editors, Spatial Information Theory, Proceedings, volume 3693 of Lecture Notes in Computer Science, pages 300–315. Springer, Heidelberg, 2005. [39] F. Giannotti and D. Pedreschi. Mobility, data mining and privacy: A vision of convergence. In F. Giannotti and D. Pedreschi, editors, Mobility, Data Mining and Privacy, pages 1–11. Springer, Heidelberg, 2008. [40] F. Girardin, F. Calabrese, F. Dal Fiore, C. Ratti, and J. Blat. Digital footprinting: Uncovering tourists with user-generated content. Pervasive Computing, IEEE, 7(4):36–43, 2008. [41] M. C. Gonzalez, C. A. Hidalgo, and A. L. Barabasi. Understanding individual human mobility patterns. Nature, 453(7196):779–782, 2008. [42] M. F. Goodchild. Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4):211–221, 2007. [43] J. Gudmundsson, M. van Kreveld, and B. Speckmann. Efficient detection of patterns in 2d trajectories of moving points. GeoInformatica, 11(2):195–215, 2007. [44] L. J. Guibas. Sensing, tracking and reasoning with relations. IEEE Signal Processing Magazine, 19(2):73–85, 2002. [45] T. Hägerstrand. What about people in regional science. Papers of the Regional Science Association, 24:7–21, 1970. [46] J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, Amsterdam, 2nd edition, 2006. [47] D. J. Hand, H. Manilla, and P. Smyth. Principles of Data Mining. MIT Press, Cambridge, MA, 2001. [48] K. Hornsby. Temporal zooming. Transactions in GIS, 5(3):255–272, 2001. [49] Y. Huang, C. Chen, and P. Dong. Modeling herds and their evolvements from trajectory data. In T. J. Cova, K. Beard, M. F. Goodchild, and A. U. Frank, editors, Geographic Information Science, GIScience 2008, volume 5266 of Lecture Notes in Computer Science, pages 90–105. Springer, Heidelberg, 2008. [50] H. Kargupta and P. Chan. Distributed and parallel data mining: A brief infroduction. In H. Kargupta and P. Chan, editors, Advances in Distributed and Parallel Knowledge Discovery, pages xv–xxvi. AAAI

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

70

[51] [52]

[53] [54] [55]

[56] [57] [58]

[59] [60] [61] [62] [63] [64] [65]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[66] [67] [68] [69] [70]

[71] [72] [73]

[74] [75]

P. Laube / Progress in Movement Pattern Analysis

Press / The MIT Press, Menlo Park, CA, 2000. SY Kim, R. Maciejewski, K. Ostmo, E. J. Delp, T. F. Collins, and D. S. Ebert. Mobile analytics for emergency response and training. Information Visualization, 7(1):77–88, 2008. M. P. Kwan. Interactive geovisualization of activity-travel patterns using three dimensional geographical information systems: a methodological exploration with a large data set. Transportation Research Part C, 8(1-6):185–203, 2000. J. S. Larson, E. T. Bradlow, and P. S. Fader. An exploratory look at supermarket shopping paths. International Journal of Research in Marketing, 22(4):395–414, 2005. P. Laube, T. Dennis, M. Walker, and P. Forer. Movement beyond the snapshot - dynamic analysis of geospatial lifelines. Computers, Environment and Urban Systems, 31:481–501, 2007. P. Laube, M. Duckham, and T. Wolle. Decentralized movement pattern detection amongst mobile geosensor nodes. In T. J. Cova, K. Beard, M. F. Goodchild, and A. U. Frank, editors, Geographic Information Science, GIScience 2008, volume 5266 of Lecture Notes in Computer Science, pages 199–216. Springer, Heidelberg, 2008. P. Laube, S. Imfeld, and R. Weibel. Discovering relative motion patterns in groups of moving point objects. International Journal of Geographical Information Science, 19(6):639–668, 2005. P. Laube and R.S. Purves. An approach to evaluating motion pattern detection techniques in spatiotemporal data. Computers, Environment and Urban Systems, 30(3):347–374, 2006. D. Mahajan, N. Kwatra, S. Jain, P. Kalra, and S. Banerjee. A framework for activity recognition and detection of unusual activities. In Proceedings of the Fourth Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2004), pages 15–21, 2004. H. J. Miller. Modelling accessibility using space-time prism concepts within geographical information systems. International Journal of Geographical Information Systems, 5(3):287–301, 1991. H. J. Miller and Y. H. Wu. Gis software for measuring space-time accessibility in transportation planning and analysis. GeoInformatica, 4(2):141–159, 2000. J. M. Morales and S. P. Ellner. Scaling up animal movements in heterogeneous landscapes: The importance of behavior. Ecology, 83(8):2240–2247, 2002. K. Nagel, J. Esser, and M. Rickert. Large-scale traffic simulations for transportation planning. Annual Reviews of Computational Physics VII, pages 151–202, 2000. R. Nevatia, J. Hobbs, and B. Bolles. An ontology for video event representation. In Conference on Computer Vision and Pattern Recognition Workshop, 2004. CVPRW ’04, pages 119–119, 2004. D. O’Sullivan and D. J. Unwin. Geographic Information Analysis. John Wiley and Sons, Hoboken, NJ, 2003. B. Padmanabhan. The interestingness paradox in pattern discovery. Journal of Applied Statistics, 31(8):1019–1035, 2004. P. Partsinevelos, P. Agouris, and A. Stefanidis. Reconstructing spatiotemporal trajectories from sparse data. ISPRS Journal of Photogrammetry and Remote Sensing, 60(1):3–16, 2005. D. J. Peuquet. Representation of Space and Time. The Guilford Press, London, UK, 2002. D. Pfoser and C. S. Jensen. Indexing of network constrained moving objects, 2003. F. Porikli. Trajectory distance metric using hidden markov model based representation. In IEEE European Conference on Computer Vision, Workshop on PETS, Prague, 2004. G. Ramos-Ferández, J. L. Mateos, O. Miramontes, G. Cocho, H. Larralde, and B. Ayala-Orozco. Levy walk patterns in the foraging movement of spider monkeys (Ateles geoffroyi). Behavioral Ecology and Sociobiology, 55(3):223–230, 2004. S. Rinzivillo, D. Pedreschi, M. Nanni, F. Giannotti, N. V. Andrienko, and G. L. Andrienko. Visually driven analysis of movement data by progressive clustering. Inf Visualization, 7(3-4):225–239, 2008. F. G. Schmitt and L. Seuront. Multifractal random walk in copepod behaviour. Physica A, 301(1-4):375– 396, 2001. S. Sclaroff, G. Kollios, M. Betke, and R. Rosales. Motion mining. In Multimedia Databases and Image Communication : Second International Workshop, MDIC 2001, Amalfi, Italy, September 17-18, 2001, volume 2184 of Lecture Notes in Computer Science, page 16. Springer, 2001. T. D. Seeley and P. K. Visscher. Group decision making in nest-site selection by honey bees. Apidologie, 35(2):101–116, 2004. C. B. Shim and J. W. Chang. Efficient similar trajectory-based retrieval for moving objects in video databases. In Image and Video Retrieval, Proceedings, volume 2728 of Lecture Notes in Computer Science, pages 163–173. Springer, Heidelberg, 2003.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Laube / Progress in Movement Pattern Analysis

[76] [77] [78] [79] [80]

[81] [82] [83] [84] [85] [86] [87] [88]

[89] [90]

[91]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[92] [93]

71

T. Shirabe. Correlation analysis of discrete motions. In Geographic Information Science, Proceedings, volume 4197 of Lecture Notes in Computer Science, pages 370–382. Springer, Heidelberg, 2006. A. Silberschatz and A. Tuzhilin. On subjective measures of interestingness in knowledge discovery. In Proc. of the 1st Int. Conf. On Knowledge Discovery and Data Mining, Montreal, pages 275–281, 1995. A. Silberschatz and A. Tuzhilin. What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Engineering, 8(6):970–974, 1996. G. Sinha and D. M. Mark. Measuring similarity between geospatial lifelines in studies of environmental health. Journal of Geographical Systems, 7(1):115–136, 2005. A. P. Sistla, O. Wolfson, S. Chamberlain, and S. Dao. Querying the uncertain position of moving objects. In O. Etzion, S. Jajodia, and S. Sripada, editors, Temporal Databases - Research and Practice, volume 1399 of Lecture Notes in Computer Science, pages 310–337. Springer, Heidelberg, 1998. C. S. Smyth. Mining mobile trajectories. In H. J. Miller and J. Han, editors, Geographic data mining and knowledge discovery, pages 337–361. Taylor and Francis, London, UK, 2001. K. Stewart Hornsby and K. King. Modeling motion relations for moving objects on road networks. GeoInformatica, 12(4):477–495, 2008. J. J. Thomas and K. A. Cook. A visual analytics agenda. IEEE Computer Graphics and Applications, 26(1):10–13, 2006. G. Trajcevski, O. Wolfson, K. Hinrichs, and S. Chamberlain. Managing uncertainty in moving objects databases. ACM Transactions on Database Systems (TODS), 29(3):463–507, 2004. P. Turchin. Quantitative Analysis of Movement: Measuring and Modelling Population Redistribution in Animals and Plants. Sinauer Publishers, Sunderland, MA, 1998. F. A. M. Tuyttens, D. W. Macdonald, and A. W. Roddam. Effects of radio-collars on european badgers (meles meles). Journal of Zoology, 257(1):37–42, 2002. F. Verhein and S. Chawla. Mining spatio-temporal patterns in object mobility databases. Data Mining and Knowledge Discovery, 16(1):5–38, 2007. M. Vlachos, D. Gunopulos, and G. Kollios. Robust similarity measures for mobile object trajectories. In 13th International Workshop on Database and Expert Systems Applications, pages 721–728. IEEE Computer Society, 2002. S. Watanabe. Pattern Recognition: Human and Mechanical. John Wiley & Sons, New York, NY, 1985. E. A. Wentz, A. F. Campell, and R. Houston. A comparison of two methods to create tracks of moving objects: linear weighted distance and constrained random walk. International Journal of Geographical Information Science, 17(7):623–645, 2003. O. Wolfson and E. Mena. Applications of moving objects databases. In Y. Manolopoulos, A. Papadopoulos, and M. Vassilakopoulos, editors, Spatial Databases: Technologies, Techniques and Trends. Idea group Co., 2004. M. Worboys and M. Duckham. Monitoring qualitative spatiotemporal change for geosensor networks. International Journal of Geographical Information Science, 20(10):1087–1108, 2006. F. Zhao and L. J. Guibas. Wireless Sensor Networks – An Information Processing Approach. Morgan Kaufmann Publishers, San Francisco, CA, 2004.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Representing and Reasoning About Movement Behaviours

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-75

75

Interpretation of Behaviours from a Viewpoint of Topology Yohei KURATA a,1 , and Max J. EGENHOFER b SFB/TR 8 Spatial Cognition, Universität Bremen, Germany b National Center for Geographic Information and Analysis, University of Maine, USA a

Abstract. Behavioural monitoring often concerns the interpretation of the motion of an agent with respect to an area of interest. Geometrically, the trajectory of the agent is represented by a directed line segment (DLine) around/over a region. Topological relations between a DLine and a region concern how the DLine intersect with the region and, therefore, these relations are useful for characterizing the motion in association with an area of interest. In this chapter, we introduce a formal model of topological DLine-region relations and the application of these relations for the characterization of motions. We start from a model of topological relations between a non-directed line and a region, called the 9-intersection, reviewing how these line-region relations are associated with spatial predicates. Then, we introduce the 9+-intersection, which distinguishes 26 topological DLine-region relations in R2 , with which we explore several approaches to associating motions with the model of human motion concepts. Finally, we introduce two future research questions: (i) simplification of complex trajectories by segmentation and (ii) mapping between non-planar motion concepts and topological DLine-region relations in R3 .

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Keywords. motion, motion concepts, topological relations, 9-intersection, 9+-intersection, conceptual neighborhood graphs, spatial predicates

Introduction Imagine that you are a lifeguard watching a swimming pool for children. Some children swim from one side to another side. Some may wander around the center of the pool. Some may jump into, or enter the pool calmly. Some may hang on the edge and creep along it. Some may cautiously approach the edge, but then turn back. Now, how many kinds of motions can we distinguish? To simplify the discussion, let us ignore metric details, such as where each child enters the pool or how far the child swims. Instead, we focus on the transition of the child’s location―how he/she moves between the inside, outside, and edge of the pool. Geometrically, the trajectory of a child with respect to a pool is represented by a directed line segment (for short, DLine [19]) around/over a region. For instance when a child swims from one side to another side of a pool, and a region, his/her trajectory is represented by the DLine-region configuration in Figure 1a. Similarly, Figures 1b-1f illustrate different patterns of motions introduced before. 1 Corresponding Author: Yohei Kurata, SFB/TR 8 Spatial Cognition, Universität Bremen, Postfach 330 440, 28334 Bremen, Germany; [email protected].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

76

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. Examples of motion patterns with respect to a swimming pool: (a) swimming from one side to another side, (b)wandering around the center, (c) jumping into the pool, (d) entering the pool calmly, (e) creeping along the edge, and (f) turning back at the edge.

Here we would like to think about the topological relations between the DLine and the region (in short, topological DLine-region relations) in these configurations. Topological relations are the spatial relations between two objects, which are distinguished by the properties of the spatial arrangment that do not change under topological transformation (i.e., continous transformations such as translation, rotation, and scaling) [4]. Intuitively speaking, topological relations concern how the objects intersect with each other. Consequently, topological DLine-region relations highlight whether the DLine starts from, passes through, and ends at the inside, outside, or border of the region. Naturally, we can expect that those topological DLine-region relations can be used for capturing some fundamental characteristics of motions [20]. In this chapter, we apply the topological DLine-region relations for characterizing the motion of an agent with respect to an area of interest. Behavioural monitoring often concerns such motions―for instance, the motion of an impaired person in/around a physically-demanding area, that of a suspicious individual in/around a security-enforced area, and that of an animal in/around its territory. In this chapter, we identify the set of topological DLine-region relations in a formal way. Then, we discuss the semantics of the motions represented by these topological DLine-region relations. Naturally, this discussion concerns the expressions of motions in our natural language, such as enter and go through. Even though not all DLine-region relations are assigned such expressions in a simple way, the set of topological DLine-region relations is expected to serve as a basic primitives of motions with which we can characterize the motions of an agent. Before discussing the use of topological DLine-region relations, we review various approaches on the qualitative characterization of motions and try to explain why the characterization of motions is necessary (Section 1) . Then, we start the discussion about motion characterizations from a topological viewpoint. As a basis for the later discussion, we first look at topological relations between non-directed lines and regions (in short topological line-region relations)―how they can be modeled and associated with spatial predicates (Section 2). Based on this review, Section 3 develops a formal model of topological DLine-region relations and discusses how we can associate these DLine-region relations with the concepts of motions. Sections 4 and 5 raise two open questions; how we should simplify complex trajectories by appropriate segmenation and how we characterize non-planar motions, such as jumping into the pool, using three-dimensional topo-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

77

logical DLine-region relations. Finally Section 6 concludes with a discussion of future problems.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Characterization of Motions The development of tracking devices, such as GPS and RFID, and the resulting expansion of tracking data increase the necessity to convert the trajectory data into human understandable forms of representations, such that people can analyze and reason with the data. To realiza such conversion (i.e., qualitative characterizations of motions), various different approaches have been taken. One key difference is the number of the agent whose motion is observed: a single agent, a pair of agents, and a collective of agents. Singleagent approaches are relatively straihgtforward―they try to describe the motion of an agent in a similar way as people do [18,16,26,17,10,20,32]. The pair-based approaches typically consider the relative motion of an agent with respect to the partner, such as going toward, against, rightward, or leftward of the partner [35,3,11]. The collective-based approaches may compare the motions of agents as individuals and as a mass, through which individual-mass dynamisms [37,2] or concurrency of similar motions [22] (Chapter 3) are analyzed. A number of studies featuring a single agent characterizes the motion of the agent in association with the environments (although the motion of a single agent itself can be characterized without context [26]). For instance, Krüger et al. [18] and Kray et al. [17] model the concepts, such as (go) along and past, which characterize the path in association with spatially-extended landmark. Kray and Blocher [16] consider the modeling of more variety of motion concepts based on the increase/decrease/stability of distance and angle against the landmark. Gottfried and Witte [10] defined topologically contextualized motion patterns, where the motions are described as a sequence of the combination of the topological location (e.g., leftPenaltyArea, leftHalf, etc.) and the motion state (e.g., in motion or stable). Kurata and Egenhofer [20] associated the topological DLineregion relations with the characterization of motions. Shi and Kurata [32] modeled various ontologically-defined motion concepts using double-cross-like frame of spatial reference [7]. In addition, some robotics groups tried to make mobile robots understood human-made natural route instructions [34,31,23] and their studies are tightly related to the characterization of motions. In this chapter, we will target the movement of a single agent (or possibly a group that can be regarded as a single entity because of its spatial concentration and stability) in association with a two-dimensional area of interest, following the approach by Kurata and Egenhofer [20]. The agent is modeled as a moving point around/over a region that models the area of interest. The trajectory of the moving point forms a non-branching directed line over/around the region. To simplify the discussion, we consider that the region is simple (i.e., a single-component region without holes, disconnected interiors, spikes, or cuts [29]), even though the presence of holes in regions does not matter for the discussion. In the remainder of this chapter, non-branching non-directed lines, nonbranching directed lines, and simple regions are expressed simply as lines, DLines, and regions. Let us assume that we can identify the start point and the endpoint line of each DLine, even though they may be not distinct if the DLine forms a loop. Why do we have to characterize motions in a qualitative way? One answer is the application in human-machine interfaces. We want to make it possible for ordinary people

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

78

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

to communicate an action plan to a mobile robot or spatial information systems through natural language. We also want to make it possible for a robot or a spatial information system to describe an action plan or report the motion of a certain agent observed by sensors. Another important application is the support of rule specifications by human domain experts. For instance, in a security monitoring system, a security manager may want to input the rules for controlling security devices, such as goAlongEdge(x, zoneA) → startCameraRecording(zoneA). Similarly, in a smart home environment, the room designer may want to specify a rule like goInto(x, roomA) ∧ time(night) → turnOn(lightL) Without the motion predicate like goAlongEdge and goInto, however, the specification of these rules are not trivial tasks. The rule specification is also important for analyzing spatial trajectory data. The trajectory data easily become overwhelming when the observation continues for along period or the number of observed target is large. Thus, computational aid is neccesary for analyzing such data. For instance, REMO [22] allows the rule-based detection of continuing or concurrent actions on multi-object trajectory data. As an example set of primitive actions, Laube et al. [22] used eightdirectional movements. Alternatively, we can use a set of topologically-defined actions and analyze the trajectory data in terms of how moving agents react with specific areas ―for instance, to detect the situation when certain number of agents goes into a certain area at the same time. In this study, we will provide a basic set of spatial actions from a topological point of view. Since these actions are closely related to human conceptualization of motions and, accordingly, they are useful for the specification of rules by people.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2. Topological Line-Region Relations and Spatial Predicates Topological line-region relations capture how the line intersects with the region. For instance, imagine a simple map, which illustrates a road and a park by a line and a region, respectively. In natural languages we can describe their arrangement in various ways; for instance, the road goes into the park, goes across the park, goes along the park, and so on. As these expressions indicate, the presence of a linear entity, like a road, evokes a virtual image of movement, even though the entity itself is static. Talmy [33] called it fictive motion, using examples like "the mountain range goes from Canada to Mexico." This is why we believe topological line-region relations are relevant for the modeling of motion concepts. So, before considering topological DLine-region relations, we would like to start the discussion from topological line-region relations. 2.1. Representation of Topological Line-Region Relations In spatial database studies, topological relations are distinguished often by the 9intersection [5] or its extensions (e.g., [20,27,9]). In the 9-intersection, topological relations between two spatial objects A and B are characterized by 3 × 3 pairwise intersections between A’s three topological parts (interior, boundary, and exterior) and B’s three topological parts. The interior, boundary, and exterior of each object are defined based on point-set topology [1]. The interior of a spatial object X, denoted X ◦ , is the union of all open sets contained in X, X’s boundary ∂X is the difference of X’s closure

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

79

(the intersection of all closed point sets that contain X) and X ◦ , and X’s exterior X − is the complement of X’s closure. The 9-intersection matrix in Eq. (1) concisely represents the 3 × 3 types of intersections between A and B. In the most basic approach, topological relations are distinguished by the presence or absence of these 3 × 3 intersections. In other words, topological relations are distinguished by the emptiness/non-emptiness patterns of the 9-intersection matrix. ⎛

⎞ A◦ ∩ B ◦ A◦ ∩ ∂B A◦ ∩ B − M (A, B) = ⎝ ∂A ∩ B ◦ ∂A ∩ ∂B ∂A ∩ B − ⎠ A− ∩ B ◦ A− ∩ ∂B A− ∩ B −

(1)

Topological relations between a line L and a region R are characterized by the 9intersection matrix in Eq. (2), where L◦ , ∂L, and L− are L’s interior, boundary, and exterior, while R◦ , ∂R, and R− are R’s interior, boundary, and exterior, respectively. By definition the line’s boundary refers to the set of the line’s two endpoints, while the region’s boundary refers to the region’s looped edge. Based on the emptiness/non-emptiness patterns of the 9-intersection matrix, we can distinguish 19 topological line-region relations in a two-dimensional Euclidean space R2 [5]. Figure 2 shows these 19 line-region relations. In Figure 2, the emptiness/non-emptiness patterns of the 9-intersection matrix are represented visually by bitmap-like icons with 3 × 3 cells, in which white and black cells indicate the emptiness and non-emptiness of the corresponding elements in the 9-intersection matrix, respectively [25]. More detailed distinctions of topological lineregion relations would be possible if further criteria were employed―for instance, the number of the intersections and how each intersection extends over the space (i.e., zero/one-/two-dimensionally).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.



⎞ L◦ ∩ R◦ L◦ ∩ ∂R L◦ ∩ R− M (L, R) = ⎝ ∂L ∩ R◦ ∂L ∩ ∂R ∂L ∩ R− ⎠ L− ∩ R◦ L− ∩ ∂R L− ∩ R−

(2)

Note that in AI fields topological relations are often modeled by RCC-theory [28] instead of the 9-intersection. RCC-theory, however, does not distinguish explicitly the sort of spatial objects. On the other hand, the 9-intersection distinguishes different sets of topological relations depending on the sorts of spatial objects (e.g., 8 region-region relations in R2 , 19 line-region relations in R2 ). Thus, we expect that the characteristics of line-region configurations are more specifically captured by the 9-intersection. 2.2. Scematization of Topological Line-Region Relations We schematize the 19 topological line-region relations into a conceptual neighborhood graph [7]. In this graph, pairs of ’similar’ relations are linked to each other. Conceptual neighborhood graphs are a powerful tool for schematizing a set of relations and analyzing its characteristics [25,24,30]. The shape of the conceptual neighborhood graph depends on the definition of ’similar’ relations. Figure 3 shows one of the conceptual neighborhood graphs of the 19 topological line-region relations, in which two relations are consider ’similar’ and linked to each other if an instance of one relation may switch to

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

80

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 2. 19 topological line-region relations in R2 , which are distinguished by the iconic patterns of the 9-intersection [5].

another relation by transforming the line continuously without gaining/losing more than one intersection. Note the resulting conceptual neighborhood graph is highly schematic (forming a lattice-like structure). Alternatively, we can consider unrestricted transformation of lines where the line can gain/lose multiple intersections simultaneously, but the resulting concpetual neighborhood graph becomes much more complicated and nonplannar [6]. Here we introduced the former graph, since our primary purpose is to develop a visual schema of the 19 line-region relations rather than to identify all possibility of smooth transformations between relations. This conceptual neighborhood graph is later compared with that of DLine-region relations, through which we find nice correspondence between the two graphs (Section 3.2). 2.3. Qualitative Interpretation of Topological Line-Region Relations By the 19 topological line-region relations we can categorize numerous patterns of lineregion configurations in R2 . A question is whether this categorization matches with the way people categorize the line-region configurations. To answer this question, Mark and Egenhofer [25] conducted an experiment, in which human subjects are given 40 simple maps of a road and a park (essentially line-region configurations) and asked to group the maps if they are given the same descriptions in their native languages. The 40 maps consist of 19 pairs that correspond to the 19 topological line-region relations and two extras. The result shows that the 19 most frequently grouped pairs are exactly the 19 pairs of maps with the same topological relations. This result indicates the naturalness to categorize line-region configurations based on their topology. They also found that some topological line-region relations are more often grouped than others. Figure 3 shows the sets of relations that more than a half of the human subjects have grouped together. They are considered qualitatively similar relations. Interestingly, each group of line-region relations always form a cluster in the conceptual neighborhood graph.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

81

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 3. A conceptual neighborhood graph of the 19 line-region relations in Figure 2. The graph also shows the consensus grouping of line-region relations in [25], in which the line-region relations were categorized into the same group by more than a half of human subjects.

Since topological line-region relations apparently match with our categorization schema of line-region configurations, these relations are expected to have certain correspondence with natural language expressions we use for describing line-region configurations. Mark and Egenhofer [24] analyzed the correspondence between the 19 topological line-region relations and such spatial predicates as cross, go across, go through, and enter, based on the human subjects’ agreement to associate each predicate with each of 60 road-park maps. They found that some predicates are assigned to multiple topological line-region relations (for instance, cross are associated with the relations in Figures 2d, 2g, and 2q). Then, by a regression analysis they found that the toplogical relation of a line-region configuration mostly explains how well each predicate fits this configuration, but metric information may be also influential for some predicates. For instance, go across is less used than go through when the road enters and soon leaves the region. Shariff et al. [30] analyzed the correspondence between the 19 topological lineregion relations and 59 spatial predicates in English. In their experiment, human subjects were asked to sketch a road on/around a park illustrated on a paper, such that the illustration exemplified each predicate. Based on these illustrations, they modeled the spatial characteristics of the 59 predicates using the 19 topological line-region relations and additional 15 metric parameters. The result shows that each predicate is characterized by topological information plus only a few metric parameters specific to the predicate. Then, by a clustering analysis, the 59 spatial predicates were categorized into ten clusters (Table 1). The predicates in each goup have similar spatial characteristics. Some predicates belong to multiple groups because of their semantic ambiguity. Among the ten clusters, six are characterized by a unique prototypical topological pattern. On the other hand, Clusters 5 and 9 share the same prototypical topological patterns. They are distinguished by additional metric properties; how long the line goes along the region’s edge (outer approximate perimiter/line alongness) is an important factor for the spatial predicates in Cluster 5 (e.g., goes by), but not for those in Cluster 9 (e.g., ends near). Similarly, Clus-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

82

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

ters 7 and 8 are characterized by the same prototypical topological patterns but different metric properties. It looks that the ten clusters of Table 1 can be used as a set of primitives for characterizing the motions around an area of interest, as each cluster is associated with specific spatial predicates that evokes an image of movement. Unfortunately, however, this categorization of spatial predicates lacks the information about the line’s direction. Accordingly, some direction-sensitive predicates, such as goes into and comes out of, belong to the same cluster even though its semantic difference. As a natural consequence, it is expected that the refinement of this categorization, taking the line’s direction into account, will be useful as the mapping between basic motion patterns and motion predicates.

3. Topological Relations between a Directed Line and a Region Previous section reveals that line-region relations seem useful for characterizing motions, even though directional information is still necessary for distinguishing motion patterns that are associated with direction-sensitive motion predicates. This section, therefore, develops a formal model of topological DLine-region relations, extending the previous approach. Then, we apply these relations for characterizing basic motion patterns.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.1. Scematization of Topological DLine-Region Relations How to incorporate the line’s directional information into the 9-intersection? One simple but sufficient solution is to distinguish the line’s boundary into the start point and the end point (recall that the line’s boundary consists of two endpoints). Accordingly, the 9-intersection matrix is modified into a nested matrix in Eq. (3), where LD is the directed line (DLine) and ∂s LD and ∂e LD are its start and end points, respectively. By this nested matrix, we can deal with the intersections related to the DLine’s start point and those related to the DLine’s end point separately. This nested matrix is called the 9+-intersection matrix for topological DLine-region relations [20]. We can consider similar refinements of the 9-intersection when we should distinguish the subparts of the interior, boundary, or exterior of spatial objects [21]. ⎡

L◦D ∩ R◦

L◦D ∩ ∂B

L◦D ∩ R−









⎢ ∂s LD ∩ R◦ ∂s LD ∩ ∂R ∂s LD ∩ R− ⎥ ⎢ ⎥ M (LD , R) = ⎢ ◦ ∂e LD ∩ ∂R ∂e LD ∩ R− ⎥ ⎣ ∂e LD ∩ R ⎦ − − − ◦ − LD ∩ R LD ∩ ∂R LD ∩ R +

(3)

Topological DLine-region relations are distinguished by the emptiness/non-emptiness patterns of the 9+-intersection matrix in Eq. (3). Such patterns are also represented by bitmap-like icons [20]. Each icon basically has equi-size 3 × 3 cells, but the three cells in the middle row are further partitioned, following the nested structure of the 9+-intersection matrix in Eq. (3) (Figure 4b-4c). White and black cells/cellpartitions indicate the emptiness and non-emptiness of the corresponding elements in the 9+-intersection matrix, respectively. In Figure 4b-4c, we can clearly see that exiting and entering patterns are captured as different topological DLine-region relations.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

83

Table 1. Ten clusters of 59 spatial predicates, each characterized by one or two topological relations and a small set of metric parameters. Predicates with an * are considered direction-sensitive. The table is developed from the result of [30]. Significant metric parameters

Predicates used exclusively for each cluster

Other prdicates

Cluster 1

Interior splitting

area

bisects, comes through, crosses, cuts, cuts across, cuts through, goes across, goes through, intersects, splits, transects, traverses

connected to, goes out of, ends just outside, intersect, run across, runs into, spans

Cluster 2

Exterior splitting

area

Cluster 3

Perimeter/line alongness

connects

along edge, runs along, connected to

Cluster 4 Cluster 5 Cluster 6

along edge, runs along

Outer nearness + Outer approximate perimeter/line alongness Exterior splitting

Cluster 7

avoids, bypasses, entirely outside, goes by, near, passes

Interior/exterior travese splitting + Inner closeness

Inner nearness + Inner approximate perimeter/line alongness

contained in edge

Cluster 8

connected to*, ends at*, enters*, ends just outside*, ends outside*, goes out of*, goes to*, goes up to*, intersect, runs into*, starts in*, starts just outside*, starts outside*

Inner nearness

contained, within, enclosed by, encloses, in, inside, starts and ends in

connects, starts in

Cluster 9

comes from*, comes into*, comes out of*, ends in*, ends just inside*, ends outside*, exits*, goes into*, leaves*, starts just inside*

Outer nearness

ends near*, outside, passes, runs along boundary, starts near*

conneted to*, ends just outside*, ends outside*, goes away from*, starts outside*

Cluster 10

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Prototypical topological relation

Outer closeness

area

runs across, spans

connected to*, ends at*, enters*, goes out of*, goes up to*, goes to*

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

84

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Figure 4. (a) Iconic representation of a topological line-region relation under the 9-intersection and (b-c) those of topological DLine-region relations under the 9+-intersection.

Based on the emptiness/non-emptiness patterns of the 9+-intersection matrix, we can distinguish 26 topological DLine-region relations in R2 (Figure 5). These 26 relations can be derived also from the previous 19 topological line-region relations, simply by giving lines a direction. Among the 19 relations, twelve relations (Figures 2a-2l) are not sensitive to the line’s direction, because the line’s two start points are located within the same topological part of the region. For the remaining seven relations (Figures 2m-2s), two DLine-region relations are derived from each relation. As a result, 12 + 7 × 2 = 26 topological DLine-region relations are naturally derived. We can formally prove that there are no other DLine-region relations in R2 under the 9+-intersection matrix [20].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.2. Scematization of Topological DLine-Region Relations These 26 DLine-region relations are schematized into a conceptual neighborhood graph, as we did for the line-region relations. The conceptual neighborhood graph is useful for grasping the structure of the identified relations, especially in comparison with the lineregion relations identified before. In addition, the conceptual neighborhood graph is later used for visual reasoning (Section 2.3). We use the definition of neighbors equivalent to that of the line-region relations (Section 2.2); i.e., two DLine-region relations are linked if an instance of one relation may switch to another relation by transforming the DLine continuously without gaining/losing more than one intersection. Under this definition, we can identify 46 pairs of neighbors. By linking these neighbors, a conceptual neighborhood graph of the 26 DLine-region relations in Figure 6 was developed [20]. The graph is drawn on a V-shaped tube instead of a 2D plane, such that links do not cross. Figure 7 shows the upper and lower halves of the conceptual neighborhood graph in Figure 6. The relations with gray background in Figure 7 are the relations located at the top or bottom of the graph in Figure 6. A remarkable point is that these two sub-graphs are isomorphic to the conceptual neighborhood graph of the 19 line-region relations in Figure 3. Actually, these two sub-graphs can be derived from the graph in Figure 3 by giving the lines a direction. Since seven of the 19 line-region relations are sensitive to the line’s direction, two different graphs are derived. Synthesis of these two graphs naturally yields the conceptual neighborhood graph in Figure 6. The conceptual neighborhood graph in Figure 6 has several unique characteristics; for instance: • Pairs of relations located at the top and bottom of the graph are derived from each other by reversing the DLine’s direction. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

85

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 5. 26 topological DLine-region relations in R2 , which are distinguished by the iconic patterns of the 9+-intersection [20].

Figure 6. A conceptual neighborhood graph of the 26 topological DLine-region relations [20] (some relations are hided behind the V-shaped tube).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

86

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Figure 7. Upper and lower halves of the conceptual neighborhood graph in Figure 6.

• Considering a line that penetrates the V-shaped tube from the front center to the back center, pairs of relations located symmetrically across this line are derived from each other by exchanging the region’s interior and exterior. For more characteristic of this conceptual neighborhood graph, see [20]. 3.3. Qualitative Interpretation of Topological DLine-Region Relations Our next step is to associate the 26 topological DLine-region relations with the motion concepts. There are several potential approaches. One approach is to model motionrelated spatial predicates with the aid of the topological DLine-region relations (and possibly other metric criteria). This approach follows the work by Mark and Egenhofer [24] and Shariff et al. [30] we have reviewed in Section 2.3. They associated a number of spatial predicates with the topological line-region relations. Most of these predicates are actually related to motion concepts, even though they are given to static line-region rela-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

87

tions. Some predicates require directional information to capture their nuance correctly. This implies the potential value of topological DLine-region relations for capturing human concepts and expressions of motions. To a certain degree, we can reuse the results in [24,30]. For instance, in Table 1, we can divide Clusters 4, 9, and 10 into two subclusters, considering the line’s directions that each predicate presumes. Let’s go back to the swimming pool scenarios in Intrduction (Figures 1a-f). The motion illustrated in Figure 1a (swimming from one side to another side) is represented by the topological DLine-region relation (g) in Figure 5. Its non-directed correspondent (line-region relation (g) in Figure 2) is the prototypical relation of Cluster 6 in Table 1. Thus, the motion illustrated in Figure 1a can be associated with the motion concept underlying Cluster 6―which are represented by such predicates as run across and spans. Similarly, the motion illustrated in Figure 1b (wandering around the center) can be associated with the motion concept underlying Cluster 8―represented by such predicates as (goes) inside―even though this time we explicitly need additional metric information to negate that this motion does not correspond to Cluster 7. The motion illustrated in Figure 1d (entering the pool calmly) corresponds to a refinement of Cluster 4―represented by such predicates as comes into and goes into―because the DLine-region relation in Figure 1d is direction-sensitive. Motions illustrated in Figures 1c, 1e, and 1f are problematic cases. The motion in Figure 1c (jumping into the pool) does not correspond to any cluster in Table 1, because Table 1 supports only two-dimensional motions. Similarly, the motion in Figure 1e (creeping along the edge) does not correspond to any cluster, because Table 1 is developed from the line-region sketches by human subjects for 59 predicate and, consequently, Table 1 misses such motion concepts as go on the edge. The motion in Figure 1f (turning back at edge) correspond to the line-region relation (e) and this relation is Cluster 3’s prototypical relation, but the motion concept in Cluster 3, represented by such predicates as along edge and runs along, does not fit this motion. Since turning back at edge or similar predicates were not used for the development of Table 1, Table 1 misses such ’turning-back’ concept. In this way, Table 1 has still room for improvement as the model of human motion concepts. Thus, to achieve more precise and comprehensive modeling of motion predicates, it is better to conduct another human-subject experiment using DLine-region configurations instead of line-region configurations. A similar but slightly different approach is to make interpretations of each topological DLine-region relation, regarding it as a motion pattern. For instance, Gao et al. [9] proposed a systematic schema that categorizes the topological relations between a DLine and another simple object (Figure 8a). This schema categorizes the 26 topological DLine-region relations into 10 groups (Figure 8b). Each group is given a name, such as pass-by and get-out, which highlights the qualitative characteristics of the group members when they are regarded as motion patterns. Interestingly, each group forms a connected sub-graph in the conceptual neighborhood graph. Let’s see the swimming pool scenarios again. The DLine-region relations in Figures 1a, 1b, 1d, 1e, and 1f belong to the classes named within/along, within/along, enter, pass-by, and pass-by, respectively. As this result indicates, this classification may be too rough to be used as a model of motion concepts. Kurata and Egenhofer [20] proposed another sort of grouping, in which relations are grouped if they satisfy certain motion-related conditions. The members of each group are represented by an icon (Figures 9a-9j). This icon has a graph-structure, which corre-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

88

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

sponds to the sub-graphs of the conceptual neighborhood graph in Figure 7. Each node is marked out if the corresponding DLine-region relation satisfies the given condition. Some nodes are partitioned, such that their upper and lower halves correspond to the different relations in the upper and lower sub-graphs (i.e., the relations located at the top and bottom of the V-shaped tube in Figure 6). An interesting point of this representation is that through simple manipulations on the icons in Figure 9 we can derive the DLineregion relations that satisfy complex conditions. For instance, Figure 10a shows the intersection of the icons in Figures 9a and 9f, whose result indicates that only one DLineregion relation satisfies "start from inside and end at outside" (i.e., "exit"). Similarly, Figures 10b-d show the union of two icons, difference of two icons, and complement of an icon, whose results indicate the set of DLine-region relations that satisfy "start from inside or end at outside", "start from inside but not end at outside", and "not start from inside", respectively. Such computation is particularly useful when we have to integrate the characterizations of the motion of an agent reported by multiple observers.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 8. (a) Classification schema of topological relations between a directed line L and a simple object X by Gao et al. [9] and (b) its application to the 26 topological DLine-region relations (only eight groups are illustrated on the upper half of the conceptual neighborhood graph).

Figure 9. Iconic representation of groups of topological DLine-region relations that satisfy respective qualitative conditions [20].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

89

Recently, Hois and Kutz [12,13] proposed a formal framework for associating spatial relations defined in spatial calculi, such as the 9+-intersection [20] and Double Cross Calculus [8], with the motion concepts specified in a linguistically-motivated ontology. Such developments of calculi-ontology associations will be highly profitable for enriching human-machine interfaces.

4. Segmentation of Directed Lines

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

So far we have implicitly considered only simple motions―for instace, the motion where the agent may cross/touch the boundary once or twice, but no more. Of course, we may have to deal with more complicated motions, especially when we observe the moving agent for a long period. In such cases, people naturally subdivide the motion of the agent into several steps and describe the motion as a sequence of characterizations given to individual segments. Our question is, then, how we should subdivide the motion of the agent (i.e., DLine) in an appropriate way. In order to describe the topological properties of complicated motions more precisely, we introduce an alternative notation of DLine-region configurations, called IBEsequence [20]. In this notation, a DLine-region configuration is represented by a threetuple, such as (I, IBE, E), which represents the transition of the location of an agent over a certain period. The first and third elements indicate the agent’s initial and last locations. I, B, and E stands for the region’s interior, boundary, and exterior, respectively. With this notation, the 26 topological DLine-region relations are rewritten as shown in Table 2. As this table indicates, some topological relations correspond to multiple (actually infinite number of) configurations with different topologies (e.g., Figure 11). This highlights the aspect of the 26 DLine-region relations as a categorization of DLineregion configurations based on some (but not all) topological properties. Now, we propose five potential strategies for segmenting complex trajectories.

Figure 10. Set operations on the icons representing a set of DLine-region relations.

Figure 11. Three DLine-region configurations with different topologies, which correspond to the same topological DLine-region relation in Figure 5i.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

90

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Table 2. 26 topological DLine-region relations represented by IBE-sequences, where [X] is X or empty, Y ∗ is an arbitrary number of Y , and Z|W is either Z or W , but not both. Direction-Invariant Relations

Direction-Variant Relations

(a) (I, I, I) (b) (I, IB[IB]∗ I, I)

(m1 ) (I, I, B) (m2 ) (B, I, I)

(c) (I, IB[IB|EB]∗ E[BI|BE]∗ BI, I) (d) (E, EB[IB|EB]∗ I[BI|BE]∗ BE, E)

(n1 ) (I, IB[IB]∗ [I], B) (n2 ) (B, [I][BI]∗ BI, I)

(e) (E, EB[EB]∗ E, E) (f) (E, E, E)

(o1 ) (I, IB[IB|EB]∗ E[BI|BE]∗ [B], B) (o2 ) (B, [B][IB|EB]∗ E[BI|BE]∗ BI, I)

(g) (B, I, B) (h) (B, [B]IB[IB]∗ , B) ∨ (B, [BI]∗ BI[B], B)

(p1 ) (I, IB[IB|EB]∗ E, E) (p2 ) (E, E[BI|BE]∗ BI, I)

(i) (B, [B][IB|EB]∗ (IBE|EBI)[BI|BE]∗ [B], B) (j) (B, [B]EB[EB]∗ , B) ∨ (B, [BE]∗ BE[B], B)

(q1 ) (B, [B][IB|EB]∗ I[BI|BE]∗ BE, E) (q2 ) (E, EB[IB|EB]∗ I[BI|BE]∗ [B], B)

(k) (B, E, B) (l) (B, B, B)

(r1 ) (B, [E][BE]∗ BE, E) (r2 ) (E, EB[EB]∗ [E], B) (s1 ) (B, E, E) (s2 ) (E, E, B)

Strategy 1: Equal-Length Segmentation. In this strategy, the trejactory is divided into n segments with equal spatial length (Figure 12a). This strategy is simple, but of course there is no guarantee that the configuration of each segment and the region becoms simple.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Strategy 2: Equal-Interval Segmentation. In this strategy, we partition the trejactory at the points where the moving agent is located at time nt0 (t0 : constant) (Figure 12b). Again, there is no guarantee that each segment-region configuration is simple. This strategy may become reasonable when the behaviour of a moving agent should be reported periodically at certain intervals. Strategy 3: Boundary-Based Segmentation. In this strategy, we partition the trejactory at every point where the trejactory crosses, merges with, or leaves the region’s boundary (Figure 12c). This strategy was proposed in [36] for capturing topological line-region relations more precisely than the 9-intersection. The merit of this method is that any motion is characterized by a unique sequence of nine primitives: (E, E, E), (E, E, B), (B, B, B), (B, E, E), (B, E, B), (B, I, B), (B, I, I), (I, I, I), and (I, I, B). Strategy 4: Primitive-Based Segmentation Just like Strategy 3, we prepare a set of primitives and parse the given IBE-sequence into a sequence of primitives (Figure 12d). These primitives should correspond to the prototypes of motion concepts/predicates, such that the parsed result naturally fits with human conceptualization of motions. The given IBEsequence may be parsed into multiple ways. Thus, how to identify the most reasonable parsing result by considering the priorities of primitives is left as a challenging question. Strategy 4’: Termporally-Adjusted Primitive-Based Segmentation A problem of Strategy 4 is that we cannot tell from the IBE-sequence whether the moving agent instantly crosses the region’s boundary or stays on the boundary for a long period. Thus, we modify Strategy 4, such that if the temporal length over which a moving agent stays on the region’s boundary goes beyond a certain threshold, we partition the trejactory at the points where it merges and leaves the region’s boundary (Figure 12e).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

91

Figure 12. Five strategies for segmenting a DLine in a complicated DLine-region configuration. In (c)-(e), each segment is assigned a typical motion predicate that characterizes the motion on the segment.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 13 shows an example of primitives for Strategies 4 and 4’, together with their priorities. Each primitive corresponds to a specific concept of motion, which is represented by a typical motion predicate. By applying these primitives, a given motion is characterized by the sequence of predicates (Figure KurataFig12d-KurataFig12e). Note that these primitives are not pairwise disjoint; that is, the primitives in higher priority are represented by the combination of the primitives in lower priorty. This follows the fact that some motion concepts (e.g., go through) are equivalent to a sequence of other motion concepts (e.g., enter and then exit).

5. Modeling Non-planar Motions So far we have considered two-dimensional motions. In a 2D space, if an agent moves between the interior and exterior of a region, it always pass the region’s boundary. In a 3D space, however, direct transitions from the interior to the exterior or vice versa is possible. For instance, in the swimming pool scenario, children may jump into the pool, without passing through the pool’s edge. How can we characterize such non-plannar motions? Probably, topological DLine-region relations in R3 will be a key for the characterization. The 9+-intersection distinguishes 45 topological DLine-region relations in R3 [21]. Among these 45 relations, 19 relations in Figure 14 correspond to non-planar motion patterns, while the remaining 26 relations are equivalent to the topological DLine-region relations in R2 (Figure 5). These 45 relations will serve as a formal foundation for characterizing three-dimensional motions in association with an two-dimensional area of interest. For instance, we can capture the movement of a flying bird with respect to a pond with these topological DLine-region relations. The topological relations in Figures 14C, 14E, and 14J2 may be mapped to such spatial predicates as "jump within", "touch-and-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

92

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 13. An example of motion primitives, which have different priorities.

go", and "dive into". To complete such mappings between the additional 19 relations and the linguistic expressions of motions will be highly important for qualitative characterization of motions, just like the previous 26 topological DLine-region relations in R2 (Section 3). We tried to schematize the 45 topological DLine-region relations into a conceptual neighborhood graph (Figure 15). This, time, however, the graph becomes much more complicated than the previous conceptual neighborhood graph in Figure 6. Thus, the icons like those in Figure 9 are no longer available.

6. Conclusion and Future Problems Behavioural monitoring often concerns the interpretation of agents’ motions with respect to a certain area of interest. In this chapter, we introduced a formal model of topological relations between a directed line and a region, called the 9+-intersection, and explored its application to the characterization of motions. The 9+-intersection distinguished 26 DLine-region relations in R2 and 45 DLine-region relations in R3 . These relations are expected to work as a formal foundation for modeling motion concepts used by humans. The results of human subject tests in [25,24,30] show that the topological line-region relations are already a highly influential factor for the choice of spatial predicates. The use of DLine-region relations enables us to capture the nuance of direction-sensitive predicates more precisely. For characterizing complicated motions where the moving agent crosses/touches the border of the area of interest repeatedly, cognitively-adequate

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

93

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 14. 19 topological DLine-region relations, which are peculiar to R3 [21].

segmentation of the trajectory is crucial. How to achieve such segmentation is our ongoing work. A set of primitive motion patterns, represented by topological DLine-region relations, potentially serves as a foundation of mobile intention recognition [15]. For instance, Kiefer and Schlieder [15,14] segment the trajectory of an agent into a sequence of basic motion primitives and parse this sequence in order to deduce the agent’s intention (Chapter 10). The use of topologically distinguished primitives will be interesting for intention recognition, because behavioural monitoring often concerns the motions with respect to an area of interest. Similalry, topological DLine-region relations may be used for the set of primitive motion patterns in REMO [22] for analyzing large scale of trajectory data. The limitation of topology-based approach arises when the trajectory do not intersect with the area of interest. In this situation, the topological DLline-region relation only tells that the agent stays outside of the area of interst, even though people may describe this situation in various ways: for instance, the agent is going toward, leaving away from, or passing by the area (Figures 16a-16c). Such distinction of disjoint scenarios may be crucial for some scenarios―for instance, robot navigation via natural language. Thus, we need a complementary model of motions, which takes non-topological properties (say, direction and distance relations) into account. One potential complementary model is RfDL (Region-in-the-frame-of-a-DLine) by Shi and Kurata [32]. In this model, we feature a straight segment of a trajectory and project a double-cross-like frame of spatial reference [7] over this segment (Figure 17a). This frame defines fifteen 2D/1D/0D partitions around/over the segment (Figure 17a). Then, the spatial arrangement of this segment and the area of the interst is represented by the partitions with which this area intersects (Figure 17c). Shi and Kurata [32] discuss the mapping between such segment-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

94

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 15. A conceptual neighborhood graph of the 45 topological DLine-region relations in R3 .

Figure 16. Examples of three motion patterns, which cannot be distinguished by our topological approach.

area arrangements and motion concepts, including going toward, leaving away from, and passing by. The 9+-intersection introduced in this chapter provides a flexible and universal framework for modeling a large variety of topological relations [21]. The application of the 9+-intersections for other DLine-relevant topological relations (e.g., DLine-line relations) is another interesting topic, because such DLine-relevant relations can be applied to the categorization of motions with respect to a landmark of various geometric types [9].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

95

Figure 17. Modeling of DLine-region arrangment by RfDL [32]: (a) projection of a double-cross-like frame of spatial reference, (b) distinction of fifteen fields and (c) the iconic representation of the DLine-region arrangement.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

References [1] Alexandroff, P.: Elementary Concepts of Topology. Dover Publications, Mineola, NY, USA (1961) [2] Dodge, S., Weibel, R., Lautenschütz, A.-K.: Taking a Systematic Look at Movement. In: Andrienko, G., Andrienko, N., Dykes, J., Fabrikant, S., Wachowicz, M. (eds.): AGILE workshop on GeoVisualization of Dynamics, Movement, and Change (2008) [3] Dylla, F., Wallgrün, J. On Generalizing Orientation Information in OP RAm . In: Freksa, C., Kohlhase, M., Schill, K. (eds.): 29th Annual German Conference on AI (KI 2006), Lecture Notes in Computer Science, vol. 4314, pp. 274-288. Springer, Berlin/Heidelberg, Germany (2006) [4] Egenhofer, M.: A Formal Definition of Binary Topological Relationships. In: Litwin, W., Schek, H.J. (eds.): 3rd International Conference on Foundations of Data Organization and Algorithms, Lecture Notes in Computer Science, vol. 367, pp. 457-472. Springer, Berlin/Heidelberg, Germany (1989) [5] Egenhofer, M., Herring, J.: Categorizing Binary Topological Relationships between Regions, Lines and Points in Geographic Databases. In: Egenhofer, M., Herring, J., Smith, T., Park, K. (eds.): NCGIA Technical Reports 91-7. National Center for Geographic Information and Analysis, Santa Barbara, CA, USA (1991) [6] Egenhofer, M., Mark, D.: Modeling Conceptual Neighborhoods of Topological Line-Region Relations. International Journal of Geographical Information Systems 9(5), 555-565 (1995) [7] Freksa, C.: Temporal Reasoning Based on Semi-Intervals. Artificial Intelligence 54, 199-227 (1992) [8] Freksa, C.: Using Orientation Information for Qualitative Spatial Reasoning. In: Frank, A., Campari, I., Formentini, U. (eds.): International Conference GIS - From Space to Territory: Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, Lecture Notes in Computer Science, vol. 639, pp. 162-178. Springer, Berlin/Heidelberg, Germany (1992) [9] Gao, Y., Zhang, Y., Tian, Y., Weng, J.: Topological Relations between Directed Lines and Simple Geometries. Science in China Series E: Technological Sciences 51, supplement 1, 91-101 (2008) [10] Gottfried, B., Witte, J.: Representing Spatial Activities by Spatially Contextualized Motion Patterns. In: Lakemeyer, G., Sklar, E., Sorrenti, D., Takahashi, T. (eds.): RoboCup 2006, vol. 4434, pp. 330-337. Lecture Notes in Artificial Intelligence (2006) [11] Gottfried, B.: Representing Short-Term Observations of Moving Objects by a Simple Visual Language. Journal of Visual Language and Computing 19, 321-342 (2008) [12] Hois, J., Kutz, O.: Counterparts in Language and Space: Similarity and S-Connection. In: Eschenbach, C., Gruninger, M. (eds.) 5th International Conference on Formal Ontology in Information Systems, IOS Press, Amsterdam, Netherlands (2008) [13] Hois, J., Kutz, O.: Natural Language Meets Spatial Calculi. In: Freksa, C., Newcombe, N., Gärdenfors, P., Wölfl, S. (eds.): Spatial Cognition VI, Lecture Notes in Artificial Intelligence, vol. 5248, pp. 266-282. Springer, Berlin/Heidelberg, Germany (2008)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

96

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

[14] Kiefer, P., Schlieder, C.: Exploring Context-Sensitivity in Spatial Intention Recognition. In: Gottfried, B. (ed.): 1st Workshop on Behavioural Monitoring and Interpretation, TZI-Bericht, vol. 42, pp. 102-116. Technogie-Zentrum Informatik, Universität Bremen, Germany (2007) [15] Kiefer, P.: Spatially Constrained Grammars for Mobile Intention Recognition. In: Freksa, C., Newcombe, N., Gärdenfors, P., Wölfl, S. (eds.): Spatial Cognition VI, Lecture Notes in Artifical Intelligence, vol. 5248, pp. 361-377. Springer, Berlin/Heidelberg, Germany (2008) [16] Kray, C., Blocher, A.: Modeling the Basic Meanings of Path Relations. In: 16th International Joint Conference on Artificial Intelligence, pp. 384-389. Morgan Kaufmann (1999) [17] Kray, C., Baus, J., Zimmer, H., Speiser, H., KrÃijger, A.: Two Path Prepositions: Along and Past. In: Montello, D. (ed.): COSIT ’01, Lecture Notes in Computer Science, vol. 2205, pp. 263-277. Springer, Berlin/Heidelberg, Germany (2001) [18] Krüger, A., Maaß, W.: Towards a Computational Semantics of Path Relations. In: Workshop on Language and Space at the 14th National Conference on Artificial Intelligence (1997) [19] Kurata, Y., Egenhofer, M.: The Head-Body-Tail Intersection for Spatial Relations between Directed Line Segments. In: Raubal, M., Miller, H., Frank, A., Goodchild, M. (eds.): GIScience 2006, Lecture Notes in Computer Science, vol. 4197, pp. 269-286. Springer, Berlin/Heidelberg, Germany (2006) [20] Kurata, Y., Egenhofer, M.: The 9+-intersection for Topological Relations between a Directed Line Segment and a Region. In: Gottfried, B. (ed.): 1st Workshop on Behavioural Monitoring and Interpretation, TZI-Bericht, vol. 42, pp. 62-76. Technogie-Zentrum Informatik, Universität Bremen, Germany (2007) [21] Kurata, Y.: The 9+-intersection: A Universal Framework for Modeling Topological Relations. In: Cova, T., Miller, H., Beard, K., Frank, A., Goodchild, M. (eds.): GIScience 2008, Lecture Notes in Computer Science, pp. 181-198. Springer, Berlin/Heidelberg, Germany (2008) [22] Laube, P., Imfeld, S., Weibel, R.: Discovering Relative Motion Patterns in Groups of Moving Point Objects. International Journal of Geographical Information Science 19(6), 639-668 (2005) [23] Mandel, C., Frese, U., Röfer, T.: Robot Navigation Based on the Mapping of Coarse Qualitative Route Descriptions to Route Graphs. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2006), pp. 205-120 (2006) [24] Mark, D.: Calibrating the Meanings of Spatial Predicates from Natural Language: Line-Region Relations. In: Waugh, T., Healey, R. (eds.): 6th International Symposium on Spatial Data Handling, pp. 538-553. Taylor Francis, London, UK (1994) [25] Mark, D., Egenhofer, M.: Modeling Spatial Relations between Lines and Regions: Combining Formal Mathematical Models and Human Subjects Testing. Cartography and Geographical Information Systems 21(3), 195-212 (1994) [26] Musto, A., Stein, K., Eisenkolb, A., RÃ˝ufer, T., Brauer, W., Schill, K.: From Motion Observations to Qualitative Motion Representation. In: Freksa, C., Habel, C., Wender, C. (eds.): Spatial Cognition II, Lecture Notes in Computer Science, pp. 115-126. Springe, Berlin/Heidelberg, Germany (2000) [27] Nedas, K., Egenhofer, M., Wilmsen, D.: Metric Details of Topological Line-Line Relations. International Journal of Geographical Information Science 21(1), 21-48 (2007) [28] Randell, D., Cui, Z., Cohn, A.: A Spatial Logic Based on Regions and Connection. In: Nebel, B., Rich, C., Swarout, W. (eds.): 3rd International Conference on Knowledge Representation and Reasoning, pp. 165-176. Morgan Kaufmann, San Francisco, CA, USA (1992) [29] Schneider, M., Behr, T.: Topological Relationships between Complex Spatial Objects. ACM Transactions on Database Systems 31(1), 39-81 (2006) [30] Shariff, A., Egenhofer, M., Mark, D.: Natural-Language Spatial Relations between Linear and Areal Objects: The Topology and Metric of English-Language Terms. International Journal of Geographical Information Science 12(3), 215-246 (1998) [31] Shi, H., Tenbrink, T.: Telling Rolland Where to Go: HRI Dialogues on Route Navigation. In: Coventry, K., Bateman, J., Tenbrink, T. (eds.): Workshop on Spatial Language and Dialogue (2005) [32] Shi, H., Kurata, Y.: Modeling Ontological Concepts of Motions with Two Projection-Based Spatial Models. In: Gottfried, B., Aghajan , H. (eds.): 2nd Workshop on Behavioral Monitoring and Interpretation, CEUR Workshop Proceedings, vol. 396, pp. 42-56. CEUR-WS.org (2008) [33] Talmy, L.: Fictive Motion in Language and ”Ception”. In: Bloom, P., Peterson, M., Nadel, L., Garrett, M. (eds.): Language and Space. MIT Press, Cambrdige, MA, USA (1996) 211-276 [34] Tschander, L., Schmidtke, H., Eschenbach, C., Habel, C., Kulik, L.: A Geometric Agent Following Route Instructions. In: Freksa, C. (ed.): Spatial Cognition III, Lecture Notes in Artificial Intelligence, vol. 2685, pp. 89-111. Springer, Berlin/Heidelberg, Germany (2003)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Y. Kurata and M.J. Egenhofer / Interpretation of Behaviours from a Viewpoint of Topology

97

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[35] Van de Weghe, N., Cohn, A., De Maeyer, P.: A Qualitative Trajectory Calculus as a Basis for Representing Moving Objects in Geographical Information Systems. Control and Cybernetics 35, 97-120 (2006) [36] Wang, X., Luo, Y., Xu, Z.: SOM: A Novel Model for Defining Topological Line-Region Relations. In: Lagnà, A., Marina, Gavrilova, M., Kumar, V., Youngsong, M., Tan, C., Gervasi, O. (eds.) ICCSA, Lecture Notes in Computer Science, vol. 3045, pp. 335-344. Springer, Berlin/Heidelberg, Germany (2004) [37] Wood, Z., Galton, A.: Collectives and How They Move: A Tale of Two Classifications. In: Gottfried, B., Aghajan, H. (eds.): 2nd Workshop on Behaviour Monitoring and Interpretation, CEUR Workshop Proceedings, vol. 396, pp. 57-71. CEUR-WS.org (2008)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

98

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-98

Qualitative Spatial Reasoning for Navigating Agents Behavior Formalization with Qualitative Representations Frank DYLLA Universit¨at Bremen, SFB/TR 8 – Spatial Cognition Abstract. In this chapter we show how qualitative representations can be applied for the formalization of agent behavior. First, we give an introduction to several aspects of space relevant for agent motion and agent control, especially orientation, location, and distance. Based on these preliminaries we explain which characteristics a qualitative spatial representation and its operations, called qualitative calculus, should have so that the knowledge can be manipulated in an adequate manner. Afterwards, we present the two main categories of reasoning with qualitative calculi: constraint-based reasoning and neighborhood-based reasoning, actionaugmented neighborhood respectively. Exemplary, we sketch how this structure can be utilized for representing agent behavior. Additionally, we apply the approaches to agent control in the context of right-of-way rules in sea navigation. Keywords. qualitative spatial reasoning, qualitative spatial representations for moving objects

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Introduction A considerable part of everyday human interaction is guided by regulations, for example regulations on how to behave in traffic scenarios, recommendations on how to use escalators, rules on how to enter subways and buses, rules of politeness at bottlenecks in traffic situations, in sports, in games, in expert recommendation systems, and so on. Generally speaking, behavior is restricted by conditions and constraints. Nowadays, we find more and more robotic systems, autonomous agent systems respectively, in everyday life. Representatives of autonomous agent systems are, for example, robots supporting elderly people care or house keeping, toys for playing with children, and driver assistance systems in traffic vehicles. A general task that all these systems need to cope with is interacting with the environment, including other agents. This requires an adequate representation of agent behavior, especially including action in space and communication about space. Agents that have to solve navigational tasks need to consider aspects that go far beyond single-agent goal-directed deliberation: what an agent does in a specific situation often interferes with what other agents do at the same time. In order to avoid conflicts or even collisions, agents must be aware that situations in space are governed by laws, rules, and agreements between the involved agents. Therefore, artificial agents and BMI (Behavior Monitoring and Interpretation) systems interacting with humans need to

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

99

be aware of these regulations. Artificial agents acting autonomously also must be able to process them. Examples of such sets of rules are traffic regulations, e.g. if two cars meet at an intersection, the car from the right having the right of way and the car from the left having to obey this right of way. Another example: collision regulations in vessel navigation, e.g. if two motor boats are in head-on or nearly head-on position, one navigational rule prescribes that both boats need to turn starboard1 and pass on the port side of the other. These rules have in common that they are usually formulated in natural language and hence extensively use qualitative terms like ’to the left’, ’turn right’ or ’in danger of collision’, to describe spatial situations and actions. For example, in traffic laws qualitative concepts are used to describe relevant situations and also the ”correct” behavior of agents in these situations. In a right of way road traffic scenario the correct behavior of the vehicle giving way is partially given by ’taking clear action to show that she will be waiting, especially by moderate speed’. In addition, which behavior is considered to be correct for a certain agent may not only depend on the spatial situation at hand, but also on the current role of an agent in a particular situation. What an agent is allowed to do may depend on whether she is a pedestrian or whether she is using a vehicle, and if so, which kind of vehicle. In contrast, quantitative data are values measured in quantities, e.g. 90◦ , turn 270◦ , or 250 meters. Although robot systems rely on metric data concerning sensory input and motor control, implementing such rule sets in agent systems based on pure metric data is in most cases a complicated and error-prone undertaking. Providing agent and BMI system developers with auxiliary methods and tools for making such tasks less complicated and thus, less error-prone, is an expedient objective. The problem of formalizing agent behavior regarding a specific set of regulations must be considered from two different standpoints. First, what are adequate qualitative spatial representations for the formalization and how can we reason with these representations. Second, what actions should the agent perform to achieve behavior which is in compliance with the regulations? From a BMI perspective we may ask the other way round: Which actions has an agent performed to reach the current state with respect to a previous state and are these actions in compliance with given regulations? In the remainder of this chapter we give an overview on the field of Qualitative Spatial Reasoning (QSR) and how representations and reasoning techniques can be applied for controlling agents, interpretation of agent behavior respectively. The overview on QSR includes considerations on different aspects of space, namely orientation, location, distance, and motion, as well as an introduction to qualitative spatial calculi, i.e. specific representations and their elementary operations. Based on a calculus in general two kinds of reasoning can be performed: constraint-based reasoning and neighborhood-based reasoning. As representation for orientation is the most important aspect in our context, we present several orientation calculi. Finally, we apply the introduced techniques to represent right-of-way rules in a qualitative manner in so called rule transition systems and exploit these systems to derive rule-compliant agent behavior. 1 Starboard

is the nautical term that refers to the right side of a vessel with respect to its bow (front); port refers to the left hand side, stern to the back. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

100

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

2. Qualitative Spatial Representation and Reasoning

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Although the world is infinitely complex and our knowledge of the world is limited, i.e. incomplete, biological systems, especially humans, function quite well within this world without understanding it completely [31]. Humans understand physical mechanisms such as bathtubs, indoor or outdoor navigation, bicycling, microwave ovens, and so on. Qualitative Reasoning (QR) is concerned with capturing such everyday commonsense knowledge of the physical world with a limited set of symbols and allows for dealing with the knowledge without numerical values [8]. In addition, qualitative approaches are considered to be closer to how humans deal with commonsense knowledge compared to quantitative approaches [47]. Although, knowledge about the world is always incomplete, by providing appropriate qualitative representations making relevant knowledge explicit in combination with corresponding reasoning methods, computers can be enabled to monitor, diagnose, predict, plan, or explain the behavior of physical systems in a qualitative manner. If applying quantitative models to describe complex commonsense knowledge it might be intractable or even unavailable. One reason for intractability of a quantitative model is that an infinite number of potential states or configurations are given. With qualitative representations conclusions and inferences can still be drawn even in the absence of complete knowledge, without applying any probabilistic or fuzzy-based techniques. Space is a fundamental concept for humans to navigate in familiar or unfamiliar environments or to reason about properties of object configurations within such an environment. The subfield of qualitative reasoning that is concerned with representations of space is called Qualitative Spatial Reasoning (QSR). In the remainder of this section we introduce different aspects of space which are relevant in our context, namely orientation, distance, and motion. We present the essence of qualitative spatial representations and introduce constraint-based reasoning techniques as well as (action-augmented) neighborhood-based reasoning techniques. Finally, we introduce several calculi representing orientation information as this kind of knowledge is most important for the task to model right-of-way rules. 2.1. Qualitative Spatial Representations A qualitative spatial description captures distinctions between objects that make an important qualitative difference but ignores others. In general, objects are abstracted to geometric primitives, e.g. points, lines, or regions. The ability to focus on the important distinctions and ignore the unimportant ones is an excellent way to cope with incomplete knowledge [31]. Cohn and Hazarika [8] summarize that the essence of qualitative spatial reasoning is to find ways to represent continuous properties of the world, also called continuities, by discrete systems of symbols, i.e. a finite vocabulary. These symbols describe the relationships between objects in a specific domain. Therefore, they are called relations. The domain is given by the set of objects, i.e. geometric primitives, considered. Quantization is the process of summarizing indistinguishable perceptions or values into an equivalence class. The result of quantization is a (generally finite) set of quantities, called quantity space. Continuities can always be quantized, but it is the question whether the properties chosen are reasonable with respect to the given problem and the applied reasoning techniques. This means not all quantizations are equally useful regarding a certain problem. Forbus [15] summarizes this in the principle of relevance. Forbus

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

.

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

101

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

says that distinctions made by a quantization must be relevant to the kind of reasoning performed. Relations may describe different aspects of space as topology (e.g. ’outside’ or ’inside’), orientation (e.g. ’right’, ’left’, ’ahead’ or ’behind’), location (e.g. ’here’, ’on the market place’, or ’Bremen’), distance (e.g. ’close’ or ’distant’), size (e.g. ’small’ or ’large’), or shape (e.g. ’cube’, ’circle’, etc.). However, an important characteristic of perceptual precision is that a series of small changes, each imperceptible, may combine to form a perceptible change. Then the indistinguishability relation though reflexive and symmetric, is not transitive, which can lead to a paradox of perception [8]. In general, quantity spaces possess an inherent ordering which may be partial or total. Implications can be drawn by exploiting the transitivity of this ordering information. Two of these implications are for example a shift in perspective (converse operation) and the integration of local knowledge of two overlapping sets of objects into survey knowledge (composition operation). Simplified, if we know, for example, that object B is right of A a change in perspective from A to B reveals that A is left of B (converse). From knowing that B is left of A and C is left of B we can infer that C is also left of A (composition). A complete model for a certain domain is called a qualitative calculus. It consists of the set of relations between objects from this domain and the operations defined on these relations. The first prominent calculus presented was Allen’s Temporal Interval Algebra (IA) [1] in the temporal domain. For details on the IA and other temporal calculi we refer to [55]. Temporal calculi can be interpreted in the spatial domain as well. This led to the development of, for example, the cardinal direction calculus [35] or the Rectangle Calculus [3]. Freksa and R¨ohrig [23] classify approaches to qualitative spatial reasoning, roughly, into two groups: topological and positional calculi. Topological calculi are for example the region-based calculi RCC-5 or RCC-8 [46] or the Cyclic Interval Calculus [4]. Positional calculi, i.e. calculi dealing with orientation or distance information, are for example the Double Cross Calculus [20], the FlipFlop Calculus [34], the Dipole Relation Algebras [12], or the Oriented Point Relation Algebra [40]. 2.1.1. A Simple Example: Boat Race For a more vivid comprehension of the concept of what qualitative calculi are and how they can be used to formulate constraints (cf. 2.2.2), we give a simple example based on a boat race scenario borrowed from [36]. The underlying representation is the spatial version of the Point Algebra (PA) [58]. Primitive entities are 1D points on an oriented line, with pairs of points taking one of the relations: behind, ahead, or same. Imagine, a friend tells us on a phone about a boat race on a river. We can try to understand the story by modeling the river as an oriented line and the boats of the five participants A, B, C, D, E as points moving along the line (see Fig. 1). Thus, our domain is the set of all 1D points on the oriented line. According to the relations from the Point Algebra we can now distinguish for each pair of boats, whether one boat is ahead of the other boat, behind it, or on the same level. Using these relations to formulate the current situation in the race might lead to the following description of the scene: 1. A is behind B 2. E is ahead of B 3. A is behind C 4. D is on the same level as C 5. A is ahead of D

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

102

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

B A

C D

A

E

C/D B E

Figure 1. A possible situation in a boat race which can be modeled by 1D points on an oriented line and can be described by qualitative relations from the Point Algebra.

From this information we are able to conclude that our friend must have made an error, probably confusing the names of the participants: We know that A is behind C (sentence 3) and D is behind A (conversion of sentence 5). From composing these two facts it follows that C and D cannot be on the same level which contradicts sentence 4. On the other hand, taking only the first four sentences into account, we can conclude that E is also ahead of A by composing the facts A is behind B (sentence 1) and B is behind E (conversion of sentence 2). However, this information is not sufficient to derive the exact relation between C and E, as C can either be ahead, behind or on the same level as E.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.1.2. Qualitative Orientation and Location Orientation and location are important qualitative concepts because communication about and dealing with orientation and location information plays a major role in everyday life. Orientation information is closely related to location. Location is a specific place or position of an object, e.g. ’here’ or ’in the garden’. Orientation describes the position (location) or alignment relative to other specific directions. These directions may be defined relative to one or more other objects – including locations – (e.g. left of the road), or absolute, i.e. independent of specific entities in the domain (e.g. cardinal directions). Further considerations on the dependence between location and orientation are, for example, given in [26, 52]. While we deal with binary relations in topology, orientation is a ternary relation. Orientation relationships are formed from the object, also called primary object or located object, a reference object, and a frame of reference. A frame of reference (FoR) defines the context in which a spatial statement, e.g. a spatial utterance or an object configuration, is to be understood. Thus, the FoR defines a framework how objects have to be embedded in space, so that position and direction of incorporated objects can be resolved unambiguously. Levinson [33] introduces the terms origin, relatum, and referent. The origin denotes the center of the reference system and the relatum is the reference object with respect to which the referent is to be located. If the frame of reference is fixed, orientation can be represented as a binary relationship with respect to the FoR [47]. This implies that the relatum is equal for all relations, and thus the relatum can be neglected. Nevertheless, orientation remains a ternary relation in general. Many approaches dealing with orientation are based on relationships between points in R2 . Frank [16], for example, introduced two approaches on modeling cardinal directions. He partitions the given Euclidean plane P into regions with respect to a reference point R and an absolute west-east/south-north reference frame. Any point P ∈ P belongs to one of the nine basic relations: North, NorthEast, East, SouthEast, South,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

103

SouthWest, West, NorthWest, or Equal with respect to a located object L ∈ P. In [17] two different partition schemas are introduced: the cone-shaped and the projection-based approach (cf. Fig. 2). In the cone-based model all relations, except Eq, represent planar cones. The linear borders are no individual relations and are assigned to one of the neighboring cones. In the projection-based model, also called the Cardinal (Direction) Algebra, NE, SE, SW, and NW form planar cones. The linear borders between these four relations are relations themselves and are denoted by N, E, S, and W. The author argues that the projection-based approach is cognitively more adequate than the conebased approach. For discussions on cognitive adequacy we refer to [20, 24]. The orientation ranges in the Cardinal Algebra are fixed. The Star Calculus extends the projectionbased approach to representing absolute orientation in such a way that the segmentation of the orientation ranges can be chosen arbitrarily [49]. North North−West West

North North−East

R

South−West

East South−East

North−West West

North−East R

South−West

East South−East

South

South

(a)

(b)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 2. The cone-based relation model for points (a) and the projection-based variant (b).

Prominent Calculi representing relative orientation information between point objects are the FlipFlop Calculus (FFC) [34] and the Double Cross Calculus (DCC) by Freksa [20]. Based on the classification of ternary point configurations into one of the three categories ’clockwise’, ’anti-clockwise’, and ’collinear’ Schlieder [52] developed the Dipole Calculus (DC) for reasoning about pairs of oriented line segments (dipoles). This approach was further extended to different variants of the Dipole Relation Algebra (DRA) in [12, 42]. The calculus of bipartite arrangements [26] can be seen as a mixture of DCC and DRA. We will go into detail on FFC, DCC, and DC, DRA respectively in Sec. 2.3. Based on Schlieder’s basic classification, R¨ohrig developed a ternary representation where triples of points are classified on the basis of the CYCORD(x, y, z) function, which evaluates true if x, y, and z are in clockwise orientation [50]. He also showed how other (not only orientation) calculi can be translated into CYCORD terms. Orientation information for extended objects is much more uncertain and ambiguous compared to dealing with points, e.g. due to complex shapes, and thus, calculus development is much more complex. For avoiding most of the problems arising due to dealing with extended objects, a popular approach is approximating, respectively prototyping, the incorporated objects, for example, to circles or to rectangles whose sides are aligned to the reference frame axis. Such approaches were for example considered in [2, 27]. But even if such uniformly shaped objects are given there may exist several reasonable linguistic propositions for describing the configuration sufficiently [47]. A prominent example is the Rectangle Algebra (in R2 ) where each dimension is considered separately with respect to the 13 Allen relations which leads to 13 × 13 base relations [2, 3]. The Rectangle Algebra does not only capture the orientation between rectangles, but also their topological relation. Therefore this approach can be regarded as a unifying approach of orientation and topology [47].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

104

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

A calculus combining cardinal direction relations and relative orientation relations is presented in [29]. Because of disadvantages concerning computational properties of Freksa’s Double Cross Calculus2 a coarsened variant of the DCC is used. It turns out that this variant is equal to the FlipFlop Calculus [34, 54] which has been shown to have much better reasoning properties. Isli et al. [29] presents an example where the configuration is considered consistent if evaluated with each calculus separately, but in case of the combined calculus is found to be inconsistent.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.1.3. Qualitative Distance Distance is also a fundamental concept in everyday life. Contrary to topology and orientation, distance is a scalar entity. In most approaches for representing distance points are chosen as basic entities. Generally, two main categories can be distinguished: absolute and relative distance. Absolute distance represents direct comparison between two entities and can be represented quantitatively as well as qualitatively. Absolute quantitative measurements are for example 100 meters and 200 kilometers, which may be represented qualitatively by ’close’ and ’far’. Relative distance compares the distance between two spatial entities with regard to the distance to some third object. Representing relative distance is generally qualitative. Categories might be ’closer’ or ’farther’. Reasoning with distance comprises problems. In [47] two general problems are sketched. First, assume a set of n points p1 , . . . , pn on a line. Each pair (pi , pi+1 ) of points is regarded ’close’. Now, the question to ask is: for which number n pn is far from p1 ? This is a variant of the indistinguishability problem stated earlier (cf. Sec. 2.1), which describes that several imperceptible changes may sum up to a perceptible one. In addition, combining distance information between objects dist(A, B) and dist(B, C) (composition) is problematic as well. The overall distance is not only dependent on distance itself, but also on the orientation between the two legs AB and BC. If both distances are ’far’ and the angle between the legs are close to π the result might be something like ’very far’, but if the angle is small, i.e. close to zero, the resulting relation might be ’close’. Therefore, it is reasonable to regard distance in conjunction with orientation. Calculi dealing with orientation and distance conjoined are called positional calculi. 2.1.4. Representing Positional Information By combining cardinal directions and relative distance Frank [17] demonstrates the limitations if reasoning with orientation and distance information is done separately. A qualitative framework for combining relative orientation and relative distance is presented in [20] by determining relative position of a point with respect to an oriented line segment between two other points3 . A more sophisticated approach was taken in [61] by combining orientation with the Δ-Calculus. The Δ-Calculus is based on a ternary relation x(>, d, y) denoting that x is larger than y by an amount of d. Clementini et al. [7] propose a cone-based positional calculus with absolute distance measure regarding different variants of composition. They present algorithms for deriving the composition between relations with same, orthogonal, and opposite orientation. They also suggest an algorithm for composition of relations with arbitrary orientation. 2 The

determination of satisfiability of constraint systems over Double Cross relations is NP-hard [53]. basic idea is that the oriented line segment is determined by the start and end points of the motion of a single object. 3 The

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

105

For calculi with finer granularity, i.e. making finer distinctions than same, opposite, and orthogonal (left/right), the composition is approximated by combining the results of the algorithms for these three prototypes. A similar approach to combining orientation and distance was taken in [13]. Liu [38] presents a qualitative abstraction of orientation and distance and derives a set of inference rules for combining the two sources of knowledge. The approach is called qualitative trigonometry and qualitative arithmetic. In [30] the combination of different orientation and relative distance approaches on different levels of granularity are investigated. Based on these results the projection-based Ternary Point Configuration Calculus (TPCC) with two distance categories and 12 orientation categories is developed [41]. TPCC was further investigated in [11]. 2.1.5. Qualitative Change and Qualitative Motion Muller [44] emphasizes that change is a central concept in spatio-temporal domains, as very often the configuration of the represented entities change over time. According to Galton [25] change and time are two sides of the same coin as one cannot have one without the other4 . The concept of change can be regarded from different viewpoints. Worboys [59] distinguishes between the action of change and the results or effects of the change. Thus, two different definitions of change arise:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• process of change: an object o changes if and only if there exists a property P of o and distinct times t and t such that o has property P at t and o does not have property P at t . (Greek philosophy, e.g. Aristotle) • effect of change: a change occurs if and only if there exists a proposition Π and distinct times t and t such that Π is true at t but false at t [51]. Several communities stress the distinction between continuous and discrete (discontinuous) change, e.g. [25, 45]. Intuitively, discontinuous change describes change that happens from one moment to the other, i.e. instantaneous change of property values. Continuous change on the other hand describes change that appears over time, i.e. property values vary continuously with time. For example, the position of a moving object changes continuously over time, but the property to be at the final position is discrete. One abstraction of change is the approach of conceptual neighborhood [18] of relations of a qualitative calculus. Intuitively, conceptual neighborhood is a model for how the world could evolve in terms of transitions between qualitative relations. For further details we refer to Sec. 2.2.3. Regarding change in spatial environments leads to the crucial concept of motion. Even though an object is perceived at two different locations at different time points, i.e. two discrete perceptions, it is clear that due to object persistence some sort of continuous motion must have taken place [25]. Generally, motion is represented by trajectories. A trajectory is the path a moving body is following through space. Moving objects have been studied from multiple perspectives, e.g. spatio-temporal reasoning, autonomous agents, database modeling, or video analysis. An overview is given in [57]. Regarding the domain of autonomous agent control, Bennett [5] remarks that spatial processing in robot control systems rely on algorithms which are rather adhoc from a logical point of view. In general, qualitative spatial representations are designed for representation purposes and thus, do not make use of qualitative reasoning 4 Galton remarks that some philosophers have the standpoint that time is in principle possible without change.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

106

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

techniques. Although, qualitative representations are incorporated in the world models, in many cases reasoning is achieved by calculating with quantitative prototypes and mapping the result back into the qualitative representation, e.g. [14]. 2.2. Qualitative Spatial Reasoning Based on qualitative representation and corresponding reasoning methods computers can be enabled to monitor, diagnose, predict, plan, or explain the behavior of physical systems. In general, two categories of reasoning based on qualitative spatial representations can be distinguished: constraint-based reasoning and neighborhood-based reasoning. In case of constraint-based reasoning, relations of a spatial calculus are used to formulate constraints about a spatial configuration. This results in the specification of a spatial constraint satisfaction problem (CSP) which can be solved with specific reasoning techniques. If neighborhood-based reasoning is performed, the inherent connectivity structure of the relations of a calculus is exploited. This method is based on the assumption of persistent, continuously moving objects. We will introduce these two approaches below after defining the structure of calculi and their operations on relations.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2.1. Qualitative Calculi and Operations on Relations A qualitative spatial calculus defines operations on a finite set R of spatial relations, e.g. front, left, and others. The spatial relations are defined over a usually infinite set of spatial objects, the domain D. The entities represented by D might be points, regions, or other spatial primitives (cf. Sec. 2.1). For a start we consider binary calculi, in which R consists of binary relations R ⊆ D × D = D2 . If necessary, we account for definitions for ternary or n-ary relations as well (R ⊆ D × ... × D = Dn ). The set of relations R of a spatial calculus is typically derived from a jointly exhaustive and pairwise disjoint (JEPD) set of base relations (BR). Every relation in R is a union of a subset of the base relations. Since spatial calculi are typically used for constraint reasoning and unions of relations correspond to disjunctions of relational constraints, it is common to speak of disjunctions of relations as well and write them as sets {B1 , ..., Bn } of base relations with Bi ∈ BR. Using this convention, R is then either taken to be the power set 2BR of the base relations (all unions of base relations) or a subset of the power set. In order to be usable for constraint reasoning, R should contain at least the base relations Bi (∀i = 1, ..., n), the empty relation ∅, the universal relation U = D × D, and the identity relation Id = {(x, x)|x ∈ D}. R also needs to be closed under the operations defined below. We assume R1 and R2 are n-ary relations from R and r is an n-tuple over the domain D, e.g. r = (x, y) with x, y ∈ D in the binary case. As the relations are subsets of tuples from the same Cartesian product, the set operations union (R1 ∪ R2 = { r | r ∈ R1 ∨ r ∈ R2 }), intersection (R1 ∩ R2 = { r | r ∈ R1 ∧ r ∈ R2 }), and complement (R = U \ R = { r | r ∈ U ∧ r ∈ R }) can be applied directly. In addition, two more operations are defined which allow derivation of new facts from given information: conversion and composition. In contrast to the set operations above, these two operations are dependent on the relation’s arity. Conversion can be interpreted as shifting perspective from one entity to another. Regarding the Point Algebra in the boat race example (cf. Sec. 2.1.1), the converse of ’A is behind B’ yields ’B is ahead of A’. For the binary case conversion is defined

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

107

as R = { (y, x) | (x, y) ∈ R }. While there is only one possibility to permute the two objects of a binary relation (ignoring the identical permutation) which corresponds to the converse operation, there exist five such permutations for the three objects of a ternary relation (n! − 1 for n-ary relations). Therefore, in [62] five separate operations were introduced: inverse (Inv(R) = { (y, x, z) | (x, y, z) ∈ R }), shortcut (Sc(R) = { (x, z, y) | (x, y, z) ∈ R }, inverse shortcut (Sci(R) = { (z, x, y) | (x, y, z) ∈ R }, homing (Hm(R) = { (y, z, x) | (x, y, z) ∈ R }, and inverse homing (Hmi(R) = { (z, y, x) | (x, y, z) ∈ R }). Composition allows for the integration of two relations, if they share one common entity (in the case of binary relations). The composition of two relations may result in additional knowledge. For example, from knowing that ’E is ahead of B’ and ’B is ahead of A’ we can derive that also ’E is ahead of A’. The composition operation is often given in form of look-up tables called composition tables. Composition is defined as R1 ◦ R2 = { (x, z) | ∃y ∈ D : ((x, y) ∈ R1 ∧ (y, z) ∈ R2 ) }. This kind of composition is also called strong composition. In contrast, for most calculi no finite set of relations exists that includes the base relations and is closed under composition. In this case, a weak composition is defined instead. Weak composition takes the union of all base relations that have a non-empty intersection with the result of the strong composition. This yields the definition R1 ◦w R2 = { r | r ∈ BR ∧ r ∩ (R1 ◦ R2 ) = ∅ }. Composition for ternary calculi is adapted from the binary definition: R1 ◦ R2 = { (w, x, z) | ∃y ∈ D : ((w, x, y) ∈ R1 ∧ (x, y, z) ∈ R2 ) }. We especially note that two entities must be shared by both relations, e.g. x, y ∈ D as in the definition above, and not only one as in the binary case. For considerations on composition with n-ary relations we refer to [9].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2.2. Constraint-Based Reasoning and Consistency The relations R of a spatial calculus are often used to formulate constraints about the spatial configuration of objects from the domain of the calculus. The resulting spatial constraint satisfaction problem (CSP) then consists of a set of variables V = {v1 , ..., vn } (one for each spatial object considered) and a set of constraints C1 , ..., Cm with Ci ∈ R. Each variable vi can take values from the domain of the utilized calculus. CSPs are often described as constraint networks (CN ) which are completely labeled graphs CN =< V, l > where the node set is the set of variables of the CSP and the labeling function l : V ×V → R labels each edge with the constraining relation from the calculus. A CSP is called atomic if all edges are labeled with base relations. Fig. 3(a) shows a CN for the boat race example based on the Point Algebra (cf. Sec. 2.1.1). The labels a, b, and s abbreviate the relations ahead, behind, and same. The variables vi take the individual boats A to E abstracted as points. The edges not shown correspond to the universal relation U and thus are unconstrained. A CSP is called consistent if an assignment for all variables to objects of the domain can be found, that satisfies all the constraints. However, spatial CSPs usually have infinite domains and thus backtracking over the domains cannot be used to determine consistency. Therefore, special techniques for CSPs with relational constraints have been developed [32]. Besides consistency, weaker forms of consistency called local consistencies are of interest in QSR, as they can be used to decide or approximate consistency under specific conditions. Roughly, they can be employed as a forward checking technique reducing the CSP to a smaller equivalent one with the same set of solutions. Furthermore, in some cases they can be proven to be not only necessary but also sufficient for global

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

108

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

E

E

{a}

B

{b}

{a}

A {b}

{a}

D

B

{b}

A {b}

{s}

{a}

D



C

C

(a) The initial CSP.

(b) Inconsistency detected after applying RDC = RDC ∩(RDA ◦RAC ), because RDC = ∅.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 3. A constraint network over Point Algebra relations regarding the boat race example in Sec. 2.1.1. The direction of the arrows denotes how to read specific relations, e.g. E is ahead of B. The edges not shown correspond to the universal relation U and thus are unconstrained.

consistency for the set R of relations of a given calculus. If this is only the case for a certain subset S of R and this subset exhaustively splits R (which means that every relation from R can be expressed as a disjunction of relations from S), this at least allows to formulate a backtracking algorithm to determine global consistency by recursively splitting the constraints and using the local consistency as a decision procedure for the resulting CSPs with constraints from S [32]. One important form of local consistency is path-consistency which means that for every triple (in binary CSPs) of variables each consistent evaluation of the first two variables can be extended to the third variable so that all constraints are satisfied. Pathconsistency can be enforced syntactically based on the composition operation and the intersection operation, for instance with the algorithm by van Beek [56] in O(n3 ) for binary relations (O(n4 ) for ternary relations [11]), where n is the number of variables. Within the binary algorithm for determining path-consistency successively two constraining relations over three variables are composed and intersected with the relation already known for the resulting two variables until a fix point is reached. For example, one step is to compose the constraining relations for AB and BC and to intersect the result with the constraining relations for AC. If Rik denotes the label of the edge between the nodes vi and vk the operation for each step can be defined as follows: Rik = Rik ∩ (Rij ◦ Rjk ). If the empty relation is deduced by this operation the given configuration is definitely inconsistent. In Fig. 3(b) we determine the inconsistency of the boat race example. However, this syntactic procedure does not necessarily yield the correct result with respect to path-consistency as defined above, e.g. if only weak composition is available. The same holds for syntactic procedures that compute other kinds of consistency. Whether syntactic consistency coincides with semantic consistency with respect to the domain needs to be investigated for each calculus individually. For an in-depth discussion we refer to [37, 48]. 2.2.3. Neighborhood-Based Reasoning and Action The notion of conceptual neighborhood has been introduced by Freksa [18]. Conceptual neighborhood extends static qualitative representations by interrelating the discrete set of base relations by the temporal aspect of transformation of the basic entities.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

109

Two spatial relations of a qualitative spatial calculus are conceptually neighbored, if they can be continuously transformed into each other without resulting in a third relation in between [18]. The definition of conceptual neighborhood originates from work on time intervals and thus only continuous transformations on intervals (shortening, lengthening, and shifting) are considered. Later, the definition was also interpreted spatially. For moving objects we can say two relations are conceptual neighbors if continuous motion of the objects can cause an immediate transition between these two relations. Similar considerations are proposed by Ligozat [34] and Galton [25]. For instance, imagine two of the boats in the boat race example. The possible relations in our representation are behind, same, and ahead. The vessels are able to move forward with changing speed. In the configuration shown in Fig. 1 vessel A is behind B. Observing the scene a few minutes later shows that now A is ahead of B. But this is not the whole story. Assuming continuous motion it is not possible for A to overtake B without passing B at some time, i.e. being at the same level. Therefore, ahead and behind are conceptual neighbors of same, but not ahead and behind. A

A

A

B

B

B

behind

same

ahead

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 4. The relations of the Point Calculus arranged as a conceptual neighborhood graph CN G.

The conceptual neighborhood relation, denoted by ∼, between the base relations BR of a qualitative calculus is often described in form of a conceptual neighborhood graph, diagram or structure respectively, CN G =< BR, ∼> as illustrated in Fig. 4 for the Point Algebra. A set of base relations which is connected in the CN G is called a conceptual neighborhood. For convenience, we introduce a function cn : BR → 2BR which yields all conceptual neighbors for a given base relation b ∈ BR: cn(b) = {b |b ∼ b }. The term continuous transformation is a central concept in the definition of conceptual neighborhood. Detailed investigations on different aspects of continuity are, for example, presented in [6, 25, 43]. Conceptual neighborhood on the qualitative level corresponds to continuity on the geometric or physical level: continuous processes map onto identical or neighboring classes of descriptions [21]. However, the term continuous with regard to transformations needs a grounding in spatial change over time. We define continuous transformation as continuous motion of a moving agent, e.g. a robot R. This can be described by the function pos(R) : T → P , where T is a set of times and P is a set of possible positions of R. Now assuming T and P being topological spaces, the motion of R is continuous, if the function pos(R) is continuous [25]. Conceptual neighborhoods and neighborhood-based reasoning are suitable models for how the world could evolve in terms of transitions between qualitative relations. Nevertheless, for tasks like navigation, action planning, as well as behavior monitoring and its interpretation, it is crucial that the CN Gs reflect the properties and capabilities of the represented agents so that neighborhood induces direct reachability in the physical world. In its general form a CN G represents arbitrary dynamics of the objects involved.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

110

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

If two objects are in relation r then the conceptual neighborhood only defines that for any r ∈ cn(r) there exists some action causing a transition from r to r . Many of these changes are not applicable at all or are most unlikely to occur considering agents or robots in the real world. Thus, conceptual neighborhood in its original definition is not sufficient from an agent control perspective. In [10] the notion of conceptual neighborhood was extended to action-augmented conceptual neighborhood (ACN ) including an explicit representation for the actions causing a change in the relation between two objects. Overall, three main aspects affect the action-augmented conceptual neighborhood graph (ACN G) for a given spatial calculus in the context of robot navigation:1) the robot kinematics (motion capabilities), 2) whether the objects may move simultaneously, and 3) whether objects may coincide in position or not (superposition). For example, if we reconsider the boat race example with ’A is behind B’ and assume that B is definitely faster than A, it will never happen that ’A is on the same level as B’. For representing conceptual neighbors of a relation regarding specific actions we introduce a refined neighborhood function acn: acn(r, a1 , a2 ) = {r | O1 r O2 ∧ r ∼ r ∧ r is possible if [ object O1 performs action a1 ∧ object O2 performs action a2 ]}, with relation r is the current relation between the two objects O1 and O2 and a1 is the action performed by O1 and and a2 is the action performed by O2 . acn(r, a1 , a2 ) returns the set of neighboring relations regarding actions a1 and a2 . Considering random actions () which deliver arbitrary motion behavior for both objects acn(r, , ) is equal to cn(r).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.3. Qualitative Calculi Representing Relative Orientation Information In this section we give examples of several important calculi representing relative orientation information between point objects. We describe the FlipFlop Calculus [34], the Single Cross Calculus and the Double Cross Calculus [20], the Dipole Relation Algebra [42, 52], and the Extended Oriented Point Relation Algebra (OPRAm ) [10]. In Sec. 3 we show that only the latter calculus is sufficient in order to formalize agent behavior adequately. 2.3.1. The FlipFlop Calculus and the Single/Double Cross Calculus The FlipFlop Calculus (FFC), the Single Cross Calculus (SCC), and the Double Cross Calculus (DCC) are all ternary calculi that describe the orientation of a point C (the referent) with respect to a point B (the relatum) as seen from a third point A (the origin). The differences are based on the frames of reference and thus, how many different orientations are distinguishable, i.e. in the number of base relations. A ternary relation rel holding between A, B, and C is written as A, B rel C. In the FFC, proposed in [34], a point C can be to the left or to the right of the oriented line going through A and B, or C can be placed on the line resulting in one of the five relations inside, front, back, start (C = A) or end (C = B) (cf. Fig. 5(a)). Because of the position on the reference line these relations are called linear relations, whereas r and l are planar relations. Relations for the case where A and B coincide were not included in Ligozat’s original definition. This was done with the LR refinement [54]

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

111

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

that introduces the relations dou (A = B = C) and tri (A = B = C)5 as additional relations, resulting in a total of 11 base relations. Fig. 5(a) depicts A, B r C.

f

1 0 7

1 0 7

Br i C A

2

2

b

3 4 5

(a) FFC: A, B r C

(b) SCC: A, B 5 C

l

B6 C A

5 4 3

B6 A

3 4 5

B 6

A2

7 0 1

1 0 11 2

B10 C 3 12 9 4 A8 5 6 7

(c) DCC: The two SCC reference frames resulting in the overall DCC reference frame (A, B 9 C).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 5. The reference frames of the FlipFlop Calculus (a), the Single Cross Calculus (b), and the Double Cross Calculus (c) with exemplary relations.

In the Single Cross Calculus (SCC) the plane is partitioned into regions by the line going through A and B and the perpendicular line through B [20]. This results in eight distinct orientations. Freksa introduced linguistic prepositions as well as numbers from 0 to 7 for denoting the relations: straight-front (0), left-front (1), left-neutral (2), left-back (3), straight-back (4), right-back (5), right-neutral (6), and right-front (7). Relations 0, 2, 4, 6 are linear ones, while relations 1, 3, 5, 7 are planar. The additional relation bc denotes the case where A = B = C. With relations dou and tri this yields 9 relations overall. Fig. 5(b) depicts A, B 5 C. The DCC, also proposed in [20], is the combination of two Single Cross relations, the first describing the position of C wrt. B as seen from A and the second wrt. A as seen from B (cf. Fig. 5(c) (left)) resulting in a partition distinguishing 13 relations (7 linear and 6 planar) and four special cases, A = C = B (a), A = B = C (b), A = B = C (dou), and A = B = C (tri), resulting in 17 base relations overall. Fig. 5(c) depicts the relation A, B 9 C 6 . 2.3.2. Dipole Calculus and Dipole Relation Algebras (DRA) Schlieder [52] proposes the Dipole Calculus (DC), a two-dimensional generalization of Allen’s interval relations that permits the representation of the relative position of two directed line segments. Schlieder then exploits that linear ordering is the one-dimensional specialization of generalized n-dimensional point ordering. The directed line segments, also called dipoles, are used for representing spatial objects with intrinsic orientation. A dipole A is defined by two points, the start point sA and the end point eA . Schlieder assumes four pairwise different points in general position, i.e. that no three points are on one line. Thereby a left/right dichotomy is specified for each point regarding the dipole the point is not contained in, which leads to 14 base relations7 . Each base relation is a quaternary tuple (r1 , r2 , r3 , r4 ) of, so to speak, restricted FlipFlop relations relating a [54] the relations where originally named e12 (dou) and eq (tri). the relation names were derived from the relation tuples defined by the two individual reference frames, e.g. 5 3 corresponds to 9 or 4 a corresponds to a. 7 Out of the 42 = 16 configurations only 14 are geometrically realizable. 5 In

6 Originally

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

112

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

point from one of the dipoles with the other dipole. r1 describes the relation of sB with respect to the dipole A, r2 of eB with respect to A, r3 of sA with respect to B, and r4 of eA with respect to B. Due to the restrictions of the point positions only left and right are possible for each ri with i ∈ {1, ..., 4}. Dipole relations are usually written without commas and parentheses, e.g. rrll. The example in Fig. 6 shows the relation A rlll B.

sA

eA

A

eB

sA

eA

A

B sB Figure 6. A dipole configuration: A rlll B in the Dipole Calculus (DC), Dipole Relation Algebra respectively.

eB

B sB Figure 7. A bipartite arrangement: A 72 B, A FOmr B respectively8 .

Several more fine-grained variants, called Dipole Relation Algebras (DRA), with stepwise relaxation of the general position constraint have been proposed in [12, 42]. Combining ideas from Allen, Schlieder, and Freksa (cf. Sec. 2.3.1) Gottfried [26] introduced the calculus of bipartite arrangements to represent positional information in a very sophisticated manner (cf. Fig. 7). The Qualitative Trajectory Calculus, another approach for representing relative orientation between line segments, is presented in van de Weghe [57].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.3.3. The Oriented Point Relation Algebra (OPRAm ) and its Refinement (OPRAm ) The domain of the Oriented Point Relation Algebra (OPRAm ) [39, 40] is the set of oriented points (points in the plane with an additional direction parameter). The calculus relates two oriented points with respect to their relative orientation towards each other.  can be described by its Cartesian coordinates xO , yO ∈ R and a An oriented point O direction φO ∈ [0, 2π] with respect to a reference direction and thus D = R2 × [0, 2π]. The OPRAm calculus is suited for dealing with objects that have an intrinsic front or move in a particular direction and can be abstracted as points. The exact set of base relations distinguished in OPRAm depends on the granularity parameter m ∈ N. For each of the two related oriented points, m lines are used to partition the plane into 2m planar and 2m linear regions. Fig. 8 shows the partitions for the cases m = 2 (a) and  and m = 4 (b). The orientation of the two points is depicted by the arrows starting at A  respectively. The regions are numbered from 0 to 4m − 1, region 0 always coincides B, with the orientation of the point. An OPRAm base relation consists of a pair (i, j) where  which contains B,  while j is the number of the region i is the number of the region of A    m∠j B  with i, j ∈ Z4m 9 . of B that contains A. These relations are usually written as A i 1     Additional Thus, the examples in Fig. 8 depict the relations A 2∠7 B and A 4∠313 B. base relations called ’same’ relations describe situations in which both oriented points 8 Relation FO mr is derived from Allen’s relations [1] directly: B is in the Front of A and Overlaps. Additionally, the points of B are in the middle and to the right of A. 9Z 4m defines a cyclic group with 4m elements ({0, ..., (4m − 1)}).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

113

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

 coincide. In these cases, the relation is determined by the number s of the region of A  in which the orientation arrow of B is positioned (as illustrated in Figure 8(c)). These  (A  2∠1 B  in the example). The total number of base  2∠s B relations are written as A relations with respect to granularity m is (4m)2 + 4m. 0

7

0 1

1

B

7 2

A 3

4

5

5

2 3 4 5

 2∠1 B  (a) m = 2: A 7

3

13

7

8

9 10

0

13 12 11

B

A 6

4

0 15 14

1 0

6

1 2

10

A

7

B

6

9

7 8

3

5 4

 4∠3 B  (b) m = 4: A 13

 and B  coincide: (c) case where A   2∠1 B A Figure 8. Two oriented points related at different granularities.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Regarding navigation purposes OPRAm is not expressive enough to define reasonable behavior in some situations. For example, if we represent moving agents with oriented points, relation 2∠37 (cf. Fig. 9(a)) subsumes configurations where the agents approach, move in parallel, or depart. Therefore, OPRAm was refined to OPRAm by refining all relations similar to the one above [10]. The o-points can be aligned parallel (P), mathematically positive (+) or negative (-), or opposite-parallel (O). OPRAm relaj tions are written as α m∠i with alignment α ∈ {P, +, −, O}. Overall, only 4m relations need to be split into three refined relations each ({−, P, +} or {+, P, −}) as for the other relations the alignment is fixed. The function align(r) = Or , returns the set Or of valid alignments regarding OPRA relation r. For example OPRAm relation 2∠37 can be − 3 3 P 3 refined in the three OPRAm relations + 2 ∠7 , 2 ∠7 , and 2 ∠7 (cf. Fig. 9(b)).

B

B A

(a) OPRA:

B

A

2∠3 7

B

A

(b) OPRAm :

Figure 9. The OPRA relation

2∠3 7

+ 3 2 ∠7

A

,

P∠3 2 7

,

− 3 2 ∠7

and its refinement in OPRAm .

3. Agent Control Based on Qualitative Calculi In this section we describe why qualitative calculi are a reasonable means for describing agent behavior. As example we apply the OPRAm calculus to define rule-compliant behavior in sea traffic scenarios as given by the international collision regulations10 10 International Collision Regulations (COLREGs) by the International Maritime Organization (IMO) http://www.imo.org/Conventions/contents.asp?doc id=649&topic id=257 (October 2008)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

114

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

(COLREGs) [28]. Representations of rule-compliant behavior, of course, are not limited to navigation. Examples of rule sets guiding the behavior of agents can also be found in sports, in games, in expert recommendation systems, and so on. Rule sets need to be made explicit and be formalized at different stages when artificial agents or multiagent systems are specified or implemented. To do so, we first need to derive the actionaugmented neighborhood structure for OPRAm relations. Afterwards, based on the ACN G the rules have to be transferred into rule transition systems (RTSs) which can be applied for vessel control. In general, the same considerations must be regarded if it is intended to describe people’s behavior which is monitored within an environment. In such a context the ACN G does not not serve as a means for controlling behavior but as a means for explaining and evaluating behavior. Everyday life of humans is in many situations guided by regulations and recommendations. Rule systems, like traffic scenarios, have in common that they are generally formulated in natural language and thus, contain qualitative terms for describing spatial situations and actions which are expected to be adequate in these situations, i.e. rulecompliant behavior. Especially, qualitative orientation terms like ’from the left’ or ’behind’ and qualitative action terms like ’turn left’ constitute a fundamental part of such rule sets. In many cases no specifications are given on speed, distance, or temporal limitations. It is implicitly assumed that the actions given in regulations are safe for the purpose of the desired effect. For example, if two vessels are in head-on situation and distant from each other regarding their current speed, a small turn is sufficient. In contrast, if the agents are close to one another, a hard turn might be necessary. The COLREGs are given in natural language and all rules are defined for pairs of vessels. For example, Rule 14 states: “When two power-driven vessels are meeting on reciprocal or nearly reciprocal courses (i.e. head-on or nearly head-on) so as to involve risk of collision each shall alter her course to starboard (right) so that each shall pass on the port side of the other (left side).“ Artificial cognitive agents that interact with humans must be able to process such rule sets. This entails that an agent must be able to not only localize itself in the physical space, but also to classify itself in the normative space of regulations and laws. Considering the integration of an agent in the normative space, not only spatial relations are important. Most of the rules to be followed by an agent in some situation depend on the types of agents involved in a situation. Assuming different types in a specific context might lead to different decisions what an agent is allowed to do or which behavior to show according to a rule set, i.e. a different rule has to be applied. For example, in vessel navigation the behavior of a vessel depends on the types of vessels involved in the current situation. A motor vessel has to show different behavior if it meets another motor vessel than if it meets a sailing vessel. Thus, an agent must perceive its current spatial situation, identify rules that might be relevant in this spatial situation with respect to types of agents involved, and finally select appropriate, but notwithstanding rule-compliant actions. Qualitative representations allow us to mediate between metrical information perceivable to the agents and abstract conceptual knowledge expressed in the rule set. Each rule from such a rule set comprises some particular qualitative configurations (meet on reciprocal or nearly reciprocal courses so as to involve risk of collision) and allows specific actions for the agents involved (shall alter her course to starboard so that ...). We combine constraint- and neighborhood-based reasoning methods to infer admissible actions from a set of ”physically” possible actions. Admissible actions are actions

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

115

which are not only physically possible but also in compliance to the rule set at the same time. For a start we restrict ourselves to consider orientation information primarily, distance plays a subordinate role only. As the underlying qualitative representation we use OPRAm together with its composition and action-augmented neighborhood structures as we derive in Sec. 3.1. In our application we neglect the spatial extent of the objects, as we are interested in deriving general strategies for navigation. If the objects are in immediate proximity one could model the local environment with additional relations representing the extensions of the objects. 3.1. Action-Augmented Conceptual Neighborhood for OPRAm

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

We now derive exemplarily parts of the action-augmented conceptual neighborhood graph (ACN G) for OPRA2 , which can be generalized to m = 2k with k ∈ N easily. The orientations of the o-points represent the intrinsic fronts of the objects represented. An ACN G is influenced by the kinematic capabilities of the objects involved, the number of objects moving simultaneously, and whether objects can take the same position in our representation or not. In diagrams a possible neighborhood transition is visualized by . As we only refer to o-points in the following, we will use denotations like A instead  of A. With respect to distance, translational and rotational velocity of the objects involved an infinite number of different actions are possible. To systematically derive the ACN G we restricted our investigations to the following primitive actions assuming that motion is performed relative to the intrinsic front of the object, i.e. to the orientation of the opoint: no movement, rotation to the left/right on the spot, straight, forward/backward motion, sidewards motion, and circular motion. More complex motion patterns might be approximated by the primitive ones. To sketch the process of deriving the ACN G for OPRAm relations we restrict to actions where agents do not move (Stable) or rotate on the spot (Left,Right). To exemplify the limitations of OPRAm relations we consult circular motion. For further details and other action primitives we refer to [10]. 3.1.1. Single Rotating Object First, we derive the ACN G for cases where only one object rotates and the other is stable. Imagine a robot R standing in a room together with a stable object with a fixed intrinsic front, for example, a locker (L). Rotating on the spot will lead to a change in relative position of the locker compared to the robot’s own intrinsic front, but the robot’s relative position to the locker does not change. Therefore, considering the OPRAm relation representing the situation (R xm∠ji L) only changes in orientation i or the alignment x need to be considered. In general, rotating right by R results in i = i + 1 (cf. Fig. 10) and rotating left in i = i − 1. If xm∠ji is a refined relation, i may also stay unchanged and the alignment changes: from + to O to − or from − to P to +. Reversing the roles of L and R entails the same changes in j and the inverse changes in alignment. 3.1.2. Both Objects Rotating Simultaneously If both objects are allowed to rotate simultaneously, additional neighborhood transitions are possible. For determination of the additional neighborhood relations we need to consider the speed of rotation, although not represented in the calculus. If both objects rotate simultaneously on the spot, we need to consider three cases regarding xm∠ji : 1) A rotates

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

116

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents L

L

R

+ 2 2 ∠1



L

R

P 2 2 ∠2



L

R

− 2 2 ∠3

... ...

L

L

R

− 2 2 ∠5



R

− 2 2 ∠6



L

R

− 2 2 ∠7

... ...

R

+ 2 2 ∠1

Figure 10. Exemplary part of the neighborhood structure if R is rotating to the right and L is not moving.

faster than B, 2) A and B rotate with the same speed, and 3) A rotates slower than B. 0 For example, let us assume A O 2 ∠0 B with both objects rotating to the right. We illustrate potential results in Figure 11. As both objects rotate to the right, both zero values 1 increase in all cases. In the first case, we end in − 2 ∠1 , because the traversed angle by A is greater than the angle traversed by B. In the second case, the angles rotated by A and 1 B are equal and so, the alignment stays unchanged ( O 2 ∠1 ). In the third case, the angle + 1 is lower and so, we end in 2 ∠1 . This schema is applicable to any relation xm∠ii with i i denoting a linear region: acn( xm∠ii , Right, Right) = { ym∠i+1 i+1 |y ∈ align( m∠i )} . B

B A

A

O 0 2 ∠0

O 1 2 ∠1

B

B

B

B

B

A − 1 2 ∠1

A O 1 2 ∠1

A

A

+ 1 2 ∠1

− 1 2 ∠2

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

O∠0 2 0

Figure 11. The action neighborhood for if both objects rotate to the right on the spot. If A 1 rotates faster than B, the result is − 2 ∠1 , if both 1 , and if A ∠ agents rotate with the same speed O 2 1 1 rotates slower than B, the result is + 2 ∠1 .

A

B A O 2 2 ∠2

+ 2 2 ∠1

1 Figure 12. The action neighborhood for O 2 ∠1 if both objects rotate to the right on the spot. If A 1 rotates faster than B, the result is − 2 ∠2 , if both 2 , and if A ∠ agents rotate with the same speed O 2 2 2 rotates slower than B, the result is + 2 ∠1 .

For xm∠ii with x ∈ align( m∠ii ) and i denoting planar regions we obtain different neighborhood relations. For these cases three different alignments are possible: negative, − 1 1 O 2 opposite-parallel, and positive. Considering O 2 ∠1 (cf. Fig. 12) yields 2 ∠2 , 2 ∠2 , and + 1 2 ∠2 . In contrast to linear relations the action outcome is not unique with respect to the − 1 1 O 2 O 1 ratios of rotational velocities. The results for − 2 ∠1 are: 2 ∠2 , 2 ∠2 , and 2 ∠1 ; and the + 1 + 1 O 2 2 results for 2 ∠1 are: O ∠ , ∠ , and ∠ . Again, the schema can be applied for any 2 1 2 2 2 1 x i i ∠ with x ∈ align( m ∠ ) and i denoting planar regions. m i i The action-augmented neighbors for other relations as well as other action tuples containing rotation actions, e.g. acn(r, Left, Right), can be derived in similar manner. The neighborhood structure regarding translational actions can be deduced systematically based on considering only one or both objects moving as well. Similar to rotation actions additional relational transitions are possible regarding the speed ratio of the vessels. Other straight movements like sidewards motion orthogonal to the orientation of the object can be derived by assuming rotation on the spot towards the direction of movement, a translational movement, and a final rotation back. Additionally, the ACN G

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

117

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

for configurations where point positions may coincide need to be derived. A complete overview is given in [10]. 3.1.3. Circular Motion As the qualitative representation abstracts from any distance or speed information, there are motion patterns that cannot be represented within the ACN G. One of these patterns is circular motion. We define circular motion as motion around a center point along the boundary of a circle with a fixed diameter. In Fig. 13 (top row) we depict the neighborhood trace for A rotating right on the spot. The bottom row shows, that the same trace is induced, if A is stable and B is moving circular around A. This means that for a given sequence of qualitative perceptions the interpretation of which actions were executed is not unique if the individual motion of objects is unknown or not considered. B

B

A



B

A



B

A

B

B

A

...



A



B

A

...

B

A

B B

B A



A



B

A

...

A



B

A

B



A

...

A

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 13. Rotation on the spot to the right by A induces circular motion of B to the left with its center in A.

The action neighborhood for circular motion is strongly dependent on the relative distance between A and B to the center point of the circular motion. For linear relations we can extract seven different classes of center points in m = 2 which influence the action neighborhood if one object moves. We illustrate them in Fig. 14. We assume object B moves. The first center is object B itself, which we already investigated as rotation on the spot in Sec. 3.1.1. The second possible center is A which bears the same neighborhood structure as if A rotates on the spot (cf. Fig. 13). The next distinctive point is the middle point between A and B such that the neighborhood trace contains a ’same’ relation. The fourth class of points is also on the line between A and B but closer to B. So, B will always stay on the same side regarding A (e.g. in Fig. 14 on the right side of A). The same holds for center points which are behind B seen from A. The sixth class of points is between A and B but closer to A. The characteristic of these neighborhood traces is that the moving object is already turned towards the surrounded object if the other object’s front is passed, and consequently, the distance between both objects decreases. For the last category of center points, on the line behind A seen from B, this is the opposite, i.e. the distance still increases if the other half plane is entered. For relations containing planar regions several subclasses can be determined which are dependent on much finer distinctions between the relative distances of A and B to the center point. For granularities larger than two, additional classes arise if finer distance distinctions are considered (cf. Fig. 15). For center points which are behind B seen from A it makes a difference in granularity m = 4 how far the point is behind B. Because the number of potential neighbors is very high concerning circular motion it is not reasonable to consider circular motion in applications which use the OPRAm ACN G for action planning or similar tasks. So, if circular motion is essentially needed

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

118

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

B A

A 4. 3.

B

5. 1.

6. 2. 7.

Figure 14. Different circular trajectories of an object B moving around A with respect to different center points.

Figure 15. The importance of distance information regarding neighborhood traces for circular motion.

and cannot be approximated by straight linear translation and rotation, from our point of view it is explicitly necessary to integrate distance in the underlying representation. 3.2. The Qualitative Framework For demonstrating how qualitative spatial representations and techniques can be applied for agent control in the domain of vessel navigation we developed a qualitative framework for a demonstrator application, in the following also called SailAway. Though, most of the techniques applied carry over to other navigation and BMI scenarios. We first give a general overview of SailAway. Afterwards, we present how so called rule transition systems are derived from the natural language description in a stepwise manner. Finally, we give examples how these transition systems can be utilized for agent control.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.2.1. General Overview In Fig. 16 we depict the overall architecture of SailAway. The environment is an open sea scenario with vessels moving around towards different targets. The environment is represented to the agent by a qualitative scene description that contains qualitative information about the current relative position for each pair of vessels in sight of each other in the environment. In the context of vessel navigation, position information, i.e. information about direction and distance, is essential. In particular, orientation information is required to differentiate spatial constellations as described by navigation rules. Currently, distance information only plays a subordinate role in our approach. We use such information only to distinguish those vessels that are close enough to other vessels such that they need to be considered when navigation rules are evaluated. Navigation rules restrict the possibilities of agents to act in space. For representing possible actions and their effects in a formal model, spatial and temporal information needs to be combined. The qualitative rule representation encodes relevant parts of the collision regulations, e.g. how two power-driven vessels have to behave if they are in danger of collision. On the basis of rule formalizations, configurations can be classified and admissible actions for an agent can be deduced. An appropriate formalization is key to an accurate modeling of the rules and essential for empowering effective reasoning. The formalization serves as a double link; it links continuous real-world configurations to discrete classes of configurations and it links rule descriptions to symbolic representations. These rule representations are derived from the ACN G as developed in Sec. 3.1. Due to definitions in the COLREGs m = 4 is a suitable granularity for the OPRAm re-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

qualitative framework

qualitative scene description

119

qualitative rule representation

symbolic reasoning

action primitives

environment

Figure 16. The qualitative methods underlying SailAway.

lations. Finally, the current scene description and the rule representations which apply to the current situation are related by symbolic reasoning, such that an action primitive can be selected. The symbolic reasoning comprises the extrapolation of the current scene description based on the effect of rule-compliant actions, and the deduction of a consistent scenario in terms of constraint-based reasoning from the set of extrapolated scene descriptions. The action primitives for controlling the vessels can be derived directly from the consistent scenario. The actions are executed immediately and affect the qualitative scene description perceived by the agents. 3.2.2. Rule Transition Systems

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Navigation rules restrict the possibilities of agents to act in space. The basic idea underlying the rule representations is to consider rule-specific transition systems. These transition systems can be derived from the ACN G. In contrast to a complete ACN G it first contains only actions that are physically possible, i.e. executable by the agents, and second represents rule-compliant (or nearly rule-compliant) behavior of the agents. To simplify the building process of transition systems, first, a coarse model of the rule is derived (idealized thread). Thereafter, the idealized thread is refined by means of the ACN G (neighborhood expansion) to determine the complete rule transition system (RTS). The Idealized Thread The starting point for defining RTSs is to identify an idealized transition sequence, which may be considered a coarse prototypical rule-compliant plan of maneuvers from a dangerous into a safe configuration, i.e. from a start to an end configuration, containing important decision points, if we observe the vessels in each point in time. Consider the example in Fig. 17. First the vessels are head-on, then both must turn starboard. When they are not head-on anymore, they can go straight ahead (midships), and when they are just about side by side they can turn port, heading for their original courses. The rule must be triggered if two motor vessels (M V1 , M V2 ) are in 0 head-on position, i.e. M V1 O 4 ∠0 M V2 or inverse. Because of the symmetry we restrict 4 to the first case here. If both vessels pass each other on port side (M V1 O 4 ∠4 M V2 ) the vessels are in a safe situation, i.e. they are in no danger of a collision anymore, and so 0 the rule terminates. If M V1 O 4 ∠0 M V2 holds, both vessels must turn starboard. Under the assumption that both vessels travel with the same translation and rotation velocity 1 the execution of starboard commands results in M V1 O 4 ∠1 M V2 . Because we defined the steering commands as safe for the purpose of the desired effect (the turn is powerful and fast enough) now both vessels can move straight on. After a while the vessels have 3 almost passed each other (M V1 O 4 ∠3 M V2 ) and they can start turning back towards their

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

120

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

4 original course. When the vessels pass each other on port side (M V1 O 4 ∠4 M V2 ) the rule is processed successfully and the vessels can go on to follow their route. We illustrate the idealized thread for this rule in Fig. 18. A box defines a start configuration and a double circle a safe configuration denoting that the rule is processed and the boats are in no danger of a collision anymore.

MV O 0 4 ∠0

(S,S)

O 1 4 ∠1

(M,M)

O 3 4 ∠3

(P,P)

O 4 4 ∠4

MV

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 17. Two motor vessels (MVs): both have to alter their course starboard to pass each other on port side.

Figure 18. A coarse model for the rule shown in Fig. 17: If the vessels are head-on, then both must turn starboard. When they are not head-on anymore they can go straight ahead, and when they are just about side by side they can turn port, heading for their original course.

A Complete Rule Transition System The idealized thread is not yet a suitable formalization of rule-compliant actions, as it abstracts from alternative action effects that need to be considered: depending on the precise position of the vessels, the same action may lead to different change-overs with respect to the qualitative relations as defined by the ACN G. In particular, we can hardly observe transitions from region to line relations as considered in OPRA4 . Additionally, as we abstract extended objects to point-like ob0 jects in our example not only linear cases, e.g. O 4 ∠0 are interesting, but also their conceptual neighbors. Therefore, the idealized thread is extended to a transition system that also includes neighboring configurations if they are still within the scope of the traffic rule at hand. Incorporating the neighboring relations makes our formalization robust towards noise in perception and execution. Analogously, we apply this method to start and end configurations. For each rule, a specific rule transition system is derived that contains rule-compliant actions only. We specifically note that RTSs may further vary if different vessel types are involved, e.g. due to different kinematic capabilities. First, the formalization is incomplete as the relations of the idealized thread are not 1 O 3 O 2 necessarily conceptual neighbors, e.g. O 4 ∠1 and 4 ∠3 . Thus, we need to add 4 ∠2 to complete the thread. As we consider the same type of vessel in this rule model we assume the same velocity for both in the rule transition system. But as soon as these actions are executed with different velocities or the vessels start turning at different time points, the effects of executing the actions do not necessarily lead to perceiving the re3 lation predicted on the idealized assumption. If, for example, O 4 ∠3 holds both vessels should turn port. Already in the case of slight differences in velocity we cannot expect − 3 + 4 4 the prototypical effect resulting in relation O 4 ∠4 , it is more likely that 4 ∠4 or 4 ∠3 holds. Assuming a velocity being just about the same for both vessels we expect the resulting perceived relation r after the vessels executed their actions (a1 , a2 ) has a maximal neighboring distance of one from the prototypical result rp , i.e. r ∈ cn(rp ) in our model. For models concerning different types of vessels with different prototypical assumptions on velocity we need to generalize: If the velocity proportion between two vessels is just about the same as assumed in the prototypical model, r ∈ cn(rp ) holds.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

121

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

So, relations neighboring to the ones in the coarse model need also to be part of the fine-grained model. We call this method neighborhood expansion based on the ACN G. Fig. 19 shows the fine rule model derived from the coarse model in Fig. 18. The prototypical behavior is highlighted by the (red) shaded boxes. For each of the relations added, we derive admissible actions that lead the vessels closer to the idealized thread. Given two vessels V1 and V2 are in relation ri the edge between relation ri and rj in the rule transition system is labeled with an action pair (a1 , a2 ) with a1 is performed by V1 and a2 by V2 . rj denotes the prototypical effect of the actions executed in ri derived from the ACN G of OPRAm . Unfortunately, we cannot apply the ACN G as derived in Sec. 3.1 naively. The problem is that we only derived a detailed neighborhood structure for rotation on the spot and straight translational motion, but not a detailed structure for circular motion. Therefore, we approximated the effects of the vessels’ actions by combinations of translation and rotation. Currently, no automatic method is available for this process. So, the admissible actions must be selected by hand. − 15 4 ∠1

− 15 4 ∠0

(S,S)

− 15 4 ∠15

+ 15 4 ∠15

O 15 4 ∠15

(S,S)

+ 0 4 ∠15

(S,S)

(S,S)

+ 1 4 ∠15

(S,S)

(S,S)

O 0 4 ∠0

(S,S) − 0 4 ∠1

(S,S)

− 1 4 ∠1

(S,S)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

− 1 4 ∠2

(M,M)

− 2 4 ∠3

(M,M)

(M,M)

(P,P)

(M,M)

O 3 4 ∠3

(P,P)

− 4 4 ∠5

O 4 4 ∠4

(M,M) + 3 4 ∠3

− 5 4 ∠5

O 5 4 ∠5

(M,M)

+ 3 4 ∠2

(M,M)

+ 3 4 ∠1

(M,M) (P,P)

(M,M)

(M,M)

− 3 4 ∠5

(M,M)

(M,M) − 3 4 ∠3

+ 1 4 ∠0

+ 2 4 ∠1

(P,P)

− 3 4 ∠4

(S,S)

O 2 4 ∠2

(M,M) − 1 4 ∠3

+ 1 4 ∠1

O 1 4 ∠1

+ 4 4 ∠3

(M,M)

+ 5 4 ∠5

+ 5 4 ∠4

+ 5 4 ∠3

Figure 19. The fine model for the coarse rule model in Fig. 18.

Rule 14 defines the rule applicable if two vessels are on ”reciprocal or nearly recip0 rocal course”, i.e. head-on or nearly head-on course. As O 4 ∠0 only represents situations in which two vessels are head-on, we need additional relations which cover ”nearly Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

122

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

15..1 head-on” situations. These are covered by − 4 ∠15..1 (cf. Fig. 20). Thus, all these relations − 15 + 1 but 4 ∠15 and 4 ∠1 are additional start relations where the rule has to be triggered11 . − 15 + 1 4 ∠15 and 4 ∠1 are exceptions as the vessels move away from each other and thus, are not ”in danger of a collision”. Having a closer look at the relations considered as ”nearly 0 head-on” we note that they are all conceptually neighbored to O 4 ∠0 .

B

B A

B A

B A

B

B A

A

B A

A

(a)

− 15 4 ∠1

(b)

− 15 4 ∠0

B

(c)

− 15 4 ∠15

B

(d)

O∠15 4 15

B

(e)

+ 15 4 ∠15

(f)

+ 0 4 ∠15

(g)

+ 1 4 ∠15

B

B A

A

(h)

− 0 4 ∠1

A

(i)

− 1 4 ∠1

A

(j)

O∠1 4 1

A

(k)

+ 1 4 ∠1

(l)

+ 1 4 ∠0

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 20. Relations representing ”nearly head-on” situations.

For deriving reasonable actions we can distinguish relations in two categories: safe relations where no collision is possible without an additional action and unsafe relations where a collision is possible, e.g. the start relations. A conservative strategy for collision free navigation is now to stay in safe relations. Given a potential collision both vessels + 15 15 must turn starboard. However, in case of O 4 ∠15 and 4 ∠15 a turn to port seems more 1 reasonable but is not in compliance with the rules. In relation − 4 ∠1 it is still necessary + 1 O 1 to turn starboard, whereas for 4 ∠1 and 4 ∠1 it is sufficient to move straight for not 2 O 3 being in danger of collision. For relations O 4 ∠2 and 4 ∠3 we add the neighboring relations as well. Here, the vessels should continue straight motion according to the ac3 tions in the idealized thread. Only in case of + 4 ∠3 it is already safe for both vessels to turn port, i.e. turning towards the original course without getting into an unsafe relation. But it is also an alternative to move straight on or even to turn starboard. Finally, the − 3 + 4 4 neighboring relations for the end configuration O 4 ∠4 are added. 4 ∠4 and 4 ∠3 are no end configurations as not both vessels have passed at each others port side. Again, the vessels may choose from the alternatives moving straight on or turning port towards the 3..5 5 O 5 , − original course. Relations − 4 ∠5 4 ∠3..5 , and 4 ∠5 are defined as end configurations, i.e. the rule has been successfully applied in the given start configuration. If a relation not defined within the rule model is perceived processing is also terminated, but the rule is not seen as being applied correctly, because something unexpected happened during execution, e.g. one of the agent did not act in accordance to the rules. For keeping the fine-grained rule model in Fig. 19 clear and simple we presented a very conservative and strict variant regarding the idealized thread, i.e. only a few action alternatives are defined. We applied different models with varying number of alternatives 3 in each relation, if in compliance with the rules. For example, in O 4 ∠3 it is also save if both vessels keep midships or turn starboard. We also considered rule transition systems 1 with additional more fine-grained action alternatives. For example, in configuration + 4 ∠1 it may be safe as well to turn ”a little to port” without running into a collision. 11 The

reader might argue, that some physical situations represented by start relations, e.g. ”nearly reciprocal” course anymore. This can be improved with finer granularity of OPRAm . Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

− 15 4 ∠1

is no

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

123

3.3. Constraint-Based Rule Integration

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Transition systems formalize rule-compliant actions for pairs of agents and hence allow a pair of agents to avoid collisions by performing the actions linked to their current relation. But this procedure may not suffice in situations involving more than two vessels governed by multiple rules. Therefore, we apply constraint-based reasoning methods to check whether actions according to the two-vessel transition systems are compatible from a global point of view as well. Additionally, constraint-based reasoning enables us to select a globally admissible action when a transition system allows alternative actions. By checking for compatibility we can consistently integrate rules that pose constraints on the configuration of objects. Given qualitative rule representations that locally constrain the configurations of objects, then a configuration is globally consistent with respect to the rules if the combined constraint network is satisfiable. CSP-based qualitative reasoning offers a sound reasoning method for testing compatibility of configuration descriptions. For this, we generate a constraint network (CN) that encodes all spatial relations between vessel positions that may result from admissible actions applied to the current configuration. Afterwards, we check the CN for consistent scenarios, i.e. only atomic relations are assigned to the constraints between pairs of vessels, in order to deduce a single admissible action per vessel to execute. Examples In Fig. 21 we depict an example with three motor vessels represented by the 1 o-points A, B, and C. The relations of the qualitative scene description are: A + 4 ∠15 B, + 1 + 15 B 4 ∠15 C, and A 4 ∠3 C. The pair (A, B) and the pair (B, C) are in a head-on situation and so, both vessels should turn starboard. In contrast, A and C are in a crossing situation12 with C perceives A on its starboard side, and so C must turn starboard and A must keep its course (Rule 15 of COLREGs). In Fig. 22(a) we depict the current scene description in terms of a constraint network (A, B, and C) and the extrapolated network regarding the transition systems (A+ , B + , and C + ). If possible, a solution of the extrapolated constraint network is computed, identifying globally consistent spatial relations among the agents. In other words, if a solution exists, the individual rules for pairs of objects are compatible. The result is then repropagated to determine the suitable actions for the individual vessels that will lead to this particular constellation. In Fig. 22(b) we depict one possible solution of our scenario. This process ensures that the selected actions are admissible with respect to the individual rules (by construction of the constraint network) and with respect to the global scene (by global constraint satisfaction). Regarding vessel B in Fig. 21 being a sports vessel instead of a motor vessel different behavior has to be shown. In this case C must turn port regarding A and concurrently C must turn starboard regarding C. So, this example yields an inconsistent extrapolated constraint network, reflecting that there is no action which is in compliance with all rules for two vessels. As a consequence, at least one vessel must violate standard rules. To generate such behavior additional methods have to be applied, e.g. organizing the vessels involved into a hierarchy with respect to the importance in the current situation. 12 A

15 and C are in relation A + 4 ∠3 C which is not considered to be ”nearly head-on”.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

124

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

A

A − 15 4 ∠3

+ 1 4 ∠15 + 1 4 ∠15

B

A

− 15 4 ∠3

+ 1 4 ∠15

C

B

+ 1 4 ∠15

C

C S

1 + 2 (+ 4 ∠ 0 , 4 ∠0 , + 2 4 ∠1 )

B

Figure 21. A situation with 1 three motor vessels: A + 4 ∠15 B, + 1 + 15 B 4 ∠15 C, and A 4 ∠3 C.

B+

S

M,S

A+

0 − 15 (− 4 ∠4 , 4 ∠4 , − 0 4 ∠3 )

1 + 2 (+ 4 ∠0 , 4 ∠15 , + 2 4 ∠0 )

C+

S

S

1 (+ 4 ∠0 )

B+

A+ 1 (+ 4 ∠0 )

S

15 (− 4 ∠4 )

C+

(a) Deriving a CSP network by the (b) Consistent scenario deduced rule models (A+ , B + , and C + ) from (a) from the original configuration (A, B, and C) Figure 22. Reasoning with CSPs for rule-compliant actions of vessels A, B, and C.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4. Summary In this chapter we gave an introduction to the domain of qualitative spatial reasoning (QSR) and how reasoning techniques from this domain can be applied for controlling, describing, and judging agent behavior. QSR is concerned with capturing everyday commonsense knowledge about space with a limited set of symbols and allows for dealing without numerical values. The essence of QSR is to find ways to represent continuous properties of the world by discrete systems of symbols, so called spatial calculi. These calculi may deal with different aspects of space like orientation, size, shape, etc. We specifically discussed orientation, location, distance, and motion. Based on the definition of qualitative calculi we introduced two reasoning techniques: constraint-based reasoning and (action-augmented) neighborhood-based reasoning. As relative orientation is the most important aspect for the navigation task at hand, we introduced different calculi dealing with this aspect, e.g. the Double Cross Calculus and OPRAm . We derived the action-augmented conceptual neighborhood structure (ACN G) for OPRAm and utilized it for generating rule transition systems (RTSs) in the domain of collision regulations in vessel navigation. The RTSs are derived in a stepwise manner from natural language descriptions. First, a coarse prototypical rule-compliant plan of maneuvers from a dangerous into a safe configuration is modeled (idealized thread). Second, the coarse model is refined by neighborhood expansion to capture all relevant configurations in the RTS. By constraint-based reasoning specific actions can be deduced if the RTS allows for alternative actions and we can check whether actions according to the two-vessel transition systems are compatible from a global point of view as well.

Acknowledgements The work was carried out in the DFG Transregional Collaborative Research Center SFB/TR 8 Spatial Cognition. Financial support by the Deutsche Forschungsgemeinschaft (DFG) is gratefully acknowledged.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

125

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

References [1] J. F. Allen. Maintaining knowledge about temporal intervals. Communications of the ACM, pages 832–843, Nov. 1983. [2] P. Balbiani, J.-F. Condotta, and L. F. del Cerro. A model for reasoning about bidimensional temporal relations. In A. G. Cohn, L. Schubert, and S. C. Shapiro, editors, KR’98: Principles of Knowledge Representation and Reasoning, pages 124–130. Morgan Kaufmann, San Francisco, California, 1998. [3] P. Balbiani, J.-F. Condotta, and L. F. del Cerro. A new tractable subclass of the rectangle algebra. In 16th International Joint Conference on Articial Intelligence, pages 442–447, 1999. [4] P. Balbiani and A. Osmani. A model for reasoning about topologic relations between cyclic intervals. In A. G. Cohn, F. Giunchiglia, and B. Selman, editors, KR2000: Principles of Knowledge Representation and Reasoning, pages 378–385, San Francisco, 2000. Morgan Kaufmann. [5] B. Bennett. Logical Representations for Automated Reasoning about Spatial Relationships. PhD thesis, School of Computer Studies, The University of Leeds, 1997. [6] B. Bennett and A. P. Galton. A unifying semantics for time and events. Artificial Intelligence, 153(1-2):13–48, March 2004. [7] E. Clementini, P. D. Felice, and D. Hernandez. Qualitative representation of positional information. Artificial Intelligence, 95(2):317–356, 1997. ISSN 0004-3702. [8] A. G. Cohn and S. M. Hazarika. Qualitative spatial representation and reasoning: An overview. Fundamenta Informaticae, 46(1-2):1–29, 2001. [9] J.-F. Condotta, M. Saade, and G. Ligozat. A generic toolkit for n-ary qualitative temporal and spatial calculi. In TIME ’06: Proceedings of the Thirteenth International Symposium on Temporal Representation and Reasoning (TIME’06), pages 78–86, Washington, DC, USA, 2006. IEEE Computer Society. ISBN 0-7695-2617-9. [10] F. Dylla. An Agent Control Perspective on Qualitative Spatial Reasoning, volume 320 of DISKI. Akademische Verlagsgesellschaft Aka GmbH (IOS Press); Heidelberg, Germany, 2008. ISBN 978-3-89838-320-2. [11] F. Dylla and R. Moratz. Empirical complexity issues of practical qualitative spatial reasoning about relative position. In Workshop on Spatial and Temporal Reasoning at ECAI 2004, pages 37–46, Valencia, Spain, Aug. 2004. [12] F. Dylla and R. Moratz. Exploiting qualitative spatial neighborhoods in the situation calculus. In Freksa et al. [22], pages 304–322. [13] M. T. Escrig and F. Toledo. A framework based on CLP extended with CHRs for reasoning with qualitative orientation and positional information. Journal of Visual Languages and Computing, 9(1):881–101, 1998. [14] A. Ferrein, C. Fritz, and G. Lakemeyer. Using Golog for deliberation and team coordination in robotic soccer. KI K¨unstliche Intelligenz, 19(1):24–43, 2005. [15] K. D. Forbus. Qualitative process theory. Artificial Intelligence, 24(1-3):85–168, 1984. ISSN 0004-3702. [16] A. Frank. Qualitative spatial reasoning about cardinal directions. In Proceedings of the American Congress on Surveying and Mapping (ACSM-ASPRS), pages 148–167, Baltimore, Maryland, USA, 1991. [17] A. U. Frank. Qualitative spatial reasoning about distances and directions in geographic space. Journal of Visual Languages and Computing, 3:343–371, 1992. [18] C. Freksa. Conceptual neighborhood and its role in temporal and spatial reasoning. In M. G. Singh and L. Trav´e-Massuy`es, editors, Proceedings of the IMACS Workshop on Decision Support Systems and Qualitative Reasoning, pages 181–187, North-Holland, Amsterdam,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

126

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

1991. Elsevier. [19] C. Freksa. Temporal reasoning based on semi-intervals. Artificial Intelligence, 54(1):199– 227, 1992. [20] C. Freksa. Using orientation information for qualitative spatial reasoning. In A. U. Frank, I. Campari, and U. Formentini, editors, Theories and methods of spatio-temporal reasoning in geographic space, pages 162–178. Springer, Berlin, 1992. [21] C. Freksa. Spatial cognition – An AI perspective. In Proceedings of 16th European Conference on AI (ECAI 2004), 2004. [22] C. Freksa, M. Knauff, B. Krieg-Br¨uckner, B. Nebel, and T. Barkowsky, editors. Spatial Cognition IV. Reasoning, Action, Interaction: International Conference Spatial Cognition 2004, volume 3343 of Lecture Notes in Artificial Intelligence. Springer, Berlin, Heidelberg, 2005. [23] C. Freksa and R. R¨ohrig. Dimensions of qualitative spatial reasoning. In N. P. Carret´e and M. G. Singh, editors, Proceedings of the III IMACS International Workshop on Qualitative Reasoning and Decision Technologies – QUARDET’93, pages 483–492. CIMNE Barcelona, 1993. [24] C. Freksa and K. Zimmermann. On the utilization of spatial structures for cognitively plausible and efficient reasoning. In F. A. H. G¨usgen and J. v.Benthem, editors, Proc. of the Workshop on Spatial and Temporal Reasoning (IJCAI’93), pages 61–66. Chambery, 1993. [25] A. Galton. Qualitative Spatial Change. Oxford University Press, 2000. [26] B. Gottfried. Reasoning about intervals in two dimensions. Systems, Man and Cybernetics, 2004 IEEE International Conference on, 6:5324–5332 vol.6, Oct. 2004. [27] H. W. Guesgen. Spatial reasoning based on allen’s temporal logic. Technical report, International Computer Science Institute, 1989. [28] IMO. International regulations for preventing collisions at sea 1972 (ColRegs). International Maritime Organization (IMO), 1972. adopted 2001. [29] A. Isli, V. Haarslev, and R. M¨oller. Combining cardinal direction relations and relative orientation relations in qualitative spatial reasoning. Technical Report FBI-HH-M-304/01, Fachbereich Informatik, Universit¨at Hamburg, 2001. [30] A. Isli and R. Moratz. Qualitative spatial representation and reasoning: Algebraic models for relative position. Technical Report FBI-HH-M-284/99, Fachbereich Informatik, Universit¨at Hamburg, 1999. [31] B. Kuipers. Qualitative Reasoning: Modeling and simulation with incomplete knowledge. MIT Press, Cambridge, Massachusetts, USA, 1994. [32] P. Ladkin and A. Reinefeld. Effective solution of qualitative constraint problems. Artificial Intelligence, 57:105–124, 1992. [33] S. C. Levinson. Space in language and cognition: explorations in cognitive diversity. Cambridge University Press, Cambridge, MA, 2003. [34] G. Ligozat. Qualitative triangulation for spatial reasoning. In A. U. Frank and I. Campari, editors, Spatial Information Theory: A Theoretical Basis for GIS, (COSIT’93), Marciana Marina, Elba Island, Italy, volume 716 of LNCS, pages 54–68. Springer, 1993. ISBN 3-54057207-4. [35] G. Ligozat. Reasoning about cardinal directions. Journal of Visual Languages and Computing, 9:23–44, 1998. [36] G. Ligozat. Categorical methods in qualitative reasoning: The case for weak representations. In A. G. Cohn and D. M. Mark, editors, COSIT, volume 3693 of Lecture Notes in Computer Science, pages 265–282. Springer, 2005. ISBN 3-540-28964-X. [37] G. Ligozat and J. Renz. What is a qualitative calculus? A general framework. In Zhang et al. [60], pages 53–64.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

127

[38] J. Liu. A method of spatial reasoning based on qualitative trigonometry. Artificial Intelligence, 98(1-2):137–168, 1998. ISSN 0004-3702. [39] R. Moratz. Representing relative direction as a binary relation of oriented points. In G. Brewka, S. Coradeschi, A. Perini, and P. Traverso, editors, ECAI, pages 407–411, Riva del Garda, Italy, 2006. IOS Press. ISBN 1-58603-642-4. [40] R. Moratz, F. Dylla, and L. Frommberger. A relative orientation algebra with adjustable granularity. In Proceedings of the Workshop on Agents in Real-Time and Dynamic Environments (IJCAI 05), pages 61–70, Edinburgh, Scotland, July 2005. [41] R. Moratz, B. Nebel, and C. Freksa. Qualitative spatial reasoning about relative position: The tradeoff between strong formal properties and successful reasoning about route graphs. In C. Freksa, W. Brauer, C. Habel, and K. F. Wender, editors, Spatial Cognition III, volume 2685 of Lecture Notes in Artificial Intelligence, pages 385–400. Springer, Berlin, Heidelberg, 2003. [42] R. Moratz, J. Renz, and D. Wolter. Qualitative spatial reasoning about line segments. In W. Horn, editor, Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), Berlin, Germany, 2000. IOS Press. [43] P. Muller. A qualitative theory of motion based on spatio-temporal primitives. In A. G. Cohn, L. Schubert, and S. C. Shapiro, editors, KR’98: Principles of Knowledge Representation and Reasoning, pages 131–141. Morgan Kaufmann, San Francisco, California, 1998. [44] P. Muller. Space-time as a primitive for space and motion. In N. Guarino, editor, 1st Int. Conf. (FOIS-98), Frontiers in AI and Applications, volume 46, pages 63–76. IOS Press, 1998. [45] J. Pinto. Integrating discrete and continuous change in a logical framework. Computational Intelligence, 14:39–88, 1998. [46] D. A. Randell, Z. Cui, and A. Cohn. A spatial logic based on regions and connection. In B. Nebel, C. Rich, and W. Swartout, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Third International Conference (KR’92), pages 165–176. Morgan Kaufmann, San Mateo, CA, 1992. [47] J. Renz. Qualitative Spatial Reasoning with Topological Information. Number 2293 in Lecture Notes in Computer Science. Springer-Verlag New York, Inc., New York, NY, USA, 2002. ISBN 3-540-43346-5. [48] J. Renz and G. Ligozat. Weak composition for qualitative spatial and temporal reasoning. In Proceedings of the 11th International Conference on Principles and Practice of Constraint Programming (CP 2005), pages 534–548, Sitges (Barcelona), Spain, Oct. 2005. [49] J. Renz and D. Mitra. Qualitative direction calculi with arbitrary granularity. In Zhang et al. [60], pages 65–74. [50] R. R¨ohrig. A theory for qualitative spatial reasoning based on order relations. In AAAI’94: Proceedings of the twelfth national conference on Artificial intelligence (vol. 2), pages 1418– 1423, Menlo Park, CA, USA, 1994. American Association for Artificial Intelligence. ISBN 0-262-61102-3. [51] B. Russell. Principles of Mathematics. Cambridge University Press, 1903. Second edition, published by Norton, New York. [52] C. Schlieder. Reasoning about ordering. In Proc. of COSIT’95, volume 988 of Lecture Notes in Computer Science, pages 341–349. Springer, Berlin, Heidelberg, 1995. [53] A. Scivos and B. Nebel. Double-crossing: Decidability and computational complexity of a qualitative calculus for navigation. In Spatial Information Theory: Foundations of Geographic Information Science (COSIT-2001), pages 431–446, Morro Bay, CA, 2001. Springer, Berlin. [54] A. Scivos and B. Nebel. The finest of its class: The practical natural point-based ternary calculus LR for qualitative spatial reasoning. In Freksa et al. [22], pages 283–303. [55] O. Stock, editor. Spatial and Temporal Reasoning. Kluwer Academic Publishers, 1997.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

128

F. Dylla / Qualitative Spatial Reasoning for Navigating Agents

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[56] P. van Beek. Reasoning about qualitative temporal information. Artificial Intelligence, 58 (1-3):297–321, 1992. [57] N. van de Weghe. Representing and Reasoning about Moving Objects: A Qualitative Approach. PhD thesis, Ghent University, Belgium, 2004. [58] M. Vilain, H. Kautz, and P. van Beek. Constraint propagation algorithms for temporal reasoning: A revised report. In D. S. Weld and J. de Kleer, editors, Readings in qualitative reasoning about physical systems, pages 373–381. Morgan Kaufmann, San Mateo, CA, San Francisco, CA, USA, 1990. ISBN 1-55860-095-7. [59] M. Worboys. Modelling changes and events in dynamic spatial systems with reference to socio-economic units. In A. Frank, J. Raper, and J.-P. Cheylan, editors, Life and Motion of Socio-Economic Units, number 8 in ESF GISDATA, pages 129–138. Taylor and Francis, 2001. [60] C. Zhang, H. W. Guesgen, and W.-K. Yeap, editors. PRICAI 2004: Trends in Artificial Intelligence, 8th Pacific RimInternational Conference on Artificial Intelligence, Auckland, New Zealand, Proceedings, volume 3157 of Lecture Notes in Computer Science. Springer, 2004. [61] K. Zimmermann. Enhancing qualitative spatial reasoning – combining orientation and distance. In Spatial Information Theory: a theoretical basis for GIS, number 716 in Lecture Notes in Computer Science, pages 69–76. Springer Verlag, 1993. [62] K. Zimmermann and C. Freksa. Qualitative spatial reasoning using orientation, distance, and path knowledge. Applied Intelligence, 6:49–58, 1996.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-129

129

Classifying Collective Motion Zena WOOD, Antony GALTON School of Engineering, Computing and Mathematics, University of Exeter Abstract. Collective phenomena and their associated movement patterns are ubiquitous in everyday life. However, formal reasoning about these phenomena is currently hampered by the lack of adequate tools. We have previously developed a classification of collectives but this is incomplete with regards to movement and therefore needs to be integrated with a classification of collective motion. This paper analyses existing research into the movement patterns of collectives. Although there has been some research into the movement patterns of particular kinds of collectives, most of this is found to focus on the level of the individuals; vital information about the collective is lost. Existing research that focuses on movement patterns improves on this but still leaves many questions unanswered and important features of collectives unable to be represented. Therefore, we develop a list of goals which we believe a classification of collective motion needs to satisfy, and introduce the foundations of a system that we believe will satisfy these goals. We hope that this work will provide a sound basis for the development and formalisation of a comprehensive classification of collectives and their motions. Keywords. spatial collectives, collective behaviour, collective motion, movement patterns

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Introduction Consider the following scenario. An organised demonstration is due to take place in a city where it is your responsibility to ensure the safety of both the residents and any visitors. As well as lining the entire route with police officers you have decided to observe the demonstration on a monitor from a central command post. The monitor will show you regular snapshots of the entire route; each snapshot clearly indicates the movements of all the individuals showing their velocities and identifying which are police officers and which are members of the public. A police officer is represented by a square and a member of the public by a circle. Figure 1(a) shows a section of one of the snapshots which depicts the movements of individuals at the beginning of the demonstration. The snapshot clearly shows the police lining the route and the demonstrators moving in a reasonably ordered fashion along the route. However, at a certain point along the route, you observe that the demonstrators appear to have ceased moving along the route and have begun to form a stationary cluster (figure 1(b)). The police officers that were lining the edge of the particular section of the route can be seen to move to surround the expanding cluster. From your observations of the motions of these individuals, what do you think may be happening? Do you need to take any further action in order to ensure the continued safety of all those involved? Such an example of collective motion could represent the beginnings of riot and, therefore, the need for more police officers or more specialised officers to be sent to the scene. A system like the one described here where the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

130

Z. Wood and A. Galton / Classifying Collective Motion

(a) An early stage in the demonstration

(b) A later stage in the demonstration showing possible development of a riot

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. Snapshots of sections of the demonstration

motion of a group of individuals can be observed using a succession of snapshots could not only be used to monitor motion in real-time but also predict collective motion. For example, predicting what would happen if the route of the demonstration was suddenly changed or how a crowd of demonstrators would react in an emergency. A group of demonstrators is only one example of a collective — collective phenomena and their associated movement patterns are ubiquitous in everyday life. With some collectives being defined by their movement patterns, it is one of the most important features of a collective that we would wish to reason about. However, it would appear that we do not currently possess the necessary tools. We have developed a taxonomy of collectives [27,26] but even though it appears to be capable of distinguishing a wide range of types of collective phenomena, it is incomplete with regards to movement. The taxonomy only acknowledges the absence or presence of movement and if it is present whether or not the motion of the collective is co-ordinated with the motion of its individual members. A comprehensive classification would allow us to define both the type of a collective and the movement patterns one might expect from that type. The ultimate goal of our research is to build such a taxonomy which can then be formalised and used as a basis for an ontology of collectives. The system we envisage is similar to that described in the example at the beginning of this paper — one by which, using a combination of visual analysis and automatic processing, we could identify the type of collective from its movement patterns and vice versa. Such a system would have many applications including security (e.g., identifying how the occupants of a building would exit in an emergency) [16], traffic monitoring and management and the prediction of crowd behaviour. It is important to note what we mean by the term collective. In order for a phenomenon to be considered a collective it must consist of more than one individual; however, it must also satisfy a set of conditions [26]. We do not wish to consider abstract phenomena such as the set of natural numbers and therefore, under our terminology a collective must be a concrete particular. An important feature of many collectives is variable membership: for example, members may join or leave a protest march in midcourse without affecting the existence or identity of the collective. It is vital that this time-dependence of membership is adequately modelled in any framework that wishes to efficiently represent collective phenomena. In order to achieve our goal, we must first develop a classification of the movement patterns that are exhibited by collective phenomena as a class; we will refer to these

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Z. Wood and A. Galton / Classifying Collective Motion

131

movement patterns as collective motion. This classification system could then be integrated with our existing taxonomy to obtain the required comprehensive classification system. To produce a classification of collective motion we must fully understand the range of collective motion and any relations that may exist between the different types. The introduction of technology such as GPS devices and mobile phones has led to an increasing amount of available data that records the movement of objects [2]. These datasets usually contain information on the movement patterns of a large number of individual objects which may or may not be related; implicit in such data there may be a good deal of information on how individuals are participating in a collective. Owing to this increase in availability, there has been increased research into the development of adequate extraction and analytical interpretation tools [22]. Researchers within many fields including GIS, Mathematics and Computer Science have focused their attention on the definition and detection of movement patterns from large data sets [2,5,12,17,24]. It appears that most of the research that relates to collectives focuses on the level of the individuals and in particular the movement patterns that arise due to the interactions that exist between the individual members of a collective [13,25,1]. Although this research does not provide us with all of the necessary information regarding the collective’s motion (section 4), the movement of the individual members of a collective and the interactions that exist between them are still very important. Even though most of this research focuses on particular types of collective there are many useful concepts and methods that can be extracted and possibly incorporated into a classification system that considers the movement patterns of a much wider range of collectives. Recent research has begun to focus on the movement of a collective by looking at either specific patterns such as leadership [2], or the patterns that can be exhibited by collectives in general [24,5,12]. Some of this research simply lists the different possible movement patterns [5] but some attempts to organise these patterns into a taxonomy [24,12]. Although these taxonomies are closer to what we are looking for, they still leave important features of collectives unrepresented and problems unaddressed. There also appears to be no research that has systematically related the different types of collective to the different types of movement pattern that they typically exhibit. The aim of this paper is, through a survey of existing work in this area, to highlight the need for a classification of collective motion that does justice to all of the important features of a collective, particularly the relationship that exists between its motion and the motions of its individual members. From a detailed analysis of the existing research, the problems that remain ignored or only partially addressed with regards to collective motion are discussed; the results of this survey are used to produce a set of goals that we believe a classification of collective motion should satisfy. We have begun the development of such a classification, the foundations of which are introduced at the end of this paper. A detailed and systematic analysis of the relevant literature is given. This analysis divides the literature into two parts: that which focuses on the level of the individuals (Section 2), and that which focuses on the level of the collective (Section 3). Our findings from this analysis are discussed (Section 4), leading to a list of desired goals (Section 5). An overview of our proposed classification is given which highlights how we believe we will achieve these goals (Section 6). The paper concludes with a list of further research

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

132

Z. Wood and A. Galton / Classifying Collective Motion

questions that need to be addressed in order for the proposed classification system to be fully developed (Section 7).

2. Focusing on the Level of the Individuals Researchers have begun to investigate the movement patterns exhibited by specific types of collective phenomena such as crowds [1,11] or animal aggregations [7,8,13,17], but the main focus appears to be on the individual members of the collective rather than collective as a whole. Although some researchers have studied the effect of decisions made by individual members, the majority concern themselves with how interactions between pairs of individual members affect the overall behaviour of the group; many of them have developed mathematical models to aid their investigations.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.1. The interactions between individuals Eftimie et al [13] have built a ‘non-local continuum model’ to help explain the way in which animal groups form and move in response to the information that they receive. Like Wood and Galton [27], Eftimie et al note that groups can be influenced by both internal and external factors, but they choose to focus only on the internal factors and in particular three interactions that can occur between the individual animals of a group: attraction, repulsion and alignment. Focusing on communication as a medium for these social interactions between animals, four different types of signals are given: acoustic, visual, chemical and tactile; both the range and directionality of these signals are considered when defining the three social interactions. Using their model, Eftimie et al indicate the different spatial patterns that can arise from different combinations of the three interactions. When all three interactions are present the model displays four types of movement patterns: stationary pulses, travelling pulses, travelling trains and zigzag pulses. Stationary pulses are ‘spatially nonhomogeneous steady states’; travelling pulses are ‘spatially nonhomogeneous solutions that have a fixed shape and move at a constant speed’; travelling trains are ‘periodic solutions’, while zigzag pulses are defined as ‘periodic solutions that change direction’. For Eftimie et al’s purposes three internal factors were sufficient but it is unclear whether these interactions alone could be used to develop a model that would explain the movement of a wider range of collectives. It also seems unlikely that the four types of movement patterns defined by Eftimie et al are exhaustive enough to define all types of collective motion. The effects of external factors also need to be considered; as noted by [1,23], the external factors which come from the environment which the collective exists in is very important and will affect the collective’s motion. The continuum model developed by Eftimie et al [13] is one-dimensional. Simulation results show that the model enables them to compare the essential features of complex biological patterns; however, as they note, the model would need to be extended to two spatial dimensions in order to handle more general and realistic cases of communication. Topaz et al [25] have developed a two dimensional continuum model which models the motion of biological organisms. Similar to that of Eftimie et al, the model proposed by Topaz et al is designed to study the ‘pattern-forming behaviour’ that arises

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Z. Wood and A. Galton / Classifying Collective Motion

133

from the social interactions between organisms. However, they focus on a particular type of collective motion, swarming, which as they note is an example of how a ‘global structure can arise from more localized rules’ [25]. The ultimate goal of their research is to be able to ‘make specific statements about how social interactions between organisms affect the large-scale motion of a biological group’ [25]. The two-dimensional model is based on four assumptions. Since Topaz et al only wish to consider swarming they assume the population is always conserved (i.e., ‘birth, death, immigration and emigration of organisms are negligible’). Although this may be suitable for swarming, variable membership is an important feature of many collectives and one which is likely to affect their movement patterns (see section 4). It would appear that like [13], they do not consider external factors. In [6], Batty et al develop a model which simulates the effects of a route change on the Notting Hill Carnival; key to this model is the behaviour that ‘emerges from the accumulated interactions between small-scale objects’. The interaction and mobility of large numbers of people over a short time period generates planning and management issues and, as Batty et al note, it is very important that we are able to reason and solve the related spatial problems; they wish to introduce models that can be generalised to a range of different ‘small-scale spatial events’ which are based on the dynamics of local movement. Although Batty et al do not define or discuss any particular movement patterns related to collectives in general, crowds are briefly discussed. It is noted that crowds can grow in size and density, both to levels that are out of control. However, their model is currently not able to simulate such situations. Unlike the previous research that has been discussed, Batty et al do consider the effects of external factors due to the environment on the movement patterns of individuals. The landscape, barriers, mobile attractions (e.g., floats) and police are all treated as ‘mobile agents’; this approach allows for the consideration of a range of different interactions. Batty et al raise some important points regarding the problems that occur when handling collective motion; in particular those relating to granularity. As we discuss further in section 4, the way in which collectives are observed is very much dependent on the level of granularity that they are viewed from. Batty et al note the need to observe movement at different levels of granularity and to develop software with which ‘we can quickly visuallise inputs and outputs from the model at different scales and through time’ [6]. 2.2. The decisions of individuals As well as focusing on how the interactions between individuals affect the overall behaviour of the group that they are participating in, some researchers have also examined how the decisions of individual members affect the group’s behaviour. Gueron et al [17] have developed a mathematical model which is based on a set of hierarchical decisions that individuals make within a herd. Assuming the group is self-organising, the model examines how these decisions affect the movement, shape and cohesion of the group over time. Taking the perspective that co-ordination of groups is locally controlled (i.e., patterns form due to the individuals’ actions in relation to their neighbours or ‘local environmental signals or markers’), Gueron et al note that external factors do not need to be considered; however, they do note the importance of the changes in environmental

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

134

Z. Wood and A. Galton / Classifying Collective Motion

conditions on the shape of the group. They note that their model is only a ‘starting point’ and tools must be developed that allow the details of movement rules to be explored. Sumpter [23] reflects on the principles that lie behind the collective behaviour of animals and asks whether a set of principles could be established that would allow us to classify and understand this behaviour. After analysing a wide range of self-organising collectives, Sumpter notes that many of the collective movement patterns that are exhibited by animals can be explained using rules which the individual members follow. He wishes to develop a set of principles which would allow the development of behavioural algorithms that individuals could follow; these algorithms could be used as a basis for mathematical models capable of predicting the collective behaviour of animal aggregates. In order to develop these behavioural algorithms, Sumpter begins by generating an initial list of principles which include the features that are common to observed collective animal behaviour: individual integrity, leadership, inhibition, redundancy and synchronisation. It is important to note that Sumpter does not intend these principles to be thought of in isolation; they need to be studied further to see how they interact to produce ‘complex collective patterns’. Further research is required to see how relevant these principles are to collective motion in general. Up to now our focus has been on research that looks at the movement patterns of individuals participating in collectives. However, there is existing research which does not recognise collectives but highlights what information movement patterns can reveal. This research could also help in the relating of movement patterns to the relevant types of collective, and vice versa. Andrienko and Andrienko [3], for example, focus on the extraction and interpretation of information about a single individual’s movement patterns from a large dataset. The dataset records an individual’s car position over a period of five months. Since data was only recorded when the car was moving, periods when the car was stationary are only implicitly stored in the data. As Andrienko and Andrienko note, the dataset is too large to extract movement patterns using visual techniques alone, and therefore they have used various extraction methods including database queries, computational analysis methods, data transformations and other computer-based operations. Although they only look at behaviour patterns of an individual, rather than of a collective, they do suggest a number of techniques that could be used to identify different behaviour patterns such as the detection of important places in the individual’s life and also routes used to traverse between such places. For example, important places could include the individual’s home and place of work, and they may have a particular route for getting between them.

3. Focusing on the level of the collective The following pieces of research have all defined a list or even a taxonomy of movement patterns that are relevant to collectives; each has its merits but also its problems. This research is summarised at the end of this section in table 1. 3.1. Thériault et al Thériault et al [24] propose a taxonomy that classifies sets of geographical entities (SGEs) which are located in a geographical space and whose members share a thematic

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Z. Wood and A. Galton / Classifying Collective Motion

135

or functional relationship. The taxonomy is not designed to be exhaustive — it focuses only on entities that are found in the transportation modelling domain. However, some of the concepts they use in the taxonomy could be applied to one that is less restrictive. It is important to note that Thériault et al also suggest a range of existing statistical methods to analyse the geographical patterns exhibited by the entities; however, since we are currently only concerned with the definition of movement patterns we will only focus on their proposed taxonomy here. A basic set of assumptions have been used as a basis for the proposed framework. The majority of these assumptions can be seen as acceptable; for example, time is defined as a ‘set of measured times that is isomorphic to the set of real numbers’ and temporal intervals delimit the life of a geographical entity. However, as discussed in section 4, the level of granularity at which a movement pattern is observed can affect the way in which it is interpreted. Thériault et al appear to avoid this issue by simply assuming that the measurement units used for both space and time will be chosen according to what an application needs. They state that they do not intend to ‘identify propagation rules across geographical scales and temporal granularities’. To produce the taxonomy two sets are identified: one that defines the spatiotemporal properties that characterise how the considered sets of entities geographically evolve and one which defines the ‘basic evolution processes of the geographical entities’. Thériault et al note that although a group of entities may exhibit simple behaviour, the ‘set behaviour’ may be complex, and that defining the ‘homogeneous spatial properties for sets’ is not a simple issue. Therefore, in their proposed taxonomy for each SGE, the properties of the geographical entities (GEs) at both the entity and set level are represented as ‘computed attributes using statistical aggregation methods’. In previous research [9,10], a ‘spatio-temporal model’ has been developed that classifies ‘basic spatiotemporal processes’. For the purposes of [24] the model restricts the way in which an entity can evolve over space and time to those changes involving points (i.e., an entity is considered to have no shape, size or orientation); however, each entity is considered to have an intensity to indicate how important it is during that study. We believe that it is unlikely that this method will allow the relationship that exists between a collective and its members to be modelled and therefore, it is unsuitable for use in our classification system. For example, the size of entities is particularly important if the spaces between them is particularly large. In contrast, if entities are closely packed together then their size is likely to effect the dynamics of the collective. An example of this is discussed in section 4.7. To enable the comparison and measurements of the patterns formed from the way in which the GEs are distributed through space, spatial and spatiotemporal profiles are compiled for each set of points. A spatial profile is one where the location of entities is measured at or over a given time interval (i.e., ‘time is controlled’). In a spatio-temporal profile both location and time are measured. By combining these two profiles with movement measures such as travelling speed, the spatiotemporal behaviours of either entities or sets of entities can be compared. Although we are currently only focusing on the definition of collective motion, the use of profiles may need further consideration if we look at ways in which the spatiotemporal behaviour of groups of entities can be compared. In making no attempt to be exhaustive, four ‘basic components of spatial evolution’ are distinguished which are all relevant to the domain of transportation modelling: an SGE’s footprint, the way in which the entities within the domain are spatially distributed,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

136

Z. Wood and A. Galton / Classifying Collective Motion

the spatial pattern formed by the entities (i.e., their geographical arrangement) and the presence of spatial autocorrelation. Thériault et al note that these can be measured using existing statistical methods and examples are given for each of the four components. Although only considered for the transportation domain, these components would also be highly relevant to a much broader range of collectives. The spatiotemporal processes that involve relevant sets of GEs are identified. It is stressed that they are not attempting to build a formal taxonomy of these processes or to show how the properties of entities are transferred or aggregated to the sets which they form; these are left for forthcoming research. Instead they focus on how changes of a GE affect the evolution of the SGE that it is a member of and in particular the changes that relate to the life of each GE, the movement of entities and the movement of the entire SGE. Possible changes in the life of each GE include its leaving or joining an SGE. Set membership of a SGE can result in changes in the size and density of its footprint, which they choose to represent as a minimum convex polygon. The restriction to convex polygons here could be a serious limitation [15]. GEs could move around while remaining within the boundary of the SGE’s footprint; however the movement of a GE could lead to its interaction with the boundary. If entities move outside the current footprint of the SGE but remain a member then the footprint will expand. However, if entities all move inwards (away from the SGEs boundary) then the SGE’s footprint will contract. As discussed in [27], variable membership is an important feature of many collectives and as highlighted by Thériault et al, this feature will affect the way in which collective motion is interpreted. When discussing the possible changes in the movement of the entire SGE, only the co-ordinated movements of GEs are considered; they do not consider partial coordination. Of the possible co-ordinated movements they only consider the possibilities of GEs moving at the same speed and direction and GEs moving at different speeds but ‘turning around a central point’. The basic components of spatial evolution and spatiotemporal processes that have been identified by Thériault et al are all highly relevant to the motion of collectives in general and, therefore, will need to be considered in our classification of collective motion. 3.2. Laube et al Laube et al [21], in common with many other researchers, note that we need ways to analyse and interpret moving object data; their research tries to increase the ‘analytical power’ of GIScience when exploring the motion of objects. In [21] an overview of the problems current database management systems have with ‘moving point objects’ (MPOs) is given along with a brief discussion of the problems current GIS software has when trying to handle the temporal dimension of information — a key area when trying to adequately represent moving objects. Laube et al have developed the ‘RElative MOtion (REMO) analysis concept’ which allows motion attributes of point objects to be compared over space and time and also an object’s motion to be related to that of others in the dataset. In addition, REMO allows the user to predefine the motion patterns that need to be detected using a formalism that is also presented in [21]. Moving point objects are modelled using ‘geospatial lifelines’; these consist of three pieces of data: id, location and time.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Z. Wood and A. Galton / Classifying Collective Motion

137

A two-dimensional matrix is used (the REMO matrix) whose rows represent objects and columns represent time-steps. The first step in the REMO analysis concept involves the population of the matrix with the motion attributes speed, change and motion azimuth. A pattern in REMO is defined as ‘a set of motion parameter values’ which extend in time and across the objects. Primitive patterns are those that are only present in one dimension of the matrix and include constancy, concurrence and change; however, these can be used to build ‘complex patterns’. For example, constancy and concurrence can be combined to form a complex pattern which they name ‘trend-setter’. By considering the ‘interrelations’ that occur in both dimensions of the matrix, any complex interactions that occur between multiple moving objects can be considered. To detect the motion patterns that have been predefined, the REMO analysis concept uses ‘syntactic pattern recognition’; in this technique the pattern is seen as comprising ‘simpler subpatterns’ (i.e., primitives). A description of possible pattern matching algorithms is given along with a reasonably detailed discussion of how the REMO analysis concept compares to other tools which analyse spatiotemporal data (e.g., database management systems, data mining, descriptive statistics and exploratory spatial data analysis (ESDA)). One way in which Laube et al state that their approach improves on these existing tools is the ability of the REMO analysis concept to analyse the movement data of multiple individuals concurrently. Laube et al point out that movement in itself is continuous but in order for it to recorded and analysed in information systems, the ‘essential characteristics’ must be represented in a discrete form. This affects the problem of granularity, especially the problem of choosing the most appropriate level of granularity. If too coarse a temporal granularity is chosen undersampling will occur and important information is lost; however, too fine a granularity may lead to oversampling and possibly high levels of noise with ‘feigned autocorrelation’ being introduced between consecutive moves [21]. It is suggested that undersampling can be avoided if data are collected at the ‘highest granularity possible’; oversampling can be lessened by resampling the tracks of the moving objects at increasingly coarse levels of granularity until ‘autocorrelation between the moves disappears’. The importance of being able to change the temporal granularity is noted (i.e., temporal zooming [19]); if all the available information from the data is to be uncovered it is vital that a system can move between ‘more and less detailed views’. The REMO analysis concept has been tested on a diverse range of data: in the paper, they evaluate their model by applying it to football players on a pitch and ‘data points in an abstract ideological space’. These two very different examples highlight the broad range of applications that the REMO analysis concept could be applied to and the wider range of motion patterns that can be defined and extracted for database datasets. However, as noted by Benkert et al [7] and Andersson et al [2], REMO only allows single time steps to be considered and therefore is unable to detect sequences of change over much longer periods of say a month or even a year. 3.3. Andrienko et al The main focus of the work presented by Andrienko and Andrienko in [5] is to develop a tool-set that would allow an analyst to detect movement patterns of multiple entities in vast movement data. However, to do this the patterns that an analyst may wish to detect must be determined and therefore the properties and the structure of movement data

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

138

Z. Wood and A. Galton / Classifying Collective Motion

considered. Although no taxonomy is proposed, a detailed list is given of the movement characteristics and patterns that are specific to multiple entities; importantly it is the collective movement of such entities that Andrienko and Andrienko wish to focus their analysis on. Movement data is defined as a function which matches pairs (entity, time moment) with positions in space. It is noted that this data has to be finite — the real data could not be used (i.e., it would have to be an abstraction) since not all of the possible positions of each and every pair of entity and time could be used. However, Andrienko and Andrienko believe that this model is sufficient if trying to define the ‘types of patterns that may exist in movement data’. From this data, derivative movement characteristics can be derived which consist of data such as direction, speed and their changes (called turn and acceleration respectively). A definition of a pattern is given; it is noted that various patterns can be built out of the basic elements: ‘pattern types’ and ‘pattern properties’. Andrienko and Andrienko treat a specific pattern as an ‘instantiation of one or more pattern types’. An analyst will look for constructs in the data that can be related to known pattern types; once found, the pattern properties of the found constructs will be observed and measured. As they note, this means that the pattern types that are relevant to the data that is being analysed need to be defined. When dealing with multiple entities Andrienko and Andrienko distinguish between categories of what an analyst may wish to observe the movement characteristics of: a single entity over time, a set of entities at a given single time moment and multiple entities over a certain period (i.e., taking a holistic view). These are designated Individual Movement Behaviour (IMB), Momentary Collective Behaviour (MCB) and Dynamic Collective Behaviour (DCB) respectively. The characteristics of an IMB are given as the path and distance travelled, the movement vector and the variation of speed and direction. The ‘synoptic characteristics’ of a MCB is given as ‘the distribution of the entities in space, the spatial variation of the derivative movement characteristics, and the statistical distribution of the derivative characteristics over the set of entities’. It is the DCB that Andrienko and Andrienko believe would be the focus of interest when analysing the movements of multiple entities. They list the various factors that could influence the behaviours and movement characteristics of an entity; these are divided into four main categories: properties of space (e.g., a terrain’s characteristics), properties of time (e.g., temporal cycle), properties and activities of the moving entities (e.g., means and way of movement), and various spatial, temporal and spatiotemporal phenomena such as weather, customs and legal regulations. Andrienko and Andrienko summarise the goal of an analyst, who focuses on the movement of multiple entities, to relate the DCB to these four factors as well as describing and comparing the dynamic collective behaviour. A DCB can be described from two points of view: as ‘the behaviour of the IMB over the set of entities’ and as ‘the behaviour of the MCB over time’; these two views are referred to as ‘aspectual behaviours’ [5]. Importantly Andrienko and Andrienko see these two aspectual behaviours as being fundamentally different and therefore needing to be described using different pattern types. In previous work Andrienko and Andrienko [4] define what they deem to be the four most general types of pattern, three of which they see as relevant when detecting patterns using visual analysis alone: similarity, difference and arrangement. Arrangement patterns can only be considered when observ-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Z. Wood and A. Galton / Classifying Collective Motion

139

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

ing MCB since they examine the ‘changes in the MCB with respect to the ordering and distances between the corresponding time moments’ [5]. Four different specialisations of a similarity pattern are listed: similarity of overall characteristics (e.g., travelled distances), co-location in space, synchronisation in time and co-incidence in space and time. Co-location in space occurs if the paths followed by the observed entities contains the same, or at least some, of the same positions; this similarity pattern can be further distinguished according to whether the shared positions are visited in the same order (‘ordered co-location’), in different orders (‘order-irrelevant co-location’) or in opposite orders (‘symmetry’). The pattern types which describe DCB behaviour are a combination of the three basic pattern types and include: constancy, change, trend, fluctuation, pattern change or pattern difference, repetition, periodicity and symmetry. All these patterns Andrienko and Andrienko refer to as descriptive because they allow us to describe a DCB. In order to describe the relation ‘between the DCB and properties of space, time, entities, external phenomena, and events’ it is stated that additional types of pattern must be used. Referred to as connectional patterns these include correlation, influence and structure. Structure allows complex behaviour to be considered that is built from simpler ones. Influence occurs when phenomena produce effects on others. Correlation can look at co-occurrence of any characteristics. The multiple entities that Andrienko and Andrienko are concerned with analysing are those which retain their identity whilst changing their positions in space; the collectives that we are considering could be viewed in this way. However, they do not consider entities which can merge or split. We believe that this is an important feature of some types of collective. Consider a flock of birds — a phenomenon which can be seen daily. During flight the flock of birds may split into two groups but we might still refer to them as the same flock; it may be because the two sub-flocks will eventually merge to become one entity. The question arises as to whether or not there is a period of time that after which the two sub-flocks would be considered new collectives in their own right. How they move in relation to each other will also need to be considered. This problem will be discussed in section 4. 3.4. Dodge et al A review of existing literature into the discovery of movement patterns in domains such as data mining and visual analytics has shown that there is little agreement on the variation of movement patterns that exist and very few movement patterns have been defined. This has led Dodge et al [12] to begin to develop a taxonomy of movement patterns. Although not completed, the taxonomy that is presented highlights the problems of defining movement patterns and introduces a range of concepts and definitions that could prove useful in the development of a classification of collective motion. However, since the focus is not on movement patterns that are specific to collectives there are still a number of questions left unanswered and problems unsolved. The need to fully understand movement and its properties are noted and a conceptual framework is developed which focuses on the elements that movement comprises (i.e., the primitives of movement). These include the parameters of movements and the external factors that may influence movement. The conceptual framework is then used as a basis for the proposed taxonomy. Three groups of movement parameters are given

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

140

Z. Wood and A. Galton / Classifying Collective Motion

where the first, primitive parameters, is used to derive the second two: primary derivatives and secondary derivatives. The parameters within each of these three groups are organised according to whether they occur in the spatial, temporal or spatiotemporal dimension. For example, an object’s position is given as the sole spatial primitive. From this, the distance, direction and spatial extent are considered primitive derivatives; secondary derivatives are given as the spatial distribution (function of distance), change of direction (function of direction) and sinuosity (function of distance). The consideration of the parameters of movement is a useful concept; however, as they note, when considering groups of moving objects, movement parameters should be defined in a ‘relative sense’ (i.e., in relation to the movement of other moving objects) as well as an ‘absolute sense’ (i.e., in relation to an external referencing system). However, it would appear that they only define movement parameters in an absolute sense. In the conceptual framework, four groups of influencing factors are given: ‘the intrinsic properties of the moving object’, the spatial constraints, the environment where the movement is taking place, and other agents. Dodge et al have designed their influencing factors to be as generic as possible since they vary according to the type of entity. It is noted that this contrasts with the approach of Andrienko and Andrienko [5] who have developed a classification of influencing factors; however, Dodge et al believe that with ‘respect to the behavioural characteristics of movement’, their classes are defined better by using their approach. It is difficult to see if this indeed true and further research is required to see the impact of this design choice on a classification of collective motion or if there may be a more efficient method. The behaviour of moving objects will differ according to whether they are travelling alone or in a group [12]. As already noted, Andrienko et al distinguish between MCB and DCB. Dodge et al state that this approach does not allow the ‘functional relationship’ that exists between the members of a collection to be recognised and instead distinguish between groups where members ‘share a behaviourally relevant functional relationship’ and cohorts whose members, while lacking such a relationship, exhibit a common factor that is of some statistical relevance. This distinction is one of importance but it is lost in their proposed taxonomy which only distinguishes between individuals and groups; they consider the type of relationship that exists between the individuals in a collection only useful in the interpretation of movement patterns and not in their definition. The proposed taxonomy is based on the relevant literature that they have reviewed to ensure that redundant terminology is minimised. This has resulted in many of the movement patterns defined by Andrienko et al [5] and Laube et al [21] being included. However, they have included some new movement patterns and an organisation scheme to produce a taxonomy. Primarily the proposed taxonomy is organised according to the two pattern types distinguished by Dodge et al: generic patterns and behavioural patterns. The former are movement patterns that can be found in ‘any form of behaviour’ but the latter covers movement that is specific to particular types of moving object. Generic patterns can be thought of as the building blocks for forming behavioural patterns. Since generic patterns can also range in complexity they are further categorised into primitive patterns, where only one movement parameter varies, and compound patterns which comprise a set of primitive patterns. After distinguishing by type, the taxonomy is then organised into the dimensions that possible movement patterns can occur in. Primitive patterns are split into three: spatial, temporal and spatiotemporal; compound patterns are only

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Z. Wood and A. Galton / Classifying Collective Motion

141

categorised as spatiotemporal and behavioural patterns are not further categorised. After the second distinction, a list of movement patterns are given some of which are split into sub-categories. However, we would suggest that the movement patterns could be organised further if included in a classification of collective motion. Many of the generic movement patterns that are defined by Dodge et al would need to be included in a classification of collective motion. For example, spatial concentration, synchronisation, repetition, co-incidence, constancy, and concurrence. The taxonomy proposed by Dodge et al acknowledges the possibilities of fixed or variable membership by qualifying their movement patterns meet and moving cluster by means of the additional attributes fixed and varying, which refer to whether the membership of the group can change. The difference between a meet and a moving cluster is that the former stays within a stationary region over the period of its existence whereas the latter follows a trajectory through space. However, there are examples of collective motion that have not been included. A football match involves two teams (i.e., two collectives) interacting with each other. The movement patterns exhibited by each team arise primarily from its collective purpose of winning the match, but the spatial patterns involved in winning the match are oppositely oriented for the two teams. The resulting complementarity of the movement patterns forms a new type of movement pattern in its own right, not present in the Dodge classification. We believe that lack of movement is also an important movement pattern which is missing. Consider a traffic jam where no traffic is moving; the lack of movement could indicate that the roads been blocked by something such as an accident. Another example is an orchestra during a performance. In the proposed taxonomy, Dodge et al defines meet as a movement pattern consisting of a ‘set of MPOs [moving point objects] that stay within a stationary disk of specific radius in a certain time interval . . . a stationary cluster’. This could be suitable for describing the movement of an orchestra whilst they are performing, but it could also describe a football team during a match — the taxonomy is not capable of distinguishing between these two very different types of collective: one where the individuals exhibit a wide variety of movement within the fixed region which they are confined to, and another where movement is largely absent. The importance of granularity in analysing a movement pattern is noted but they believe that the decision of what temporal and spatial granularity to use is specific to a domain and it is therefore somewhat ignored by their classification.

4. What is Missing? Our analysis of existing research has highlighted many useful concepts and methods that could be used in the development of a classification of collective motion. However, there are also many problems that relate to the definition of the movement patterns of a collective that remain unanswered. This section will briefly discuss these issues. 4.1. Considering the collective There are two distinct levels of observation to a collective: a lower level where all that can be seen are the individual members of a collective, and a higher level at which the collective itself can be seen. By focusing on the individual members and the interactions

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

142

Z. Wood and A. Galton / Classifying Collective Motion

Merits Thériault et al [24]

Laube et al [21]

Andrienko and Andrienko [5]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Dodge et al [12]

• The use of spatial and spatiotemporal profiles. • The ‘basic components of spatial evolution’ and spatiotemporal processes that have been identified.

Outstanding Problems • Does not address the granularity problem. • Restriction of entity changes to those that involve a point. • Restriction of footprint to convex polygon. • Only considers co-ordinated motion.

• REMO analysis concept allows a wide range of motion patterns to be detected. • Broad range of applications. • Can analyse the movement data of multiple individuals concurrently. • Discussion of granularity problem.

• Can only consider single time steps and therefore, cannot detect sequences of change over much longer periods. • Some solutions are given to choosing the correct granularity but still unable to view movement on multiple levels of granularity.

• A list is given of the movement characteristics and patterns that are specific to multiple entities. • Define patterns as comprising two elements: ‘pattern types’ and ‘pattern properties’. • List the various factors that could influence the behaviours and characteristics of an entity. • Distinguish between three types of pattern relevant when detecting patterns using visual analysis alone: similarity, difference and arrangement.

• Movement patterns are simply listed — they have not been organised into a taxonomy. • Do not consider entities that can split or move.

• Present a detailed taxonomy of movement. • Taxonomy is based on a conceptual framework which looks at what movement comprises. • Organisation of the taxonomy considers whether the movement pattern is ‘generic’ or ‘behavioural’ and then by which dimension the pattern can occur in (i.e., spatial, temporal or spatiotemporal).

• Movement patterns are only defined in an absolute sense (i.e., in relation to an external referencing system). • Since the taxonomy is incomplete there are gaps. • Taxonomy does not include all of the necessary examples of collective motion (e.g., lack of movement). • Taxonomy is unable to distinguish between some examples of collective motion which are infact very different. • The problem of granularity is believed to be domain specific and therefore somewhat ignored in the taxonomy.

Table 1. A summary of research that focuses on the level of the collective

that exist between them it is difficult, if not impossible, to see the behaviour of the collective as a whole; vital information regarding the collective’s motion is lost. As humans, we are very good at seeing the ‘big’ picture and therefore seeing the motion of the collectives as a whole. However, this ability will need to be duplicated within a machine — a way must be found to consider both the motion of the collective (when considered as a single unit), and the motions of its individual members. As Thériault et al note, ‘existing

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Z. Wood and A. Galton / Classifying Collective Motion

143

frameworks must be improved in order to describe joint evolution of entities forming sets at various abstraction levels’. 4.2. The relationship between a collective and its members

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The relationship that exists between the motion of the collective and those of its members is very important; it seems unlikely that this relationship will be accurately modelled by separately modelling the motions of the collective and those of its members. Some may argue that the movement of a collective is simply the aggregated motion of its individual members; however, we would not always agree with this statement and believe there to be at least two cases to describe the relation between a collective’s motion and those of its members. In the first case, the movement of the collective is formed from the aggregation of the motions of its members (i.e., the collective and its members are moving in a similar way). In [27] we say that the movement of the individual and those of its members are co-ordinated; examples include a procession and a platoon on the march. In the second case, the movement of the collective is not co-ordinated [27] — no individual member of the collective is following the same path as the collective. Although the collective’s motion is still formed from the aggregated motions of the individuals, the motions are qualitatively distinct. An example of this case is a crowd where the individual members are randomly moving about but the crowd, when considered as a single unit, gradually drifts off in a particular direction. Figures 2(a) and 2(b) illustrate these two extremes of co-ordination. To indicate the relationship between the motion of the collective and those of its members the top row shows the motions of the individuals in relation to that of the collective and the bottom row the actual motions of the individuals. Each vector shown for an individual in the diagram depicted on the bottom row is the vector sum of the two vectors on the top, the individual’s vector (zero in the fully co-ordinated case) with the collective’s vector. It should be noted that these two cases are two extremes and there may be cases that lie in between that we have not yet considered.

(a) The movement of the collec- (b) The motion of the collective tive and those of its members are and those of its members are notco-ordinated. co-ordinated. Figure 2. The two extremes of co-ordination

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

144

Z. Wood and A. Galton / Classifying Collective Motion

Modelling social interactions and the decisions made by the individual members of a collective constitutes a ‘bottom-up’ approach to identifying a collective’s movement patterns. As well as a ‘bottom-up approach’ we believe that a ‘top-down’ approach is needed which could be used to identify the type of collective from its global movement patterns. Once the type has been identified, a bottom-up approach may be needed to explain or predict how the individual members of a collective will interact and therefore the movement patterns of the collective that will follow. 4.3. The relationship between members As noted by Dodge et al [12] and Laube et al [21], the motion of individuals need to be considered in relation to each other as well as in relation to an external referencing system. Therefore, as well as looking at the co-ordination between the motion of the collective and the motions of its members, we could also analyse the relationship that exists between the individual members, in particular the degree of co-ordination exhibited [5,12]. There are many ways in which the motions of individual members can be considered co-ordinated and three examples can be seen in figure 3: (1) all individuals move away from a central point, (2) all individuals move toward a central point, and, (3) all individuals move in the same direction at the same speed.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 3. Ways in which the members’ movements can be considered co-ordinated

Movement patterns have already been defined to consider this aspect of entities that are moving in groups. For example, both Dodge et al [12] and Andrienko and Andrienko [5] define co-location in space, concurrence and synchronisation in time [13]; these could all be used to describe the co-ordination of individual members’ movements in space and time. Thériault et al [24] include in their proposed taxonomy the concept of co-ordination but only consider full co-ordination; they do not consider partial coordination. If the relationship between members is to be fully modelled, the degree of coordination would have to be modelled as well as what aspect of the individuals’ motions is co-ordinated. 4.4. Granularity Motion is very much dependent on granularity both in space and time [12,6]. Much of the existing research that has been examined in this paper involves the decomposing of complex movement patterns into simpler ones [21,5,12]. Usually referred to as primitives, these simpler movement patterns are ones in which only one movement parameter changes. The approach of using primitives as ‘building blocks’ to form more complex movement patterns could be seen as one way in which the granularity problem can be overcome. However, such primitives must be chosen with care. Consider walking as an

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Z. Wood and A. Galton / Classifying Collective Motion

145

example. At one level of granularity this can be seen as a motion from one point to another. At a lower-level of granularity walking consists of the repeated movement of one leg after another. There is no natural lowest level of granularity where the motion can be seen as homogeneous. Another problem with the existing research regarding granularity is that none of them appear to allow the switching between granularities; in order to adequately classify collective motion, we must be able to observe the movement patterns at different levels of granularity and since movement relies on both the spatial and temporal dimensions we must be able to do so in both dimensions. This is a difficult problem to overcome but has been examined by Hornsby and Egenhofer [20]. Hornsby and Egenhofer have developed a model which allows the movement of individuals or objects to be represented over multiple granularities both in space and time. As they note, this method allows a user to uncover much more information about the movement in the data. In the proposed model [20] each individual’s movement is modelled as a geospatial lifeline which is inspired by Hägerstrand’s time geography [18]. Like that of Laube et al [21] a geospatial lifeline records the locations visited by the individual over a period of time. However with Hornsby and Egenhofer’s model, the user can choose which level of granularity to observe the geospatial lifeline at. It is noted that movement is a continuous process which is usually observed through discrete time samples; Hornsby and Egenhofer relates the switching between different levels of granularity to the change of sampling rate. Depending on the level chosen, the lifeline will be modelled as a lifeline bead, a lifeline necklace, a lifeline thread, a lifeline tube or a lifeline trace. A lifeline bead gives all the set of possible locations that an individual may have visited or passed through; this set is calculated according to the individual’s given start and end points in space-time and maximum speed. Consisting of ‘two inverted half cones’, a lifeline bead allows us to analyse which locations might have been visited by an individual and for how long. Comparisons of the beads from two individuals can result in discovering periods when they might have met for a period of time whilst moving (two beads share a common part at the rim), whilst stationary (two beads intersect) and a point at which they met (two beads touch at the rim). To refine the granularity that the lifeline is observed from, a lifeline necklace is observed instead of a single bead. The necklace arises from additional sample points being introduced into the model and consists of a sequence of beads. It is important to note that the end point of one bead will also be the starting point of the next bead. By examining the necklace, the locations that an individual visited can be refined. Since each bead is calculated based on the individual’s maximum speed, comparisons of two beads within a necklace can help visualise changes in speed. For example, a wider bead would indicate a faster speed than that of a narrower bead. To view the movement at a coarser level of granularity, Hornsby and Egenhofer present many methods [20]. The beads in a lifeline necklace can be aggregated into ‘fewer, generalised beads’ or some selectively omitted. However, one could also move from viewing a necklace to a lifeline thread, a lifeline tube or a lifeline trace; each of these allows the model to be viewed at an increasingly coarser level of granularity. Retaining the start and end points of each bead in the necklace a lifeline thread is a ‘linear approximation of an ordered sequence of space-time samples’ which show the ‘likely space-time points’ that an object or individual may have visited whilst in continuous

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

146

Z. Wood and A. Galton / Classifying Collective Motion

motion between two points. A lifeline tube is a ‘shape-approximating approach’ which approximates a necklaces geometry; a tube models the movement by taking the speed at the start and end points of the necklace along with point locations which have been selectively chosen from the rim of each of the necklace’s beads. Derived from a tube (based on its center or ‘biased towards the side of the tube’), a lifeline trace is the coarsest view that a lifeline can be observed from. 4.5. External and internal factors Collective motion can be affected by both internal and external factors; therefore, both of these will need to be represented in our classification system. External factors can include phenomena such as weather conditions [5], environmental factors (e.g., terrain [24]) and other agents, which may themselves be collectives. The various interactions that can take place between individual members of a collective can be viewed as internal factors. Some interactions have already been suggested and examined (e.g., attraction, repulsion and alignment [13]) but as already noted, further research is needed to see if these three interactions alone can represent the full range of collective motions. The degree of co-ordination between members as discussed in section 4.3 could be considered as an internal factor. 4.6. Allowing for variable membership

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Thériault et al [24] included in their proposed taxonomy the possibility of variable membership; however, they appear to be the only ones to do this from the existing research that we have examined. This is an important feature of many collectives and it will have a dramatic effect on the shape of a collective’s footprint.

(a) Members leaving the col- (b) Members joining the col- (c) New members joining lective lective. the collective and old ones leaving. Figure 4. Effects of variable membership

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Z. Wood and A. Galton / Classifying Collective Motion

147

Figure 4 highlights three very simple examples of how the footprint of a collective can change because of variable membership: members leaving a collective will result in the size of a footprint decreasing (figure 4(a)); new members joining a collective will result in the footprint increasing (figure 4(b)) and when new members join and old members leave at the same time the footprint will remain essentially the same (figure 4(c)). In each of these diagrams, the upper figure shows an earlier state of the collective, and the lower figure a later state. Since variable membership can have a dramatic effect on the size, shape and dimensions of a collective’s footprint it must be considered throughout the development of a classification of collective motion. 4.7. Accounting for the footprints of the individuals Much of the existing research examined in this paper ignores the size, shape and dimensions of the entities that are being considered. We believe that this approach is not suitable for many of the phenomena which we wish to classify since, for some entities, their size and shape will have an impact on their collective motion. The effect on their collective motion could be due to the members of the collective interacting with the environment. For example, consider the collective which consists of a convey of buses traveling from A to B. The sizes of the buses will affect the collective motion that is exhibited (e.g., their movement may be constrained due to a low bridge or narrow road). Existing approaches take account for the size and shape of the collective by modelling its footprint [24]; we believe that these attributes of the individual members should also sometimes be taken account of in this way (i.e., by modelling them as footprint).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.8. Splitting and merging From the existing research that we have studied, no-one appears to have considered the motion patterns of those collectives which can merge or split whilst still retaining their identity. As already noted, this is a possible feature of collectives (e.g., a flock of birds). There are many questions which will need to be answered regarding these types of collective (see section 3.3); however, we do feel that the motion of these phenomena should be represented, or at least considered, in a classification of collective motion. 5. The Goals of a Classification of Collective Motion From the discussions found in the previous section, it is clear that an adequate and accurate classification of collective motion will need to satisfy the following goals: 1. Both the collective’s motion (when considered as a single unit) and the motions of its individual members is accurately modelled. 2. The system models, and does justice to, the relationship that exists between the motion of a collective and those of its members. 3. The motion of the individual members are modelled in relation to each other as well as an external referencing system (i.e., the relationship between the individual members are modelled). 4. The system will allow the collective motion to be viewed on multiple levels of spatial and temporal granularities (the model should also be able to switch between these levels).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

148

Z. Wood and A. Galton / Classifying Collective Motion

5. Both the internal and external factors that affect the collective motion are modelled, especially environmental factors and those due to external agents. 6. The effects of variable membership on a collective’s motion (including changes in footprint) are accurately modelled in as much detail as possible. 7. The dimensions of both the individual members as well as the collective are modelled and considered. 8. The system allows the splitting and merging of collectives to be represented.

6. Our Proposed Classification System Taking these findings into account, we have begun the development of a classification of collective motion. The remainder of this paper will introduce this classification system and highlight how it tries to satisfy the above goals and thereby improve on existing research. 6.1. The foundation of the system; what is movement? In order to develop an adequate classification of collective motion it is important to fully understand what we me mean by the term motion and how it should be treated — this will act as the foundations of the system and will have a great affect on how the rest of the system should be developed. This concept is similar to that of the conceptual framework in the taxonomy proposed by Dodge et al [12]. After considering many examples of collective motion, we believe that to describe such motions adequately it is necessary to take into account the distinction between processes and events and the ways in which they can be combined.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.1.1. The theory of processes and events By a process we mean an open-ended activity or sequence of events which can in principle be continued indefinitely and which can at some level of granularity be viewed as homogeneous. Examples are a person walking or running, or the flowing of a river. Processes of this kind (which are called ‘open processes’ in [14]) should not be confused with closed routines consisting of a definite sequence of actions with a clear end point, such as making a table or preparing a meal; these are called ‘closed processes’ in [14], but they lack the homogeneity and open-ended character of the open processes. It is only the latter that we will regard as ‘true’ processes; the ‘closed processes’ are better understood as events, by which we mean bounded occurrences with more or less definite beginnings and endings, which at some level of granularity can be conceptualised as point-like. The simplest kinds of events include both the initiation and termination of processes or states (as e.g., starting to walk) and also homogeneous ‘chunks’ of process bounded by starting and stopping events (as, e.g., an event consisting of someone’s beginning to walk, walking for a while, and then stopping). A theory of processes and events must include, in addition to these primitive events, a set of operators by means of which these primitive events can be combined to form more complex events and processes. Such operators include sequential composition, by which a composite event consists of two or more simpler events occurring sequentially, one after the other, and parallel composition, by which a composite event can be built up from two or more simpler events occurring

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

149

Z. Wood and A. Galton / Classifying Collective Motion

over the same time interval. An operator which forms higher-level processes from events is indefinite repetition: the open-ended repetition of some event type, when considered at a level of temporal granularity too coarse to discern the individual events, is seen as a process. An example is a person’s heartbeat, which is an ongoing process comprising many individual heartbeat events. Processes can also consist of other processes; unlike events, however, they can only be combined using parallel composition, since they do not have the endpoints which are a prerequisite for sequential composition. An apparent sequential composition of processes such as ‘walking followed by running’ must in fact be a composition of events, i.e., definite ‘chunks’ of walking and running, the walking ‘chunk’ finishing as the running ‘chunk’ begins. In what follows we shall refer to such ‘chunks’ as episodes. 6.1.2. Episodes We would like to use some of the concepts that are introduced in the theory of processes and events and propose the following basis to consider motion.

The system should allow each instantiation to be viewed on multiple levels of granularity (both spatially and temporally); with particular episodes becoming apparent when viewed at the level of granularity at which their constituent processes may be seen as homogeneous. Our use of episodes can be thought of as a way of enabling ‘temporal zooming’, and since we will examine both the collective’s motion and those of its individuals using separate episodes (see section 6.2), our system can also be considered as instantiating a form of ‘spatial zooming’. Figure 5 illustrates how an instantiation may be broken down into episodes and transitions when viewed at increasingly fine levels of granularity.

coarse granularity

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• A particular example of motion that we wish to classify will be referred to as an instantiation of collective motion with each instantiation taking place over a period of time. • An instantiation can be thought of an event which may comprise a group of smaller events that are themselves made up of episodes. • In our classification system a set of possible episodes will be defined. • Also defined in our proposed system is a set of transitions that allow us to examine how episodes are joined together.

Collective Motion

fine

time

Figure 5. An illustration of viewing motion using episodes — black separators mark transitions between white episodes.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

150

Z. Wood and A. Galton / Classifying Collective Motion

6.2. Building on the foundations; how will instantiations be classified? The proposed classification system will classify an instantiation of collective motion using the following three criteria. 1. Movement of the collective considered as a point. This criterion will classify an instantiation of collective motion by considering the motion of some representative point within the collective (e.g., its geometric centroid, or its centre of gravity). This approach allows us to consider a collective as a single unit. To classify a collective in this way two features can be examined: the composition of the motion in terms of episodes and the structures of each individual episode. 2. Evolution of the footprint. This criterion will classify a collective motion according to the changes in shape, size and orientation of its footprint. As with the other criteria we will use the concept of an episode to classify the evolution of a footprint. 3. Movements of the individuals. Collective motion ultimately arises from the motions of a group of individuals and this criterion will look specifically at how criteria 1 and 2 arise from these movements. Three features of the individual members will be examined: the composition of their motion composition in terms of episodes, the structure of each individual episode and the relation between the movement of the members of the group (i.e., the degree of co-ordination exhibited).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.2.1. Classifying an Instantiation of Collective Motion This section will use an example to illustrate how collective motion may be presented in our proposed system. It is important to remember that the system we are presenting is still in its infancy and many questions require further research; however, we believe that the method we are proposing (i.e., considering motion as a combination of processes and events), will provide a solid foundation for a classification of collective motion which also begins to satisfy the goals as laid out in section 5. For our example, we would like to consider the movement of multiple people through a tunnel. There are two distinct types of collective that we might pick out here. on the one hand, there is the collective whose members at any given time are just the people who are in the tunnel at that time (figure 6(a)); this collective has variable membership and is always located in the tunnel, so is in effect stationary. This case can be considered as a continuous process comprising three spatially distinct sub-processes that are all in operation at the same time: the convergence towards the tunnel’s entrance, passing through the tunnel and then the dispersion from the tunnel’s exit. On the other hand, we can consider a collective comprising a fixed group of people (for example a tour group) whose path takes them through the tunnel (Figure 6(b)). In contrast to the previous case, the motion of this type of collective exhibits three distinct temporal phases: the arrival of the crowd at the tunnel (figure 7(a)), the period when the crowd is passing through the tunnel (figure 7(b)) and the crowd emerging from the tunnel and dispersing (figure 7(c)). Let us take the example of a tour guide moving through the tunnel and consider how it may be classified in our proposed system.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Z. Wood and A. Galton / Classifying Collective Motion

151

(a) The collective which consists (b) A crowd moving through the of those passing through the tunnel tunnel Figure 6. Two possible examples of collective motion through the tunnel

6.2.2. A crowd passing through the tunnel Firstly, all of the necessary episodes and transitions need to be defined (table 2). Once all of the necessary transitions and episodes have been defined, the instantiation of collective motion will need to be classified according to the three criteria set out in 6.2. Since we only wish to give a demonstration of how the foundations of our system will work, it is sufficient to look only at criteria 1 and 2.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• The movement of the collective considered as a point: ∗ At a coarse level of granularity, the motion of the collective when considered as a point can be observed as that shown in figure 8; the motion of the point can be seen as consisting of a single episode which starts at a time before the point has entered the tunnel (point A in figure 8) and finishes at a time after it has left (point B). If we increase the level of granularity three clear episodes can be observed: moving to the entrance of the tunnel, passing through the tunnel, and then moving from the exit of the tunnel and away from the tunnel. Each of the episodes is joined to the next by a transition which, minimally, simply refers to the change of external context of the motion (i.e., in the tunnel vs. out of the tunnel) but may also be accompanied by internal changes in speed or direction. At increasingly fine levels of granularity, more episodes may become apparent; however, these will not be possible at the description level of ‘passing through the tunnel’. Therefore, the instantiation of collective motion according to this criterion will be classified as shown in figure 9(a). • The evolution of the footprint: Transitions

Episodes

in the tunnel

constant velocity

out of tunnel

constant expansion

change in direction

constant contraction

change in speed

constant convergence

change in size

constant divergence

Table 2. The pre-defined sets of transitions and episodes

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

152

Z. Wood and A. Galton / Classifying Collective Motion

∗ At a very coarse level of granularity the footprint of the collective can be seen to move between two points; at this granularity, the footprint of the collective would be considered as so small that changes to its size shape and dimension would be considered insignificant. Viewing the footprint at a finer level of granularity, the motion of the footprint will be observed as consisting of three episodes: moving towards the entrance of the tunnel, moving through the tunnel and exiting and moving away from the tunnel; these three episodes will be distinguished by changes in the relationship between itself and external factors (i.e., whether it is in or out of the tunnel). It is only at a finer level of granularity, that changes in the size and shape of the footprint will be apparent. At this level of granularity, the instantiation of motion will consist of five episodes: essentially constant size and shape; decreasing size and shape, essentially constant shape; increasing size and shape; and finally, essentially constant size and shape. These episodes will be joined by the one tranistion, change in shape. Figure 9(b) illustrates how the evolution of the footprint for this instantiation of collective motion will be classified.

(a) A crowd arriving at the en- (b) A crowd passing through (c) A crowd emerging from the tunnel trance of the tunnel the tunnel and dispersing

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 7. The three temporal parts occurring when a crowd moves through a tunnel

A

B

Figure 8. The motion of the collective when considered as a point

6.3. Considering the goals Table 3 indicates how we believe we can develop a classification of collective motion which satisfies the goals as discussed in section 5.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

153

Z. Wood and A. Galton / Classifying Collective Motion coarse Collective Motion

Collective Motion

granularity

granularity

coarse

fine

fine time (t)

(a) Criterion 1: the motion of the collective when considering it as a point.

time (t)

(b) Criterion 2: the footprint

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 9. The classification of the instantiation of collective motion based on two criteria

Goals

How will our framework satisfy these goals?

1

Our proposed framework will allow an instantiation of collective motion to be classified by looking at both the motion of the collective (when considered as a single unit) and those of its individual members. These two different motions are looked at separately.

2

Each instantiation of collective motion will classify the motion of the collective and the motions of the individual members separately. The relationship between the two motions can be examined by comparing the way in which the collective motion is classified according to criterion 1 and 3. The different combinations of these two criterion may reveal the different types of relationship that can exist.

3

A set of operators will be defined which indicate the different levels of coordination of movement that can occur between the individual members. Movement in relation to each other determines the footprint’s evolution, movement in relation to external reference system determines the centroid motion.

4

By using the concepts found in the theory of processes and events [14] along with viewing the phenomena at different levels of granularity (i.e., the level of the collective and the level of the individuals) we believe our model will be capable of viewing motion on different levels of granularity.

5

Collective motion patterns will be defined which arise from interactions within the collective (internal factors) and with something outside the collective (external factors).

6

Criterion 3 has been introduced to help satisfy this goal.

7

We would like to see if individual members of a collective could also be represented using footprints which we refer to as sub-footprints. Each sub-footprint would represent the dimensions of an individual entity. The granularity at which a collective’s footprint is described will play an important part since at too coarse a granularity, the sub-footprints will not be viewable. However, it could be argued that is only at a finer level of granularity that the dimensions of the individual members are important at - if this is true, then our proposed approach should be suitable.

8

We will try to ensure that all of the necessary examples of collective motion have been defined in our completed system.

Table 3. How does our proposed system satisfy the previously defined goals?

7. Further Work The foundations of our proposed classification system have been laid out in this paper but there are many questions that need to be answered and problems solved before a classification of collective motion can be fully developed. • We wish to produce a classification of collective motion that can be used across domains; however, further research is needed to see to what extent this can be

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

154

Z. Wood and A. Galton / Classifying Collective Motion

achieved. For example, can we give a detailed definition of our movement patterns without having to omit information since part of the definition is domain specific? • Many movement patterns relevant to collectives have already been defined [12,5,21,24]. We will need to make sure all those ones that are relevant have been included in our classification of collective motion as well as new types of collective motion. • Although there has been attempts to classify movements patterns that are relevant to collective phenomena, there has been no attempt to systematically link the different movement patterns to the associated collectives. In order to achieve our research goal, we will need to complete this task. However, this can only be done once a classification of collective motion has been developed. Once completed, this classification can be integrated with the existing taxonomy of collectives; the process of integration will heavily involve the systematic linking of collectives to their associated movement patterns. • The analysis of motion into processes and events needs to be developed to see if it is indeed an adequate and efficient method of handling movement. Once this is established and we find that it allows motions to be viewed on multiple levels of granularity, the rest of the classification system can be completed; this will involve the development of a full set of transitions and episodes that are not domain specific. The system will be validated via thorough testing.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

8. Conclusion The increase in data recording the movement of objects has led to an increase in the need to be able to analyse and interpret it. Particular collective phenomena and movement patterns have been studied to some degree but most of the research appears to focus on the movement patterns of the individuals within a collective rather than the motion of the collective as a whole. There appears to be little research into developing a general framework that models the movement patterns of a wide range of collective phenomena or even identifying that a collective exists in a dataset. We have carried out a review of the existing relevant research. Although this provides us with some useful concepts and methods that could be used in the development of a classification of collective motion, there are many questions that remain unanswered. More importantly, there are vital features of collectives which these existing proposals cannot represent. Therefore, we have produced a list of goals that a classification of collective motion should fulfill. We also give a brief overview of our proposed classification system that we are currently developing with these goals in mind.

References [1]

W. Ali and B. Moulin. 2D-3D multiagent geosimulation with knowledge-based agents of customers’ shopping behaviour in a shopping mall. In David M. Mark and Anthony G. Cohn, editors, Spatial Information Theory: Proceedings of International Conference COSIT 2005, Lecture Notes in Computer Science, pages 445–458, Ellicottville, NY, USA, 2005. Springer. [2] M. Andersson, J. Gudmundsson, P. Laube, and T. Wolle. Reporting leaders and followers among trajectories of moving point objects. Geoinformatica, 12(4):497–528, 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Z. Wood and A. Galton / Classifying Collective Motion

[3]

[4] [5] [6]

[7]

[8]

[9]

[10]

[11] [12] [13] [14] [15]

[16]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[17] [18] [19] [20] [21] [22] [23] [24]

[25] [26] [27]

155

G. Andrienko and N. Andrienko. Extracting patterns of individual movement behaviour from a massive collection of tracked positions. In Bjoern Gottfried, editor, Workshop on Behaviour Monitoring and Interpretation. Technical Report 42, pages 1–16, Germany, 2007. Technologie-Zentrum Informatik. N. Andrienko and G. Andrienko. Exploratory Analysis of Spatial and Temporal Data: A systematic Approach. Springer, Berlin, 2006. N. Andrienko and G. Andrienko. Designing visual analytics methods for massive collections of movement data. Cartographica, 42(2):117–138, 2007. M. Batty, J Desyllas, and E. Duxbury. The discrete dynamics of small-scale carnival events: agentbased models of mobility in carnivals and street parades. Int. J. of Geographical Information Science, 17(7):672–697, 2003. M. Benkert, J. Gudmundsson, F. Hübner, and T. Wolle. Reporting flock patterns. In Proceedings of the 14th Annual European Symposium, volume 14 of Lecture Notes in Computer Science, pages 660–671. Springer, 2006. P. Blythe, G. Miller, and P. Todd. Human simulation of adaptive behaviour: Interactive studies of pursuit, evasion, courtship, fighting, and play. In Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, pages 13–22, Cambridge, 1996. MA: MIT Press/Bradford Books. C. Claramunt and M. Th´ eriault. Toward semantics for modelling spatio-temporal processes within gis. In M. J. Kraak and M. Molenaar, editors, Advances in GIS Research, pages 47–63. Taylor & Francis, 1996. C. Claramunt, M. Th´ eriault, and C. Parent. A qualitative representation of evolving spatial entities in two-dimensional topological spaces. In S. Carver, editor, Innovations in GIS V, pages 119–129. Taylor & Francis, 1997. C. Couch. Collective behaviour: An examination of some stereotypes. Social Problems, 152-3:310–322, 1968. S. Dodge, R. Weibel, and A.K Lautenschütz. Towards a taxonomy of movement patterns. Information Visualization, 7:240–252, 2008. R. Eftimie, G. de Vries, M. A. Lewis, and F. Lutscher. Modeling group formation and activity patterns in self-organising collectives of individuals. Bulletin of Mathematical Biology, 69:1537–1565, 2007. A. Galton. Experience and history: Process and their relation to events. Journal of Logic and Computation, 18(3):323–340, 2007. A. P. Galton. Pareto-optimality of cognitively preferred polygonal hulls for dot patterns. In Christian Freksa, Nora S. Newcombe, and Peter Gärdenfors adn Stefan Wölfl, editors, Spatial Cognition VI: Learning, Reasoning and Talking about Space, pages 409–425. Springer, 2008. J. Gudmundsson, P. Laube, and T. Wolle. Movement patterns in spatio-temporal data. In Encyclopedia of GIS. Springer-verlag, 2007. S. Gueron, S. Levin, and D. Rubenstein. The dynamics of herds: From individuals to aggregations. Journal of Theoretical Biology, 182(1):85–98, 1996. T. Hägerstrand. What about people in regional science? Papers in Regional Science, 24(1):6–21, 1970. K. Hornsby. Temporal zooming. Transactions in GIS, 5:255–272, 2001. K. Hornsby and M. Egenhofer. Modelling moving objects over multiple granularities. Annals of Mathematics and Artificial Intelligence, 36:177–194, 2002. P. Laube, S. Imfeld, and R. Weibel. Discovering relative motion patterns in groups of moving point objects. International Journal of Geographical Information Science, 19(6):639–668, 2005. H. J. Miller. What about people in geographic information science? Computers, Environment and Urban Systems, 27(5):447–453, 2003. D.J. Sumpter. The principles of collective animal behaviour objects. Philosophical Transactions of The Royal Society B, 361:5–22, 2006. M. Thèriault, C. Claramunt, and P.Y. Villeneuve. A spatio-temporal taxonomy for the representation of spatial set behaviours. In H.B. B`‘ohlen, S. Jensen, C, and M. Scholl, editors, Spatio-Temporal Database Management, International Workshop STDBM’99, pages 1–18. Springer, 1999. C. M. Topaz and A. L. Bertozzi. Swarming patterns in a two-dimensional kinematic model for biological groups. SIAM Journal on Applied Mathematics, 65(1):152–174, 2004. Z. Wood and A. Galton. A taxonomy of collectives. Under review, 2009. Zena M. Wood and Antony P. Galton. A new classification of collectives. In Carola Eschenbach and Michael Gruninger, editors, Formal Ontology in Information Systems: Proceedings of the Fifth International Conference (FOIS 2008). IOS Press, 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Well-Being and Assisted Living

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-159

159

Implementing Monitoring and Technological Interventions in Smart Homes for People with Dementia Case Studies Tim ADLAM a,1 , Bruce CAREY-SMITH a , Nina EVANS a , Roger ORPWOOD a , Jennifer BOGER b , Alex MIHAILIDIS b

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

a

Bath Institute of Medical Engineering, Wolfson Centre, Royal United Hospital Bath, BA1 3NG, UK. b IATSL, Dept. Occupational Therapy, University of Toronto, 160–500 University Avenue, Toronto, ON, M5G 1V7, Canada. Abstract. This chapter is about the application of behaviour monitoring technology in the context of a smart home for people with dementia. It is not about the design of technology, but about the application and configuration of existing technology in a specific context: in this case, smart flats for people with dementia in London and Bristol. Technology was installed and evaluated in a year long evaluation in London by a resident tenant. He was assessed throughout his tenancy using standardized outcome measures, by clinical professionals; and through the analysis of data collected by sensors installed in his flat. It was demonstrated that the technology had a positive impact on his life, improving his sleep in particular. This improvement had a positive effect on many other aspects of his life in the extra care setting where he lived. The Bristol evaluation is in progress. It is also an evaluation of smart home technology embedded in a person’s own home. This chapter also describes two technologies being developed at the University of Toronto, in Canada. The first is COACH, a system used for the guidance of activities of daily living, and the second is HELPER, a fall detection and personal emergency response system (PERS). These technologies operate autonomously with little or no explicit input from the person using them, making them extremely intuitive and effortless to use. Practical experience and clinical results gained from the latest efficacy trials with COACH are presented and discussed. From the data collected through these trials, it seems that COACH has a positive effect on peoples’ ability to independently complete the activity of handwashing. It is hoped that monitoring technologies such as these will improve the independence and quality of life for people with dementia. Keywords. dementia, monitoring, behaviour, technology, evaluation, guidelines, implementation, installation, sensors, measurement

1 Corresponding

Author: Tim Adlam, [email protected]

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

160

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Introduction The monitoring of the behaviour of people with dementia has been implemented for several primary reasons. Monitoring for safety: The continuous monitoring of behaviour in a person’s home can enable changes in behaviour to be detected that might indicate that a dangerous situation has occurred, such as the person having fallen or become suddenly ill. Monitoring for long term trend prediction: Behaviour monitoring can also be used to detect long term trends in behaviour. It is thought that such long term trends can be used to detect potential health problems or changes in cognitive status for example through the measurement of specific parameters such as walking speed or more abstract parameters like activity levels and sleep patterns. Monitoring for control and individualisation: Behaviour monitoring can be used to determine lifestyle related parameters that can then be used to customise the configuration of intelligent assistive systems. Customisation can either be manual, based upon human interpretation of the monitoring data, or automatic, based upon data interpretation by an artificial intelligence. At this time, behaviour monitoring has been used widely in telecare systems for the detection of suddenly occurring problems. Networked sensors and communication systems installed in a person’s home are linked to a call centre, and when a problem is detected, a call to the centre is generated and a person is sent to the home to deal with the problem. Some systems such as those described in this chapter [1, 3] are taking a more intelligent approach to this application and are using artificially intelligent software agents to make judgements about the nature of the problem detected.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Technology in Bristol and Deptford This section describes the technology installed in Bristol and Deptford flats. Both are part of Extra Care developments that provide a home and care for older people. The flats are similar in the facilities they provide, and the infrastructure upon which the smart systems are built is also similar. The Deptford flat is hard wired with some limited proprietary wireless systems for voice messaging. The Bristol flat employs a hybrid wireless and wired system but is mostly wireless, using off-the-shelf sensors and devices to deliver monitoring and interventional services to the user. A range of assistive devices [4] are installed in both flats that aim to support people with dementia to be safer and more independent; and to support care staff in their duties. 1.1. Infrastructure The Bristol and Deptford flats have been built on the KNX buildings automation system. This is an off-the-shelf, standardised and highly interoperable system of devices for networking, sensing and controlling. Devices are available using wired and wireless communications. KNX devices are also available to link the system to remote communications, including webservers, gateways into TCP/IP networks and facilities for information distribution and control using GSM and the GSM Short Message Service (SMS).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Sensor

Location

Source

Passive infrared (PIR) motion sensors Bed occupancy sensor

All rooms Bedroom

KNX supplier BIME

161

Kitchen Hardware store Smoke sensor (modified) Door opening sensor Front door KNX device supplier Table 1. A table of sensors installed in the Deptford and Bristol flats

KNX is based upon the European Installation Bus (EIB), the European Home System (EHS) and BatiBus. It is supported by many major device manufacturers who present a wide range of sensors, actuators and controllers suitable for use in commercial and domestic applications. Interoperability of devices from different manufacturers is excellent due to strict management of the standard by the Konnex Association. The Deptford flat used wired KNX, which at the time of installation was known as EIB or ‘European Installation Bus’2 . Control logic was provided with an ABB AB1/S logic controller installed in the flat on a DIN rail. The AB1/S is a programmable controller with up to a total of 200 logic elements available to the programmer for all programmes. The PC based programming software utilizes a crude graphical interface to link and configure logical elements. The Bristol flat used a hybrid of the wired and wireless versions of the KNX standard, as well as some off-the-shelf proprietary wireless devices operating through a KNX gateway. Since the time of installation, the range of wireless KNX devices has been extended and it is no longer necessary to use proprietary devices in this way.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1.2. Sensors and Actuators The Bristol and Deptford flats were similarly equipped with a range of sensors and actuators. The Deptford devices were installed in the flat in the six months prior to the beginning of the tenancy in April 2006. The actuators and sensors were sourced from KNX suppliers, electrical wholesalers, hardware stores and BIME. The passive infrared motion sensors and the door opening switch were from the range available off-the-shelf from KNX manufacturers such as ABB, Gira and Siemens. The bed occupancy sensor was designed and built by BIME, and the smoke sensor was bought from a hardware store and adapted for use with a KNX binary input interface. As with the sensors, the actuators used were a mix of off-the-shelf KNX devices and those designed and built by BIME. Both flats used standard KNX lighting actuators; and KNX binary output interfaces to control BIME devices such as the cooker isolator, the smart taps (Deptford only), and the warden call interface. The voice messaging was controlled by a KNX binary output interface linked to a BIME wireless controller and a network of message boxes in each of the Deptford rooms, and a standard off-the-shelf KNX device voice messaging system hard wired to speakers in each of the rooms in the Bristol flat. 2 The EIB standard was merged with the European Home Automation System (EHS) and BatiBus standards to form KNX: a new single European standard for buildings automation networks. KNX has since been granted an international ISO standard. See http://www.knx.com for further information.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

162

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Actuator

Location

Source

Lighting control Cooker isolation contactor and KNX binary output)

All rooms Kitchen

KNX supplier Electrical wholesaler and KNX supplier

Voice messaging (Deptford) Voice messaging (Bristol) Smart taps and KNX binary input (Deptford only)

All rooms All rooms Kitchen

BIME device KNX supplier BIME device and KNX supplier

Door opening sensor Front door KNX supplier Table 2. A table of actuators installed in the Deptford and Bristol flats

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Passive infrared motion sensors (PIR): The Deptford PIRs were standard KNX sensors bought off the shelf from a UK KNX supplier. The specific devices purchased were designed to work as presence sensors, triggering after several motion detections have registered, and resetting after a short period of inactivity. These sensors were selected because they were ceiling mounted with a 360◦ field of view. It would have been preferable to use wall mounted motion sensors designed for security applications, however the location of the KNX wiring installed at the time of build made this impractical. As a result the presence sensors did sometimes under-report movements through the lounge if the person was moving quickly. This was not thought to be a substantial problem in practice as the tenant did not walk quickly. The sensors were used for detecting the entry and exit of the tenant from the room. For this reason, they needed to be sited carefully, to prevent a sensor ‘seeing’ into more than one room and falsely triggering. Again, the incorrect positioning of the KNX wiring prevented this careful positioning from taking place, and there was some overlap of the lounge sensors into the kitchen and the bedroom. Bed occupancy sensor: A bed occupancy sensor was installed underneath the bed legs. The sensor weighs the bed and can detect if a person gets in or out of bed. A pressure mat by the bed can detect only if a person is standing by the bed and not whether they are getting in or out. The sensor was linked to the KNX bus using a binary input device. Cooker minder: A cooker monitoring system was installed that used a smoke detector to trigger a voice warning message if smoke was detected. If the smoke did not clear shortly after the message was played, then the system isolated the cooker from its electrical power supply. Smart taps: The kitchen was fitted with smart taps that could be used just like ordinary taps using their handwheels, but they also had a timer incorporated that turned off the water after a preset time. After the water is turned off, the taps remain usable and are not left in the ambiguous situation of being turned on but with no water coming out. Voice messaging: The messaging system was designed and built at BIME rather than being an off–the–shelf system. A message device was installed in each room and wirelessly linked to a controller installed with the bus system devices in a cupboard. A code sent from the KNX logic controller determined which message was sent to which room. A different off–the–shelf KNX system was installed in the Bristol flat which used a single message generating device and distributed hard wired speakers. Codes were delivered to the device using multiple binary output channels.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

163

Front door sensor: A magnetically actuated reed switch was fitted to the front door to detect door opening. It was interfaced with the bus using a binary input channel. Warden call interface: It was necessary to be able to alert the staff if a problem was detected that could not be resolved with a local intervention by the system. For this purpose a device was integrated into the warden call system that could generate a call when needed. The call system was linked to a DECT3 . They operate in the phone handset that was carried at all times by the duty manager in the building. The Deptford flat had a single warden call channel available, so the call to the duty manager’s DECT phone did not distinguish between the channel being activated by the system or by the occupant pressing a button. The Bristol flat had two warden call channels, one of which was dedicated to the monitoring system, thus enabling the duty manager to know whether the occupant or the system initiated a call. Presentation of monitoring data: The data gathered by the monitoring system needs to be presented to the care staff and duty manager in a way that is understandable and useful to them. An example of raw sensor data is shown in Figure 1. This data, though it contains great richness and detail, is not useful in a care context without further interpretation and reformatting. Care staff do not need to know how many times a light switch was operated. Much more useful parameters might be whether the occupant is asleep or awake, or whether the occupant has been inactive for an extended period of time. 25/06/2006 25/06/2006 25/06/2006 25/06/2006

11:00:30 11:00:30 11:01:09 11:01:09

GDIW GDIW GDIW GDIW

0B19 0B1D 0C0C 0B19

6 6 6 6

3 3 3 3

0x01 0x00 0x01 0x01

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. A sample of raw sensor data from the Deptford evaluation. Each line is an event recorded in the log that might be a motion sensor trigger or a light switch being operated. The events are automatically time and date stamped.

The Bristol and Deptford flats used slightly different approaches to data presentation. They both used alerts delivered to care staff based on realtime monitoring data using several different media. Automatic staff calls The staff call facility incorporated into the smart flat was used to alert carers to situations which were likely to need their immediate attention. In this way they were able to target their attention at appropriate times. Staff calls were raised in the following three scenarios: Unsettled night time behaviour: Bed occupancy and movement within the flat were monitored and analysed in real time and staff were alerted if unsettled behaviour was detected at night for a prolonged period. This was triggered on a number of occasions during the evaluation period and allowed staff to provide reassurance and reinforce daily routines. The tenant also had a tendency to become anxious when she had an appointment the following day, would wake early in the morning, and being disorientated about what time of day it was, would proceed to get ready. 3 Digital Enhanced Cordless Telecommunications: A standard for cordless portable phones used typically for

domestic and corporate purposes that permits the sending of voice and data. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

164

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The automatic staff calls alerted carers to the activity and upon visiting the flat could reassure the tenant of the time and encourage them back to bed. Wandering: The front door contact was monitored during the night time ’at risk’ period and staff alerted if the door was opened. Although the smart flat was located in a sheltered living environment, the family were anxious that there would be safety implications if the tenant left her door open late at night or early in the morning. During the evaluation period the tenant opened the door many times during the night time period. She was sometimes found outside her flat, particularly if she was anxious about making an appointment, however the majority of cases appeared to be due to confusion about the layout of the flat. In either case, the capability of the monitoring system to call staff when the door was opened assisted staff in providing reassurance and support to the tenant when it was needed. Difficulties in cooking: The kitchen smoke detector was monitored and analysed in real time and staff were alerted if persistent smoke was detected. Cooking can be an important activity for people, preserving function and skills, providing patterns of daily living which help anchor in time and space, and, for some, helping to maintain identity and value. The tenant had a history of cooking at home, and continued to do some cooking once established in the sheltered living environment. Most of this occurred with one-to-one support from a carer, however she did attempt to perform a limited amount of cooking on her own. On five occasions during the evaluation period smoke was detected in the kitchen by the monitoring equipment and in four of these persisted long enough for staff to be called. In most care environments safety is considered of primary importance. The ability of the smart flat to monitor and respond to signs of danger in the kitchen resulted in carers being more at ease in leaving the tenant to cook for herself, ultimately allowing greater independence. Email updates The computer within the Hillside flat was equipped with a mail server and this was used to generate emails in response to specific events within the flat. Primarily this was used to notify engineering support staff when faults occurred or when interventions such as verbal messages or staff calls were made. However, the system was also set up to notify the care manager of the time and reason for each automated staff call. As these calls were predominantly made during night time hours this enabled the care manager to gain a picture of the previous nights events upon arriving at work in the morning. This information complemented the written logs kept by the night staff. Towards the end of the evaluation, the system was modified to allow the generation of more flexible summary emails. The intention was that these emails would be sent to carers and family if unsettled behaviour was detected at night and would contain a summary of information deemed relevant to the recipient. The information would include the bedtime and risetime of the tenant and any periods of sustained activity at night. If messages were played or staff were called the times of these interventions would be included. The impetus for creating these summary emails was that the key family member was, from time to time, requesting feedback from the monitoring. This was predominantly to help them confirm information received verbally from the tenant about their night time experience. It helped paint a bigger picture of events that had occurred over previous nights in terms of restlessness and activity. This information had to be manu-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

165

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

ally extracted and it was felt that automating this process would allow the family to gain greater benefit from the monitoring. However, the automated provision of information to interested parties who weren’t involved in the day-to-day, hands-on care of the tenant raised some important questions. The overarching conclusion was that any sharing of behavioural monitoring data must be done in a way that supports rather than undermines the existing relationships between the people who have a strong interest in the wellbeing of the person being cared for and those who are carrying out the person’s day-to-day care. The technology installed in the Bristol and Deptford flats was mostly off–the–shelf. It did not have a high degree of intelligence and employed simple conditional algorithms to determine its responses to situations detected using its array of sensors. However even with these modest technologies, a strong positive impact on the quality of life of the occupant was demonstrated using sensor based and personal evaluation of the technology. Some of the results of the use of this technology are presented in the results section of this chapter. Long term statistical analysis Processing of data was undertaken by the research staff to extract behavioural indicators of tenant wellbeing. Examples included hours in bed, number and duration of out-of-bed episodes, day time room transitions, number and reasons for staff calls. Graphs and statistics were presented to staff and family at reviews one month after the tenancy began and then every 3 months until the end of the evaluation. At the initial review data was used to inform which interventions were to be enabled. At subsequent reviews the monitoring data was useful in highlighting the difficulties that the tenant and/or staff may have been having and in prompting discussion about how the technology might be modified to better support all parties. Each review led to refinements of the system to better cater for the needs of the specific tenant and to accommodate changes in her condition. To be valuable to staff and family, the long term recorded data must provide information which is directly relevant to the tenant’s care. The challenge is in identifying which parameters are most useful as indicators of the tenant’s wellbeing and then to present them in such a way as to allow clear interpretation and comparison. The adoption of monitoring data as a useful tool in the process of developing a tenant’s care plan is most likely to be successful when a clear link can be made between the behavioural trends shown in the data and existing measures of tenant wellbeing made through staff observation. 2. Technology in Toronto This section presents the COACH and HELPER systems. Both systems are autonomous and are aimed at supporting different aspects of in-home living with little or no effort from the people using the technology. The COACH system guides the completion of activities of daily living while the HELPER system detects acute and long-term changes in health. Both systems rely on artificially intelligent sensing, planning, and response. 2.1. COACH (Cognitive Orthosis for Assisting aCtivites in the Home) The confusion and impaired memory functioning that accompanies dementia makes it difficult for even mildly affected people to complete activities of daily living (ADL) [5]. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

166

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

These difficulties in ADL completion become more challenging as the condition worsens. It is not uncommon for a person with dementia to become disoriented part way through an ADL and/or not be able to remember what steps are required to complete the activity. The current solution is to have a caregiver supervise the person with dementia as they go about their day, something that can be difficult and humiliating for everyone involved. The COACH system has been developed to assist people to complete ADL independently from a human caregiver, easing caregiver burden and enabling greater independence of the person with dementia. COACH monitors a person with dementia as they complete an ADL and provides prompts to the person if s/he appears to be having trouble with a step in the activity. While the goal is to support activity completion through the use of COACH, the intent is for the person with dementia to remain in control over the activity and to engage with their surroundings as much as possible. As such, prompts are only given if and when the person is having trouble with a step (e.g., confusion about the order of completion of steps, missing a step, or remaining inactive for a period of time). People with dementia are as unique as cognitively aware people; each individual has his or her own needs and abilities. Moreover, the effects of dementia are usually dynamic, causing changes in a person’s comprehension and execution capabilities that vary from day-to-day and generally degrade over longer periods of time. For prompts to be effective, they must not only be accurate, clear, and delivered in a timely manner (not too soon or too late), but also must be sensitive to a person’s capabilities and to changes in these capabilities, providing greater or less support as necessary. This requires the system to take into account the context of its environment, enabling COACH to react appropriately to the individual needs of the person using it. Aspects that need to be considered include: Where the person is in the activity? What is his/her preferred ordering of ADL steps? How responsive is the person to different kinds of prompts?

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.1.1. Infrastructure, Sensors and Actuators As it is intended for use in the home, the design criteria for COACH included ensuring the system was affordable and simple to install. This resulted in the intentional development of a system with as few sensors and actuators as possible. The current system (version three of COACH) consists of a computer, an overhead camera, a flat-screen monitor, and speakers. An example setup of COACH can be seen in Figure 5. To test the efficacy of the system, it was configured to guide the ADL of handwashing, as this activity must be completed several times every day and steps are often missed. Overhead camera: A video camera is placed over the area of interest, in this case the sink. Images from the camera are relayed to the computer, which analyses them for hand, towel, and soap positions in each incoming frame before deleting it. At the moment, the camera is the only sensor in COACH, although sensors such as a microphone will likely be added in the following version to enable increased perception (e.g., "hearing" when the water is on or off) and interaction with the person using the technology (e.g., basic speech recognition). Computer: A single computer is all that is needed to analyse incoming images (sensing), make decisions about how COACH should react (planning), and to issue prompts to the person using the system (prompting). The particulars of how this is accomplished are discussed in more detail below.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

167

Flat-screen Monitor and Speakers: The monitor and speakers are used to play audio and audio/video prompts that guide the person using the system to the next appropriate step in the activity. Using the hardware listed above, the system guides a person through an ADL, in this case, handwashing. Handwashing can be broken into six steps, as shown in Figure 6. Any pathway from the Start to Finish nodes in Figure 6 represents a successful instance of handwashing. Using this representation, the system interprets images from the camera to determine where in the activity the person is and whether or not s/he needs assistance (i.e., if s/he has missed a step, is completing steps in the incorrect order, is leaving the activity before it is completed, or has been inactive for a period of time). If assistance is required, the system plays an audio or audio/visual prompt to guide the person to the next appropriate step. Prompts vary in specificity from minimal (e.g., a simple audio prompt asking the person to “Turn on the water”) to specific (e.g., “Tom, pull up on the silver lever in front of you to turn the water on,” accompanied by a video demonstrating how to turn the water on). Should the person remain irresponsive to the guidance given by COACH, the system will summon a human caregiver to intervene (e.g., via a pager). As such, COACH is not intended to replace the caregiver, but rather ensure the caregiver that s/he will be summoned if difficulties arise that prompting alone cannot handle. A novel feature of COACH is the dynamic history that the system maintains about each user. Each individual’s short- and long-term history is learned and adapted autonomously by COACH as it interacts with the person and includes his/her level of dementia, responsiveness, and number of times they have been prompted that day. By maintaining and automatically adjusting the history, the system can detect and respond appropriately to changes in daily and overall abilities. In an effort to keep the person using the device as involved and in control of the activity as possible, prompts are given at the minimum level that elicits compliance.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2. Health Evaluation and Logging Personal Emergency Response (HELPER) system Falls, heart attacks, and other acute adverse health events become more prevalent with age. When an adverse event occurs, the effected person is often unable to summon help. This can result in many hours, or even days, before the effected person receives the medical attention they require, which in turn leads to higher rates of morbidity and mortality. To provide support and security, many people install a personal emergency response system (PERS). A PERS generally consists of a push-button that is worn on a bracelet or necklace and a speaker-phone base-station. When the button is depressed, a call is placed by the base-station to a call centre. The PERS client then speaks with a trained responder, who identifies the situation and, if required, dispatches the appropriate type of assistance. While PERSs can be helpful, many owners do not wear the push-button because they forget to do so and/or they feel stigmatised. Thus many owners do not have access to the system when an adverse event occurs, cannot activate the system because of the adverse event itself (e.g., unconsciousness, broken wrist, etc.), or are afraid to activate the system because they perceive it as likely to result in long-term care placement. Additionally, this type of PERS is not suitable for people with dementia as it requires a reliable, explicit input from the owner (i.e., a button press), which is an unrealistic expectation of this population. Building on the experience the research team has gained through clinical trials, a PERS has been built to operate more automatically and intuitively than conventional

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

168

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

PERSs. The prototype, called HELPER, can autonomously detect adverse events and, using speech recognition software, has a dialogue with the occupant to determine what help the occupant needs. Depicted in Figure 2, HELPER consists of one or more ceiling mounted units that communicate with a central control unit. Each ceiling mounted unit uses computer vision and artificial intelligence to monitor a room and detect when an adverse event, in particular a fall, has occurred. In the case of an adverse event, the ceiling mounted unit determines the type of assistance the occupant needs through a series of simple, "yes/no" answer questions. If the system receives no response or cannot understand the occupant’s response, it will connect the occupant with a PERS call centre responder. Desired system response information is relayed to the central control unit, which dispatches for the appropriate type of assistance. Each ceiling mounted unit has other safety features, such as a smoke alarm that can automatically summon the fire department if it is not reset within a certain amount of time. In addition to contacting responders, the central control unit is responsible for coordinating the ceiling mounted units in an installation, keeping track of the occupant’s whereabouts, and building models of an occupant’s long-term living habits, which could be used to detect more gradual changes in health. Pilot trials with the HELPER system are set to begin in late 2009.

Figure 2. Schematic of an occupant verbally interacting with the HELPER system to procure assistance following an adverse event.

While this technology has been developed with independent older adults who are living at home alone in mind, it is not difficult to envision how the system could easily be adapted to monitor a person with dementia, detecting long-term behaviour patterns and notifying a caregiver if an adverse event occurs. 3. Practical Guidelines for Installation and Support of Monitoring Systems Installing complex systems in the homes of people with cognitive disabilities needs special care and awareness because of the reduced ability of the occupant to adapt to the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

169

presence of an unfamiliar person, for example an electrical contractor, in what would normally be a mostly private space. This section describes some practical guidelines that installers may find helpful. They are based on the experiences of the authors when conducting user evaluations of monitoring and interventional systems in the homes of people with dementia. 3.1. Guidelines on installation and support from the work in Bristol and Deptford

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Survey the site prior to installation: The installer or another technically competent person should survey the site prior to installation to make sure that there are no unexpected problems that might arise such as the unavailability of a power supply or gas pipes that obstruct the installation of a device. The survey should be thorough and, where necessary, include measurements so that cable runs can be calculated in advance. The survey should enable the installer to plan the installation in advance of arriving at the person’s home with equipment and tools. This is to reduce the impact of the installation by making it as short and as efficient as possible. Maintain good communication with the occupant of the evaluation site, another person that the occupant trusts, and care staff: The Deptford and Bristol evaluations were carried out in an extra care context where staff were available for 24 hours of the day. In both cases a family member was nominated as a ‘primary carer’ who would hold the occupant’s best interests uppermost. In the Deptford evaluation contact with the carer was frequent and easy. The occupant’s daughter worked, but was happy to be contacted by email or phone, and made time to attend meetings and assessments when she was able to do so. The researchers became well known to the care staff even before the evaluation began as the researchers installed much of the technology in the apartment. After the beginning of the evaluation, there were some technical issues to resolve that required the attendance of the researchers in addition to the bimonthly review meetings held at the site. This frequent contact resulted in a good working relationship between the researchers, the care staff and the primary carer that contributed significantly to the durability of the evaluation when problems were encountered. By contrast, the Bristol evaluation presented different communication challenges. The system was installed by a contracted KNX installer who efficiently and cost effectively completed the installation in a few days. There was no extended installation period prior to the evaluation so the researchers needed to work harder to build a relationship with the care staff. The main communication with the family was by email. This meant there was a lack of face–to–face contact, so at times it was especially difficult to know what were the family’s feelings about situations that were occurring. Delays that occurred in the evaluation might have been more swiftly resolved had more direct communication been possible. Send two people and make sure one remains in the house: Always send at least two installers to install a system, and make sure that one if the installers remains in the property until the installation is completed. This is especially important if the installation will be prolonged or require a complex installation procedure. This guideline follows experience with the ENABLE project where two installers visited the home of a person with dementia to install a cooker monitor. While in the middle of a complex installation with tools

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

170

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

and materials distributed around the occupant’s kitchen, both installers simultaneously left the house to collect some additional materials from their van. They returned to the house to find that the occupant had forgotten who they were and was not willing to allow them access to her house. Fortunately in this case the woman’s son-in-law happened to visit at that time and persuaded her that she should allow the installers back into her house. Bring spares for key components, especially if they are vulnerable to damage in transit: Bring spares for key components of the installation. When installing a cooker monitor in Lithuania, the ENABLE installers found that their gas isolation valve was damaged in transit, even though it had been carefully packed in a rigid case. Because the valve was damaged and no spare had been brought, the installation had to be abandoned and then reinstated later by a local gas installer. Installation standards may be different in different regions: At the installation mentioned in the previous paragraph, the installer had to go out to modify a component to make it fit the cooker in the flat. When he returned to the small kitchen, the gas fitted who was installing the valve had removed the gas supply stop valve, stuffed a piece of rag in the end of the open ended pipe, and was trying to test the operation of the electric stop valve by inserting its wires into a mains power socket. The gas fitter had been selected for the job by a very senior person in the state energy company.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Responding to technical faults: When evaluating prototypes of complex technology that is still being developed, it is likely that faults and problems will emerge during the evaluation. People with dementia are less able to cope with fault conditions than people without dementia because of their impaired ability to reason, plan and adapt. Working with care staff: Building an effective and robust working relationship with care staff in an evaluation of monitoring equipment is an important and challenging task that requires constant effort. Maintaining contact, communication and cooperation with the care manager is key to the success of the evaluation so that access to staff and facilities are easily obtainable, and that high level feedback about the impact of the installation on the working of the care facility and its integration into, for example, care plans and daytime activities. The care staff should also be engaged with the project through good communication and involvement. They will be able to provide key insight into the effectiveness of a monitoring system and are users of the system just as much as the occupant of the accommodation. Implement and test new logic in-situ but off-line: When working with people with dementia, it will be necessary to revise the control logic in use in their flat as their dementia progresses and their ability changes. In the Deptford flat and initially in the Bristol installation, revised logic was tested in the laboratory as much as possible, and then uploaded to the flat controller for immediate use. Inevitably there were some problems when the new control logic was faced with the complexity of a real person behaving unpredictable as real people do. Hasty changes had to be made to solve problems as they arose. In the later stages of the Bristol project, the new control logic was tested in the laboratory, then it was uploaded to the controller and enabled, but was not connected to the system actuators until it had been run for a week or so with only the sensors active. This enabled the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

171

programmer to check the sensor input and control outputs with real live data, but without the risk of causing unwanted interventions in the occupant’s flat.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Guidelines around automatic staff calls and alerts: Feedback from carers indicated that the automated staff calls were useful in supporting them in providing appropriate care. Knowing that the smart flat would call them if a potential risky situation arose or behaviour indicated that the tenant was unsettled allowed them to target their care at times when it was needed most and allowed the resident greater independence in her day to day life. However, there were several key issues that were highlighted through the use of an automatic staff call system based on the monitoring and interpretation of sensor data. 1. Trust in the system by the carers is key to its success. Carers need to know that they will be notified consistently if a risky or unsettled behaviour is detected. However, they must also be relatively sure that if they are called it is for good reason. False alarms erode confidence and can lead to calls being ignored. This will be unacceptable in the long term and will undermine the technologies acceptance. False alarms can be due to inadequate algorithms which incorrectly diagnose situations or it may be due to unreliable sensor data. In either case, thorough testing is a necessary prerequisite before activating the system. 2. Carers may also ignore alerts if the automated system has been set up incorrectly. In the Hillside flat, the period over which the door sensors would trigger a staff call was set from eleven o’clock at night to seven o’clock in the morning. However, the tenant was an early riser and frequently got out of bed in the morning between six and seven o’clock. Due to her disorientation she would sometimes open the front door. Data logs showed that staff didn’t always respond immediately to calls during this period. This was due, in some part, to it being a busy time for staff. However, it may also have been influenced by the knowledge that this was not an inappropriate time for the tenant to be up and there was less risk to her if she did leave her flat during this period. Adjusting the settings so that opening the front door during this period did not trigger a staff call may have resulted in a more consistent staff response. 3. A knowledge of the reason for a call can assist the carer in knowing how best to respond. For staff calls initiated by the tenant the carer can use an intercom system to immediately find out from the tenant what assistance they need. In the cases where the smart flat has initiated the call the tenant is unlikely to know why the call has been made and interrogating them is unlikely to be beneficial. The Hillside installation did not include the facility to provide additional information to carers remotely once a call had been made; this would be a useful addition to the system. 4. The effect of staff calls on the tenant must be taken into consideration. Calls must be able to be made without alarming the tenant. In the Hillside installation calls initiated by the smart flat activated the staff call unit located in the hallway of the flat. This resulted in a audible alarm sounding in the flat, and continuing to sound until a staff member deactivated the call. The alarm caused the tenant distress, particularly as she was always conscious of disturbing others. If the tenant initiates a staff call herself then there must be some acknowledgement that this has been registered. However, where the calls are automatically generated thought

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

172

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

should be given to whether the tenant should be made aware of this, and if so, how this can be done without causing distress. 3.2. Practical guidelines from the work in Toronto Other lessons were learned whilst conducting the work in Toronto described above. Three iterative versions of the COACH system have been installed and tested in three different long-term care facilities and evaluated with more than 30 people with moderate-to-severe dementia, established by a MMSE [2] score of 20 or less. While the following points reflect insights gained through daily, supervised pilot trials with a new system, many are applicable to permanent installations as well. Get to know the Facility: As early as possible, the research team should get in touch with the Director / other management at the facility of interest. This will allow familiarisation with the facility and enable cooperation between the facility and researchers to determine interest in the system, appropriate (and, for prototype devices, dedicated) areas for the system, access procedures, and other matters of interest.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Obtain a Letter of Support: Before attempting to install the technology in the facility and/or apply for ethics regarding the study, obtain a letter of support from the director (or equivalent) of the facility. The letter should indicate the facility’s understanding of the purpose of the system, acknowledge their interest in having it installed, and (if applicable) their intention to participate in the research study. Get to Know Your Participants: After obtaining appropriate ethics permissions, take the time to get to know the people who will be participating in the research study before it begins. Understanding the personal preferences and needs of the people who will be participating allows the research team to anticipate and avoid many difficulties before they arise. Not surprisingly, interacting with people in a way that respects their individuality will result in happier, more cooperative participants. It is vital that the person participates at a time that does not conflict with his/her schedule (e.g., meal times, nap times, social groups, etc.). Also, a little extra goes a long way. For example, one participant was an avid, lifelong painter. The research team brought watercolours and poster-paper, which were set up near the study location and made available to the participant after each trial. Very soon, she went from being an aloof to a quite keen participant (although her smile while painting was the greatest success of all). Involve the Staff: Prior to installation of the system, the staff of the facilities were made aware of the purpose of the study and proposed execution plan at staff meetings. The team answered questions and asked for suggestions from the staff as to the most effective and least disruptive way to conduct trials. During the studies, researchers interacted with the staff often, answering questions and demonstrating the technology often. As the staff of long-term care facilities are often very busy, it is important that researchers continually work with the staff to ensure as little disruption to everyone’s schedules as possible. Technical Support: Software/hardware support must be available and easily accessible at all times while the system is installed. It must be made clear to the people operating the device how assistance can be obtained and the response to technical difficulties must be timely. Enthusiastic and timely support will lead to greater interest and cooperation from the staff of the facility; if you do not take your responsibilities regarding your equipment seriously, you cannot expect others to do so.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

173

Continually Asses the Installation: After installing a system, continually (remotely) monitor the operational status of the system and regularly assess the satisfaction of the people who interact with the system (e.g., staff, families of people with dementia, and the people with dementia themselves). Apart from asking direct questions, simply observing people interact with the system will yield significant information. This will result in many insights regarding the strengths and weaknesses of the installation as well as to ensure that any problematic situations regarding the system are addressed as quickly as possible and, in many cases, before they even arise. Be Prepared for the Reality of Working with this Population: Older adults with dementia are a frail and vulnerable population. Research trials must be designed in a way that is sensitive to peoples’ abilities and morbidities. Particularly with longer trials, days may be missed because of illness and in some cases participants may cease to participate entirely because of health concerns and, sometimes, in death. Not only should the methodology of the trials take this into account, but adequate support should be made available for the research team, who may very well be affected by changes in the participants’ health and wellbeing.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Be Creative: Ensuring that participants willingly and enjoyably participate in a study is important to most researchers. However, many conventional incentives are not appropriate for this group; monetary incentives cannot be given and it is inadvisable to offer food incentives because of the large number of dietary constraints. Be creative when devising appropriate incentives for participation. For example, during one set of trials, one of the researchers brought her small dog to the facility every day (after gaining approval from facility management). The dog stayed in a lounge area adjacent to the washroom where the trials were being conducted. Very soon after the dog’s appearance, people would come and go throughout the day to visit and play with the dog. Usually there were two or three people in the lounge area, some of whom were in the study and some who were not. The dog proved to be a wonderful addition to the research team, as she caused many participants to come to the trial area of their own accord, created an enjoyable place for socialisation, and became a conversation topic for the residents and researchers. Take the Time: Apart from the performance of the system itself, taking the time to establish a good rapport with everyone involved in the study is the most important thing one can do to ensure a successful instillation. Participants, staff, management, and families all notice when a few extra minutes are taken to visit with the participant (e.g., chat a bit, help him/her to the room they wish to go to, etc.). While it takes more time, being a friend to the participants is not only personally rewarding, but is more than worth the effort for the trust, good rapport, and cooperation it yields. This positive relationship is especially important for trials that require many weeks or months to complete. This approach also fosters good long-term relationships with people, who will be keen to participate in future studies, greatly reducing the effort needed to find interested facilities, identify willing participants, and obtain consent from substitute decision makers. Setting up and maintaining good relationships will always pay off in the long run. Provide Recognition and Closure for the People Involved: It is important to convey the closure of a study in a way that is appropriate for the people who participated. This gives researchers a chance to show their appreciation for peoples’ participation and provides closure for everyone. At the completion of the COACH trials, a "thank you" party with

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

174

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

cake and juice was given for all the residents and staff in the wing and certificates were given to participants. Upon publication of the results several months after the completion of the study, a pizza lunch with a 20-minute presentation followed by a question and answer period was held for the staff, management, and family members of participants. Getting the opportunity to see the impact of their participation not only resulted in a sense of satisfaction, but generated an excellent discussion with many observations the researchers plan to implement in the future.

4. Evaluation and Assessment The assessment of the impact of assistive technology on the people for whom it is intended is not only important to determine whether it should be used on a larger scale, whether it is cost-effective, or whether it achieves the aims for which it was designed. It is also necessary to assess and evaluate systems for their usability and usefulness during design and development phases so that the technology is known to be usable by its intended users and also useful to them. In these cases technology has been designed to be useable and useful to people with dementia and their caregivers.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.1. Evaluation of the Deptford and Bristol Flats The flat in Deptford was evaluated by a man who had moderate Alzheimer’s Disease. He moved into the flat from his own home, where he had been supported by community services coming into his home. He and his daughter agreed to participate in the evaluation programme. On arrival at the flat the participant had a Mini Mental State Examination (MMSE) score of 10. The flat in Bristol was also evaluated by a resident tenant who agreed to be part of the research programme. The first tenant to reside in the flat was found to be unsuitable as a tenant by the housing association and was found alternative accommodation. This was not done because he was unsuitable as an evaluator. The second tenant found for the flat in Bristol was a woman experiencing the early stages of Alzheimer’s Disease. She agreed to participate in the technology evaluation along with her daughter who was nominated as the tenant’s primary carer after the caregiving staff in the development. The woman’s immediate family did not live locally and maintained contact with her primarily using telephone. She was visited by family occasionally and went and stayed with relatives several times over the course of the evaluation period. A variety of evaluation methods were used, including standardized outcome measures, qualitative methods and sensor derived parameters. These methods are described below: Qualitative Measures These measures form an unstructured and anecdotal record of the impact of the installed technological interventions. Due to the small scale nature of these evaluations, gathering sufficient quantitative data on which to base scientific analysis was difficult; the recorded views of the people using the technology became invaluable in assessing its overall value.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

175

Interview: The participants, their key family member and the care facility manager (or her deputy) were individually interviewed at regular review meetings. In the Deptford case these occurred every two to three months; in the Bristol case an initial interview was conducted five weeks after she moved in and then review meetings were held 3monthly. Notes were made from these interviews. These interviews took the form of a two-way dialogue, with research staff presenting observations and trends collected from the monitoring data, and staff, family and the resident giving feedback on the usefulness of the technology. The two-way nature of these interviews resulted in them being an invaluable setting in which changes to the technology could be discussed and agreed upon. Care notes and event log: The notes made by staff relating to the participant were copied and collected at each review meeting. The staff also filled in a separate form recording calls made by the monitoring system and their responses to the calls. In the Bristol case, these logs were of limited success due to poor compliance of staff. It may also have been due to night staff training being delivered by care managers rather than the researchers themselves, as was the case at the flat in Deptford. Quantitative Measures

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Two non-technological quantitative measures were used to measure parameters relating to the participant and the evaluation. Mini Mental State Examination (MMSE): MMSE [2] is a crude but quick measure of cognitive function that is widely used to track the progression of diseases that cause cognitive impairment. MMSE was measured every 3 months throughout the evaluation. The MMSE score of the Deptford flat’s participant was 10 upon arrival at the flat. This indicates moderate to severe dementia. After three months, at the assessment made just prior to switching on the interventions, his MMSE score had risen to 16. It subsequently decreased to 15 when measured at the six monthly review. It is thought that the increase in his MMSE score at the 3 month point was due to improved nutrition and basic care over the period after his arrival at the extra care facility. The Bristol participant’s MMSE score was 21 on arrival. She had moderate dementia. Her MMSE score was not measured again during the evaluation. Individual Prioritised Problem Assessment (IPPA): IPPA [6] outcome measure is designed to measure the impact of assistive technological interventions on a specific problem identified by a person (and his or her caregivers in the case of people with cognitive impairment). IPPA was measured at each review by the project occupational therapist. Sensor Derived Measures The monitoring and intervention system included a large number of sensors installed in the flats. Data from these sensors was collected and logged. Offline analysis of the data yielded meta-parameters describing aspects of the participants lifestyle and behaviour. Changes in these parameters were used as indicators of changes in behaviour and overall wellbeing.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

176

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Cumulative time asleep at night: This parameter was measured using the installed bed occupancy sensor. Time asleep was determined by aggregating the duration of continuous periods spent in bed of one hour or more. It was assumed that if the participant was in bed at night time then he was asleep. It is accepted that it is possible that he was in bed and was not asleep, however, determining his state of consciousness was beyond the scope of the technology used for this evaluation. Figure 3 shows a graph of total time spent in bed per day (24 hour day) from the Bristol evaluation. This type of long term data can be useful for detecting trends in health or progressive disease.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 3. A graph of the total daily time spent in bed, averaged over 7 days. 8 months of data are presented. The gaps in the data indicate that data was not recorded for those days.

Number of exits from the flat during the risk period: This parameter was measured by identifying the number of times that the participant left the flat during the risk period. For the participant in the Deptford flat this was configured to be between 22:00 and 06:00; for the Bristol participant, between 23:00 and 7:00. An exit from the flat was defined as a trigger of the hall motion sensor, the front door motion sensor and then the front door sensor; without a subsequent trigger of the front door or hall motion sensors shortly afterwards. In the Bristol flat this was measured slightly differently due to difficulties with the delay in PIR sensors resetting. The paramter recorded was the number of times the door was opened during the risk period. This was defined as a trigger of the front door motion sensor followed by the front door being opened within 30 seconds. Figure 4 shows an effective means of presenting this data that gives an overview of a whole night of activity. Number of room transitions: Room transitions were measured throughout the day and night as a measure of activity. A room transition was defined as a trigger of a room motion sensor followed by the triggering of the sensor in an adjacent room. Room transition data recorded from Deptford is shown in the tables below. It is thought that the number of room transitions occuring within a defined time-frame is a useful measure of activity levels.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

177

Figure 4. A sample of data from the Bristol evaluation showing one night of bed occupancy and front door opening data. A downward transition on the bed occupancy trace indicates the tenant getting out of bed. An upward transition indicates the tenant getting into bed. An upward transition on the door opening trace indicates the door opening. A downward trace indicates the door closing. Daytime transitions Before switch-on (up to Aug. 15th) After switch-on (after Aug. 15th) November

Minimum

Maximum

Mean

217 199 258

445 535 599

348 367 357

December / January 157 536 297 February / March 201 694 385 Table 3. A table showing the number of daytime room transitions in the Deptford flat during periods between July 2006 and March 2007.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Night-time transitions Before switch-on (up to Aug. 15th) After switch-on (after Aug. 15th) November

Minimum

Maximum

Mean

0 8 2

134 90 202

42 42 38

December / January 2 108 41 February / March 12 189 80 Table 4. A table showing the number of night-time room transitions in the Deptford flat during periods between July 2006 and March 2007.

4.2. Evaluation and Assessment of COACH The following presents selected results from the latest clinical trials with the COACH system. A single-subject design was used with two alternating baseline (COACH not used) and intervention (COACH used) phases with five participants who had moderatelevel dementia (i.e., an MMSE score between 10 to 20). Each participant completed ten days of trials for each phase for a total of 40 trials per participant. During the baseline phases, a caregiver assisted the participant through the activity of handwashing in the conventional way (i.e., providing prompts, gestures, and physical assistance when the caregiver felt they were necessary). During the intervention phases, the caregiver stood outside the washroom and intervened only if COACH summoned her to do so. Data from the trials were analysed to determine the efficacy of COACH. However, "successful

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

178

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

device operation" does not mean the same thing to clinicians as it does to technicians. From a clinical point of view, a technology is successful if it achieves desirable clinical outcomes, in this case independent handwashing by the participant. From a technical point of view, the system is considered a success if it accurately captures the environment, interprets this information correctly, and gives reasonable prompts to the participant. Hence COACH’s efficacy was evaluated in two ways, with one examining clinical outcomes while the other focused on the technical operation of the system.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 5. Example setup of the COACH system when it is installed to monitor the activity of handwashing.

To reflect the importance of a person’s independence, the clinical assessment of COACH focused on: 1) independent step completion; 2) number of caregiver interactions; and 3) functional assessment scale (FAS). For independent step completion, participants scored one point for the first time s/he completed a step in a trial without any assistance from a human caregiver, up to a maximum score of five (one for each essential handwashing step). Number of caregiver interactions was the number of times the caregiver interacted with the participant and was considered to be any exchange between the caregiver and the participant related to activity completion, including verbal prompting, demonstration, and touching (either the participant or an object). The functional assessment score (FAS) is a modified tool for rating independence. For each trial, the FAS of the five essential handwashing steps were scored from zero (no attempt/refusal) to seven (complete independence), with an overall maximum of 35. If the participant completed the step in response to prompts provided by the COACH, a score of seven was given. Table 5 summarises the results from the pilot study. From these results, it appears that the introduction of the COACH system leads to improvement trends in the three measures that were examined: independent step completion increased, caregiver interactions went down, and the average FAS increased. It is interesting that an individual such as S4 was able to obtain complete independence from a human caregiver when COACH was used. For people who are already relatively independent in handwashing, such as S3 and S6, COACH appears to serve more as a "maintenance" than guidance tool, with

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

179

Figure 6. The steps required to complete the activity of handwashing. Note that “wetting hands” was a non-essential (optional) step in the activity as liquid soap was used in the trials.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

no significant positive or negative effects (although positive effects would be difficult to detect using these methods as these subjects essentially had maximum score). In all cases, participants’ measures of clinical independence tended to improve when COACH was used.

Figure 7. Hits, misses, false alarms, and correct rejects made by the COACH system and participants’ responses during trials with five individuals who had moderate-level dementia.

To gain a better understanding of the device performance from a more technical point of view, how and when the device intervened was examined with respect to the Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

180

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

Participant [Average MMSE score]

S3 [15]

S4 [12]

S5 [20]

S6 [13]

S8 [11]

Mean score over phases

Phase

Mean number of steps completed independently (out of 5)

Mean number of interactions with human caregiver

Mean FAS* (out of 35)

A1

5.0

0.0

35.0

B1 A2

5.0 4.9

0.0 0.2

35.0 34.8

B2

5.0

0.0

35.0

A1

3.6

2.6

29.9

B1 A2

5.0 4.5

0.0 1.4

34.7 33.9

B2

5.0

0.0

34.5

A1

3.3

4.4

30.7

B1 A2

4.1 3.8

2.6 3.8

31.6 32.2

B2

4.9

0.3

33.2

A1 B1 A2

5.0 5.0 5.0

0.0 0.0 0.3

35.0 34.9 34.8

B2

5.0

0.0

34.9

A1 B1 A2 B2

4.6 5.0 4.5 5.0

1.9 1.3 2.2 2.6

33.2 33.1 32.8 32.3

A1&A2 B1&B2 percent change**

4.4 4.9 11

1.7 0.6 -66

33.2 34.1 2.4

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

* Functional assessment score ** Calculated by [(B1+B2)-(A1+A2)]/ (A1+A2)*100 Table 5. Results from efficacy trials with COACH.

COACH Response Prompt No Prompt Error Hit Miss Participant Action No Error False Alarm Correct Reject Table 6. Possible reactions of the COACH system to participant actions.

participant’s action. Each device intervention can be categorised as a hit, miss, correct reject, or false alarm, as outlined in Table 6. In addition labelling with these categories, (when applicable) data from the trials was also marked with whether or not the participant responded to the prompt given by COACH. These results can be seen in Figure 7. This data suggest that while the use of COACH may improve independence, there is still work to be done in improving prompting accuracy. A more in-depth discussion of the results presented above can be found in [3].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

181

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5. Conclusions The systems that are being developed in Toronto are designed to autonomously support people in their home, providing assistance only when it is needed and without any intentional input from the person it is supporting. Important features of these devices are: 1) they do not require people to learn or remember how to use the devices as they will intervene automatically when needed; 2) interactions with the devices are intuitive (e.g., speech); 3) the devices learn about and adapt appropriately to the people using them; and 4) the person using the device remains in control of the situation, deciding his/her preferred course of action. As such, these devices are not specific to particular stages in aging or dementia, but can support a very wide range of mental and physical abilities, adapting to the person as his/her needs change. It is anticipated that monitoring technologies such as these will significantly increase the independence and quality of life of people with dementia in the not-too-distant future. The behaviour monitoring technologies installed in these three examples (Deptford, Bristol and Toronto) have been shown to be beneficial to the individuals that evaluated it. These case studies clearly show that there is great potential for further development of such technology. However, if this future potential is to be realised, then methods and procedures for installation must be developed that take into account the particular needs of people with dementia. Design, installation, configuration and support must all be done with reference to the abilities and impairments of people with dementia and their carers, and in the case of configuration and installation, with reference to the specific abilities, preferences and impairments of the individuals for whom a particular system is being installed. Especially where more complex systems are being installed, it is not sufficient to bring a generic design to production and expect it to be suitable for all people with dementia without scope for individualised configuration. Monitoring technologies such as those described in this chapter have the potential to transform the lives of people with dementia and their carers by providing greater independence; by allowing care to be more accurately tailored to an individual’s needs through the availability of high quality data describing a person’s lifestyle; and by reducing the anxiety and burden experienced by carers who need to check on the status of people with dementia, and worry about what may have happened to them when they are not there. Fully integrated user involvement in design and implementation is key to the success of such technologies because of the complexity and uniqueness of the people that use them, and the degree to which success of failure depends of how well the technology fits the individual user.

6. Future Research This work has shown that monitoring of people with dementia can be of value to both professional and informal carers. However there has been very little implementation of this technology in the field except in a small scale technology evaluation context, such as the case studies described here. Further work is needed to develop a fuller understanding of the impact of these technologies on a larger scale, and to understand too how best to gather, analyse, interpret and present the data that the technology makes it possible to acquire.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

182

T. Adlam et al. / Implementing Monitoring and Technological Interventions in Smart Homes

References [1]

[2] [3] [4]

[5]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[6]

T. Adlam, R. Faulkner, R. Orpwood, K. Jones, J. Macijauskiene, and A. Budraitiene. The installation and support of internationally distributed equipment for people with dementia. IEEE Transactions on Information Technology in Biomedicine, 8(3):253–257, September 2004. M. Folstein, S. Folstein, and P. McHugh. "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3):189–198, 1975. A. Mihailidis, J. Boger, T. Craig, and J. Hoey. The COACH prompting system to assist older adults with dementia through handwashing: An efficacy study. BMC Geriatrics, 8(28), 2008. R. Orpwood, C. Gibbs, T. Adlam, R. Faulkner, and D. Meegahawatte. The design of smart homes for people with dementia - user-interface aspects. Universal Access in the Information Society, 4(2):156–164, 7 2005. P. Visser, F. Verhey, R. Ponds, and J. Jolles. Medial temporal lobe atrophy predicts alzheimer’s disease in patients with minor cognitive impairment. Journal of Neurology, Neurosurgery, and Psychiatry, 72(4), 2002. R. Wessels, L. deWitte, R. Andrich, M. Ferrario, J. Persson, B. Oberg, W. Oortwijn, T. VanBeekum, and O. Lorentsen. Ippa, a user-centred approach to assess effectiveness of assistive technology provision”. Technology and Disability, 13(1):105–115, 2000.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-183

183

The Praxis of Cognitive Assistance in Smart Homes Sylvain GIROUXa,1 , Tatjana LEBLANC b, Abdenour BOUZOUANE c, Bruno BOUCHARD c, Hélène PIGOTa, Jérémy BAUCHETa a

DOMUS laboratory, University of Sherbrooke, Canada School of Industrial Design, University of Montreal, Canada c Dept. of Computer Science, Université du Québec à Chicoutimi, Chicoutimi, Canada b

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Abstract. The current and prospective situation of cognitively impaired people entails great human, social, and economical costs. Smart homes can help to maintain at home cognitively impaired people, to improve their autonomy, and accordingly to alleviate the burden put on informal and professional caregivers. This chapter will provide a comprehensive view of the research performed at DOMUS lab. This research aims at turning the whole home into a cognitive prosthetic, especially by providing cognitive assistance. The first part of the chapter presents research on the infrastructure, both sensors networks and middleware. Research work on autonomic computing, multi-person localization, context awareness, and personalization are presented. The next part of the chapter illustrates by means of DOMUS research prototypes how cognitive assistance can help to address four kinds of cognitive deficits: initiation, attention, planning, and memory. Studies involving cognitively impaired people are also be presented. In the final part of the chapter, the role of AI, context awareness and behavior tracking are questioned. To what extend are they compulsory? Does design can provide smart and simple solutions to complex issues? Keywords. Smart homes, pervasive computing, context awareness, cognitive deficits, design, artificial intelligence.

Introduction Who has never searched for his keys? Who has never forgotten a pan on the stove? Considered individually, such deficits of memory and attention do not have serious consequences. But people suffering form cognitive deficits have to face them on a daily basis. People suffering from head traumas, schizophrenia, intellectual disability, or Alzheimer’s disease know all too well how such deficits may change one’s life. In many cases, they would be able to stay at home if a light assistance was provided. But resources are scarce. Oftentimes, relatives must take the initiative and responsibility of care without appropriate professional support. Smart homes can help to maintain at home cognitively impaired people, to improve their autonomy, and accordingly to alleviate the burden put on informal and professional caregivers.

1

Sylvain Giroux, Département d'informatique, Université de Sherbrooke, Téléphone (819) 821-8000 poste 62027, eMail [email protected]. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

184

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

This chapter presents an overview of the research performed at DOMUS lab. Cognitively impaired people are more numerous than one may think. Given the human, social and economic cost, means have to be found to allow them to stay in their home and live autonomously (§1). Smart homes can rely on pervasive computing and tangible user interfaces to turn the whole house into a cognitive orthosis (§2). A preliminary study involving cognitively impaired people together with natural and professional caregivers shows that cognitive assistance and tele-vigilance services have to be designed, implemented and deployed to address deficits of attention, planning, memory, and initiation (§3). However many consideration have to be taken into account —from ethic to cost— before building a true smart home (§4). The DOMUS smart home infrastructure rests on three layers: hardware and networks, middleware, and services (§5). Next a pervasive cognitive assistant for morning routine illustrates how cognitive assistance can help to address four kinds of cognitive deficits: initiation, attention, planning, and memory (§6). Then the implementation of a cognitive assistant for meal preparation is sketched and a usability study on this assistant involving people with intellectual deficiencies is detailed (§7). As a complement to pervasive cognitive assistance, mobile services are ensuring continuity of assistance and vigilance outside the home (§8). In the final part of the chapter investigates how to overcome current limitations of DOMUS cognitive assistance and tele-vigilance services (§9). First the role of AI, context awareness and behavior recognition are questioned (§10). To what extend are they compulsory? Then we explore how design can provide smart and simple solutions to complex issues (§11)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Cognitive Deficits Entail High Human, Social, and Economic Costs The “World Population Ageing: 1950-2050” report of the Population Division – Department of Economic and Social Affairs (United Nations) [1] exposes how the increase in life expectancy 2 lead to an expected global population of 2 billion of people of 60 or over in 2050, that is more than three times the current elder population. Accordingly a tremendous rise in ageing related diseases like dementia and Alzheimer’s disease should be observed. For instance, [2] estimates that in 2005 more than 6.4 million European people were suffering of various forms of dementia (around 1.25% of the population); this number is expected to grow up to 15.9 million in 2040. So it is not surprising that current demographic trends tend to focus on elders when speaking of cognitive deficits. Unfortunately cognitive deficits are far more spread than one may think at first sight. Traumatic brain injuries (TBI), schizophrenia, intellectual deficiencies are just few examples of sources of cognitive deficits. Direct medical costs and indirect costs such as lost productivity of TBI totaled an estimated $56.3 billion in the United States in 1995 [2]. TBI people account for 6% of the SAAQ clientele but stand 28% of its costs [3]. Schizophrenia affects 1% to 2% of the population in the United States [4] 3 . In 2002, the overall U.S. 2002 cost of schizophrenia was estimated

2

Life expectancy in 2045-2050 is expected to be 76 years old. However the authors of a systematic analysis suggest that the reality is somewhat lower: “if we wish to provide the general public with a measure of the likelihood that individuals will develop schizophrenia during their lifetime, then a more accurate statement would be that about seven to eight individuals per 1,000 will be affected.” [78] 3

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

185

to be $62.7 billion [5]. Mental retardation is affecting 1 to 3% of the population 4 [6]. Therefore many young people are also affected by cognitive deficits; and they have a long life span in front of them. Governments try to maintain in their community people suffering from cognitive impairments (Alzheimer disease, trauma, schizophrenia...). Continuous care and monitoring are then compulsory to maintain them at home. However in many cases, they would be able to stay at home if light assistance was provided. But resources are scarce. Thus most of the time relatives take responsibility for car without access to appropriate resources. Too often this situation turns to an exhausting burden. Hence relatives and caregivers are urging for help.

2. Smart Environments, a Source of Hope

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Today, networks, microprocessors, memory chips, smart sensors and actuators are faster, more powerful, cheaper and smaller than ever. Chips are all around, invading everyday objects. Wireless networks enable to easily connect them. Everyday objects can then propose innovative and unexpected interactions [7]. Clothes will transport one’s profile to reconfigure his environment according to his preferences [8]. Lamps will help him to find lost objects [9]. Interactive portraits will reflect at distance the mood and health state of his beloved people [10]. Then pervasive computing [11], cognitive orthotics [12] [13], and tangible user interfaces [7] form a promising combination for a seamless integration of assistance services in the everyday life of cognitively impaired people [14] [15]. Such smart environments will change at their root the way we conceive and use health related services: diagnosis, therapy, assistance, and vigilance. The research at DOMUS laboratory is studying the theory and praxis of pervasive computing to create smart homes [16, 17]. Smart homes are augmented environment with embedded computers, information appliances, and multi-modal sensors. Research target is the realization of a smart home for people suffering from cognitive deficits. To enable them to perform their activities of daily living securely, the lab is developing x

Cognitive assistants. A home enhanced with cognitive assistance will give to people suffering from cognitive deficits the capacity to define their own life project and will foster their autonomy. The entire home then will be a cognitive orthosis able to remedy to deficits of attention, memory, planning and initiation.

x

Tele-vigilance systems. Tele-vigilance systems will support better medical and human supervision while relieving caregivers.

4

The great variation of prevalence extracted from the clinical studies is related to the definition of mental retardation used by the study, the evaluation methods and the studied populations. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

186

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

3. Users of a Smart Home Users of a smart home will be of course the cognitively impaired residents, but also the natural and professional caregivers. Before starting to design and build smart homes for cognitively impaired people, DOMUS have established its target population (§3.1), their needs and their requirements (§3.2). 3.1. Target Population The global deterioration scale (GDS) for assessment of primary degenerative dementia [18] is used to assess the severity of cognitive impairments thanks to a staging system (Table 1). Although dedicated to dementia, the GDS was used to characterize the cognitive capacities of the expected typical residents of a smart home 5 . We believe that stages 3, 4, and 5 were those where technology can reveal the most fruitful in maintaining cognitively impaired people at home. They delineate indeed those that would not be able to stay at home without some assistance but their health status is not too much severe so they can benefit from assistance and remote vigilance. Table 1 The Global Deterioration Scale for Assessment of Primary Degenerative Dementia [18] Stage 1 2 3

Type of cognitive decline No cognitive decline Very mild cognitive decline Mild cognitive deficits

4 5

Moderate cognitive decline Moderately severe cognitive decline

6 7

Severe cognitive decline Very severe cognitive decline

Description No subjective of objective deficits Some subjective complaints, no objective deficits Mild working memory deficits (attention, concentration) Episodic memory deficits (memory of recents events) Explicit memory deficits (ability to accomplish usual task) Severe memory deficits (which cause delusion) All verbal activities are lost

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.2. Requirements and needs A study on requirements and needs of cognitively impaired people and caregivers in Quebec and in France [19] established first that a special attention should be paid to attention, initiation, memory, and planning deficits, and second two classes of applications should be privileged, namely cognitive assistance and tele-vigilance. Cognitively impaired people should be able to perform their activities of daily living while being and feeling safe at home. As a result pervasive computing and tangible user interface stand as the backbone for cognitive assistance. They enable to provide adapted and personalized ambient cues to foster the autonomy of cognitively impaired people [15] and to reduce risks and hazards. Then mobile computing and locationbased services keep ensuring cognitive assistance when people are outside their home [20] [21]. Finally these technologies help relatives and caregivers to stay in touch at distance with cognitively impaired people. Besides tools for team of caregivers give means for synchronous and asynchronous collaboration [22] [23].

5

The GDS was used more as a guideline than a real assessment tool, since the residents may suffer of course of dementia, but also from other diseases as TBI. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

187

4. Design Principles Though technology can be very enabling and powerful, building cognitive assistant and tele-vigilance services raises many considerations: effectiveness, costs, ethics [24]. For instance doing the actions in place of an Alzheimer patient can lead to a faster deterioration of his cognitive capacities. Stated otherwise, it §§DOMUS lab has set the following guidelines and hypothesis: x x x x x x x x

We don’t need to know everything to be helpful. Technology is not the only way to go, maybe there exists a non-technology based solution. Use what is already commercially available. The system should be as unobtrusive as possible. There should be as less as possible intervention; less but useful and meaningful is better. User owns the control and can always turn the system off. Advising instead of doing. There is always a human being at the other end of the system.

5. From Homes to Smart Homes, to Smart Care

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

To explore how to transform a real apartment into a smart home and to use this smart home to provide smart care, DOMUS benefits or a real apartment on the campus of the University of Sherbrooke (Figure 1).

(a)

(b)

(c)

(d)

Figure 1 The apartment at the DOMUS lab. (a) the living room. (b) the kitchen. (c) the dining room. (d) a map of the complete apartment.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

188

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

The infrastructure of this smart home has three layers. The lower layer is made of hardware, embedded processors, and networks (§0). Middleware is deployed on top of the hardware layer (§5.2). At upper level, smart care services are deployed (§5.3). Prototypes will then be experimented, validated and progressively transferred to real life. In the long term, we hope that results will be applied to non-medical contexts. Indeed targeting the most difficult cases will help to better understand the needs and uses of smart homes in general. 5.1. Hardware, Devices and Network A smart home is an environment augmented with heterogeneous networks, sensors networks, embedded processors in appliances, clothes, jewels, information devices, and networked communicating objects. The DOMUS laboratory provides a cutting-edge research infrastructure for pervasive computing and TUI. It consists of a standard apartment (kitchen, living room, dining hall, bedroom, and bathroom) augmented with sensors, localization systems, micros and speakers, TV, touch screens, etc. Lighting, plumbing, audio and video streams can be entirely monitored and controlled. Available networks include wired Ethernet, Wifi, Bluetooth, data communication through power line, and Crestron specialized domotic system. Currently available sensors are electromagnetics contacts, sensitive rugs, movement detectors, RFID readers and tags, ubisense localization system, Watteco, flowmeters, microphones. They are used for the localization and identification of people and objects. Processors and sensors are embedded into the stove. To provide for feedback to the user, services may use information devices such as wireless computer screen, touch screens, Icebox, PDAs, LEDs, speakers, telephones, videoconference systems, TVs, smartboards, lighting systems. Servers coupled with programmable logic controllers, provides full control over the devices, the sound and the video streams in each room of the apartment.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.2. Middleware Layer of a Pervasive Infrastructure Middleware links the hardware layer to the application layer. Once the hardware is installed, one needs to build a middleware layer for the pervasive infrastructure. On the one hand, middleware enables to cope with heterogeneity of devices, networks, and operating systems. On the other hand, it provides generic services useful for many applications (Figure 2). Research are done on low-level event awareness and context awareness, enabling for instance to localize user from simple sensors information [25], for transparent user friendly migration of sessions [26], autonomic computing and selfmanagement [27], extension of OSGI for remote deployment and monitoring [28], mobile agents, multi-channel delivery of services [29], application of design patterns for security and dependability [30].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

189

Figure 2 The middleware layer links the sensors and effectors through the various available networks. Cognitive assistance and tele-vigilance use the middleware to gathers information and to interact with users through traditional graphical user interfaces and tangible user interfaces.

5.3. Application Layer: Cognitive Assistance and Tele-Vigilance So far applications and services are pervasive cognitive assistance for the morning routine, pervasive cognitive assistance for meal preparation, pervasive reminder systems. With respect to mobile cognitive assistance, we have built a cognitive assistant for ADLs, tele-monitoring tools, medical assessment tools, and tools for the coordination of caregivers. Finally we are also addressing pervasive space as an asynchronous collaboration medium for asynchronous coordination and collaboration of caregivers. Privacy and security are also taken into consideration in these applications. Embedding knowledge directly in the design of the artifacts is another path to assistance that is explored. Next sections will sketch some of these services.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6. Cognitive Assistance to Morning Routine According to the caregivers four cognitive deficits elicited in the study [19] are mainly responsible of the autonomy disruption in the daily life: initiation, planning, attention and memory deficits. Autonomy could be diminished due to difficulties in remembering which activity to perform and how to do it, to difficulties in focusing on the activity in process or even to begin it. Such behaviors may impair the health and well being of the person, as the essential activities, like eating or taking medicine, could not be completed in time. Then the person needs continuous prompts from her caregiver, which could affect the relationship. This section sketches how a pervasive cognitive assistant (PCA) can recognized these deficits during the morning routine and assisted a cognitively impaired resident in the context of a smart home. The cognitively impaired population considered when we implemented the PCA was composed of persons suffering from Alzheimer disease, head trauma, and schizophrenia. 6.1. Initiation Deficit The initiation deficit leads to inactive periods whereas the person is supposed to perform actions [31]. For example, during breakfast time, standing in the kitchen for a Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

190

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

long time could be attributed to an initiation deficit. The PCA detects an initiation deficit when no action is observed during a period where the occupant is supposed to be active. To diagnose it, the PCA must combine the information from three features: first the lack of actions detected by the sensors, second the period of inaction and third the occupant habits. The third feature is used to compare the actual activity with the one expected. It is essential to avoid detecting initiation deficit in a period where the occupant is usually inactive, as for example during a nap. The PCA infers a deficit of initiation when it notices that, contrary to the information stored about the occupant habits, the sensors indicate no action during a period. The aim of the assistance is to urge the person to begin the activity. 6.2. Planning Deficit

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The planning is the identification and organization of the steps and elements needed to achieve a goal [32]. The planning deficit leads to the difficulty to perform an appropriate sequence of actions in order to achieve a goal. To prepare tea, it is at least necessary to put a tea bag in a cup, to boil the water, and to pour it in the cup. The appropriate sequence requires that the third task must be performed after the two others. Performing actions in an inappropriate sequence indicates a planning deficit. Given an activity in progress, three cases are distinguished according to the current action performed: 1.

The current action is related to the activity under progress, but should not occur at that time. Previous actions have to be performed before this one. For example, pouring the coffee before the water is hot.

2.

The action under progress is not related to the activity. But this action takes place in the same location as the activity one. It could be inferred that the person encounters difficulties to formalize the next step that has to be completed. She then tries an action without following the goal. For example, she opens the cabin doors instead of the drawer to pick up a spoon.

3.

The activity is engaged, but the current step average duration has run out. The occupant seems to be unable to perform the next step.

Time duration is involved in the third case as shown previously for the initiation deficit. But, here the activity has yet begun and the person is lost, exhibiting difficulty to find the next step, instead of waiting without beginning the activity. In the second case, the location where the non-relevant action takes place allows identifying the nature of the deficit. Outside the activity area it could be due to a stimulus leading to an attention deficit, inside the activity area it is attributed to the difficulty to find the next step. The PCA detects a planning deficit according to the three types of unfair actions presented below. To diagnose one of these cases, the PCA must combine the information from the following four features: x x x

the location of ADL under progress; the location of the current action; the sequence of actions in an activity;

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

x

191

the average duration of a step completion.

Given the activity under progress, given the current action, the PCA diagnoses a planning deficit if the current action is not the one to be performed at that time even if performed in the same location, or if it takes too much time to perform the next action. The aim of the assistance is to recall the next step in the sequence of actions for the current activity. 6.3. Attention Deficit The attention concept is linked to the processing of external stimuli [33]. During task completion, the person shifts her attention from the activity under progress to a stimulus causing interference. The person demonstrates difficulty to focus on the activity to be performed and as a consequence, the current activity should be forgotten and never completed. The PCA detects attention deficit when the actions performed are not related to the current activity. To distinguish it from a planning deficit, the location of the new location is crucial. As explained before, if the current action location is different from the activity one then the PCA diagnoses an attention deficit, otherwise a planning deficit. Given the activity under progress, given the current action, the PCA diagnoses an attention deficit if the current action is not the one to be performed at that time and if it is performed in another location than the activity under progress. The aim of the assistance is to help the person continuing the current activity. It will then recall the goal of the activity to help her keeping the focus on it.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.4. Memory Deficit The memory processes refer to information storage and retrieval [34]. Suffering from memory deficits could lead to difficulties to remember the activity to perform, the steps of the activity or the locations of the tools and materials involved in that activity. It is inferred in this study that the person is aware of her memory deficits. She will then take the initiative to ask for assistance. The PCA diagnoses the memory deficit by the kind of demand asked by the occupant. The aim of the assistant is to provide the forgotten information. 6.5. Interacting with Users Interactions between the PCA, residents and caregivers raise many issues. Who initiate the interactions? How users and PCA interact? Are the interactions synchronous or asynchronous? 6.5.1. Triggering Interactions Assistive interactions can be triggered by the resident as well as by the PCA. Whenever a user feels the need for some help, he can ask explicitly for it. For instance when he is looking for the objects needed to perform a task, he can use a service dedicated to the objects localisation. When he does not know which pills he has to take, he can ask the PCA to establish a connection with his caregivers. On the contrary, when the PCA is supervising on a step by step for instance meal preparation, it can decide to highlight

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

192

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

specific objects in the room related to the current step of the recipe. Or when the PCA detects an initiation deficit, it can prompt the resident and suggests an ADL to perform. 6.5.2. User Interaction Mode The PCA makes use of two approaches for interaction with the resident. At one end of the spectrum, it can use traditional user interfaces, for instance a message displayed on a touch screen to suggest the next ADL to perform when an initiation deficit is recognized or display a map of the apartment to help find objects to assist user memory. Sound, recorded messages, videos can also be used to prompt the resident of attract his attention. At the other end of the spectrum, the PCA can use tangible user interfaces, for instance blinking LEDs to attract resident attention to a specific location or straightforwardly using movements of objects to control the monitoring process [14]. Nonetheless whatever the devices used, interactions should be as natural, easy and fluid as possible. 6.5.3. Synchronicity of Interactions Most of the times, interactions will be synchronously. When a resident ask for help, he is expected to receive an answer immediately. But there are cases where asynchronous interactions are well appropriate. For instance, when the PCA detects an initiation deficits, it sends an asynchronous message to caregivers warning them that maybe this morining the resident will require more attention. So caregivers can concentrate their attention on those that may need of supervision.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

7. Archipel Archipel is a context-oriented framework for cognitive assistance [15]. It has been applied to meal preparation. Its objectives are to promote ADLs completion for people with cognitive impairments, to foster their functional autonomy and their quality of life, and to exploit context-awareness and to use resources in the environment for assistance. A framework integrating four axis was implemented: knowledge representation, manmachine interfaces, ADL monitoring and ADL assistance. A first experimentation of the prototype has been done with 12 people with intellectual disabilities. Its protocol was presented in [35]. These persons were asked to complete two similar recipes, one with the assistant and the other without it. The familiarization with the assistant was done during an initiation recipe, completed before the experimentation. A researcher stayed with the person all the time, providing him some assistance when the person asked for. The preliminary results show a decrease of the number of cues the researcher gives when the assistant is present (Figure 3). Furthermore, cues are more abstract with the assistant, most of the time concerning the use of the orthosis. For example, instead of saying how to complete a step, the researcher invites the person to use the assistant to watch the video explaining the completion. Every one was able to complete the recipe, even those that were not able to read.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

193

Figure 3 We clearly observed the need for less human interventions to perform the recipe when Archipel provides assistance. Yaxis is the number of interventions by participant. For participant 1, 2, 3, 9, 12, and 6 the firt value is for th experimentation without the cognitive assistant while the second value is for the experimentaiton with the cognitive assistant. For the participant 7,4,5, 11, 10, 8 the firt value is for the experimentation with the cognitive assistant while the second value is for the experimentation without the cognitive assistant.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

8. Mobile Cognitive Assistance The PCA and Archipel are context-aware assistants designed for working inside the apartment of a cognitively impaired people. However if he does not have services following him anywhere, his home will soon become some sort of prison. Hence we implemented Mobus a simple mobile orthosis, as a complement to these assistants. Mobus clients are running on personal digital assistants (PDA) or smart phones. Mobus web services are accessed through wireless networks, e.g. GPRS or 3G. Mobus offers four kinds of services: activity recalls, medical assessment, requests for assistance, and contextual location-based assistance [36] [37] [38]. Each service has two facets: one for cognitively impaired people and the other for caregivers. Many usability and clinical studies of these services with real users have been made or are on-going. Results of usability studies are available in [39] . 8.1. Activity Reminder The web service for activity recalls enables a cognitively impaired person to consult at all time his activity recalls. The activity recalls are decided in collaboration between the cognitively impaired person and his caregivers. The caregiver can enter them in his own MOBUS-system. When the person completes an activity, he validates it on his MOBUS-system. The caregivers can supervise on his client the realization of the activities (consultation, creation, modification, etc.). They are notified when precise tasks (medication, meals, etc.) have been or have not been performed. 8.2. Gathering Ecological Data for Medical Assessment Schizophrenic people usually see their psychiatrist once a month. Often answers of the patient are vague or not representative of the true situation. Furthermore medication has Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

194

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

often harsh side-effects, and doses must be fine-tuned with care. A Mobus web service enables a patient to note facts valuable for a better cure, e.g. occurrence of symptoms and their intensity… So the psychiatrist will have real ecological values when he meets the patient. In the prototype, data are also uploaded to a database thanks to wireless networking, so the psychiatrist can adjust the dose of medication according to the observation and the intensity of side-effects. 8.3. Requests for Assistance If the patient has a technical or personal problem he can ask for assistance to his caregivers using Mobus client. 8.4. Context-Awareness and Location-Based Assistance Usually the smart phones or PDA used by the person is equipped with a GPS. A Mobus web service exploit GPS information to provide cues and advice related to the user location and/or the activity in his agenda. For instance when the user goes to a predefined area, MOBUS displays previously saved information related to the current activity, such as security rules, orientation help, bus schedules, etc. When they are in crisis, schizophrens tend to follow characteristic paths or to stay in front of specific locations, for instance churches. The GPS enables to follow the patient movements and to detect crisis. Since we know at that moment the patient location, somebody can go there and help him.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

9. Towards Better and More Relevant Systems The PCA (§6) and Archipel (§7) entangle seamlessly pervasive computing, straightforward context-awareness and tangible user interfaces into one’s home and locus of activities (morning routine and meal preparation respectively). But in both cases, reasoning capacities and adaptability are limited, ad hoc, and based on low-level information. As a consequence their behaviors may become really brittle and irrelevant. Mobus makes available useful mobile cognitive assistance services without any sophisticated computation or reasoning exploiting very few sources of contextual information. The load is put on the cognitively impaired people and their caregivers who are responsible to give explicitly the information to record and perform the interpretation of this information. If we want to improve and personalize the services, assistants have to do activity recognition. They need better models of the users. They have to do complex reasoning on events sent by sensors and generate high-level contextual information. They have to be aware more precisely where the person and the objects are. They have to interact with users exploiting and understanding all the capacities and limitations of the devices available. Finally they have to be operative in multi-persons environments. Obviously Artificial Intelligence (AI) appears as the silver bullet able to tackle and solve many of these issues. Next section will explore how more fundamental research at DOMUS can propose solutions, especially in relation with ontologies, activity recognition, and cognitive modeling. These works indeed address the issue of modelling and using the context in the large. Section 11 takes the opposite direction and present other researches at DOMUS that take the party of not using any contextual

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

195

information to achieve effective assistance by embedding knowledge directly into the objects.

10. Artificial Intelligence and Ambient Assisted Living Combining ambient computing with techniques from AI greatly increases the acceptance of the ambient assisted living and makes it more capable of providing a better quality of life in a non-intrusive way, where elderly people, with or without disabilities, could clearly benefit from this concept. From the computational perspective there is a natural association between them. However, research addressing smart environments has in the past largely focused on network and hardware oriented solutions. AI-based techniques (planning and action theory, ontological and temporal reasoning, etc) which promote intelligent behavior have not been examined to the same extent [40], although notable exceptions can been found in the domain of activity recognition for healthcare. Prior work has been done to use sensors to recognize the execution status of particular types of activities, such as handwashing [41], meal preparation [42], and movements around town [43]. Additionally, several projects have attempted to do more general activity recognition, using radio frequency identification (RFID) tags attached to household objects and gloves [44]. At Domus lab, we investigate theory and praxis of plan recognition. Most theoretical and long term approaches are based on hierarchical task Markovian model [45] [35], Bayesian networks [46], and lattice-based models [47] enhanced with probabilities [48] to recognize ADLs and to anticipate erroneous behaviors classified according to cognitive errors [49].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

10.1. Activity Recognition Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agent’s actions and the environmental conditions [50]. Due to its many-faceted nature, different fields may refer to activity recognition as plan recognition. The problem of plan recognition has been an active research topic for a long time [51] and still remains very challenging. The keyhole, adversarial, or intended plan recognition problem refers to a fundamental question: how can we predict the behavior of an observed or communicating agent, so that this prediction can be then used for task coordination, cooperation, assistance, etc.? The theory of keyhole plan recognition, on which we are working, tries to establish a formalization of this behavioural prediction. It is usually based on a probabilistic-logical inference for the construction of hypotheses about the possible plans, and on a matching process linking the observations with some plans included in a library or a model of activities related to the application domain. This library describes the plans that the observed agent can potentially carry out. At each observation of an action occurrence, the recognition agent tries to build hypotheses based on the knowledge described in this library. Since there can be many possible plans that can explain the observations, and thus the behaviour of the observed agent, the challenge is then to disambiguate these concurrent hypotheses. The researchers at Domus lab are exploring the following representation models to attack this issue.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

196

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

10.1.1. Lattice-Based Models

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The lattice plan recognition model tries to address the recognition issue by using lattice theory and Description Logics (DL) [52], which transforms the plan recognition problem into a classification issue. Description logics are a well-known family of knowledge representation formalisms that may be viewed as fragments of first-order logic. The main strength of DL is that they offer considerable expressive power going far beyond propositional logic, although reasoning is still decidable. The proposed model [47] provides an adequate basis to define algebraic tools used to formalize the inferential process of ADL recognition for Alzheimer’s patients. To summarize, our approach consists of developing a model of minimal interpretation for a set of observed actions, by building a plan lattice structure as shown in Figure 4.

Figure 4 A plan lattice structure that models two plans: “Cooking pasta” and “Preparing tea”.

In this model, the uncertainty related to the anticipated patient’s behavior is characterized by an intention schema. This schema corresponds to the lower bound of the lattice and is used to extract the anticipated incoherent plans, which are not preestablished in the knowledge base that the patient may potentially carry out as a result

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

197

of the symptoms of his disease. However, it is not sufficient to be able to disambiguate the relevant hypotheses. Therefore, the addition of a probabilistic quantification on the lattice structure [53] is an interesting and effective alternative, in the sense that it makes it possible to combine the symbolic approach for hypothesis construction with a probabilistic inferential process. The symbolic recognition agent filters the hypotheses by passing only a bounded lattice recognition space to the probabilistic inference engine, instead of considering the whole set of plans included in the library, as the classical probabilistic approaches usually do. The probabilistic quantification that we propose is based on samples of observation frequencies obtained at the end of a training period while the system learns the usual routines of the patient. This knowledge allows us to create a profile of the patient that offers a relevant basis to accurately estimate the probabilities of possible ongoing plans. This approach was implemented and tested in the Domus experimental infrastructures, where we have simulated different scenarios based on 40 low-level actions and 10 activities of daily living. Each of these activities corresponds to a common kitchen task (cooking cake, cooking pasta, making tea, etc.) sharing several actions with some other activities, in order to create a realistic context where plans can be interleaved and can lead to many different kinds of planning errors (realization, initiation, sequence, completion, etc). The observation’s frequencies of the erroneous and coherent behaviours are based on the frequencies described in the study of Giovannetti et al. [54], done on 51 patients suffering from neurodegenerative diseases, which include the Alzheimer’s disease. The results clearly show that the model recognizes all of the interleaved plans and realization type errors, and 70% of the sequence type errors. These results are promising, as all these recognized hypotheses were not pre-established in the knowledge base; they were dynamically generated in the recognition space, according to the initial identified possible plans set. However, our approach is limited by the fact that the first observed action is assumed to be correct (no errors) and coherent with the patient’s goal. The problem is that in some scenarios that we simulated, the patient started by performing an action that he was only supposed to carry out in a later stage. This limitation explains the 30% of unpredicted sequence errors and also explains why our system has trouble predicting initiation errors. In another hand, we have also experimented the approach in concrete case by extending the system named COACH [49] [41], a cognitive aide for Alzheimer’s patients that actively monitors a user attempting a handwashing task and offers assistance in the form of task guidance (e.g., prompts or reminders) when it is most appropriate. When an Alzheimer’s patient is performing the handwashing activity, the system gets as observations a set of state variables obtained using cameras, such as the patient’s hand location, the tap position (open or closed), etc., in order to determine the completion status of the task according to a previously handcrafted model. If the completion status of the task regresses or does not evolve for a certain period of time, the system will compute the best possible solution to achieve the task and will try to guide the person until the next activity step. 10.1.2. Hierarchical Markovian Task Model Over the last decade, there has been significant research and development on Hidden Markov Models (HMM) formalism [55] as the predictive core model of many systems. Several investigators have highlighted the importance on representing hierarchically structured complex activities with this dynamic probabilistic model. For instance, in

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

198

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

Pigot et al. [35] and Bauchet et al. [45], the recognition process is based on a model of activities where tasks are described using hierarchical structure as shown in Figure 5.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 5 ADL recognition process is based on a hierarchical model of activities.

The model includes two types of task’s nodes: goal of the occupant and the method to complete it. Leaves are methods of terminal tasks, which mean an atomic way to realize a concrete goal. Similar approaches can be found in hierarchical task network planning. However, this hierarchical model does not consider the set of subtasks as a predefined sequence, since there are numerous ways to realize an activity for a given method. Instead of generating all plausible sequences, rules are defined to generalize, for a given method, the criteria of integration of subtasks: partial or total sequence, repetition and/or necessity constraint. Breaking those rules should be considered as an improper activity completion. To monitor the proper completion of activities, temporal information is introduced for tasks nodes. This deals with the average time needed to realize the task, and the time slot of completion. The validation of these constraints during task realization is done according to the Epitalk approach, a tutoring architecture used for generating advisor agents [56]. Each adviser manages a local model of the activity based on a hierarchical Markov model of the patient’s habits by using an episodic memory. The activity is considered as an episode incorporating information on the method used for task completion, on right time slots, locations, sequences of subepisodes, frequencies of the observed activities, and so on. Hence, the adviser agent is both responsible to recognize a precise subtask and to provide for assistance related to this task. The leaves of the model are connected to the IO events server and are fed

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

199

by low-level events triggered by the sensors. A bottom-up traversal of the hierarchy aggregates information to provide for a larger view of what is going on. The main characteristics of this model are that the plan recognition and the production of pieces of advice are combined into a single walk through the adviser tree. The principle is simple: each time a sensor triggers an event, it sends it to the corresponding terminal advisers. Then a bottom-up spreading is activated as follows: (i) each adviser (terminal or non-terminal) processes the information, either to issue local advice or to update a local model of the activity being observed, (ii) the adviser transmits to its direct father any information it considers relevant. This scheme is applied recursively for all advisers of the tree, terminal or non-terminal, until the root adviser is reached. Terminal advisers receive information directly from the host system, in particular sensors, whereas non-terminal advisers receive information from advisers below them in the hierarchy. This model, compared to previous works, allows a more effective description of ADLs for cognitive assistance. Despite the good results that has shown in real case assistance scenarios, the system appears to be somewhat limited owing to the fact that it is only able to monitor one specific ADL and the assistance agent react after the user error. This model constitute the base component of the Archipel system described above in section 5.2

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

10.1.3. Bayesian Networks In general, Bayesian networks are the principal technology used for performing activity recognition [57]. A typical approach is that taken in the Barista system [58], which is a fine-grained ADL recognition system that uses object IDs to determine which activities are currently executed. It uses radiofrequency identification (RFID) tags on objects and two RFID gloves that the user wears in order to recognize activities in a smart home. The system is composed of a set of sensors (RFID tags and gloves) that detects object interactions, a probabilistic engine that infers activities with observations from the sensors, and a model creator that allows creating probabilistic models of activities from, for instance, written recipes. The activities are represented as sequences of activity stages. Each stage is composed of the objects involved, the probability of their involvement, and, optionally, a time to completion modeled as a Gaussian probability distribution. The activities are converted into Dynamic Bayesian Networks (DBN) by the probabilistic engine. By using the current sub-activity as a hidden variable and the set of objects seen and time elapsed as observed variables, the engine is able to probabilistically estimate the activities from sensor data. The engine was also tested with hidden Markov models (HMM) in order to evaluate the accuracy precision of activity recognition. These models were trained with a set of examples where a user performs a set of interleaved activities. However, some HMM models perform poorly, and the DBN model was able to identify the specific on-going activity with a recognition accuracy higher than 80%, which is very impressive. This approach is able to identify the currently carried out ADL in a context where activities can be interleaved. However, this approach does not take into account the erroneous realization of activities, because the result of the activity recognition is the most plausible on-going ADLs.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

200

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

10.2. User Modeling In the past, user modeling was focusing on developing domain-dependent software architectures for user models. These artificial representations have been studied by the Human-Computer Interaction and Intelligent-Tutoring Systems communities for years. However, developing applications in ambient living environments poses the challenge of continuously updating user models and, what is more important, this implies to be able to deal with not only the ongoing technological developments (e.g. pervasive computing, wearable devices, sensor networks, etc.), but also with any type of the objective, subjective or emotional user feature. Mostly, the user models proposed are based on learning process of user's preferences such as in Lin's work [59] which build a system to learn a dependency between user's services and sensor observations. On other hand, in order to contribute to this kind of future ambient user model, Casas et al. [60], use the persona concept to build a user model based on the persona’s aptitudes, with the intention of creating an accurate, parameterized user profile that could be adjusted to resolve the User Interface (UI) features of what could be the most appropriate for a specific user at any time. They defined ten data-driven fictional characters, based on age, education, work, family situation, impairments, and technology background. For instance, to correct visual impairment of elders, they develop an adaptive glass magnifier agent for checking emails according to the user’s disabilities evolution. This work has argued the importance of user-modeling involvement in the development of an ambient assisted living.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

10.3. Cognitive Modeling Based on Episodic Memory The ability to remember where you have been, what you have sensed and what actions you have taken in various situations provides a knowledge base of information that is invaluable for acting in the present. Knowing this episodic memory facilitates your ability to use several cognitive skills in the context of sensing, reasoning and learning [61]. Serna et al. [62] have conducted an investigation to explore how such an episodic memory system based on ACT-R [63] cognitive architecture, can be exploited for assisting Alzheimer’s patients in a smart home context, in order to know from life habits, for example, how one cognitively and usually performs an activity. Like the SOAR system [64], which is used to build intelligent decision making agents, ACT-R has been inspired by the work of Allen Newell on unified theories of cognition [65]. The knowledge is stored in episodic memory as component of declarative memory subsystem, in form of memory chunks, each of which has a base level of activation that decays according to a power law of forgetting, and increases through rehearsal. Specifically, Serna’s work attempted to investigate whether episodic memory is sufficient to support cognitive capabilities across a range of tasks. They created a model of an Alzheimer’s patient completing a cooking task, based on study of mistaken behavior during the Kitchen Task Assessment (KTA) developed by occupational therapist [66]. They incorporate in the model the type of errors observed during the KTA as a production rules and the chunks generated automatically during the execution of the system as means to model cognitive phenomena such as memory loss according to the increase of the number of errors committed over the course of disease’s progression. This model has been developed at the Domus lab as part of the project on home support for people suffering form cognitive disorders. The result of this research demonstrates the effectiveness of computational cognitive modeling of

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

201

daily activities for ambient assisted living technologies. These technologies should be based on better understanding of the people they seek to assist. Therefore, the cognitive modeling is an important step for designing cognitive assistive devices. Unfortunately, up to our knowledge, not many technology-based solutions have focused on building a computational cognitive model for the person’s living environment.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

10.4. What Are the AI Techniques Needed for Better Systems The new development towards ambient computing will stimulate the research in many field of artificial intelligence, such as multi-agent approach as development paradigm for this open and hardly dynamic environment [67] [68]. Since forty years, artificial intelligence has not ceased to being used on a large scale through expert system applications, web search agent, etc. If the internet devoted the advent of the conventional planetary networks, the next evolution, that will support the development of the artificial intelligence, relates to new challenging issues concerning how a network of ambient agents will be deployed within our natural living environment, and how each of these artificial agents, in the sense of multi-agent systems [69], will be represented according to the following ambient capacities: (i) ubiquity which means that the agent must be able to interact with an embarked heterogeneous electronic devices within the assisted-living by using the pervasive computing technology, (ii) context-awareness based on ontological reasoning to detect the localization and the implication of objects and inhabitants in daily activities, (iii) natural interaction for communicating intuitively with occupant through a personalized multimodal interface, and finally (iv) intelligence based on activity recognition and machine learning in order to predict the behaviour of the inhabitant allowing cognitive assistance and as well as stimulation for avoiding the rejection of such ambient technology. Hence, the question concerns the integration of these four characteristics of ambient agents within any object of everyday life. For instance, if the door of the refrigerator is open, the associated ambient agent must be able to have an idea on the behaviour of the person, such as this opening is under the context of meal preparation while it communicates in an opportunistic way with other objects of the habitat, for example, the cooker’s ambient agent. The stimulation for closing the door because of memory loss can be done through an intuitive interaction (game) between the refrigerator’s ambient agent and the occupant with disabilities, explaining the concept of the door closed. This new concept of ambient agent will ineluctably impose a capital evolution in the assisted living.

11. Design and Ambient Assisted Living Previous section explored extensively how AI can enhance cognitive assistants to render them less brittle and more relevant. But information and communication technology (ICT) is not necessarily THE solution. Using design to embed knowledge into objects does not require artificial intelligence whilst it may dramatically improves and simplifies technology-based solutions. This section describes the philosophy and examples of this research direction at DOMUS towards assistance and well-being for cognitively impaired people.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

202

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

11.1. Technology and their Impact on User Behavior What distinguishes Intelligent living environments from ordinary living spaces is their ability to sense and register information and to provide feedback and assist the habitants. A communication process takes place between the user and its physical surrounding via ICT and intelligent devices such as computer systems, TVs, cell phones, etc. Many of these devices and their services have easily found their place in people’s daily lives and contributed to the transformation of social dynamics. Some suggest that their ubiquity not only provides instant access to communication and information, but more significantly, it contributes to the emergence and structure of new social cultures and changes the meaning of the concept of presence (physical, virtual, or geographic, etc.). Intelligent products have not only the ability to facilitate communication and social interaction, they also have semiotic qualities and rhetoric abilities of their own. A cell phone in plain site, for instance, suggests the presence of a third party, while its ringing not only announces a number of possibilities (a social event, an emergency or a pleasant conversation with a loved-one) but also imposes itself in people’s life by interrupting and obliging the user to take immediate action. Even though, it is the designers and engineers who determine the product features and interface, once in use, “intelligent products take on a relative autonomy on their own: the ability to affect the user, his actions and to alter his behavior.” Caron stresses: “it is the role of society to rethink, rectify, reinvent or legitimize behavior and practices of social interaction to instate rule of engagement that can be accepted and shared by all.” [70]. Without the introduction of ICT a whole range of services has been adopted (e-commerce, e-banking, e-health, intelligent environments), changing the way people work, access information, manage their daily lives and transforming society as a whole. Many welcomed and embraced these communication technologies without hesitation, while others apprehend, fear or even reject them. Those who most apprehend them are often seniors and neophytes, those who feel oppressed by its ubiquity, those who feel overrun by technology and inapt to cope with their perpetual changing nature, and those who are worried of losing control over their lives. Such inaccessibility or rejection led to new social phenomena, such as digital exclusion - a subject of numerous studies. Therefore many disciplines, including designers, have adopted human-centered approaches in order to propose user-friendly and inclusive design solutions. 11.2. Interdisciplinary Approaches Towards Inclusive Design When designing intelligent living environments for the aging population and people with reduced physical and cognitive capacities, one needs to look at the problem from a larger perspective and study the contextual environment at large. It is especially important to understand this particular type of user, his real challenges, what is meaningful to him, how he feels about himself, etc. Obviously, this task cannot be affronted from a single disciplinary perspective. This requires expert’s input, interdisciplinary approaches and transdisciplinary thinking. Therefore, designers have joined the DOMUS team in their interdisciplinary research on ambient assisted living, which seeks to promote autonomy and auto determination and enable the user through pervasive technologies. By merging knowledge bases, designers were able to study the specific contextual environment of

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

203

use, identified user’s challenges and integrated expert knowledge in the process of creative problem solving. As part of the design research designers tackled the following questions: 1. 2. 3.

How to assist in completing a complex task (example preparing a meal) How to assist in locating objects How to motivate, incite and engage in activities of daily living

The design research was studying the environment of use, the context of use, the actors involved, challenges they face and ultimately identified design opportunities and design concepts that are susceptible to address the needs and further the research. In order to assess the complexity of a task and identify design opportunities, analytical scenarios have been developed. This method involved a step-by-step photo documentation and analysis of a task (ex. preparing a meal) and all its variables, followed by a categorization process which allowed the grouping of individual steps in structural phases: motivation, initiation, preparation, realization, etc. Other techniques involved the creation of personas or the generation of contextual user scenarios, which includes all actors (the user, his family and friends, health professionals, neighbors, community).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Various challenging elements have been identified during this assessment process: x inadequate living environment (kitchen configuration, too many cabinets, nondisclosure of their content,…) x complexity of various appliances and interfaces (microwave oven, dishwasher, cook top, etc) x complexity of the task itself (too many steps involved, complex recipe) x complexity of produce and their packaging, (recognition of ingredient or its packaging, changing packaging design, overload of information)

11.3. Complexity of Intelligent Products For many seniors or people who suffer from reduced cognitive and intellectual capacity, the complexity of user interfaces, their logic and configuration appear to be the biggest challenge. Many designers and manufacturers still overlook these problems. Therefore, simplifying user interfaces should be a designer’s and software engineer’s main focus. Technological progress in the last years has already significantly contributed to simplifying interfaces. Intelligent devices such as the iPod, the Wii interface, dual touch screens, and projection technologies, are only a few of those that proliferated with great success among the young and old. These technological advances have made other applications possible. However, technological features such as digital displays, touch-sensitive screens, numeric keys, light emitting diodes, captors, etc. have become so inexpensive that manufactures keep adding them mindlessly wherever they can. Evidently, issues such as complexities, legibility, compatibility, feature redundancies, information overload or (coexistence with other equipment) are barely taken into account. As a result, many devices remain complex and unsuitable for many elderly

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

204

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

and people with intellectual disability. Either the interfaces are too intricate, push buttons too small, typeface illegible or pictograms incomprehensible. Such misfit mainly arises when designers or engineers unconsciously transpose their cognitive model onto the products, without taking into consideration the cognitive model of the end user. Indeed, simple or intuitive can be interpreted quite differently by the designer or user. Thus, before engaging in the process of designing intelligent living spaces, it is critical to comprehend how users perceive, interact and experience their environment or products, and how they infer meaning. 11.4. Universal Design Universal design is a term, which refers to accessible, comprehensible and intuitive design solutions for all, regardless of age, ability, or status, but also solutions that avoid stigmatization and digital exclusion. Today more than ever, designers are sensitive to the need for simple and meaningful products, especially considering the complexity and continuously changing nature of digital products and devices. Physical or psychological barriers can be reduced if user interfaces are intuitive and decipherable. Some principles for universal design developed by the Center for Universal design, State University of NC are [71] [72]: Flexibility – Space configuration, interface options, mode of communication (visual, acoustic, tactile, feedback) need to be flexible enough to adjust to user preferences and potentially evolving cognitive ability. Tolerance – User interface should be able to anticipate and tolerate missteps, unintended use, input errors, etc. and propose corrective measures or alternatives.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Minimize physical and cognitive effort – (Unless desired, e.g. to stimulate mental function) cognitive and physical effort should be reduced. Intuitive use – Interface should be obvious, features easily identifiable, its purpose recognizable allowing the user to anticipate actions and consequences. Perceptible – Essential information (all modes: visual, acoustic, etc) needs to be made available in a comprehensible and legible way to ensure effective communication. Message has to be distinguishable and meaning decipherable from its surrounding context: x x x x

purpose is easily identifiable essential information is perceptible (variety of modes: acoustic, tactile, symbolic) action can be anticipated interface is obvious, signs (symbols, icons, pictograms, index) comprehensible.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

205

11.5. Low-Tech and Non-Tech Modes of Communication 11.5.1. The Role of Light Light has an affect on a person’s mental and physical health and is considered a regulator of the human biological cycle, influencing sleep patterns, body temperature, hunger, energy level, mental alertness, emotion, etc. Disturbances in circadian rhythms can trigger a number of health problems, such as irritability, loss of appetite, sleeping disorders, depression or seasonal affective disorders [73]. As en example, appropriate lighting can improve sleep patterns and brain activity [74], and stimulate a person during his or her morning routine. While dimmed lighting, on the other hand, can reduce visual strain and produce calming effects. However, light also provides critical information and visual feedback about the surrounding. About 85% of all sensory information is absorbed through the visual system. Designers exploit the advantages of both the natural and artificial light sources, while creating modern living spaces. They distinguish direct and indirect lighting, task lighting, ambient lighting, accent lighting, decorative lighting, etc. The size of a lit area, the light source, the light intensity, the color of the light, direct light or indirect are all attributes that not only contribute to the way someone perceives the built environment, but also significantly affect a person’s mood. Orange lighting, for instance, is supposed to set an intimate mood and lessen depression, but could have undesirable effects on people suffering from insomnia. Blue tints, on the other hand, tend to be calming and facilitate sleep [75]. Red lights, for instance, are often associated with danger and action, and are therefore largely used as limitation indicators in user interface design. When lighting spaces, reflective surfaces ought to be carefully chosen since they tend to produce glare that contributes to visual fatigue. In severe circumstances or longterm exposure, glare can lead to visual impairment. Strong contrast between light source and the surrounding light conditions, or heavy contrasts in color, should be carefully waged since they too can trigger visual stress.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

11.5.2. Communication Between a User and his Surrounding Information can be transmitted and perceived by people in a number of ways and designers and software engineers should not neglect a human’s unique ability to perceive information through his sensory organs, which enable him to see, to hear, to touch, to taste and to smell. Interface solutions don’t have to be always technological. And technological solutions don’t always require a digital interface, a monitor or numeric controls. Some low-tech form of communication can be sufficient, such as the use of signs (symbols, icons, index) that are more or less invasive and able to assist a person in locating an object, for instance. Activated by captors of various types (light, weight, temperature, sound, movement), the intelligent environment can be programmed to sense and intervene if desired. Visual indicators, which are typically used are LED, light signals, labels, projected symbols or icons (Figure 6). Acoustic indicators can produce sounds such as a jingle, a tune, a bell, or speech, activated or deactivated in multiple ways

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

206

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

Figure 6 Projectable information (steady, pulsating, increasing in size)

More or less explicit information can be diffused through light. An illuminated area (activated by motion sensors) can indicate the presence of a person. The changes from non-lit to a lit state can, too, attract the attention of a person. Changing light intensity can act as an attention seeker. The intelligent built living space can assist its user in locating an object by simply illuminating a localized area, a drawer, or a cupboard, thus revealing its content. Non-technological form of communication, for example, is the configuration of a product or space in itself. A specific kitchen layout (sink, work surface, appliances, utensil storage) can be suggestive in terms of procedural sequence of a certain task. Figure 7 illustrates the concept of a kitchen workstation that, through its linear configuration, suggests:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. washing and preparing produce, 2. gathering and preparing ingredients, 3. cooking, 4. enjoying the meal. In this, design concept technology intervenes progressively and only if desired. In the process of realizing a task, additional assistance can be provided on demand, since the weight-sensitive flooring can detect a person’s position in space and offer a touchsensitive interface whenever needed, thus activating, for example, the projection of a demonstration video “how to….” on a desired surface (wall, screen, cabinet door, work surface,…). Such solutions are sensitive to the user context and flexible enough to accommodate different type of users and avoid stigmatization.

Figure 7. LED command in work surface, indicator for active interface illuminates and invites user to interact.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

207

11.5.3. Communicating with Sound Information and communication technologies make mainly use of human’s audiovisual capacities and sound is being more consciously employed as a command feature. Voice activated devices and interfaces become, indeed increasingly popular even though some still lack precision and reliability. Sound experts have been especial involved in developing new sound indicators (positive, negative, reassuring, alerting) for technological products and environments. This is a new avenue, which is being explored by many sound designers. Interior design trends show that hard surfaces (glass, concrete, stone, stainless steel, ceramic, hardwood) are increasingly popular. These surfaces unfortunately become a source of reverberation and undesirable noise-production. Appliances and technical equipment generate sounds, either unintentional (example due to vibration) or intentional as a new form of communication to signal a certain state of the equipment: a toaster that finished toasting, an oven requiring attention, etc. The burgeoning of sound producing equipment in home environments can have negative effects such as difficulties to distinguish a voices or a ringing phone for example. According to Kopec, “slight but continual noise can lead to reduced mental alertness, annoyance, irritability and can have extremely negative affect on the psychological and physical wellbeing”. He also suggests that unpleasant sounds can trigger physical reactions such as increase in blood pressure and muscular tension and the release of stress hormones where as pleasant sounds, are capable of stimulating people and incite immediate action [76]. To control noise level and achieve a well-balanced atmosphere, interior designers can make use of sound-absorbing materials on ceilings or walls, windows and floors. Nonetheless, Nevertheless, if managed appropriately, sound can be a highly desirable mode of communication, capable of creating stimulating or calming atmosphere in a living environment.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

11.5.4. Design Implication Overall design needs to consider the aging process and the gradual changes a human body experiences over time. Among others, these changes include a deterioration of the perceptive system and may result in reduced touch sensitivity, visual capacity, hearing difficulties and/or memory loss. Design solutions need to be responsive to such biological, physical and psychological changes. They should also include solutions that provide appropriate lighting and manage its level of intensity, color, glare, etc., in addition to also assuring legibility of user interfaces by carefully waging the use of signs, and testing the legibility (size, color, contrast, detailing) and comprehension of information.

12. Conclusion The population of cognitively impaired people is not just restricted to dementia and Alzheimer disease. Schizophrenia, traumatic brain injury, or intellectual disability strike also a significant part of the population. Therefore cognitive deficits entail high human, social, and economic costs. Cognitively impaired people would prefer to stay at home and live autonomously. For many of them it would be possible if assistance was

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

208

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

provided. But resources are scarce and relatives have to take this load. Fortunately current advances in technology could give a helping hand. At Domus lab, research on smart homes, cognitive assistance, and tele-vigilance has produces promising prototypes of pervasive cognitive assistance for the morning routine and meal preparation inside one’s home. Pervasive computing, artificial intelligence and tangible user interfaces are the key field. Cognitive assistants do activity recognition, personalisation thanks to user models, reason on low-level events sent by sensors, use contextual information and localization, understand devices capabilities, and finally interact with the resident. Interactions can be triggered on the user request or on the assistant initiative. Traditional graphical user interface may be used, but also tangible user interfaces are investigated to provide for a seamless integration in the resident activities. Indeed the whole home itself becomes the interface and acts as a cognitive orthosis. In complement to pervasive assistants, a mobile assistant was also presented enabling the resident to go outside while still benefiting of cognitive assistance. These prototypes showed a lack of flexibility and adaptability, though they revealed useful and relevant in specific context during usability studies involving cognitively impaired people. Therefore Domus current research is based on AI, representation and context-awareness to overcome these limitations in pervasive environment. It seems a very promising avenue. Nonetheless for the sake of assistance and well being of cognitive impaired people, Domus is also conducting research in design in search for non-technical solution to hard problems. To paraphrase the famous sentence of Rodney Brooks “intelligence without representation” [77], this last avenue raised the question of “pervasive computing without context”?

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Acknowledgments All our grateful thanks to all of those who has contributed to the Domus research project, especially Francis Bouchard, Sophie Cayouette, Yves Lachapelle, Virginie Lapointe, Dany Lussier-Desrochers, Philippe Mabilleau, Nicolas Marcotte, Blandine Paccoud, Juliette Sablier, Jean-Pierre Savary, Emmanuel Stip, Denis Vergnès.

References [1] U. Nations, World Population Ageing, 1950-2050. United Nations, 2002. [2] C. P. Ferri, M. Prince, C. Brayne, H. Brodaty, L. Fratiglioni, M. Ganguli, K. Hall, K. Hasegawa, H. Hendrie, Y. Huang, A. Jorm, C. Mathers, P. R. Menezes, E. Rimmer and M. Scazufca, "Global prevalence of dementia: a Delphi consensus study," The Lancet, vol. 366, pp. 2112-2117, Dec. 17, 2005. [3] C. Claire Laberge-Nadeau, S. Messier and I. Huot, Guide Des Services Offerts Aux Blessés De La Route, Au Québec. Laboratoire sur la sécurité des transports du Centre de recherche sur les transports de L'Université de Montréal, 1998, pp. 212. [4] Springhouse, Professional Guide to Diseases. ,8th ed.Lippincott Williams & Wilkins, 2005 [5] E. Q. Wu, H. G. Birnbaum, L. Shi, D. E. Ball, R. C. Kessler, M. Moulis and J. Aggarwal, "The economic burden of schizophrenia in the United States in 2002," The Journal of Clinical Psychiatry, vol. 66, pp. 1122-1129, Sep 2005.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

209

[6] J. Chelly, M. Khelfaoui, F. Francis, B. Chérif and T. Bienvenu, "Genetics and pathophysiology of mental retardation," European Journal of Human Genetics, vol. 14, pp. 701–713, 2006. [7] B. Ullmer and H. Ishii, "Emerging frameworks for tangible user interfaces," IBM Systems Journal, vol. 39, 2000. [8] G. D. Abowd, A. K. Dey, R. Orr and J. A. Brotherton, "Context-awareness in wearable and ubiquitous computing," in ISWC, 1997, pp. 179-180. [9] D. Vergnes, S. Giroux and D. Chamberland-Tremblay, "Interactive assistant for activities of daily living," in From Smart Home to Smart Care, July. 4-6, Sherbrooke, Canada, 2005. [10] E. D. Mynatt, J. Rowan, S. Craighill and A. Jacobs, "Digital family portraits: Providing peace of mind for extended family members." in 2001, Seattle, Washington, pp. 333-340. [11] M. Weiser, "The computer for the 21st century," Sci. Am., pp. 94-104, 1991. [12] M. E. Pollack, "Planning technology for intelligent cognitive orthotics," in 6th International Conference on Automated Planning and Scheduling, 2002, pp. 322-331. [13] E. LoPresti, A. Mihailidis and N. Kirsch.Assistive technology for cognitive rehabilitation: State of the art. Neuropsychological Rehabilitation 14(1/2) (2004), pp. 5-39. [14] B. Boussemart and S. Giroux, "Tangible user interfaces for cognitive assistance," in 21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07), 2007. [15] J. Bauchet, S. Giroux, H. Pigot, D. Lussier-Desrochers and Y. Lachapelle, "Pervasive Assistance in Smart Homes For People with Intellectual Disabilities : A Case Study on Meal Preparation," IJARM, vol. 9, pp. 53-65, December 2008. [16] H. Pigot, A. Mayers and S. Giroux, "The intelligent habitat and everyday life activity support," in 5th International Conference on Simulations in Biomedicine, april2003. [17] H. Pigot, B. Lefebvre, J. Meunier, B. Kerhervé, A. Mayers and S. Giroux, "The role of intelligent habitats in upholding elders in residence," in 5Th international Conference on Simulations in Biomedicine, April 2-4, 20032003, Slovenia, pp. 497-506. [18] Reisberg, B. et al, "Signs, symptoms and course of age-associated cognitive decline." in S. Corkin, Et Al., Aging : Alzheimer’s Disease : A Report of Progress.Anonymous New York: Raven Press, 1982, pp. 177-182. [19] H. Pigot, J. P. Savary, J. L. Metzger, A. Rochon and M. Beaulieu, "Advanced technology guidelines to fulfill the needs of the cognitively impaired population," in 3rd International Conference on Smart Homes and Health Telematic (ICOST); Assistive Technology Research Series, July. 4-6, 2005, Magog, Canada, pp. 25-32. [20] S. Giroux, H. Pigot, J. Moreau and J. Savary, "Distributed mobile services and interfaces for people suffering from cognitive deficits," in Handbook of Research on Mobile Multimedia, 2006, pp. 544-554. [21] B. Paccoud, D. Pache, H. Pigot and S. Giroux, "Report on the impact of a user-centered approach and usability studies for designing mobile and context-aware cognitive orthosis," ICOST, 2007. [22] S. Giroux and H. Pigot, "Mobile devices to enhance interactions between cognitive impaired people and medical staff," in 2nd International Conference on Smart Homes and Health Telematic, 15-17 September 2004, Singapore, pp. 261-268. [23] D. Chamberland-Tremblay, S. Giroux, C. Caron and M. Berthiaume, "Space-mediated learning at the locus of action in a heterogeneous team of mobile workers," in Feb.1-7, 2009, Cancun, Mexico, pp. 3540. [24] E. Stip and V. Rialle, "Environmental cognitive remediation in schizophrenia : Ethical implications of smart home technology," Canadian Journal of Psychiatry, vol. 50, pp. 281-291, 2005. [25] Y. Rahal, P. Mabilleau and H. Pigot, "Bayesian filtering and anonymous sensors for localization in a smart home," in Proceedings of the 21st IEEE International Conference on Advanced Information Networking and Applications (AINA’07), May 21-23, Niagara Falls, Ontario, Canada, 2007. [26] S. Giroux and S. Guertin, "A pervasive reminder system for smart homes," in Sept. 15-17, Singapore, 2004. [27] C. Gouin-Vallerand, B. Abdulrazak, S. Giroux and M. Mokhtari, "A self-configuration middleware for smart spaces," International Journal of Smart Home, vol. 3, pp. 7-16, January. 2009. [28] C. Gouin-Vallerand and S. Giroux, "Managing and deployment of applications with OSGi in the context of smart homes," in 3rd IEEE International Conference on Wireless and Mobile Computing, Networking and Communications 2007, Wimob 2007, New York, USA, 8-10 octobre 2007. [29] S. Giroux, D. Carboni, G. Paddeu, A. Piras and S. Sanna, "Delivery of services on any device : From java code to user interface." in 10th International Conference on Human - Computer Interaction, HUMAN-COMPUTER INTERACTION, Vol. 1-2, Crete, Greece, June 22-27, 2003. [30] P. Busnel, P. El-Khoury, S. Giroux and K. Li, "Achieving socio-technical confidentiality using security pattern in smart homes," in Second International Conference on Future Generation Communication and Networking, 2008. FGCN '08. 13-15 Dec. 2008, pp. 447-452, 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

210

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

[31] M. E. O Connell, C. A. Mateer and K. A. Kerns, "Prosthetic systems for addressing problems with initiation: guidelines for selection, training, and measuring efficacy," NeuroRehabilitation, vol. 18, pp. 9-20, 2003. [32] T. Meulemans, F. Collette and Van Der Linden, M., Neuropsychologie Des Fonctions Exécutives. Solal, 2004. [33] R. J. Perry and J. R. Hodges, "Attention and executive deficits in Alzheimer's disease. A critical review," Brain, vol. 122 ( Pt 3), pp. 383-404, Mar. 1999. [34] E. Vakil, "The effect of moderate to severe traumatic brain injury (TBI) on different aspects of memory: a selective review," J. Clin. Exp. Neuropsychol., vol. 27, pp. 977-1021, Nov. 2005. [35] H. Pigot, D. Lussier-Desrochers, J. Bauchet, Y. Lachapelle and S. Giroux, "A smart home to assist recipes’ completion (extended version)," in Technology and Aging ,Assistive Technology Research Series ed., vol. 21, A. Mihailidis, H. Kautz and J. Boger, Eds. IOPress, 2008. [36] S. Giroux and H. Pigot, "Computing and outdoors mobile computing for assisted cognition and telemonitoring," in 9th International Conference on Computers Helping People with Special Needs, July 7-9, 2004, Parispp. 953-960. [37] H. Pigot and S. Giroux, "Keeping in touch with cognitively impaired people: How mobile devices can improve medical and cognitive supervision," in 2nd International Conference on Smarth Homes and Health Telematics, ICOST 2004, Sept. 15-17, 2004, Singapore. [38] J. F. Moreau, H. Pigot and S. Giroux, "Assistance to cognitively impaired people and distance monitoring by caregivers: A study on the use of electronic agendas," in International Conference on Aging, Disability and Independence (ICADI), February 1-5, 2006, St-Petersburg, Fl, USApp. 29-30. [39] S. Giroux, H. Pigot, B. Paccoud, D. Pache, E. Stip and J. Sablier, "Enhancing a Mobile Cognitive Orthotic - A User-Centered Design Approach," International Journal of Assistive Robotics and Mechatronics, vol. 9, pp. 36-47, 2008. [40] J. C. Augusto and C. D. Nugent, Designing Smart Homes: The Role of Artificial Intelligence, State of the Art Survey, , vol. LNAI 4008, Springer-Verlag, 2006, p. 183. [41] A. Mihailidis, J. Boger, M. Canido and J. Hoey, "The use of an intelligent prompting system for people with dementia," Interactions, vol. 14, pp. 34-37, 2007. [42] T. Barger, M. Alwan, S. Kell, B. Turner, S. Wood and A. Naidu, "Objective remote assessment of activities of daily living: Analysis of meal preparation patterns," Medical Automation Research Center, University of Virginia Health System, Charlottesville, 2002. [43] L. Liao, H. Kautz and D. Fox, "Learning and inferring transportation routines," in Proceedings of the 19th National Conference on Artificial Intelligence (AAAI), 2004. [44] M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Hahnel, D. Fox and H. Kautz, "Inferring Activities from Interactions with Objects," IEEE Pervasive Computing, vol. 3-4, pp. 50-57, 2004. [45] J. Bauchet and A. Mayers, "A modelisation of ADLs in its environment for cognitive assistance," in 3rd International Conference on Smart Homes and Health Telematic (ICOST), July. 4-6, 2005, Magog, Canadapp. 221-228. [46] P. Naîm, P. H. Wuillemin, P. Leray, O. Pourret and A. Becker, Réseaux Bayésiens. Eyrolles, 2004. [47] B. Bouchard, A. Bouzouane and S. Giroux, "A Keyhole Plan Recognition Model for Alzheimer's Patients: First Results," Journal of Applied Artificial Intelligence (AAI), Accepted for Publication in Volume 22, (7), pp. 1-34, July. 2007. [48] P. C. Roy, B. Bouchard, A. Bouzouane and S. Giroux. A hybrid plan recognition model for alzheimer's patients: Interleaved-erroneous dilemma. Presented at The 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT'07). November 2-5Fremont, CA. [49] B. Bouchard, P. C. Roy, A. Bouzouane, S. Giroux and A. Mihailidis, "Towards an extension of the COACH task guidance system: Activity recognition of alzheimer's patients," in ECAI'08 3rd Workshop on Artificial Intelligence Techniques for Ambient Intelligence (AITAmI'08), 2008, Patras, Grecepp. 1620. [50] H. Kautz, O. Etzioni, D. Fox and D. Weld, "Foundations of assisted cognition systems," University of Washington, Tech. Rep. CSE-02-AC-01, 2003. [51] S. Carberry, "Techniques for Plan Recognition," User Modeling, 11(2001), vol. 11, pp. 31-48, 2001. [52] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi and P. F. Patel-Schneider, The Description Logic Handbook. Cambridge University Press New York, NY, USA, 2007. [53] B. Bouchard, "Un modèle de reconnaissance de plan pour les personnes atteintes de la maladie d'Alzheimer basé sur la théorie des treillis et sur un modèle d'action en logique de description," pp. 268, 2006. [54] T. Giovannetti, D. J. Libon, L. J. Buxbaum and M. F. Schwartz, "Naturalistic action impairments in dementia," Neuropsychologia, vol. 40, pp. 1220-1232, 2002. [55] L. R. Rabiner, "A tutorial on hidden markov models and selected applications in speech recognition," in Proceedings of the IEEE, Feb 1989, pp. 257–286.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

S. Giroux et al. / The Praxis of Cognitive Assistance in Smart Homes

211

[56] G. Paquette, F. Pachet, S. Giroux and J. Girard, "Epitalk, generating advisor agents for existing information systems," Artificial Intelligence in Education, vol. 7(3-4), pp. 349-379, 1996. [57] M. E. Pollack. (2005, Intelligent technology for an aging population: The use of AI to assist elders with cognitive impairment. AI Magazine 26(2), pp. 9-24. [58] D. J. Patterson, H. A. Kautz, D. Fox and L. Liao, "Pervasive computing in the home and community," in Pervasive Computing in Healthcare J. E. Bardram, A. Mihailidis and D. Wan, Eds. CRC Press, 2007, pp. 79–103. [59] Z. Lin and L. Fu, "Multi-user preference model and service provision in a smart home environment," in IEEE International Conference on Automation Science and Engineering, 2007, pp. 759-764. [60] R. Casas, R. Blasco Marín, A. Robinet, A. R. Delgado, A. R. Yarza, J. Mcginn and R. Picking, "User modelling in ambient intelligence for elderly and disabled people," in Proceedings of the 11th International Conference on Computers Helping People with Special Needs, 2008, pp. 114-122. [61] A. M. Nuxoll, "Enhancing Intelligent Agents with Episodic Memory," 2007. [62] A. Serna, H. Pigot and V. Rialle, "Modeling the progression of Alzheimer's disease for cognitive assistance in smart homes," UMUAI, vol. 17, pp. 415-438, 2007. [63] D. W. Glasspool and R. Cooper, " Executive processes. " in Modelling High Level Cognitive Processes R. Cooper, Ed. New Jersey: Lawrence Erlbaum Associates, 2002, pp. 313-362. [64] J. Laird, P. Rosenbloom, and A. Newell, "Soar: An Architecture for General Intelligence," Artificial Intelligence Journal, vol. 33, pp. 1-1-64, 1987. [65] A. Newell, Unified Theories of Cognition. ,Harvard University Press ed.Cambridge (Mass): Harvard University Press, 1990, pp. 549. [66] C. Baum and D. F. Edwards, "Cognitive performance in senile dementia of the Alzheimer's type: the Kitchen Task Assessment." American Journal of Occupational Therapy, vol. 47, pp. 431-436, 1993. [67] T. Patkos, A. Bikakis, G. Antoniou, M. Papadopouli and D. Plexousakis, "Distributed AI for ambient intelligence: Issues and approaches," in Ambient Intelligence. Proceedings of European Conference, AmI 2007, Darmstadt, Germany, November 7-10, 2007.Anonymous Spinger-Verlag, LNCS 4794, 2007, pp. 159-176. [68] C. Ramos, J. C. Augusto and D. Shapiro, "Ambient Intelligence: the Next Step for Artificial Intelligence," IEEE Intelligent Systems, vol. 23, pp. 5-18, 2008. [69] G. Weiss, Multiagent Systems, a Modern Approach to Distributed Artificial Intelligence. MIT Press, 1999. [70] A. H. Caron, Culture Mobile, Les Nouvelles Pratiques De Communication. Les Presses de l’Université de Montréal, 2005. [71] R. L. Mace, G. J. Hardie and J. P. Place, Accessible Environments: Toward Universal Design Center for Accessible Housing, North Carolina State University, 1990. [72] Center of Universal design, "Principles of universal design,". [73] P. Boyce, Human Factors in Lighting. London: Taylor & Francis, 2003. [74] M. S. Rea, M. G. Figueiro and J. D. Bullough, "Circadian photobiology: An emerging framework for lighting practice and research," Lighting Research and Technology, vol. 34, pp. 177, 2002. [75] S. Chiazzari, "Healing Home: Creating the Perfect Place to Live with Colour, Aroma, Light and Other Natural Elements. " 1998. [76] D. Kopec, "Health Sustainability and the Built environment," 2009. [77] R. A. Brooks, "Intelligence without representation," 1997. [78] S. Saha, D. Chant, J. Welham and J. McGrath, "A Systematic Review of the Prevalence of Schizophrenia," PLoS Medicine, vol. 2, pp. 413-433, May 2005. 2005.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

212

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-212

Avatar communication for a social in-home exercise system: a user study

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Joyca LACROIX 1 , Yasmin AGHAJAN 2 and Aart VAN HALTEREN Philips Research, Eindhoven High-Tech Campus, The Netherlands Abstract. Although many people are aware of the beneficial effects of physical exercise on health, they often experience difficulties in finding time and motivation to incorporate regular gym visits into their busy schedules. Exercising at home is a viable alternative, but lacks the human coaching and social support that is available in traditional exercise settings. Modern technology opens an enormous space of possibilities for enriching the in-home exercise experience with motivational coaching and social factors through interactive monitoring, feedback, and social connectedness. In this chapter we introduce an interactive social in-home exercise system, which enables the monitoring and sharing of exercise movements through an avatar in a social context (composed of a coach or other exercisers). We conduct a survey study in two user groups to explore to what extent users appreciate the use of avatars to visualize their exercise movements and to gain insight into preferences with respect to the level of personalization of avatars in various social contexts. Also, we examine the relationship between several user characteristics (body image, gender, and age) and these avatar preferences. The results show that individuals overall like the use of avatars to visualize exercise movements. Moreover, we find that the appreciation of visualizing exercise movements through an avatar and the preferences with respect to the level of personalization vary with the type of social context, with a greater appreciation for the avatar when exercising in private than when the avatar is communicated in a social context of unknown others. Also, in one group we find a greater appreciation to share the avatar with a coach than with friends or with unknown others. Finally, the results suggest that body image, gender, and age play an important role for avatar preferences. Keywords. Avatar preferences, monitoring and feedback, social connectedness, motivation, in-home exercise system

A plethora of published research in the past years has underscored the relationship between physical activity and physical as well as mental well-being [31]. It has been shown that regular physical exercise, combined with a healthy diet has many beneficial effects including, the prevention of heart disease, diabetes and obesity, reduced blood pressure, and improved self-esteem and energy [31,20,28]. As a consequence of these findings, public awareness and efforts to promote exercise behavior at home and in fitness centers has considerably increased. The focus on a healthier lifestyle and on increased physical fitness has prompted people to introduce exercise programs into their daily lives. Nevertheless, while the benefits of exercise are widely known, preoccupation with career 1 Corresponding 2 Yasmin

Author E-mail: [email protected] Aghajan did an assignment at Philips Research.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

213

demands and other daily obligations have caused difficulties for many in finding time and energy to incorporate regular exercise at the gym or outside the home into their schedules. Exercising at home is a viable alternative, but it has two main disadvantages compared to exercising in the gym. The first disadvantage concerns the absence of human coaches that often guide a person in creating and following an effective exercise program in a gym and can provide educational and motivational feedback. The second disadvantage concerns the lack of social factors such as the presence of other exercisers or the company of an exercise friend that may be helpful to enhance motivation and realize commitment to previously formulated exercise intentions. Modern technology opens an enormous space of possibilities for enabling such motivational coaching or social support in situations where human face-to-face contact is missing. Many technological solutions have been developed for monitoring the user’s exercise behavior and the physiological variables relevant for exercising (e.g., heart rate), that aim at enriching the exercise experience. In accordance with studies from the field of traditional sports and exercise psychology that showed the importance of feedback for motivation and performance [16], studies from the domain of persuasive and mobile technology have shown that monitoring and feedback can indeed be a powerful way to enhance motivation and performance in health interventions [38,3,10,25,43]. In several studies it was shown that measuring physical activity (e.g. with pedometer, activity monitor, or self-report) and providing feedback about performance increased physical activity behavior [3]. [3] conducted a systematic review of pedometer intervention studies and found that overall, pedometer users significantly increased their physical activity and decreased their body mass index. [38] performed a randomized control trial aimed at testing the effect of physical activity measurement on physical activity behavior and found that the mere monitoring of physical activity already had a positive effect on physical activity behavior, which they expected to be due to an increased awareness of physical activity level. Other studies combined setting a physical activity goal, providing feedback about progress towards the goal, and employing coaching through a webservice and e-mail, all of which led to a physical activity increase [25,43]. In addition to individual goals and feedback about monitored performance, studies from the domain of sports and exercise psychology have demonstrated that social factors can have a major impact on exercise behavior and motivation. Not only can the role of a coach or trainer be quite influential [16], also the social exercise context created by friends and family can play an important role in determining exercise behavior [42,40,29]. Moreover, research into online social networks and virtual communities has demonstrated the persuasive dynamics within these networks [13]. On the basis of the literature, we believe that enhancing the concept of exercising at home with interactive monitoring and coaching elements and with elements of social connectedness (i.e., enabling connecting exercisers with a coach or with other exercisers) may be a powerful way to provide the coaching and social support that helps people to remain motivated and committed to their exercise intentions. In particular, technology can be employed to monitor exercise movements while providing real-time feedback and instructions for a more efficient and sound exercise session. Furthermore, computer-mediated interactions to connect with others may also enrich the user’s experience and enhance motivation through social factors, thereby increasing the enjoyment and frequency of exercising.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

214

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

One way to monitor and provide feedback about the exercise movements of an exerciser is through the use of a real-time image processing system. The user is presented with a 3 dimensional (3D) model of the body and its movements on a screen. This model is constructed from camera observations. This system may be used not only to give feedback about exercise movements to the exerciser, but also to share the exerciser’s model with relevant others, such as a trainer/coach, an exercise buddy, or a group of other exercisers in a virtual community. Depending on user preferences, the graphical model can be displayed at various levels of personalization (we refer to personalization as the degree of body similarity between the avatar and the user, in other words, the degree to which an avatar reflects the bodily characteristics of the user) through a so-called avatar ranging from not personalized (i.e., an abstract skeletal "stick figure" avatar) to highly personalized (i.e., a mirror-image avatar mimicking the appearance of the exerciser). In such a system, avatars have two roles: i) offering the exerciser immediate visual feedback, and (ii) serving as a communication token that finds relevance in a social context, or where others (e.g., a coach or other exercisers) can see the avatar. In a study experimenting with the former role, it was found that people who watched an avatar that looked similar to themselves running on a treadmill for approximately five minutes exercised more the next day than people whose avatars weren’t similar in appearance to themselves with regards to body weight, height, and age [14]. Also, those who viewed avatars lounging around exercised less than those whose avatars were active. Displaying exercise movements through a personalized avatar and sharing it with a coach or with other exercisers is expected to enhance the user’s awareness of his/her appearance, body-shape, and movements. Social comparison theory [11] would predict that the confrontation with an avatar that reflects certain personally important body characteristics evokes a pleasant or unpleasant feeling depending on the personal judgment of these characteristics. Therefore, the level of satisfaction with one’s bodily characteristics and the level of self-confidence about one’s own exercise capabilities may impact user preferences with respect to the degree of personalization of the avatar. Moreover, levels of preferred personalization may vary across different types of social contexts (the social context being defined as the individuals to whom the avatar is communicated). In short, considering the identity-expressive properties of an avatar, we expect that both the characteristics of the user and the composition of the social context (for example a coach or a few exercise friends) with whom the avatar is shared affect the preferences with respect to the level of personalization of the avatar. Therefore, it is important to gain insight into the relationship between avatar preferences and relevant user characteristics in various (technology-based) social contexts. In this chapter, we explore user preferences for various avatar types employed in a social exercise system to provide movement feedback to an exerciser and to share exercise movements with others (coach or other exercisers). In section 1, we briefly describe the use of image-processing technology for monitoring the user in a social exercise system and the use of different types of avatars for displaying the monitored movements. Then in section 2, we shed some light on the role of the social context. This is followed in section 3 by a discussion of the user characteristics that are expected to be of relevance for avatar preferences. In particular, we discuss body-image (and social physique anxiety), gender, and age. Subsequently, in section 4, we present an overview of our study aimed at gaining insight into avatar preferences in various social (and also non-social) exercise settings and the relation between these preferences and the user characteristics

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

215

discussed in section 3. Section 5 presents the methods of the study, followed by a presentation of the results in section 6. An overview and discussion of these results is then given in section 7. Finally, section 8 provides our conclusion.

1. Monitoring the User in a Social Exercise System In order to realize an in-home exercise system that allows for visual feedback of the user’s movements and the possibility of sharing these movements in a social exercise context, we need to consider: 1) technology for monitoring the body and the body movements of the exerciser, and 2) possibilities to display the monitored data in a user friendly way. Below, we briefly consider monitoring technology and the displaying of the monitored data through avatars. 1.1. Monitoring Technology

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The use of camera and computer vision technology presents a wide scope of possibilities for monitoring the user during exercising. Video taken of the user by a camera as he performs exercises can be displayed to him in real-time as a simple means of visual feedback. Such video can be processed locally to extract measurements of the user’s body movements such as the number of repetitions and the relative position of joints. In addition, the actions of the exerciser can be mapped onto a 3D model displayed through an avatar in real-time, allowing the user to receive visual feedback which may be augmented with annotations related to the performance or deviations from the ideal routine. The processing involves methods of human pose and gesture analysis. Such methods which are based on extracting positions of the body joints as the user movements are registered between video frames have been reported extensively [39,7,33,34,37]. Multi-camera implementations in real-time using embedded processing have also been reported, for example in multi-player games with networked cameras [44]. 1.2. Displaying Monitored Data through Avatars Considering the intensity of interactions between these monitoring solutions and the user, it is of major importance to deepen our understanding of their impact on the user’s experience. These user experiences with the monitoring technology employed for in-home exercise systems play a major role in the adoption of the technology. In addition to the type of measurements made from the acquired video, the overall experience is determined partly by the way that the graphical representation of the user is presented to the user and to the user’s social environment. Privacy issues as well as cognitions and confidence regarding body image and comfort of exposing oneself to other people may impact a user’s willingness to use personal avatars as a visual representation both when exercising alone or in a social context. Visually representing the self through avatars has been employed extensively within the domain of virtual communities. Avatars can range from imaginary to realistic, depending on the application, system design and user preferences. The most popular use of avatars has been on instant messaging services in cyberspace. People can use avatars to build their own different identity [21,22]. More recently, the use of avatars has been considered for virtual reality applications such as Second Life (www.secondlife.com) and

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

216

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

Qwaq Forum (www.qwaq.com). Fitness applications for avatars have also been explored to motivate healthy practices by creating and keeping track of a body model avatar online (My Virtual Model). Interactive avatar applications for exercising have appeared based on devices held by or attached to the body of the user. Gaming platforms such as the Nintendo Wii Fit have been used for interactive exercise sessions involving the display of an avatar [35]. Below, section 2 briefly addresses the role of the social context in which the avatar is shared and subsequently section 3 discusses three user characteristics that are expected to be of relevance for avatar preferences in a social exercise system.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2. Social Contexts The communication function of avatars is particularly relevant in a social context where others can see the avatar. Based on the literature from the domain of social psychology e.g., [11,12], we expect that avatar preferences across types of social contexts are impacted by: (i) the user’s emotions and cognitions with respect to the type of behavior and (ii) the user’s relationship with, and perception of the individual(s) of the social context. First, we expect that the preferred level of avatar personalization is partly determined by the user’s emotions and cognitions with respect to exercising, in particular, the degree to which the user considers exercising in the social context under consideration an embarrassing act or an act to proud of. Research has demonstrated that people dislike the presence of others when they are performing embarrassing behavior [12]. In a similar way, people can feel satisfied with the presence of others when they are proud of their behavioral performance or appearance (see, e.g., social comparison theory, [11]). A high level of embarrassment with respect to the own exercise behavior and body (see also subsection 3.1 that discusses the construct of body image) may lead to a preference for a low level of avatar personalization (i.e., a low level of resemblance to the user’s body characteristics) in order to allow the user to disassociate from the avatar and thereby from the behavior of the avatar. In contrast, a high level of pride with respect to the own exercise behavior may lead to a stronger desire to expose this through a personalized avatar that better reflects the user’s characteristics. The effect of embarrassment may be counteracted and the effect of pride may be strengthened by the general preference and greater sensitivity for what is similar that has been shown to exist in many domains [5]. For example, [1] showed that individuals liked virtual representations of the self more than virtual representations of others, in particular when the virtual representation of the self showed a high similarity to the individual. A second main determinant of avatar preferences in a social context is the relationship with, and the perception of the individuals that together compose the social context. In a study aimed to explore to what extent users are willing to share their location to social relations, it was shown that the most important factors that determined willingness to share were: the relationship of the individual with whom the information was shared, the reason of sharing, and the usefulness of the level of detail [6]. In accordance with these findings, [45] found that individuals were more willing to share personal information with other exercisers than with anyone (public) in a study that explored the social requirements for exercise group formation. Considering the results of [6,45], it is expected that the relationship with the individual(s) of the social context (compare coach,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

217

friends, or strangers) affect to what extent users are willing to expose themselves and therefore affect the preferred level of avatar personalization. Moreover, the perception of the individual(s) of the social context partly determines the reason and usefulness of sharing personal details (compare sports coach and lay person) and thereby may affect the willingness to share personal characteristics through the avatar.

3. User Characteristics It is expected that user characteristics impact a user’s avatar preferences with respect to the level of personalization (i.e., resemblance to one’s own appearance), as well as his acceptance of the technology and willingness to use it in the home. Many user characteristics may directly or indirectly impact a user’s preferences for the level of personalization of the graphical image when sharing visual data during exercising in different social contexts. For our study we focused on three types of user characteristics that are expected to play an important role in shaping these preferences: body-image, gender, and age. These characteristics and their role within social contexts are briefly discussed below.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.1. Body-image Body-image falls under the umbrella of the larger concept of self-image. Self-image refers to the way in which a person perceives himself, either physically or as a concept. Self-image has important implications in a person’s self-esteem and confidence [9]. One’s self-image encompasses a number of self-impressions built up over a period of time, and may be influenced by previous successes (e.g., winning an award, being praised). A high self-image indicates that a person feels competent and in control of his life, feels like he deserves others’ respect, and ultimately, feels good about who he is. A low self-image, on the other hand, is characteristic of people who do not feel comfortable with who they are, and perceive themselves in an unfavorable manner. Body-image is the feeling a person has specifically about his physical appearance [4]. While self-image focuses on the perception of all aspects of the self, body-image targets solely at body-related aspects of the self. Someone with a low body-image may feel poorly about his body and may wish to change his physique. Someone with a high body-image is satisfied with the way his body looks. Related to body-image is the concept of social physique anxiety (SPA) [26]. SPA indicates how anxious an individual is about his body when around others. Although related to self-image and self-esteem, SPA is different in that it focuses on the level of comfort with one’s body when others are present. Since the body forms the center of attention in exercising, we believe that it is important to consider levels of SPA when sharing exercise movements in a social context where the user’s body is open to social evaluation. In an experiment regarding social physique anxiety and exercise behavior, it was found that individuals with high SPA were less likely to participate in exercise in situations where their bodies could be criticized [26]. The results of [26] suggest that individuals with high SPA levels feel uncomfortable exercising in social contexts where others can see and judge their physique. In a similar way, these individuals may feel uncomfortable sharing their exercise movements within a social context, in particular when the avatar reflects their bodily characteristics more accurately.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

218

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

3.2. Gender Several studies have shown that males and females can differ in motivations and reasons to exercise, body perceptions and cognitions, and willingness to engage in competition and comparison with others [24,2,41,4,27,16], which are related to differences in bodyimage. While the results are ambiguous, overall it seems that females experience stronger weight and appearance related motives compared to males and higher levels of body dissatisfaction [16]. Also, several studies indicate that females place higher meaning and importance on physical characteristics and perceptions of appearance than do males [17]. Moreover, females are more critical with respect to their own body than males and experience higher levels of body dissatisfaction than males [4]. While the differences in exercise motives and body-image related cognitions between males and females may be indicative of differences in preferences for the avatar type, no study we know of has directly assessed gender differences in avatar preferences in the field of exercise. However, there have been studies that explored the relationship between gender and avatar preferences in the context of virtual communities. For example, in their study on the use of avatars in computer-mediated communication, [22] found that females were more likely to use avatars that were less similar to themselves, whereas males didn’t mind using avatars that were somewhat representative of reality. Based on the results of avatar studies and those from the domain of exercise psychology, we may expect that females are less willing than males to express themselves by means of personalized avatars that reflect their bodily characteristics.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.3. Age Several studies have shown that exercise motivation and self-perceptions related to the body (such as body-image, perceived physical fitness, self-esteem and self-efficacy) may decline with age [32,23,30,16], possibly affecting avatar preferences. Also, research from the domain of gaming and online social networks suggests that avatar preferences may vary with age [22,8]. For example, a recent study on avatar personalization by players in virtual worlds such as Maple Story, Second Life, and World of Warcraft, found that older people broadly prefer creating avatars that look younger than they really are [8]. In contrast, younger individuals preferred avatars of a similar age. Avatar studies as well as studies from the domain of exercise psychology, seem to suggest that younger individuals are often more satisfied with their body and are more willing to see their personal body-related characteristics reflected in avatars in social networks than older individuals. Therefore, age should be considered as a possible determinant for the willingness to visualize and share the body and exercise movements through an avatar in different social contexts.

4. Study Overview and Expectations The study described in this chapter is an initial exploration into the following research question: To what extent do users appreciate the use of avatars to visualize their exercise movements and what is the preferred level of personalization of an avatar when it is communicated within various social contexts?

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

219

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. Avatar examples used in the survey: a skeletal "stick figure" avatar (right), a body-shape based avatar (center), and a mirror-image avatar (left).

In order to gain insight into the extent to which users appreciate the use of avatars for exercise visualization and into the preferred level of avatar personalization, we administered a survey in which individuals rated the use of three types of avatars in four types of social exercise contexts. In this way, avatar appreciations (as measured by the avatar ratings) were obtained for each type of avatar in each type of social context. The three types of avatars varied in the level of personalization (i.e., the level of similarity to the user’s bodily characteristics): (i) a skeletal "stick figure" avatar, (ii) a body-shape based avatar, and (iii) a mirror-image avatar. The least personalized avatar was the skeletal avatar which did not show any resemblance to the user’s bodily characteristics apart from reflecting the user’s body movements. The second avatar was the body-shape based avatar, which showed a considerable degree of personalization, resembling the user in terms of body shape and body movements (individuals were explicitly told that this avatar reflected their body shape). The most personalized avatar was the mirror-image avatar which fully resembled the user’s appearance and movements. Examples of these avatar types are shown in Figure 1; skeletal figure (right), body-shape based avatar (center) and mirror-image avatar (left). Four types of (social) exercise contexts were distinguished defined by the individuals with whom the user’s avatar was shared: (i) the user himself/herself (context Self ), (ii) an exercise coach (context Coach), (iii) a (small) group of exercise friends (context Friends), and (iv) a group of other exercisers unknown to the user (context Others). The various assumed determinants of the appreciation of using avatars and the preferred level of avatar personalization in different social contexts (see e.g. section 2) complicate the formulation of specific predictions with respect to the avatar ratings collected in the survey. Therefore, we addressed our main research question through a truly open exploration without explicit prior predictions. In addition to our main research question, we aimed to gain some insight into the relationship between avatar ratings and three types of user characteristics: body-image (in particular level of SPA), gender, and age. While this preliminary study is too limited to be considered a systematic and thorough examination of the relationship of these user characteristics with avatar ratings, it does provide some initial insight into these relationships by comparing avatar ratings across gender, across two levels of SPA (low SPA vs. high SPA) and across two age groups. From the literature discussed in the previous section we derive the following expectations: (a) males are more willing to visualize and communicate their exercise movements through an avatar than females, in particular

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

220

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

when the avatar is more personalized (i.e., more similar to their bodily characteristics); (b) individuals low on SPA are more willing to visualize and communicate their exercise movements through an avatar than individuals high on SPA, in particular when the avatar is more personalized; and (c) younger individuals are more willing to visualize and communicate their exercise movements through an avatar than older individuals, in particular when the avatar is more personalized. Based on these general expectations, we formulated the following three explicit expectations with respect to the avatar ratings of our survey study: 1. Avatar appreciation varies with gender, with males overall showing higher avatar appreciations (i.e., higher avatar ratings) than females, in particular for the personalized avatars. 2. Avatar appreciation varies with SPA level, with individuals low on SPA overall showing higher avatar appreciations than individuals high on SPA, in particular for the more personalized avatars. 3. Avatar appreciation varies with age, with younger individuals overall showing higher avatar appreciations than older individuals, in particular for the more personalized avatars.

5. Method 5.1. Study Design The study entailed a 3 (avatar type: skeletal avatar, body-shape based avatar, mirrorimage avatar) x 4 (social context: contexts Self, Coach, Friends, and Others; see section 4) factorial survey design.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.2. Participants We recruited male and female participants within two different age groups: (A) working adults in the Netherlands (21 males, 9 females, mean age = 31 years, std = 4.6 years), and (B) high school students in California (4 males, 9 females, mean age = 15 years, std = 0.5 years). 5.3. Procedure For our explorations into user preferences for the different types of avatars across the different types of social contexts, we administered a survey to participants in the two groups. The survey introduced the scenario of exercising at home with the technology-based exercise monitoring system. Survey takers were informed about the visualization and communication of exercise movements through different types of avatars within different types of social contexts. Then they were asked to provide their avatar ratings through a set of survey questions. Following the questions about avatar ratings, the participant’s level of SPA was assessed. In order to explore the relationship between avatar appreciation and gender, we compared avatar ratings between males and females within each participant group. In a similar way, we explored the relationship between avatar appreciation and SPA by comparing

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

221

avatar ratings between participants with low SPA levels and those with high SPA levels. Finally, to gain some initial idea of the relationship between avatar appreciation and age, we compared avatar ratings across the two participant groups. 5.4. Measurements Avatar appreciations were measured by the avatar ratings obtained through a set of survey questions following the scenario explanations at the start of the survey. To obtain a composite avatar rating based on the variables that we considered most important for the exercise experience, the survey questions addressed three dimensions: enjoyment, comfort, and helpfulness. Each survey question sketched the social context and the type of avatar under consideration and asked for the avatar rating (on a scale from: 1 = not at all to 5 = very much) for one of the dimensions (e.g., When exercising with the system I would feel comfortable when my coach sees my movements through my Skeletal Avatar: (not at all) 1- 2- 3 -4 -5 (very much)). The avatar ratings on the three dimensions (enjoyment, comfort, helpfulness) were averaged to result in a composite avatar rating for each avatar type in each type of social context. To assess the level of SPA, participants filled out the 12-point Social Physique Anxiety Scale (SPAS) designed to measure the level of SPA [18].

6. Results Below we present the results of participant groups A (6.1) and B (6.2). Subsequently, we compare the two groups (6.3).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.1. Results Participant Group A Figure 2 presents the overall avatar ratings averaged across participants for each type of avatar (light grey bars: skeletal avatar; black bars: body-shape based avatar, dark-grey bars: mirror-image avatar) and each type of social context (Self, Coach, Friends, Others). From Figure 2 it can be seen that the participants from group A prefer the bodyshape based avatar overall and in each type of social context. Moreover, for the contexts Self, Coach, and Friends, the mirror-image avatar is preferred second and the skeletal avatar is preferred least. In contrast for the context Others, we see a different pattern, the skeletal avatar is preferred second and the mirror-image avatar is clearly rated lowest. An ANOVA (factors avatar type and social context) with avatar rating as the dependent variable reveals a significant main effect for avatar type F (2, 359) = 28.8, p < 0.00 as well as for social context, F (3, 359) = 15.7, p < 0.00 (the highest ratings were obtained for context Self, mean = 3.14 and Coach, mean = 3.3, followed by F riends, mean = 2.74 and finally Others, mean = 2.28). A multiple comparisons test revealed that ratings were significantly different between social contexts Self and Others, between contexts Coach and F riends, and between contexts Coach and Others, but did not differ significantly between contexts Self and Coach, between contexts Self and F riends, between contexts Coach and F riends, and between contexts F riends and Others. Moreover, the two-way interaction between avatar type and social context was significant, F (6, 359) = 5.1, p < 0.00, indicating that avatar preferences for

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

222

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

Figure 2. Avatar ratings from participant group A. Error bars represent standard errors of the mean.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

(a)

(b) Figure 3. Avatar ratings from (a) males and (b) females in participant group A.

the three types of avatars vary across social contexts. In particular, the context Others shows a different pattern of avatar ratings than the other contexts. The graphs in Figure 3 present the avatar ratings for males and females per type of social context and overall (not differentiating between types of social contexts). From these graphs it can be seen that the patterns for males and females are quite similar overall and also for social contexts Coach, Friends and Others. In the social contexts Coach and Friends, both males and females score the body-shape based avatar highest, followed by the mirror-image avatar and subsequently the skeletal avatar. For the context Oth-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

223

ers, both males and females rate the body-shape based avatar highest, followed by the skeletal avatar and the lowest ratings were obtained for the mirror-image avatar. For the context Self a different pattern is observed for the males and females. While males rated the mirror-image avatar highest (followed by the body-shape based avatar and finally the skeletal avatar), females rated the body-shape based avatar highest (followed by the mirror-image avatar and finally the skeletal avatar). Nevertheless an ANOVA (with the factors avatar type and gender) for context Self could not confirm the statistical significance of the interaction between gender and avatar type (F (2, 89) = 1.2, p > 0.30), and also no main effect for gender was obtained (F (1, 89) = 1.37, p > 0.24). Also, overall (not differentiating between types of social contexts) an ANOVA could not indicate a significant main effect for gender (F (1, 359) = 1.21, p > 0.2) nor an interaction between gender and avatar type (F (2, 359) = 1.86, p > 0.15). A possible explanation for the absence of a main effect for gender is the small number of females in the participant group. The absence of a significant main effect and the absence of the interaction effect demonstrate that we cannot find support for our first expectation based on the results of participant group A. Nevertheless, the results indicate that for context Self males prefer the highest level of avatar personalization (mirror-image avatar), while females prefer a moderate level of avatar personalization. For two of the participants we did not obtain an accurate SPA score, because of missing data. For the remaining 28 participants, the graphs in Figure 4 present the avatar ratings per type of social context and the overall avatar ratings for participants with a low SPA score and participants with a high SPA score. It can be seen from these graphs that the patterns of avatar ratings obtained for participants with a low SPA score compared to those of participants with a high SPA score differ. An ANOVA with the factors avatar type and SPA shows that a significant interaction is obtained between avatar type and SPA (F (2, 335) = 5.59, p < 0.00), but no main effect is obtained for SPA. From the graphs it can also be derived that the interaction effect can for a large part be explained by differences in ratings for the skeletal and the mirror-image avatars, with the participants low on SPA rating skeletal avatars higher and mirror-image avatars lower compared to participants high on SPA. When analyzed separately per context, we can see from the graphs in Figure 4 that different patterns of avatar ratings were obtained for participants with low SPA scores and those with high SPA scores. ANOVAs revealed that a significant interaction was obtained between avatar type and SPA score for the social context Self (F (2, 83) = 3.27, p < 0.05). No significant interactions were obtained between avatar type and SPA score for the social contexts Coach (F (2, 83) = 1.81, p > 0.15), Friends (F (2, 83) = 1.67, p > 0.19), and Others (F (2, 83) = 1.02, p > 0.36). Moreover, these ANOVAs showed a significant main effect for avatar type and no significant main effect for SPA score. With the absence of a main effect for SPA level in participant group A, we failed to find support for the first part of our second expectation concerning overall (across all avatar types) higher expected avatar ratings for individuals with low SPA levels compared to those with high SPA levels. Moreover, the significant interaction effect of avatar type and SPA level combined with the presented data in Figure 4 contradict the second part of our second expectation by showing that participants with high SPA levels overall show higher ratings for the highest level of avatar personalization (mirror-image avatar) and lower ratings for the lowest level of avatar personalization (skeletal avatar) compared to

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

224

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

(a)

(b) Figure 4. Avatar ratings for participants with (a) low and (b) high SPA levels in participant group A.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

participants with low SPA levels (this interaction effect was found to be significant for context Self ). 6.2. Results Participant Group B Figure 5 presents the avatar ratings from participant group B for each type of avatar (light grey bars: skeletal avatar; black bars: body-shape based avatar, dark-grey bars: mirror-image avatar) and each type of social context (Self, Coach, Friends, Others) and overall. From figure 5 it can be seen that there is an overall preference for the mirror-image avatar. Rather similar patterns of avatar ratings were obtained for the social contexts Self, Coach, and Friends, with the highest ratings for the mirror-image avatar, slightly lower ratings for the body-shape based avatar, and the lowest ratings for the skeletal avatar. In contrast, the ratings for context Others show a different pattern. There, the highest ratings were obtained for the skeletal avatar, followed by the bodyshape based avatar and finally the mirror-image avatar. An ANOVA with factors avatar type and social context, and avatar rating as the dependent variable reveals a significant main effect for social context, F (3, 161) = 4.13, p < 0.008 (the highest ratings were obtained for context Self, mean = 3.34, then Coach, mean = 3.17, followed by F riends, mean = 3.10 and finally context Others, mean = 2.50). A multiple comparisons test revealed that only the ratings between social contexts Self and Others were significantly different, the other differences between social contexts did not reach

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

225

Figure 5. Avatar ratings from participant group B.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

(a)

(b) Figure 6. Avatar ratings from (a) males and (b) females in participant group B.

significance. No main effect for avatar type was found F (2, 161) = 2.49, p = 0.09. Moreover, the two-way interaction between avatar type and social context was significant, F (6, 161) = 5.1, p < 0.005. This interaction effect is due to the different pattern of avatar type ratings between the social context Others and the other social contexts; when this context is not considered in the ANOVA, the main effect of context disappears, F (2, 121) = 0.43, p = 0.65, as well as the interaction effect between avatar type and social context, F (4, 121) = 0.4, p = 0.81. Moreover, leaving the social context Others out of the ANOVA results in a significant main effect of avatar type,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

226

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

F (2, 121) = 8.57, p < 0.00. The graphs in Figure 6 present the avatar ratings for males and females. These graphs show that overall the avatar ratings of the males are higher than the ratings of the females for the skeletal and the mirror-image avatars. For social contexts Self, and Coach, males and females differ mainly in the scores for the mirror-image avatar, with males scoring this avatar higher than females. For the context Coach and Friends, males score the skeletal avatar as well as the mirror-image avatar higher than females. Finally for the context Others, males seem to score the skeletal and body-shape based avatars higher than females. The higher scores for males compared to females were revealed in an ANOVA (with the factors avatar type and gender) showing a main effect for gender, F (1, 161) = 4.93, p < 0.03. The main effect for avatar type did not reach statistical significance (F (2, 161) = 2.44, p = 0.09) nor did the interaction between gender and avatar type (F (2, 161) = 0.85, p = 0.43). These results support the first part of our first expectation that males show higher avatar ratings overall than females, but fail to support the second part of our first expectation that this difference is particularly high for the most personalized avatars. The absence of an interaction effect between gender and avatar can also be observed on the basis of the avatar rating patterns in the graphs of Figure 6, where males and females showed rather similar patterns. For the social contexts Self and Coach, both males and females scored the mirror-image avatar highest followed by the body-shape based avatar, and finally the skeletal avatar. For the context Others, both males and females rated the skeletal avatar highest, followed by the body-shape based avatar and then the mirror-image avatar. For the context Friends, males and females differed in their ranking with males and females scoring the mirror-image highest, but males scoring the bodyshape avatar lowest and females the skeletal avatar. Not surprisingly, when analyzed separately per social context type, ANOVAs (with factors avatar type and gender), did not reveal a significant interaction effect between gender and avatar type for any of the social contexts (probably also due to the small sample sizes). The graphs of Figure 7 present the avatar ratings for participants with a low and high SPA score. It can be seen from these graphs that overall there are some differences in the patterns of avatar ratings obtained for participants with a low SPA score compared to those of participants with a high SPA score. Overall, participants with a low SPA score rated the body-shape based and the mirror avatars higher than the participants with a high SPA score. This difference was most pronounced for the contexts Self and Coach, and Friends. Moreover, for the context Others participants with a high SPA score tend to rate skeletal avatars higher than participants with a low SPA score. Despite the trends in Figure 7, an ANOVA with the factors avatar type and SPA level revealed no significant main effect of SPA level (F (1, 161) = 2.86, p = 0.09) and no significant interaction between avatar type and SPA level (F = 1.74, p = 0.17). Also, when analyzed separately per type of social context, no significant main effects were revealed for SPA level and no interactions between avatar type and SPA level were found. Therefore, we failed to find support for our second expectation on the basis of the results obtained for participant group B. Nevertheless, the trends in the data are in line with both parts of the expectation; participants with low SPA levels show higher avatar ratings overall and in particular for the more personalized avatars (the body-shape based and the mirror-image avatars).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

227

(a)

(b) Figure 7. Avatar ratings from participants with (a) low and (b) high SPA levels in participant group B.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6.3. Comparison Participant Groups A and B When we compare the average avatar ratings across all avatar types in each participant group, we find no differences between participant group A (mean = 2.87) and participant group B (mean = 2.84)3 . This means that we do not find support for the first part of our third expectation that younger individuals show a higher appreciation for all avatars than older individuals. However, the main observation that can be made is that the groups differ in their preferences, with participant group A having a preference for the body-shape based avatar and participant group B preferring the mirror-image avatar. This difference in avatar preferences between participant groups A and B can be seen by comparing the overall ratings in Figures 2 and 5. It accords with our second part of the expectation, since we found that the younger participants preferred the highest level of avatar personalization, while the older participants preferred a moderate level of personalization (body-shape based avatar).

3 these

values are not shown in the graphs, but are obtained by averaging the avatar ratings across all avatar types in each group Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

228

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

7. Overview and Discussion The purpose of this investigation was to explore user preferences for seeing and sharing exercise movements through an avatar and to examine how different social contexts and several user characteristics affect these preferences. Regarding the overall user preferences, participant group A preferred the body-shape based avatar, while participant group B preferred the mirror-image avatar. For participant group A, a main effect for avatar type was obtained, while the observed trend did not reach statistical significance for group B. The results suggest that avatar personalization is appreciated which accords with earlier research on avatar preferences and selfexpression through avatars [1,14]. Nevertheless, the current study failed to control for differences in aesthetics between the avatar types. Therefore, a follow-up study should address avatar preferences using avatar designs that control for the aesthetics factor. For both participant groups a significant main effect for social context was obtained, with for both groups differences in ratings between social contexts Self and Others reached statistical significance, which implies that in general, individuals have a lower preference to share their exercise movements through an avatar with unknown others than to view their own exercise movements through an avatar. Also, for participant group A, significant differences in ratings for contexts Coach and F riends were obtained and between contexts Coach and Others, which implies a greater willingness to share exercise movements with a coach than with friends or with unknown others. This finding suggests that people see additional benefits in sharing their exercise movements with a coach compared to sharing their movements with friends or other people. We believe that these possible benefits concern the expected support of a coach and the expert judgments on the quality of exercise movements enabled by the sharing. Future research should address the robustness of this finding and the reasons underlying these preferences. With respect to the role of user characteristics and avatar preferences across different social contexts we had formulated three expectations in section 4. Some of the obtained results are consistent and others contradict our expectations. Our first expectation that males would show higher overall avatar appreciation (corresponding to the avatar ratings) than females was supported by the results of participant group B, but not A. Overall we cannot find support for our first expectation based on the results of participant group A, although we did see the trend that females would be less inclined than males to prefer a personalized avatar in the context Self . For participant group B, these results support the first part of our first expectation, but fail to support the second part of our first expectation that this difference is particularly high for the most personalized avatars. The second expectation that individuals low on SPA would show higher overall avatar appreciation than individuals high on SPA was not supported by the results. First, neither groups showed a main effect for SPA level, thereby we failed to find support for the first part of the expectation that overall higher ratings would be obtained for individuals with low SPA levels compared to those with high SPA levels. Moreover, with regard to the second part of our expectation, the results were opposite to what we expected for participant group A. For this group, it was found that participants low on SPA rated skeletal avatars higher and mirror-image avatars lower than participants high on SPA. This was the case for the context Self . A possible explanation for this is that the preoccupation with body appearance that often accompanies body dissatisfaction which is

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

229

closely related to SPA [26] may have led to a preference for a more personalized avatar despite the body dissatisfaction. In the context Self , where only the user sees the avatar, this preference is probably not counteracted by the feeling of being judged by others. This may explain the higher avatar ratings for the personalized avatars in the context Self of participants with high SPA levels. Further investigations are needed to deepen our understanding of the relationship between avatar preferences and body image (and SPA in particular). The third expectation was partially supported by our results. While we failed to find support for the first part of our third expectation that overall the appreciation of the use of avatars (across all avatar types) varies with age, we did find a different pattern of avatar preferences between participant groups. Younger participants preferred the highest level of avatar personalization, while the older participants preferred a moderate level of personalization (body-shape based avatar), which supports our second part of the expectation. A limitation of this study is that we did not control for confounding factors that may have led to differences in avatar appreciations between the two groups. Therefore, it is difficult to conclude that the differences between avatar ratings in the two groups are based on age differences. The differences may also be explained by other factors such as culture or education. Overall, the results from this survey study provide an initial idea of avatar preferences and the role of social contexts and user characteristics in these preferences. Future research should enhance this understanding by performing large-scale survey studies into preferences of different types of user groups. Also, in addition to the user characteristics addressed in the current survey, future research should consider other user characteristics that may be relevant in shaping the avatar preferences such as personality. We expect that two of the five broad dimensions of personality that are generally distinguished in traditional psychology [15] may be particularly relevant for avatar preferences: extraversion and openness to experience. These dimensions have shown to be related to the willingness of people to expose themselves to others [19]. Therefore, they may affect user’s acceptance of avatars in social contexts and their preferences with respect to the level of personalization of the avatar. Finally, we think that future research should aim to deepen our understanding of avatar types on behavior. Several studies suggest that the type of avatar may impact a user’s real life behavior [36,14]. Therefore, we believe it is important to investigate the effects of different types of avatars on the user’s exercise behavior.

8. Conclusion Guidance and feedback of a coach as well as the company of other exercisers can be quite powerful in enhancing motivation and exercise behavior in traditional exercise settings such as the gym. Many technological developments aim to realize the motivational power of these elements in an exercise setting that lacks human face-to-face contact. This chapter introduced an in-home exercise system that implements interactive monitoring, feedback, and social connectedness by sharing exercise movements through an avatar. Although several studies have shown that these elements can have a positive impact on a user’s exercise experience, findings so far are inconclusive about the desired type and level of personalization of the feedback and to what extent users wish to share this feed-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

230

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

back with others (for example, a coach or other exercisers). The study presented in this chapter entailed an initial exploration into the extent to which users appreciate the visualization of exercise movements through avatars and their preferred level of personalization in different social contexts. Moreover, we examined the relationship between these avatar preferences and various user characteristics. Our results indicate that the type of social context and the characteristics of the user play a role in the appreciation of avatar communication and the preferences with respect to the level of personalization. This implies that these variables should be studied carefully when developing social exercise systems that communicate exercise movements within a social context. Further larger-scale investigations are needed to deepen our understanding of the role of these variables in shaping avatar preferences. Also, future studies should address the effect of avatar communication and the level of avatar personalization on exercise motivation and behavior.

Acknowledgements Chen Wu and Hamid Aghajan from Stanford University are gratefully acknowledged for providing useful technological insights with respect to the real-time monitoring and displaying of movements.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

References [1] J. N. Bailenson and J. Blascovich. Self-representations in immersive virtual environments. Journal of Applied Social Psychology, 38:2673–2690, 2008. [2] R. Bowden, B. Lanning, L. Irons, and J. Briggs. Gender comparisons of social physique anxiety and perceived fitness in a college population. Research Quarterly For Exercise and Sport, 73(1):87, 2002. [3] D. M. Bravata. Using pedometers to increase physical activity and improve health. Journal of the American Medical Association, 298(19):2296–2304, 2007. [4] T. F. Cash and K. Grasso. The norms and stability of new measures of the multidimensional body image construct. Body Image, 2:199–203, 2005. [5] R. B. Cialdini. Influence: Science and practice. Harper Collins, New York, 2001. [6] S. Consolvo, I. E. Smith, T. Matthews, A. Lamarca, J. Tabert, and P. Powledge. Location disclosure to social relations: why, when, & what people want to share. In CHI ’05: Proceedings of the SIGCHI conference on Human factors in computing systems, New York, NY, USA. [7] M. Dimitrijevic, V. Lepetit, and P. Fua. Human body pose detection using bayesian spatio-temporal templates. Computer Vision and Image Understanding, 104(2), 2006. [8] N. Ducheneaut, M. Wen, N. Yee, and G. Wadley. Body and mind: a study of avatar personalization in three virtual worlds. In Proceedings of the 27th International Conference on Human Factors in Computing Systems, 2009. [9] E. Ekeland, F. Heian, and K. B. Hagen. Can exercise improve self esteem in children and young people? a systematic review of randomised controlled trials. British Journal of Sports Medicine, 39:792–798, 2005. [10] P. D. Faghri, C. Omokaro, C. Parker, E. NIchols, S. Gustavesen, and E. Blozie. E-technology and pedometer walking program to increase physical activity at work. The Journal of Primary Prevention, 29:73–91, 2008. [11] L. Festinger. A theory of social comparison processes. Human Relations, 7:117–140, 1954. [12] B. Fish, S. A. Karabenick, and M. Heath. The effects of observation on emotional arousal and affiliation. Journal of Experimental Social Psychology, 14:256–265, 1978. [13] B. J. Fogg. Mass interpersonal persuasion: An early view of a new phenomenon. In PERSUASIVE ’08: Proceedings of the 3rd international conference on Persuasive Technology, 2008.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

231

[14] J. Fox and J. Bailenson. Virtual self-modeling: The effects of vicarious reinforcement and identification on exercise behaviors. Media Psychology, 2009. [15] L. R. Goldberg. The structure of phenotypic personality traits. American Psychologist, 48:26–34, 1993. [16] M. S. Hagger and N. L. D. Chatzisarantis. Intrinsic Motivation and Self-determination in Exercise and Sport. Human Kinetics, Champaign, IL, 2007. [17] S. Harater. The construction of the self. A developmental perspective. The Guilford Press, New York, 1999. [18] E. A. Hart, M. R. Leary, and W. J. Rejeski. The measurement of social physique anxiety. Journal of Sport and Exercise Psychology, 11:94–104, 1989. [19] R. Hogan, J. Johnson, and S. Briggs. Handbook of Personality Psychology. Academic Press, California, 1997. [20] J. Huang, G. Norman, M. Zabinski, K. Calfas, and K. Patrick. Body image and self-esteem among adolescents undergoing an intervention targeting dietary and physical activity behaviors. Journal of Adolescent Health, 40:245–251, 2007. [21] T. Jordan. Cyber Power: The Culture and Politics of Cyberspace and The Internet. Routledge, New York, 1999. [22] H. Kang and H. Yang. The visual characteristics of avatars in computer-mediated communication: Comparison of internet relay chat and instant messenger as of 2003. International Journal of HumanComputer Studies, 64:1173–1183, 2006. [23] R. J. Kirkby, G. S. Kolt, K. Habel, and J. Adams. Exercise in older women: Motives for participation. Australian Psychologist, 34:122–127, 1999. [24] N. Koivula. Sport participation: Differences in motivation and actual paricipation due to gender typing. Journal of Sport Behavior, 22:360–381, 1999. [25] J. Lacroix, P. Saini, and R. Holmes. The relationship between goal difficulty and performance in the context of a physical activity intervention program. In Proceedings of the 10th International Conference on Human-Computer Interaction with Mobile Devices and Services (MobileHCI 2008), pages 415–418, September 2008. [26] C. D. Lantz and C. J. Hardy. Social physique anxiety and perceived exercise behavior. Journal of Sport Behavior, 20:83–93, 1997. [27] K. Martin Ginis, H. Prapavessis, and A. Haase. The effects of physique-salient and physique non-salient exercise videos on women’s body image, self-presentational concerns, and exercise motivation. Body Image, 5:164–172, 2007. [28] E. McAuley and K. S. Morris. Advances in physical activity and mental health: Quality of life. American Journal of Lifestyle Medicine, 1:389–396, 2007. [29] J. A. M. Murcia, M. L. de San Roman, C. M. Galindo, N. Alonso, and D. Gonzalez-Cutre. PeersŠ influence on exercise enjoyment: A self-determination theory approach. Journal of Sports Science and Medicine, 7:23–31, 2008. [30] Y. Netz and S. Raviv. Age differences in motivational orientation toward physical activity: an application of social-cognitive theory. The Journal of Psychology, 138:35–48, 2004. [31] J. R. Penedo and F. J. Dahn. Exercise and well-being: A review of mental and physical health benefits associated with physical activity. Current Opinion in Psychiatry, 18:189–193, 2005. [32] M. Polce-Lynch, B. J. Myers, C. T. Kilmartin, R. Forssmann-Falck, and W. Kliewer. Gender and age patterns in emotional expression, body image, and self-esteem: a qualitative analysis. 38:1025–1048, 1998. [33] J. Rittscher, A. A. Blake, and S. Roberts. Towards the automatic analysis of complex human body motions. Image and Vision Computing, 12:905–916, 2002. [34] C. Robertson and E. Trucco. Human body posture via hierarchical evolutionary optimization. page III:999, 2006. [35] S. Schiesel. O.k., avatar, work with me. New York Times, 2008. retrieved Dec. 31, 2008 from http://www.nytimes.com/2008/05/15/fashion/15fitness.html. [36] R. Schroeder. The social life of avatars: presence and interaction in shared virtual environments. Springer-Verlag New York, Inc., New York, NY, USA, 2002. [37] L. Sigal and M. J. Black. Measure locally, reason globally: occlusion-sensitive articulated pose estimation. In CVPR06, pages II: 2041–2048, 2006. [38] E. Sluijs, M. van Poppel, J. Twisk, and W. van Mechelen. Physical activity measurements affected participants’ behavior in a randomized controlled trial. Journal of Clinical Epidemiology, 59:404–411,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

232

J. Lacroix et al. / Avatar Communication for a Social In-Home Exercise System: A User Study

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2006. [39] C. Sminchisescu and B. Triggs. Kinematic jump processes for monocular 3d human tracking. In CVPR (1), pages 69–76, 2003. [40] A. L. Smith. Peer relationships in physical activity contexts: a road less traveled in youth sport and exercise psychology. Psychology of Sport and Exercise, 4:25–39, 2003. [41] A. L. Smith. Measurement of social physique anxiety in early adolescence. Medicine and Science in Sports and Exercise, 36:475–483, 2004. [42] S. Trost, N. Owen, and A. E. Bauman. Correlates of adults’ participation in physical activity; review and update. Medicine and Science in Sports and Exercise, 34:1996–2001, 2002. [43] L. J. Ware, R. Hurling, O. B. W. B. Fairley, L. T. Hurst, P. Murray, L. K. Rennie, E. C. Tomkins, A. Finn, R. M. Cobain, A. D. Pearson, and P. J. Foreyt. Rates and determinants of uptake and use of an internet physical activity and weight management program in office and manufacturing work sites in england: Cohort study. Journal of Medical Internet Research, 10(4):e56, Dec 2008. [44] C. Wu, H. Aghajan, and R. Kleihorst. Real-time human posture reconstruction in wireless smart camera networks. In IPSN ’08: Proceedings of the 7th international conference on Information processing in sensor networks, 2008. [45] M. Wu, A. Ranjan, and K. N. Truong. An exploration of social requirements for exercise group formation. In CHI ’09: Proceedings of the SIGCHI conference on human factors in computing systems, New York, NY, USA.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Behaviour Interpretation in Smart Environments

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-235

235

Rule-based intention recognition from spatio-temporal motion track data in ambient assisted living Peter KIEFER a and Klaus STEIN a and Christoph SCHLIEDER a for Semantic Information Technologies, University of Bamberg {peter.kiefer,klaus.stein,christoph.schlieder}@uni-bamberg.de

a Laboratory

Abstract. A central design issue in ambient assisted living consists in creating environments which show a smart behavior that is transparent to the user in the sense that the effects of ambient intelligence are easily predictable. Another aspect of behavioral transparency is related to the effective and efficient communication with the user about relevant background knowledge. Rule-based specifications are widely used in today’s home automation systems as a means to communicate user knowledge. This chapter addresses ambient intelligence based on the recognition of the user’s intentions to act and discusses approaches that identify user intentions in the context of specific background knowledge about possible tasks and the spatial environment. We give a survey on knowledge-based methods for intention recognition and compare different rule-based formalisms with regard to the trade-off between expressiveness and complexity. Special emphasis is laid on approaches that assume a user moving in an environment which is spatially structured by a partonomy as most indoor and near outdoor environments are.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Keywords. Intention recognition, knowledge-intensive modeling

1. Introduction Although the following short description of a person’s behavior in his home environment reminds us of typical examples from the literature on ambient assisted living, it is from an article on plan recognition that was published long before the establishment of ambient intelligence as a field of research: ‘(A1) Steve walked to the cabinet. (A2) He opened the cabinet. (A3) He took the record out of the cabinet. (A4) He took the record out of the record jacket.’ [30, p.3]

We expect that sensors are attached to the cabinet door, the record jacket, and the record, and that we get Steve’s position by some means of indoor positioning. We also imagine a number of services our ambient environment could offer after each of the steps, like (S1) illuminating the cabinet with a spot-light, (S2) giving a personal music recommendation,

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

236

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

Figure 1. User interface of a commercial home automation system software (mControl v2, (C) Embedded Automation, Inc.) [8, p.80]: simple trigger-action rules.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

(S3) playing a preview sound snippet through the headphone, (S4) switching on the hi-fi system.

Current commercial home automation systems implement services like these with simple trigger-action rules, e.g. trigger: somebody approaches the cabinet, action: switch on the spot-light. The basic behavior is directly mapped to a service. This is not appropriate in many situations: Steve, for instance, might be approaching the cabinet accidentally (on his way to the kitchen or some other place). Schlieder calls this the room-crossing problem [28]. When opening the cabinet, Steve might as well have the intention to pick something else from the cabinet, or to put something back into it. By offering an inappropriate service, like turning on music or switching on the light at the wrong time, our ambient system will bother the user and make her feel ‘Lost in ambient intelligence’ [23]. Intention recognition can help us to design more intelligent ambient environments. Our aim is to find out the user’s current intention, the ‘state [. . . ] of mind’ directed ‘towards some future state of affairs’ [36, p.23]. The service we offer is selected depending on the intention and not triggered by the sensor directly [29]. To find out the user’s intention we cannot read her mind, so we try to draw conclusions from the previous behavior, and from the place where that behavior happened. For example, in step (A2) we try to distinguish whether Steve has the intention SelectMusic or the intention PickDishes. This depends on the higher-level intention, given the behavior Steve has shown before. In other words, if he has prepared dinner in the kitchen he will probably now have the intention SetTheTable with the sub-intention PickDishes. For this kind of inference our system needs background knowledge about the intentions a user can have in our environment. The choice of the representational formalism for encoding this knowledge, and the inference mechanism we run on the data, are object of research of the plan and intention recognition community. The article we cited above

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

237

[30], as well as other early work on plan recognition, dealt with this problem on a very general level. The representational formalisms were rather expressive, like Kautz’s event hierarchies [15]. However, the problem of plan recognition has an intrinsic complexity that makes it intractable for the general case. Current work in plan and intention recognition focuses on finding solutions for special cases, e.g. Liao et al’s work on determining a user’s destination and mode of transportation in public transport [21]. While the variable domains and the transition parameters in their hierarchical Markov model are learned, the structure of the model is fixed for the public transportation use case. In ambient intelligence it is often not possible to use one fixed intention model for all scenarios. Every ambient environment is different, and so are the habits and daily routines of their inhabitants. If a computer scientist is needed for customizing an ambient environment system at installation time (and also for modifying it at run time) the costs would be rather high. This is one reason why current commercial home automation systems allow their users to modify the behavior of the system on their own. As you can see in Fig. 1, the user can create her own trigger-action-rules in an intuitive way. Another reason is that users just feel more comfortable if they have the system under their control. This is one conclusion of an experiment performed by Misker et al. on manual vs. automatic device selection in an ambient intelligence environment: ‘subjects were quite willing to invest a bit more time and effort in exchange for more control’ [22]. It is also important that we can explain the outcome of our computations to the user (no black box). For these reasons our aim is to represent intentions in an intuitive way, understandable even for the user herself. As we will see in this chapter, formal grammars are cognitively easy to understand. This makes them different to dynamic network based approaches which are also used for intention recognition, starting from naive Bayes approaches (see [5]), over Dynamic Bayes Networks (see [1]) and Hidden Markov Models (see [3]), to hierarchical Markov Models (see [21]). In these approaches the system comes with a previously learned model which the user is not able to view or modify herself. It requires, for instance, too much background knowledge to interpret the conditional independencies encoded in a Bayesian network correctly. This chapter addresses intention recognition in ambient environments with formal grammars. As space plays a major role in any ambient environment, the spatial behavior (motion track with spatial context) of the user is the main input to our intention recognition. Motion tracks we observe in typical scenarios of ambient assisted living show certain regularities that require an according degree of complexity for representing them. We discuss common patterns of spatial behavior and explain which degree of complexity is needed for representing these patterns with an appropriate spatial grammar. We give a survey on a hierarchy of grammar formalisms, and discuss their formal expressiveness, their cognitive understandability, and their inference complexity. We develop the ideas of this chapter using an ambient library as example which we introduce in section 2. Starting with simple context-free formalisms (section 3), we proceed to probabilistic (state-dependent) grammars (section 4). We lay special emphasis on spatial grammars, formalisms that combine knowledge about space and intentions (section 5).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

238

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

Shelf CS−1 standing pick Books Section CS (Computer Science) searching

Locker Shelf 1

standing

Locker Shelf 2

lock unlock

Shelf CS−2 Reading Room Shelf CS−3 searching

searching

Shelf CS−4

Lockers Room

searching

searching

Foyer

Aisle searching

checkout

Entrance Inside

Shelf Ec−1

Shelf Hu−1

Shelf Ec−2

Shelf Hu−2

Books Section Ec (Economics)

Books Section Hu (Humanities)

Shelf Ec−3

Shelf Hu−4

Shelf Ec−4

Counter

Shelf Hu−3

Library

Figure 2. The floor plan of the library.

L IBRARY E NTRANCE F OYER C OUNTER L OCKERS ROOM L OCKER S HELF1 L OCKER S HELF2

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

I NSIDE A ISLE R EADING ROOM BS ECTION CS

S HELF CS-1

. . . S HELF CS-4

S HELF E C -1

...

S HELF H U -1

. . . S HELF H U -4

BS ECTION E C S HELF E C -4

BS ECTION H U Figure 3. The spatial partonomy of the library.

2. Use Case: Intention Recognition in an Intelligent Library 2.1. The Library: A Partonomically Structured Environment We use a library scenario as our running example. The library is structured as displayed in the floor plan in Fig. 2: we have three thematic sections (economics, humanities, computer science) each of which has a number of book shelves, and a designated reading

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

239

room where visitors can browse the books they have collected. At the checkout counter visitors present the books they plan to borrow to the library staff who register the checkout in the computer system. Visitors wearing coats or carrying bags are required to leave their stuff in the lockers room during the time of their visit. As you see in Fig. 3, the spatial regions in our library form a partonomy: we can arrange the regions in a tree with part_of relations1 . Almost all ambient environments can be structured partonomically. A smart home, for instance, has several floors with rooms which contain regions where you usually cook, work, watch TV, and so on. It is important to notice that our regions structure the library not only spatially, but also semantically. If we know that a person is located inside a certain region we already get some idea about the intentions she can have in that region. This comes close to the intuition behind activity-based spatial ontologies which characterize spatial regions by the actions they afford [20,12].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2. Interpreting Human Behavior in an Ambient Library Our ambient library should be clearly more intelligent than a simple mapping of sensor data to services. We are not talking about switching on the light when the visitor enters a reading room, or muting the mobile phone if other library visitors are in the same room. Our interest is in recognizing the (complex and possibly long-term) intentions a visitor has in mind. In difference to plan recognition, intention recognition is satisfied with finding out the current intention of a person, not necessarily a complex plan structure2 . We are not interested in recognizing the intentions of a group, but only of an individual visitor. The kind of input data we can use for intention recognition heavily depends on the sensors we have in our library. Let us assume we get the following data for each visitor: the motion track (i.e. a sequence of 2D-coordinates), a pick- and drop-event whenever the visitor picks or drops a book from a shelf, a checkout-event when books are scanned at the checkout counter, and a lock- and unlock-event if a locker is opened or closed. These data are collected by several sensors in the environment, and by a portable device attached to the user. We could, for instance, retrieve the indoor position by fusing the data from WLAN, camera surveillance, and sensors in the floor [32], and get the pick/drop/checkout-events from RFID. The semantic gap between low-level data and high-level intentions is very large, especially for 2D position data. To bridge this gap, we use a multi-level architecture like that presented for the use case of a GPS game in [19], see Fig. 4: the motion track is preprocessed with the steps segmentation, feature extraction, and classification. This turns a sequence of (quantitative) coordinates into a sequence of (qualitative) descriptions which we call behaviors. As displayed in Fig. 4 we also treat the non-motion track data as behaviors. For each behavior we know the spatial region where it happened. In the library plan from Fig. 2 we see an example for a visitor’s behavior sequence. The intention recognition problem consists in interpreting a behavior sequence b1 , ..., bn (with b j from the set of possible behaviors B) in terms of intentions. That means, 1 There

are several other spatial relations that hold between the regions in our library, for instance L OCK touches F OYER or, trivially, S HELF CS-1 disjoint BS ECTION E C. These relations are left out for clarity reasons. 2 That means, the problem of intention recognition is a subproblem of plan recognition. If we have recognized a complex plan structure, we also have recognized the current intention. ERS ROOM

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

240

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

information service

int1 int2 ...

Selection

beh1 beh2 ...

Intention recognition

feat1 feat2 ...

Classification

Segmentation and Feature extraction

non-spatial behavior

Spatial Structure Figure 4. From low-level data to information services.

we try to find an intention i (from the set of intentions I) for each element of the behavior sequence, resulting in an intention sequence i1 , ..., in . Two or more succeeding intentions in this sequence will be equal if the intention does not change from one behavior to the next. The intentions we allow for the set I should be selected by a domain expert. Although a realistic person could have almost any intention, we can restrict ourselves to those intentions that we plan to support with an according service. In the following we are only interested in the intention recognition problem. We do not describe the services we want to provide in our library.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3. Intention Recognition With Context-Free Grammars With formal grammars we can describe structural regularities. Starting at Chomsky [6] they were developed for natural language description and processing (NLP) (see e. g. [14]), adapted by computer scientists in formal language theory (e. g. regular expressions or the BNF), used for pattern recognition [11], and computer vision [4]. This idea can be adapted for recognizing the patterns in human behavior. For intention recognition the non-terminals represent the possible intentions, terminals denote the behaviors. Geib and Steedman [9] argue that almost all plan recognition approaches work on hierarchical plans, which can be seen as special cases of hierarchical task networks (HTN, see [10]). For instance, a student’s intention BorrowBooks while visiting the library could be hierarchically split into GetBooks, CheckOutBooks, and Leave where GetBooks must occur temporally before CheckOutBooks which has to be done before Leave. Written as an HTN method this gives m1 : (m1, BorrowBooks, {GetBooks, CheckOutBooks, Leave}, {(1 ≺ 2), (2 ≺ 3)}). This can easily be transformed into a CFG rule: BorrowBooks



GetBooks CheckOutBooks Leave

Figure 5 gives a set of production rules describing a visitor’s intentions and behaviors in a library. This obviously is a simplified ruleset, we did not include bringing a book back or even putting a book back to the shelf after inspecting it etc. as we want to keep the examples simple. A more realistic model may consist of around fifty to hundred rules.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

241

Production Rules (library) BorrowBooks GetBooks SearchSection SelectBooks InspectShelf SearchShelf ExamineBook InspectBooks SearchReadingRoom BrowseBooks CheckOutBooks SearchCounter Leave

→ → → → | → → → | → → → → → →

GetBooks CheckOutBooks Leave SearchSection SelectBooks

(1) (2) searching (3) InspectShelf SelectBooks (4) InspectShelf InspectBooks (5) SearchShelf ExamineBook (6) searching (7) standing (8) standing pick (9) SearchReadingRoom BrowseBooks (10) searching (11) standing (12) SearchCounter checkout (13) searching (14) searching (15)

Figure 5. Representing the intentions of a visitor in an ambient environment library.

We call a production system mapping intentions to behaviors an intentional system [28, p.11]:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Definition 1 An intentional system is a production system A = (B, I, P, S) with terminals B denoting behaviors, non-terminals I denoting intentions and start symbol S ∈ I called the highest-level intention. P is a set of productions over I ∪ B. All sets B, I and P are final. The rules in Fig. 5 clearly describe a context free grammar (CH-2) as the ruleset P ⊂ (I × (I ∪ B)∗ ). Regular grammars (CH-3) would be too restrictive for many tasks as they are not able to count pairs, e. g. if the user picks n objects and drops them afterwards n n we get the sequence pick drop which cannot be expressed in CH-3. This kind of pattern frequently occurs in spatial partonomies with an arbitrary number of nested regions, where users enter and leave regions frequently, e. g. entera . . . enterb . . . leave(a+b) . All these patterns of the form xn yn cannot be expressed with a regular grammar. We take CFG as a starting point for grammar based intention recognition parsers. As we will see later some tasks need more expressive grammars introducing higher levels of complexity (and increasing parse time). Fig. 6 gives the parse tree of a certain behavior sequence derived from the highestlevel intention. We call the direct parent of a behavior in the parse tree the corresponding current intention. It gives an interpretation of the underlying behavior and can be used to provide the appropriate information service. In our example parse tree the current intention of the first searching behavior is SearchSection, and Leave of the last searching behavior. In ambient environments we want to provide the user some tailored information services in almost real-time, so that the parsing must be done on the fly. This means that the parser has to work on a growing sequence of behaviors, only knowing what the user has done until now, not knowing what she will do next. Fortunately, incremental parsers are available from NLP applications where the computer tries to understand the speaker

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

242

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

BorrowBooks

CheckOutBooks

GetBooks

SearchSection

SelectBooks

InspectShelf

Leave

SearchCounter

SelectBooks

SearchShelf ExamineBook . . . SelectBooks

. . . SelectBooks

. . . InspectBooks

SearchReadingRoom BrowseBooks

searching searching standing pick

...

searching

standing searching checkout searching

Figure 6. A parse tree for a visitor of the library borrowing books.

Probabilistic Production Rules (library) → InspectShelf SelectBooks | InspectShelf InspectBooks

0.7 0.3

(4) (5)

→ standing | standing pick

0.8 0.2

(8) (9)

all other rules from Fig. 5

1.0

SelectBooks ExamineBook Copyright © 2009. IOS Press, Incorporated. All rights reserved.

d(p)

Figure 7. Production rules from Fig. 5 with probabilities.

before she finished her sentence. Parsing algorithms for CFG typically use a dynamic programming approach with storing and reusing derivations during execution. Simple variations of the well-known Earley parser [7] will incrementally parse CFG sequences in O(n3 ). For a description of incremental parsing techniques for CFG see e.g. [14, p.377ff] or one of the many other textbooks on formal grammars or NLP.

4. Probabilistic (State-Dependent) Grammars One important issue when parsing CFGs is ambiguity. For many behavior sequences a number of different parse trees match, especially in incremental parsing where only the starting part of the sequence is known. For instance the sequence searching searching

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

243

standing pick searching standing is ambiguous as the last standing might either be ExamineBook or BrowseBooks, as rules (4) and (5) can be applied here. Ambiguous parsing is very common in NLP: ‘I shot an elephant in my pajamas’ may be parsed as ‘I was wearing my pajamas while I shot the elephant’ as well as ‘I shot an elephant which was wearing my pajamas’ (example taken from [14, p.373]). While the second interpretation is not the preferred reading it is syntactically correct.

4.1. Probabilistic Intentional Systems A probabilistic context free grammar (PCFG) is an extension to CFG which assigns a certain probability to each production rule. PCFGs were introduced to cope with the described ambiguities [2]. The underlying idea is that the probabilities help to compute the preferred tree without withdrawing the others. Definition 2 A probabilistic intentional system PIS = (B, I, P, S, d) is an intentional system (B, I, P, S) enhanced by production probabilities d : P → [0, 1] where the sum of d(p) for all productions p with the same left hand intention is 1.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

In the ruleset of Fig. 5 rules (4) and (5) and rules (8) and (9) are ambiguous as (4) or (5) can be applied to SelectBooks, and (8) or (9) can be applied to ExamineBook. By assigning probabilities to all productions we turn our intentional system into a probabilistic intentional system. Figure 7 shows the updated ruleset. For our example probabilities are chosen manually. If a sufficient amount of annotated behavioral data are available we can also learn probabilities from a corpus. Figure 8 shows the five possible incremental parse trees for the (partial3 ) behavior sequence searching searching standing pick searching standing.4 The probability of each of these five trees is derived by computing their conditional probabilities. With (d(4) · d(9) · d(4) · d(8)) = (0.7 · 0.2 · 0.7 · 0.8) = 0.0784 parse tree 1 has the highest probability so the preferred interpretation is that the user has the current intention ExamineBook. For a probabilistic Earley parser see Stolcke [33]. 4.2. Probabilistic State Dependent Grammars As PCFGs are context-free the probabilities of applying productions are independent from each other. In other words: they ignore what happened in other parts of the parse tree. In our example, the probability for inspecting another shelf and choosing another book is always higher (d(4) = 0.7) than the probability for finally inspecting the books collected (d(5) = 0.3) which is not very intuitive as the amount of books the visitor can carry is limited. Preferring rule (4) always expects another (searching standing)-sequence by inferring the intention InspectShelf and never infers the intention BrowseBooks (if our inference algorithm chooses the hypothesis with highest probability). Because the probabilities of rules are not context-free in real world scenarios, you have to take the context into account to get useful preferred interpretations. What the user has done certainly influences the probabilities of what she will do in future. On the other 3 an

extract from the sequence shown in Fig. 2 1 and 2 differ in the last applicable production rule (choosing rule (8) or (9) after the last standing behavior) and are therefore given in one graph. Same holds for trees 3 and 4. 4 Trees

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

244

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

Parse tree 1, 2

BorrowBooks GetBooks

SearchSection

SelectBooks d(4) = 0.7 InspectShelf SearchShelf

SelectBooks d(4) = 0.7

ExamineBook d(9) = 0.2

InspectShelf SearchShelf

searching

searching

standing

pick

Parse tree 3, 4

searching

SelectBooks

ExamineBook ? d(8) = 0.8 d(9) = 0.2

standing

BorrowBooks GetBooks

SearchSection

SelectBooks d(4) = 0.7 InspectShelf SearchShelf

SelectBooks d(5) = 0.3

ExamineBook d(9) = 0.2

InspectShelf

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

SearchShelf searching

searching

standing

pick

searching

InspectBooks

ExamineBook ? d(8) = 0.8 d(9) = 0.2

standing

BorrowBooks

Parse tree 5 GetBooks SearchSection

SelectBooks d(5) = 0.3 InspectShelf SearchShelf

searching

searching

InspectBooks

ExamineBook d(9) = 0.2

standing

pick

SearchReadingRoom

BrowseBooks

searching

standing

Figure 8. Probabilistic parse trees for an ambiguous partial behavior sequence. Probabilities d(k) not given in the tree are 1.0. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data



1.0 0.0 ⎧ ⎨ 1.0 π1 (qt−1 , b, qt ) = 1.0 ⎩ 0.0 π0 (q) =

245

for q = 0, otherwise. ∧ qt = qt−1 + 1 ∧ qt ≤ 8, for b = pick for b ∈ B \ {pick} ∧ qt = qt−1 , otherwise.

Figure 9. Probability distributions for the library example PSDIS. q denotes the number of books currently carried by the visitor.

hand, switching to a fully context sensitive grammar would make the parsing rather expensive. This is the motivation of Pynadath and Wellman [25] to introduce Probabilistic State Dependent Grammars (PSDG). A PSDG is a PCFG with an additional state variable q ∈ Q. q describes the current state of the user within the domain Q, which may describe a complex multidimensional set of states. For our library example Q holds the number of books currently carried by the visitor, i. e. Q = {0, . . . , 8} if we assume the visitor to carry not more than 8 books. q0 describes the initial state of the parser, q1 the state after the first behavior is parsed, and qt the state after the t th terminal. For all non-terminals in the parse tree we (recursively) define its time as the time of its leftmost child. This allows to define the probability for a certain rule at time t to depend on the state of qt−1 :

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Definition 3 A probabilistic state dependent intentional system PSDIS = (B, I, P, S, Q, d, π0 , π1 ) is a probabilistic intentional system (B, I, P, S, d) enhanced by a domain Q and dependent probababilities d : P × Q → [0, 1], where for a fixed qt the sum of d(p, qt ) for all productions p with the same left hand intention is 1. π0 : Q → [0, 1] is the initial probability distribution of q and π1 : Q × B × Q → [0, 1] the behavior dependent state transition probability distribution. So π0 (q) tells with which probability some state q ∈ Q is the initial state q0 of the parser and π1 (qt−1 , b, qt ) tells for a given state qt−1 ∈ Q and a given terminal b ∈ B with which probability the system will go to a state qt ∈ Q. See Fig. 9 for our library: the system starts with q0 = 0 with probability 1.0, i. e. the visitor has no books in the beginning. Each time the visitor picks a book, with probability 1.0 the state changes from n to n + 1, i. e. the number of books increases by one, for any other behavior it stays the same. More complex domains include much more features like the number of shelves visited. In domains where state transitions are not deterministic and where we cannot observe the state q directly (hidden state), π1 (qt−1 , b, qt ) will take other values than 0 or 1. While π1 resembles state transitions, d(p, qt ) gives the dependent probability of a production rule p to be applied. Figure 10 gives the ruleset for the library example. And now obviously with an increasing number of books carried the probability for choosing rule (5) (instead of (4)) and rule (8) (instead of (9)) also increases. Pynadath describes a method for efficiently answering intention recognition queries by converting a given PSDG/PSDID description into a dynamic bayes network and exploiting the particular independence properties of the PSDG language [24].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

246

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

PSDG Production Rules (library) SelectBooks ExamineBook

d(p, q) : q = 0

1

2

3

>3

→ InspectShelf SelectBooks | InspectShelf InspectBooks

0.9 0.1

0.6 0.4

0.5 0.5

0.3 0.7

0.1 0.9

(4) (5)

→ standing | standing pick

0.5 0.5

0.5 0.5

0.6 0.4

0.6 0.4

0.8 0.2

(8) (9)

1.0

1.0

1.0

1.0

1.0

all other rules from Fig. 5

Figure 10. Production rules from Fig. 5 with state-dependent probabilities.

Probabilistic approaches require us to get useful a priori probabilities. In difference to NLP where large corpora of annotated texts exist, such corpora still need to be acquired for behavioral data. In a public space, like the library use case, where lots of people show quite similar behavior we can easily build a corpus by observing visitors. We can, for instance, empirically determine the probability of picking another book. Other ambient environments, especially at home, are very diverse and highly personalized. The spatial structure and number of rooms, the tasks that are possible in that environment, and in particular the users’ habits are quite individual. In this case we should take the probabilities from a (global) corpus and personalize the model by learning during run time.

5. Spatial Grammars

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.1. Spatially Groundend Intentional Systems A special kind of context for ambient environments is its spatial structure. Behaviors reveal different intentions dependent on where they occur. A searching behavior in the area of book shelves is typically driven by a different intention than a searching behavior in the lockers room. The spatial structure of an ambient environment helps for the disambiguation of possible parse trees by using the user’s spatial context [28, p.13]: Definition 4 A spatially grounded intentional system SGIS = (B, I, P, S, R, G) is an intentional system (B, I, P, S) enhanced by a finite set of spatial regions R and a relation G ⊆ P × R describing the regions in which a production rule is applicable. For parsing an SGIS each behavior is attributed by the region where it happened. A given production rule p is only applicable if all corresponding behaviors (i. e. the leaf nodes derived from p in the parse tree) occur in spatial regions r1 to rk with (p, ri ) ∈ G. By organizing the regions in R as a partonomy the hierarchical structure allows each production rule to be grounded on an appropriate level, i. e. there are rules applicable in the whole library, others only in the lockers room etc. So let us have a look at the ambiguous behavior sequence from Sec. 4: searching searching standing pick searching standing. We now attribute each behavior with the region in which it occurred: searching, A ISLE searching, BS ECTION CS standing, S HELF CS-4 pick, S HELF CS-4 searching, BS ECTION CS standing, R EADING ROOM. Using spatially grounded production rules (Fig. 11) it is obvious that the last intention cannot be ExamineBook as neither production rule (8) nor (9) are applicable in the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

SGIS Production Rules (library) SelectBooks → | InspectShelf → SearchShelf → ExamineBook → | InspectBooks →

247

spatial grounding

InspectShelf SelectBooks InspectShelf InspectBooks SearchShelf ExamineBook

(4) I NSIDE (5) I NSIDE (6) BS ECTION m . . . n searching (7) BS ECTION m . . . n standing (8) S HELF 1 . . . k standing pick (9) S HELF 1 . . . k SearchReadingRoom BrowseBooks (10) I NSIDE SearchReadingRoom → searching (11) I NSIDE BrowseBooks → standing (12) R EADING ROOM all other rules from Fig. 5

L IBRARY

Figure 11. SGIS rules for the library example.

SGIS Production Rule extensions (library) SearchSection SearchShelf SearchReadingRoom SearchCounter Leave

→ → → → →

searching SearchSection searching SearchShelf searching SearchReadingRoom searching SearchCounter searching Leave

all other rules from Fig. 5, 11

spatial grounding (16) (17) (18) (19) (20)

I NSIDE BS ECTION m . . . n I NSIDE L IBRARY L IBRARY L IBRARY

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 12. SGIS rule extensions for the library example. These additional rules are neccessary because of the behavior sequence segmentation by the spatial regions.

R EADING ROOM (see Fig. 13 for an according SGIS parse tree). And this is very reasonable: if I am not standing at the shelf how should I choose a book from it? Binding possible intentions to places or regions where they may occur is very feasible in ambient environments. Many services provided by an ambient environment naturally depend on the current position of the user so spatial grounding is very intuitive. The partonomial structure of the regions allows to ground each rule on the appropriate level. How to select the correct level of interpretation in a partonomy is discussed in [29]. The introduction of regions allows to ground behaviors and intentions. This also means that the user’s behavior sequence is now not only segmented by changes in behavior (e. g. when the user stops searching and stands in front of a shelf) but also by moving from one region to another, e. g. the track of a user walking through the library showing searching behavior all the time was represented as searching and is now segmented to searching, A ISLE searching, BS ECTION E C searching, BS ECTION H U . . . . This means we have to expand rules (3), (7), (11), (14) and (15) representing a single searching behavior to a sequence of searching behaviors (Fig. 12). On the first sight an SGIS looks like a special case of a PSDG, where the domain Q corresponds the set of regions R. This is generally true but contrary to the latter the production rules of an SGIS are not choosen by a given probability. This allows an SGIS to be parsed with a modified Earley-parser, where the additional constraints simply limit the breadth of the search space, thus speeding up the parser. For many applications it may

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

248

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

BorrowBooks GetBooks SearchSection

SelectBooks InspectShelf SearchShelf

InspectBooks

ExamineBook

searching

searching

standing

A ISLE

BS ECTION CS

S HELF CS-4

pick

SearchReadingRoom BrowseBooks

searching

S HELF CS-4 BS ECTION CS

standing

R EADING ROOM

Figure 13. The only possible SGIS parse tree for the spatially grounded partial behavior sequence.

SCCFG Production Rules (Library)

spatial grounding

→ SearchLockers LockBagi GetBooks CheckOutBooks SearchLockers UnlockBagj Leave (21) L IBRARY, ri = r j SeachLockers → searching (22) E NTRANCE LockBag → lock (23) L OCKER S HELF 1,2 UnlockBag → unlock (24) L OCKER S HELF 1,2 BorrowBooks

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 14. Extending the example from Fig. 11. The visitor has to leave her bag outside.

be feasible to use a probabilistic SGIS, where for each region the probability of each rule is given. While this certainly leads to a more expressive grammar the drawback is that it needs a more complex parser. 5.2. Spatially Constrained Context-Free Grammars One important restriction of SGIS is that the regions are propagated hierarchically through the parse tree. This allows only a restricted subset of spatial configurations to be expressed. In Fig. 14 we extend the ruleset by some rules for putting the visitor’s bag to the locker before entering the inner library and fetching it up again afterwards. Obviously, the visitor will fetch the bag from the same locker where she locked it up before. So what we have here is that she goes to one locker shelf (L OCKER S HELF2 ), locks the bag, goes into the library, comes back to the same locker shelf (L OCKER S HELF2 ), takes her bag and leaves: identical regions BorrowBooks → SearchLockers LockBag GetBooks CheckOutBooks SearchLockers UnlockBag Leave So here we need a grammar where we can express that two intentions must be in the same region and some intentions in between may be elsewhere. So we define [16]:

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

Extended SGIS Production Rules (library) SelectBooks → | | | TakeBookj → PickBook → ReturnBookk → PushBook → |

TakeBooki SelectBooks ReturnBooki TakeBooki InspectBooks ReturnBooki TakeBook SelectBooks TakeBook InspectBooks SearchShelf PickBookj pick

SearchShelf PushBookk standing push push

249

spatial grounding (25) (26) (27) (28) (29) (30) (31) (32) (33)

I NSIDE I NSIDE I NSIDE I NSIDE I NSIDE S HELF 1 . . . k I NSIDE S HELF 1 . . . k S HELF 1 . . . k

Figure 15. Extended SGIS rules. Rule (9) is now dispensable.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Definition 5 A spatially constrained context-free grammar SCCFG = (SGIS, R, SR, NLC) is a spatially-grounded intentional system SGIS = (B, I, P, S, G) extended by a set of spatial relations SR over R with  ∈ SR :  ⊆ R × R, and a set of spatial non-local constraints NLC which defines a spatial relation  ∈ SR between two regions r1 , r2 ∈ R, where r1 and r2 are attached to two right-side intentions of the same rule. Using the spatial relation =∈ SR we can now define the spatial non-local constraint for rule (21), that the regions of LockBag and UnlockBag have to be equal5 . For equality we normally use a simplified notation by just using the same index for the connected intentions, so this would be: BorrowBooks → . . . LockBagi . . . UnlockBagi . . . As arbitrary spatial relations (like touches, overlaps, isNorthOf etc.) are allowed also more complex situations can be modelled. For a given ruleset and a given partonomy, an SGIS can be automatically computed from an SCCFG (with much more rules, i. e. for a given finite partonomy and set of relations each SCCFG can be expanded to a set of SGIS rules instantiated with concrete regions fullfilling the relations given). However, for the reasons of intuitive modeling and portability (if we change the spatial relations without changing the production rules) it makes sense to choose an SCCFG model whenever we find a spatial constellation like that of returning to the lockers shelf. Currently, the visitor of the library may pick up some books from the shelves, inspect them in the reading room, and finally check them out at the counter before leaving the library. In reality she will probably not take all books with her. So what we are currently missing is to put some of the books back to the shelf after browsing them. Figure 15 adds rules to describe this sequence. A book is taken from a shelf and put back afterwards. Rule (25) states this, and the indexes tell us that TakeBook and ReturnBook must be in the same region, the book has to be returned to the same book shelf: the PickBook from rule (29) must be in the same book shelf region as the PushBook from rule (31). The SearchShelf intentions in both rules are not concerned. We achieve this by the indexes j and k on the left hand side of (29) and (31): the index in the left-side non-terminal binds any spatial constraint and propagates it to the right side. As this is a constraint between two production rules, the production system in Fig. 15 is not an 5 We

need some explanation on the meaning of equal in this context: with equal we mean that the smallest common parent regions of all behaviors that are (transitive) children of some intentions i1 and i2 connected through a spatial relation  are identical. We need that explanation because if we have a top region (L IBRARY), all behaviors will obviously always be in that library and thus in some way in the equal region. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

250

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

SCCFG. However, we could also state the relations explicitly as in Fig. 14 by writing the content of rules (29) and (31) directly into rule (25). Thus, Fig. 15 is an SCCFG with a simplified notation. 5.3. Spatially Constrained Tree-Adjoining Grammars An SCCFG allows to express long-range spatial dependencies as the examples in the last section illustrated. It is an example of a general “come back again to where you were before”-pattern, which shows up if some intention has to occur in the same region as another one before. The rules given in Fig. 15 are not sufficient to describe this general pattern. If our visitor takes three books from the shelves she is supposed to bring them back in reverse order (or keep it), otherwise our ruleset will not match. And this is not a problem of us having chosen the wrong rules but a general restriction of context-free grammars: they are simply not able to cope with arbitrary overlapping dependencies. So for taking three books and putting two back to their shelf from the following six behavior sequences

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. 2. 3. 4. 5. 6.

PickBook1 PickBook1 PickBook1 PickBook1 PickBook1 PickBook1

PickBook2 PickBook2 PickBook2 PickBook2 PickBook2 PickBook2

PickBook3 PickBook3 PickBook3 PickBook3 PickBook3 PickBook3

... ... ... ... ... ...

PushBook1 PushBook1 PushBook2 PushBook2 PushBook3 PushBook3

PushBook2 PushBook3 PushBook1 PushBook3 PushBook1 PushBook2

only the nested ones (3, 5 and 6) can be expressed by a CFG. This problem is discussed in recent plan and intention recognition research (Geib and Steedman [9], Kiefer and Schlieder [18]) and known for a long time in NLP as some languages have long-ranging crossing dependencies. For instance, in the Dutch sentence ‘omdat ik1 Jan2 het lied3 probeer1 te leren2 zingen3 ’6 the words with same indexes are connected (example from [9, p.4]). In NLP this was adressed by developing mildly-context sensitive grammars (MCSG) which is a class of grammars more expressive than CFGs but less complex than CSGs (context sensitive grammars). MCSG include tree-adjoining grammars (TAG)[13], combinatory categorial grammars (CCG)[31], linear indexed grammars (LIG)[26], and headdriven phrase-structure grammars (HDPSG)[27]. All of them support certain kinds of cross dependencies but are still parsable in polynomial time.7 Compared to other MCSG, TAGs are cognitively rather easy to understand. The main idea of TAG is to give the production rules directly in the form of parse trees. A good introduction to TAG is given by Joshi and Schabes in [13]. Definition 6 A Tree-Adjoining Grammar is defined as TAG = (NT, Σ, IT, AT, S), where • NT are non-terminals • Σ are terminals. • IT is a finite set of initial trees. In an initial tree, the root and interior nodes are non-terminals. Leaf nodes are either terminals, or non-terminals marked for substitution (we mark substitution nodes with a ↓). 6 ‘because 7 The

I try to teach Jan to sing the song’ formalisms have been shown to be weakly equivalent [35].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

A A↓ τ

X A

β

251

τ β

X ϑ

γ

X∗ α

X

ϑ

X

α γ

Figure 16. Substitution (left) and adjoining (right) on a TAG (taken from [13, Fig. 2.2]).

• AT is a finite set of auxiliary trees. An auxiliary tree is an initial tree with exactly one special leaf node: the foot node (marked with an asterisk ∗), which must have the same label as the root node. • S is a distinguished non-terminal (starting symbol). The operations defined on a TAG are substitution and adjoining (see Fig. 16):

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• Any substitution leaf A↓ of a tree τ (i. e. any non terminal leaf besides the foot node in any tree) may be substituted by any initial tree β with root A, i. e. the A↓ is replaced by β . This operation is similar to substitution in a CFG. • To any inner node X of a tree ϑ an auxiliary tree α with root X may be adjoined, i. e. the inner node X and its descendants (the subtree γ) is removed from ϑ and substituted to the foot note X ∗ of α (it is attached to α there) and afterwards the resulting tree α  is attached to ϑ at the original position of the inner node X. In other words: the auxiliary tree α is inserted into ϑ at some inner node X shifting the subtree of X down to the foot node. This second operation brings in the additional expressiveness of TAGs. A TAG defines not a string language but a tree language. If we interpret the trees as Strings by traversing the trees, we see that adjoining manipulates a String in an intricate way: a part of the String becomes surrounded by new Strings to the left and to the right. We could as well say that adjoining is like substituting at the same time on the left and on the right hand of some middle part. This allows us to express all 6 cases of crossing constraints from the PickBook/PushBook example above. We define a spatial TAG (see [16]): Definition 7 A Spatially Constrained Tree-Adjoining Grammar is defined as SCTAG = (TAG, R, SR, GC, NLC), where • • • • •

TAG = (I, B, IT, AT, S), defined over intentions I, and behaviors B. R is a set of regions. SR is a set of spatial relations, where each relation r ⊆ R × R. GC ⊆ (IT ∪ AT) × R is a set of grounding constraints. NLC is a set of spatial non-local constraints. Each constraint has a type from the spatial relations SR and is defined for two nodes in one tree from IT ∪ AT.

Figure 17 shows a possible ruleset of an SCTAG for the PickBook/PushBook example. To have concise trees we abbreviated PickBook to P and PushBook to D. SelectBooks is represented by A. As before the indices indicate identical regions, so e. g. in tree β the Pi has to be in the same spatial region as Di . Figure 18 shows how these rules may be applied to derive the given sequence which could not be expressed by a CFG ruleset. It also shows how the spatial regions are instantiated during tree assembly.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

252

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

A

S Pi ↓

A

init tree: α

Di ↓

A

A A∗

t

A

Di ↓

A∗

Pi ↓

auxiliary tree: β

A

P

D

A

p

d

π

σ

A∗

Pi ↓

auxiliary tree: γ

auxiliary tree: δ

Figure 17. Elementary trees of an SCTAG that allows for crossing dependencies.

S A P1

S

A

A S

P1

A

A P1

p

A P2

A A

P2

A

D1

t

d

μ1 = β adjoined into α P1 t D1

p

p

A P3

A

A

A

A

D1

t

d

μ2 = δ adjoined into μ1 P1 P2 t D1

D3

A D1 p

p

p

t

d d

μ3 = β adjoined into μ2 P1 P2 P3 t D1 D3

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 18. Deriving PickBook1 PickBook2 PickBook3 . . . PushBook1 PushBook3 .

For parsing TAGs an algorithm with polynomial worst (O(n6 )) and average case complexity (based on the Cocke-Younger-Kasami algorithm) was proposed in [34]. An improved incremental TAGs parser which decreases the average case complexity was presented by Joshi [13]. It adopts the idea of the Earley parser whereas the ‘Earley dot’ traverses trees and not Strings. Besides the three original operations (scan, predict, complete) a forth one (adjoin) is added to cover the corresponding TAGs rule. By collecting the set of trees (starting with the initial tree) which fit to the beginning of the behavior sequence and successively adjoining/substituting them, the possible intentions at any point of time are available from the parser. A new behavior coming in may invalidate some trees (which are removed from the set) while the existing ones are further expanded by additional joins/substitutions. Adding spatial constraints lowers the number of applicable joins and substitutions and may invalidate additional trees, as spatial constraints give us more predictive information. ‘Any algorithm should have enough information to know which tokens are to be expected after a given left context’ [13, p.36]. Knowing the spatial context of left-hand terminals we can throw away those hypotheses that are not consistent with the spatial constraints. This gives a further speedup

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

253

in processing. Taking the polynomial time algorithms mentioned above as a basis, the modified algorithms for an according spatially constrained grammar will thus also have polynomial time complexity.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6. Summary Ambient environments are supposed to support the users doing their tasks. For this they have to know what the user currently wants. Besides explicit user interaction like pressing buttons user intentions often can be inferred from user’s behaviors, allowing to provide appropriate services. We have discussed a number of grammar formalisms for representing and interpreting high-level human intentions and behaviors in space. In general, rule-based formalisms are cognitively easier to understand than dynamic network based methods. This allows for easy configurability and explainability. Current commercial home automation systems [8] as well as empirical research [22] show that there is interest in ambient environments that allow configuration by the user. Users living in an ambient environment should not feel lost in a system they do not understand, but comprehend the system’s behavior, and possibly also be able to modify it easily. We have described typical patterns of spatial behavior and the grammar complexity we need for representing them. As simplest formalism, we presented Intentional Sysn n tems which are derived from CFG, thus covering behavioral patterns of the form x y . The problem of ambiguity lead us to their probabilistic counterpart, Probabilistic Intentional Systems, and Pynadath’s extension Probabilistic State-Dependent Grammars with its arbitrarily complex state variable. In the next step, we considered space as a special form of context, and explained Spatially Grounded Intentional Systems, SpatiallyConstrained Context-Free Grammars, and Spatially-Constrained Tree-Adjoining Grammars. All these spatial grammars exploit the fact that not any intention is possible at any place. Rules in these grammars are annotated with spatial constraints, and a spatial model is assumed. The connection of intentional knowledge and spatial knowledge makes these formalisms special. The system designer should choose the formalism of lowest complexity with which the intentional-spatial patterns typical for his use case can still be expressed. We exemplified our grammars using a visitor’s behavior in an ambient library. The connections of intentions and space we identified in this use case follow general intentional-spatial patterns which we can also find in a number of other use cases. For instance, in many domains the nesting of intentions and sub-intentions strictly follows the nesting of regions in a partonomy. This can be handled with an SGIS. If the use case requires longranging dependencies without nested child regions an SCCFG becomes necessary. The return to region pattern, for instance, occurs if the user returns to a region r she has been to some time before, after having shown some behavior in regions that are disjunct to r. If these long-ranging dependencies may also overlap we must choose the mildly-context sensitive SCTAG formalism (crossed return to region pattern). Figure 19 summarizes the three spatial grammars and gives short examples for the library use case.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

254

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

SGIS

Nested dependencies: only part-of relation. No cross-dependencies

Typical spatial intention pattern: Sub-intentions are located in the same or in sub-regions of their parent intention

Shelf CS−1 pick

Books Section CS (Computer Science)

Shelf CS−4

Shelf CS−2 Shelf CS−3

Library

part-of part-of part-of part-of L IBRARY S ECTION S HELF 4 S ECTION L IBRARY part-of

SCCFG

Nested dependencies. No cross-dependencies (unless statically defined) Lockers Room Locker Shelf 1

lock unlock

Locker Shelf 2

Typical spatial intention pattern: Return to region pattern.

Library

identity

SCTAG

Nested dependencies. Cross-dependencies drop pick

Shelf CS−1 Shelf CS−4

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

L IBRARY LROOM LS HELF 2 LROOM L IBRARY LROOM LS HELF 2 LROOM L IBRARY part-of part-of part-of part-of part-of part-of part-of part-of

Typical spatial intention pattern: Crossed return to region pattern.

pick drop Books Section CS (Computer Science)

identity

identity S ECTION S HELF 4 S ECTION S HELF 1 S ECTION S HELF 4 S ECTION S HELF 1 S ECTION part-of part-of part-of part-of part-of part-of part-of part-of Figure 19. A hierarchy of spatial grammars for mobile intention recognition.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

255

7. Outlook We have presented probabilistic grammars on the one hand, and spatial grammars on the other. No formalism that combines both, spatial knowledge and probabilities, exists yet. The probabilistic and state-dependent PSDIS formalism is not designed specifically for spatial knowledge, and does not catch mild context sensitivity. As argued in [17], making a PSDIS dependent of the region history would lead to an explosion of the state space. Our current research is related to a new formalism that combines SCCTAG with probabilities in a way that keeps inference tractable. We are also going to develop a design approach for rule-based intention recognition systems that guides a system developer in finding and implementing the right formalism for a specific domain. An issue that should be tested empirically is the idea of the user writing grammar rules for her own ambient environment. Are the formalisms cognitively appealing enough? Which tools can support non computer-scientists in writing the rule sets? Studies with real users of ambient environments may also show if self-created rule sets are too error-prone and whether users are willing to modify their rule sets at run time. Finally, one current goal in research on rule-based intention recognition is the building of large corpora of annotated behavioral data which would make different approaches more comparable. References [1]

[2] [3]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[4]

[5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]

D. W. Albrecht, I. Zukerman, A. E. Nicholson, and A. Bud. Towards a bayesian model for keyhole plan recognition in large domains. In Proceedings of the Sixth International Conference on User Modeling (UM ’97), pages 365–376. Springer, 1997. T.L. Booth and R.A. Thompson. Applying probability measures to abstract languages. IEEE Transactions on Computing, C-22(5):442–450, 1973. H.H. Bui. A general model for online probabilistic plan recognition. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2003. G. Chanda and F. Dellaert. Grammatical methods in computer vision: An overview. Technical Report GIT-GVU-04-29, College of Computing, Georgia Institute of Technology, Atlanta, GA, USA, November 2004. ftp://ftp.cc.gatech.edu/pub/gvu/tr/2004/04-29.pdf. E. Charniak and R.P. Goldman. A bayesian model of plan recognition. Artificial Intelligence, 64(1):53– 79, 1993. N. Chomsky. Aspects of the Theory of Syntax. MIT Press, 1965. J. Earley. An efficient context-free parsing algorithm. Communications of the ACM, 13(2):94–102, 1970. Embedded Automation Inc., Surrey, BC, USA. mControl v2 (Home Edition) User Manual, 2008. C.W. Geib and M. Steedman. On natural language processing and plan recognition. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 1612–1617, 2007. M. Ghallab, D. Nau, and P. Traverso. Automated Planning: Theory and Practice. Morgan Kaufman, 2004. R.C. Gonzalez and M.G. Thomason. Syntactic Pattern Recognition - An Introduction. Addison-Wesley, Reading, Massachusetts, USA, 1978. T. Jordan, M. Raubal, B. Gartrell, and M. Egenhofer. An affordance-based model of place in gis. In Proc. 8th Int. Symposium on Spatial Data Handling, pages 98–109, Vancouver, 1998. IUG. A.K. Joshi and Y. Schabes. Tree-adjoining grammars. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, pages 69–124. Springer, Berlin, New York, 1997. D. Jurafsky and J.H. Martin, editors. Speech and Language Processing. Prentice Hall, Upper saddle River, New Jersey, USA, 2000. H.A. Kautz. A Formal Theory of Plan Recognition. PhD thesis, University of Rochester, Rochester, NY, 1987.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

256 [16]

[17]

[18]

[19]

[20] [21] [22]

[23] [24] [25]

[26]

[27] [28]

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[29]

[30] [31] [32]

[33]

[34] [35] [36]

P. Kiefer et al. / Rule-Based Intention Recognition from Spatio-Temporal Motion Track Data

P. Kiefer. Spatially constrained grammars for mobile intention recognition. In C. Freksa, Nora S. Newcombe, P. Gärdenfors, and S. Wölfl, editors, Spatial Cognition VI, pages 361–377, Berlin Heidelberg, 2008. Springer (LNAI 5248). P. Kiefer. SCTAG: A mildly context-sensitive formalism for modeling complex intentions in spatially structured environments. In AAAI Spring Symposium on Human Behavior Modeling, AAAI Technical Report SS-09-04, 2009. accepted. P. Kiefer and C. Schlieder. Exploring context-sensitivity in spatial intention recognition. In Workshop on Behavior Monitoring and Interpretation, 30th German Conference on Artificial Intelligence (KI-2007), pages 102–116. CEUR Vol-296, 2007. ISSN 1613-0073. P. Kiefer and K. Stein. A framework for mobile intention recognition in spatially structured environments. In 2nd Workshop on Behavior Monitoring and Interpretation (BMI08), 31st German Conference on Artificial Intelligence (KI-2008), pages 28–41, 2008. W. Kuhn. Ontologies in support of activities in geographical space. International Journal of Geographical Information Science, 15(7):613–631, 2001. L. Liao, D.J. Patterson, D. Fox, and H. Kautz. Learning and inferring transportation routines. Artificial Intelligence, 171(5-6):311–331, 2007. J.M.V. Misker, J. Lindenberg, and M.A. Neerincx. Users want simple control over device selection. In Proceedings of the 2005 joint conference on Smart objects and ambient intelligence: innovative contextaware services: usages and technologies, pages 129–134, New York, USA, 2005. ACM. A. Nijholt, T. Rist, and K. Tuinenbreijer. Lost in ambient intelligence? In Proc. ACM Conference on Compuer Human Interaction (CHI 2004), pages 1725–1726, New York, USA, 2004. ACM. D.V. Pynadath. Probabilistic Grammars for Plan Recognition. PhD thesis, The University of Michigan, 1999. D.V. Pynadath and M.P. Wellman. Probabilistic state-dependent grammars for plan recognition. In Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence, pages 507–514, 2000. O. Rambow. Multiset-valued linear index grammars: imposing dominance constraints on derivations. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, pages 263–270, Morristown, NJ, USA, 1994. Association for Computational Linguistics. I.A. Sag and C. Pollard. Head-driven phrase structure grammar: An informal synopsis. CSLI Report 87-79, Stanford University, Stanford University, 1987. C. Schlieder. Representing the meaning of spatial behavior by spatially grounded intentional systems. In GeoSpatial Semantics, First International Conference, volume 3799 of Lecture Notes in Computer Science, pages 30–44. Springer, 2005. C. Schlieder and A. Werner. Interpretation of intentional behavior in spatial partonomies. In Spatial Cognition III, Routes and Navigation, Human Memory and Learning, Spatial Representation and Spatial Learning, volume 2685 of Lecture Notes in Computer Science, pages 401–414. Springer, 2003. C.F. Schmidt, N.S. Sridharan, and J.L. Goodson. The plan recognition problem: An intersection of psychology and artificial intelligence. Artificial Intelligence, 11(1-2):45–83, 1978. M. Steedman. Dependency and coordination in the grammar of dutch and english. Language, 61(3):523– 568, 1985. A. Steinhage and C. Lauterbach. Monitoring movement behavior by means of a large area proximity sensor array in the floor. In 2nd Workshop on Behavior Monitoring and Interpretation (BMI08), 31st German Conference on Artificial Intelligence (KI-2008), pages 15–27, 2008. A. Stolcke. An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. In Computational Linguistics, MIT Press for the Association for Computational Linguistics, volume 21. 1995. K. Vijay-Shanker and A.K. Joshi. Some computational properties of tree adjoining grammars. In Meeting of the Association for Computational Linguistics, pages 82 – 93, Chicago, Illinois, 1985. K. Vijay-Shanker and D.J. Weir. The equivalence of four extensions of context-free grammars. Mathematical Systems Theory, 27(6):511–546, Nov./Dec. 1994. M. Wooldridge. Reasoning About Rational Agents. MIT Press, 2000.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-257

257

Model-based Inference Techniques for detecting High-Level Team Intentions

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Albert HEIN Christoph BURGHARDT Martin GIERSICH and Thomas KIRSTE Dept. of Computer Science, University of Rostock 18059 Rostock, Germany e-mail: fi[email protected] Abstract. Situation awareness is a critical prerequisite for providing proactive assistance in the context of Ubiquitous Computing. In this chapter we will focus on the detection of high-level team intentions. At first we introduce an exemplary scenario of a team meeting in a smart environment and outline relevant criteria for our approach. Then we give a short insight into the current state of the art, discussing related research projects while trying to find points of intersections for a more generic modeling approach. We reflect team behavior and individual problem solving strategies against the background of social and cognitive psychology and review sources of prior information like domain knowledge and common sense to outline a basic generic model structure. We will argue that especially probabilistic model based approaches are capable of efficiently and robustly inferring activities by making use of this knowledge. We will take a closer look at models which are able to represent human behavior at different levels of abstraction, accompanied by a short explanation of their reasoning mechanisms for inference and prediction. At last we take a look at the synthesis of such models using planning methods for behavior sequences. Keywords. High-Level Intention Recognition, Probabilistic Models, Team Support

1. Introduction The objective of this chapter is to investigate techniques for inferring high-level goals (“intentions”) of individual users and teams from low level sensor data generated by physical actions of individuals. Specifically, we want to investigate, how prior knowledge on human behavior – both at the level of individual behavior and group activities – can be exploited for building models for intention inference. We regard behavior as the primary conceptual notion on which to build an inference model. We will argue that behavior can be regarded as a goal driven process, where user goals generate observable actions. Thus, we have essentially a process-based, generative modeling approach. Therefore, we will look at inference schemes that employ generative probabilistic models – more precisely: we will use dynamic Bayesian networks as underlying paradigm for probabilistic modeling.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

258

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

Topic Software Architecture Meeting Smart Appliance Lab

10:00 Presentation of Software Architect Alice 10:05 Presentation of Software Architect Bob 10:10 Presentation of Chief Architect Carol 10:15 Discussion

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. SmartApplianceLab at University of Rostock (left) and a preliminary agenda (right) of a meeting

One branch ubiquitous computing research looks at supporting a user in performing activities by proactive assistance: by building smart environments that are aware of their inhabitants, and react to their behavior without explicit interaction, such as Smart Homes [3], Smart Health Care environments [1], Smart Laboratories [26], or Smart Rooms [11], or by building mobile assistants that support persons in performing mobile activities, such as helping persons with mild cognitive impairment following a travel itinerary [21] or supporting elder care nurses providing out-patient care [27]. Providing proactive assistants requires the assistive system to understand what the user tries to achieve – ideally, it would have a notion of the user’s intention (or “goal”), in order to provide that assistance which best helps a user in achieving this goal. For instance, consider a Smart Meeting Room Environment designed to incorporate inhabitant tracking and environment monitoring as well as occupancy schedule and meeting agenda retrieval. The room is equipped with sensing devices (e.g., RF-positioning sensors, motion sensors, light sensors) and actuator appliances (e.g., steerable projectors, motor screens, motor window blinds) that form an ad-hoc ensemble together with brought-in devices including notebooks and mobile projectors. In such a room, situated in the IT department of a company, a meeting of a software design group could be appointed (see Fig. 1, left). A typical scenario for proactive assistance located in such a room might read as follows: Chief architect Carol announces the meeting using the internal calendar management system of this company. With her announcement she provides an outline of agenda items that should be addressed during the meeting. Maybe the meeting is structured like in the agenda of Fig. 1 where first software architect Alice should present her thought about an envisioned software design. Then software architect Bob provides his presentation and after that it is the turn of chief architect Carol to present. Shortly before the appointed meeting time the two software architects Alice and Bob enter the Smart Meeting Room. Assuming all employees and visitors of the company are wearing identifiable RF-badges, the room immediately knows who is walking in. The calendar management system indicates that a meeting is about

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

259

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

to begin. Hence, the ensemble goes into a meeting stand-by configuration where screens and projectors are prepared to provide their assistance. As chief architect Carol walks in a short gossip starts, occupants walk to their seat and open their notebook computers. The notebooks add themselves dynamically to the ensemble and make the presentations of their owners available to the room. Then, the meeting starts and deviating from the preliminary agenda Bob goes to the presentation stage to give his talk. But the environment recognizes this deviation, infers that the team decided to bring forward the presentation of Bob, and puts his presentation on the screen just before he enters the presentation stage. After Bob’s contribution, the team turns back to the agenda and Alice presents. Finally, chief architect Carol moves for presentation. Every speaker is proactively provided with his particular presentation. At the end of the meeting the attendees grab their mobile appliances and leave the room. Now the remaining appliance ensemble in the room can go to energy saving or re-calibration mode and can rest until the next scheduled meeting.

Taking this scenario as general setup for our discussion, a set of criteria can be identified that are relevant to the design of the desired system for Smart Meeting Rooms. The above scenario indicates that the envisioned Smart Meeting Room should be an open environment, where a dynamic ensemble of mobile and of course also built-in appliances forms the basis for the capabilities and features of the environment. The dynamic nature of this infrastructure implies that it could not be foreseen at design or training stage respectively and reliable training data would hardly be available. Considering those questions, one issue is that the proposed concept should pursue a training-free approach. Secondly, the meeting room is open to various groups of various sizes with various meeting practices. That is, the context, namely the inhabitant identities and the meeting agendas, changes from meeting to meeting, too. Furthermore, the dynamic character of the Smart Meeting Room leads to two other criteria for modeling proper intention recognition. An adequate model must allow easy changes or extensions of either the lexica of team activities or the size of teams (i.e., the number of team members) to allow a flexible handling of the various team settings just mentioned. Another aspect is that the modeling approach has to deal with physical infrastructural constraints. Acting under the observation of, for instance, an audiovision system may be found embarrassing. Additionally, a company deployment of audio-vision-based recognition may raise security concerns by that company, because recorded confidential meeting content may be abused. Therefore we decided to rely recognition on simple unobtrusive sensor data. In the case of the described scenario these are the 2D-positions for each of the three team members, namely a six-dimensional feature vector of position data, that can be easily aquired with active RFID systems. But the concept should allow the usage of even simpler sensor hardware (e.g. proximity sensors), too. From this it follows that the proposed concept must enable robust recognition from simple, maybe sparse or noisy, sensor data. In summary, the relevant criteria for the team intention model are 1. support of a training-free model setup, 2. capability of using various lexica of team activities (i.e., agendas), 3. allowance for easy extensions (e.g., to larger teams),

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

260

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

4. provision of robust recognition from sparse or noisy sensor data Additionally, as team intention analysis and prediction is used in the real-world surrounding of an assistive Smart Meeting Room it is mandatory that the proposed concept provides online recognition and prediction and even some kind of ‘hindsight’ labeling of team activities. 2. Current Approaches When designing a system for recognising human behavior, it is helpful to look at current approaches by different research groups to identify promising methods.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Opportunity Knocks The objective of the Assisted Cognition in Community, Employment Opportunity Knocks and Support Settings (ACCESS) project[13] at University of Washington was to enhance the quality of life for persons with cognitive disabilities through computer-based memory and problem solving aids. A major part of the efforts was focused on Opportunity Knocks – a prototype described by Patterson et al. [21] that logged location sensor data to recognize a user’s mode of transportation and learn typical locations of activities. The system was built to support the memory of users from the target group by monitoring deviation from the usual daily routine, detecting a likely aberration, and providing guidance back on track. Liao et. al [14] and Patterson et al. [21] developed an hierarchical dynamic Bayesian network (DBN) which could infer from the current GPS position of the user hers/his hidden goal (destination). The hierarchical DBN has a layered structure with the raw GPS data as the only observable nodes. Starting with the GPS data the model inferred the current transportation mode, trip segment and finally the goal associated with this trip. The proposed hierachical model is very successful in infering the goal of the user from hers/his trajectories and therefore a good starting basis for further model design. Multi Modal Meeting Manager (M4) Another project is the Multimodal Meeting Manager (M4), a project sponsored by the EU IST-Programme. It was focused on the realization of a system to enable structuring, browsing and querying of an archive of automatically analyzed meetings. Zhang et al. [29] developed a DBN with a two-level structure, namely player (i.e., a single person or individual) level and team level. A single person’s activities were modeled as a conventional hidden Markov model (HMM). The activity states of all persons at a certain moment were parent nodes to the team state node and thus potentially influenced the team state. In addition to those conditional parents a switching node decided which particular activity was to affect the team activity. The team node itself in turn just had an impact on the activities of an individual person in the next time step. This also implied that there is no direct affection between a current team state and its previous state, but only the described two-level bidirectional influence. Zhang et al. estimated the team level as an aggregation of the behaviors of an individual, where the contribution of a certain person’s behavior was described by the distribution over the switching node variable. The probability distribution was automatically learned from data in an unsupervised fashion and in the end this model outperformed with its influence values a method that took the propor-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

261

tions of time during a meeting which each participant spoke to quantify influence. Their work was a first step in the direction of modeling team interaction which we want to use in our work. CIGAR A major challenge in goal recognition is that users often pursue several high-level goals in a concurrent and interleaving manner, where the pursuit of the goals may spread over different parts of an activity and may be pursued in general. Hu and Yang [12] proposed the CIGAR (Concurrent and Interleaving Goal and Activity Recognition) framework that use skip-chain conditional random fields (SCCRF) to learn the relations between activities and goals. Given an activity sequence, CIGAR can infer the probabilities of a goal. In our work we seek to combine the hierarchical approaches from Opportunity Knocks with the possibility to recognize team intentions like in the M4 project. Recognizing multiple concurrent and interleaving goals is still part of our and others ongoing research but first projects like CIGAR show a promising route.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3. Sources of Prior Knowledge In the previous section, we have briefly reviewed the approaches of other researchers for recognizing user or team activities from sensor data. We have seen, that specific structures are utilized (e.g., the team-user hierarchy of M4), that the concept of goal-driven behavior is employed (Opportunity Knocks), and that persons can be regarded as parallel processes (cf. CIGAR). When looking at these different approaches, one wonders, what the possible connections are and if a model could be successfully conceived that incorporates all of these ideas. Furthermore, we see that – at least partially – prior information on human behavior and problem solving strategies is used for defining probabilistic models (cf. Opportunity Knocks). In this section, we will explore both questions a little further. We will first look at possible sources for prior knowledge on human behavior that could be employed for simplifying the definition of behavior recognition mechanisms. Then we will outline a generic model structure that is able to incorporate these different sources of prior knowledge, as well as to integrate the different structural ideas developed in the above projects. 3.1. Social Team Behavior While intention recognition in Smart Environments is an ongoing research domain and the focus on teams rather than single users is not really established yet, in psychology exists the field of social psychology which is focused on group behavior. Social psychology can be described as “the study of the manner in which the personality, attitudes, motivations, and behavior of the individual influence and are influenced by social groups” [16]. An extensive amount of literature on group behavior is already available in the social psychology area. Forsyth [7], who gives an overview on Group Dynamics, identifies properties and dynamics that

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

262

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

all kinds of groups possess, and classifies the so called “nature of groups” into six categories – interaction, interdependence, structure, goals, cohesiveness, and stage (team phases). A specific type of group is the team. A review of literature on group behavior shows that 1.) if a group is a team in terms of collaboration, a joint objective or respectively goal-oriented acting of the team members can be assumed, 2.) joint goals (and individual member objectives) are negotiated according to a specific strategy which can be characterized as social protocol 3.) the group interdependences and structures of a team give its members equal rights, so the behavior of the different members of a team can be modeled equally, and 4.) the sensor observation of the team is an adequate technique that, combined with a coding scheme (e.g., an a priori agenda), allows an objective and systematic recognition of team events. In our further discussion we will primarily consider groups that exhibit team characteristics. Furthermore additional and more concrete information can be extracted from agendas. An agenda (e.g., the one from Fig. 1, right) is conceptually a plan of actions to achieve a certain goal. First, it provides a temporal sequence of a set of group tasks which will probably occur during e.g. the course of a meeting. Secondly, a person’s name assigned to a task may refer to a special role of that person within the team. He adopts this role very likely, if the team intends to process this specific task. That is, ‘Presentation of Software Architect Bob’ from the agenda example means that Bob has a presenter role and all other team members may adopt a listener role but at least are not presenters at the same time. Finally, a specific agenda can be split into a kind of task hierarchy. It may consist of a set of somehow related team tasks or actions that may form a tree-like hierarchy, but at least splits into a set of quasi-parallel user action sequences of the team members. In addition to the aspects of group dynamics as considered in social psychology, there is also a growing body of research on the description and modeling of the hierarchical spatio-temporal structure of individual and collective motion behavior. (The chapter “Classifying Collective Motion” by Wood and Galton in this book gives an interesting review of this topic.) Combining such knowledge with background information on group dynamics will potentially enable us to reason about the collective intentions that give rise to specific patterns of motion behavior. 3.2. Individual Problem Solving Strategies Cognitive psychology is “a branch of psychology concerned with mental processes (as perception, thinking, learning, and memory) especially with respect to the internal events occurring between sensory stimulation and the overt expression of behavior” [15]. The field of cognitive psychology is interested in describing the mental processes that occur between a stimulus and the related response. It is concerned with all human activities rather than some portions of it from a cognitive point of view. As claimed by Neisser [18] the question that cognitive psychology addresses is how a sensory input is transformed, reduced, elaborated, stored, recovered, and

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

263

used by a human processor. Metaphors and terminology used in cognitive psychology are rather computational and research is fairly intermeshed with artificial intelligence research as a significant aspect of the interdisciplinary subject of cognitive science. Besides others the main categories in cognitive psychology are perception, memory, learning (with knowledge representation and language aspects), and thinking. Especially reasoning and problem solving aspects as part of the cognitive psychology subfield thinking promise insights into human behavior that are constructive with respect to the design of a team intention model. The fundamental statement here is that human reasoning and problem solving is goal oriented. People tackle a certain goal in a “divide & conquer” manner. Abstracting this behavior, they try to find an efficient transformation from an initial state to a desired goal state by subdividing the possibly composite activity into a set of atomic actions. The different models from cognitive research typically enable a hierarchical formulation of the individual user goal. A review of this research field shows that 1.) cooperative behavior of individuals can be modeled by hierarchical structures that reflect typical problem solving strategies, 2.) the temporal sequence of certain activities is tied to observable preconditions and effects of the underlying actions, and 3.) the knowledge for solving problems can be derived from perception, memory1 , or reasoning. For an in-depth review of both, social and cognitive psychological aspects on modeling team behavior see [8].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.3. Domain and Context Knowledge Besides agenda information, which is usually integrated at team level, additional domain specific knowledge influences the design decisions regarding an intention model at a lower level of abstraction. As individual problem solving is goal-based, this domain knowledge has to provide the basic activities needed to achieve such a goal. In our meeting scenario this would involve the knowledge that each single team member has an own position and can either sit or stand at a certain place for some time to listen or hold a presentation or walk around. If he moves he does that at a certain path over a certain distance at a certain velocity, usually intending to go to some place, and so on. Additionally to this domain knowledge, specific context information on i.e. the physical environment, room topologies, personal attributes or utilized devices provides valuable input for designing an appropriate model. As this prior knowledge strongly depends on the particular application domain, it is clearly the most heterogeneous source of information and has to be considered always individually. Generic Model Structure Keeping in mind these sources of prior knowledge on human behavior, we can sum up three basic building blocks for a fundamental and generic model: 1 That

are, in context of temporal probabilistic models sensor observations, preliminary agenda knowledge and history tracking. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

264

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

• Team goal deliberation (Team Level) • Individual goal deliberation (User Level) • Action planning and execution (Activity Level)

Team

Team Level:

Team goal deliberation

Member Member User

User Level:

Individual goal deliberation

Activity Activity Activity

Activity Level: Action Planning / Execution

Sensor Sensor Sensor

Sensor Level:

Sensor observations

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 2. Schematic outline of the proposed team model. Social behavior is modeled jointly at team level, while for each single member problem solving strategies, activities and physical sensor observations are managed separately.

The hierarchical structure we impose on these building blocks can be interpreted as a proposal for a specific cognitive architecture, providing three levels of control loops with increasing complexity: motor skills, individual reasoning, group reasoning. Figure 2 shows a schematic outline incorporating these levels and adding a fourth level for physical sensor observations. This model makes explicit the fact that motor behavior is (at least sometimes) caused by user objectives, and that user objectives are “caused” by (rather: synchronized with) objectives of a larger social unit (such as a team). In this sense, the model we have outlined does indeed provide the capability to represent high-level behavior resp. highlevel objectives. Systems such as Opportunity Knocks or M4 can be regarded as specific instances of this approach. So, with respect to its fundamental structure, we believe this model should be able to cover a wide range of applications for modeling and recognizing team behavior, beyond the limits of the meeting room domain. Furthermore, it should provide for the specific flexible integration of prior knowledge on user activities, problem solving strategies, and social interaction patterns inherent to these application areas. 4. Team Intention Model In the previous section we have identified the basic structure of a model for describing team behavior, and the different sources for prior knowledge on behavior

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

265

we can use for populating this model. Objective of this section is to show, how this knowledge can be turned into a mechanism for inferring high-level team goals from sensor data. 4.1. State estimation in state space systems Inferring team activities from noisy and ambiguous sensor data is a problem akin to tasks such as estimating a vehicles position from (noisy) velocity and direction information or estimating a verbal utterance from speech sounds. The latter tasks are tackled by employing probabilistic inference methods, such as Kalman Filters or Hidden Markov Models. We briefly recapitulate the basic concepts of Bayesian state estimation; more elaborate introductions can be found for instance in [23]. 4.1.1. Probabilistic Inference The general objective of probabilistic inference as considered here is to estimate the state of a system at time t, denoted by xt , given a sequence y1:t of system observations from time 1 to time t. (Here, y1:t is a shorthand for the sequence y1 , y2 , . . . , yt .) xt is an element of X, the system state space. In a probabilistic framework, the basic inference problem is to compute the distribution function p(Xt | Y1:t = y1:t ); that is: the probability distribution of the state random variable Xt given a specific realization of the sequence of observation random variables Y1:t . The probability of having a specific state xt is then given by p(Xt = xt | Y1:t = y1:t ); we write this shorter as p(xt | y1:t ). It is now of interest to consider the additional information we have available for computing this value.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.1.2. State Space Models and Graphical Models In a model-based setting, we can assume that our model specifies the state space of the system, X. In addition, our model will provide information on the system dynamics: It will make a statement about in which future state xt+1 ∈ X it will be in, if it is currently in state xt ∈ X. In a deterministic setting, system dynamics will be a function xt+1 = f (xt ). However, many systems (e.g., humans) act in a non-deterministic fashion, therefore, the model will generally provide us only with a probability for reaching a state xt+1 given a state xt , denoted by p(xt+1 | xt ). This is the system model. The system model here is first order Markovian: it assumes that the current state depends only on the previous state and no other states. In addition, observations yt will depend on the current system state xt , again nondeterministically in general, giving the distribution p(yt | xt ) as observation model. For instance, if we are modeling the behavior of a vehicle, and the state x contains the true vehicle location x.loc, the observation – for instance, using GPS, possibly a normally distributed random variable with x.loc as mean. Here, one would define p(y | x) = N (y; μ = x.loc, Σ = . . . ) with N denoting the normal distribution. Finally, we assume that we have a notion what the initial state of the system may be, given by the distribution p(x1 ) (this may be the uniform distribution, representing ignorance of the initial state).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

266

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

A

B



Xt-1

Xt

Xt+1





Yt-1

Yt

Yt+1



C

Figure 3. A simple graphical model (left) and a dynamic system with hidden state (right)

The conditional independence assumptions can be written as a graphical model, where the nodes represent random variables and the arrows conditional dependence. Fig. 3, left, is a small example. Here, C is conditionally dependent on B, but not on A. This means, p(C | A, B) = p(C | B). The probabilistic structure represented by such a model is completely defined by giving the conditional probability distributions (CPD) for the model’s nodes. The model here is completely defined by giving p(A), p(B | A), and p(C | B). The model asserts, that the following equality holds for the joint distribution of the model’s random variables: p(A, B, C) = p(A) · p(B | A) · p(C | B). The conditional independence assumptions between the state variables Xt and the observation variables Yt are shown in Fig. 3, right. 4.1.3. Bayesian State Estimation

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

In such a system such as given by Fig. 3 (right), recursive Bayesian estimation can be used for estimating p(xt | y1:t ). The basic idea is: if we already know p(xt−1 | y1:t−1 ), i.e., the state distribution at the previous timepoint t − 1, and if we have a new observation yt , we can use this for computing an updated state estimation p(xt | y1:t ). So, starting with p(x1 ) as initial prediction, we can iteratively process the observations y1 , y2 , . . . , yt . The estimation process consists of two steps: • Prediction of the new state using the current state and the system model. Prediction is given by  p(xt | y1:t−1 ) =

p(xt | xt−1 )p(xt−1 | y1:t−1 )dxt−1

(1)

• Correction of the predicted state using the new observation and the observation model: p(xt | y1:t ) = Here, p(yt | y1:t−1 ) =



p(yt | xt )p(xt | y1:t−1 ) p(yt | y1:t−1 )

(2)

p(yt | xt )p(xt | y1:t−1 )dxt

Then, using p(x1 ) as initial prediction, the first step is to compute p(x1 |y1 ) using the correction step, then the estimation process alternates between computing the next prediction and the next correction. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

1

t

267

T

filtering prediction

h

fixed lag smoothing

`

fixed interval smoothing MAP explanation

Figure 4. Basic inference tasks in temporal probabilistic models. Shaded segments are the intervals for which observations are available. Arrows pointing up represent the time steps for which the system states are inferred. T denotes the length of a complete data sequence, t is the current time, h represents the prediction horizon, and  is the time lag that inference is behind current time. (Source: Adapted from [17, pg. 3])

4.1.4. Inference Tasks

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The above state estimation process gives an estimation of the current state at time t by using the observations up to time t. This is also known as filtering. If we continue to perform prediction steps beyond time t, e.g., up to a time t + h (omitting correction after time t), this is known as prediction (with horizon h). If we have observations up to time T and we are using this full observation sequence y1:T for estimating the state at time t < T , this is known as smoothing (sometimes known as “hindsight”). Smoothing produces better results than filtering, but clearly can not be used in real-time settings where state estimations have to be produced as data arrives. (Smoothing is required for parameter estimation from training data when employing the Expectation Maximization Algorithm.) Finally, while smoothing allows to determine the most probable state for each time t ≤ T , one is also interested in the most probable state sequence. Smoothing may produce a state sequence that is not a possible system history, as it averages over all possible histories – the MAP sequence is the most probable individual history explaining the observations (see also Fig. 4 for the different inference tasks). 4.1.5. Inference Strategies Unfortunately, Eq. 1 and Eq. 2 do not have an exact solution in general. Exact solutions are available for instance if the state space is finite (here, algorithms for hidden Markov models can be used), or if the system model and observation model are linear Gaussian – then the Kalman filter can be used. Non-linear, nonGaussian systems require the use of approximate inference methods such as grid based filters (e.g., discretization of the state space) or sampling methods, such as particle filters. Note that the inference approach here is a generative: For estimating p(x | y), the system state given the observation, we use knowledge on how a system generates new states from a current state (given by the system model), and how observations are generated from the system state (given by the observation model). Bayes’ theorem readily gives the required basic translation: p(x | y) =

p(y | x)p(x) p(y)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

(3)

268

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

From this we see that a generative approach effectively requires to model the joint distribution p(x, y) = p(y | x)p(x). In contrast, a discriminative approach would directly model just p(x | y). There is a certain debate, which kind of approach does make more efficient use of training data. One statement is that discriminative models achieve a lower asymptotic error bound, but that generative models achieve their (typically higher) error bound faster [19] (although more recent investigations show that the situation is not so clear [28]). That is, with little training data, a generative approach (built on prior knowledge on the behavior of the generating process) is able to provide better estimates. In addition, a discriminative approach can not readily be used for prediction, as it does not incorporate a model of the system dynamics, and prior knowledge on system dynamics can not easily be incorporated into such a model. The ability for prediction just means, a generative model can be used for simulation, which in turn means a generative model readily can be employed in simulation-based inference schemes such as particle filtering. If a generative model does correctly reflect system dynamics, it provides better performance with limited training data than a comparative discriminative model.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.2. Modeling Team Dynamics Given (noisy and intermittent) sensor readings of the team members’ positions in a meeting room, we are interested in inferring the team’s current objective – such as having a presentation delivered by a specific team member, or having a round table discussion, a break, or the end of the meeting. In Section 3 we have outlined that in fact several sources of prior knowledge on the system dynamics of a team as a whole as well as of the individual users do exist. In Fig. 2, we have identified the major modeling levels that should be considered when investigating the activity of a team. Objective of this section is to show, how these layers can be mapped to a probabilistic model of a team’s system dynamics, which then can be used for the above inference tasks. A straightforward step is to provide a refined model for the hidden state X based on the identified modeling levels: this state is decomposed into components reflecting the team state, T , the user state, U , the state of user actions, A, and the observable sensor data Y . As basic concept of team operation we are considering a simple two-phase prepare-act model operating on an agenda of joint objectives. The team will negotiate an objective (such as having a presentation by a team member), then it will prepare for this objective (e.g., presenter moves to stage, listeners move to their seats), then it will perform the action of this objective (e.g., presenter is giving his presentation). Afterwards, the team will enter the next prepare-act cycle until all agenda items are finished. It is the job of the T node to manage this two-phase “protocol” as well as the selection of agenda items, conditional on whether users have finished their activities for the current phase. There is a mutual dependency between a team and its members within a time slice: whether a team adopts a new objective depends on how far the members have got in achieving their current goal, and the new goal a team member adopts depends on the team’s objective. To resolve this mutual dependency between the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

269

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 5. Deriving the basic probabilistic structure of the team intention model

random variables “User State” and “Team State”, we have split the User state into the variables U, representing the completion state of the current goal, and G, the goal adopted for the next time interval. So, T depends on U , and G depends on T . Whether the current goal has been achieved depends on the state of the current user action, A, which models activities such as following a route from the stage to a seat. Finally, the observation Y depends just on the state of the user action. So we arrive at the basic probabilistic structure outlined in Fig. 5, which states that X = (A, U, T, G) and that p(Y | X) = p(Y | A). As a further refinement, we have to consider that each team consists of several members a, b, c, . . . . So for each team member there is an individual set of A, U, G, Y nodes, as outlined in Fig. 6. In addition, we see that a user’s action at time t depends on the previous action, if this is being continued, as well as on the previous new goal (the current goal), if this has been set. Besides the current state of activity, the goal completion also depends on the current goal. Using V (i)  as a shorthand for the list of variables V (a) , V (c) , V (b) , . . ., we arrive at the state model X = (A(i) , U (i) , G(i) , T ). If we let M be the set of team members, we get the following decomposition for p(Xt | Xt−1 ) based on the conditional independence defined by Fig. 6: (i)

(i)

(i)

(i)

(i)

(i)

p(Xt | Xt−1 ) = p(At , Ut , Gt , Tt |At−1 , Ut−1 , Gt−1 , Tt−1 ) (i)

= p(Tt |Ut , Tt−1 )   (i) (i) (i) × p(At | At−1 , Gt−1 ) i∈M

(4)

 (i) (i) (i) (i) (i) × p(Ut | At , Gt−1 ) p(Gt | Tt , Tt−1 , Gt−1 )

In addition, we have the following equation for the observation model:  (i) (i) p(Yt | At ) p(Yt | Xt ) = i∈M

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

(5)

270

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

Figure 6. Two-sliced dynamic Bayesian network (DBN) modeling team intention inference. It shows the intra-slice dependencies between observable (double contoured) and hidden variables, as well as the inter-slice dependencies between consecutive states.

The next step is the definition of the individual factors. In these definitions, we use the following notational conventions: • I(P ) denotes the indicator function for a proposition P . It is defined by 

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

I(P ) =

1 if proposition P is True 0 otherwise

(6)

We sometimes use the indicator function as probability distribution. p(X | B) = I(X = B + 1) would define p(X | B) as a probability distribution that assigns a non-zero probability only to one value of X, namely when X = B + 1. So, p(3 | 2) = 1 and p(3 | 6) = 0. If sampling from p(X | 4), one would deterministically get 5 as result. Indeed, one would directly use the assertion X = B + 1 to compute the sample deterministically from a given value for B. • Random variables may have several named slots. We write V.s to access slot s of Variable V . The notation V ⊕ {s = v} is a variable that has the same slot values as V except for s, where it has the value v. Using these conventions, a statement such as (i)

p(Tt | Tt−1 , Ut ) = I(Tt = Tt−1 ⊕ {phase = Achieve})

(7) (i)

means: The probability for observing a value of Tt given Tt−1 and Ut  will be 1 if the values of all slots of Tt ’s realization are identical to the slot values of Tt−1 ’s realization, except for the slot phase, which Tt must map to the value Achieve. The probability of observing any other value for Tt is 0. (So, in this example, the realization for Tt deterministically follows from the realization of Tt−1 .)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

271

(i)

p(Tt | Tt−1 , Ut ) =

(i) Ut .done

if ∀i ∈ M : = T rue then if Tt−1 .phase = P repare then return I(Tt = Tt−1 ⊕ {phase = Achieve}) else return I( Tt .phase = P repare ∧ Tt .history = Tt−1 .history ∪ Tt−1 .goal ∧ Tt .agenda = Tt−1 .agenda \ {Tt .goal}) × pselect (Tt .goal | Tt−1 .agenda, Tt−1 .history) else return I(Tt = Tt−1 ) (i)

Figure 7. Computation of p(Tt | Tt−1 , Ut )

4.2.1. Overall team behavior The computation of p(Xt | Xt−1 ) by evaluating the different factors of Eq. 4 is performed in the following order, directly implied by the direction of conditional dependencies:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• First, the members have been busy with their activities during the time that has passed since the previous time slice. This is captured by computing the (i) (i) (i) new state of the action variables, p(At | At−1 , Gt−1 ). • Next, the members individually decide, if they have achieved their subgoals. (i) (i) (i) This is given by p(Ut | At , Gt−1 ). • Then the team negotiates to switch between the phases (preparing / per(i) forming), possibly picking up a new joint goal, p(Tt |Ut , Tt−1 ). • Finally, if a phase switch has occurred, the members select new subgoals (i) (i) based on the team goal and phase, p(Gt | Tt , Tt−1 , Gt−1 ). This procedure mimics the specific social protocol we assume to be in effect for coordinating the member actions in our team. Different social protocols – for instance, protocols that allow team members to continue preparing (e.g., finding their seat), while the perform-phase has already started (e.g., while the lecture has begun) – require other procedures. 4.2.2. The team variable T (i)

The team variable T and its associated CPD p(Tt |Ut , Tt−1 ) encode the team’s joint decision making strategy regarding when to switch phases and how to select new goals from the agenda. The team variable contains the following slots: • • • •

agenda, the list of goals yet to be achieved by the team. goal, the team’s current goal. phase, the current execution phase (possible values: P repare or Achieve). history, the set of goals already achieved.

We assume a probability distribution function pselect (goal | agenda, history) exists, defining the probability that the team decides to achieve goal next, given Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

272

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

agenda and history. Furthermore, the U variables are required to have a slot done that contains the value T rue if the user has finished his current subgoal. (i) Based on this, the computation of p(Tt | Tt−1 , Ut ) can be defined as outlined in Fig. 7. From the viewpoint of a procedural behavior description, this definition can be read as follows: (i) If all users have achieved their subgoals and if the team is currently preparing for a goal, the team will switch to the achieve phase. (ii) Otherwise, if all users have finished their subgoals and if the team is currently in the achieve phase, the team goal has been achieved. The current goal is moved to the history, a new team goal is selected from the agenda, and the team switches phase to preparing for this goal. (iii) Otherwise, if some users are still working on their subgoals, the team remains in its current phase with the current team goal. Clearly, the team works highly deterministic, the only randomness is introduced in the selection of the next goal.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.2.3. pselect : Managing the Agenda The definition of pselect (together with the precise models for history) governs a team’s strategy for sequentially discharging agenda items. We assume that agenda holds a list of objectives to be achieved by the team, and that the order of objectives identifies the preferred execution order. So, an agenda A, B, C, D would state that the favored execution sequence indeed is first A, then B, etc. However, real teams in the real world may deviate from whatever agenda they have initially set up. Therefore, we have to assign non-zero probabilities to selecting items out of sequence. In addition, the probability of what items are selected next may depend not only on the issues still open, but also on the items already discharged (as well as their execution order). Finally, the team may decide to redo action items or to take up action items not initially planned. A very rough model is simply to assign constant probabilities to the different options for selecting an action item as outlined below: ⎧ pfollow ⎪ ⎪ ⎪ ⎨p deviate /na pselect (g | a, h) = ⎪ p redo /nh ⎪ ⎪ ⎩ padhoc /nc

if if if if

g g g g

next item on agenda a some other item on agenda a some item already on history h some other possible goal with g ∈ / a∪h (8) Here na , nh , and nc are the number of items eligible for the respective cases. (I.e., na is the number of items left on the agenda without the next item, etc.) Other more refined models are possible, such as computing the set of possible execution histories and then assigning probabilities for moving from one execution history to a possible successor history, as outlined in Fig. 8. The question here of course is how to specify such an elaborate model efficiently, and whether the achievable payoff is worth the increased modeling effort. (To this we do not yet have a decisive answer.)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions {A}

.09

{A, B} 1

.9



.9

.1

273

.99

{B}

.01

.01

{A, C}

.09

{A, B, C}

1

{A, B, C, D}

1

.91

{C}

1

{B, C}

Figure 8. Markov model of the agenda driven team activity selection process with exemplary transition probabilities.

(i)

(i)

p(Gt | Tt , Tt−1 , Gt−1 ) = if Tt .phase = Tt−1 .phase then (i) return I(Gt .new = T rue) × psubgoal (G(i) .goal | Tt .phase, Tt .goal, i)) else (i) (i) return I(Gt = Gt−1 ⊕ {new = F alse}) (i)

(i)

Figure 9. Computation of p(Gt | Tt , Tt−1 , Gt−1 )

4.2.4. User goal selection behavior and G(i)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Subgoal selection by a team member can be modeled quite straightforward: if the team switches the execution phase (i.e., Tt−1 .phase = Tt .phase), all members have finished their previous subgoal, so in this case every member i can take up a new subgoal according to the member’s identity and role, the execution phase, and the team’s goal. If there is no phase switch, the current goal is maintained. The G variable contains two slots, • goal, the subgoal selected by the team member • new, to indicate that a new subgoal is been selected in this time slice. The goal selection procedure is defined in Fig. 9. The core function is the CPD psubgoal (G(i) .goal | Tt .phase, Tt .goal, i),

(9)

which implements the user’s reasoning process in selecting a goal. In principle, this could be an arbitrary complex deliberation process. However, for our threeperson meeting scenario it can be implemented fully deterministically by a simple lookup table, such as given in Table 1. Note that subgoal selection by a team member is conditionally independent from other team member’s considerations given the joint team goal: any negotiation required between team members is assumed to have occurred in the T node. Also, note that the procedures for computing T and G are independent from the application domain (i.e., independent from the concrete set of team goals, team members, and member goals). It is the definition of psubgoal where the concrete domain first becomes visible.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

274

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

Table 1. Deterministic mapping of team goals, execution phase, and member identity to subgoal. Team goal Tt .goal

User i

Role

A Presents

a b c a b c a b c a b c

Presenter Listener Listener Listener Presenter Listener Listener Listener Presenter Panelist Panelist Panelist

B Presents

C Presents

Discussion

Execution phase Tt .phase P repare Achieve Goto(Stagea ) Goto(Seatb ) Goto(Seatc ) Goto(Seata ) Goto(Stageb ) Goto(Seatc ) Goto(Seata ) Goto(Seatb ) Goto(Stagec ) Goto(Seata ) Goto(Seatb ) Goto(Seatc )

Do(T alka ) Do(ListenT oa ) Do(ListenT oa ) Do(ListenT ob ) Do(T alkb ) Do(ListenT ob ) Do(ListenT oc ) Do(ListenT oc ) Do(T alkc ) Do(Debate) Do(Debate) Do(Debate)

4.2.5. The user action variables A(i) The user action variable A captures the state of a user’s success in his current subgoal, and his strategy for planning how to go on with achieving this goal. Here the concrete set of possible user goals and user actions becomes explicit. For the meeting scenario, we have set up a very simple domain: a team member may do two things:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• Walk from one place to another, in order to reach a specific location as required for achieving goals of the type Goto(loc). • Stay at a specific place for a certain amount of time as mandated for goals of the type Do(duration). The rationale for this very simple action model is based on the available sensor data: If we have only location tracking, the only point of interest is the location pattern associated with an action. So, there is no need to differentiate between being stationary because of standing at the stage, giving a talk, or being stationary because of sitting on a seat, listening to a talk. Arguments to Do – such as T alka or Debate – simply specify the expected duration of this action. Of course, this is arguably a overly simplified model: as soon as other sensor data becomes available, a finer differentiation of actions is required. An A(i) variable contains the following slots: • pos, the users current position. • If trying to achieve a Goto(loc) goal: – path, the path leading to loc – distance, the distance already traveled on path – velocity, the current speed • If trying to achieve a Do(duration) goal: – worktime, time spent working on goal Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

275

p(At | At−1 , Gt−1 ) = if Gt−1 .new then switch Gt−1 do case Goto(loc).goal return I( At .path = route(At−1 .pos, loc) ∧ At .distance = At .velocity × δt ∧ At .pos = map(At .distance, At .path)) × pvstart (At .velocity) case Do(duration) return I( At = At−1 ⊕ {worktime = δt }) else switch Gt−1 .goal do case Goto(loc) return I( At .path = At−1 .path ∧ At .distance = At−1 .distance + At .velocity × δt ∧ At .pos = map(At .distance, At .path)) × pvchange (At .velocity | At−1 .velocity) case Do(duration) return I( At = At−1 ⊕ {worktime = At−1 .worktime + δt })

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 10. Computation of p(At | At−1 , Gt−1 )

In addition, the function route(f rom, to) represents the process of planning how to go from the location f rom to the location to. In our scenario, we have implemented this quite simply by providing a graph of possible locations in the meeting room, together with their connections (see Fig. 11). route(f rom, to) just computes the shortest path on this graph. The function map(d, p) takes a distance d and a path p and maps this to the point reached when traveling along p for the distance d. The user starts to move along a path with an initial velocity v distributed as pvstart (v), he changes his velocity to v  while traveling with probability pvchange (v  | v). δt is the absolute time passed since the last time slice. Based on these definitions, Fig. 10 gives the computation of the user action state variable. The process is straightforward: If we have changed to a new goal in the last time slice, we plan how to reach this goal (either by planning a route from the current position to the destination and picking an initial speed, or by starting to work on the stationary task); if we are continuing on an already established goal, we update the state either by the distance traveled since the last time slice, or by the amount of time spent working. (Note that in case a new goal has been selected at the end of time slice t − 1, the user has been working on this new goal during the time interval between slices t − 1 and t. Therefore, the state of the new action for the new goal has to reflect this already spent effort.) 4.2.6. The user goal achievement variables U (i) Finally, we compute the success in achieving a goal. This computation too is based on the concrete application domain. If the user is traveling along a path, he has reached his goal when arriving at the destination. If the user is working on a temporal activity with duration d for a time w, he has achieved this goal with probability pfinish (w | d), allowing a deviation between scheduled time and time actually needed. The definition of the computation is given in Fig. 12.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

276

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

Figure 11. The motion graph for the meeting scenario.

(i)

(i)

(i)

p(Ut | At , Gt−1 ) = switch Gt−1 .goal do case Goto(loc) (i) (i) return I( (At .pos = loc ∧ Ut .done = T rue) (i) (i) ∨(At .pos = loc ∧ Ut .done = F alse)) case Do(duration) (i) p ← pfinish (At .worktime | duration) (i)

(i)

return I(Ut .done = T rue) × p + I(Ut .done = T rue) × (1 − p) (i)

Figure 12. Computation of p(Ut

(i)

(i)

| At , Gt−1 )

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

We now have given definitions for all CPDs required to describe p(Xt | Xt−1 ). What remains is to define p(Yt | Xt ). 4.2.7. Computing p(Yt | Xt ) (i)

(i)

As given by Eq. 5, the observation model factorizes into independent p(Yt | At ), (i) so it suffices to give the definition for observing an individual user. At contains a slot pos, which we assume to hold the x and y coordinates of the user’s position. Our sensors also give user positions as 2D-coordinates. However, not all users will be observed in a time slice, so in addition to a pos slot, an observation Y (i) also contains a seen slot, indicating if an observation for this user does exist. So we arrive at  (i) (i) (i) ppos (Yt .pos | At .pos) if Yt .seen (i) (i) p(Yt | At ) = (10) 1 otherwise where ppos is the location sensor model, in the most simple case a bivariate normal distribution with, for instance, a unit matrix as covariance (if locations are given in meters), so we have: ppos (ypos, apos) = N (ypos | μ = apos, Σ = 1)

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

(11)

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

277

In general, in our experience location sensors give more outliers than compatible with a normal distribution model, so it is better to either use a Gaussian mixture such as ppos (ypos, apos) =(1 − poutlier ) × N (ypos | μ = apos, Σ = 1) + poutlier × N (ypos | μ = apos, Σ = 100 × 1)

(12)

or an explicitly heavy-tailed distribution such as the Cauchy distribution. (One would in general use training data for computing estimators for distribution parameters such as covariance matrices.)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.3. Evaluation We have derived a generic model for a probabilistic description of team behavior and we have given a simple concrete instantiation for this model from the application domain of meeting support. With respect to the application scenario, it is now of interest to analyze, how well the model is able to correctly identify team behavior from sensor data. Following, we give a brief account of some of our evaluations; a more complete presentation can be found in [8]. Based on our three-person meeting scenario, we have run a number of experiments for evaluating the precision of the model. We have recorded a data set of 20 three person meetings with the Agenda P resentA, P resentB, P resentC, Discussion and different execution orders (either according to agenda, or with smaller or larger deviations). We then used position sensor data for the three team members (provided by a Ubisense indoor positioning system) for inferring the current team intention using the model described in the previous sections, employing both forward filtering and smoothing. A typical inference run for a meeting data set, together with the Mean Squared Error (MSE), is shown in Fig. 13. It can be noted that – except for a short phase of insecurity in estimating the right P repare-goal at the start of a prepare phase, the classification errors are only due to the lag between the true onset of a new phase and the recognition of this phase by the system. Over all 20 data sets, the average MSE using forward filtering is 9.1%, for smoothing it is 6.2%. Although the concrete application scenario is rather simple, we think these are promising results that justify a further investigation of this modeling approach. However, our experiments also point at some potential areas of modeling errors. There is quite a large variance in the MSE’s produced (for the forward filter MSE we get σ = 0.045, μ = 0.091), the reason being some data sets producing rather bad results, estimating a team member constantly as leaving the stage, while in reality he is still presenting his lecture. Looking at the sensor data we see, that the user is indeed located outside the spatial area associated with the lecture position, already on a path leading back to a seat. The model has explained the sensor data by assuming the user to have stopped on the path back to his seat. There are of course immediate remedies for this problem: enlarging the lecturing region or refining the motion model. But this experience shows that simple things, such as movements of persons in a room, have quite elaborate in-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

278

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

5

State 10 15

Forward, mse = 0.0614

0

500

1000

1500

2000

2500

3000

2500

3000

State 10 15

Smoothed, mse = 0.0423

5

truth estimate

0

500

1000

1500

2000

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 13. A sample run of the intention classifier, using filtering (top) and smoothing (bottom). The state numbers 1–4 represent the P resentA, P resentB, P resentC, Discussion team goals, in this order. State numbers 5 and above represent different prepare goals. In the underlying data set, the team has discharged the agenda items in reverse order. Note that when using forward filtering, the model shows some initial uncertainty, which of several possible P repare goals are currently in effect at the start of such a goal; this uncertainty vanishes with hindsight.

ternal structure, which, if ignored, will allow an inference system to come up with surprising explanations for the observations. Building blocks for human behavior are required that encapsulate such internal structures of everyday behavior in a reusable fashion. Another interesting point regarding the model-based, generative approach is, whether it is able to compensate for missing sensor data: How will the model behave if, say, the sensor for user b does not deliver data (although we know that b is a participant)? Here, we have investigated system behavior when removing the observations for a, b, or c, respectively. Our experiments show that in some configurations (team following agenda in sequence), the lack of sensor data has virtually no effect. This is, after all, not a big surprise, as the system’s prediction then coincides with reality: in this case, sensor data would only confirm the model’s predictions. If there is no sensor data, the system will still make its predictions. On the other hand, if the team is behaving against expectation and the remaining sensor data is noisy, recognition rate may drop by 20%. This shows both the potential benefit of a generative approach (its capability for prediction), and its danger in case sensor data is not good enough to divert it from its conviction how reality should look like. 4.4. Discussion In the preceding sections, we have shown how a hierarchical model of team behavior can be mapped to a generic probabilistic structure enabling the use of established probabilistic inference techniques (e.g., Bayesian filtering) for recognizing behavior, and we have shown, how such a generic structure can be instantiated for

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

279

a concrete – albeit simple – application scenario. We now review some important aspects of this modeling approach. 4.4.1. The use of prior knowledge In Section 3, we have discussed potential sources of prior knowledge that may aid the definition of a probabilistic model of team behavior. In the model we have defined above, the following aspects of prior knowledge have been integrated:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• Social protocol. This has been used to set up the specific form of P repare/Achieve cycles (i.e., activity synchronization) and the negotiation strategy used by the team to agree upon the next objective (pselect ). • Human problem solving strategies. This aspect introduces the notion of a goal into the model and the use of a hierarchical task breakdown (psubgoal ). • Domain knowledge on user activities. Here we have the set of concrete member activities (Do, M ove) that are relevant to the domain and the information how a member will perform such activities. • Domain knowledge on physical environment. The room topology has been used to identify the relevant locations and the possible motion paths of members. • Explicit information provided by team members. The agenda provided by the team, indicating the (expected) members, the objectives, sequencing information, and information for the time allocated for specific goals, is the final source of prior information. The above prior knowledge enables us to set up a probabilistic model that can be used without additional training data. This is an important point for highly variable application domains, where a system has to work in settings that have not occurred before. For instance, in our application scenario, a change in the room topology (e.g., having the meeting in a different room), or a different team size would immediately invalidate previous training data (i.e., location sightings for team members). 4.4.2. Procedural probabilistic behavior modeling In our exposition, a model is not defined declaratively, but rather procedural, state-based. Team members are individual processes that synchronize their activities in certain situations, where the team level provides the necessary synchronization facilities. This process-based, procedural approach has important consequences: + Arbitrary parallelism of member activities is automatically captured at the model level. + The model can directly be used for simulation (here, rather than computing the probability for a given realization of a state variable, we simply draw a realization from the state variables CPD). Thus, sampling based inference schemes (i.e., particle filtering) is immediately usable in the case, exact inference algorithms are not longer applicable.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

280

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

(We note that already the simple concrete model we have given disables exact inference: The path-based motion model is non-linear, so KalmanFilters are not applicable, at the same time we have continuous location variables, so we can not use Markov-Models. – We come back to this in Section 5) − If we want the behavior described by our model to exhibit a specific property, we need to find a way that procedurally implements this property (rather than stating it declaratively). This can be tedious. A very important aspect is the “language level” a model provides for defining behavior: One does not only want to be able to recognize high-level behavior, one also would like to describe behavior using a high-level language. Within the model we have presented, a first step in this direction is to provide variables for representing high-level states (U , G, T ). But besides this, one would also like to be able to concisely express high-level dynamics. The “Prepare”/“Achieve” cycle is a specific example of such high-level dynamics, but it is hardwired in the model we have given. Likewise, the strategy for selecting agenda items based on the history represents such a higher-level behavior. So, we are looking for a “behavior description language” that provides building blocks for basic behavioral units and for composing complex behavior from simple behavior – and it is important, that this language enables the generation of probabilistic models: it should also allow to make statements about the probability of the modeled behavior. Languages for enabling the high-level description of probabilistic models are an active area of research – see for instance [22,20,5] – we expect such concepts to provide valuable tools for building a high-level behavior modeling language. In Section 6 we outline some of our own experiments in this direction.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.4.3. Unknown objectives It is not always possible to compile a comprehensive set of user actions (and user goals and team goals) for a given application domain: unexpected team behavior should always be anticipated. This obvious contradiction can be tackled by adding a team goal “Unknown objective” that effectively mandates uniform distributions for the possible user locations (on the free floor space): So, if user locations can not be associated with known team goals, the “Unknown objective” will be used. Eventually, location data collected under the label “Unknown objective” might be used for discovering new objectives, but a discussion of this topic is outside the scope of this chapter. 4.4.4. Why bother at all? The exposition here has shown that modeling is not trivial. And one wonders up to what level of complexity modeling can be driven, and what the benefit of all this modeling effort really is. Do we need a complex motion model? Do we need a highly elaborate agenda model? – Or aren’t there much simpler models that perform just as well? What we have done is to define a model structure that is able to accommodate several of the modeling efforts proposed by different researchers for the recognition of individual user and team activities. So we provide an integrated view on the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

281

different modeling aspects that might need taking into account when building a system for intention inference. Once this generic model is instantiated for a specific application domain, a significantly more compact model description might be found by hardwiring the instantiation into the model structure itself. In the next section we will show that this is indeed possible, providing us with a very compact and simple probabilistic structure for our application scenario. The benefit of a generic model is that it provides the scenario independent building blocks for defining team behavior, and by this a universal starting point for the development of specific applications. Furthermore, it furnishes the conceptual notions for mapping different applications to a common description framework, thereby establishing the basic prerequisite for integrating heterogenous models into a common intention recognition architecture. This at least is what should be the aim of a generic model – and we admit that we have not yet proven in how far the proposal we have presented lives up to this challenge. But we hope our proposal is a step in the right direction.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5. Reducing complexity In the previous section we have outlined a generic probabilistic structure that is able to handle quite complex models of team activities. However, we have instantiated this generic structure with a rather simple domain model. One questions, how this domain model affects the effective probabilistic structure. It would be an important point for efficient inference, if the effective complexity of the Team Model, once instantiated with a concrete application domain, would, at least for some domains, allow both exact and efficient inference algorithms. For the meeting scenario, all slots of the state variables have finite domains (with the exception of the pos, distance, velocity, and worktime contained in the A variable). Therefore, all these variables can be folded into a single discrete and finite state space variable X. If we somehow could get rid of the continuous variables, we would be able to effectively map the probabilistic model for the team scenario to a simple hidden Markov model. If we are not really interested in the precise user positions while achieving a Goto(loc), we might be satisfied with the assertion that the sensor will observe him somewhere on the route between source and destination, for the duration it typically takes to follow this route. So we remove the complete facility for user location tracking (pos, distance, velocity, path, route, map) and replace it with a sensor model that is only interested in which spatial region observations will come in case a user is performing a specific action. For instance, we can represent each motion path by a Gaussian mixture that covers the region containing the positions where we will observe the user when on this path. For our meeting scenario, we have computed such mixtures from training data created by recording several of these meetings, given the mixtures shown in Fig. 14 (left), (together with the bivariate normals representing stationary places). Then, we define the observation model by simply mapping each state in X to an expected observation region for each user – Fig. 15 shows some of these states and their associated observation regions. If Tt .phase = Achieve, users are

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

282

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 14. Modeling motion paths by Gaussian mixtures (left); An extremely simple location model (right)

Figure 15. Team states and observation regions

stationary and this mapping is straightforward: P resentA for instance is the folded state where Tt .phase = Achieve and Tt .goal = P resentA. For Tt .phase = P repare, where users change locations, the mapping is a little bit more involved, as the user motion of course depends on the previous locations of the user. So for each combination Last Goal Achieved / New Goal to Prepare, there is a separate folded state. For instance, P resentA.P resentB represents the state Tt .phase = P repare and Tt .goal = P resentB where the previous goal has been P resentA. If we include an explicit Exit goal that must be the last one to achieve, we arrive at five folded Achieve-phase states and 16 folded P repare-phase states. These 21 states make up the state space of a hidden Markov model; a rough estimate for durative actions can be added by self-transitions with appropriate probabilities. Folded P repare-phase states have only one external transition: to the corresponding Achieve-phase state. Achieve Phase states have transitions to four

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

283

of the P repare-phase states. (e.g., P resentA transits to P resentA.P resentB, P resentA.P resentC, P resentA.Discuss, and P resentA.Exit). The probability of a transition from an Achieve-state x to a P repare-state x.y can be computed by creating the set of all possible histories H via pselect and then counting the frequency of x.y states in all h ∈ H (weighted by h’s probability), relative to the number of x states. These massive simplifications lead to a very small model that can very efficiently be handled using standard algorithms for hidden Markov models. Of course, durative actions are modeled at a very coarse level, history-awareness has been much reduced, and location tracking has vanished. Such a model performs not too bad when using smoothing. However, the lack of a sufficiently faithful temporal knowledge makes it comparatively error-prone when sensor data is missing. It is possible to incorporate temporal information without leaving a HMM framework: by modeling duration as delay line with a sequence of states having a defined self-transition probability. The number of time slices required to pass through a sequence of n states with a self-transition probability of p has a negative binomial distribution, which is a good option for modeling durative actions (see for instance [2] for additional details). Adding duration information noticeably increases the number of states in the Markov model, but also produces significantly better results in situations where forward filtering or prediction is required. So in total, we see that for specific application domains and by sacrificing some state information (i.e., precise user location), a rather complex generic model can be compiled into a very compact and efficient description. There is an interesting point with respect to the training data required for building the distribution functions representing user whereabouts: It would clearly be a disadvantage, if, after furnishing a new meeting room, a number of “teach in” meetings would be required for training the room on likely user motion patterns. However, possible motion paths can be generated automatically from a floor plan, as outlined by Fig. 11. And, if necessary, a detailed floor plan can be automatically generated by, e.g., a small robot that is exploring the room [24] or by a wired sensor-network installed on the floor [25]. When detailed information can not be made available, symbolic spatial reasoning is a possible alternative, which replaces quantitative position information by a symbolic position information, such as “user is at region A”, “region A is connected to region B”, etc. This concept is proposed for instance in [10]. We found out that it is even possible, in some scenarios, to use an extremely rough location model: either a person is at an interesting place (seat or stage), which are represented by spherical normal distributions with, say, 1 meter standard deviation, or the user is “moving from somewhere to somewhere else”, represented by a large variance normal covering the complete room (better still would be a uniform distribution covering just the free floor space). We have used this location model (shown in Fig. 14 at right) for tracking our three-person meeting scenario with the effect that the Achieve-phase activities still could be identified correctly. (Necessarily, P repare-phase classification was not possible any more, as the location model now provides no information on the possible destination of a moving user.) Morale: In simple situations, simple models suffice.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

284

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

6. Model Synthesis 6.1. The role of generic planning The model we have outlined in the previous sections can be interpreted as a specific hierarchical planning model, although the individual planning steps are rather simple (e.g., table lookup).

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• A team plans how to break down an agenda of objectives into a sequence of individual joint activities for achieving these objectives. Indeed, the P repare/Achieve cycle is a specific instantiation of such a planning process: It is a hardwired planner based on the assumption that individual objectives on the agenda do not depend on each other, while in order for achieving an objective, some preconditions have to be established by the P repare phase. • A user plans which goal to pursue in order to support the team in performing a joint activity (G). In our model we have hardwired the goals with a very simple table lookup (Table 1). • A user plans how to achieve a specific goal, such as reaching a certain location (A). As described in section 4.2.5 we have a floor plan and compute just the shortest route to the goal location. Hardwiring the P repare/Achieve cycle into the model by table lookup is a top-down modeling approach. We have researched tools and methods to facilitate the generating of human behavior by describing the plans with ConcurTaskTrees (CTT) [9]. But even for simple scenarios this can get very exhausting, as the designer has to anticipate the user behavior for every possible meeting. If we have no information about the meeting structure we can only construct an ergodic model where the transition probabilities between all states are greater zero. Such a flat model would be rather useless: it does not make any useful predictions and therefore it is not able to disambiguate between two activities that may produce the same observations. An alternative formalism for modeling human behaviour in a top-down manner are probabilistic grammars as described in the chapter “Rule-based intention recognition from spatio-temporal motion track data in ambient assisted living” by Kiefer et. al. found in this book. 6.2. Generating meeting sequences with partial order planning The second approach for describing the human behavior in a smart meeting room in a probabilistic manner is a bottom-up approach. We need an activity catalog where we describe an ontology of the basic human actions and their dependencies. The main idea is to describe the behavior of the humans as planning operators, e.g. in STRIPS [6]. We first proposed this approach in [4]. For our work, we use the STRIPS formalism for defining possible user actions. From this definition, we compute the possible action sequences that can occur

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

285

in our environment and transform it into the transition function for a generative model. This means we employ planning as a means for generating the possible sequences of actions that might occur in the real world in order to relieve a human designer of this step. Going back to the small scenario from the introduction, the team-members have three actions that they can execute. In the prepare phase the users move throughout the room: (:action move :parameters (?who ?from ?to) :precondition (at ?who from) :effect (and (not (at ?who ?from)) (at ?who ?to))) Such a STRIPS operator can be read as follows: There is an action move that takes three parameters, ?who – the user that wants to move, ?from – the user’s current position, and ?to – the users destination. The precondition of move (i. e., the state that should hold in order for move to be executable) is that the user ?who must currently by at location ?from. The effect of the move action is that afterwards the user is not not any more at ?from, but rather at ?to. When the users have reached their destinations, they can start their work, e.g. giving a presentation. In order to begin with the presentations, all other users have to reach their seats where they can listen:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

(:action present :parameters (?who) :precondition (and (at ?who stage) (forall (?x - user) (or (=?x ?who) (at ?x seat)))) :effect (has-presented ?who)) After all presentations a discussion was scheduled. The discussion operator can be described as: (:action discuss :precondition (forall (?x - user) (at ?x seat)) :effect (have-discussed)) That is: before discussion can commence, all team members must take a seat. The devices and persons that are present during the meeting would be given as initial states to the planner. The goal (in our scenario the meeting) is the set of actions that need to be accomplished. Expressed in STRIPS, this would look as follows: all persons have given a presentation and the team has discussed. (:goal (and (forall (?p - person) (has-presented ?p)) (have-discussed)))

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

286

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions 













  



 





      

    

   





  



    

  

   



   

 



   





   

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 16. A possible meeting sequence that is generated by a partial order planner, given the described operators.

The result of a partial order planning process is a directed acyclic graph that is one of the possible execution histories of a meeting (see Figure 16). Comparing the plan to the previous described model, it can be seen that the two P repare, Act phases emerge automatically from the operator description. The sum of all valid plans is the complete directed acyclic graph that can be translated into the transition function of the generative model. We implemented a forward planner that uses a depth first search to build the complete directed acyclic graph. While the planner is not suitable for all types of planning problems, the description of human behavior in a meeting room is a relatively small planning problem with about five to ten parametrized operators. After variable substitution this expands to 300 to 400 atomic operators. Therefore a complete iteration through the search space is still possible. The automatically generated transition model is a “flat” model where all transition probabilities are equal, depending on the number of available branches in the graph (Figure 16). In order to make this model more intelligent, we have two options: We can use prior information and annotate specific actions with priorities which encode common sense about the probabilities of actions. For example we could assign higher probabilities to useful actions like “stand” or “sit” than to undirected and less goal-oriented actions like “wander around”. The second option is to fuse the resulting flat model with other prior information like our agenda information in order to prefer Present A over Present B or C. Clearly, all these planning and reasoning processes, for which we have given very simple instantiations, could be modeled by generic multi-agent planning algorithms. However, there clearly is a continuous tug of war between the desire to model reasoning by, e.g., a complete partial order planner, and the wish to hardwire scenario-specific social behavior models into a probabilistic model. As shown, it is not too difficult to define an ontology of possible team member actions, from which all possible plans for how a team may achieve a given agenda can be generated by a planning algorithm. But the question of course is: which of the generated sequences will be observed for real teams? If all plans have to be considered, combinatorial explosion will make this approach intractable for non-trivial scenarios. If plans are expanded only up to the next time step, pruning away plans that do not agree with sensor data any more, the capability for prediction is significantly reduced. Again, the truth will be in the middle: a combination of generic planning to keep the amount of modeling reasonable with hardwired behavior to reliably capture typical patterns of human behavior, providing a heuristics for expanding possible action sequences into the future.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

287

7. Summary Objective of this chapter has been to develop a reusable model for inferring highlevel intentions of teams of cooperating users. The approach we propose is based on the utilization of prior knowledge on human and group behavior that can be cast into a procedural description of team activities. Here, we have identified the different potential sources of prior knowledge and we have mapped them to a hierarchical model of goal deliberation and activity control, which can be interpreted as a layered control architecture of a multi-agent system. We have shown that such a model can be used as a generative probabilistic model, enabling to use Bayesian state estimation for filtering, smoothing, and prediction tasks – especially the latter task enables an environment to anticipate future team behavior, enabling proactive assistance. Furthermore, the generative approach allows to define models that are able to operate in new situations without the need of training data. In addition, we have shown that, for concrete application domains, an instantiation of the generic model can be compiled into a very compact and simple probabilistic structure, enabling the use of simple and exact inference algorithms in these cases. Finally, we have outlined some options for synthesizing such a model in a bottom-up fashion, from a STRIPS-description of the individual possible user actions. Acknowledgements

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This work has been partially supported by the German Research Foundation (DFG) as part of the graduate school 1424 MuSAMA and Landesforschungsschwerpunkt Mecklenburg-Vorpommern MAike. References [1] S. Agarwal, A. Joshi, T. Finin, Y. Yesha, and T. Ganous. A pervasive computing system for the operating room of the future. Mob. Netw. Appl., 12(2-3):215–228, 2007. [2] J. A. Bilmes. What HMMs Can Do. UWEE Technical Report UWEETR-2002-0003, Department of Electrical Engineering, University of Washington, Jan. 2002. [3] B. Brumitt, B. Meyers, J. Krumm, A. Kern, and S. Shafer. Easyliving: Technologies for intelligent environments. In Handheld and Ubiquitous Computing, pages 97–119. 2000. [4] C. Burghardt, M. Giersich, and T. Kirste. Synthesizing probabilistic models for team activities using partial order planning. In KI’2007 Workshop: Towards Ambient Intelligence: Methods for Cooperating Ensembles in Ubiquitous Environments (AIM-CU), September 2007. [5] S. K. Card. The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates. [6] R. E. Fikes and N. J. Nilsson. Strips: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2(3-4):189–208, 1971. [7] D. R. Forsyth. Group Dynamics. Thomson Wadworth, Belmont, CA, USA, 4th International Students edition, 2006. [8] M. Giersich. Concept of a Robust Training-free Probabilistic System for Online Intention Analysis in Teams. PhD thesis, University of Rostock, 2009. to appear.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

288

A. Hein et al. / Model-Based Inference Techniques for Detecting High-Level Team Intentions

[9] M. Giersich, P. Forbrig, G. Fuchs, T. Kirste, D. Reichart, and H. Schumann. Towards an integrated approach for task modeling and human behavior recognition. In J. Jacko, editor, Human-Computer Interaction, HCII 2007, volume Part I, pages 1109—1118, Heidelberg, June 2007. Springer Verlag. [10] B. Gottfried. Spatial health systems. pages 1–7, 29 2006-Dec. 1 2006. [11] T. Heider and T. Kirste. Supporting goal based interaction with dynamic intelligent environments. In F. van Harmelen, editor, ECAI, pages 596–600. IOS Press, 2002. [12] D. H. Hu and Q. Yang. Cigar: Concurrent and interleaving goal and activity recognition. In D. Fox and C. P. Gomes, editors, AAAI, pages 1363–1368. AAAI Press, 2008. [13] H. Kautz, L. Arnstein, G. Borriello, O. Etzioni, and D. Fox. An overview of the assisted cognition project. In AAAI-2002 Workshop on Automation as Caregiver: The Role of Intelligent Technology in Elder Care, 2002. [14] L. Liao, D. Fox, and H. A. Kautz. Learning and inferring transportation routines. In D. L. Mcguinness and G. Ferguson, editors, Proceedings of the Nineteenth National Conference on Artificial Intelligence, Sixteenth Conference on Innovative Applications of Artificial Intelligence, pages 348–353. AAAI Press, 2004. [15] Merriam–Webster Medical Online Dictionary, 2008. Definition of “Cognitive Psychology” [online], (Accessed: January 25, 2008). [16] Merriam–Webster Online Dictionary, 2008. Definition of “Social Psychology” [online], (Accessed: December 7, 2007). [17] K. P. Murphy. Dynamic Bayesian Networks: Representation, Inference and Learning. PhD thesis, University of California, Berkeley, CA, USA, 2002. [18] U. Neisser. Cognitive Psychology. Appleton-Century-Crofts, New York, NY, USA, 1967. [19] A. Y. Ng and M. Jordan. On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes. In NIPS, pages 841–848, 2001. [20] F. Paterno, C. Mancini, and S. Meniconi. Concurtasktrees: A diagrammatic notation for specifying task models. In INTERACT ’97: Proceedings of the IFIP TC13 Interantional Conference on Human-Computer Interaction, pages 362–369, London, UK, UK, 1997. Chapman & Hall, Ltd. [21] D. J. Patterson, L. Liao, K. Gajos, M. Collier, N. Livic, K. Olson, S. Wang, D. Fox, and H. Kautz. Opportunity knocks: A system to provide cognitive assistance with transportation services. pages 433–450. 2004. [22] A. Pfeffer. The design and implementation of IBAL: A generalpurpose probabilistic programming language. The MIT Press, 2007. [23] S. Russel and P. Norvig. Artificial Intelligence: A Modern Approach (Second Edition). Prentice Hall International, 1995. ISBN-13: 978-0130803023. [24] R. Smith, M. Self, and P. Cheeseman. Estimating uncertain spatial relationships in robotics. In Robotics and Automation. Proceedings. 1987 IEEE International Conference on, volume 4, page 850, 1987. [25] A. Steinhage and C. Lauterbach. Monitoring movement behavior by means of a large area proximity sensor array in the floor. In B. Gottfried and H. K. Aghajan, editors, BMI, volume 396 of CEUR Workshop Proceedings, pages 15–27. CEUR-WS.org, 2008. [26] K. Thurow, B. G¨ ode, U. Dingerdissen, and N. Stoll. Laboratory information management system for life science applications. Organic Process Research & Development, 8:970–982, 2004. [27] T. Umblia, A. Hein, I. Bruder, and T. Karopka. Marika: A mobile assistance system for supporting home care. In MobiHealthInf 2009 - 1st International Workshop on Mobilizing Health Information to Support Healthcare-related Knowledge Work, 2009. [28] J.-H. Xue and D. M. Titterington. Comment on “on discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes”. Neural Processing Letters, 28(3):169–187, 2008. [29] D. Zhang, D. Gatica-Perez, S. Bengio, and D. Roy. Learning influence among interacting Markov chains. In NIPS, number 48, Martigny, Switzerland, 2005. IDIAP-RR 05-48.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-048-3-289

289

Learning about preferences and common behaviours of the user in an intelligent environment Asier AZTIRIA a,1 , Alberto IZAGUIRRE a Rosa BASAGOITI a and Juan Carlos AUGUSTO b a

b

University of Mondragon, Spain University of Ulster, United Kingdom

Abstract. Intelligent Environments are supposed to act proactively anticipating the user’s needs and preferences in order to provide effective support. Therefore, the capability of an Intelligent Environment to learn user’s habits and common behaviours becomes an important step towards allowing an environment to provide such personalized services. In this chapter we explain and exemplify the importance of detecting patterns of behaviour, we propose a system which learns patterns of behaviour and also a complementary interaction system, based on speech recognition, which facilitates the use of such patterns in real applications. Keywords. Intelligent Environments, pattern learning, interaction based on speech

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

1. Introduction Intelligent Environments (IE) are digital environments that proactively, but sensibly, assist people in their daily lives [1]. They offer an opportunity to blend a diversity of disciplines, from the more technical to those more human oriented, which at this point in history can be combined to help people in different environments, at home, in a car, in the classroom, shopping centre, etc. Ambient Assisting Living (AAL) [2] refers to the potential use of an IE to enhance the quality of life of people, that is emphasizes the applications of IE for healthcare and wellbeing in general. For example it is well known that the majority of the elderly people prefer to live in their own houses carrying out an independent life as far as possible [3]. AAL systems aim at achieving that by: • • • •

Extending the time people can live in their preferred environment. Supporting maintaining health and functional capability of the elderly individuals. Promoting a better and healthier lifestyle for individuals at risk. Supporting carers, families and care organizations.

Let us consider two scenarios which illustrates potential applications of IE for AAL: 1 Corresponding

Author. E-mail: [email protected]

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

290

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Scenario 1: Michael is a 60 years-old man who lives alone and enjoys an assisting system that makes his daily life easier. On weekdays Michael’s alarm goes off few minutes after 08:00AM. he gets dressed and on Tuesdays, Thursdays and Fridays approximately 10-15 minutes later he usually steps into the bathroom. At that moment the lights are turned on automatically. The temperature of the water in the shower is already set according to Michael’s preferences (around 24-26 degrees in winter and around 2123 degrees in summer). After he left the bathroom all blinds are opened. When he goes into the kitchen the radio is turned on so that he can listen to the headlines while he prepares his breakfast. Just before having breakfast the system reminds him that he has to measure his blood pressure and heart rate and this data is sent to his GP by using the broadband internet connection. He has breakfast and in 15-20 minutes he leaves the house. At that moment all lights are turned off and safety checks are performed in order to prevent hazardous situations in his absence (e.g. checking if cooker is turned on) and if needed the house acts accordingly (e.g. turning the cooker off). Scenario 2: Sarah is a 75 years-old woman who is frail and hence needs help in order to carry out some daily tasks such as dressing up or having a shower. Fortunately, she lives in a modern building where people live in independent apartments share communal services of nursing-care, rehabilitation, entertainment, and so forth. The staff members in that accomodation know (by means of previous reports generated by the intelligent environment) that Sarah usually likes to have a shower just after getting up so that when her alarm goes off, nurses are ready to help her. In relation to her rehabilitation, Sarah has the freedom of choosing what type of exercises she wants to do. Specialized staff members, after monitoring and detecting Sarah’s preferences, design a personalized treatment much more suitable to her needs and preferences. At nights, on Tuesdays and Thursdays she likes watching her favourite sitcom so that she goes to bed around 11:00PM whereas the rest of the days she goes around 10:00PM. Staff members are concerned about Sarah’s recent behaviour because the system has detected that although she takes pills everyday, she takes them just before having lunch, which is not desirable. Finally, doctors are concerned because there are some indications which show that she could be in the first stage of Alzheimer’s disease. In order to confirm or rule out these suspicions they are going to check if she carries out repetitive tasks in short periods of time or shows sins of disorientation (e.g. going back repetitively to places where she has been). These scenarios show desirable environments that make the life of the user’s easier and safer. It is clear that knowing the users’ common habits or preferences gives either the environment (scenario 1) or staff members (scenario 2) the opportunity to act more intelligently according to each situation. Discovering these habits and preferences demands a previous task of learning. In an Intelligent Environment, learning means that the environment has to gain knowledge about the user’s preferences, the user’s common behaviour or activity pattern in an unobtrusive and transparent way [4,5]. For example the environment of scenario 1 had to learn Michael’s morning habits, which are represented in Figure 1. 1.1. Advantages of learning patterns Michael’s example shows how the environment, knowing his most common behaviour, can act proactively in order to make his life easier and safer. In this case acting proacBehaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

291

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 1. Example of a pattern

tively means the environment can automatically turn on and off the lights, set the temperature of the water, open the blinds, turn on the radio and so on. Automation of actions and/or devices can be considered as positive side effect which can be obtained once the environment has learned his common habits and preferences. In Sarah’s case, patterns are not used to automate actions or devices, but they are used to understand her behaviour and act in accordance with it. From Sarah’s perspective, the staff members are always at the correct place and time. On the other hand, for the staff members, the knowledge of habits of different patients allows them to organize their time in an efficient way as well as giving more personalized services to patients. The understanding of usual patterns also allows the detection of bad or unhealthy habits (e.g. she takes pills before having lunch). The learning of patterns is not merely an optional aspect of the system which may bring some advantages to an intelligent environment, rather we consider that an essential contribution to the idea that an environment can be intelligent. It supports an environments which adapt itself to its users in an unobtrusive way and one where the users are released from the burden of programming any device [6]. Therefore the ability to learn patterns of behaviour is of paramount importance for the successful implementation of Intelligent Environments. The remainder of this chapter is organized as follows. Section 2 describes the special features of Intelligent Environments which have to be considered when performing the learning process. Section 3 provides a literature review of different approaches suggested so far. In Section 4 we propose a new approach in order to learn patterns. Finally, Section 5 provides some overarching reflections on this topic.

2. Intelligent Environments’ special features We overview the special features which make these environments different from others in the process of acquiring new knowledge. 2.1. Importance of the user One of the hidden assumptions in Intelligent Environments is that unlike other current computing systems where the user has to learn how to use the technology, a fundamental

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

292

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

axiom in Intelligent Environments requests that the environment adapts its behaviour to the user. Thus, the user gains a central role. The learning process has to be accomplished as unobtrusively as possible, being transparent to the user. This implies that: • Data have to be collected by means of sensors installed either on standard devices or in the environment (e.g. temperature sensors). • System actions in relation to the user have to be performed maximizing the user’s satisfaction. • User’s feedback has to be collected either through the normal operation of standard devices (e.g. switch of lights) or through friendly interfaces such as multimodal user interfaces (e.g., through voice and image processing technologies) [7,8,9,10]. 2.2. Collected data

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

As we suggested in the previous section, necessary data must be collected from sensors installed in the environment. All patterns will depend upon the data captured. In Michael’s case an example of collected data could be: Devices’ activations (date;device;status;value)

Other sensors (date;device;status;value)

2008-10-20 08:02:12;Alarm;on;100 08:15:54;MotionCorridor;on;100 08:15:55;BathroomRFID;on;Michael 08:15:55;MotionBathroom;on;100 08:15:57;SwitchBathroomLights;on;100 08:31:49;MotionBathroom;on;100 08:31:50;BathroomRFID;on;Michael 08:31:50;MotionCorridor;on;100 .. .

2008-10-20 08:01:53;TempBedroom;on;21 08:04:16;TempBathroom;on;18 08:12:26;TempBedroom;on;22 08:13:49;HumBathroom;on;51 08:16:04;TempBathroom;on;20 08:19:04;TempBathroom;on;21 08:26:42;HumBathroom;on;53 08:28:12;TempBathroom;on;20 .. .

2008-10-21 08:10:50;Alarm;on;100 08:23:18;MotionCorridor;on;100 08:23:19;BathroomRFID;on;Michael 08:23:19;MotionBathroom;on;100 08:23:21;SwitchBathroomLights;on;100 08:30:52;shower;on;24 08:48:33;MotionBathroon;on;100 08:48:32;BathroomRFID;on;Michael 08:48:32;MotionCorridor;on;100 .. .

2008-10-21 08:05:16;TempBedroom;on;22 08:19:10;HumBathroom;on;50 08:22:42;TempBathroom;on;19 08:23:58;TempBathroom;on;20 08:31:30;HumBathroom;on;53 08:32:52;TempBathroom;on;22 08:38:10;HumBathroom;on;54 08:45:39;TempBathroom;on;21 08:49:02;HumBathroom;on;52 .. .

2.2.1. Need of pre-processing Data will be typically collected in a continuous way from different information sources. Integrating data from different sources usually presents many challenges, because differ-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

293

ent sources will use different recording styles and also different devices will have different possible status [11]. Finally, as in other areas of computing, ‘noisy’ data, with missing or inaccurate values, will be common and finding out how to appropriately deal with that is another important challenge. 2.2.2. Nature of collected data As far as the nature of information provided by sensors is concerned, different classifications can be used. But an interesting classification from the point of view of learning is that one which distinguishes between information about the user’s actions and information about the context. Direct information of the user’s actions is usually provided either by sensors installed on devices which indicate when the user has modified the status of that device, or by movement sensors which detect where the user is. A simple example of it can be a sensor installed in a switch which indicates when the user turns on that light. On the other hand there will be sensors which provide general information about the context. Sensors such as temperature, light or humidity do not provide direct information of the user’s actions, but they provide information about the context. Finally, it is worth noting that due to the complexity of Intelligent Environments, external knowledge in combination with the collected data could be useful to carry out a successful learning process. Externally gathered knowledge will typically be domainrelated knowledge such as: • Patient’s medical background. • Preferences specified in advance by the user. • Calendar information (e.g. when the user goes on holiday)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2.3. Spatiotemporal aspects Every act of a person situated in an environment is performed in a spatiotemporal dimension. The importance of considering spatial and temporal aspects in Intelligent Environments has been pointed out before by various authors [12,13,14]. As shown in the example given in Section 2.2 spatial information of different actions is given by the location of either devices (e.g. switch of the light) or user (e.g. motion in the bathroom) whereas temporal information is given by the timestamp of each action. As we will see later on, spatiotemporal aspects are always present in every step of the learning process. 2.3. Scheduling the learning process Discovering common patterns of behaviour in recorded data is part of a complex process involving several stages. On one hand it is desirable the system acts as intelligently as possible from the very beginning. Typically these actions will not be as intelligent and efficient as those performed once the patterns of the user have been learnt, and we can expect minimal services at this stage. On the other hand, once patterns have been discovered, it seems clear that those patterns will have to be continuously revised and updated because: • The user can change his/her preferences or habits (e.g. Sarah now prefers other types of exercises for her rehabilitation). Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

294

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

• New patterns could appear (e.g. Sarah has started receiving some visits at the weekends). • Previously learned patterns were incorrect (e.g. the system wrongly learned that Sarah likes going to bed at 9:00PM). This adaptation process could mean the modification of parameters in a pattern we have learnt previously, adding a new pattern or even deleting one. This is a sustained process which will last throughout the lifetime of the environment. To do this effectively the user’s feedback is essential. We can conclude that at least three learning periods seem to be necessary. The first one is to act as intelligently as possible without patterns while the system is starting to gather data. The second and main process deals with learning the user’s common behaviours and habits. Finally, while the system is acting in accordance with patterns previously learned, it is necessary to update those patterns in a continuous way. This chapter and the approach we suggest in Section 4 are mainly focused on the process of learning patterns from collected data, that is the first stage.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.4. Representation of the acquired patterns Depending on the objectives of each environment, different representations can be used. For example if the only goal is to provide an output given the inputs (e.g. switch the light on given the current situation) it does not require a user comprehensible representation. However, most of the times the representation of the patterns is relevant. In this cases a human-understandable representation of the patterns is an essential feature for the success of the system. Unlike the previous example where the internal representation is not important, most of the times the internal representation of patterns is as important as the final output. For instance, in order to understand the behaviour of a user, a comprehensible representation is essential. Moreover, it maybe necessary for the system to explain the decisions made to the user. Thus, the system could explain to the user that it turns on the light of the bathroom because it has strong evidence that Michael usually do that. Sometimes the output of the learning process has to be integrated into a bigger system or has to be combined with other types of knowledge in order to make sensible high level decisions. Representing common behaviours by means of sequences of actions seems to be a promising approach. Figure 1 shows Michael’s habits in a sequenced way. It is worth mentioning that this type of representation allows to inter-relate actions among them (e.g. ‘go into the bathroom’ and ‘turn on the light’). At the same time it allows to represent the time relations using relative time references instead of absolute times (e.g. ‘go into the bathroom’; 2 seconds after; ‘turn on the light’). Finally, conditions are necessary to further specify the occurrence of this sequences. General conditions help to contextualize the whole sequence (e.g. ‘On weekdays between 8AM and 9AM’ or ‘takes a shower on Tuesdays, Thursdays and Fridays’).

3. State of the art Intelligent Environments as a technological paradigm has attracted a significant number of researchers and many applications are already being deployed, with different degree of

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

295

success. The complexity of these systems is due to the combination of hardware, software and networks which have to cooperate in an efficient and effective way to provide a suitable result to the user. Due to this complexity, up to now each project has focused upon different aspects of such complex architectures. In that sense, it is understandable, and even logical in some way, that the first developments have been focused upon the needs associated with hardware and networking as supporting infrastructure. This has resulted in simple automation that implements a reactive environment. Although to date, many researches [3,13,15,16] noted the importance of providing the environment with intelligence, little emphasis has placed in general upon the subject of learning ‘per se’. There are some notable exceptions and next we provide an overview of those focused on the Machine Learning techniques they use for learning the user’s patterns.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.1. Artificial Neural Network (ANN) Mozer et al. [17] and Chan et al. [18] were amongst the first groups that developed applications for Intelligent Environments where the user’s patterns were involved. Mozer et al. designed an adaptive control system for the environment named Neural Network House, which considered the lifestyle of the inhabitants and the energy consumption. For that, they use a feed-forward neural network which predicted where the user would be in the coming seconds. Based on these predictions they controlled the lighting. Chan et al. also used ANNs for similar objectives. They developed a system that predicted the presence or absence of the user and his/her location. The system compared the current location with the prediction made by the system and said if the current situation was normal or abnormal. Other authors have used ANNs in order to learn patterns related to users. A similar application which calculated the probability of occupation of each area of the house was developed by Campo et al. [19]. Boisvert et al. [20] employed ANNs in order to develop an intelligent thermostat. Monitoring the use of the thermostat the system automatically adapted it to the user’s preferences, reducing the number of interactions as well as reducing the energy consumption. See [21] for a survey focused on ANNs for Smart Homes. 3.2. Classification techniques Classification techniques such as decision trees or rule induction have been used by other groups. The group that works in the environment named ‘SmartOffice’ [22] developed, using decision trees, an application which generated rules splitting situations where examples indicated different reactions. Considering Michael’s case, an example of it could be that when he goes into the bathroom he sometimes has a shower and sometimes he does not. Thus, the application developed by SmartOffice group would be able to discover the rules to separate these different reactions. In our case, the conditions would be that he has a shower on Tuesdays, Thursdays and Fridays. Stankovski and Tmkoczy [23] generated a Decision Tree based on the training data. They considered that the training data set described normal events and the induced decision tree would therefore describe the normal state of the environment. Thus, they tried to detect abnormal situations, which would be out of the tree. Let us consider that we create a decision tree based on Michael’s normal behaviour. Thus, when Michael forgets

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

296

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

to switch the cooker off and he goes out the house the system would detect that this situation is out of the tree, so that it would be labelled as an abnormal situation and an alarm would be issued. 3.3. Fuzzy rules Researchers at Essex’s iDorm lab have given prominence to the problem of learning, being one of the most active groups in this area. They represented the user’s patterns by means of fuzzy rules. Their initial efforts [6] [24] were focused on developing an application that generated a set of fuzzy rules that represented the user’s patterns. Recording the changes caused by the user in the environment, they generated membership functions as well as fuzzy rules which mapped those changes. Besides, they defined a strategy to adapt such rules based on negative feedback given by the user. Vainio et al. [25] also used fuzzy rules to represent the user’s habits. In contrast to the approach followed in the iDorm project they manually constructed the membership functions and they used reinforcement learning to replace old rules in order to avoid that a single override event had large impact unless it lasts for a significant amount of time.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3.4. Sequence discovery The group that is working in ‘MavHome’ and ‘CASAS’ environments is one of the most active groups. The first applications developed by this group were oriented to build universal models, represented by means of Markov Models, in order to predict either future locations or activities [26]. In that sense, they carried out notable improvements developing applications to discover daily and weekly patterns [27] or to infer abstract tasks automatically, with the corresponding activities that were likely to be part of the same task [28]. One of the major contributions of this group is the association of time intervals between actions [29]. They were one of the first groups that considered relations between actions, representing these relations using Allen’s temporal logic relations [30]. Once time relations between actions were defined, they tried to discover frequent relations by means of sequence discovery techniques. Considering Sarah’s case, this approach would be able to relate the actions of ‘getting up’ and ‘having a shower’, establishing that ‘having a shower’ comes ‘after’ ‘getting up’. 3.5. Instance based learning Researches at Carnegie Mellon University’s MyCampus lab developed a message filtering application using Case-Based Reasoning (CBR), which can be defined as an Instance based learning technique [31]. Based on the user’s preferences, showed in previous interactions, the system filtered the messages. They validated the system without and with the CBR module and participants’ satisfaction increased from 50% to 80%. Another example of the use of CBR techniques in order to learn the user’s preferences is the UT-AGENT [32]. Recording the set of tasks that user carried out, the UT-AGENT tried to provide the information that user needed based on the information he/she used to ask in similar situations.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

297

3.6. Reinforcement learning Some of the groups we have previously mentioned, e.g. Neural Network House and SmartOffice, have developed a module based on reinforcement learning in order to add the capacity of adaptation to the environment. Mozer et al. [33] used Q learning for lighting regulation. Taking as starting point that the user has no initial preferences, the system tried to minimize the energy consumption as long as the user did not express discomfort. Zainderbeg et al. [34], starting from a pre-defined set of actions, progressively adapted them to the user by giving rewards to the system associated with good decisions. 3.7. Summary of related work As we have seen in the previous sections, different learning techniques have been used for developing different applications in Intelligent Environments. Analysing different applications, it seems clear that the use of different techniques is firmly conditioned by the specific needs of each environment or application. Muller [35] pointed out that ‘In many research projects, great results were achieved ... but the overall dilemma remains: there does not seem to be a system that learns quickly, is highly accurate, is nearly domain independent, does this from few examples with literally no bias, and delivers a user model that is understandable and contains breaking news about the user’s characteristics. Each single problem favours a certain learning approach’.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

The current state of the art shows that there is no a global or holistic approach yet. In that sense, given the strengths and weaknesses of different techniques, combining different techniques seems a promising approach. Next section shows a first approach of a system that combining different techniques to learn user patterns.

4. Sequential Patterns of User Behaviour System Sequential Patterns of User Behaviour System (SPUBS) is a system that discovers the user’s common behaviours and habits (e.g. Michael’s common behaviour shown in Figure 1). It is mainly focused on discovering common sequences of actions from recorded data. Due to the complexity of the Intelligent Environments the architecture of the system demands an exhaustive analysis. Thus, we have created a three-layered architecture which allows us to distinguish those aspects related to particular environments from those aspects that can be generalized for all environments. Figure 2 shows the global architecture of SPUBS. 4.1. Transformation Layer The objective of this first layer is to transform raw data, i.e. information collected from sensors, into useful information for the learning layer. As we are going to see in this section, most of the transformations carried out in order to get useful information are dependant on each environment. Therefore, although some general transformations can be defined, different environments will demand different transformations.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

298

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

Figure 2. Global Architecture of SPUBS

Considering the data shown in Section 2.2, the following sections suggest some basic transformations. 4.1.1. Inference of simple actions Once data from sensors have been collected an important task is to infer meaningful information from these raw data. Sometimes the information provided by sensors is meaningful, for example:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

from 2008-10-20T08:15:57, SwitchBathroomLights, on, 100 we infer 2008-10-20T08:15:57, BathroomLights, on, 100 In this case, the action itself is meaningful because we can directly infer from it the action of the user. But there are other actions that are quite difficult to infer from a simple activation of a sensor. Let us consider that the inference of the simple action ‘Go into the Bathroom’ is not possible from the activation of a simple sensor, so that it must be inferred combining different actions. The following example shows that there is a motion in the corridor, then the RFID tag installed in the door of the bathroom detects that it is Michael and then there is a motion in the bathroom. We can logically infer that Michael has entered the bathroom. Thus, the transformation of those three actions into only one meaningful action allows to annotate the sequence of raw data items with meaning.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

299

from 2008-10-20T08:15:54, Motion Corridor, on, 100 2008-10-20T08:15:55, Bathroom RFID, on, Michael 2008-10-20T08:15:55, Motion Bathroom, on, 100 we infer 2008-10-20T08:15:55, Bathroom, on, 100 The most basic way of inferring these actions is by means of templates. Templates define what actions must be combined as well as which constraints are to be considered. Actions can be defined either as mandatory or as optional. As far as constraints are concerned, they can be related to the order of the actions or to durational aspects. ‘Go into bathroom’ action’s template is defined as: ‘Go into Bathroom (Bathroom, on,100)’ Actions: Motion Corridor (Mandatory) RFID Detection (Mandatory) Open Door (Optional if already open) Motion Bathrooom (Mandatory)

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Constraints: Order Motion Corridor < RFID Detection < Open door < Motion Bathroom Time TMotionBathroom − TMotionCorridor < 3seg. At the end of this first step, all actions should be meaningful. It is clear that the definition of templates depends on particular environments because it is mainly influenced by sensors installed in the environment. It is worth mentioning that this step of inferring meaningful actions is important because once we have defined such actions the rest of the learning process will depend upon them. 4.1.2. Inference of complex actions Inference of simple actions makes sure that, from that point on, all considered actions are meaningful. Once simple actions have been inferred, a similar process can be carried out in order to infer complex actions such as ‘prepare a coffee’ or ‘take a pill’. This inference could be necessary because simple actions do not always represent the type of actions we want to analyse. As in the inference of simple actions, the most basic method is the use of templates, with only one difference. Whereas the former consists in combining raw data, in order to infer complex actions we will combine simple actions. ‘Prepare a coffee’ action’s template could be defined as:

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

300

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

‘Prepare a Coffee (PrepareCoffee, on,100)’ Actions: Put Kettle on (Optional) Open Cupboard (Optional) Get Coffee (Mandatory) Take a cup (Mandatory) Open fridge (Optional) Get Milk (Optional) Constraints: Time TF irstAction − TLastAction < 5min. Combining different actions into only one action does not mean the impossibility of defining its internal structure. For example, retrieving all cases that were labelled as ‘Prepare a coffee’ we can carry out a particular learning process and detect if there exists a pattern which defines how Michael prepares a coffee. 4.1.3. Nature of information Once different actions have been identified, it is important to determine what type of information has been collected. As stated in Section 2.2.2, different sensors provide information of different nature, so that they will be used for different purposes in the learning process. For this first approach two different types of information are considered:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

• Information related to the user’s actions. Simple actions detected by sensors (e.g. ‘turn the light on’), inferred simple actions (e.g. ‘go into bathroom’) or complex actions (e.g. ‘prepare a coffee’) that represent the user’s actions. • Information related to context. Temperature, humidity, luminosity and some other type of sensors indicate the state of the context in each moment. Besides, temporal information will also be considered as context information. For instance, time of the day, day of the week or season information are interesting to contextualize actions. 4.1.4. Splitting actions into sequences Data will be collected in a continuous way from sensors, so that they will be represented as a string of actions, i.e. without any structure or organization. The aim of this task is to structure the data collected from sensors, using that organization in order to add some meaning to the collected data. In that sense, many different organizations can be suggested. The organization we have used is based on the facts that the user usually carries out actions in a sequenced way and actions are mainly influenced by previous and next actions. Thus, we split the string of actions into sequence, but instead of using a quantitative window-width, we use a relative and more flexible criteria that determine the end of one meaningful sequence and the beginning of a new one. For instance, going into bed and staying there for more than 2 hours or going out and staying out for more than 30 minutes are considered as ‘landmarks’ demarcating sequences. This task is dependant on each environment because different environments will demand different criteria. For example, a criteria defined in a Smart Home will not have any sense in a Smart Car.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

301

Figure 3. Essential Components of the Learning Layer

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.2. Learning Layer The objective of this layer is to discover common behaviours and habits of the user, taking as starting point the information provided by the transformation layer. First of all it is necessary to define what type of patterns we are trying to discover. In that sense, the objective of this first approach is to discover patterns whose representation allows their use in most of the applications that can be proposed in Intelligent Environments. Considering different applications (See Section 4.3), a comprehensible representation of patterns was defined as an essential aspect. It was also considered that the user’s patterns are best represented by means of sequences of actions, where actions are related temporally among them and conditions are defined in order to get more accurate patterns. An example of this type of representation was shown in Figure 1. The core of this layer is A SP UBS , an algorithm that discovers patterns using the information coming from the Transformation Layer. The language L SP UBS included within this layer provides a standard framework to represent patterns with a clear syntax and which facilitates the application of them. The essential components of this layer are shown in Figure 3. It is worth mentioning that unlike the Transformation and Application layers the components of this one are not dependant on particular environments, so that its design allows its use in other environments too. 4.2.1. Representing patterns with L SP UBS Defining a language that allows to represent patterns of user behaviour in Intelligent Environments is necessary to have a clear and non ambiguous representation. The language (See Appendix A) integrated into our system, L SP UBS , is an extension of the Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

302

A. Aztiria et al. / Learning About Preferences and Common Behaviours of the User in IE

language defined in [36] and it is also based on ECA (Event-Condition-Action) rules [37]. ECA rules allow to define what action has to be carried out when an event occurs under relevant conditions. Thus, the sequence would be represented as a string of ECA rules, named ActionPatterns, contextualized by means of general conditions. Thus, the sequence showed in Figure 1 would be represented as: General Conditions IF context (day of week is (=,weekday) & time of day is ((>,08:00:00) & ( MAX TEMP THRESHOLD) Then switchOffHeating() Second case If (readTemp() < MIN TEMP THRESHOLD) Then switchOnHeating() If (readTemp() > MAX TEMP THRESHOLD) Then switchOffHeating()

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Although both interfaces provide the same functionality (read temperature values from a sensor), the different ways of accessing temperature data clearly hamper interoperability and flexibility of the services. As a result, service developers need different implementations of the same service and a physical change between devices from different technology providers becomes complicated. A first attempt of defining devices in a standard and common way was performed by UPnP forum. However, up to now, UPnP is focusing more in Audio and Video devices and not so much in home automation devices. Therefore there are just a couple of Home Automation devices standardized up to now [24]. Another attempt to solve this issue was performed by Telef´onica I+D [25][26], where a three-level architecture was deployed that tried to solve this lack of device definition and suggested a solution to enable device management regardless the attached network devices: • Network level. Here, a device is considered as an element within a network, and base drivers speak the needed protocol for each network. • Device level. Each device belongs to a type class of devices regarding its functionality (a light, a camera, and so on) no matter the network it is attached to and devices are also categorized into groups (OSGiWashingMachine, OSGiBlinds,...) to define the minimum set of operations which are common to a type of device. For example: a KNX light, a Lonworks light and a UPnP light will have the same available operations from the interface. • Application level. Each device is considered to be an element of the home automation platform. At this level, all devices are managed in the same way no matter the network they are attached to, neither the type of device they are. All devices have a common interface that is published, so that external applications can use this interface to interact with the devices no matter what they are, that is to say, a washing machine is treated in the same way as a blind or a dimmer. This solution has been used by Telef´ onica I+D for internal projects and in some European projects such as [20][21][22], and a similar concept of a three-level architecture was also explored in [27]. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

340

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

The architecture here presented aims to overcome this lack of standardization. It proposes an open taxonomy of devices and services framed in OSGi that focuses in their inherent functionalities derived from their own nature; a temperature sensor provides temperature information regardless of its technology, and the way accessing it should be the same for any service using it. Benefits are multiple: technology developers know interfaces that their devices must follow, services are developed independent from the technology and final users can use any technology transparently.

2. OSGi, the best solution for RGs? Before introducing common OSGi components for smart home environments, this section shows—in a more detailed way—why OSGi was chosen as service platform for an RG. Therefore the first subsection lists common requirements, which must be fulfilled by service platforms used in smart home environments and the second one shows resulting advantages as well as disadvantages when using OSGi on an RG.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.1. Smart home environments. Requirements on a service platform As explained above the RG is a device in charge of connecting the internal networks with the external world. These internal networks can be data networks and Home Automation networks. These home automation networks are connected using USB or RS232 interfaces to the RG. As explained before, the RGs are aimed to offer an environment to deploy several services. One of these services can be the environment monitoring. In order to be able to do this, drivers able to talk to the different networks are needed. Services can then be developed using this devices information. Hence smart home environments are normally dynamic systems, which means that devices and services are added or removed during runtime, the RG must provide functionality to add new drivers to manage new networks, devices and services. Because in most cases smart environments consist of a huge amount of services and devices which are using different technologies and communication protocols the possibility of an easy way to hide specific implementations is one of the strongest requirements on the used service platform— only in this way an easy service and device development is possible. As already mentioned, services combine information from several devices and other services. Hence the service platform must ensure that all devices and services, which are used by another RG component, are available during runtime—which means that a service platform must provide a dynamic component dependency management. Besides this, another strong requirement on the RG is the functionality to be controlled remotely—of course service providers want to manage and deploy services remotely. Because there is no requirement on the RG’s operating system and architecture the used service platform should be able to run on heterogeneous processing units, especially on those with slow processors and a little memory.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

341

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

2.2. OSGi, the best service platform for an RG? Most of the requirements asked for the RG in terms of dynamism and ease of development are met by SOA platforms, which usually are implemented using web services, especially in enterprise environments. However, web services are intended to interoperate across different machines and platforms, relaying on message exchanging mechanisms that introduce a certain delay in operations, and this delay is often not assumable in AAL applications. When making a bank transaction, we can accept a few seconds in receiving the information about our balance, but if we detect a domestic alarm like a gas leakage, we should react as soon as possible. OSGi is an open standard for modular Java application management and development. It is already used in a wide range of systems like the development environment Eclipse and embedded systems like the BMW 3 series car. The development of OSGi is driven by the OSGi Alliance with contributions from software companies like IBM and ORACLE and device manufacturer as NOKIA, BOSCH and SIEMENS [28]. Regarding smart home environments statements about OSGi, like “The OSGi (Open Service Gateway initiative) service platform specification is the most widely adopted technology for building a control system for the networked home.” are found in several publications [11][29]. When comparing the above mentioned requirements on an RG service platform to the functionality provided by OSGi, it can be seen that OSGi fulfills most of them. The OSGi platform is based on a Java virtual machine and hence it is platform independent. OSGi consists of an OSGi framework and a set of so called bundles, which are the basic components in OSGi [11]. A bundle collects information from other bundles or real devices, processes these data and offers new information to other bundles—the connections between bundles are handled by the framework’s service registry. Each bundle is able to provide an interface for information exchange with other bundles. Hence it is easy to hide or to change specific implementations. So OSGi is a SOA based service platform. To organize bundles, the framework includes a dynamic bundle management system, which makes it possible to install, to update, to uninstall and to start/stop bundles during runtime and it also able to handle dependencies between bundles. This fact is very important for smart home environments because—as mentioned above— smart home environments are dynamic systems. Beside this, it is also possible to control OSGi remotely, which is also a strong requirement on an RG service platform. When comparing the functionality of OSGi to the requirements mentioned above, it can be seen that almost all important requests are covered. But regarding smart home environments there are still lots of disadvantages and problems. For example—as already mentioned in the introduction—OSGi doesn’t provide any pre-defined interfaces for devices and services of the same type. Hence service and device development among different providers is nearly impossible. Another problem concerns easy “service discovery” mechanisms. Using OSGi, developers have to know the exact names of services they want to use. In [29] a semantic solution is proposed to handle this problem. Beside the mentioned problems there are still several disadvantages and problems, which occur when using OSGi in smart homes. But for many of them solutions or workarounds are already proposed by researches and companies.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

342

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

Figure 2. Components of a functional service.

Nevertheless, when comparing OSGi to other frameworks like Jini, DebianPackage, XBone, SNMP and OCAP, in the field of smart home environments OSGi seems to be the best solution [30][31]. This fact is also confirmed by the huge number of companies and research groups, which are using OSGi as a service platform for RGs.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

3. Description of Common OSGi Components for AmI Service Development Figure 1 shows the architecture of a smart home environment. Here, a network of devices (sensors, actuators, etc.) is connected to the RG—running OSGi— which is responsible for device interaction and service implementation. Therefore, OSGi components are needed to connect and manage Home Automation Networks (drivers), providing meaningful information, processing data and implementing services. The cooperation between components is shown in figure 2, which also follows a three-level architecture as those in [25][27]. Drivers are OSGi bundles, which are located at the lowest level of the framework. They care about transport and network management duties with physical devices connected to the RG. In general one different driver bundle is needed per communication protocol. For example the right-hand driver bundle in figure 2 could be a ZigBee driver and the left-hand one, a LonWorks driver. Devices are the common representation of physical devices, which are virtually connected to the RG. They encapsulate real device operations and provide functions in both directions (to and from the RG) to control the device and to make device data available to other OSGi bundles inside the RG. Regarding figure 2, the right-hand device bundles represent real devices based on ZigBee (e.g. temperature sensor, presence sensor, light actuator) while the left-hand device bundles represent real devices based on LonWorks (e.g. again temperature sensor, heating and window actuator). Technological Services are OSGi bundles, which represent services on a technical level. Such services provide high-level information reached by processing raw data from devices and can also be used to change their states (e.g. switch on the lights) or to communicate with the environment (e.g. sending an email to a

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

343

carer). In case of a temperature sensor, this information can be a raised event when the temperature exceeds a pre-defined threshold. Another example could be a technological service, which is able to send a SMS to a mobile phone. As figure 2 shows, technological services are able to cooperate with other technological services and combine information from devices and technological services to build another technological service. Devices and technological services are the objective of this work, where a common taxonomy derived from the devices’ nature and services’ functionality is presented. Functional Services represent high-level end-user services. These services consist mostly on IF–THEN rules. Here, the conditional part combines information from several devices and technological services. In the same way the operational part is based on a combination of technological services and devices. A simple example for a functional service could be a service for an automatic heating control. There, information from two temperature sensors (devices)—one installed inside and one installed outside the house—are combined. The system will automatically check both temperature sensors. The heating must be turned on if the temperature in the room falls below a predefined threshold, but also if the system detects that the temperature outside is much lower than the temperature inside, to prevent a quick decrease of the inside temperature due to a big temperature gradient. Because it could be waste of energy to heat the house if no residents stay currently at home, the service should only turn the heating on if at least one of the residents is at home. Therefore an additional technological service is used to determine if anybody is at home analyzing data from all presence sensors at home. The following rule describes the mentioned functional service.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

IF

((Tinside < Tmin ) OR (Tinside − Toutside > ΔTmax )) AND (NP ersonsAtHome > 0)

THEN

turnOnHeatings

There each function (like the calculation of the temperature difference) can be realized by a technological service or by accessing a devices’ raw data and processing it inside the functional service.

4. Common OSGi Interface Definition In the last section different components of an AmI application have been introduced. The goal is now to define standardized interfaces for each component class allowing service developers to share already implemented OSGi bundles—even between different projects and working groups. Drivers and functional services are quite close to the hardware and user level respectively. Thus, defining how their interfaces should be is too restrictive in terms of development. Hence we defined interfaces for components on device as well as technological service level. There are several approaches that could be followed to do this definition: considering the technology, the networking, etc. Finally we decided to refer to the intrinsic nature of devices and services finding attributes and methods in common. We found the ZigBee Cluster Library (ZCL) to have a similar approach grouping functionali-

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

344

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

ties of devices in clusters of attributes and commands, which together define a communication interface [32]. These clusters are grouped into domains addressing specific functionalities; e.g. closures, lightning, security and safety, measurement and sensing, etc. OSGi4AMI shares ZCL’s approach having clusters and domains; nevertheless, substantial differences exist between them. Considering the general architecture in figure 2, ZCL focuses at driver level while OSGi4AMI defines the upper layers. Moreover, many clusters defined are not suitable for OSGi as they have been defined in the frame of a specific wireless communication standard (e.g. RSSI Location, Groups management, etc.). Also, the domains addressed in the ZCL are specific for wireless sensor networks with devices having reduced data throughput and small processing capabilities; as OSGi4AMI does not stick to any hardware or communications architecture, many other devices and services are considered (e.g. cameras, MMS managing, household appliances, etc.). The OSGi4AMI taxonomy can be used to build ambient intelligence environments; nevertheless, selection of devices and technological services here presented are those more useful in Ambient Assisted Living (AAL) applications. This selection results from the requirements of various European and national projects where the authors are involved dealing with different AAL applications; MonAMI focuses in ambient intelligence using mainstreaming devices and services to increase people autonomy [13]; Easyline+ builds a smart kitchen that supports elderly autonomy [14]; AmiVital sets up a technological platform comprising device, network and computer programmed standardized components allowing for a simple creation of services adapted to different needs and environments [19]; and AmbienNet aims to demonstrate the viability of navigation systems to assist users with and without disabilities supported by intelligent environments [18].

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

4.1. Devices A device is a representation of a physical device or another entity that can be attached by a driver service and has an explicit functionality. Examples for devices are temperature sensors, dimmer-light actuators, speakers or blood-pressure measurement tools. Because there are a wide variety of devices, we have grouped them into the following mentioned device categories (bundles) regarding their functionalities. In the AmI environments, we can find mainly the following devices: • Sensors are devices which are able to sense physical magnitudes (like temperature, presence, acceleration, etc.). • Actuators are devices which are able to change the state of a physical tool (like switch on/off a light/plug, close/open a door/window, etc.). • Simple HMI (Human-Machine Interface) are devices which are used to put data into the RG (remote controllers, level controllers, etc.) or are used by the RG to provide information to the user (LED controllers, buzzer, etc.). Figure 3 visualizes the introduced device bundles. Of course this taxonomy can be enlarged by other categories of devices having different functionalities that might not be included in these groups. Examples of them could be: Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

345

Figure 3. Device categories.

• Household Appliances, which are complex devices, present in a standard home environment (like washing machines, TV, oven, etc.). • Multimedia IO, which are devices able to capture or produce multimedia data like sound and images (examples are cameras, microphones, etc.). • Medical devices are tools specially designed for acquisition of medical parameters. • Special devices are other devices too specific to be clustered into dedicated bundles, possibly with unique functionalities. However, the functionality of these devices is usually so specific that it does not make too much sense to cluster them or to provide particular interfaces to use them in a different way than the ones provided at the device level. As figure 3 shows, all device interfaces extend the device bundle, and therefore all devices have some properties in common. Because some devices may be able to provide information about their location (in case of fixed mounted devices or devices implementing location functionality) or not (in case of wearable or

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

346

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

movable devices without location functionality) the device bundle includes besides the BasicDevice interface, which is mandatory for all devices, also an optional Location Cluster 2 . If devices are able to provide information about their location, they have to implement functions specified in this cluster; if not, the Location Cluster can be neglected. Most of the devices are also able to provide information about their power state and hence the device bundle includes an optional Power Cluster as well. The same idea of having optional clusters applies to specific devices like sensors and actuators, which may be able to provide different levels of information. Each device category includes mandatory and optional clusters. In this way service developers reach more flexibility for creating new devices Devices are able to inform registered services about device events. For instance, such an event could be that battery level is too low, or that the device’s location has been refreshed. If a service wants to be informed about device events, it has to implement the belonging device listener interface, which is included in every device bundle. Every time a device raises a new device event, it will inform all registered listeners about it. Looking at children bundles in figure 3, sensors are of type devices and inherit all device properties. To particularize the functionality of sensors, additional properties and methods are necessary. Beside the BasicSensor mandatory interface, which defines common sensor functionalities, different cluster with additional methods are available. These clusters provide functionalities like automatic refreshment of sensor values, raising some event when a threshold is trespassed, storing maximum and minimum values or the ability of streaming data. The same as sensors, actuators are also of type device and inherit its properties. Primary functionality of actuators is gathered in the BasicActuator mandatory interface, and specialized functionalities of every actuator can be shaped with different clusters, such as turning on/off a light or setting a level for a dimmer light. Lastly, simpleHMI (Human-Machine Interface) are devices that are used by the residents to input simple data into the RG or by the RG to output simple data to the user. Hence the simpleHMI bundle includes beside the mandatory BasicSimpleHMI interface, a cluster for input and a cluster for output HMI devices. As other devices, simpleHMI devices inherit all device properties. A common approach for all the devices defined in these bundles is how they are related to the physical, real device. Device objects represent physical devices that are connected to the RG through sensor networks, typically wireless ones. In that sense, when we want to retrieve some information from a device by calling any method, there will be an internal message exchange that will require some time. For that reason, methods in the interfaces are constructed not to directly retrieve some information (synchronous operation), but asking the device to refresh it and notifying the caller when the requested information is available (asynchronous operation). Another characteristic in the interfaces proposed is that they are kept as simple as possible, having only methods that are directly related with general functionalities, whereas methods for configuration of devices are intentionally 2 In

the following, “cluster” specifies a package of functions and features, which falls in the same application area, and has correspondence with an interface. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

347

omitted. For example, a shutter actuator may require a procedure to determine how much time it will take to raise the shutter from the bottom to the top, in order to be able to raise the shutter to a 50% of the full range in normal operation. In this case, the interface does not offer a method to do so, and it is the responsibility of the implementation to run that procedure before publishing the shutter actuator, if this were required to work properly. 4.2. Technological Services

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

As already mentioned, technological services may combine information from several devices or device events to describe their functionality. Like devices, services are also organized in bundles with similar functionalities. The proposed categories for technological services are: • Communication. This bundle includes all services, which are used for “home - outside world” communication (like send/receive SMS/email, call carer, etc.). • Personal Monitoring. This category includes services that provide information about monitored users (like current user location, person activity monitoring, abnormal behavior detection, etc.). • Ambient Monitoring. Here services are included which provide information about the current state of the environment or the state of fixed mounted devices within the environment (like current state of shutters, washbowl, TV, etc.). • Ambient Control. This category includes services, which are used to control the environment (mainly devices). • Personal Support. This class includes services, which provide support to users with special needs (like speech recognition, text2speech functionality, etc.). • Special Services. This group includes those services which provide generic support to other services in the framework, such as user information or other special services that do not fit in other categories. Figure 4 shows the graphical representation of the introduced technological service groups and already defined interfaces. As can be seen, all technological services belong to Service type and hence all services have some basic features in common. Beside this, each service is allowed to extend the Service interface and to provide specific service functions and features. All services are able to inform other technological or functional services about raised service events. These events include the information a service wants to share with other services or the environment. For example in case of a fall detection service such an event could be “person (id = 02) fell down in the kitchen”. Of course data provided by an event depends strongly on the service type. To allow more flexibility, some events provide event data coded in a map with defined keys. If a service wants to be informed about a raised event, which is provided by another service, it has to register to that service as a ServiceListener. This means it has to implement a belonging function to receive event data from the service. A detailed description of all the already defined device and service interfaces can be found at the OSGi4AMI project page on Sourceforge [33].

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

348

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

Figure 4. Technological service categories

5. Devices and Services Implementation Up to now, a set of device and service definitions have been presented, but the purpose of AAL is to provide functional services to end-users. To illustrate how that can be achieved using the Common OSGi Interface, let’s consider two example services coming from [34]: a guiding service and an alarm detection service. 5.1. Guiding Service Reaching a destination inside a big building can entail a problem for people with cognitive disabilities—as well as for those without disabilities in unknown

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

349

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 5. Guiding Service architecture

buildings—who can be disoriented and get lost. Also, in special education schools, one of the skills on which children with cognitive disabilities are trained is on increasing their autonomy, asking them to go to some place and perform some actions such as transmitting a message from one therapist to another, which usually requires of another person to go with the child. Thus, with this example it is clear how such a service, which helps people to navigate inside unknown buildings, can be very useful. However, mainstreaming navigation systems such as commercial car navigation systems may be not suitable for indoor guiding of people with cognitive disabilities, as they can misunderstand navigation instructions. In that sense, an indoor guiding system was developed in [35] that takes into account user characteristics (cognitive level profile, physical abilities...) and building status (crowded corridors, elevator availability...) to calculate appropriate routes, and providing easy-to-understand instructions supported by building landmarks such as rooms furniture or structural elements. The service model according the framework proposed is showed in figure 5. The guiding functional service uses a number of technological services to provide its functionality. The service has awareness of the user’s position by means of a device able to provide its location. Guiding is materialized by two technological services that compute routes and compound adequate guiding messages. RouteCalculator service uses a BuildingState service that informs about the building dynamic state (elevators availability, crowded areas...), a BuldingMapsDB service which provides graphs for route calculation, and a UserInformationDB service that provide information about user characteristics that must be taken into account when calculating a route (for instance, if the user has a wheelchair, routes that involve stairs must not be considered). GuidingAssistant service also uses BuildingMapsDB and UserInformationDB services, and a BuildingLandmarksDB service that provides adequate guiding messages using reference elements on the building (for example “cross the door next to the fire extinguisher” or “go to the end of the corridor and turn left”). When the user wants to reach some destination, the service looks at his/her current position provided by the location device, and computes the safest route with the RouteCalculator service. Depending on the user position, next guiding

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

350

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

Figure 6. Alarm service architecture

message is provided by the GuidingAssistant service and displayed to the user. If the service detects that the user is out of the route, it recalculates a new route and continue the guiding.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.2. Alarm Service If knowing the location of a user is of interest to enable guiding him/her to a desired destination, it is also interesting to prevent this user from staying at some undesired places. A common problem in residences for elderly and disabled people is how to prevent users from wandering in restricted areas (kitchen, healing room, warehouse...), leaving their rooms at night or even leaving the residence, and this is usually accomplished by locking doors and confining users in their rooms (which, among other aspects, obviously entails a big security issue in case of an emergency). This problem can be alleviated with a service that raises an alarm when the user enters on such restricted areas so their carers can react quickly, avoiding the need of locking users in their rooms. In addition to that, it is also of interest raising alarms when the user demands help, for example by pressing a button, or when the user suffers a fall, which actually is one of the biggest worriers for people living alone in their homes and it is forcing them to move into residences where they can feel safer. A model of such service can be seen in figure 6. The Alarm service will check if the current user location, which is provided by the device carried by him/her, appears to be inside a dangerous area with a DangerousAreas service. This service combines restrictions for the user from a UserInformationDB service and the existing areas in the building supplied with the BulidingAreasDB service to determine if an alarm must be triggered. In this case, the alarm can be displayed on a monitor or sent by SMS to the carer or a relative using a HandleSMS service, with information on the dangerous area. If the device detects a fall, the event raised by the FallDetector device will be

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

351

Figure 7. Mobile device

handled by the Alarm service in the same way as location alarms, and also if the user demands help by himself/herself.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

5.3. Device implementation As mentioned before, the services proposed came from [34]. There, an indoor localization system working with ZigBee and ultrasounds was presented, which was installed on a residence for children with mental disabilities, with the aim of providing AAL location-based services as the ones described. Ultrasound technology offers the biggest ratio accuracy/cost on localization with well-known systems that support that assertion, like Active Bat [36][37] or Cricket [38]. ZigBee is a communication standard for wireless sensors networks (WSN) that provide many interesting features for that field, like low consumption, message auto-routing and self-healing mechanisms, and it is mainly focused on home control applications. The mobile device (MD) designed in [34] to compute location is a piece of hardware that integrates an ultrasound transceiver, accelerometer, temperature sensor, two buttons and a ZigBee communication module (see figure 7). Once the MD is connected to the RG through the ZigBee network, in the OSGi framework, the ZigBee driver will instantiate a ZigBeeDevice object, from which several devices will derive (see figure 5 and figure 6). Each device will implement the BasicDevice interface that provides basic methods such as identification or state of each device. The MD is able to compute its location, so besides the BasicDevice, the LocationCluster (ExtendedLocation interface) will be implemented to retrieve that location. The MD is also able to measure its battery level, so devices will also implement the PowerCluster (ExtendedPower interface), although the services previously presented do not use it. The MD is able to detect falls with the integrated accelerometer [39]. Thus, a FallDetector device is derived, which implements the BasicSensor interface plus the EventSensorCluster. The built-in buttons in the MD provides the user with the ability of demanding help, which is handled by an AlarmButton device that implements the SimpleHMI interface plus the InputSimpleHMICluster. These two devices extend the base device implementing the common clusters (see figure 8) These devices are used by technological and functional services presented, and can be shared with other services without interfering. The ZigBee driver will also publish devices for the accelerometer and temperature sensor, which besides the

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

352

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Figure 8. Implementation of FallDetector and AlarmButton devices

Figure 9. Implementation of Temperature Sensor and Accelerometer Sensor devices

BasicSensor can implement any of the sensor-related clusters, such as the ThresholdSensorCluster, the MinMaxSensorCluster or the StreamingSensorCluster, that will be available to any other services (see figure 9). Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

353

5.4. Services implementation The guiding service will use the information provided by the implemented device presented in previous section, but it could use any other device providing location with any other technology. All the existing services will implement the BaseService interface, plus its specific interface (see figure 10). BuildingMapsDB and BuildingLandmarksDB are database-based services, which belong to the SpecialServices cluster, and provide static information from the building, the same as the UserInformationDB, whereas BuildingState service from the AmbientMonitoring cluster provides dynamic information about what is happening in the building. RouteCalculator and GuidingAssistant are upper level technological services, which implement the BaseService and their specifics interfaces. In the case of the RouteCalculator service, it must implement a BuildingStateListener in order to be notified about changes in the building state, and register itself as listener for the device. The guiding functional service will implement the BaseService as well, and a DeviceListener to detect when the user location changes, and provide the adequate guiding message or recalculate the route if the user abandons the provided one. It does not implement listeners for RouteCalculator or GuidingAssistant services, as these services operate synchronously and the guiding service will be locked when calling methods on them until the reply is ready. The process flow of the service can be sketched as follows:

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

ON IF

DeviceLocationChanged locationBelongsToRoute

THEN ELSE

showNextGuidanceMessage calculateRoute

For the alarms functional service, the UserInformationDB is the same as in the guiding service, and there is a BuildingAreasDB service similar to the BuildingMapsDB, implementing the BaseService interface and belonging to the SpecialServices cluster. There is a HandleSMS service belonging to Communication cluster that provides SMS support (send and receive messages). The DangerousAreas is similar to the RouteCalculator service as it relates static information from the user and the building. The alarm functional service is the more complex one as it must implement the BaseService plus listeners for services and devices: HandleSMSListener, FallDetectorListener, DeviceListener and SimpleHMIListener, as well as register as listener of those service and devices. The process flow for the service is: IF

userRequestHelp OR userEntersOnRestrictedArea OR userFallDetected

THEN

getUserLocation AND getUserIdentification AND informCarerUsingSMS

As the guiding service, location or fall detection could also be provided with other technology, and will be only needed to use the services or devices related, but without modifying the alarm service. Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

354

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

Figure 10. Services definition for guiding and alarm functional services

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

6. Conclusion and Outlook This paper introduced common interfaces for AmI applications based on OSGi. By specifying standardized interfaces for devices and services, which are used in many smart home environments, the possibility of sharing already implemented OSGi service and device bundles is given. Because functional services as well as technological services and devices developers share the same interfaces they can change their services and devices easily. There is no need for re-implementing services or devices again or to define wrapper classes for adapting provided technology to the own applications. The interfaces proposed do not claim to be exhaustive definitions of every device of service available, but a guide to deploy devices and services that can be used homogeneously. Interfaces are kept as simple as possible, and details about how to configure devices and services or redundant functionalities are intentionally omitted. Nevertheless, these interfaces only propose common methods that ease the utilization of devices and services, but do not limit the implementations in the additional functionalities they can provide. When having a look to the near future, the device technologies and communication protocols will probably stay heterogeneous because of new technological hardware developments and commercial reasons. But regarding technological services, a standardized common way to access service information—like the one described in this paper “using OSGi and common interfaces”—which was accepted by all the service developers, is imaginable. Of course this would clear the way for

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

355

a distributed service development as well as service sharing and service selling. As a result this will bring up a big market and will offer also a new possibility for an easy service usage. When talking to service developers among different countries and projects, the same problem seems to occur once and again: “How could high-level service development be simplified?” In many cases companies or service providers want to have an easy solution to create new high-level service (functional services) even without having knowledge about complex programming languages as Java or C++ and just by using existing technological services. Because high-level services are mostly based on rules, which combine information from different services, they want to have the possibility to create such rules using a very simple and basic programming language—based on Boolean expressions— as BPEL (Business Processes Execution Language) does in the development of enterprise applications. If standardized technological service interfaces are defined and agreed by all the service developers, such a high-level service development framework based on rules, could be easily developed. And this means that every company providing services will be able to offer their technological services to functional service developers which will use again these services to create highlevel services with ease. As a result such services will be no longer created by computer specialist working for companies but by persons who want to use the services like nurses, caretakers or the normal users. Because such people are very close to the real application field, services can be created or adapted to a very specific task. And this would mean a huge improvement in service quality, service usage and service efficiency.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

References [1] A. Millonig, N. Brndle, S. Van Der Spek, and D. Bauer, Pedestrian Behaviour Monitoring: Methods and Experiences, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [2] P. Laube, Movement patterns in spatio-temporal data, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [3] Y. Kurata, and M. J. Egenhofer, Interpretation of Behaviours from a Viewpoint of Topology, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [4] F. Dylla, Qualitative Spatial Reasoning for Navigating Agents, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [5] Z. Wood, and A. Galton, Collectives and their Movement Patterns, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [6] A. Aztiria, and A. Izaguirre, Learning of User’s Preferences improved through speech based interaction, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [7] A. Hein, M. Giersich, C. Burghardt, and T. Kirste, Model based Inference Techniques for detecting Activities of Daily Living, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [8] T. Adlam, B. Carey-Smith, N. Evans, and A. Mihailidis, Implementing Monitoring and Technological Interventions in Smart Homes for People with Dementia: Case Studies, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [9] S. Giroux, The praxis of cognitive assistance in smart homes, Behaviour Monitoring and Interpretation in Ambient Environments, IOS press (2009). [10] Home Gateway Initiative, http://www.homegatewayinitiative.org/ (2009). [11] OSGi Alliance, OSGi Service Platform, Core Specification, Release 4, Version 4.1, http://www.osgi.org (2007).

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

356

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

[12] Charles Gouin-Vallerand, Sylvain Giroux, Managing and deployment of applications with OSGi in the context of Smart Homes, Third IEEE International Conference on Wireless and Mobile Computing, Networking and Communications WiMob 2007 (2007). [13] EU Project: MonAMI - Mainstreaming on Ambient Intelligence, http://www.monami.info (2009). [14] EU Project: Easyline+ - Low cost advanced white goods for a longer independent life of elderly people, http://www.easylineplus.com (2009). [15] EU Project: Aspire - Advanced Sensors and lightweight Programmable middleware for Innovative Rfid Enterprise applications, http://fp7-aspire.eu/index.html (2009). [16] EU Project: ePerSpace - Towards the era of personal services at home and everywhere, http://www.ist-eperspace.org/ (2009). [17] EU Project: Amigo - Intelligence for the networked home environment, http://www.amigoproject.org (2008). [18] Spanish Project: Ambiennet: Soporte a la navegaci´ on de personas con discapacidad en Ambientes Inteligentes, (2009). [19] Spanish Project: AmiVital: Entorno personal digital para la salud y el bienestar, http://www.amivital.es (2007). [20] EU Project: TEAHA – The European Application Home Alliance, http://www.teaha.org (2008). [21] EU Project: H@H – Hearing at Home, http://www.hearing-at-home.eu/ (2009). [22] EU Project: Share-it –Supported Human Autonomy for Recovery and Enhancement of cognitive and motor abilities using information technologies, http://www.istshareit.eu/shareit (2009) [23] Andre L. C. Tavares, and Marco Tulio Valente, A Gentle Introduction to OSGi, ACM SIGSOFT Software Engineering Notes, 33 (5) (2008). [24] UPnP Forum, http://www.upnp.org/standardizeddcps/default.asp (2008) [25] Miram Iba˜ nez, Service Provisioning for the residential environment, OSGi Alliance Congress (2002). [26] Miram Iba˜ nez, V´ıctor Manuel Garc´ıa, Jos´e Mar´ıa Montero and Cristina D´ıaz, Aplicaci´ on del est´ andar OSGi en la arquitectura del proyecto Hogar.es, Comunicaciones de Telef´ onica I+D, 31 (2003), 25–34. ´ [27] Jorge Falc´ o, Roberto Casas, Alvaro Marco, Jos´e Luis Sevillano. D´ omotica En Entornos Asistenciales. Puede La Dom´ otica Ayudar A Personas Con Discapacidad? Retos Pendientes, Montajes E Instalaciones, 386 (2004), 102–106. [28] Jan S. Rellermeyer, Michael Duller, Ken Gilmer, Damianos Maragkos, Dimitrios Papageorgiou, and Gustavo Alonso, The Software Fabric for the Internet of Things, IOT (2008), 87–104. [29] Rebeca P. Diaz Redondo, Ana Fernandez Vilas, Manuel Ramos Cabrer, Jose Juan Pazos Arias, Jorge Garcia Duque, and Alberto Gil Solla, Enhancing Residential Gateways: A Semantic OSGi Platform, IEEE Intelligent Systems, IEEE Educational Activities Department (2008), 32–47. [30] Andre Bottaro, Anne Gerodolle, and Philippe Lalanda, Pervasive Service Composition in the Home Network, 21st International Conference on Advanced Networking and Applications AINA’07 (2007). [31] C. Lee, D. Nordstedt, and S. Helal, Enabling Smart Spaces with OSGi, IEEE Pervasive Computing, 33 (3) (2003), 89–94. [32] ZigBee Cluster Library Specification, ZigBee Alliance, available at http://www.zigbee.org/en/spec download/download request cl.asp, (2007). [33] OSGi4AMI Sourceforge project page, http://sourceforge.net/projects/osgi4ami/ (2009). ´ [34] Alvaro Marco, Roberto Casas, Jorge Falco, H´ector Gracia, Jos´e Ignacio Artigas, and Armando Roy, Location-based services for elderly and disabled people, Computer Communications, 31 (6) (2008) 1055–1066. [35] Jos´e M. Falc´ o, GUIA: Una aportaci´ on al guiado de personas mayores o con discapacidad, doctoral dissertation, Univ. Of Zaragoza (2004). [36] A. Ward, A. Jones, and A. Hopper, “A New Location Technique for the Active Office”, IEEE Personal Communications, 4 (5) (1997), 42–47.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

A. Marco et al. / Common OSGi Interface for Ambient Assisted Living Scenarios

357

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

[37] M. Addlesee, R. Curwen, S. Hodges, J. Newman, P. Steggles, A. Ward, and A. Hopper, “Implementing a Sentient Computing System”, Computer, 34 (8), (2001) 50–56. [38] N.B. Priyantha, A. Chakraborty, and H. Balakrishnan, “The Cricket Location-Support System”, International Conference On Mobile Computing and Networking (2000), 32–43. ´ [39] Rub´ en Blasco, Roberto Casas, Alvaro Marco, Victori´ an Coarasa, Yolanda Garrido, and Jorge Falco, Fall Detector based on Neural Networks, Proceedings of the International Conference on Bio-inspired Systems and Signal Processing BIOSIGNALS 2008 (2008), 540–545.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

359

Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2009 © 2009 The authors and IOS Press. All rights reserved.

Author Index 159 v, 1 212 336 289 289 289 183 11 336 319 336 159 183 183 11 257 159 336 319 98 75 159 129

Giersich, M. Giroux, S. Gottfried, B. Hein, A. Ibañez, M. Izaguirre, A. Jean-Bart, B. Kiefer, P. Kirste, T. Kurata, Y. Lacroix, J. Laube, P. Leblanc, T. Marco, A. Mihailidis, A. Millonig, A. Orpwood, R. Pigot, H. Ray, M. Schlieder, C. Stein, K. van der Spek, S. van Halteren, A. Wood, Z.

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

Adlam, T. Aghajan, H.K. Aghajan, Y. Asensio, A. Augusto, J.C. Aztiria, A. Basagoiti, R. Bauchet, J. Bauer, D. Bauer, G. Bennett, R. Blasco, R. Boger, J. Bouchard, B. Bouzouane, A. Brändle, N. Burghardt, C. Carey-Smith, B. Casas, R. Duckham, M. Dylla, F. Egenhofer, M.J. Evans, N. Galton, A.

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,

257 183 v, 1 257 336 289 336 235 257 75 212 43 183 336 159 11 159 183 11 235 235 11 212 129

Copyright © 2009. IOS Press, Incorporated. All rights reserved.

This page intentionally left blank

Behaviour Monitoring and Interpretation - BMI : Smart Environments, edited by B. Gottfried, and H. Aghajan, IOS Press,