Computer and Information Technology in Pervasive Environments 9781846637636, 9781846637629

With the emergence and convergence of advanced network technologies and electronic devices, the trend from personal comp

255 54 2MB

English Pages 105 Year 2007

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Computer and Information Technology in Pervasive Environments
 9781846637636, 9781846637629

Citation preview

ijpcc cover.qxd

13/12/2007

14:43

Page 1

ISSN 1742-7371 Volume 3 Number 2 2007

International Journal of

Pervasive Computing and Communications Computer and information technology in pervasive environments Guest Editor: Daming Wei

www.emeraldinsight.com

International Journal of Pervasive Computing and Communications

ISSN 1742-7371 Volume 3 Number 2 2007

Computer and information technology in pervasive environments Guest Editor Daming Wei

Access this journal online _________________________

119

Editorial advisory board___________________________

120

Guest editorial ___________________________________

121

Energy-efficient adaptive resource management strategy for large-scale mobile ad hoc networks Shengfei Shi, Jianzhong Li, Chaokun Wang and Yuhui Wu ____________

123

Ontology alignment as a basis for mobile service integration and invocation Jingshan Huang, Jiangbo Dang, Michael N. Huhns and Yongzhen Shao __

138

Towards an analysis of WonGoo performance Tianbo Lu, Binxing Fang, Yuzhong Sun and Xueqi Cheng_____________

Access this journal electronically The current and past volumes of this journal are available at:

www.emeraldinsight.com/1742-7371.htm You can also search more than 150 additional Emerald journals in Emerald Management Xtra (www.emeraldinsight.com) See page following contents for full details of what your access includes.

159

CONTENTS

CONTENTS continued

Integrated business-process driven design for service-oriented enterprise applications Xingdong Shi, Weili Han, Yinsheng Li and Ying Huang_______________

175

JDGC: an integrated decentralized distributed computing platform for Java program Tay Teng Tiow, Chu Yingyi and Sun Yang _________________________

190

k-PCA: a semi-universal encoder for image compression Chuanfeng Lv and Qiangfu Zhao _________________________________

205

www.emeraldinsight.com/ijpcc.htm As a subscriber to this journal, you can benefit from instant, electronic access to this title via Emerald Management Xtra. Your access includes a variety of features that increase the value of your journal subscription.

How to access this journal electronically To benefit from electronic access to this journal, please contact [email protected] A set of login details will then be provided to you. Should you wish to access via IP, please provide these details in your e-mail. Once registration is completed, your institution will have instant access to all articles through the journal’s Table of Contents page at www.emeraldinsight.com/1742-7371.htm More information about the journal is also available at www.emeraldinsight.com/ ijpcc.htm Our liberal institution-wide licence allows everyone within your institution to access your journal electronically, making your subscription more cost-effective. Our web site has been designed to provide you with a comprehensive, simple system that needs only minimum administration. Access is available via IP authentication or username and password. Emerald online training services Visit www.emeraldinsight.com/training and take an Emerald online tour to help you get the most from your subscription.

Key features of Emerald electronic journals Automatic permission to make up to 25 copies of individual articles This facility can be used for training purposes, course notes, seminars etc. This only applies to articles of which Emerald owns copyright. For further details visit www.emeraldinsight.com/ copyright Online publishing and archiving As well as current volumes of the journal, you can also gain access to past volumes on the internet via Emerald Management Xtra. You can browse or search these databases for relevant articles. Key readings This feature provides abstracts of related articles chosen by the journal editor, selected to provide readers with current awareness of interesting articles from other publications in the field. Reference linking Direct links from the journal article references to abstracts of the most influential articles cited. Where possible, this link is to the full text of the article. E-mail an article Allows users to e-mail links to relevant and interesting articles to another computer for later use, reference or printing purposes. Structured abstracts Emerald structured abstracts provide consistent, clear and informative summaries of the content of the articles, allowing faster evaluation of papers.

Additional complimentary services available Your access includes a variety of features that add to the functionality and value of your journal subscription: Xtra resources and collections When you register your journal subscription online you will gain access to additional resources for Authors and Librarians, offering key information and support to subscribers. In addition, our dedicated Research, Teaching and Learning Zones provide specialist ‘‘How to guides’’, case studies and interviews and you can also access Emerald Collections, including book reviews, management interviews and key readings. E-mail alert services These services allow you to be kept up to date with the latest additions to the journal via e-mail, as soon as new material enters the database. Further information about the services available can be found at www.emeraldinsight.com/alerts Emerald Research Connections An online meeting place for the world-wide research community, offering an opportunity for researchers to present their own work and find others to participate in future projects, or simply share ideas. Register yourself or search our database of researchers at www.emeraldinsight.com/connections

Choice of access Electronic access to this journal is available via a number of channels. Our web site www.emeraldinsight.com is the recommended means of electronic access, as it provides fully searchable and value added access to the complete content of the journal. However, you can also access and search the article content of this journal through the following journal delivery services: EBSCOHost Electronic Journals Service ejournals.ebsco.com Informatics J-Gate www.j-gate.informindia.co.in Ingenta www.ingenta.com Minerva Electronic Online Services www.minerva.at OCLC FirstSearch www.oclc.org/firstsearch SilverLinker www.ovid.com SwetsWise www.swetswise.com

Emerald Customer Support For customer support and technical help contact: E-mail [email protected] Web www.emeraldinsight.com/customercharter Tel +44 (0) 1274 785278 Fax +44 (0) 1274 785201

IJPCC 3,2

120

ADVISORY EDITORIAL BOARD Koji Nakano Hiroshima University, Japan Stephan Olariu Old Dominion University, USA Sartaj Sahni University of Florida, USA Franciszek Seredynski Polish Academy of Science, Poland Arun Somani Iowa State University, USA Ralf Steinmetz Darmstadt University of Technology, Germany Mazin Yousif Intel, USA Albert Y. Zomaya (Chair) University of Sydney, Australia

EDITORIAL BOARD Emile Aarts Philips Research Labs, The Netherlands Ralf Ackermann Darmstadt University of Technology, Germany Enrique Alba University of Malaga, Spain Leonard Barolli Fukuoka Institute of Technology, Japan Mark Billinghurst University of Washington, USA Victor Callaghan University of Essex, UK Sajal K. Das University of Texas at Arlington, USA Amitava Datta University of Western Australia, Australia Dave De Roure University of Southampton, UK Richard W. DeVaul MIT Media Lab, USA Minyi Guo University of Aizu, Japan Ju¨rg Gutnecht ETHZ, Switzerland Oliver Haase Bell Labs Research, USA Uwe Hansmann IBM, Germany Matthias Hollick Darmstadt University of Technology, Germany Tsung-Chuan Huang National Sun Yat-sen University, Taiwan Jadwiga Indulska University of Queensland, Australia Chung-Ta King National TsingHua University, Taiwan Gerd Kortuem International Journal of Pervasive Computing and Communications University of Lancaster, UK Vol. 3 No. 2, 2007 Bjorn Landfeldt p. 120 # Emerald Group Publishing Limited University of Sydney, Australia 1742-7371

Qusay H. Mahmoud University of Guelph, Canada Tom Martin Virginia Tech, USA Rene Mayrhofer Johannes Kepler University at Linz, Austria Jo¨rg R. Mu¨hlbacher Johannes Kepler University at Linz, Austria Max Mu¨hlha¨user Darmstadt University of Technology, Germany Hideyuki Nakashima Future University, Hakodate, Japan Joseph Kee-Yin Ng Hong Kong Baptist University, China Lionel M. Ni Hong Kong University of Science and Technology, China Tom Pfeifer Waterford Institute of Technology, Ireland Joachim Posegga University of Hamburg, Germany Aaron Quigley University College Dublin, Ireland Omer Rana Cardiff University, UK Kurt Rothermel University of Stuttgart, Germany Anthony Savidis ICS Forth, Greece Dieter Schmalstieg TU Vienna, Austria Andreas Schrader University of Lu¨beck and International School of New Media GmbH, Germany Timothy K. Shih Tamkang University, Taiwan Ivan Stojmenovic Ottawa University, Canada Makoto Takizawa Tokyo Denki University, Japan El-ghazali Talbi University of Lille, France David Taniar Monash University, Australia Yoshito Tobe Tokyo Denki University, Japan Hideyuki Tokuda Keio University, Japan Yu-Chee Tseng National Chiao-Tung University, Taiwan Cho-li Wang Hong Kong University, China Jie Wu Florida Atlantic University, USA Zhaohui Wu Zhejiang University, China Franco Zambonelli University of Modena and Reggio Emilia, Italy Sherali Zeadally Wayne State University, USA

Guest editorial About the Guest Editor Daming Wei graduated from Department of Mathematics and Mechanics, Tsinghua University, Beijing, China 1970. He received the M. Eng. degree in Computer Engineering from Shanghai Institute of Technology (Shanghai University) in 1981, and PhD in Biomedical Engineering from Zhejiang University, China 1985. He was a deputy director of the Biomedical Engineering Section in Zhejiang University before joined Tokyo Institute of Technology in 1986. Since then, he has been with industry and universities in Japan. He is currently professor at faculty of Computer Science and Engineering and director of Information Systems and Technology Center (ISTC), University of Aizu, Japan. He served as Director of Software Department and Chair of Graduate Department of Information system in the past years. Professor Wei is well known for research and development of state-of-the-art computer heart model and simulation of electrocardiogram. Recent directions in his group include biomedical modeling and computer simulation, information technology in biomedicine. He serves as a council member of the International Society of Bioelectromagnetism, and an editor of the International Journal of Bioelectromegnetism. He is a council member of Japan Biomedical Engineering Society Tohoku Branch. He is founder and co-chair of IEEE International Conference on Computer and Information Technology (IEEE CIT). He is leader of several large-scale research projects supported by Central or prefecture government funds focused on information technology on healthcare. He is a guest professor at several universities in Japan and China.

Special issue on computer and information technology in pervasive environments With the emergence and convergence of advanced network technologies and electronic devices, the trend from personal computer computing towards pervasive computing is continuing. Pervasive computing devices are either mobile or embedded in any type of object through interconnected networks. Pervasive computing environments are characterized as dynamic, mobile, reconfigurable, and personalized because the pervasive applications should be aware of variation of location, time, device, and user. Developing the dynamics and context-awareness of services and applications is extremely complex and heterogeneous since current computing, networking, and information technology are far from providing a perfect solution to any user at any time, any location on any device. CIT’05, the fifth International Conference on Computer and Information Technology, provides a premier venue to bring together researchers working in all foundational and applied research areas of pervasive and ubiquitous computing. This special issue on computer and information technology in pervasive environments contains extended versions of papers selected after a second round review from the CIT’05. It is more challenging to develop resource management methods for mobile ad hoc networks compared with the infrastructure networks since the mobile hosts are connected by wireless network with a frequently changing topology. Shi et al. proposes a novel resource management strategy in their paper, which can be used in many current routing protocols. The QoS issue is also considered for resource discovery algorithm and broadcasting method. To use services from other devices, a mobile device must be able to comprehend their descriptions. Ontologies can aid in this comprehension, but ontologies designed

Guest editorial

121

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 121-122 # Emerald Group Publishing Limited 1742-7371

IJPCC 3,2

122

independently for each device would have heterogeneous semantics. Huang et al. present an automated schema-based approach to align the ontologies from interacting devices as a basis for mobile service invocation. When the ontologies are ambiguous about the services provided, it introduces compatibility vectors as a means of maintaining ontology quality and deciding which service to choose to reduce the ambiguity. Lu et al. present a so-called WonGoo system, which provides strong anonymity and high efficiency with layered encryption and random forwarding. This paper focuses on measuring the performance of WonGoo system, and the results shows that WonGoo can protect against (n 1) attack and provide variable anonymity, as well as how confident the collaborators can be that their immediate predecessor is in fact the path initiator. The paper by Shi et al. proposes a new approach for integrated business-processdriven modeling and implementation for service-oriented enterprise applications. There are three phases in this approach: business environment modeling, business process modeling, and compiling. The approach is implemented on the open platform Eclipse V3.1 so that it can be integrated with other SOA tools to provide a total solution for building enterprise applications. The paper by Tay et al. demonstrates an integrated distributed computing platform, Java distributed code generating and computing ( JDGC), which allows standard, single machine-oriented Java programs to be transparently executed in a distributed system. The functions of JDGC include distributed code generation, host recruitment, code distribution, runtime monitoring and control, and error recovering. The architecture of JDGC is fully decentralized in that every participating host is identical in function. Lv et al. propose a new technique called k-PCA (principal component analysis) and applies it to image compression. k-PCA is a combination of vector quantization and PCA. Although a k-PCA encoder is more complex than a single PCA encoder, the compression ratio can be much higher. As the guest editor-in-chief, I would like to thank the editorial board for their supporting this special issue, as well as the CIT. I appreciate contributions of all reviewers for their excellent work in reviewing the submitted papers. I wish this issue helpful and valuable for all readers. Congratulations to all authors in this issue! Daming Wei University of Aizu, Japan Guest referees Phillip G. Bradford, University of Alabama, USA Chin-Chen Chang, Feng Chia University, Taiwan Xiaoyan Hong, University of Alabama, USA Chun-Hsi Huang, University of Connecticut, USA Hai Jin, Huazhong University of Science and Technology, China Qun Jin, Waseda University, Japan Mengchi Liu, Carleton University, Canada Bhanu Prasad, Florida A&M University, USA Hui Wang, University of Aizu, Japan Yan Wang, Macquarie University, Australia Haibin Zhu, Nipissing University, Canada Ying Zhu, Georgia State University, USA

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1742-7371.htm

Energy-efficient adaptive resource management strategy for large-scale mobile ad hoc networks Shengfei Shi and Jianzhong Li School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China

Resource management strategy 123 Received 31 December 2005 Revised 28 September 2006

Chaokun Wang School of Software, Tsinghua University, China, and

Yuhui Wu School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China

Abstract Purpose – The purpose of this paper is to propose a novel resource management strategy, which needs no special frameworks and directory servers. Design/methodology/approach – The key idea is to piggyback a little extra packet header on the normal routing message by resource providers randomly. The clients can obtain the resource information gradually and need no dedicated resource queries. Findings – The results of simulation confirm the good performance of our algorithms in different situations in terms of query latency and power consumption. Originality/value – A novel resource management strategy, which needs no special frameworks and directory servers. The approach can be used in many current routing protocols. The quality of service issue is also considered for resource discovery algorithm and broadcasting method. Keywords Resource management, Mobile communication systems, Telecommunication network routing Paper type Research paper

1. Introduction In the mobile ad hoc networks (MANETs), all the hosts are roaming. The network that interconnects these mobile hosts is a wireless one with a frequently changing topology, and there are no fixed infrastructures and mobile support stations, which provide basic wireless communication management in traditional cell systems, such as GSM, etc. The applications that originally match the features of MANET are in emergency rescue operations and battlefield, etc. which are deployed in an unfriendly, remote infrastructure-less area. Recently, more and more civilian applications using MANET are developed, such as urban and campus resources sharing systems for large-scale This work is supported by the National Natural Science Foundation of China under Grant No. 60473075, and Key Program of the National Natural Science Foundation of China under Grant No. 60533110, and Program for New Century Excellent Talents in University (NCET) under Grant No. NCET-05-0333, and the National Grand Fundamental Research 973 Program of China under Grant No. 2006CB303000, and the key Natural Science Foundation of Heilongjiang Province under Grant No. zjg03-05.

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 123-137 # Emerald Group Publishing Limited 1742-7371 DOI 10.1108/17427370710847273

IJPCC 3,2

124

MANETs. In these research areas, many attentions have been paid to efficient resource discovery methods, which are crucial for locating resource provider (RP) with low query latency and communication overhead. Many resource discovery protocols have been proposed for wired networks. The Dynamic Host Configuration Protocol (DHCP) provides configuration parameters to Internet hosts (Droms, 1993). DHCP is built on a client–server model, where designated DHCP server hosts allocate network addresses and deliver configuration parameters to dynamically configured hosts. The domain name system (DNS) provides name service for the Internet. It is one of the largest name services in operation today (Mockapetris and Dunlap, 1988). The data in DNS can also be changed with low frequency (Vixie et al., 1997). LDAP protocol is used for retrieving and updating information in a X.500 model-based directory (Wahl, 1997). Simple object access protocol provides a message passing mechanism to exchange information between service providers and requestors (http:// www.w3.org/TR/soap/). The resource discovery working group in the Internet engineering task force has developed the service location protocol (SLP) (Veizades et al., 1997). The main entity in SLP is directory agent which has the responsibility to register the services. It is more challenging, however, to develop resource discovery methods for MANETs which have several distinct features compared with wired networks and wireless networks with infrastructure. These distinct features include mobility of clients and RPs, frequent topology changing, etc. In recent years, some resource discovery standards have been proposed for MANET. A two-stage protocol to solve the resource discovery problem in ad hoc networks is proposed in Tang et al. (1998). The solution proposed is fully distributed with each node advertising the services it provides. There are two stages in the solution: resource discovery and resource location. Every node that provides a service sends an advertisement in the first stage, and in the next stage, the clients send the request for information obtained in the first stage. In Liu et al. (2002), a framework was proposed to provide a solution for the discovery of resources and quality of service (QoS)-aware selection of RPs. The key entities of this framework are a set of self-organized discovery agents. They take the responsibility to register the resources and answer the resource discovery queries. Similar to Tang et al. (1998), the service providers have to register their resources information on agents before clients submit their queries to obtain the location of resources. In Klein et al. (2003), a lightweight overlay structure LANES was proposed. The LANES is an application layer overlay method. It was developed from CAN, which was designed for the internet-based peer-to-peer networks. In Tchakarov and vaidya (2004), an algorithm for content location in location-aware ad hoc networks was proposed, but it was geography-based content location protocol which needs geographical location information and special hardware support. Although these approaches have many advantages for ad hoc networks, they are not suitable for very large-scale MANETs for the following reasons: the first is that they all need special stages to register resources and maintain the directory information periodically; the costs of these stages are very high in large-scale ad hoc networks. The second is that none of them utilizes the large amount of information existing in the current routing messages, such as route request and route reply packets used in aodv (Perkins and Royer, 1999) and maodv (Royer and Perkins, 1999) routing protocols. In fact, the clients can obtain the resource information by analyzing the data packets forwarded and overheard by them. The third one is that in large-scale ad hoc networks, it is very difficult to monitor the QoS metrics by some discovery agents. On the other hand, the good evaluation given by agents cannot guarantee the QoS between

the clients who select service providers based on these evaluations and corresponding service providers, because the mobility patterns and energy level of clients are varied. Finally, it will be very difficult and inefficient to manage resource and QoS information when hosts move very fast. In this paper, a novel resource management strategy is proposed. Our approach does not require any additional framework and hardware; it can also be extended to most of the current routing protocols. The key idea is to piggyback a little extra packet header on the normal routing message by RPs randomly. The clients can obtain the resource information gradually and need no dedicated resource queries. We also consider the QoS issue and broadcasting optimization problem, because broadcasting is a very important method in MANET just as Gruenwald et al. (2002) proposed. The results of simulation confirm the good performance of our algorithms in different situations in terms of query latency and power consumption. 2. A novel energy-efficient adaptive resource management strategy 2.1 Overview of the strategy Suppose in a large-scale MANET, there are a number of hosts, which can provide some kinds of resources. The resources can be replicated among the RPs. The client who has interest in a dedicated resource can submit query to the RPs which hold the resource or the replication. RPs broadcast the frequently requested resources, which are called hot resources. Our approach is composed of some important algorithms to provide energy-efficient resource management techniques described below. Resource notification initiation and maintenance procedure. Initially the clients have no knowledge of the RPs, so the RPs must inform the clients which kind of resources held by them. We should notice that it is not necessary to flood special resource information periodically because many routine procedures can implement this task. Differing from other resource management approaches, our resource discovery method is based on most of the current routing protocols. In order to reduce the communication overhead, we only introduce a small amount of extra data piggybacked on the normal routing packets header, which is called PBH. The PBH is not flooded by RPs as many previous approaches have done. Instead, the RP randomly selects some neighbors and sends PBH to them. The latter hosts then forward PBH at an appropriate time randomly selected from the following occasions: routing requests reply, forwarding message, etc. Each mediate host along the routing path will obtain these PBH messages occasionally. On the other hand, the hosts, which do not receive these messages, can also get them by overhearing from their neighbors. In this way, with time going by, the resource information can be spread all over the whole ad hoc network. Mobility-aware hot resource broadcasting schedule and optimization algorithm. Our approach on broadcasting considers not only the energy limitation of RP, but also the mobility of all hosts involved in the operation. We propose cost metrics, which are used to determine the resource content to be broadcasted. The metric value is calculated by synthesizing many characters, such as distance vector (DV), mobility trend, and the density of clients which have the same interests on the resource. Resource discovery and selection optimization algorithm. Before it can query the resource from a RP, a client must know which RP can provide the interesting resource. As we mentioned, initially, the client is not aware of RPs. Some hosts on their local routing table are selected randomly as the possible relay hosts which maybe contact the RP. Each host which has received the resource request will check its local PBH buffer, if

Resource management strategy 125

IJPCC 3,2

126

some hosts are related to the RP which satisfied the resource request; these hosts’ ID will be returned. Otherwise, the process will be repeated by selecting some hosts in its local routing table randomly. The result of this procedure will be convergent to some hosts, which are either RPs who can provide the resource or hosts which are very near to RPs. During the above procedure, the client will maintain a set of host candidates all along, and monitor the QoS metrics, such as the communication delay between it and each host, possible mobility trend and request density of each candidate in the neighborhood. In the next sections, we will give the details of these algorithms. 2.2 Resource notification initiation and maintenance algorithm Before describing the details of the algorithms, we give some notations used in the algorithms. In this algorithm, we introduce two basic packets: PBH ::¼ (RP_id, Typ_id, Pwr_lv), where RP_id is the id of RP, as RP_id can be represented by id of the host, so in practice, RP_id does not consume any space in PBH; Typ_id is the id of resource type; Pwr_lv is the power level of the RP. The total length of PBH is one byte, 6 bits for Typ_id, 2 bits for Pwr_lv. Mes_spark ::¼ (L_hosts, PBH, n_Forwarded, M_TTL, n_Cached), where L_hosts is the list of hosts which should receive this packet, n_Forwarded is the time being forwarded, M_TTL defines the maximal times of forwarding, n_Cached is the times for which the Mes_spark message has been cached along the forwarding path. The total length of Mes_spark is five bytes, one byte for L_hosts, one byte for n_Forwarded, one byte for M_TTL, and one byte for n_Cached. n_per: Ratio threshold of caching times along the longest forwarding path. L0, L1, DV_threshold, p_PBH, P_Cache is the const defined by the system. Algorithm 1 shows the framework of resource notification and maintenance. The details of the algorithm will be described in the subsequent sections. Algorithm 1. Framework of resource notification and maintenance. (1) At the initial stage, there is no routing information: .

The RP randomly selects L0 neighbors and sends packet Mes_spark to them;

(2) During the establishment procedure of local routing information: .

Send piggybacked PBH on route request packet (RREQ) with probability p_pbh;

(3) During the maintenance procedure of local routing information: .

The RP randomly selects L1 destination hosts in routing table or local routing cache;

.

The RP sends PBH along the routing path at random time.

(4) for (each host along the path) .

forward the PBH with probability p_PBH;

(5) for (each host overheard the PBH) .

if (distance between it and the RP > DV_threshold) — if (PBH is not a duplication recently) — { (A) Randomly select one destination host in its local routing table;

(B) Forward the PBH with probability p_PBH; — } Due to the mobility or intended disconnection from the network, information about RP resided in the networks will be decayed. On the other hand, during the establishment procedure of the network, all RPs will initiate the resource information in order to notify clients. Contrary to previously proposed methods, we do not use flooding protocol to allow all hosts in the networks to know the meta information. Instead, RP adopts MES_spark forwarding algorithm which keeps the overhead within a scope. Those hosts who cache the MES_spark messages of the RP are distributed in the networks sparsely, and are used to answer the query of resources. It must be noted that the MES_spark messages are scattered only during some initialization procedures, such as routing setup, topology change and the joining of new RP in the networks, etc. Under these circumstances, the MES_spark message can be piggybacked on control messages, such as RREQ (Route Request), RRLY (Route Reply) and HELLO messages. Algorithm 2. MES_spark forwarding. Input: MES_spark M; Output: Host of next hop to forward the message nh (1) M.L_hosts

{RP randomly selects n_forward neighbors};

(2) sends M to M.L_hosts; (3) for (all hosts h in M.L_hosts) (4) { (a) h caches M.PBH with probability p_Cache; (5) if (h will cache M.PBH)&& (6) (M.n_Cached < M.M_TTL * M.n_per) .

{

.

Cache M.PBH;

.

M.n_Cachedþþ;

.

}

.

if (M.n_Forwarded > M.M_TTL)

.

{

.

Drop Mes_spark;

.

return;

.

}

.

else

.

{

.

nh {h randomly selects one of its neighbors from whom h did not receive M before};

.

}

.

}

Resource management strategy 127

IJPCC 3,2

128

We show the example of algorithm 2 in Figure 1. From the figure we can see that, after forwarding MES_spark, there are some hosts which have cached the PBH messages distributed sparsely in the networks. In the subsequent time, these hosts will randomly diffuse PBH. Algorithm 2 shows how RPs diffuse their resource meta data without using any routing information, while in algorithm 3, the RP will scatter resource information with the help of local routing information. In MANET, because every host takes the role of router to relay messages communicated among other hosts, with the time being, RP will establish some routing path locally. The routing information includes many useful data for resource notification, such as destination host, distance between RP and other hosts, and next hop host, etc. In algorithm 3, we show how RP diffuses resource notification by using the routing information. Assume each RP maintains a routing table or routing cache locally, the main contents of the routing data structure are: Dest_Host: ID(Address) of a destination host; Next_Hop: The next hop of RP to the destination; DV: Distance vector to the destination in the form of hop numbers. The aim of algorithm 3 is to select some paths for RP to diffuse RP resource information (PBH), and ideally, these paths are distributed uniformly among all directions. But in fact, unless using location-based routing protocols in which each host know its current location, the RP cannot select paths considering the directions of the path. Instead, it can only select these paths without any knowledge of geographical location. In general, it is hard for RP to guarantee the uniform distribution of paths along all directions. In algorithm 3, we adopt a greedy method to select paths. The criteria of selection are based on the following observations: first, if a host, say h, appears in the next hop of

Figure 1. Example of algorithm 2

many routing entries, the paths through h will have high probability to forward PBH messages. This is because h will forward PBH randomly in the subsequent routing routines. So we only select one path from these paths which begins with h; Second, for a host which appears for fewer times in the next hop of routing entries, as we have mentioned above, the paths going through this host have lower probability to forward PBH messages, so the algorithm selects these hosts with higher priority, aiming to keep the distribution of PBH in the networks uniformly; Third, the longer the path length is, the more possibility there is for more hosts to obtain PBH. In conclusion, the RP selects paths to diffuse PBH based on the following rules: consider the uniformity along all directions of distribution of PBH firstly, and then consider the distance of diffusion. Algorithm 3. Path selection of PBH diffusion. Input: Local routing table or routing cache: RT_TB Output: Set of destination hosts: L_DestHosts (1) Group the routing entries by RT_TB.Next_Hop; (2) Gi.n_row

line number of each group i;

(3) Sort groups by Gi.n_row in ascending order; (4) In each group, sort each entry by RT_TB.DV in descending order; (5) Select one host from each group recurrently and insert them into L_Desthosts until jL_Desthostsj ¼ n_forward; (6) Return L_Desthosts. 2.3 Optimization algorithm of resource broadcasting schedule For the hosts within the scope of dv_tr hops around the RP, they can benefit from broadcasting common hot resource items. To select the hot items, our algorithm considers not only the access frequency like previous methods do (Gruenwald et al., 2002), but also considers the distance between clients and RP. The mobility trend of clients is another important character. For example, if many clients request a resource item, the previous methods (Gruenwald et al., 2002) will probably broadcast this item. But in fact, these clients may be far away from the RP, so the broadcasting may lead to heavy communication overhead. On the other hand, if the members of broadcast group have the trend to move out of the scope of dv_tr, they should be assigned the low possibility to get resource item by broadcasting from RP. The RP can obtain the distance dv and distance change value v_dv of the client who requests the resource item. Once the number of clients who are interested in an item exceeds the threshold N_bro, the RP will estimate the distance mean and variance of these hosts and the mobility trend. A client may change route path to the RP for the reason of mobility. Value v_dv indicates the hop difference between the different route paths. While the RP is calculating the weight of distance, which is indicated by the hop number, v_dv is treated differently in different situations. If the client moves away from the RP, v_dv is calculated as v_dv ¼ v_dv * dv/dv_tr, which means mobility trend is enlarged when the client moves out of the scope of dv_tr (because dv is larger than dv_tr at this time). If the distance is reduced, that is when v_dv is less than 0, v_dv ¼ v_dv * dv_tr/dv, the v_dv is shrank when dv is still out of the scope of dv_tr. The calculation of v_dv shows that the RP gives higher weight to the items which are accessed by hosts moving toward to the RP. The trend of mobility is indicated by the ratio of number of clients who move away from the RP to the total number of clients

Resource management strategy 129

IJPCC 3,2

130

which access an data item. During each schedule period, the RP evaluates the current mean and variance of clients’ distance for each item. The items are selected into broadcast schedule by synthesizing the following characters: mean and variance of distance in ascending order, trend value in ascending order, and average access frequency per hop in descending order. Some variables and data structures used in algorithm 4 are defined as follows: N_bro: threshold of clients’ number, Buff_item: Buffer of resource items whose clients’ number exceeds N_bro. The buffer maintains an entry for each item. Each buffer entry maintains hosts who request the item and their metric values: current distance hop number dv_cur, mobility trend bool_away. Each item is also assigned some values: e_dv and var_dv: mean and variance of clients’ distance separately, access times acc_num, access frequency per hop af_ph, trend ratio rt_away. Query request packet R_Q ::¼ (host id h_id, item id i_id, dv: hop number between host h_id and the RP). dv_tr: Threshold of distance scope within which broadcasting can be considered. N_schedule: number of broadcasting items. Algorithm 4. Optimization algorithm of resource broadcasting schedule. (1) for each received request of resource item R_Q, do (2) { (3) Buff_item (R_Q.i_id, R_Q.h_id).acc_numþþ; .

v_dv ¼ R_Q.dv-Buff_item(i_id,h_id ).dv_cur;

.

if (v_dv > 0)

.

{

.

v_dv ¼ v_dv * R_Q.dv/dv_tr;

.

Buff_item(R_Q.i_id, R_Q.h_id).bool_away ¼ 1;

.

}

.

else

.

{

.

v_dv ¼ v_dv * dv_tr/ R_Q.dv;

.

Buff_item(R_Q.i_id, R_Q.h_id).bool_away ¼ 0; (i)

}

.

Buff_item (R_Q.i_id, R_Q.h_id).dv_cur þ ¼ v_dv;

.

}

(4) for each item i in Buff_item, do .

{ —

i.e_dv ¼ mean of i.dv_cur;



var_dv ¼ variance of i.dv_cur;



i.af_ph ¼ sum(i. acc_num)/sum(i.dv_cur);



i. rt_away ¼ sum(i.bool_away)/count(hosts of i);



}

.

Sort items by (i.e_dv, i.var_dv, i.rt_away) in ascending order, by i.af_ph in descending order;

.

Select top N_schedule items into broadcasting schedule;

.

}

2.4 Resource discovery algorithm For the reason of mobility of RPs and clients, in general, it is difficult to specify a RP as a client’s RP. It can introduce large amount of communication overhead when the routing path between client and RP changes frequently. Instead, in our algorithm we present the RP with a set of host candidates, which have high possibility to answer the resource requests efficiently. As we know, the RPs will disseminate PBH randomly. If a host receives these PBHs frequently, it can be seen as a RP candidate even though it does not possess any resource, because other clients can obtain resource items from it indirectly with high possibility. The PBH messages are used to filter unnecessary hosts in the candidates set. The algorithm also considers the QoS issue when the RP is selecting the candidate hosts. As we have mentioned in algorithms 2 and 3, ideally, the PBHs will be distributed uniformly along all directions, but in fact, it is hard to achieve this goal. In Figure 2, we show the relationship between distribution of PBH and resource discovery requests submitted by a client. The main idea is that the resource discovery request submitted by a client will be sent to some destination hosts, the paths to these hosts are represented by dotted lines in Figure 2. These paths will intersect one or some forwarding paths of PBH with probability, in this way, the client can obtain the information of RP and avoid the flooding of requests. We show an example of algorithm 5 in Figure 3. Initially the client has no knowledge of RPs. It randomly selects RPN ¼ 4 hosts as its RP from its routing table, as shown in Figure 3(a). The client will send resource request to these hosts even they are not RPs. The host who receives a request will check its local PBH buffer received before. If there is a RP who satisfies the request, it will forward the request to the RP and return an acknowledgement message to the client. If it knows nothing about the RP, it randomly selects RPN hosts from its local routing table and forward the request to them. In Figure

Resource management strategy 131

Figure 2. Distribution of PBH and requests

IJPCC 3,2

132

3(b), host H1 and H2 can provide resource got from their neighbors. Once the client receives an acknowledgement from a host, it will increase the weight of the host. If there is no message return from the host in the set, the client will reduce the weight of the host. In Figure 3(c), the client selects H1 and H2 as RP candidates. On receiving the resource items, the client also evaluates the QoS metrics of different hosts. The metrics include packet transfer delay, which is an important criterion, and some other characters, which can influence QoS indirectly, such as hop number and trend of mobility. Finally, the client selects host H1 as its RP, for the reason that H1 has the best QoS. The client also needs to randomly select another three hosts as RP candidates and repeat the above procedure, as shown in Figure 3(d). This randomized resource discovery algorithm can resolve the problems caused by frequent changing of topology. Algorithm 5. Resource discovery algorithm. Input: Client ch, resource request q Output: RP candidates set S (1) S ¼ {set of RPN hosts, which are selected randomly by ch from its routing table};

Figure 3. Distribution of PBH and requests

(2) ch sends q to hosts in S; (3) On each host h in S, do (4) { .

Checks local buffer of PBH to find the RP with the smallest hop number;

.

if (found)

.

return acknowledge message ack to ch;

.

else

.

{ — Randomly select RPN hosts from its local routing table; — Forward q to them;

(5) } (6) if (h Receives ack from hosts) (7) Forward asck to ch; (8) }; //Each host in S (9) On client ch do (10) { (a) if (ch received ack from h within the time interval specified by system) (11) { (12) Increase the weight of h; (13) if (ack.RP.distance < h.RP.distance) .

Update RP.id and RP.distance of h with ack;

.

};

(14) ch deletes hosts in S which return no ack; .

if (jSj < RPN)

.

{ — ch randomly selects RPN  jSj hosts from routing table; — ch sends request q to new hosts in S; — };

.

ch monitors the packets delay between hosts in S and itself;

.

ch selects the top one host in S as its RP in the descending order of packets delay and hop number;

.

}; //On client ch do.

3. Performance evaluation 3.1 Simulation environment We simulate our algorithms using network simulator ns-2 (http://www.isi.edu/nsnam/ ns/). There are 200 mobile hosts in the network. The hosts are distributed over the area of 1,500 m by 1,500 m. The MAC layer protocol is IEEE 802.11. Each wireless channel

Resource management strategy 133

IJPCC 3,2

134

has 2 Mbps bandwidth and a circular radio range with 100 m radius. The routing protocol is ad hoc on-demand distance vector (AODV). There are 30 RPs and 15 replications randomly selected from them. Each RP has 50 data items. We compare our works with algorithms proposed in Liu et al. (2002) and Gruenwald et al. (2002). 3.2 Query latency We evaluate the resource query performance with different speeds of hosts. From Figure 4 we can see that with the increase of host speed, the average query latency grows. This is because the topology changes frequently with the increase of speed. Extra queries are submitted to discover the RP. In Figure 4, FRM method stands for the method in Liu et al. (2002). From Figure 4, we find that when the speed is low, say, less than 8 m/s, the query latency of FRM is lower than that of what we proposed. Because of this, the RP can be seen as static and the mobility has little effect on the query. With the increase of speed of host, however, the increase of query latency of our method is slower than that of FRM, because in our method, the resource information is piggybacked on ordinary routing packets, the clients can obtain resource location information in many cases and the time for resource discovery is reduced dramatically. The simulation result shows that our method is suitable for the situation in which the hosts have high speed. On the other hand, the broadcasting method we proposed considers the mobility of both RPs and clients. The optimized broadcasting schedule can satisfy a majority of clients who want to get the resource. In the next section, we will examine the average power consumption of hosts. 3.3 Extra communication costs We have shown that our method has a shorter query latency than that of others. In this section, we will show that although this advantage is obtained by using extra data piggybacked in general messages, the total extra communication costs are not larger than others. In other methods, such as in Liu et al. (2002), they need to maintain some architectures, which will result in some extra communication costs. In Figure 5, we compared the extra communication costs between our method and Liu et al. (2002). With the increase of number of RP, the extra communication costs increased. But our method is almost

Figure 4. Query latency

independent of the number of RP and it is much smaller than that in method proposed in Liu et al. (2002). In AODV protocol which is implemented by ns 2, the length of routing request message is 48 bytes, in our methods, PBH is only one byte which is piggybacked in AODV package header with the probability p_PBH, but in Liu et al. (2002), many application level messages must be sent by clients and RPs, which will bring many extra communication costs.

Resource management strategy 135

3.4 Average power consumption of host We compare the average power consumption of host between our method and PBARBSA method proposed in Gruenwald (2002). From Figure 6, we can find that the power consumption grows with the increase of speed in both methods. The power consumption of our method, however, grows more slowly than that of PBARBSA. The main reason is that our method considers the mobility of both clients and RP, which makes it easy to adapt to the high speed of mobility.

Figure 5. Extra messages

Figure 6. Average power consumption

IJPCC 3,2

136

In PBARBSA method, the server calculates the weight of items according to their access times and duration, but does not consider the distance and mobility trend of those clients who send the queries. If the clients who access an item very frequently are far away from the server, the broadcasting of this item will bring high communication costs. Similarly, if the clients have the trend to move away from the server, it will be a high burden for the intermediate hosts to forward these data to the moving hosts. With the increase of the hosts’ speed, the problems become very severe. 4. Conclusion In this paper, a novel resource management strategy for large-scale MANETs is proposed, which need no special frameworks and directory servers. The approach can be used based on most of current routing protocols and the QoS issue is also considered. The results of simulation confirm the good performance of our algorithms in different situations in terms of query latency and power consumption. References Droms, R. (1993), ‘‘Dynamic host configuration protocol’’, RFC 1531, October. Gruenwald, L., Javed, M. and Gu, M. (2002), ‘‘Energy-efficient data broadcasting in mobile ad-hoc networks’’, Proceedings of the 2002 International Symposium on Database Engineering & Applications, July, IEEE Computer Society, Washington, DC, pp. 64-73. Klein, M., Konig-Ries, B. and Obreiter, P. (2003), ‘‘Lanes – a lightweight overlay for service discovery in mobile ad hoc networks’’, Third Workshop on Applications and Services in Wireless Networks (ASWN2003), Berne, July . Liu, J., Zhang, Q., Zhu, W., Zhang, J. and Li, B. (2002), ‘‘A novel framework for QoS-aware resource discovery in mobile ad hoc networks’’, IEEE International Conference on Communications (ICC’02), April, IEEE Computer Society, New York City, pp. 1011-6. Mockapetris, P. and Dunlap, K.J. (1988), Development of the Domain Name System. ACM SIGCOMM Computer Communication Review, ACM Press, New York, NY, August, pp. 123-33. Perkins, C.E. and Royer, E.M. (1999), ‘‘Ad hoc on-demand distance vector routing’’, The 2nd IEEE Workshop on Mobile Computing Systems and Applications, February, IEEE Computer Society, New Orleans, LA, pp. 90-100. Royer, E.M. and Perkins, C.E. (1999), ‘‘Multicast operation of the ad hoc on-demand distance vector routing protocol’’, The 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking, August, ACM Press, Seattle, WA, pp. 207-18. Tang, D., Chang, C.-Y., Tanaka, K. and Baker, M. (1998), ‘‘Resource discovery in ad hoc networks’’, Technical Report No.: CSL-TR-98-769, Stanford University, August. Tchakarov, J.B. and Vaidya, N.H. (2004), ‘‘Efficient content location in wireless ad hoc networks’’, IEEE International Conference on Mobile Data Management (MDM’04), January, IEEE Computer Society, Berkeley, CA, pp. 74-87. Veizades, J., Guttman, E., Perkins, C. and Kaplan, S. (1997), ‘‘Service location protocol’’, RFC 2165, July. Vixie, P., Thomson, S., Rekhter, Y. and Bound, J. (1997), ‘‘Dynamic updates in the domain name system’’, RFC 2136, April. Wahl, M. (1997), ‘‘Lightweight directory access protocol’’, RFC 2251, December.

About the authors Shengfei Shi received the BE degree in computer science from Harbin Institute of Technology, China, in 1995, his MS and PhD degrees in computer software and theory from Harbin Institute of Technology, China, in 2000 and 2006. He is currently an lecturer in the School of Computer Science and Technology at Harbin Institute of Technology, China. His research interests include wireless networks, mobile computing, and data mining. Shengfei Shi is the corresponding author and can be contacted at: [email protected] Jianzhong Li is a professor and the chairman of the Department of Computer Science and Engineering at Harbin Institute of Technology, China. He worked in the University of California at Berkeley as a visiting scholar in 1985. From 1986 to 1987 and from 1992 to 1993, he was a staff scientist in the Information Research Group at lawrence Berkeley National Laboratory, Berkeley, USA. His Current research interests include data warehousing, data mining, bioinformatics, and wireless sensor networks.

Chaokun Wang received the BE degree in computer science and technology, the MSc degree in computational mathematics and the PhD degree in computer software and theory in 1997, 2000, and 2005, respectively, from the Harbin Institute of Technology, China. He joined thefaculty of the School of Software in Tsinghua University in February 2006, where currently he is a lecturer. His research interests include data management, P2P systems, digital music libraries, and mobile databases. He has authored and coauthored 19 papers on these topics.

Yuhui Wu received the BE degree in computer science from Harbin Institute of Technology, China, in 2005. He is currently a MS degree candidate in the School of Computer Science and Technology at Harbin Institute of Technology, China. His research interest is wireless networks.

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints

Resource management strategy 137

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1742-7371.htm

IJPCC 3,2

Ontology alignment as a basis for mobile service integration and invocation

138

Jingshan Huang, Jiangbo Dang and Michael N. Huhns

Received 31 December 2005 Revised 7 May 2006

Computer Science and Engineering Department, University of South Carolina, Columbia, South Carolina, USA, and

Yongzhen Shao Software School, Fudan University, Shanghai, People’s Republic of China Abstract Purpose – The purpose of this paper is to present ontology alignment as a basis for mobile service integration and invocation. Design/methodology/approach – This paper presents an automated schema-based approach to align the ontologies from interacting devices as a basis for mobile service invocation. When the ontologies are ambiguous about the services provided, compatibility vectors are introduced as a means of maintaining ontology quality and deciding which service to choose to reduce the ambiguity. Findings – Both precision and recall measurements are applied in the evaluation of the alignment approach, with promising results. In addition, for the compatibility vector system, it is not only proved theoretically that the approach is both precise and efficient, but it also shows promising results experimentally. Originality/value – In cases where sufficient resources are not available and only a certain number of mobile devices can be chosen for interaction, this approach increases the efficiency by choosing suitable mobile device(s). Research limitations/implications – This current approach makes use of a center ontology, but introduces the problem of how to handle the vulnerability issue inherent in this centralized solution. To analyze and solve this problem is a potential research direction. Keywords Mobile communication systems, Integration Paper type Research paper

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 138-158 # Emerald Group Publishing Limited 1742-7371 DOI 10.1108/17427370710847282

1. Introduction Mobile computing is becoming more widespread and increasingly important. Mobile portable devices already outnumber traditional desktop computers and are expected to determine the view of computers for future generations. However, mobile devices typically have rather limited capabilities. To overcome this limitation, a mobile device can make use of the functionalities and services provided by other mobile devices, and thereby extend its own capabilities. The first step for mobile devices to achieve this goal will be to understand the descriptions of services that they can provide to each other. Only in this way can the future integration and/or invocation of mobile services take place automatically and successfully. An ontology serves as a declarative model for the knowledge and capabilities possessed by a device or of interest to a device. It forms the foundation upon which machine understandable description of services can be obtained, and as a result, automatic interaction among mobile devices is enabled. Therefore, adding ontologies into the mobile service scenario will facilitate the extension of mobile device capabilities by providing a more comprehensible and formal semantics. The functionalities and behaviors of mobile services can be described, advertised, discovered, and composed by others through the use of and reference to ontologies.

Eventually, these mobile services would be able to interoperate with each other, even though they have not been designed to work together. This is the vision for pervasive computing among mobile devices. However, because it is impractical to have a global ontology that describes every concept that is or might be included as part of the mobile services, ontologies from different mobile devices typically have heterogeneous semantics. Due to this basic characteristic, mobile devices need to reconcile ontologies and form a mutual understanding when they interact with each other. Only in this sense, mobile services are able to comprehend and/or integrate the information from different sources, and enhance service interoperability thereafter. In this paper, we first present an automated schemabased ontology merging algorithm to align heterogeneous ontologies. Then we focus on an important but mostly neglected research topic – how a mobile device can select suitable ontologies to interact with. We introduce the concept of compatibility vectors as a means of evaluating and maintaining ontology quality, and use this as a basis for suitability of ontology selection. Our approach is able to create and adjust dynamically the compatibilities of mobile devices with regard to the quality of their underlying ontologies. The rest of the paper is organized as follows. Section 2 briefly introduces related work in ontology matching and quality of service (QoS). Section 3 discusses the challenges after the introduction of ontologies into mobile services. Sections 4 and 5 present our ontology merging algorithm and compatibility vector system, respectively. Section 6 briefly reports on the experiment results and section 7 concludes with future work. 2. Related work 2.1 Related work in ontology matching Many researchers have investigated the problem of ontology matching, mostly using one of two approaches: instance-based and schema-based. All of the following systems belong to the latter, except for GLUE (Doan et al., 2003). GLUE introduces well-founded notions of semantic similarity, applies multiple machine learning strategies, and can find not only one-to-one mappings, but also complex mappings. However, it depends heavily on the availability of instance data. Therefore, it is not practical for cases where there is an insignificant number of instances or no instances at all. PROMPT (Noy and Musen, 2001) is a tool making use of linguistic similarity matches between concepts for initiating the merging or alignment process, and then use the underlying ontological structures of the Prote´ge´-2000 environment to inform a set of heuristics for identifying further matches between the ontologies. PROMPT has a good performance in terms of precision and recall. However, user intervention is required, which is not always available in real world applications. COMA (Do et al., 2002) provides an extensible library of matching algorithms, a framework for combining results, and evaluation platform as well. According to their evaluation, COMA is performing well in terms of precision, recall, and overall measures. Although being a composite schema matching tool, COMA does not integrate reasoning and machine learning techniques. Similarity Flooding (Melnik et al., 2002) utilizes a hybrid matching technique based on the idea that similarity spreading from similar nodes to the adjacent neighbors. Before a fix-point is reached, alignments between nodes are refined iteratively. This

Ontology alignment

139

IJPCC 3,2

140

algorithm only considers the simple linguistic similarity between node names, leaving behind the node property and inter-node relationship. Cupid (Madhavan et al., 2001) combines linguistic and structural schema matching techniques, as well as the help of a precompiled dictionary. But it can only work with a tree-structured ontology instead of a more general graph-structured one. As a result, there are many limitations to its application, because a tree cannot represent multipleinheritance, an important characteristic in ontologies. S-Match (Giunchiglia et al., 2004) is a modular system into which individual components can be plugged and unplugged. The core of the system is the computation of relations. Five possible relations are defined between nodes: equivalence, more general, less general, mismatch, and overlapping. Giunchiglia et al. claim that S-Match outperforms Cupid, COMA, and Similarity Flooding in measurements of precision, recall, overall, and F-measure. However, like Cupid, S-Match uses a tree-structured ontology. 2.2 Related work in QoS QoS is becoming a significant factor with the widespread deployment of Web services. By QoS we refer to the nonfunctional properties of services, such as reliability, availability, and security. Bilgin and Singh (2004) propose a service query and manipulation language (SWSQL) to maintain QoS attribute ontologies and to publish, rate, and select services by their functionality as well as QoS properties. Based on SWSQL, they extend the universal description, discovery, and integration (UDDI) registry to a service repository by combining a relational database and the attribute ontology. Zhou et al. (2004) provide a DAML-QoS ontology as a complement to a DAML-S ontology in which multiple QoS profiles can be attached to one service profile. In addition, they present a matchmaking algorithm for QoS properties. One widely used QoS attribute is user rating, but it is subjective to the perception of the end user and is limited by the lack of an objective representation of the performance history. Kalepu et al. (2004) introduce reputation, a composition of user rating, compliance and verity as a more viable QoS attribute. Ontologies are applied to QoS-aware service selection, execution, and composition. A selected ontology itself can adopt some QoS measures to facilitate mutual ontology understanding as discussed in this paper. 3. Challenges of ontology integration into mobile services 3.1 Adding ontologies into mobile services In order to integrate and invoke the services rendered by other mobile devices, a mobile device must be able to comprehend the descriptions about those services as a first step. Being a formal knowledge representation model, ontologies can aid in this comprehension by providing the necessary semantics during communications among mobile devices. An example scenario of the interaction among different mobile devices can be envisioned as follows. (1) A number of mobile devices form a mobile service community (MSC) within which services provided by different devices might be integrated and have the

ability to render better services. This integration requires the mutual understanding of the individual ontologies underlying each service. (2) The mobile devices outside this MSC can ask for help from the community and consume its services, either the original ones or the integrated one. This invocation requires not only an understanding of the related ontologies, but also the ability to choose suitable service provider(s), especially under the situations where resources are limited. 3.2 Problems encountered Because of the fact that there is no global, common, and agreed-upon ontology, any mobile device can maintain and use ontologies according to its own conceptual view of the world. Consequently, ontological heterogeneity among different mobile devices becomes an inherent characteristic in a mobile computing environment. The heterogeneity can occur in two ways: (1) Different ontologies could use different terminologies to describe the same conceptual model. That is, different terms could be used for the same concept or the same term could be used for different concepts. (2) Even if two ontologies use the same name for a certain concept C, its corresponding properties and relationships with other concepts can be different. Therefore, two major problems are envisioned here. First, during the formation of a MSC, how can it be ensured that all devices within the community have no problem in understanding each other’s ontology? Secondly, a mobile device seeking services from outside this community would like to choose those devices that understand its ontology best. How can it ensure this selection is a correct one? 3.3 Overview of our approach In order to solve the first problem – mutual understanding of ontologies within a MSC, we need an approach to match/align ontologies from different mobile devices. By this means, concepts from communicating devices can comprehend each other, and possible integration of related services can be achieved. In the next section, we present an ontology merging algorithm to reconcile heterogeneous ontologies. To tackle the second problem – the correct selection of mobile devices that are most acquainted with the ontologies from service-consuming devices, we introduce compatibility vectors as a means of measuring and maintaining the ontology quality. By setting up the compatibility for each mobile device along with the formation of a MSC, not only the mobile devices seeking service from this community are able to select the best service provider(s) with ease, but also a better mutual understanding of ontologies within the MSC is obtained. 4. A schema-based ontology merging algorithm Our goal is to construct a merged ontology from two original ones. Although there does not exist such a global and agreed-upon ontology, we do assume that there is a common metamodel, i.e. web ontology language description logic (OWL DL), for the ontologies to be merged, and we also assume that natural language provides common semantics during the ontology merging process.

Ontology alignment

141

IJPCC 3,2

142

4.1 Top-level procedure The ontology merging is carried out at the schema level, that is, we concentrate on the structure (schema) information of ontologies. Internally we represent an ontology using a directed acyclic graph G(V, E), where V is a set of ontology concepts (nodes), and E is a set of edges between two concepts, i.e. E ¼ {(u,v)|u and v belong to V, and u is a superclass of v}. In addition, we assume that all ontologies share ‘‘Thing’’ as a common ‘‘built-in’’ root. In order to merge ontologies, G1 and G2, we try to relocate each concept (node) from one ontology into the other one. We adopt a breadth-first order to traverse G1 and pick up a concept C as the target to be relocated into G2. Consequently, at least one member of C’s parent set Parent(C) in the original graph G1 has already been put into the suitable place in the destination graph G2 before the relocation of C itself. The pseudocode below describes this top-level procedure, whose time complexity is obviously O(n2), with n the number of concepts in the merged ontology. Top-level procedure – merge(G1, G2) Input: Ontology G1 and G2 Output: Merged Ontology G2 begin new location of G1’s root ¼ G2’s root for each node C (except for the root) in G1 Parent(C ) ¼ C’s parent set in G1 for each member pi in Parent(C ) pj ¼ new location of pi in G2 relocate(C, pj) end for end for end An implementation detail is worth mentioning here. Because of the characteristics of traversing a directed acyclic graph, there is a possibility that one or more parents of a certain concept may not have been relocated before that concept is visited. However, at least one of the parents will have been relocated. In this case, we revisit the target concept after all its parents have been visited. Notice that progress is guaranteed, because the graphs in question are acyclic. 4.2 Relocate function The relocate function in the top-level procedure is used to relocate C into a subgraph rooted by pj. The following pseudocode shows the details of this function. Relocate function – relocate(N1, N2) Input: nodes N1 and N2 Output: the modified structure of N2 according to information from N1 begin if there exists any equivalentclass of N1 in the child(ren) of N2 merge N1 with it else if there exists any subclass of N1 in the child(ren) of N2

Children(N1) ¼ set of such subclass(es) for each member ci in Children(N1) add links from N2 to N1 and from N1 to ci remove the link from N2 to ci end for else if there exists any superclass of N1 in the child(ren) of N2 Parent(N1) ¼ set of such superclass(es) for each member pi in Parent (N1) recursively call relocate(N1, pi) end for else add a link from N2 to N1 end if end Our main idea is: try to find the relationship between C and pj’s direct child(ren) in the following descending priorities: equivalentclass, superclass, and subclass. Because equivalentclass has most significant and accurate information, it is straightforward that equivalentclass has been assigned the highest priority. For superclass and subclass, since we adopt a top down procedure to relocate concepts, the former has been given a higher priority than the latter. If we cannot find any of these three relationships, the only option is for us to let C be another direct child of pj. Notice that there is a recursive call within relocate. This recursion is guaranteed to terminate because the number of the nodes within a graph is finite, and the worst case is to call relocate repetitively until the algorithm encounters a node without any child. To determine the relationship between C and pj’s direct child(ren), we need to consider both the linguistic and the contextual features. The meaning of a concept is determined by two aspects: (1) its linguistic feature – concept name (2) its contextual feature – property list and the relationship(s) with other concept(s). These two features are discussed next; they together specify a concept’s semantics. 4.3 Linguistic matching The name of a concept reflects the meaning that the ontology designer intended to encode. Our approach uses string matching techniques to match linguistic features. Furthermore, we integrate WordNet by using the JWNL API (JWNL, 2005) in our system. In this way, we are able to obtain the synonyms, antonyms, hyponyms, and hypernyms of an English word. In addition, WordNet performs some stemming, e.g. the transformation of a noun from plural to singular form. We claim that for any pair of ontology concepts C and C 0 , their names NsC and NC 0 have the following mutually exclusive relationships, in terms of their linguistic features (the vlinguistic mentioned below refers to the similarity between two concept names).

Ontology alignment

143

IJPCC 3,2

144

.

anti-match: NC is a antonym of NC 0 , with vlinguistic ¼ 0;

.

exact-match: either NC and NC 0 have an exact string matching, or they are the synonyms of each other, with vlinguistic ¼ 1;

.

sub-match: NC is either a postfix or a hypernym of NC 0 , with vlinguistic ¼ 1;

.

super-match: NC’ is either a postfix or a hypernym of NC, with vlinguistic ¼ 1;

.

leading-match: the leading substrings from NC and NC 0 match with each other, with vlinguistic equaling the length of the common leading substring divided by the length of the longer string. For example, ‘‘active’’ and ‘‘actor’’ have a common leading substring ‘‘act’’, resulting in a leading-match value of 3/6;

.

other: vlinguistic ¼ 0.

When relocating C, we perform the linguistic matching between C and all the candidate concepts in the destination graph G2, and build a list for each of three types of relationship of C, i.e. equivalentclass, superclass, and subclass. For each candidate concept C 0 , if an exact-match or a leading-match (with vlinguistic  threshold) is found, we put C 0 into C’s candidate equivalentclass list; if a sub-match is found, we put C 0 into C ’s candidate subclass list; and if a super-match is found, we put C 0 into C ’s candidate superclass list. Then we continue the contextual matching between C and each concept in the three candidate lists to obtain further information. 4.4 Contextual matching In essence, the context of an ontology concept C consists of two parts: its relationship(s) with other concept(s), and its property list. The former include equivalentclass, subclass, superclass, and sibling, and is implicitly embodied in the graph traverse process mentioned previously. The latter is discussed next. Considering the property lists, P(C) and P(C 0 ), of a pair of concepts C and C 0 being matched, our goal is to calculate their similarity value. vcontextual ¼ wrequired  vrequired þ wnon-required  vnon-required, where vrequired and vnon-required are the similarity values calculated for the required property list and non-required property list, respectively. wrequired and wnon-required are the weights assigned to each list. Notice that vrequired and vnon-required are calculated by the same procedure. Suppose the number of properties in two property lists (either required or nonrequired ones), P1 and P2, is n1 and n2, respectively. Without loss of generality, we assume that n1  n2. There are three different matching models between two properties. (1) Total-match .

The linguistic matching of the property names results in either an exactmatch, or a leading-match with vlinguistic  threshold; and

.

The data types match exactly.

Let vt ¼ number of properties with a total-match, and ft ¼ vt/n1. Here ft is a correcting factor for name-match, embodying the integration of heuristic reasoning. We claim that between two property lists, the more pairs of properties being regarded as total-match, the more likely that the remaining pairs of properties will also hit a match as long as the linguistic match between their names is above a certain threshold value. For example, assume that both P1 and P2 have ten properties. If there are already nine pairs with a total-match, and furthermore, if we

find out that the names in the remaining pair of properties are similar with each other, then it is much more likely that this pair will also have a match, as opposed to the case where only one or two out of ten pairs have a total-match.

Ontology alignment

(2) Name-match .

The linguistic matching of the property names results in either an exactmatch, or a leading-match with vlinguistic  threshold; and

.

The data types do not match.

Let vn ¼ number of properties with a name-match, and fn ¼ (vt þ v)n/n1. Similarly to ft, fn also serves as a correcting factor for datatype-match. (3) Datatype-match: Only the data types match. Let vd ¼ number of properties with a datatype-match. After we find all the possible matching models in the above order, we can calculate the similarity between the property lists as 1/n1(vt  w1 ¼ vn(w2 þ ft  w02 ) þ vd(w3 þ fn  w03 )), where: . .

wi (i from 1 to 3) is the weight assigned to each matching model; and w0i (i from 2 to 3) is the correcting weight assigned to the matching models of name-match and datatype-match.

4.5 Domain-independent reasoning Remember that to merge two ontologies, we in essence are to relocate each concept from one ontology into the other one. After we obtain the linguistic and contextual similarities, we apply a domain-independent reasoning rule to infer the relationship between the target concept to be relocated and the candidate concept in the destination ontology. Relationships among property lists. Suppose we have two ontologies A and B, each of which is designed according to the OWL DL specification. Furthermore, let n(A) and n(B) be the sets of concepts in A and B, respectively, with ni(A) and nj(B) be the individual concept for each set (1  i  |n(A)| and 1  j  |n(B)|), and P(ni(A)) and P(nj (B)) be the property list for each individual concept. Consider the property lists P(ni(A)) and P(nj(B)), let si and sj be the set size of these two lists. There are four mutually exclusive possibilities for the relationship between P(ni(A)) and P(nj(B)): (1) P(ni(A)) and P(nj(B)) are consistent with each other if and only if .

Either si ¼ sj or (abs(si  sj)/si þ sj)  threshold, and

.

vcontextual  threshold p

We denote the corresponding concepts ni(A) and nj(B) by ni(A) $nj (B); (2) P(ni(A)) is a subset of P(nj(B)) if and only if .

si  sj, and

.

vcontextual  threshold p

We denote the corresponding concepts ni(A) and nj(B) by ni(A) ! nj(B);

145

IJPCC 3,2

(3) P(ni(A)) is a superset of P(nj(B)) if and only if .

si  sj, and

.

vcontextual  threshold

We denote the corresponding concepts ni(A) and nj(B) by ni(A)

146

p

nj(B);

(4) Other. Relationships among concepts. Given any two ontology concepts, we can have the following five mutually exclusive relationships between them: (1) subclass, denoted by  (2) superclass, denoted by  (3) equivalentclass, denoted by  (4) sibling, denoted by  (5) other, denoted by = Reasoning rule. If two classes share a same parent, then their relationship is one of: equivalentclass, superclass, subclass, and sibling. (1) Preconditions: ni1(A)  ni2(A) and nj1(B)  nj2(B) and ni1(A)  nj1(B) and .

(the names of ni2(A) and nj2(B) have either an exact-match, or a leadingp match with vlinguistic  threshold) and ni2(A) $ nj2(B)

.

(the name of nj2(B) is a sub-match of the name of ni2(A)) and ni2(A) ! nj2(B)

.

(the name of nj2(B) is a super-match of the name of ni2(A)) and ni2(A)

.

None of above three holds

p

p

nj2(B)

(2) Conclusion: .

ni2(A)  nj2(B)

.

ni2(A)  nj2(B)

.

ni2(A)  nj2(B)

.

ni2(A)  nj2(B)

The intuition behind our reasoning rule is as follows. After the linguistic matching phase we obtain three candidate lists for target concept C. For each concept C 0 in these lists, we then try to find the contextual similarity between C and C 0 to make the final decision. 4.6 Features of our merging algorithm Comparing to the research work mentioned in section 2, our approach advances the state of the art of ontology merging techniques by including the following features. (1) We carry out ontology merging at the schema level, and separate the performance of the merging algorithm from the availability of a large volume of

instance data. As a result, it is more practical than GLUE in cases where there is no enough data to carry out instance-based matching.

Ontology alignment

(2) Our approach is fully automated. This feature is necessary, especially in terms of the successful invocation and seamless integration of mobile services dynamically. Some semi-automated systems, PROMPT for example, require user intervention, which is not always available in a dynamic environment. (3) We treat graph-structured ontologies, which are not only more complex than tree-structured ones (as in Cupid and S-Match), but also more realistic, because multiple-inheritance cannot be represented by a tree. (4) We exploit both the linguistic and the contextual features of a concept, and combine these two features to determine what a concept means in an ontology. It is more advanced than Similarity Flooding, which considers concept names alone and can represent only partial semantics of ontological concepts. (5) We incorporate WordNet into the linguistic analysis phase, under the assumption that natural language provides common semantics; and then integrate heuristic knowledge into the contextual analysis phase. (6) We apply a reasoning rule to infer new relationships among concepts. This rule is based on the domain-independent relationships subclass, superclass, equivalentclass, and sibling, together with each concept’s property list. 5. Compatibility vectors We introduce compatibility vectors as a means of measuring and maintaining the ontology quality, which determines the compatibilities of mobile devices providing services. Along with the formation of a MSC, we create a center ontology by merging all the original ontologies; then the distances from the latter to the center ontology are suitably encoded in compatibility vectors, and can be adjusted efficiently and dynamically during the period in which the MSC is formed. Based on the information contained in the vectors, mobile devices are supposed to understand the ontology from each other without trouble. In addition, for the mobile devices seeking services from outside this community, there is no difficulty for them sssssto choose the devices with good compatibilities, which is, in an objective sense, with no bias. 5.1 Center ontology formation The center ontology is generated by merging all original ontologies, step-by-step, as each new mobile device joins a MSC. At the beginning, when there is only one mobile device, its ontology is regarded as the center ontology. Each time a new device joins the community, the new ontology is merged with the current center one. The resultant merged ontology is the newly obtained center ontology. 5.2 Ontology distance and compatibility vectors 5.2.1 Concept distance. The center ontology contains information from all original ontologies, because the former is the result of the merging of the latter. Therefore, with respect to whether a specific original ontology understands each concept in the center ontology or not, there are two situations. One situation is that for one specific concept in the center ontology, the original ontology can understand it, but possibly with less accurate and/or complete information. The other situation is that the original ontology is not able to recognize that concept at all. In either case, the concept distance is represented

147

IJPCC 3,2

148

Figure 1. Graphical representation for ontology1

Figure 2. Graphical representation for center1

by the amount of information missing, i.e. the number of relationships not known in the original ontology. The following equation formalizes the concept distance: d ¼ w1  nsub-super þ w2  nother, with the constraint of (w1 þ w2 ¼ 1) nsub-super is the number of sub/superclass (isa) relationships not known in the original ontology, and nother is the number of other relationships not known in the original ontology. wi is the weight assigned to different kinds of relationship, including subclass, superclass, equivalentclass, disjointWith, parts, owns, contains, causes, etc. Because the sub/ superclass relationship is the most important one in an ontology schema, w1 will be given a greater value than w2. Consider the ontologies in Figures 1 and 2. In ontology1, concept ‘‘Intangible’’ has one superclass (‘‘AbstractThing’’); four subclasses (‘‘TemporalThing’’, ‘‘SpatialThing’’, ‘‘Mathematical’’, and ‘‘IntangibleIndividual’’); and one disjointWith relationship (with ‘‘PartiallyTangible’’). In the merged center1, the concept ‘‘Intangible’’ has more information from the other ontologies: one more superclass (‘‘PartiallyIntangible’’); one more disjointWith relationship (with ‘‘Tangible’’); and one more subclass (‘‘OtherIntangibleStuff’’). Thus, the concept distance from ‘‘Intangible’’ in ontology1 to ‘‘Intangible’’ in center1 is (w1  2 þ w2  1). Also notice that the concept distance formula is suitable for both situations, i.e. independent of whether the original ontology recognizes that concept or not. For example, if in ontology1 there is no concept ‘‘Intangible’’, then the distance becomes (w1  7 þ w2  2).

5.2.2 Ontology distance. After each concept distance has been calculated as shown above, we can continue to figure P out the ontology distance between the original ontology and the center one: D ¼ ni¼1 ðWi  di Þ, where di is the distance between a pair of concepts, and n is the number of concepts in the center ontology. Recall that the concept set of the original ontology is a subset of that of the center ontology, and the concept distance is encoded by the missing relationships in the original ontology compared to the center one. The above formula shows that the ontology distance is obtained by the weighted sum of the concept distances between two ontologies. How much a concept contributes to the ontology distance is determined by the importance of that concept in its ontology. We use the percentage of the number of relationships to represent this measurement. For example, if ontology1 has 100 relationships in total, and concept ‘‘Spatial’’ has 15 relationships, then the weight for this concept in ontology1 is 0.15. 5.2.3 Compatibility vectors. Inside the center ontology, there is a set of compatibility vectors, one for each original ontology. A compatibility vector consists of a set of domains, each corresponding to one concept in the center ontology. Therefore, all compatibility vectors have identical dimension, i.e. equaling to the number of the concepts in the center ontology. Each dimension has three sub-dimensions. The first sub-dimension tells us whether the original ontology understands this concept or not; the second sub-dimension records the concept name in the original ontology if the latter does recognize that concept; and the third sub-dimension encodes the distance from the concept of the original ontology to the concept of the center ontology. An example of compatibility vectors is shown in Figure 3. For the first concept (‘‘Spatial’’) in the center ontology, device1 knows it as ‘‘Spatial’’ and has a concept distance of 2.7; device3 also understands this concept, but with a different name (‘‘Space’’) and a bigger concept distance of 4.5; neither device2 nor devicem recognizes concept ‘‘Spatial’’, therefore, they have the same concept distance (5.0).

Ontology alignment

149

Figure 3. Compatibility vectors

IJPCC 3,2

150

5.3 Dynamic adjustment of compatibility vectors As mentioned before, when there is only one mobile device, its compatibility is perfect. In the compatibility vectors stored in the center ontology, each concept distance has a value of zero. However, with the adding of new devices into this MSC, the compatibilities for existing devices might be changed because newly joined devices could contain more accurate and/or complete information regarding the ontology in the same domain. An example is shown in Figure 4, demonstrating the process of dynamic distance adjustment. After ontology1 and ontology2 are merged to generate center1, the distance between these two original ontologies and the merged one (center1) is calculated and stored in compatibility vectors of center1. Upon the joining of ontology3 and the generation of center2, the distance from center1 to center2 is calculated and then integrated into the compatibility vectors in center2 for ontology1 and ontology2. This is accomplished by vector integration. For example, we have compatibility vectors in both center1 and center2. Now we want to upgrade the compatibility vectors in center2. Originally there are two compatibility vectors in center2: one for ontology3, and the other for center1. The former will stay the same as is; while the latter will be replaced by several new vectors, the number of which is determined by the number of the vectors in center1 (two in our example). Remember that center1 has one vector for each mobile device when center1 is generated. Each vector in center1 will be integrated with the vector for center1 in center2, therefore creating a new vector correspondingly in center2. The following procedure describes the generation of such a new vector. Pseudocode for new vector generation Input:

Figure 4. Dynamic adjustment of compatibility vectors

.

compatibility vector v for center1 in center2

.

compatibility vector u for devicei in center1

Output: .

compatibility vector w for devicei in center2

begin for each dimension d in v yn ¼ d’s first sub-dimension’s value nm ¼ d’s second sub-dimension’s value dis ¼ d’s third sub-dimension’s value create a new dimension nd in w if yn ¼ ‘‘Yes’’ find in u the dimension od for concept nm yn_old ¼ od’s first sub-dimension’s value nm_old ¼ od’s second sub-dimension’s value dis_old ¼ od’s third sub-dimension’s value nd’s first sub-dimension ¼ yn_old nd’s second sub-dimension ¼ nm_old nd’s third sub-dimension ¼ dis þ dis_old else // yn ¼ ‘‘No’’ nd’s first sub-dimension ¼ yn nd’s second sub-dimension ¼ nm nd’s third sub-dimension ¼ dis end if end for end It is not difficult to figure out that the time complexity for the above procedure is O(n log n). There are n dimensions in each vector, therefore, we need n steps for the loop. Within each loop, all steps take constant time, except for the one finding dimension in u. Suppose in u the dimensions are indexed by the concept names, then a binary search is able to locate a specific dimension within O(log n). Figure 5 exemplifies how the above pseudocode works. There are two source vectors, u and v, and we traverse the second one, one dimension each time. (1) The values for the first dimension are ‘‘Yes’’, ‘‘Intangible’’, and ‘‘2.3’’. We then find the dimension for ‘‘Intangible’’ in u, and obtain (‘‘Yes’’, ‘‘Intang’’, and ‘‘1.6’’). Finally we calculate the values for the new dimension in the resultant vector w, which are ‘‘Yes’’, ‘‘Intang’’, and ‘‘3.9’’ (the result of 1.6 þ 2.3). (2) The values for the second dimension are ‘‘Yes’’, ‘‘Tangible’’, and ‘‘1.7’’. After we obtain the values for dimension ‘‘Tangible’’ in u (‘‘No’’, ‘‘N/A’’, and ‘‘6.7’’), we figure out the values for the new dimension in w are ‘‘No’’, ‘‘N/A’’, and ‘‘8.4’’ (the result of 6.7 þ 1.7). (3) The values for the third dimension are ‘‘No’’, ‘‘N/A’’, and ‘‘5.9’’. We simply copy these three values into the new dimension in w. (4) This continues until we finish the traverse of all dimensions in v.

Ontology alignment

151

IJPCC 3,2

152

Figure 5. Example of new vector generation

5.4 Ontology understanding via compatibility vectors The center ontology maintains the compatibility vectors for all original devices; in addition, the vectors themselves contain such information as whether a device understands a specific concept or not, what is the concept name in the original ontology, and so on. Therefore, if two devices would like to try to understand each other’s ontology, they can simply refer to the center ontology and obtain the corresponding compatibility vectors. By this means, compatibility vectors help a mobile device in the mutual understanding of ontological concepts. 5.5 Mobile device(s) selection through compatibility vectors When a mobile device from outside this MSC needs to ask for service(s), it would like to choose the device(s) understanding its ontology best. The device first compares its own ontology with the center ontology, and then searches in the compatibility vectors to find all those devices understanding the concept of its interest. If there is more than one device competing to provide this service, the request will be sent to those with good compatibilities, that is, devices with concept and/or ontology distance below a certain threshold. Such a threshold could be either specified by the service-consuming device, or otherwise determined by the center ontology. Because the compatibility vectors are stored and maintained by the center ontology, the servicerendering devices have no way to modify or manipulate the vectors. In this sense, the selection of service device(s) is objective and with no bias. 5.6 Features of compatibility vectors Correctness of compatibility vectors – a precise approach. In this section we prove that our approach obtains a correct compatibility for each mobile device. To record and maintain the proper compatibility of each device inside a MSC, the key is to obtain a correct center ontology by which to evaluate the distance from it to each original ontology, and thereby acquire the corresponding compatibility vector. When a new device and its associated ontology join the MSC, instead of

communicating with each existing device, it only talks with the center ontology. Therefore, if we can prove that the newly merged ontology is a correct new center ontology, the correctness of compatibility vectors is guaranteed. First, we point out that according to the merging algorithm in this paper, each time we merge two ontologies, the resultant one will contain all information from both original ones. Next, we introduce Lemma 1 and Theorem 1.

Ontology alignment

Lemma 1. When we merge two ontologies A and B using the algorithm in section 4, the result is the same regardless of whether we merge A into B or merge B into A. Proof by induction:

153

(1) Base case: When both A and B contain two concepts, i.e. besides one common built-in root, ‘‘Thing’’, A contains C1 and B contains C2. If we merge A into B according to the top-level merging procedure in section 4, ‘‘Thing’’ in A is considered equivalent with ‘‘Thing’’ in B; then C1 is compared with all the direct children of the root in B, in this case C2, to determine where to put C1 in B. This is based on the relocate function inside the top-level merging procedure. On the contrary, if we merge B into A, ‘‘Thing’’ in B is considered equivalent with ‘‘Thing’’ in A; then C2 is compared with C1 to determine where to put C2 in A. Obviously, we obtain the same merged ontology in both cases. (2) Induction: Assume that Lemma 1 holds for all cases where the number of concepts contained in A and B is less than (i þ 1) and (j þ 1), respectively. Now consider the case where A and B contain (i þ 1) and ( j þ 1) concepts, respectively. Suppose the superclass set of the (i þ 1)th concept in A, Ciþ1, is PA(Ciþ1), and suppose the location of PA(Ciþ1) in merged ontology M is PM(Ciþ1). The position of Ciþ1 in M is determined by the relationships between Ciþ1 and all the direct children of PM(Ciþ1). From the inductive hypothesis we know that PM(Ciþ1) is identical no matter whether we merge A into B or merge B into A. Therefore, the position of Ciþ1 in M will also be the same in both situations. That is, Ciþ1, the (i þ 1)th concept in A, will be put into the same position in M in both merging orders. Similarly, the ( j þ 1)th concept in B will also be put into the same position in M in both merging orders. So in the case where A and B contain (i þ 1) and (j þ 1) concepts, respectively, we still have the same resultant ontology regardless of the merging order taken. Theorem 1. The final result of merging a number of ontologies is identical no matter by which order the original ontologies are merged using the algorithm in section 4. Proof by induction: (1) Base case: There are two ontologies to be merged. According to Lemma 1, when we merge two ontologies A and B, the result is the same no matter whether we merge A into B, or merge B into A. (2) Induction: Assume that Theorem 1 holds for all cases where the number of ontologies to be merged is less than (n þ 1). Now consider the case where we merge (n þ 1) ontologies. Let the indexes of these ontologies be: 1, 2, . . . , (n þ 1). Consider two arbitrary orders by which we merge these (n þ 1) ontologies: order1 and order2. Suppose the last index in order1 and order2 is i and j, respectively. .

If i equals j, then the first n indexes in order1 and order2 are the same, just in different orders. We merge the first n ontologies to get Mergedn. According to the inductive hypothesis, Mergedn in order1 is identical with Mergedn in

IJPCC 3,2

order2. Then we merge Mergedn with the last ontology in both order1 and order2, and we will get the same result. .

154

If i does not equal j, we mutate the first n indexes in order1 and make the nth index be j; then mutate the first n indexes in order2 and make the nth index be i. Now the first (n  1) indexes in order1 and order2 are in common (possibly in different orders), and the last two are (j, i) and (i, j), respectively. Notice that this kind of mutation will not affect the merging result of the first n ontologies according to our inductive hypothesis. We then merge the first (n  1) ontologies to get Mergedn1. According to the hypothesis, Mergedn1 in order1 is identical with Mergedn1 in order2. Finally we merge Mergedn1 with the last two ontologies in both order1 and order2, and we will get the same result.

Complexity of compatibility vectors – an efficient approach. The time complexity of establishing a MSC, along with the achievement of mutual understanding of ontological concepts, is in the order of O(mn2). For the ontology merging, O(mn2) is needed, because we need to merge m ontologies, and each merging procedure takes time O(n2) as described in section 4. In addition, in order to dynamically update the compatibility vectors, extra time will be spent. According to the previous analysis, O(n log n) is needed for updating one device, so the time for extra work for all services is O(mn log n). Therefore, the total complexity becomes O(m(n2 þ n log n)), which is in the same order of O(mn2). For device selection, the time complexity is O(n2). We only need to compare the ontology from the service-consuming device with the center ontology. 6. Experimental results 6.1 Test ontologies A collection of 16 ontologies for the domain of ‘‘Building’’ were constructed by graduate students in computer science at the University of South Carolina and used for evaluating the performance of our merging algorithm, and the utilities from compatibility vectors as well. Table I lists the summarized characteristics of these test ontologies. They have between 10 and 15 concepts with 19 to 38 properties. 6.2 Experiments on merging algorithm itself To decide whether a correctly merged ontology is obtained, we asked two ontology experts to carry out a manual mapping and we compared their results with ours. Both precision and recall measurements are applied in the evaluation. The evaluation result is shown in Figure 6, reflecting a promising result. Please refer to Huang et al. (2005) for more details.

Table I. Statistic of test ontologies

Max depth No. of total nodes No. of inner nodes No. of total nodes No. of inner nodes

Average of original ontologies

Merged ontology

7 14 6 5 23

8 64 42 47 182

Ontology alignment

155

Figure 6. Precision and recall result in ontology merging

6.3 Experiments on compatibility vectors 6.3.1 Correctness of compatibility vectors. We simulated a MSC out of 16 test ontologies. Based on ontology distances calculated (see Figure 7), we sorted the original ontologies with regard to their distances to the center. We then asked two experts to rank the qualities of these ontologies manually; the result is the same as the one from our system. 6.3.2 Efficiency of compatibility vectors. A set of experiments have been conducted. We first fixed one original ontology as the service-consuming one, and simulated a MSC out of the remaining 5, 10, and 15 ontologies as three experiment settings; then for each MSC setting we did the following in two groups. In the first group the service-consuming ontology always interacted with the ontology with the best quality, while in the second group the interaction happened with a randomly chosen ontology. We compared the resultant merged ontologies from two groups. The result is shown in Figure 8. It is clear

Figure 7. Ontology distance calculated

IJPCC 3,2

156

Figure 8. Improvement with compatibility vectors

that, after adopting our compatibility vectors, both precision and recall measurements have been improved. Therefore, in cases where sufficient resources are not available and only a certain number of mobile devices can be chosen for interaction, our approach increases the efficiency by choosing suitable mobile device(s). 7. Conclusion Mobile computing has become increasingly important with the proliferation of mobile devices. To extend the usually limited capabilities of typical mobile devices, it is essential to integrate mobile services from different providers. As the first step of this process, mobile devices must understand each other’s service descriptions. Although ontologies can aid in this understanding, they will likely have heterogeneous semantics if designed independently, as they typically are. We present an automated approach carried out at the schema level to reconcile ontologies as a basis for mobile service integration and invocation. In addition, we introduce compatibility vectors as a method to evaluate and maintain ontology qualities, thereby handling the problem of how to choose ontologies with good compatibilities. We not only prove theoretically that our approach is both precise and efficient, but also show promising results experimentally. Our current approach makes use of a center ontology, but introduces the problem of how to handle the vulnerability issue inherent in this centralized solution. To analyze and solve this problem is a potential research direction. Other future work includes: (1) how to maintain compatibility vectors when existing mobile devices modify their corresponding ontologies; and (2) what kind of mechanism is suitable if we simultaneously consider qualities of both ontologies and services. References Bilgin, A.S. and Singh, M.P. (2004), ‘‘A DAML-based repository for QoS-aware semantic web service selection’’, paper presented at IEEE International Conference on Web Services, San Diego, CA. Do, H.H., Melnik, S. and Rahm, E. (2002), ‘‘Comparison of schema matching evaluations’’, Proceedings of Workshop on Web and Databases, Madison, WI.

Doan, A., Madhavan, J., Dhamankar, R., Domingos, P. and Halevy, A. (2003), Learning to Match Ontologies on the Semantic Web, The VLDB Journal, Vol. 12, Springer-Verlag, New York, NY, pp. 303-19. Giunchiglia, F., Shvaiko, P. and Yatskevich, M. (2004), ‘‘S-Match: an algorithm and an implementation of semantic matching’’, Proceedings of the 1st European Semantic Web Symposium, Vol. 3053, Springer-Verlag, New York, NY, pp. 61-75. Huang, J., Zavala Gutie´rrez, R., Mendoza, B. and Huhns, M.N. (2005), ‘‘Sharing ontology schema information for web service integration’’, Proceedings of 5th International Conference on Computer and Information Technology (CIT 2005), Shanghai. JWNL (2005), ‘‘Java WordNet Library – JWNL 1.3’’, available at: http://sourceforge.net/projects/ jwordnet/ Kalepu, S., Krishnaswamy, S. and Loke, S.W. (2004), ‘‘Reputation ¼ f(user ranking, compliance, verity)’’, paper presented at IEEE International Conference on Web Services, San Diego, CA. Madhavan, J., Bernstein, P.A. and Rahm, E. (2001), ‘‘Generic schema matching with cupid’’, Proceedings of the 27th VLDB Conference, Springer-Verlag, New York, NY. Melnik, S., Garcia-Molina, H. and Rahm, E. (2002), ‘‘Similarity flooding: a versatile graph matching algorithm and its application to schema matching’’, Proceedings of the 18th International Conference on Data Engineering, IEEE Computer Society Press, Los Alamitos, CA. Noy, N.F. and Musen, M.A. (2001), ‘‘Anchor-PROMPT: using non-local context for semantic matching’’, Workshop on Ontologies and Information Sharing at the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI), Seattle, WA. Zhou, C., Chia, L.-T. and Lee, B.S. (2004), ‘‘DAML-QoS ontology for web services’’, paper presented at IEEE International Conference on Web Services, San Diego, CA. About the authors Jingshan Huang is a PhD candidate in the Computer Science and Engineering Department at the Unversity of South Carolina. Mr Huang is a member of AAAI and SIAM, and a review board member of Journal of Open Research on Information Systems (JORIS). He has served as a programme committee member for several international conferences and is a technical paper reviewer for many journals and conferences. Mr Huang’s research interests include ontology matching/aligning, ontology quality, semantic integration, Web services, and service oriented computing. Jingshan Huang is the corresponding author and can be contacted at: [email protected] Jiangbo Dang earned his PhD degree in Computer Science from the Computer Science and Engineering Department at the Unversity of South Carolina. Dr Dang’s research interests include distributed artificial intelligence and multiagent systems, service-oriented computing, business process and workflow management, and knowledge discovery, data mining, and machine learning. Michael N. Huhns is a professor of computer science and engineering at the University of South Carolina, where he also directs the Center for Information Technology. Dr Huhns is a Fellow of the IEEE and a member of Sigma Xi, Tau Beta Pi, Eta Kappa Nu, ACM, Upsilon Pi Epsilon, and AAAI. He is the author of over 200 technical papers and four books in machine intelligence, including the recently coauthored (with Prof. Munindar P. Singh of NCSU) textbook ServiceOriented Computing: Semantics, Processes, Agents for John Wiley

Ontology alignment

157

IJPCC 3,2 158

Publishing Co. Dr Huhns is on the editorial boards of seven journals and is a founder and board member for the International Foundation for Cooperative Information Systems and the International Foundation for Multiagent Systems. His research interests include multiagent systems, serviceorinted computing, and semantic Web services. .

Yongzhen Shao is pursuing a Master’s degree in the Institute of E-Business of Software School at Fudan University, China. Mr Shao’s research interests include E-Business, Web services, and ontology management.

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1742-7371.htm

Towards an analysis of WonGoo performance

WonGoo performance

Tianbo Lu National Computer Network Emergency Response Technical Team/ Coordination Center of China, Beijing, People’s Republic of China, and

Binxing Fang, Yuzhong Sun and Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences, Beijing, People’s Republic of China

159 Received 5 December 2005 Revised 29 April 2006

Abstract Purpose – As a peer-to-peer scalable anonymous communication system, WonGoo is a tradeoff between anonymity and efficiency. Usually, the longer the path, the stronger the anonymity, but at the same time the heavier the overhead. WonGoo lengthens the anonymity path and reduces the overhead, providing strong anonymity and high efficiency with layered encryption and random forwarding. The purpose of this paper is to analyze its performance in detail. Design/methodology/approach – The paper focuses on measure the performance of WonGoo system with probability theory. First, it gives a brief description of the system and evaluate its payload. Then it presents a detailed security analysis of the system. Findings – It is shown that WonGoo can protect against (n  1) attack and provide variable anonymity, as well as how confident the collaborators can be that their immediate predecessor is in fact the path initiator. The paper measures the anonymity degree provided by WonGoo system based on information entropy and compare it with other anonymity systems. Practical implications – The paper is helpful for the further development of WonGoo system. In addition, the results presented in this paper will be useful for users to design other anonymity system. Originality/value – WonGoo is a peer-to-peer anonymity system that provides strong anonymity and high efficiency with layered encryption and random forwarding. The paper presents a detailed analysis of its performance with probability theory and measures its anonymity degree with information theory. Keywords Internet, Data security Paper type Research paper

1. Introduction With the rapid growth and wide acceptance of the Internet as a means of communication and information dissemination, concerns about privacy and security on the Internet have significantly grown up. Privacy not only refers to confidentiality of the information, but also to not revealing information who is communicating with whom. Anonymity is an essential requirement for privacy. Since Chaum (1981) proposed the mix network, many anonymous communication systems have been developed (Boucher et al., 2000; Danezis et al., 2003; Dingledine et al., 2004; Freedman and Morris, 2002; Gu¨lcu¨ and Tsudik, 1996; Lu et al., 2004; Mo¨ller et al., 2003; Rennhard and Plattner, 2002; Reed et al., 1998) which protect the identities of the participants in various forms: sender anonymity protecting the identity of the sender, and receiver anonymity doing the one of the receiver. Relationship anonymity guarantees that both parties of a communication remain anonymous to each other. A mix is a store-and-forward device that accepts a number of fixed length messages from different sources, performs cryptographic transformations on the messages, and then forwards the messages to the next destination in an order not predictable from the This work has been supported by National Basic Research Program of China (973 Program) under Grant No. 2007CB311100.

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 159-174 # Emerald Group Publishing Limited 1742-7371 DOI 10.1108/17427370710847291

IJPCC 3,2

160

order of inputs. A single mix makes difficult tracking of a particular message either by specific bit-pattern, size, or ordering with respect to other messages. By routing through a mix network consisting of numerous mixes, determining who is talking to whom becomes even more difficult. Most existing anonymous systems are based on Chaum mix, such as Babel (Gu¨lcu¨ and Tsudik, 1996), Mixmaster (Mo¨ller et al., 2003) and its successor Mixminion (Danezis et al., 2003), Onion Routing (Reed et al., 1998) and its successor Tor (Dingledine et al., 2004), the Freedom network (Boucher et al., 2000), Tarzan (Freedman and Morris, 2002), MorphMix (Rennhard and Plattner, 2002) and I2P (www.i2p.net). However, all of the above systems have a heavy message overhead or communication cost, which limits their scalability and decreases their efficiency. Crowds (Reiter and Rubin, 1998) is an application layer protocol designed for anonymous browsing on the Internet. The basic idea of Crowds is to hide one user’s action within the actions of many others, in a form of the so-called crowd, which issues requests to end servers on the behalf of its members. As a consequence, the end servers cannot determine the originating sender, since it is equally likely to have originated from any member in the crowd. The covert paths in Crowds are constructed using an algorithm with probabilities. Crowds provides sender probable innocence against an adversary who controls a certain fraction of the participants. However, Crowds provides no protection against a global eavesdropper. Based on mix (Chaum, 1981) and Crowds (Reiter and Rubin, 1998), WonGoo (Lu et al., 2004) is a scalable peer-to-peer system for low-latency anonymous communication that is resistant to both eavesdropping and traffic analysis. It achieves strong anonymity and high efficiency with layered encryption and random forwarding. The original paper (Lu et al., 2004) has presented a detailed comparison among WonGoo, Crowds and mix, showing that WonGoo makes a tradeoff between Crowds and mix by comparing their efficiency and anonymity. In this paper, we focus on measuring the anonymity provided by WonGoo system, as well as analyzing its payload. The question we consider here is whether the attacker can determine who initiated the path. The definition of anonymity used by us is given in Pfitzmann and Ko¨hntopp (2003) as ‘‘Anonymity is the state of being not identifiable within a set of subjects, the anonymity set’’. We organize the paper as follows. In section 2, we present an overview of WonGoo system. In section 3, we evaluate its payload. In section 4, we provide a detailed security analysis, and measurement of its degree of anonymity based on information entropy in section 5. We conclude in section 6. 2. WonGoo overview This section provides a brief overview of WonGoo protocol (Lu et al., 2004). A node that initiates an anonymous communication is called an initiator, and a node for which messages of the initiator are destined is called a responder. We use I to represent the initiator, R to represent the responder, and Pi or Qi (i ¼ 1, 2, . . .) to represent a peer. Link by link encryption is also used to defend against a powerful adversary. There are two kinds of nodes, i.e. fixed nodes and probability nodes in a WonGoo path. The former, representing by Pi ð1  i  Þ will layered encrypt the messages received from its predecessor, while the latter, representing by Qji ð1  i  ; j ¼ 1; 2; . . .Þ, only forwards the message. In the rest of the paper, we use the terms ‘‘peer’’, ‘‘node’’ and ‘‘user’’ interchangeably. The WonGoo protocol works as follows.

Step 1 The initiator I determines the number  according to some probability distribution and chooses the first fixed node P1 from its neighbors, then each fixed node Pi ði ¼ 1; 2; . . . ;   1Þ along the anonymity path picks the following node Piþ1 and sends it to the initiator. The same nodes in set fP1 ; P2 ; . . . ; P g will be picked again. Note that the sender I and the receiver R are also fixed nodes, represented by P0 and Pþ1 , respectively. Suppose the initial message is M0 . The initiator I then creates an initial packet M1 with layer encryption and sends it to P1 . Mr ¼ Kr ðR0 ; M0 Þ;  Mi ¼ Ki ðRi ; Miþ1 ; Si ; Piþ1 Þ;  M1 ¼ K1 ðR1 ; M2 ; S1 ; P2 Þ: Mi is the message received by Pi and Ri is a string of random bits proposed by Chaum (1981). Si is the maximal number of nodes that messages Miþ1 can pass between node Pi and Piþ1 . Step 2 Pi decrypts the received message Mi to obtain the address of the next node Piþ1 in the path. In Chaum mix, Pi will forward the message Miþ1 to node Piþ1 directly. However, in WonGoo, it is not the case. If Si  0, then node Pi forwards the message Miþ1 to node Piþ1 directly. If Si > 0, then Si ¼ Si  1, and Pi makes a random choice to either forward the message Miþ1 to Piþ1 directly or to another node. Pi flips a biased coin to determine whether or not to forward Miþ1 ; the coin indicates to forward with probability pf. If the flipping result is ‘‘heads’’, then Pi forwards the message Miþ1 to Piþ1 directly; else Pi randomly selects another node Q1i (called probability node) from 0 to it. The values of pf impacts the its neighbors and forwards the message Miþ1 anonymity properties offered by the system. 0 Miþ1 ¼ ðMiþ1 ; Si  1; Piþ1 Þ

Step 3 0 and obtains the address of next node Piþ1 , doing the similar Q1i receives the message Miþ1 i 00 to a operation as Pi . If Q1 does not forward Miþ1 to Piþ1 directly, it forwards Miþ1 i i randomly selected peer Q2 from its neighbors. Note that Qj ð1  j  Si Þ only does link by link encryption, not layered encryption while Pi does the two encryption operations. 00 Miþ1 ¼ ðMiþ1 ; Si  2; Piþ1 Þ

Step 4 Finally, M0 passes through the fixed nodes P1 ; P2 ; . . . ; Pi ; . . . ; P and reaches to R, forming a WonGoo path between I and R: I  P1  Q11  Q12 ; . . . ; Q1S1  P2 ; . . . ; Pi Qi1  Qi2 ; . . . ; QiSi  Piþ1 ; . . . ; P  R

WonGoo performance

161

IJPCC 3,2

162

All of the following messages pass along the same path. So do the return messages. An example is illustrated in Figure 1. For the purpose of description briefly, we define the concepts of fixed path and probability path. The path from one fixed node to a neighboring fixed node directly is called fixed path (the initiator and the responder are also fixed nodes); and the one between two neighboring fixed nodes that is built by random forwarding is called probability path. Obviously, when the length of a probability path is one, it becomes a fixed path. In a large scale peer-to-peer system, the topology information one node owns is local, not global. So it is possible that one node may reappear on a WonGoo path, forming a loop. Consequently WonGoo intends to decrease the probability of node reappearing, not to avoid it. In order to reduce nodes to reappear on a path, the fixed nodes determined by the initiator must be different in a WonGoo path. Furthermore, when a node forwards a message randomly, the next hop selected by it can’t be itself and the one before it. 3. Payload analysis In this section, we analyze the payload of WonGoo system. In order to measure the payload of an anonymous communication system, Reiter and Rubin (1998) advised to evaluate the expected total number of appearances that each node makes on all paths at any point in time. For example, if a node occupies two positions on one path and one position on another, then it makes a total of three appearances on these paths. And it proposed an evaluating method in Sui et al. (2004) which can be described as follows. During a period, let N ðN ¼ 1; 2; . . .Þ be the size of the system, and M ðM ¼ 1; 2; . . .Þ be the number of paths. The length of path m ð1  m  M Þ is fLm g which is a discrete random variable with probability distribution: P1 Pr fLm ¼ kg ¼ f ðkÞ; ð0  f ðkÞ  1; ð1Þ 1 f ðkÞ ¼ 1; k ¼ 1; 2;   Þ Assuming that it is randomly selecting nodes from the total N nodes when creating a path, then the expectation value of the payload of any node vi ði ¼ 1; 2; . . . ; N Þ is   M EðLÞ ð2Þ EðFi Þ ¼ N where EðLÞ is the expectation value of the length of path L.

Figure 1. An example of WonGoo protocol

For simpleness, we assume that one node can reappear at any point on a path. So the payload of WonGoo system is lower than what we calculating in the following. The reason is that in WonGoo system not all nodes on a path can reappear, only some of them can reappear as described in section 2. There is at most one path initiated by a node during a period in WonGoo system, so in general M < N . In an extreme, M ¼ N , meaning that the system operates fully, i.e. each node initiates one path. We calculate the expectation value EðFi Þ of the payload of node vi in this case. Firstly, we calculate the expected length of path L. Suppose that there are K ðK  0Þ fixed nodes (not including the initiator and the responder) on path L. K ¼ 0 means that there is no fixed node on path L, that’s to say, the system evolves into a Crowds system. As showed in Figure 1, we can say that path L is jointed by the probability path L1 ; L2 ; . . . ; Lk ; . . . ; LKþ1 whose length is l1 ; l2 ; . . . ; lk ; . . . ; lKþ1 , respectively, where lt  1; t ¼ 1; 2; . . . ; K þ 1. It is assumed that the length of path L is R  1 if there are R nodes (including the initiator and the responder) on path L. As a result, the expected length of path L is: EðLÞ ¼ E

K þ1 X i¼1

Li ¼ "

K þ1 X

EðLi Þ

i¼1

¼ ðK þ 1Þ ð1  pf Þ

1 X

# rpr1 f

r¼1

ðK þ 1Þ ¼ ð1  pf Þ If M ¼ N, the expected payload of node vi ði ¼ 1; 2; . . . ; N Þ is:   M EðFi Þ ¼ EðLÞ ¼ EðLÞ N ðK þ 1Þ ¼ ð1  pf Þ

ð3Þ

ð4Þ

It can be seen that when the system is full, the expected payload of WonGoo is influenced only by the number of fixed nodes K and the forward probability pf, independing of the system size N. Figure 2 shows that the expectation EðFÞ increases linearly with the increase of fixed nodes number K. Figure 3 shows that with the increase of forward probability pf, the expectation also increases. And when pf < 0:5; EðFÞ increases steadily, while when pf > 0:5, it increases sharply. 4. Security assessment 4.1 Threat model WonGoo does not protect against a strong global adversary who can break any anonymity systems. Instead, we assume a well-funded adversary who can observe some fraction of network traffic; who can operate WonGoo nodes of its own; who can compromise some fraction of the WonGoo nodes on the network; and who can forge, modify, delete, or delay traffic. However, the adversary is not able to invert encryptions and read encrypted messages.

WonGoo performance

163

IJPCC 3,2

164

Figure 2. Expectation of payload vs the number of fixed nodes

Figure 3. Expectation value of payload vs forward probability

4.2 (n  1) Attack The (n  1) attack can be performed by flooding a node with fake messages alongside a single message to be traced. The attacker manipulates the messages entering the node by delaying or deleting the genuine messages except for the target one and flooding the node with the fake messages simultaneously. When the node fires, the only message unknown to the attacker must be the target message to be traced. The success of (n  1) attack depends on the attacker’s following abilities: (1) the attacker can observe arbitrary traffic on the network links; (2) the attacker can delay or delete the genuine messages without getting known by nodes; (3) the attacker can inject fake messages to flood the node; (4) the attacker can distinguish his messages from other messages at the outputting side.

Though some countermeasures (Danezis et al., 2003; Gu¨lcu¨ and Tsudik, 1996; Mo¨ller et al., 2003; Kesdogan et al., 1998; Danezis and Sassaman, 2003) have been proposed to defend (n  1) attack, none of them can thwart it completely. However, WonGoo can successively protect against (n  1) attack. In layered encryption mix systems, the next node of a fake message is known to the attacker though the coding of the message has been changed. According to this, the attack can identify his fake messages. While in WonGoo system, a node not only changes the coding of a message, but also forwards it randomly. This makes the attacker cannot distinguish his fake messages from the target one due to the indeterminacy of the next nodes of all the messages. 4.3 Variable anonymity The main problem of long mix paths is the heavy overhead. Park et al. points out that the length of the ciphertext in mixnet grows proportionally to the number of mixes (Park et al., 1993). In WonGoo, we lengthen the path with random forwarding, reducing the overhead due to probability nodes only do link by link encryption, not layered encryption. An outside eavesdropper cannot relate the incoming messages and outgoing messages of a node due to link by link encryption, that’s to say, he cannot distinguish between probability nodes and fixed nodes. However, inside compromised nodes can do this. Assume that all nodes between two malicious nodes are probability nodes, then the two malicious nodes can recognize a message m between them because the probability nodes does not layered encrypt the message m. This means that the adversary can reduce the length of the path by excluding all the probability nodes between them. However, if there is at least one fixed node between them, then the adversary cannot do that because the encoding of the message m has changed when it passed through the fixed node. Hence, on condition that the path length is fixed, in order to achieve stronger anonymity, we should add fixed nodes and reduce probability nodes, but the cost also increases. Sometimes, we do not need too strong anonymity, then we can add probability nodes and reduce fixed nodes, achieving high efficiency. In a word, WonGoo provides a variable anonymity influenced by the number of fixed nodes and forwarding probability pf. It is a tradeoff between anonymity and efficiency. 4.4 Collaborating nodes We call the nodes operated or compromised by the attacker corrupted nodes, or collaborators, and call those not corrupted honest nodes that would not offer any information to the attacker. Assume the size of WonGoo system is N, and there are C ð1  C  N  1Þ corrupted nodes. The question we consider is if the attacker can determine who is the initiator. The analysis of the responder is same to that of the initiator. If the initiator itself is corrupted, then the system provides no anonymity protection. Considering any path that is initiated by an honest node and on which a collaborator occupies a position, the goal of the collaborators is to determine the peer that initiated the path. We assume that the contents of the communication do not suggest an initiator, the collaborators have no reason to suspect any peer other than the one from which they immediately received it, i.e. the peer immediately preceding the first collaborator on the path. All other honest peers are each equally likely to be the initiator, but are also obviously less likely to be the initiator than the collaborators’ immediate predecessor. We now analyze how confident the collaborators can be that their immediate predecessor is in fact the path initiator.

WonGoo performance

165

IJPCC 3,2

166

Let Hk ð1  kÞ denote the event that the first collaborator occupies the kth position on the path, where the initiator itself occupies the 0th position (and possibly others), and define Hkþ ¼ Hk _ Hkþ1 _ Hkþ2 _ . . . which means that there is a collaborator on the path. Let I denote the event that the immediate predecessor of the first collaborator on the path is the path initiator. When the first collaborator occupies the first position, its immediate predecessor is surely the initiator, i.e. H1 ! I , but the converse is not true, because the initiator might appear on the path multiple times. Given this notation, the collaborator now hopes to determine PðI jH1þ Þ, the probability that the path initiator is the first collaborator’s immediate predecessor on condition that a collaborator is on the path. Let q ¼ ðN  CÞ=N , then 1  q ¼ C=N . We assume that the length of path L is R if there are R nodes (including the initiator and the responder) on it, differing from the definition in section 3. If the first collaborator is on probability path L1 , then PðHi1 Þ ¼ ðpf qÞi1 ð1  qÞ 1 PðHiþ Þ ¼ ð1  qÞ

ð5Þ

l1 1 X

ðpf qÞm

m¼1

1  ðpf qÞl1 ¼ ð1  qÞ 1  pf q

! ð6Þ

where node i is on path L1 , i.e. 1  i  l1 . If the first collaborator is on probability path L2 , then PðHi2 Þ ¼ ð pf qÞl1 1 ðð1  pf ÞqÞð pf qÞi1 ð1  qÞ 2 PðHiþ Þ ¼ ð pf qÞl1 1 ðð1  pf ÞqÞð1  qÞ

ð pf qÞm

m¼0

¼ ð pf qÞ

l1 1

ð7Þ

l2 1 X

1  ðpf qÞl2 ðð1  pf ÞqÞð1  qÞ 1  pf q

! ð8Þ

where node i is on path L2 , i.e. 1  i  l2 . If the first collaborator is on probability path Lk , then PðHik Þ ¼ ð pf qÞ

k1 X ðlj  1Þðð1  pf ÞqÞk1 ð pf qÞi1 ð1  qÞ

ð9Þ

j¼1 k PðHiþ Þ ¼ ð pf qÞ

lk 1 k1 X X ðlj  1Þðð1  pf ÞqÞk1 ð1  qÞ ð pf qÞm j¼1

m¼0

k1 X 1  ð pf qÞlk ¼ ð pf qÞ ðlj  1Þðð1  pf ÞqÞk1 ð1  qÞ 1  pf q j¼1

where node i is on path Lk , i.e. 1  i  lk .

! ð10Þ

To simplify our discussion, let l1 ¼ l2 ¼    ¼ lk ¼    ¼ lKþ1 ¼ l  1, then PðH1þ Þ ¼

Kþ1 X

WonGoo performance

j PðH1þ Þ

j¼1

1  ð pf qÞl ¼ ð1  qÞ 1  pf q ¼ ð1  qÞ

1  ð pf qÞl 1  pf q

! !

K X ðð pf qÞmðl1Þ ðð1  pf ÞqÞm Þ m¼0

1  ðð pf qÞl1 ð1  pf ÞqÞKþ1 1  ð pf qÞ

l1

ð1  pf Þq

167

! ¼

1q T

ð11Þ

where T ¼ ðð1  pf qÞ=ð1  ðpf qÞl Þðð1  ðpf qÞÞl1 ð1  ðpf ÞqÞ=ð1  ðpf qÞl1 ð1  pf Þ qÞKþ1 ÞÞ: We know that PðI jH1 Þ ¼ 1; PðH1 Þ ¼ 1  q, and PðI jH2þ Þ ¼ 1=ðN  CÞ. The last of these follows from the observation that if the first collaborator on the path occupies only the second or higher position, then it is immediately preceded on the path by any honest peer (including the initiator) with equal likelihood. Now PðI Þ can be captured as PðI Þ ¼ PðH1 ÞPðI jH1 Þ þ PðH2þ ÞPðI jH2þ Þ 1 ðPðH1þ Þ  PðH1 ÞÞ ¼ ð1  qÞ þ N C ! !   1q 1  ðpf qÞl 1  ððpf qÞl1 ð1  pf ÞqÞ ¼ N C 1  pf q 1  ððpf qÞl1 ð1  pf ÞqÞ ðN  C  1Þð1  qÞ N C ð1  qÞð1 þ TðN  C  1ÞÞ ¼ TðN  CÞ þ

ð12Þ

Then, since I ! H1þ we get PðI ^ H1þ Þ PðH1þ Þ PðI ^ H1 Þ þ PðI j ^ H2 Þ þ    ¼ PðH1þ Þ PðH1 ÞPðI jH1 Þ þ PðH2 ÞPðI jH2 Þ þ    ¼ PðH1þ Þ PðI Þ ¼ PðH1þ Þ 1 þ TðN  C  1Þ ¼ N C

PðI jH1þ Þ ¼

ð13Þ

When K ¼ 0, l ! 1, we can see that PðH1þ Þ, PðI Þ and PðI jH1þ Þ are same to that of Crowds (Reiter and Rubin, 1998). The reason is that WonGoo has evolves into Crowds

IJPCC 3,2

168

in this case. In real system, because the initiator cannot appear in any position on a path, the probability PðI jH1þ Þ will smaller than above conclusion. From equation (13) we can see that the degree of anonymity provided by the system increases with the increase of forward probability and the path length, and reduces with the increase of corrupted nodes. In addition, it is hard to increase its anonymity significantly only by increasing the size of the system. The reason is that anonymity is achieved by hiding one’s actions within the actions of many others. Anonymity is the stronger, the larger the anonymity set is. However, only some users participate in a communication in a large scale peer-to-peer system. So it is hard to increase its anonymity significantly only by increasing the size of the system, we must take some other measures to ensure the attacker believe that all nodes participate in a communication. 5. Measuring anonymity based on entropy How to measure anonymity is a challenging work. Dı´az et al. (2002) and Serjantov and Danezis (2002) proposed a metric to measure anonymity based on information entropy, respectively. In a system with N users, let U ¼ fu1 ; u2 ; . . . ; uN g be the anonymity set, and X be the discrete random variable with probability mass function pi ¼ Pr ðX ¼ ui Þ with which the attacker guesses peer i is the initiator. Each ui 2 U corresponds to an element of the anonymity set (a sender). Ideally, before the execution of an attack every ui will be the initiator with a priori probability pi ¼ 1=N from the attacker’s view on the system. After the attack, the adversary might get additional information that helps to change the probability distribution on the anonymity set, and assigns a posteriori probabilities to the users. Entropy can be used as a measure to describe the degree of anonymity the system provides against a specific attack. For each sender P belonging to the senders set of size N, the attacker assigns a probability pi ð Ni¼1 pi ¼ 1Þ. The posteriori entropy of the system denoted by H ðXÞ can be calculated as: H ðXÞ ¼ 

N X

pi log2 ðpi Þ:

ð14Þ

i¼1

Obviously the maximum entropy HM of the system is HM ¼ maxðH ðXÞÞ ¼ log2 N:

ð15Þ

The information that the attacker has learned with the attack can be expressed as HM  H ðXÞ. Dı´az et al. (2002) normalize this information and define the degree of anonymity provided by the system as: d ¼1

HM  H ðXÞ HX ¼ : HM HM

ð16Þ

For the particular case of one user, d is assumed to be zero. The advantage of the normalization is the finite range [0, 1] the degree lies within. This degree of anonymity provided by the system quantifies the amount of information the system is leaking. There are some further observations: .

It is always the case that 0  d  1;

.

d ¼ 0 when a user appears as being the initiator originator of a message with probability one;

.

d ¼ 1 when all users appears as being the initiator with the same probability ðpi ¼ 1=N Þ.

In some systems there may be different distributions with a certain probability. Dı´az et al. (2002) advised to calculate the degree of anonymity offered by the system taking into account all possibilities and combine the obtained degrees as follows: d¼

S X

pj dj

ð17Þ

j¼1

where dj is the degree obtained under particular circumstances and pj the probability of occurrence of such circumstances. S is the number of different possibilities. 5.1 Degree of anonymity from the point of view of an attacker on the path We now calculate the degree of anonymity of a user who sends a message passing along a path with corrupted nodes. From section 4, we know that in this case the probability assigned to the predecessor of the first collaborator in the path has been given by equation (13). The probabilities assigned to the corrupted nodes remain zero. Assume that the attacker does not have any extra information about the rest of honest nodes, then the probabilities assigned to those members are: Pi ¼ ðX ¼ ui Þ ¼

1  PðI jH1þ Þ 1  T ¼ N C1 N C

ð18Þ

where ui is an element of the set consisting of any honest nodes excluding the one before the first collaborator. The posterior entropy H ðXÞ of the system is H ðXÞ ¼

ðN  C  1Þð1  TÞ N C log2 N C 1T 1 þ ðN  C  1ÞT N C þ log2 : N C 1 þ ðN  C  1ÞT

ð19Þ

And the maximum entropy HM , taking into account that the size of the anonymity set is N  C, is equal to: HM ¼ log2 ðN  CÞ:

ð20Þ

Therefore, the degree of anonymity of the system after the attack will be: d¼

H ðXÞ ðN  C  1Þð1  TÞ N C log ¼ HM ðN  CÞ log2 ðN  CÞ 2 1  T 1 þ ðN  C  1ÞT N C þ : log ðN  CÞ log2 ðN  CÞ 2 1 þ ðN  C  1ÞT

ð21Þ

WonGoo performance

169

IJPCC 3,2

170

From section 4, we know 1  C  N  1, so 1=N  C=N  ððN  1Þ=NÞ  C ¼ N  1 means there is only one honest node, i.e. the initiator itself. In this case, d ¼ 0. Figure 4 shows the degree of anonymity d vs the proportion of corrupted nodes with l ¼ 4 and K ¼ 5. We can see that d reduces with the increase of the proportion C=N . Figure 5 shows the degree of anonymity d vs forward probability pf with l ¼ 4 and K ¼ 5. For a given corrupted nodes, the degree of anonymity also increases as the forward probability increases. 5.2 Degree of anonymity from the point of view of the sender We know that from the point of view of the sender, the path will either have corrupted nodes or not. So we have to take into account that the message goes a corrupted node or only through honest nodes. The probability pc of a path through which the message goes having collaborators is given by equation (11). And the probability ph of the message going through only honest nodes is: ph ¼ 1  pc ¼ 1  PðH1þ Þ ¼

Figure 4. Anonymity degree from the point of view of an attacker on the path vs C=N

Figure 5. Anonymity degree from the point of view of an attacker on the path vs pf

T þq1 : T

ð22Þ

WonGoo performance

As a result, the degree of anonymity of the system is: d ¼ pc dc þ ph dh ¼ pc dc þ ph ¼

T þ q  1 ð1  qÞðN  C  1Þð1  TÞ N C þ log2 T TðN  CÞlog2 ðN  CÞ 1T ð1  qÞð1 þ ðN  C  1ÞTÞ N C : log2 þ TðN  CÞlog2 ðN  CÞ 1 þ ðN  C  1ÞT

ð23Þ

171

If a message does not go through any collaborating nodes, the attacker will assign all honest senders the same probability, pi ¼ 1=ðN  CÞ, and the degree of anonymity will be dh ¼ 1 (the maximum degree is achieved because the attacker cannot distinguish the sender from the rest of honest users). Figure 6 shows that with the increase of the proportion of corrupted nodes, the degree of anonymity d is reduced. And Figure 7 shows that as the forward probability pf increases, the degree of anonymity d also increases. 5.3 Discussions So far, in mix-based systems such as Mixminion (Danezis et al., 2003), Onion Routing (Reed et al., 1998), Tor (Dingledine et al., 2004), MorphMix ( Rennhard and Plattner, 2002) and Tarzan (Freedman and Morris, 2002), the path initiator

Figure 6. Anonymity degree from the point of view of the sender vs C=N

Figure 7. Anonymity degree from the point of view of the sender vs pf

IJPCC 3,2

would not reappear on the path because all nodes along the path are selected by the initiator. Therefore, the probability Pmix ðI jH1þ Þ of those systems under our threat model is Pmix ðI jH1þ Þ ¼

172

C : N

ð24Þ

We assume that there are K nodes on a Crowds path, then we get the follow equations: ! K CX C 1  ðpf qÞK i1 PCrowds ðH1þ Þ ¼ ðpf qÞ ¼ ð25Þ N i¼1 N 1  pf q ! C pf q  ðpf qÞK ð26Þ PCrowds ðH2þ Þ ¼ N 1  pf q PCrowds ðI Þ ¼ PCrowds ðI jH1þ Þ ¼

C Cðpf q  ðpf qÞK Þ þ N N ðN  CÞð1  pf qÞ 1  pf q 1  ðpf qÞK

þ

pf q  ðpf qÞK ðN  CÞð1  ðpf qÞK Þ

ð27Þ :

ð28Þ

And equation (13) has given the probability PWonGoo ðI jH1þ Þ of WonGoo system. We can see that the attacker has the same confidence that the initiator is the first collaborator’s immediate predecessor when the anonymity path is very long with respect to WonGoo and Crowds systems. That’s to say: PWonGoo ðI jH1þ Þ ¼ PCrowds ðI jH1þ Þ ðN  C  1Þpf : ¼1 N

ð29Þ

When the path is not very long, we have the following properties: PWonGoo ðI jH1þ Þ < PCrowds ðI jH1þ Þ; ðpf < 0:5Þ PWonGoo ðI jH1þ Þ ¼ PCrowds ðI jH1þ Þ; ðpf ¼ 0:5Þ

ð30Þ ð31Þ

PWonGoo ðI jH1þ Þ > PCrowds ðI jH1þ Þ; ðpf > 0:5Þ:

ð32Þ

And it always holds that: PWonGoo ðI jH1þ Þ > Pmix ðI jH1þ Þ:

ð33Þ

Note that our threat model is not very powerful, therefore the function of layered encryption in WonGoo has not been shown comparing with Crowds. A detailed comparison among WonGoo, Crowds and mix-based systems under a more powerful threat model can be found in Lu et al. (2004). 6. Conclusion In this paper, we have firstly quantitatively evaluated the payload of the current WonGoo system which is affected mainly by the number of fixed nodes and forward

probability. Moreover, the payload is independent of the system size, showing its good scalability. Then we have analyzed the security properties provided by WonGoo system in detail. WonGoo is a tradeoff between anonymity and efficiency, providing variable anonymity. Thanks to random forwarding, WonGoo protects against ðn  1Þ attack. After showing how confident the collaborators can be that their immediate predecessor is in fact the path initiator, i.e. the probability PðI jH1þ Þ, we measured the degree of anonymity of the system based on information theory and presented a security comparison among WonGoo, Crowds and mix-based systems. References Boucher, P., Shostack, A. and Goldberg, I. (2000), ‘‘Freedom systems 2.0 architecture’’, December, available at: www.freedom.net/ Chaum, D. (1981), ‘‘Untraceable electronic mail, return addresses and digital pseudonyms’’, Communications of the ACM, Vol. 24 No. 2, pp. 84-8. Danezis, G. and Sassaman, L. (2003), ‘‘Heartbeat traffic to counter (n  1) attacks’’, Proceedings of the Workshop on Privacy in the Electronic Society (WPES), Washington, DC, October, pp. 89-93. Danezis, G., Dingledine, R. and Mathewson, N. (2003), ‘‘Mixminion: design of a type III anonymous remailer protocol’’, Proceedings of the 2003 IEEE Symposium on Security and Privacy, Oakland, California, May, pp. 2-15. Dı´az, C., Seys, S., Claessens, J. and Preneel, B. (2002), ‘‘Towards measuring anonymity’’, Proceedings of Privacy Enhancing Technologies Workshop (PET), Oakland, California, April, pp. 54-68. Dingledine, R., Mathewson, N. and Syverson, P. (2004), ‘‘Tor: the second-generation onion router’’, Proceedings of the 13th USENIX Security Symposium, San Diego, CA, August, pp. 303-20. Freedman, M. and Morris, R. (2002), ‘‘Tarzan: a peer-to-peer anonymizing network layer’’, Proceedings of the 9th ACM Conference on Computer and Communications Security (CCS), Washington, DC, November, pp. 193-206. Gu¨lcu¨, C. and Tsudik, G. (1996), ‘‘Mixing E-mail with BABEL’’, Proceedings of the Network and Distributed Security Symposium (NDSS), San Diego, CA, February, pp. 2-16. Kesdogan, D., Egner, J. and Bu¨schkes, R. (1998), ‘‘Stop-and-Go MIXes: providing probabilistic anonymity in an open system’’, Proceedings of Information Hiding Workshop (IH), Portland, Oregon, LNCS 1525, pp. 83-98. Lu, T., Fang, B., Sun, Y. and Cheng, X. (2004), ‘‘WonGoo: a peer-to-peer protocol for anonymous communication’’, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas, Nevada, June, pp. 1102-6. Mo¨ller, U., Cottrell, L., Palfrader, P. and Sassaman, L. (2003), ‘‘Mixmaster Protocol-Version 2 Draft’’, July, available at: www.abditum.com/mixmaster-spec.txt Park, C., Itoh, K. and Kurosawa, K. (1993), ‘‘Efficient anonymous channel and all/nothing election scheme’’, Proceedings of Eurocrypt, Lofthus, Norway, LNCS 765, pp. 248-59. Pfitzmann, A. and Ko¨hntopp, M. (2003), ‘‘Anonymity, unobservability, and pseudonymity – a proposal for terminology, Draft v0.14’’, May, available at: www.freehaven.net/anonbib/ papers/ Reed, M., Syverson, P. and Goldschlag, D. (1998), ‘‘anonymous connections and onion routing’’ IEEE Journal on Selected Areas in Communications, Vol. 16 No. 4, pp. 482-94. Reiter, M.K. and Rubin, A.D. (1998), ‘‘Crowds: anonymity for web transactions’’, ACM Transactions on Information and Systems Security, Vol. 1 No. 1, pp. 66-92.

WonGoo performance

173

IJPCC 3,2

174

Rennhard, M. and Plattner, B. (2002), ‘‘Introducing MorphMix: peer-to-peer based anonymous internet usage with collusion detection’’, Proceedings of the ACM Workshop on Privacy in the Electronic Society (WPES), Washington, DC, November, pp. 91-102. Serjantov, A. and Danezis, G. (2002), ‘‘Towards an information theoretic metric for anonymity’’, Proceedings of Privacy Enhancing Technologies Workshop (PET), San Francisco, CA, April, pp. 41-53. Sui, H., Chen, S., Chen, J. and Wang, J. (2004), ‘‘Payload analysis of rerouting-based anonymous communication systems’’, Journal of Software, Vol. 15 No. 2, pp. 278-85.

About the authors Tianbo Lu received the PhD degree in Computer Science from the Institute of Computing Technology, Chinese Academy of Sciences in 2006. His research interests include information security, computer networks and distributed systems. Tianbo Lu is the corresponding author and can be contacted at: [email protected]

Binxing Fang received the PhD degree in Computer Science from the Harbin Institute of Technology in 1989. He is currently a professor in the Institute of Computing Technology, Chinese Academy of Sciences. He has published his research papers in the areas of information security, computer architecture, information assurance, parallel and distributed computing. He has served as program committee member for several International Conferences and also served as reviewer for several International Journals.

Yuzhong Sun received the PhD degree in Computer Science from the Institute of Computing Technology, Chinese Academy of Sciences in 1997. He is currently a professor in the Institute of Computing Technology, Chinese Academy of Sciences. His research interests include computer architecture and grid computing.

Xueqi Cheng received the PhD degree in Computer Science from the Institute of Computing Technology, Chinese Academy of Sciences in 2006. His research interests include information security, dada mining and P2P computing.

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1742-7371.htm

Integrated business-process driven design for service-oriented enterprise applications Xingdong Shi, Weili Han and Yinsheng Li

Business-process driven design

175

Software School, Fudan University, Shanghai, People’s Republic of China, and

Ying Huang IBM T.J. Watson Research Center, New York, New York, USA Abstract Purpose – An enterprise application can be quickly built up by service composition. Business process composition is the essence of service composition. To build up such service-oriented enterprise application, the developer needs an integrated design tool. The purpose of this paper is to present and integrated business-process driven design for service-oriented enterprise applications. Design/methodology/approach – In the approach, there are three phases: business environment modeling, business process modeling, and script compiling. Business environment modeling adopts a new modeling technique which combines both the advantages of use case diagram and sequence diagram in UML. Business process modeling builds a concrete model according to business environment modeling. The mapping algorithms from business environment model to business process model are also given. At script compiling phase, the business process model is compiled into several deployable files. And then the paper presents a demonstration which shows how to apply our approach to developing a supply chain management system for the retail industry. Findings – The analysis shows that the approach can meet the requirement of service-composition. The approach can help business expert freely express their business requirement at business environment modeling phase; and help IT expert quickly design service-oriented enterprise application according to business environment model at business process modeling phase. Originality/value – This paper proposes a novel integrated approach to model and implement business-process driven service composition, and presents an integrated tool based Eclipse to implement this approach. Keywords Business planning, Modeling, Business process re-engineering, Information systems Paper type Research paper

1. Introduction Business process integration is critical to the further development of enterprises in the near future (Li et al., 2005a). This demands that the enterprise information systems have to support not only the internal processes within the enterprise but also the collaborations among business partners. Moreover, all the partners’ IT systems together form a complicated cooperating system. Service-oriented architecture (SOA) (He, 2003) is one of the enabling technologies in building such a complex system, and is considered to be the most efficient in designing enterprise systems and integrating heterogeneous applications in a large-scale distributed environment (Mukhi, et al., 2004). In an SOA-based environment, every enterprise exposes its functionalities as services, and integrates with each others in a loosely coupled way. This makes it possible to share information and cooperate among distributed heterogeneous systems, however, it also makes the applications to be more complicated. Supported by IBM Shared University Research (SUR) Program  On Demand Business Integration with Intelligent Web Service and Process Composition.

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 175-189 # Emerald Group Publishing Limited 1742-7371 DOI 10.1108/17427370710847309

IJPCC 3,2

176

The rapid change of business requirement demands that enterprises should not expect software developers to build a perfect system that never changes. A new trend is that software developers concentrate on constructing the infrastructure and leave the work of business level system building to business specialists (Xiong et al., 2004), thus the system can be easily changed to meet the new business requirement. This paper proposed a new approach in building service-oriented enterprise applications. This is a business-driven approach. In this approach, people’s main work is computer-related business process modeling, however, they start from business environment modeling which is totally about business. This paper proposed a new modeling technology for business environment modeling, designed a business process execution language (BPEL)-enabled foldable flowchart (BFF) for business process modeling, and proved that the mapping algorithm is of linear time complexity. The rest of this paper is organized as follows. Section 2 introduces some related works in this field. Section 3 gives an overview of integrated business-process-driven design for service-oriented enterprise applications. Section 4 discusses the business environment model (BEM) and business process model (BPM) in detail. Section 5 introduces the mapping algorithms and analyzed its time complexity. Section 6 illustrates the application of this approach in a supply chain management scenario. Section 7 summarizes this paper and proposes some future work. 2. Related works Services composition is one of the hot spots in the field of Web services. There has been some solutions such as BPEL (Business Process Execution Language for web services, Version 1.1 Specification [online], 2003) and web services choreography description language (Web Services Choreography Description Language Version 1.0 [online], 2005), however, they only give us technical solutions for services composition, but do not tell us how to use these solutions to build IT systems that meet the ever-changing business requirements. Most of researches are based on object-oriented technique and component-based development (Tsoi et al., 2003; Liang, 2002). As the technology of web services emerges, service-oriented development has become a new hot spot in the academics. In serviceoriented development, a critical problem is how to rapidly and flexibly compose existing services to build a new service that meets the business requirement. Current service composition techniques can be classified as manual ones and computer-aided ones. In manual composition, the user writes the process script, usually with the help of a graphical user interface. The script is then deployed on a process execution engine. In computer-aided composition, a computer program generates a rough process and then the user starts his work based on the generated one. There has been a lot of researches in computer-aided service composition. These researches have exploited two kind of approaches. One kind of approaches provide a service with service semantics or information about quality of service and something else, using the techniques like ontology or others (Cao et al., 2005; Li et al., 2005; Zhoo et al., 2005). The extra information of a service can help the computer in selection and composition of services by artificial intelligence techniques or mathematical reasoning. The other kind of approaches adopt the idea of model-driven architecture (Miller and Mukerji, 2003) and simplify the users’ work by automatic transformation from high-level models to low-level models (Shin et al., 2005, Miller et al. 2003). Agarwal et al. (2005) combined both the above two kinds of approaches. It separates the service composition into two phases: logical composition phase and physical composition

phase. In logical composition phase, the analysis of service semantics and business requirement, both of which are described in OWL-S, helps building an abstract BPEL process without binding to any services. In physical composition phase, the abstract process is bound to the concrete services. We see that the requirement is described in OWL-S, so it is not really business-oriented, it is still a developer-oriented approach. Machiraju et al. (2004) proposed a theoretical framework about business-driven IT infrastructure. However, he just structured the problem into three layers at different levels of abstraction, and analyzed the techniques may be used in this infrastructure. No implementation is provided in his works. The above researches are all developer-oriented. Some other researches put forward the idea of business-oriented service composition. Fang et al. (2005), for example, proposed a service virtualization mechanism for business end programming. When service composition is started by business process modeling, however, current researches concentrate on the internal process within a single enterprise, while Gordijn et al. (2000) pointed out that process models are not a good starting point for identifying business stakeholder requirements. Most e-business projects start with the design of a business model stating who is offering and exchanging what with whom and expects what in return, rather than how these offerings are selected, negotiated, contracted and fulfilled operationally as is explained by a process model. The cooperations among enterprise applications from different business partners can be illustrated in Figure 1. Apart from the internal processes within each enterprise, there should be a collaborative process among all the enterprises as shown in the central ellipse, which is exactly Gordijn’s who is offering and exchanging what with whom and expects what in return. This paper proposes a new business-driven approach of building service-oriented enterprise applications. It provides a BEM to capture the requirement of the

Business-process driven design

177

Figure 1. Cooperated enterprise IT systems

IJPCC 3,2

178

cooperations among enterprises, a BFF for business process modeling. In addition, it also provides model mapping algorithms of linear time complexity. 3. Integrated business-process driven design Constructing a new IT system is started by business environment modeling. We define the business environment as the business partners and the cooperations between the enterprise itself and these partners. Business environment is described in a BEM, which contains only the information about exposed functionalities of each enterprise application, not including the internal process. In this step, we get a computing independent model that describes only the business requirement to be met. There is no limitation on implementation technologies. After the first step is done, the BEM is mapped to a BPM using some mapping rules. The key point in this step is the mapping rules. We have discovered some rules and applied them to help generating a rough BPM from a BEM. Software developer start their work from the generated BPM, rather than from scratch. The BPM describes the internal process, and it is a platform independent model. The process consists of actions and the logical relationships among these actions, but it does not specify what technologies should be used to implement the process. Software developers’ main work here is to complement the BPM with enough details. After the BPM is finished constructing, we can compile the model and get some script files. The process of building a new system is finished by deploying these script files on a process execution engine. In this solution, constructing a new system is started by capturing the business requirement, and the followed steps are also driven by this requirement. This is illustrated in Figure 2. 4. Model definition 4.1 Business environment model Business environment model is a macroscopical view of the cooperated enterprise IT systems. It describes the business requirement supposed to met. A lot of technologies

Figure 2. Overview of businessdriven SOA systems development

can be used to model the business environment, such as IDEF and UML. But they all have some flaws in the service-oriented environment. IDEF is a family of methods for enterprise modeling and analysis. Among the family, IDEF0 is used for system function modeling in a lot of researches (Ni et al., 2005). However, IDEF0 is mainly target at modeling a single complicated system, but in an SOA environment the key point is to analyze the cooperations among the systems (services). UML is another frequently used modeling language. The use case diagram and sequence diagram in UML can be used for system modeling in many cases. A use case diagram is quite flexible. It treats a system as service provider and describes what the system is offering. However, use case diagram is not effective in an SOA environment because of the fact that every system may have both the role of service provider and service consumer. A sequence diagram can model the cooperations among systems, but dynamically inserting or removing a system in a sequence diagram is a timeconsuming work. The modeling technology we proposed has both the advantage of use case diagram and sequence diagram, and is suitable for modeling the cooperations in an SOA environment. We define a BEM as BPM ¼ {BPS, VXS, OPS, IPS, DCS, ECS}. Business participant set (BPS) is the set of all the business participants involved in the business process. There are two types of participants. One is the service provider, for example, the headquarters or a distribution center of a retailer. The other includes business partners of the service provider. In the case of a retailer, its partners may include manufactories, banks, among others. An example BPS is {Headquarters, Store, Manufacturer, Bank, Shipment}. Each business participant is denoted by an icon. Value exchange set (VXS) is the set of value exchanges among these business participants. Each value exchange is in the form of vx(bp1, bp2) indicating that there is an information flow from bp1 to bp2. It can be defined as VXS ¼ {vx(bp1, bp2)|there is an information flow from bp1 to bp2}. An example VXS is {vx(Headquarters, Manufacturer), vx(Store, Headquarters)}. Each value exchange is denoted by an arrow from one icon to another, with a sequence number indicting the order of its time. A value exchange with a smaller sequence number happens earlier than one with a bigger sequence number. Dependence constraint set (DES) is the set of all dependence constraints. Each dependence constraint is a couple like DC(bp1, bp2), indicating that the presence of one business participant is dependent on the other. For example, dc(Manufacturer, Bank) indicates that if a manufacturer is involved, a bank must also be involved to provide some service. So the DCS can be formally defined as DCS ¼ {dc (bp1, bp2)|the present of bp1 is dependent on bp2}. Exclusion constraint set (ECS) is the set of all exclusion constraints. Each exclusion constraint is a couple like ec(bp1, bp2), indicating that one business scenario should not involve both bp1 and bp2. For example, ec(Intel, AMD) indicates that we should not buy CPUs from both Intel and AMD in one business scenario. So the ECS can be formally defined as ECS ¼ {ec(bp1, bp2)|bp1 and bp2 should not be both involved in one business scenario}. The BEM shown in Figure 3. is quite simple for edit and convenient for communication. Furthermore, the research by Danesh found that an emphasis on a communication-oriented view of processes seems to increase perceived modeling quality and redesign success (Danesh and Knock, 2005).

Business-process driven design

179

IJPCC 3,2

180

Figure 3. Business environment model

Figure 4. In a normal flowchart, a 3d-vision is required to show all levels of model details

4.2 Business process model In the field of BPM, there are already a great deal of researches. They proposed various approaches including flow chart, data flow diagrams, role activity diagrams, role interaction diagrams, IDEF, coloured petri-net, etc. When dealing with processes that need a high level of detail, the flowchart technique is probably the best choice because of its flexibility (Ruth Sara, 2004). However, a flowchart has several layers, the top layer gives an overview and the different level sub-layers give different details. In other words, a flowchart need a 3dvision to show all levels of details. This can be illustrated in Figure 4. This paper proposed a new modeling method named BFF for business process modeling. It is much more legible than the normal flowchart. Each modeling element in BFF is exactly a counterpart of the concept ‘‘activity’’ in BPEL language. Activities includes not only the actions like receiving data and invoking services but also the control logics like sequence, selection and loop in structured programming. Other control logics includes concurrency, exception handling and compensation. Each activity is denoted by an unique icon. All icons are arranged in a top-to-bottom and left-to-right manner according to their execution orders. The advantage is that we can eliminate the use of arrows in normal flowchart to make it more concise.

As opposed to normal flowchart, BFF can show all levels of details in the same plane. Some icons denoting containers like SEQUENCE and FLOW can be folded and unfolded. A folded icon hides the internal process in the container while an unfolded one shows that. So we can see any level of details by folding or unfolding some icons. In the BFF example shown in Figure 5, the SEQUENCE icon is unfolded to show some details, and then the FLOW icon in the SEQUENCE is unfolded to show more details. 5. Model-mapping algorithms Mapping from BEM to BPM is a critical step. BEM is about external requirement, while BPM is about internal operation process. There are no direct mapping between these two models. However, the sequence numbers in BEM give us the clues about the possible process. Sequence numbers can tell us the time sequence of each arrows, and the source and destination of the arrow tells us which two partners are involved in the corresponding action. Thus, we can get a framework of the business process from these information. Moreover, we have discovered some rules to help us generating a more accurate BPM. To all dynamic extension to these rules, we used the rules engine technique to do the mapping works. The mapping algorithm is described in Figure 6.

Business-process driven design

181

(1) Create an empty BPM object BPM. (2) Create a SEQUENCE object in the BPM. All the activity objects created later will be wrapped in this SEQUENCE object. (A process in BPEL language consists of activities. SEQUENCE is a special kind of activity that act as a container, and all the activities wrapped in a SEQUENCE will be executed sequentially.) (3) Scan the BEM to get a list of business partners. (4) For each business partner in BEM, generate a PartnerLink object in BPM. (5) Scan the BEM to get a list of information flows.

Figure 5. An example of BFF showing all levels of details in the same plane

IJPCC 3,2

182

Figure 6. Sequence diagram for mapping from BEM to BPM

(6)

Ask the rules engine to generate a more accurate process model, according to the business partner list and information flow list. The rules engine has been given several rules, some of them are:

Rule-1: If the source of the first information flow and the destination of the last information flow are both BP1, then generate a RECEIVE action with the source of BP1 at the beginning of BPM and a REPLY action with the destination of BP1 at the end of BPM. Rule-2: If the destination of a information flow and the source of the next information flow are both BP1, then generate an INVOKE action with the destination of BP1. Rule-3: As a general rule, insert an ASSIGN action before each INVOKE action. Rule-4: If there are more than one information flows start from BP1, and their sequence numbers are sequential (N, N þ 1, N þ 2, . . .), then generate a SWITCH action for BP1. In the above algorithm, time complexities of steps 1 and 2 are obviousely both O(1), and that of steps 3 and 4 are both O(|BPS|). Step 5 scans all the information flows, so its time complexity is O(|VSX|). Step 6 applies every mapping rules for exactly once. In the worst case, each mapping rule need to access the whole set of BPS and VXS, so the time complexity of step 6 is O(m(|BPS| þ |VXS|)), where m is the number of mapping rules. We notice that each business participant should be involved in at least one information flow, so |BPS|< ¼ |VXS|. The time complexity of step 6 can be simplified to O(m|VSX|). Furthermore, m can be considered as a constant, so the time complexity of the whole algorithm is O(|VXS|), which means that the algorithm is of linear time complexity. As mentioned above, each modeling element in BFF is exactly a counterpart of the concept ‘‘activity’’ in BPEL language, so the BPM can be easily compiled into BPEL script files. In general, two script files are required to deploy a new service, one is a BPEL file describing the internal process, and the other is a WSDL file describing the external interface. Different BPEL engines conform to the same BPEL specification, so a BPEL file can be deployed on any BPEL engine without modification. However, these engines usually give proprietary extensions to the standard WSDL file.

To generate the BPEL script file, we only need the traverse the object tree of BPM. Each node in the object tree generates its own BPEL script. After finished traversing the tree, we get a BPEL script of the whole process. We access each node for exactly once, so the time complexity is O(n) where n is the number of nodes, or activities in the process. This a linear time algorithm. To generate the WSDL file, we apply the algorithm shown in Figure 7.

Business-process driven design

(1) Generate standard namespaces in WSDL including xmlns:xsd, xmlns:plnk, xmlns:wsdl and xmlns:wsdl, etc.

183

(2) For each third-party service invoked, generate a corresponding namespace named after the filename of the WSDL file describing the service. (3) Generate message definitions for the variables defined in the BPM. (4) Generate porttype definitions. (5) Generate BPEL-specific partnerLinkType definitions. (6) Generate platform-specific definitions. In the above algorithm, the time complexities of steps 1, 2, 4 and 6 are all O(1), and that of step 3 is O(v) where v is the number of variables. Step 5 generates a partnerLinkType definition for each business partner, so its time complexity is O(|BPS|). As a result, the time complexity of this algorithm is O(v þ |BPS|). This is also a linear time algorithm.

Figure 7. Sequence diagram for generating a WSDL file

IJPCC 3,2

184

According to the above analysis, we see that the three algorithms in the business-driven solution are all linear time algorithms. So this solution is effective in performance. 6. Case study We have developed a computer tool named (Service-oriented business modeling workshop) S-BMW to support our idea. It has three main components: BEM designer, BPM designer and BPM compiler, as shown in Figure 8. S-BMW runs as a plug-in on the famed open-source platform Eclipse (www.eclipse.org), so it can collaborate with other Eclipse-based SOA tools. For example, when S-BMW is integrated with Systinet Developer for Eclipse (www.systinet.com), we gain the ability of publishing our web services to an UDDI registry and retrieving interface descriptions of business partners’ web services from the UDDI registry. Developing a service-oriented enterprise application needs the support from software and hardware infrastructures. IBM has successfully provided store integration frame (IBM stone integration framework [online], 2004) as a serviceoriented infrastructure for the retail industry. We will give an example of developing a supply chain management service for a retailer using our solution, assuming that the retailer has already adopted store integration frame and all the IT systems of the retailer and its business partners are all exposed as web services. Figure 9 shows a typical scenario for retail supply chain. The stores ask the distribution center for additional shipment of goods. The distribution center asks its suppliers for the products. The suppliers accept and reply with their quotes. The distribution center then tells the bank to pay the supplier from its bank account. The factory provides a series of web services. One of these services is for receiving orders, and the main part of its WSDL description is shown in Figure 10. The bank provides a series of web services as well. One of the services is for transferring some among of money from one account to another, and the main part of its WSDL description is shown in Figure 11. The business logic of this scenario can be modeled in BEM designer as shown in Figure 12(a). We see that four business entities participate in this scenario, and there are six information flows with the sequence number from 1 to 6. Using the defined mapping rules, we can build a rough BPM as shown in Figure 12(b). Some of the mapping rules applied in this example includes: (1) Rule-1: If the source of the first information flow and the destination of the last information flow are both BP1, then generate a RECEIVE action with the source of BP1 at the beginning of BPM and a REPLY action with the destination of BP1 at the end of BPM.

Figure 8. The architecture of S-BMW

Business-process driven design

185 Figure 9. An SCM scenario for retail industry

Figure 10. Service description for the factory

In this case, the first information flow (with sequence number one) comes from the store, and the last information flow (with sequence number six) goes to the store as well, so in the BPM the rules engine generated a RECEIVE action at the beginning and a REPLY action at the end. (2) Rule-2: If the destination of a information flow and the source of the next information flow are both BP1, then generate an INVOKE action with the destination of BP1. In this case, the No. 2 information flow goes to the factory,

Figure 11. Service description for the bank

IJPCC 3,2

186

Figure 12. Modeling in S-BMW

and the No. 3 information flow comes from the factory as well, so in the BPM the rules engine generated an INVOKE action meaning that the distribution center invokes a service provided by the factory. Applying the same rule, the rules engine generated another INVOKE action for the Nos. 4 and 5 information flows between the distribution center and the bank. (3) Rule-3: As a general rule, insert an ASSIGN action before each INVOKE action. Before invoking a service, it is usually required that assigning values to the parameters of the invoked web service be finished. So in this case the rules engine inserted an ASSIGN action before each INVOKE action.

Finally, applying the algorithms in section 5, BPM compiler can generating the deployable script file as shown in Figure (13a).

Business-process driven design

7. Summary This paper proposed a new approach for business-driven modeling and implementation of enterprise applications. In this approach, the flexible and expressive business environment model is based on use case diagram and sequence diagram in UML, and takes advantages of both of them. The BFF for business process modeling is a variation of normal flowchart, and is more concise and suitable for service-oriented modeling. One advantage of BFF is that all levels of detailed information can be shown

187

Figure 13. Generated deployable files by S-BMW

IJPCC 3,2

188

in a same panel, while in a normal flowchart needs a 3d-vision. This paper also proved that the model mapping algorithms are of linear time complexity. Compared to other approaches, this approach can make it much easier to capture the business requirements and thus the constructed enterprise applications are more close to the real business. Furthermore, this approach is implemented on the open platform Eclipse so that it can be integrated with other SOA tools to provide a total solution for enterprises. Further studies may include automatic services selection by the technology of semantic web services in design time and rules engine in runtime. In addition, study on the mapping rules may help generating much more accurate BPMs. References Agarwal, V., Dasgupta, K., Karnik, N., Kumar, A., Kundu, A., Mittal, S. et al. (2005), ‘‘A service creation environment based on end to end composition of web services’’, Proceedings of the 14th International World Wide Web Conference, Chiba, 10-14 May, ACM Press, New York, NY, pp. 128-37. Business Process Execution Language for Web Services, Version1.1 Specification [online] (2003), May, available at: ftp://www6.software.ibm.com/software/developer/library/ws-bpel.pdf (accessed 15 November 2005). Cao, J., Zhang, S. and Li, M. (2005), ‘‘A goal driven and process reuse based web service customization model’’, Chinese Journal of Computers, Vol. 28 No. 4, pp. 721-30. Danesh, A. and Kock, N. (2005), ‘‘An experimental study of process, representation approaches and their impact on perceived modeling quality and redesign success’’, Business Process Management Journal, Vol. 11 No. 6, pp. 724-35. Fang, J., Hu, S., Han, Y. and Liu, C. (2005), ‘‘A service virtualization mechanism for business end programming’’, Chinese Journal of Computers, Vol. 28 No. 4, pp. 549-57. Gordijn, J., Akkermans, H. and van Vliet, H. (2000), ‘‘Business modelling is not process modelling’’, Conceptual Modeling for E-Business and the Web – ER 2000 Workshops on Conceptual Modeling Approaches for E-Business and The World Wide Web and Conceptual Modeling, Salt Lake City, UT, October. He, H. (2003), ‘‘What is service-oriented architecture’’, 30 September, available at: http:// webservices.xml.com/pub/a/ws/2003/09/30/soa.html (accessed 15 November 2005). IBM Store Integration Framework (2004), May, available at: www-03.ibm.com/industries/retail/ doc/content/bin/SIF.pdf (accessed 15 November 2005). Li, H., Han, Y., Hu, S., Shan, B. and Liang, Y. (2005a), ‘‘An approach to constructing serviceoriented and event-driven application dynamic alliances’’, Chinese Journal of Computers, Vol. 28 No. 4, pp. 739-49. Li, M., Wang, D., Du, X. and Wang, S. (2005b), ‘‘Dynamic composition of web services based on domain ontology’’, Chinese Journal of Computers, Vol. 28 No. 4, pp. 644-50. Liang, Y. (2002), ‘‘Generation of object models for information systems from business system models’’, Proceedings of the 8th International Conference on Object-Oriented Information Systems, Montpellier, France, 2-5 September, Springer-Verlag, London, pp. 255-66. Machiraju, V., Bartolini, C. and Casati, F. (2004), ‘‘Technologies for business-driven IT management’’, 7 June, available at: www.hpl.hp.com/techreports/2004/HPL-2004-101.pdf (accessed 15 November 2005). Miller, J.P., Bauer, B. and Friese, T. (2003), ‘‘Programming software agents as designing executable business processes: a model-driven perspective’’, First International Workshop on PROMAS, Melbourne, 15 July.

Miller, J. and Mukerji, J. (2003), ‘‘MDA Guide Version 1.0.1’’, 1 June, available at: www.omg.org/ docs/omg/03-06-01.pdf (accessed 15 November 2005). Mukhi, N.K., Konuru, R. and Curbera, F. (2004), ‘‘Cooperative middleware specialization for service oriented architectures’’, Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers and Posters, New York, 17-22 May, ACM Press, New York, NY, pp. 206-15. Ni, M., Xu, F. and Shen, Y. (2005), ‘‘The complicated system model for implementing enterprises informatization based on IDEF0 theory’’, System Engineering, Vol. 23 No. 3, pp. 69-74. Ruth Sara, A.-S. (2004), ‘‘Business process modeling: review and framework’’, International Journal of Production Economics, Vol. 90, pp. 129-49. Shin, H., Kim, H.-K. and Shim, B. (2005), ‘‘Development of business rule engine and builder for manufacture process productivity’’, Knowledge-Based Intelligent Information and Engineering Systems: 9th International Conference, Melbourne. Tsoi, S.K., Cheung, C.F. and Lee, W.B. (2003), ‘‘Knowledge-based customization of enterprise applications’’, Expert Systems with Applications, Vol. 25, pp. 123-32. Web Services Choreography Description Language Version 1.0 (2005), November, available at: www.w3.org/TR/ws-cdl-10/ (accessed 26 December 2005). Xiong, J.H., Li, H., Han, Y. and Geng, H. (2004), ‘‘VINCA: a business-end programming language for just-in-time application construction’’, Journal of Computer-Aided Design and Computer Graphics, Vol. 16 No. 2, pp. 180-5. Zhao, J., Xie, B., Zhang, L. and Yang, F. (2005), ‘‘A web services composition method supporting domain feature’’, Chinese Journal of Computers, Vol. 28 No. 4, pp. 731-8. Corresponding author Weili Han can be contacted at: [email protected]

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints

Business-process driven design

189

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1742-7371.htm

IJPCC 3,2

JDGC An integrated decentralized distributed computing platform for Java program

190

Tay Teng Tiow, Chu Yingyi and Sun Yang Department of Electrical and Computer Engineering, National University of Singapore, Singapore

Received 27 December 2005 Revised 8 May 2006 Abstract Purpose – To utilize the idle computational resources in a network to collectively solve middle to large problems, this paper aims to propose an integrated distributed computing platform, Java distributed code generating and computing (JDGC). Design/methodology/approach – The proposed JDGC is fully decentralized in that every participating host is identical in function. It allows standard, single machine-oriented Java programs to be transparently executed in a distributed system. The code generator reduces the communication overhead between runtime objects based on a detailed analysis of the communication affinities between them. Findings – The experimental results show that JDGC can efficiently reduce the execution time of applications by utilizing the networked computational resources. Originality/value – JDGC releases the developers from any special programming considerations for distributed environment, and solves the portability problem of using system-specific programming methods. Keywords Java, Programming Paper type Research paper

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 190-204 # Emerald Group Publishing Limited 1742-7371 DOI 10.1108/17427370710847318

1. Introduction Distributed computing or grid computing is the emerging platform for many computation intensive applications. These systems utilize the idle computational resources in a network to collectively solve large problems. Previous works Neary et al. (2000b) and Nisan et al. (1998) are some examples of such systems. Most computing schemes proposed in previous works have a centralized architecture where mapping of requests and supplies of computational resources is performed in or with the aid of dedicated components. For example, in Christiansen (1997) a broker component is used to distribute the tasks submitted from clients. In Lewis and Grimshaw (1996), a set of special objects is responsible for management functions such as naming and binding services. These dedicated components are usually deployed on one or a number of dedicated machines which serve as the controllers. The centralized systems with dedicated controllers are usually employed to handle ultra-large tasks while our system targets middle to large tasks. Such a task needs several or tens of computers’ processing ability. As the number of nodes involved in the computation of one task is not very large, one common PC can manage all the computational resources needed so that dedicated controllers are not necessary. On the other hand, the number of tasks and the number of participants in a network may change dynamically over time. The centralized architecture cannot scale well with these changes. When the number of tasks and participants is small, the dedicated controllers may be a waste of resources. When the number of tasks and participants becomes large, the controllers may be a bottleneck and could not fulfill the needs. Although some solutions to the scalability problem have been given in Neary et al. (2000a) and Cappello and Mourloukos (2001) they cannot solve this problem at the

fundamental level due to the centralized nature of them. In comparison, the scheme proposed in this paper has a fully decentralized architecture. In this scheme, the participants themselves rather than dedicated controllers manage the mapping of resource requests and supplies. To support this scheme, a set of network APIs based on group communication protocol are proposed. Since there is no dedicated controller that manages the tasks and resources for the whole system, the task is distributed to every participating host. This may be viewed as a self-broking scheme, in the language of Neary et al. (2000), where cooperative computing may be seeded in any part of the network by any participating host. A key consideration in such a scheme is the reliability of the recruited nodes in getting the assigned tasks done. To address this concern, we use a similar design philosophy as in Internet Protocol, in that performance of the recruited hosts is on a best effort basis. The runtime environment of our platform implements progress monitoring and migrating protocols to determine if the minimum performance measure is met and if not, to migrate the tasks assigned to an errant host to another appropriate host. The application layer of any distributed computing platform provides two basic functions. Firstly, it provides the method to map computing requirements to available hosts. Secondly, it provides a method to produce distributed applications that run on the specific platform. For the first aspect, platforms Neary et al. (2000b) and Nisan et al. (1998) use the web browser and Java Applet techniques to distribute computations in the systems. The proposal in Neary et al. (2000) is based on Java application and operates on a client/server mode. In our proposal, the nodes operate in a peer-to-peer mode. The assignment of computations is implemented in the distributed application itself, which is generated based on the communication affinities within the application. For the second aspect, support for the development of distributed applications is an important issue. In proposals such as Brecht et al. (1996) and London (1998) new language features were introduced to facilitate programming for individual platforms. Other proposals such as Fahringer (2000) and Barat loo et al. (1999) make use of existing language like Java, but require compliance with some specific programming paradigms. In these systems, the production of distributed code is done by programmers using system-specific programming rules. In our system, the distributed code generation function automatically ports standard concurrent programs running on single machine to a form that runs on the network. The method is proposed in our previous work (Tay et al., 2005) and is summarized in this paper within the context of the whole system. Security is another important issue in an open distributed computing system, especially for the proposed fully decentralized scheme. Most previous proposals provide some discussion and solutions on this aspect. However, there are several different aspects in the security of distributed computing platform, each of which has its own solutions. In this paper, we provide a systematic discussion on security in the context of our platform. The security problems are classified and corresponding solutions are discussed in detail. The rest of this paper is organized as follows. Section 2 introduces the architecture of the Java distributed code generating and computing (JDGC) platform. Sections 3 and 4 detail the network layer and the middle layer, respectively. The application layer of the JDGC platform is described in section 5. Section 6 provides a systematic discussion on the security issues and their solutions. Section 7 presents sample applications on the system. Section 8 concludes the paper.

Platform for Java program

191

IJPCC 3,2

192

2. System overview The JDGC platform is designed for Java applications. This allows machines with different architectures and operating systems to participate in the open distributed computing system. As mentioned in section 1, the JDGC platform has a fully decentralized architecture. This platform consists of all participating peers or hosts. The hosts involved in the computation of an application constitute a dynamic cluster that a participating host can join or leave without the need to register its presence or absence with a ‘‘controller’’. Multiple such clusters can coexist in the system at a time. In order to adapt to the changing computation tasks, these hosts can be automatically reorganized into new clusters. All participating hosts are equipped with the same JDGC software system. The JDGC software system consists of three layers, namely, the network layer, the middle layer and the application layer. The network layer is the base of the whole JDGC platform. It contains the low-level network routines used in resource management and application object communication. These routines can be classified into two sets. The first set is used to discover the computational resources requested by applications and is based on a group communication protocol. The second set implements the communications between application objects using a unicast protocol and is incorporated into the distributed code running on the platform. The middle layer of JDGC system provides a runtime environment for the distributed code. This environment supports simultaneous execution of multiple applications. All information of the current applications, objects and hosts is maintained. The states of the application objects are backed up, and can be automatically recovered without users’ intervention. The application layer consists of a graphical user interface and a distributed code generator. The distributed code generator preprocesses the standard Java input programs, automatically transforming them into the code suitable for execution in a distributed environment. The distributed code is then submitted to the runtime application manager at the middle layer. The above components and their relationships are illustrated in Figure 1. 3. The network layer To enable the distributed code to run on the platform, we need to discover available hosts, create objects on them, and handle the inter-machine or intra-machine

Figure 1. Architecture of JDGC platform

communication between objects. The network layer provides the mechanisms and APIs to achieve these functions. 3.1 Recruitment protocol using multicast The recruitment protocol provides the facility for a host to acquire available computational resources requested by the applications submitted to it. We call the host that initiates the resource request a client host and the host that responds to the request a server host. A client host recruits server hosts by invoking the Recruit( ) function. The number of hosts to be recruited can be specified by the user. Figure 2 shows the state transition diagram of the client. When the function is invoked, a multicast packet is sent to the dedicated multicast address. The module then goes into the WAIT_OFFER state and listens to replies at a specific network port. On receiving an offer packet, an accept packet is sent to the remote server and the variable OfferCount is increased by one. A ‘‘wait for acknowledge’’ thread is then created which waits at the WAIT_ACK state. In case the appropriate acknowledge is not received within a period of time (i.e. the acknowledge packet is lost) an accept packet is resent to the server. Once the desired acknowledge is received, the variable ServerCount is increased by one, the remote server is added to the array AvailableServers, and the thread exits. Note that there could be at most HostCount mutually exclusive threads waiting at the WAIT_ACK state at the same time. Once the variable OfferCount equals to HostCount (i.e. enough offers has been received), recruit module goes into ENOUGH_OFFER state. Any offer received in this state is rejected. Once the condition that ServerCount equals to HostCount is satisfied, the recruit function goes to END state and returns the array, AvailableServers. During the recruitment process, RecruitTimeout is set to make sure that the recruitment process does not wait infinitely at the WAIT_ACK and ENOUGH_OFFER states. If timeout occurs, an error flag is set when the recruitment process returns. If a host is willing to contribute its computation resource, it starts a daemon to listen to the requests. Figure 3 shows the states of a server daemon. There are three states: IDLE, OFFER_SENT and TIME_WAIT. Upon invocation, the daemon waits at the IDLE state and listens to the dedicated multicast address. Once a recruit message is received, an offer packet is sent to the recruiting client and the server daemon will wait for the response at OFFER_SENT state. If the response is not received within a period

Platform for Java program

193

Figure 2. Recruit() state transition diagram: the client side

IJPCC 3,2

194 Figure 3. Recruit() state transition diagram: the server daemon

of time or a denial is received, the daemon returns to the IDLE state. On the other hand, if the response is affirmative, an acknowledge message is sent and the daemon goes to the TIME_WAIT state. The daemon remains at this state, instead of exiting immediately, after it sends the acknowledge message. This is to ensure that if the acknowledge message is lost, it can retransmit this message to the client on receiving the resent accept packet. The remote server will remain at this state for about twice the maximum segment lifetime or until the daemon receives other messages from the client. In the recruitment protocol we use 4-way handshaking. After sending an accept packet to a server, the client still need to receive the ACK packet from the server before distributing objects to it. Otherwise the objects may be mistakenly sent to a server which has not received the accept packet and has entered the IDLE state due to the timeout. 3.2 Unicast-based protocol The other set of API functions in the network layer provides the facilities to create object and invoke their methods in a distributed environment. The RemoteCreate( ) function creates an object on a specified networked machine. The RemoteInvoke( ) function enables the application code to invoke the method of an object on a remote machine. They are incorporated into the distributed code that runs on the system. The implementation of RemoteCreate( ) is based on the dynamic class loading and RMI protocol of Java (Pitt and McNiff, 2001). The distributed code invokes the RemoteCreate( ), specifying the destination host and the class of the object. Then the bytecode of this class is transmitted to the host in the form of a Jar stream. After receiving the Jar stream, the destination Java virtual machine loads the bytecode in the stream with the RMIClassLoader and an object is created on the virtual machine. Using the RMI interface, a RObject is created as a proxy of the real object on the destination virtual machine. After an object is created, the distributed code is able to invoke its methods through the reference of RObject using the RemoteInvoke( ) function. To facilitate object backup and crash recovery, the network layer provides the Checkpoint( ) function. The implementation of this function is based on Object Serialization of Java (Darby, 1997). The distributed code invokes this function, specifying a reference of RObject to backup the corresponding remote object. The function sends a message to the destination virtual machine, which in turn uses

the Object Serialization interface to transform the object into an array of bytes. The array records the current information of the object. The array is then returned to the object that invokes the Checkpoint( ) function.

Platform for Java program

4. The middle layer The middle layer provides a runtime environment for distributed code. 4.1 The runtime application manager The runtime application manager receives distributed code from the application layer. All received applications and recruited hosts are recorded in the runtime application manager of the local JDGC system. The application manager also records the current scenario/status of the dynamic system. The main information used in runtime application manager is contained in three kinds of nodes, namely application nodes, object nodes and host nodes. All these nodes are organized into two global cross-linked lists. One global list is the list of all application nodes. An application node links to all its object nodes. Each object node points to a host node where the object resides. The overall data structure are illustrated in Figure 4. Each application node has an application ID used for uniquely identifying an application in the local system. The affinity metrics obtained from the object level analysis is maintained here and used in distributing the objects at run time. Each object node has an object ID that identifies a certain object within an application. The node also contains a reference to access the real object. This reference enables the manager to access the local/remote object. The backup information saved in the object nodes is used for crash recovery. Each host node has a host ID that is global to the local system. This node also contains the information about the remote host. One host node is maintained for each

195

Figure 4. The data structure in runtime application manager

IJPCC 3,2

196

physical host. If different objects use the same host, they will have references pointing to the same host node. 4.2 Backup and crash recovery An important function provided in the runtime application manager is backup and crash recovery. The system will backup the state of all objects on the local host during the lifetime of the objects. The backup information will be held in each object node. During an invocation, if the remote host crashes, does not respond or has other exceptions, the system will automatically retry the invocation for a number of times. After that, the remote host or server daemon is considered crashed. The system follows a similar procedure as the object creation procedure to select a host and to migrate the object using the backup information in the corresponding object node. After the object is created, the system reperforms the invocation. All these will be done by the system automatically without user intervention. 5. The application layer The application layer provides a user interface (UI) for the platform. The UI is built on Java swing library. Users can transform their Java applications, submit the generated distributed code to the system, and manage their shared resources in this unified environment. A snapshot of the UI is shown in Figure 5. The distributed code generation function in the application layer automatically ports a standard single machine concurrent application to the form that runs on the networked machines. The object placement is based on a detailed analysis of communication affinities between the runtime objects. 5.1 Communication affinity analysis The analysis of communication affinity is done first in the class level and then in the object level. Class level analysis generates a three-dimensional affinity metrics, which is needed by the following object level analysis. The first dimension is the identifier of each class. The second dimension is the entry point from which a class is accessed. In an object-oriented program, the entry points of a class are the beginning of every

Figure 5. User interface

public method. The third dimension is one of the visible objects in the entry method of the class. A visible object can be a member variable of class A, a local variable defined in method M or a parameter passed to M. Formally an affinity index is associated with each 3-tuple (ClassFrom, EntryMethod, VisibleObjectReference). For each class, say A, and a certain entry method M, the class-level analyzer analyzes the communication from A to all the visible objects, or more specifically, the reference of a visible object. And, the number of bytes to be transferred to the object within this method is used as the affinity index. The class-level analyzer analyzes the public methods in each class one at a time, calculating the affinity index for each 3-tuple. The procedure is shown in Figure 6. The purpose of the object-level analyzer is to extract affinity between runtime instances. In an object-oriented program, any potentially existing instance must be created using certain statement in the class definition. In Java this is typically a new statement. An object-level affinity value is associated with any two new statements in the application, while a class-level affinity value is associated with a class, a method and a variable reference. The object-level analyzer maps the class-level affinity values to runtime objects. The analyzer first finds the main method of the application. It then steps through all possible control flows of the application. On encountering a new statement, the analyzer calculates the affinity of this newly created object. The procedure of the object-level analyzer is described in Figure 7. The class-level and object-level analysis are not performed on JDK classes because the JDK classes need to be installed on each participating machine that run our system.

Platform for Java program

197

5.2 Distributed code generation Tree transformer incorporates the object replacement information derived from the object-level metrics into the abstract syntax tree (AST) of original program. Firstly, the new statement will be reimplemented by a procedure that places the new object. According to the object-level affinity metrics, the procedure calculates the

Figure 6. Class-level analysis

IJPCC 3,2

198

Figure 7. Object-level analysis

affinity between the new object and every machine, that is, the sum of affinity values between the new object and every object in the machine. The code generator strives to reduce the communication overhead between objects. Inter-machine communications are much more expensive than intra-machine communications. The generator groups heavily communicating objects into the same machine so that they could use lightweight communication mechanisms. So, the new object will be placed in the machine that has the largest affinity value with the new object. That is, suppose the application recruits m machines M1, M2, . . . , Mm, the ith machine has the existing objects Oi,1, Oi,2, . . . , Oi,ni, and a new object Onew is created. The machine to place Onew, denoted by Mplace, satisfies: AffinityðOnew ; Mplace Þ ¼ MaxfAffinityðOnew ; Mi Þg; i ¼ 1; 2; . . . ; m; where AffinityðOnew ; Mi Þ n X ¼ AffinityðOnew ; Oi;k Þ: k¼1

The affinity between two objects, AffinityðOnew ; Oi;k Þ, is directly obtainable from the object-level affinity metrics. The network layer API will be used to do the inter-machine creation of a new instance which should be placed in a certain network machine. Secondly, the inter-object communication statements will be reimplemented to correctly handle intra- or inter-machine data delivery. Every communication statement

will be replaced by a procedure, which first decides whether the destination instance is currently located at local machine or a certain network machine. If it is local, then the communication is done as normal. If it is in a certain network machine, then the network layer API functions will be used to handle inter-machine communication. To facilitate the automatic transformation, a reference to a user-defined class is replaced by a reference to a wrapper class. The wrapper class hides the difference between a local and a remote object at compile time and runtime. If the object is local, then the corresponding wrapper just encapsulates the normal object in the same virtual machine. If the object is remote, then the corresponding wrapper encapsulates a RemoteClient object, which is the reference to this remote object. The transformation procedure is shown as Figure 8. All generated sub ASTs are parsed at generation time. The transformed AST is again attributed and directly ported to bytecode.

Platform for Java program

199

6. Security issues and protection The distributed computing environment is an open system, where any host can participate in or quit the system at will. In such an environment, security is another important issue. In this section, we classify the major security issues in our system. We then discuss the self-signature integrity protection scheme used in our platform. 6.1 Major security issues There are two separate security issues in the discussed computing environment. The first is the protection of the server hosts from malicious programs. This aspect of the security problem is similar in nature to that of a multi-user computer system, where individual programs must be protected from other programs running in the same physical system. This problem has been investigated and addressed in the Java technology. The virtual machine concept proposed for Java provides a practical solution for this problem. The Java virtual machine has a bytecode verification

Figure 8. Code generation

IJPCC 3,2

200

mechanism (Leroy, 2001). Before loading the bytecode of a defined class, the code verifier checks whether it possibly poses security threats to the local system. The next line of defense is the Security Manager (Venners, n.d.), which is a single module that can perform runtime checks on dangerous operations of the distributed code. Code in the Java library consults the Security Manager whenever a dangerous operation is about to be attempted. According to a predefined security policy, the Security Manager is given a chance to restrict the operation by generating a security exception at runtime. These security solutions in Java are one important reason for choosing Java as the implementation technique of JDGC platform. The second issue, which is the protection of the distributed program executing in a malicious environment (remote server), can be divided into two parts. The first part is secrecy, which requires the content of data and semantic of codes in the program to be shielded from the remote server. This risk is reduced somewhat by ensuring that no server receives the complete set of private information. Traditional encrypted and authenticated channels are also utilized to enforce this kind of protection. The second part is integrity, which requires untampered execution of the distributed program by the untrusted server. Violation of integrity must be detected as early as possible in order to recognize the malicious servers and to avoid using the wrong results. Our selfsignature security protection scheme is focused on this aspect of the security problem. 6.2 Security protection scheme using self-signature The key of the protection scheme is the insertion of a self-signature algorithm into the distributed bytecode. The execution of the original bytecode can be divided into several phases. A procedure that implements a self-signature algorithm is inserted into every phase of the program. This is done in preprocessing of the distributed bytecode before distributing it to an untrusted server. In each phase of the execution, the transformed bytecode will perform both the workload computation and the signature calculation. At the end of each phase, the client host will request the server to serialize and return the current status of the execution. All intermediate variable values at this time are recorded in an image, and returned to the client. The signature calculated by the self-signature algorithm is also included in the image among other intermediate results of this phase. All the signatures can be recalculated at the client host to check for correctness. Any inconsistency of the signature indicates an integrity violation. The self-signature algorithm is a function f which calculates a signature value from a bytecode stream. In general, the function f implements an algorithm based on a set of parameters. In our scheme, the parameters of f are sent to the distributed bytecode at runtime, at the beginning of each phase (not at load time). The implementation of selfsignature procedure is such that it first requests the function parameters from the client host. After receiving the parameters, it calculates the signature of the input bytecode using these parameters. The client host upon transmitting the parameters of the function f will set a deadline for the completion of the phase. The deadline is required to be longer than the ordinary execution of the code, but less than the time needed to reverse engineer f. We use the Crypto-Computing technique of Sander et al. (1998); Lipton and Sander (1997); Goldwasseer and Micali (1984) to encrypt the self-signature function such that the function cannot be determined in polynomial time. The technique provides a mathematically provable security for encrypting a function. Although this method is not applicable for general algorithm, it is enough for application to our self-signature function.

7. Experiments and results In this section, we show the experimental results with two sample applications that run on JDGC platform. Application 1: Program MultT. This application performs multiplication of two large matrices of floating point numbers. Application 2: Program Curve. This application performs curve fitting for a series of data using the least square error criteria. The two applications are tested in different situations where one or more singlethreaded and multi-threaded applications are submitted to the system. The unit of execution time in the following tables is second.

Platform for Java program

201

7.1 Single-threaded applications In this case, the computation is done within one object. Four identical applications are submitted simultaneously to the system. We increase the number of hosts in the system, and record the average execution time of the applications. For the program MultT, the order of the matrix is 1,000. For the program curve, the size of data series is 1,600, and the precision of parameters is 104. From Table I, we can see that the average execution time decreases as we increase the number of server host. When there are two hosts available, each of them receives two applications. In the case of four hosts, each application is sent to one host to be computed simultaneously. The execution time with n hosts is roughly equal to, but slightly longer than, 1/n of the execution time with one host. This overhead is due to the time of launching the applications on a networked machine. 7.2 Multi-threaded applications In this case, the computation of an application is done in multiple concurrent objects that can be distributed to available hosts. The applications with 48 concurrent threads are submitted to the system with different hosts recruited. We also compare these execution times of multi-threaded applications with a single-threaded application for the same problem. For the program MultT, the order of the matrix is 3,000. For the program Curve, the size of data series is 6,400, and the precision of parameters is 106. Table II shows the execution times and speedups of a multi-threaded application. If the application is implemented as multi-threaded and runs on a single machine, it is slower than a single-threaded implementation. This is due to the context switching on the single machine, as well as the local communication between the multiple threads. When there are more hosts available, the multi-threaded application outperforms the single-threaded one running on a single machine. The speedups, however, are below the increase of the number of hosts. The overhead is partly incurred by the time of transferring and initializing the objects on a networked machine. This cause is similar to the case of multiple single-threaded applications. On the other hand, for a multi-threaded application, the overhead is also due to the communications between the multiple threads of an application. MultT (1,000) Curve (1,600; 104)

No. of hosts Execution time No. of hosts Execution time

1 421.974 1 975.951

2 217.812 2 488.902

4 115.123 4 260.160

Table I. Average execution time of four single-threaded applications

IJPCC 3,2

202

7.3 Comparison of different object placement methods We validated the effectiveness of our method by comparing it with two other object allocation methods; namely, the random allocation and the manual (worst case) allocation. We used the same parameters as multi-threaded case. In this test the two applications are implemented as 16 concurrent threads. The number of recruited hosts is fixed to four. In the random case, every recruited host is randomly assigned four objects without considering their communication relations. In the manual case, the objects are manually allocated to the hosts according to the worst case of all combinations. The comparison of their execution time is summarized in Table III. The proposed method outperforms the random allocation and the manual allocation method. The difference of performance is due to the different communication overheads when using different methods. In the worst case, objects that have the largest communication affinity are manually separated into different hosts so that the overheads introduced are larger than the other cases. In the random case, the execution time fluctuates in different trials. The average execution time, however, is between the manual method and the proposed method. These comparison experiments are done in a local area network. If the experiment is extended to a wide area network or the Internet environment, larger difference between the three cases can be expected. 8. Conclusions The proposed JDGC platform is a fully decentralized integrated platform for distributed computing based on Java. It allows standard, single machine oriented Java programs to be transparently executed in a distributed computing environment. The automatic distributed code generation method in JDGC releases the developers from any special programming considerations for distributed environment, and solves the portability problem of using system-specific programming methods for individual

Threads No. of hosts

Table II. Average execution time of multi-threaded applications

Table III. Comparison of different object allocation methods

Single

Multi (48)

Multi (48)

Multi (48)

Multi (48)

Multi (48)

Multi (48)

Multi (48)

1

1

2

4

8

16

32

48

MultT Execution 3,512.217 3,537.823 1,801.137 (3,000) time Speed up 1 0.99 1.95

921.842 474.624 245.781 132.787 103.483 3.81

7.40

14.29

26.45

33.94

Curve Execution 5,798.150 5,814.292 2,958.240 1,506.013 775.154 394.770 206.119 141.868 (6,400; time 106) Speed up 1 0.997 1.96 3.85 7.48 14.69 28.16 40.87

Random

Manual

Proposed method

MultT (2,400)

Threads/hosts Execution time

16/4 473.956

16/4 505.918

16/4 431.518

Curve (3,200; 106)

Threads/hosts Execution time

16/4 380.758

16/4 398.648

16/4 344.585

distributed computing platforms. The object level analysis method while motivated from our integrated system for object distribution, is nevertheless a general approach for extracting object-to-object communication affinity metrics and can also be used in other contexts. Our experimental results show that in comparison with two other object placement methods, random and manual, the method based on object level analysis achieves better performance.

Platform for Java program

203 References Baratloo, A., Dasgupta, P., Karamcheti, V. and Kedem, Z. (1999), ‘‘Metacomputing with MILAN’’, Proceedings of the Heterogeneous Computing Workshop (HCW’99), International Parallel Processing Symposium (IPPS/SPDP 1999), April. Brecht, T., Sandhu, H., Shan, M. and Talbot, J. (1996), ‘‘ParaWeb: towards world-wide supercomputing’’, Proceedings of the Seventh ACM SIGOPS European Workshop: Systems Support for Worldwide Applications, Connemara, 9-11 September. Cappello, P. and Mourloukos, D. (2001), ‘‘A scal-able, robust network for parallel computing’’. Proceedings of the ACM 2001 Java Grande Conference, Stanford University, California, 2-4 June. Christiansen, B.O., Cappello, P., Ionescu, M.F., Neary, M.O., Schauser, K.E. and Wu, D. (1997), ‘‘Javelin: internet-based parallel computing using Java’’, Concurrency: Practice and Experience, Vol. 9 No. 11, pp. 1139-60. Darby, C. (1997), ‘‘Object serialization in Java 1.1. Making objects persistent’’, WEB Techniques, Vol. 2 No. 9, pp. 55, 58-9. Fahringer, T. (2000), ‘‘JavaSymphony: a system for development of localityoriented distributed and parallel Java applications’’, Proceedings IEEE International Conference on Cluster Computing, pp. 145-52. Goldwasser, S. and Micali, S. (1984), ‘‘Probabilistic encryption’’, Journal of Computer and System Sciences, Vol. 28 No. 2, April, pp. 270-99. Leroy, X. (2001), ‘‘Java bytecode verification: an overview’’, Proceedings of 13th International Conference on Computer Aided Verification (CAV 2001) (Lecture Notes in Computer Science Vol. 2102), Paris, France, 18-22 July, pp. 265-85. Lewis, M. and Grimshaw, A. (1996), ‘‘The core Legion object model’’, Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing (Cat. No.TB100069), Syracuse, NY, 6-9 August, pp. 551-61. Lipton, R. and Sander, T. (1997), ‘‘An additive homomorphic encryption scheme or how to introduce a partial trapdoor in the discrete log’’, November. London, S. (1998), ‘‘POPCORN – a paradigm for global-computing’’, MSc thesis, Institute of Computer Science, The Hebrew University of Jerusalem, Jerusalem, June. Neary, M.O., Brydon, S.P., Kmiec, P., Rollins, S. and Cappello, P. (2000a), ‘‘Javelinþþ: scalability issues in global computing’’, Concurrency: Practice and Experience, Vol. 12 No. 8, pp. 727-53. Neary, M.O., Phipps, A., Richman, S. and Cappello, P. (2000b), ‘‘Javalin 2.0: Java-based parallel computing on the internet’’, Euro-Par 2000, Munich, 29 August-1 September. Nisan, N., London, S., Regev, O. and Camiel, N. (1998), ‘‘Globally distributed computation over the internet – the POPCORN project’’, International Conference on Distributed Computing Systems, Amsterdam, 26-29 May. Pitt, E. and McNiff, K. (2001), Java.rmi: The Remote Method Invocation Guide, Addison-Wesley, Boston, MA, July.

IJPCC 3,2

204

Sander, T. and Tschudin, C.F. (1998), ‘‘Towards mobile cryptography’’, Proceedings of IEEE Symposium on Security and Privacy, Oakland, CA, 3-6 May. Tay, T.-T., Chu, Y. and Sun, Y. (2005), ‘‘Distributed code generation using object level analysis’’, Proceedings of 5th International Conference on Computer and Information Technology (CIT2005), September, pp. 858-62. Venners, B. (n.d.), ‘‘Java’s security architecture’’, available at: www.javaworld.com/javaworld/jw08-1997/jw-08-hood.html Corresponding author Sun Yang can be contacted at: [email protected]

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints

The current issue and full text archive of this journal is available at www.emeraldinsight.com/1742-7371.htm

k-PCA: a semi-universal encoder for image compression

Encoder for image compression

Chuanfeng Lv Department of Electric and Information Engineering, The Beijing Institute of Technology, People’s Republic of China, and

Qiangfu Zhao Multimedia Device Laboratory, The University of Aizu, Aizu-Wakamatsu City, Japan

205 Received January 2005

Abstract Purpose – In recent years, principal component analysis (PCA) has attracted great attention in dimension reduction. However, since a very large transformation matrix must be used for reconstructing the original data, PCA has not been successfully applied to image compression. To solve this problem, this paper aims to propose a new technique called k-PCA. Design/methodology/approach – Actually, k-PCA is a combination of vector quantization (VQ) and PCA. The basic idea is to divide the problem space into k clusters using VQ, and then find a PCA encoder for each cluster. The point is that if the k-PCA encoder is obtained using data containing enough information, it can be used as a semi-universal encoder to compress all images in a given domain. Findings – Although a k-PCA encoder is more complex than a single PCA encoder, the compression ratio can be much higher because the transformation matrices can be excluded from the encoded data. The performance of the k-PCA encoder can be improved further through learning. For this purpose, this paper-proposes an extended LBG algorithm. Originality/value – The effectiveness of the k-PCA is demonstrated through experiments with several well-known test images. Keywords Data compression, Computer components, Encoders, Image processing Paper type Research paper

1. Introduction So far, many techniques have been proposed for image compression. These techniques can be roughly divided into two categories: predictive approaches and transformational ones. In brief, predictive approaches like differential pulse code modulation (DPCM) Ching, 1973 and vector quantization (VQ) Linde et al. 1980 try to predict a pixel or a block of pixels based on known data (already observed or previously stored). Usually, only local prediction is considered. For example, in DPCM, good prediction can be made even if the predictor is very simple because neighboring pixels are often highly correlated. In VQ, a block of pixels can be predicted very well using the nearest code word. Transformational approaches (Ahmed et al., 1974; Jolliffe, 1986) project the data into a domain, which requires fewer parameters for data representation. Principal component analysis (PCA) ( Jolliffe, 1986) is known as the optimal linear transformation for this purpose. Compared with VQ which approximates each point in the problem space using a different code word, PCA approximates all points using the linear combinations of the same set of basis vectors. Thus, we may consider VQ and PCA as two extreme cases. VQ is an extremely local approach, which approximates each point using only one point (the nearest code word), while PCA is an extremely global approach which approximates all points using the same set of basis vectors. The authors would like to thank the anonymous reviewers for their invaluable suggestions and comments.

International Journal of Pervasive Computing and Communications Vol. 3 No. 2, 2007 pp. 205-220 # Emerald Group Publishing Limited 1742-7371 DOI 10.1108/17427370710847327

IJPCC 3,2

206

Currently, PCA has been successfully adopted in many signal processing applications such as image processing, system control, communication, and pattern recognition. In all applications, PCA can be used to reduce the dimensionality of the problem space. PCA achieves dimension reduction by discarding the eigenvectors associated with small eigenvalues. However, since a very large transformation matrix consisting of the eigenvectors must be used to reconstruct the original data, PCA has not been successfully applied to image compression. Another transformation currently in use for image compression is discrete cosine transform (DCT) (Ahmed et al., 1974). Although DCT is not optimal, it is one of the most popular transforms, and has been used and studied extensively. The important feature of DCT is that it takes correlated input data and concentrates its energy in just the first few transformed coefficients. The advantage of using DCT is that we need only to preserve the transformed coefficients, since the transformation matrix is universal in the sense that it can be used to compress all images. In general, a PCA encoder built from one image cannot be used to compress other images because the eigenvectors obtained from one image cannot approximate other images well. In fact, even within the same image, the PCA encoder may not be able to approximate all image blocks equally well using a fixed set of eigenvectors. It may perform poorly in local regions containing edges or noises. To increase the approximation ability, some non-linear PCA approaches have been proposed in the literatures (Oja, 1982; Kung and Diamantaras, 1990; Dony, 1995, 1998). The basic assumption in these approaches is that different patterns should belong to different sub-spaces, and patterns in each sub-space can be approximated better by using a separate PCA encoder. Although these approaches can improve the performance of the PCA encoder, they are very time-consuming because the convergence speed of the learning process is very slow. Several improvements were given in Dony (1998), but they are far from being practically useful. To speed-up the learning process, several ideas have been proposed in (Roweis and Saul (2000), Tenenbaum et al. (2000), Scholkopf et al. (1998), and Kambhatla and Leen (1997). These ideas are very similar to each other, and have been used mainly for dimension reduction, in which the compression ratio is not the main concern. In fact, as stated earlier, PCA based approaches are not efficient for image compression because the transformation matrix corresponding to each image must be saved. One of the main contributions of this paper is to introduce the concept of semiuniversal encoding. Using this concept, the compression ratio can be greatly improved because it is not necessary to save the transformation matrix in the compressed data. Another main contribution we make to this field is the proposal of a new technique called k-PCA for image compression. Actually, the original version of k-PCA was proposed in Lv and Zhao (2005), and this paper is an improved journal version of the same. Briefly, k-PCA is a combination of VQ and PCA. The basic idea is to perform a rough compression on the image data using PCA, then divide the problem space into k clusters using VQ, and finally find a set of PCA encoders for each cluster. Therefore, k-PCA is actually PCA-VQ-PCA. The point here is that if the training data contain enough information, we can construct a k-PCA encoder that can be used to compress all images in a given domain. That is, a k-PCA encoder is a semi-universal encoder. Of course, a k-PCA encoder is not truly universal because it cannot be used for compressing images of all domains. In this paper, an extended LBG learning algorithm is proposed to improve the generalization ability of the k-PCA and more experimental results with more

test images are provided to verify the effectiveness of the k-PCA. The paper is organized as follows: section 1 provides a short review of VQ and PCA, and introduces briefly the concept of mixture of principal components (MPC). In section 3, the k-PCA approach is first introduced, and the extended LBG algorithm is proposed. The proposed algorithm is verified through experiments in section 4, and section 5 is the conclusion.

Encoder for image compression 207

2. Preliminaries 2.1 Vector quantization (VQ) VQ is a lossy data compression technique. It achieves compression by representing continuous signals using discrete ones. The simplest example of VQ is the round off operation, which replaces any value with the nearest integer. In general, there is a codebook corresponding to each VQ. The codebook is a set of codewords. For any given data x, which is a point in the problem space, VQ maps x to the nearest codeword y, and represents x using the position or the index of y in the codebook. The smaller the codebook is, the higher the compression ratio will be. However, when the codebook is too small, the quality of the decoded image will be very poor. To find the codebook of a VQ, we can use the LBG algorithm (Linde et al., 1980). The main steps of the LBG algorithm are as follows: .

.

.

.

Step 1: Select a threshold value (>0), set k ¼ 1 and set the mean of all input data (the training data) as the first codeword C1 . Step 2: If k is smaller than the pre-specified codebook size, continue; otherwise, terminate. Step 3: Split each of the current codewords into two by duplicating it and adding a small noise; then double the value of k. Step 4: Based on the current codebook, calculate the distortion, say e0. For each codeword Ci (i ¼ 1; 2; . . . ; k), find the set Si as follows: Si ¼ fxjdðx; Ci Þ ¼ min dðx; Cj Þg j¼1;2;...;k

ð1Þ

where x 2 , and  is the set of all input data. .

Step 5: Re-calculate each codeword Ci as the mean of the input data contained in Si . Based on the new codewords, calculate the reconstructed distortion, say e1. If e0  e1<  then replace e0 with e1 , and return to Step 2; otherwise return to Step 4.

The distortion is often defined as the mean squared error (MSE) given by MSE ¼

1 X jjx  nðxÞjj2 jj x2

ð2Þ

where nðxÞ is the nearest codeword of the input x, and jj jj2 is the Euclidean distance between two vectors. The distortion can also be defined as the peak signal to noise ratio (PSNR) as follows: PSNR ¼ 10 log10

2 fmax ðdBÞ MSE

ð3Þ

IJPCC 3,2

208

where fmax is the maximum value of the image. For a gray scale image with eight bits per pixel, fmax is 255. Once the codebook is built, the encoding procedure is very simple. For each input datum x, find the nearest codeword y in the codebook. The index of y will be the code of x. The decoding procedure is also simple: read in the indices one by one, substitute each index with the codeword itself, and put it into the image in order. VQ is a piece-wise-linear approach. It approximates each point locally. It is locally linear but globally non-linear. It uses only one codeword for each input vector. In addition, VQ is a pure discrete representation of the data, and thus can achieve high compression ratio. There are several problems with the use of VQ. The first is the so-called trade-off relation between the compression ratio ðCr Þ and the fidelity. For example, in order to increase Cr , the codebook size needs to be reduced but the fidelity will be decreased. To resolve this problem, we have proposed an iterated function system (IFS) based algorithm in Lv and Zhao (2004), and Lv (2004). The second problem encountered is the computational cost for building the codebook, which has become a bottleneck for applying VQ. This is actually one of the major research topics for improving VQ. Another drawback in using VQ is the high computational cost for searching the codebook for encoding. Several algorithms have been proposed to reduce the computational cost and to accelerate the encoding process (Chang et al., 1992; Gray and Linde, 1982; Torres and Huguet, 1994; Chang et al., 1997). 2.2 Principal component analysis Principal component analysis (PCA), also known as Karhunen-Loeve transformation (KLT) in the context of communication theory, is an important technique for data compression because of its excellent capability that can grasp the correlations between different observations of the signals. The PCA maps the observations into another space therein all variables become un-correlated to each other. Because all variables are un-correlated to each other, the redundancy between observations can be removed, and thus very efficient encoding can be expected. In linear algebra, PCA can be formulated as an eigenvector decomposition problem. Actually, PCA is implemented by solving the following equation: Rq ¼ q

ð4Þ

where R is the correlation matrix of the input data,  is the eigenvalue of R, and q is the eigenvector. If the problem space is N dimensional, we can have N possible solutions for the vector q. The N solutions q1 ; q2 ; . . . ; qN are called the principal components of R. For any observation x, its projection to the jth principal component is given by aj ¼ fx; qj g;

j ¼ 1; 2; . . . ; N

ð5Þ

where < > is the inner product. To reconstruct the original data, we simply have x¼

N X

aj qj :

ð6Þ

j¼1

Usually, some of the eigenvalues are very small, and the corresponding eigenvectors can be omitted in equation (6). This is the basic idea for data compression using PCA.

The more eigenvectors we omit, the higher the compression ratio we can obtain. Usually, the eigenvalues correspond to the omitted eigenvectors are not zero. Therefore, PCA is also a lossy compression technique. Instead of solving equation (4), researchers have also tried some other methods. In 1982, Oja proposed a self organized neural network with constrained Hebbian learning rule that can extract the principal components from stationary input data (Oja, 1982). Thereafter, there has been increasing interests in the study of connections between PCA and neural networks. A symmetrical multilayer perceptron (MLP) neural network with the back propagation algorithm in supervised auto-associative memory, have been shown closely connected to PCA (Carrato, 1982). Sanger’s generalized Hebbian algorithm (GHA) (Sanger 1989) extended Oja’s single model to M principal components. Kung and Diamantara (1990) proposed an adaptive principal component extraction (APEX) model, in which the output of the mth principal component can be calculated based on the previous m  1 components. The most important advantage of neural network based PCAs is that they are good for learning with changing data. However, the learning process is usually very time-consuming, and this is the main reason why these approaches have not been applied successfully to image compression.

Encoder for image compression 209

2.3 Mixture of principal components One drawback in using PCA is that PCA is a linear approach, it cannot approximate all areas of the image equally well. In other words, one PCA cannot simultaneously capture the features of all regions. To solve this problem, the mixture of principal components (MPC) has been studied (Dony, 1995, 1998). The basic idea of MPC is to partition the problem space into a number of sub-spaces, and find a PCA encoder for each sub-space. Dony (1995) proposed an optimal adaptive transform coding method. The whole encoder is composed of a number of GHA neural networks. Each GHA neural network is expected to become a PCA encoder after learning. Figure 1 illustrates how the appropriate GHA neural network is selected to learn from the current input vector. The training algorithm is given as follows: .

Step 1: Initialize (at random) k transformation matrices W1 ; W2 ; . . . ; Wk , where Wj is the weight matrix of the jth GHA neural network.

Figure 1. Selection of the best GHA for a given input

IJPCC 3,2

.

Step 2: For each training input vector x, classify it to the ith sub-space, if Pi x ¼ max Pj x; 8j

where Pi ¼ WiT Wi

ð7Þ

Update the weights according to the following rule:

210

Winew ¼ Wiold þ Z ðx; Wiold Þ

ð8Þ

where  is the learning rate and Z is the GHA learning rule which converges to the principal components. .

Step 3: Iteratively implement the above training procedure until some terminating condition is satisfied.

In Dony (1995), the training parameters were given as follows: the number of subspaces is 64 and the number of training iterations is 80,000. Clearly, to build an MPC encoder is very time consuming. For each input datum, we must find which sub-space it belongs to, and then update the corresponding weight matrix. In Dony (1990), several methods were proposed to speed up the training process and decrease the distortion. These methods include growth by class insertion, growth by components addition and tree structured network. The key point is that, whatever method we use, the convergence speed of training is very slow. In fact, the purpose of this study is to propose a semi-universal encoder based on PCA. By ‘‘semi-universal’’ here we mean that the encoder can be used to compress all images in a given domain (say, domain of medical images or domain of face images). Clearly, the standard PCA cannot be used because it produces one transform based on one image, and this transform is optimal only for that image. As stated earlier, PCA is a linear technique. It cannot approximate a large number of non-linear input patterns well. In this sense, MPC is better, but not significantly better for our purpose because the training process is very time-consuming. 2.4 Discrete cosine transform Discrete cosine transform (DCT) (Ahmed et al., 1974; Salomon, 2004) is another transform based image compression technique. Usually, the value at each pixel of an image is closely correlated with its neighbors. DCT de-correlates the input data and forces the energy concentration to the low frequency domain (left top corner of the transformed image). The two-dimensional DCT and inverse DCT (IDCT) are given by     n1 X n1 X 1 ð2y þ 1Þj ð2x þ 1Þi p ffiffiffiffiffi Gij ¼ pxy cos Ci Cj cos 2n 2n 2n x¼0 y¼0     n1 X n1 1 X ð2y þ 1Þi ð2x þ 1Þj pxy ¼ pffiffiffiffiffi Ci Cj Gij cos cos 2n 2n 2n i¼0 j¼0 8 < p1ffiffiffi : f ¼ 0 where Cf ¼ 2 : 1: f > 0

ð9Þ ð10Þ

where n  n is the image block size, typically n ¼ 8. The two-dimensional DCT can be interpreted in the way that transforms the original data into a new data space based on an n  n basis which is defined by equation (9). A graphic representation of 8  8 basis is shown in Figure 2. The horizontal frequency increases from left to right, and the vertical frequency increases from top to bottom. The constant-valued basis function at the upper left is often called the DC basis function, and the corresponding DCT coefficient is often called the DC coefficient. Others are called AC basis and AC coefficients, respectively. The procedure for encoding an image with DCT is summarized as follows: .

Step 1: Divide the image into many small blocks of nn pixels for each.

.

Step 2: Apply DCT to each small block, and preserve the transformed coefficients that are significant.

.

Step 3: Quantize the preserved coefficients and write them to the compressed stream.

Encoder for image compression 211

The most popular technique for still image compression is JPEG (Joint Photographic Experts Group). In fact, JPEG depends heavily on DCT. However, using the DCT on 8  8 blocks often results in a blocky appearance in the reconstructed image (especially when the compression ratio is high). This is the very reason that wavelet transform has replaced DCT since the introduction of JPEG2000 (Antonini et al., 1992). 3. k-PCA: a semi-universal encoder As pointed out in the previous discussion, there are two reasons why the computational cost for designing an MPC encoder is very high: (1) the weight matrices to be updated are of high dimensionality, and (2) the convergence speed of the GHAs is slow. To solve these problems, we propose the so-called k-PCA approach. The basic idea of k-PCA is to divide the problem space into k clusters using VQ, and find a set of principal components using PCA for each cluster. Although k-PCA and MPC are very similar, k-PCA has the following advantages. First, the dimension of the vectors to be updated is much smaller. Second, the LBG algorithm is faster than the learning

Figure 2. The transform coefficients of an 88 two-dimensional DCT

IJPCC 3,2

algorithm used in MPC. Third, for each cluster, we do not use a GHA, but a PCA, and to get a PCA is much easier. 3.1 The training procedure The first step of the proposed method is to build an encoder based on the given training data, The training procedure can be described as follows:

212

.

.

Step 1: Normalization (0  xi  1), and divide each training image into many n  n small blocks. Step 2: For all blocks of all training images, find an l dimensional (l  n2 ) PCA encoder. By so doing we can reduce the dimension of the problem space from n2 to l.

.

Step 3: VQ is implemented within these reduced l dimensional data space, the number of codewords is set to be k (k ¼ 64 in our experiments). While the indices correspond with the input data will be utilized to divide the original n  n dimensional data space.

.

Step 4: One by one, according to the index found in last step drop the current n  n dimensional data into respect sub-space.

.

Step 5: For each sub-space, find an m dimensional (m  n2 ) PCA encoder. For the jth sub-space, the transformation matrix is given by Vj ¼ ðv1j ; v2j ; . . . ; vm j Þ;

j ¼ 1; 2; . . . ; k:

ð11Þ

where vij is the ith eigenvector of the jth PCA encoder (supposing that the eigenvectors are already sorted in non-increasing order) Notice that, PCA and VQ in Steps 2 and 3 may be substitued with a VQ in high dimensional space (i.e. n  n). The reason of so doing is that, in VQ, significantly computational cost shall be saved once the dimension of input space were reduced. In addition, to further lessen computational penalty, DCT transform may be used rather than PCA. Through experiments we have found that an l ¼ 8 dimensional PCA encoder can represent the original image blocks very well. The codebook obtained based on the 8 dimension vectors performs almost as well as that obtained from the original n2 ¼ 64 dimensional vectors. In this paper, we call the above algorithm k-PCA. Note that if we train the k-PCA encoder using enough data, we can use it as a semi-universal encoder to compress many images of the same domain. In this case, we do not have to include the transformation matrices into the compressed data, and thus, the compression ratio can be increased greatly. 3.2 The encoding procedure After building the encoder in the training procedure, encoding is performed as follows: .

Step 1: Normalization (0  xi  1), and divide each training image into many n  n subimage blocks.

.

Step 2: Centralization, subtract the offset from each subimage block.

.

Step 3: For each block x, find the jth PCA encoder that can minimize the reconstruction error as follows: Vj ¼ arg min8j jjx  Vj VjT xjj:

ð12Þ

Encoder for image compression

We can express the principal components y as follows: y¼

m X

T

vij x

213 ð13Þ

i¼1

The index j and the coefficients yi ði ¼ 1; 2; . . . ; mÞ are used as the code of x. .

Step 4: Quantization, the number of bit assigned to the respective dimension in the transformed space should determined by its variance (the eigenvalue in the input space).

3.3 The decoding procedure The decoding procedure is as follows: .

Step 1: Initialize the relative parameters about the image to be reconstructed, which including size, offset, k-PCA, quantization tables, etc. According to the parameters, iteratively perform Step 2 through Step 4.

.

Step 2: Input a segment of codes, retrieve the index and the quantized principal components for current subimage. Inverse quantization with the quantized principal components, get the principal components of current subimage block.

.

Step 3: Refer to the index, select the corresponding transform matrix. Reconstruct the subimage block via: ^x ¼

m X

yi vij

ð14Þ

i¼1

where j is the index, y is the principal components. .

Step 4: Decentralization, add the offset to the current subimage block. And embed the current subiamge in the reconstructing image so far.

3.4 Improving the performance of k-PCA through learning From the training process we can see that to obtain a k-PCA encoder that generalizes well, it is important to partition the input data space properly. Although the VQ based k-PCA is faster than other existing adaptive PCA approaches, the encoder obtained is not optimal because the decision boundaries formed by the k cluster centers are in general different from those formed by the k-PCA encoders. To improve the approximation ability of the k-PCA encoder, we can conduct a post training process based on the LBG algorithm (Linde et al., 1980). We call this approach the extended LBG. The basic idea is as follows. After partitioning the problem space by VQ, we can find k-PCA encoders. For each PCA encoder, we can find the set of training data that can be approximated best by this encoder. The encoder can be rebuilt using these data. We can reiterate this process until incremental gains in the fidelity of the reconstructed image are insignificant. This process is given as follows:

IJPCC 3,2

.

Step 1: For each input data (training data), find the closest PCA encoder which can produce the smallest error between the original and the reconstructed data, mark it with the index of the encoder.

.

Step 2: For each PCA encoder, find all input data marked with its index, construct a new encoder using these input data, and replace the old one.

.

Step 3: Evaluate the error between the original and the reconstructed image. Compare the error with the one in the last iteration.

.

Step 4: If there is no significant improvement, terminate; otherwise, return to Step 1.

214

4. Experimental results To verify the proposed method, we conducted experiments with ten well known test images. All images have the same size. The size is 512512 pixels, and there are 256 gray levels. Therefore the uncompressed data amount of each image is 256 kB. In the experiments, the block size n is 8, the codebook size (the number of clusters) k is 64, and the number of basis vectors m is 4. The coefficients of the principal components were quantized to 8 bits. In the first set of experiments, we constructed a k-PCA encoder using one of the images and tested the performance using the same image. Table I shows the MSE and the PSNR of the k-PCA encoders obtained for different images. In this table, I is the number of iterations used by the extended LBG for improving the k-PCA. For comparison, the results obtained by using MPC (Dony, 1995) are provided in Table II. In Tables I and II, the parameters n, k and m are the same. The compression Name

Table I. MSE/PSNR of k-PCA encoders for the same image

Table II. Results of MPC

Airport Barbara Boat Elaine F16 Lena Man Mandrill Peppers Zelda

k-PCA before retraining

k-PCA after retraining

I

140.27/26.66 113.04/27.6 84.08/28.88 37.74/32.36 56.66/30.6 32.69/32.99 116.68/27.46 331.09/22.93 40.18/32.09 18.75/35.4

117.61/27.43 79.37/29.13 66.09/29.93 31.16/33.19 43.79/31.72 25.26/34.11 93.52/28.42 287.47/23.54 30.79/33.25 14.20/36.61

28 25 24 35 46 27 28 22 27 33

Name

MSE

PSNR

Airport Barbara Boat Elaine F16 Lena Man Mandrill Peppers Zelda

258.33 286.27 178.85 66.88 149.96 75.95 303.03 488.84 112.26 36.02

24.01 23.56 25.61 29.88 26.37 29.33 23.32 21.24 27.63 32.57

ratio (Cr ) in this case is 3.084 (since the transformation matrix, as well as the transformed coefficients, are all counted, the compression ratio is quite low). From these results, we can see that the k-PCA encoders perform significantly better than MPC in all cases. The results after retraining are even better. The improved algorithms given in Dony (1998) may produce results better than those given in Table II. Since the programs of these algorithms are not available, we could not provide the results here. However, the results for the image Lena was given in Dony (1998), and they are worse than the results of kPCA (see Table III). Note that the results given for Lena in Table II are different from those given in Table III because the results of MPC are different for different trials. In principle, VQ may achieve higher compression ratio than the proposed method. However, it is usually very time-consuming. Figure 3 shows the relation between the training time and the MSE of VQ and the k-PCA. Obviously, the computational cost for building the codebook is very high, and thus VQ cannot be used in many real-time applications. Thus, in this paper, we use VQ only for finding the k clusters, and the k-PCA is then found based on these clusters. To confirm the generalization ability of k-PCA encoders, which is very important if we intend to use them as semi-universal encoders, we conducted another set of experiments. Specifically, we used nine of the ten images for training, and tested the resulting encoder using the remaining image. This method is often called ten-fold cross-validation in machine learning. The basic idea is that if the training samples are adequate, good performance for the test image can be expected. MSE

PSNR

57.1 57.0 84.9

30.06 30.05 28.8

Encoder for image compression 215

Table III. Growth MPC Tree MPC Standard MPC

Results of different MPC algorithms for the image Lena

Figure 3. Training time vs MSE of VQ and k-PCA

IJPCC 3,2

216

The MSE and PSNR of five dimensional (using five principal components) PCA encoders for training and test data are given in Table IV. The results of k-PCA are given in Table V. The results of k-PCA after ten iterations of retraining are given in Table VI, and the results after convergence of retraining are given in Table VII. These results show that the extended LBG can improve the generalization ability in all cases, with only a slight increase in cost for training. Table VI shows that the extended LBG converges very quickly. Usually, ten iterations are enough. If the k-PCA encoder is constructed off-line, this additional cost can be ignored. Next, we compare the performance of k-PCA and that of DCT. Results of different DCT encoders are given in Table VIII, in which i-D DCT means a DCT encoder that uses Name

Table IV. MSE/PSNR of five dimensional PCA encoders (ten-fold cross validation, Cr ¼ 12:8)

Airport Barbara Boat Elaine F16 Lena Man Mandrill Peppers Zelda

Name Airport Barbara Boat Elaine F16 Lena Table V. Man MSE/PSNR of the k-PCA Mandrill encoders (without Peppers retraining, Cr ¼ 13:47) Zelda

Name Airport Barbara Boat Elaine F16 Lena Man Table VI. Mandrill MSE/PSNR of k-PCA encoders (ten iterations of Peppers Zelda retraining, Cr ¼ 13:47)

Training

Test

161.02/26.06 157.14/26.17 170.25/25.82 180.83/25.56 172.87/25.75 180.38/25.57 166.49/25.92 134.19/26.85 177.05/25.65 183.85/25.49

236.46/24.39 271.86/23.79 154.66/26.24 58.11/30.49 129.97/26.99 62.48/30.17 188.02/25.39 478.43/21.33 92.93/28.45 30.65/33.27

Training

Test

115.27/27.51 112.61/27.62 122.39/27.25 129.31/27.01 124.96/27.16 130.32/26.98 112.61/27.62 92.06/28.49 127.66/27.07 131.92/26.93

175.95/25.68 273.19/23.77 106.93/27.84 49.09/31.22 79.95/29.1 46.59/31.45 133.28/26.88 378.58/22.35 58.06/30.49 23.28/33.29

Training

Test

94.16/28.39 96.89/28.27 101.01/28.09 106.05/27.88 103.82/27.97 107.75/27.81 98.38/28.2 73.62/29.46 106.46/27.86 108.59/27.77

160.48/26.08 227.24/24.57 98.1/28.21 45.33/31.57 73.11/29.49 40.49/32.06 122.55/27.25 361.51/22.55 49.86/31.15 21.8/34.75

i coefficients in image encoding. Clearly, both the original and retrained k-PCA encoders are better than DCT encoders with respect to smaller error and higher Cr . To visualize the fidelity improvement, Figure 4 shows a sub-image of the image Lena that contains edges and fine features. The sub-image is re-constructed by different coding methods. We can see from the figure that the k-PCA encoders (before and after retraining) have much better approximation ability than the five dimensional PCA encoder and the five dimensional DCT encoder, especially around the edges of the cap and the iris. Note that the compression ratios of all encoders are almost the same in this case. Also notice that the reconstructed images by the five dimensional PCA encoder and the five dimensional DCT encoder are almost the same. This implies that the transform obtained by PCA using many image are almost the same as DCT. In this sense, DCT is the PCA averaged over many images.

Name Airport Barbara Boat Elaine F16 Lena Man Mandrill Peppers Zelda

Name Airport Barbara Boat Elaine F16 Lena Man Mandrill Peppers Zelda

Training

Test

92.95/28.45 95.39/28.34 98.68/28.19 104.33/27.95 101.51/28.07 106.12/27.87 96.44/28.29 71.10/29.61 105.45/27.9 107.14/27.83

160.72/26.07 224.54/24.62 97.87/28.22 44.9/31.61 73.24/29.48 40.12/32.1 122.44/27.25 361.1/22.55 49.19/31.21 21.72/34.76

4-D DCT (Cr ¼ 16)

5-D DCT (Cr ¼ 12:8)

6-D DCT (Cr ¼ 10:67)

281.56/23.64 295.93/23.42 196.64/25.19 68.39/29.78 162.92/26.01 83.8/28.9 226.08/24.59 509.65/21.06 116.4/27.47 39.33/32.18

238.22/24.36 272.32/23.78 156.09/26.2 58.13/30.49 133.57/26.87 62.55/30.17 187.35/25.4 483.78/21.28 91.72/28.51 29.95/33.37

216.04/24.79 259.51/23.99 131.32/26.95 51.37/31.02 98.45/28.2 56.6/30.6 161.57/26.05 435.55/21.74 74.41/29.41 24.79/34.19

Encoder for image compression 217

Table VII. MSE/PSNR of k-PCA encoders (retraining until convergence, Cr ¼ 13.47)

Table VIII. MSE/PSNR of different DCT encoders

Figure 4. Magnified sub-image of the original and the reconstructed image Lena

IJPCC 3,2

218

To further confirm the generalization ability of the proposed method and evaluate how many training data are required for obtaining a good k-PCA encoder, we tried to reduced the training data. Specifically, the ratio between the training set and test set was 5:5 or 1:9. Tables VIII and IX show the experiments results. Comparing Table VII with Table IX, we can see that five-training images are already enough to get a good k-PCA encoder. Comparing the last column of Table X with the third column of Table VIII, we can see that despite the fact that the training set contains only one image, the results are still better then the DCT based methods. 5. Conclusion In this paper, we have focused our attention on how to improve the performance of PCA-based image compression. A new technique, called k-PCA, has been proposed for obtaining semi-universal encoders. The basic idea of k-PCA is similar to MPC. However, the process for obtaining a k-PCA is much more efficient. We have also proposed an extended LBG algorithm that can improve the k-PCA through retraining. The results of the experiments show that the k-PCA encoders, both before and after retraining, are better than DCT, PCA and MPC encoders. Further, the k-PCA encoders can have very good generalization ability even if the training set is a small portion of the whole database. Note that in this paper, we did not compare the k-PCA with JPEG directly. In fact, we can replace DCT with a k-PCA in JPEG, and get a better encoder for a given domain. Of course, since the k-PCA encoder is not truly universal, we cannot use it for compressing Name

Airport Barbara Boat Elaine F16 Lena Man Table IX. MSE/PSNR of the k-PCA Mandrill Peppers encoders using five Zelda images for training

Name Airport Barbara Boat Elaine F16 Lena Man Table X. MSE/PSNR of the k-PCA Mandrill Peppers encoders using one Zelda image for training

Training

Test

138.34/26.72 103.6/27.98 83.01/28.94 41.15/31.99 57.89/30.5 35.94/32.58 106.4/27.86 313.9/23.16 43.21/31.77 20.96/34.92

164.97/25.96 226.86/24.57 102.79/28.01 46.23/31.48 79.04/29.15 41.74/31.93 122.6/27.25 368.3/22.47 52.09/30.96 22.04/34.7

Training

Test

117.61/27.43 79.37/29.13 66.09/29.93 31.16/33.19 43.79/31.72 25.26/34.11 93.52/28.42 287.47/23.54 30.79/33.25 14.20/36.61

186.06/25.43 239.54/24.34 120.02/27.34 53.11/30.88 97.05/28.26 52.25/30.95 147.43/26.44 396.6/22.15 68.34/29.78 26.67/33.87

all kinds of images. Currently, we are trying to replace DCT in JPEG with a k-PCA encoder, to apply the resulting encoder to the compression of facial images in a database used by our group for face/expression recognition. Further results will be reported later. References Ahmed, N., Natarajan, T. and Rao, R.K. (1974), ‘‘Discrete cosine transform’’, IEEE Transactions on Computers, C-23, pp. 90-3 Antonini, M., Barlaud, M., Mathieu, P. and Daubechies, I. (1992), ‘‘Image coding using the wavelet transform’’, IEEE Transactions on Image Processing, Vol. 1, pp. 205-20. Carrato, S. (1992), ‘‘Neural networks for image compression’’, in Gelenbe (Ed.), Neural Networks: Advances and Applications, 2nd ed., North-Holland, Amsterdam, pp. 177-98. Chang, C.C., Lin, D.C. and Chen, T.S. (1997) ‘‘An improved improved VQ codebook searching using principal component analysis’’, Journal of Visual Communication and Image Representation, Vol. 8 No. 1, pp. 27-37. Chang, R.F., Chen, W.T. and Wang, J.S. (1992), ‘‘Image sequence coding adaptive tree-structured vector quantization with multipath searching’’, IEE Proceedings, Vol. 139 No. 1, pp. 9-14. Ching, Y.C. (1973), ‘‘Differential pulse code modulations system having dual quantization schemes’’, US Patent 3781685. Dony, R.D. (1995), ‘‘Adaptive transform coding of images using a mixture of principal components’’, PhD thesis, McMaster University, Hamilton. Dony, R.D. (1998), ‘‘A Comparison of Hebbian learning methods for image compression using the mixture of principal components network’’, Proceedings of SPIE, Applications of Artificial Neural Networks in Image Processing III, Vol. 3307, pp. 64-75. Gray, R.M. and Linde, Y. (1982), ‘‘Vector quantization and predictive quantizers for GaussMarkov sources’’, IEEE Transactions on Communications, Vol. 30 No. 2, pp. 381-98. Jolliffe, I.T. (1986), Principal Component Analysis, Springer-Verlag, New York, NY. Kambhatla, N. and Leen, T.K. (1997), ‘‘Dimension reduction by local principal component analysis’’, Neural Computation, Vol. 9, pp. 1493-516. Kung, S.Y. and Diamantaras, K.I. (1990), ‘‘A neural network learning algorithm for adaptive principal component extraction (APEX)’’, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Alburqurque, NM, 3-6 April, Vol. 90, pp. 861-4. Linde, Y., Buzo, A. and Gray, R.M. (1980), ‘‘An algorithm for vector quantization’’, IEEE Transactions on Communications, Vol. 28 No. 1, pp. 84-95. Lv, C.F. and Zhao, Q.F. (2004), ‘‘Fractal based VQ image compression algorithm’’, Proceedings of the 66th National Convention of IPSJ. Lv, C.F. and Zhao, Q.F. (2005), ‘‘A simplified MPC for image compression’’, Proceedings of the International Conference on Computer and Information Technology, Shanghai, pp. 580-4. Lv, C.F. (2004), ‘‘IFS+VQ: a new method for image compression’’, Master’s thesis, The University of Aizu, Aizu. Oja, E. (1982), ‘‘A simplified neuron model as a principal component analyzer’’, Journal of Mathematics and Biology, Vol. 15, pp. 267-73. Roweis, S.T. and Saul, L.K. (2000), ‘‘Nonlinear dimensionality reduction by locally linear embedding’’, Science, Vol. 290, pp. 2323-6. Salomon, D. (2004), Data Compression: The Complete Reference, Springer, London, pp. 289-325. Sanger, T.D. (1989), ‘‘Optimal unsupervised learning in a single-layer linear feedforward neural network’’, Neural Networks, Vol. 2, pp. 459-73.

Encoder for image compression 219

IJPCC 3,2

220

Scholkopf, B., Smola, A. and Muller, K.R. (1998), ‘‘Nonlinear component analysis as a kernel eigenvalue problem’’, Neural Computation, Vol. 10, pp. 1299-319. Tenenbaum, J., Desilva, V. and Langford, J. (2000), ‘‘A global geometric framework for nonlinear dimensionality reduction’’, Science, Vol. 290, pp. 2319-23. Torres, L. and Huguet, J. (1994), ‘‘An improvement on codebook search for vector quantization’’, IEEE Transactions on Communications, Vol. 42 Nos. 2/3/4, pp. 208-10. Further reading Fischerm, Y. (1994), Fractal Image Compression, Springer, New York, NY. ISO/IEC JTC 1/SC 29/WG 1 WD14495, ‘‘JPEG LS image coding system’’, July, 1996. ISO/IEC WD15444-1, ‘‘JPEG 2000 lossless and lossy compression of continuous-tone and bi-level still images’’, ISO December 1999. Lv, C.F. and Zhao, Q.F. (2005), ‘‘A universal PCA for image compression’’, Proceedings of the International Conference on Embedded and Ubiquitous Computing, Nagasaki, Springer LNCS 3824, pp. 910-9. Corresponding author Chuanfeng Lv can be contacted at: [email protected]

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints